How artificial intelligence and data science are shaping the financial services industry

How artificial intelligence and data science are shaping the financial services industry

June 2023

1. Why does artificial intelligence matter for financial institutions?

Artificial intelligence (AI) is a field that combines computer science and data analytics to solve real-life problems. Artificial intelligence may first have been born in works of fiction, but it started showing its real power when AlphaGo, an AI model that plays the board game Go, became unbeatable in 2016. This was astonishing because Go was thought to be too complicated for computers, at least for a few more decades to come. But however impressive AlphaGo is, it is still weak AI. Nowadays, AI developers are working towards strong AI. ChatGPT and GPT-4 are certainly the most recent breakthroughs teaching us that AI is not just something of the future – it is here now.

AI is already applied in many industries and research fields where a lot of data are made available. Financial services is certainly one such industry. For instance, the Bank of America’s AI financial assistant, Erica, is one of the most advanced and one of the first widely available models. Since its launch in 2018, Erica has helped nearly 32 million Bank of America clients manage their financial lives. It learns from conversation with clients and expands its capabilities continuously. There are many other applications of AI in the financial services industry, including customer acquisition and onboarding, know-your-customer (KYC), credit decision making, segmentation, retention and cross-selling, to name but a few.

In this paper, we will discuss some key AI terms and algorithms, followed by some use cases of AI in the financial services industry and analysis of the trends.


2. What is data science and artificial intelligence

Data science (DS) is an interdisciplinary field of study that uses various statistical and computing models to extract valuable insights and knowledge from noisy data. Artificial intelligence (AI) refers to the field of study that simulates human intelligence to carry out the process of learning, thinking and acting.

Machine Learning (ML) is a branch of AI that enables machines to learn and improve themselves. In the field of ML, there is a subsidiary field that specialises in using neural networks to learn patterns from data. We refer this to deep learning (DL).

Since machine learning, by construction, involves the development of a model and training it with data, it relies heavily on mathematical, statistical and computing methods to calibrate the model with raw data. This overlaps with the field of data science.


Figure 1 – Relationship between AI, Machine Learning (ML) and Data Science (DS)


3. Brief introduction of common glossary


Natural language processingNatural language processing (NLP) allows machines to read and understand human language in the form of structured and unstructured data. NLP has been widely adopted in the financial services industry, all the way from retail banking to hedge funds. Some NLP techniques include extracting data from reports or social media, performing sentiment analysis, answering questions, etc.

Machine learningMachine learning is a branch of AI and computer science that focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving Accuracy. There are three types of machine learning.

Table 1 – Classification of machine learning


Type of ML

Supervised Learning
In supervised learning, the algorithm is provided with sample inputs and desired outputs. The machine learns a general rule that maps the labelled data, which solves classification and regression problems.

Unsupervised Learning
In unsupervised learning, the algorithm studies from unlabelled data and identifies the unknown pattern of data without human instructions. The algorithm helps humans find features in datasets through clustering and dimensionality reduction.

Reinforcement Learning
In reinforcement learning, the algorithm interacts with the environment and is rewarded for desired behaviours and punished for discouraged ones. Consequently, the algorithm finds the rule to archive the goal from trial and error.

Big dataAI and machine learning models require a massive scale of training datasets to improve learning algorithms. Tagging tremendous amounts of data is a prerequisite process to train an ML model. This is often a labour-intensive process and usually takes up most of the time in a machine learning project. However, big data have a synergistic relationship with AI and machine learning. By feeding big data into the machine learning algorithms, the accuracy and efficiency of decision-making are greatly enhanced. AI and big data work together to deliver better solutions on customer services, risk management, insight generation, etc.
Big data refers to the massive and complex datasets that are difficult to deal with using traditional data processing methods. Big data could also be described using the “three Vs”: volume, velocity and variety. Volume represents the size and amount of data; velocity represents the speed of data generation and transfer; and variety refers to the type and nature of data. Variety can be further classified into three categories: structured data, semi-structured data and unstructured data.

Structured dataStructured data have been organised in a predefined format, following a consistent order. The data are typically stored in relational databases, such as SQL databases or spreadsheets. The well-defined format of structured data facilitate its easy storage and access.

Unstructured dataUnstructured data are not organised in a predefined format, such as audio, images. It is estimated that over 80% of data generated or collected by organisations is unstructured. This type of data has a large volume, but it is difficult to analyse.

Artificial Neural Networks (ANNs)Artificial Neural Networks in deep learning use various layers for computation, including the input layer, hidden layers and the output layer. The layers are composed of nodes (or neurons) that are analogous to human biological neurons.

Figure 2 – Artificial neutral network




4. Key applications in banking


1a. Fraud detectionFor years, fraud has been a critical issue in the banking sector. As customers engage in more digital and online transactions via a greater variety of payment options, such as credit cards or digital wallets, there is a greater chance of fraudulent activities occurring. An effective fraud risk management model is very important to banks as it enables banks to mitigate losses due to fraud.
Banks can employ models based on machine learning to recognise the hidden pattern in fraud transactions. In a traditional or rule-based approach, rules have to be set up to identify suspicious transactions. However, fraudsters are becoming more technologically savvy. They are able to make use of the latest developments in technology and more sophisticated schemes to defraud banks. Therefore, banks cannot predict fraud precisely with strict rules. There is a need to analyse the data pattern that could develop and respond to new situations instantly. An unsupervised learning algorithm has been used in fraud detection as the model is able to analyse data and evolve without human supervision. As such, it can identify the hidden similarities between the fraudulent transactions. In addition, neural networks are also used for fraud detection modelling as they can model non-linear and complicated interactions based on cognitive computing. This is especially practical as many of the relationships between inputs and outputs of fraud transactions are non-linear and complex.

1b. Case study: Danske BankDanske Bank is the largest bank in Denmark and plays an important role in the northern European region by serving more than 5 million retail customers. However, it operated a traditional fraud detection model manually, with low efficiency and accuracy. Danske Bank struggled with a low fraud detection rate of 40% as the model identified 1200 false negative cases each day. The enormous number of false cases alarmed Danske Bank, causing it to modernise its fraud detection model to identify fraud more precisely.
Dankse Bank integrated deep learning software with graphical processing unit (GPU) appliances that were optimised for deep learning. The deep learning enhanced solution helped the bank identify potential fraud cases and reduced false positives. The labour required to make operational decisions was also reduced as the process was shifted to the AI system. The bank’s deep learning systems compare models in real time using the “champion/challenger” methodology to identify the most productive models. Different challenger models analyse the data in real time simultaneously, and learn from the data feed. As a result, Danske Bank achieved a 60% reduction in false positives and a 50% rise in true positives. Danske Bank can now more effectively detect fraud due to the deployment of deep learning technologies.

2a. Credit risk assessmentCredit risk is typically the biggest financial risk for commercial banks. For all types of lending, banks have to determine the default risk of their customers. In the past, banks managed credit risk with the aid of predictive statistical models effectively. For example, logistic regression models are among the most commonly used to estimate the probability of default. However, in the big data era, banks can use AI algorithms to analyse much more data, especially those unstructured in nature, for example, social media data or litigation data from judicial websites. Using these data effectively helps to enhance the predictive power of the models and hence helps banks to make better credit decisions and optimise portfolio performance. Traditional information such as financial ratios, company profiles and borrower demographics are commonly used for the development of credit scoring models. However, since the data from some SMEs may not be sufficient to construct a valid scorecard, the adoption of alternative data, including telco data, shipping records and behavioural traits, is essential. Machine learning models enable the most effective use of these alternative data, as they can come in unstructured or semi-structured forms.

2b. Case study: a leading Chinese online bank (affiliated with Alibaba)MYBank is a leading Chinese online bank. It offers online loans to small and medium enterprises (SMEs). Since its establishment in 2015, the bank has provided loans to more than 45 million SMEs. The size of the loans and their durations are typically smaller when compared with loans from traditional banks.
In July 2022, Ant Group’s MYbank released the “Bailing” intelligence risk management system to apply AI technology in loan approval. The Bailing system enables customers, especially the small and micro business owners, to prove their operational strength and stability by simply taking photos of their credentials. The NLP technologies could identify 26 types of credentials automatically, ranging from contracts, invoices and business licences. In the control test, the NLP-based AI identifier demonstrated 95% consistency with manual review, while greatly improving the efficiency of credential identification. Since inception, more than 5 million customers have obtained loans through the Bailing system without human interaction.

3a. Customer segmentationBanking was one of the earliest industries to embrace the idea of targeting specific customers using segmentation analysis. Customer segmentation allows banks to divide their diverse customer bases into groups according to different criteria, such as customer needs, credit quality and profitability. This helps banks to achieve different purposes. Well-designed segmentation analysis enables banks to understand their clients thoroughly and allows them to improve customer experience with more personalised products.
In order to gain a better understanding of client data and segment them into groups, modern banks could make use of artificial intelligence and machine learning models. Banks observe the hidden patterns or common characteristics in data effectively using ML models. The algorithm will classify customers into different clusters. For instance, the bank may use common attributes such as demographics, stages of life, levels of income, and more. Subsequently, the bank could design and select a tailor-made marketing campaign or suggestions for clients.

3b. K-means Clustering:K-Means is an unsupervised learning algorithm that is commonly adopted in clustering (assigning input data to groups using data patterns).

Figure 3 – K-means clustering

The technique uses an unsupervised approach with K clusters with at least one data element. The K-means clustering algorithm uses an initial set of randomly chosen centroids as the starting points for each cluster to process the learning data; it then uses iterative calculations to optimise the positions of the centroids. The process computes the centroid until it reaches a point where the difference between the centroids and data points can no longer be reduced.

3c. Case study: a leading bank in the Middle EastA leading bank in the Middle East has encountered numerous issues in recent years as a result of subpar customer service and consumer orientation. As a result, a study focused on dividing the bank’s clients into groups according to expected benefits. The bank assisted scholars in their search by providing the appropriate information and working with the research team closely.
In the project, K-Means analysis was employed to execute segmentation and establish the number of segments. Some demographic information of customers (e.g. gender, marital status) was applied in the clustering analysis, which resulted in the separation of four distinct segments based on the anticipated benefits. The four customer groups are benefit-oriented, peace-oriented, interest-oriented and moderate client, as these groups have demonstrated prominent data attributes. As a result, the bank can choose the ideal marketing plan and create an appropriate advertising programme according to customer characteristics.



5. Key applications in asset management


4a. Robo-advisingRobo-advisers are digital financial advisers that provide investment or financial planning services driven by AI and ML algorithms. Robo-advisers use natural language processing (NLP) to understand the natural language data input (e.g. voice and text) and interact with investors in the form of chatbots. This reduces human supervision and thus lowers the entry barrier with retail investors.
For example, robo-advisers can be used to gather investors’ information (e.g. return objectives and risk appetites) through behavioural questionnaires. Robo-advisers can also be used to answer frequently asked questions.

4b. Portfolio ManagementNLP has become an effective tool in portfolio management because it is capable of extracting insights from different unstructured and semi-structured formats, such as annual reports, news articles and posts on social media. Instead of the traditional dictionary-based approaches that extract information only from individual words, AI approaches could also interpret context and tones.

Some asset management companies use machine learning models to perform sophisticated fundamental analysis, including the analysis of semi-structured data from the news, or data from financial reports. AI approaches such as support vector machines could generate insights on numerous stocks, identify the correlations between different asset classes and provide more accurate estimates on co-variances and expected returns. In portfolio optimisation, traditional approaches focus on hedging and quantitative analysis, while machine learning techniques can process and extract information from extremely large amounts of data. Moreover, machine learning can readily apply non-linear techniques and reduce dimensionality, which provide better out-of-sample performances. In addition, reinforcement learning allows the machine to improve continuously, dealing with complex asset allocation problems that no human can solve.

4c. Case study: UOB Asset Management (UOBAM)United Overseas Bank (UOB) is a leading bank in Asia with 500 offices around the globe. Its new product, UOBAM Invest, is Singapore’s first robo-adviser on a mobile wallet and includes sustainable investing solutions. It has millions of users and the total assets under management has surpassed SGD 36.5 billion.
The robo-adviser employs risk profile and goal-setting algorithms to make sure the investment recommendations offered by the algorithm are aligned with the risk tolerance levels and investment objectives set by clients. In addition, the time period and ESG investing preferences are also taken into account. UOBAM Robo-Invest aims to optimise the portfolios with stable long-term growth using a hybrid strategy of actively managed funds and exchange traded funds (ETFs). The optimisation model maximises the returns under the investment goal and risk tolerance of clients. The combination of AI and portfolio management has brought impressive results to UOBAM-Invest, as the asset manager reported returns higher than their benchmarks.

5a. Trading processes)Apart from the transactions that took place at the exchange, over the counter (OTC) executions have also been replaced by electronic trading. The rise of electronic trading has generated a large dataset for data analysis throughout the trading process, from pre-trade to execution.
Figure 5

During the pre-trade stage, AI approaches could analyse the transaction costs before execution, including the bid–ask spreads, market impact cost and commissions. AI techniques could capture non-linear relationships to predict the market impact and provide additional insights compared with traditional market impact models. Further, AI models could also use alternative data to make predictions when the historical trading data of the asset is insufficient.
At the stage of trade execution, the algorithm could minimise the transaction cost by recommending the appropriate size and timing of the order. For instance, machine learning models could analyse the broker-dealer inventory, historical trades and pricing by actively learning from these data. However, the post-trade analysis often requires human intervention in monitoring the risk and realised market outcome.

5b. Case Study: BlackRock
BlackRock has taken advantage of machine learning and AI techniques to analyse its own trading data to identify patterns in transaction costs. BlackRock’s traders could make use of the result generated by these patterns to obtain a better understanding of the trades. This facilitates the reduction of trading costs and execution time, which eventually benefits BlackRock and its clients. Moreover, the application of AI techniques enables BlackRock to analyse enormous volumes of text data and to anticipate the probable changes in companies’ future earnings. For example, NLP technology could transform the unstructured text into their proprietary measures of market sentiment or trading trends. The technology also analyses over 5000 transcripts of earnings calls and over 6000 broker reports every day, where the traditional approach would be time-consuming as the reports have to be read by human.



6. Key applications in insurance


6a. Automating and improving claims processAs claims data exist in various formats like photos and documents, it is difficult to identify and analyse these unstructured or semi-structured data with high efficiency.
Therefore, a major pain point in claims processing (e.g. untimely processing, customer dissatisfaction) is poor client experience and long waiting times. In this case, AI could provide a helping hand to handle the claims process automatically with high efficiency. For instance, insurers can implement an automated first notice of loss (FNOL) system with AI. It could automate the customer-facing and claim management aspects, such that the reporting process could be finished without human intervention. At the claim assessment stage, the damage estimation is handled by an adjuster or vehicle repair shop, which may take days or weeks. Using NLP technology, the insurer could introduce a machine learning model that classifies claims based on the images of vehicle damage, estimating the repair fees with the support of a vast database. AI could make the assessment and authorise the claim within a very short period, helping insurers to ensure the quality and speed of the claims process. Moreover, it also empowers insurers to make observations from claim audits and prevent claim leakage.

6b. Case Study: Fukoku Mutual Life InsuranceFukoku Mutual is a leading life insurer actively engaged in Japan. Similarly to the industry trend, Fukoku was struggling with increasing operational costs due to inefficiencies in claim processing. They have embraced AI systems to analyse and interpret claims data, such as images of medical certificates, and to calculate the claim pay-out with high accuracy. In order to prevent payment oversights, the AI system could also check if there is any violation against the insurance contracts. After the employment of AI, the time required to calculate the large volume of pay-outs was drastically reduced, while productivity was significantly increased.

7a. Price optimisation
The major business of insurance companies is to understand and anticipate the risk of clients and to take the risk with an appropriate price. Nowadays, the integration of machine learning techniques into risk modelling has allowed insurers to better predict different events. As a result, insurers could refine their pricing models to increase profitability. Some commonly employed methodologies include General Linear Model (GLM), LightBGM and random forests. They could analyse a vast dataset of structured and unstructured data in real time. The pricing team would feed the model with historical policy attributes, client information and events data. In contrast to the traditional approach that builds frequency and severity models for predicting the number and amounts of claims, the machine learning approach might directly infer the incurred claim. It often provides a more accurate result as the model maps non-linear relationships and could use a larger variety of variables.

7b. Case study: AXA
In Japan, 7–10% of AXA’s customers were causing a car accident every year. The loss was significant. Since AI techniques could draw on more historical data to provide predictive analysis, the data science team of AXA Japan decided to create a deep learning model. They used TensorFlow, an open source machine learning platform developed by Google, to develop the experimental model using random forest and neural network. Numerous factors, such as driver age and vehicle type, were identified and fed into the neural network with three hidden layers. Eventually, the machine learning approach generated predictions with an accuracy of 78%, enabling AXA to provide better underwriting services with more precise pricing. Meanwhile, AXA Tianping, one of China’s largest property and accident insurers has announced a partnership with an insurance technology company, Akur8. Using Akur8’s AI technology, AXA Tianping could automate rate modelling and improve the pricing process. This would allow AXA Tianping to develop a portfolio of well-priced products more quickly in the dynamic risk environment.


7. Challenges for AI / ML development in financial services industry


The quality and quantity of dataLike other empirical models, AI models rely on the availability and integrity of data. Poor data quality can easily trigger what is known as “garbage in, garbage out”. Data quality and sufficiency become particularly important because AI outputs are often taken at face value. Therefore, identifying data-related issues by evaluating the model outcomes might not be a straightforward exercise. Furthermore, AI models require large amounts of data during the learning phase, often more than what is made available.

The lack of trust in machine learning for sensitive decisions
Complex machine learning models are not easy to interpret, and explaining their predictions can be difficult or even impossible. Lacking an understanding of the black-box models, such as artificial neural networks with many layers, end users can find it hard to trust their outputs. This is especially true for financial services regulators and bankers working on high-stakes decisions as they would want to support their decisions with comprehensive explanations.


At Accuracy, we have a team of data scientists and a technology laboratory who help our clients on the following tasks:

• Performing strategic analysis on the adoption of appropriate AI / ML solutions in different business functions
• Developing AI / ML models that best suit your business needs
• Development and implementation of AI / ML-powered platforms and systems
• Designing AI / ML model governance frameworks and implementation of best practices.

At Accuracy, our financial services industry experts work with banks and non-bank financial institutions on mergers and acquisitions, strategic transformations, quantitative modelling and adoption of technology solutions. We have been working closely with both global financial institutions and small to medium sized entities over the past two decades to add value to their businesses.

Download the article

More accuracy Perspectives