Self-driving cars, image recognition, speech recognition, medical diagnoses and playing chess – machine learning has been making waves everywhere and continues to improve technologies across multiple industries, especially so in cybersecurity. The end goal surrounding such applications is to move towards increased automation, self-learning and effectively, reduced reliance on humans.

Modern fraud solutions – such as CashShield – have begun to incorporate machine learning algorithms to improve the operational efficiency of the fraud screening process, as well as to increase the accuracy of pinpointing fraudsters in the sea of unique users. This is a step away from traditional fraud detection solutions that are predominantly rules-based and are heavily reliant on manual labor to configure and update the rules and to make sense of the scores resulting from the rules. Machine learning solutions have improved the situation significantly; they have allowed for greater automation and greatly reduced the need for manual reviews.

But – there’s the caveat: most modern fraud solutions built with machine learning can only greatly reduce the need for manual reviews, but are unable to completely eliminate the need for manual reviews.

Human involvement in fraud prevention: boon or bane?

Overreliance on manual reviews and long manual review times are commonly agreed to be undesirable in the fraud screening process in the industry. Despite this, manual reviews are still commonplace for most companies – whether or not they have adopted fraud tools designed with machine learning algorithms.

For proponents of keeping the manual review process, it is based on the worry that machines will make mistakes to pass fraudulent transactions and fail genuine ones. And of all the nightmares a merchant could possibly have, rejecting genuine customers and risk losing them forever is probably one of the worst ones.

False positives is a tricky problem to solve, however. Customers could be turned away by overly strict controls from the fraud screening process, be it by the rules set or by the manual reviewer who does not want to risk letting in a fraudulent transaction and increasing the chargeback rates. After all, human judgement is not impervious to human error.

When manual reviews cost more than it saves

Keeping a manual review team is costly. Not only does hiring a manual review team cost money, finding the right people is difficult, and training the team to ensure that the staff are well updated on recent fraud trends or well versed in newer fraud tools can put a strain on resources as well. Many merchants report that maintaining a team dedicated solely to fraud prevention is unjustified, and would rather divert resources to other parts of the business to generate revenue.

As your business expands and receives more transactions a day, scaling the manual review team becomes an important consideration to improve operational efficiency, so that more transactions can be processed without delay. But, reality check: not everyone can afford to invest extra resources solely into fraud mitigation, and particularly so for short promotional periods that last a day, or a week at most.

To put it simply, this is how manual reviews cost more than it saves:

Assuming that sales usually peak over the year during the summer sale and year end sales, a company might choose to increase the manual review team halfway in the year to meet with the incoming demand. However, as sales dip between the Black Friday craze and the holiday season, the excess capacity thanks to all the extra hires will take a toll on operational costs. Meanwhile, lost opportunities might occur all year round when the existing manual review team is unable to meet the unexpected influx of transactions, resulting in lost revenues.

Call on the machines to take on a bigger role

We have all heard the term somewhere – machine learning – but what do we really know about machine learning, and more importantly, how it can prevent fraud?

Let’s understand how the fraudster works:

He has bought a list of 10,000 stolen credit numbers off the Dark Web, and is intending to use the numbers to make purchases on his favorite e-commerce stores. Of course, the fraudster doesn’t want to get caught. To do so, he needs to ensure that all the transactions he makes with those stolen credit card numbers do not seem like they are from the same source, albeit with different credit card numbers.

Using easily obtainable tools from the Dark Web, the fraudster will be able to mask and change their IP addresses or randomize the IP addresses such that transactions would seem like they originated from different sources. More sophisticated fraudsters would also be able to make micro-changes to the device fingerprint, increasing the difficulty of tracing the various transactions back to one source. For example:

  • Transaction A was made from an iPhone 6S
  • Transaction B was made from an iMac computer
  • Transaction C was made from a Linux computer

All these changes may be computed in mere seconds and the transactions could be made one after another or minutes apart, randomly. Now we ask: how do machines do what humans can’t do?

With the massive size and quantity of data required to be analyzed to match fraud patterns, machine learning has become of utmost importance.

Most fraud systems would minimally have the basics: using historical data sets of known fraud patterns to train the machines, so that they are able to predict and capture (or block) the same type of fraud patterns, otherwise known as supervised machine learning.

Typically, a supervised machine learning model “learns” to recognize patterns and make predictions, constantly refining its accuracy by processing and analyzing emerging data. This is done by collecting a colossal amount of data, labelling the data based on previous incidents of fraudulent behavior, and then training the model to recognize and predict the same anomalies in future outcomes. Unlike static and inflexible rules, supervised machine learning is able to keep up with the increasing volume, velocity, complexity and variety of data today. By having machines train themselves automatically based on historical data, the effort required to keep fraud detection up to date is drastically reduced.

Some fraud solutions would go further to share known fraud behavioral patterns across various merchants and companies, because most fraudsters do not stop at one place. In fact, if they had been blocked at one platform, they would quickly move on to a different platform to try their luck. And in the case of success, they would grow even more audacious to use the winning attack on other websites to maximize their profits.

In the face of new fraud trends

However, if the machines are only trained to capture fraud based on historical data, what happens to fraud attacks that are new with zero traces of historical data anywhere?

To fill this gap, CashShield incorporates unsupervised machine learning, that would allow the system to identify fraud patterns without known data (or historical data). With each incoming transaction, the CashShield system analyzes millions of data points within seconds – a large part of it on the user’s behavior, to identify good behavior as much as bad behavior. Even seemingly negligible data points such as the device battery level, the browser version and whether or not the user has connected to social media are all clues to identifying whether a fraudster is attempting to trick the system by making micro-changes to the transactions. Real-time pattern recognition runs pattern analysis with each incoming transaction to identify fraudulent patterns, especially of coordinated fraud attacks or new attacks launched by the fraudster.

Machine learning can help us speedily trawl through massive data sets and flag out potentially fraudulent behavior, but in the end, machine learning systems often just end up with a probability score (or what we call the “fraud score”). Even if up to 95% of transactions may be automated to pass or fail, 5% of the borderline transactions would still require manual reviews or human decisions for completion.

To achieve full automation, we must move beyond just relying on machine learning for the answers.

Striving to achieve full machine automation

When CashShield was first conceptualized a decade ago, it was always imagined as a fully automated solution, to lessen the stress on the operational team, as well as to provide instant delivery for our digital goods merchants.

Through a unique application of high frequency trading (HFT) algorithms, combined with our machine learning models, the core CashShield system is able to make sense of the fraud score and automate decisions to pass or fail a transaction in real-time.

Think about it: accepting a transaction is extremely similar to investing in a stock – both have a potential return with a risk of default (or loss). Applying financial modelling (such as a return to risk ratio) allows the system to view all the transactions as part of an investment portfolio, and maximize the merchant’s returns based on an optimal risk level.

Take a look at the chart:

Red denotes high risk transactions, yellow denotes borderline risk transactions and green denotes low risk transactions. Comparing the risk to return ratio, the system can accept a riskier transaction with a great potential return, as long as the system has offset the higher risk with other low risk transactions within the portfolio. This allows the system to be more aggressive in accepting more transactions to maximize revenue as well.

Humans or machines?

Supporters of manual reviews distrust the machines to be fault-free and 100% perfect. And we agree, because if fraud risk can be seen as financial risk, 0% risk means 0% returns. Some risk should be taken, but just enough to maximize your business potential.

But let us not forget the other benefits of full-machine automation: to keep one step ahead of fraudsters, without compromising consumer experience and growth. Without having to dedicate extra manpower and resources to manage fraud, these efforts can be committed to other areas. With a full machine automated fraud system, a business can streamline its operations to handle large volumes of data, scaling aggressively and easily on demand without a huge jump in costs.