Many retailers have been making the shift towards applications that allow users to shop and pay for products directly through their phones. This modern way of transacting creates new avenues for tech-savvy fraudsters to benefit. There are many ways to fraud a mobile payment app; reinstallation, dummy emails, multiple devices, loading card info stolen from the Dark Web, etc. Because there is no perfect way to validate someone’s identity in widespread use today, committing fraud can be relatively easy (however, the world of validating identity is evolving quickly with such technologies as facial recognition and other biometric verification methods). But, luckily for retailers and consumers alike, there is a suite of tools in the world of machine learning that can help us identify and stop fraud before it happens.
Leveraging Data in Machine Learning
As transactions flow through a payment system, data is generated, collected, and attached to the transaction. This information is necessary for creating a fraud detection system. There are many classifiers in machine learning which will use this data to categorize each transaction as fraudulent or not.
Available data elements will change depending on the payment architecture, but there is always something to work with. For example, you might be able to see how many transactions the phone has attempted in the past day, or the distance between the geolocation of the phone and the billing address associated with the credit card. Whatever it might be, this data can be used to build predictive models. All we need is a historical data set with a “target” (in this case, which transactions were fraudulent, and which were not) that we will attempt to classify. If a transaction is associated with a chargeback (a demand by a credit-card provider for a retailer to make good the loss on a fraudulent or disputed transaction), we categorize it as fraudulent.
One problem that often arises with classifying transactions, however, is the large imbalance between what is fraud and what is not. Often times the amount of fraudulent transactions will be as low as 1%-2% of the full data set, so it can be very difficult to build an accurate classifier which separates the signal from the noise seen among all healthy transactions. This means it is crucial to have as much data as possible. The more data you have, the more powerful your predictions will be.
Using Machine Learning to Classify Transactions
With a large, trustworthy and well managed data source, we can begin applying machine learning algorithms to search for the best performing one. The process generally works by splitting up the data set a couple different ways. First, we save a portion of the data set for the very end of our evaluation, also known as the test data. This will give us a better idea of how well our model will perform in a production setting. We then split the rest of the data into training and validation data sets. The training data set is used to train the model, and then we evaluate its accuracy on validation data. Once we pick a model that gives us the best accuracy, we apply it one last time to the test data to give us a final set of predications and evaluate the accuracy.
Accuracy can be defined in many ways. Generally, with such imbalanced classes, it is not a good idea to look at only full model accuracy. Instead, you need to evaluate what is known as sensitivity and specificity, among other more advanced measures. Sensitivity is a measure of how many “positives” (in this case, fraud) were correctly identified, while specificity measures the ability to correctly identify those observations that are not fraud. For example, 80% specificity tells us that 20% of the true negative cases were identified as being fraudulent and are false positives. We need to be very cautious of these as these are good consumers that our model would be turning away.
Once we have good accuracy in our classification of historical data, we can begin implementing our classifier in a production setting to stop fraud in its tracks.
Implementation of Machine Learning
Implementation can be tricky and has limitations. Although we have found a model that performs classifying transactions with the best accuracy, we need to consider some other issues. The first issue has to do with false positives. Our goal is to not only stop fraudsters, but to also have as little impact on good consumers as possible. This means we must be very careful with evaluating false positives. In general, we make sure that any model we implement must have very few, usually below 1% of all healthy transactions (very high specificity). Another limitation has to do with fraud management platforms. For example, the platform might only allow for rules-based models, while others might allow for more complicated classifiers. In a rules-based system we must use decision trees or random forests, even if we can get better accuracy from black-box approaches such as neural networks.
The technology around fraud management services is constantly evolving. We are at a point where tools have become sophisticated enough to have a huge impact on stopping fraud, and it will only be getting better over time. This is crucial as fraudsters are also constantly evolving as well, and it’s our goal to always be one step ahead.
For further discussion around mitigating mobile fraud, contact Stuart Greenlee at [email protected].