A leading US retailer approached KPMG with an ambitious goal: to develop a predictive risk monitoring solution that could proactively identify and quantify risks to its business such as employee theft, counterfeit products, litigation, customer safety issues, money laundering and more, before costly incidents occurred instead of reacting to them after the fact.
By using modern artificial intelligence (AI) technologies including natural language processing (NLP) to analyze vast amounts of publicly available data, including unstructured data from ecommerce websites and and other data sources, the retailer hoped it could illuminate these potential risks well in advance and eliminate any surprises.
A leading US retailer wanted to proactively identify, quantify and respond to risks to its business, such as employee theft, litigation, counterfeit products and more, before costly incidents occurred instead of reacting to them after the fact.
We helped design and deploy a predictive risk monitoring solution that uses advanced artificial intelligence (AI) technologies including natural language processing (NLP) and several machine learning techniques to analyze vast amounts of publicly available data, including product reviews, descriptions and other related data, to track and quantify more than a dozen key risks to the retailer.
With our team of experienced AI and NLP technology professionals, our leading AI and NLP tools and accelerators, and our extensive background in risk management, KPMG was able to build and deploy the first release of the solution in just four weeks.
Host Tori Weldon looks at the problem of identifying counterfeit product listings, and how AI technologies like natural language processing and machine learning are being adopted to help find the fakes.
Two of the retailer’s key predictive risk goals were to detect and prevent the sale of counterfeit products on its online store, and to identify which retail locations have a greater likelihood of litigation occurring related to the COVID-19 pandemic.
The retailer’s partner program enables smaller businesses and individuals to sell products under its brand through its online store. Occasionally a few rogue sellers would offer counterfeit goods. But with thousands of partner products for sale, it was nearly impossible to detect such goods by manually investigating each product listing.
The retailer’s litigation concerns stemmed directly from its past experience. The retailer, which also operates one of the top pharmacies in the US, had been caught by surprise when litigation tied to the opioid crisis appeared almost overnight without warning. This was despite the retailer’s systems designed to examine public sentiment, which they hoped would be a viable predictor of such risks but proved not to be.
For example, to detect if a product might be a counterfeit, the AI models were trained to look for:
Next, we identified the publicly available sources of such data for each proxy and then developed the systems to acquire, preprocess, cleanse and transform it.
For example, to detect counterfeit products, one model would focus on the content of product reviews while another would focus on the online behavior of the reviewers.
Finally, we created personalized dashboards for each member of the retailer’s team so they could visualize the results, including the ability to drill down and further analyze the details. The system scores each risk factor from 1 to 5, with a higher score indicating a greater potential risk, and displays those scores on the dashboard. They can click on a score to see the supporting evidence behind that score.
For example, for each partner product it analyzes, the system provides separate scores for the product’s description, its reviews and its pricing, and automatically flags suspicious outliers or anomalous offerings for further review by the antifraud team. By drilling down, the team can look at the actual product description provided by the seller, for example, side by side with that of the manufacturer and see the highlighted words or sentences that the system found to be suspicious.
For identifying potential litigation hotspots, the retailer’s team could see scores for the threat level COVID-19 posed to the community where each store was located, the community’s current rate of COVID-19 transmission and the economic hardship of that community, including its unemployment rate and the number of bankruptcy claims. The team could drill down to see a summary of related litigation that had been recently filed in the community.
KPMG Signals Repository is an active listening platform that can deliver unprecedented insights into market dynamics in real time. It continuously harvests data from more than 55,000 public and private sources — everything from unemployment data and crime statistics to school schedules and real estate prices. It transforms both structured and unstructured data into signals to help significantly improve the accuracy of predictions. It uses ML to automatically identify and validate correlations among data sources and create a subset or "bundle" of sources that are shown to have an impact on the factor being analyzed.
The solution was built and deployed on the Google Cloud Platform (GCP). We leveraged various GCP services including Cloud Storage for staging the data, Cloud Functions for various serverless functions required during ingestion, Big Query as the data lake and Looker for reporting and data visualization. Various ML and NLP models were trained using Google Compute Engine, Cloud Natural Language and AutoML.
On top of these technologies, we were able to build the custom predictive models required to support the retailer’s goals. We also integrated it with the retailer’s existing case management system to facilitate workflows that followed from the identification of risks.
This solution is one of the first to apply Bidirectional Encoder Representations from Transformers (BERT) to predicting risk. The key is the "Bidirectional" part of "BERT" in that it considers content on both sides of the word it’s examining to better understand context. A good example is "a river bank" and "a bank deposit." An NLP model that looks at words on both sides of "bank" is more likely to get the proper meaning than one that looks only at words in front of it.
The next phase of the project will be to incorporate the retailer’s proprietary data into the models to augment the publicly available sources to enhance risk prediction.
Our ability to handle all aspects of the project from design through deployment enabled our client to hire just one vendor and helped to accelerate and derisk the development process. Although the solution is designed to be platform agnostic, we were able to use our experience with the client’s platform of choice to leverage many of its key features, which also helped reduce the development cycles.
Unlike business-only consultancies, our more than 15,000 technology professionals have the resources, engineering experience, battle-tested tools and close alliances with leading technology providers to deliver on your vision — quickly, efficiently and reliably. And unlike technology-only firms, we have the business credentials and sector experience to help you deliver measurable business results, not just blinking lights.
Our professionals immerse themselves in your organization, applying industry knowledge, powerful solutions and innovative technology to deliver sustainable results. Whether it’s helping you lead an ESG integration, risk mitigation or digital transformation, KPMG creates tailored data-driven solutions that help you deliver value, drive innovation and build stakeholder trust.