The Predictive Analytics Story
If you’ve been reading me for a while, you already know how important it is to capture data on your ROI. If you know what’s working, you can put your money in the right place. I discuss this at length in this post on calculating ROI. Predictive analytics is one way to take that data from your past successes or failures, and stretch it into insights about your future.
To share a piece of what may at this point be marketing legend, consider the beer/diaper correlation. The story goes that one retail outlet decided to use predictive analytics to track what items their customers were purchasing. Their data showed an odd trend: diapers were being purchased alongside beer disproportionately often by men aged 30-40 after 7pm on Friday nights. Young men were running in for a late-night diaper restock, and grabbed a six pack while they were at it. According to the story, the retailer used this insight to place beer and diapers close to one another in the store, and sales of both zoomed.
Big Names in Big Data
That isn’t the whole story, however. The real story tells an even deeper truth about chasing after marketing trends. While trying to find sources for the diaper beer story, or even the name of the store, I came across this 2012 article at CanWorkSmart:
“The myth itself relates to a study done in June of 1992 when Thomas Blischok, then VP of industrial consulting for NCR (now spun off to TeraData), did an analysis for Osco Drug. They examined 1.2 million market baskets in 25 stores identifying over 20 different product couplings including beer and diapers, and fruit juice and cough syrup.
The story about how Osco moved beer next to the Diapers and both made more sales isn’t correct though. Osco took the NCR study and identified approximately 5,000 slow-moving SKUs in its inventory. After removing those items from the shelf, consumers, now finding more items they wanted easier, actually thought Osco’s selection had increased.”
The distinction is an important one. What Osco, the formerly-anonymous store did, was mine data to determine overall trends. They did not discover a magic formula for products to put together to exponentially increase sales. Why does this matter? Just ask any of the companies in the late 90’s who built huge, expensive data warehouses. Many were hoping to glean such clear cut insights, only to have the project fail or never find any useful data at all. This article from Forbes all the way back in 1998 describes failed attempts by big names such as American Express and J.C. Penney to create large data centers to mine useful information for their predictive analytics campaigns.
Industries Benefiting From Predictive Analytics
So, the diaper-beer story shows us the basic idea of how predictive analytics are supposed to work. The “predictive” part comes in when you use past trends to try to predict future buying patterns. For example, a retail company might mine their checkout data to predict surges or troughs in demand. This allows them to stock items appropriately. According to SAS, “financial services firm can use predictive analytics to predict the likelihood of fraud activity for any given transaction before it is authorized – within 40 milliseconds of the transaction initiation.”
Some public works and utilities firms are mining data to predict when huge machines, like wind turbines, will need maintenance. The US Census Bureau analyzes their data to understand population trends for a long time. Manufacturers can use data to optimize parts, service resources, distribution, and quality control efforts.
How Predictive Analytics Works
According to SAS, a provider of predictive analytics services,
“There are two types of predictive models. Classification models predict class membership. For instance, you try to classify whether someone is likely to leave, whether he will respond to a solicitation, whether he’s a good or bad credit risk, etc. Usually, the model results are in the form of 0 or 1, with 1 being the event you are targeting. Regression models predict a number – for example, how much revenue a customer will generate over the next year or the number of months before a component will fail on a machine.”