Predictive modeling is a way of finding patterns and relationships in existing data and using those to predict what will happen in spaces where data isn’t available. Good predictive modeling has become a vital component of many fields and professions.
An organization has a problem it needs to solve and a cache of data about that problem. For example, a supermarket might like to know when its customers are likely to return to the supermarket using a history of previous visits. If the model predicts a drop off in visits, the supermarket can send that customer a promotion. We can make a prediction algorithm with that one customers data and compare the results with existing data. The closer the match the more confident we can be about future predictions.
There are many different approaches to predictive modeling, each with its own advantages. A company called Kaggle offers a new perspective by connecting organizations who have problems and data, with teams of data scientists in what Kaggle describes as online competitions. By having lots of data scientists attacking a problem in different ways, Kaggle’s view is that the best approach will out perform all others contributing to greater accuracy beyond the organization that might have just 1 or 2 data scientists on staff.
A competition run to predict the propensity of accidents by different vehicle models for a major insurance company improved on the benchmark by 340%. Their largest competition so far has been running for 3 years. Called the Heritage Healthcare prize, this $3 million bake off asks players to predict the likelihood of a patient going to the hospital in the next year. With that prediction, the competition host hopes to take preventive measures to help keep their patients healthy and avoid hospital stays.
In essence, Kaggle makes predictive analytics a sport.