Learn how to prove machine learning value over time. Understanding the business value of machine learning projects. The value of machine learning: best practices
In this blog, we learn how to prove Machine Learning value over time.
Very rarely do technical-related issues cause the failure of machine learning (ML) projects. Most often, machine learning projects fail due to poor communication and planning. Much of this communication—or lack thereof—involves the project’s goals, which are directly tied to the project’s values. All too often, an unclear or poorly stated value proposition for a machine learning project kills projects prematurely before they’ve even had a chance to succeed.
A key question in any machine learning project, both early on and later, is: What machine learning value are you hoping to get from it, and how will you prove that value? Without a clear plan with clear goals and milestones, you are dooming your ML projects to rushed, impetuous, and/or arbitrary decisions that will likely lead to the project’s failure.
Let’s first discuss why so many ML projects fail due to poor communication and planning, then discuss what you can do to avoid this and create an adaptable plan that accounts for smaller failures so that the bigger project can succeed.
Model Drift—the machine learning model value
In this new, data-driven world, Machine Learning, Artificial Intelligence and Deep Learning models have become significant drivers of business decisions. But just like the business strategies and models of old, ML models need to be periodically revisited and sometimes tweaked to ensure they are still working toward your final project goals.
Model drift is what happens when the relationship between the target variable and the independent variables changes over time. Because of this drift, the model becomes increasingly unstable and unreliable and produces increasingly erroneous results, eventually leading to the loss of machine learning model value, its discarding, and having to restart the entire project.
There are two main types of Machine Learning model drifts:
- Concept drift, when the statistical properties of the target variable itself change, resulting in the ML model becoming obsolete.
- Data drift, when the statistical properties of the predictors change.
Data drift is, of course, far more common than concept drift. An example of data drift would be season-related pattern changes— i.e., summer models not working in the winter—or changes in customer behavior or preferences over time, such as the mass adoption of smartphones, which would require any mobile phone manufacturer who entered the market pre-smartphone to significantly alter its models to account for this new buying preference.
And now the question naturally comes up: How do you address model drift?
A Quick Guide to Retraining Your Models
Avoiding model drift helps improve Machine Learning value. It involves first approaching your ML model as a continuous process of training and retraining, and not a static one where the model either ‘works’ or doesn’t ‘work’ – model training never stops. Using this approach and continuing to retrain your models on new data can become more resilient to changes in your data composition, whether due to external factors like seasonality or internal ones such as equipment upgrades.
Specially, these are the key points to consider for retraining your models and avoiding ML model drift:
Build it, and then adjust it over time
This is not the same as adjusting “on the fly”. You are building your initial iteration and then doing scheduled retests to ensure it’s still working. It could be six months later, or one year later, and then every six months or every year after that.
The point is: it takes iterations to prove Machine Learning value. You can’t expect to have meaningful results right off the bat. Not with the first model, not with the second—even if the second is better than the first. If a model worked in 2010, it doesn’t necessarily mean it will work well in 2021. Things change, customers change, real-life events change, technologies change. This might seem obvious, but organizations don’t think about it. So, they often too quickly disregard ML because it didn’t work the first time around or the second time around.
They need to build a first model then iterate to improve it over time in what is a continuous process.
Incorporate data from real-world events
Proving Machine Learning value is not about adjusting in real-time but more about adjusting over time as you get more input from real life. You need a minimum of data, for sure, but that’s not enough. You can create an initial model to help a human detect important elements, but then the human needs to make judgments with the first results and then compare these results with real-time events.
The self-driving car is a good example: Cars don’t become self-driving overnight. Before becoming full-fledged self-driving vehicles, they start with driving assistance, like lane change warnings, brake assistance, or adaptive cruise control.
Encryption is another example of models that evolved over time as computer power increased. They weren’t perfect at first, but people still used them. However, hackers were still able to crack the codes. But over time, the codes have become stronger thanks to better encryption and more computing power.
Don’t shoot for perfection
Many companies and ML scientists make the “all or nothing” mistake to prove Machine Learning value. That is, they miss out on potentially huge value their ML model could have given them even with only 90% accuracy by waiting for and trying to get 99% accuracy. This is also known as the perfect being, the enemy of the good.
Don’t let the perfect be the enemy of the good. If you’re trying to train your model to be 99% accurate, you’ll need years of data and training. But for 90% accuracy, you can have it ready in a few weeks and start deriving value from it much sooner.
Remember: there isn’t ONE RULE for every model. Sometimes a model can prove its value over time. Some models will need to be retrained much more frequently or require a massive data set. But you can probably have or add value early on and put that value to good use. You can do this with a Machine Learning platform like the ForePaaS Platform.
Don’t wait until you have the perfect model to prove Machine Learning value. You can start now, and even if you know that the model isn’t perfect, it will become better over time.
For more articles on cloud infrastructure, data, analytics, machine learning, and data science, follow Paul Sinaï on Towards Data Science.
Get started with ForePaaS for FREE!
Discover how to make your journey towards successful ML/ Analytics – Painless
The image used in this post is a royalty free image from Unsplash.