The Problem with Machine Learning

 

 

Machine Learning. AI, Deep Learning, neural networks. The biggest topic of the tech world, and hottest buzzwords of industry. Put them on your CV and your job prospects will certainly improve. Offer them as services in your company and see your marketability go up. 

 

You would think. 

 

But there’s almost an unspoken problem with these buzzwords, these holy grails; as much as every company would love to have them incorporated into their product in some manner, it’s just …too much work. Too complicated, too expensive. Whether it’s a small team working on a webapp, or a multi-million-dollar company, there seems to be an almost comedic paradox when you look at how many companies would like to have machine learning used in their process, versus how many actually do it. A figure often reported is less than 9% of companies that could implement machine learning, actually do it [1]. Whilst this number is on the rise, it’s still amazing that a technology that has been around for a while, that has been proven, and that is seen as extremely beneficial, has relatively so little implementation. 

 

So what’s the problem here? 

 

Costs. Lack of infrastructure. Talent deficit. Time. Inexperience. Most likely you’ve read in other articles that these are the biggest issues.  In fact, these issues have been mostly answered with the ongoing rise of AI orchestration platforms. Their very existence negates the need to construct a rigorous infrastructure, which results in lower costs. Many ML orchestration platforms try to address the issue of inexperience and talent deficit in multiple ways. The most common and simplest is a guide that helps introduce those with little experience into the exceedingly large ML pool, in addition to providing a necessary introduction to their product. Others have made the decision to try and reduce popular machine learning libraries like Scikit Learn into a more user-friendly version via a “simple” UI involving next to no user written code, giving those with less experience (or in some cases, none at all) the ability to produce a fully functional pipeline, even if it isn’t optimized for their data. This strategy at least guarantees that new users can get a taste of machine learning for little cost or time.

 

It also doesn’t help that machine learning has become a victim of its own success. Its perceived image for those outside of its field has been exaggerated in its benefit; something as good as machine learning surely has a price, one that’s proportional to the perceived benefits. So, these over hyped benefits (and purely hypothetical) equate to big costs, and a lot of companies just don’t want to invest the perceived time, effort, money, knowledge, and of course the expensive data scientists, that come with such a technology. As you might have picked up on, the operative word here is “perceived”, yes, all these concerns are valid, but not as much as people think. Whilst this misconception isn’t always the case, sadly, it is an image of Machine Learning that is in the minds of the people that matter. 

 

However, I’ve found that problems start before we even reach the typical issues usually brought up, before concerns of data availability or costs are even considered. When I first got into machine learning, I was blown away by its potential, even if the time and effort to become competent at it were large, the benefits seemed worth it. So, I was just as surprised when I learned a large number of companies were not even considering using it. I asked around for an answer to this paradox of “we want it” versus “we have no plans to use it”. I found the answer was actually quite simple and understandable; the vast majority of companies and teams that I asked just didn’t know how machine learning could benefit them. Which makes a lot of sense. If you don’t have someone that knows something about machine learning to advise you, then it is unlikely you can identify exactly how ML can benefit you. Worse, you’ll probably not even investigate how it could help, which I believe explains the low figures of ML adoption. So, we end up in a chicken-egg scenario, you need someone with ML experience to begin understanding how ML can benefit you, and of course, such a person won’t exist unless you know how ML can benefit you. 

 

For the companies that do get past this entry level hurdle, the issues start at the end. After the pipelines have been tested, after the costs have been paid and the personnel hired, still, some projects end up in data limbo or end up implementing unnecessary workarounds to production problems. 

 

When talking to some companies that reached this point, their implementation of ML was…verbose, but out of necessity. They had everything ready and working from an ML perspective but were training and modifying their models locally and uploading the predicted data to whatever cloud provider they were using, rather than having everything contained in one place. The issue was scalability, deployment, automation, and resilience to time. The last one consisting mostly of model maintenance, monitoring, updates, and modification as datasets expand and new data becomes available. These issues are much more difficult to solve and will take a time to resolve. However, I believe with the increase in AI orchestration platforms, this will continue to be addressed and solutions presented. 

 

Machine learning can be complicated, machine learning on the cloud can be extremely complicated. But it doesn’t have to be. It is more than worth the time and energy to focus on these issues and make ML more accessible to the those who have yet to adopt it.

 

[1] Advanced Technologies Adoption and Use by U.S. Firms: Evidence from the Annual Business Survey