A friend I were chatting this afternoon about the impact on the role of data scientists from automated machine learning/AI products like AWS’s new Sagemaker. There is a real possibility that much of the current work that business-oriented data scientists do – model tuning, optimization, and deployment – will be automated. What’s a data scientist to do?
The conversation got me thinking about an old quote that sales people toss around about the difference between selling features and selling results.
“People don’t want to buy a quarter-inch drill, they want a quarter-inch hole.” – Theodore Levitt
Many data scientists are in the drill and drill-bit business, focused on the ceremony of producing, tweaking, and deploying models. This is busy work that doesn’t necessarily add value for the business. As the ceremonial tasks of data science are automated, data scientists will move to the ‘quarter-inch hole’ end of the spectrum where they can focus full-time on delivering predictive and prescriptive results for the business.
As the current data science role become standardized and automated, I think you’ll see the role of a data scientist evolve to concentrate in two main areas:
- Domain expert. Automated data science tools will amplify a data scientist’s skillset, and facilitate more time for value-add work like domain expert analysis, prediction, and prescription. Automated tools will augment the data scientist to produce real value and results. And data scientists may become – gasp! – analysts. Full circle.
- Data engineering. This is the most underrated area of data that (for now) won’t be automated. Personally, I find data engineering more interesting and fulfilling than data science tasks of choosing and tuning algorithms. It’s gnarly work, but absolutely necessary for data science to properly work and scale in production. No data engineering = toy level data science, at best.
These are exciting times for data scientists. It is still critical to learn the fundamentals of how algorithms work and their use cases. But, as Sagemaker demonstrates, a lot of the busy work will be obfuscated behind the scenes. This opens giant opportunities for data scientists to push their skills into more valuable areas. Exciting times indeed.