Bridging the Skills Gap for AI and Machine Learning

Even as COVID-19 has slowed business investments worldwide,  AI/ML spending is increasing. In a post for IDN, dotData’s CEO Ryohei Fujimaki, Ph.D, looks at the latest trends in AI/ML automation – and how they will speed adoption across industries.

Tags: AI, automation, BI, dotData, data science, intelligence, machine learning, skills,

Ryohei Fujimaki, Ph.D., dotData
Ryohei Fujimaki, Ph.D.

"Despite the rosy outlook for AI/ML investments, businesses admit they struggle to scale these technologies beyond PoCs due to an ongoing talent shortage."

Intelligent Data Summit
Analytics, Apps & Data for Success in the Digital Enterprise
Online Conference

Even as COVID-19 has slowed business investments worldwide,  AI/ML spending is increasing. In a post for IDN, dotData’s CEO Ryohei Fujimaki, Ph.D, looks at the latest trends in AI/ML automation – and how they will speed adoption across industries.


COVID-19 has impacted businesses across the globe, from closures to supply chain interruptions to resource scarcity. As businesses adjust to the new normal, many are looking to do more with less and find ways to optimize their current business investments. 


In this resource-constrained environment, many types of business investments have slowed dramatically. That said, investments in AI and machine learning are accelerated, according to a recent Adweek survey


Adweek found two-thirds of business executives say COVID-19 has not slowed AI projects. In fact, some 40% of respondents told Adweek that the pandemic has accelerated their AI/ML efforts. Reasons for the sustained and growing interest in AI/ML include decreasing costs, improving performance, and increasing efficiencies-all efforts to make up for time and output lost during the COVID-19 slowdown.     


Despite the rosy outlook for AI/ML investments, it bears mentioning that businesses also admit they still struggle to scale these technologies beyond PoCs (proof of concepts). This is due to an ongoing talent shortage in the data science field – a shortage that COVID has made even more acute.  


Data science is an interdisciplinary approach that requires cross-domain expertise, including mathematics, statistics, data engineering, software engineering, and subject matter expertise.  


The shortage of data scientists — as well as data architects, machine learning engineers skilled in building, testing, and deploying ML models — has created a big challenge for businesses implementing AI and ML initiatives, limiting the scale of data science projects and slowing time to production. The scarcity of data scientists has also created a quandary for organizations: how can they change the way they do data science, empowering the teams they already have? 


The democratization of data science is very important and a current industry trend, but true democratization has never been easy for organizations. Analytics and data science leaders lament their team's ability to only manage a few projects per year. BI leaders, on the other hand, have been trying to embed predictive analytics in their dashboards but face the daunting task of learning how to build AI/ML models. What can organizations do, what tactics will help them to scale AI initiatives and bridge the gap between what is required and what's available?

Effective Democratization Needs AI Automation 

Democratization of data science in a true sense is to empower teams with advanced analytical tools and automation technologies. 


These tools can significantly simplify tasks that formerly could only be completed by data scientists. They are empowering business analysts, BI developers and data engineers to execute AI and machine learning projects. Further, they accelerate data science processes with very little training. 

Notable among these offerings are: 

  • AutoML, a class of tools which usually refers to the automated building of ML models or that provides end-to-end data science automation from data preparation and support for building and operationalize models. Other technologies in this category include data science automation and ML operationalization platforms. 
  • End-to-end data science automation — from raw business data through data prep and feature engineering through machine learning — is enabling enterprises to build effective data science teams with minimal costs, using their current talent. 

This class of automation tools removes much of the time and expense to design and deploy AI-powered analytics pipelines – and do so little cost and without high-priced technical staff. 


Today, s typical data team is interdisciplinary and consists of data engineers, data analysts and data scientists. The data analyst and engineer are responsible for cleaning, formatting and preparing data for the data scientist who then uses analytics-ready data to build features and then build ML models using a trial and error approach. 


Data science processes are complicated, highly manual, and iterative in nature. Depending on the maturity of the data pipelines, a data science project can take from 30 to 90 days to complete with nearly 80% of the effort spent on AI-focused data preparation and Feature Engineering. 

Further, the AI-focused data preparation process requires an impressive amount of hacking skills from developers, data scientists and data engineers to clean, manipulate and transform the data to enable data scientists to execute feature engineering.


That said, the landscaping is changing. Tools are now surfacing to deliver AI automation to pre-process data, connect to data and automatically build features and ML models. These results eliminate the need for having a large team and doing it efficiently at the greatest possible speed. 


In addition, feature engineering automation has vast potential to change the traditional data science process. Feature engineering involves the application of business knowledge, math, and statistics to transform data into a format that can be directly consumed by machine learning models. 


It also can significantly lower skill barriers beyond ML automation alone, eliminating hundreds or even thousands of manually-crafted SQL queries, and ramps up the speed of the data science project even without a full light of domain knowledge).


Organizations with large data science teams will also find automation platforms very valuable. They free up highly-skilled resources from many of the manual and time-consuming efforts involved in data science and machine learning workflow and allow them to focus on more complex and challenging strategic tasks. 


The trend is definitely to leverage automation technologies to speed-up the ML development process. By using AI automation technologies, BI and junior data scientist can automatically build models. This frees up time for experienced data scientists who take on more challenging business problems. While everyone seemed to focus on building automated ML models, the industry is definitely moving towards automating the entire AI/ML workflow.


This empowers data scientists to achieve higher productivity and drive greater business impact than ever before.

Upskill Current Employees

Another important tactic for bridging the skills gap in data science is ongoing skills training for the AI, data science and business intelligence teams. 


Rather than hiring outside talent from an already shallow talent pool, companies are often better off investing time and resources in data-science training of their existing talent pool. These citizen data scientists can bridge the skill gap, address the labor shortage and enable companies to leverage the existing resources they already have. 


There are many advantages to this approach. 


The idea is to build a team from inside the company versus hiring experts from outside. Any transformation is only going to succeed, provided it is embraced by the vast majority. Creating internal AI teams, empowering citizen data scientists and scaling pilot programs focused on AI is the right approach. 


One of the most important of which is building data science skills across multiple teams to support data science's democratization across the organization. This strategy can be implemented by first identifying employees with existing programming, analytical and quantitative skills and then augmenting those skills with the required data science skills and tools training. Experienced data scientists can play the role of an evangelizer to share data science best practices and guide the citizen data scientists through the process.


AI and ML-driven innovation becomes indispensable as more enterprises transform themselves into data-driven organizations. Building a strong analytics team, while challenging in today’s resource-scarce environment, is attainable by using appropriate automation tools. The benefits of this approach include:

  • Significantly shortening the learning curve.
  • Empowering citizen data scientists with skills and knowledge.
  • Supporting the data science team through a data-driven culture.

These factors can not only help fill the skills gap but will help accelerate both data science and business innovation, delivering greater and broader business impact.


Ryohei Fujimaki, Ph.D. is the CEO and co-founder of dotData. Prior to founding dotData, he was the youngest research fellow ever in NEC Corporation’s 119-year history. During his tenure at NEC, Ryohei was heavily involved in developing many cutting-edge data science solutions. Ryohei received his Ph.D. degree from the University