As Data Pipelines and AI Pipelines share similar components, does it make sense to combine them together?

The answer is yes. The data they use comes from the same data sources, and the insight they produce can be sent to the same destinations. Data Scientists tasked to build an AI solution, will prefer to avoid embarking on a complex data ingestion, transformation and storage project, and naturally seek to access existing clean data sets.

Data Pipelines are generally built by Data Engineers and used by Business Users, whereas AI Pipelines are generally used and built by Data Scientists. Joining the AI Pipeline and the Data Pipeline together creates a collaborative platform that allows Data Engineers and Data Scientists to work together and Business Users to equally benefit from the predictive models. The newly found predictions and recommendations can be stored back to the Data Warehouse in the Data Pipeline to compliment the information presented by the Reporting and Analytics tools for greater business insight.

To correctly graft their AI Pipeline onto the existing corporate Data Pipeline, Data Scientists need to establish if the existing data fits their needs, or if additional data sources need to be tapped into. They’ll also determine their data access right, for security reasons. Since predictive models have a relentless appetite for data, they’ll need to define how much additional disk space they’ll need, and the architecture to run their AI models. Data Scientists will need an end-to-end AI management system management that allows them to easily add their AI Pipeline branch to the existing Data Pipeline.

For more information, please read our two-part blog series called “Data Pipelines Vs. AI Pipelines, The Similarities and Differences – Analyzed” here.