How To Choose The Right Data Storage Strategy for Machine Learning


A lot has been written about Data Warehouses and Data Lakes, and which approach is better for data scientists to build, learn, test and deploy their Artificial Intelligence (AI) and Machine Learning (ML) algorithms. Complicating the issue are the latest data management advancements. They are now blurring the line between the Data Warehouse and the Data Lake and offering the possibility to efficiently scale to exabytes.


In this post I quickly go over which use-cases the Data Warehouse and Data Lake have traditionally been designed for, and the new data management technologies data scientists can now choose from. I then focus on which approach would be better for AI.


Data Warehouses and Data Lakes are both used for storing large amounts of data. Both use data extracted from transactional systems, IoTs and external data sources. Both can keep large amounts of historical data, so data scientists can perform trend analysis and compare today’s numbers with past data.


Read the full article on