Test-driven data preparation is particularly at home in the areas of big data, smart data and artificial intelligence. It describes a method in which data is prepared and checked in such a way that it can be reliably used for further analyses or machine learning.
Instead of simply collating all the data, various tests are used early on in the process in test-driven data preparation. These tests automatically check whether the data is complete, correct and usable. This ensures that errors or gaps are recognised and corrected in good time right from the start.
A practical example: a company wants to use artificial intelligence to predict which products will soon be in particularly high demand. This requires clean, up-to-date sales data from various sources. With the help of test-driven data preparation, automatic tests are first defined to check that no important figures are missing or that all data is available in the correct format. Only when these tests have been passed is the data released for analysis.
This ensures greater reliability, saves time during subsequent analyses and makes data projects more successful overall.















