The term quality metrics for training data is particularly important in the fields of artificial intelligence, big data, smart data and digital transformation. Training data is the basis on which computer programmes, for example for speech or image recognition, „learn“. For artificial intelligence to work reliably, this training data must be of good quality.
Quality metrics are key figures that are used to check exactly that. They help to determine how complete, correct and relevant the data is before it is fed into a system. This prevents incorrect or biased data from leading to poor results or incorrect decisions.
An illustrative example: A company wants to develop an AI system that automatically sorts emails. To ensure that the programme does this reliably, quality metrics for training data check whether all emails are correctly labelled, no important emails are missing and there are no duplicate emails in the data set. The system only learns correctly with good training data - this improves automation and saves time and costs. Quality metrics for training data therefore help to make artificial intelligence smarter and more reliable.















