Pseudo-labelling is a concept from the fields of artificial intelligence, big data and smart data. It is used to make machines and computer programmes even smarter, especially when there is little "real" data with suitable labels available.
In machine learning, a computer normally needs many examples that are already correctly labelled as "cat" or "dog" so that it can recognise these animals in images. However, if there are only a few such labelled images, pseudo-labelling comes into play: the machine tries to "label" unknown images itself. The suggestions (pseudo-labels) developed by the computer are then used to train the machine further. In this way, it can learn much more from less tangible data.
For example, a company wants to develop an AI model that automatically analyses medical records. However, there is only a small amount of medical data with a doctor's diagnosis. With the help of pseudo-labelling, the system creates its own suggestions, which then help with learning like "real" examples. This allows the model to improve more quickly, even without a lot of expensive expert knowledge.
Pseudo-labelling is a clever way for companies to get more out of their artificial intelligence with less data.















