The term "interacting multimodality" is particularly relevant in the areas of artificial intelligence, digital transformation and Industry and Factory 4.0, where different types of data and information channels - such as voice, images and text - are used simultaneously and actively linked with each other. "Multimodal" therefore means that several ways of communicating or receiving information are used simultaneously. "Interacting" means that these channels influence and complement each other.
A simple example is a modern customer service robot: it can understand speech, respond to text and even analyse images, for example when customers send a photo of a faulty product. The robot combines these inputs to provide the best possible answer. If the customer speaks first, then uploads a photo and continues writing later, this information is intelligently compiled and analysed, and the overall result significantly improves the quality of service.
Interacting multimodality is therefore an important step towards making digital systems more "human" and enabling them to solve more complex tasks automatically - wherever different information paths interlock and need to be analysed together.















