Deep audio models are at home in the fields of artificial intelligence, automation and digital transformation. They refer to specialised computer models that are trained to understand and process audio data such as speech, music or environmental sounds particularly well.
These models work in a similar way to human hearing: they analyse sounds in many layers in order to recognise subtleties and differences. Deep audio models are used in voice assistants such as Alexa or Siri, for example. They help to correctly recognise and implement spoken commands.
An illustrative example: If someone asks their smartphone for the weather in a noisy environment, a deep audio model filters the person's voice out of the background noise and reliably converts the enquiry into text. This makes speech recognition much more reliable.
Deep audio models are also used in the music industry, for example to automatically categorise or recommend music tracks. To summarise, deep audio models enable machines to understand speech and sounds better than ever before - making our everyday lives more convenient and efficient.