ΒιΆΉ΄«Γ½

Technology

Audio AIs are trained on data full of bias and offensive language

Seven major datasets used to train audio-generating AI models are three times more likely to use the words "man" or "men" than "woman" or "women", raising fears of bias

By Victoria Turk

11 November 2024

Microphone

Audio training data has been overlooked when it comes to assessing AI

Israel Palacio/Unsplash

Artificial intelligence models that generate audio are being trained on datasets plagued with bias, offensive language and potential copyright infringement, sparking concerns about their use.

Generative audio products, such as song generators, voice cloning tools and transcription services, are increasingly popular, but while text and image generators have been subject to much scrutiny, audio has received less attention.

To help rectify this, at Carnegie Mellon University in Pennsylvania and his…

Sign up to our weekly newsletter

Receive a weekly dose of discovery in your inbox. We'll also keep you up to date with ΒιΆΉ΄«Γ½ events and special offers.

Sign up

To continue reading, today with our introductory offers

or

Existing subscribers

Sign in to your account
Piano Exit Overlay Banner Mobile Piano Exit Overlay Banner Desktop