Updated

Updated · The Verge · Jun 20

The Atlantic Opens AI Music Database Covering 12 Million Tracks as 9 Million More Surface

Updated

Updated · The Verge · Jun 20

The Atlantic Opens AI Music Database Covering 12 Million Tracks as 9 Million More Surface

3 articles · Updated · The Verge · Jun 20

The Atlantic made four music-training datasets searchable, exposing two massive sets with 12 million and 9 million tracks plus two others with more than 100,000 songs each.
Alex Reisner reported the collections have been downloaded thousands of times, and Google and Stability have acknowledged using them in research papers.
Three datasets are distributed as link lists to YouTube or Spotify tracks, with developers typically pulling the audio through automated tools that can bypass logins, ads and creator monetization mechanisms.
The database surfaces artists from Lady Gaga and Radiohead to Wu-Tang Clan and Bruce Springsteen, highlighting how widely commercial and personal-use-only music has entered AI training pipelines.

Sources

Left100%

The Verge3h ago

The Atlantic Creates Searchable Database of Music Used to Train AI

onemoreshot.ai13h ago

20 Million Songs Exposed in AI Training Data | One More Shot AI Blog

Reddit3h ago

I just discovered that 10 of my songs were stolen to train Suno AI in a dataset... : r/SunoAI

5 Sources

Lawsuits are mounting against AI music companies. Is licensed data their only path to survival?

Will new US legislation finally end the 'fair use' debate for AI music training?

21 Million Songs Scraped: The Legal, Ethical, and Economic Fallout of AI Music Datasets

Overview

In late 2024, a major investigation revealed that AI developers had amassed over 21 million music recordings, including both popular and independent tracks, to train their models. These vast datasets were often collected through automated scraping, sometimes without consent or compensation. The exposure of this practice caused immediate outrage, especially when it was found that sacred recordings from Aboriginal, Torres Strait Islander, and Māori artists were included without permission. This not only raised serious copyright concerns but also highlighted a deep violation of Indigenous cultural rights, sparking urgent debates about ethics and the future of AI in music.

...