Australian Musicians Find Songs in 22 Million-Track AI Datasets
Updated
Updated · The Guardian · Jun 26
Australian Musicians Find Songs in 22 Million-Track AI Datasets
3 articles · Updated · The Guardian · Jun 26
Summary
A search tool built by The Atlantic surfaced songs by Paul Dempsey, Bernard Fanning, Kylie Minogue and others in two AI training datasets, exposing a broad sweep of Australian copyrighted music.
Those datasets—Sleeping-DISCO-9M with 9.7 million tracks and LAION-DISCO-12M with 12.3 million—were assembled largely from YouTube, with lyrics also pulled from Genius.com.
Paul Dempsey said Something For Kate’s full catalogue and his solo work appeared in the data, while Darren Hayes said his 30-year output, including Savage Garden hits, had been taken without permission.
APRA AMCOS, which represents 128,000 members in Australasia, called the datasets proof of creative theft and said major tech platforms had not negotiated payment or licensing terms.
The findings land in an active policy fight: Australian law generally requires permission and payment for copyrighted use, and the federal government rejected a 2025 proposal to let AI firms mine content without paying creators.
Beyond lawsuits, are 'poisoning' tools and AI watermarks the future for artists fighting to protect their work from being scraped?
If AI is trained on our entire culture, who truly owns the art it generates: the tech firm, original artists, or everyone?
As global courts issue conflicting AI copyright rulings, is a universal law possible or is a fractured legal landscape for creativity inevitable?
AI’s $500 Million Threat: Unlicensed Use of Music and the Fight for Fair Compensation
Overview
Recent investigations have revealed that AI training datasets have widely used copyrighted music from Australian and New Zealand artists without permission, leading to significant financial losses for creators. Industry reports estimate that songwriters and composers could lose over $500 million in revenue within four years if no licensing framework is established. The issue is global, with hundreds of copyright cases against AI companies. Notably, the work of Aboriginal and Torres Strait Islander artists has been included in these datasets without consent, violating both copyright and cultural protocols. This situation highlights the urgent need for better protection and fair compensation for creators.