Inaudible adversarial audio can hijack voice AI chatbots and potentially trigger access to photos, bank accounts and other linked data, according to research due at the IEEE Symposium on Security and Privacy.
The attack takes about 30 minutes to train, then works as a context-agnostic hidden signal embedded in songs, videos or podcasts, letting it manipulate a target model regardless of what the user says.
Researchers said the method currently requires access to a model’s full weights, limiting direct attacks to open-source systems, but they still succeeded against mainstream products tied to Microsoft and Mistral models.
Microsoft said the study used controlled direct interactions and argued real-world apps can add protective layers, while Mistral did not respond to a request for comment.
The findings highlight a security gap in voice assistants built on open-source AI foundations, especially when those systems are connected to sensitive personal services.
Your AI assistant can be hijacked by silent commands. Are your personal photos and bank accounts truly safe?
With new EU laws in effect, who is liable when your AI is silently hacked through a podcast or song?