Google DeepMind, the folks who teach computers to play Go and then beat the world champion, are now turning their attention to a slightly more chaotic game: what happens when millions of AI agents are let loose on the internet, unsupervised. Their answer? Probably not utopia.
The company is throwing $10 million into a new field of study they're calling "multi-agent safety." Because apparently, that's where we are now. They've teamed up with Schmidt Sciences, the UK government's ARIA, and the Cooperative AI Foundation to get ahead of what they see as a looming "tipping point" where hypothetical risks become very real, very quickly.

Rohin Shah, who leads AGI safety research at DeepMind, notes that these digital assistants can now follow instructions from other digital assistants without a human in the loop. Which, if you think about it, is both impressive and slightly terrifying. James Fox from Schmidt Sciences added that the goal is to get academic researchers, who aren't beholden to quarterly earnings calls, to really dig into the future implications.
We're a new kind of news feed.
Regular news is designed to drain you. We're a non-profit built to restore you. Every story we publish is scored for impact, progress, and hope.
Start Your News DetoxSo, what exactly are they worried about? Think of it as all the current online problems — scams, cyberattacks, the general digital wild west — but on a supercharged, AI-fueled dose of espresso. One particular concern is "prompt injection," where a malicious instruction turns an AI into self-guided malware. Fox wants to prevent the "digital commons" from descending into "absolute anarchy." Which, let's be honest, sounds like a Tuesday on some parts of the internet already.
Shah believes that within months, AI agents will be everywhere, making these potential risks a very serious concern. Their plan? Realistic simulations in controlled environments, or "sandboxes," to see how these agents actually behave when they're not just politely answering your questions about the weather. Because AI agents, especially those powered by large language models, don't always act rationally. And the real chaos, they suspect, comes from the sheer number of interactions happening all at once. Some researchers even think that true artificial general intelligence might emerge from a "hive mind" of these agents — a group smarter than the sum of its parts.
DeepMind isn't alone in this digital hand-wringing. Anthropic, another AI firm, recently released their own guidelines based on "zero trust" cybersecurity: assume every system is vulnerable, every agent is an attacker, and a breach is inevitable. Refael Angel of cybersecurity firm Akeyless points out that old security models assumed machines were just running fixed software. Now, agents can reason, improvise, and be hijacked by a single sentence.
Angel supports the funding, arguing no single lab should dictate safety standards. But he also offers a wry caution: safety researchers sometimes get a little too focused on the exotic, hypothetical problems, overlooking the more mundane (but very real) ones. Fox, however, counters that the future has arrived faster than anyone expected, and those once-hypothetical risks are now knocking on our digital door.











