Rice University scientists just cracked a problem that's been slowing down genetic medicine: how to find the right DNA sequence among millions of possibilities.
The challenge sounds simple until you think about the scale. In synthetic biology, you need cells to do specific things — produce insulin, fight cancer, glow under certain conditions. But for any given function, there are countless DNA designs that might work. "There are many possible designs for any given function, and finding the right one can be like looking for a needle in a haystack," says Rice scientist Caleb Bashor.
Until now, researchers could only test hundreds or thousands of designs at a time. The new approach, called CLASSIC, changes that entirely. The team can now generate and test hundreds of thousands to millions of DNA circuits simultaneously.
We're a new kind of news feed.
Regular news is designed to drain you. We're a non-profit built to restore you. Every story we publish is scored for impact, progress, and hope.
Start Your News DetoxHow it actually works
The breakthrough combines two sequencing methods that usually compete with each other. Long-read sequencing captures entire circuit designs by reading thousands of DNA bases in one go. Short-read sequencing is faster and more accurate but only works over short stretches. Together, they're powerful.
Here's the clever bit: researchers created DNA circuits, inserted them into human cells engineered to glow when certain genes activate, then used short-read sequencing to barcode and track which sequences produced which outcomes. Suddenly they had millions of data points linking DNA design to biological behavior.
Then they fed all that data into machine learning models. These models learned the underlying patterns well enough to predict how untested circuits would behave. In validation, the AI got all 40 predictions exactly right when checked against manually tested sequences.
"This was the first time AI could be used to analyze circuits and make accurate predictions for untested ones because up to this point nobody could build libraries as large as ours," says co-first author Kshitij Rai.
What makes this particularly useful: the team discovered that most genetic functions don't have one "perfect" design — they have many solutions that work. That flexibility matters for real applications. It means engineers can design biological systems that are more robust, more likely to work reliably in actual patients rather than just in the lab.
The research, published in Nature in 2022, suggests this combination of massive datasets and AI modeling could accelerate development of cell-based therapies, better insulin-producing cells, cancer-fighting immune cells, and other synthetic biology applications that are still in early stages.
The real shift here isn't that AI is "designing" DNA — it's that AI can now learn from datasets so large that humans couldn't generate them before. That's the bottleneck breaking.
Combining long and short range sequencing to investigate genetic complexity - Nature, 2022










