Internet Archive preserves its trillionth webpage in 30…

Internet Archive preserves its trillionth webpage in 30 years

The Internet Archive just preserved its trillionth webpage—a staggering milestone after 30 years of digital rescue work that underscores how fragile the web really is.

Elena Voss

Feb 21·2 min read·San Francisco, United States·120 views

Originally reported by Popular Science ↗ · Rewritten for clarity and brevity by Brightcast

Why it matters: Future generations and researchers now have an irreplaceable digital record of human knowledge and culture, protecting our shared history from being lost forever.

The Internet Archive just hit a number that sounds abstract until you think about what it means: one trillion webpages saved. Since 1996, this nonprofit has been quietly building a backup copy of the internet itself—a project that started feeling urgent around 2019, when MySpace's server migration accidentally deleted 50 million songs from 14 million artists in a single afternoon.

That's the thing about digital content: it doesn't persist unless someone actively maintains it. A website can vanish when its owner loses interest, when a company shuts down, when a server fails. The internet, despite being everywhere, is fundamentally fragile.

So the Internet Archive built web crawlers to automatically capture publicly available websites, and invited volunteers to upload everything else—old books, obscure music, documents that might otherwise disappear. After nearly three decades, they've collected more than 866 billion webpages, 41 million texts, and millions of other digital artifacts. They're adding around 500 million new websites every day. The total storage: 100,000 terabytes, or roughly the capacity of 50,000 of today's highest-end iPhones.

Wait—What is Brightcast?

We're a new kind of news feed.

Regular news is designed to drain you. We're a non-profit built to restore you. Every story we publish is scored for impact, progress, and hope.

Start Your News Detox

This matters more than it might seem. Journalists use the Archive to verify what a news outlet said five years ago. Researchers trace how misinformation spreads. Historians document how the web has changed. Ordinary people check what their favorite website looked like before a redesign.

The New Pressure

But the Archive is now caught in a collision between two forces. On one side, tech companies are scraping the internet to train AI systems—often in legally murky ways. On the other, major media outlets including The New York Times, The Guardian, and USA Today have started blocking the Archive from preserving their newer content, worried that their articles will end up in AI training datasets without compensation.

It's a legitimate concern. Writers and publishers deserve to be paid for their work, and right now, there's no clear legal or financial framework that covers this scenario. The tension is real. But it also creates a paradox: the more content that gets hidden from the Archive, the harder it becomes to preserve what might be the most fragile information ecosystem we've ever built.

The Archive will keep growing toward its two trillionth webpage. But its future depends on finding a middle ground—ways to protect creators' rights while keeping the permanent record of the internet intact.

Brightcast Impact Score (BIS)

The Internet Archive's preservation of 1 trillion webpages represents a genuine positive milestone in digital conservation—a solution to the ephemeral nature of online content. The work is globally significant, permanent in impact, and emotionally resonant (the MySpace example powerfully illustrates why this matters). However, verification is moderate: the article cites specific numbers but lacks named expert sources or organizational endorsements, and the piece appears incomplete, cutting off mid-sentence.

Hope30/40

Emotional uplift and inspirational potential

Reach28/30

Audience impact and shareability

Verification16/30

Source credibility and content accuracy

Significant

74/100

Major proven impact

Start a ripple of hope

Share it and watch how far your hope travels · View analytics →

Spread hope

You

friendstheir friendsand beyond...

Wall of Hope

0/20

Be the first to share how this story made you feel

How does this make you feel?

Internet Archive preserves its trillionth webpage in 30 years

We're a new kind of news feed.

The New Pressure

Brightcast Impact Score (BIS)

Start a ripple of hope

Wall of Hope

Connected Progress

Cities are learning to use AI without losing their humanity

Meta and NVIDIA team up to build AI infrastructure at global scale

Microsoft proposes technical standards to verify what's real online

More stories that restore faith in humanity

World’s first transatlantic robotic stroke surgery performed from Florida to Dundee

Fei-Fei Li wins Queen Elizabeth Prize for Engineering

Robert Therrien’s World of Everyday Wonder Emerges in a LA Retrospective

An improved way to detach cells from culture surfaces

Researchers Have Identified the Names of Five Million Victims Murdered in the Holocaust

Student Inventors Create Low-Cost Prosthetics Using 3D Printing