The Digital Hoard: Saving Our Online History from Oblivion

The Digital Hoard: Saving Our Online History from Oblivion

The internet, a sprawling digital landscape teeming with information, is also a graveyard of forgotten content. Websites disappear, social media trends fade, and digital artefacts vanish into the ether. But a small group of dedicated individuals is waging a silent war against this digital oblivion, striving to preserve our online history for future generations.

Scott, a "free-range archivist and software curator" with the Internet Archive, is one of those warriors. Founded in 1996 by internet pioneer Brewster Kahle, the Internet Archive acts as a vast online library, collecting and storing data that would otherwise be lost to time.

Over the past two decades, the organisation has amassed a gargantuan collection of web content, including relics from the era of GeoCities. This isn't limited to purely digital artefacts; the Internet Archive also houses a vast collection of digitised books, meticulously scanned and rescued from potential oblivion.

To date, the Internet Archive has amassed over 145 petabytes of data, encompassing over 95 million public media files, including movies, images, and texts. It has even managed to preserve nearly half a million MTV news pages, capturing a snapshot of a bygone era.

The Internet Archive's Wayback Machine, a treasure trove of internet history, allows users to rewind time and witness how websites appeared at any given moment. It boasts over 800 billion stored web pages, adding another 650 million each day. The organisation also records and stores TV channels from around the world, even archiving TikToks and YouTube videos. This vast digital collection is safeguarded across multiple data centres owned by the Internet Archive itself.

However, the task is daunting. The sheer volume of digital content created daily makes it a Sisyphean struggle to keep pace. As Jack Cushman, Director of Harvard's Library Innovation Lab, puts it, "We’re creating so much new stuff that we must always delete more things than we did the year before."

Archivists face a constant dilemma: what should be saved for posterity? Which TikToks deserve a place in the digital archives?

Niels Brügger, an internet researcher at Aarhus University, offers a pragmatic approach. "We cannot imagine what historians in 30 years’ time would like to study about today," he argues. "So we shouldn’t try to anticipate and constrain the possible questions that future historians would ask."

Instead, Brügger advocates for a comprehensive approach: "Get it all, and then historians will find out what the hell they’re going to do with it."

At the Internet Archive, the focus lies on preserving content most at risk of being lost. Jefferson Bailey, who helps develop archiving software for libraries and institutions, explains: "Material that is ephemeral or at risk or has not yet been digitized and therefore is more easily destroyed, because it’s in analog or print format—those do get priority."

The race to save our online history is a complex and ongoing battle. As the internet continues to evolve, the task of preserving its past becomes increasingly challenging. But thanks to the tireless efforts of archivists like those at the Internet Archive, future generations will have the opportunity to explore the digital legacy of our time, and understand the world that shaped us.

Read more