Data hoarding is more important than ever

The destruction of public data and media is reaching critical levels. You can help stop it by downloading literally everything.

A woman taking a large archive tape off a shelf at the Johnson Space Center in 1986.
Credit: NASA, JSC

There's a popular saying that "the internet never forgets," but it's more like the internet has selective memory. The rich pockets of culture and information found in forums, blogs, social media platforms, online games, libraries, chatrooms, and mailing lists are not eternal. Websites go away, links turn into 404 pages, CMS migrations break blog posts, and servers shut down.

The Internet Archive, Library Genesis, the Wikimedia Foundation, local libraries, and countless other organizations are providing an invaluable service by maintaining public access to everything from research papers to 90s PC games to books. Unfortunately, not everything can be archived in time, and the effort is costly and vulnerable to legal attacks. It's a constant battle that demands time, energy, and money.

We've seen attack after attack on information and media access over the past few years. Social media platforms are locking down. Research papers and public resources are being intentionally destroyed. Movies and TV shows are vanishing from distribution, sometimes before they see the light of day. Books are getting pulled from library shelves. It's more important than ever to be a data hoarder: save everything, back up everything, and share everything.

Cutting the Vine

There's a powerful idea that technology, information, and inventions can only be added to the collective knowledge of humanity, and never taken away, at least in the modern era. Sure, the Library of Alexandria burned down and no one knows what happened to Roanoke, but now we've got things figured out. We all have small pocket bricks with quick access to Wikipedia and Google. YouTube still has videos from 2005. It feels like anything is knowable if you have enough time and curiosity.

This will instantly date me as Gen Z, but the first significant shutdown I remember witnessing in real time was Vine, the short-form video service that launched in 2013 and had over 200 million active monthly users by December 2015. Vine shut down in January 2017, but to the company's credit, links to Vine videos continued working and an archive gallery was created. Unfortunately, the archive stopped working sometime in late 2024—presumably another victim of famed guest star of the worst Iron Man movie, Elon Musk.

The handful of most popular Vine videos are still mirrored across YouTube and other hosting services—I can still watch the video that permanently hardwired my brain to think "I sure hope it does" whenever I see a road work sign. Unfortunately, countless other videos are just... gone. There are countless stories like this one: "Just found a link to an old video of a friend who passed, it was one of my fav vids and I can’t get it."

I used the social network Google+ for several years, which did not have a popular reputation with the mainstream public, but still had many great people to follow and communities to join. I even moderated a popular Minecraft community for a while. Google+ shut down in April 2019, and even though many posts remain accessible through the Internet Archive, others are lost forever.

There are so many other platforms that have deleted vast swathes of content. Countless forum sites are gone, and many others still have broken images, because Photobucket deactivated free accounts and ImageShack deleted many images. Imgur deleted many old images and videos, too.

Even though it's not the same as outright deletion, I do want to mention the collapse of Google, Bing, and other web search engines. It's increasingly difficult to find useful and accurate information on the web under the weight of AI slop and SEO spam. How many blogs, forum comments, and social media posts are now so inaccessible that they may as well be deleted?

Slash and burn

The past few years have seen a significant uptick in the intentional destruction of information and culture. Every day is a Library of Alexandria burning.

After Reddit started locking out third-party apps, many users deleted all of their own posts and comments. When Twitter was taken over by Elon Musk, champion of a cyberpunk vehicle that can't withstand rain, many people there also deleted their posts and accounts. To be clear, I absolutely believe people should have the right to scrub their own content from the internet—I wiped the posts on my own Twitter account. Those acts are prompted by the same right-wing and profit-extracting motives that are torching other information and culture sources, though. I don't want my posts on a platform that now specializes in organizing right-wing violence and misinformation campaigns.

Warner Bros. Discovery kicked off a trend of wiping movies and TV shows from streaming services, largely to take advantage of tax benefits, which is a problem when many shows and movies are only accessible through those services. Westworld is one of my favorite shows of all time, and it's sold out on physical media and was pulled from HBO Max. It's still available for purchase on digital platforms, but who knows how long that will last—I'm thankful I bought the Blu-ray sets while I had the chance. Other media companies have followed in Warner's footsteps, like Disney+ removing its original series Willow, which its star Warwick Davis called "embarrassing."

Aaron Paul in Westworld looking at a humanoid robot.
Credit: Warner Bros. Discovery

This has all become so much worse under the new U.S. Trump administration. Thousands of web pages, research papers, and datasets about climate change, infectious diseases like COVID-19, diversity, and LGBTQ+ groups have been systematically deleted. NASA personnel were told to "drop everything" and remove anything mentioning women in leadership, environmental justice, indigenous people, and other topics.

There are ongoing efforts to archive as much data as possible and host it somewhere else, like CDC Restored and SciOp, but some data is still permanently deleted or difficult to access. You also can't archive what never existed in the first place: academic researchers are now told to avoid using simple words like "women" or "diversity" or "historically," or they won't get funding from the National Science Foundation.

The destruction of state-owned information, resources, and staff is just the start. Failed submarine rescuer Elon Musk, who is now in charge of the 'DOGE' Trump administration initiative, has continued attacks on Wikipedia and retweeted messages to "Stop donating to Wokepedia." Federal funding for nearly all public media, including NPR and PBS, could be eliminated. The Wikimedia Foundation and the Internet Archive are both based in the United States.

The hobby of data hoarding might be most closely associated with saving illegally-downloaded movies on mountains of hard drives, but archiving public media and information has always been a component as well. Both parts are now critical to the survival of our shared culture and information base.

Patch the holes

This is a political and economical problem that cannot be solved on an individual level. That being said, there are ways we can help as individuals and communities.

If you have a computer, home server, or NAS and a stable internet connection, consider setting up a torrent client and adding some torrents. All downloads on the Internet Archive have torrents available, so pick some files you care about or browse through the collections, and let them continue seeding in the background. SciOp is another project working to mirror "cultural, intellectual and scientific heritage" through torrents, and the Library Genesis Project (LibGen) uses torrents to distribute and archive its millions of books and research material. If you're downloading or uploading copyrighted material, or you just want more privacy, you should use a VPN that supports torrents.

I currently have a Synology NAS with 4TB hard drives, a gigabit internet plan forced on me through my apartment lease, and a cheap legacy Proton VPN subscription. The docker-transmission-openvpn Docker image has turned my NAS into a capable 24/7 torrent machine, with over 29 TB of data uploaded since I set it up around 300 days ago. I love that I can repurpose my internet bandwidth and unused storage capacity for something that helps other people.

If you have a niche hobby or interest, think about the resources of that community and how you can help preserve them. Maybe you have a favorite PC game that isn't on the Internet Archive already, or some guides and forum posts for a hobby might only be on one website. Bookmark them, or save them for offline access, or both. Many years ago, I downloaded the soundtrack for Nickelodeon's Invader Zim animated show, which was a limited-print promotional CD. I looked around recently and couldn't find it online at all, except converted YouTube versions, so I uploaded it to the Internet Archive and backed it up in a few more places.

You can also donate to organizations that are focused on information access, like the Internet Archive, the Electronic Frontier Foundation, archive.is, or the Wikimedia Foundation. It's worth noting that Wikimedia is flush with cash right now, so your money might go further with other organizations. There are also the groups fighting the Trump administration in courts to restore deleted data, like Public Citizen, the ACLU, and Doctors for America.

I recently played Death Stranding, which places you in the shoes of Sam Porter Bridges, a delivery person tasked with rebuilding America after a worldwide apocalypse. He says something near the end of the game that feels relevant:

With the shape the world’s in, it’ll only be delaying the inevitable. Still, if it buys us time to try and build something better. A new lease on life, at least for a little bit. [...] Nothing lasts forever. Not even the world. But we gotta keep it going as long as we can, right? Patch the holes, change the parts, all that. So we can say we had a good run. That we lived.

Patch the holes. Seed a torrent.

Mastodon