The Wayback machine is the biggest internet archive on the planet and a fascinating insight into the development of the internet.  This enormous library is maintained by the San Francisco based Internet Archive and is one of the web’s most popular resources.

news-1546706_640

For any of us who have accidentally deleted a blog page or site – it’s probably a familiar resource already, but it’s also a fascinating place for research and simple browsing.  The Wayback machine periodically crawls the entire web and stores dated versions of the web pages in it’s index.  You can not only see the version of a website that exists today but also 1, 5 and even ten years back.

For media journalists it’s also an important tool as it can be used to find deleted information, find data on past news events and research stories from long dead web sites.    It’s been used many times to embarrass public figures and expose lies that would have otherwise been long forgotten.

There has been some pressure for the owners of the Wayback Machine to remove and censor the content stored there.  It is not a direction they wish to take although it has in the past taken the decision to remove content which was deemed to be dangerous to people.   Whether removal means complete deletion or simply removing from the publicly accessible servers is also unclear.

The director is certain that there will be instances where they remove web pages in the future, but it will never be based on personal choice or censorship.  The small staff of 10 editors is dedicated to cultural preservation at the heart of their roles.

Content can be removed and made non-visible though, for example if the content is blocked by using a file called robots.txt then the Wayback Machine crawler will respect this decision.  They will crawl and archive the page if possible but that particular version will not be accessible for public viewing.   You will likely receive a message stating that the page has been excluded or removed from public viewing.

Most people in the media are completely against the censorship or filtering of any content from the internet.   After all if something is offensive then it’s easy enough to simply disregard the pages.  It is important to remember that something can be perfectly reasonable to some people and offensive to other – take for example science based articles which discredit biblical scriptures, it all depends on which side of the fence you are sitting.

The internet is already becoming very controlled and filtered, many web sites operate a system called region locking which blocks content based on their location.  Some of these are very sophisticated with large media sites like Netflix blocking VPN services too in order to stop people accessing their subscriptions from specific physical locations.

The concept is probably similar to that of a physical library, where it is possible to remove a book from a libraries shelves but only with some effort.   Librarians there would also be very reluctant to remove books based on individual requests, although obviously the scale would be much smaller than the digital archive.

John Henry

Video Blogger – Using a Proxy for Netflix