Preserving the web - California Digital Archives
The California Digital Library is embarking on a project to save web pages that would otherwise vanish. Here is what I found out this morning:
The Web at Risk
The Web has revolutionized our access to information. Documents and publications that were once difficult to find are now readily available to anyone with an Internet connection. Federal, state and local government agencies and non-profit organizations now have an inexpensive means for distributing information to the public. When important historical events such as Hurricane Katrina or 9/11 take place, we can see the popular reaction unfold via blogs and personal web sites, and have an unprecedented view into popular culture. All of these materials will serve as valuable resources for researchers for years to come.
But ready access to these publications cannot be taken for granted. Web pages and documents are as easy to change or remove as they are to publish. When sites are redesigned, when new administrations take office, when policies or organizations change, we witness the wholesale disappearance of information. State and local web publications are particularly at risk. In many cases, these documents are no longer available in print, and libraries are challenged to continue their historic role as cultural memory institutions in the digital environment.As scholars increasingly rely on web citations, it becomes difficult or impossible to verify a scholar’s sources. Studies of web citations are showing that up to half of the citations in scholarly journal articles can cease to function within four years. Even if a web citation still returns a page, there is no guarantee that you are looking at the same content the author cited. Furthermore, web content faces the same risks as other digital publications as file formats evolve and change.
In 2005, The National Digital Information Infrastructure and Preservation Program awarded a grant to the California Digital Library and its partners at New York University Libraries and the University of North Texas Library to provide librarians and archivists with the tools to capture, curate and preserve web publications. One result of that grant is the Web Archiving Service, which produced the archives available here. Curators at University of California Libraries, Stanford University Libraries and New York University Libraries along with a growing number of institutions have used these tools to save web publications for researchers.










Of course this begs the question of the role of the Internet Archive, which has been doing this for a number of years (although maybe not as aggressively as the article implies).
Will TIA be involved? (I do see it possible considering Peter Brantley, who has ties to CDL is now the Exec. Dir. at TIA).