Leaving the Cathedral and Entering the Bazaar

Library and Archives Canada Engages Canada’s Digital Society, a presentation by Daniel J. Caron to the Association of Computing Machinery and Institute of Electrical and Electronics Engineers (ACM/IEEE) Joint Conference on Digital Libraries, Ottawa

To begin, I would like to thank the organizing committee for the Joint Conference on Digital Libraries for giving me the opportunity to share my thoughts concerning how national libraries and archives are adapting to the challenges of the digital environment.
Evidence of the Change
One sign of the transformation from 1950 to 1980 to the information age was that manufacturing goods was eclipsed by information management as the dominant economic activity in the world. The tipping point in information storage occurred in 2002 when more information was stored digitally than in analogue format. In 2000, 75% of the world's information was still in analogue format (paper, videotape, etc.) but by 2007, 94% was preserved digitally (Hilbert & Lopez, 2011).
Our Role is Exercised Differently
Today, national libraries and archives are faced with the challenge of adapting their practices which were applicable in an environment that was predominantly analogue to one that it is becoming more and more digital. That is not to say that we have moved completely from one modus operandi to another. Our mandate remains to acquire, preserve, and make accessible a nation’s documentary heritage to its citizens, which means that we will continue to operate in a hybrid environment and work with information resources regardless of whether they are formatted in a physical container or encoded into bits and bytes. However, the tried and true practices of the analogue environment cannot be simply transferred to the digital realm.
Exponential Growth in Information
The rapid growth of information provides a constant source of change, new opportunity and also a staggering challenge in comprehending the nature of the change due to its enormous scale and invisible nature. The management of information can be divided into 3 major divisions, each evolving at their own unique pace: storing information (reaching across time, 23% growth per year); communicating information (reaching across space, 28% growth per year); and computing information (composing or processing, 58% growth per year).
According to the latest estimates, stored information around the globe is somewhere near 800 exabytes, a mind boggling quantity that defies earthly comparison and is best understood mathematically.
Simply put, at this astronomical level of production, it becomes an impossible task to sift through all of the information that is being produced and to decide what is to be preserved even if all it could be captured and saved.
Social Transformations
The degrees of separation between citizens and information resources have diminished dramatically. Acquisition, preservation, and access to information resources, which were once mediated by documentary heritage institutions in the analogue world, have become more direct, unfiltered, and immediate in the digital domain. This advance provides opportunities for both documentary heritage institutions, such as LAC, and our clients, the citizens of Canada, to reassess and readjust monopolistic and obsolete business models and methodologies that were designed for and more applicable to the print era, and to participate in activities such as cataloguing and metadata allocation, once the sole purview of information professionals.
Disruptive Technological Change
It should also be noted that the principle enabler of the profound changes affecting the world in which we live in, information and communications technology, also undergoes a series of wholly unpredictable metamorphoses. In a relatively short period of time, we have moved from main-frame to personal to mobile computing devices. As a result, the storage of information has moved from centralized locations to be spread through computer networks and finally into the clouds. Without question, this technological evolution impacts upon the very nature of how we can attempt to capture, preserve, and make accessible a nation’s documentary heritage.
Importantly, it is the rate of technological change and its impact on our social relations that creates an enormous challenge for documentary heritage institutions.
To use the extended metaphor established in Raymond’s seminal work, The Cathedral and the Bazaar, which juxtaposes the cultures of commercial software and open source software producers, we can see that there are two substantially different lines of authority and flows of information when it comes to the production of commercial and open source software. In the commercial, there is a command and control hierarchy, and in the open source, there is a flatter, more fluid organizational architecture that we typically associate with networks. Likewise, there are two distinct fundamental mindsets and operational paradigms that distinguish how a national library conducts its affairs in a predominantly analogue and predominantly digital environment.
The National Library as Cathedral
The national library in the analogue world is a bricks and mortar monument to knowledge. It is characterized by the ideology of the book: literacy and democracy is supported by a comprehensive collection whose access is mediated by professionals primarily for other professionals with whom the public is dependent for access. In this instance, the mapping and description of materials is in the exclusive domain of a professional class. The geography of space is structured to bring forth a particular order, for example, the Dewey Decimal System.
The National Digital Library as Electronic Bazaar
The National Digital Library offers access to digital information resources, distributed on line, anywhere, anytime. To access the materials, users will make use of third party platforms, more often than not from commercial information service providers.  As a result, access is no longer always free as there are costs in making use of computing devices, gaining Internet access, and navigating the Web. More importantly, the organization of the materials are user-centered and are often user generated. In this domain, the ideology of the book no longer dominates as the act of reading has changed considerably. Rather than engaging a text that conducts an extended argument, readers read in information snippets that make use of extended hyperlinks. Moreover, literacy is no longer about mastering an established canon. Instead, literacy is about the capacity and competency of processing information resources in their myriad forms.
Acquisition, Preservation and Resource Discovery with National Library Analogue Collections
There exists a comprehensive collecting approach. An educated class decides what passes for knowledge. Everything is relevant and enduring. The material arrives pre-mediated by the publishing chain.
That which has been deemed to be included in the collection is to be preserved for all time. The container determines preservation techniques and decisions.
Resource Discovery
Resource Discovery is systemized, ordered, and formal, mediated by rules and protocols that are site specific.
Acquisition, Preservation and Resource Discovery in the Digital Domain
Because of the volume of digital information that is being produced, a comprehensive collection doesn’t make much sense. As well, there is a significant democratization of documentary production. Consequently, selection criteria must be established.
Digital information is considered to be perishable. It will be preserved and recycled as long as it remains relevant. The criteria for what gets preserved is largely market driven. Again, superabundance means that preservation is not simply about storage. In fact, managing the content so that it remains accessible presents a more difficult challenge.
Resource Discovery
Resource Discovery is characterized by a user-generated sense of order. Users want to arrange, describe, and gain access to information resources without mediation.
LAC Responds
At LAC, we have come to the realization that in order to perform the traditional functions of acquisition, preservation, and resource discovery that we must be proactive and move further upstream in the process of documentation towards the point of creation of digital information resources. Capturing the documentary moment is no longer a function performed unilaterally by national memory institutions. It is done collaboratively among the partners of a documentary heritage network. This is very different from the past, when institutions such as ours were appointed to preserve the public memory in specific areas, and given a legislative mandate to manage information resources over time on behalf of the public. Using a whole-of-society framework means new and broader partnerships and relationships between LAC and the rest of society.
Publishing Ecosystem is Far From Equilibrium
We can say that the transition from the printed word to the digital text has moved the publishing ecosystem out of its static analogue state into a dynamic evolving state in which the tension between the driving forces will over time push the publishing domain into another operating order. Given the nature of technological change, we would anticipate that the new equilibrium will be considerably shorter in duration than the age of print. In other words, we have left an epoch in which change is gradual to arrive in an epoch characterized by punctuated equilibrium.
Business Models
The traditional business models with regard to the publication chain have been disrupted. Consumers have far more choice with regard to which information delivery platforms they can access information. Consequently, publishers cannot assume that their clients will abide by the protocols they establish when providing information. Constant technological innovation means that consumers may leave en masse and migrate towards another platform and businesses must adapt quickly to changing market conditions that often include game changing technology.
Value Propositions
As a corollary to the instability of business models, a key concern becomes the value of the information or the information service that is being offered. Perceived value is also subject to disruptive change. What has value within a particular context of information dissemination and consummation can quickly be lost. In fact, a national archive often becomes the final resting place of information platforms that have dropped out of use, for example, nitrate and cellulose film, transistor radios, tape recorders, etc.
Intellectual Property Rights
It is widely recognized that as we have moved from scarcity to an overabundance of published material, where the cost of making digital copies has dropped to next to nothing, that the protection of intellectual property rights would need to evolve. Indeed, the necessity of guaranteeing fair usage becomes readily apparent when the rise of person to person computer networks allows for file sharing that threatens the revenue streams that creators have come to depend on. Legislatures around the world are struggling to modernize copyright legislation, and Canada is no exception as we anticipate that new copyright legislation will be tabled this fall.
Information and Communications Technology and User Preferences
In less than the twenty years that the Internet has come into being, we have seen a series of technological advances that have radically altered the digital landscape. Computing devices have become increasingly mobile, search engines have made the World Wide Web easy to navigate, and the rise of social media has fundamentally changed the way we communicate, so much so as to even threaten some existing power structures: consider the pivotal role that mobile computing devices and social networks have played in the popular uprisings during the Arab spring of 2011.
In many ways, we have seen a radical democratization of the info-sphere in which the proliferation of blogs, personal Web pages, self-published video and text, and instant messaging has taken the production of information out of the hands of the few and into the hands of the great many. As well, the lifecycle of any technological platform has been reduced significantly and the fate of any ICT offering is determined by the preferences of the users, who are constantly testing and adapting features of a product to their own needs.
The revenue streams of public and private information resource and service providers appear to be moving in opposite directions. Following the fallout of the Great Recession and the challenge of managing increased public debt, public funding of national libraries and archives struggles to maintain current levels while the market capitalization of private information service providers is constantly on the rise.
The Genie Has Escaped the Lamp
Information has been liberated from its physical containers. As a result, there has been a steady decline in a static relationship between particular content and a particular communications medium. Delivery of content has moved to a single platform, the Internet, and as a consequence, the business models for the delivery of content via physical mediums have faltered.
For example, newspaper circulation is down and in response publishers are moving more and more to electronic formats accompanied by the challenge of maintaining revenue streams. As well, the music industry has been totally revolutionized by the advent of digital formats transmitted over the Internet and often through person-to-person file sharing. Vinyl records give way to CDs, which in turn give way to single song downloads through distributers like iTunes. Finally, in what will certainly be a historic case study, students will examine how the company Blockbuster became North America’s leading home movie distributor based on a bricks and mortar distribution chain only to be eventually undone by video streaming furnished by cable companies and Internet distributors like Netflix and iTunes.
New Concerns Emerge
There has been an initial swing of the pendulum away from monopoly and hegemony as scarcity gave way to abundance. However, recently we have seen a new concentration of control over access in the rise of the giants, Apple, Microsoft, and Google. Large portions of Internet traffic are channelled into applications controlled by the big three vendors.
The Bricks and Mortar Campus Is Crumbling
Among the traditional economic models for brick and mortar institutions that are experiencing radical transformation is the traditional university. The economics of gathering large numbers of students to attend in-person lectures more often than not taught by graduate students on campus is being challenged by the online delivery of university courses accomplished professors from prestigious universities who make available all the course material digitally. Indeed, the business of knowledge distribution is radically altered when renown universities like MIT and Stanford put their undergraduate programs online and can reach students in Africa who can follow the courses on their smartphones. Similarly, academic publication is also altered when budding academics can publish their papers online bypassing traditional review procedures and have their work evaluated by the number of downloads they solicit and the number of citations they garner in other electronic academic forums.
Public Information Providers Must Take Heed of What’s Happening in the Private Sector
Gaining access to the Internet-based information resource network for the vast majority of users entails the utilization of a number of private third-party platforms beginning with the computing device, including an Internet service provider, a Web browser furnished by proprietary software and finally the search engine.
Market-driven enterprise, regulated by government, offers its services to a public situated in a global information economy with ICT sales of over $4 trillion annually. The market valuation of the companies before you is in excess of $1 trillion.
Publically funded information resource providers therefore find themselves competing for the attention of information consumers with far less in the way of human and financial resources as compared to the private sector. Moreover, their services offered to the public may be supplanted by the private sector. For example, governments provide a search function for their websites, but users more often than not use the ubiquitous Google search to find the information they require.
As a result, public information providers face new levels of risk that were previously associated with the private sector. In particular, decisions regarding capital investments in information infrastructure are difficult to make when the ICT landscape is fraught with dynamic change. Indeed, technological advances often function as disrupters and, as a result, can change the lay of the land so that what was perceived as an opportunity to address a particular need can vanish quickly as clients opt for new services or entirely new platforms offered by the private sector.
Heterogeneous Domains of Access
Presently, more and more public domain materials are in the process of being digitized or are born digital, which in theory places this domain of information in the category of free and democratic access.
Access to government-generated information can vary depending on the rights management framework that oversees access to information and privacy laws, copyright, and enabling legislation with regard to open data and open government. In general, there is a world-wide movement to make as much government information as possible available democratically and at little or no cost, but such developments progress at varied rates from one nation to another.
Access to information generated or delivered by the private sector will always carry a cost component that users must be prepared to accept. That being said, commercial information service providers do collaborate with public partners in order to bring large bodies of information to the user at no cost as long as they are prepared to use a particular information delivery platform, for example, Apple provides free books and free access to university courses via iTunes.
Open Access vs. Copyright
The nature of the underlying tension between open and limited access is best captured by Stewart Brand’s often misquoted observation that information wants to be free and that information also wants to be expensive.
Clearly Defined Areas
Public domain works allow the Internet Archive, the Hathi Trust, and the emerging Digital Public Library of America to collaborate and advance the public good by making available digitally a great number of educational texts.
Commercial publication in which copyright (all rights reserved) exists has a long legal tradition. However, what complicates matters is that there is often a lack of compatibility between proprietary information provider platforms. For example, Google Books will function with Sony’s Kobe Reader but not with Amazon.com’s Kindle application.
Areas Yet To Be Worked Out
Creative Commons and licensing agreements (some rights reserved) represent a potential third way.
The thorny problem of orphan works, where the rights holders cannot be located, has caused a major problem in the attempt to make large numbers of texts available digitally. It is labour intensive to perform a diligent search to locate the rights holder for each and every orphaned work.
How This is Playing Out
Nowhere is the tension between open access and proprietary platforms greater than in the race to build the first truly great digital library. In many ways it is reminiscent of the race to put the first man on the moon in that the two competing forces emerge from quite different ideological domains. On one side, there are the Americans and the Europeans with their desire to create a comprehensive digital collection founded on the ideology of the book. On the other, there is the financial might and technological expertise of Google that cannot only provide digital access to more than 12 million books with unparalleled search capabilities to enhance access, but as well, starting tomorrow, provide users with its Chrome book, a low cost cloud computing device that offers an unprecedented, comprehensive, vertical integration of content, software, memory storage and search engine.
As you know, the matter of how to deal with orphan works has created a snag for Google in the courts and has delayed the offering of its digital collection. As we observe the denouement of the legal challenge, we need to keep in mind that we are witnesses to a historical process similar in kind to Gutenberg’s invention of the printing press with movable type, but whose game changing consequences will unfold at a velocity exponentially greater than the societal changes that occurred during the Renaissance.
LAC’s Core Mandate Remains
LAC’s prescribed role to identify, manage, and preserve Canada’s documentary heritage continues. Of particular concern is preservation of the foundation governance documents that can best be defined as the foundational civic goods of our nation—the original documents that record our decisions and actions and the information to be found in our books and other documentary media and artifacts. They are required within society to articulate, express and share common goals; they provide individuals and groups with the capacities of social literacy necessary to enable their democratic participation within their communities; they allow citizens to act on their entitlements, rights and freedoms, and to ensure the accountability of public administration and responsible governance under the rule of law; and they promote an inclusive social consensus and progress through the distribution and sharing of information resources and the preservation of an accessible public memory. In short,foundation governance documents are the information resources that permit Canadians to function collectively as a democracy.
More Than Ever a National Library Must Be Circumspect
A national library cannot assume sole responsibility for providing access to digital information to its citizens. It must be circumspect with regard to the content it is able to capture, preserve, and make accessible, all the while operating within its budgetary constraints and its legislated mandate, especially taking into consideration the unknown consequences of public and private sector attempts to establish large scale digital libraries.
Value Added Propositions
Linked Data: Metadata Interoperability. The rapid growth of Internet resources and digital collections is accompanied by a proliferation of metadata schemas. Each metadata schema has been designed based on the requirements of the particular user community, intended users, type of materials, subject domain, the depth of description, etc. Problems arise when building large digital libraries or repositories with metadata records prepared according to diverse schemas. Most users do not and should not have to know or understand the underlying structure of the digital collection; but in reality, they are experiencing difficulties in resource discovery and access. How to enable a “one-stop” seamless search presents considerable challenges and Library and Archives Canada is determined to provide its users with a seamless search capacity for its entire holdings of archival and library materials.
National and International Collaboration. There is a growing consensus that the public memory challenges of the Digital Age need to be met by collaborative strategy and research; that potential solutions and interventions will not succeed through independent unilateral actions, but can emerge through institutional and occupational convergences. And I think that we are all beginning to realize that this collaboration cannot be confined to ourselves as memory professionals and memory institutions. We are beginning to understand that the construction and constitution of the civic goods of public memory are a collective social responsibility requiring broad participation across all sectors.
Preservation: Trusted Digital Repository. Beginning in 2017, Library and Archives Canada will no longer receive or will receive very little in the way of paper government records. LAC will become a Trusted Digital Repository, but not in the sense of a bricks and mortar building with an ingest method taken from the analogue era. More precisely, we envision ourselves becoming a node in a distributed network, using Web archiving services in the cloud to maintain reliable, authentic, and trustworthy information. The expertise garnered in working with government records will be then transferred to capture, preserve, and make accessible a greater variety of digital texts.
Dark Storage. With the question of digital preservation being far from settled, the preservation of printed documents still offer a stable, reliable, long-term medium for preservation. Interestingly enough, the Internet Archive is presently archiving hard copies of the books that it digitizes and then offers on line.
Thank you. I would be happy to respond to any questions or comments.
Date modified: