Documenting the 2020 COVID-19 Pandemic

More than ever before, web archiving in 2020 emerged internationally as a rapid response means of documenting a crisis. The COVID-19 pandemic of 2020 demonstrated that web archiving is one of the few immediate actions information professionals and digital librarians and archivists may take to preserve an historical timeline and primary resources on an extended crisis.

From the beginning of the COVID-19 pandemic in early 2020, LAC's Web Archiving and Social Media Program team was fully engaged in documenting the evolution of the situation and its effects on Canadian society. The team curated a diverse collection that includes websites from government and non-government sources, as well as social media relating to the pandemic's impact on life in Canada.

COVID-19 Collection Scope, Priorities, and Highlights

  • French and English news media (daily newspaper crawls and targeted Covid-19 content)
  • Impact on business and the economy (for example, corporate sites for affected industries)
  • Health, science, and medicine (for example, information about research efforts)
  • Sites focused on social and cultural aspects including religion and artistic and cultural expression and impacts on families, children, and education
  • Curated social media related to COVID-19 (e.g., Twitter communications from public health officials; ongoing capture of tweets with hashtags related to COVID-19 in Canada)

This important work not only collected digital information that will serve as historical primary sources on COVID-19 for future research, it will also help tomorrow’s Canadians understand what it was like for those living through this crisis, and provide future leaders with important background, data, and experiences to help guide their decisions.

The information below summarizes LAC's work in capturing website and social media content to document the overall response to the crisis starting on February 1, 2020, and will be updated periodically:

Summary Statistics (2020-02-01 to 2020-11-17)

  • Total news/media websites crawled daily for COVID-19: 34
  • Total non-media web resources selected for COVID-19 collection: 1429
  • Total digital assets collected for COVID-19: 149,261,077
  • Total data collected (including news media) for COVID-19: 6.2 terabytes Tweets captured for COVID-19 (hashtags #covidcanada, #covid19canada, #canadalockdown, #canadacovid19, #maskupcanada, #masks4canada, #lightuplive, #eclaironslesscenes): 721,651

This graph shows the distribution of all resources collected in the LAC COVID-19 collection by publisher/data origin.


Figure 1: Resource
Resource distribution, see text version below
Figure 1: Resource distribution – text version
  • Government of Canada: 7%
  • Provincial and territorial governments: 5%
  • Non-governmental: 88%

This graph shows the distribution of all resources collected in the LAC COVID-19 collection by language.

Figure 2: Resource language
Resource language, see text version below
Figure 2: Resource language – text version
  • English: 63%
  • French: 37%

This graph shows the distribution of all resources collected in the LAC COVID-19 collection by type of resource.

Figure 3: Resource type
Resource type, see text version below
Figure 3: Resource type – text version
  • Article: 11%
  • Website: 37%
  • Website-partial: 40%
  • Social media: 7%
  • Other (data repository, Podcast, Wikipedia, etc.): 5%
Date modified: