Love Data Week Workshop: Navigating HathiTrust with Python

Feb 14, 2024
10 - 11am EST
Online
Registration is required
This event is free

Who can attend?

  • General public
  • Faculty
  • Staff
  • Students

Contact

Data Services

Description

When working with a digital repository, a fundamental research concern is being able to construct a subcorpus. HathiTrust is currently the largest open collaborative repository that houses 18+ million volumes, offering remote and open access to scholars who work with these sources in a variety of disciplines. However, finding specific subcorpora can be really challenging in such a repository, due to labeling errors, OCR errors, and more. This workshop covers a research workflow for programmatically interacting with HathiTrust that can be applied to a variety of tasks, including finding subcorpora within HathiTrust that aren't directly or cleanly labeled by the metadata.

See the entire schedule of Johns Hopkins Data Services' Love Data Week.

Who can attend?

  • General public
  • Faculty
  • Staff
  • Students

Registration

Registration is required

Please register in advance

Contact

Data Services