[Rescheduled] Digital Humanities Workshops: OCR for Non-Roman Texts

Attendees will be introduced to the basics of optical character recognition (OCR)––which allows for full-text searching and other types of text manipulation of a digitized document––with a particular focus on OCR for materials in languages other than English, and in scripts other than Roman/Latin. OCR is fairly commonplace for English and Roman-script languages like French or Spanish, but it does not work so seamlessly for languages such as Arabic, Hindi, or Chinese. This workshop will be an opportunity to explore an open source OCR tool (Kraken) that has demonstrated success with some non-Roman scripts. The workshop will look at a few different non-Roman scripts; however, participants are encouraged to bring a digitized, highly legible sample text of interest.

 

[Originally scheduled for 9/4/19, this workshop will take place October 2, 2019]

Friday, October 4, 2019 at 1:00pm to 2:30pm

Perry-Castañeda Library (PCL), Learning Lab 2
101 21ST ST E, Austin, Texas 78705

Event Type

Academics, Arts & Humanities

Departments

University of Texas Libraries

Target Audience

Students, Faculty

Website

https://www.lib.utexas.edu/events/297

Subscribe
Google Calendar iCal Outlook

Recent Activity