To enable people to explore a digital collection, the platform that hosts that collection needs to have a comprehensive understanding of the information it is presenting. However, the level and quality of assistance that can be provided to a user by a computer is largely dependent on the amount of information that the system has about the collection. While such information can be provided by a process of manually tagging and annotating archive contents, this can be expensive, time-consuming or even infeasible if the collection is too large.
This talk will explore the challenges involved in the automatic identification and disambiguation of entities within digital cultural heritage collections.
Seamus Lawless is Assistant Professor at Trinity College Dublin.
Our Big Ideas seminar series is funded by the Friends of The National Archives.