Digital Humanities Research Network: Searching for John: An analysis of Papers Past URL logs

The October DHRN meetup will feature a presentation from Karin Stahel on her PhD research, which is focused on the use of machine learning
methods to classify historical newspaper articles by their genre and topic.  Join us 1-2 pm on Thursday 24 October in Elsie Locke 313.

I will talk about the patterns and trends discovered in my recently completed analysis of a URL log dataset containing over 4 million unique Papers Past URLs.

The analysis was completed as part of my PhD project, which is focused on the use of machine learning methods to classify historical newspaper articles by their genre and topic. As part of the user study stage of the project, the National Library of New Zealand provided me with a URL log dataset containing over 4 million unique Papers Past URLs for the six months from October 2023 to March 2024. In this talk, I’ll discuss some of the features and patterns of user search behaviour discovered through this analysis, including the most visited newspapers and articles and the types of search filters and queries used. I will also discuss my exploration of the use of newspaper article genre terms such as “notice”, “poetry”, and “letter” in search queries and the topics or concepts that were searched for in relation to these genres.