Text mining

The Historical Medical Library, as part of its role with the Medical Heritage Library (MHL), is working on a consortium wide digitization effort, in conjunction with the Internet Archive, to provide scholarly access to the entirety of the State Medical Society Journals published in the 20^th century. For an introduction to this project, you can read my previous blog post.

In this post, I would like to explore what I began to discuss at the end of my last post: the application of computer aided text analysis techniques, also referred to as “text mining.” In this second-in-a-series of posts about the MHL project and the possibilities for digital scholarship, I will offer an introduction to some of the core concepts of text mining, as well as some easy-to-use, browser-based tools for getting started without the need for a high level of expertise, or specialized software. There will be a link to some more in-depth resources and processes at the end of this article for people interested in exploring some of these concepts and processes more fully.