Added by Sarah on 25 November 2008 18:22

At the JISC Collections AGM on 20 November 2008, Sophia Ananiadou, Director of the National Centre for Text Mining (NaCTeM) gave an excellent presentation on what text mining is, why it matters for researchers and how it helps to facilitate new and innovative research.

In the context of information overload and the problem of keeping up with the increasing amount of new literature available, Sophia made the point that much information on the web is unstructured (she estimates about 80%) and/or not searchable (e.g text in pdf or PowerPoint files, which cannot be found by ordinary search engines). She explained how text mining helps with not only finding relevant information, but can make intelligent connections to scholarship from other fields and provoke questions that might not otherwise have been asked.

Text mining is a step on from the type of search engines we are used to. It enables a search to be done on semantic tags and on the relationships between these tags, to make more intelligent connections. JISC have produced some useful fact sheets to explain what text mining is and how it can be useful for researchers, National Centre for Text Mining: Introduction to tools for researchers and What text mining can do .

The Centre has tools and services to help institutions and researchers do text mining ? the current areas of focus are biology and the social sciences, but the Centre has had a lot of interest from publishers and they hope to expand on the disciplines they cover.

One major issue is in being able to get access to the full text of journals for text mining, as currently only abstracts are used due to copyright issues. Sophia hopes that by getting publishers more involved, this issue can be raised and dealt with.

