Title: Text Mining: Generating hypotheses from MEDLINE
Author: Padmini Srinivasan.
Publisher: Journal of the American Society for Information Science and Technology, 55(5): 396-413, 2004.
In this paper, the author presents two literature-based discovery algorithms for hypothesis generation, termed open discovery and closed discovery algorithms. The goal of these algorithms is to reduce as much human intervention in hypothesis discovery process as possible. The algorithms exploit pre-assigned MeSH terms in biomedical documents and UMLS semantic types for constructing topic profiles to be used in generating a ranked list terms that represent novel discoveries. A topic profile is a vector of vectors of MeSH terms in which each vector of MeSH terms corresponds to each UMLS semantic type. In other words, a vector of MeSH terms of an UMLS semantic type contains all MeSH terms that belong to the semantic type. Those MeSH terms are obtained from MeSH terms preassigned to biomedical documents. In the open discovery algorithm, begining from a user topic A to topic profile AP to a list of B-terms to a set of topic profiles BPs and to a final topic profile CP, terms in the final profile CP are considered novel if their MEDLINE search with the user-specified topic A returns zero result. Similarly, in the closed discovery algorithm, begining from user’s specified topics A and C to topic profiles AP and CP to a list of B terms which are common MeSH terms between AP and CP and to a final topic profile BP, terms in the BP profile are considered novel if their MEDLINE search with user’s specified topics A and C return zero result.