Literature reading: A semantic approach for mining hidden links from complementary and non-interactive biomedical literature

Hu X., Zhang X., Yoo I., Zhang Y., A Semantic Approach for Mining Hidden Links from Complementary and Non-interactive Biomedical Literature, in the 2006 SIAM Conference on Data Mining, pp. 200-209, Bethesda, Maryland, April 20-22, 2006

If I think about this paper, I would think about a literature-based discovery system that utilizes UMLS semantic relations for deriving two sets of UMLS semantic types. One set is used for restricting B-term generation and another for constraining A-term generation. In other words, UMLS concepts in B-list and A-list must belong to the derived UMLS semantic types in the previous step. In addition, the authors utilize UMLS hierarchy for eliminating UMLS semantic types that are too general such as the ones in Level 1, 2 and 3 in the UMLS semantic network

As mentioned in the paper, the goal of the proposed system is to reduce human intervention in selecting B terms, which is the main drawback in Swanson’s approach. The authors replicate Swanson’s discoveries in Migraine/Magnesium and Raynaud disease/Fish Oils and compare the results with LSI-based and Association rule approaches. The system requires users to provide a starting concept C, date range, initial semantic relations between the starting concept C and the to-be-discovered target concept A and the role of the starting concept C such as subject or object.

The authors claim that their method is novel in such a way that there is no existing literature-based discovery for hypothesis generation approach that exploits UMLS semantic relations. Current methods use terms, MeSH terms, and UMLS semantic types.

In short, the system could be divided into two phases. The first phase is to generate constraints or category restrictions in term of semantic types to which UMLS concepts for B-list and A-list must belong. The second phase is to find UMLS concepts for B-list and A-list as usual. UMLS concepts for B-list and A-list are from pre-assigned MeSH terms, and terms are ranked based on frequency counts

If I have to say something negative about this paper, I would say that the experimental results should also be compared with those from existing works such as Swanson, Srinivasan, Gordon and Lindsay, etc. The results are not really trustworthy because the authors use their own implementation of LSI-based and association rule.

This entry was posted in text mining. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s