Category Archives: text mining

Text preprocessing: Wrap around a punctuation with spaces using Java

Input text = “Hello-my $world?” Output text = “Hello – my $ world ? ” Advertisements

Posted in Java, text mining | Leave a comment

Resources for literature mining research in biomedical domain

The goal of this blog is to collect all useful resources, ranging from software tools, datasets, interesting literature, etc., for literature mining research in biomedical domain. Its content will regularly be updated. Currently, I am focusing on automatic pattern generation, coreference resolution and discouse … Continue reading

Posted in text mining | Leave a comment

Setup WEKA and LIBSVM for WEKA in Debian Etch

When we try to use LIBSVM through WEKA, we might encounter the problem of libsvm.jar not in the classpath. Here, I’d like to describe how to set up Weka and LIBSVM for weka. 1) Download weka-3.5.7.zip from http://www.cs.waikato.ac.nz/ml/weka/, and unzip … Continue reading

Posted in Java, Linux, text mining | 1 Comment

A Practice on LIBSVM Example in Debian Etch Using Java

This exercise comes from “A Practical Guide to Support Vector Classification” paper by Chih-WEi Hsu, Chih-Chung Chang and Chih-Jen Lin. It can be downloaded from http://www.csie.ntu.edu.tw/~cjlin/libsvm/index.html. The goal for writing this page is to record step-by-step instructions of an example … Continue reading

Posted in Java, text mining | 3 Comments

Literature reading: A semantic approach for mining hidden links from complementary and non-interactive biomedical literature

Hu X., Zhang X., Yoo I., Zhang Y., A Semantic Approach for Mining Hidden Links from Complementary and Non-interactive Biomedical Literature, in the 2006 SIAM Conference on Data Mining, pp. 200-209, Bethesda, Maryland, April 20-22, 2006 If I think about … Continue reading

Posted in text mining | Leave a comment

Literature reading: Knowledge discovery across documents through concept chain queries

Title: Knowledge discovery across documents through concept chain queries Authors: Wei Jin and Rohini K. Srihari Publisher: Sixth IEEE International Conference on Data Mining – Workshops (ICDMW’06), IEEE Computer Society, 2006. In this paper, the authors combine Srinivasan’s topic profile and … Continue reading

Posted in text mining | Leave a comment

Literature reading: Text mining: Generating hypotheses from MEDLINE

Title: Text Mining: Generating hypotheses from MEDLINE Author: Padmini Srinivasan. Publisher: Journal of the American Society for Information Science and Technology, 55(5): 396-413,  2004. In this paper, the author presents two literature-based discovery algorithms for hypothesis generation, termed open discovery and closed discovery algorithms. … Continue reading

Posted in text mining | Leave a comment