How to Create an Entrez Utilities Java Library for Embedding in Java Applications

Update: April 24, 2009.
=====================
With this update, I am describing how I recreated the latest version of NCBI Entrez Utilities Web Service Java library (v2.0),  eutils.jar, using Apache Axis2. Currently, I am using Debian Lenny RC2 AMD64 on Sony VGN-NR160E laptop.

1) I install Sun Java JDK in the /opt as described below.

2) I download Apache Axis2, precompiled library and unzip it into any directory.

3) I generate all stub files of the NCBI Entrez Utilities by performing the following steps:

3.1 Access the bin sub-directory under the unzipped Apache Axis2.

>cd axis2-1.4.1

>cd bin

3.2 Then, I add an executable permission to myself by issuing the following command.

>chmod u+x wsdl2java.sh

3.3 I generate all stub files using the following command

>./wsdl2java.sh -uri http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/v2.0/eutils.wsdl (For all Entrez Utilities except EFetch)

Since I want to use EFetch to retrieve Title and Abstracdt from NCBI Pubmed. I need to run another wsdl2java.sh with the uri for PubMed and EFetch. I get the uri from http://www.ncbi.nlm.nih.gov/entrez/query/static/esoap_help.html, and run the following command.

>./wsdl2java.sh -uri http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/v2.0/efetch_pubmed.wsdl
(For EFetch and PUBMED)

Note that if I want to use EFetch to retrieve articles from PMC, I also need to run another wsdl2java.sh with the uri for EFetch and PMC, as follows.

>./wsdl2java.sh -uri http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/v2.0/efetch_pmc.wsdl (For EFetch and PMC)

If successful, a new directory structure called “src/gov/nih/nlm/ncbi/www/soap/eutils” is created. When using the first URIs above (without PMC), this new generated directory contains the following files:
– EFetchPubmedServiceCallbackHandler.java
– EFetchPubmedServiceStub.java
– EUtilsServiceCallbackHandler.java
– EUtilsServiceStub.java

3.4 I create Client DOT Java that uses ESearch+EFetch to retrieve a set of Pubmed IDs and their corresponding Titles and Abstracts from NCBI Pubmed. I put Client.java file into the “src/gov/nih/nlm/ncbi/www/soap/eutils”.

4) Now, I compile all the created stub files by using the “compile.bat” file provided on the http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/v2.0/DOC/esoap_java_help.html.

4.1 I copy and paste the content of the “compile.bat” file into a new file called “compile.sh”. Note that I use the “.sh” extension because I am working on Linux operating system.

4.2 I edit the content of “compile.sh” file to suit the Linux format. After edited, the file should look like below.

———————————————————————————————————————————–

————————————————————————————————————————————–

4.3 I put “compile.sh” file in the same directory as the  “src” sub-directory generated in the previous step (or in the bin directory of Apache Axis2).

4.4. Then, I add an executable permission to the “compile.sh” file.

>chmod u+x compile.sh

4.5 I compile the stub files in the “src” directory.

>./compile.sh

Several class files are generated after the compilation.

—————————————————————————————————–

Entrez Utilities or EUtils is a web service from NCBI that allows users to retrieve biomedical documents and other related information using java applications. Its web site is at http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html

The goal of this blog is to create an Entrez Utilities java library for embedding in Java applications. I will call the library, eutils.jar. There are four main steps: 1) install Sun JDK and setup Java path, 2) download and install Apache Axis-1_4, 3) download and compile Entrez Utilities, and 4) construct an eutils.jar file.

Install Sun JDK and setup Java path

– Download the latest version of binary JDK (not rpm) from Sun web site. The current version is jdk1.6.0_02.

– Change permission of the file to be executable.

>chmod u+x jdk1.6.0_02

– Install JDK into /opt directory.

>./jdk1.6.0_02-*.bin

After a successful installation, setup Java class path. The JDK should be installed in the /opt/jdk1.6.0_02 directory. As a normal user,

>cd

>vi .mysetup

In the .mysetup, enter the following text.

—————————————————–

JAVA_HOME=/opt/jdk1.6.0_02

export JAVA_HOME

PATH=$JAVA_HOME/bin:$PATH

export PATH

———————————————-

Then, save .mysetup file, and edit .bashrc file.

>cd

>vi .bashrc

Add the following line at the end of the .bashrc file.

——————————————–

source ~/.mysetup

——————————————–

Save the .bashrc file and activate the change.

>source .bashrc

Now, we need to download and install Apache Axis. Although Entrez Utilities has not been tested with Axis-1_4, it is working fine with the Axis-1_4 version.

Go to Apache Axis web site. (Use google), and download the binary tar.gz version. Although I am using Debian 64-bit version, I can use the pre-compiled binary version of Axis-1_4 without recompiling the source distribution.

To install Apache Axis-1_4, I just unzip the downloaded binary tar.gz file and move it to /opt. Then, I setup AXIS_HOME=/opt/axis-1_4, and add $AXIS_HOME/bin to PATH in .mysetup file. So, now we have two installed directories, /opt/jdk1.6.0_02 and /opt/axis-1_4. The content of .mysetup file becomes:

————————————————————

JAVA_HOME=/opt/jdk1.6.0_02

export JAVA_HOME

AXIS_HOME=/opt/axis-1_4

export AXIS_HOME

PATH=$JAVA_HOME/bin:$AXIS_HOME/bin:$PATH

export PATH

————————————————————–

Save .mysetup file, and reactivate the changes above by issuing a command in my home directory.

>source .bashrc

Next, similar to instructions on the Entrez Utilities web site, we need to create “generate.sh” and “compile.sh” files. Notice that because we are working on Linux machine, we have to use “.sh” instead of “.bat”.

To save time, I copy the whole content in the “generate.bat” file and paste it into my “generate.sh” file. Then, we need to change what is used in Windows to Linux. Therefore, we need to replace all:

1. ‘%’ to ‘$’

2. ‘\’ to ‘/’

3. ‘;’ to ‘:’

4. remove all “set” words. In Linux, we do not need to use “set” keyword.

After replacement, we need to remove the ‘$’ sign from the end of each variable. For example, in Windows when referring to an environment variable such as AXIS_LIB, it is referred as %AXIS_LIB%. However, in Linux, we will use $AXIS_LIB.

Then, we need to change the $AXIS_LIB to reflect the actual path in our Linux system. In this case, the $AXIS_LIB shoule be /opt/axis-1_4/lib.

After completing the “generate.sh”, we are ready to download and install Entrez Utilities. First of all, we need to change its permission to be executable. The easy way is to issue:

>chmod 755 generate.sh

Then, we can run the “generate.sh” by:

>./generate.sh http://eutils.ncbi.nlm.nih.gov/entrez/eutils/soap/utils.wsdl

If there is no error, a new directory called “gov” will be created at the same working directory as the “generate.sh”.

Next, we need to compile all Java files in the “gov” directory. So, we need to create a “compile.sh”

Similar to “generate.sh”, I just copy all the content in the “compile.bat” on the Entrez Utilites to my “compile.sh”. Then, I perform the Windows/Linux replacement and change AXIS_LIB and JDK_HOME to reflect their actual paths on my system. Next, I change the permission of the “compile.sh”, and execute it with the following command.

>./compile.sh

The above command will compile all Java class files in the “gov” directory.

Next, we will create a jar file that we can use as a Java library for embedding in any Java application. To creat a jar file named “eutils.jar”, issue the following command.

>jar -cf eutils.jar gov/*

To be able to use the newly created “eutils.jar” library, we create a Client.java as described in the Entrez Utilities web site. In the Client.java, we need to change the package statement to reflect the actual package path. Then, we may  need to add one more import statement. To run the Client.java, we need to do two things, one is putting the “eutils.jar” in the classpath, and second is putting all jar files in the /opt/axis-1_4/lib in the class path. We cannot just put the /opt/axis-1_4/lib/axis.jar in the classpath. We need all of them. Alternatively, we can setup these classpaths in our “.mysetup” file in our home directory.

Advertisements
This entry was posted in Java. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s