Sapelo Island Microbial Observatory Sapelo Island Microbial Observatory
powered by
 
 

RDPquery

 

Introduction

RDPquery is a Java application for retrieving taxonomic identifications for 16S rRNA prokaryotic gene sequences. The program utilizes The Ribosomal Database Project's (http://rdp.cme.msu.edu) online sequence match tool to retrieve classification information. RDPquery was created by Glen Dyszynski and Wade Sheldon in the Departments of Microbiology and Marine Sciences at the University of Georgia. The program makes use of another Java application created by Ahmed Moustafa called JAligner, which creates alignments and performs comparisons on sequence data.

Classification Strategy

The general strategy used is as follows (Figure 1). For each query sequence, RDPquery asks the RDP to find the 10 entries (or some specified number of entries from 1-20) with the highest Sab values. However, the sequence with the highest Sab value is frequently not the sequence with the highest similarity, in the same way that the sequence with the highest BLAST score frequently does not have the highest similarity. Therefore, RDPquery uses JAligner to calculate the sequence similarity for each of the sequences with high Sab values. To limit the number of requests on the RDP, this action is done locally using downloaded copies of all the RDP sequences. Therefore, it is necessary to download the most recent version of the RDP sequences so that all of the matching sequences returned by RDP can be found in the RDP database FastA file.


figure 1. Overview of RDPquery

RDPquery then identifies the sequence with the highest similarity and creates an output file with two sets of taxonomic identifications. The first set contains all the taxonomic data provided by RDP for the sequence with the highest similarity. The second set, however, contains only those taxonomic identifications where the similarity value exceeds a predetermined cutoff. These cutoffs were generated by a survey of the taxonomy in Bergey's Manual of Systematic Bacteriology2 (Figure 2). The default cutoff values were set to represent the similarity value at which one would be 95% confident in declaring a given taxonomic assignment. For instance, 95 % of the comparisons we surveyed between members of different genera from within the same family possessed less than 95 % sequence similarity. Similarly, 95 % of the comparisons between members of different families from within the same order possessed less than 92 % sequence similarity. Thus a clone possessing 94 % sequence similarity to a type strain would be classified in the same family but not in the same genus. These guidelines are conservative and tend to assign clones to taxonomic groups when there is a high level of confidence; however, note that the guidelines were developed from nearly complete sequences, so caution should be used when applying them to partial sequences, which may be more or less conserved than the entire gene.

figure 2. Survey of taxonomic assignments in Bergey's Manual of Systematic Bacteriology. At each level, the rRNA sequence similarity was determined for representatives of different taxa from within the same higher taxonomic group. Thus, at the genus level, representatives of genera within the same family were compared. At the family level, representatives within the same order were compared. All sequences used were from type strains and >1300 bp. No more than six sequences were selected from any one taxon.

Licensing

The RDPquery source code is licensed under the GNU General Public License. If you use RDPquery in a published work or product, please include the citation:

Dyszynski, G. and Sheldon, W.M. RDPquery: A Java program from the Sapelo Program Microbial Observatory for automatic classification of bacterial 16S rRNA sequences based on Ribosomal Database Project taxonomy and Smith-Waterman alignment. (http://simo.marsci.uga.edu/public_db/rdp_query.htm, [version used]).

 

Downloads

RDPquery version 2.7 (October 2006) --

Documentation Only (Adobe PDF)
Source code and documentation
(Zip archive)

 

 

 
   
 

National Science FoundationThe Sapelo Island Microbial Observatory is funded by the National Science Foundation

This material is based upon work supported by the National Science Foundation under grant number MCB-0702125. Any opinions, findings, conclusions, or recommendations expressed in the material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

 

UGA Marine Sciences

Contact Us