Sheldon, W. M., Moran, M. A. and J. T. Hollibaugh. 2002. Efforts to link
ecological metadata with bacterial gene sequences at the Sapelo Island Microbial Observatory. Proceedings of
the 6th World Multiconference on Systemics, Cybernetics, and Informatics. Information Systems Development II
7:402-407.
The existence of public databases for archiving genetic sequence data, such as GenBank and
the Ribosomal Database Project, coupled with the availability of standardized sequence alignment and
comparison tools has led to rapid advances in the field of bacterial genetics and systematics. Many microbial
ecologists now routinely submit gene sequences obtained from environmental isolates, clones, and bands excised
from electrophoretic gels to public sequence databases. As the amount of environmental sequence data in these
systems has increased, ecologists have begun using sequence databases for broader classes of studies, such as
biogeography and community ecology. Unfortunately, the general lack of documentation and data quality control
standards has resulted in many sequences being entered without appropriate metadata, effectively orphaning
records from their ecological context information and making comparisons impossible.
In order to address the shortcomings of public sequence databases, an independent 16S rRNA
sequence database was recently developed at the Sapelo Island Microbial Observatory (SIMO) in Georgia, USA.
The database was created to store complete information from all SIMO research activities using a hierarchical
structure designed to reflect the actual flow of information from sample collection through final publication.
By incorporating key fields from external databases, such as GenBank, the SIMO database is able to serve both
as an independent research tool for SIMO scientists and as a reference source of SIMO data stored in other
databases.