Biopolym. Cell. 2024; 40(3):220-220.
Chronicle and Information
Assessing microbial genome representation across various reference databases: A comprehensive evaluation
1Boldirev G., 2Sharma N., 3Munteanu V., 2Bhavatharini A., 4Koslicki D., 1Zelikovsky A., 2Mangul S.
  1. Georgia State University
    Atlanta, GA, USA, 30302,
  2. University of Southern California
    3470, Trousdale Parkway, Los Angeles, CA, USA, 90089
  3. Technical University of Moldova
    168, Stefan cel Mare Blvd., Chisinau, Republic ofMoldova, MD–2004
  4. Pennsylvania State University
    201, Old Main, University Park, PA, USA, 16802

Abstract

Aim. Metagenomics research can provide significant insights into the composition, diversity and functions of mixed microbial communities found in various environments. To identify bacterial species, reads from samples are mapped to references that are found in bacterial reference databases. Multiple references may be assigned the same taxonomic identifiers yet these references may contain different genomic information. This project was designed to uncover and correct inconsistencies in bacterial reference databases by comparing species names and genomic representation for the three most commonly used bacterial reference databases (PATRIC, RefSeq and Ensembl). Our first study “Improving the usability and comprehensiveness of microbial databases” [1] considered the concordance of the databases based solely on species names. We extended that research to compare not only the species names but also bacterial genomes and to estimate their similarity. Conclusions. The lack of species and genus overlap not only undermines the accuracy of metagenomic analysis but also emphasizes the critical need for a standardized integration of existing databases. Our analysis will not only enhance the identification and characterization of microbial life but also improve the comparability and rigor of metagenomic research.
Keywords: metagenomics, bacterial reference databases, taxonomic discrepancies

References

[1] Loeffler C et al. Improving the usability and comprehensiveness of microbial databases [published correction appears in BMC Biol. 2020; 18(1):92]. BMC Biol. 2020; 18(1):37.