RiboDB : a prokaryotic ribosomal components DataBase
R-prots are named according to BAN, Nenad, BECKMANN, Roland, CATE, Jamie HD, et al. A new system for naming ribosomal proteins. Current opinion in structural biology, 2014, vol. 24, p. 165-169. (see also the Ban Lab website)
Selection of the subsets
Retrieving Ribosomal proteins: Queries allow scanning "FASTA commentary lines" of ribosomal proteins contained in the database using keywords. The structure of "FASTA commentary lines" is described below.
Most relevant searches target fields corresponding to:
  • Genus, Species, or lineage_report (e.g. #Sodalis_praecaptivus, @Bacillaceae-Bacillus)
  • NCBI_Species_TaxID (e.g. #~1463164)
  • Genome_assembly_number (e.g. #GCF_900890425.1)
To avoid any confusion among taxonomic ranks use "-" at the end of the taxon name when querying RiboDB on lineage report information. Using #Listeria will retrieve both Listeria (genus) and Listeriaceae (family). To retrieve ribosomal proteins from the Listeria genus, use "#Listeria-".
Similarly, use a "~" when querying on TaxID (e.g. "#~1312852")
More generally, any information contained in "FASTA commentary lines" may be queried, but may be risky or poorly relevant.
For instance, querying the database with "#Myco" will return information on Mycobacterium, Mycolicibacterium, Mycobacteroides, Mycolicibacter, and other Mycobacteriaceae (Actinobacteria), Mycoplasma (Mycoplasmatales), Mycoplana_dimorpha (an alphaproteobacterium), and Mycoavidus_cysteinexigens (a betaproteobacterium) strains contained in RiboDB.
Similarly, "#myco" will return proteins from Corynebacterium_amycolatum, Amycolatopsis, Streptomyces_antimycoticus, and Actinoplanes_awajinensis_subsp._mycoplanecinus (Actinobacteria), Bacillus_mycoides, Bacillus_paramycoides, Bacillus_pseudomycoides, and Mycoplasma_mycoides (Firmicutes).
Retrieving information/statistics only: Queries may concern the species name (ex: @Acinetobacter_colistiniresistens) the strain Id (ex:@NR1165) the genome Id (ex:@GCF_003227755), the NCBI taxId (ex:@TaxId 280145; <-mind the ";") and any part of the nomenclature hierarchy (ex: @-Gammaproteobacteria- note that the "-" may be mandatory in some cases)
FASTA commentary lines are built as follow:
>Genus_species|strain_ID#genome_type~genome_assembly_number~contig_number~[position_on_the_genome]~NCBI_Species_TaxID~Genetic_code~Genome_source~Protein_evidence=lineage_report
with:
  • Genus_species: e.g. Pseudomonas_aeruginosa
  • strain_ID: e.g. PAO1
  • genome_type [#T, #R, or #E] with #T = genome tagged as type strain material in RefSeq or GenBank, #R = genome tagged as reference / representative genomes in RefSeq, #E = genome listed in Ensembl! Bacteria
  • genome_quality [#C, #S, #U and #d] with #C for complete genomes, #S for scaffolds #U for unassembled and note that #d indicate the origin from metagenomes and other potential loss of quality in the assembly (#S#d is a genome in the scaffold state from a metagenome).
  • genome_assembly_number: e.g. GCF_000006765.1
  • contig_number: e.g. NZ_002516.2
  • position: indicates the position of CDS on the contig, with "C" indicating that the CDS is encoded on the reverse strand, e.g. C[4781985..4782680]
  • NCBI_species_TaxID: corresponds to the species TaxID of the strain
  • Genetic_code: indicates the genetic code for the genome
  • Genome_source [~A. or ~B.> with #A = genome from RefSeq, #B = genome from Genbank not present in RefSeq
  • Protein_evidence (the indication is following the genome sourec ex: A.V) [V or H] with #V = match between RiboDB and CDS annotations as ribosomal protein, #H = if the protein identified by RiboDB is annotated as ribosomal protein.
  • Lineage_report = Domain-Phylum-Class-Order-Family-Genus-Species taxonomic ranks separated by "-": e.g. Bacteria-Proteobacteria-Gammaproteobacteria-Pseudomonadales-Pseudomonadaceae-Pseudomonas-Pseudomonas_aeruginosa.
See for example:
>Methanocaldococcus_bathoardescens|JH146#R#T#E#C~GCF_000739065.1~NZ_CP009149.1~C[1571584..1571994]~1301915~11~A.V=Archaea-Euryarchaeota-Methanococci-Methanococcales-Methanocaldococcaceae-Methanocaldococcus-Methanocaldococcus_bathoardescens