J Mol Evol (1999) 49:551–557
© Springer-Verlag New York Inc. 1999
Functional Classes in the Three Domains of Life
M.A. Andrade,1 C. Ouzounis,1 C. Sander,1 J. Tamames,2 A. Valencia2
EMBL-EBI, Wellcome Trust Genome Campus, Cambridge, UK Protein Design Group, CNB-CSIC, Madrid, Spain
Received: 21 July 1997 / Accepted: 5 May 1999
Abstract. The evolutionary divergence among thethree major domains of life can now be addressed through the first set of complete genomes from representative species. These model species from the three domains of life, Haemophilus influenzae for Bacteria, Saccharomyces cerevisiae for Eukarya, and Methanococcus jannaschii for Archaea, provide the basis for a universal functional classification and analysis. We have chosen 13 functional classes andthree superclasses (ENERGY, COMMUNICATION and INFORMATION) as global descriptors of protein function. Compositional comparison of the three complete genomes reveals that functional classes are ubiquitous yet diverse in the three domains of life. Proteins related with ENERGY processes are generally represented in all three domains, while those related with COMMUNICATION represent the mostdistinctive functional feature of each single domain. Finally, functions related with INFORMATION processing (translation, transcription, and replication) show a complex behaviour. In Archaea, proteins in this superclass are related with proteins in either Eukarya or Bacteria, as recognized previously. The distribution of functional classes in the three domains accurately reflects the principalcharacteristics of cellular life forms. Key words: Genome comparison — Functional classes — Haemophilus influenzae — Saccharomyces cerevisiae — Methanococcus jannaschii — Archaea
Introduction The three domains of cellular life—Bacteria, Archaea, and Eukarya—exhibit differences that can possibly be attributed to their genome structure and composition. The availability of the first bacterial (Haemophilusinfluenzae) (HI) (Fleischmann et al. 1995), eukaryotic (Saccharomyces cerevisiae) (SC) (Goffeau et al. 1996), and archaeal (Methanococcus jannaschii) (MJ) (Bult et al. 1996) genomes as well as a large number of additional eukaryotic and bacterial sequences provide a unique opportunity to approach this question. However, a comparative analysis will be meaningful only to the extent that these threespecies have representative genomes for the corresponding domains. For example, the genome of Mycoplasma genitalium (Fraser et al. 1995) does not adequately represent Bacteria, given the lack of many important functions derived from its parasitic lifestyle. To minimize domain misrepresentation by the abovementioned genomes, we compared them with the full set of sequences in the database usingphylogenetic criteria, instead of restricting the analysis to species intersections. The number of sequences in the archaeal domain is still insufficient to be used as a reference set, limiting the analysis to the distribution of sequences in the eukaryotic and bacterial domains. The analysis is based on the presence or absence of key cellular functions in these model genomes. Cellular function wasreduced to a comprehensive set of 13 classes of functions derived from a previously proposed scheme (Riley 1993), later applied to the HI (Fleischmann et al. 1995) and MJ (Bult et al. 1996) genomes. For the SC genome, a similar classification was provided
Correspondence to: Alfonso Valencia, CNB-CSIC, Campus U. Autonoma, Cantoblanco, Madrid 28049, Spain; e-mail: valencia@cnb. uam.es
Table1. Percentages of the SC and HI proteins of each functional class with at least one homologue in Bacteria and Eukarya, respectively SC Functional class Amino acid biosynthesis Biosynthesis of cofactors Central & int. metabolism Energy metabolism Fatty acids & phospholipids Nucleotide biosynthesis Transport Energy Replication Transcription Translation Information Cell envelope/cell wall Cellular...
Leer documento completo
Regístrate para leer el documento completo.