Pseudogenes

Páginas: 53 (13206 palabras) Publicado: 1 de agosto de 2012
Letter

Millions of Years of Evolution Preserved: A Comprehensive Catalog of the Processed Pseudogenes in the Human Genome
Zhaolei Zhang, Paul M. Harrison, Yin Liu, and Mark Gerstein1
Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520-8114, USA Processed pseudogenes were created by reverse-transcription of mRNAs; they provide snapshots ofancient genes existing millions of years ago in the genome. To find them in the present-day human, we developed a pipeline using features such as intron-absence, frame-disruption, polyadenylation, and truncation. This has enabled us to identify in recent genome drafts ∼8000 processed pseudogenes (distributed from http://pseudogene.org). Overall, processed pseudogenes are very similar to theirclosest corresponding human gene, being 94% complete in coding regions, with sequence similarity of 75% for amino acids and 86% for nucleotides. Their chromosomal distribution appears random and dispersed, with the numbers on chromosomes proportional to length, suggesting sustained “bombardment” over evolution. However, it does vary with GC-content: Processed pseudogenes occur mostly in intermediateGC-content regions. This is similar to Alus but contrasts with functional genes and L1-repeats. Pseudogenes, moreover, have age profiles similar to Alus. The number of pseudogenes associated with a given gene follows a power-law relationship, with a few genes giving rise to many pseudogenes and most giving rise to few. The prevalence of processed pseudogenes agrees well with germ-line geneexpression. Highly expressed ribosomal proteins account for ∼20% of the total. Other notables include cyclophilin-A, keratin, GAPDH, and cytochrome c.
Pseudogenes are sequences in the genome that have close similarities to one or more paralogous functional genes, but in general are unable to be transcribed (Vanin 1985; Alberts et al. 1994; Mighell et al. 2000). The nonfunctionality of the pseudogeneis often caused by the lack of functional promoters or other regulatory elements. As a result, these sequences are released from selection pressure and are free to accumulate non-gene-like features such as frame disruptions (frameshifts, in-frame stop codons, or disrupting interspersed repeats) in the original protein-coding sequence (CDS). There are two major types of pseudogenes: duplicated(nonprocessed) and processed (retrotransposed). Duplicated pseudogenes arose from genomic DNA duplication or unequal crossing-over; hence, they have often retained the original exon–intron structures of the functional genes, although sometimes incompletely. Processed pseudogenes resulted from the process of retrotransposition, that is, the reverse transcription of mRNA transcript followed byintegration into the genomic DNA, presumably in the germ line (Maestre et al. 1995; Esnault et al. 2000; Goncalves et al. 2000). Because of their origin, processed pseudogenes are sometimes considered as a special type of retrotransposon just like Alu or LINE (Long Interspersed Nuclear Elements) and are referred to as retropseudogenes. They are typically characterized by a complete lack of introns,the presence of small flanking direct repeats, and a polyadenine tail near the 3 -end, provided that they have not decayed. In the last several years, many efforts have been made to systematically identify and characterize the pseudogene population in completely sequenced genomes. It has been reported that between 1000 and 2000 pseudogenes, or about one for every eight functional genes, exist inthe Caenorhabditis elegans genome (Harrison et al. 2001). In the yeast Saccharomyces cerevisiae genome, a genome-wide survey has found ∼200 disabled open reading frames (Harrison et al. 2002a). Large numbers of pseudogenes also exist in some bacterial genomes (Cole et al. 2001; Homma et
1 Corresponding author. E-MAIL Mark.Gerstein@yale.edu; FAX (360) 838-7861. Article and publication are at...
Leer documento completo

Regístrate para leer el documento completo.

Estos documentos también te pueden resultar útiles

  • pseudogenes

Conviértase en miembro formal de Buenas Tareas

INSCRÍBETE - ES GRATIS