Alignment – the process of lining up two or more DNA or protein sequences so as to maximize the number of identical nucleotides or residues while minimizing the number of mismatches and gaps.
Annotation – in the most general sense, linking information from the literature to database entries for genes or proteins. In the context of genome sequencing,annotation refers to the identification of putative genes using a combination of ab initio methods, homology searches and physical evidence.
BLAST – Basic Local Alignment Search Tool – A program that compares a
sequence (input) to all the sequences in a database (that you choose). This
program aligns the most similar segments between sequences. BLAST aligns
sequences using a scoring matrix similarto BLOSUM (see entry below). This scoring
method gives penalties for gaps and gives the highest score for identical
residues. Substitutions are scored based on how conservative the changes are.
The output shows a list of sequences, with the highest scoring sequence at the
top. The scoring output is given as an E-value. The lower the E-value, the
higher scoring the sequence is. E-values inthe range of 1^-100 to 1^-50 are very
similar (or even identical) sequences. Sequences with E-values 1^-10 and higher
need to be examined based on other methods to determine homology. An Evalue
of 1^-10 for a sequence can be interpreted as, “a 1 in 1^10 chance that the
sequence was pulled from the database by chance alone (has no homology to
the query sequence).”
BLOSUM – Block ScoringMatrix - A type of substitution matrix that is used by
programs like BLAST to give sequences a score based on similarity to another
sequence. The scoring matrix gives a score to conservative substitutions of
amino acids. A conservative substitution is a substitution of an amino acid
similar in size and chemical properties to the amino acid in the query sequence.
Bioinformatics - Bioinformaticsis a field of study that merges math, biology, and
computer science. Researchers in this field have developed a wide range of
tools to help biomedical researchers work with genomic, biochemical, and
medical information. Some types of bioinformatics tools include data base
storage and search programs as well as software programs for analyzing
genomic and proteomic data.
CDS – theabbreviation for a coding sequence. CDS is not aynonymous with exon, since exons may contain noncoding sequence.
ClustalW – A program for making multiple sequence alignments.
Clusters of orthologous genes (COGs) – sets of genes from a collection of species that are proposed to encode the same gene product, based on pairwise best-match sequence similarity. Foreukaryotes, the acronym KOGS is sometimes used.
Consensus sequence – a hypothetical sequence consisting of the most common amino acid at each position in a multiple alignment of DNA or protein sequences.
Conserved – when talking about a position in a multiple sequence alignment,
“conserved” means the amino acid residues at that position are identical
throughout the alignment.Conservative residue change – when talking about a position in a multiple
sequence alignment, a “conservative change” is when there is a change to a
homologous amino acid residue.
Contig – the term refers to a set of sequence fragments that have been ordered into a contiguous, linear stretch on the basis of sequence overlaps at the fragment ends. A set of contigs constitutes the “scaffold” of a wholegenome sequence.
DeepView/Swiss-Pdb Viewer – a program for viewing 3-D structures. It loads
“.pdb” files, which contain the 3-D coordinates for molecular structures. Swiss-
Pdb Viewer is easy and free to download on any computer (Mac of PC) and can
be used no matter what Browser you are using. It is fairly easy to learn to use at
the basic level, however, it also has very advanced...
Leer documento completo
Regístrate para leer el documento completo.