Bioinformatics in the Classroom

Characteristics of Sequences

Characteristics of Sequences - Random vs. Biological Sequences
Concepts: Nucleotide and amino acid sequences from living organisms are not random. DNA and proteins entail information. Specific functions are correlated with specific sequence patterns.

Genes consist of sequences of nucleotides. These nucleotides, in form of a triplett code, determine the location of individual amino acids in peptides. The sequences of amino acids within a peptide determine the characteristics and, thus, the function of the resulting proteins. Genes and proteins are tailored to serve specific functions, they have undergone mutations and selection over long time spans. Thus, their elements, nucleotides and amino acids, are not lined up randomly, but in very specific patterns developed during evolution.

The exercises in worksheets 1-5 will help you to better understand the difference between biological and random sequences. They will also prepare you to understand how bioinformatics tools work and to answer questions like these:

  • How often would you expect a sequence of 16 nucleotides (16-mer) to be repeated in the human genome? How often a sequence of 300 nucleotides?


  • Alu elements are nucleotide stretches of ca. 350 bp which occur repetetively in the human genome and amass to about 10% of the entire human DNA sequence. How many copies of Alu does this amount to? Would you think that this high number is due to chance? Why? why not? What kind of information could be entailed in the occurence of so many Alu elements?
Identifying Genes in Nucleotide Sequences - Exercises: Work through worksheets 1-5