A DNA sequence or genetic sequence is a succession of letters representing the primary structure of a real or hypothetical DNA molecule or strand, with the capacity to carry information as described by the central dogma of molecular biology.
The possible letters are A, C, G, and T, representing the four nucleotide bases of a DNA strand — adenine, cytosine, guanine, thymine — covalently linked to a phosphodiester backbone. In the typical case, the sequences are printed abutting one another without gaps, as in the sequence AAAGTCTGAC, read left to right in the 5' to 3' direction. Short sequences of nucleotides are referred to as oligonucleotides and are used in a range of laboratory applications in molecular biology. With regard to biological function, a DNA sequence may be considered sense or antisense, and either coding or noncoding. DNA sequences can also contain "junk DNA."
Sequences can be derived from the biological raw material through a process called DNA sequencing.
In some special cases, letters besides A, T, C, and G are present in a sequence. These letters represent ambiguity. Of all the molecules sampled, there is more than one kind of nucleotide at that position. The rules of the International Union of Pure and Applied Chemistry (IUPAC) are as follows:[1]
|
Plural |
DNA sequence (plural DNA sequences)
A DNA sequence or genetic sequence is a succession of letters representing the primary structure of a real or hypothetical DNA molecule or strand, with the capacity to carry information.
The possible letters are A, C, G, and T, representing the four nucleotide subunits of a DNA strand - adenine, cytosine, guanine, thymine bases covalently linked to phospho-backbone. In the typical case, the sequences are printed abutting one another without gaps, as in the sequence AAAGTCTGAC, going from 5' to 3' from left to right. A succession of any number of nucleotides greater than four is liable to be called a sequence. With regard to its biological function, which may depend on context, a sequence may be sense or anti-sense, and either coding or noncoding. DNA sequences can also contain "junk DNA."
Sequences can be derived from the biological raw material through a process called DNA sequencing.
In some special cases, letters besides A, T, C, and G are present in a sequence. These letters represent ambiguity. Of all the molecules sampled, there is more than one kind of nucleotide at that position. The rules of the International Union of Pure and Applied Chemistry (IUPAC) are as follows:
A = adenine
C = cytosine
G = guanine
T = thymine
R = G A (purine)
Y = T C (pyrimidine)
K = G T (keto)
M = A C (amino)
S = G C (strong bonds)
W = A T (weak bonds)
B = G T C (all but A)
D = G A T (all but C)
H = A C T (all but G)
V = G C A (all but T)
N = A G C T (any)
| This page uses content from the English language Wikipedia. The original content was at DNA sequence. The list of authors can be seen in the page history. As with this Familypedia wiki, the content of Wikipedia is available under the Creative Commons License. |
|
|