The table below gives you the encoding for the four bases (A
, C
, T
, G
)
and for ambiguous positions in your DNA-sequence.
This one-letter-code is usually used in FASTA-Files and other DNA file formats.
The etymology should give you a mnemonic to memorize the codes.
Code | Meaning | Etymology | Complement | Opposite |
---|---|---|---|---|
A |
A |
Adenosine |
T |
B |
T/U |
T |
Thymidine/Uridine |
A |
V |
G |
G |
Guanine |
C |
H |
C |
C |
Cytidine |
G |
D |
K |
G or T |
Keto |
M |
M |
M |
A or C |
Amino |
K |
K |
R |
A or G |
Purine |
Y |
Y |
Y |
C or T |
Pyrimidine |
R |
R |
S |
C or G |
Strong |
S |
W |
W |
A or T |
Weak |
W |
S |
B |
C or G or T |
not A (B comes after A) |
V |
A |
V |
A or C or G |
not T/U (V comes after U) |
B |
T/U |
H |
A or C or T |
not G (H comes after G) |
D |
G |
D |
A or G or T |
not C (D comes after C) |
H |
C |
X/N |
G or A or T or C |
any |
N |
. |
. |
not G or A or T or C |
. |
N |
|
- |
gap of indeterminate length |