The superclass for a number of pattern classes that are related with mapping DNA data to aminoacid names and numbers.
The idea of DNA as a "program code" is as popular as misleading1 . So while this mechanism is the same in principle, the real significance of a structure in the DNA varies entirely with the living context in which the translation happens. Sonification is not a bad analogy: while certainly, the data has influence on the sound, also here, its outcome depends on many other factors.
The related pattern classes model the simplest part of this translation and transcription process. The combinations of the DNA specific nucleic acids, guanine (G), adenine (A), thymine (T), or cytosine (C) thus serve as a code for a larger set of corresponding amino acids.
The class ProteinBioSynthesis also allows to open a textfile containing genetic information and extract it. In real life, the startcodons and stopcodons distinguish the part of the code representing a protein and the one that remains unused (Ptranscribe).
For examples, see the subclasses, in particular AminoacidPattern.
Return a new pattern from a genome
genome |
The genome data, a string containing the characters representing the four nucleic acids: guanine (G), adenine (A), thymine (T), or cytosine (C). |
The mapping of the 64 tripletIDs to the 21 aminoacids, represented as integer numbers (see below). The genetic code is called 'degenerated', because it is partly redundant. By default, this is the standard translation table for most higher forms of life, the interpretation varies in different organisms.
Internally used to map from triplets (all combinations of the bases) to tripletIDs.
A string of four characters representing the bases.
An IdentityDictionary mapping aminoacid ids to aminoacid names:
-1 | stop, // stopcodon |
0 | methionin // startcodon |
1 | phenylalanin |
2 | leucin |
3 | serin |
4 | tyrosin |
5 | cystein |
6 | tryptophan |
7 | prolin |
8 | histidin |
9 | glutamin |
10 | arginin |
11 | isoleucin |
12 | threonin |
13 | asparagin |
14 | lysin |
15 | valin |
16 | alanin |
17 | asparaginacid |
18 | glutaminacid |
19 | glycin |
The triplet that marks the beginning of a valid sequence. This is only used by Ptranscribe
The triplet that marks the end of a valid sequence. This is only used by Ptranscribe
Translate nucleic acid triplets to aminoacid id. This is what in protein bio synthesis is called "translation".
triplet |
A string of three characters from the nucleic acids. |
Translate from aminoacid id to nucleic acid triplets.
aminoacidIndex |
An integer index. |
The nucleic acids represented as a String.
If set to true, print information to the post window.
Transcription is the process that produces the so called codons, the aminoacids from nucleic acid triplets.
Load a genome data file (see example file)
path |
The file path. |
finishFunc |
A function to be called when the reading has finished. |