ProteinBioSynthesis:
Filter:
ProteinBioSynthesis/Classes (extension) | Streams-Patterns-Events | Sonification

ProteinBioSynthesis
ExtensionExtension

Pattern class for the sonification of DNA information in terms of the aminoacids it encodes in a living cell.

Description

The superclass for a number of pattern classes that are related with mapping DNA data to aminoacid names and numbers.

NOTE: In a living cell, the DNA is transcribed into an complementary messenger-RNA (mRNA). The activity of translating the mRNA to proteins is carried out by the ribosomes, arranging chains of aminoacids by translating base triplets (codons). This process, called protein biosynthesis, is the same in all living beings down to bacteria and even partly in viruses.

The idea of DNA as a "program code" is as popular as misleading1 . So while this mechanism is the same in principle, the real significance of a structure in the DNA varies entirely with the living context in which the translation happens. Sonification is not a bad analogy: while certainly, the data has influence on the sound, also here, its outcome depends on many other factors.

The related pattern classes model the simplest part of this translation and transcription process. The combinations of the DNA specific nucleic acids, guanine (G), adenine (A), thymine (T), or cytosine (C) thus serve as a code for a larger set of corresponding amino acids.

The class ProteinBioSynthesis also allows to open a textfile containing genetic information and extract it. In real life, the startcodons and stopcodons distinguish the part of the code representing a protein and the one that remains unused (Ptranscribe).

For examples, see the subclasses, in particular AminoacidPattern.

Class Methods

.new

Return a new pattern from a genome

Arguments:

genome

The genome data, a string containing the characters representing the four nucleic acids: guanine (G), adenine (A), thymine (T), or cytosine (C).

.transcriptionDict

The mapping of the 64 tripletIDs to the 21 aminoacids, represented as integer numbers (see below). The genetic code is called 'degenerated', because it is partly redundant. By default, this is the standard translation table for most higher forms of life, the interpretation varies in different organisms.

.tripletDict

Internally used to map from triplets (all combinations of the bases) to tripletIDs.

.bases

A string of four characters representing the bases.

.aminoacids

An IdentityDictionary mapping aminoacid ids to aminoacid names:

-1stop, // stopcodon
0methionin // startcodon
1phenylalanin
2leucin
3serin
4tyrosin
5cystein
6tryptophan
7prolin
8histidin
9glutamin
10arginin
11isoleucin
12threonin
13asparagin
14lysin
15valin
16alanin
17asparaginacid
18glutaminacid
19glycin

.startcodon

The triplet that marks the beginning of a valid sequence. This is only used by Ptranscribe

.stopcodon

The triplet that marks the end of a valid sequence. This is only used by Ptranscribe

Instance Methods

.translate

Translate nucleic acid triplets to aminoacid id. This is what in protein bio synthesis is called "translation".

Arguments:

triplet

A string of three characters from the nucleic acids.

.translateBack

Translate from aminoacid id to nucleic acid triplets.

Arguments:

aminoacidIndex

An integer index.

.genome

The nucleic acids represented as a String.

.verbose

If set to true, print information to the post window.

.transcribe

Transcription is the process that produces the so called codons, the aminoacids from nucleic acid triplets.

.loadData

Load a genome data file (see example file)

Arguments:

path

The file path.

finishFunc

A function to be called when the reading has finished.

[1] - This was the inspiration for a radio broadcast on this topic, featuring the most varied sonifications of the same DNA, done with the present implementation.