ContextSnake is a Markov chain/generative grammar approach with self-adjusting context depth.
Based on a corpus of examples, a ContextSnake creates a stream of values: For each next value, a number of previous values define the context depth; the longer the context considered, the less likely the context is to occur in its full length. The snake extends the context until it becomes unique, i.e. until it only occurs at a single location in the corpus; then it reduces context depth again so that more than one option becomes available, and picks the next item from these options. (Like a gourmet, it insists on choices.)
Two parameters influence these choices further: minDepth
is the minimum context depth below which single choices are accepted), and acceptSingleCond
is a flag or function returning a boolean whether to accept single choices even when above minimum depth.
Concept by Gerhard Nierhaus, implementation Alberto de Campo. Comments and suggestions welcome!
Fairytales and poetry can make good material, e.g. we experimented with Schneewittchen (Snow white), Erlkoenig, and others, see ContextSnake_3texts.html. Formulaic segments can become points of 'modulation' between sections.
First code examples:
corpus |
the corpus from which to generate new streams |
starter |
an optional start sequence (usually the beginning of a corpus specimen) |
minDepth |
the minimum context depth below which single choices are also accepted |
acceptSingleCond |
a flag or test function whether to accept a next value when there is only a single option. |
starterLength |
optional, used when no starter list is given. |
the corpus used
the list to use as starter
the minimum depth
the flag or test func whether to accept a single option
flag whether to post info or not
the starterLength to use when no starter list is given
sublist |
find all locations of a given sublist in the corpus |
generate a random starter list
inval |
method to yield values from the pattern. see Pattern |
ContextSnakes can analyse a sample for different criteria, for example, what the longest sequences are that occur verbatim in the corpus. If the found sequences overlap, then the sample is a possible production from the given corpus. Other methods test whether the sample is valid output of the snake, and whether it is new or not.
sample | |
snippets |
test whether a sample is possible output of the ContextSnake. If you have the snippets already, you can pass them in for efficiency. This also posts the amounts of overlap, in effect telling you the likely minimum context depth. |
all the elements occurring in the corpus.
sample |
finds the indices of longest snippets that come from one continuous source section. |
sample |
test whether a sample occurs in the corpus or is a new production of the pattern. (Sometimes the snake may regenerate long parts of the corpus, which should be identified as not new.) |
For a musical example with Palestrina melodies, see