@hackage simseq0.0

Simulate sequencing with different models for priming and errors

Simseq - SIMulate SEQuences. Yep, that's real creative.


Generates a bunch of sequences from a set of reference sequences. For ESTs, NCBI's refseq transcripts are probably good choices.

The generated sequences are generated using a model that specifies priming conditions and error generation.

Currently, this is not very refined, you can try

simseq --model=sanger:n,d reference.fasta

Where n indicates the number of sequences to generate, starting points drawn from a uniform distribution, and d probability of being in the forward direction. Or, even more experimentally:

simseq --model=454:n,d

Which implemets a completely unfounded and baseless model of 454/Roche pyrosequencing. (Okay, actually based on a paper by Marguiles et al, but more data is definitely a requirement).

Solexa will be installed as soon as anybody says something definitive about the error modes.

In any case, running out of sequence results in X's, indicating vector, which I hope makes sense for Sanger, at least.


The usual Cabal routine. Get a working GHC compiler, install my 'bio' library, and do:

chmod +x Setup.hs
./Setup.hs configure
./Setup.hs build
sudo ./Setup.hs install

Mail me if it didn't work - .