@hackage dephd0.0

Analyze 'phred' output (.phd files)

Synopsis

dephd - A simple tool for base calling and quality appraisal

Reads files in phd-format (phred output), either specified individually, or in a directory (use the --dir option to read directories).

Installation

You need the GHC compiler, or if you know what you are doing, another Haskell compiler or interpreter with Cabal. You also need to install the 'bio' library (darcs get http://malde.org/~ketil/bio)

With those things in place, you should be able to do

 runhaskell Setup configure
 runhaskell Setup build
 sudo runhaskell Setup install

Optionally, add "--prefix $HOME" (without the quotes) after configure to install to your home directory - in which case you don't need the 'sudo'.

Usage

    dephd --rank files..
    dephd --rank --dir dirs..

Outputs (to standard output) a summary of all phd files, including sequence name, average quality, length of longest contiguous run with qualities >= 15, 20 and 30, and longest run with sliding average quality 20 or better.

    dephd --call files..
    dephd --call --dir dirs..

Produce files 'dephd.fasta' and 'dephd.qual' in the current directory, containing sequence and quality data, respectively. Bases estimated (currently very conservatively) to be of good quality are in upper case, very low quality is output as lower case 'n's.

  dephd --plot files..
  dephd --plot --dir dirs..
  dephd --plot -X files..
  dephd --plot -X --dir dirs..

Produce a plot of sequence quality, along with a sliding average. With -X display it directory, without -X, produce a jpg file with the plot. A similar option --plotall generates and displays all plots directly, instead of one at a time (only useful with -X as well).

Bugs

Not many, I hope. Specifying more than one action at a time will pull all sequences into memory, but a single action should stream okay. Approx 15K phd-files can be --call'ed OR --plot'ed OR --rank'ed with less than 100Mb of RAM.

For further questions, email me at ketil@malde.org