ICORN: Iterative Correction of Reference Nucleotides |
Instructions For Use
Inputs: Reference Sequence (FASTA) & reads (FASTQ) & information of short reads- Download ICORN from the download page, and unzip the file.
- Download and install SSAHA_pileup and SNP-o-matic.
- Define the shell variable ICORN_HOME, PILEUP_HOME and SNPOMATIC_HOME, pointing to path, where the three packages are installed. If ICORN cannot find SSAHA_pileup and SNP-o-matic it will prompt an error to adjust the path.
- Run:
icorn.start.sh <Reference> <Start Iteration should be 1* > <Stop Iteration i.e. 5* > <fastq forward> <fastq reverse> <insertrange: i.e. 100,400> <meaninsert 300>(for mate pair reads)
(* - in the current version, the algorithm does not stop on his own.) - ICORN runs several hours, depending the size of the reference and the amount of short reads. The actual version is designed for small genomes. A distributed version using an LSF farm is available form the authors by request.
- To see the statistics, open the files Stats.Mapping.csv and Stats.Correction. The file Final.corrected.fa holds the finally corrected sequence.
- To show the results via a graphical interface, start Artemis
and load the sequence and All.sequence.gff file.
In Artemis, load the sequence/contig by going to 'File', 'Read an Entry or Open', and select the file Sequence file and after this the <All.sequence.>.gff file - You can also load the coverage plots for the reference sequences which tells whether the coverage in the first iteration. You can load this by going to 'Graph', 'Add User Plot', and select the file '<sequence>.coverage.plot' or '<sequence>.coverage.plot' in the plots directory.
- For screenshots, examples and a test dataset please see this.