sanger logo


ICORN: Iterative Correction
of Reference Nucleotides
   biomalpar

iCORN2

We are happy to annouce version 2 of iCORN. We recently used this version to succesfully correct PacBio assemblies of Bacteria and Eukaryotes. Although PacBio uses Quiver to correct the consensus, iCORN2 still corrects many indels! The current version can be found here: iCORN2.

A short description of iCORN2 (inclusive a simple test set).

iCORN(1) in PAGIT

We bundled icorn into PAGIT. Please follow installation as described here.

icorn

ICORN a.k.a. ACORN is a software to correct reference genome sequences. The main idea is to iteratively map reads and find differences in the sequence. The changes the sequence are corrected is the amount of perfect mapping reads over this region is not decreased. Results are export for Artemis or Gap4.
We applied ICORN successfully on projects like Plasmodium facliparum, Chlamydia, Echinococcus multilocularis, Salmonella, Streptococcus pneumoniae...

ICORN uses SSAHA_pileup to map short reads against a genomic sequence or a set of contigs. Next it finds single nucleotide polymorphisms and short insertions and deletions (indel's) up to 3 bases. Those are error in the reference sequence. After the correction of the errors, the coverage of perfect mapping reads is calculated and compared with the coverage before the correction (done with SNP-o-matic). If the coverage doesn't decrease, the correction is accepted. The algorithm runs until no more error is found.
The output of ICORN can be best analyzed in Artemis. For further information please see the Documentation.

Download

A version of ICORN not depending on a cluster computer can be downloaded as a standalone software. It runs around 12h for a 4mb bacterial genome.

Example

An example is given for the chromosome 1 of the Plasmodium falciparum 3D7. The Illumina/Solexa reads and the chromsome can be downloaded. Screenshots are shown.

Funding

Funding for the development of ICORN is provided by the European Union 6th Framework Program grant to the BioMalPar Consortium [grant number LSHP-LT-2004-503578] and the Wellcome Trust Sanger Institute.

SourceForge