Widespread adoption of massively parallel deoxyribonucleic acid (DNA) sequencing instruments has prompted the recent development of de novo short read assembly algorithms. A common shortcoming of the available tools is their inability to efficiently assemble vast amounts of data generated from large-scale sequencing projects, such as the sequencing of individual human genomes to catalog natural genetic variation. To address this limitation, a de novo, parallel, paired-end sequence assembler - ABySS (Assembly By Short Sequences), was designed and developed for short reads. The single-node version is useful for assembling genomes up to 100 Mbases in size. There is also a parallel version of ABySS implemented using MPI and capable of assembling larger genomes. The script abyss-pe will run a more comprehensive set of tools to process paired-end data.
