An improved sequence assembly program

Document Type

Article

Publication Date

4-1-1996

Department

Department of Computer Science

Abstract

We describe a number of improvements to the CAP sequence assembly program. These improvements include the development of methods for solving the problem caused by simple repetitive sequences, for automatically editing fragment alignments and consensus sequences, and for identifying chimeric fragments. The improved program (CAP2) assembled each of seven data sets, six of which contain repetitive sequences of very strong similarity, into a single sequence. As an example, CAP2 assembled a set of 1467 fragments into a single sequence of 73,328 bp that has only eight differences from the original sequence. The effects of fragment length, coverage, and error rate on the performance of CAP2 were evaluated using artificial data sets.

Publication Title

Genomics

Share

COinS