Skip to main content
Premium Trial:

Request an Annual Quote

Whitehead s Arachne Brings Whole-Genome Shotgun Sequence Assembly to the Masses


A new assembly program from the Whitehead Institute may give researchers outside of Celera Genomics their first chance at assembling whole-genome shotgun sequence.

The program, called Arachne, was described in the January 2002 issue of Genome Research. The authors noted that while other assembly programs, such as Phrap, TIGR assembler, Amass, Euler, GigAssembler, and the Celera assembler have been reported in the literature, only Celera’s assembler has been able to handle large and complex eukaryotic genomes so far, making Arachne the first publicly available tool for the job.

Whitehead researcher David Jaffe, an author on the paper, told BioInform that the Sanger Center is also developing a whole-genome shotgun assembler called Phusion, but it is not yet publicly available. The two sequencing centers have been comparing their programs against the same data sets in a friendly competition, Jaffe said, “and we’re learning from each other’s assemblies as we go along.” Both programs are currently being used to assemble the whole-genome shotgun sequence for 5-6X coverage of mouse, which Jaffe said is expected to be publicly available in March.

The key difference between Arachne and other assembly methods is its use of pairing information — paired forward and reverse reads from both ends of plasmid clones — to order and orient unique contigs into longer segments called supercontigs (or scaffolds). Programs such as Phrap do not use this pairing information, Jaffe said, and are too slow to scale to larger data sets. Arachne is similar to Phrap, however, in its use of quality scores to ascertain the accuracy of read alignment.

Whitehead has used Arachne to assemble the 40-megabase genome of the fungus Neurospora crassa and is applying it to its other sequencing projects, including the 400-megabase Tetraodon nigroviridans genome and the 180-megabase Ciona savignyi genome. For Neurospora, the Whitehead researchers compared their assembly to four megabases of independently generated finished sequence and found only two discrepancies (99.996 percent accuracy).

Whitehead demonstrated the feasibility of Arachne for mammalian-sized genomes by producing an initial WGS assembly of 4X coverage of the mouse genome in eight days on a single Alpha processor running at 833 MHz and using less than 24 Gb RAM. The authors noted, however, that while the program should be useful for producing initial WGS assemblies of large genomes, “producing high-quality finished sequences of such genomes will require at least some clone-based sequencing.”

The Arachne software package is freely available from the Whitehead website ( edu/wga) for Compaq Alpha hardware running Tru64 Unix. Source code is also available (

— BT

Filed under

The Scan

More Boosters for US

Following US Food and Drug Administration authorization, the Centers for Disease Control and Prevention has endorsed booster doses of the Moderna and Johnson & Johnson SARS-CoV-2 vaccines, the Washington Post writes.

From a Pig

A genetically modified pig kidney was transplanted into a human without triggering an immune response, Reuters reports.

For Privacy's Sake

Wired reports that more US states are passing genetic privacy laws.

Science Paper on How Poaching Drove Evolution in African Elephants

In Science this week: poaching has led to the rapid evolution of tuskless African elephants.