The recovery effort at the site of the World Trade Center disaster officially ended on May 30 with a ceremony that brought a sense of closure to many New York City residents. But for families and friends of the victims of September 11, the mourning continues. For many of the bereaved, the healing process can’t begin without positive identification of a loved one’s remains.
Ann Arbor, Mich.-based bioinformatics company Gene Codes has unexpectedly found itself playing a central role in the WTC recovery effort. Through its new wholly owned subsidiary, Gene Codes Forensics, the company has been able to bring speed and accuracy to the grim and painstaking investigation, contributing to the identification of 1,203 individuals to date, out of roughly 2,800 victims who perished at the site.
New York’s Office of the Chief Medical Examiner recruited Gene Codes to help solve the horrific DNA pattern-matching problem in late September. It was immediately obvious that the ME’s standard methods for crime-scene DNA identification, based on the FBI’s CODIS (Combined DNA Index System) software for one-to-one pattern matching, would be utterly inadequate to handle the unprecedented WTC disaster. The office arranged to outsource DNA extraction and short-tandem repeat profiling to Myriad Genetics in Salt Lake City and Bode Technology Group in Springfield, Va., but the department of forensic biology, which had long used Gene Codes’ Sequencher software for mitochondrial DNA analysis, soon realized it required a new level of computational help for the task.
Department head Bob Shaler called on Gene Codes’ president, Howard Cash, to deliver five features in a new software program: match DNA from individual remains with DNA from family members and victims’ personal effects; reunify separated pieces of individuals; track collected samples; maintain chains of custody for all submitted swabs and personal effects; and confirm the accuracy of the identifications with rigorous quality assurance tests. After all, there had been no preparation for this data-gathering project, and the donors as well as most of those collecting the data were in a state of shock at the time. Entry errors were a given.
Accuracy Is Top Priority
To be sure, the computational challenge for Gene Codes would not be extraordinarily complex: write a program that detects matches within and among several fields of data. But the significance of the program’s findings, which would trigger the release of remains to families for burial, meant that impeccably accurate output would be imperative. As Gene Codes’ quality assurance specialist Amy Sutton noted, the demands of the project were different from creating a commercial software product. “It’s embarrassing if it crashes, but it’s acceptable,” she said. “What’s not acceptable is making the wrong match.” Added Cash: “The bug that crashes the program is not that bad; a bug that makes it look like we made an ID when we didn’t is catastrophic.”
Since September 11, the ME’s office has been receiving remains from the WTC disaster at a rate of about 100 “pieces” per day. Meanwhile, victims’ parents, siblings, and children have submitted cheek swabs, as well as personal effects of the deceased — toothbrushes, combs, razors, etc. — to the New York State Police. New York’s ultimate goal is that every bit of human recovered from the disaster site will be identified by DNA, matched to a relative and a personal effect, and returned to a grieving family. Even remains that are too damaged to yield enough readable short tandem repeats are being preserved in hopes that a mitochondrial DNA profile or even some identifying SNPs can be recovered.
Extreme Programming for Extreme Circumstances
Gene Codes engineers toiled days, nights, weekends, and holidays through October and November to build a customized profile-matching software program dubbed M-FISys (pronounced “emphasis” and short for Mass Fatality Identification System). Writing code in C#, they used an extreme programming approach, working intensively in pairs to both test and review each line of code “to ferret out all the defects in the program before it goes out the door,” Sutton said.
Aside from the more than 700 tests that are run automatically 10 to 15 times a day on the software, Sutton tested the program by pushing it to every conceivable limit that a user might dream up. Cash said his company has always had high standards for quality control, but “we’ve never done Q/A like this.”
GeneCodes delivered version 1.0 of M-FISys to the ME’s office on December 10, and Cash has flown from Detroit to New York every Friday for the past eight months to install weekly M-FISys upgrades and get feedback on the previous week’s work. Gene Codes has delivered nearly 30 upgrades of the program so far.
By the one-year anniversary of the tragedy, the ME’s office expects to have made all the matches it can using short tandem repeats for body parts that remain unidentified. It will then move on to mitochondrial DNA analysis and, with the help of Orchid’s subsidiary GeneScreen and Celera Genomics, the ME’s office will also try a new approach to identifying remains by SNP analysis. Gene Codes programmers are now upgrading M-FISys to incorporate those additional search fields.
Although Gene Codes has a $10 million contract with New York City, Cash said he intends to bill only for his costs, which he estimates will fall in the range of $3 million. As a consequence, he expects the company might suffer its first unprofitable quarter in eight-and-a-half years. But Cash is unfazed about the financial repercussions of the project, which he has called the most important moment of his professional life.