
NEW YORK (GenomeWeb) — With a staff of 175, including 160 scientists, the Department of Forensic Biology of the City of New York Office of Chief Medical Examiner (OCME) is the largest forensic DNA testing laboratory in the United States and has been at the forefront of developing and testing new technologies for analyzing samples from crime scenes and identifying human remains.
Besides the forensic biology lab, OCME operates three other laboratories, focusing on forensic toxicology, molecular genetics, and histology. The office investigates all deaths in New York City that result from crime, accidents, or suicides, or that occur in a sudden or unusual manner, and conducts an average of 5,500 autopsies per year.
Following 9/11, the forensic biology lab gained nationwide attention through its identification work on the victims of the World Trade Center attacks, which is still ongoing. As of earlier this month, OCME had identified remains from 60 percent of the 2,753 persons reported missing, and DNA played a role in almost 1,500 of those identifications.
GenomeWeb recently met with Timothy Kupferschmid, OCME's Chief of Laboratories, to talk about the forensic biology lab's work and its exploration of new technologies, including next-generation sequencing. Kupferschmid joined OCME in 2013 after seven years at Sorenson Forensics in Salt Lake City. Earlier in his career, he was the forensic technical director at Myriad Genetics, a business that Myriad has since divested; director of the Maine State Police Crime Laboratory; and a senior DNA analyst at the Armed Forces DNA Identification Laboratory. Below is an edited version of the conversation.
What does the Department of Forensic Biology do? What kinds and how many samples do you analyze?
The Department of Forensic Biology at the OCME is a forensic laboratory that works almost exclusively on criminal evidence that is recovered from crime scenes. Our primary customer is [the New York Police Department]. We service all five boroughs of New York City, with a population of around 8.3 million people. We do between 9,000 and 10,000 cases per year, which equates to about 50,000 samples per year. We do all the criminal casework from any biological material we recover from crime scenes. This includes homicides, rapes, assaults, burglaries, other types of property crime, gun cases — just about any case where you would expect to have DNA left behind by either the perpetrator or the victim. NYPD collects that evidence and submits it to our laboratory for processing.
What types of DNA analyses do you perform?
For the criminal evidence, we are exclusively doing STR or short tandem repeat analysis, the forensic markers that have been well established for over a decade. We also do human identification work, mostly for the Chief Medical Examiner's Office, and we still have ongoing work with identifying the remains from 9/11. Most of that work is done on bone remains, and we use mitochondrial DNA sequencing — we're still using Sanger sequencing for that.
How much progress are you still making on samples from 9/11 victims?
We just made at least one more ID in 2015, and we had three new IDs in 2014. Obviously, the samples are getting harder and harder. We're going back to the same remains 10, 12, 15 times, we develop a new procedure, we try it on all the old bones. Mostly, we are fine-tuning the DNA extraction procedure. If we can get DNA out of the bone, we use standard PCR and sequencing techniques, so it's all in the extraction. We're using different chemicals, different times, different incubations — it's nothing magical, just continuous tweaking.
Overall, what have been the greatest improvements in forensic DNA testing over the last few years?
We have been very steady since the late 1990s. Our technology has been STR analysis by capillary electrophoresis. There has been a slow and steady increase in stability, sensitivity, balance of the PCR reactions, and so on.
But right now, in the last couple of years and moving forward, we are embarking on a technology that's new to forensics, which is massively parallel sequencing, or next-generation sequencing, and that is being worked on deliberately at several labs around the country, as well as by manufacturers. We try to stay at the forefront of new lab techniques here at the OCME, and we have certainly embraced the next-generation sequencing methods out there. We're working to test those procedures and develop our own methods around them.
How have you been testing next-gen sequencing?
There are only two manufacturers out there that have targeted the forensic industry: [Thermo Fisher Scientific's] Life Tech with the Ion Torrent PGM and Illumina, which came out with the MiSeq FGx this year, which mostly is upgraded software around their standard MiSeq. We have a MiSeq FGx in house, and we have a visiting scientist dedicated to that project, as well as some staff scientists. The MiSeq and the forensic kit that goes with it is what we're testing.
How have you been testing the MiSeq FGx, and what's your experience been so far?
We have not done a whole lot of testing — we've done some degradation studies, some sensitivity studies, some mixture studies; all the types of studies that we do in forensics. The huge difference between forensics and clinical and research work is our starting material is so terrible, and we can't go back and draw another tube of blood from the patient or volunteer. We get what we get, whatever we can recover from a crime scene, from the sidewalk, from a taxi cab. It's important for us in forensics to examine what the effects of degradation are, natural and man-made degradation, such as bleach. What are the effects of mixtures? You can imagine a taxi door handle has a mixture of dozens of individuals on it. And of course the sensitivity, because we don't recover a lot of DNA. Sensitivity has improved greatly in next-generation sequencing, but it's still not quite as sensitive as traditional STRs. That's why it has taken forensics so long to get into this game.
The target that the next-gen sequencing manufacturers are saying they get consistent reliable results for is 1 nanogram of DNA. In forensics, that's a bucket load. We routinely work at the picogram level of DNA. Can we go down to 100 picograms? Can we go down to 50 picograms? With STRs, we're down in the 10, 15, 20 picogram range and we're getting reliable results. Obviously, the technology of next-generation sequencing wasn't designed around those levels of DNA. It's up to us in the forensic labs to push it and tweak it. How far can we push it and still get reliable, accurate, and precise results?
Using standard STR methods, what is the least amount of DNA you can get away with to obtain reliable STR profiles?
Part of the problem in answering that question is the quantification of the amount of DNA you have. We use real-time PCR to quantitate DNA, and it's plus or minus 30 to 40 percent accuracy. Given that, we can certainly be in the 10-picogram DNA range and get reliable results. That's a little over one cell, but we don't have a whole cell's worth of DNA, we most likely have a bunch of pieces of DNA from multiple cells.
Where do you see the greatest promise of NGS technology in forensics?
The greatest promise is that we can multiplex so much more information than we used to. With NGS, we have all the traditional STRs, the so-called CODIS STRs, plus they have added additional STRs, such as Y-STRs — most commonly, we're looking for male DNA, as the perpetrator is generally male — and also X-STRs that focus on the female; generally, females are victims in forensics. We are also looking at identity SNPs, ancestry SNPs, and phenotypic SNPs that code for eye color, hair color, etc. With ancestry SNPs, if there is no suspect, we could at least give the police an investigative lead, such as 'you're looking for someone of Asian or Northern European descent.' With the same 1 nanogram or 0.5 nanogram of DNA, you can get all that additional information, whereas with traditional STRs, the most we are able to do is 22 or 23 STRs.
The other major advantage that we are really excited about is we are able to determine SNPs within the STRs, which could help us for mixture deconvolution. In traditional STRs, you have two alleles, one allele from your mother and one from your father. So I may be what they call an 11/12, I have 11 repeats on this allele, 12 repeats on the other allele. If my sample was mixed with somebody else that was an 11/13, the DNA mixture would look like 11/12/13, but if it was a 1:1 mixture, I would not know who had the 11, who had the 13, or the 12. But if there is a SNP within the STR, you could deconvolute the mixtures and determine contributions and who is there, and we can't do that now with traditional STRs.
How often is that a problem?
Mixture deconvolution is our biggest challenge in forensic DNA. You get weaker statistics around mixtures because of allele sharing, you can't do random match probability statistics, you have to do likelihood ratios, which are going to be lower. The random match probabilities are generally those outrageous numbers you hear, one in a sextillion or one in a septillion. But with likelihood ratios, it's much lower, one in a million or one in 10,000.
How does next-gen sequencing compare cost-wise to traditional STR profiling?
It is more expensive now. We're hoping, as the technology and manufacturing processes mature, that costs will come down, but at the moment, it is more expensive, and throughput is lower. On the MiSeq, we're only able to run 32 samples at a time because we want enough coverage. If we had more DNA, we could do 96 samples at a time or even more. It also takes much longer — with an STR run, it only takes 45 minutes to do the electrophoresis, so a single capillary electrophoresis instrument is much higher throughput than a MiSeq instrument.
What is the cost of a standard STR analysis?
In private industry, for known samples from convicted offenders, for instance, where we're dealing with high-quality, high-copy DNA, generally from buccal swabs, it's down to about $25 to $30 per sample. With forensic samples, it's not as automated, it's a much more manual method, a lot more labor is involved, so generally, the private companies charge between $250 and $500 a sample. Next-gen sequencing is not even close, but I don't have an exact number.
What can you do today with DNA phenotyping, and how might this be used in forensics in the future?
It's still quite limited. Eye color, hair color, and skin tone are possible today. People get excited about eye color, but you can really only tell between blue eyes and dark eyes, and there are only about 16 percent blue-eyed people in North America, and an even lower in the Southern hemisphere. So it doesn't help us too much if you are a dark-eyed person. Hair color, of course a lot of people dye their hair, so that's always an issue. Skin tone, how can you put words on skin tone — are you dark, are you medium dark, are you light, what does that mean? Phenotypes are challenging, and we are really not quite there yet. That will develop, I'm sure, over time.
What about determining facial features from DNA variants?
That's even more immature. They're just coming out with those, so they haven't been tested very thoroughly yet. And some facial features, like your ears, grow over time.
Where do the recently launched integrated platforms for STR analysis, such as the RapidHIT 200 from IntegenX and the DNAscan Rapid DNA Analysis System from Network Biosystems (NetBio) and GE Healthcare, have a place in forensics?
They are absolutely coming, and we are also interested in looking at those technologies. They are designed and developed to do known samples, mostly buccal swabs from known individuals, and they absolutely work. Again, the challenge would be how much more development will be needed before they can work with forensic samples, with the degradation, mixtures, inhibitors, and so on.
What's the main advantage of those platforms? The ease of workflow? Or are they cheaper?
They are not cheaper than lab work, but they are cheap in the sense that you don't need any infrastructure. The military put a lot of money into them, so they are designed for rugged fieldwork. In battlefields, you could imagine that you want the identity of a person you just captured. They are also looking at refugee camps where you want to figure out family structures; they would be perfect for that operation. They were designed in the US around law enforcement, to go in booking stations. More than 20 states require DNA to be collected from arrestees, so even before you are convicted, if you are arrested, just like when you are fingerprinted, you will get a buccal swab, you would put that in the rapid machine, and within an hour or an hour and a half, you would have the DNA profile, you can scan it against the CODIS database and see if it has hits to any other crimes. That's the strength of those instruments; they are not ready for crime scene samples yet.
Do booking stations already do their own STR profiling?
No, they don't. Right now, they take the buccal swab, they send it to a lab, and the lab could take weeks or months to run those samples. We don't do this work here, it is done by the database lab in Albany. And New York does not have an arrestee law; you have to be convicted of one of the felony or misdemeanor crimes listed in the law to have your DNA profiled.
Several years ago, New York City cleared its backlog of untested rape kits. How did you do this, and could technology improvements help other places clear their backlogs, too?
We were the first to even recognize it was a problem, and we cleared it, and we do not have a rape kit backlog, we're very proud to say. How did we do it? We outsourced it, essentially, to three different private laboratories. That got rid of the backlog while our lab kept current with cases. But over the years, advances are continually being made in screening kits, extracting the DNA, and of course running the samples. Automation is a big help, but process flow, I think, is the most important thing. I'm a student of Lean Six Sigma, a process improvement method, which has helped amazingly in the forensic field. Some labs I have worked with have improved 300 percent, just by looking at their system and eliminating the waste inherent in their system.
Are you using any non-DNA technologies?
We do have a very small proteomics lab and are doing some very exciting work, using MALDI-TOF mass spectrometry. For example, we are able to determine the species of, mostly, bone remains. For some bones, especially if they are very small fragmented bones, you can't tell if they are human or animal. That's very important for the Medical Examiner's Office — is that a dead deer that's 25 years old? Or is this a burial site from a murder victim?
Why is a proteomics test better for this than a DNA-based test?
Protein is more abundant than DNA, so we have better chances of getting protein out of bone. We're also looking at proteins to do body fluid identification. Right now, we only have presumptive tests to identify blood, but again, the courts want to know, 'Is the blood in the back of a pickup truck deer blood, fish blood, or human blood?' We also need to identify semen, saliva, is it menstrual blood vs. venous blood — that might be important in a case. For all of those things we're looking at protein markers.
Where will your work be going in the next few years?
My main focus has been to eliminate our backlog. We have no sexual assault backlog and no homicide backlog but we did traditionally have a property crime backlog. Property crime is done at rates many times higher than homicides and rapes, thankfully, so there are a lot more of those cases. Over the last year, we have really eliminated that backlog, so we hope to stay current. The more rapidly we can turn around DNA cases, the better it is for the police investigations and for the courts.
But we are also very interested in staying on the forefront of technology. The OCME has been a leader in technology under previous directors, and I intend to continue that leadership and embrace the latest technology for the field.