Chances are great that Cari Soderlund’s software built the map of any genome you’ve ever analyzed, and ever will
By Adrienne J. Burke
Before last month, computer scientist Cari Soderlund considered FPC, the program she wrote for building physical maps of genomes, to be her crowning glory. Now she’s got rights to brag that her software, which has been downloaded by labs worldwide and used in the mapping of many a genome since 1998, is also an improvement on the work of a Nobel laureate.
Soderlund, an unassuming academic with a twangy accent patchworked over 50 years of life in disparate locales — Michigan, Tennessee, Louisiana, New Jersey, New Mexico, England, South Carolina, and Arizona — wrote FPC (short for Finger Printed Contigs) last decade during a six-year stint in the Sanger Centre’s informatics division. John Sulston, said laureate, had in the 1980s developed a fingerprinting program that used restriction enzymes to measure fragments of clones. The program, which Sanger’s informatics chief Richard Durbin recalls to have been “written in Fortran on VAXes with limited user interface and some archaic graphic systems,” was used to build a map of all of the clones for the C. elegans sequencing project.
But by the time Soderlund arrived at Sanger, after having worked with biologist Chris Fields at New Mexico State on the first gene identification program and at Los Alamos National Lab with Christian Burks on restriction map assembly, the fingerprinting approach to mapping was all but dead. Fingerprinting is “what we call in computer science an NP-complete problem,” Soderlund says. “It’s a very hard problem,” and Sulston’s largely interactive program wasn’t up to bigger genome tasks.
When Sanger decided to use a BAC-by-BAC approach to sequence the whole human genome, Durbin wanted a next-generation fingerprint program to assemble the physical maps and fingerprint data. Soderlund led the project to build a stopgap software solution. Over the years, she says, “I got an algorithm to work for assembling the clones, and even though, since it’s NP-complete, it would take too much time to solve it exactly, there’s still the approximation algorithm that works really well. And the better the data, the better it works.”
Plants Without Politics
Today her program is an industry standard that, with add-ons by Marco Marra and Soderlund herself, has been integral to fingerprinting whole genomes including Arabidopsis, rice, maize, mouse, zebrafish, and, of course, human. Having kept a running count of the number of labs that downloaded the publicly accessible FPC over a three-month period this year — about 50 — Soderlund suspects her program is being used to map a lot of smaller genomes too.
While she seems to have been deprived of much acclaim for her contribution to the Human Genome Project, Soderlund’s calendar in September hints at her increasing importance in another area of genomics: Tuesday to Washington for an NSF maize sequencing meeting; next Monday and Tuesday, St. Louis for a soybean mapping meeting; the week after that, Cornell for a tomato meeting.
In June Soderlund and five of her staff moved from Clemson University’s Genome Institute, where she cut her teeth on ag genomes, to the University of Arizona in Tucson. Soderlund, who rides a mountain bike to campus every morning, has come a long way since she last lived in these parts: she spent a summer during college waitressing on a dude ranch at the foothills of the Rincon Mountains. “It’s amazing to me to come back 25 years later as an academic in a different situation indeed.”
She is setting up the Arizona Genomics Computational Laboratory, the informatics complement to Rod Wing’s Arizona Genomics Institute (also transplanted from Clemson). Among the computational chores that Soderlund’s five-person team will handle are EST analysis, managing the infrastructure of AGI’s online clone-ordering service, and overall IT support for Wing’s group’s agricultural genomics endeavors. She’s also co-PI on a multi-university grant to look at Magnaporthe grisea, the pathogen that causes rice blast. Her team will put EST, sequence, and microarray data into a database to enable researchers to ask questions.
After years in the human genome arena, Soderlund has developed a fondness for the plant community. “There’s a different feeling to it. The politics are less and the genomes are fascinating. They’re more difficult, more complex than humans.”
Of her partnership with Wing, Soderlund says, “We’re on a roll here. The data he produces in his lab and the software I produce are very synergistic.” The duo is named on grants for work on barley, cotton, maize, rice, sorghum, and tomatoes. And Wing benefits from Soderlund’s continuing development of FPC — “we get to test the Ferrari version with all the bells and whistles,” Wing says. She’ll be keeping his group on its toes: NSF recently awarded her $300,000 to extend FPC to incorporate sequence and mapping data, and USDA awarded her an access grant to make results available and enable analysis on the Web.
Wing, who notes that Soderlund is not “under my wing — she needs her own identity,” describes FPC as Soderlund’s baby. “She does that to relax. She really enjoys that aspect of her work. We have lots of conversations and Cari will show us something and we’ll say, that’s great, but what about this?”
Soderlund’s undergrad training in psychology might have something to do with her uncanny ability to communicate in the language of her biology counterparts. As one of that breed of bioinformaticist who comes from a strict CS background — she holds an MS in system science, a PhD in computer science, and her first job was designing circuit boards for Bell Labs — to find her way into the biology department, Soderlund prides herself on “being interdisciplinary.”
Says her old Sanger supervisor Durbin, “Cari is unusual in that she has a computer science background but really cares about what is required by the end user to get the job done. She’s very responsive and takes account of the people who need the tools she’s writing. … She also investigates the niggling, awkward problems that arise datawise and userwise and finds solutions for them.”
In spite of her ability to transcend disciplines, or maybe because of it, Soderlund confesses that she gets a little lonely. “I don’t feel like I belong anywhere anymore. I am a computer scientist, but the problems I solve are genomics, not computer science.” Asked if there’s anyone she’d point to as a mentor or role model she answers frankly, “Nope, not at all. I’ve really been finding my own way.”
At Arizona, when the computer science department wouldn’t have her, she set up shop in a converted fifth floor conference room in the plant sciences department. “I would have liked to have been in CS, but I understand that, in a way, [my work] doesn’t fit.” Soderlund is in discussions about setting up a bioinformatics degree program at Arizona, and says that, eventually, she could envision her group evolving into its own department.
Durbin considers her identity crisis an advantage. “The strongest science is done where these disciplinary boundaries don’t mean very much,” he says, noting that years ago, at the Laboratory for Molecular Biology in Cambridge, Fred Sanger, Max Goetz, and Francis Crick disdained academic department divisions.
“There’s a lesson there for science,” Durbin says, “and Cari is an example.”