Skip to main content
Premium Trial:

Request an Annual Quote

You Want Fries with That?


By Nat Goodman

The headline is the punch line to an old, cruel joke: What does a physics PhD say when he gets his first job?

As we contemplate the start of another school year, it’s time to ask whether bioinformatics is in the same basket as physics. Are we producing too many computational biology graduates with the wrong skills? And for those already in the field, what new skills do we need to stay on the diners’ side of the fast-food counter?

Unhappy Meals

The bioinformatics job market is bleak. I hear regularly from old friends who’ve lost their jobs and are having trouble finding new ones, and from new people struggling to break into the field. The résumés I see when trying to fill the few openings in my own group are heartbreaking: fine people out of work for months. While the mainstream press continues to hype bioinformatics as the latest taste sensation, it smells to me like yesterday’s burgers.

I spoke with several headhunters and hiring managers to get a broader perspective. Many insisted on anonymity, since it’s not good for business to dis the food.

One headhunter said flatly that he had seen no new openings in the past six months. Another said that the market was so bad, he’d essentially shut down his bioinformatics practice. The most optimistic words I heard were: yes, the market’s been bad, but it’s picking up slowly.

The model of hiring great software people and teaching them biology hasn’t worked very well and is pretty much off the menu. I heard a wide range of explanations as to why this is so. Some companies hired professional software engineers who insisted on following a rigid discipline that didn’t match the loosey-goosey culture of biology. Others hired academic software folks with great ideas but too little discipline to craft complex software that really worked. Some hired software developers and expected them to be architects and project managers. Others hired software architects and expected them to be coders. It seems that management lacked the experience to staff these projects and fell into every trap in the book.

The opposite approach of teaching biologists to write software also hasn’t worked very well, but has emerged as the strategy of choice. One source put it this way: hiring managers subconsciously want to clone themselves; since most bioinformatics managers are biologists, and many fancy themselves to be decent programmers, it’s natural for them to hire people with these same skills. The sad fact is that it’s a huge leap from writing a small Perl script to creating a complete software system. Few people can learn to do this without training and experience. It’s like asking a home cook to take over a restaurant kitchen. Serving up hundreds of burgers and fries to paying customers is a far cry from putting dinner on the table for a family of four.

Lighter Fare

My sources are seeing openings for people with specific scientific backgrounds, like oncology, coupled with strong computing expertise. Hiring managers are looking to plug very specific holes — a French-fryer rather than a chef or even a short-order cook. Call me cynical, but I gotta wonder what happens to the lucky applicant when all the fries have been cooked. The flip side is that it’s hard to find people with such highly specialized backgrounds, and many of these jobs remain open for months.

There are also a growing number of positions for data analysts — for people who can make sense of the mountains of microarray and SNP data being generated. One source called this “end-user bioinformatics.” Managers are generally looking for people who can push the buttons on standard packages, rather than people who understand the stats and math and can do analyses from scratch. These jobs require enough biology to understand the problem being attacked. They also require decent programming skills since you end up having to write scripts to move data from databases into the packages, store the results back in a database, and combine results from multiple tools.

Running a Kitchen

Bioinformatics projects — even small ones — are surprisingly complex from a software development perspective. Projects routinely involve Web interfaces, local and remote databases, external software packages, internally developed software tools, middleware, and other elements.

The gobbledygook is imposing: Netscape, Mozilla, IE. HTML, XML, Javascript. Apache, mod_perl, CGI. Session management, Web page templates. Perl, Java, C++, R, MATAB, Octave. BioPerl, BioJava. Object-oriented programming. Persistent objects. Relational databases. MySQL, PostgeSQL, Oracle, SQL Server. BioSQL, OBDA. Object-relational brokers. Web services. SOAP, BioMOBY, LWP, I3C.

It’s impossible for a programmer to be an expert in all these areas. In the real world, companies would expect to hire a team of programmers to cover the ground. In bioinformatics, we try to find one person who can do it all.

The biological context is also complex, and when you add the two together, you end up with a software development challenge that is as hard as it gets. Management tends to blame the labor pool: software people can’t learn biology; biologists can’t learn programming. But the real problem is that we as a community haven’t figured out how to organize the work in such a way that excellent, but imperfect, human beings can actually do it.

Short Order

As we start another school year, here’s my three cents of advice for students new and old:

1. Improve your software skills. Read a CS textbook on a subject you haven’t studied. Keep your eyes open to what’s new in the computing field, so you don’t waste time reinventing the wheel.

2. Improve your biology skills. Join a journal club and do the reading! Be aware of trends in biology, so you won’t be surprised by new requirements coming down the pike.

3. Hone your business instincts. Remember that your job is to create excellent software that helps scientists do research. Never deviate from this path. When your boss strokes your ego and sends you down a different path, look for a new boss.

I hear the school bell ringing. Back to class, everyone!


Fast-Food Foray: Lessons to Learn From the Bioinformaticist Gone Bust

Imagine you’re the bioinformatics jock in a small laboratory. The boss sends you an e-mail: “Hey Sal, here are five sequences I need Blasted. Can you do it for me?”

No problem. Five minutes to extract the sequences from the e-mail and send them to NCBI. Ten minutes to clean up the results when they come back and forward them to the boss. Cost: about $15 worth of labor.

Next day. “Hey Sal, do you have those sequences I sent you yesterday? I lost your e-mail. Can you run them again? And here’s another five sequences. What I really want to know is whether they’re alternative splice forms or allelic variants.”

Grrr... You dig the sequences out of your e-mail trash bin thanking your lucky stars you didn’t empty it last night, and decide to build a database so this won’t happen again. You also realize that Blast is the wrong tool for answering the real question. So you write a couple of Perl scripts to store the sequences in MySQL, run all pairs through Smith-Waterman, and analyze the alignments for big gaps (suggesting alternative splicing) and small mismatches (suggesting allelic variations). It takes about a week to get this working. It’s really cool and you’re very proud of the result. Cost: just over $2,000.

The PI is very happy — of course, you didn’t tell him the cost — and wants more. “Please do it again with different Smith-Waterman parameters.” Piece of cake. An hour’s work. $50. “Can we try it with Clustal-W?” A little harder. A day’s work: about $500.

The boss is never satisfied. (He says that’s why he’s such a successful scientist.) “Is there a way I can run these programs from the Web?” You take the challenge and whip out an HTML interface in a couple of days: another $1,000.

“Professor Copycat loves your work and wonders if his postdoc can use our website, too.” Groan. Need separate databases for each user. Login screens. Authentication. Another two weeks: $4,000 to $5,000.

“Have you seen the Institut Pasteur’s SeWeR website? They have a whole pile of alignment programs. Copycat’s postdoc says he’ll switch to them unless you add those programs.” You’re not worried until you actually look at the site and see that they support 27 programs, many of which you’ve never heard of. Two months’ work: $20,000.

“And, you know, with all these programs it’s hard for us busy scientists to remember what analyses we’ve run, so please keep a history.” “Gee, the history is great, but is there any way I can re-run an analysis?” “It would be a lot more useful if I could change the analysis before re-running it. You know, so I can run the same analysis on different sequences from the database.”

And finally, the coup de grace. “Everyone at the dysfunctional genomics meeting loved our website. Here’s a list of 25 people who want accounts. A lot of them are doing microarrays and a couple are doing genetics, so they may need some different tools.” You finally complain that it’s too much work for one person to do. “No problem,” he says, “Let’s start a company. We’ll make millions.”

A year of hard but exciting work later. “Sorry Sal. We have to let you go. The investors are too dumb to see the potential of your website and want to refocus the company on siRNA. I wish I could take you back to the lab, but I’ve already filled your old job and money is tight. Don’t worry, though, you’ll have a new job in no time. I hear bioinformatics is real hot.” Yeah, right, as hot as a fry cooker.

Where did poor Sal make her mistake? What should she have done to avoid joining the legion of ex-bioinformaticians? Send your answers, and I’ll summarize the advice in a future column.

— NG


Software Engineers Are Like the Salad Bar: So Many Flavors

One of my sources offered this wonderful description of the software people he sees in bioinformatics. He divides these people into three types.

First are hard-core software engineers who come from big, established companies like Boeing or American Express. This is a world where a year-long project is short. These folks expect to start a project with full, fixed, written specifications and then proceed through a structured design, development, test, and deployment methodology. They do not expect specifications to change much, since they come from a world where change-orders cost big bucks. These people have a hard time surviving in bioinformatics, where a year is eternity and projects change constantly.

At the opposite extreme are ‘hackers’ who can deliver quick and dirty solutions to a range of problems. These folks can be of great practical value, but they lack the experience to understand the full scope of a project. They tend to produce fragile solutions that work today but are unable to evolve as needs change. You have to be careful managing such people: they look like heroes so long as you give them small enough tasks, but flounder and fail if you give them something that’s too complex to hack through. You should also be aware that you’re accumulating no value: the software developed last year, or even last month, generally has no value moving forward.

In between are the dream programmers: people who can deliver quick yet clean solutions that evolve gracefully as needs change. These folks usually have a lot of experience and are able to see both the specific problem that has to be solved today and the generalization of that problem that may have to be solved tomorrow. These are people who’ve seen it all before. People who can look at the problem you’ve just thrown at them and say, “Boy, that’s just like something company XYZ did five years ago.” Rather than, “Wow, you’re a genius boss. No one has ever thought of something like this before.” This experience is hard to find. And, of course, some managers prefer to hire people who think they’re geniuses.

— NG


The Scan

Study Finds Sorghum Genetic Loci Influencing Composition, Function of Human Gut Microbes

Focusing on microbes found in the human gut microbiome, researchers in Nature Communications identified 10 sorghum loci that appear to influence the microbial taxa or microbial metabolite features.

Treatment Costs May Not Coincide With R&D Investment, Study Suggests

Researchers in JAMA Network Open did not find an association between ultimate treatment costs and investments in a drug when they analyzed available data on 60 approved drugs.

Sleep-Related Variants Show Low Penetrance in Large Population Analysis

A limited number of variants had documented sleep effects in an investigation in PLOS Genetics of 10 genes with reported sleep ties in nearly 192,000 participants in four population studies.

Researchers Develop Polygenic Risk Scores for Dozens of Disease-Related Exposures

With genetic data from two large population cohorts and summary statistics from prior genome-wide association studies, researchers came up with 27 exposure polygenic risk scores in the American Journal of Human Genetics.