Bioinformatics Zen's Michael Barton has a post on performing analysis of massive data sets -- "gigabytes of data with millions of data points" and his experience as a biologist in handling this. He says his preference is "to use a database to store and format data as I think this make projects easier to maintain compared with using just scripts." He also lists several common steps for tackling a project of this scale, including adding database indices, using a fast language interpreter, and deleting unnecessary data and analysis (among many others).
First Step: Your Netbook Won't Cut It
Oct 22, 2009