Too Much to Read

Pedro Beltrao at the Public Rambling blog says there never seems to be enough time to keep up with all the literature researchers keep churning out. In 2009, 848,865 papers were added to PubMed, he says — that's something like 1.6 papers per minute. While there's definitely no scarcity of outlets to publish, is anyone even paying attention? "It is very literally impossible to keep up with the current literature without some sophisticated filtering system," Beltrao says. "With all of the imperfections of our current System of editorial control, subjective peer review, subjective impact evaluations, impact factors and so on, we must agree that we need a lot of help filtering through these many articles." Some argue that each individual researcher should be able to filter through the masses of literature to judge what is best for them to read, he adds, and while that's fine for the research in the narrow focus of his own research, what about the papers that describe new methods or data that may be of interest, but technically outside that narrow focus? Maybe the journals themselves can help with that. "I would pay for tools that would recommend me papers to read," Beltrao says. "In my mind, this is where publishers of today should be making their money, in tools that connect the readers to what they want to read."

You probably know this

You probably know this already but in case you don't, you can set up a search strategy in PubMed and have email alerts of articles and abstracts sent to your mailbox as they are published.

Scientific papers have been

Scientific papers have been the main repositories of experimental data until the last decade, but the sequencing of genomes has resulted in a paradigm shift that the biomedical research community is only starting to come to grips with. The dizzying proliferation of scientific papers and even new journals is really driven by the need to receive individual recognition and funding.

Data repositories such as Uniprot and many of those available through the NCBI are powerful resources with structural information, but there are few databases with quantitative measurements of diverse parameters (e.g. protein expression, protein phosphorylation, enzyme activities, cell proliferation, apoptosis, etc) in response to diverse perturbations (e.g. hormones, compounds, RNAsi, mutations). The scientific community should be working to centralize such data (including negative data if the experimental approaches are sound) so that it is better accessible for analysis. The KiNET DataBase (www.kinet.ca) provided by Kinexus Bioinformatics Corporation is an early attempt to develop such a quantitative on-line protein expression resource with data from our company. However, major government-funded agencies should really be creating such databases that are easy for researchers to contribute to based on standardized protocols. Individual contributions of experimental data to such databases could be easily tracked and quantified, which could still permit recognition important for career advancement.

Future scientific publications should be more like blogs with links to the specific data in the repository databases, but provide more in depth critical analyses of these data and also document the commentaries of the peers. This would provide much better peer-review, even if it is largely post-publication. If a scientific paper had faults, these would be better revealed and retractions would be possible. To avoid embarrassment, researchers would probably be a lot more careful about what they publish. It would be possible to track the scientific contributions not only of the authors of the publications, but also the readers who take the time to add helpful and insightful comments.

Scientific reviews that summarize a large body of work from many investigators should be developed with a Wikipedia approach so that they are always updated. Such reviews could be available on websites that also feature the experimental databases. Individual contributions to such collective works could also be tracked and quantified.

Publication is traditional scientific journals is expensive, slow and often not readily accessible to those without deep financial resources. Search engines such as Google Scholar only provide fragments of text of a myriad of hundreds if not thousands of hits from key words, the vast majority of which are irrelevant and time consuming to sift through. Pubmed provides citation information for scientific papers and at best short abstracts. Retrieval of specific information from individual scientific paper is difficult even if one is able to obtain a copy of the complete paper. As we have entered the second decade of this century, perhaps it is time to completely re-think and re-engineer how we disseminate the data from scientific research so that it is more efficiently found and utilized.