Skip to main content
Premium Trial:

Request an Annual Quote

Bioinformatics Study Finds 88 Protein Sequences with Vaccine Potential Conserved in West Nile Virus


This story originally ran on May 13.

A large-scale bioinformatics analysis of the West Nile virus proteome has identified 88 protein sequences that researchers said could provide the basis for the development of treatments for the disease, for which no vaccine or therapy currently exists. .

Corresponding to 1,169 amino acids, or 34 percent of the total 3,430 amino acid composition of the WNV proteome, the sequences represent an "attractive target for the development of diagnostics, specific anti-viral compounds, and vaccine candidate targets," the authors said in a study published April 29 in PLoS One. "In short, they can be defined as multi-purpose immutable, functional, and immunological tags of WNV."

The study is the latest in research that the authors have conducted for a number of pathogenic viruses, including dengue fever and influenza. In coming months, they also expect to publish work on yellow fever and hepatitis A, and a "major paper" on T-cell epitopes of pathogens identified by human leukocyte antigen transgenic mice, corresponding author Thomas August told ProteoMonitor last week.

August is a professor in the department of Pharmacology and Molecular Sciences at the Johns Hopkins University School of Medicine.

"We're analyzing the whole proteome of a number of viral pathogens in six HLA transgenic mice [and] have identified [what we call] epitope peptide sequences that encode those HLA-restricted epitopes, and then we have further characterized those epitopes for where they are in nature," he said.

The work is directed mainly at vaccine discovery and development, he said, and "the critical thing is not which of these peptide epitopes are conserved in all West Nile viruses, but which ones of these are specific to West Nile viruses."

While WNV is thought to have emerged as a distinct virus about 1,000 years ago, the disease was isolated in 1937 in the West Nile District of Uganda. The first reported cases of the disease in the US occurred in 1999 in the New York City area.

WNV remained a rare occurrence in the US between 1999 and 2001, when there were only 149 cases reported and 18 deaths, according to the Centers for Disease Control and Prevention. In 2002, though, the numbers skyrocketed to more than 4,000 reported cases and 284 fatalities in the US, and the following year, the number of reported cases peaked at nearly 10,000 while 264 deaths were reported.

Last year a total of 1,356 WNV cases and 44 deaths were recorded by the CDC.

No vaccine or therapy for WNV currently exists, partly due to its genetic diversity. According to the authors of the PLoS One study, five distinct genotypes have been identified, with each varying from the others by 20 percent to 25 percent across the entire genome.

Despite the diversity, the variability among the genotypes is "uneven across the viral genome, since mutations detrimental to viral fitness are restricted," the researchers said. As a result, though certain protein sites permit multiple mutations, sites that are crucial to structure and function have remained evolutionarily "robust and highly conserved."

In a separate study published April 24 as a Papers in Press in the online edition of Molecular and Cellular Proteomics, a team of researchers from the French Institut de Médecine Tropicale du Service de Santé des Armées [Institute of Tropical Medicine of the Army Military Health Service] said that "clinical manifestations of West Nile virus infection are diverse and their pathogenic mechanisms depend on complex virus-cell interactions."

In that study, the scientists used a mass spectrometry-based approach combined with 2D-DIGE to examine the effects of WNV infection on the African green monkey kidney cell line, or Vero. While Vero is commonly used for flavivirus isolation, propagation, and titration, no studies have focused on identifying Vero cellular proteins whose expression has been altered by WNV infection, the researchers said.

In a quantitative analysis, they detected 127 differentially expressed proteins — 68 up-regulated proteins and 59 down-regulated proteins. Of those, 93 were successfully identified with most of them involved in transcription/translation processes, alteration of the cytoskeleton networks, stress cellular response, and apoptotic pathways, they wrote. The study, they added, provides "an understanding of how the host metabolism is modified by West Nile infection, and for identifying new potential targets for antiviral therapy."

The goal of the PLoS One study was to identify and characterize those protein regions that have shown high conservation "throughout the recorded history of the virus" because they could be potential targets for T-cell response, which other studies have suggested are a mechanism to control and cure WNV, the researchers said.

[ pagebreak ]

Applying a bioinformatics approach, they set out to accomplish five tasks: examine the large number of WNV sequences available in public databases; analyze the conservation and variability of the sequences; identify all sequence fragments of WNV proteins that have been completely conserved in all known WNV phenotypes; examine the structure-function relationship and distribution in nature of pan-WNV sequences; and assess the immune relevance of pan-WNV sequences as potential T-cell epitopes "correlating immunoinformatic predictions to previously reported human WNV T-cell epitopes and to our current studies in the identification of human WNV T-cell epitopes by use of HLA transgenic mice," according to the researchers.

"Because we're interested in all West Nile viruses, we're interested in those sequences that are conserved," August said. "That approach is critical for rapidly mutating pathogens, in particular HIV, which is the most rapidly mutating pathogen we know, but also influenza virus, which [requires a new] vaccine every year because they develop those vaccines to target sequences, regions that are highly immunogenic."

Historical Conservation

The researchers began by retrieving sequence records from the Entrez database from the National Center for Biotechnology Information by searching the NCBI taxonomy browser for WNV. August and his colleagues applied a method called entropy analysis to a total of 2,746 complete and partial fragments extracted from Entrez and identified 88 completely conserved sequence fragments across the whole WNV proteome.

The length of the fragments ranged from nine to 29 amino acids, covering a total length of 1,169 amino acids, or 34 percent of the 3,430 amino acids of the complete WNV polyprotein, they wrote.

They chose that fragment length because those are the amino acids that would make the best targets for vaccine- and drug development, August said.

The structural proteins NS3 and NS5 had the greatest number of completely conserved fragments, 25 in the former, and 30 in the latter, they said. Of the 88 sequences found in all WNV phenotypes, 50 are known to be associated with putative or known biological functions and/or structures. Because many of the identified critical biological and/or structural properties are associated with conserved sequences, "they are likely to significantly diverge in newly emerging WNV isolates in the future" according to the researchers.

The biological significance of the remaining 38 sequences was not determined.

The research team also determined that 67 of the 88 sequences overlapped at least nine amino acid sequences of as many as 68 other viruses of the family Flaviviridae, genus Flavivirus. Each of the 67 sequences matched between one and 67 Flavivirus species, including the Murray valley encephalitis virus, which matched 49 of the 67 sequences, and the JEV and Usutu viruses, which shared 47 and 41 sequences, respectively. Fifty-eight of the 67 sequences were from non-structural proteins.

A literature survey and search of the Immune Epitope Database found three pan-WNV sequences that matched three previously reported WNV T-cell epitopes immunogenic in human, "having HLA restriction (when known) with both class I and II specificities," the researchers said.

They also noted that nine consecutive amino acids of five of the pan-WNV sequences are present in non-viral proteomes. While the overlap may be coincidental, they said that it is "likely to be statistically significant as the probability of randomly matching a nonamer is almost negligible."

And they said that there is evidence that many of the conserved sequences "are immunologically relevant in humans." Half of the sequences contained at least nine amino acids overlapping with a total of 54 peptides that are reported to be immunogenic in humans and/or HLA transgenic mice. Computational analysis also predicted putative T-cell epitopes for 12 major HLA class I supertypes and for class II DR supertype "with broad application to the immune responses of human population worldwide," August and his co-researchers said.

Some of the putative T-cell eptitopes were predicted to be "promiscuous" to multiple HLA supertypes. The limited variability of WNV sequences, as they pertain to cellular immunity, may suggest a better chance at success in the development of a vaccine against West Nile virus, they added, than, for example, the Flavivirus, which is "highly" variable.

It was unclear whether August and his team would do any instrument-based work to validate his bioinformatics findings, but he said that he has no plans to do any mass spectrometry-based work.

Regardless, the conservation of the immunologically relevant sequences through the entire recorded WNV history, he and his team said, suggests they "will be valuable" as components of any peptide-specific vaccines or therapies "for sequence-specific diagnosis of a wide-range of Flavivirus infections and for studies of homologous sequences among other flaviviruses."

August added that any vaccine work in the future may ultimately depend on such sequences. "Our postulate is that if you want vaccines that will work against any representative of a given viral pathogen that comes out of these rapidly mutating viruses, you need to select conserved sequences," he said.

The Scan

Harvard Team Report One-Time Base Editing Treatment for Motor Neuron Disease in Mice

A base-editing approach restored SMN levels and improved motor function in a mouse model of spinal muscular atrophy, a new Science paper reports.

International Team Examines History of North American Horses

Genetic and other analyses presented in Science find that horses spread to the northern Rockies and Great Plains by the first half of the 17th century.

New Study Examines Genetic Dominance Within UK Biobank

Researchers analyze instances of genetic dominance within UK Biobank data, as they report in Science.

Cell Signaling Pathway Identified as Metastasis Suppressor

A new study in Nature homes in on the STING pathway as a suppressor of metastasis in a mouse model of lung cancer.