Skip to main content
Premium Trial:

Request an Annual Quote

High-quality Zebrafish Reference Genome Set to Inform Human Gene Research, Studies Suggest

NEW YORK (GenomeWeb News) – A pair of papers published online today in Nature is providing information on the latest genetic and genomic resources for those working with zebrafish.

In the first of these studies, an international team reported on initial findings from the first high-quality reference genome assembly for zebrafish. For instance, the group found that the genome contains tens of thousands of apparent protein-coding genes, including orthologs for the majority of annotated human genes.

"This genome will allow researchers to understand how our genes work and how genetic variants can cause disease in ways that cannot be easily studied in humans or other organisms," Wellcome Trust Sanger Institute researcher Derek Stemple, that study's corresponding author, said in a statement.

Zebrafish has found favor as a vertebrate model organism system, particularly for studying gene function as it relates to both disease states and normal development processes. The animal's relatively straightforward genetics and transparent embryos have helped in that respect, researchers noted, as have its overall similarities to other animals, including humans.

While a draft version of the zebrafish genome has been publicly available for more than a decade, efforts have been ongoing to produce the type of high-quality reference status that has been achieved for the human genome or that of mouse, another major model organism.

To that end, researchers used a clone-by-clone approach based on multiple, large insert clones to sequence genomic DNA from a zebrafish reference strain called Tübingen. These clones were systematically sequenced and assembled with the help of an existing high-density zebrafish meiotic map known as the Sanger AB Tübingen map, or SATmap.

The team's analysis of the resulting 1.4 billion base genome assembly unearthed many so-called type II transposable elements, along with a preponderance of repeat sequences.

And although it contains relatively few pseudogene sequences compared to the human genome, the researchers reported, the zebrafish reference is home to large set of predicted protein-coding genes — some 26,206, according to current counts.

With the exception of zebrafish-specific genes, which tended to cluster on chromosome 4, many of the coding sequences in the zebrafish genome also appear to correspond with genes described in humans.

When they directly compared the human and zebrafish references, for instance, investigators determined that about 70 percent of protein-coding genes in the human genome have one or more corresponding orthologs in zebrafish.

That estimate was even higher when the team specifically considered human genes that have been implicated in disease: Some 84 percent of those human genes had one or more zebrafish orthologs.

"The vast majority of human genes have counterparts in the zebrafish, especially genes related to human disease," co-author Jane Rogers said in a statement.

"By modeling these human disease genes in zebrafish, we hope that resources worldwide will produce important biological information regarding the function of these genes and possibly find new targets for drug development," added Rogers, who was affiliated with the Sanger Institute and the Genome Analysis Centre in Norwich at the time the study was performed.

In a related Nature article, also co-led by Sanger's Stemple, researchers outlined an exome enrichment and chemical mutagenesis-based method that they have started using to systematically assess zebrafish gene function.

With the help of the zebrafish reference genome, they noted, investigators have been able to more efficiently target coding sequences for enrichment.

And by performing barcoded sequencing on pooled exome samples from zebrafish exposed to mutagenic chemicals, it becomes relatively straightforward to characterize the genetic alterations behind a trait or condition of interest.

"Our challenge is to develop a comprehensive, functional understanding of all human genes as quickly as possible," Stemple said in a statement.

"Our systematic analysis of zebrafish gene function will advance understanding of human disease," he continued, noting that the latest zebrafish resource "will help researchers and clinicians find the gene variations responsible for our inheritance of, and susceptibility to, diseases."

The group has already assessed mutations in more than 38 percent of the zebrafish's suite of protein-coding genes, including 3,188 of the almost 5,500 zebrafish orthologs of disease-related human genes.

The researchers also demonstrated that they could use the newly available zebrafish genetic and genomic resources to tease apart the phenotypic effects associated with around 1,000 embryogenesis-related alleles.

Going forward, this study's authors said the same strategy is expected to aid studies of other biological pathways and processes as well. "All mutant alleles and data are available to the community," they wrote, "and our phenotyping scheme is adaptable to phenotypic analysis beyond embryogenesis."

The Scan

NFTs for Genome Sharing

Nature News writes that non-fungible tokens could be a way for people to profit from sharing genomic data.

Wastewater Warning System

Time magazine writes that cities and college campuses are monitoring sewage for SARS-CoV-2, an approach officials hope lasts beyond COVID-19.

Networks to Boost Surveillance

Scientific American writes that new organizations and networks aim to improve the ability of developing countries to conduct SARS-CoV-2 genomic surveillance.

Genome Biology Papers on Gastric Cancer Epimutations, BUTTERFLY, GUNC Tool

In Genome Biology this week: recurrent epigenetic mutations in gastric cancer, correction tool for unique molecular identifier-based assays, and more.