Skip to main content
Premium Trial:

Request an Annual Quote

Newly Created Cytogenomics Array Group Blossoms into International Data-Sharing Effort


A growing number of cytogeneticists are joining a new entity called the Cytogenomics Array Group to share data related to abnormal microarray findings.

CAGdb, the group's database of copy number variants, is the main interface for members who have been enticed to join CAG by the resource's "user-friendly and intuitive interface, search functionality, data output, and community," according to Hutton Kearney, director of the cytogenetics laboratory at the Fullerton Genetics Center at Mission Health in Asheville, NC.

Kearney told BioArray News this week that CAG was originally called the Carolinas Array Group, and began as a data-sharing effort between labs in North and South Carolina. In recent months, though, the effort has been renamed as other labs, including a few outside the US, requested to join. CAG is now planning its first ever meeting, to be held later this month in Asheville.

"We never planned for this to be an international resource, so there was never a plan to go big, but nearly every colleague that heard about our resource has shown interest," said Kearney.

CAGdb is designed to house each laboratory's abnormal cases that are freely shared in a de-identified fashion with all participating laboratories. CNV details, interpretation comments, parental and other follow-up data, and clinical features, "all fully stripped of patient identifiers," are collected and shared for each case in the database, according to CAG. In addition, the database provides a "useful laboratory tool to track experience with in-house microarray cases, network with colleagues, and to share experience with rare findings."

"We started building this site as a way for our own laboratory to organize and track our experience with abnormal cases," said Kearney. "Most laboratory information systems do not capture microarray data in an intuitive and accessible format," she said. Kearney acknowledged that other commercial software packages exist, but said that these are not affordable for most labs.

"We built our own solution, and because it was web-based, it was trivial to invite other laboratories to share the resource," said Kearney. "Several of us were already e-mailing and phoning one another on a weekly, sometimes daily basis to look for shared experience with rare cases."

Kearney's lab took the initiative in building the database, which is now hosted by the Mission Healthcare Foundation, a nonprofit wing of western North Carolina's Mission Health.

"We built an interface that was platform independent and accessible to any clinical laboratory," said Kearney, who credited Mission Health with being "incredibly supportive" and said the foundation is "committed to providing a widely available free resource."

Now that the site is available, and labs are signing up, Kearney said the "real proof" that CAG is a success will be in whether labs actually contribute data.

"Everyone wants to utilize the data from other labs, but labs also have to be willing to invest the energy to add cases to the site," Kearney said. "We are doing everything we can think of to make this investment worthwhile for laboratories, both by making it as efficient as possible as well as giving the labs tools to organize and interpret their cases."

Shared Goals

Access to clinical data on cases with abnormal microarray findings continues to be paramount for labs offering array-based cytogenetic testing. For years, cytogeneticists have worked together, notably through the International Standards for Cytogenomic Arrays consortium, to share data in order to define variants of currently unknown clinical significance as pathogenic or benign (BAN 7/6/2010).

ISCA maintains its own database that contains whole-genome array data from a subset of the ISCA clinical diagnostic laboratories. That database is hosted as a collection within the National Center for Biotechnology Information's ClinVar, a public archive of reports of the relationships among human variations and phenotypes.

ISCA and partners from labs that use next-generation sequencing are also developing another database within ClinVar that will consist of curated human genomic variation data (BAN 4/3/2012).

According to Kearney, CAGdb is different from ClinVar in that it "collects information a bit differently and more comprehensively." Also, unlike ClinVar, CAGdb is not a public resource and access is only granted to users from participating clinical laboratories. These attributes "set it apart from other resources, though the goals are shared across all groups — understanding the clinical impact of copy number variation," she said.

While CAGdb could be seen as a competitive resource to the ISCA collection, Kearney said that CAG is encouraging its members to share their data with the ISCA study collection as well.

"Data dispersed across many different resources makes it cumbersome to find all relevant cases," Kearney said. "Additionally, ISCA and NBCI have more far-reaching research goals that will be fueled by increased data in their collections," she said. CAG will also reformat and track participating laboratories' data for submission to NCBI upon request, Kearney added.

Kearney said that clinical laboratories and researchers are "desperate" to curate and access information connecting rare genotype to phenotype. "We are making great progress," she said, "but much genomic variation remains to be understood."

Vendor Support

Navigating the various databases that serve clinical cytogeneticists will be one of the topics of CAG's meeting July 22-23. Scheduled speakers include the Hospital for Sick Kids' Steven Scherer, Harvard Partners' Charles Lee, and Medical University of South Carolina's Daynna Wolf.

The meeting is being sponsored by Affymetrix, Illumina, Agilent Technologies, and Oxford Gene Technology. Baylor College of Medicine, the Mission Healthcare Foundation, and the Medical Innovation and Commercialization Alliance of Western North Carolina are also supporting the meeting. The North Carolina Biotechnology Center is co-hosting the event.

In addition to sponsorship, Kearney said that vendors have been helping CAG in other ways. For instance, Affymetrix provided funding to make data from its ChAS software immediately uploadable to CAGdb. Other microarray vendors have expressed interest in funding similar functionality, she said.

While data sharing will be one topic addressed at the meeting, Kearney said that its ultimate goal is to have an "in-depth discussion of best practices" in clinical microarray use. "This is one part tutorial, one part discussion and debate," she said. Kearney noted that CAG originally planned a regional workshop, but that the amount of registrants "quickly outgrew all expectation." This level of interest is evidence that there is a "great need for this type of exchange," she noted.

Looking forward, CAG aims to improve its database "as much as possible," and is "working closely with ISCA and NCBI to dovetail [its] resources." CAG is also soliciting feedback from its advisory board and database users and will be implementing improvements on an ongoing basis.

"The better our database functions, the more use it will see," said Kearney. "The more cases are contributed, the better we will understand the impact of copy number variation on human disease," she added. "The ultimate goal is to have such a clear understanding of copy number variation that the site is no longer needed."

The Scan

Octopus Brain Complexity Linked to MicroRNA Expansions

Investigators saw microRNA gene expansions coinciding with complex brains when they analyzed certain cephalopod transcriptomes, as they report in Science Advances.

Study Tracks Outcomes in Children Born to Zika Virus-Infected Mothers

By following pregnancy outcomes for women with RT-PCR-confirmed Zika virus infections, researchers saw in Lancet Regional Health congenital abnormalities in roughly one-third of live-born children.

Team Presents Benchmark Study of RNA Classification Tools

With more than 135 transcriptomic datasets, researchers tested two dozen coding and non-coding RNA classification tools, establishing a set of potentially misclassified transcripts, as they report in Nucleic Acids Research.

Breast Cancer Risk Related to Pathogenic BRCA1 Mutation May Be Modified by Repeats

Several variable number tandem repeats appear to impact breast cancer risk and age at diagnosis in almost 350 individuals carrying a risky Ashkenazi Jewish BRCA1 founder mutation.