Researchers in the synthetic biology community are developing a new language intended to standardize the way scientists describe and exchange biological parts, modules, and systems.
A draft specification of the language, called the Synthetic Biology Open Language or SBOL, was released in October and its developers plan on finalizing the specification this month.
A number of researchers have already agreed to implement SBOL in their software tools, and bioengineering firm DNA2.0 announced this week that it has integrated the language into its Gene Designer software for gene design and assembly.
Claes Gustafsson, DNA2.0's chief operating officer, noted that the new language is an important step toward building an infrastructure for synthetic biology.
"The frustration right now is that [the field lacks] the infrastructure and tools and common standardized languages" that are available in disciplines like electrical or mechanical engineering, which have "predefined tools and standards" that make design processes easier, he told BioInform.
At present, "none of that exists in biological engineering," but SBOL is "a fantastic first step in that direction," he said.
Researchers began developing SBOL in 2008 with a grant from Microsoft. The original intent was to create a means for different software packages to "communicate designs to each other" in a common language, Cesar Rodriguez, a synthetic biology software designer for startup Genome Compiler and one of the founders and editors of SBOL, told BioInform this week.
At its core, SBOL is a "well-structured GenBank file," Rodriguez said. However, it goes a step further by providing both structural and functional information for the genetic parts that users select, he said.
"Not only do we want to send over a sequence with annotations but we also want to send over some information about the phenotype, or the behavior, or the function," he said. "That’s what distinguishes SBOL from other standards."
Gustafsson added that SBOL not only defines genetic parts, but also finds "the specific sequence that is associated with [each part] and the constraints of that sequence" in addition to metadata.
SBOL ensures that information "flows consistently and has a standardized format between the different users and the different types of applications that you are using," he told BioInform. This is in contrast to tools like GenBank, "which are more like diagnostic tools in the sense that they are not really associating specific features with specific units," he said.
For example, when a user selects a genetic part such as a promoter, SBOL attaches a number that includes extra information, such as the location of the sequence within a particular plasmid as well as fluorescent activity, and transfers this information over to the design software, Rodriguez explained, adding that all this activity occurs automatically without requiring any user input. Furthermore, users can also obtain the structure of the plasmid in question if they chose to do so, he said.
Once the parts have been selected, they are transmitted to tools like Gene Designer where they are converted into virtual sequences and ultimately synthesized.
Among SBOL's current specifications is a visual language that lets users create diagrams of systems they plan to design, Rodriguez said.
The tool includes an extension called SBOL visual that "standardizes the actual symbols that people would use to describe a genetic module," he said.
"Those symbols map to a language within SBOL that says, 'I have a symbol for promoter,'" he explained. "It’s a standard term or a controlled vocabulary that can be transmitted from one computer to another."
Putting SBOL to the Test
The first version of the language was released in October and although there is still a lot of work to be done, SBOL's developers are asking the synthetic biology community to try it out.
"SBOL is an open community so anybody can join ... they just have to indicate that they are going to use the standard in their software tool," Rodriguez said.
Currently, 45 researchers have signed on to implement SBOL in their software tools, he said.
Commercially, in addition to Gene Designer, SBOL has been implemented in a similar software tool dubbed Genome Compiler that will be offered by a startup of the same name.
Rodriguez, who designs software for the newly minted company, explained that Genome Compiler focuses on developing "computer-aided design systems for genetic engineering."
On the academic side, several genetic sequence design software developers are planning to adopt SBOL into their systems. These include the Virginia Bioinformatics Institute's GenoCAD, which is used to design sequences based on grammatical models of genetic parts; and Device Editor, the graphical user interface for the Joint BioEnergy Institute’s J5 software (BI 8/13/2010).
Other tools include the Standard Biological Parts Knowledgebase, which is a semantic web resource that now uses SBOL to represent standard biological parts from the Registry of Standard Biological Parts at the Massachusetts Institute of Technology; as well as University of Washington's TinkerCell, which is a synthetic biology CAD tool for drawing and quantitative simulation.
Moving forward, Rodriguez and his colleagues are planning a number of updates to SBOL, he said, noting that the version of the language implemented in Gene Designer and Genome Compiler is a prototype and hasn’t been formally "ratified."
Planned changes include a performance extension that organizes data and information about the performance of a DNA component as well as a script extension that will provide controlled English for specifying genetic modules, genetic systems, and genome modifications. Other extensions being considered are for modeling, assembly, and host/chassis context.
Also under development is an extensible markup language serialization of SBOL's core model and a resource description framework/OWL implementation of the model.
Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com.