Genetic Design Automation

Accelerating Synthetic Biology Discovery through Integrated Curation

Synthetic biology designed systems have many applications in areas including environmental, manufacturing, sensor development, defense, and medicine. However, currently the progress and usefulness of synthetic biology is impeded by the time required for literature studies and the replication of existing but poorly documented work. The Synthetic Biology Knowledge System (SBKS) project endeavored to address these challenges by integrating data from parts repositories with information extracted from literature into a unified knowledge system. However, this form of post-hoc curation requires the extraction of knowledge from manuscript and supplemental text files after publication by curators separate from the original authors. To handle large amounts of data, machines are used to scour free text and attempt to recognize key words and work out their meaning from context. This tests the limits of named entity recognition and entity classification. Additionally, it leaves ambiguous entities that only the original authors might disambiguate. For example, yeast may refer to many different strains of yeast. Furthermore, the SBKS project also extracted sequences provided as supplemental information in publications. However, these sequences, even when they are provided, are typically poorly annotated, incomplete, and provided in non-machine readable formats. Taken together, the SBKS project demonstrated that reconstruction of this important design information through post-hoc curation is extremely noisy and error prone.

This project is founded by National Science Foundation Grants No. 2231864. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.

Genetic Design Automation

EBUGS (Engineering Biology for Underwater and Ground Sensors)

In response to DARPA’s BAA Tellus, the Draper Team, including partners from Boston University, MIT, NC State, Raytheon BBN, and CU Boulder, proposes EBUGS to revolutionize environmental monitoring. EBUGS aims to develop a software-guided methodology for rapidly designing, building, and validating microbial sensors capable of multiplex detection in complex environments. Unlike current hardware-based and manual sample collection methods, EBUGS leverages engineered microorganisms to sense and process multiple stimuli, producing diverse outputs beyond traditional fluorescent signals. The project introduces innovative approaches such as an end-to-end software tool for predicting genetic circuit performance, next-generation remote sensing for signal transduction, and DNA barcoding for recording sensing events, significantly advancing the state-of-the-art in microbial sensor technology.

This project is supported by DARPA HR0011-24-C-0423. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.

Former Research Projects

Genetic Design Automation

Center for Harnessing Microbiota from Military Environments (CHARMME)

The CHARMME project aims to identify consortia amenable to carry synthetic functions, render them receptive to engineering, design genetic functions to work in never-before-tried species and simulate environments to test engineered microbes. The project focuses on engineering soil extremophiles (Bacilli, Pseudomonads, Acinetobacter, etc.) and filamentous fungi/black yeast (Trichoderma, Aspergillus, Penicillium, Fusarium, Cladosporium, Aureobasidium, etc.). SynBioHub3 will link new computational tools with CAD/bioinformatics software already in use by Army researchers (e.g., Benchling or , Geneious) and will connect to the BioMADE API. CHARMME collaborators include research groups from MIT, Caltech, Columbia University, and University of Colorado- Boulder.

This project is supported by the Army Research Office under Cooperative Agreement Number W911NF-22-2-0210. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.

Genetic Design Automation

FLUENT Verification Project

This research aims at advancing probabilistic verification techniques for the rigorous design of dependable systems in synthetic biology and nanotechnology. Major goals of the project include the following. First, scale up stochastic model checking with efficient and accurate state space truncation techniques. Secondly, investigate practical stochastic counterexample generation techniques and utilize them to improve the accuracy of the state reductions. Thirdly, derive automated guidance mechanisms learned from stochastic counterexamples to improve the quality and efficiency of rare-event stochastic simulations. Lastly, integrate our proposed framework within existing state-of-the-art stochastic model checking tools, PRISM and STORM; and evaluate the proposed methodology on a wide range of case studies derived from synthetic biology and nanotechnology applications. The combination of these methods into this new methodology is being explored for the first time. Altogether, this research will improve the accuracy of analysis of infinite state stochastic systems with rare-event properties.

This project was supported by National Science Foundation Grants No. 1856740. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.

Genetic Design Automation

Synergistic Discovery and Design

SD2E began as the DARPA SD2 program Environment for enabling advanced scientific modeling and computation. The Synergistic Discovery and Design (SD2) program is focused upon developing data-based approaches for accelerating scientific discovery and the design of robust models in new domains of research. More information on SD2 can be found here.

SD2E now serves SD2 and other related data-driven scientific programs. SD2E consists of this web-portal; a web-based research workbench, a RESTful APIs (Tapis) and function-as-a-service (Abaco) linking computational applications and workflows with command line access and control via web-portal; high performance advanced computational and data storage hardware; and the skilled personnel supporting scientists in their use of the computational resources. Additional tools incorporated into SD2E include JupyterHub, Gitlab, Jenkins, Redash, and Synbiohub. Access control to data and software is maintained at multiple levels, allowing private user-only access during initial testing, followed by project level shared access, and finally publishing capabilities.

SD2E is managed by the Texas Advanced Computing Center (TACC), where many of the world’s most powerful research computing resources are designed and operated. More information on TACC resources can be found here.

This project was supported by DARPA FA8750-17-C-0229. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.

Genetic Design Automation

Synthetic Biology Knowledge System

The scientific challenge for this project is to accelerate discovery and exploration of the synthetic biology design space. In particular, many parts used in synthetic biology come from or are initially tested in a simple bacteria, E. coli, but many potential applications in energy, agriculture, materials, and health require either different bacteria or higher level organisms (yeast for example). Currently, researchers use a trial-and-error approach because they cannot find reliable information about prior experiments with a given part of interest. This process simply cannot scale. Therefore, to achieve scale, a wide range of data must be harnessed to allow confidence to be determined about the likelihood of success. The quantity of data and the exponential increase in the publications generated by this field is creating a tipping point, but this data is not readily accessible to practitioners. To address this challenge, our multidisciplinary team of biological engineers, machine learning experts, data scientists, library scientists, and social scientists will build a knowledge system integrating disparate data and publication repositories in order to deliver effective and efficient access to collectively available information; doing so will enable expedited, knowledge-based synthetic biology design research.

This project was founded by National Science Foundation Grants No. 1939892. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.

Genetic Design Automation

SynBioHub 3 - An Interactive Genetic Design Repository

This research successfully advanced SynBioHub, an interactive genetic design repository, by enhancing usability, extensibility, and support for biological data standards. Key achievements include a more intuitive front-end, a modernized back-end with improved authentication and SBOL 3 support, refined plugin capabilities, and a robust testing framework. The project delivered a functional search system, streamlined user management, and improved data submission and visualization. The back-end now provides secure, granular access control and faceted search, while plugin support for visualization and downloads has been implemented. A comprehensive testing infrastructure ensures consistency between SynBioHub1 and SynBioHub3. While SynBioHub2 and SynBioHub3 are still being finalized, this work has laid a strong foundation for their completion. The developed methodologies and improvements will support the synthetic biology community in managing and sharing genetic designs more effectively.

This project was supported by NIST award 70NANB21H103. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.

Research Areas

Genetic Design Automation

Active research area

Analog Circuit Design and Verification

Former research area

Asynchronous Circuit Design and Verification

Former research area

Formal Verification of Cyberphysical Systems

Former research area

Current Research Projects

Genetic Design Automation

Accelerating Synthetic Biology Discovery through Integrated Curation

Genetic Design Automation

EBUGS (Engineering Biology for Underwater and Ground Sensors)

Former Research Projects

Genetic Design Automation

Center for Harnessing Microbiota from Military Environments (CHARMME)

Former Projects

Genetic Design Automation

FLUENT Verification Project

Genetic Design Automation

Synergistic Discovery and Design

Genetic Design Automation

Synthetic Biology Knowledge System

Genetic Design Automation

SynBioHub 3 - An Interactive Genetic Design Repository