Publications

Decoding Genetic Circuit Failures: Analyzing Static and Dynamic Failures in Genetic Circuitry

Synthetic biology resides at the nexus of engineering and biology, employing diverse ap- proaches to engineer biological systems. These systems can be as simple as DNA sequences, bio- chemical reactions, or more abstracted through control theory or digital logic, among other ways. Similar to other engineering disciplines, for real-world applications, the designed systems must ex- hibit robustness and adaptability to environmental changes beyond controlled laboratory settings. This dissertation focuses on genetic constructs viewed specifically as digital logic genetic circuits, examining their implementation and failure behavior. It aims to elucidate and analyze various failure modes and proposes analytical methods to enhance genetic circuit robustness. This work defines genetic circuit failure, where deviations from expected output are deemed as unexpected and faulty. Such deviations may stem from failures at the cellular level or from flaws in the circuit’s logic implementation or Boolean function. Subsequently, this dissertation develops computational methods to predict circuit behavior, employing diverse analysis techniques such as ordinary differ- ential equation analysis, stochastic simulation algorithms, and stochastic model verification. These methodologies enable the prediction of the likelihood of failure occurrence. Furthermore, this dis- sertation compares different computational modeling techniques to assess the effort required for genetic circuit analysis. Finally, experimental validation is provided for a predicted circuit failure, demonstrating the practical application of the proposed methodologies.

PhageBox: An Open Source Digital Microfluidic Extension With Applications for Phage Discovery

Objective Recent advancements demonstrate the significant role of digital microfluidics in automating laboratory work with DNA and on-site viral testing. However, since commercially available instruments are limited to droplet manipulation, our work addresses the need for accelerated integration of other components, such as temperature control, that can expand the application domain. Methods We developed PhageBox—an accessible device that can be used as a biochip extension. At hardware level, PhageBox integrates temperature and electromagnetic control modules. At software level, PhageBox is controlled by embedded software containing a unique model for bio-protocol programming, and a graphical user interface for visual device feedback and operation. Results To evaluate PhageBox’s efficacy for biomedical applications, we performed functional testing. Similarly, we validated the temperature control using thermography, obtaining a range of ± 0.2 ∘C . The electromagnets produced a magnetic force of 15 milliTesla, demonstrating precise immobilization of magnetic beads. We show the potential of PhageBox for bacteriophage research through three initial protocols a universal framework for PCR, T7 bacteriophage restriction enzyme digestion, and concentrating ϕX174 RF genomic DNA. Conclusion Our work presents an open-source hardware and software extension for digital microfluidics devices. This extension integrates temperature and electromagnetic modules, demonstrating efficacy in biomedical applications and potential for bacteriophage research. Significance We developed PhageBox to be accessible the components are off-the-shelf at a low cost ( ≤ $ 200), and the hardware designs and software code are open-source. With the long aim of ensuring reproducibility and accelerating collaboration, we also provide a DIY-build document.

Evaluating the Contribution of Model Complexity in Predicting Robustness in Synthetic Genetic Circuits

The design–build–test–learn workflow is pivotal in synthetic biology as it seeks to broaden access to diverse levels of expertise and enhance circuit complexity through recent advancements in automation. The design of complex circuits depends on developing precise models and parameter values for predicting the circuit performance and noise resilience. However, obtaining characterized parameters under diverse experimental conditions is a significant challenge, often requiring substantial time, funding, and expertise. This work compares five computational models of three different genetic circuit implementations of the same logic function to evaluate their relative predictive capabilities. The primary focus is on determining whether simpler models can yield conclusions similar to those of more complex ones and whether certain models offer greater analytical benefits. These models explore the influence of noise, parametrization, and model complexity on predictions of synthetic circuit performance through simulation. The findings suggest that when developing a new circuit without characterized parts or an existing design, any model can effectively predict the optimal implementation by facilitating qualitative comparison of designs’ failure probabilities (e.g., higher or lower). However, when characterized parts are available and accurate quantitative differences in failure probabilities are desired, employing a more precise model with characterized parts becomes necessary, albeit requiring additional effort.

Robustness and reproducibility of simple and complex synthetic logic circuit designs using a DBTL loop

Computational tools addressing various components of design–build–test–learn (DBTL) loops for the construction of synthetic genetic networks exist but do not generally cover the entire DBTL loop. This manuscript introduces an end-to-end sequence of tools that together form a DBTL loop called Design Assemble Round Trip (DART). DART provides rational selection and refinement of genetic parts to construct and test a circuit. Computational support for experimental process, metadata management, standardized data collection and reproducible data analysis is provided via the previously published Round Trip (RT) test–learn loop. The primary focus of this work is on the Design Assemble (DA) part of the tool chain, which improves on previous techniques by screening up to thousands of network topologies for robust performance using a novel robustness score derived from dynamical behavior based on circuit topology only. In addition, novel experimental support software is introduced for the assembly of genetic circuits. A complete design-through-analysis sequence is presented using several OR and NOR circuit designs, with and without structural redundancy, that are implemented in budding yeast. The execution of DART tested the predictions of the design tools, specifically with regard to robust and reproducible performance under different experimental conditions. The data analysis depended on a novel application of machine learning techniques to segment bimodal flow cytometry distributions. Evidence is presented that, in some cases, a more complex build may impart more robustness and reproducibility across experimental conditions.

Investigating Genetic Circuit Failures

Synthetic biology is an engineering discipline in which biological components are assembled to form devices with user-defined functions. As a nascent discipline, genetic circuit design is reserved only for experienced researchers with an in-depth knowledge of biology. This work aims to alleviate some of these constraints by developing software to facilitate genetic circuit design and analysis so that more researchers can participate in this thriving discipline and help elucidate the causes of circuit failures. Firstly, an automatic dynamic model generator can be implemented to predict a circuit’s behavior between steady states and determine the amount of time needed to reach such steady states. Moreover, the analysis of the predicted dynamic behavior will help the designers understand the risks of applying specific input changes and decide whether the risk is critical for the designed systems’ intended purposes. Extrinsic and intrinsic noise can contribute to the observed output variability of a clonal population. Therefore, to account for a genetic circuits’ stochastic behavior, this work aims to develop stochastic modeling using extrinsic and intrinsic noise contributions that can help infer glitch probabilities and elucidate the causes of circuit failure. All the methodologies developed in this work will serve the overacting aim of redesigning genetic circuits to avoid circuit failures. Dynamic ODE modeling will predict glitching behavior and the time to reach said states; stochastic modeling will be used to predict glitch propensities; hazard-preserving transformations will be used to avoid solvable hazards. Facilitated dynamic modeling of genetic circuits would be an instrumental technique for synthetic biologists, especially if it can be accompanied by a circuit design automation tool, such as those proposed in this work. This would help automation in synthetic biology and provide a way to debug circuit designs before construction and compare predictions with experimental data once the synthesized circuit is implemented, saving time, effort, and money. This project aims to expand such capabilities to help researchers through the design process with the development of automated modeling, logic synthesis, hazard identification, and genetic circuit redesign.

BioSimulators: a central registry of simulation engines and services for recommending specific tools

Computational models have great potential to accelerate bioscience, bioengineering, and medicine. However, it remains challenging to reproduce and reuse simulations, in part, because the numerous formats and methods for simulating various subsystems and scales remain siloed by different software tools. For example, each tool must be executed through a distinct interface. To help investigators find and use simulation tools, we developed BioSimulators (https://biosimulators.org), a central registry of the capabilities of simulation tools and consistent Python, command-line and containerized interfaces to each version of each tool. The foundation of BioSimulators is standards, such as CellML, SBML, SED-ML and the COMBINE archive format, and validation tools for simulation projects and simulation tools that ensure these standards are used consistently. To help modelers find tools for particular projects, we have also used the registry to develop recommendation services. We anticipate that BioSimulators will help modelers exchange, reproduce, and combine simulations.

Genetic circuit design automation with Cello 2.0

Cells interact with their environment, communicate among themselves, track time and make decisions through functions controlled by natural regulatory genetic circuits consisting of interacting biological components. Synthetic programmable circuits used in therapeutics and other applications can be automatically designed by computer-aided tools. The Cello software designs the DNA sequences for programmable circuits based on a high-level software description and a library of characterized DNA parts representing Boolean logic gates. This process allows for design specification reuse, modular DNA part library curation and formalized circuit transformations based on experimental data. This protocol describes Cello 2.0, a freely available cross-platform software written in Java. Cello 2.0 enables flexible descriptions of the logic gates’ structure and their mathematical models representing dynamic behavior, new formal rules for describing the placement of gates in a genome, a new graphical user interface, support for Verilog 2005 syntax and a connection to the SynBioHub parts repository software environment. Collectively, these features expand Cello’s capabilities beyond Escherichia coli plasmids to new organisms and broader genetic contexts, including the genome. Designing circuits with Cello 2.0 produces an abstract Boolean network from a Verilog file, assigns biological parts to each node in the Boolean network, constructs a DNA sequence and generates highly structured and annotated sequence representations suitable for downstream processing and fabrication, respectively. The result is a sequence implementing the specified Boolean function in the organism and predictions of circuit performance. Depending on the size of the design space and users’ expertise, jobs may take minutes or hours to complete.

Round Trip: An Automated Pipeline for Experimental Design, Execution, and Analysis

Synthetic biology is a complex discipline that involves creating detailed, purpose-built designs from genetic parts. This process is often phrased as a Design-Build-Test-Learn loop, where iterative design improvements can be made, implemented, measured, and analyzed. Automation can potentially improve both the end-to-end duration of the process and the utility of data produced by the process. One of the most important considerations for the development of effective automation and quality data is a rigorous description of implicit knowledge encoded as a formal knowledge representation. The development of knowledge representation for the process poses a number of challenges, including developing effective human–machine interfaces, protecting against and repairing user error, providing flexibility for terminological mismatches, and supporting extensibility to new experimental types. We address these challenges with the DARPA SD2 Round Trip software architecture. The Round Trip is an open architecture that automates many of the key steps in the Test and Learn phases of a Design-Build-Test-Learn loop for high-throughput laboratory science. The primary contribution of the Round Trip is to assist with and otherwise automate metadata creation, curation, standardization, and linkage with experimental data. The Round Trip’s focus on metadata supports fast, automated, and replicable analysis of experiments as well as experimental situational awareness and experimental interpretability. We highlight the major software components and data representations that enable the Round Trip to speed up the design and analysis of experiments by 2 orders of magnitude over prior ad hoc methods. These contributions support a number of experimental protocols and experimental types, demonstrating the Round Trip’s breadth and extensibility. We describe both an illustrative use case using the Round Trip for an on-the-loop experimental campaign and overall contributions to reducing experimental analysis time and increasing data product volume in the SD2 program.

Promotion of Data Reuse in Synthetic Biology

Synthetic biology is a movement to standardize genetic engineering and make it more reproducible and accessible by using functional descriptions of desired circuits. Such descriptions can then be converted to genetic designs via genetic design automation tools. Subsequently, the genetic designs can be used to generate models for in silico experimentation using automatic model generators. Both of these technologies rely on access to libraries of genetic part information encoded in standard, machine-readable ways. The Synthetic Biology Open Language (SBOL) can be used together with SynBioHub (a genetic part repository) to encode and store the information. However, the use of SynBioHub for the storage and reuse of parts is still very limited. This is due to insufficient metadata (making it difficult to find parts or judge their usefulness) and the effort required to submit parts to the repository. This dissertation aims to decrease the barriers to part reuse and thus enable a more automated synthetic biology workflow. Hence, an integrated curation workflow is proposed based on the contributions of the dissertation. The contributions are: a proposed SBOL Data Content Standard, tools for working with genetic parts in spreadsheets, a framework to modularly extend the SynBioHub part repository, and the lessons learned from the analysis and curation of data from existing genetic data repositories.

Curation Principles Derived from the Analysis of the SBOL iGEM Data Set

As an engineering endeavor, synthetic biology requires effective sharing of genetic design information that can be reused in the construction of new designs. While there are a number of large community repositories of design information, curation of this information has been limited. This in turn limits the ways in which design information can be put to use. The aim of this work was to improve this situation by creating a curated library of parts from the International Genetically Engineered Machines (iGEM) registry data set. To this end, an analysis of the Synthetic Biology Open Language (SBOL) version of the iGEM registry was carried out using four different approaches—simple statistics, SnapGene autoannotation, SYNBICT autoannotation, and expert analysis—the results of which are presented herein. Key challenges encountered include the use of free text, insufficient part provenance, part duplication, lack of part removal, and insufficient continuous curation. On the basis of these analyses, the focus has shifted from the creation of a curated iGEM part library to instead the extraction of a set of lessons, which are presented here. These lessons can be exploited to facilitate the creation and curation of other part libraries using a simpler and less labor intensive process.

Synthetic Biology Open Language Visual (SBOL Visual) Version 3.0

People who engineer biological organisms often find it useful to draw diagrams in order to communicate both the structure of the nucleic acid sequences that they are engineering and the functional relationships between sequence features and other molecular species. Some typical practices and conventions have begun to emerge for such diagrams. SBOL Visual aims to organize and systematize such conventions in order to produce a coherent language for expressing the structure and function of genetic designs. This document details version 3.0 of SBOL Visual, a new major revision of the standard. The major difference between SBOL Visual 3 and SBOL Visual 2 is that diagrams and glyphs are defined with respect to the SBOL 3 data model rather than the SBOL 2 data model. A byproduct of this change is that the use of dashed undirected lines for subsystem mappings has been removed, pending future determination on how to represent general SBOL 3 constraints; in the interim, this annotation can still be used as an annotation. Finally, deprecated material has been removed from collection of glyphs: the deprecated insulator'' glyph and macromolecule’’ alternative glyphs have been removed, as have the deprecated BioPAX alternatives to SBO terms.

Quantitative Characterization of Recombinase-Based Digitizer Circuits Enables Predictable Amplification of Biological Signals

Quantitative Characterization of Recombinase-Based Digitizer Circuits Enables Predictable Amplification of Biological Signals

Many synthetic gene circuits are restricted to single-use applications or require iterative refinement for incorporation into complex systems. One example is the recombinase-based digitizer circuit, which has been used to improve weak or leaky biological signals. Here we present a workflow to quantitatively define digitizer performance and predict responses to different input signals. Using a combination of signal-to-noise ratio (SNR), area under a receiver operating characteristic curve (AUC), and fold change (FC), we evaluate three small-molecule inducible digitizer designs demonstrating FC up to 508x and SNR up to 3.77 dB. To study their behavior further and improve modularity, we develop a mixed phenotypic/mechanistic model capable of predicting digitizer configurations that amplify a synNotch cell-to-cell communication signal ($Δ$ SNR up to 2.8 dB). We hope the metrics and modeling approaches here will facilitate incorporation of these digitizers into other systems while providing an improved workflow for gene circuit characterization. Kiwimagi & Letendre et al. present a workflow to quantitatively define recombinase-based digitizer and predict responses to different input signals. With a mechanistic/phenotypic model that can predict circuit performance, they generate a synthetic cell-cell communication device that amplifies a synNotch output signal.

Synthetic Biology Open Language Visual (SBOL Visual) Version 2.3

People who are engineering biological organisms often find it useful to communicate in diagrams, both about the structure of the nucleic acid sequences that they are engineering and about the functional relationships between sequence features and other molecular species. Some typical practices and conventions have begun to emerge for such diagrams. The Synthetic Biology Open Language Visual (SBOL Visual) has been developed as a standard for organizing and systematizing such conventions in order to produce a coherent language for expressing the structure and function of genetic designs. This document details version 2.3 of SBOL Visual, which builds on the prior SBOL Visual 2.2 in several ways. First, the specification now includes higher-level interactions with interactions,'' such as an inducer molecule stimulating a repression interaction. Second, binding with a nucleic acid backbone can be shown by overlapping glyphs, as with other molecular complexes. Finally, a new unspecified interaction’’ glyph is added for visualizing interactions whose nature is unknown, the insulator'' glyph is deprecated in favor of a new inert DNA spacer’’ glyph, and the polypeptide region glyph is recommended for showing 2A sequences.

Genetic Circuit Dynamics: Hazard and Glitch Analysis

Genetic Circuit Dynamics: Hazard and Glitch Analysis

Multiple input changes can cause unwanted switching variations, or glitches, in the output of genetic combinational circuits. These glitches can have drastic effects if the output of the circuit causes irreversible changes within or with other cells such as a cascade of responses, apoptosis, or the release of a pharmaceutical in an off-target tissue. Therefore, avoiding unwanted variation of a circuit’s output can be crucial for the safe operation of a genetic circuit. This paper investigates what causes unwanted switching variations in combinational genetic circuits using hazard analysis and a new dynamic model generator. The analysis is done in previously built and modeled genetic circuits with known glitching behavior. The dynamic models generated not only predict the same steady states as previous models but can also predict the unwanted switching variations that have been observed experimentally. Multiple input changes may cause glitches due to propagation delays within the circuit. Modifying the circuit’s layout to alter these delays may change the likelihood of certain glitches, but it cannot eliminate the possibility that the glitch may occur. In other words, function hazards cannot be eliminated. Instead, they must be avoided by restricting the allowed input changes to the system. Logic hazards, on the other hand, can be avoided using hazard-free logic synthesis. This paper demonstrates this by showing how a circuit designed using a popular genetic design automation tool can be redesigned to eliminate logic hazards.

SBML Level 3: an extensible format for the exchange and reuse of biological models

Systems biology has experienced dramatic growth in the number, size, and complexity of computational models. To reproduce simulation results and reuse models, researchers must exchange unambiguous model descriptions. We review the latest edition of the Systems Biology Markup Language (SBML), a format designed for this purpose. A community of modelers and software authors developed SBML Level 3 over the past decade. Its modular form consists of a core suited to representing reaction-based models and packages that extend the core with features suited to other model types including constraint-based models, reaction-diffusion models, logical network models, and rule-based models. The format leverages two decades of SBML and a rich software ecosystem that transformed how systems biologists build and interact with models. More recently, the rise of multiscale models of whole cells and organs, and new data sources such as single-cell measurements and live imaging, has precipitated new ways of integrating data with models. We provide our perspectives on the challenges presented by these developments and how SBML Level 3 provides the foundation needed to support this evolution.

The first 10 years of the international coordination network for standards in systems and synthetic biology (COMBINE)

This paper presents a report on outcomes of the 10th Computational Modeling in Biology Network (COMBINE) meeting that was held in Heidelberg, Germany, in July of 2019. The annual event brings together researchers, biocurators and software engineers to present recent results and discuss future work in the area of standards for systems and synthetic biology. The COMBINE initiative coordinates the development of various community standards and formats for computational models in the life sciences. Over the past 10 years, COMBINE has brought together standard communities that have further developed and harmonized their standards for better interoperability of models and data. COMBINE 2019 was co-located with a stakeholder workshop of the European EU-STANDS4PM initiative that aims at harmonized data and model standardization for in silico models in the field of personalized medicine, as well as with the FAIRDOM PALs meeting to discuss findable, accessible, interoperable and reusable (FAIR) data sharing. This report briefly describes the work discussed in invited and contributed talks as well as during breakout sessions. It also highlights recent advancements in data, model, and annotation standardization efforts. Finally, this report concludes with some challenges and opportunities that this community will face during the next 10 years.

Synthetic biology open language (SBOL) version 3.0.0

Synthetic biology builds upon genetics, molecular biology, and metabolic engineering by applying engineering principles to the design of biological systems. When designing a synthetic system, synthetic biologists need to exchange information about multiple types of molecules, the intended behavior of the system, and actual experimental measurements. The Synthetic Biology Open Language (SBOL) has been developed as a standard to support the specification and exchange of biological design information in synthetic biology, following an open community process involving both wet bench scientists and dry scientific modelers and software developers, across academia, industry, and other institutions. This document describes SBOL 3.0.0, which condenses and simplifies previous versions of SBOL based on experiences in deployment across a variety of scientific and industrial settings. In particular, SBOL 3.0.0, (1) separates sequence features from part/sub-part relationships, (2) renames Component Definition/Component to Component/Sub-Component, (3) merges Component and Module classes, (4) ensures consistency between data model and ontology terms, (5) extends the means to define and reference Sub-Components, (6) refines requirements on object URIs, (7) enables graph-based serialization, (8) moves Systems Biology Ontology (SBO) for Component types, (9) makes all sequence associations explicit, (10) makes interfaces explicit, (11) generalizes Sequence Constraints into a general structural Constraint class, and (12) expands the set of allowed constraints.

Synthetic biology open language visual (SBOL visual) version 2.2

People who are engineering biological organisms often find it useful to communicate in diagrams, both about the structure of the nucleic acid sequences that they are engineering and about the functional relationships between sequence features and other molecular species. Some typical practices and conventions have begun to emerge for such diagrams. The Synthetic Biology Open Language Visual (SBOL Visual) has been developed as a standard for organizing and systematizing such conventions in order to produce a coherent language for expressing the structure and function of genetic designs. This document details version 2.2 of SBOL Visual, which builds on the prior SBOL Visual 2.1 in several ways. First, the grounding of molecular species glyphs is changed from BioPAX to SBO, aligning with the use of SBO terms for interaction glyphs. Second, new glyphs are added for proteins, introns, and polypeptide regions (e. g., protein domains), the prior recommended macromolecule glyph is deprecated in favor of its alternative, and small polygons are introduced as alternative glyphs for simple chemicals.

Improving Authentication and Authorization on SynBioHub

Synthetic Biology is an emerging discipline which uses engineering principles to shape biological behavior. The Synthetic Biology Open Language (SBOL) is a standard for describing biological constructs which enables engineering workflows that previous formats, such as GenBank and FASTA, could not. SynBioHub is an online repository for storing and sharing genetic designs. It uses the SBOL standard and an RDF triplestore to store designs, as well as supporting file attachment and external links. Several research efforts in synthetic biology have adopted SynBioHub and SBOL. These research efforts have revealed key areas for improvement in SynBioHub. Improving user sharing and permissioning is a primary target for improvement. The existing system has basic support for sharing with different privilege levels. Unfortunately, its architecture makes it difficult to extend and improve. Due to this difficulty, many features which would make SynBioHub more collaborative have not been implemented. This work aims to make synthetic biology more collaborative by providing a better foundation for experimentation and innovation in user sharing and permissioning. The existing authentication and authorization (auth) system is not centralized; it mixes concerns between page rendering and permissions management. The new system separates auth into its own software layer, separate entirely from page rendering. This new layer is itself split into separate authentication and authorization steps. New feature development and refinement will be made easier by the strong separations between the different components of SynBioHub.

Automated Generation of Dynamic Models for Genetic Regulatory Networks

Synthetic biology is an engineering discipline in which biological components are assembled to form devices with user-defined functions. As in any engineering discipline, modeling is a big part of the design process, since it helps to predict, control, and debug systems in an efficient manner. Systems biology has always been concerned with dynamic models, and a recent increase in high-throughput of experimental data has made it essential to develop dynamic models that can be used for an iterative learning process in a design/build/test workflow. In this thesis work, an automated model generator is created to automatically generate dynamic models for genetic regulatory networks, implemented in the genetic design automation tool, iBioSim. This automated model generator uses parameters stored at an online parts repository and encodes the mathematical models it generates using Systems Biology Markup Language. The automated model generator is then used to model and simulate genetic circuits created with the design environment referred to as Cello. The simulation of the mathematical models produces a dynamical response prediction of each of the circuits, which is unavailable with steady-state modeling. Some of these dynamical responses present unexpected behavior. Using the dynamic models generated with the automatic model generator of this work, an analysis of the predicted behaviors yielded insight into the underlying biology phenomena that cause the observed glitching behavior of these circuits. The last chapter of this thesis is focused mainly on future enhancements to the automated model generator of this work to produce more accurate and precise models not only for genetic regulatory networks in textitEscherichia coli, but any organism where parametrization exists as proposed in this thesis work. It also explores different analysis that could be implemented into the automated model generator of this work, in order to expand the assessment done on genetic circuits.

Asynchronous Genetic Design

Synthetic biology is applying engineering concepts to biological processes to enable genetic circuit designs, among other applications. As more biological parts are being discovered, it is vital to have an automated procedure to allow complex circuit designs to be built. Technology mapping is a set of procedures that maps biological components to a design specification. Current technology mapping frameworks for genetic circuits are used to design combinational circuits. This dissertation illustrates the process of building an automated workflow for a technology mapping framework to design synchronous sequential genetic circuits. An automated process to create a library of gates for logic and memory circuits is described to construct gates from DNA parts retrieved from a standardize data repository. Genetic constraints address what parts can be mapped to the design specification when the gates and designs are constructed. The proposed automaton workflow begins with a specification provided in a formal design language, such as Verilog. The input design specification is converted into a genetic regulatory network represented using the Synthetic Biology Open Language (SBOL). The network is decomposed into base functions (NOR gates, inverters, and genetic toggle switches) and matching and covering algorithms are performed to produce the output design. The output design is converted to the Systems Biology Markup Language (SBML) data format for testing and simulation. The outcome of this work provides the synthetic biology community insights on how asynchronous sequential circuit designs can be built through an automated procedure to perform technology mapping from libraries composed of logic gates and memory circuits.

SBOL-OWL: An Ontological Approach for Formal and Semantic Representation of Synthetic Biology Information

Standard representation of data is key for the reproducibility of designs in synthetic biology. The Synthetic Biology Open Language (SBOL) has already emerged as a data standard to represent information about genetic circuits, and it is based on capturing data using graphs. The language provides the syntax using a free text document that is accessible to humans only. This paper describes SBOL-OWL, an ontology for a machine understandable definition of SBOL. This ontology acts as a semantic layer for genetic circuit designs. As a result, computational tools can understand the meaning of design entities in addition to parsing structured SBOL data. SBOL-OWL not only describes how genetic circuits can be constructed computationally, it also facilitates the use of several existing Semantic Web tools for synthetic biology. This paper demonstrates some of these features, for example, to validate designs and check for inconsistencies. Through the use of SBOL-OWL, queries can be simplified and become more intuitive. Moreover, existing reasoners can be used to infer information about genetic circuit designs that cannot be directly retrieved using existing querying mechanisms. This ontological representation of the SBOL standard provides a new perspective to the verification, representation, and querying of information about genetic circuits and is important to incorporate complex design information via the integration of biological ontologies.

A Computational Workflow for the Automated Generation of Models of Genetic Designs

Computational models are essential to engineer predictable biological systems and to scale up this process for complex systems. Computational modeling often requires expert knowledge and data to build models. Clearly, manual creation of models is not scalable for large designs. Despite several automated model construction approaches, computational methodologies to bridge knowledge in design repositories and the process of creating computational models have still not been established. This paper describes a workflow for automatic generation of computational models of genetic circuits from data stored in design repositories using existing standards. This workflow leverages the software tool SBOLDesigner to build structural models that are then enriched by the Virtual Parts Repository API using Systems Biology Open Language (SBOL) data fetched from the SynBioHub design repository. The iBioSim software tool is then utilized to convert this SBOL description into a computational model encoded using the Systems Biology Markup Language (SBML). Finally, this SBML model can be simulated using a variety of methods. This workflow provides synthetic biologists with easy to use tools to create predictable biological systems, hiding away the complexity of building computational models. This approach can further be incorporated into other computational workflows for design automation.

Stochastic Analysis of an Genetic Sensor

The Systems Biology Markup Language (SBML): Language Specification for Level 3 Version 2 Core Release 2

Computational models can help researchers to interpret data, understand biological functions, and make quantitative predictions. The Systems Biology Markup Language (SBML) is a file format for representing computational models in a declarative form that different software systems can exchange. SBML is oriented towards describing biological processes of the sort common in research on a number of topics, including metabolic pathways, cell signaling pathways, and many others. By supporting SBML as an input/output format, different tools can all operate on an identical representation of a model, removing opportunities for translation errors and assuring a common starting point for analyses and simulations. This document provides the specification for Release 2 of Version 2 of SBML Level 3 Core. The specification defines the data structures prescribed by SBML as well as their encoding in XML, the eXtensible Markup Language. Release 2 corrects some errors and clarifies some ambiguities discovered in Release 1. This specification also defines validation rules that determine the validity of an SBML document, and provides many examples of models in SBML form. Other materials and software are available from the SBML project website at http://sbml.org/ .

Synthetic Biology Open Language Visual (SBOL Visual) Version 2.1

People who are engineering biological organisms often find it useful to communicate in diagrams, both about the structure of the nucleic acid sequences that they are engineering and about the functional relationships between sequence features and other molecular species . Some typical practices and conventions have begun to emerge for such diagrams. The Synthetic Biology Open Language Visual (SBOL Visual) has been developed as a standard for organizing and systematizing such conventions in order to produce a coherent language for expressing the structure and function of genetic designs. This document details version 2.1 of SBOL Visual, which builds on the prior SBOL Visual 2.0 standard by expanding diagram syntax to include methods for showing modular structure and mappings between elements of a system, interactions arrows that can split or join (with the glyph at the split or join indicating either superposition or a chemical process), and adding new glyphs for indicating genomic context (e.g., integration into a plasmid or genome) and for stop codons.

Synthetic Biology Open Language (SBOL) Version 2.3

Synthetic biology builds upon the techniques and successes of genetics, molecular biology, and metabolic engineering by applying engineering principles to the design of biological systems. The field still faces substantial challenges, including long development times, high rates of failure, and poor reproducibility. One method to ameliorate these problems is to improve the exchange of information about designed systems between laboratories. The synthetic biology open language (SBOL) has been developed as a standard to support the specification and exchange of biological design information in synthetic biology, filling a need not satisfied by other pre-existing standards. This document details version 2.3.0 of SBOL, which builds upon version 2.2.0 published in last year’s JIB Standards in Systems Biology special issue. In particular, SBOL 2.3.0 includes means of succinctly representing sequence modifications, such as insertion, deletion, and replacement, an extension to support organization and attachment of experimental data derived from designs, and an extension for describing numerical parameters of design elements. The new version also includes specifying types of synthetic biology activities, unambiguous locations for sequences with multiple encodings, refinement of a number of validation rules, improved figures and examples, and clarification on a number of issues related to the use of external ontology terms.

Scalable and Reproducible Modeling and Simulation for Heterogeneous Populations

Advancements in the systems and synthetic biology fields have proved that biology can be engineered. The development of computer-aided design (CAD) tools has contributed to advancements in these fields. Mathematical modeling and simulation methods are important assets of CAD tools that are frequently applied to the systems and synthetic biology fields. Modeling and simulation methods are used to understand or predict the behavior of a biological system being studied. However, many modeling efforts in those fields face a reproducibility problem, where many published models are not reproducible. In order to address such issue, standards have been created for the representation of biological models. A major advantage of standards is that they enable model reuse and sharing. The leading standard representation of biological systems is the Systems Biology Markup Language (SBML). The SBML standard is used to describe how biological processes affect and modify biological entities in a system. Such standard has been widely used to describe biochemical networks, cell signaling path, and gene regulation, among others. Unfortunately, not all models use SBML since there are many biological systems that SBML is incapable of representing efficiently, such as heterogeneous cellular populations. This dissertation explores extensions to SBML for the efficient representation of large heterogeneous cellular populations and simulation methods that can simulate such complex models efficiently. Since cellular populations are inherently hierarchical, this dissertation proposes an efficient simulator for hierarchical SBML models. Since the hierarchical structure is preserved in the proposed simulator, the hierarchical simulator is a perfect fit for handling hybrid models. However, no one has explored the coupling of different modeling formalisms within the same SBML model. Hence, this dissertation proposes a methodology that can be used to describe hybrid models. Such methodology is demonstrated by using dynamic flux balance analysis (DFBA) models as examples and such models can be successfully exchanged between tools. This dissertation also discusses extensions to the SBML data model to support regular structures in the form of arrays. Arrays is well-suited for population models since population models use large regular structures. Another application of arrays is microsimulation of disease models, where a population of individuals with unique characteristics need to be model. With the proposed arrays extension, simulators need to scale in order to handle the increase complexity that the arrays extension introduces. Hence, this dissertation also proposes an efficient simulation method that takes advantages of arrays.

SBOLExplorer: Data Infrastructure and Data Mining for Genetic Design Repositories

Biology is a very noisy field. Experiments are difficult to reproduce, the mechanisms behind life are not well understood, and data that we do obtain is difficult to make sense of. Much like traditional engineering fields where engineers draw from a library of reusable parts for their designs, experimental and synthetic biologists have designed biological circuits by drawing from a library of genetic constructs. However, these so-called genetic parts are poorly understood and are therefore limited in their usefulness. Additionally, there are hundreds of thousands of parts and sequences that have been either created or discovered. For my thesis, I filter through this biological noise to provide genetic circuit designers a powerful way to search for and access the genetic parts that are useful to them. This thesis is focused on creating SBOLExplorer, a system that is used to provide intuitive search within the SynBioHub genetic design repository. SynBioHub integrates genetic construct data from various sources and transforms and stores this data in a standardized data model. By tackling the intricate data mining and data infrastructure problems associated with large-scale semi-structured and noisy data, the search, transformation, and storage of data in genetic design repositories can be enhanced. In particular, this thesis focuses on improving the usability of genetic part repositories’ search capabilities. By clustering SynBioHub’s genetic parts into many derived collections, duplicate parts are merged. From there, a graph analysis algorithm is used to rank collections of parts by popularity and usefulness. Finally, data infrastructure challenges relating to indexing, storing, serving, and distributed search are solved. The end goal of SBOLExplorer is to integrate these findings into SynBioHub and other genetic design repositories’ data representation, search functionality, and data infrastructure.

Approximation techniques for stochastic analysis of biological systems

This book presents outstanding contributions in an exciting, new and multidisciplinary research area: the application of formal, automated reasoning techniques to analyse complex models in systems biology and systems medicine. Automated reasoning is a field of computer science devoted to the development of algorithms that yield trustworthy answers, providing a basis of sound logical reasoning. For example, in the semiconductor industry formal verification is instrumental to ensuring that chip designs are free of defects (or "bugs"). Over the past 15 years, systems biology and systems medicine have been introduced in an attempt to understand the enormous complexity of life from a computational point of view. This has generated a wealth of new knowledge in the form of computational models, whose staggering complexity makes manual analysis methods infeasible. Sound, trusted, and automated means of analysing the models are thus required in order to be able to trust their conclusions. Above all, this is crucial to engineering safe biomedical devices and to reducing our reliance on wet-lab experiments and clinical trials, which will in turn produce lower economic and societal costs. Some examples of the questions addressed here include: Can we automatically adjust medications for patients with multiple chronic conditions? Can we verify that an artificial pancreas system delivers insulin in a way that ensures Type 1 diabetic patients never suffer from hyperglycaemia or hypoglycaemia? And lastly, can we predict what kind of mutations a cancer cell is likely to undergo? This book brings together leading researchers from a number of highly interdisciplinary areas, including: - Parameter inference from time series - Model selection - Network structure identification - Machine learning - Systems medicine - Hypothesis generation from experimental data - Systems biology, systems medicine, and digital pathology - Verification of biomedical devices "This book presents a comprehensive spectrum of model-focused analysis techniques for biological systems …an essential resource for tracking the developments of a fast moving field that promises to revolutionize biology and medicine by the automated analysis of models and data." Prof Luca Cardelli FRS, University of Oxford

Synthetic Biology Open Language (SBOL) Version 2.2.0

Synthetic biology builds upon the techniques and successes of genetics, molecular biology, and metabolic engineering by applying engineering principles to the design of biological systems. The field still faces substantial challenges, including long development times, high rates of failure, and poor reproducibility. One method to ameliorate these problems would be to improve the exchange of information about designed systems between laboratories. The synthetic biology open language (SBOL) has been developed as a standard to support the specification and exchange of biological design information in synthetic biology, filling a need not satisfied by other pre-existing standards. This document details version 2.2.0 of SBOL that builds upon version 2.1.0 published in last year’s JIB special issue. In particular, SBOL 2.2.0 includes improved description and validation rules for genetic design provenance, an extension to support combinatorial genetic designs, a new class to add non-SBOL data as attachments, a new class for genetic design implementations, and a description of a methodology to describe the entire design-build-test-learn cycle within the SBOL data model.

SBOLDesigner: A Hierarchical Genetic Design Editor

Synthetic biology, as a field of research, applies electrical engineering, systems biology, and bioinformatics to genetic circuit design. Software tools are leveraged to provide rapid iteration through the design space, and data standards are used to encode and characterize complicated genetic circuit designs. Specifically, genetic design automation workflows centered around standards, abstraction, and decoupling are utilized to help experimental biologists accomplish their goals. Unfortunately, the software tools that support this workflow are lacking in some critical features such as combinatorial design and support for an extended range of glyphs. These inadequacies hinder the adoption of data standards, and therefore hurt the reproducibility of experiments and results. Necessary details of experiments are not recorded, and the resulting conclusions are therefore not trusted. SBOLDesigner, a sequence-based computer aided design tool, addresses these issues. This thesis will focus on SBOLDesigner’s implementation of combinatorial design and the SBOL Visual 2 standard using the SBOL 2 data model. Using SBOLDesigner, experimental biologists are able to visualize their genetic circuits unambiguously and express the full state of their design robustly. This results in higher productivity when designing genetic circuits, more comprehensive circuit descriptions, and most importantly, enhanced reproducibility in the field of synthetic biology.

The Systems Biology Markup Language (SBML): Language Specification for Level 3 Version 2 Core

Computational models can help researchers to interpret data, understand biological functions, and make quantitative predictions. The Systems Biology Markup Language (SBML) is a file format for representing computational models in a declarative form that different software systems can exchange. SBML is oriented towards describing biological processes of the sort common in research on a number of topics, including metabolic pathways, cell signaling pathways, and many others. By supporting SBML as an input/output format, different tools can all operate on an identical representation of a model, removing opportunities for translation errors and assuring a common starting point for analyses and simulations. This document provides the specification for Version 2 of SBML Level 3 Core. The specification defines the data structures prescribed by SBML, their encoding in XML (the eXtensible Markup Language), validation rules that determine the validity of an SBML document, and examples of models in SBML form. The design of Version 2 differs from Version 1 principally in allowing new MathML constructs, making more child elements optional, and adding identifiers to all SBML elements instead of only selected elements. Other materials and software are available from the SBML project website at http://sbml.org/.

SynBioHub: A Standards-Enabled Design Repository for Synthetic Biology

The SynBioHub repository (https://synbiohub.org) is an open-source software project that facilitates the sharing of information about engineered biological systems. SynBioHub provides computational access for software and data integration, and a graphical user interface that enables users to search for and share designs in a Web browser. By connecting to relevant repositories (e.g., the iGEM repository, JBEI ICE, and other instances of SynBioHub), the software allows users to browse, upload, and download data in various standard formats, regardless of their location or representation. SynBioHub also provides a central reference point for other resources to link to, delivering design information in a standardized format using the Synthetic Biology Open Language (SBOL). The adoption and use of SynBioHub, a community-driven effort, has the potential to overcome the reproducibility challenge across laboratories by helping to address the current lack of information about published designs.

Synthetic Biology Open Language (SBOL) Version 2.1.0

Synthetic biology builds upon the techniques and successes of genetics, molecular biology, and metabolic engineering by applying engineering principles to the design of biological systems. The field still faces substantial challenges, including long development times, high rates of failure, and poor reproducibility. One method to ameliorate these problems would be to improve the exchange of information about designed systems between laboratories. The Synthetic Biology Open Language (SBOL) has been developed as a standard to support the specification and exchange of biological design information in synthetic biology, filling a need not satisfied by other pre-existing standards. This document details version 2.1 of SBOL that builds upon version 2.0 published in last year’s JIB special issue. In particular, SBOL 2.1 includes improved rules for what constitutes a valid SBOL document, new role fields to simplify the expression of sequence features and how components are used in context, and new best practices descriptions to improve the exchange of basic sequence topology information and the description of genetic design provenance, as well as miscellaneous other minor improvements.

Toward Community Standards and Software for Whole-Cell Modeling

Objective: Whole-cell (WC) modeling is a promising tool for biological research, bioengineering, and medicine. However, substantial work remains to create accurate comprehensive models of complex cells. Methods: We organized the 2015 Whole-Cell Modeling Summer School to teach WC modeling and evaluate the need for new WC modeling standards and software by recoding a recently published WC model in the Systems Biology Markup Language. Results: Our analysis revealed several challenges to representing WC models using the current standards. Conclusion: We, therefore, propose several new WC modeling standards, software, and databases. Significance: We anticipate that these new standards and software will enable more comprehensive models.

Verification methodologies for fault-tolerant network-on-chip systems

Over the last decade, cyber-physical systems (CPSs) have seen significant applications in many safety-critical areas, such as autonomous automotive systems, automatic pilot avionics, wireless sensor networks, etc. A Cps uses networked embedded computers to monitor and control physical processes. The motivating example for this dissertation is the use of fault- tolerant routing protocol for a Network-on-Chip (NoC) architecture that connects electronic control units (Ecus) to regulate sensors and actuators in a vehicle. With a network allowing Ecus to communicate with each other, it is possible for them to share processing power to improve performance. In addition, networked Ecus enable flexible mapping to physical processes (e.g., sensors, actuators), which increases resilience to Ecu failures by reassigning physical processes to spare Ecus. For the on-chip routing protocol, the ability to tolerate network faults is important for hardware reconfiguration to maintain the normal operation of a system. Adding a fault-tolerance feature in a routing protocol, however, increases its design complexity, making it prone to many functional problems. Formal verification techniques are therefore needed to verify its correctness. This dissertation proposes a link-fault-tolerant, multiflit wormhole routing algorithm, and its formal modeling and verification using two different methodologies. An improvement upon the previously published fault-tolerant routing algorithm, a link-fault routing algorithm is proposed to relax the unrealistic node-fault assumptions of these algorithms, while avoiding deadlock conservatively by appropriately dropping network packets. This routing algorithm, together with its routing architecture, is then modeled in a process-algebra language LNT, and compositional verification techniques are used to verify its key functional properties. As a comparison, it is modeled using channel-level VHDL which is compiled to labeled Petri-nets (LPNs). Algorithms for a partial order reduction method on LPNs are given. An optimal result is obtained from heuristics that trace back on LPNs to find causally related enabled predecessor transitions. Key observations are made from the comparison between these two verification methodologies.

Stochastic Model Checking of Genetic Circuits

Synthetic genetic circuits have a number of exciting potential applications such as cleaning up toxic waste, hunting and killing tumor cells, and producing drugs and bio-fuels more efficiently. When designing and analyzing genetic circuits, researchers are often interested in the probability of observing certain behaviors. Discerning these probabilities typically involves simulating the circuit to produce some time series data and computing statistics over the resulting data. However, for very rare behaviors of complex genetic circuits, it becomes computationally intractable to obtain good results as the number of required simulation runs grows exponentially. It is, therefore, necessary to apply numerical methods to determine these probabilities directly. This article describes how stochastic model checking, a method for determining the likelihood that certain events occur in a system, can by applied to models of genetic circuits by translating them into continuous-time Markov chains (CTMCs) and analyzing them using Markov chain analysis to check continuous stochastic logic (CSL) properties. The utility of this approach is demonstrated with several case studies illustrating how this method can be used to perform design space exploration of two genetic oscillators and two genetic state-holding elements. Our results show that this method results in a substantial speedup as compared with conventional simulation-based approaches.

Systems Biology Markup Language (SBML) Level 2 Version 5: Structures and Facilities for Model Definitions

Computational models can help researchers to interpret data, understand biological function, and make quantitative predictions. The Systems Biology Markup Language (SBML) is a file format for representing computational models in a declarative form that can be exchanged between different software systems. SBML is oriented towards describing biological processes of the sort common in research on a number of topics, including metabolic pathways, cell signaling pathways, and many others. By supporting SBML as an input/output format, different tools can all operate on an identical representation of a model, removing opportunities for translation errors and assuring a common starting point for analyses and simulations. This document provides the specification for Version 5 of SBML Level 2. The specification defines the data structures prescribed by SBML as well as their encoding in XML, the eXtensible Markup Language. This specification also defines validation rules that determine the validity of an SBML document, and provides many examples of models in SBML form. Other materials and software are available from the SBML project web site, http://sbml.org/.

Synthetic Biology Open Language (SBOL) Version 2.0.0

Synthetic biology builds upon the techniques and successes of genetics, molecular biology, and metabolic engineering by applying engineering principles to the design of biological systems. The field still faces substantial challenges, including long development times, high rates of failure, and poor reproducibility. One method to ameliorate these problems would be to improve the exchange of information about designed systems between laboratories. The Synthetic Biology Open Language (SBOL) has been developed as a standard to support the specification and exchange of biological design information in synthetic biology, filling a need not satisfied by other pre-existing standards. This document details version 2.0 of SBOL, introducing a standardized format for the electronic exchange of information on the structural and functional aspects of biological designs. The standard has been designed to support the explicit and unambiguous description of biological designs by means of a well defined data model. The standard also includes rules and best practices on how to use this data model and populate it with relevant design details. The publication of this specification is intended to make these capabilities more widely accessible to potential developers and users in the synthetic biology community and beyond.

Efficient, Sound Formal Verification for Analog/Mixed-Signal Circuits

The increasing demand for smaller, more efficient circuits has created a need for both digital and analog designs to scale down. Digital technologies have been successful in meeting this challenge, but analog circuits have lagged behind due to smaller transistor sizes having a disproportionate negative affect. Since many applications require small, low-power analog circuits, the trend has been to take advantage of digital’s ability to scale by replacing as much of the analog circuitry as possible with digital counterparts. The results are known as emphdigitally-intensive analog/mixed-signal (AMS) circuits. Though such circuits have helped the scaling problem, they have further complicated verification. This dissertation improves on techniques for AMS property specifications, as well as, develops sound, efficient extensions to formal AMS verification methods. With the emphlanguage for analog/mixed-signal properties (LAMP), one has a simple intuitive language for specifying AMS properties. LAMP provides a more procedural method for describing properties that is more straightforward than temporal logic-like languages. However, LAMP is still a nascent language and is limited in the types of properties it is capable of describing. This dissertation extends LAMP by adding statements to ignore transient periods and be able to reset the property check when the environment conditions change. After specifying a property, one needs to verify that the circuit satisfies the property. An efficient method for formally verifying AMS circuits is to use the restricted polyhedral class of emphzones. Zones have simple operations for exploring the reachable state space, but they are only applicable to circuit models that utilize constant rates. To extend zones to more general models, this dissertation provides the theory and implementation needed to soundly handle models with ranges of rates. As a second improvement to the state representation, this dissertation describes how octagons can be adapted to model checking AMS circuit models. Though zones have efficient algorithms, it comes at a cost of over-approximating the reachable state space. Octagons have similarly efficient algorithms while adding additional flexibility to reduce the necessary over-approximations. Finally, the full methodology described in this dissertation is demonstrated on two examples. The first example is a switched capacitor integrator that has been studied in the context of transforming the original formal model to use only single rate assignments. Th property of not saturating is written in LAMP, the circuit is learned, and the property is checked against a faulty and correct circuit. In addition, it is shown that the zone extension, and its implementation with octagons, recovers all previous conclusions with the switched capacitor integrator without the need to translate the model. In particular, the method applies generally to all the models produced and does not require the soundness check needed by the translational approach to accept positive verification results. As a second example, the full tool flow is demonstrated on a digital C-element that is driven by a pair of RC networks, creating an AMS circuit. The RC networks are chosen so that the inputs to the C-element are ordered. LAMP is used to codify this behavior and it is verified that the input signals change in the correct order for the provided SPICE simulation traces.

Specifications of Standards in Systems and Synthetic Biology

Standards shape our everyday life. From nuts and bolts to electronic devices and technological processes, standardised products and processes are all around us. Standards have technological and economic benefits, such as making information exchange, production, and services more efficient. However, novel, innovative areas often either lack proper standards, or documents about standards in these areas are not available from a centralised platform or formal body (such as the International Standardisation Organisation).textless/ptextgreatertextlessptextgreaterSystems and synthetic biology is a relatively novel area, and it is only in the last decade that the standardisation of data, information, and models related to systems and synthetic biology has become a community-wide effort. Several open standards have been established and are under continuous development as a community initiative. COMBINE, the ‘COmputational Modeling in BIology’ NEtwork [1] has been established as an umbrella initiative to coordinate and promote the development of the various community standards and formats for computational models. There are yearly two meeting, HARMONY (Hackathons on Resources for Modeling in Biology), Hackathon-type meetings with a focus on development of the support for standards, and COMBINE forums, workshop-style events with oral presentations, discussion, poster, and breakout sessions for further developing the standards. For more information see http://co.mbine.org/. So far the different standards were published and made accessible through the standards’ web-pages or preprint services. The aim of this special issue is to provide a single, easily accessible and citable platform for the publication of standards in systems and synthetic biology. This special issue is intended to serve as a central access point to standards and related initiatives in systems and synthetic biology, it will be published annually to provide an opportunity for standard development groups to communicate updated specifications.

SBML Level 3 package: Hierarchical Model Composition, Version 1 Release 3

Constructing a model in a hierarchical fashion is a natural approach to managing model complexity, and offers additional opportunities such as the potential to re-use model components. The SBML Level 3 Version 1 Core specification does not directly provide a mechanism for defining hierarchical models, but it does provide a mechanism for SBML packages to extend the Core specification and add additional syntactical constructs. The SBML Hierarchical Model Composition package for SBML Level 3 adds the necessary features to SBML to support hierarchical modeling. The package enables a modeler to include submodels within an enclosing SBML model, delete unneeded or redundant elements of that submodel, replace elements of that submodel with element of the containing model, and replace elements of the containing model with elements of the submodel. In addition, the package defines an optional “port” construct, allowing a model to be defined with suggested interfaces between hierarchical components; modelers can chose to use these interfaces, but they are not required to do so and can still interact directly with model elements if they so chose. Finally, the SBML Hierarchical Model Composition package is defined in such a way that a hierarchical model can be “flattened” to an equivalent, non-hierarchical version that uses only plain SBML constructs, thus enabling software tools that do not yet support hierarchy to nevertheless work with SBML hierarchical models.

Technology mapping of genetic circuit designs

Synthetic biology is a new field in which engineers, biologists, and chemists are working together to transform genetic engineering into an advanced engineering discipline, one in which the design and construction of novel genetic circuits are made possible through the application of engineering principles. This dissertation explores two engineering strategies to address the challenges of working with genetic technology, namely the development of standards for describing genetic components and circuits at separate yet connected levels of detail and the use of Genetic Design Automation (GDA) software tools to simplify and speed up the process of optimally designing genetic circuits. Its contributions to the field of synthetic biology include (1) a proposal for the next version of the Synthetic Biology Open Language (SBOL), an existing standard for specifying and exchanging genetic designs electronically, and (2) a GDA work ow that enables users of the software tool iBioSim to create an abstract functional specication, automatically select genetic components that satisfy the specication from a design library, and compose the selected components into a standardized genetic circuit design for subsequent analysis and physical construction. Ultimately, this dissertation demonstrates how existing techniques and concepts from electrical and computer engineering can be adapted to overcome the challenges of genetic design and is an example of what is possible when working with publicly available standards for genetic design.

QDI logic for signaling data validity in bundled-data design: a kogge-stone case study

A Methodology to Annotate Systems Biology Markup Language Models with the Synthetic Biology Open Language

Recently, we have begun to witness the potential of synthetic biology, noted here in the form of bacteria and yeast that have been genetically engineered to produce biofuels, manufacture drug precursors, and even invade tumor cells. The success of these projects, however, has often failed in translation and application to new projects, a problem exacerbated by a lack of engineering standards that combine descriptions of the structure and function of DNA. To address this need, this paper describes a methodology to connect the systems biology markup language (SBML) to the synthetic biology open language (SBOL), existing standards that describe biochemical models and DNA components, respectively. Our methodology involves first annotating SBML model elements such as species and reactions with SBOL DNA components. A graph is then constructed from the model, with vertices corresponding to elements within the model and edges corresponding to the cause-and-effect relationships between these elements. Lastly, the graph is traversed to assemble the annotating DNA components into a composite DNA component, which is used to annotate the model itself and can be referenced by other composite models and DNA components. In this way, our methodology can be used to build up a hierarchical library of models annotated with DNA components. Such a library is a useful input to any future genetic technology mapping algorithm that would automate the process of composing DNA components to satisfy a behavioral specification. Our methodology for SBML-to-SBOL annotation is implemented in the latest version of our genetic design automation (GDA) software tool, iBioSim.

Asynchronous Circuit Design (in Chinese)

Stochastic Analysis of Synthetic Genetic Circuits

Over the past few decades, synthetic biology has generated great interest to biologists and engineers alike. Synthetic biology combines the research of biology with the engineering principles of standards, abstraction, and automated construction with the ultimate goal of being able to design and build useful biological systems. To realize this goal, researchers are actively working on better ways to model and analyze synthetic genetic circuits, groupings of genes that influence the expression of each other through the use of proteins. When designing and analyzing genetic circuits, researchers are often interested in building circuits that exhibit a particular behavior. Usually, this involves simulating their models to produce some time series data and analyzing this data to discern whether or not the circuit behaves appropriately. This method becomes less attractive as circuits grow in complexity because it becomes very time consuming to generate a sufficient amount of runs for analysis. In addition, trying to select representative runs out of a large data set is tedious and error-prone thereby motivating methods of automating this analysis. This has led to the need for design space exploration techniques that allow synthetic biologists to easily explore the effect of varying parameters and efficiently consider alternative designs of their systems. This dissertation attempts to address this need by proposing new analysis and verification techniques for synthetic genetic circuits. In particular, it applies formal methods such as model checking techniques to models of genetic circuits in order to ensure that they behave correctly and are as robust as possible for a variety of different inputs and/or parameter settings. However, model checking stochastic systems is not as simple as model checking deterministic systems where it is always known what the next state of the system will be at any given step. Stochastic systems can exhibit a variety of different behaviors that are chosen randomly with different probabilities at each time step. Therefore, model checking a stochastic system involves calculating the probability that the system will exhibit a desired behavior. Although it is often more difficult to work with the probabilities that stochastic systems introduce, stochastic systems and the models that represent them are becoming commonplace in many disciplines including electronic circuit design where as parts are being made smaller and smaller, they are becoming less reliable. In addition to stochastic model checking, this dissertation proposes a new incremental stochastic simulation algorithm (iSSA) based on Gillespie’s stochastic simulation algorithm (SSA) that is capable of presenting a researcher with a simulation trace of the typical behavior of the system. Before the development of this algorithm, discerning this information was extremely error-prone as it involved performing many simulations and attempting to wade through the massive amounts of data. This algorithm greatly aids researchers in designing genetic circuits as it efficiently shows the researcher the most likely behavior of the circuit. Both the iSSA and stochastic model checking can be used in concert to give a researcher the likelihood that the system will exhibit its most typical behavior. Once the typical behavior is known, properties for nontypical behaviors can be constructed and their likelihoods can also be computed. This methodology is applied to several genetic circuits leading to new understanding of the effects of various parameters on the behavior of these circuits.

Utilizing stochastic model checking to analyze genetic circuits

When designing and analyzing genetic circuits, researchers are often interested in the probability of the system reaching a given state within a certain amount of time. Usually, this involves simulating the system to produce some time series data and analyzing this data to discern the state probabilities. However, as the complexity of models of genetic circuits grow, it becomes more difficult for researchers to reason about the different states by looking only at time series simulation results of the models. To address this problem, this paper employs the use of stochastic model checking, a method for determining the likelihood that certain events occur in a system, with continuous stochastic logic (CSL) properties to obtain similar results. This goal is accomplished by the introduction of a methodology for converting a genetic circuit model (GCM) into a continuous-time Markov chain (CTMC). This CTMC is analyzed using transient Markov chain analysis to determine the likelihood that the circuit satisfies a given CSL property in a finite amount of time. This paper illustrates a use of this methodology to determine the likelihood of failure in a genetic toggle switch and compares these results to stochastic simulation-based analysis of this same circuit. Our results show that this method results in a substantial speedup as compared with conventional simulation-based approaches.

Erlang-delayed stochastic chemical kinetic formalism for efficient analysis of biological systems with non-elementary reaction effects

Stochastic chemical kinetics (SCK) has become an important formalism for modeling and analysis of complex biological systems as it can capture the discreteness and the randomness of underlying biochemical reactions. One of the assumptions made by SCK is that each reaction be an elementary step which cannot be broken down into smaller steps. As such, transition events of the SCK model occur spontaneously in that there is no time lag between the reaction initiation and completion. In practice, however, it is very difficult to experimentally determine if an observed state change is a result of an elementary reaction or a sequence of several reaction steps. To test various hypotheses on such time-delays through the use of the SCK, all of the intermediate reactions and species need to be explicitly specified, which can quickly become cumbersome. To more efficiently model and analyze potential effects of such intermediate reaction steps, this paper proposes a new formalism for higher-level discrete-stochastic treatment of biological systems with reaction delays. Our new formalism can represent a time delay caused by intermediate reaction steps as an Erlang random variable, allowing a model’s size to be reduced substantially. This paper illustrates an application of this formalism for analysis of the effect of different transcription elongation steps and rates on the distribution of RNA molecules.

Learning Genetic Regulatory Network Connectivity from Time Series Data

Recent experimental advances facilitate the collection of time series data that indicate which genes in a cell are expressed. This information can be used to understand the genetic regulatory network that generates the data. Typically, Bayesian analysis approaches are applied which neglect the time series nature of the experimental data, have difficulty in determining the direction of causality, and do not perform well on networks with tight feedback. To address these problems, this paper presents a method to learn genetic network connectivity which exploits the time series nature of experimental data to achieve better causal predictions. This method first breaks up the data into bins. Next, it determines an initial set of potential influence vectors for each gene based upon the probability of the gene’s expression increasing in the next time step. These vectors are then combined to form new vectors with better scores. Finally, these influence vectors are competed against each other to determine the final influence vector for each gene. The result is a directed graph representation of the genetic network’s repression and activation connections. Results are reported for several synthetic networks with tight feedback showing significant improvements in recall and runtime over Yu’s dynamic Bayesian approach. Promising preliminary results are also reported for an analysis of experimental data for genes involved in the yeast cell cycle.

Design and analysis of a robust genetic Muller C-element

This paper presents results on the design and analysis of a robust genetic Muller C-element. The Muller C-element is a standard logic gate commonly used to synchronize independent processes in most asynchronous electronic circuits. Synthetic biological logic gates have been previously demonstrated, but there remain many open issues in the design of sequential (state-holding) logic operations. Three designs are considered for the genetic Muller C-element: a majority gate, a toggle switch, and a speed-independent implementation. While the three designs are logically equivalent, each design requires different assumptions to operate correctly. The majority gate design requires the most timing assumptions, the speed-independent design requires the least, and the toggle switch design is a compromise between the two. This paper examines the robustness of these designs as well as the effects of parameter variation using stochastic simulation. The results show that robustness to timing assumptions does not necessarily increase reliability, suggesting that modifications to existing logic design tools are going to be necessary for synthetic biology. Parameter variation simulations yield further insights into the design principles necessary for building robust genetic gates. The results suggest that high gene count, cooperativity of at least two, tight repression, and balanced decay rates are necessary for robust gates. Finally, this paper presents a potential application of the genetic Muller C-element as a quorum-mediated trigger.

Temperature Control of Fimbriation Circuit Switch in Uropathogenic Escherichia coli: Quantitative Analysis via Automated Model Abstraction

Uropathogenic Escherichia coli (UPEC) represent the predominant cause of urinary tract infections (UTIs). A key UPEC molecular virulence mechanism is type 1 fimbriae, whose expression is controlled by the orientation of an invertible chromosomal DNA element—the fim switch. Temperature has been shown to act as a major regulator of fim switching behavior and is overall an important indicator as well as functional feature of many urologic diseases, including UPEC host-pathogen interaction dynamics. Given this panoptic physiological role of temperature during UTI progression and notable empirical challenges to its direct in vivo studies, in silico modeling of corresponding biochemical and biophysical mechanisms essential to UPEC pathogenicity may significantly aid our understanding of the underlying disease processes. However, rigorous computational analysis of biological systems, such as fim switch temperature control circuit, has hereto presented a notoriously demanding problem due to both the substantial complexity of the gene regulatory networks involved as well as their often characteristically discrete and stochastic dynamics. To address these issues, we have developed an approach that enables automated multiscale abstraction of biological system descriptions based on reaction kinetics. Implemented as a computational tool, this method has allowed us to efficiently analyze the modular organization and behavior of the E. coli fimbriation switch circuit at different temperature settings, thus facilitating new insights into this mode of UPEC molecular virulence regulation. In particular, our results suggest that, with respect to its role in shutting down fimbriae expression, the primary function of FimB recombinase may be to effect a controlled down-regulation (rather than increase) of the ON-to-OFF fim switching rate via temperature-dependent suppression of competing dynamics mediated by recombinase FimE. Our computational analysis further implies that this down-regulation mechanism could be particularly significant inside the host environment, thus potentially contributing further understanding toward the development of novel therapeutic approaches to UPEC-caused UTIs.

Representing Genetic Networks as Labeled Hybrid Petri Nets for State Space Exploration and Markov Chain Analysis

This paper presents the bachelor’s thesis of Curtis Kendall Madsen which can be broken down into the following three goals. The first goal of this project is to develop a way to convert genetic networks into logical models. Once this is done, finding the state graph of these nets and performing Markov chain analysis on them can provide researchers with insight into the reachability of the states in the original network. Therefore, the second goal of this project is to develop an automated tool that can perform state space exploration of a logical model, and the third goal is to implement a Markov chain analyzer for the stage graph. For the logical representation of genetic networks, the conversion method uses labeled hybrid Petri nets (LHPNs) because they are designed for modeling logic while still allowing for important information required by Markov chain analysis such as transition rates to be included in the model. This conversion method is automated and is integrated into the iBioSim program allowing users to transform a Genetic Circuit Model (GCM) into an LHPN with just a click of a button. Also, iBioSim now includes the LHPN file type in its file tree so users can view and edit LHPNs once conversion is complete. In addition, a method for performing state space exploration on an LHPN allows the user to view the state graph using Graphviz’s Dotty tool.

Efficient Modeling and Verification of Analog/Mixed-Signal Circuits Using Labeled Hybrid Petri Nets

Analog circuit design is traditionally done by expert designers in an ad hoc manner heavily dependent on simulation. This methodology has worked successfully for many years, but process variation and design complexity are prompting designers to explore new techniques. Formal methods are being used successfully to aid in the complex validation problem for digital circuits. This dissertation presents formal methods for analog and mixed-signal (AMS) circuits. This dissertation describes the development of a formal model, labeled hybrid Petri nets (LHPNs), appropriate for the modeling and verification of AMS circuits. An LHPN is a Petri net variant capable of modeling both continuous and discrete quantities. Creating an LHPN model of an AMS circuit by hand is a complicated and error prone exercise that requires expert knowledge. This is unacceptable for practical adoption of the LHPN model and its associated analysis methods. For this reason, this dissertation introduces an automatic LHPN model generation method. The method uses a set of simulation traces and a desired system property to generate an LHPN modeling the behavior of the simulation traces. The model generator can also be used to generate abstract Verilog-AMS or VHDL-AMS models suitable for use in system-level simulations. Formal verification of a property over the entire state space of an LHPN model is complicated by the infinite state of the model. For this reason, the infinite states of the model are grouped into potentially finite groups of equivalent states for verification. Difference bound matrices (DBMs), a restricted form of convex polygons, are used to represent these equivalent classes of infinite states. Reachability analysis using DBMs is very efficient at the cost of exactness. This dissertation presents algorithms for conservative state space analysis and verification of LHPNs. Finally, these methods are demonstrated on several case studies of AMS circuits from both academia and industry. The formal verification methods demonstrate the ability to find bugs missed by standard simulations. The abstract modeling methods show the promise of using automatically generated abstract models by demonstrating up to 40x speedup for some examples.

Production-passage-time approximation: a new approximation method to accelerate the simulation process of enzymatic reactions

Given the substantial computational requirements of stochastic simulation, approximation is essential for efficient analysis of any realistic biochemical system. This paper introduces a new approximation method to reduce the computational cost of stochastic simulations of an enzymatic reaction scheme which in biochemical systems often includes rapidly changing fast reactions with enzyme and enzyme-substrate complex molecules present in very small counts. Our new method removes the substrate dissociation reaction by approximating the passage time of the formation of each enzyme-substrate complex molecule which is destined to a production reaction. This approach skips the firings of unimportant yet expensive reaction events, resulting in a substantial acceleration in the stochastic simulations of enzymatic reactions. Additionally, since all the parameters used in our new approach can be derived by the Michaelis-Menten parameters which can actually be measured from experimental data, applications of this approximation can be practical even without having full knowledge of the underlying enzymatic reaction. Here, we apply this new method to various enzymatic reaction systems, resulting in a speedup of orders of magnitude in temporal behavior analysis without any significant loss in accuracy. Furthermore, we show that our new method can perform better than some of the best existing approximation methods for enzymatic reactions in terms of accuracy and efficiency.

Design and Analysis of Genetic Circuits

Electronic Design Automation (EDA) tools have facilitated the design of ever more complex integrated circuits each year. Synthetic biology would also benefit from the development of Genetic Design Automation (GDA) tools. Existing GDA tools require biologists to design genetic circuits at the molecular level, roughly equivalent to designing electronic circuits at the layout level. Analysis of these circuits is also performed at this very low level. This thesis presents a first step at developing a GDA tool that supports higher levels of abstraction. In particular, this thesis describes the Genetic Circuit Model (GCM), a graphical specification language from which molecular descriptions can be synthesized. The GCM has several advantages. The input is tightly controlled through the use of an editor, limiting the possibility of user error. The representation of the genetic circuit is much more compact than using the System Biology Markup Language (SBML), the standard form for representing genetics circuits. The GCM can be automatically translated into SBML, allowing GCM’s to be easily simulated across multiple different simulators. The GCM to SBML translation process is targeted in such a way that the resulting output can be easily abstracted to allow for efficient simulation. To evaluate and test the GCM, this thesis presents a case study of the design of a genetic Muller C-element, a gate often used in asynchronous design. Three different genetic Muller C-elements are designed and analyzed. The utility of the GCM is demonstrated as it allows for efficient analysis of the Muller C-elements. The results of the simulations show that logically equivalent circuits can have different behaviors. In particular, a speed independent Muller C-element does not necessary imply that the gate is more robust than a non-speed independent gate. Design principles gathered from the simulations are that dual-rail outputs are essential, high gene count increases robustness, cooperativity greater than one is necessary, repression needs to be strong, and decay rates must be balanced for high robustness and low switching time. One potential application of the genetic Muller C-element is determining when to start the invasion of cancer cells. The two input signals are an environmental signal, and a communication signal. Using these signals, the bacteria colony can correctly reach consensus on when to begin the invasion. One interesting result is that noise is necessary in correctly switching into the invasion state.

Model Abstraction and Temporal Behavior Analysis of Genetic Regulatory Networks

With advances in technologies such as high throughput data collection and genome sequencing methods, the Human Genome Project which, among other things, determined the human DNA sequence and identified all the genes in human DNA was completed at least two years ahead of time. As these technologies are becoming more accurate, efficient, and cost effective and a massive amount of genomic and proteomic data are becoming available at a rapid pace, we are now in the position to face the challenge to understand how these genes coupled with environmental stimuli orchestrate the regulation of cell-level behaviors. However, understanding such complex systems is very expensive and is most likely impractical with wet-lab experiments alone as the amount and the complexity of data substantially increase, requiring the integration of computational methods to make the process more efficient. To allow for substantial acceleration in computational analysis, this dissertation develops a model abstraction methodology for biochemical systems which systematically performs various model abstractions to reduce the complexity of computational biochemical models. Our methodology is particularly useful for systems with small molecular counts that require the discrete and stochastic representation and thus demand substantial computational requirements. As a case study, this dissertation illustrates the application of individual abstraction methods to such systems. Furthermore, it demonstrates the application of collective abstraction methods at various accuracy levels to temporal behavior analysis of several genetic regulatory networks. This dissertation shows that analysis time of biologically relevant properties of such genetic regulatory networks can be improved from days of work to minutes of work using our methodology while maintaining reasonable accuracy.

Learning Genetic Regulatory Network Connectivity from Time Series Data

Recent experimental advances facilitate the collection of time series data that indicate which genes in a cell are expressed. This information can be used to understand the genetic regulatory network that generates the data. Typically, Bayesian analysis approaches are applied which neglect the time series nature of the experimental data, have difficulty in determining the direction of causality, and do not perform well on networks with tight feedback. To address these problems, this dissertation presents an improved method, called the GeneNet algorithm, to learn genetic regulatory network connectivity which exploits the time series nature of experimental data to allow for better causal predictions on networks with tight feedback. More specifically, the GeneNet algorithm provides several contributions to the area of genetic network discovery. It finds networks with cyclic or tight feedback behavior often missed by other methods as it performs a more local analysis of the data. It provides the researcher with the ability to see the interactions between genes in a genetic network. It guides experimental design by providing feedback to the researcher as to which parts of the network are the most unclear. It is encased in an infrastructure that allows for rapid genetic network model creation and evaluation. The GeneNet algorithm first encodes the data into levels. Next, it determines an initial set of influence vectors for each species based upon the probability of the species’ expression increasing. From this set of influence vectors, it determines if any influence vectors should be merged, representing a combined effect. Finally, influence vectors are competed against each other to obtain the best influence vector. The result is a directed graph representation of the genetic network’s repression and activation connections. Results are reported for several synthetic networks showing significant improvements in both recall and runtime while performing nearly as well or better in precision over a dynamic Bayesian approach.

Verification of Analog and Mixed-Signal Circuits Using Symbolic Methods

With the rapidly increasing complexity of hardware, traditional validation techniques are becoming insufficient. This has led to a substantial interest in the formal verification of digital components. There has been relatively little research, however, into the application of formal verification methods to the analog/mixed-signal domain. Therefore, the overall goal of this work is to provide a system for efficient and meaningful analysis of analog/mixed-signal circuits. This encompasses two major efforts: modeling and symbolic analysis. The continuous nature of analog circuits requires a modeling method that is capable of representing continuous behavior and the discrete nature of digital circuits requires a modeling method that is capable of representing discrete behavior. This dual requirement necessitates a hybrid model—a model that can simultaneously represent continuous and discrete behavior. This work details the development of a specialized hybrid Petri net model with capabilities similar to hybrid automata. Analysis is greatly complicated by the addition of continuous behavior to the model. To help alleviate this, infinite numbers of states are often grouped into equivalence classes represented by symbolic structures. The analysis methods described here represent ranges of continuous variables using groups of inequalities which are then either mapped to Binary Decision Diagram variables so that necessary operations can be performed efficiently, or handed over to an advanced Satisfiability Modulo Theories solver for analysis. After describing the verification system in detail, experiences applying the techniques to several case studies are described and performance results are provided.

Production-Passage-Time Approximation: A New Approximation Method to Accelerate the Simulation Process of Enzymatic Reactions

Given the substantial computational requirements of stochastic simulation, approximation is essential for efficient analysis of any realistic biochemical system. This paper introduces a new approximation method to reduce the computational cost of stochastic simulations of an enzymatic reaction scheme which in biochemical systems often includes rapidly changing fast reactions with enzyme and enzyme-substrate complex molecules present in very small counts. Our new method removes the substrate dissociation reaction by approximating the passage time of the formation of each enzyme-substrate complex molecule which is destined to a production reaction. This approach skips the firings of unimportant yet expensive reaction events, resulting in a substantial acceleration in the stochastic simulations of enzymatic reactions. Additionally, since all the parameters used in our new approach can be derived by the Michaelis-Menten parameters which can actually be measured from experimental data, applications of this approximation can be practical even without having full knowledge of the underlying enzymatic reaction. Furthermore, since our approach does not require a customized simulation procedure for enzymatic reactions, it allows biochemical systems that include such reactions to still take advantage of standard stochastic simulation tools. Here, we apply this new method to various enzymatic reaction systems, resulting in a speedup of orders of magnitude in temporal behavior analysis without any significant loss in accuracy.

The case for analog circuit verification

Abstracted stochastic analysis of type 1 pili expression in E. coli

Abstract — With the aid of model abstractions, biochemical networks can be analyzed at different levels of resolution: from low-level quantitative models to high-level qualitative ones. Furthermore, an ability to change the level of abstraction can be very useful when dealing with many biological systems, including gene regulatory networks. These systems typically have too many components and states to be practically studied using all-inclusive low-level models, yet they often manifest enough dynamical and functional complexity, making an entirely high-level qualitative representation similarly inadequate — thus necessitating a search for some intermediate level of abstraction. Finally, while most abstractions used in modeling of biochemical networks have traditionally been performed manually, doing so accurately in a large system is a tedious and time-consuming process that is highly susceptible to errors during model transformation. To address these issues, we have developed a methodology and implemented an automated modeling and analysis tool with variable abstraction level capabilities. In this paper, we use it for the analysis of switching in Type 1 pili expression dynamics and, in particular, for the problem of estimating the effect of H-NS and Lrp regulatory protein levels on phase variation rates in E. coli. Such behavior is notoriously difficult to study due to the size of the associated gene regulatory network and the characteristically stochastic dynamics involved, which result in very high analytical and computational demands. Here, we show how, by using our system, we are able to automatically abstract the switch network and accurately predict E. coli afimbriation rates, while, at the same time, accelerating the required computations by up to two orders of magnitude. I.

Asynchronous abstraction methodology for genetic regulatory networks

Abstract. In order to efficiently analyze a large scale system in an automated and objective manner, abstraction is essential. This paper presents an automated abstraction methodology that systematically reduces the small scale complexity found in genetic regulatory network models, while broadly preserving the large scale system behavior. Our approach is to first reduce the number of reactions through a quasisteady-state approximation-based algorithm. Second, it represents the exact molecular state of the system by a set of reduced Boolean (or nary) discrete levels. This results in a chemical master equation that is approximated by a Markov chain with a much smaller state space providing significant simulation time acceleration and computability gains. 1 Background Numerous methods have been proposed to model genetic regulatory networks [1, 2]. While many traditional approaches have relied on a differential equation representation as inferred from a set of underlying biochemical reactions, there has been a growing appreciation of their limitations [3-6]. In particular, differential equation analysis of genetic networks generally assumes that the number of molecules in a cell is high and their concentrations can be viewed as continuous quantities, while their underlying reactions occur deterministically. However, in natural genetic networks these assumptions frequently do not hold as, for example, DNA molecules are typically present in single digit quantities, while some promoters can lead to substantial fluctuations in transcription/translation rates and essentially non-deterministic expression characteristics [7, 8].

Technology Mapping of Timed Asynchronous Circuits

This dissertation presents an efficient method for technology-mapping of timedasynchronous circuits. Technology-mapping combines the steps of decomposition, partitioning, and matching/covering to implement a synthesized design in a given technology. This work is applied to timed circuits, which are a class of asynchronous circuits that utilize explicit timing information for optimization throughout the entire design process. This work carries the timing constraints down to the circuit implementation level, giving new insight into the detection and elimination of hazards. In asynchronous circuits, correct operation requires that there are no hazards in the circuit implementation. Therefore, each internal node and output of the transformed circuit following decomposition must be verified for hazard-freedom to ensure correct operation. Current verification algorithms require an explicit state exploration often resulting in state explosion for even modest sized examples. The goal of the hazard verification portion of technology-mapping is to abstract the behavior of internal nodes and utilize the reduced state space to make a conservative determination of hazard-freedom for each node in the circuit. The newly annotated circuit is then mapped to an existing library for implementation. This dissertation explores various complexities of libraries used for matching and examines the hazard covering behavior using a variety of gates. Issues such as short-circuit detection and common-input matching are explored in detail, particularly when libraries contain generalized C-elements. The goal of this research is a hazard-free implementation of the synthesized design in a user-provided technology. Experimental results indicate that this new hazard-verification approach is substantially more efficient than existing timing verification tools and that in most cases hazard-free netlists are produced with modest sized libraries.

Efficient verification of hazard-freedom in gate-level timed asynchronous circuits

This paper presents an efficient method for verifying hazard freedom in timed asynchronous circuits. Timed circuits are a class of asynchronous circuits that utilize explicit timing information for optimization throughout the entire design process. In asynchronous circuits, correct operation requires that there are no hazards in the circuit implementation. Therefore, when designing an asynchronous circuit, each internal node and output of the circuit must be verified for hazard-freedom to ensure correct operation. Current verification algorithms for timed asynchronous circuits require an explicit state exploration often resulting in state explosion for even modest sized examples. The goal of this work is to abstract the behavior of internal nodes and utilize this information to make a conservative determination of hazard-freedom for each node in the circuit. Experimental results indicate that this approach is substantially more efficient than existing timing verification tools. These results also indicate that this method scales well for large examples. It is capable of analyzing circuits in less than a second that could not be previously analyzed. While this method is conservative in that some false hazards may be reported, our results indicate that the number of false hazards is small.

Verifying synchronization strategies

Asynchronous Circuit Design (in Japanese)

A Comparison of Timed State Space Analysis Methods

Circuit designers continue to push the limits of high speed circuit design. This desire for speed motivates the creation of new and innovative design styles. One of these design styles is timed circuits. Timed circuits take advantage of timing information to increase performance. This style has been applied in industrial research to designs like IBM’s guTS microprocessor, the Intel RAPPID project, and Sun’s GasP circuits. These experimental designs were successful at increasing performance, but they are only experimental designs. Timed circuits have not yet been used in a commercial design. One problem that plagues timed circuit design is the difficulty of understanding the complex timing interactions between circuit components. In response to this problem, researchers have created several methods and tools to help verify timed circuit designs. One of the most critical aspects of these methods is state space exploration of both the timed and untimed state space. This work concentrates on two leading methods to do timed state space exploration using Petri nets. These methods were previously implemented in two different CAD tools, ATACS and VINAS-P. However, a clear comparison of the methods has not been done due to the fact that the methods were implemented in different tools. We extended the ATACS framework by adding the methods previously only used by VINAS-P. Having both methods implemented in the same tool allows us to do an accurate comparison of the methods to better understand the strengths and weaknesses of each method.

Design Methodology for Analog VLSI Implementations of Error Control Decoders

In order to reach the Shannon limit, researchers have found more efficient error control coding schemes. However, the computational complexity of such error control coding schemes is a barrier to implementing them. Recently, researchers have found that bioinspired analog network decoding is a good approach with better combined power/speed performance than its digital counterparts. However, the lack of CAD (computer aided design) tools makes the analog implementation quite time consuming and error prone. Meanwhile, the performance loss due to the nonidealities of the analog circuits has not been systematically analyzed. Also, how to organize analog circuits so that the nonideal effects are minimized has not been discussed. In designing analog error control decoders, simulation is a time-consuming task because the bit error rate is quite low at high SNR (signal to noise ratio), requiring a large number of simulations. By using high-level VHDL simulations, the simulation is done both accurately and efficiently. Many researchers have found that error control decoders can be interpreted as operations of the sum-product algorithm on probability propagation networks, which is a kind of factor graph. Of course, analog error control decoders can also be described at a high-level using factor graphs. As a result, an automatic simulation tool is built. From its high-level factor graph description, the VHDL simulation files for an analog error control decoder can be automatically generated, making the simulation process simple and efficient. After analyzing the factor graph representations of analog error control decoders, we found that analog error control decoders have quite regular structures and can be built by using a small number of basic cells in a cell library, facilitating automatic synthesis. This dissertation also presents the cell library and how to automatically synthesize analog decoders from a factor graph description. All substantial nonideal effects of the analog circuit are also discussed in the dissertation. How to organize the circuit to minimize these effects and make the circuit optimized in a combined consideration of speed, performance, and power is also provided.

Correctness and Reduction in Timed Circuit Analysis

To increase performance, circuit designers are experimenting with timed circuits a class of circuits that rely on a complex set of timing constraints for correct functionality. This is evidenced in published experimental designs from industry. Timing constraints are key to the success of these designs, and algorithms to verify timing constraints are required to make them practical in commercial applications. Due to the complexity of the constraints, however, traditional static timing analysis is not adequate. Timed state space analysis is required; thus, improved timed state space analysis is paramount to producing efficient timed circuits. This dissertation discusses two facets of work in timed state space analysis: correctness and reduction. For correctness, this dissertation presents the level ruled Petri net as a model for timed circuits. This model is based on the Petri net language. It includes, however, timing information and level expressions that are key to the specification and verification of timed circuits. This dissertation formalizes the intent of correctness in the verification of a timed circuit by defining a set of failure conditions that can be analyzed in the circuit’s respective model. The circuit is said to be correct if its model is failure free. For reduction, this dissertation presents a timed state space analysis algorithm that verifies correctness in the timed circuit model. The algorithm, when compared to existing algorithms, reduces on average the running time and memory footprint of analysis. A partial order reduction is implemented for the algorithm to further reduce its resource usage. This reduction is not supported by the existing algorithms; thus, the new analysis algorithm can be applied to systems that are beyond their capacity. This is demonstrated in verifying industrial designs from IBM and Sun Microsystems.

Protocol Selection, Implementation, and Analysis for Asynchronous Circuits

This dissertation presents new methods for handshaking expansion of asynchronous circuits. Handshaking expansion includes protocol selection, reshuffling, and state variable insertion. The starting point is a channel-level specification of a design. The goal is a signal-level description of the given design that is correct, synthesizable, and efficient. This dissertation studies the impact of protocol selection and implementation on deadlock avoidance, complete state coding (CSC), CPU time required to compile a given example, and the quality of the circuit. This dissertation treats reshuffling and state-variable insertion as special cases of concurrency reduction. Prior work in the field has also taken this approach. However, this dissertation extends this approach and applies it to specifications that contain quantitative timing assumptions. The concurrency reduction algorithms have been implemented within a computer aided design (CAD) tool. Starting from a signal-level specification that contains the constraints of the desired protocol, these algorithms search the concurrency reduction design space, guided by an estimate of the performance of the final circuit. The CAD tool that this dissertation presents also contains a front end that, given a channel-level specification, produces the starting point for concurrency-reduction. This front end currently handles only pure synchronization channels, using one protocol. Finding all possible ways to reduce concurrency of a specification is a fundamentally exponential problem. However, this dissertation presents techniques to dramatically prune the search space. This dissertation demonstrates that these techniques are capable of reducing the search space by several orders of magnitude compared to the theoretical upper bound – and by one order of magnitude beyond existing techniques – without significantly impacting the quality of the solutions.

Analysis and Characterization of a Locally-Clocked Module

This thesis describes an evaluation of a locally-clocked module. Locally-clocked modules can be used as synchronous datapath elements in synchronous systems or as asynchronous elements in an asynchronous system. One key element of a locally-clocked module is a stoppable ring oscillator (or stoppable clock). If locally-clocked modules are to be used, their practicality must be quantified. Namely, it must be shown that a reliable and useful stoppable clock can be built. This thesis presents the design and evaluation of a fabricated locally-clocked sequential multiplier. The multiplier is used as a driving example to evaluate local clocks. The design for the stoppable clock is a hybrid of stoppable clocks from previous work. The same gates that make up the critical path of the multiplier are used to make the delay element of the stoppable clock. Although the stoppable clock is meant to track the datapath under a wide range of voltages and temperatures, it is shown that the clock requires tuning to match the critical path sometimes. This is due to the fact that it is difficult to match the critical path exactly. In addition, some temperature and voltage data points cause the cutoff path for the clock to be too slow. This problem is fixed by slowing down the clock. Future designs can focus on speeding up the cutoff path; thus, matching the critical path delay is the only limiting factor on clock frequency. A 20-bit multiplier was fabricated through MOSIS using AMI’s 0.5 μm process. The multiplier consumes 0.468 mm2 and contains 8190 transistors. With a 5 volt power supply, the multiplier runs at 13.3 MHz and consumes 196.6 mW of power, while the stoppable clock runs at 174 MHz. This thesis presents latency and power measurements for the multiplier and stoppable clock in addition to a detailed analysis of stoppable clocks. Process variation is analyzed in that five chips are tested and shown to have little variation in measured values.

Improved POSET timing analysis in timed Petri nets

2001 Conference on Advanced Research in VLSI

Automatic Abstraction for Synthesis and Verification of Deterministic Timed Systems

Achieving fast and exact hazard-free logic minimization of extended burst-mode gC finite state machines

This paper presents a new approach to two-level hazard-free logic minimization in the context of extended burst-mode finite state machine synthesis targeting generalized C-elements (gC). No currently available minimizers for literal-exact two-level hazard-free logic minimization of extended burst-mode gC controllers can handle large circuits without synthesis times ranging up over thousands of seconds. Even existing heuristic approaches take too much time when iterative exploration over a large design space is required and do not yield minimum results. The logic minimization approach presented in this paper is based on state graph exploration in conjunction with single-cube cover algorithms, an approach that has not been considered for minimization of extended burst-mode finite state machines previously. Our algorithm achieves very fast logic minimization by introducing compacted state graphs and cover tables and an efficient single-cube cover algorithm for single-output minimization. Our exact logic minimizer finds minimal number of literal solutions to all currently available benchmarks, in less than one second on a 333 MHz microprocessor-more than three orders of magnitude faster than existing literal exact methods, and over an order of magnitude faster than existing heuristic methods for the largest benchmarks. This includes a benchmark that has never been possible to solve exactly in number of literals before.

Algorithms for Synthesis and Verification of Timed Circuits and Systems

In order to increase performance, circuit designers are beginning to move away from traditional, synchronous designs based on static logic. Recent design examples have shown that significant performance gains are realized when aggressive circuit styles are used. Circuit correctness in these aggressive circuit styles is highly timing dependent, and in industry they are typically designed by hand. In order to automate the process of designing and verifying timed circuits, algorithms to explore the reachable state space of the circuit under the timing constraints are necessary. This thesis presents a new specification method for timed circuits, timed event/level (TEL) structures, and new algorithms for exploring a timed state space. The TEL structure specification allows the designer to specify behavior controlled by signal transitions, which is best for representing sequencing, and behavior controlled by signal levels, which is best for representing gate level circuits. This thesis also presents algorithms based on partially ordered sets (POSETs) that explores the timed state space of the TEL structure. Results using the new specification method and algorithms show orders of magnitude improvement over previous techniques in both speed and memory performance. The algorithms have also been successfully applied to several circuit examples from the recently published experimental Gigahertz processor developed at IBM. The speed and memory performance improvements of the algorithm allow automatic synthesis and verification of complex timed circuits, making them an attractive design alternative.

Timed Event/Level Structures

Synthesis of timed circuits using BDDs

Machine analysis and data coding of the Qing imperial lineage (in Chinese)