Extended Molecular Computing Model - wseas.us

5 downloads 42 Views 156KB Size Report
prepared by Bionovo Legnica. Reagents (molecular biology grade) were from Sigma. Restriction enzymes. FokI, BseMII, and BseXI were from Fermentas and T4.
Extended Molecular Computing Model OLGIERD UNOLD1, MACIEJ TROĆ1, TADEUSZ DOBOSZ2, ALICJA TRUSEWICZ2 1 Institute of Engineering Cybernetics, Wroclaw University of Technology Wyb. Wyspianskiego 27, 50-370 Wroclaw POLAND 2

Medical University, Institute of Forensic Medicine Mikulicza-Radeckiego 4, 50-368 Wroclaw POLAND

Abstract: - The paper extends the molecular computation model based on splicing system and implemented in vitro by Shapiro. The new restriction enzyme BseMII is proposed. The model of 3-state molecular finite state machine is introduced. Key-Words: -DNA Computing, Finite-State Automata

1 Introduction

2 Biological Preliminaries

DNA Computing is a method of solving mathematical problems with the help of biological operations on DNA strands. The current boom in interest in DNA Computing started with Adleman’s article [1], in which an instance of the directed Hamiltonian path problem – a wellknown NP-complete problem – is solved using DNA and standard biological techniques. Up to now one can distinguish a few models of DNA Computing: computing inside a single molecule, computing by interactions among molecules, computing with membranes and computing with geometry [4]. Regardless of applied DNA model in order to construct a general molecular computer some universal model of computation must be expressed in chemistry, such as a Turing Machine (TM). In this paper, we extend the computation model verified in lab by Ehud Shapiro [2], which is in fact a kind of splicing system [5]. The molecular finite 3-state machine was implemented with the restriction enzyme BseMII - in comparison to 2-state machine and enzyme FokI used in Shapiro’s model. This paper is organized as follows. In the next section, we summarize the basic biological concepts necessary to understand DNA Computing. In Section 3 we introduce finite state machines. Splicing systems are introduced in Section 4. In Section 5 we describe shortly Shapiro’s model of computation. Model of DNA computation with 3-state automaton was proposed in Section 6. Section 7 concludes the paper and additionally in Section 8 lab materials and methods are described.

In DNA Computation, the instances of problem are encoded in oligonucleotides, or strands, of DNA. There are four types of nucleotides which differ in the chemical group called base. The four bases are adenine (A), guanine (G), cytosine (C), and thymine (T), which bind according to the Watson-Crick complement condition. The pairs A,T and G,C are called complementary. The nucleotides form DNA strands which possess polarity, it means that beside the sequence also the direction of the strand is important. For example, CTT and TTC are different strands. Hybridization is a chemical process that joins two complementary single strands into a double strand. Ligation is a chemical process whereby two double strands are joined into one double strand. A restriction enzyme (such as EcoRI, FokI, BseMII) is characterized by double strand that it recognizes. For example the enzyme FokI cleaves the input molecule far away from the recognition site

CGATG and works GCTAC

according to the definition FokI: GGATG(N)9/13↓ where N ∈{A,C,G,T}. See [10] for further background in molecular biology.

3 Finite State Machine We assume that the reader is familiar with finite-state machines (for background see [6]). Here we give only a cursory glance, avoiding any formal model. A FSM is a subclass of TM. A FSM has only an internal memory determined by its finite state set; the input tape is not used as an additional memory. A FSM just reads the

input in one sweep from the left to the right. The machine can be in one of a finite number of internal states of which one is designed an initial state and some are designed accepting states. Its software consists of transition rules, each specifying a next state based on the current symbol. It is initially positioned on the leftmost input symbol in the initial state. In each transition the machine moves one symbol to the right, changing its internal state according to one of the applicable transition rules. Alternatively, it may suspend without completing the computation if no transition rule applies. A computation terminates on processing the last input symbol. An automaton is said to accept an input if a computation on this input terminates in an accepting final state.

4 Splicing Systems Splicing is a paradigm for DNA Computing which provides a theoretical model of enzymatic systems operating on DNA strands. Splicing models a solution with enzymatic actions (restriction, hybridization and ligation) operating on DNA in parallel in a common test tube. The DNA strands are modeled as strings over a finite alphabet. Splicing allows for the generation of new strings from an initially chosen set of strings, using a specified set of splicing operations. The splicing operations on the DNA are string editing operations such as cutting, appending, etc. These operations can be applied in any order, and thus the splicing system can be considered to be autonomously controlled. Also, the operations may be nondeterministic, and a large of possible results may be obtained from a small number of operations. The splicing system was introduced as a generative formalism in 1987 by Head [5]. The splicing systems and their extensions have been studied by many authors (for example [3], [7], [8]), and Freund at all proved [3] that the generative power of finite extended splicing systems equals that of TM. Rothemund [9] showed that a universal TM can be simulated by recombinant DNA operations in splicing models. The restriction enzymes that Rothemund suggests to use are a subclass of the II endonucleases, that is the IIS restriction endonuleases. Recently an experimental test of splicing was done by Shapiro [2], who used biological molecules to create DNA computer based - as a Rothemund’s TM - on a restriction nuclease FokI.

5 Shapiro’s Machine In [2] Shapiro and his coworkers have described a programmable finite two-state, two-input symbol machine comprising DNA and DNA-manipulating

enzymes. This kind of the automaton can have 8 possible transition rules and programming amounts to selecting some of these transition rules and deciding which internal states are accepting. There are 255 possible transition-rule selections and 3 possible selections of accepting states (either S0, or S1 or both), resulting in 765 syntactically distinct programs. A few of these programs were tested in the lab by Shapiro’s team. Hardware of the molecular Shapiro’s computer consists of restriction enzyme FokI, T4 DNA Ligase and ATP, while the software comprises 8 short doublestranded DNA molecules – the transition molecules, which encode the 8 possible transition rules. A double strand DNA molecule encodes the initial state of the automaton and the input, with 6 base pair coding for one input symbol. The system also contains ‘peripherals’, two output-detection molecules of different lengths, each of which can interact selectively with a different output molecule to form an outputreporting molecule that indicates a final state and can be readily detected by gel electrophoresis. The computation starts when the hardware, software and input are all mixed together and runs autonomously, if possible till termination. If the peripherals are also mixed then output reporters are formed in situ in termination. The process of DNA computing is as follows. First, the input is cleaved by FokI, thereby exposing a 4-nucleotide sticky end that encodes the initial state and the first input symbol. The computation proceeds via cascade of transition cycles. In each cycle the sticky end of an applicable transition molecule ligates to the sticky end of the input molecule (a process known as a hybridization), detecting the current state and the current symbol. The product is cleaved by FokI inside the next symbol encoding, exposing a new 4nucleotide sticky end. The design of the transition molecules ensures that the 6-bp-long encodings of the input symbols a and b are cleaved by FokI at only two different frames, the leftmost frame encoding the state S1 and the rightmost frame encoding S0. The exact next restriction site and thus the next internal state are determined by the current state and the size of the spacers in an applicable transition molecule. The computation proceeds until no transition molecule matches the exposed sticky end of the input or until the special terminator symbol is cleaved, forming an output molecule that has a sticky end encoding the final state. In a step extraneous to the computation and analogous to a print instruction of conventional computer, this sticky and ligates to one of two output detectors and the resultant output reporter is identified by gel electrophoresis.

6 Model of 3-State Molecular Automaton Before we developed a model of 3-state molecular machine, we examined our approach by simulating all of the experiments performed by Shapiro in our own molecular computation simulator [11]. We also checked automata not reported in [2]. Not only the enzyme FokI as a restriction enzyme, but also BseMII and BseXI were tested. In the next step the model of two-state automata was extended to the model of three-state, two-input symbol machine. This kind of the automaton can have 18 possible transition rules. There are 262.143 possible transition-rule selections and 7 possible selections of accepting states, resulting in 1.835.001 (!) syntactically distinct programs. In our approach the enzyme BseMII was used. We have implemented and virtually checked automaton shown in Figure 1.

7 Conclusions We explore the capabilities of DNA Computing by giving some extensions to the splicing model implemented in lab by Shapiro and his team. The model of two-state automata was extended to the model of three-state, two-input symbol machine. This kind of the automaton is able to perform 1.835.001 (!) syntactically distinct programs in comparison with 765 programs in Shapiro’s approach. In our model the enzyme BseMII was used. Although many issues remain for the future investigation, we have started the first set of experiments in vitro.

8 Lab Materials and Methods All used 5’ phosphorylated oligonucleotides were prepared by Bionovo Legnica. Reagents (molecular biology grade) were from Sigma. Restriction enzymes FokI, BseMII, and BseXI were from Fermentas and T4 DNA ligase from Sigma. Plasmid pBlueScript II SK+ was from Stratagene. Electrophoresis was performed by 2% agarose gel in 1x TBE buffer, using Sigma apparatus and FMC Gold Agarose. The PCR reaction and all other enzymatic processes were performed in Biometra Unothermoblock termocycler. Purification of the experimental DNA mixtures were performed using QIAquick DNA Purification Kit. Results of experiments were checked and analysed by DNA sequencing using BigDye Terminator 3 technique (PE Applied Biosystem) and capillary ABI 310 Genetic Analyser, manufactured by PE Applied Biosystems.

References [1] Adleman, L.M.: Molecular Computation of Solutions to Combinatorial Problems, Science 266, 1994, pp.1021-1024 [2] Benenson, Y., Paz-Elitzur, T., Adar, R., Keinan, E., Livneh, Z. and Shapiro, E.: Programmable Computing Machine Made of Biomolecules, Nature 414, 2001, pp. 430-434 [3] Freund, R., Kari, L., Paun, G.: DNA Computing Based on Splicing; the Existence of Universal Computers, Theory Comput. Syst., 1999, 32, pp. 69-112 [4] Hagiya, M.: From Molecular Computing to Molecular Programming, DNA Computing, 6th International Workshop on DNA-Based Computers, DNA 2000, In: Condon, A., Rozenberg, G. (eds.): LNCS, Vol.2054, 2001, pp. 89-102 [5] Head, T.: Formal Language Theory and DNA: an Analysis of the Generative Capacity of Specific Recombinant Behaviors. Bull. Math. Biol. 49, 1987, 6, pp. 737-759 [6] Hopcroft, J.E., Ullman, J.F.: Introduction to Automata Theory, Languages and Computation, Addison-Wesley, Reading MA, 1979 [7] Mateescu, A., Paun, G., Rozenberg, G. and Salomaa, A.: Simple Splicing Systems, Discrete Appl. Math., 1998, 84, pp. 145-163 [8] Pixton, R.: Splicing in Abstract Families of Languages, In: SNAC Working Material, TUCS General Publ., 1997, 6, pp. 513-540 [9] Rothemund, P.W.K.: A DNA and Restriction Enzyme Implementation of Turing Machines, In: Lipton, R.J., Baum E.B. (eds.): DNA Based Computers, American Mathematical Society, Providence, RI, 1996, pp. 1-22 [10] Streyer, L., Biochemistry, Freeman & Co. 1995 [11] Unold, O., Troć, M.: Restriction Enzyme Computation, 7th International Work-Conference on Artificial and Natural Neural Networks, IWANN 2003, LNCS, Vol. 2687, 2003, pp. 686693

a

S1

b

S0

a

b S2

a) Diagram of the 3-state example automaton. Incoming unlabelled arrow represents the initial state (S2), labeled arrows represent transition rules, and the double circle represents an accepting state (S0).



a CATC TC AT CA

b CTAC AC TA CT

b) Symbols and states encoding t1: S0 → a S0 CTCAG(4)TC GAGTC(4) t5: S0 → b S1 CTCAG(5)AC GAGTC(5) t9: S1 → a S2 CTCAG(5)AT GAGTC(5) t13: S2 → a S0 CTCAG(2)CA GAGTC(2) t17: S2 → b S1 CTCAG(3)CT GAGTC(3)

t2: S0 → a S1 CTCAG(5)TC GAGTC(5) t6: S0 → b S2 CTCAG(6)AC GAGTC(6) t10: S1 → b S0 CTCAG(3)TA GAGTC(3) t14: S2 → a S1 CTCAG(3)CA GAGTC(3) t18: S2 → b S2 CTCAG(4)CT GAGTC(4)

S0:

t CGGC GC GG CG

S1: S2:

CTCAG(6) GAGTC(6) CTCAG(7) GAGTC(7) CTCAG(8) GAGTC(8)

c) Initial states encoding t3: S0 → a S2 CTCAG(6)TC GAGTC(6) t7: S1 → a S0 CTCAG(3)AT GAGTC(3) t11: S1 → b S1 CTCAG(4)TA GAGTC(4) t15: S2 → a S1 CTCAG(4)CA GAGTC(4)

t4: S0 → b S0 CTCAG(4)AC GAGTC(4) t8: S1 → a S1 CTCAG(4)AT GAGTC(4) t12: S1 → b S2 CTCAG(5)TA GAGTC(5) t16: S2 → b S0 CTCAG(2)CT GAGTC(2)

d) Transition molecules S0 – D: (100)GC (100)

S1 – D: (200)GG (200)

S2 – D: (300)CG (300)

e) Output-detection molecules Sequence of the input symbols: ba Initial state: S2 Input molecule: CTCAG8CTACCATCCGGC300 GAGTC8GATGGTAGGCCG300 Final states: S0 -----------------------------------------------Sequence of the transition rules: t17 t7 OK A next intermediate configuration formed upon restriction BseMII: CTCAG8CT GAGTC8

ACCATCCGGC300 GATGGTAGGCCG300

Ligation with t17 transition rule: CTCAG3CTACCATCCGGC300 GAGTC3GATGGTAGGCCG300 A next intermediate configuration formed upon restriction BseMII: CTCAG3CTACCAT GAGTC3GATGG

CCGGC300 TAGGCCG300

Ligation with t7 transition rule: CTCAG3ATCCGGC300 GAGTC3TAGGCCG300 A next intermediate configuration formed upon restriction BseMII: CTCAG3ATCCGGC GAGTC3TAGGC

300 CG300

Ligation with output detector S0-D: 100GC300 100CG300

f) A sample processing of an input molecule ba Fig. 1. Design details and mechanism of operation of 3-state molecular finite automaton