Knots in DNA

3 downloads 37803 Views 947KB Size Report
goes at the unimaginably awesome speed of 200 km per hour along our fishingline, ... In spite of the smoothing effect of rec A on DNA knots, they may still be terribly ..... Let the rectangle in the illustration above conceal 2m right-handed ...
Knots in DNA by Pieter van de Griend

Abstract In this paper we shall investigate several solutions to the problems arising from isotopy and transmutation, which are exhibited by knotted structures and encountered in the topological approach to enzymology in DNA research.

INTRODUCTION Before Jones the maths was incredibly arcane. The way knots were classijied had nothing to do with biology. But now you can calculate the things that are important to you. Molecular biologist Nicholas Cozzarelli, 1986.

Contrary to commonly held belief, the tying of knots is not solely a privilege of man. They can occur spontaneously in intelligence-free objects like electric cables and garden watering hoses. There are also animals which tie themselves into an occasional knot. The Eptatretus stoutii, also known as Hagfish on the eastern Pacific shores, protects itself by tying its body into an Overhand Knot. 1 In 1844 the Dane ørsted discovered a type of bacteria called Leucothrix Mucor in shallow poois at the beach. About 120 years later Thomas Brock observed that this marine bacteria can tie various sorts of knots in its long and monofilamentous body. 2 DNA molecules are also long and flexible. One may wonder whether they are a medium having the capability of hosting a knotted structure? Since the beginning of the 80’s the answer is affirmative, but in order to obtain some understanding of the hindlying mechanism, enabling us to address the foregoing question in some depth, we shall have to take a closer look at DNA Most cellular DNA is double-stranded (duplex), consisting of two linear backbones of alternating sugar and phosphorous. Attached to each sugar molecule is one of the four bases: A=Adenine, T=Thymine, C=Cytosine and G=Guadine. A ladder is formed by hydrogen bonding between the base pairs, with A bonding to T, and C bonding to G. The base-pair sequence, or code, for a linear segment of duplex DNA is obtained by reading along one of the two backbones and is a word in the letters {A, C, G, T}. In the Crick-Watson model for DNA, the ladder spirals in a right-handed fashion with an average, nearly constant pitch of 10.5 base pairs per full helical twist. Duplex DNA in vivo (in the living cell) is usually a linear molecule, but in nature duplex DNA can exist in closed circular form. Enzymology is the study of the (geometric) actions of various naturally occurring enzymes which alter the physical appearance of DNA. For our purposes there are roughly two categories of interesting enzymes. Those affecting the way DNA is embedded in 3-space, which are called topoisomerases, and an enzyme type which acts during recombination, called recombinases. One role of the topoisomerases is to facilitate the central genetic events of replication, transcription and recombination of the DNA. This is done by:

1 2

[Jensen 1966] [Brock 1964] 1

1. Writhing, which is the coiling up of the molecule in the nucleus. 2. Strand-passage of the molecule through another via a transient break in one of the strands. Recombinases breaks the strands and rejoins them at different ends. These moves are performed on so-called substrate DNA and yield product DNA To get an indication of the packing and recombination problems of DNA in the nucleus, imagine the latter scaled up to the size of a football. It then contains about 200 km of DNA material the thickness of a fishingline. During the replication process, which goes at the unimaginably awesome speed of 200 km per hour along our fishingline, breaks can occur. The other obvious problems relate to the storing of this length, but even uncoiling poses difficulties, also causing an occasional severing of the strands. All of these operations are conducted by enzymes and can cause the formation of twists and knots. If this kinking and knotting is performed on circular DNA then these knots can be trapped and used to derive an understanding of the enzyme workings. By studying the behaviour of an enzyme in vitro (in the laboratory), scientists are one step closer to understanding its behaviour in vivo. For the record; knotted circular DNA has been observed in the cell. [Wang 1985] After removal the molecules are prepared for viewing under the electron microscope (EM) by coating them with protein rec A. This coating increases the DNA diameter from 20 to 100 Angstroms, stiffens the molecule and lengthens it by reducing the helical pitch of the core DNA. This fattening and stiffening facilitates the unambiguous determination of crossings in an electron micrograph of a configuration of DNA circles and reduces the number of extraneous crossovers. Hence the complete stereostructure of knots can be determined by electron microscopy of protein-coated DNA molecules. [Krasnow et a!. 1983, Wasserman & Cozzarelli 1984.] In the following we shall adopt an approach commonly used in computer science and called prototyping. We shall come to consider three models which reflect the iterative aspect of the human learning process. This knowledge-approximating process will enable us to formulate specific problems and develop further models, which will be more appropriate.

-

2

Sectioni: PRELIMINARIES §1.1 What do we want? In spite of the smoothing tangled structures. The first according to knot type. We relation over our set of knot been used:

effect of rec A on DNA knots, they may still be terribly hurdle on the way is to fractionate the product family thus want to apply a (presently undefined) equivalence projections. In the past the following techniques have

1.

Gel electrophoresis, which works as a shape- and mass-sensitive filter. The problem here is lack of understanding of the operational mechanism. 2. Counting of crossing points in the diagram. Node counting has the important advantage of describing all the key topological and geometric properties of DNA that are altered by enzymes. However this method has certain shortcomings of which several are illustrated by the two distinct “4” crossing point knots below.

3. A mathematical classification scheme [Schubert 1956], which appears to be a numerical method. However [White et a! 1987] call it redundant and cumbersome. Ergo none of the above mentioned methods is sufficient and/or satisfactory. Especially since they have been virtually useless in predicting any of the topological changes brought about by enzymes, which will be our second (and ultimate) goal.

3

§1.2 What have we got? Our aim is to build a purified mathematical environment in which we can model reality. Our first task is one of synchronizing mathematical and DNA researchers’ terminology. To a mathematician a knot K is a trajectory of an embedding -y of the unit circle 1 into R . For reasons of compactification Euclidean 3-space is usually 3 replaced by S . The notion of a knot is thus denoted by: 3 *S 1 :S 3

and

J-y*

This can be translated into saying that a knot is a closed curve that meanders smoothly through a bit of 3-space, which is contained in a compact bit of 4-space. If more than one separate curve can be seen to make a circuitous journey throughout 3 then mathematicians speak of an n-link, in which n denotes the number of disjoint closed curves. This is denoted by: x...xS’—*S 1 :S 3

and

L=y*

n—times

In other words an n-link occurs when n proper knots are intertangled with each other, without any mutual intersections. When these knotted structures occur in genetic material the terminology is slightly different. DNA researchers speak of (DNA) knots in the former case and about catenanes in the case of a DNA n-link. From now on we shall adopt the contextually most convenient terminology.

§1.3 Idealizing. We are given EM scans showing apparently knotted structures in circular DNA. How to be sure that the stereo structure is really knotted and that the images are not just fanciful whims of the photographic process? We have several reassurances. 1. The employed contrast creating process removes, beyond all doubt, that the crossings should not be proper crossings. 2. Often DNA naturally occurs in circular form, which can “catch knots”. We can therefore discard the notion that the images produced by electron microscopes are optical trickery. The EM images show that we may replace the knotted DNA structures by real-life knots. We are still left with problems pertaining to the flexibility of DNA. The twists occurring in the DNA molecules are caused by enzymatic workings, but we will assume that this genetic material has a certain “homogeneousness”, making it flexible like rope or twine. We can thus reconstruct the DNA knots in an everyday piece of string. These knots can be captured by fusing the string’s ends like in a necklace. This relates to the fact that we are working with circular DNA.

4

§1.4 Generalizing. We are now in a position to consider a DNA knot or catenane as an n-link, which is the image of an embedding -y : S x 51 —* 53• It is a well-known fact that knots 1 x can be deformed into very differently appearing knotted structures. This transformationability is called isotopy. It is easy to see that it implements an equivalence relation on the set of all knotted structures, causing the lattter to partition into knot-classes. As 3-dimensional objects, knots are rather cumbersome. We wish to reduce this problem by considering diagrams ofprojections. How to be sure that it is sufficient to treat diagrams of these knotted circles? After all isotopy might be a property of 3-space only. Luckily there has been rigourous research on this problem by Reidemeister [1932] to show that we have not lost any essential information in performing this step. Isotopy is preserved by certain elementary moves on diagrams, called Reidemeister Moves. Therewith we have reduced the 3-dimensional knotted objects to 2-dimensional diagrams. When traversing round the unit circle 1 one naturally assigns a (global) orientation to it. This choice of direction is arbitrary, but permanent once chosen. However, it is rare for catenanes to be oriented in one stringent manner. The plausibility argument is that the orientation, in first instance, will merely work as a catalyst. In a later, more refined, model we will return to this problem. ...

5

Section2: MODEL I It is natural, though somewhat simple minded, to reduce the diagrams of the knot projections to graphs. Our first attempt will therefore apply graph theory to these EM scans. This will also enable us to get an extendable operational perspective on the problems facing us. We shall proceed in the same manner as knot theory did. In a publication of 1847 Johann Benedict Listing discussed some topological phenomena involving knots. We shall employ Listing’s ideas to get some clearer indications of what exactly is needed in tackling the knotty problems. This will aid us in identifying and analyzing the first set of problems encountered in DNA research. A first shaping is to discover the concept of an invariant. From the planar projections of spirals we can come to the concept of handedness for an oriented crossing. It will soon be found that there are two types of crossings. Like the spirals from which they originate, these crossings are also mutually discernible by an application of the so-called cork screw rule. To each of them one can assign a certain symbol distribution consisting of 6’s and A’s to the pairwise diametrically separated regions, which is illustrated below.

LEFT-FLkNDED CROSSING

RIGHT-HANDED CROSSING

Using the just introduced symbol distributions we next decorate all crossings in an oriented knot diagram. As stated the orientation has a mere catalyzing function as it will prove insignificant in this model. Every region is to have a type, which is either 6 or A, determined by the crossings of the sides bounding the region. This should also hold for the unbounded region. Depending on whether a mixture of S’s and A’s per region occurs or not, it is called either amphirypical or monolypical. These terms will be used to tell whether a diagram is in reduced form, by which is understood that the diagram is in a form displaying the minimal number of crossings. In the case of amphitypicality, when there is a mixture of types in at least one of the regions, the diagram must be attempted to be reduced further still. There is no general algorithm of how to perform a crossing point reduction, but an indication of when to perform a reduction will be encountered further on.

6

After decorating the crossings in a diagram and considering the resulting symbol distribution in each of the diagram’s regions we are able to put forth an “invariant”. In general an invariant is a mathematical expression which carries non-altering information about a system. In this particular instance the knot is the system and we should be looking for a property solely depending on the knot or link under consideration and not on any particular picture of it. The easiest invariant to visualize, but which is not very telling, is the number of components in an n-link. Per definition it equals n and remains so whatever continuous deformations the link is subjected to. An invariant must be independent of the chosen representation. In our development so far we have optioned to work from diagrams. This means that an invariant is a map p : D —* IS, in which B denotes the set of knot projections (i.e diagrams) and IS the set of Invariance Symbols. The latter may be polynomials, numbers, etc, etc. An ideal invariant does two things. First it must be well-defined on the knot-classes, which means that if K and K 2 give rise to the two diagrams D 1 and D 2 respectively, then the following assertion should hold: 1 K



D 1 1 D D 2)?‘

=

2 D

=>

p(Di)

=

) 2 p(D

Secondly we would like it to be able to distinguish between different knots. i.e the following property should hold: p(Di)p(D 1 D ) 2

The idea concealed behind the invariant concept is that if two diagrams determine links with distinct invariants then these links will have been proved distinct. However to date proposed invariants have not been able to detect absolute distinctness between knots. That is the following injectivity property does not hold: K 1 2 K

—*

1 D

p(Di)

=

>

7

) 2 p(D

1 D

=

In our case we obtain an “invariant” for a reduced and monotypical diagram as follows. Count the number of sides per region and make it the exponent to the symbol denoting the region’s type. Thereafter sum equal types into polynomial-like forms, called Complexions-symbol. In this particular instance p maps diagrams into the set of complexion- symbols. The example below should illustrate the derivation of the “Listing Invariant” for knot numbered 77 in Reidemeister’s list.

5

2L

In 77 there are four S-regions of which three are 3-sided and the unbounded region adjoins five sides. There are five \-regions of which two pairs are 2- and 3-sided, while the remaining one is 4-sided. This is denoted symbolically by:

f

S+3S 2 + 2 + 2A

In general a complexions-symbol would have the form: S’ + a 1 6+ 2 f aS + a bo + 1 b +2 ’ b +

...

...

+ a_S’ 1 + b_

in which the coefficients a and b, 0 < i < n 1, indicate the number of the various regions. As potential check-facilities graph theory gives that: —

a+b=n+2

and

(n—i)=b (n—i)=2n These are camouflaged versions of Euler’s polyhedra formula, revealing the invari ant’s graph-theoretical base. An indication of when to perform a reduction is given by the following observation: if the exponent of any term equals unity, then its cone sponding region can be removed, as it will comprise a simple twist.

8

This invariant, even though it is reminiscent of a modern 2-variable polynomial, is not very powerful, and neither does it cater for all knots. The unknot U’s invariant for instance is not defined, because there are no crossings. However there are two far more serious defects: 1.

We define parity to indicate whether the line in an oriented knot traverses under or over at an encountered crossing. It can be shown that the monotypicality demand forces knots of alternating parity to ensue. This fact’s consequence is immediately felt, since there are knots which refuse being projected into monotypic diagrams. It is well-known that a diagram can be coloured like a checkerboard, but the demands imposed by monotypicality cause the invariant to become undefined for non-alternating knots. Examples of these already occur at 819 to 821 and two of them are shown below.

88

2.

For the same reason this “invariant” fails to work for certain n-links in which ii 2. Such complex forms are to be found in nature for instance in kinetoplasts. [Englund et al. 1982] Occasionally the “invariant” is not truly invariant. The knot 77 in the version in which an inverted unbounded region type occurs, such as the example shown below, enforces the equation:

{

x’

36 + 2) + 2)2

}={

x’

2S + 26 + 2) + 2)2

}

Rather optimistically one could attempt to find possible transformations. This however is not quite necessary as Peter Tait [1877] gave two distinct knots with equivalent “invariant”. 3 Terminally disrupting any hope for detecting distinctness.

Proc RSE 1877 p310 & footnote, also p325. [Tait p293] 9

Tait’s knots are given below:

{:: }

=

§2.1 Intermediate conclusions. An invariant must be extractable from any diagram into which the link projects. The just described invariant must be extracted from a reduced diagram and in certain cases it is necessary to transform it afterwards. The main problem with this invariant is how to find it from a non-reduced diagram? The Reidemeister moves do not work, they change region type or even cause it to become undefined in certain cases. In topological terminology: the proposed invariant is not isotopy-invariant. Two embeddings are isotopic when either of them can be deformed by a special kind of continuous transformation into the other. We say that embeddings 70, 71 : S 1 —* 53 are isotopic if there exists a diffeomorphism H such that: H:

1

(x,t)

x I F-*

3

x I

(h(x),t)

in which h is an embedding for 0