1 LOGICAL DEPTH AS A MEASURE OF BIOLOGICAL ... - CiteSeerX

1 downloads 0 Views 85KB Size Report
1.1 Biological Complexity. When biologists refer to “complexity”, they generally mean the nature of interactions within biological systems rather than simply the ...
L O GI CA L D EPT H AS A ME AS U RE O F B I OL O GI CA L O RGA NI ZA TI O N A ND IT S R EL AT I ON TO AL GO RI T HMI C C OMPL E XI T Y J . D . C O LLIE R Department of Philosophy, University of Newcastle Callaghan, NSW 2308 Australia Although the complexity of biological systems and subsystems like DNA and various transcription and translation pathways is of interest in itself, organization is of fundamental importance to understanding biological systems. It would be convenient to have a general definition of organization applicable to biological systems. I propose that C.H. Bennett’s notion of logical depth is a suitable candidate. I discuss the problems with using complexity measures alone, and then the relations between logical depth and algorithmic complexity. Last, I give some examples in which depth gives a better measure of what might naively be taken to be complexity in biological systems by many biologists,,and then argue that this must be augmented by consideration of dynamical processes.

1. Algorithmic Complexity vs. Logical Depth 1.1 Biological Complexity When biologists refer to “complexity”, they generally mean the nature of interactions within biological systems rather than simply the measure of information required to specify the system, such as Kolmogorov complexity, or the complexity as determined by MML or MDL methods.1,2,3 Biological interactions are characterized by their functionality, which requires feedback and, in more advanced systems, anticipation. Thus spatial and temporal closure of biological processes, and overall interconnectedness imply organization. Function itself is often characterized as a consequence of adaptation,4,5 but it is better characterized by its role in maintaining system autonomy, itself an organizational trait.6 A system is (relatively) autonomous if its functional and interaction closure works so as to actively maintain system integrity 1

against internal and external changes. Biological complexity is thus organized complexity, and we need some measure of organization. Complexity methods start with an isomorphic mapping of features onto a string that is then amenable to MML or MDL methods of compression to determine overall complexity. A simple case in which the resulting measure might be misleading is found in eukaryotes, in which the “reading frame”, a segment of DNA to be transcribed to mRNA is controlled by transcription factors that can be influenced by distant regulatory regions. The complexity of the reading frame gives information about the complexity of the mRNA transcript, but the regulatory function is not purely local, unlike in prokaryotes, so the complexity of the local regulatory factors will give a misleading estimate of the complexity of the regulatory factors. The organization as well as the complexity of the genome is relevant to understanding gene regulation. A further complexity is the editing phase of mRNA, in which introns are deleted and the mRNA is modified to produce the final functional mRNA transcript. The introns are typically random, or at least chancy, and so increase the complexity of the reading frame of the DNA without contributing to function. Furthermore, in many cases polypeptide chains formed from more than one frame, often from disparate parts of the genome, fold together to form functional proteins, again leading to a divergence between local complexity measures of DNA and their mRNA transcripts and the complexity of functional proteins. Each element can be studied independently by complexity measures, but the overall organization is likely to be overlooked by these methods alone. I will give some further examples later. I should note now, though, that translation itself is not a simple process, either in the complexity or organizational sense. The connection to phenotypic traits is even more complicated in most cases, with the notable exception of some genetic defects. As I will show later, however, even genetic defects are not organizationally simple in at least some cases. The relevant complexity 2

for biological systems is organised complexity, not complexity simplicitur. 1.2 Logical Depth Logical depth was developed by C.H. Bennett, who hypothesized that it was a suitable measure of organization.7 Formally, logical depth is a measure of the minimal computation time (in number of computational steps) required to compute an uncompressed string from its maximally compressed form. Some adjustments are required to the definition to get a reasonable value of depth for finite strings. We want to rule out cases in which the most compressed program to produce a string is slow, but a slightly longer program can produce the string much more quickly. To accommodate this problem, the depth is defined relative to a significance level s, so that the depth of a string at significance level s is the time required to compute the string by a program no more than s bits longer than the minimal program. Physically, the logical depth of a system places a lower limit on how quickly the system can form from disassembled resources. Bennett has proposed that logical depth is a suitable measure of the organization in a system. However, while adding more components to a system at the will not increase the system organization, only the size of the system organised, it will increase its depth because the sheer length of the sequence to be computed has increased. All sequences of n identical entries are intuitively equally trivial, however the depth of each string depends on the depth of n itself. This effect can be made negligible if we consider only relative depth: The depth of a sequence relative to the depth of the length of the sequence. The relative depth itself of a sequence of n identical entries is no more than the depth required to specify the entry itself (and negligible if the entry is 0 or 1). In the case of adding identical components to a system the relative depth does not increase since the depth of a component is already included in 3

the original system relative depth. It is not transparent whether relative depth deals satisfactorily with all possible cases of this kind, but it is a reasonable, and plausibly sufficient, refinement of logical depth simplicitur to adopt. 1.3 Relation of Logical Depth to Complexity The first thing to note is that logical depth implies redundancy. A maximally complex string is not compressible, and necessarily has a computation time equivalent to its uncompressed form. Secondly, logical depth cannot be produced by low order redundancy, where redundancy order is ½ the minimal length required to detect the redundancy. Simple repetition, for example, requires only a relatively short computation time and has low relative logical depth. On the contrary, relative logical depth requires higher order redundancy, and relatively greater depth requires relatively higher order redundancy. MML and MDL methods are well adapted to detecting redundancies of various orders.2 It would be convenient if relative logical depth was implied by higher order redundancy, but this is not generally the case. For example, a sequence of incompressible strings of length n will have redundancy order n, but will not require a long computation time from the compressed form of the complete string. Therefore, higher order redundancy can be taken as a sign of possible organization, worth further investigation. This example, however, implies that (relative) logically deep strings will show redundancy of both higher and lower orders, the discovery of which would be further evidence for organization. Unfortunately, there has been little work on minimal computation time for computing the uncompressed form from the compressed form of a string. Most of the available work deals with the relative computational difficulty of classes of problems rather than for particular problems. Nonetheless, this work can give some indication of the class of organization involved in a particular problem. 4

Relative logical depth promises to capture the core of the intuitive sense of organization discussed in the first part of this section. Crystals are much easier to produce than are computers and, within the latter, otherwise comparable memory chips are much easier to produce than CPUs, which are much more complicatedly organized (as well as being more complex in the algorithmic sense). However I do not know of a theorem that requires a relation between ease of production and depth. I will assume that higher redundancy order computationally associated with lower order redundancy indicates higher logical depth, other things being equal. Whether or not organization requires anything else is somewhat unclear at present. It is, for example, unclear to me whether relative logical depth can be used to satisfactorily distinguish the kind of deep organization exhibited by a computer running a complex coherent program and that exhibited by a living cell, which is autonomous, defends an internal/external phase separation through a controlling boundary membrane, and is self-reproducing, all organizational features the computer lacks. A further complication is that biological systems are organised hierarchically through causal closure conditions (a paradigmatic example is the cohesion of species as well as their members, but similar considerations apply to cells and the organisms they comprise).8 Consider two systems constituted of the same basic components and with equal depth, the first a single level system with subtly correlated parts so that they mutually regulate and some control others, and the second a multi-level hierarchy with internally simple levels but crosslevel correlations, none of which are control relations. If the notion of organization is fundamentally concerned with just the extent of coordination in the sense of depth, then ex hypothesi these two systems would have the same measure of organization. If, on the other hand, the notion of organization also includes reference to the degree to which correlations are hierarchically organised by levels, as it will for many, 5

then, the second system would have the greater organization. Secondly, there is the question of whether ordering of regulation or control should play any intrinsic role in the concept of organization. If so, then the first system will be judged to possess more organization than the second. Finally, these two intuitions are often joined to emphasize the importance of hierarchical order of regulation or control to organization. In this case a system with coherent interactions at a high level used to control lower levels will be judged to possess more organization than the first, since the first system can show only first order hierarchical regulation or control while the second system can show higher order hierarchical regulation and/or control. Perhaps we need to introduce concepts of ‘hierarchical organization’, ‘regulatory (control) organization’ and ‘hierarchical regulatory (control) organization’ in addition to organization simplicitur. Further development of this would take me too far afield, but it is worth noting that the nature of the interactions in a system rather than its logical structure alone is of some importance to understanding and classifying the system as “organized”. 2. Some Examples of Organization that Seem to Defy Complexity Analysis Molecular genes are segments of DNA that can be transcribed into functional mRNA or translated into polypeptides that are either functional themselves or can be enfolded with other polypeptides to form functional proteins. In prokaryotes, the regulatory DNA is adjacent to the transcribed DNA, and the whole can be regarded as a functional molecular gene. As described above, however, in eukaryotes regulatory factors can arise at some distance from the transcribed “reading frame”; the exact mechanism is at present unknown. In order to define a functional molecular gene for eukaryotes, then, a more global perspective is required to determine the computational relations that 6

underlie the gene. It should be noted that replicators need not include complete functional genes, nor need they contain only complete functional genes. This means that even for molecular reductionists the unit of replication (inheritance) is not a functional gene. In order to understand inherited traits, it is necessary to understand how the DNA interacts with itself and the cell metabolism to produce phenotypic traits. This is a problem of organization, not of complexity. The information transferred is organizational information, not the information of complexity theory. This suggests that the unit of inheritance is a trait, not a molecular gene. It is well known that the fitness of Mendelian (or evolutionary) alleles is highly context dependent. E.O. Wilson describes a trait as genetically determined if it makes a difference in some environment.9 For example, eye color is genetically determined because there are environments in which genetic differences make a difference to eye color, even though the environment and other factors can affect eye color. This definition of genetic determination has limited interest to molecular geneticists, since a genetically determined trait need not depend on a single stretch of DNA, and may have a large environmental influence. However it is of interest to evolutionary biologists because genes are selected because of the phenotypic traits they express. Typically, for complex traits, a number of segments of DNA, perhaps not even on the same chromosome, may be involved. Wilson’s definition of genetic determinism is not entirely satisfactory even for his own purposes (not only because it diverges from popular conceptions of genetic determinism, however useful it is for evolutionary biology), since it allows any environment. It would be better to restrict the environments to the evolutionarily effective ones, and to consider a norm of reaction across these environments to determine the strength of genetic determination for a particular trait across a range of environments. It may well turn out that for a particular trait the genetic determination 7

(difference attributable to genetic causes) is low in one environment and high in another. Intelligence (or more operationally, IQ), for example, may be such a trait. As I mentioned above, genetic defects are more likely to be traceable to individual segments of DNA, but even this is not a simple matter. Consider for example the evolutionary gene for Myotonic Dystrophy. The molecular difference between sufferers of this genetic disease is that they possess any one of a number of sequences of between 50 and 200 repeats, whereas normal people in this respect have any of several sequences of between 5 and 27 repeats. In other words, there is a threshold effect beyond which the translated proteins are sufficient in number to cause the disease. This cannot be determined by genetics alone, at least for the local regions, and requires understanding the metabolic processes. A full genetic understanding requires considering the genetic basis for the organization of these processes as well as the specific threshold level. This likely involves non-additive (non-linear) interactions among a number of molecular genes through their products. This organization should show up in higher order redundancies in the genome involving the relevant repeated sequences, but whether or not this would be a productive line of research is questionable. Many normal traits in most species show considerable variability across a species. The evolutionary definition of genetic determination cannot be isolated from environmental interaction; as noted above, there is a norm of reaction. This is in line with the requirement of interaction closure for autonomy and the possibility of functionality. This presents further problems for complexity analysis, since environmental interactions must be taken into consideration. It is possible to compare the complexity of the environment with the complexity of the functional parts of the genome to get some idea of the complexity of adaptation.10 Unfortunately, this correlation tells only a small part of the story. Again, a major factor is the organization of environmental interaction and 8

adaptive strategies, which again involve logical depth. A deep strategy is not necessarily more flexible, though it will involve higher order redundancy. It may represent a deeply entrenched and complicated relationship between an organism and a particular environment to which it has specialized. If the depth is associated with higher cohesive levels, however, it is more likely that the depth is associated with adaptability and less specialization. Similar considerations apply to ecological resiliency. Mutual complexity tells us something about degree of adaptation, but depth can tell us more, and the type of depth can tell us about the kind of adaptive strategy in use. Similarly, ecological complexity, especially the sum of mutual complexity of interactions, can tell us a good deal about the nature of an ecological system, but this may indicate a brittle ecological community or one with considerable resiliency, depending on how the interactions are organized.11 In this case too great a depth may indicate an ecological community that has lost its resilience, much as a specialized adaptation is not an effective strategy in a varying environment. Organization as well as complexity is required to understand such systems, but as pointed out in the previous section, the type of organization is also relevant. Complexity and organization studies can place limits on the possibilities for biological theories of specific systems, but the dynamical character of the system processes is required for full understanding. 3. Conclusions Biological complexity is organized complexity. Even relatively basic and supposedly simply biological subsystems like DNA, when regarded as functional, show complex organization. To some extent biological organization can be understood in terms of logical depth. Depth, in turn, can be recognized, but not reliably, through higher order redundancy bound together with lower order redundancies. Algorithmic complexity places limits on possible biological processes, and the logical depth of 9

a biological system or subsystem places further limits. However, even these constraints leave open a wide range of strategies. Full understanding requires knowing the dynamical character of the processes of the system.

Acknowledgments I would like to thank Professor C.A. Hooker, who supported this work by granting me a Research Associateship under his Australian Research Council Large Grant. Several of the ideas in this paper were developed together with him and Wayne Christensen. Several of the examples I used were derived from a talk by Paul Griffiths based on his unpublished work.

References 1. C.S.Wallace and D.M Boulton, “An information measure for classification”, Computing Journal 11, 185-195 (1968). 2. J.J. Rissanen, Stochastic Complexity and Statistical Inquiry, World Scientific Publishers (1989). 3. M. Li and P.M.B. Vitányi, An Introduction to Kolmogorov Complexity and its Applications, Springer-Verlag, New York (1993) 4. K. Neander, “Functions as selected effects: The conceptual analyst’s defense”, Philosophy of Science, 58, 168-184 (1991). 5. R.G. Millikan, “In defense of proper functions”, Philosophy of Science 56, 288-302 1989. 6. W.D. Christensen, J.D. Collier and C.A. Hooker, “Adaptiveness and adaptation: A new autonomy-theoretic analysis”, unpublished manuscript. 10

7. C.H. Bennett, “Dissipation, information, computational complexity and the definition of organization”, in D. Pines. Ed. 1985. Emerging Syntheses In Science. Proceedings of the Founding Workshops of the Santa Fe Institute. Redwood City, Calif., Addison West Publishing Company (1985). 8. J.D. Collier and C.A Hooker, “Complexly organized dynamical systems”, unpublished manuscript. 9. E.O. Wilson, On Human Nature, Cambridge, MA, Harvard University Press (1978). 10. J.D. Collier, “Information increase in biological systems: how does adaptation fit?”, in G. van der Vijver, S. N. Salthe and M. Delpos. Eds., Evolutionary Systems, Dordrecht, Reidel (1998). 11. R.E. Ulanowicz, Growth and Development: Ecosystems Phenomenology. New York: Springer Verlag (1986).

11