Patrick Fricker. Example: Feature Tree Imposed. Mutschler, E., G. Geisslinger,
H.K. Kroemer, and M. Schäfer-. Korting, Mutschler Arzneimittelwirkungen. 8 ed.
Automated Drawing of Structural Molecular Formulas Under UserDefined Constraints Patrick Fricker
[email protected] Matthias Rarey
[email protected] Center for Bioinformatics (ZBH) Hamburg
Problem
§ Results from virtual screening, high-throughput screening etc. have to be presented in a convenient way. § One way for fast browsing of hundreds of structures is with 2D structure diagrams. § Usually, molecules of interest are in a way similar to each other. § Such relationships should become visible to the modeler. April, 16.
Patrick Fricker
Problem
§ Results from virtual screening, high-throughput screening etc. have to be presented in a convenient way. § One way for fast browsing of hundreds of structures is with 2D structure diagrams. § Usually, molecules of interest are in a way similar to each other. § Such relationships should become visible to the modeler. April, 16.
Patrick Fricker
Overview § Structure Diagram Generation (SDG) from the connection tables of molecules § No overlapping bonds and fulfill aesthetic standards
§ User-defined constraints § Influence the layout
§ Automatically generated constraints § Feature Trees
April, 16.
Patrick Fricker
SDG: Drawing Units § For a easier handling, the molecule is split into components. § A Chain Drawing Unit (CDU) represents a sequence of acyclic bonds. § A Ring Drawing Unit (RDU) represents a start bond and a ring system.
April, 16.
Patrick Fricker
SDG: Algorithm § Basic algorithm as found in the literature [1]. § Create an initial Chain Drawing Unit from the longest chain in the molecule. § Add sequentially Drawing Units for substituents of the already drawn part and handle collisions. § Choose overall diagram orientation. § Add atom and bond labels.
§ It is easy to preserve the layout of the core fragment for compounds of a combinatorial library. April, 16.
Patrick Fricker
[1] Helson, H. E., Structure Diagram Generation, in Reviews in Computational Chemistry, 1999, p.313-398
SDG: Collisions § §
Detection via a plain-sweep Collision-Handling: a) b) c) d) e) f) •
Rotating Drawing Units Exchanging substituents Alternating angle pattern Shortening terminal bonds Rotating within Units Stretching bonds (optional) No distortion of angular patterns
April, 16.
Patrick Fricker
SDG: Collisions § §
Detection via a plain-sweep Collision-Handling: a) b) c) d) e) f) •
Rotating Drawing Units Exchanging substituents Alternating angle pattern Shortening terminal bonds Rotating within Units Stretching bonds (optional) No distortion of angular patterns
April, 16.
Patrick Fricker
SDG: Collisions § §
Detection via a plain-sweep Collision-Handling: a) b) c) d) e) f) •
Rotating Drawing Units Exchanging substituents Alternating angle pattern Shortening terminal bonds Rotating within Units Stretching bonds (optional) No distortion of angular patterns
April, 16.
Patrick Fricker
SDG: Collisions § §
Detection via a plain-sweep Collision-Handling: a) b) c) d) e) f) •
Rotating Drawing Units Exchanging substituents Alternating angle pattern Shortening terminal bonds Rotating within Units Stretching bonds (optional) No distortion of angular patterns
April, 16.
Patrick Fricker
SDG: Collisions § §
Detection via a plain-sweep Collision-Handling: a) b) c) d) e) f) •
Rotating Drawing Units Exchanging substituents Alternating angle pattern Shortening terminal bonds Rotating within Units Stretching bonds (optional) No distortion of angular patterns
April, 16.
Patrick Fricker
SDG: Collisions § §
Detection via a plain-sweep Collision-Handling: a) b) c) d) e) f) •
Rotating Drawing Units Exchanging substituents Alternating angle pattern Shortening terminal bonds Rotating within Units Stretching bonds (optional) No distortion of angular patterns
April, 16.
Patrick Fricker
SDG: Collisions § §
Detection via a plain-sweep Collision-Handling: a) b) c) d) e) f) •
Rotating Drawing Units Exchanging substituents Alternating angle pattern Shortening terminal bonds Rotating within Units Stretching bonds (optional) No distortion of angular patterns
April, 16.
Patrick Fricker
Example: NCI Count
Percentage
Total in database
122473
100.00
No salts, no single atoms
121972
99.59
Ring topology too complex
2748
2.24
Collisions
3814
3.11
243
0.20
45
0.04
6850
5.59
Failures:
Coordination number larger than five Other Failures total Runtime: 2ms per compound April, 16.
Patrick Fricker
Example: NCI Count
Percentage
Total in database
122473
100.00
No salts, no single atoms
121972
99.59
Ring topology too complex
2748
2.24
Collisions
3814
3.11
243
0.20
45
0.04
6850
5.59
Failures:
Coordination number larger than five Other Failures total Runtime: 2ms per compound April, 16.
Patrick Fricker
Example: Ugi-Reaction
April, 16.
Patrick Fricker
§ Compounds from an Ugi-based library
User-Defined Constraints § Each bond may be marked with a directional constraint. § The created structure diagram should fulfill both, chemical standards and those restrictions. § Basic idea: § Find rules to describe chemical standards. § Calculate the set of structure diagrams fulfilling all constraints. § Choose one. April, 16.
Patrick Fricker
Directions § Eight “logical” directions for Drawing Units. § The direction of a bond is the direction the corresponding chain. § Additional information for zigzag orientations of each bond is necessary.
April, 16.
Patrick Fricker
Local Rules § Rules define how the outgoing bonds might be placed. § Such rules can easily be defined for acyclic bonds. § For ring systems, the rules are taken from the corresponding drawing. Each outgoing bond might be z-right or z-left.
April, 16.
Patrick Fricker
Algorithm § Create the so-called Direction Tree from the molecule graph. § Find all valid combinations of drawing rules. § Choose a drawing by sequentially selecting longest chains that can be draw in the same direction (respecting the rules). § Use the normal drawing routine for the rest. April, 16.
Patrick Fricker
Finding Valid Combinations § Traverse the Direction Tree. For each node expansion, three steps are needed: § Calculate rules for this node and its children. § Restrict orientations of the children according to the user constraints. § Ensure that the already visited subtree contains a consistent set of drawing rules.
§ After the traversal, we have the set fulfilling the user-constraints (but collisions are not considered). April, 16.
Patrick Fricker
Keeping The Tree Valid (I) § Case 1: Removed orientations in the node (tighten_down). § Rules for the node and the children may be invalid. Those are removed. § Only orientations of children may be invalid. Remove invalid ones. Use tighten_down for the children.
April, 16.
Patrick Fricker
Keeping The Tree Valid (II) § Case 2: Removed orientations in a child (“black”) (tighten_up). § Rules of the node may be invalid. They are removed. § Orientations in children may have become invalid (but not in “black”!). Remove them and use tighten_down. § Orientations in this node may have become invalid. Remove them and use tighten_up for the parent node. April, 16.
Patrick Fricker
The Optimization Problem. § Apart from collisions, the algorithm always finds a valid drawing, if one exists. And what if none exists? § Find a drawing with a maximal number of userconstraints fulfilled § Weights for the directional constraints
§ So far, only solved heuristically § The order of the expansion in the Direction Tree is chosen according to node-weights § Inconsistent constraints are modified or ignored April, 16.
Patrick Fricker
Example: Manually Imposed
April, 16.
Patrick Fricker
Mayer, D.; Naylor, C. B.; Motoc, I. and G.R. Marshall, A unique geometry of the active site of angiotensin-converting enzyme consistent with structure-activity studies. JCAMD, 1987. 1: p. 3-16
Example: Manually Imposed
April, 16.
Patrick Fricker
Mayer, D.; Naylor, C. B.; Motoc, I. and G.R. Marshall, A unique geometry of the active site of angiotensin-converting enzyme consistent with structure-activity studies. JCAMD, 1987. 1: p. 3-16
Application: Feature Trees § “Feature Tree” pairwise similarity measure (based on tree matching of physico-chemical properties) § Constraints are obtained from a template molecule based on bond paths between and within Feature Tree matches
April, 16.
Patrick Fricker
Example: Feature Tree Imposed
April, 16.
Patrick Fricker
Mutschler, E., G. Geisslinger, H.K. Kroemer, and M. SchäferKorting, Mutschler Arzneimittelwirkungen. 8 ed. 2001
Example: Feature Tree Imposed
April, 16.
Patrick Fricker
Mutschler, E., G. Geisslinger, H.K. Kroemer, and M. SchäferKorting, Mutschler Arzneimittelwirkungen. 8 ed. 2001
Example: Feature Tree Imposed
April, 16.
Patrick Fricker
Mutschler, E., G. Geisslinger, H.K. Kroemer, and M. SchäferKorting, Mutschler Arzneimittelwirkungen. 8 ed. 2001
Example: Feature Tree Imposed (II)
April, 16.
Patrick Fricker
Mutschler, E., G. Geisslinger, H.K. Kroemer, and M. SchäferKorting, Mutschler Arzneimittelwirkungen. 8 ed. 2001
Example: Feature Tree Imposed (II)
April, 16.
Patrick Fricker
Mutschler, E., G. Geisslinger, H.K. Kroemer, and M. SchäferKorting, Mutschler Arzneimittelwirkungen. 8 ed. 2001
Summary § For structure diagram generation, Drawing Units are used to obtain § a unified handling of cyclic and acyclic bonds in collision handling and § a good heuristic for collision handling preserving straight chains. § With the user-defined constraints, it is possible for programs to influence, e.g. to generate similar diagrams for similar molecules. § The user-defined constraints may be automatically imposed by software tools. As a proof of concept, Feature Trees were used. April, 16.
Patrick Fricker
P. Fricker, M. Gastreich, M. Rarey; Automated Drawing of Structural Molecular Formulas under Constraints; JCICS, 2004, 44, 1065-1078
Acknowledgment § Thomas Lengauer, MPI Saarbrücken § Marcus Gastreich, BioSolveIT St. Augustin § Hugo Kubinyi, University of Heidelberg
April, 16.
Patrick Fricker