Metagenomic natural product discovery in lichen ...

4 downloads 15400 Views 4MB Size Report
Bootstrap analysis was done with 1,000 pseudoreplicate ..... Bootstrap values higher than 60% are shown at the ...... β-D-OH converted to shifted double bond.
PNAS PLUS

Metagenomic natural product discovery in lichen provides evidence for a family of biosynthetic pathways in diverse symbioses Annette Kampaa,1, Andrey N. Gagunashvilib,1, Tobias A. M. Guldera, Brandon I. Morinakaa, Cristina Daolioc, Markus Godejohannc, Vivian P. W. Miaod, Jörn Piela,e,2, and Ólafur S. Andréssonb,2 a Kekule Institute of Organic Chemistry and Biochemistry, University of Bonn, 53121 Bonn, Germany; bFaculty of Life and Environmental Sciences, University of Iceland, 101 Reykjavik, Iceland; cBruker BioSpin GmbH, 76287 Rheinstetten, Germany; dDepartment of Microbiology and Immunology, University of British Columbia, V6T 1Z3 Vancouver, Canada; and eInstitute of Microbiology, Eidgenössische Technische Hochschule Zurich, 8093 Zurich, Switzerland

Edited by Nancy A. Moran, Yale University, West Haven, CT, and approved June 18, 2013 (received for review March 27, 2013)

|

of mycobiont lectin genes (6), and PCR-based phylogenetics in investigation of intrathalline bacterial diversity (7). In a number of bacterial–eukaryote symbioses, bacterial partners have been implicated in the production of complex molecules derived from polyketide synthase (PKS) and nonribosomal peptide synthetase (NRPS) pathways (3, 8, 9). Examples include pederin, made by bacteria that live in rove beetles of the genus Paederus, and structurally related metabolites, the onnamides and psymberin, produced by bacteria that live in marine sponges (Fig. 2). In general, metabolites known or suspected to be of symbiont origin show remarkably low structural overlap with natural products discovered in screening programs from free-living bacteria (10). This phenomenon raises the intriguing question of whether symbiont chemistry might encompass structural scaffolds covering distinctive regions of chemical space. In this study, we applied a combination of metagenomic and natural product discovery methods to identify nosperin, the first

|

biosynthesis Peltigera membranacea trans-acyltransferase polyketide synthase 13C nuclear magnetic resonance

|

ymbiosis, defined by de Bary (1) as the “living together of two organisms,” includes a broad range of partnerships, from loose associations to obligate interdependencies and host–parasite interactions. Many involve microbes, with perhaps the most successful—between bacteria and early nucleated cells in the Precambrian—leading to mitochondria and chloroplasts in modern eukaryotes (2). Symbiotic interactions are being examined with increasing molecular detail, focusing not only on attributes that may be beneficial for each organism individually but also on what might be important for the association. It is increasingly being recognized that biosynthetic pathways leading to synthesis of specialized metabolites may play key roles in the biology of symbiosis (3). Lichens are ancient and physiologically highly integrated symbioses between heterotrophic filamentous fungi (mycobionts) and cyanobacteria or coccoidal green algae (photobionts) that may date as far back as 600 Mya (4). The morphology of the characteristic and stable macroscopic body of a lichen, the thallus, typically bears little resemblance to the individual organisms that form it and, in many cases, can be highly organized: fungal cells on the periphery for physical support and protection and photobiont cells inside, providing photosynthate or fixed nitrogen or both (5) (Fig. 1 A–C). Although the photobionts can often be isolated in pure culture (Fig. 1D), most mycobionts (almost exclusively from the Ascomycota) are refractory to propagation in vitro by standard methods, and intact lichens cannot be maintained artificially for long. Nevertheless, such limitations are gradually being overcome using advanced analytical platforms, e.g., metagenomics in the characterization

Significance Remarkable chemical families are being recognized by studying diverse symbioses. We identified, through metagenomics, the first cyanobacterial trans-AT polyketide biosynthetic pathway in the Nostoc symbiont of the lichen Peltigera membranacea and showed its expression in natural thalli. An isotope-based technique designed for characterizing minute amounts of material confirmed predictions that its product, nosperin, is a distinct member of the pederin family of compounds that was previously thought exclusive to animal–bacteria associations. The unexpected discovery of nosperin in lichen expands the structural range and known distribution of this family of natural products and suggests a role associated with symbiosis.

S

www.pnas.org/cgi/doi/10.1073/pnas.1305867110

Author contributions: A.N.G. and Ó.S.A. initiated project; A.K., A.N.G., V.P.W.M., J.P., and Ó.S.A. designed research; A.K., A.N.G., B.I.M., V.P.W.M., J.P., and Ó.S.A. performed research; A.N.G. carried out bioinformatic analyses of WGS, isolated Nostoc strains and conducted gene expression studies; A.K. performed feeding studies and compound isolations; A.K., T.A.M.G., B.I.M., C.D., and M.G. performed metabolic analyses and elucidated structure; J.P. analyzed the trans-AT PKS genes and performed metabolic prediction; C.D. and M.G. contributed new reagents/analytic tools; A.N.G. and V.P.W.M. examined distribution of gene cluster; A.K., A.N.G., T.A.M.G., B.I.M., C.D., M.G., V.P.W.M., J.P., and Ó.S.A. analyzed data; A.K., A.N.G., T.A.M.G., V.P.W.M., J.P., and Ó.S.A. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Data deposition: The sequences reported in this paper have been deposited in the GenBank database [accession nos. GQ979609 (nsp gene cluster), JQ975876 (second trans-AT gene cluster), GU591312 (nostopeptolide-like gene cluster), JX181775 (P. membranacea WGS Nostoc rRNA genes), KC489223 (heterocyst glycolipid gene cluster), KC291407 (rbcLXS operon), and JX975209 (Nostoc sp. N6 rRNA genes)]. 1

A.K. and A.N.G. contributed equally to this work.

2

To whom correspondence should be addressed. E-mail: [email protected] or [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1305867110/-/DCSupplemental.

PNAS | Published online July 29, 2013 | E3129–E3137

MICROBIOLOGY

Bacteria are a major source of natural products that provide rich opportunities for both chemical and biological investigation. Although the vast majority of known bacterial metabolites derive from free-living organisms, increasing evidence supports the widespread existence of chemically prolific bacteria living in symbioses. A strategy based on bioinformatic prediction, symbiont cultivation, isotopic enrichment, and advanced analytics was used to characterize a unique polyketide, nosperin, from a lichen-associated Nostoc sp. cyanobacterium. The biosynthetic gene cluster and the structure of nosperin, determined from 30 μg of compound, are related to those of the pederin group previously known only from nonphotosynthetic bacteria associated with beetles and marine sponges. The presence of this natural product family in such highly dissimilar associations suggests that some bacterial metabolites may be specific to symbioses with eukaryotes and encourages exploration of other symbioses for drug discovery and better understanding of ecological interactions mediated by complex bacterial metabolites.

A

B

C

D

Fig. 1. The foliose lichen Peltigera membranacea and Nostoc symbiont. (A) Lichen in situ. (Scale bar, 5 cm.) (B) Rhizines (Rhi) on lower surface and apothecia (Apo) protruding from thallus edge. (C) Thallus cross section illustrating stratified internal structure including photosynthetic cyanobiont layer (shown with arrows) between cortical and medullary mycobiont layers (above and below, respectively). (Scale bar, 100 μm.) (D) Nostoc sp. N6 in culture. (Scale bar, 100 μm.) (photograph for Fig. 1C, courtesy of Martin Grube).

member of the pederin family from a lichenized cyanobacterium and a further example toward the emerging concept of symbiosis-associated natural product pathways (10). Results Discovery of Trans-AT PKS Genes in the Lichen Metagenome. Peltigera membranacea is a widely distributed terrestrial lichen carrying Nostoc sp. as its photobiont (Fig. 1 A–D). Total lichen DNA extracted from field samples collected in Iceland processed for whole genome sequencing (WGS) (11) revealed approximately equal contributions from the mycobiont, the photobiont, and the community of intrathalline microbes. Bioinformatic mining of the initial metagenome assembly yielded 18 candidate clusters containing genes that encode PKS enzymes (SI Appendix, Table S1). Among the putative bacterial gene clusters, two were members of the trans-acyltransferase (AT) PKS family (Fig. 3; SI Appendix, Fig. S1) in which AT domains are not encoded by the PKS genes but rather by a separate gene elsewhere: i.e., the ATs that load the polyketide building blocks are not integral parts of the modules but act as free-standing units (10). This group of enzymes is particularly interesting because many of them are responsible for products made specifically by symbiotic bacteria (10). These gene clusters in the lichen are most likely derived from the photobiont, as only Nostoc exhibited a high level clonal presence, indicated by DNA sequence coverage in the WGS, and a commensurate level of coverage was found for diagnostic markers

of the Nostoc genome, such as hgl (involved in heterocyst glycolipid biosynthesis; SI Appendix, Table S1). The longer of the two gene clusters in P. membranacea, designated “nsp” (Fig. 3) had significant homology to the gene clusters for the biosynthesis of pederin family compounds and therefore was selected for further investigation. The nsp gene cluster consists of a 59-kb region with 3 large genes (nspA, nspC, and nspD) that encode multidomain PKS or PKS/NRPS proteins and a suite of 10 smaller genes that encode accessory enzymes (Fig. 3; Table 1). The multidomain proteins together comprise a “starter” module 0, followed by nine PKS or PKS/NRPS elongating modules (modules 1–9). The 5′ end of the gene cluster, i.e., nspA (modules 0–3), nspB, and the beginning of nspC (module 4 and the KS region of module 5), as well as accessory genes at the 3′ end of the cluster, have closely related counterparts in biosynthetic gene clusters of pederin-type compounds (Fig. 3). The middle region, however, has primary affinities to NRPS–PKS biosynthetic pathways from other members of Proteobacteria or Cyanobacteria, viz., the end of nspC (modules 5–7) is similar to the PKS genes of the rhizoxin (rhi) biosynthetic gene clusters from Burkholderia sp. (12) and Pseudomonas sp. (13). Further downstream, the PKS genes have resemblances to gene clusters reported from various Nostocales or Oscillatoriales. An ∼3-kb region at the junction of nspC and nspD is especially intriguing in bearing ∼80% identity at the DNA level to a portion of the nos-like gene cluster (a cis-AT PKS pathway) in

Fig. 2. Pederin family compounds and symbioses. (Upper Left) Image of Paederus fuscipes courtesy of Christoph Benisch (www.kerbtier.de). (Upper Right) Image of Theonella swinhoei courtesy of Yoichi Nakao. (Lower Right) Image of Psammocinia aff. bulbosa adapted with permission from ref. 15. Copyright 2007 American Chemical Society. (Lower Left) Image of Mycale hentscheli courtesy of Mike Page.

E3130 | www.pnas.org/cgi/doi/10.1073/pnas.1305867110

Kampa et al.

PNAS PLUS

Fig. 3. Nosperin biosynthetic gene cluster nsp and flanking regions. Microsynteny and homology with pederin and onnamide biosynthetic gene clusters are indicated in gray. Similarity of nsp to other PKS biosynthetic gene clusters is indicated by double-headed arrows. Numbers denote individual modules. Genes with similar proposed functions (Table 1) are indicated with identical colors. β, genes involved in β-branch formation; T, transposon. See SI Appendix, Figs. S15 and S16 for details of regions flanking the nsp locus.

P. membranacea (SI Appendix, Fig. S2); a homolog of this cluster in Nostoc sp. GSV224 is responsible for biosynthesis of nostopeptolide (14), a cyclic peptide-polyketide (SI Appendix, Table S2). Altogether, the nsp locus appears to be an evolutionary mosaic of trans- and cis-AT PKS fragments from diverse sources. Expression of the nsp pathway was detected by RNA-seq analysis in P. membranacea thalli freshly collected from the same location as the source material for the WGS. Consistent with expectations for a photobiont-specific gene cluster, nsp transcripts were observed in the main thallus tissue that contains both mycobiont and photobiont cells, but not in apothecia or rhizines, which are lichen structures that are derived only from the mycobiont (Fig. 1B; SI Appendix, Table S3). Although transAT PKS systems have been found in a wide range of bacteria (16, 17), none have been reported for cyanobacteria, which are otherwise rich sources of cis-AT PKSs (17, 18). These observations suggested the possibility of metabolic products that might be novel, not only from structural but also from ecological and evolutionary perspectives. Prediction of the Compound Structure. Detailed examination of the ketosynthase (KS) domains in PKS gene clusters using phylogenetic methods and comparisons of module architecture in

pathways with similar products can often facilitate prediction of natural product structures generated by trans-AT PKSs (16, 19). When the Nsp KS sequences (KS1–KS9, referring to the module number in which the domain occurs) were aligned and compared with 494 homologs using KSs from cis-AT systems as an outgroup, the resulting clades were generally consistent with respect to KS functions (SI Appendix, Fig. S3–S5). For example, all KSs with known function in the same group as KS1 accept acetyl starters incorporated by domains of the GCN5-related N-acetyltransferase family (GNAT) (20). In this way, partial structures for the substrates of KS1–3, 5, 7, and 9 were predicted (Table 2). As expected from the earlier analyses, KS1–5 (nspA, nspC) were most similar to KSs of the pederin (21, 22) and/or onnamide (23) PKS, and a full domain analysis revealed virtually complete architectural identity with corresponding portions of the ped and onn PKS–NRPS clusters over the first six modules, ending with KS5. The region also included an NRPS (module 4a) that catalyzes the insertion of a glycine residue (Fig. 4). This observation indicated that a large part of the polyketide product would resemble pederin and onnamides. The remainder of the core structure was more difficult to predict, because two of the four KSs (KS6 and KS8) fell into clades consisting of KS0s, which are nonelongating KS variants that usually show little consistency

ORF nspA nspB nspC nspD nspE nspF nspG nspH nspI nspJ nspK nspL nspM

Protein size

Proposed function

Closest homolog (protein,origin)

Percent identity

Accession number*

5,320 371 8,252 2,206 474 285 86 411 420 262 442 464 647

PKS Flavin-dependent oxygenase PKS-NRPS PKS MatE efflux transporter O-Methyltransferase ACP β-Ketoacyl synthase HMG-CoA synthase Enoyl-CoA hydratase Acyltransferase Cytochrome P450 Asparagine synthase

PedI, Paederus fuscipes symbiont PedJ, P. fuscipes symbiont OnnI, Theonella swinhoei symbiont JamP, Lyngbya majuscula SxtM1, Lyngbya wollei OnnH, T. swinhoei symbiont Cpap_1683, Clostridium papyrosolvens DSM 2782 Cpap_1682, C. papyrosolvens DSM 2782 PksG, Bacillus subtilis subsp. subtilis SC-8 Cpap_1678, C. papyrosolvens DSM 2782 PedD, P. fuscipes symbiont PPSIR1_33239, Plesiocystis pacifica SIR-1 Acid_5610, Candidatus Solibacter usitatus Ellin6076

42 66 49 60 54 57 58 60 72 52 49 35 48

AAR19304 AAR19305 AAV97877 AAS98787 ACG63829 AAV97876 EGD47495 EGD47494 EHA29460 EGD47490 AAS47563 EDM78481 ABJ86557

*Accession numbers are for the GenBank database.

Kampa et al.

PNAS | Published online July 29, 2013 | E3131

MICROBIOLOGY

Table 1. List of the genes present in the nsp gene cluster and their predicted functions

Table 2. Analysis of KS domains present in the Nsp PKSs Domain

Closest characterized relative (substrate specificity)

Predicted specificity of KS clade

Moiety present in nosperin

pederin KS1 (acetyl starter) onnamide KS2 (α-L-methyl + β-D-OH) onnamide KS3 (β-exomethylene) onnamide KS4 (KS0) pederin KS5 (amino acid) rhizoxin KS11 (KS0, double bond) rhizoxin KS12 (shifted double bond) bryostatin KS8 (KS0) oxazolomycin KS9 (serine)

Acetyl α-L-methyl + β-D-OH mostly β-exomethylene KS0 amino acid KS0, double bond shifted double bond KS0 amino acid

Acetyl α-L-methyl + β-D-OH (anti configured) β-exomethylene KS0 glycine KS0, double bond shifted double bond KS0 proline

KS1 KS2 KS3 KS4 KS5 KS6 KS7 KS8 KS9

between phylogeny and substrate structure (16). KS06 was positioned in a small subclade containing homologs from the rhizoxin and bacillaene PKSs that are involved in shifting double bonds from the α,β- to the β,γ-position (24, 25). These KSs are found in modules harboring, in addition to the KS0 and the acyl carrier protein (ACP), a dehydratase (DH) domain postulated to catalyze double bond isomerization and characterized by a NSAF/YL instead of the usual DxxxQ/H motif involved in dehydration (26). The same elements are present in the nsp module

nspA

B

encoding KS06; moreover, KS7, encoded by the module immediately downstream, is highly similar to rhizoxin KS12, which accepts a substrate with a shifted double bond (25). Together with the upstream NRPS module, these features strongly suggested the presence of an enamide moiety, which is not present in pederin or onnamides. KS9, associated with KSs elongating chains of amino acid residues, consistent with its position C-terminal to a second NRPS module (8a), was the only other KS with predictable function. An analysis of residues lining the

C

I

D

E FG H I J K L

II

M

III

translation module 1 0 KR GN AT

CR MT KS CR

KS

HO

O

4a

4 KS0

A

C

MT KS0 DH KS

KS

S

O

S

O

8

7

DH KR KR S

O HO

6

5

? KS

S

S

S O

3

2

KR

AT KR MT

C

KS0

S

O

9

8a A

KS S

S

O

S

O

O

O

HO HN

MeO

HO

N

O

O HO

HN

MeO

O HO

O

O

HN

MeO O

N

HO

HO

MeO

pederin orthology

HO

O

HN

O

ER TE

O O

HO

HO MeO

HN O

O HO

MeO O

OH

H N

OH

O

O

OH

NspM

O

OH

H N O

NH2

N

MeO O

HN

MeO O OH

O

OH

O

O HO MeO

N

OH

O

NspL MeO O

OH

Release

H N

OH

O

OH

O

Nosperin

O

N

NH2

HO

Fig. 4. The nsp gene cluster, deduced architecture of the PKS proteins NspA, NspC, and NspD, and proposed biosynthesis of nosperin. GNAT, GCN5-related N-acetyltransferase family (20); KS, β-ketoacyl synthase; KR, ketoreductase; MT, C-methyltransferase; CR, crotonase superfamily (also known as enoyl-CoA hydratase) (30); KS0, nonelongating KS; C, nonribosomal peptide synthetase (NRPS) condensation domain; A, NRPS adenylation domain; DH, dehydratase; AT, acyltransferase; ER, enoyl reductase; TE, thioesterase; ?, unknown. Small black circles symbolize acyl and peptidyl carrier proteins. The positions of amplicons used for the nsp screening are shown with black boxes and roman numerals.

E3132 | www.pnas.org/cgi/doi/10.1073/pnas.1305867110

Kampa et al.

Isolation and Characterization of the Polyketide Nosperin. Using the preliminary structural information as a guide, total extracts of whole lichens were examined for the presence of the predicted metabolites. Due to copious amounts of diverse glycolipids and other metabolites, however, LC-MS and extensive NMR-guided subfractionation failed to detect a pederin-type polyketide. As Nostoc symbionts have often been cultured from lichens (5), an alternative approach focusing only on the cyanobacterium was taken. Macerated thalli of P. membrancacea were plated on BG-110, a minimal medium lacking nitrogen, and cyanobacteria identifiable as Nostoc sp. by microscopic examination were established in pure culture (Fig. 1D). The presence of the nsp cluster in three random isolates was confirmed by PCR for amplicons representing the PKS genes nspA and nspC, and the accessory gene, nspF (Fig. 4; SI Appendix, Table S7). One strain, designated N6, also characterized by sequencing of 16S and 23S ribosomal RNAs, was grown in BG-11 liquid medium for 4 wk to evaluate gene expression in culture. Transcription of the nsp gene cluster in Nostoc sp. N6 was confirmed by mapping RNA-seq data and found to be fivefold higher than in the thallus, relative to expression of rbcLXS, a Nostoc reference marker (SI Appendix, Table S3). When extracts were prepared from scaled up cultures, numerous metabolites were observed in small amounts. However, due to the unusual architecture of the terminal Nsp domains and the unknown nature of post-PKS modifications, prediction of the mass of the compound was challenging and convincing candidates were not identified by MS analysis. In light of the challenges imposed by multicomponent trace mixtures and the absence of a known mass, a strategy of stable isotopic enrichment followed by HPLC-SPE-NMR to address problems of sensitivity and complexity while allowing detection of predicted structural moieties was used. Nostoc sp. was cultured in 25 L of BG-11 supplemented with 13C-labeled NaHCO3, and after 5 wk, cyanobacterial biomass from 10 L of culture was freeze-dried and extracted. HPLC-electrospray ionization (ESI)MS analysis confirmed that most components in the extracts had multiple 13C atoms incorporated into individual molecules. To Kampa et al.

Distribution of the nsp Locus in Cyanobacteria. Although the cultivated strain Nostoc sp. N6 carried the nsp cluster, it also exhibited a distinctive 23S rRNA polymorphism and originated from a specimen of P. membranacea independent from that used in metagenome sequencing. These observations suggested that the nsp pathway might be common in P. membranacea photobionts or in Nostoc and possibly even other cyanobacteria. A PCR-based survey for nspA, nspC, and nspF indicated that the nsp cluster was present in P. membranacea from several locations in Iceland, but samples of this lichen in British Columbia, Canada, also included specimens where the targeted sequences were not detected (Table 3). Three nsp amplicons obtained from a specimen near Vancouver on the mainland were nearly identical (99.9%) to those from a Vancouver Island sample, but showed ∼3% divergence from the Icelandic reference (SI Appendix, Fig. S14). A bioinformatic search for nsp-like sequences in GenBank (SI Appendix, Table S2) and a PCR survey for amplicons of nspA, nspC, and nspF in 26 cyanobacterial strains representing 15 genera from all cyanobacterial orders (Materials and Methods) did not return any positive results except in the genus Nostoc, suggesting that the nsp pathway per se may have a phylogenetically restricted distribution. Within Nostoc there appeared to be an association with lichens: results from PCR testing of four strains (Nostoc sp. PCC9709, AR10B, AR9A, and WL-1) isolated from Peltigera spp. (32, 33) were positive for nsp, whereas four other strains not associated with lichens (Nostoc muscorum PNAS | Published online July 29, 2013 | E3133

PNAS PLUS

obtain insights into structural features of these compounds, the crude extract was subjected to repetitive HPLC-solid phase extraction (SPE) purification with subsequent NMR analysis of target molecules eluted with fully deuterated solvent (SI Appendix, Figs. S6–S9). This method allowed collection of highquality 1H- and 13C-spectra, as well as COSY-, HSQC-, and HMBC-2D-NMR data from microgram amounts of the 13Clabeled material (SI Appendix, Figs. S8–S13). NMR signals characteristic of the predicted exomethylene, methoxy, and Cmethyl functions were detected for a component eluting at 30.3 min (SI Appendix, Figs. S6–S9), of which only 30 μg were obtained. Further support for the identity of the compound came from ESI(+)-MS analysis, which indicated a multiply labeled minor component with an exact unlabeled mass of m/z = 564.2926, well within the predicted range and best fitting a calculated atomic composition of C26H43N3O9Na (calculated: m/z = 564.2897). MS/MS analysis of the molecular ion peak additionally revealed a daughter ion consistent with the formal loss of MeOH (m/z = 532.2664; calculated for C25H39N3O8Na: m/z = 532.2635), suggesting a methoxy group in the predicted structure. Several other indicative fragments were also visible in the MS/MS data, e.g., an additional cleavage of an acetamide functionality (m/z = 473.2260; calculated for C23H34N2O7Na: m/z = 473.2264). The NMR data fully supported the identity of the compound as a member of the pederin group and allowed elucidation of its constitution. In combination with bioinformatic analysis it was possible to predict most of the stereogenic elements, except the configuration at C12 and C14 (Fig. 5; for a full description of the MS and NMR-based characterization see SI Appendix, Data S2). The structure of the compound is in almost perfect agreement with the predicted features and represents a hybrid of pederin and an unusual proline-containing terminal moiety not previously observed in this group of natural products. Two deviations from the incomplete product prediction are the terminal amide function, most likely generated by the asparagine synthase-like protein NspM, and the hydroxyl moiety at C20, indicating post-PKS oxidation by the P450 homolog NspL. Altogether, the data show that this compound, which has been designated nosperin, represents a unique member of the pederin family of natural products.

MICROBIOLOGY

substrate pocket of the adenylation domain, known as the nonribosomal code (27, 28), returned a perfect match to proline-activating domains (SI Appendix, Table S5). In the absence of diagnostic downstream KS domains, the portions of the polyketide generated by modules 5, 7, and 9 were predicted using classical PKS colinearity rules (29), although they often apply poorly to trans-AT PKSs. These rules indicated the presence of two additional methyl groups and a hydroxyl function. The terminal elongation step was predicted to be catalyzed by module 9 in NspD, a cis-AT PKS module with an integrated AT domain. This step remained obscure, because the module architecture (KS-AT-KR-ER-ACP-TE) contrasts with the canonical order (KS-AT-DH-ER-KR-ACP-TE) and lacks a DH domain to provide the substrate for the subsequent enoyl reduction. This feature suggested either that the ER domain is nonfunctional, despite the presence of key amino acid residues, or that the DH activity is provided in trans. Further structural predictions were possible by comparison of the accessory and post-PKS nsp genes to known pathways. The genes nspGHIJK resembled those typically involved in the generation of polyketide β-branches (Table 1), indicating the presence of a pederin-type exomethylene bond (30). Because the closest relatives of nspB and nspF in the ped and onn clusters encode an oxygenase and a methyltransferase (31), responsible, respectively, for one oxygenation (at C7) and one methylation (at the C6 acetal oxygen) within the corresponding moiety of pederin, similar units in the nsp product were expected. The putative asparagine synthetase (NspM) and cytochrome P450 (NspL) enzyme homologs, however, remained without counterpart in pederin-type pathways.

MeO O

1

OH

OH

O

H N

O

20

N

13

2

OH

NH2

10

O

HO

24

Fig. 5. Stereochemical characterization of nosperin by NMR and bioinformatic analysis. The absolute configurations at the chiral centers were predicted by analysis of the stereospecificity of KR domains (blue), the NRPS domain structure (orange), the overall domain organization in comparison with other pederintype biosynthetic gene clusters (red), and NMR coupling constants and/or chemical shifts (green).

PCC7906, Nostoc punctiforme PCC73102, Nostoc spp. PCC6705, and PCC7107) were negative. The limited distribution within Cyanobacteria and the apparent absence of the nsp gene cluster in some samples from Canada suggested that rather than being a core part of the genome, the nsp genes may have been introduced horizontally. Investigation of regions flanking the nsp cluster revealed IS4 elements on both sides and linkage to genes associated with plasmid replication and partitioning (including parA, parB, and parM homologs and a gene encoding a DNA helicase), suggesting the possibility of an extrachromosomal source (Fig. 3; SI Appendix, Figs. S15 and S16). Discussion In this report, we describe identification of the nsp genes in the P. membranacea lichen metagenome, the first trans-AT PKS gene cluster from a cyanobacterium, and the application of a strategy consisting of bioinformatic prediction, symbiont cultivation, isotope enrichment, and 13C-NMR that enabled characterization of a unique symbiosis-associated natural product, nosperin, from the photobiont. Lichens have long been known for distinctive mycobiont-produced compounds, such as depside and depsidone polyketides (34), but unique structures and pathways are now also emerging from studies of the photobionts. Two conventional cis-AT PKS– NRPS biosynthetic pathways have recently been described from cyanobacteria associated with lichens: the mcy gene cluster (35, 36) involved in synthesis of microcystins, notorious hepatotoxins typical of many cyanobacteria, and the crp gene cluster, responsible for production of cryptophycins (37), anticancer agents of more limited distribution. The elements of the nosperin biosynthetic pathway, in contrast, are similar to the less common

trans-AT PKS–NRPS systems responsible for a number of animal– bacteria symbiosis-associated compounds including pederin, theopederins, onnamides, mycalamides, psymberin (irciniastatin A), and others (10) (Fig. 2). These compounds, with almost identical core regions but different biosynthetic starter regions and/or termini, are often highly toxic to eukaryotes and some have been considered promising candidates for anticancer drug development (38–40). Notably, this group of compounds has never been recovered from screening free-living bacteria, despite conspicuous pharmacological activities. Studies of mycalamide A (Fig. 2), which binds in the E-site of the ribosome normally occupied by the tRNA-terminal CCA (41), and synthetic analogs together with molecular modeling, have identified the N-acyl linked tetrahydropyran structure as central to binding and activity (42). The presence of the N-acyl linked tetrahydropyran in nosperin suggests it might have similar bioactivity; however, the amounts available were too small for testing. The discovery of nosperin not only increases the number of chemical scaffolds and biosynthetic enzymes encompassed by the pederin group but also expands the remarkable range of symbioses associated with this natural product family (Fig. 2). Furthermore, although the taxonomic identities of their producers are unknown, with the exception of pederin, which is produced by a close relative of Pseudomonas aeruginosa (21, 43, 44), both lichen metagenomic data and expression and product characterization in Nostoc sp. N6 clearly show that nosperin derives from the cyanobacterial photobiont of P. membranacea. Although this cyanobacterium is essential to every phase of thallus growth and development, it is also a facultative symbiont, being culturable by itself on basic mineral salts media. This first individually identified and culturable producer of a pederin family natural product provides new opportunities to study the biochemistry and

Table 3. Detection of nsp amplicons in P. membranacea from Iceland and two locations in British Columbia, Canada Region Reykjavík

Vancouver (Mainland)

Vancouver Island

Locality

Number of samples

nspA

nspC

nspF

Grafarholt Keldur Mosfellsbaer Öskjuhlid Raudavatn Ulfarsfell Belcarra Black Mountain Brothers Creek Eagle Ridge Eagle Ridge Horth Hill Roche Cove

1 3 1 5 1 1 1 1 1 2 3 2 2

+ + + + + + + − + + − + −

+ + + + + + + − + + − + −

+ + + + + + + − + + − + −

For positions of the amplicons see Fig. 4. +, a PCR product of the expected size was observed; −, no PCR product was observed.

E3134 | www.pnas.org/cgi/doi/10.1073/pnas.1305867110

Kampa et al.

Kampa et al.

C-NMR–based technique and other recent methods such as imaging MS (53) can detect low concentration signatures of nosperin and facilitate investigation of its role in symbiosis. These approaches could also identify molecular variants: e.g., recently studied specimens of P. membranacea that appear negative for only one or two of the three primer sets used for nsp screening may present variants of nosperin. This possibility is akin to the situation of the microcystins and cryptophycins for which a large number of structural variants have been found (36, 37). A thorough study of the >1,500 species of cyanobacteriabearing lichens and the multitude of other organisms including bryophytes, ferns, cycads, and angiosperms (54) that harbor cyanobacterial symbionts may yield many new biosynthetic pathways and metabolites to provide both alternative chemistry for potential pharmacological applications and a wealth of information on the chemical biology of symbiosis.

PNAS PLUS

13

Materials and Methods Identification of PKS Gene Clusters in the P. membranacea Metagenome and Expression Analysis of Whole Thalli. Metagenomic DNA was processed for sequencing at commercial facilities via Roche 454 and Illumina Solexa 2 × 35-bp methodology generating 1.76 GB of 454 data and 1.4 GB of Illumina data, yielding ∼50× coverage of the Nostoc genome. A draft assembly of the P. membranacea metagenome was constructed with MIRA v3.2.1 (www. chevreux.org/projects_mira.html). To search for PKS gene clusters, concatenated consensus sequences of the KS (N terminus, pf00109; C terminus, pf02801; http://pfam.sanger.ac.uk/) and ACP domains (pf00698) were used in a TBLASTN search (55) to retrieve all relevant contigs from the metagenomic database. Accuracy of the assembly was verified by visual inspection of the contigs in GAP4 (Staden package) (56) based on a mapped 3.5-kb paired-end library. Portions of the nsp sequence were verified by PCR amplification and sequenced directly using BigDye chemistry (Applied Biosystems; MacroGen). RNA-seq data sets from field samples of lichen thalli, apothecia, and rhizines were previously generated (6) and used for mapping in this study with Bowtie (57). Structure Prediction. Amino acid sequences of 503 KS domains from trans-AT and cis-AT PKSs were retrieved from GenBank and aligned using the MUSCLE algorithm with a gap open score of −1, as implemented in Geneious 5.5.3 (Biomatters Ltd.). After manual improvement of the alignment, phylogenetic reconstruction was performed by means of the Geneious software using the neighbor joining algorithm with a Jukes-Cantor distance method. KS domains of cis-AT PKSs were used as an outgroup. Bootstrap analysis was done with 1,000 pseudoreplicate sequences. Chemical Analysis of Whole Lichen. Air-dried lichen (30 g) was ground to a fine powder in liquid nitrogen using a mortar and pestle and stirred for 24 h at room temperature in MeOH. The mixture was filtered, and the solid material was extracted a second time. The solvent of the combined MeOH extracts was removed under reduced pressure. The crude extract was partitioned between 10:1 MeOH/H2O (300 mL) and n-hexane (3 × 100 mL). The solvent was removed from the aqueous MeOH layer under reduced pressure, and the residue was further fractionated by silica gel column chromatography. The following solvents (0.5 L each) were used to elute compounds: petroleum ether, petroleum ether/EtOAc (1:1), EtOAc, EtOAc/MeOH (9:1; 8:2; 7:3, 1:1), and MeOH. The fractions were evaporated under reduced pressure and analyzed by LC-MS using a Phenomenex Luna C18 column with a mobile phase gradient of 1:9 CH3CN/H2O + 0.1% TFA to 100% acetonitrile over 30 min and a flow rate of 1 mL/min. Isolation of Nostoc sp. N6. Lichen thalli collected from the same location as material used for WGS and expression studies were macerated between sterile microscope slides (58), and cells were plated on BG-110 (without NaNO3) agar medium (59) and incubated at 20 °C with a 12/12-h day/night cycle. Nostoc colonies were purified by repeated streaking on the same medium and maintained at room temperature. Analysis of the 16S and 23S rRNA sequences in the RNA-seq library (below) confirmed both the purity of the culture and its identification as a Nostoc sp. RNA Extraction and RT-PCR of Nostoc sp. N6. Total RNA was isolated from 1 L of BG-11 medium incubated at 20 °C under constant illumination for 4 wk. Cyanobacteria were retained on Miracloth (Calbiochem) after culture filtration, rinsed with water, blotted with paper towels, flash frozen in liquid

PNAS | Published online July 29, 2013 | E3135

MICROBIOLOGY

physiology of the biosynthetic pathway in vivo, as well as to improve metabolite yield through optimization of production protocols and strain improvement. Study of genes, gene clusters, and biosynthetic pathways in diverse symbiotic associations may help clarify their functions or identify metabolic products that are essential. Some polyketides produced by trans-AT PKSs, such as pederin (21, 22, 43), bryostatin (45, 46), and rhizoxin (12, 47), are known to participate in host defense and pathogenicity in symbiotic associations. It has also been suggested that PKS–NRPS compounds such as the microcystins, sometimes produced by cyanobacterial symbionts, may contribute to the chemical defense of lichens against grazers (35, 36). Expression of the nsp genes in P. membranacea and their presence in all Icelandic specimens tested suggest that it is a beneficial trait, although its role is unclear. In this regard, it may be significant that no microcystin pathway homolog was identified in P. membranacea, leaving open the possibility that nosperin might have a similar function in the lichen. Examination of geographically distant populations of P. membranacea was informative, as absence of nsp amplicons from some samples from Canada indicates that these genes are not essential for the lichen symbiosis although nosperin may confer advantage under some conditions. This provision may also apply to pederin, onnamides, and psymberin, where the metazoan hosts can be found with or without the metabolites (23, 43, 48). In the case of P. membranacea, additional field studies may help elucidate whether there is a primary cause, e.g., a founder effect, a particular environmental condition, or an interplay of other factors that underlie the distributional differences observed. The presence of similar trans-AT PKS–NRPS gene clusters in different groups of bacteria has suggested that these clusters are horizontally transferred (44). The flanking of the nsp cluster by transposable elements is consistent with this hypothesis, and the mosaic of homologies across the gene cluster suggests involvement of several intergenomic and intragenomic recombination events. The homology of NspE and part of NspD to proteins from Oscillatoriales (Fig. 3) suggests that an ancestral ped-like operon, specifying the conserved core part the molecule, may have been introduced into and modified by oscillatorean cyanobacteria: the position of nspE and nspD between sequences with high homology to the ped gene cluster and the orientation of nspE opposite to the ped-like genes suggest an intragenomic rearrangement mediated by genetic similarity of PKS–NRPS modules. Transfer to Nostocales and subsequent recombination resulted in the present domain organization that includes the ∼3-kb cis-AT containing fragment from a Nostoc nos-like cluster. The presence of a cis-AT domain is unusual in a PKS relying on trans-ATs, with few occurrences among the ∼40 large trans-AT PKS complexes with known products (10, 49). A relatively recent insertion of this ∼3-kb fragment is indicated by the high amino acid and nucleotide similarities to the nos cluster and suggests that near relatives of the nsp pathway may exist. It will be interesting whether a gene cluster similar to nsp, but without the cis-AT encoding region, or with other types of inserts and substitutions, will be found in other bacteria. Study of such examples of naturally engineered multidomain genes and gene clusters involving distantly related participants may not only generate useful hypotheses for further understanding their evolution, but the phylogenetic reconstruction may also be informative in identifying models of successful architectures for application in combinatorial biosynthesis industrially. The metabolic options offered by symbiotic associations provide exciting potential for drug development and highlight the need for new discovery strategies applicable to these complex systems. Although individual steps of the present procedure have been used previously in natural product research (15, 19, 47, 50–52), the combination of methods has not been reported and should be applicable to many further organisms. The

nitrogen, and crushed to a fine powder. TRIzol reagent (Life Technologies) was added to the powder, and it was ground again. The mixture was transferred to a 15-mL polypropylene tube and processed according to the TRIzol protocol. Before RT-PCR, the RNA was treated with DNase I (RNasefree) (Fermentas) to remove residual genomic DNA. First-strand cDNA was synthesized from 1 μg total RNA using SuperScript II Reverse Transcriptase (Invitrogen). RNA-seq data were obtained using Illumina Solexa Genome Analyzer IIx at the deCODE Genetics facility (Reykjavik, Iceland). RNA-seq mapping was done with Bowtie (57) and visualized in Geneious 5.5.3. Culture of Nostoc sp. N6 for Natural Product Analysis. Twenty-five liters of cells were grown in an illuminated (5,200 lm) bubble-column bioreactor in BG-11 liquid medium, optionally enriched with 3 mM 13C-labeled NaHCO3, for 5 wk at pH 7.8 and 25 °C. The cyanobacteria from 10- and 5-L portions of the culture were collected by filtration, frozen in liquid nitrogen, freeze-dried, and stored at −20 °C. Chemical Extractions and Analysis of Nostoc sp. N6, Unlabeled Culture. Freezedried cyanobacteria (above) were homogenized in 50 mL CH2Cl2/MeOH (2:1) and stirred for 15 min at room temperature. Biomass was filtered and treated again with the same amount of CH2Cl2/MeOH for 30 min at 30 °C. This procedure was repeated twice. The combined extracts were dried under reduced pressure. The crude extract was dissolved in MeOH and subjected to LC-MS analysis using an Agilent 1200 series HPLC and Bruker Daltonics micrOTOF-Q-spectrometer. HPLC was carried out with a Phenomenex Luna C18 column (5 μm, 250 × 2.00 mm), a mobile phase gradient of CH3CN/H2O (20:80) to (80:20) over 45 min, and a flow rate of 1 mL/min. Chemical Extractions and Analysis of Nostoc sp. N6, Labeled Culture. Freezedried cyanobacteria were extracted with stirring for 24 h in 2 L MeOH at room temperature. After filtration, the methanolic fraction was dried by evaporation and redissolved in 0.5 L MeOH/H2O (10:1) followed by liquid-liquid extraction with 0.5 L cyclohexane. The cyclohexane fraction was discarded. The remaining MeOH/H2O fraction was dried and stored at −20 °C. This material was directly used for LC-SPE-NMR analyses. HPLC-SPE-NMR. The solvent system consisted of eluent A (H2O + 0.1% deuterated formic acid) and eluent B (acetonitrile) with a linear gradient starting with 10% of B up to 90% B in 30 min. The flow rate was 0.8 mL/min at 25 °C, and the injection volume was 50 μL. The chromatography was monitored at 210, 220, and 254 nm, and these wavelengths were used to define absorbance thresholds to trigger SPE trapping. The HPLC eluate was diluted with H2O (2.4 mL/min) before trapping on SPE cartridges (Spark Holland), and individual peaks were trapped four times to increase concentration on cartridge. The cartridges were dried with pressurized nitrogen gas for 30 min each, and the analytes were eluted with 190 μL CD3CN (99.8 atom %; Deutero GmbH) into 3-mm match tubes from Bruker BioSpin GmbH.

heteronuclear multiple-bond correlation spectroscopy (HMBC) were carried out using 4,000 complex data points in F2 and 512 points in the F1 dimension. The multiplicity edited gradient heteronuclear single quantum correlation (HSQC) was acquired with 2,000 data points in F2 and 400 points in the F1 dimension. The COSY experiment was acquired with 32 scans, the HSQC with 64 scans, and the HMBC with 128 scans per increment, resulting in experiment times of 8 h 46 min (COSY), 12 h 4 min (HSQC), and 1 d 11 h (HMBC). A C13 spectrum with composite pulse decoupling on the proton channel was acquired by collecting 4,096 scans with 131,072 complex data points at a sweep width of 40,761 Hz and with a relaxation delay of 5 s. The experiment time was 7 h 11 min. Distribution Survey. P. membranacea thalli were collected at several localities in Iceland (Reykjavik area) and in British Columbia (North Shore mountains near Vancouver; Vancouver Island), and DNA was extracted using the previously described methods (60). DNA samples representing cyanobacterial strains other than those newly isolated from P. membranacea for this study were prepared and described previously (32, 61) and stored at −20 °C. They include Anabaena sphaerica UTEX1616, Chlorogloeopsis fritschii PCC6718, Cylindrospermum stagnale PCC7417, Fischerella muscicola PCC7414, Geitlerinema sp. PCC7105, Gloeobacter violaceus PCC7421, Leptolyngbya sp. PCC7104, Leptolyngbya sp. PCC7375, Lyngbya kuetzingii UTEX1547, Myxosarcina sp. PCC7325, Nodularia spumigena PCC73104, Nodularia harveyana UTEX2093, Pleurocapsa sp. PCC7315, Pleurocapsa sp. PCC7324, Pleurocapsa sp. PCC7321, Scytonema hofmanni PCC7110, Synechocystis PCC6803, Nostoc punctiforme PCC73102A, Nostoc sp. PCC6705, Nostoc sp. PCC9709, Nostoc sp. AR10B, and Nostoc sp. AR9A. In addition, DNA samples from Calothrix sp. PCC7601, Nostoc muscorum PCC7906, and Nostoc sp. PCC7107 (originally obtained from the Pasteur Culture Collection of Cyanobacteria) and Nostoc sp. WL-1 (kindly provided by E. Loos, University of Regensburg, Regensburg, Germany) were prepared from cultures using similar methods (32). Amplification of rbcLX (62) or rnpB (63) regions (SI Appendix, Table S7) (and in some cases, also the 16S rRNA gene) was used as a positive control to ensure DNA quality before screening with primer sets targeting the nsp gene cluster (Fig. 4). DNA from the Nostoc sp. N6 strain was used as an nsp positive control. Conditions for nsp screening were 94 °C for 2 min, then 94 °C for 10 s, 55 °C for 30 s, and 72 °C for 30 s (35×), and then 72 °C for 7 min. For rbcLX primers, the extension time was 1 min. Eppendorf MasterMix 2.5× (Eppendorf) was used according to the manufacturer’s protocol in a final volume of 50 μL. All negative samples were repeated at least once. PCR amplicons from two samples were sequenced directly (MacroGen). The bioinformatic search was conducted in November 2012.

NMR. All NMR experiments were acquired on an AVANCE III 600 MHz NMR spectrometer equipped with a 5-mm QNP cryo probe head (Bruker Biospin). Standard parameter sets created for the Bruker SELU (structure elucidation) program were uniformly used. Gradient correlation spectroscopy (COSY) and

ACKNOWLEDGMENTS. We thank T. Taylor and W. Loos for a gift of lichens and Nostoc isolate WL-1; K. Anamthawat-Jónsson for help with microscopy; deCODE Genetics (D. N. Magnúsdóttir, G. P. Örlygsdóttir, S. Snorradóttir, and Ó. T. Magnússon) for sequencing; G. König for providing a fermentor; H. Gross for sharing knowledge on culturing cyanobacteria; and K. PetersPlaumbaum and M. Engeser for MS support. This work was financially supported in part by the DFG (SFB 642 to J. P. and Emmy Noether fellowship to T.A.M.G.), the EU (BlueGenics to J.P.), the Alexander von Humboldt Foundation (B.I.M.), and the Icelandic Research fund (to Ó.S.A.).

1. de Bary HA (1879) Die Erscheinung der Symbios [The Phenomenon of Symbiosis]. (Verlag von Karl J. Trübner, Strassburg, Germany). 2. Margulis L (1970) Origin of Eukaryotic Cells: Evidence and Research Implications for a Theory of the Origin and Evolution of Microbial, Plant, and Animal Cells on the Precambrian Earth (Yale Univ Press, New Haven, Connecticut). 3. Piel J (2011) Approaches to capturing and designing biologically active small molecules produced by uncultured microbes. Annu Rev Microbiol 65:431–453. 4. Yuan X, Xiao S, Taylor TN (2005) Lichen-like symbiosis 600 million years ago. Science 308(5724):1017–1020. 5. Nash T, ed. (2008) Lichen Biology (Cambridge Univ Press, Cambridge, UK). 6. Miao VPW, Manoharan SS, Snaebjarnarson V, Andrésson ÓS (2012) Expression of lec-1, a mycobiont gene encoding a galectin-like protein in the lichen Peltigera membranacea. Symbiosis 57:23–31. 7. Bates ST, Cropsey GWG, Caporaso JG, Knight R, Fierer N (2011) Bacterial communities associated with the lichen symbiosis. Appl Environ Microbiol 77(4): 1309–1314. 8. Rath CM, et al. (2011) Meta-omic characterization of the marine invertebrate microbial consortium that produces the chemotherapeutic natural product ET-743. ACS Chem Biol 6(11):1244–1256. 9. Kalaitzis JA, Lauro FM, Neilan BA (2009) Mining cyanobacterial genomes for genes encoding complex biosynthetic pathways. Nat Prod Rep 26(11):1447–1465. 10. Piel J (2010) Biosynthesis of polyketides by trans-AT polyketide synthases. Nat Prod Rep 27(7):996–1047.

11. Xavier BB, Miao VPW, Jónsson ZO, Andrésson ÓS (2012) Mitochondrial genomes from the lichenized fungi Peltigera membranacea and Peltigera malacea: Features and phylogeny. Fungal Biol 116(7):802–814. 12. Partida-Martinez LP, Hertweck C (2007) A gene cluster encoding rhizoxin biosynthesis in “Burkholderia rhizoxina”, the bacterial endosymbiont of the fungus Rhizopus microsporus. ChemBioChem 8(1):41–45. 13. Brendel N, Partida-Martinez LP, Scherlach K, Hertweck C (2007) A cryptic PKS-NRPS gene locus in the plant commensal Pseudomonas fluorescens Pf-5 codes for the biosynthesis of an antimitotic rhizoxin complex. Org Biomol Chem 5(14):2211– 2213. 14. Hoffmann D, Hevel JM, Moore RE, Moore BS (2003) Sequence analysis and biochemical characterization of the nostopeptolide A biosynthetic gene cluster from Nostoc sp. GSV224. Gene 311:171–180. 15. Robinson SJ, et al. (2007) Probing the bioactive constituents from chemotypes of the sponge Psammocinia aff. bulbosa. J Nat Prod 70(6):1002–1009. 16. Nguyen T, et al. (2008) Exploiting the mosaic structure of trans-acyltransferase polyketide synthases for natural product discovery and pathway dissection. Nat Biotechnol 26(2):225–233. 17. Hochmuth T, Piel J (2009) Polyketide synthases of bacterial symbionts in sponges— evolution-based applications in natural products research. Phytochemistry 70(15–16): 1841–1849. 18. Jenke-Kodama H, Sandmann A, Müller R, Dittmann E (2005) Evolutionary implications of bacterial polyketide synthases. Mol Biol Evol 22(10):2027–2039.

E3136 | www.pnas.org/cgi/doi/10.1073/pnas.1305867110

Kampa et al.

PNAS PLUS

41. Gürel G, Blaha G, Steitz TA, Moore PB (2009) Structures of triacetyloleandomycin and mycalamide A bind to the large ribosomal subunit of Haloarcula marismortui. Antimicrob Agents Chemother 53(12):5010–5014. 42. Wan S, et al. (2011) Total synthesis and biological evaluation of pederin, psymberin, and highly potent analogs. J Am Chem Soc 133(41):16668–16679. 43. Kellner RL (2002) Molecular identification of an endosymbiotic bacterium associated with pederin biosynthesis in Paederus sabaeus (Coleoptera: Staphylinidae). Insect Biochem Mol Biol 32(4):389–395. 44. Piel J, Höfer I, Hui D (2004) Evidence for a symbiosis island involved in horizontal acquisition of pederin biosynthetic capabilities by the bacterial symbiont of Paederus fuscipes beetles. J Bacteriol 186(5):1280–1286. 45. Lopanik N, Lindquist N, Targett N (2004) Potent cytotoxins produced by a microbial symbiont protect host larvae from predation. Oecologia 139(1):131–139. 46. Sudek S, et al. (2007) Identification of the putative bryostatin polyketide synthase gene cluster from “Candidatus Endobugula sertula”, the uncultivated microbial symbiont of the marine bryozoan Bugula neritina. J Nat Prod 70(1):67–74. 47. Partida-Martinez LP, Hertweck C (2005) Pathogenic fungus harbours endosymbiotic bacteria for toxin production. Nature 437(7060):884–888. 48. Fisch KM, et al. (2009) Polyketide assembly lines of uncultivated sponge symbionts from structure-based gene targeting. Nat Chem Biol 5(7):494–501. 49. Ross AC, et al. (2013) Biosynthetic multitasking facilitates thalassospiramide structural diversity in marine bacteria. J Am Chem Soc 135(3):1155–1162. 50. Carmeli S, Moore RE, Patterson GML (1991) Mirabimides A-D, new N-acylpyrrolinones from the blue-green alga Scytonema mirabile. Tetrahedron 47(12):2087–2096. 51. Sailer M, et al. (1993) 15N- and 13C-labeled media from Anabaena sp. for universal isotopic labeling of bacteriocins: NMR resonance assignments of leucocin A from Leuconostoc gelidum and nisin A from Lactococcus lactis. Biochemistry 32(1):310–318. 52. Schlotterbeck G, Ceccarelli SM (2009) LC-SPE-NMR-MS: A total analysis system for bioanalysis. Bioanalysis 1(3):549–559. 53. Watrous JD, Dorrestein PC (2011) Imaging mass spectrometry in microbiology. Nat Rev Microbiol 9(9):683–694. 54. Adams DG (2000) Symbiotic interactions. The Ecology of Cyanobacteria: Their Diversity in Time and Space, eds Whitton BA, Potts M (Kluwer Academic Publishers, Dordrecht, The Netherlands), pp 523–561. 55. Altschul SF, et al. (1997) Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402. 56. Bonfield JK, Smith KF, Staden R (1995) A new DNA sequence assembly program. Nucleic Acids Res 23(24):4992–4999. 57. Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10(3):R25. 58. Yoshimura I, Yamamoto Y, Nakano T, Finnie J (2002) Isolation and culture of lichen photobionts and mycobionts. Protocols in Lichenology—Culturing, Biochemistry, Physiology and Use in Biomonitoring, eds Kranner I, Beckett RP (Springer-Verlag, Varma, AK), pp 3–33. 59. Waterbury JB (2006) The Cyanobacteria – isolation, purification and identification. The Prokaryotes, eds Dworkin M, Falkow S, Rosenberg E, Schleifer KH, Stackebrandt E (Springer, New York), Vol 4, pp 1053–1073. 60. Sinnemann SJ, Andrésson ÓS, Brown DW, Miao VPW (2000) Cloning and heterologous expression of Solorina crocea pyrG. Curr Genet 37(5):333–338. 61. Lorne J, Scheffer J, Lee A, Painter M, Miao VPW (2000) Genes controlling circadian rhythm are widely distributed in cyanobacteria. FEMS Microbiol Lett 189(2):129–133. 62. Rudi K, Skulberg OM, Jakobsen KS (1998) Evolution of cyanobacteria by exchange of genetic material among phyletically related strains. J Bacteriol 180(13):3453–3461. 63. Lee A (2003) Conservation of the cyanobacterial circadian clock: Comparative studies in Nostoc sp. strain PCC 9709, a cyanobacterium isolated from the lichen Peltigera membranacea. MS thesis (Univ of British Columbia, Vancouver).

MICROBIOLOGY

19. Teta R, et al. (2010) Genome mining reveals trans-AT polyketide synthase directed antibiotic biosynthesis in the bacterial phylum bacteroidetes. ChemBioChem 11(18): 2506–2512. 20. Gu L, et al. (2007) GNAT-like strategy for polyketide chain initiation. Science 318(5852):970–974. 21. Piel J (2002) A polyketide synthase-peptide synthetase gene cluster from an uncultured bacterial symbiont of Paederus beetles. Proc Natl Acad Sci USA 99(22): 14002–14007. 22. Piel J, Wen G, Platzer M, Hui D (2004) Unprecedented diversity of catalytic domains in the first four modules of the putative pederin polyketide synthase. ChemBioChem 5(1):93–98. 23. Piel J, et al. (2004) Antitumor polyketide biosynthesis by an uncultivated bacterial symbiont of the marine sponge Theonella swinhoei. Proc Natl Acad Sci USA 101(46): 16222–16227. 24. Moldenhauer J, et al. (2010) The final steps of bacillaene biosynthesis in Bacillus amyloliquefaciens FZB42: Direct evidence for β,γ dehydration by a trans-acyltransferase polyketide synthase. Angew Chem Int Ed Engl 49(8):1465–1467. 25. Kusebauch B, Busch B, Scherlach K, Roth M, Hertweck C (2010) Functionally distinct modules operate two consecutive α,β→β,γ double-bond shifts in the rhizoxin polyketide assembly line. Angew Chem Int Ed Engl 49(8):1460–1464. 26. Keatinge-Clay A (2008) Crystal structure of the erythromycin polyketide synthase dehydratase. J Mol Biol 384(4):941–953. 27. Stachelhaus T, Mootz HD, Marahiel MA (1999) The specificity-conferring code of adenylation domains in nonribosomal peptide synthetases. Chem Biol 6(8):493–505. 28. Challis GL, Ravel J, Townsend CA (2000) Predictive, structure-based model of amino acid recognition by nonribosomal peptide synthetase adenylation domains. Chem Biol 7(3):211–224. 29. Khosla C, Tang Y, Chen AY, Schnarr NA, Cane DE (2007) Structure and mechanism of the 6-deoxyerythronolide B synthase. Annu Rev Biochem 76:195–221. 30. Calderone CT (2008) Isoprenoid-like alkylations in polyketide biosynthesis. Nat Prod Rep 25(5):845–853. 31. Zimmermann K, Engeser M, Blunt JW, Munro MHG, Piel J (2009) Pederin-type pathways of uncultivated bacterial symbionts: Analysis of O-methyltransferases and generation of a biosynthetic hybrid. J Am Chem Soc 131(8):2780–2781. 32. Miao VPW, Rabenau A, Lee A (1997) Cultural and molecular characterization of photobionts of Peltigera membranacea. Lichenologist 29(6):571–586. 33. Wastlhuber R, Loos E (1996) Differences between cultured and freshly isolated cyanobiont from Peltigera: Is there symbiosis-specific regulation of a glucose carrier? Lichenologist 28(1):67–78. 34. Stocker-Wörgötter E (2008) Metabolic diversity of lichen-forming ascomycetous fungi: Culturing, polyketide and shikimate metabolite production, and PKS genes. Nat Prod Rep 25(1):188–200. 35. Kaasalainen U, Jokela J, Fewer DP, Sivonen K, Rikkinen J (2009) Microcystin production in the tripartite cyanolichen Peltigera leucophlebia. Mol Plant Microbe Interact 22(6): 695–702. 36. Kaasalainen U, et al. (2012) Cyanobacteria produce a high variety of hepatotoxic peptides in lichen symbiosis. Proc Natl Acad Sci USA 109(15):5886–5891. 37. Magarvey NA, et al. (2006) Biosynthetic characterization and chemoenzymatic assembly of the cryptophycins. Potent anticancer agents from cyanobionts. ACS Chem Biol 1(12):766–779. 38. Narquizian R, Kocienski P (2000) The pederin family of antitumor agents: Structures, synthesis and biological activity. The Role of Natural Products in Drug Discovery, eds Mulzer J, Bohlmann R (Springer, New York), pp 25–56. 39. Cichewicz RH, Valeriote FA, Crews P (2004) Psymberin, a potent sponge-derived cytotoxin from Psammocinia distantly related to the pederin family. Org Lett 6(12):1951–1954. 40. Lee KH, et al. (2005) Inhibition of protein synthesis and activation of stress-activated protein kinases by onnamide A and theopederin B, antitumor marine natural products. Cancer Sci 96(6):357–364.

Kampa et al.

PNAS | Published online July 29, 2013 | E3137

Metagenomic natural product discovery in lichen provides evidence for a family of biosynthetic pathways in diverse symbioses Annette Kampa1∗, Andrey N. Gagunashvili2∗, Tobias A. M. Gulder1 , Brandon I. Morinaka1 , Cristina Daolio3 , Markus Godejohann3 , Vivian P. W. Miao4 , Jörn Piel1,5 & Ólafur S. Andrésson2

Supporting Information (SI) Appendix 1

Kekule Institute of Organic Chemistry & Biochemistry, University of Bonn, Gerhard-Domagk-Str. 1, 53121 Bonn, Germany 2 Faculty of Life and Environmental Sciences, University of Iceland, Sturlugata 7, 101 Reykjavik, Iceland 3 Bruker BioSpin GmbH, Silberstreifen 4, 76287 Rheinstetten, Germany 4 Department of Microbiology & Immunology, University of British Columbia, V6T 1Z3 Vancouver, Canada 5 Institute of Microbiology, Eidgenössische Technische Hochschule (ETH) Zurich, Wolfgang-Pauli-Str. 10, 8093 Zurich, Switzerland ∗ These authors contributed equally to this work.

Author contributions. A.N.G. and Ó.S.A. initiated project; A.K., A.N.G., V.P.W.M., J.P., and Ó.S.A. designed research; A.K., A.N.G., B.I.M., V.P.W.M., J.P., and Ó.S.A. performed research; A.N.G. carried out bioinformatic analyses of WGS, isolated Nostoc strains and conducted gene expression studies; A.K. performed feeding studies and compound isolations; A.K., T.A.M.G., B.I.M., C.D., and M.G. performed metabolic analyses and elucidated structure; J.P. analyzed the trans-AT PKS genes and performed metabolic prediction; C.D. and M.G. contributed new reagents/analytic tools; A.N.G. and V.P.W.M. examined distribution of gene cluster; all authors analyzed data; A.K., A.N.G., T.A.M.G., V.P.W.M., J.P., and Ó.S.A. wrote the paper.

1

List of Figures, Tables and Data Figure S1 Figure S2 Figure S3 Figure S4 Figure S5 Figure S6 Figure S7 Figure S8 Figure S9 Figure S10 Figure S11 Figure S12 Figure S13 Figure S14 Figure S15 Figure S16 Table S1 Table S2 Table S3 Table S4 Table S5 Table S6 Table S7 Data S1 Data S2

Nostoc sp. second trans-AT PKS gene cluster . . . . . . . . . . . . . Nostoc sp. NRPS/PKS nos-like gene cluster . . . . . . . . . . . . . . Cladogram of the clade containing nsp KS1 and KS2 . . . . . . . . . . Cladogram of the clade containing nsp KS3, 4 and 7 . . . . . . . . . . Cladogram of the clade containing nsp KS5, 6, 8 and 9 . . . . . . . . . HPLC-MS chromatogram of Nostoc sp. N6 MeOH extract . . . . . . . MS/MS spectrum of nosperin . . . . . . . . . . . . . . . . . . . . . . 1 H-NMR spectrum of nosperin . . . . . . . . . . . . . . . . . . . . . . 13 C-NMR spectrum of nosperin . . . . . . . . . . . . . . . . . . . . . 2D-COSY spectrum of nosperin . . . . . . . . . . . . . . . . . . . . . HSQC spectrum of nosperin . . . . . . . . . . . . . . . . . . . . . . . HMBC spectrum of nosperin . . . . . . . . . . . . . . . . . . . . . . . Key 1 H-1 H COSY and HMBC correlations of nosperin . . . . . . . . . Sequences of nspA, nspC and nspF amplicons . . . . . . . . . . . . . 5′ flanking region of the nsp gene cluster . . . . . . . . . . . . . . . . 3′ flanking region of the nsp gene cluster . . . . . . . . . . . . . . . . List of polyketide synthase genes and gene clusters in Peltigera membranacea metagenome . . . . . . . . . . . . . . . . . . . . . . . . . . Best hits of a BLASTN search of the nsp gene cluster . . . . . . . . . Relative expression of the nsp gene cluster . . . . . . . . . . . . . . . Analysis of KR domains present in the nsp PKSs . . . . . . . . . . . . Analysis of adenylation domains present in the nsp PKSs . . . . . . . . NMR spectroscopic data of nosperin in CD3 CN . . . . . . . . . . . . . List of primers used . . . . . . . . . . . . . . . . . . . . . . . . . . . Sequences of PKS proteins encoded in the nsp gene cluster . . . . . . . Analysis of NMR data for nosperin . . . . . . . . . . . . . . . . . . . .

2

3 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 21 22 23 24 29

ORF6 ORF4 ORF5 ORF2

ORF1

ORF7 ORF8 ORF9

ORF3

ORF10 ORF11 22,429 bp

ORF

Protein size, aa

ORF1

610

ORF2 ORF3 ORF4 ORF5 ORF6 ORF7 ORF8 ORF9 ORF10 ORF11

1,885 1,622 324 314 329 136 457 647 368 373

Proposed function

Closest homolog (protein, origin)

% identity

GenBank accession number

AMP-dependent synthetase/ligase PKS PKS Fatty acid desaturase Acyltransferase Fatty acid desaturase Unknown Secretion protein HlyD ABC transporter ABC transporter ABC transporter

Cce 3758, Cyanothece sp. ATCC 51142

50

YP 001805172

PksB, Bacillus subtilis B11 ChiB, Sorangium cellulosum ‘So ce 56’ OA307 3202, Octadecabacter antarcticus 307 Mahau 1047, Mahella australiensis 50-1 BON Cyan7822 0629, Cyanothece sp. PCC 7822 Ava 3979, Anabaena variabilis ATCC 29413 Npun F1932, Nostoc punctiforme PCC 73102 Npun F1933, N. punctiforme PCC 73102 Npun F1934, N. punctiforme PCC 73102 Npun F1935, N. punctiforme PCC 73102

39 37 27 45 39 45 68 79 71 71

ABH03696 YP 001614780 ZP 05051826 YP 004463067 YP 003885940 YP 324479 YP 001865528 YP 001865529 YP 001865530 YP 001865531

Fig. S1. Nostoc sp. second trans-AT PKS gene cluster (GenBank accession number JQ975876). Deduced architecture of the encoded PKS proteins: ACP-KS-KR-ACP-KS0 and ACP-KR-MT-CR for ORF2 and ORF3, respectively.

ORF1

ORF2

ORF3

ORF4

ORF7 ORF6 ORF5

ORF8 ORF9 38,419 bp

ORF

Protein size, aa

ORF1 ORF2 ORF3 ORF4 ORF5 ORF6 ORF7

2,603 3,302 1,418 3,474 265 367 303

ORF8

273

ORF9

669

Proposed function

Closest homolog (protein, origin)

% identity

GenBank accession number

NRPS NRPS PKS NRPS Proline dioxygenase Alcohol dehydrogenase JmjC domain-containing protein Pyrroline-5-carboxylate reductase ABC transporter

OciB, Planktothrix agardhii NIVA-CYA 116 Ava 1611, Anabaena variabilis ATCC 29413 Ava 1612, A. variabilis ATCC 29413 Ava 1613, A. variabilis ATCC 29413 Npun F2185, Nostoc punctiforme PCC 73102 NosE, Nostoc sp. GSV224 L8106 24340, Lyngbya sp. PCC 8106

60 81 84 77 90 95 35

ABI26078 YP 322129 YP 322130 YP 322131 YP 001865729 AAF17283 EAW38021

NosF, Nostoc sp. GSV224

90

AAF17284

Ava 1617, A. variabilis ATCC 29413

84

YP 322135

Fig. S2. Nostoc sp. NRPS/PKS nos-like gene cluster (GenBank accession number GU591312). Deduced architecture of the encoded NRPS and PKS proteins: C-A(Methyl?-)Pro -PCP-C-AIle -MT-PCP, C-ASer -PCP-C-AMethyl-Pro -PCP-C-APhe -PCP, KS-AT-MT-ACP and C-A(Methyl?-)Pro -PCP-C-AGln -PCP-E-CPCP-TE for ORF1-4, respectively. The green line shows a 3.4 kb region with homology to the nsp gene cluster (see Table S2).

3

           



      

 

          

  !"#

 # !$% # &  !"#   # & ' !"#

'

&  (" )!"#

11

&  (" )!"# ((*   +,-     $ # $.  $ '  / !"# ((*/ 

-

((*   #   (" )!"# +0$ !"#   (" )!"# '1!"#



- !"#  - !"# - !"# (1 



21 

- 11

+  (  (  '  3 

Fig. S3. Cladogram of the clade containing nsp KS1 and KS2 (highlighted in bold). Bootstrap values higher than 60% are shown at the nodes. Question marks refer to unknown substrates. Abbreviations for members of this clade and the clades shown in Figures S4-S5: Alb, albicidin, Xanthomonas albilineans; Bae, bacillaene, Bacillus amyloliquefaciens FZB 42; Bat, batumin/kalimantacin, Pseudomonas fluorescens BCCM ID9359; Bry, bryostatin, “Candidatus Endobugula sertula”, the endosymbiont of the marine bryozoan Bugula neritina; BBR1 and BBR2, Brevibacillus brevis NBRC 100599 clusters BBR47 31930-32020 and BBR47 39780-39920; BCER, Bac. cereus BSGC 6E1 cluster; BP, Burkholderia pseudomallei cluster (multiple strains); BTP, Bac. thuringiensis pondicheriensis BGSC 4BA1 cluster; CACI, Catenulispora acidiphila DSM 44928; CC, CC2, CC3, Clostridium cellulolyticum H10 clusters Ccel 0858-0868, Ccel 2373-2386 and Ccel 0965-0980; Chi, chivosazol, Sorangium cellulosum ‘So ce 56’; CJA, Cellvibrio japonicus Ueda107; CP, Chitinophaga pinensis cluster; Cor, corallopyronin, Corallococcus coralloides B035; DDAN, Dickeya dadantii Ech703 cluster; Dif, difficidin, Bac. amyloliquefaciens FZB 42; Dsz, disorazol, Sor. cellulosum ‘So ce 12’; Etn, etnangien, Sor. cellulosum ‘So ce 56’; GU, Geobacter uraniireducens Rf4; KA, Kordia algicida OT-1; Kir, kirromycin, Streptomyces collinus; Lkc, lankacidin, Str. rochei 7434AN; LLYN, Leptolyngbya sp. PCC7375 cluster; Lnm, leinamycin, Str. atroolivaceus; Mgs, migrastatin, Str. platensis; Mis, misakinolide, the endosymbiont of the sponge Theonella sp.; Mmp, mupirocin, P. fluorescens NCIMB 10586; Mln, macrolactin, Bac. amyloliquefaciens FZB 42; Onn, onnamide, the endosymbiont of the sponge T. swinhoei; Ped, pederin, the Pseudomonas sp. endosymbiont of Paederus fuscipes beetles; PksX, bacillaene, Bac. subtilis; PPOL, Paenibacillus polymyxa E681 cluster; Psy, psymberin, the endosymbiont of the sponge Psammocinia aff. bulbosa; Rhi, rhizoxin, Burk. rhizoxinica; SG, Str. griseus NPRC 13350 cluster; SGRF, Str. griseoflavus Tü4000 cluster; Tai, thailandamide, Bac. thailandensis; Vir, virginiamycin M, Str. virginiae.

4

 β     β   αβ    

 β   αβ    β   αβ      



    

 β   αβ    ! "  β    #$ "



# " %& " ''#& " %&( "  ( β    ( β    

)*#!" + ,      "

 

$!



-0  $(



!-

!

,0   + ,   1(    0   23  

$$ !

''#&   ''#&   %&-   ( (   . /    0   0     ''#&  



Fig. S4. Cladogram of the clade containing nsp KS3, 4 and 7. For abbreviations see the legend of Fig. S3.

5











 

                    

          ! " # "  $            %   %  

&& " &&     && "   &' ( " )   )   

 

&   (*   &+   ,)   



   $

 $$





$   

!    &(&-   #+   &+   

%          #    )$    $    $$       &+   ,)       #.    #+    &          !

  #   &/#  

 %   &/#   ,)   #+

         $     

    

Fig. S5. Cladogram of the clade containing nsp KS5, 6, 8 and 9. For abbreviations see the legend of Fig. S3.

6

7

Fig. S6. (A) HPLC chromatogram at 220 nm (blue line) of Nostoc sp. N6 MeOH extract. Nosperin peak at retention time 30.3 min is indicated with an arrow. (B) Mass spectrum of nosperin peak at 30.3 min.

B

A

8

($%##+#'

! ! "

%$!# "#

Fig. S7. MS/MS spectrum of nosperin.

!"

!"#$%&#'

(")*+'

9 Fig. S8. 1 H-NMR spectrum of nosperin.

10 Fig. S9.

13

C-NMR spectrum of nosperin.

11 Fig. S10. 2D-COSY spectrum of nosperin.

12 Fig. S11. HSQC spectrum of nosperin.

13 Fig. S12. HMBC spectrum of nosperin.

1

MeO O 5

OH

H N

10

30

OH

OH

O

O

20

O

15

25

N

NH2

HO Nosperin

MeO OH

O

H N

OH

O

OH

O N

O NH2

HO

Fig. S13. Key 1 H-1 H COSY (bold lines) and HMBC (arrows) correlations of nosperin. The three partial structures are separated by dashed lines.

14

nspA 126 1316

1O 2O 3O 4O 5O 6O 7O 8O AAGCAGATGGCGATCTAATTTATGCGACGATCAAGGGGTCAGCCTCAAATCATGGTGGACAATCCGCCGGTCTCACCGTA AAGCAGATGGCGATCTAATTTATGCGACGATCAAGGGGTCAGCCTCAAATCATGGTGGACAGTCCGCCGGTCTCACTGTA AAGCAGATGGCGATCTAATTTATGCGACGATCAAGGGGTCAGCCTCAAATCATGGTGGACAGTCCGCCGGTCTCACTGTA ************************************************************* ************** ***

nspA 126 1316

9O 1OO 11O 12O 13O 14O 15O 16O CCGAATCCGCAACAGCAGGCAGCACTCTTAACCAATGCCTGGAAAGCCTCTGGTGTAGCCCCTAACACGATTAGTTTTAT CCGAATCCGCAACAGCAGGCAGCACTCTTAACCAATGCCTGGAAAGCCTCTGGTGTAGCCCCTAACACCATTAGTTTTAT CCGAATCCGCAACAGCAGGCAGCACTCTTAACCAATGCCTGGAAAGCCTCTGGTGTAGCCCCTAACACCATTAGTTTTAT ******************************************************************** ***********

nspA 126 1316

17O 18O 19O 2OO 21O 22O 23O 24O CGAAGCCCACGGAACGGGCACAGCCTTAGGAGATCCCATTGAGATCCAGGGAATTCAACAAGCTTTTTCGGAATGGAGTG CGAAGCCCACGGAACGGGCACAGCCTTAGGAGATCCCATTGAGATCCAGGGAATTCAACAAGCTTTTTCGGAATGGAGTC CGAAGCCCACGGAACGGGCACAGCCTTAGGAGATCCCATTGAGATCCAGGGAATTCAACAAGCTTTTTCGGAATGGAGTC *******************************************************************************

nspA 126 1316

25O 26O 27O 28O 29O 3OO 31O 32O AGACTCCCCAAGTCCCAATCTCCTGCGGTCTGGGTTCACTCAAAACCAACCTGGGGCATTTGGAGGCGGCTGCGGGCATA AGACTCCCCAAGTCGCGATCTCCTGCGGTCTGGGTTCGCTCAAAACCAACCTGCGGCATTTGGAGGCGGCTGCGGGTATA AGACTCCCCAAGTCGCGATCTCCTGCGGTCTGGGTTCGCTCAAAACCAACCTGGGGCATTTGGAGGCGGCTGCGGGTATA ************** * ******************** *************** ********************** ***

nspA 126 1316

GCAGGT GCAGGT GCAGGT ******

nspC 126 1316

1O 2O 3O 4O 5O 6O 7O 8O GCCTTGGTACACAAGTACGAAATGATGGACGACGCGGAACGACGGTTGATTCAGTGGGATTTGAATGCAACTTCAGTGGA GCCTTGGTACACAAGTACGAACTGATGAACGACTCGGAACGACGGTTGATTCAGTGGGATTTGAATGCAACTTCCGTGGA GCCTTGGTACACAAGTACGAACTGATGAACGACTCGGAACGACGGTTGATTCAGTGGGATTTGAATGCAACTTCCGTGGA ********************* ***** ***** **************************************** *****

nspC 126 1316

9O 1OO 11O 12O 13O 14O 15O 16O ATATCCCAGCAATTGCGTTCACCAACTATTTCAAGCTCAAGTCGAGAGAACTCCGAATGCTGTGGCGGTGGCTTGTGGTG ATATCCCAGCAATTGCGTTCACCAACTGTTTCAAGCTCAAGTCGAGAGAACTCCGAATGCTGTGGCGGTGGCTTGCGGTG ATATCCCAGCAATTGCGTTCACCAACTGTTTCAAGCTCAAGTCGAGAGAACTCCGAATGCTGTGGCGGTGGCTTGCGGTG *************************** *********************************************** ****

nspC 126 1316

17O 18O 19O 2OO 21O 22O 23O 24O AGCGGACTCTATCTTATGAAGAACTGAATCGCAAAAGTACAGCTTTAGCCAAATACTTGCAGCAACAAGGTGTCCAGTCG AGCAGACTCTATCTTATGAAGAACTGAATCGCAAAAGTACAGCTTTAGCCAAATACTTGCAGCAACAAGGTGTCCAGTCG AGCAGACTCTATCTTATGAAGAACTGAATCGCAAAAGTACAGCTTTAGCCAAATACTTGCAGCAACAAGGTGTCCAGTCG *** ****************************************************************************

nspC 126 1316

25O 26O 27O 28O 29O 3OO 31O GAAACCTTGGTTGCGGTTTGCGTCGAGCGATCGCTTGATACGATCGTCGCCTTGCTTGGGATTATGAAAGCTG GAAACCTTGGTTGCAGTTTGCGTCGAGCGATCGCTAGAGATGATCGTCGCCTTGCTTGGGATTATGAAGGCTG GAAACCTTGGTTGCAGTTTGCGTCGAGCGATCGCTAGAGATGATCGTCGCCTTGCTTGGGATTATGAAGGCTG ************** ******************** ** * *************************** ****

nspF 126 1316

1O 2O 3O 4O 5O 6O 7O 8O AGTCCGATACCAGAAAAGGTAGCAGAAATGTATAATGCGCCCGGAGGTCAAGGAGGGCATCTCATCTTTGACGGGCAGTT AGTCCGATACCAGAAAAGGTAGCAGAAATGTATAATGCGCCCGGAGGTCAAGGAGGGCATCTCATCTTTGACGGGCAGTT AGTCCGATACCAGAAAAGGTAGCAGAAATGTATAATGCGCCCGGAGGTCAAGGAGGGCATCTCATCTTTGACGGGCAGTT ********************************************************************************

nspF 126 1316

9O 1OO 11O 12O 13O 14O 15O 16O TCATTGGGGCTATTGGGACGAAAAGAATCCTGATGCCAGTCTTGCTGAGGCAGCCGATCGCCTGACTCAGATTATGATCG TCATTGGGGCTATTGGGATAAAAAGAATCCTGACGCCAGTCTTGGTGAGGCAGCCGACCGCCTGACTCAGATTATGATCG TCATTGGGGCTATTGGGATAAAAAGAATCCTGACGCCAGTCTTGGTGAGGCAGCCGACCGCCTGACTCAGATTATGATCG ****************** ************* ********** ************ **********************

nspF 126 1316

17O 18O 19O 2OO 21O 22O 23O 24O ACAAGTCGGAGATATCTCAAGGAGAACGCTTCTGCGATCTGGGATGCGGTGTTGGTGTGCCAGCTATGCGGATTGCAAAA ACAAGTCGGAGATATCTCAGGGAGAACGCTTCTGCGATCTGGGATGCGGTGTTGGTGTCCCAGCTATGCGGATTGCAAAA ACAAGTCGGAGATATCTCAGGGAGAACGCTTCTGCGATCTGGGATGCGGTGTTGGTGTCCCAGCTATGCGGATTGCAAAA ******************* ************************************** *********************

nspF 126 1316

25O 26O 27O 28O 29O 3OO 31O 32O GCGAAAGAATGTTTCGTTGATGCAATTACCATCAGCAAGTACCAATACGACAAGGCGAAGCAGCTAGCTGAAGAAGCTGG GCGAAAGAATGTTTCGTTGATGCAATTACCATCAGTAAGTACCAATACGACAAGGCGAAGCAGCTAGCCCAGGAAGCTGG GCGAAAGAATGTTTCGTTGATGCAATTACCATCAGTAAGTACCAATACGACAAGGCGAAGCAGCTAGCCCAGGAAGCTGG *********************************** ******************************** * ********

nspF 126 1316

33O 34O 35O 36O 37O 38O TATGTCCGATCGAGTCCGCTTCATCCAGGGCAATGCGTTGGAAATGCCCTGCGAGGACGCA TATGTCCGATCGAGTCCGCTTCATCCAGGGCAATGCCTTGGAAATGCCCTGCGATGACGCA TATGTCCGATCGAGTCCGCTTCATCCAGGGCAATGCCTTGGAAATGCCCTGCGATGACGCA ************************************ ***************** ******

Fig. S14. Sequences of nspA, nspC and nspF amplicons from P. membranacea from Vancouver mainland (“126”) and Vancouver Island (“1316”). Reference sequence shown in top line.

15

16

65 744 317 250 196 171 319 96 73 256 228 184 297 265 103 530 167 317 – 170

Protein size, aa

Unknown DNA helicase (RecD/TraA family) Plasmid partitioning protein Plasmid partitioning protein Unknown Unknown Plasmid segregation protein Unknown Unknown Flavin-dependent dehydrogenase Unknown Unknown Unknown Unknown Transcriptional regulator (XRE family) Transposase (IS5 family) Unknown, WD-repeat protein Transposase (IS4 family) Transposase (IS4 family) Transposase (IS4 family)

Proposed function

ORF11

ORF9 ORF10

ORF8

ORF7

Npun R3124, Nostoc punctiforme PCC 73102 Npun AR019, N. punctiforme PCC 73102 Alr8007, Nostoc sp. PCC 7120 CWATWH0003 5567, Crocosphaera watsonii WH 0003 Aazo 4564, ‘Nostoc azollae’ 0708 FJSC11DRAFT 4252, Fischerella sp. JSC-11 Npun AF025, N. punctiforme PCC 73102 Npun AF026, N. punctiforme PCC 73102 N9414 03328, Nodularia spumigena CCY9414 Npun F2230, N. punctiforme PCC 73102 FrEUN1fDRAFT 4836, Frankia sp. EUN1f Npun BR065, N. punctiforme PCC 73102 Npun BR066, N. punctiforme PCC 73102 Npun BR067, N. punctiforme PCC 73102 Ava 3497, Anabaena variabilis ATCC 29413 Gll0151, Gloeobacter violaceus PCC 7421 Npun R3184, N. punctiforme PCC 73102 Nfla 9901, Nostoc flagelliforme str. Sunitezuoqi Nfla 1703, N. flagelliforme str. Sunitezuoqi Nfla 7904, N. flagelliforme str. Sunitezuoqi

Closest homolog (protein, origin)

ORF4 parM

ORF6 fixC ORF5

b The

Fig. S15. 5′ flanking region of the nsp gene cluster.

97 94 52 61 49 72 98 70 61 80 43 90 95 81 29 61 91 83 77 92

% identity

YP 001866539 YP 001869949 NP 478432 EHJ09658 YP 003722944 ZP 08988044 YP 001869952 YP 001869953 ZP 01631110 YP 001865764 ZP 06415138 YP 001870252 YP 001870253 YP 001870254 YP 323999 NP 923097 YP 001866584 ADO19316 ADO18996 ADO19252

GenBank accession number

20.0 kb

ORF14 ORF15 nspA

ORF12 ORF13

homolog Asr7090 in Nostoc sp. PCC 7120 (69% identity; accession NP 490196) is encoded on plasmid pCC7120α. homolog Alr8562 in Nostoc sp. PCC 7120 (57% identity; accession NP 489473) is encoded on plasmid pCC7120δ. c Another homolog in Ava B0280 in A. variabilis ATCC 29413 (24% identity; accession YP 320180), is encoded on plasmid A. d Partial ORF. e The homolog All7244 in Nostoc sp. PCC 7120 (53% identity; accession NP 490350) is encoded on plasmid pCC7120α.

a The

ORF1 recD parB parA ORF2 ORF3 parM ORF4 ORF5 fixC ORF6 ORF7 ORF8 ORF9 ORF10 ORF11 ORF12 ORF13 (ORF14)d ORF15

ORF

ORF1 recD

ORF2 parA parB

ORF3

Chromosomea Plasmid pNPUN01 Plasmid pCC7120γ Unknownb Chromosome Unknown Plasmid pNPUN01 Plasmid pNPUN01 Unknown Chromosome Unknown Plasmid pNPUN02 Plasmid pNPUN02 Plasmid pNPUN02 Chromosomec Chromosome Chromosome Unknown Unknown Unknowne

Genomic context of the closest homolog

17

– – 123 124 – – – – 86 225 163 305 292 120

Protein size, aa

parB

ORF18

Transposase (IS4 family) Replicative DNA helicase (N-terminal domain) Transcriptional regulator (MarR family) Transcriptional regulator (XRE family) Plasmid partitioning protein AAA ATPase/DNA topoisomerase Uknown Transcriptional regulator (TetR family) Uknown Peptide methionine sulfoxide reductase Peptide methionine sulfoxide reductase Transcriptional regulator (LysR family) Transcriptional regulator (NmrA family) Transposase (IS4 family)

Proposed function

marR

ORF17 tetR pmsr2

pmsr1

ORF20

lysR

Nfla 1703, Nostoc flagelliforme str. Sunitezuoqi Npun F2831, N. punctiforme PCC 73102 Ava A0028, Anabaena variabilis ATCC 29413 Ava A0027, A. variabilis ATCC 29413 Alr7083, Nostoc sp. PCC 7120 OSCI 10006, Oscillatoria sp. PCC 6506 N9414 06264, Nodularia spumigena CCY9414 PCC8801 3314, Cyanothece sp. PCC 8801 Npun F2218, N. punctiforme PCC 73102 Npun F2220, N. punctiforme PCC 73102 Npun F2221, N. punctiforme PCC 73102 AXYL 02191, Achromobacter xylosoxidans A8 Glr2279, Gloeobacter violaceus PCC 7421 Nfla 9901, N. flagelliforme str. Sunitezuoqi

Closest homolog (protein, origin)

ORF19

b The

Fig. S16. 3′ flanking region of the nsp gene cluster.

ORF. homolog Ava A0029 in A. variabilis ATCC 29413 (83% identity; accession YP 320275) is encoded on plasmid B. c The homolog Ava C0030 in A. variabilis ATCC 29413 (35% identity; accession YP 320308) is encoded on plasmid C.

a Partial

(ORF16)a (dnaB)a marR ORF17 (parB)a (ORF18)a (ORF19)a (tetR)a ORF20 pmsr1 pmsr2 lysR nmrA ORF21

ORF

nspM

ORF16

dnaB

74 84 72 72 61 48 86 79 91 96 91 43 73 83

% identity

nmrA

ADO18996 YP 001866307 YP 320274 YP 320273 NP 490189 ZP 07108397 ZP 01630751 YP 002373441 YP 001865757 YP 001865759 YP 001865760 YP 003978229 NP 925225 ADO19316

GenBank accession number

10.4 kb

ORF21

Unknown Chromosomeb Plasmid B Plasmid B Plasmid pCC7120α Unknownc Unknown Chromosome Chromosome Chromosome Chromosome Chromosome Chromosome Unknown

Genomic context of the closest homolog

Table S1. List of polyketide synthase (PKS) genes and gene clusters in Peltigera membranacea metagenome Description

Origin

GenBank accession number

trans-AT PKS gene cluster 1, nsp trans-AT PKS gene cluster 2 NRPS/PKS gene cluster 3, nos-like NRPS-PKS gene cluster 4 NRPS-PKS gene cluster 5 PKS gene cluster 6 Heterocyst glycolipid gene cluster 7, hgl Reducing type I PKS gene, pks1 Reducing type I PKS gene, pks2 Reducing type I PKS gene, pks3 Non-reducing type I PKS gene, pks4 Reducing type I PKS gene, pks5 Non-reducing type I PKS gene, pks6 Reducing type I PKS gene, pks7 Reducing type I PKS gene, pks8 Reducing type I PKS gene, pks9 Reducing type I PKS gene, pks10 Reducing type I PKS-NRPS gene, pks11

Cyanobiont Cyanobiont Cyanobiont Cyanobiont Cyanobiont Cyanobiont Cyanobiont Mycobiont Mycobiont Mycobiont Mycobiont Mycobiont Mycobiont Mycobiont Mycobiont Mycobiont Mycobiont Mycobiont

GQ979609 JQ975876 GU591312 KC407995 KC407996 KC539822 KC489223 GU441232 GU460164 GU477350 HM180407 HM180408 HM180409 HM180410 HM180411 HM180412 HM189674 HM189675

18

19

AF204805

CP001037

CP000117

AY522504

2

3

4

5

Nostoc sp. ’P. membranacea cyanobiont’ hybrid NRPS/PKS gene cluster Nostoc sp. GSV224 nostopeptolide biosynthetic gene cluster Nostoc punctiforme PCC 73102, complete genome Anabaena variabilis ATCC 29413, complete genome Lyngbya majuscula jamaicamide biosynthesis gene cluster

Description

Default parameters were used.

GU591312

1

a

Accession number

Hit

847

1,867

2,551

2,639

2,663

Score, bits

1,691/2,478 (68%)

2,041/2,704 (75%)

2,498/3,225 (77%)

2,662/3,440 (77%)

2,653/3,424 (77%)

Identity

62/2,478 (3%)

37/2,704 (1%)

73/3,225 (2%)

41/3,440 (1%)

40/3,424 (1%)

Gaps

0

0

0

0

0

E-value

42,088–44,540

41,879–44,558

41,388–44,557

41,139–44,557

41,139–44,546

nsp region

56,984–59,424

2,005,385–2,008,075

2,679,901–2,683,107

13,781–17,200

17,007–20,406

Hit region

Table S2. Best hits of a BLASTN searcha of the nsp gene cluster as a query sequence against GenBank nucleotide database

20

Peltigera RNA (thallus tip, poly(A) mRNA depleted) Peltigera RNA (thallus base, poly(A) mRNA depleted) Nostoc total RNA (BG-11 medium without NO− 3 ) Nostoc total RNA (BG-11 medium with NO− 3 )

Sample

99,320,008

126,704,492

97,101,786

78,020,202

Total number of reads

0.551

0.334

0.014

0.005

nsp cluster

4.798

1.182

0.214

0.083

rpoB operonb

5.419

7.814

1.163

0.490

rbcLXS operonc

Datasets

2.598

1.402

0.318

0.108

rpoB gene

Length adjusted read counta

b

Average number of mapped reads per base. Levels for Peltigera apothecia and rhizines datasets were BryKS9 EPIAIIGMSCLFPGEVTTVDEFWELLIQERHAIQPL-PKG-R----W-QWPEGV--D--P---S------------------------GAQLG------I--D--Q-GGFL-DGIDTFDADFFRISRKEAELMDPQQRKLLELSWQVIEHAGY-----K PSVF----------S-G-----QEIGIYVGAC----HGN--YR------EL--LT---KSDQS-LKTTEGYLMT---G-S M-LSFLPNRISYFYNFKGPSVAIDTACSSVLVAIDQAVYAIQSGRCDQALVGGINLMST---PS-DTVSYY--QAG-ML--SKSGK-CKTFDATADGYVRGEGGAMLFLKSLSQAEYDKDCIYGVIKGVGVNHGGQAS-SLTVPKPDSQAALLKSVYLK ANVHPDTISYIETHGTATPLGDSVEVNALKQAF--G-SFYPDMAQ-----K------------------------K--LN S--------R----------------Y-C-GLGSVKTNIGHLEAASGIAGIIKILLSMRYRRIPASL-NYTQLNPN-IE--LSDS--PFYIVNKSRDW-PV-LQ-----DSRGGEIP--RRAGVNVY-GGGGVNAHVVLEE >KAKS2 TDIAIIGISCKLPGNITSVDALWEALTEEKSLISSY-PKE-R--GNWANSEEFK----------------------------------G----------I--D--Q-GGFM-QNAAAFDASFFRISPKEAQITDPQQRILLELAWACLEDAGI-----P AKKL----------I-G-----SNTGVFVGAS----NAD--YS------RK--VQ---DAKLA-I---EAHHAV---G-S S-LAILANRLSYYFDLSGPSLLVDTACSSSLVAIHSAVQSLHSGECDSAIVGGVNFICH---PD-LSIAYH--KAG-ML--SPEAK-CKVFDSKANGYVRGEGTVVMLLKPLKKAIEDKNQIHGVIKGSAINHGGLAA-GLTVPNPKKQSELLVNAWKN ANISTSDLSYIEAHGTGTSLGDPIEVQGIKNAL--T-TH-----S-----D------------------------SKTLE K---------------------------C-TIGSIKSNVGHLESAAGMTGMLKVILAMKHKQLPASI-HFSTLNPK-IN--LNDS--PIQIQNTLTSW--------------QSEKP--LLAGVSSF-GSGGTNGHVVLEE >BatKS1 LDIAIVGAACRLPFDIETLDGFWQLLLNGEEVIGTV-PEG-R----W-QWPVDI--E--P--HT------------------------K-HRG------I--D--Q-GAFL-SDVARFDGAFFRISPHEAELMDPQQRILLELSWQCLEDAGY-----C PSSL----------A-G-----SRTGVYVGAS----GSD--YR------LL--VD---RTGAG-I---EGHSGL---G-T S-MSILPNRISYFYDFCGPSMLIDTACSSSLVAVHQAIESLRAGNCTQAMVAGINIMCD---PS-TTVAYY--RAG-ML--STDGR-CRTFDASANGYVRAEGAVMLLLKPYEYALRDNDNIYAVLKGSAVNHGGLAG-GLTVPNPRAQAELVMGACRN ADIDINSLGYIEAHGTGTSLGDPIEVAGLKSAF--E-RF-----TADGDPV------------------------N---A-------------------------R-C-GLGSLKTNFGHLEAAAGIAGLLKAMLCLRHNVIPGSL-NFDRLNPL-ID--LSGT--PFHVVDATTAW------------PASHEEP--RRAGVSSF-GSGGTNAHVIVEE >PsyKS1 KGVAIVGMACRLPGGITTPEALWTVLAEGRDVVGTV-PAG-R----W-VWPQET--G--P--EH------------------------G-DPG------I--D--C-GGFL-DDIARFDAKLFRISPREAKVMDPQQRLLLELAWSAFEDAGY-----S KDAV----------E-G-----TKTGVFVGAS----GSD--YR------LL--LE---QHRVN-I---EPVMGT---G-T A-VAVLPNRISYFFDLQGPSLLIDTACSSSLVAIHEAVQALRAGSCEQALVGGINIMCH---PA-MTLAYY--KAG-ML--SPDGR-CKTFDAEANGYVRSEGAIVMMLKPLSAAQRDGDRIYAVVKGSACNHGGQAG-GLTVPNPQQQTALLRAAWAS ARVTPDQLGYLEAHGTGTSLGDPIEVKGMQDAF--R-AD-----------D------------------------N--IA A--------A----------------TTC-YLGSVKSNLGHLEAAAGIAGLMKLALCLYHRQLVSSL-HVHTVNPK-LG--LEQT--PFQIAQQVMAW-----------PTLKSGQP--SLTGVSSF-GSGGTNAHVVVEG >BP17KS1 DGIAIVGIACRLPGGLDTPEAFWDALKAGACVVGEL-PGD-R----W-TWPADI--D--P--GA------------------------R-HRG------I--D--R-GGFL-DDIRSFDAGLFRLSPKEVATMDPQQRILLELAWEAIERAGH-----C ADAV----------A-G-----SRTGVYVGAS----GSD--YR------LL--LE---RAGTG-V---DAHVAT---G-A S-MAVIANRISYTYDLRGPSIQVDTACSSSLVALHQAVQALRAGECDQALVGGVNVICH---PG-NTIAYY--KAG-ML--SPQGR-CKTLDDAADGYVRSEGAVMLMLRRLEQAVADGDPIHAVIRGSACNHGGLTG-GLTVPHPDRQADLLRAAWAA ARVSADDIGYLEMHGTGTRLGDPIEVRGLADAF--GARD-----------D------------------------A--AA R--------G----------------T-C-GIGSVKSNLGHLEAAAGLAGVLKTVLALKHREVPATI-HFSRLNAQ-IS--LART--PFAVVDTHRAW------------PARGGAR--RLAGVSSF-GSGGANAHVVLEE >NspKS1 DAVAIVGLACRLPGGINNPESLWDLLKNGGSVIEKL-PEG-R----W-QWPRDI--N--P---E------------------------GKHQG------I--D--W-GGFL-SQIDLFDAGFFRISAIEAQSMDPQQRILMELAWQTLENAGI-----T ANKV----------A-G-----TSTGVFVGAS----GSD--YC------RV--ME---RVGIP----IEAHVAT---G-T S-LAALANRISYFFDLRGPSIVIDTACSSSLMAVHQAVQSIRAGECLQALVGGIHIMSH---PA-NSIAYY--KAG-ML--AHDGK-CKTFDDRADGYVRSEGAVIFLLKQLRQAEADGDLIYATIKGSASNHGGQSA-GLTVPNPQQQAALLTNAWKA SGVAPNTISFIEAHGTGTALGDPIEIQGIQQAF--S-EW-----SETPQVP------------------------I-----------------------------S-C-GLGSLKTNLGHLEAAAGIAGLLKVVLCLQHQELPANQ-HFGHLNRH-IN--LADS--PFYIVDRHQKW-----------DRPNESIP--RRAGVSSF-GSGGANAHVVVEE >PedKS1 DDVAIVGMACRLPGGVDSPTAFWQCLQEGASLIGDL-PVE-R----W-EWPDDI--D--P--QR------------------------A-HKG------I--A--R-GGFI-EDVKAFDAPFFRISPAEAQSMDPQQRMLLELCWQTIEHAGY-----A PDAL----------A-G-----TDTGVFIGAS----GSD--YA-----RLL--ER---SESPL-----DAHYGT---G-S S-MAVLANRLSYFYDFTGPSLLLDTACSSSLVAVHKAVQSIWAGESVQALVGGVNLILH---PA-NSIAYY--KAG-ML--AKDGL-CKTFDQQADGYVRGEGAVMLLLKPLARALESRDRVYAVIKGTACNHGGQAG-GLTVPNPERQSTLLCSAWQS AGIDPSDLGYIEAHGTGTSLGDPIEVRGLKDAF--A------------ATR------------------------C--GA G--------Q----------------N-C-GLGSVKTNLGHLEAAAGIAGLLKVVLCLQHRQLPASL-HFQQLNAH-IE--LERG---LYVVDRLQPW-----------TPPSSGRV--RFAGVSSF-GSGGANAHVVVAE >OnnKS1_AcGNAT GHIAIVGLACRLPGGIETPQALWQLLKNGESAVGSL-PSG-R----W-NWPADI--D--P--DN------------------------R-HRG------I--D--Q-GGFL-DDIAGFDAAFFRLSTTEVESMDPQQRMLLELSWQVLEDAGY-----A PKDL----------K-K-----SQTGVFIGAS----GSD--YS------YL--LN---QSPVS----VEAHFGT---G-S A-MAVLANRISYFYDFYGPSLVVDTACSSSLVAVHKAVQSLRVGECDQVLVGGVHVMCH---PA-NSLAYY--QAG-ML--AKDGK-CKTFDQQANGYVRAEGAVMLLLKPLEAAVADQDQVFAVIRGTSCNHGGLAS-GLTVPNPEQQAALLQQAWRD ARISPLELSYLEAHGTGTALGDPIEIQGMKDAF--A-GY-----VKARSLP------------------------V--PE I--------R----------------S-C-GLGSIKTNLGHLEAAAGIAGLLKVVLAFRHRELPPLL-HFKQLNDH-ID-

--LANT--PFYPVDQLRSW------------DVPEGAI--RKAGVSSF-GSGGTNSHVVLEE >PsyKS10 RDIAIVGMSGRYPKA-HDLAAYWNVLRKAVDCVEEM-PQG-R----W-PLEGFF--E--P---D---------KAK--AV A----E---G-KSY------T--T--L-GGWL-DGIDEFDPLFFNISPQEARFMDPQERLFLQVVWGCLEEAGY-----I QPDW---------QKHP-----RDIGVFVGVS----YNN--YQ------LF--LA---EALKK-----GAHYSV-----G SQTYSIANRISYFFNLTGPSTTVDTACSSSLFAIHQACEAIYNGTTKMALAGGVSLSLH---PS-KYVTLC--ASG-FA--SSKGH-CHAFAADGDGLIPSEAVGVVLLKPLADAQADGDRILAVIKGTGVSQDGKTQ-GYTVPNPVAQTRAIRMAMDR AGVHPETISYVEAHGTGTALGDPIEVQGLVDAY--Q-PY-----T-----E------------------------K---K--------Q----------------F-C-AMGSVKANYGHGEAAAGIGQLTKVVLQLQHQTLVPSL-LHGPMNPN-LD--FRQT--PFVVQRGLAPW--------------IADHP--RRAGISSF-GAGGVNVHLIVEE >TaiKS10 LDIAIIGLAGRYPHA-PTLDAFWRNLVAGRDCVDEI-PPQ-R----W-PLDGFY--E--A---D---------PAR--AA A----E---G-KSA------S--K--W-GAFL-SDVDQFDPLFFGITPNEARLTDPHERLFLETAWACVEDAGY-----T RASL-------AALRDG-----PGVGVFVGAS----FNQ--YQ-----LIV--SD---AAQRR-----GARQFA---A-P SQIFSISNRVSYVMNFTGPSLTVDTACSSSLYAIHLACESLRRGESSVALAGGVNLSLH---PS-KYVSLS--LGR-FL--AADGR-CRAFDEGGTGYVPGEAVGAVLLKPLADAERDGDAIHGVIRGSGVSHGGRTN-GFAVPSPDAQALAIRRAVAQ AGVAPRSVGYVEAHGTGTALGDPVEIAGLEDVF--R-AG-----T-----D------------------------D---V--------G----------------F-C-AIASVKTMIGHSEAAAGIAQLTKVLLQMKHATLVRNLSHGGAPNPN-LA--LERS--PFRVVRDNEPW------------AAGDGGV--RRAGVSSF-GAGGANAHLIVDA >MmpKS2 GQIAIIGMAGVYPKS-ADLSAFWDCLANAGDCIETV-PEQ-R----W-SLDEHF--N--A---D---------RQR--AI Q----E---G-KSY------G--K--W-GGFI-EGLDDFDPMFFNFSLAEATYMHPKERQFIQCAWHALEDAGY-----T PASL----------E-Q-----EKVGVFVGVS----KAG--HD---------------NYKDS------------------FFSIANRVSYRFGFTGPSLPVDTACSSSLTAVHEACLHLQAGECTVAVVGGVNAYTH---PS-TFAEFA--RLG-VL--SADGK-SRAFGAGANGFVPGEGVGALLLKPLERALADGDMVHGVIAASAVNHGGKAN-GYTVPNPEAQRAMIRLALDR AGVSADQVTYVEAHGTGTALGDPIEFRGLVEAF--R-QD-----T-----E------------------------R---T--------G----------------F-C-RLGSVKSNIGHLEAAAGIAGLSKVLLQMRHGQIAPSL-HSQQMNPD-ID--ASLS--PFRVPQALEPW---------DVDGEGAQ-E--RIACLSSF-GAGGANAHVILRQ >TaKS4 EPIAIIGLSGRYPEA-HDLTAFWSNLARGADCVSEL-PSD-R----W-PLDVFF--D--A---D---------KDA--AI A----S---G-KSY------G--K--W-GGFL-GGLYDFDPLFFRISPREVEIIHPESRLFLQCAWHVVEDAGY-----T PTAL----------A-S-----ERVGVFVGAS----KVG--VA---------------EQHHT------------------FFTIPNRVSYALNLKGPSLAIDTACSSSLVAIHLACQQLQSGDCTLAIAGGVNTYTH---PY-HFADLS--KLQ-ML--STEGR-SKAFGAGANGFVPGEGVGAVLLKPLGRALRDGDHIYGVIRGSEINHGGKTN-GFTVPNPRAQAEVITRALAR AKVPARHITYVEAHGTGTDLGDPIEIRGLTEAF--A-AQ-----T-----P------------------------D---L--------A----------------Y-C-RIGSVKTNIGHLEAAAGIAGLSKILLQGQHRKYVPSL-HSSTLNPH-ID--FGKT--PFVVQQTCEDW--------LPCDAQGQSIP--RIACISSF-GAGGSNAHLIIEE >DifKS4 EGIAIIGVSGRYPQA-ETLEEYWDNLAAGKDCVTEI-PEE-R----W-EKDKYY--N--P---D---------PEA--AV K----E---G-KSY------S--K--R-GGFL-KDAAFFDPLFFQISPRDALNMDPQERLFIEMCWQVLEDAGY-----T KEQL-EN------LY-N-----GRVGVFAGIT----KNG--YG------MY---------WKN-----GGTLQP---K-T S-FSSAANRVSYFLNLKGPSMPVDTMCSSSLTAIHEACENILRGECDMAIAGGVNVYTH---PS-AYAELS--AYR-ML--SKDGT-CKSFGEGGDGFVPGEGVGAVLLKPLSKAMADKDHIYGVIRGTHINHGGRTN-GYTVPSPSAQADVISGALEK SGIHPRMISCIEAHGTGTELGDPIEIDGLTRAF--Q-TK-----T-----S------------------------D---K--------G----------------F-C-AISSVKSNIGHLEAAAGIAGVTKLLLQLQKRKLAPSL-HCRKLNPN-IV--FDRT--PFVVQRELADW-KRPVI---EENGAQKEIP--RIAGISSF-GAGGANAHILIEE >EtnKS9 LDVAIIGLAGRYPHA-RTLDEFWENLRAGKDCITEV-PPD-R----W-PIEGFY--D--G---A---------AES--SP G----N---G-KSY------S--K--W-GGFL-EGFADFDPLFFGISPGEAESLDPQERSFLEICWEVIEDAGY-----T RRTL-QS------RH-G-----GRVGVYAGIT----KTG--FD------LY--GP---ALWAT-----GKDAYP---H-T S-FSSLANRVSYLFDLRGPSIPVDTMCSASLTAIHEACEHLKRGECELAIAGGVNLYLH---PS-NYVILC--SHR-ML--SPDGK-CRSFGAGANGYVPGEGVGAVLLKLLRRAEADGDHIYGVIKGTAINHGGRTH-GYTVPNPNAQAEVVAMALAR ARIDARAVSYIEAHGTGTELGDPIEITGLTKAF--R-AS-----A-----A------------------------G--PL T--------R----------------R-C-AIGSLKSNIGHAESAAGIAGVTKVLLQMKHGQLVPSL-HARTPNPN-ID--FDRT--PFVVQQELESW-GRLTQ---VVDGRTEELP--RIAGISSF-GAGGANAHVVIQE >BatKS4 EGIAIIGLSGRYPGA-RTVAEFWENLRDAKNCISEI-PGE-R----W-SLTGFY--E--P---D---------MET--AL D----Q---G-KSY------S--K--W-GGFV-EGFADFDPLFFGISPREATNIDPQERLFLEASWSVLEDAGY-----T RESL-AA------RH-A-----GNVGVYVGIT----KAG--FN------LY--GP---PLWQR-----GEKSMP---Y-T S-FSSVANRVSYVLDLHGPSMPIDTMCSASLTAIHEACENLLRGECELAIAGGVNLYLH---PS-SYIGLC--AQR-ML--SLDGQ-CKSFGAQGNGFVPGEGVGAVLLKPLGRAIADGDHIYGVIKGTSVNHGGRTN-GYTVPSPTAQRDLVLAALRK ANVDPRTMSYIEAHGTGTELGDPIEIEGLVQAF--E-QY-----T-----H------------------------D---K--------Q----------------F-C-ALGSAKSNIGHLESAAGIAGVTKVLLQMRHRTLVPSL-HSHHLNPL-ID--FART--PFRVQQAVGEW-PRPVI---DSNGMSMEHP--RIAGVSSF-GAGGANAHVIIEE >EtnKS3 GPIAIIGLSGRYPQA-EDVGEFWEHLKAGRSCTREI-PAD-R----W-SLDGFY--E--P---D---------PEQ--AV M----E---G-KSY------G--K--W-GGFL-DGFADFDPLFFNVSPRESMNIDPQERLFIECCWSVLEDAGY-----T RDLI-AS------QH-G-----GRVGVFAGIT----KTG--FD------LH--GP---EMRLH-----DRQSFP---H-T S-FGSVANRVSYLLNLRGPSMPIDTMCSSSLTAIHEACEHLLRDECEIAIAGGVNLYVH---PS-TYVGLS--AQR-ML--SRTGH-CKSFGEGGDGFVPGECVGAVLLKPLQRAIDDGDHIYAVVKGTSINHGGKTN-GYTVPNPVAQRELILSALAR AGVDARAVSYVEAHGTGTELGDPIELAGLTQAF--R-AW-----T-----E------------------------E----

R--------E----------------Y-C-AIGSVKSNIGHCESAAGIAGVTKVVLQMKHGVLVPSL-HSERLNPH-ID--FENS--PFFVQRELSAW-RRPVV---KVDGKEREHP--RIAGVSSF-GAGGANAHVVLEE >EtnKS18 RRIAIIGLSGRYPQA-RTVDEYWENLKAGKDSVVEI-PPE-R----W-SLDGFY--C--D---D---------VEE--AV T----R---G-KSY------C--R--W-GGFL-DGFTEFDPLFFHISPLEAEGIDPQERLFMQASWDVLEDAGY-----T RESL-AR------RH-G-----GRVGVFAGIT----RGG--FE------LH--GP---GLWQR-----GIAAFP---R-T S-FSSVANRVSYFLNLHGPSMPVDTMCSSSLTAIHEACEHLLRDECELAIAGGVNLYVH---PS-SYVMLC--LGR-ML--SPRGR-CRSFGVDGDGMVPGEGVGTVLLKRLSRAEADGDHIYGVILGTSINHGGKTN-GYTVPNPAAHKALILSALGK AGVDARAVSYVEAHGTGTELGDPIEIAGLTQAF--R-AS-----T-----P------------------------D---V--------Q----------------F-C-AIASAKSNIGHCEAAAGIAGLTKVLLQMKHGQIAPSL-HVEELNPH-IA--FEKT--PFVVQRRLGEW-KRPTL---SVGGRSVEAP--RIAGISSF-GAGGANAHVIVAE >EtnKS19 RRIAIIGLSGRYPQA-RTIDEYWENLKAGKDSVVEI-PPE-R----W-AIEGFY--C--E---D---------PRA--AV E----A---G-KSY------C--R--W-GGFL-DGFAEFDPMFFNISPLEAEGIDPQERLFMQASWDVLEDAGY-----T RESL-------ARRH-G-----GRVGVFAGIT----KTG--FD------LY--GP---ELWRR-----GAAAFP---H-T S-FSSVANRVSYFLNIRGPSMPIDTMCSSSLTAIHEACEHLLRGECELAIAGGVNLYLH---PS-SYVLLC--MAG-ML--SKQGR-SRSFGRGADGMVPGEAVGAVLLKPLSRAEADGDPIYGVILGTSINHGGKTN-GYLVPNPAAHRELILSALAR AGVDARAVSYVEAHGTGTELGDPIEIAGLTQAF--R-AS-----T-----P------------------------D---V--------Q----------------F-C-AIASAKSNIGHCEAAAGIAGLTKVLLQMKHGQIAPSL-HVDELNPH-IA--FEET--PFVVQRRLGEW-KRPTL---TVGGRSVEAP--RIAGISSF-GAGGANAHVIVAE >LLYNKS4 TEIAIIGISGRYSQA-DNLDTFWKNLEAGHHCVTEI-PAD-R----W-SLEGFF--E--A---D---------VEK--AV K----Q---T-KSY------S--K--W-GSFL-PGFADFDPLFFGIPPSEAMKMDPQERLFLQATWEALEDAGH-----T RTML-AD------QY-G-----GRVGVFVGIS----RTG--FE------LF--GA---DFQAR-----GGRFQP---R-T S-FSSAANRVSYFLNIHGPSMPIDTMCSSSLTAIHEACEHLRRGTCDLVIAGGVNLFLH---PS-SYVELC--ASG-ML--SQDGK-CKSFGAGGDGFVPGEGVGAVVLKPLQRAIADGDNIHAVIRGSSVNHGGKTN-GYTVPNPLAQRDLIQAALAQ AGVHARSMSYIEAHGTGTELGDPIEITGLSLAF--E-AD-----T-----D------------------------D---T--------G----------------F-C-AIGSVKSNLGHLEAAAGIAGLTKVVLQLKHGRLTPSL-HATALNPN-IS--FEKT--PFYVQHQAGPW-PRPEL---ALQGESHQYP--RLAGVSSF-GAGGTNAHVVLEE >SGRFKS6 GGVAIVGMSARFPQA-ETLDAFWTVLREGRNTVTEI-PAD-R----W-ALDGFY--E--P---D---------KEE--AV A----Q---G-RSY------S--K--W-GAFL-EGFADFDPLFFEMTPNDALEIDPQERLFLQEAWRAFENAGY-----T RDRL-AR------QH-G-----SRVGVFVGIT----RAG--HN------LH--GP---ERWRR-----GDTARP---Y-T S-FGSAANRVSFKLDLHGPSMPIDTMCSSSMTALHEACEHLLHGDCDMAVAGGVNLYLH---PS-SYVGLC--DLG-ML---ADGDECRSFGAGGNGFVPGEGVGALVLKRLEDAERDGDHIHAVIRGTSINHGGTTN-GYTVPNPVAQGRVVRDALDR AGIDARTISYVEAHGTGTVLGDPVEINGLTRAF--S-QD-----T-----R------------------------D---V--------R----------------Y-C-AIGSVKSNIGHAEAAAGIAGVFKTVLQMQHGELVPSL-HADEVNPH-ID--FDAT--PFVVQRAGAEW-KRPVV---TVDGVGTEVP--RRAGVSSF-GAGGSNAHVVIEE >SGKS9 DAIAIIGMSGRYPGA-RNPDEFWQNLTAGRDRITEI-PGD-R----W-PLDGFY--E--P---D---------RAT--AV A----N---G-LSY------S--K--W-GAFL-EDFDAFDPHFFRIAPRDAYAMDPQERLFLQASWEVIEDAGY-----T REEL-AR------RH-Q-----RRVGVFAGVT----KSG--HA------RH--GE---ARLPS-----GERIAP---A-L S-FASLSARTSYVLDLRGPSLTIDTMCSSSLTAIHEACEHLRRGSCELAVAGGVNLYLH---PS-DYVELC--RSG-ML--SSQSR-VRSFGRGADGFVPGEGVGAVLLKPLARARADGDRVLAVIRGTSINHGGRSN-GYTVPSPAAQAELIGEALTR AGISAREVGYVEAHGTGTELGDPIEVKGLSLAF--E-KF-----T-----G------------------------D---R--------Q----------------F-C-AIGSVKSVIGHLEAAAGIAGVTKAVLQLRHRTLVPSL-HAEEPNPG-IR--FEET--PFRLQRALAPW--------------ESARP--RVTAVSSF-GAGGSNAHVIIEE >TaiKS11 EPIAIIGIAGRYPQA-SDLDQFWRNLAGGVDSVTTI-PPG-R----W-PLEGFF--E--P---D---------RAR--AL E----A---G-KSY------G--K--W-GAFL-DDHERFDAPFFQMSPLEAINLDPQIRLFIETCWAALEDAGY-----T RRLL--------AERHG-----KRVGVFAGIT----KTG--YA------LH--GP---RVWPL-----RQSYNP---Q-T S-FSALVNRVSYVLDLHGPSVPIDTMCSSSLTAVHEACEHIRSGACALALAGGVNLYLH---PS-NFLELA--GVQ-ML--SADGR-CKSFGQGADGFVPGEGVGAVLLKPLSRALADGDHVHAVIRATGINHGGKVN-GFTVPNPNAQRELIRATLER AGVRARRVSYVEAHGTGTDLGDPIEVAALAQAF--A-AS-----G-----G------------------------D--RD T--------G----------------Y-C-ALGSVKSNIGHLEAAAGIAGLTKVVLQMTHGMRAPTL-HARIPNEK-IP--FART--PFVLQRERDDW--PHAH-------GDAAQS--RIATVSSF-GAGGANAFAV-->MigKS4 EPIALIGLSGRYPDA-PTLEAFWENLRAGRESVREV-PAE-R----W-PLDAFY--E--P---D---------PQR--AV Q----Q---G-ASY------S--K--W-GAFL-DDFARFDAAFFGIAPRDAADMDPQERLFVESAWSVLEDAGY-----T RQRL-AE------QH-A-----SSVGVFAGIT----KTG--FD----------RH---RPPAT-----DGLPPA---PRT S-FGSLANRVSYLLDLHGPSMPIDTMCSSSLTAIHEACEHLRHGACELAIAGGVNLYLH---PS-SYVELC--RSR-ML--ATDGH-CRSFGAGGDGFLPGEGVGAVLLKPLSAAEADGDPIHAVIVGSAINHGGRTN-GYTVPNPRAQAALIRDALDR AGVSAAGIGYIEAHGTGTRLGDPVEIDGLTQAF--A-PD-----A-----G------------------------G---S--------G----------------A-C-ALGSVKSNIGHLEAAAGIAGLTKAVLQLQHGEFAPTL-HAEQTNPD-ID--FAAT--PFTLQTGGAPW-----------PRPADGGP--RRAGISSF-GAGGANAHVIVAE >BP17KS3 EPIAIVGVSGRYPQA-RDLDAFWDNLMRGRDSITEI-PPE-R----W-PLDGFY--D--E---D---------RER--AI G----A---S-RSY------A--K--W-GGFI-DGFAEFDPQFFNLSPREASNMDPQERIFLQACWEALEDAAY-----T RARI-AR------EH-G-----GRLGVFAGIT----RAE--FC------LY--GA---GNLKQ-----GKAPFT---S---FCSLVNRVSYFLDANGPSIPIDTMCSSSLVAVHEACDKLRLGECEVALAGGVNLSLH---PY-MYVSLS--AQR-ML--SSDGR-CKSFGLGGNGYVPGEGVGVIVLKPLSRALADGDRIHATIRATSINHGGKTN-GYTVPNPIAQQNVIRSALDR

AGVHARAVSYVEAHGTGTELGDPIEIAGLSGAF--R-RD-----T-----S------------------------D---R--------G----------------F-C-AIGSVKSNIGHLEAASGLAGLTKVLLQMKHGLLVPSL-HASELNPN-ID--FPAS--PFVVNRETRAW-ERPVI-------DGREHP--RIAGVSSF-GAGGTNAHVILEE >PsyKS5 EPIAIVGISGRYPGS-PDLDAFWEHLKAGDDLVTEI-PSD-R----W-SLDGFF--H--A---S---------VEE--AI A----Q---G-KSY------A--K--W-GSFV-EGVSEFDPLFFALSPREAMDMDPQERLFLQTAWEALEDAGL-----T RAAL-QH------QY-H-----RRVGVFAGVT----KTG--YE------LY--GP---TLRAQ-----GEQVFP---H-T S-FSSMANRLSYFLDVEGPSMPIDTMCSSSLSAIHEACLHIRSGACEIALVGAVNLYLH---PA-TYTALC--SRR-ML--SAQGR-CASFGEGARGFVPGEGVGVLVLKPLARAEADGDQIYGMIRGSHMNHGGKTN-GYTVPNPKAQAQLIRDALRH AGVEARAISYIEAHGTGTALGDPIEIEGLSQAF--A-SS-----T-----D------------------------D---R--------A----------------F-C-ALGSVKSNMGHLEAAAGMAGLTKVLLQMKHQTLVPSL-HAKTLNPN-ID--FEKT--PFVVQRELAPW-TRPMI-------NGQPMP--RIAGLSSF-GAGGANVHVIVEE >LLYNKS5 EPIAIIGVSARFPQA-ETLDAFWSMLRSGTHLVTEI-PSQ-R----W-PLAGFY--E--P---D---------SKQ--AV A----Q---G-KSY------G--K--W-GAFL-EGFADFDPLFFNLSPQEALRMDPQERLFLQTSWSALEDAGY-----T PARL-AQ------CC-D-----SNVGVFAGIT----KTG--YD------LY--GP---ELWQT-----GDTLQP---R-T S-FASVANRVSFLLNLKGPSMPVDTMCSSSLTAIHEACQRLRLGDCALALAGGVNLYLH---PA-NYVELS--TLQ-ML--SKDGR-CRSFGNGGDGFVPGEGVAVVLLKPLANALADGDQIHAVIRGSQVNHGGRTN-GYTVPNPKAQADLIRRTLDK AGVDARTVSYVEAHGTGTSLGDPIEVEGLTQAF--R-AD-----T-----S------------------------A---T--------G----------------F-C-ALGSVKSNLGHLEAAAGMAGLVKIVLQMRYRQLVPSL-HAAQLNPH-IE--FDET--PFQLQQTLADW-PRPTV---SVNGTTQEVP--RIACLSSF-GAGGSNAHVVVEE >LLYNKS7 EPIAIIGISGHYPQA-DSLAAFWSNLSVGKDGVTEI-PPE-R----W-SLEGFY--L--E---Q---------PEE--AV A----Q---A-MSY------G--K--W-GSFL-DGFAEFDPQFFNISGREAMSMDPQERLFLQSAWAALEDAGY-----T RETL-------QSRF-D-----GNVGVFAGIT----KTG--YD------LH--GA---EWRQH-----GKLFYP---H-T S-FGSVANRVSYLLNLHGPSMPIDTMCSSSLFAIHEACERLRQGVCPMAIAGGVNLYLH---PT-NYVELS--ALK-ML--SKGGK-CRSFGEGGDGFVPGEGVGVVILKPLAQALQDRDRIHGVIRATAVNHGGKTN-GYTVPNPKAQAEVIRRTLDL AGIHARTVSYVEAHGTGTALGDPIEVTGLTQAF--H-HD-----T-----T------------------------D---K--------G----------------F-C-ALGSVKSNIGHLEAAAGMAGLTKILLQMKHGKLAPSL-HADRLNPN-IS--FENT--PFQVQRSLADW-HRPRV---NVDGTPREYP--RIAGLSSF-GAGGANAHVILEE >TaKS2 EPIAVIGMSGRYPGA-ENLTEFWERLSRGDDCITEI-PPE-R----W-SLDGFF--Y--P---D---------KKH--AA A----R---G-MSY------S--K--W-GGFL-GGFADFDPLFFNISPREATSMDPQERLFLQSCWEVLEDAGY-----T RDSL-AQ------RF-G-----SAVGVFAGIT----KTG--YE------LY--GA---ELEGR-----DASVRP---Y-T S-FASVANRVSYLLDLKGPSMPVDTMCSASLTAVHMACEALQRGACVMAIAGGVNLYVH---PS-SYVSLS--GQQ-ML--STDGR-CRSFGAGGNGFVPGEGVGAVLLKPLSKAIADGDSIYAVIRSTSVNHGGKTN-GYTVPDPQAQAALIRSALDK AGLSARDVSYLEAHGTGTELGDPIEVAGLTQAF--R-RD-----T-----D------------------------A---R--------A----------------Y-C-RIGSLKSNIGHLEAAAGIAGLTKVILQLRHKRLVPSL-HAESLNAN-IA--FGET--PFVVQRGFEDW-ARPRV---EVDGVLSERP--RLAGVSSF-GAGGANAHVLVEE >PedKS2 EAIAVIGMSGRFASA-ADLDEYWQLLAAGESCISEV-PAE-R----W-AVEDFF--H--P---D---------PDE--AI V----Q---G-KSY------S--K--W-GGFL-QGVTEFDPLFFNISPKEARSIDPQERLFLRASWEVLEDAGY-----T RERL-TK------DF-Q-----QQLGVFAGIT----RTG--FD------LF--GP---ELWRQ-----GYTVYP---H-T S-FSSVANRVSYFLNARGPSVPVDTMCSSSLTAVHQACQSLRNGECRMAIAGGVNVYLH---PS-GYVGLS--AAH-ML--SKDGV-CRSFGKGANGFVPGEGVGAVLLKPLSQALADNDQIHAIIRSTQVNHGGKTN-GYTVPNPLAQAELIRRALDK AGLNARAVSYVEAHGTGTDLGDPIEVSGLTQAF--R-HD-----I-----Q------------------------E---N--------G----------------F-C-ALGSVKSNIGHLEAAAGIAGMIKVILQMRHGQLVPSL-NAAEINPN-ID--FSRT--PFVLQRELAPW--TPPYTL-GQNGEREEGT--RIAGVSSF-GAGGGNAHVILEE >RhiKS8 EPIAVIGMSGRYPQA-DNLDEYWRNLKEGRDCISEI-PAN-R----W-PLQDFY--V--D---S---------VQQ--AL K----Q---G-RSY------S--K--W-GGFV-EGFADFDPLFFNISPREAVNMDPQERLFLQSCWQALEDAGI-----T KADL-VE------HY-Q-----GNVGVFAGIT----KTG--YG------LY--GP---ALREQ-----GHLIFP---R-T S-FSSTANRVSYVLGLNGPSEPVDTMCSSSLTALHRACESLRRGECRMAFAGGVNLYLH---PS-NYVELC--AGQ-ML--SKDGC-CRSFGEGGNGFVPGEGVGVLLLKPLSQALADRDNIHAIIRATAVNHGGKTN-GYTVPNPQAQAKVIRLAIEQ AGLHASQISYIEAHGTGTELGDPIEVSGLSQAF--A-MD-----G-----V------------------------D---L--------G----------------R-C-ALGSVKSNIGHLEGAAGIAGVSKVILQLKHKQIAPSL-HAAKLNPN-LT--LDTT--PFYLQQALGDW------------QPAAEGP--RIAGVSSF-GAGGANAHVLLQE >NspKS2 DAIAIIGMSGRYPMA-KDLSTFWERLKAGTDCISEI-PRD-R----W-SWEDHF--H--P---H---------VEE--AV E----L---G-KSY------C--K--W-GSFL-DDAKEFDAQFFGISPHEVVNIDPQERIFLQAAWEALEDAGH-----T RQML-ES------QY-Q-----QQVGVFVGVT----RTG--FD------LY--GP---ELWKQ-----GKIGHP---H-T S-FSSIANRLSYFLNLRGPSMPIDTMCSSSLTAVHEACQHLQNGSVQLAFAGGVNLYLH---PS-SYNILC--ASK-ML--SKDGR-CKSFGNHSNGFVPGEGVGVVLLKPLSVAIADRDNIYAVIRGTHVNHGGKTN-GYTVPNPKAQAELVFEALEK SGVDAREVSYIEAHGTGTELGDPIEVAGLTKAF--Q-RH-----T-----Q------------------------D---T--------S----------------F-C-AIGSVKSNVGHLEAASGIAGLTKAILQLRHKQLVPSL-HADELNPN-ID--FPQT--PFYVQRHLQHW-PQPLV---EVDGKTKTVP--RIAGVSSF-GAGGVNAHVVVEE >OnnKS2 EPIAIIGMSGHYPMA-DTPDAFWEMLKNGQDCIREI-PPD-R----W-PLEGFY--L--A---D---------QDQ--AV A----S---G-KSY------S--K--W-GGFL-ENPFDFDARFFNISPKEAKDMDPQERIFLQAAWEALEDAGY-----D KKSL-AT------RY-R-----QRVGVFVGIT----RTG--FD------LY--GP---QLWEQ-----GNTAYP---H-T S-FSSVANRISYLLDLRGPSMPIDTMCSSSLTAIHEACQRIQFGECEMAFAGGVNLYVH---AS-SYVGLC--ASR-ML-

--SKDGR-CKSFGTGSNGFVPGEGVGVVLLKPLSQAIKDHDPIHAVVRGTYVNHGGKTH-GYTVPNPNAQGELIREALNR AGVHARTVSYVEAHGTGTELGDPIEVTGLTQAF--R-QE-----T-----Q------------------------D---S--------G----------------F-C-ALGSVKSNIGHLESAAAMAGLTKIILQMKHGMLAPSL-HARELNPK-IP--FEKT--PFVIQQELAPW-QRPTV---SLDGDIKEYP--RIAGISSF-GAGGSNAHVILEE >BryKS13 EPIAIIGLSGHYPQA-NSLDAYWENLKAGKDCIREI-PDD-R----W-SLDGFF--H--E---D---------VEE--AI A----Q---G-KSY------S--K--W-GGFL-EGFADFDPLFFNLSPREVMTIDPQERLFLQSAWEAVEDAGY-----T RAQL-AS------QF-N-----KRVGVFAGIT----KTG--FN------LY--AG---DLNSQ-----AELFYP---Y-T S-FSMLVNRVSYFLDLQGPSIPVDTMCSSSLTAIHEACEHLHRQRCELAIAGGVNLYLH---PS-SYIHLC--AGH-IL--SKNNR-CSAFGQGGDGFVPGEGVGCVLLKPLSCAERDGDNIYAVILGSHTNHSGRAG-GMG-PNLNAQSDLIIENLRQ CGIAPDTIGYVESASNGSHLGDSIELRALDKAF--S-QH-----T-----K------------------------K---R--------D----------------F-C-AIGSVKPNIGHLESASGMSQLTKVLLQLRHKQLVPSI-HAQPLNSN-ID--FEDT--AFRLQKEVEEW-KRLIV---QVNGENKEIP--RRAAINSF-GAGGVNANLIIQE >BryKS16 EPIAIIGLSGHYPQA-NSLDAYWENLKAGKDCIREI-PDD-R----W-SLDGFF--H--E---D---------VEE--AI A----Q---G-KSY------S--K--W-GGFL-EGFADFDPLFFNLSPREVMTIDPQERLFLQSAWEAVEDAGY-----T RAQL-AS------QF-N-----KRVGVFAGIT----KTG--FD------FY-------GIQSD-----QLFSAY-----T S-FSSVANRVSYFLGLQGPSLSIDTMCSSSLTAIHEACEHLHRQRCELAIAGGVNLYLH---PS-TYIRLC--TLR-ML--SKEGL-CKSFGYGGNGFVPGEGVGAVLLKPLSRAIQDQDSIYAIIRGSCVNHGGKTN-GYTVPNPHSQGDLIREALDK AQVNARMVSYIEAHGTGTELGDPIEVRGLTQAF--Q-QD-----T-----D------------------------D---V--------G----------------F-C-VLGSVKSNIGHLEAAAGIAGLSKVILQMKYEKIVASL-HAERLNAN-IN--FEQT--PFVVQQSLNEW-ERPNL---HVNGKIKEYP--RTAGISSF-GAGGTNAHIIIQE >PedKS6 EPIAIIGLSGRYPQA-ETLEEFWENLQAGKDCVSEI-PED-R----W-RLENFF--H--P---D---------PKE--AV A----Q---G-KSY------S--K--W-GGFI-EGFAEFDPLFFNISPREALAMDPQERLFLQCAWHVLEDAGY-----T RQSL---------QQGG-----HKVGVFVGIT----KTG--FD------LY--GP---ELWHR-----GERLFP---H-T S-FSSVANRVSYCLNLKGPSMPIDTMCSSSLTAIHEACQHLRQGDCDMAIVGGVNMYVH---PS-TYVGLC--SAY-ML--SRDGQ-CRSFGQGGNGFVPGEGIGAVLLKPLARAQEDDDLIHAVIRSSSVNHGGRTN-GYTVPNPNAQAELIGDCLKK AGVDARSIGYIEAHGTGTELGDPIEVNGLAQAF--G-QE-----A-----G------------------------E---H--------S----------------R-C-FLGSVKSNLGHLEAAAGMAGLTKVILQMRHGQIVPSL-HAQVLNPN-ID--FAAT--PFTVPQQLVEW-RRTIL---QESGRSRELP--RRAGLSSF-GAGGSNAHLILEE >OnnKS6 EPIAIIGISGRYAQA-DTLDDFWHNLKAGKDCVTEI-PQE-R----W-SLDDFY--L--A---D---------REA--AV A----Q---G-KSY------S--K--W-GSFL-DGFADFDARFFAIAPREVRSLDPQERLFLQCSWEALEDAGY-----T RESL-------QRHH-Q-----RNVGVFVGIT----KTG--FD------FY--EP---ELLAQ-----GETLHP---H-T S-FGSVANRVSYFLNLQGPSMPIDTMCSSALTAIHEACEHLRHGACELAIAGGVNLYVH---PL-SYVRLC--SAR-ML--SGDGR-CKSFGEGGNGFVPGEGVGAVLLKPLSAAIRDRDHIWAVIRGSGVNHDGKTN-GYTVPNPNAQGQLIADTLKK AGVPARAISYVEAHGTGTELGDPIEVTGLTQAF--S-QD-----T-----N------------------------D---R--------G----------------F-C-ALGSVKSSLGHLEAAAGMAGLTKIVLQMKHGQIVPSL-HARVLNPN-IN--FDKT--PFVVQQELGIW-KRPQI---EINGMVQEIP--RMAGLSSF-GAGGANAHLIIEE >KAKS3 EPIAIIGISGIYPEA-KDLDEFWENLQEGKNSISEI-PPS-R----W-SLDDFY--E--P---D---------MQK--AV K----E---G-KSY------A--K--W-GGFI-NEFAQFDPMFFGISPREALNIDPHERLFLQESWRALESSGY-----T KHDL-KQ------KY-H-----QKVGVFAGVT----KTG--FE-----LK---AP---FYRND-----NHKFHP---R-S S-FSSVANRLSYVLDIKGPSMPIDTMCSSSLTAIHEACEHIHRGECEVAFAGGVNLYLH---PS-TYSYLS--SQH-ML--SVDGQ-CKSFGLGGNGFVPGEGVGVVLLKPLSEAVKDNDVIHGVILSTHVNHGGKTN-GYTVPSPVSQSQLIKTAIKK AGINARDISYIEAHGTGTELGDPIEVEGLKKAF--D-AD-----T-----T------------------------D---K--------N----------------Y-C-AIGSVKSNIGHLEAAAGISGLTKVLLQMKHGKIVPSL-HTEDLNPN-ID--FEKT--AFQVATDLQNW-ERPFV-------QNEEKP--RIAGISSF-GAGGANAHVILQE >CPKS11 EPIAIIGISGRYPGG-DNLAAFWESLKNGKDSISEI-PAD-R----W-SLDDFY--V--P---D---------KTA--AL E----N---G-KSY------S--K--W-GGFI-NGHADFDPQFFNISPREAVNIDPQERLFLQACWEVMEDAGY-----T RRQI-AE------QF-A-----HRVGVFAGIT----RTG--FD------LY--GP---ELWRK-----GAAYFP---R-T S-FSSLANRTSYLLNLRGPSMPVDTMCSSSLTAIHEACEHLYRNECEMAIAGGVNLYLH---PS-SYVLFC--SQQ-ML--AADGK-CKSFGEGGDGFVPGEGVGVVLLKKLSLAERDNDHIYAVIKASGVNHGGKTN-GYTVPNPVAQAALISETITK AGIEAEAISYIEAHGTGTELGDPIEITGLSQGF--A-ST-----------E------------------------N---Q-------------------------F-C-SIGAVKSNIGHCEAAAGIAGITKVVLQMKHGLIAPTL-HAETLNPK-IR--FEST--PFVVKRELTEW-KRPVR---EKHGLMQEVP--RIAGISSF-GAGGSNAHVLVQE >PPOLKS10 RDVAIIGMSGRYPGA-RNIEEYWNNLREGKNCITEI-PKD-R----W-DWKTYF--D--Q---E---------KGK--------R---G-TIY------T--K--S-GGFI-EGIDLFDPLFFKISPAEAETMDPQERLFLEAAYASIEDAGY-----T PANL----------CES-----RKVGVFAGVM----NGN--YG----------------LGPQ------------------YNSISNRVSYHLDFQGPSMSLDTACSSSLTAIHLALESLYSGTSECAIAGGVNLIVD---PI-HYIKLS--AMT-ML--SKTGE-CKSFGDQADGFVDGEGVGAIVLKPLEKAIADGDHIYGVIKGSMLNASGKTS-GYTVPSPQAQSWLIEEALQR ADVDARTVSYIEAHGTGTALGDPIEIAGLTRAF--E-KF-----T-----K------------------------D---K--------Q----------------F-C-AIGAVKSNIGHCESAAGIAGVTKILLQLKHSQLVPSL-HSRKLNPE-IN--FNNT--PFIVQQSLEEW-KRPVI---TTNGITKEYP--RRAGISSF-GAGGANAHIVIEE >GUKS10 EPIAIIGMSGRYPQA-DNLGDYWENLKAGRDCITEI-PED-R----W-PLDGFY--H--A---D---------PQE--AV E----M---R-KSY------S--K--W-GGFV-RGFADFDPLFFNISALDAINTDPQERLFIQSCWEVFEDAGY-----T REQL-AA------QF-K-----GRVGVFAGIT----KNG--FA------LY--GP---DLWRQ-----GETISP---R-T

S-FSSVANRVSYLFNLRGPSMAVDTMCSASLTAIHEACEHLHHKECEMAIAGGVNLYLH---PA-SYVELC--ALQ-ML--STDGQ-CKSFGLDGNGFVPGEGVGVVLLKPLSRAIADRDQIHAVIRGSRTNHGGKTN-GYTVPSPVAQGELIRETLDK AGIHARAVSYIEAHGTGTKLGDPIEITGLAQAF--H-KD-----T-----A------------------------D---T--------G----------------Y-C-AIGSVKSNIGHLEAAAGIAGVTKIVLQMKHRSHVPSL-HSVNLNPN-ID--FAKT--PFTVQQELSEW-VRPVV---EIDGETREYP--RIAGISSF-GAGGSNAHVILEE >BBR2KS7 EPIAIIGMSGRYPKA-RSLNEYWENLKSGKDCITEI-PEE-R----W-SLDGFF--E--P---D---------PDK--AV A----E---G-KSY------G--K--W-GGFV-DGFADFDPLFFNMSPWEAMHFDPQERLFMESCWEVLEDAGY-----T RQQL-AE------KY-N-----RRVGVFGGIT----KTG--FS------LY--GP---DLWKQ-----GELIYP---H-T S-FSSLTNRVSYFLNLQGPSMPIDTMCSASLTAIHEACEHLYNGDCELAIAGGVNLYLH---PS-SYVFLS--ALH-ML--SVDGQ-CKSFGQGGNGFVPGEGVGTVLLKPLSKAIADGDHIYGLIRGTSVNHGGKTN-GYTVPNPTAQGELIRQALDK AGVHAKTVSYIEAHGTGTELGDPIEISGLIQAF--R-KD-----T-----Q------------------------D---T--------G----------------Y-C-AIGSVKSNIGHLEAAAGIAGVAKILLQMKHQQLVPSL-HAKELNPN-IP--FSKT--PFVVQQDLVEW-KRPLM---EVNGVLREFP--RIAGISSF-GAGGSNAHVVIEE >PPOLKS2 EPIAIIGMAGKYPWS-RDMNEYWENLKAGKDCVTEI-PRE-R----W-TLEGFF--H--P---D---------PQE--AV V----N---A-KSY------S--K--W-GSFI-EGFAEFDPLFFNISPREALSMDPQERLFIEACWEALEDAGY-----T REQL-AV------RH-N-----RRVGVFAGIT----KTG--YE------LY--GP---ELWKK-----GAQIFP---Q-I S-FSSVANRISYLFNLQGPSMPIDTMCSSSLTAIHEACEHLYRGECEMAIAGGVNLYLH---PL-SYVGLC--ANH-ML--SADGQ-CKSFGKGGNGFVPGEGAGVVLLKPLSKAIADGDLIHAVIRGTSINHGGKTN-GYTVPNPTAQGELIRAALER AGVHARTVSYIEAHGTGTELGDPIEVTGLTQAF--R-KD-----T-----P------------------------D---T--------G----------------F-C-AIGSVKSNIGHLEAAAGIAGVGKIILQMKNQTLVPSL-HAQELNPN-IH--FGQT--PFVVQQELGEW-KRPVV---EIDGETREYP--RIAGISSF-GAGGSNAHVIIEE >PPOLKS4 EPIAIIGMAGKYPWS-RDMNEYWENLKAGKDCVTEI-PRE-R----W-MMEGFF--H--P---D---------SQE--AV A----N---G-KSY------S--K--W-GGFI-EGFADFDPRFFNIAPREALGMDPQERLFIEACWEALEDAGY-----T REQL-------AVRH-N-----RRVGVFAGIT----KTG--FD------LY--GL---DLWRK-----GTQIYP---H-T S-FSSVANRISYLFNLQGPSMPIDTMCSSSLTAIHEACEHLYRGECEMAIAGGVNLYLH---PL-SYVGLC--ANH-ML--SADGQ-CKSFGKGGNGFVPGEGAGVVLLKPLSKAIADGDLIHAVIRGTSINHGGKTN-GYTVPNPTAQGELIRAALER AGVHARTVSYIEAHGTGTELGDPIEVTGLTQAF--R-KD-----T-----P------------------------D---T--------G----------------F-C-AIGSVKSNIGHLEAAAGIAGVGKIILQMKNQTLVPSL-HAQELNPN-IH--FGQT--PFVVQQELGEW-KRPVV---EIDGETREYP--RIAGISSF-GAGGSNAHVIIEE >BatKS10 SAIAIVGMSGAFANS-RSVDEFWRNLLEGNESVDEV-PLH-R----W-PVEKFY--D--P---D---------PLA--------M---K-KTY------S--K--W-MGTL-QDVDRFDAQFFNISRAEAEVIDPQQRLFLEHSWSCIEDAGI-----N PRSL----------S-A-----SRCGVFVGCG----SGD--YA---------------AYADF-----SAEGMM---G-N A-PSILAARISYFLNLKGPCLAIDTACSSALVAIAEACDSLLLGRSDLALAGGVNVLTG---PG-MHIMAS--RAG-ML--SKQGR-CFTFDDRADGFVPGEGVGVVLLKRYSDALRDEDIIHGVIRGWGVNQDGKTN-GITAPSVTSQIALEKDVYKQ FCIDPQTIGLVEAHGTGTGLGDPIEVQALTETF--R-SF-----T-----E------------------------R---Q--------N----------------F-C-ALGSVKSNIGHLLAAAGIAGVIKVLLAMRHQTLPPTI-NFKQLNSR-IN--LEGS--PFFINTEACVW------------NAGAGGV--RRAAVSSF-GFSGTNSHIVIDES >EtnKS2 EGIAVIGMSGAFPKA-RDLDEFWENIARGVDCVTEI-PPD-R----W-AIAEHY--D--P---R---PDA--------------P---R-KTY------S--K--W-MGVL-DDVDRFDPLFFGITPTEAELMDPQQRLFLEHCWSCIESAGI-----D PQRL----------S-G-----SRCGVYVGCM----PGD--YV------RV--GD---GTDLT------LHWLM---G-S S-SSILAARISYCLNLKGPCLAIDTACSSSLVAIVEACDSLLLGRSDLALAGGVSVAVR---PE-THIIGS--GAG-ML--SPRGR-CFTFDARADGFVPAECVGVVLLKRLADAVRDGDIIAGVLRGWALNQDGRTN-GITAPSVTSQAALQKQVYDH FGIDPETISLVEAHGTGTKLGDPIEVQALTESF--R-AY-----T-----D------------------------E---R--------E----------------Y-C-ALGSVKSNIGHGLAGAGVAGVLKVLLALRHERLPPTI-HYATRNEH-IT--FEGS--PFYVNTELRPW------------ARRAGSP--RRAAINSF-GFSGTNAHVVIEE>BatKS5 DGIAVIGLSGAFPKA-RNAQVFWENLAQGLDCVSEI-PSS-R----W-SIEEHY--D--P---N---------PEA--------V---G-KTY------C--K--W-MGCL-EEADRFDPLFFNISPAEAISMDPQQRLFLEHSWSCIENAGI-----D PQSL----------S-G-----SRCGVYAGCG----PGD--YG------YT--PG---GSGLT------AQVLT---G-N S-SSILSARISYFLNLRGPCVALDTACSSSLVAIALACDSLVGGSCDVALAGGVCVLTG---PS-LHIMTS--KAR-ML--SPQGR-SFAFDARADGFVPGEGVGVLLLKRFADAVRDGDQIQGVVRGWGVNHDGKTN-GMTAPSANSQASLEKDVYAR FEIDPATITLVEAHGTGTKLGDPIEVEALTASF--Q-AY-----T-----Q------------------------Q---R--------H----------------Y-C-ALGSVKSNIGHLLAAAGVSGVIKALLALQHRMLPPTI-QYQSLNEH-VM--LDGG--PFYINEALKPW------------SLEGSGS--RRAAVSSF-GFSGTNAHVVLEE>BatKS11 DGIAVIGLSGAFPKA-RNAQVFWENLAQGLDCVSEI-PSS-R----W-SIEEHY--D--P---N--------------------PEAVG-KTY------C--K--W-MGCL-EEADRFDPLFFNISPAEAISMDPQQRLFLEHSWSCIENAGI-----D PQSL----------S-G-----SRCGVYAGCG----PGD--YG---------------YTPGG-----SGLTAQVLTG-N S-SSILSARISYFLNLRGPCVALDTACSSSLVAIALACDSLVGGSCDVALAGGVCVLTG---PS-LHIMTS--RAR-ML--SPQGR-SFAFDARADGFVPGEGVGVLLLKRFADAVRDGDQIQGVVRGWGVNHDGKTN-GMTAPSANSQASLEKDVYAR FEIDPATITLVEAHGTGTKLGDPIEVEALTASF--Q-AY-----T-----Q------------------------Q---R--------H----------------Y-C-ALGSVKSNIGHLLAAAGVSGVIKALLALQHRMLPPTI-QYQSLNEH-VM--LDGG--PFYINEALKPW------------SLEGSGS--RRAAVSSF-GFSGTNAHVVLEE>BryKS1 EGIAVIGLSGQYPKS-KTLEQFWQTLADGVDCISEI-PAD-R----W-SLEEYY--S--P---I---------PEG------------G-KTY------C--K--W-MGVL-EDMDCFDPLFFAISPREAEVMDPQQRLFLENAWSCIEDAGI-----N

PKML----------S-R-----SRCGVFVGCG----AND--YS-----ALM-NSS---HSTSL-----ELMKEL---G-N N-SSILSARISYFLNLKGPCLAIDTACSSSLVAIAESCNSLVLGTSDLALAGGVLLMPG---PS-LHIGLS--HGE-ML--SVDGR-CFTFDQRANGFVPGEGVGVVLLKRMSDAVRDGDPIRAVIRGWGVNQDGRSN-GITAPSSKAQSALEQEVYQR FNIDPSSITLVEAHGTGTKLGDPIEVEALAESF--R-VY-----T-----D------------------------K---R--------H----------------Y-C-ALGSVKSNIGHLGVGAGIAGVTKVLLSLQHRMLPPTI-HCEDVNPQ-IA--LEGS--PFYINTELKPW------------QSGDSIP--RRAGVSSF-GFSGTNAHLVLEE>BryKS5 EGIAVIGLSGQYPKS-KTLEQFWQTLADGVDCISEI-PAD-R----W-SLEEYY--S--P---I---------PEG------------G-KTY------C--K--W-MGVL-EDMDCFDPVFFAISPREAEVMDPQQRLFLENAWSCIEDAGI-----N PKML----------S-R-----SRCGVFVGCG----AND--YS-----ALM-------NSSHS-----TSLELMKELG-N N-SSILSARISYFLNLKGPCLAIDTACSSSLVAIAESCNSLVLGTSDLALAGGVLLMPG---PS-LHIGLS--HAE-ML--SVDGR-CFTFDQRANGFVPGEGVGVVLLKRMSDAVRDGDPIRAVIRGWGVNQDGRSN-GIMAPSSKAQSALEQEVYQR FNIDPSSITLVEAHGTGTKLGDPIEVEALAESF--R-VY-----T-----D------------------------K---R--------H----------------Y-C-ALGSVKSNIGHLGVGAGIAGVTKVLLSLQHRMLPPTI-HCEDVNPQ-IA--LEGS--PFYINTELKPW------------QSGDGIP--RRAGVSSF-GVSGTNAHLVLEE>TaKS8 EPVAVIGMAGKFPRA-SDLAQFWQNIARGVDCISEV-PAG-R----W-SVAEHY--D--A---D---------GNT--------P---G-KSY------S--K--W-LGVL-EDAESFDPLFFGISPAEAERMDPQQRLFLAACWHCIEDAAI-----R PSSL----------S-D-----TRCGVFVGCG----PTD--YG------RD--LR---GGDLN------AQALM---G-G A-TSILAARISYLLNLKGPCLAIETACSSSLVAISQACDSLVLRNCDLALAGGVNVAFG---PA-MHIMSS--DAG-ML--SKDGR-CFTFDARANGFVPGEGVGVLLLKRLSDAVRDDDPIAGVIRGWGVNQDGKTN-GITAPSAKSQAALEREVYRR FGIDADTISLVEAHGTGTRLGDPIEVEALVESF--R-EH-----S-----A------------------------R---K--------H----------------Y-C-ALGSVKSNIGHLMTAAGVSGVLKVLMAMRHQSLPPTL-HFEAVNPH-LA--LEGS--PFYVNTELKPW---------------TGVP-VRRACVSSF-GFSGTNAHVVLEE>PPOLKS9 LDIAIIGVSGRYPGA-LGIREFWDNLRDGKDCITEI-PKE-R----W-DHSLYF--N--E---N---------RGK--P---------G-KTY------S--K--W-GGFI-DGVDWFDPLFFNISPREAETMDPQERLFLECVYETIEDAGY-----T RETL-CAERELG-GI-E-----GNVGVYVGVM----YHE--YQ------LY--GA---QEQAK-----GRMIGL---L-G N-ASAVANRVSYFCNFQGPSIAVDTMCSSSLTSIHMACHGLQHGECELAVAGGVNVSIH---PN-KYLLLG--QGS-FA--SSNGR-CESFGQ-GDGYVPGEGVGALLLKPLAKAKADRDRIYGVIKGTAVNHGGKTN-GYTVPNPNAQARVIGQAIKE AGINPRTISYIEAHGTGTALGDPIEIAGLQKAF--R-KY-----T-----G------------------------D---K--------Q----------------F-C-SIGSAKSNIGHCESAAGIAAVTKVLLQLKYKQIVPSL-HASTLNPN-ID--FSDS--PFKVQQGLEEW-KRPVV---AINGETREYP--RAAGVSAF-GAGGSNAHVVIEE>PsyKS2 PAIAVIGMSGQFPQA-NNVEALWQNLVEGRDCISEV-PLD-R----W-PVDAYF--D--P---T---PQV--------------P---G-KTY------S--R--W-MGVL-EDADKFDPLFFSISPREAMAMDPQQRLFLETCWSCVEDAGY-----A PSSL----------S-G-----TRCGVFAGCG----VSD--YN------QH--LD---ADGLD------AQRFM---G-G S-TSILAARISYELNLRGPSMAVDTACSASLVAIAVACDNLVAGACDTALAGGVCVMAG---PA-MHIMTS--QAR-ML--SPDGR-CFTFDQRANGFVPGEGVGVVLLKRLADAERDGDRILGVLRGWGVNQDGKTN-GITAPSGDSQTGLQRDVYER YGIDPATIQLVEAHGTGTKLGDPIEVEGLCQAF--S-SF-----T-----D------------------------Q---R--------N----------------Y-C-ALGSAKSNIGHLLMAAGVAGLIKTLLALQHQTLPPTI-HFEQLNEH-IA--LDDS--AFYVNDRIRPW------------ASQGATP--RRAAVSSF-GFSGTNAHVVVEE>KAKS6 HRIAVIGIAGQFPKA-KNVTEFWNNLATGKNCISEV-SKD-R----W-DNKKHF--K--A---G---------DPT--------P---G-KTN------S--K--W-MGAL-ENYDKFDPLFFNISPVEAESMDPQQRLFLQSCWHTIEDAGY-----N TNAL----------S-G-----TKCGVFVGCG----ASD--YH---------------QLSSE-----HRLSAQGFTG-G S-SSILAARISYLLNLQGPCLSIDTACSSSLVALATACDNLVSGNCNVALAGGVGIMAT---PA-MHIMTA--QAG-ML--SQDGK-CHTFDQNANGFVPGEAVGVVMLKRLEDAERDNDRIYGVVQGWGVNQDGKTN-GITAPNAVSQASLEAEVYEK FNIDPAQIQLIEAHGTGTKLGDPIEVDALKQSF--K-KY-----T-----E------------------------N---E--------T----------------F-C-ALGSVKSNIGHTMWAAGISGFLKVILALQHQQLPPTI-NYSKLNEH-IN--LKNS--PFYVNNKLQDW------------KINSEEK--RQAAISSF-GFSGTNAHIVLSE>KAKS8 QAIAVVGMAGQFPKA-ENVQQFWQNIAQGENCITEV-PKE-R----W-DINEYY--S--E---G---------TPT--------P---G-KTN------S--K--W-LGSL-AGYDQFDPLFFTISPIEAESMDPQQRLFLQSCWHSIEDAGY-----N PQAL----------S-G-----TKCGVFVGCA----TGD--YQ-----LPS--RE---QQLSA-------QGFT---G-G T-SSILAARISYFLNLQGPCIALDTACSSSLVAIANACDSLASGASDSALAGGVYVMTG---PE-MHVKTA--QSG-ML--SKDGK-CHTFDQEANGFVPGEAVGVVMLKRLEDAERDNDNIYGVIKGWGLNQDGRTN-GITAPNSQSQTNLEQEVYDR YDINPEEIQLIEAHGTGTKLGDPIEVAALKNSF--K-KY-----T-----E------------------------K---E--------Q----------------Y-C-ALGSVKSNIGHCLTAAGVAGFIKVMKAMEHKKLPPTI-HYNQLNEH-IS--LEKS--PFYVNDRLQDW------------EVGQAEI--RHAAISAF-GFSGTNAHLVMSE>LLYNKS2 AGIAIIGMAGQFPKA-PDLNAFWQNIAQGIDCVTEI-PQE-R----W-SPADFY--D--S---C---DHA--------------P---N-PTP------H--K--W-MGVL-EDVDKFDPLFFNISPREAELMDPQQRLFLEASWACIEDAGY-----D PSSL----------S-G-----KKCGVFAGCG----AGD--YG------LS--LE---AEGLN------AQALM---G-G A-ASILPARIAYTLNLHGPCLALDTACSSSLVAMAMACDSLVAGNSDMTLAGGVTVLAG---PA-MHLMTG--NAG-ML--SANGR-CHTFDQRANGFVPGEGVGVLLMKRLEDAERDGDRILGVIRGWGVNQDGRTN-GITAPNGDSQTRLQRQVYDR FNIDPATIQYVEAHGTGTKLGDPIEVEGLKQSF--R-QY-----T-----D------------------------R---R--------G----------------Y-C-ALGAVKSNIGHSLMAAGVAGVIKILQAIRHRHMPPTL-HFETLNDQ-ID--LEGS--PFYVNSAGQAW------------VAPPGQT--RCAAINSF-GFSGTNAHLVIEE>OocKS2 QAIAIVGMSGQFPKA-KNLQQFWDNLSKGEDCISDI-PAD-R----W-AIDDFF--D--R---N---------RQT----

-----P---G-KTY------S--K--W-MGVL-EDADKFDPLFFNISPREAEMMDPQQRVFLEACWGCIEDAGY-----N PADL----------S-G-----SRCGVFVGCG----DGD--YD----------AR---FGMEL-----NAQTFM---G-N A-ISILAARIAYHLDLQGPSLAIDTACSSSLVAIASACDSLVLGNSDLALAGGVSVLAG---PS-MHIMTS--KAG-ML--SEDGR-CFTFDQRANGFVPGEGVGVVLLKRLEDAIEDQDVIHGVIKGWGVNQDGKSN-GITAPNGNAQSRLEQSVYAS FGIDPEQIQYVETHGTGTKLGDPVEVVGLKQTF--A-QF-----T-----R------------------------K---E--------H----------------Y-C-ALGSVKSNIGHTLRAAGVASVLKVVLAMKNQQIPPTL-HHQKLNEH-IA--LDGT--PFYVNRELSDW------------VVESGQR--RQAAVSSF-GFSGTNAHLVIEQ>LLYNKS13 QDIAVIGMAGQFPMA-HDITAFWENLRTGRDCISEV-SSD-R----W-PLAAYY--D--P---R---PEA--------------P---G-KTY------S--K--W-MGVL-EDVDKFDPLFFNISPLEAEAMDPQQRLFLQSCWSCIEDAGY-----N PLTL----------A-G-----SNCGVFAGCG----TGD--YG-----LSF-------GRVEQ-----DASTLM---G-S S-LSILAARISYVLNLQGPCIAMDTACSSSLVAIASACDSLTLGSSDLALAGGVCVMAG---PT-MHITTS--KAG-ML--SPDGR-CFTFDQRANGFVPGEGVGVVLLKRLADAERDGDRIDGVIRGWGVNQDGKTN-GITAPNGDSQTRLEKYVYDR FGLNPENIQLVEAHGTGTKLGDPIEVAGLQASF--Q-EY-----T-----S------------------------R---Q--------H----------------Y-C-ALGSVKSNVGHLLAAAGVAGVIKSTLALKHRKLPPTI-HFETLNEH-IE--LENS--PFYVNTTCRDW------------QVEANQV--RCAAVSGF-GFSGTNAHLVIEE>PedKS3 PAIAIIGMSGQFPQA-PDVKAFWRNIVEGRDCVSEI-PAE-R----W-SIEEYY--D--S---D---------RNA--------D---G-KTV------C--R--R-MGAL-SDREVFDPLFFNISPSEAELMDPQQRLFLLNSWHCIEDAGY-----D PTRL----------S-G-----SLCGIFVGCA----ASD--YS-----QLA-------ESQTQ-----TAQGLL---G-E S-VAMLPARVAYYLNLQGPCLAIDTACSASLVALASACDSLVLGNSDVALAGGVYVING---PD-IQVKMS--KAG-ML--SPDGR-CFSFDQRGNGFVPGEGVGVLMLKRLEQAQRDGDDIYAVIRGWGVNQDGKTN-GITAPNQESQTRLETGIYRK FGINPEHIQLVEAHGTATRLGDPIEVEALSESF--R-RF-----S-----D------------------------R---K--------Q----------------Y-C-ALGSVKSNIGHLATAAGVTSVIKSALALQHRILPPTI-NFQTLNEH-IR--LQDS--PFYINTERRLW------------EMPQGHS--RQVAVSSF-GFSGTNAHMVLEE>OnnKS3 GRIAIIGISGKFPKA-STLDQFWENIAEGRNCVSEV-PES-R----W-SVDEFY--D--A---D---------GKV--------P---G-KTM------S--K--W-MGIL-EEVDQFDPLFFAISPRDAELMDPQQRLFLQACWSCIEDAGY-----N PKTL----------S-G-----SSCGVFVGCD----MGD--YG-----RSV-------QYQEL-----DAQSLL---G-G V-VSILPARISYFLNLQGPCLAVDTACSSSLTAIANACDSLLLGHSDCAVAGGVCVMTG---PE-IHIMMS--KAG-ML--SPNGT-CFTFDQRANGFVPGEGVGAMFLKRYEDAVADGDPIYAVLRGWGINQDGKTN-GITAPNARSQTRLEKRVYEQ CGIHPEDIQLIEAHGTGTKLGDPIEVEGLRDAF--A-HF-----T-----E------------------------K---Q--------H----------------Y-C-ALGSVKSNIGHLATAAGVSGMIKLVLALQHQKLPPTV-NHEKLNEH-IR--LEGS--PFYINTACRDW------------VVPEGKT--RCAAISSF-GFSGTNVHMVVEE>NspKS3 DAIAVVGMAGRFPKA-KNLAEFWENIANGRNCVSEI-ASD-R----W-NLADFY--D--P---D---------RNA--------A---N-KTY------C--K--S-MGAL-EDVDRFDPLFFHISPREAEFMDPQQRLFLQTSWQCIEDAGY-----N PKSF----------A-G-----SKCGVFVGCE----TGD--YG------KI--VQ---RYELN------ALGLL---G-S S-AALLPARISYFLNLQGPCMAIDTACSASLVAIANACDSLLLGHSDAALAGGVYVLSG---PE-MHIMMS--KAG-IL--SPDGR-CFTFDRRANGFVPGEGVGVVLLKRLADAEKDGDDICGVIRGWGVNQDGKTN-GITAPNGQSQQRLQKEVYER FQIQPADIQLVEAHGTGTRLGDPIEVEALCETF--R-EF-----T-----N------------------------K---E--------K----------------F-C-AIGSLKSNIGHLATAAGVAGVIKVLLAIRNKKLPPTI-NYESLNEH-IN--LDSS--PFYINTECKEW-------------IVDAEP--RRAAINCF-GFSGTNAHMVIEE>CJAKS7 -EEIAVIGMAGVFAES-DNVHEFWQHLHQGTNLMAEI-PRS-R----F-DYRPWF--D--E---T--------------GE-K--D---N-GIY------C--T--W-GSFI-RDVDKFDAEFFNIGLREAEVMDPQLRKLLQVTYLTAEDAGY----ATRIR------------G-----SNTGVFVGCC----FYD--YQ------AE--ML---AQGKP-I---EVYDGI---GNS-PTMLANRQSYFYDLRGPSLTVDTACSSSLVALHLACQALQRRECEQAFVSGANLLLT---PG-HYQYFC--RIG-AL ---GRTGR-CHSFDEQADGYIPGEGIASVLLKPLDSALRDGDRIYGVIKSSAIKHGGYAS-SVTAPNVKGEQEVIVSAWQ QAQIDPSTIDYIEAHGTGTALGDPIEFQATEKAF--K-QF-----T-----T------------------------G---T--------S----------------F-C-AIGSAKAHIGHLEGAAGIAGLIKVLLSMKHKTIPAMP-AFNGLNPY-IK ---LANS--PLYINGQSIPW-------------PERGHR--RRAGVNSF-GFGGTFAHIVVEE >RhiKS12 -QDIAIIGMQGQFGGA-DDLDQFWQNIWQDRELIKEV-PAD-H----W-DVAPWF--D--A---D--------------PD-A--H---D-KTY------S--K--W-GSFI-DNVDKFDADFFHLSRREAQWMDPQVRLLLQSTYAAAENAGV----IRQLR------------G-----SRTGVFIGSC----FNE--YV------DK--IT---ELGLP-M---DPYIAT---GSG--VIAANRISFWFDFKGPSLMFNTACSSSLVALHAACVSLRNGECEMAFVGGSNLLLS---SW-HYRYFS--AIR-AL ---SPTGR-CHTFDAQADGYVPGECVAMLLLKPLSKALADGDPIHGIIKGSAALHGGYTP-SLTAPSVAGEENVIVHAWQ DAGIDPRSLSYIEAHGTGTKLGDPIEINALKRAF--A-RY-----T-----N------------------------D---R--------G----------------F-C-HIGSVKANIGHTEGAAGIAGVIKVLQQMKHHKLPALG-HFKQQNPH-IK ---LDDS--PLVIDREGRDW-------------PEQSAP--RRAGVSSF-GFSGTNAHVVLQE >NspKS7 -GDIAVIGMHSTYAEC-EDLDQFWQMIRDGRDVIREV-PRD-R----W-DYRPWF--D--P---N---------PGS--------E---D-KTY------C--K--W-GSFI-KDADKFDAAFFNISPREAMWMDPQNRLLMQSMYAAAEDAGV----VNQLR------------G-----SNTGVFAGIC----FDD--YA------EK--IV---ELGLP-L---DIYTGT---GRS--GISANRISFWFDFKGPSMVINTACSSSLVALHYACQALRNKECEMAFVGGANLLLS---SL-HYRYFS--KLG-AL ---SPTGH-CHTFDAAADGYVPAECIASVLLKPLDQAEKDGDRIHAVIKGTAFTHGGYTP-SLSAPSVAGEENAIVKAWE DAGIDPETLTYIEAHGTGTKLGDPIETNSLKNAF--A-RY-----T-----N------------------------K---T--------G----------------F-C-AIGSVKANIGHAEGAAGITGLIKVILQIKHRQIPPLA-NLKQLNPY-LK ---LEGS--PLYINRELTDW--------------ESEGV--RRAGVSSF-GFSGAFAHAVIEE >OnnKS11

-EDIAVIGMHGIFPDG-ADVDQFWRDIRDAKDLIREI-PLD-H----W-DVGPWY--D--E---N--------------PE-A--K---D-RTY------S--K--W-GSFI-ADVGGFDPGFFSISPREAEWMDPQVRLTLQSIYATAEDAGV----INRLR------------G-----SDTGVFIGIC----FND--YA------DK--IA---DLRLP-V---DPYSGT---GSS--GIAANRASFIFDLTGPSLVINTACSSSLFALHAACHALRNGECGMAFVGGANLLLS---SF-HYRYFS--AIK-AL ---SPTGR-CHAFDAAADGYVPGEFVGSILLKPLSRAQADGDHIYAVVKGSAALHGGHSP-SLTAPSVAGEENVIVKAWE DAGINPETISYIEAHGTGTKLGDPVELNSLQKAF--R-RY-----T-----Q------------------------K---E--------G----------------F-C-AVGSVKANIGHTEGAAGMAGIMKVILQMQHREIPPLA-LFENLNPY-IR ---LDQS--ALYINRESQAW-------------DVSDAP--RRAGINSF-GFSGSYAHVVLEE >OocKS10 -QNISIIGMAGIFPQA-EHISGFWENLSAGRTTLSAL-SGK-R------RYLMAL--D--A--RD--------------------------RGS------D--R--I-GGYL-DGVEYFDHKLFKVPHKEAQKLDPQIRKLLEVIWQSVTDAGY----TLSQF----------R-E-----KRTGLFVATR---GHSG--YQ-----DIP--AR---MDPTQ-----AAQWRF---QAEQISAYANRISNILNLSGLSEIVETGCASFLVAIRHAMSAIKEGRCQQAIVATAELGLS---PF-VQNRTD--DQA-LY ---SAHPV-TKSFAHDSDGYVKSEVVGAIILKAETEALAQGDAIYANVKAVGVSHGGKAPLKWYSPNIEGQKSAIVAAFS EAGIDPATISYIEPEANGSQLGDASELVAIQAVY--G-PY-----LQESKAR-------------QAAIQTSHIPAD--S HA--------P----------------S-I-AIGSLKPLTGHAETASTFPVLVNMVLSMYHRRLAKVE-GLGELNEG-IT ---LTDG---FELLRADRDW------------AKPDHLP--RRGAIHSM-SIGGVNAHLLLEE >TaiKS4 -FDVAIVGASCRFPGA-DGLDALWRCIVDERSCVRDV---GAK----WLRTRDPA--D--A----------------------------G-ADY------R-------AAVL-DGIDRFDAARFGISPREARRMDPQQRLLLTEAWRALQDAGD----AARAA------------A-----HRTGVFVAAG----ANE--YG----------AG---LDAAD-----NPFSMT----SMAPALMPNRISYALDLRGPSEMTDTACSSSLVALHRAVRSLRDGECDQAVVAAVNLLLS---AE-KFEGFA--ELG-FL ---SPSGR-TRSFDAAGDGFVRGEGAAALVLKPLAAARRDGDFVYACIKGTAVHHGGRGA-ALTAPNAAGIREAMSAAYR NAGIDARTVSYLEAHGVGSPVGDAIELNAIRDAY--A-AL-----SGE-PAP------------------------A--A PA--------A----------------S-C-RIGSVKPVFGHVELASGLLAVCKVLMALRHGVLPGVP-GFERPNPH-AN ---LAGS--PLVVARAAAPW-PAPRD-----DANGAAVP--RRASVNSF-GFGGVNAHVVLEE >RhiKS1 -DDFAIIGISGCFPGG--ELEDFWAAVRSGASQIREA-PAG-R----A---------N--A----------------------------G-------------S--V-GGYI-DDIDCFDHPFFGVSAAEAALMDPQQRLLLQHAWLALEDAGI----PAAQL----------A-R-----RPTGVFIAAA----PSE--YR----------SV---VEVPK-----DSPFLL---TSSSACMYANRVSYLLDLRGPSEYCNTACSSALVALHRAMQAIKAGECRQALVGAVNLLLS---PD-ETAGYQ--LMG-FL ---SAHGQ-TRSFQAGADGYVRSEGVGVLLLKPLADAEQDGDHIYLKLKGSGVCHGGRGA-SLTAPNQDSMKAAIVAAYR RANVEPNSVSYVEAHGVGSLLGDAIEIAAIQAAR--A-EL-----S-----D------------------------DDHD GP-------------------------W---TISTLKPVIGHCELAFGMAALFKVIDAVAQRQLPGIP-GYAQLNPA-IT ---LDQR--QLRLQADSRPW-------PAPKDANGLALP--RRASINSY-GFGGVNAHLVVEE >SGKS3 -DGMAIVGMSARFPGA-DDVDAFWRNVEEGRRSITAP-PQK-R----A-DWAVHG--D-------------------------------GAADL------R-------GGFL-DGVHEFDPLFFRMSMTEARQVTPELRLLLMTAWNAVEDAGY----RPAEL----------R-N-----RPTGVFVATT----QSE--YR----------PA---ATDLM--------------SL PS-PAMVPNRISYLLDLDGPSEQCDTTCSSSFVALHRAIRSIRDGECEQAIVGGVNLVMS---PA-GFGGMR--AAG-ML ---SPRGD-VRPFQQGADGTARSEGVAAVLIKPLRRALEDGDFVHCVVRGTGVAHGGRGV-SFTAPNIRGMKTAVAHAYA DAAIDPGSVEYIETHGMSSTLADSAELAALGAGF--R-ED-----------G------------------------S---E--------D------------------VTYLGNVKPCIGHTEVVSGLAALVKTAQAMRHGVIPAIP-GFEELHRD-LS ---LKGT--RLRVAERNLPW-------PGRTDDVGRPLP--RRAALHSF-GIGGVNAHVVLER >BaeKS14 -QGIAIIGMSAQFPQS-PDIQSFWEHIVNGDHCITEI-PAD-R----W-DWRRYA-GD--E--ND--------------------------TSL---------R--W-GGFI-DGVGEFDPLFFGISPKEASQMGPEQFLLLMHTWKAMEDAGL----TNKAL----------S-S-----RPTGVFVAAG----NSD--------------PN---NGTAI------------------PSIIPNRISYALNLQGPSEYYEAACTSTLVALHRAVQSIRHNECEQAVVGAANILQS---PK-GFIGFD--SMG-YL ---SKNGR-AKSFQKDADGFVRSEGAGVIIIKPLEAAIEDGDHIHMVIKGTGVSHGGKGM-SLHAPNPAGMKAAMKKAYE DTDVDPQTVTYIEAHGIASEMADALEFNAIKAGY--G-ES-----A-----N------------------------QEES AP---------------------------C-YISTVKPCIGHGELASGLAALIKVAMAMKHHTIPGIP-RFTAANEQ-MA ---IQKS--RFRFTEDNQEW-------TQLTDHTGRPIP--RRAAINSY-GFGGMNAHVVLEQ >PksXKS15 -DGIAIIGMSGQFPKA-NSVTEFWDNLVQGKNCVSEV-PKE-R----W-DWRKYA-AA--D--KE------------------------G-QSS------L--Q--W-GGFI-EGIGEFDPLFFGISPKEAANMDPQEFLLLIHAWKAMEDAGL----TGQVL----------S-S-----RPTGVFVAAG----NTD--TA----------VV--------------------------PSLIPNRISYALDVKGPSEYYEAACSSALVALHRAIQSIRNGECEQAIVGAVNLLLS---PK-GFIGFD--SMG-YL ---SEKGQ-AKSFQADANGFVRSEGAGVLIIKPLQKAIEDSDHIYSVIKGSGVSHGGRGM-SLHAPNPAGMKDAMLKAYQ GAQIDPKTVTYIEAHGIASPLADAIEIEALKSGCSQL-EL-----ELPQEVR------------------------E---E--------A----------------P-C-YISSLKPSIGHGELVSGMAALMKVSMAMKHQTIPGIS-GFSSLNDQ-VS ---LKGT--RFRVTAENQQW-RD------LSDDAGKKIP--ARASINSY-SFGGVNAHVILEE >DDANKS11 -DAVAIIGVGGFFPGA-DSIEDFCQKLDQQESLFSRV-PEN-H----F------------P----------------------------GAEMR------N--R--Y-GAFL-KDIAGFDPDLFNIGPMEAEYMDPRTRLLLMSAWHTLEDACY----LPQQL----------R-E-----AKVDVYIASE----GAA--YT-----PFL--AK---AELTS-------YSIL---GMM-SWSLPNYISHAFRFNGKSLYIDTACSGASVALHQAKEALLRKDADYALVGAANLLFGDALAG-SYLGQE--SLG-IL ---GGATV-CSPFQADAKGFLPAEATVTVLLKRLSQAVADRDNIHGVILGSDVNHTGGYG-SLTMPSAESQASVIVNAYR QAGIDANTVTYVEAHGASSLLADAEEIKGFKRAD--A-QL-----NAHLAPS------------------------DLEP SK--------G----------------SPC-KVSTLKPNMGHANSASGMVALVRVLHAFNTQKKPGIK-DFSACSDK-IR ---LEGS--RFYINDMTEEW-PP------LRDECGRDIP--RRAAINNF-GAGGVNAHLLLEE

>DDANKS5 -APLAITGLAGYFPQC-MSVAELWRHLDADDALITEL-PPQ-R--KAWYQQGETG------------------------------G---G-EPW------S--V--LAGGFI-PDIASFDAQKFGILPIEAEEMDPRQRLLLMSTYHMLEDAGI----SPESL----------R-K-----THTGVFIGCE----SNE--YA------SL--MA---RHGYR-----PEFGLA----QA-DSMIANRISYQFDLAGPSELINATCAGFAVALHRATLALRAGMIDRAIIGAANVILV---PD-VTNQLN--DAQ-QL ----THGKTVRSFGKNGDGFMRSEGVGTLLLERLADAEQAGRRVYAVVKHTRVNFNGQGGVSMASPNTDAHCELIKDCYR EAGIDPRRVSYIEAQGMGLPVADIAEWTAINRAL--S-QL-----CDEQGVA-----------------------FE---P--------G----------------Y-C-RVSSLKPLLGHMHSASSLGALLKVIRSLQTGKIHKIL-GFEQANEY-CD ---TQDV--PCSLATETQAW-------------PAGEHP--RLAAIHSY-GSGGNNAHILIEE >LLYNKS9 DDAVVVTGVSGYFPGC-MNVASFFNHIDHDQPLITAM-SDH-R----L-DLMSHR--E----------------GDQ--------L---G-FMR------R--Q--Q-GGFI-PDIVSFDAELFGILPIEADEMDPRQRLLLMSTWRTLEDASI----NTESL----------K-K-----SATGVFVGCE----SNE--YA-----QLM--AQ---HGFVP-----TMGLSQ---A---DSMMANRISYHFDLAGPSEMVNATCAGFAVALHRAFTAVRLGLIERAIVGAANLILL---PD-PFRILS--EAG-QL ---TAGSS-VKSFGHGADGFLRAEGVGTILIERLSDAEAAGRPVYAVIKNASVNFNGQGGFSMAAPNIEAHTELIKACYR EADIDPRRVAYVEAQGMGLPVADIAEWNAINRAL--K-QL-----C-----D------------------------E---K--GLVYEPG----------------F-C-RVSTLKPLVGHMHAASSLGALLKIIRSFQTNKIHRIL-GFEQANEF-CD ---NEAM--PCRFSRETELW-------------EQSSEP--RLAALHSY-GSGGNNAHVLLED >TaKS3 -QAVAIVGAAGFFPGC-QSLQAFWDALDAEQTLLEEI-PPN-R----F-DYRPLF-----P---D--------------------------KSR------S--K--W-GGFI-PDVASFDPGFFNILPAEAETLDPRQRLLLMAVYHCLEDAGQ----APEKL----------K-G-----SRTGVFVGAE----ENE--YL------LH--LR---EQGVD-----TASGFD----QA-ASMVANRLSYFFDLSGPSELVNTMCSSAAVAIHRAVLAIRSGEIDRAIVAAANVILR---PD-GFIKLS--QLQ-QL ---SPSVR-VQSFGKDASGHQRAEGVASVLLMPLEQAEREGHPIHAVLRGSAVNYNGRSGVSIAAPSRQSHSELIQSCYR SAGIDVRDVEYIEAQGMANPVADIAEWEAINDAL--K-RL-----A-----T------------------------A---QGVDVVP--G----------------H-C-RVSSLKPLTGHMESASALGALFKVIHSFRRDKVYGIA-GFESANSY-LE ---LDRQ--PARLATRTEPW-------------RRNGKP--RLAGLHSF-GAGGNNAHLLVEE >PsyKS3 -EPIAIVGLSGMFPQC-SDVRAFWRALDADQALLEEL-PTT-R----F-PWRDWY--D-----AT---------GEN--------P---D-KSR------S--K--V-GGFL-PDIASFDPRFFGVLPDDAARMDPRQRLLLMAVYHALEDAGI----DAGSL----------K-K-----SRTGVFVAGE----DNE--YA------QV--LR---EAEVD-----LGDGFA----QA-ANMLANQISYFFDFAGPSEMINTMCSGGAVALHRAVSALRAREVELAVVGAANVILR---PE-PFVQLS--RAK-QL ---STTAT-VRSFGEGADGHLRAEGVASVLLKPLRAAEAAGDRIYAVIKHSAVNYNGQGGMSIAAPFVQSHQEVIRACYD EAKVDPREVGYIEAQGMGNPVADLAEWHACNNAL--R-AM-----A-----Q------------------------E---QGVALPK--G----------------N-C-RVSSLKPMLGHMESTSAFGALFKIIRSFQTHTVHQIV-GFAKPNPE-LV ---VEQQ--PCRLMAATEPW-------------PAGPVP--RLAGLHAY-GIGGNNAHLLVEE >MmpKS1 -EPVAIIGLSANVAQS-ASVRQFWQALDDDRSLIEEI-PAT-R----F-DFTSWY--A-----GS---------NIE--------E---G-KMR------T--R--W-GGFI-PAIDQFDPVFFGMLPAEARKMDPQQRLLLMSVRQTFEDAGY----RHTDW----------K-G-----SATGVFIAAE----RNE--YH-----LNL--LQ---AQIDP-----GEGLDQ---A---ASMLANRVSHFYDLRGPSERIDAMCAGGAVALHHAVTALRSGQINAAIVGACNLLLR---PD-VFVTLS--QSG-QM ---SPEPT-VRSFGAGADGYLRGEGVCSLLLKPLSKAEADGDHIYGLIRNTAVNYNGGDAASIAAPSVSAHSSLVQDCYR RAGIDPRHVSYIEAQGMGNPVADIAEWDALNHGL--L-AL-----G-----R------------------------E---QGVQLQE--G----------------Q-C-AISTLKPMSGHMHAASAIGALFKIIRSLQTEKIHKIL-DFEQPNLH-LH ---TAGQ--PCRLATHTVDW-------------PRQATP--RLAGLHSY-GAGGNNAHILVEE >PedKS4 -EAIAIVGLSGYFPQS-ASVDEFWRHLDQDATLIEEI-PDS-R----F-DWRKVF--D--P-TGE--------------------R--PG-SSC------S--K--W-GGFI-PDIRGFDPAFFNIPGAEAITLDPRQRLLLMSAYQTLNDAGY----ASQAL----------R-Q-----SKTGVFVALQ----DNE--YL-----QLL--AD---AGIDP-----GQWYAQ-------TCLLANRISYFFDWRGTSEVVDAQCPGAAVAIHRAVSALRNGEIELALVGAANLLLR---PE-PFVLLS--ESG-QL ---SESAS-VHSFGAQAQGHLRAEGVCSLLLKPLTKALADGDPIYASIKHSAVNFNGQGGASIAAPNVDSHVDLIKSCYQ QARVDPRQVRYIEAQGMGNVLADLVEWQAFNRAL--T-DI-----A-----R------------------------Q---QRVSLPP--G----------------N-C-LISTLKPMMGHMESASALGALFKVIRSLHTRTIHKIA-HFTQYHPD-MD ---YQGQ--PCAIAGETVAW-------------PQMEGL--RLAGIHCY-GMGGVNAHLLVEE >OnnKS4 -EPIAIIGLSGSLPKS-QTIAEFWRSLDQDLSLIEEI-PRS-R----F-NWEEVY--D--P-DGK--------------------D--VD-KMR------T--K--W-GGFL-RDIYGFDPHFFKILPRDAAVMDPRQRLLLMSVYQTLADAGY----APETF----------K-K-----SKTGVFFSIQ----DNE--YL-----QLL-------REGGV-----DRGEGF----GH-ASMIANRIAYFFDFRGPSEFVDAQCAGAAVALYRAVSTLRSGDITYAVVGAANLLLR---AE-PFAVLT--RAN-QL ---SPTNC-VNSFGKDAQGHLRAEGVVSLLLKPLSKAEADGDPIYALIKNTACNYNGQGGMSIAAPNVDSHAELIETCYE QVQVDPGEIRYIEAQGMGNPLSDLGEWHAYNQAL--Q-SM-----A-----K------------------------K---RGVVLPQ--G----------------Q-C-AISTLKPMMGHMESVSSLGAIMKVIRSFKTNTIHKIL-NVQEISPD-LD ---PQGM--PCRLLTETEPW-------------PEQARP--RLAGLHSF-GIGGNNVHILLEE >NspKS4 -EPIAIVGVSGYLPGC-MSVREFWQALDADRSLIQEI-PGE-R----F-DWRKYY--D--PAG-----------------------KDPG-KTR------C--K--W-GGFI-PDVAGFDAHFFKILPFDAKQIDPRQRLLLMSVYQTLCDAGY----APATL----------K-K-----SRTGVYVAHQ----DDD--YL-----QIL--NE---RGLDP-----GEGYGQ-------ASLLANRIAYFFDFRGPSEIVDAQCAGAAVALHRAVSALRSGEISHAIVSAANLLLR---PQ-PFLALS--RTK-QL ---SRTNT-VKSFQENADGHLRAEGVVSVMLKRLTQAEADRDFIYALISNTAVNFNGQGGTSSASPNIESHVDLIERCYG EAGIDPRDVSYIEAQGMGNRLSDLAEWEAFNRAL--R-SL-----A-----K------------------------A---RGVTLDL--H----------------S-C-RISTLKPLIGHMESTSALGALLKIVRSLETERIHKIL-GFSGADMA-LD

---TDNQ--PCVLASETLPW-------------NKTDKP--RLAGLHSF-GMSGNNAHILIQE >VirKS3 NAVAVVGMSCRFGPA-TSADRLWELLEQGRSGIHRYSPEE--------LVRLGH--R--P--EL---------VRR--------P---G-FVP--------------AGVVMEDADAFNNEFFGYSPVHAEWLDPQQRVLLETAWHALEDAGF-----A PDRT------------G-----LRTAAYVSVG----QST--MP-----QVG--IT---DLDAA-----GMIRFS---S-S D-KDFAASRISYKLGLTGPSLTVQSACSSGLVAVHLAVESLLGEESDLAVVGAASLHFP---QA-GYLAAP--DMI--L--SPSGE-CRPFDDGADGTVFGNGAGALVLRRLADAVRDGDPIRAVIRGSAVNNDGARKMDYHAPSPEGQEAVLREALAV AGIDAHTVGYLETHGTGTHLGDPIEYAALDRVY--G-GE----------------------------------------R--------P----------------HPP-PSAPSRASSATSTPRPDSPAWPRLILALEHAAVPPQA-PFEKPNRR-LA--GTGG---LRIAAGGDGW-------------PVPDGP--RRAAVSSF-GIGGTNAHVVLEQ >CJAKS1 NDVAIIGMACRFPGA-QNVEEFETNLWAGKESIRTL---T-R----E-ELRLAG--V--P---D---------------E W-L--A---N-PGY------I--P--V-TSSF-ADVESFDADFFKYSSREARRMDPQQRCLLECAWEALEHAGY-----S QKAF------------D-----YPVGVFAGAS----SNT--YL-----LNN--ML---RKKVA-LNLDFMEYLEDRQG-G D-KDFIATRVAYKMNLCGPAFTVQSACSTSLVAVHTACQSLLNGECDMALAGAVTVMYP---LD-QGYWHK--EGS-MV--SPDGH-CRTFSDRAQGTVFGNGVGVIVLKRLEDALAERANILGVIKGSAMNNDGAVKPSYTAPSLDKQSEVISDALAI ADVNPDTILFVEAHGTGTPMGDPIEVAALTQAY--R-DY-----T-----D------------------------K---R--------Q----------------F-C-ALGSVKTNIGHLDVAAGMAGIIKTLLVLKNHSVPPTL-HFNAANPA-ID--FVRS--PFFVNTDVQPL-------------PDVDTP--ARAGITSL-GVGGTNVHMILEA >OzmKS9 NDIAIVGMAGRFPGA-DSVGEFWELLRSGREGITRF---S-D----E-ELAAAG--V--P---A---------------A L-R--A---D-PAY------V--R--A-HGIL-PDVDLFDTGFFEFTPAEAEVIDPQHRLFLESCHTALEDAGY-----D PRRY------------D-----GLISVYGGAA----INT--YL-----QRH--VL---PSIDQ-TAT--SDHFRVMVG-N D-KDFLATRVSYKLDLRGPSYSVQTACSTSLVAIHLACQGLINGECDMALAGGVTVKLP---QA-RGYLYE--EGA-IL--SPDGR-VRTFDAEAGGTVLGNGVGIVVLKLLADALDAGDTIHAVIKGTATNNDGSLKVSYAAPGKEGQAAVVAEAHAV SGTEPESVTYVEAHGTATRLGDPVEVAALTDAF--R-RG-----T-----S------------------------D---T--------G----------------F-C-AIGSLKSNVGHLDAAAGVAGVIKTALMLRHRSLVPTL-NHERPNPA-ID--FAAT--PFYVNTETRPW--------------AGEGP--LRAGVSSF-GIGGTNAHAILQE >OzmKS12 SDIAVIGLACRFPGA-ATPDTFWKVLSEGRETLTHF---S-D----E-ELRAAG--V--A---E---------------P L-L--A---D-DRY------V--K--A-GQVL-VDADKFDAGLFGITRDEAELIDPQQRQFLECAYEALERAGY-----D PQRG------------E-----QRIGVYAGVG----LNT--YL-----LHN-LGERYRTASSV-----DRYRMM---ITN D-KDFVATRTAYKLNLCGPSVSTNTACSTSLVAVHLACLSLLSGDCTMALAGAAHIQAD---QG-EGYLHH--EGM-IF--SPDGH-CRAFDAKAQGTVIGNGVGAVVLKRLSDALADGDTVHAVIKGTAVNNDGSDKTGYTAPSVQGQAAVVAEAQEI ADVGPETVSYVEAHGTATPLGDPIEVAALNQAF--N-RE------------------------------------GAALA P--------G----------------S-C-ALGSVKTNVGHLDTAAGMAGLIKTILMLRHRTLVPSL-CFEAPNPE-ID--FAAG--PFYVGTETKEW-------------PAGPTP--RRAGVSSF-GIGGTNAHVIVEE >CC3KS4 LETAIVGIAGRFPGA-KNIEEFWENLRDGKESIRTF-TDD-E------LIESGI--D--P-ILL------------------------KNPNY------I--R--S-GTIL-EGADMFDAEFFGYNPRELEIMDPQQRVFLECAWEAMENSGY-----N SKTL------------D-----GSIGVFAGTN----MNT--YI-----LSI--LS---AKGNA-RRM--IDPVQAEIG-N D-KDYIATRVSYKLNLDGPSVVVQSACSSALVAVHTACRALLGGECDMALAGGVGVRVP---LK-SGYVYQ--KGG-LF--SHDGH-CRAFDAKASGTVFGSGVGIVVLKRLSDAINDRDNIYAVIKGSAVNNDGSSKVGYTAPSVDGQTKVIKTAQLV SEVEARSISYIEAHGTGTNLGDPIELSALTNAF--R-TS-----T-----K------------------------D---K--------G----------------F-C-AIGSVKSNIGHLAAASGAASLIKTALALKNKKLPPSI-NFEKPNPN-ID--FEST--PFYVNNKLSEW-------------ESSEMP--RRAGVSSF-GVGGTNAHIILEE >CC4KS1 NDIAVIGMSARVPGA-RNIKEFWANLCQGKESITRF-SD--Q------EIIAEG-IA--P--EL--------------------L---KKPEY------VK-A--W--GVL-EDAYKFDAQFFGYNPREAEILDPQQRVFLEEAWKAMEDAGC-----D SERF------------N-----GAIGTFASVG----MNT--YV------KN-LTENNESGNVA-----NNYQIM---I-N NDKDFLATRVAYKLNLEGPGITVQTACSSSMVAIHLACRSLLNRECDMALAGGVSIRL----PQ-KTGYLY--QEGMIL--SPDGH-CRAFDEKAKGTVGGNGAGVVVLKRLEDAIAEGDNICAVIKGTAVNNDGSLKVGYTAPRIEGEAGVISKAQEL AGVSPETITYIEAHGTGTPLGDPIEIEALEKVF--S-EK-----T-----D------------------------K---K--------R----------------H-C-AVGSVKTNIGHLDAASGVIGLIKTVLAIQNSKIPPSL-NFDKPNPK-CD--FENG--HFYVNTKLWDW-------------KTDGIP--RRAGVSSF-GIGGTNAHAVLEE >CC4KS2 GAIAIIGMAGRFPGA-NNTEEFWENLYNGVESVKFF---N-H----D-DLIKMG--I--D---E---------------H L-L--D---N-PKY------V--A--A-DAIL-DGMDMFDAEFFDYSAREAEITDPQHRLFLESAWEVLESAGY-----N SDLY------------D-----GRIAVYASAN----LSG--YM---VRNLYSNPG---LVESL-----GSFKIMIANG---QDFLATKVSYKMNLMGPSVNVNTLCSSSMVAVHYACQSLNSFECDIALAGGVSFQVS---RN-ETFFYQ--EGG-IG--SADGH-CRAFDSKANGTVSGSGLGILALKRLEDAIADGDCIHAIIKGTGINNDGSSKNSYTAPNVDGQAECIAEAIEM SGVNPETITYIDAHGTGTNLGDPIEIAALTKAF--R-AY-----T-----D------------------------K---K--------E----------------F-C-AIGSAKTNIGHLVNAGGLASMIKTVLSMKHRIIPASL-NFEEPNPK-ID--FVNS--PFYVNSKLSKW-------------ETEGFP--IRAAVSSF-GIGGTNTHVILEE >NspKS9 YEIAVIAMVGRFPGA-KDIDEFWQNLSLGVESIT------------WFTDEELLKAGVNP------------------------DWLSN-PNY------V--K--A-NAVL-SDMELFDANFFGYSAREAEIIDPQQRLFLESAWTALEQAGY-----N PQIY------------K-----GLIGVYAGLG----LNS--YL-----LNN--LT---PNREL-LET--VDPLQLLIC-S D-KDFLPMRVAYKLNLTGPAVNVQTACSTSLAAVHFACQSLLNGECDMALAGGVSLSFL---EN-TGYLYQ--EGM-IL--SPDGH-CRAFDANAQGTIGGSGVGIVVLKRLNEALADGDCIQAIIKGSAINNDGALKVGYTAPSINGQAAVIAEAQAV AGVDAETISYIEAHGTGTPLGDPIEIAALTQAF--A-QS-----T-----D------------------------K----

K--------G----------------F-C-AIGSVKTNVGHLNAAAGVTGLIKSVMALQHKLLPPSL-NFSTPNPK-ID--FANS--PFYVNTTLSEW-------------KTNNIP--RRAGVSSF-GIGGTNAHVILEE >SGKS1 DAIAVIGISCRFPGA-GDHRAFWSALVEGRSGGTSW---S-V----E-ELRALG--V--P---D---------------E L-I--T---R-TGY------V--P--Q-RSVV-DGRAEFDAAFFDISPRDAEFMDPQARLLLQHAWQALEDAGY-----R PEDV------------------PATSVTTSTS----TNF--YQ-----ALL--PALMANAAGP-RVLASSETYAAWLF-A Q-GGTVPTMISTKLGLHGPSMAVSTNCSSALSAVHVACRGLLAGDADQAIVGAASLFSA---GELGYVHQP--GLN--F--SSDGH-SRAFADGADGMSGGEGVGVVVLKRAGEAIAAGDHVYCLIRGVAVNNDGGDKAGFYAPSVRGQADVIRKALDR AGTDPSTISYVEAHGTGTRLGDPIEVAALTEAY--R-HY-----T-----D------------------------R---S--------Q----------------F-C-GIGSVKSNIGHLDAAAGIAGLIKVALALQHGEVPRTL-HGDVPSTE-ID--WEES--PFFVADRNLPL--------------DGDGP--ARAGLSSF-GIGGTNVHAVLEQ >BatKS2 ESVAIIGISCQLPGA-ESHRAFWKNLADGTQSIEVL---S-H----Q-ALRDAN--V--P---D---------------E I-L--A---R-DDF------V--P--A-LGRI-TGREFFDAEFFKVSPRDAELMDPQLRLLLQHAWNAIEDAGY-----V SKEI------------------PDTCVLIASS----RNG--YA-----T----GQ--TGTSGA-DVLHSSEQYVSWLL-S Q-SGTIPTMISHRLGLKGPSLAVHSNCSSSLVALNVAYRSLLAGEARQALVGAAALLSP---CDLGYVYQK--GLN--F--SSDGR-VRTFDASADGMVGGEGVAAVLLKRATEAVADGDHIYALIRGIALNNDGDAKVGFYAPSMEGQTDVIEKVLRE TGIDPQSISYVEAHGTGTELGDPIELAALSNAY--R-LH-----------T------------------------D---R-----T--Q----------------F-C-GIGSVKTNIGHLDTAAGLAGLIKVALSLSKGAMPPTL-NYHTPNPQ-FA--LDRS--PFYVVDHYTAW-------------HQDHGP--RRAALSSF-GIGGTNAHAILEE >TaiKS5 DGLAIIGIALRVPGA-ADARAFWRNLREGRSALERL---D-A----R-RLMAHG--V--A---S---------------A L-A--G---A-RQT------V--G--V-RATI-ADKHRFDAEFFGVSMRDAALMDPQARQLLQHAWLAFEDAGY-----V PADA------------------PDTAVFVSAS----HSR--YA-----AKQ--AD-GARAAAE-AVLDDPADYVGWIL-E Q-GGTIPALISYKLGLTGPSLYVHTNCSSSLAALYAAWQTIRAGDAKQALVAAATLFAD---ERLGYVHQP--GLN--F--SSDGR-IKTFDRNADGMVPGEGVVAVLVKRVAEALADGDRIYAIVRDVALNNDGAAKAGFYAPSVRGQAQVIDALLRR TGVRAADIVYVEAHGTGTQIGDPIEVAALTDAY--R-AH-----G-----A------------------------G---T--------G----------------H-C-GLGSVKTNVGHLDTAAGLVGLVKVALSLEQRMLPPSL-NFDAPNPA-LD--LASS--PFYVVERATPI-------------APRAGR--TFAAVSAF-GVGGTNAHALVEA >TaKS1 GSLAVIGISCQLPGA-ADPWRFWKNLREGRDSVVAY---R-H----E-ELRELG--V--P---E---------------E V-L--R---D-SRY------V--A--V-RSSI-EDKECFDPQFFGLTARDASFMDPQFRLLLMHAWKAVEDAAT-----T PERL------------------GPCGVFMTAS----NSF--YH-----QGS--PQ--FPADGQ-PVLRTADEYVLWVL-A Q-AGSIPTMVSYKLGLKGPSLFVHTNCSSSLSALYVAQQAIAAGDCQTALVGAATVFPS---ANLGYLHQR--GLN--F--SSAGR-VKAFDAAADGMIAGEGVAVLVVKDAAAAVRDGDPIYCLVRKVGINNDGQDKVGFYAPSATGQAEVIRRLFDR TGIDPASIGYVEAHGTGTLLGDPVEVSALSEAF--R-TF-----T-----D------------------------R---R--------G----------------Y-C-RLGSVKSNLGHLDTVAGLAGLIKTALSLRQGEVPPTL-HVTQVNPK-LE--LTDS--PFVIADRLAPW-------------PSLPGP--RRAAVSAF-GLGGTNTHAILEH >PsyKS4 DSLAIVGISCQFTDA-EDHRAFWTNLRAGKSSGRFL-TPE--------ELRAAG--V--P--ED------------------TIA---D-PRF------V--P--F-DGRL-PGRDCFDPAFFNLSSSNAELFDPQLRLLLIHAWQAVEDAGY-----V PEDI------------------ADAAVFMSAC----NSY--YK-----TLLHQLG---AVRES-----DEYVAW---I-A SQGGSIPTMVSYQLGLKGPSVFVHTNCSSSLSALYFAQQTLRAGDASAALVGASTLFPV---PGVGYTYEA--GLN--Y--ASDGR-CKTFDAAADGMVGGEGVAVLLVKRAADAMRDGDHIYAIVRGIALNNDGAEKAGFYAPSVNGQAAVIDRVLKT TQVHPETISYLEAHGTGTSLGDPIEVMALTDAY--R-RY-----T-----D------------------------K---K--------Q----------------F-C-GLGSVKTNLGHLDTAAGLAGCIKLALSLEHRELTPTL-HLVSPNPA-ID--FAAS--PFYLQDQGCSW-------------QQETGS--RRAALSSF-GIGGTNAHAVLEE >BP17KS5 DAIAIIGISCQFPGA-QDHRAFWRNLRDGKSGARFY-SED--------ELRAAG--V--P---D-----------------TLIR---D-RHY------V--P--M-QQTI-EGKDLFDRHFFRLTTKDAQLMDPQFRLLLQHAWKAIEDAGC-----T RERI------------------ADAGVYMSAS----NSY--YQ-----AMLRAAG---TIDAS-----DEYQAW---LLA Q-GGTIPTRISYELGLTGPSLFIHSNCSSGLVSLSVAAKSLLQRESRCALVGAATVLPD---ADIGYVYQP--GLN--L--SSDGR-CRTFDENADGLTSGEGVAVLLVKRARDAIDDGDPIYALLRGIAVNNDGADKVGFYAPSVGGQADVIRKVLDA TGIHPETIGYVEAHGTGTKLGDPVEVAALTDAY--R-RH-----T-----A------------------------R---T--------G----------------F-C-AIGSVKPNIGHLDTVAGLSGCIKVALSLRHGEIAPSI-NYEKPNRE-ID--FAHS--PFYVVDRLTRW-----------PAREPGAP--RRAALSSF-GIGGTNAHLILEA >NspKS5 DSLAIVGISCHFPDA-PNHDQFWQNICAGKESGQFF---S-E----E-ELRQAG--V--S---E---------------D S-I--R---N-PYF------V--G--V-QRTI-EGKGNFDPEFFNISPRNAVFMDPQFRLLLMHAWRAVEDAGY-----V SSEI------------------PRTSVFISAS----NSG--YN-----TLVDKAG---IIEAT-----DEYTAW---MLN Q-SGTIPTTISYQLGFIGPSVFVHTNCSSSLAALSAAYQSLSLGQSDYALVGASTLLPK---SEIGYLHRP--GLN--L--SRDGH-CRTFDAAADGLVAGEGVAVLLVKKAPLAIADGDRIYALIRGIGINNDGRDKAGFFAPSVRGQSQAIDMALTS TGIGAETIGYVEAHGTGTKLGDPIEIQALCDSY--Q-KH-----T-----N------------------------Q---R--------Q----------------Y-C-AIGSVKPNIGHLDTGAGLAGCIKVAMSLYHKKIPPSI-NFSQPNPA-ID--FERS--PFFVIDRLREW-------------EAAPWP--RRAALSAF-GIGGTNTHAILEE >PedKS5 DSLAIIGISCNMPGA-RTLRQFWENLRQGKESSTRL---S-E----R-ELRRAG--V--P---E---------------E L-I--R---H-PDF------V--P--M-QYSM-EGKELFDPDFFNLSAKNALFMDPQYRVLLQQAWQAIEDAGY-----V AQDI------------------PETAVFMSAS----NNF--YK----------TL---LHSAGAVETTDEYAAW---IAG Q-GGTIPTMISYQLGFKGPSFAVHSNCSSSLVGLYLASQCLRLKEAKYALVGGATLFPV---AGTGHLYTP--DMN--L--SSDGH-CKAFDADADGLVGGEGAVVLMVRKALDAIRDGDPIYALIRGVAVNNDGSDKVGFYAPSVNGQAAVIQKALDI

TGVDPQSVAYVEAHGTGTRLGDPVEIMALNEVY--R-RY-----T-----E------------------------K---R--------Q----------------F-C-RIGSVKPNIGHLDTVAGLAGMLKVVLSLKHAEFFPSI-NYREPNPA-ID--FTSS--PFEVVTQLTPW-------------PAGNEP--RRAALSSF-GIGGTNTHAILEE >OnnKS5 DSVAIVGISCHFPGA-ADHHTFWRNLRDGKESASFF-PEE--------ELRAAH--V--P--EA------------------RIY---N-PNY------V--P--L-KLTI-EGKDLFDAEFFNISPANAVYMDPQLRLLLTHSWQAMEDAGY-----C ARDI------------P------DTAVFMSAC----NSF--YK----TLLH-RAN---AIGEA-----DEYAAW---I-A SQSGTIPTMISYQLGLKGPSAFVHTNCSSSLSGLYFAVQSLQTGQAKAALVGAATVFPL---PGIGYVHQP--GLN--V--SSDGH-IRTFDAAADGLTGGEGVGVIMVKKAQDAIADGDHIYALLRGISLNNDGSDKTGFYAPSVKGQSEVIGKVLRA TNVDPSSISYMEAHGTGTRLGDPIEVMALSDAY--R-QF-----T-----S------------------------Q---T--------Q----------------F-C-GIGSVKPNIGHLDTAAGLAGCIKVALSLSHGEIPPSI-NYRQPNPE-ID--FKTS--PFYVVDQLQAW-------------KTDAVP--RRAALSAF-GIGGTNVHAIMEE >PPOLKS1 DSLAIIGISCHFPGA-KNHAEFWSNLRAGVESVRFF---S-E----E-ELVGLG--L--E---K---------------E I-V--G---N-RGY------V--P--G-RCTI-EGKEYFDPEFFSLSQKNAEFMDPQMKLLLQHSWKAVEDAGY-----I SKEI------------P------ETSVYMSAS----SSF--YQ-----AFI--PN--LASQSP-NVLKNADEYVTWIL-A Q-GGTIPTMISHKLGFKGPSLFVHSNCSSSLVGLRLASQSLLSGEAKYALVGASTIFPF---TSLGYVHQP--GLN--F--SSDGH-IKAFDESADGMIGGEGVGVVLLKKAQDAIRDGDHIYVLLRGVGVNNDGTDKLGFYAPSVKGQAEVIQKTIES TRIHPETISYIETHGTGTKLGDPVEFAALNETY--R-QY-----T-----T------------------------K---K--------Q----------------F-C-GIGSVKTNIGHLDTAAGLAGCIKVALSLYHNEIPPSL-NYEKPNSD-IN--LPDS--PFYVVDSLQKW-------------EKASAP--HRAGLSSF-GIGGTNAHAIFEQ >BBR2KS6 DSIAIIGISCHFPGA-KNHREFWENLRNGVESVRFF---S-D----E-EIDQLH--L--P---D---------------E Y-L--Q---N-PNF------I--P--V-QSTI-EGKDHFDPGFFHLSARDAEFMDPQFRLLLTHSWKALEDAGY-----T PKQV------------------PQTGVFMSAS----NSY--YQ-----ALL--PH--FTKESP-NVMKDPNEYVSWVL-A Q-GGTIPTMISHKMGLKGPSFAVHSNCSSSLVGLHLAYRSLQSGESKVALVGGATIFPT---TYSGYVYQA--GLN--F--SSDGH-CKTFDASADGMVGGEGVGVVVLKKASDAIKDGDHIYALMRGIRINNDGGDKVGFYAPSIKGQAEVIQQVLED TKVHPETISYIEAHGTGTYLGDPIEFAALNEVY--T-KY-----T-----S------------------------N---K--------Q----------------Y-C-GLGSVKTNIGHLDTAAGLAGCIKVALSLYHNEIPPSL-HYEKPNPN-IN--LDQS--PFYVVDTLKKW-------------EDAPVP--HRAALSSF-GLGGTNAHAIFEQ >BaeKS10 DSVAIVGISCQFPGA-KNHHEFWKQLREGKESVRFY---S-E----E-ELREAG--V--P---E---------------D L-I--E---N-PDY------V--P--A-LSTI-EGKDLFDPEFFHISPKDAEFMDPQLRLLLLHSWKAVEDAGY-----V SKEI------------------PKTSVYMSAS----NNS--YR-----SLL-PEKTTEGHESP-----DGYVSW---VLA Q-SGTIPTMVSHKLGLKGPSYFVHSNCSSSLVGLYSAYKSITSGESEYALVGGATLHAA---TSIGYVHQN--GLN--F--SSDGH-VKAFDASADGMAGGEGAAVILLKKASQAVQDGDHIYAMLRGIGLNNDGADKVGFYAPSVKGQTDVIQHVLDS TNIHPETISYIEAHGTGTTLGDPIEMSALQQVY--K-RY-----T-----D------------------------R---E--------Q----------------Y-C-GIGSVKTNIGHLDTAAGLAGCIKVAMSLYHRELAPTI-NYTSPNPN-IK--FSGS--PFYVADKRKTL-------------PERETP--HRAALSSF-GLGGTNAHAIFEQ >PksXKS11 DSVAIVGISCQFPGA-KNHHDFWNHIKEGKESIRFF---S-E----E-DVRANG--V--P---E---------------E L-I--Q---H-PDY------V--P--V-QSVI-EGKDLFDPGFFQISPKDAEYMDPQLRLLLLHSWKAIEDAGY-----V AKEI------------------PATSVYMSAS----SNS--YR-----TLL--PK---ETTEG-HESPDGYVSW---VLA Q-SGTIPTMISHKLGLKGPSYFVHSNCSSSLVGLYQAYKSLTSGESQYALVGGATLHAQ---SAIGYVHQN--GLN--F--SSDGH-VKAFDASADGMAGGEGVAVILLKKAVDAVKDGDHIYAIMRGIGINNDGAEKAGFYAPSVKGQTEVIQHVLDT TKIHPETVSYIEAHGTGTKLGDPIEMSALNKVY--K-QY-----T-----D------------------------K---T--------Q----------------F-C-GIGSVKTNIGHLDTAAGLAGCIKVAMSLYHNELAPTI-NCTEPNPD-IK--FESS--PFYVVRERKSL-------------EKHAGV--HRAALSSF-GLGGTNAHAIFEQ >BaeKS1 DSLAVIGISCEFPGA-KDHYEFWNNIKEGKESITFF---S-K----E-ELRRSG--I--S---E---------------E L-A--D---H-PGF------V--P--A-KSVL-EGKEMFDPGFFGFSPKDAEYMDPQLRMLLLHSWKAIEDAGY-----I SKEI------------P------ETSVYMSAS----TNS--YR-----SLL-PEETTAQLETP-----DGYVSW---VLA Q-SGTIPTMISHKLGLKGPSYFVHANCSSSLIGLHSAFQSLQSGEAKYALVGGATLHTE---SSAGYVHQP--GLN--F--SSDGH-IKAFDADADGMIGGEGAGAVLLKKASDAVKDGDHIYALLRGIGVNNDGADKVGFYAPSVKGQAEVIQKVIDQ TGIHPETIAYVEAHGTGTKLGDPIELSALQSVY--G-RY-----T-----D------------------------K---K--------Q----------------Y-C-GIGSVKTNLGHLDTAAGMAGCIKVVMSLYHQEIAPSI-NYKEPNPN-LH--LEDS--PFFVAEEKKEL-------------TRENRA--HRMALSSF-GLGGTNTHAIFEQ >PksXKS1 DSVAIIGISCEFPGA-KNHDEFWENLRDGKESIAFF---N-K----E-ELQRFG--I--S---K---------------E I-A--E---N-ADY------V--P--A-KASI-DGKDRFDPSFFQISPKDAEFMDPQLRMLLTHSWKAIEDAGY-----A ARQI------------------PQTSVFMSAS----NNS--YR-----ALL--PS---DTTES-LETPDGYVSW---VLA Q-SGTIPTMISHKLGLRGPSYFVHANCSSSLIGLHSAYKSLLSGESDYALVGGATLHTE---SNIGYVHQP--GLN--F--SSDGH-IKAFDASADGMIGGEGVAVVLLKKAADAVKDGDHIYALLRGIGVNNDGADKVGFYAPSVKGQADVVQQVMNQ TKVQPESICYVEAHGTGTKLGDPIELAALTNVY--R-QY-----T-----N------------------------K---T--------Q----------------F-C-GIGSVKTNIGHLDTAAGLAGCIKVVMSLYHQELAPSV-NYKEPNPN-TD--LASS--PFYVVDQKKTL-------------SREIKT--HRAALSSF-GLGGTNTHAIFEQ >CorKS5 QDIAITGLAGRYPGA-PTLDALLDNLRAGRSAFRTIPA------ERW-------------------------SGTEGG-----------------------VR--H-GAFL-EDVDRFDPLFFNISPGEAEEMDPQERILLELALHALEDAGQ-----P RGALLRQPL--------------KAGVFIGAMNPDYEWLGARASAAGTPNRSSS--------------------------RFWSIANRISYWFNLRGPSFAVDSACSASLTAIHLACQSLQRGECELALAGGVNLILHPDH----LERLA--HAGLL--

--TKGDS-TRSFGARADGFVDGEGAGLVVLKPLARARADGDVIHGVIKGSSINAGGKTA-GYLAPNPQAQADVIDEALRR ARVPARSISYVECAAAGSSMGDAIEVSGLKQAFARH-----------------------------------------------------------------TPERGFC-AVGSIKANVGHLESASGIAALTKVLLQLRTRELFPSP-HAGTPNPEL---ELSDS--AFHLQAALAPWRPGP-------SADGQDAP--LRAGISAF-GAGGANAHLIVEE >AlbKS2 DAAAIIGLAGRFPGA-DTLEEFWNNLRNGQSSMGEV-PGE-R----W-DHQHYF--D--S---E---------RQA--P---------G-KTY------S--R--W-GAFL-RDIDGFDAAFFEWPDSVALESDPQARIFLEQAYAGIEDAGY-----T PGSL----------SKS-----QRVGVFVGVM----NGY--YS----------------------------------G-G ARFWQIANRVSYQFDFRGPSLAVDTACSASLTAIHLALESLRSGSCEVALAGGVNLLVD---PQ-QYLNLA--GAA-ML--SAGAS-CRPFGEAADGFVAGEACGVVLLKPLKQARADGDVIHAVIRGSMINAGGHTS-AFSSPNPAAQAEVVRQALQR AGVAPDSISYIEAHGTGTVLGDAVELGALNKVF-----------------D------------------------K---R--------A----------------APC-PIGSLKANIGHAESAAGIAGLAKLVLQFRHGELVPSL-NAFPLNPY-IE--FGR----FQVQQQPAPW------------PRRGAQP--RRAGLSAF-GAGGSNAHLVVEE >ChiKS9 DDIAILALDGRYPQA-RSPEELWENLRAGRECTREV-PAD-R----W-DVSAYY--D--A---D---------PRRAAA---------G-RMY------C--K--W-GGFL-DDIGRFDALFFQISPTEAASLDPSERLFLEIAWSTLERAGY-----A RRRP----------Q-S-----RSVGVFVGVN----VGD--YH------LL--AL---EEQAR-----GRWVFS-----N PSFSAIANRVSYFFDFQGPSLAIDTQCSSSLTAIHLACESLLRGECEMALAGGVNLYPH---PS-RYVNLC--QVK-AL--SSTGQ-TRSFGAGGDGFVPGEGVGAVLLKPLRQALLDRDPILAVIKGSALNHAGKTS-GFMAPSPAAQADLLERALAR ANVDPGSVSYIEAQGMGSTLVDAAELAAFTRVL--R-RG-----------R------------------------R---Q--------G----------------P-C-LLGSIKPNIGHLEGAAGISQLTKVVHQLRSRQIAPSL-HADPVNPE-VG--FDAS--LFRIPGALEPW-PMPVV-------DGHAEPSTRRACISSF-GAGGSGVYLIVEE >DszKS9 REIAVIGLAGRYPGA-DTPRQLWRALRSGQSAVTRP-PAG-R----FGASAPQG-DE--P---------------------------RGGGAS------P--G--W-GGYL-ERLDRFDSLFFGISPAEAKLMDPQERLFIEVAWECLEDAGY-----T PEEL---------RRAA-----PRVGVFVGAM----WSD--YQ------SV--GL---EAWQR-----DRRAKA-----V AFHSSIANRISYLFDLHGPSVAIDTSCSSGLTALHLASRSLRLGECDVALVGGVNLLGH---PF-HPDLLE--GLN-LT--SRDDK-TRAFGAGGSGWVPGEGVGAVLLRRLPEAEERGEHIRCVLKGTALAHAGKAP-RYGMPSTRAQAGSIRDALAD GGVAASEIDYVECAATGSGIADASEVDALKQAF--EGRS-----------P------------------------D---G--------P----------------P-C-LLGSVKPNIGHLESASALSQLTKVILQLEHGEIAPTL-HTEPRNPL-IQ--LDGT--PFRINRALSPW-PR--------AAGADAPP--RRALINAF-GATGSSAHAVVEE >RhiKS2 DAMAIIGISGRYPGA-ANPDELWQNLSAGRASIIPL---S-R------EALFYG-SD--D---A------------------------G-DSP---------Q--WAVGAL-AGKQLFDPLLFKITPAEAKTLDPQERLFLQAVWHCLESSGY-----T AASL---------RRQA-----ERIGVFVGAM----WGD--YQ-----HHR---------PTE-----QGERAT-----S F-LSAIANRVSFFNDFNGPSVAFDTSCSSAMTALHFACNSIRQGECQAAIVGGVNLISH---PS-HLELLT--SLK-LL--SDDSQ-SYPFGRHANGWVAGEGVGALLIRPLEDAMRDGDSILGVIRATAISHSGKTF-RYGAPNADSHALSMRRVLQQ AGLSADEIGYVEAAAPGASLADGAEFAAISNVF--GARR-----------S------------------------D---A--------P------------------L-LVGSIKANIGHLESASALSQITKVLMQLKHRQIAPTL-GCNPLSPM-IC--LDDN--HLAIADQLSDW----------------RGP--QRALINAF-GASGSGGHLIVEA >ChiKS10 TDIAIVGQSGRYPGA-PDAAALWERLRRGERSIRPA-PAD-R----W-DPAPLQATG--P---D--------------------K---G-GIY------C--S--S-GGFL-DDVDRFDCLLFRMSPAEARSIDPQERLFLEAAWACLEAAGT-----T AERL---------NAQA-----GKVGVFVGVM----WND--FQ-----NEG--VE---GFRED-----HVARAV-----A L-HSSIANRVSHTFDFKGPSVAVDTSCSSAMTALHLACESIQRGECRAAIVGGVNLMTH---PY-HQGLLC--SLG-MV--SESGF-GNALGEDATGWIPGEGVGAVLIRPADDAERSGDHIHALIKATAINHTGATP-RYGMPSAEAQAASIRDVLRR AGLGPEAVSYVEAAATGAAIADASEIAALIEVF--GERQ-----------G------------------------S---A--------P----------------R-V-ALGSIKPNIGHLESASAMSQLAKVLLQIQHKTLAPHV-LSGALNPM-IP--WDRA--PFWVPEQPAAW-------------QPRSGP--RRALVNAF-GATGSLGHAVIEE >VirKS7 RPVAVVGFAARLPGA-DDLRGLDAMLAAGDCALGPA-PQE-R----WAPLRSNA--R--R------------------------A---G----------A--R--V-GGYL-PDVTAVEVEEFGLADAEVAAVDPQERLLLTVTRHCLEDAAI-----T PERL----------SQT-----GQVGVFVGSM----WQD--HA------LH--GV---AARAE-----GRTGTH-----A T-RGGLAHRMSHAFGLTGPSLVIDTGCVSGLAAVEAAFRAVADGRCEAAVAAGSNLVLH---PD-HLDVLS--ELG-LV--AEQED-SCAFTDRASGWLVGEGVGAVLLKPLDRALEDGDPVHAVLRGGALLHSGTTR-QFGIPDPRRQEQVMRAALAD ARLTAGDIGYVEAAAAGAALADALEFTALGRLF--G-AA-----GVDGGAD-------------VEGASGAGNG-A---R-----AGLG----------------P-V-PVGSVKPNTGHLEAASVFAQLGKLIAQFRRDRLYPTR-LTSAANPALAP--YRGS--VALAGPSADQW------------RTAPGGT--RRALVNGF-AGGGSYGSLVVEE >LnmKS1 EPAAVIGMAGRLPGA-GDLDAFWDNLVSGRTAIGPA-PAS-R------PETAPS----------------------------------GARAT--------------GGFL-PHIDRFDSLLFHVSPQEAPALDPQARLMLESVWQCLDDAGH-----T ADSL---------RRSA-----GRVGVFIGSM----WHD--YR------QQ--GA---DRWNG-----GDSAEV-----A ATASDIANRVSHFFDFRGPSLAVDTSCSSSFAALHLAVESLRRGECGAAVVGAVNLLAH---PY-HWGLLD--GLE-LL--AADAP-PAAYAAEGSGWHPGEGVGVLLLRPADAARRAKDTVHGLIEGTRIGHAGRAP-RYGAPHTAALADSLARALAD ASVIPDEVDYVECAAAGAGIADAAELEALGSVL--A-RC-----------A------------------------G--AS P---------------------------V-PVGTLKPNIGHLEAASGLSQLIKVLLQIRHGRIAPTL-VSGELSPL-VD--WDGL--PVELVDTPRAL-----------TPRAADGR--ATVLVNAV-GATGSYGHVVVRA >CACIKS1 TAIAIVGMAARLPGA-DDLDTLWRNLSAGRSAVAPL---DAR------RAAALG--L--P------------------------------QVR--------------GGFL-SDIESFDALRFRIAPSEAAGLDPQARQVLEAVWRCVEDAGR-----T DALA----------DLG------RVGVFVGSM----WPD--FQ------LA--GA---DAWRV-----DGTAAQ---S-G

I-ASDLPNRVSHAFGFTGPSVAVDTSCSSSLTALHLAVQSIRLGECAAAVVAGVNLIAH---PY-HLALLS--GLD-LL--GRKEA-EGAFDGEAPGWTPAEGVAAVLLRPADAAAEDRDLVHAVIEGSWTGHLGTAP-RFGAPDAGALTGSLRHALAA AGLTAADIDYVECAAAGAALADAAEIEALGRVF--A-DR-----------A----------------------------R--------P------------------V-AFGTVKPNLGHLESASGMSQLAKVVLQLRHGRLAPTL-LASHRSPL-IS--WDPK--ALRVVDAIEPW----------PPHTDPAAP--PRALINAL-GATGSLAHIIIRG >TaiKS16 EPVAIIGISGRYPGA-YDVPAFWRNLLAGACAITEV-PAE-R----W-DWRAHY--R--A---D---------AAE--AA R----E---G-KSY------S--K--W-GGFV-DDVGRFDPAFFGMTPQDAQHTDPQELLFLEMCWHALQDAGQ-----T PALL--------PGDVR-----RRAGVFAAIT-----KH--YA---------------FPPTS------------------FASLANRVSHALDFGGKSLAIDTMCSSSLVAVNEAWEYLQR-DGRLAVVGGVNLYLD---PQ-QYAHLS--RFR-FA--SSGPV-CKAFGEGGDGFVPGEGAGAIVLKRLSDAERDGDPIHAVIRGCAVNHNGRST-SFTASDPARQADVVRDALTR AGVDPRTIGYVEAAANGHAMGDAIEMTGLGKVFAAC-DG-----V-----S------------------------G---T-------------------------R---AIGSVKANIGHCEAASGMSQLTKVVMAMRDGVLAPTL-RDGTRNPN-IA--FERL--PFEVQEQAAPW-RRLIV-------DGSEVP--RRAGVTSI-GGGGVNAHVVLEE >BryKS8 EAMAVIGMSACYPSA-KNLDQYWENLKCGKNCITEI-PDD-R----W-SIDGFF--C--P---D---------VEE--AL S----Q---G-KSY------S--K--W-GGFL-EDFAAFDPLFFNLSPRDAMRIDPQERIFLQECWRAFEDAGY-----V CSRL--------SPELR-----HKTGVYGAMT--------------------------KINPN----------------T S-FASLVNRVSYIMDLHGPSVPVDSMCSSSLVALHQACESLRQGTIDMALVGAVNLYLH---QD-IYLGMC--QAK-VI--SDSAT-PAIFGCDGKGFTPSEGVGAVVIKRLSDAEKGNDRVLAVIRGSAVNHSGRTN-HYGVPCPRQQAAVIHEAIDN ANVDPRSIAYIESAANGSEMGDAIEMSALTKVF--Q-TH-----------R------------------------DNGKA Q-------------------------Y---SIGSLKSILGHGEAVSGMAQFMKVVLQLRNKSLCPSP-DPQQKNPN-IH--FENL--PFELQTELDEW-RQLTI-------ADKKIP--RRAGITAL-GAGGVNAHMIVEE >NspKS8 DAIAVIGMSGRFPQA-KTLTEFWENLQSGRDCITDL-PNN-R----W-NMDGFF--E--A---D---------PKM--AR E----Q---G-KSY------C--K--W-GAFL-EDFDTFDPLFFNISPREAAQMDPQARVFLQECWRAFEDAGY-----S PSGL--------DDAVR-----DRIGVYGAVT----KVG--FN------------------------------------T S-FAAMVNRVSHAMDLRGASVAVDTMCSSALVALHQGCEALRRSDLQMAIVGAVNLYLD---PR-NYKYLC--EVG-LL--AESKM-PRVFGEGGTGFVPGEGVGAVVLKRLEDARRDNDSIVAVIRSSAVNHSGRAN-AYGAPNPLRQAEVISQALDR GGIDPRSVGYIESAAMGLEIADSLELNALKKVF--GDRA-----------E------------------------D---R-------------------------WQCYRLGTLKSTIGHGESVSAMAQFMKVVLQLKHRKLCPTK-VVRPLNPD-ID--FASL--PFQLQTELEPW-SPLTI-------DGICLP--RRAGITAL-GSGGVNAHLIVEE >BBR2KS13 GKIAIIGMSGRFAQA-NNLDEFWENLLHAKKSISEI-PAK-R----W-DWTAYY--H--P---N---------REE--AI S----Q---G-KSY------S--K--W-GAFL-EEFDQFDPLFFQMTPREAENIDPQERLYLEECWKALEDAGY-----A SSKM--------STELR-----KRTGVFGGIT----KQG--FH------LY-STE-------------TTHHFP-----T TSYSSMVNRVSYYLNLQGPSMPIDTMCSSAFVAIHEACEYIRNGKGSMAIAGGVNLYTH---PL-TYFGLT--VGQ-LI--SHTSD-SAAFGNGGIGFVPGEGVGAVVLKDYDQAVQDRDHIYAVIRGTAVNHKGKAN-SYMTPSPLPIADVMEEALKE SGLDPRSISYLEASAYGSDIVDAVEMTAVTKAFHNR-QG-----A-----E------------------------G---D-------------------------Y---RLGSVKPNIGHCESASGMSQLMKVILSLQHQTLVPTL-IPDELNPN-IP--FDQL--PFQLQREVSEW-KQVTV-------DGQTVP--RRAGITSF-GGGGVNAHVIVEE >OzmKS7 DAVAVIGMSGVFPGA-PDPDGLWELLMAGRSAVTEV-PGR-R----W-DWREHY--D--P---H---------PEG--AD V----V---G-KSH------S--K--W-GAFL-DGFDAFDPAVFGFTEQEARNTDPQVRLFLQECWKALEDAGI-----A PSKL--------PSETR-----GRIGVFAGGA----KHG--FT---------------QLGAE-----GRLEMP-----R TSFGDMVNRVSFQFDLGGPSKAVDTACSSAHVALHEAVESIRSGRCDLALAGAVNLYLH---PS-TYVELA--TVG-LL--SDRDD-CASFGAEAAGIVPGEGVGAVLLKPLRQARRDGDPVHAVIRGSAVNHNGRTI-GFTSPSSQRQAEVIREALRD ARVDPRTIGYVEATANGSEIGDAVEMTSLTQVFEDR-PD-----A-----R------------------------G---P-------------------------Y---RIGSLKPNIGHGEASAGMAQLFKVILALRHRTLPPTR-LPGEYNPA-ID--IDRL--PFELSGAPVAW-DQVTV-------DGALVP--RRAGITGL-GGGGTNAHVVLEE >BP17KS2 DGIAIIGLSGRYPDA-PTLDAFWRNLVSGRRSISEI-PAE-R----W-DWRDHY--E--R---D---------PDT--AV A----H---G-KSY------G--K--W-GGFL-DGFSAFDPLFFQIAPREAEFIDPQERLFLEACWHALEDAGC-----P PSAL--------TRAQR-----AKAGVFGGMT----KQG--FN------LY--GA-------------GGAQPY---QST S-LAALVNRVSHCFDFNGPAVAFDSHCASALVAIHEACQYLRREPEGIAIAGAVNLNLH---PS-NYQQLS--KMQ-VL--ASGAE-SASFASGGLGYVPGEGVGAVVLKDYRRALEDGDPIYGVIRGSAVNQNGRMN-RFGMPSQKQQEAVVRAALAQ AGVDPRSITYVEASAHGSAVGDAIEMAALTRVFGAR-ER-----A-----D------------------------G---R-------------------------Y---RIGSVKPNIGHGEAVSGMSQLTKVLLSLRHGQLPPTL-VCGAPNPD-ID--FDAL--PFELNTSLTDW-ARARV-------DSERVP--RRAGITST-GASGLNAHLVLEE >BP17KS7 DAIAIVGMAGRYPGA-DDLSAFWRNLVDGVNAITEI-PAE-R----W-DWRAHY--H--P---D---------PEQ--AA R----L---R-KSY------G--K--W-GGFL-GEFDCFDPLFFWMAPRRIAMIDPQERLFLEECWKALEDAGY-----P PSRL--------GDALR-----ERTGVFGGLS----KHG--FS------LY-ASQ---YAGTQ--------------PHT S-PASMVGRVSHFFDLKGPSVAIDNHCASSLVAVHEACEYLRRGDGDLAIAGGVSLCLH---PS-SYVQLS--LVR-ML--SRDAH-CAAFDEGGAGYVPGEGVGVVVLKRLAQARAHGDPIHAVIRSGAVNHNGRMR-YYGQPDQAGQQAAIRAALAR ARIDPRSISYIEAAASGVETTDAVEMAALTEVFGDRAGA-----------A------------------------G---A-------------------------Y---TIGTVKPAIGHGEAASGMSQLMRVALSLKHATLTPTR-LPRRPSPL-ID--FDRL--PFRLAAEAAPW-APVSV-------DGRPVP--RRAGVTAI-G-NGVNAHLVLEE >SorKS24 LDIAIIGLSGRYARA-DDVHEFWENLKAGRDGITTIPP------ERW-DHRSHA-----------------ASMELPA------------EAV--------LD--R-GGFM-SGLDSFDHEFFHFKDDEVDGLDPQEKLFLEAVWHLLEHAGY-----T

SHHIKTRHHG-------------DIGVYAGA----------------SAFVHAG-------------------------FMTGMVAGRAAGFLHLSGPTVVVDSHSASSTTALHLACQALVNGDCELAIAGGVHVH-SPQL----FYAFARWQQGM----EPERR---GFSDAG-GVIMSEGVGAALLKPLHAAVRDGDPIIAVIKGTALTSAGDMA-NLP-PEPTRLADVIRKCLKK AGVDARTVGCAEAMAMGIPTGDYCEFAGFSKAFREQ-----------------------------------------------------------------TKDSGFC-AMGTVESNIGHSIAACGIAQLTKVALQIHHAQLVPTI-KAEPLNPQI---DLDDS--PFYLQRSLSEW-QTPA---------GHDGK--RRGVFASR-GRSGTLAAVVLEE >ChiKS18 GEFAIIGIGGRYPEA-ADVREFWENLKAGRSCIGEV-PPH-R----W-DGDAYY--R--P---D---------GG-------------G-ASR------S--K--W-GGFL-EDVDRFDPLLFNISPLEAERLDPQLRLFLQTAWETFEDAGY-----P RRRLRVVQQ----GA-T-----SGVGVFVGSM----YQH--YP------FV-APD---GATAA-----QLSSFP-------GSAIANRVSHYFDLKGPSMLVDTACSSSLTAIYMACESLARGECAMALAGGVNLSLH---PQ-KYVIFS--QMG-LL--GSKER-SSSLGE-GDGITVGEGVGALLLKPLALALRDGDRVYAVIKGGFVNHGGRTH-GATVPNPSAQADLIVEAFRR AGVRPDAVSYIEVAANGSPLGDSIEIAGLKQAF--R-RF-----T-----V------------------------E---R--------G----------------F-C-ALGSVKSSIGHLEAASGVSQVTKVAYQLHHRTLVPTL-NSEPLNPN-IR--LDDS--PFYVQRERAPW-----------RPAVEGEP--LRAAVASF-GAGGANAYLILES >DszKS10 VDIAIVGLSGRYPGA-DTIDAFWSNLRQGRDSVTEV-PAD-R----W-DAAAIF--D--P---E---------GGP------------G-KTR------Q--R--W-GGFL-DRVDRFDALLFNISPREAAGMDPQERLFLEIAWCAFEDAVY-----T RERL----AEEQARA-G-----VGAGVFVGSM----YQQ--YS------ML-------ARTPD-----AGASSS-------FWSIANRVSYFFDLRGPSLAVDTACASSLTALHLACESLRRGECCLALAGGVNLHLH---PH-KYVALD--RLG-LL--GSGAA-SKSLGD-GDGYVPGEAVGAVVLKPLDRAVADNDRIYGVIKGSFANHAGKTA-GYGVPSPAAQADLIAAALRR TGIDPETIGYIEVAANGSSLGDAIELAGLTQAF--R-RF-----T-----A------------------------R---K--------H----------------F-C-AVGSVKSNIGHPEAASGIAQLTKVLGQLHHRTLVPTL-HAEPHNPN-ID--LRDS--PFYVQRELGPW-TAPTL---AGEGGTAELP--RRAAISSF-GAGGANTHLLVEE >SorKS8 AGVAVIGVAGRYPHA-QDIDRLWQNLTSGRDCITEVPA------VRW-DHRRYF---------DA-------EKGKPG------------KSY--------GK--W-GGFI-EGVDEFDPLFFNVSPRNAEAMDPQIRLFLEIVWELLETAGY-----S RQALQARFDG-------------DVGLYVGSMSQQYRGLGGDPSTQALAALASA----------------------------GDIANRASNFFDLKGPSVAVDTMCSSSAMAIHMACRDLLQGDCRMAIAGGVNLLIHPDR----YVSLS--QAQMI---GSQPG-SRSFAAG-DGYLPAEAVGAVLLKPLGAAIRDRDTIWAVIKSTCTNHSGRSS-GYAAPDPNTQAAVIERALQK AGVEPETVSWVEAAATGFVLADAVELTALNKAFAKL-----------------------------------------------------------------TAGPRAC-AVGSVKSHIGHAEAASGISQLTKVLLQMRHRTLIPPL-AVETPNPNL---RFETS--PFYLLREVQEW-RRPVVTI---DGREREAP--RRALIDSF-GAGGSYVSLVVEE >SGRFKS16 DAIAVIGLAGRYPGA-QDLDVFWENLVAGRDGVTEV-PRD-R----W-EHGPAD--D--S---G---------RDL--P---------G-GTP------A--R--W-GGFL-DGAADFDAQFFNVSPREAAIMDPQTRLFLECVWTLLESSGY-----T RDRL--------REAHG-----GRVGVYVGAM----YQH--YQ------LL-------SSDPV-----HESITS-----V MSYSAIANRVSHFFDLQGPSLAIDTTCSSSLVAIHMACEELRRGGGDMMIAGGVNLSLH---PK-KYLGLS--LTG-LT--GSDPG-SRPLLD-GDGFIPAEGVGAVLLKPLADAVRDGDEILAVVRSSATNHKGRTS-GPMVPSPARQERLITESLER AGVHPRTISYAEVSANGSQMGDAIEFAALRDAF--G-ER-----T-----R------------------------D---E--------R----------------F-C-ALGTVKSTLGNMEAASGVAQLSKVVLQLAHRRLVPFA-GEGRLNPG-VE--LAGT--AFYLPQEPQPW-LRPVV---NLDGQEREYP--LRATINSF-GAGGTNAHLVVEE >RhiKS16 RELAIVGISGRYPGA-EDLEAFWHKLAGGEDLISEV-PTQ-R----W-DHQAYF--A--D---Q---------RDR--F---------D-KTY------C--K--W-GGFL-DGVEDFDPLFFNLSPREAEIINPNDRLFIETCWNLLESAGL-----T RQRL--------KQQYQ-----QQVGVFVGVM----YQQ--YQ------AF-------EADFV-----RESLVS-----V TSYSAIANRVSYFFDFQGPSLAIDTMCSSSISAIHAAGEALRNGDCRLAIAGGVNLTLH---PK-KYIGLS--IGK-VL--GSHAS-SRSFAD-GDGYLPAEGVGAVLLKTLADAERDGDQILAVIKSTAVNHGGHTH-GFSMPSAKAEAALIDSNFKR AGVDPRTISYVEAAANGSAMGDAIELSALNRVF--G-QA-----G-----V------------------------A---H--------Q----------------S-C-AIGSVKSNIGHAEAASGMSQLSKLVLQLQHQQLAPSL-LLGSLNPK-LD--FENS--PFVLQRELGHW-PQPVV---ETDGVSRQYP--RRAALSAF-GAGGSNAHLVLEE >CPKS14 EAIAIIAIEGRYPQS-ANIAAFWDKLRTGKDCITTV-PTE-R----W-DNSLYF--E--Q---Q---------PAA--P---------G-KTY------C--N--W-GGFI-PAADQFDPLFFNISPREASLTDPQERLFLETVWQLLESAGY-----T KDTL--------KRKYN-----SEVGVFAGAM----YNH--YQ------LL-GGG---DAQEA-------VTAL-----S S-HSAIANRVSYYFNLHGPSLAVDTACSSSLVALHYACESILKGECQMAIAGGVNVTVH---PK-KYTGLS--LTG-MI--ASHPD-SRSFSA-GDGYLPADGVGAVLLKPLSAAIADNDNILAVVLSTAISHNGQSN-GFTVPNLAAQAQLIEQHFRH AGIDPATISYVEAAANGSAMGDAIEFAALSRAF--R-KF-----S-----D------------------------Q---Q--------G----------------F-C-AIGAVKSNIGHAEAASGMSQLSKVVLQLHHRQLVPTL-RKGASNPD-IS--WEGS--PFYLQDHLADW--------------DRPFP--LRATVSSF-GAGGTNVHVILEE >PedKS13 RDIAIIGMSGRFPFA-PDLEAFWENLSQGCDCITEI-PPT-R----W-KHQEYF--D--P---E---------KGK--------P---G-KTY------C--K--W-GGFL-ESIDQFDPLFFKIPPAQAEVLDPQERLFLETVWNLLESSGY-----L GETL--------QRIAQ-----SRVGVFVGSM----SQQ--YH------AF-------QADLT-----RESLVT-----M SSHSSIANRVSYFFDFQGPSVAVDTMCSSALVAVHMACESLLRXDCKAAVAGGVNLSIH---PK-KYIGLS--ASQ-IL--GSHPD-SSSFGQ-GDGYLPSEGVGAVLLKPLREAVADNDTILGVIKSTTINHSGQSN-GYFVPNGAAQTELMVSNFTK AGIDPRTLSYVESAANGSSLGDAIEINALTAGF--G-RY-----T-----A------------------------D---K--------Q----------------F-C-ALGSVKSNIGHGEAASGIAQLIKVLLQLKHRQLVPTI-KAQPLNSN-ID--FTHT--PFCLQRRLEPW-RRPSL--ALGDGPMREYP--LRATVSSF-GAGGSNAHLILEE >PPOLKS12 MDIAVIGVSGKYPGA-ENLQEFWSNLQESKDCITEV-PKD-R----W-EHDLYF--D--K---E---------RNK----

-----P---G-KTY------C--K--W-GGFM-ERISLSEPAFFHLTPYEISLMDPMEQLFLGIIWNLLESAGY-----T REAL--------QKIHQ-----NKVGVYVGAT----YHK--YC----------SC---DIEPD-----SAQRVT-----S FGSVAVANPMSHYFNFQGPSISIDTMSSSSAVAVHMACESLIRGECQIAVAGGGNLLTS---PK-KYIESS--QNQ-LI--GSHED-SRSFAD-GDGYLPAEGVGAVLLKPLHKAVQDGDCILAVIKSTATNHGGHSN-GYTIPNPNAQAQLVEENFLK AGIDPRTISYVEAAANGSTLGDPIELAALNKAF--Q-KF-----T-----T------------------------E---Q--------Q----------------F-C-AIGSVKSNIGHAEAASGISQLTKVILQLWHRKLVPTI-KAEKLNPN-IN--FSNT--PFYLQREVQEW-ERPVI---EIQGEEREFP--LRATVSSF-GAGGSNVHFILEE >BBR2KS5 LDIAVIGISGRYPQA-STIQEFWNNLRDGKDCITEI-PQE-R----W-DHSLFF--D--E---A---------RNS--------Q---G-KAY------S--K--W-GGFI-DGVDQFDPLFFHISPREAEMMDPQERLFLETVWNLLEEAGH-----T REVL--------QSQYE-----NKVGVYVGAM----YQP--YH------AF-------DTDME-----KSSIIS-----L SSYHSIANRVSYFFNLQGPSMAIDTACSSSAIAIYHACESLLKGESTLAIAGGVNVSIH---PK-KYLGLS--QAQ-MI--GSHVD-SRSFGD-GDGYLPAEGVGAVLLKPLAKAVEDGDSILAVIKSAATNHGGRTN-GFSVPNPNAQAQLIEENFAR AGIDPRTISYVEAAANGSLLGDPIELKALTNAF--S-KQ-----T-----D------------------------D---V--------Q----------------F-C-AIGSVKSNIGHAEAASGISQLTKVILQLQHQELVPSI-KAEPLNPH-IH--FAET--PFYLQKARQKW-ERPVL---RINGEEREVP--RRATISSF-GAGGSNAHLILEE >BCERKS5 EEIAIVGISGKYPLS-ENSDTFWKNLKNGKNCITEV-PTE-R----W-DAELYF--N--T---E---------KGV--T---------G-KSY------T--K--W-GGFI-NEVDKFDPLFFNISPAEAELMDPQERLFLEIVWATLEDAGC-----T RDSL------------G-----TEVGVFVGSM----YKH--YP------WI-------AKDTE-----AESLLS-----S TSYWAIPNRVSYLYDFQGPSIAIDTACSSSLNAIHQACQSIKLGECKAAIAGGVNLSIY---PE-KYVGLS--RTG-MI--GSSEK-SKSFGD-GDGYVMGEGVGAVLLKPLSKAVEDGNHIYGIIKSSASNHGGKTN-GFAVPSLNAQVNLIEKVIKS ANIPAETISYIESAANGSVLGDMIEVNALNKVF--K-NV-----T-----N------------------------R---K--------N----------------T-V-PIGTVKANIGHLEAASGISQLTKVLLQIKHKSLVPTI-SARPINPH-IE--LENS--PLYISDREEEW------------KVTNGVP--RRALINSF-GAGGSNTALIVEE >BTPKS3 RDIAIIGLSGKYPKA-KNIEEFWSNLVNGENCITEI-PAE-R----W-DSSLFY--D--P---D---------KGI--------H---G-KAY------S--K--W-GGFI-GDVDKFDPLFFNISPREAELMDPQERIMLEITWHALEDAGY-----S LKKL------RQLKEEG-----HQVGVFIGSM----NQQ--YP----------WT---AANRE-----LGATLS---G-N S-YWAIPNRISHFLGVEGPSLAVDTACSSSFSALHLAMNSLQKGECSIAIVGGVNLSLH---PY-KYIGLS--QKK-LL--GSTDR-SLSLGL-GDGYVPGEGAGVVILKALTAAQQDEDKIYCKIKSSVMKHGGNSD-AYTVPSKKVQKDLVLKAFEE SNVDPVTIGYYELAANGSAKGDAIEIEALKEAY--K-EF-----T-----E------------------------L---K--------D----------------I-C-AVGSVKSNIGHLEASSGMSQLTKVILQLQHETLVPSI-NSEVLNPD-ID--LKNS--PFRVQQITDKW-KRKTI---EHNGEIEEVP--LRAAINSI-GAGGTGVTVVLEE >BCERKS3 RDIAIIGLSGKYPKA-KNIEEFWSNLVNGENCITEI-PAE-R----W-DSSLFY--D--P---D---------KGI--------H---G-KAY------S--K--W-GGFI-GDVDKFDPLFFNISPREAELMDPQERIMLEITWHALEDAGY-----S LKKL------RQLKEEG-----HQVGVFIGSM----NQQ--YP----------WT---AANRE-----LGATLS---G-N S-YWAIPNRISHFLGVEGPSLAVDTACSSSFSALHLAMNSLQKGECSIAIVGGVNLSLH---PY-KYIGLS--QKK-LL--GSTDR-SLSLGL-GDGYVPGEGAGVVILKALTAAQQDEDKIYCKIKSSVMKHGGNSD-AYTVPSKKVQKDLVLKAFEE SNVDPVTIGYYELAANGSAKGDAIEIEALKEAY--K-EF-----T-----E------------------------L---K--------D----------------I-C-AVGSVKSNIGHLEASSGMSQLTKVILQLQHETLVPSI-NSEVLNPD-ID--LKNS--PFRVQQITDKW-KRKTI---EHNGEIEEVP--LRAAINSI-GAGGTGVTVVLEE >DszKS8 GDIAIIGVSGRYPQA-EDLRALWARLQAGESCIEEI-PAE-R----W-DKDRYF--D--P---Q---------KGR--S---------G-KSE------S--K--W-GGFL-RDVDQFDPLLFNIPPARARIMDPMQRLFLESVYETLEDAGY-----T RAML--------SKD-G-----GKVGVYVGAI----YHH--YA------ML-------AADES-----TRSLLL-----S AFGAHIANHVSHFFDLHGPCMAVDTTCASSLTAIHLACEGLLLGRTDLAIAGGVNLSLI---PE-KYLGLS--QLQ-FM--SGGAL-SRPFGD-SDGMIPGEGVGAVLLKPLDRAVRDRDHIHAIIRSSAVSHGGAST-GFTAPNLKAQSDMFVEAIER AGIDPRTISYVEAAANGAPLGDPIEVNALTRAF--R-RF-----T-----A------------------------D---T--------G----------------F-C-ALGTVKSNIGHLEGASGVSQLAKVLLQLRHGALAPTI-NAEPRNPN-LH--LDDT--PFYLQERLDDW-RRPII-------SGREVP--RRAMINSF-GAGGGYATLVVEE >RhiKS11 RDIAIIGVSGRYPMA-ADLACFWDNLRNGRNCVSEA-PNS-R----W-SESLTG--T--Q---S---------WAA------------G-R-----------Y--Y-GGFL-EQVEEFDHKLFGVPHEEVSNLSPELRQMLEVTWRTFEDAGY-----N SDAI-----ARIQQRDA-----AGVGVFIGSM----YHQ--SP-----WT--------ESSLE-----RAAIKS---N-V T-DWQIPNRISHYFDLKGPSLAVNSACSSSMTAIHLACQSLLQNDCAMAIAGGVNLILD---PS-KYQTLK--LAN-YL--GSSDV-SRGFGQ-GDGMIPGEGAGAVLLKPLAAAVADGDRIYGVIKSSSVNHGGGRL-MYSAPDTQQQSRLIAQTIER AGLKPAQINYVEAAANGSELGDPIEVAALKKVF--G-DM-----Q-----P------------------------S---S---------------------------C-ALGTVKSNIGHLEAASGISQLTKVLLQLQHGQLAPSI-NADPPNPH-IR--LEGS--AFYLQQQVAPW-PAPVS------AEGAREP--RRCLINSF-GAGGSYASLVVEE >NspKS6 DDIAIVGIAGRYPGA-KNLDELWEVLRTGRSCIVEA---D-R----F-RRSNLS--D--R--KH--------------------S---ETPAR------S--H--F-GGFL-NDVYHFDRQLFGVSEERAIAMPPEVRLFLEITWETFEAAGY-----S PAAL-----KQFQLREK-----KGIGVFVGSM----YSQ--YA----------WT---NPSRK-----EAVLSS---N-G T-EWQIANQVSHFFDLTGPSLVLNTACSSSLTAIHLACESLRQGSCSMALAGGVNLTLE---PS-KFLSLE--RSN-FL--GSGQH-SRSLGD-GDGMIPSEGVGALLLKPLSLAIADNDRIEGIIKSSFVNHSGGRQ-AFTAPDPAQQTQLVLDSIAL SGCDIETITYVESAANGSSLGDPIEIIALKNAF--A-KL-----T-----N------------------------K---T--------G----------------F-C-AIGSVKSNLGHLEAASGISQIAKVLLQFQHKTLVPTI-NATPINPR-IK--LDNS--PFYLQEQLSPW-------VPQSDSHNSDLP--RRSLINSF-GAGGSYANLIVEE >OnnKS10

DDIAIIGISGRYPLS-KTLDALWENLKAERNCITEA-DAS-R----W-RQALDG--I--V---A---------QGG------------P-VPP------CR-Y--Y-GGFL-QDVHGFDHALFDIAPEQVAGLSPELRLFLEITWETFEDAGY-----A KHAL-----QALQAREQ-----QGVGIFVGTM----YSQHSFT---------------APNLT-----EAAYLS---N-G T-DWQIANRTSHFFDLTGPSIAVNSACSSSLTAIHLACESLKQRSCSMAIAGGINLTLL---PS-KYDALS--RSK-ML--GSGHE-SKSLGV-GDGYIPGEGVGAVLLKPLGAAKRDHDRILGVIKSSFINHSGGRQ-MYTAPDVKRQAELIINSIER SGIDPETIGYVESAVNGSELGDPIEISALQKAF--A-TF-----T-----N------------------------K---R--------Q----------------F-C-ALGSVKSNLGHLEAASGVSQLSKVLLQHQHQMLVPSI-NANPMNPH-VK--LQKT--AFYLQQACSPW-EPLHH-----PETGERIL--RRSMINSF-GAGGSYANLIVEE >BaeKS13 DDIAIIGISGRYPES-ETLDELWEHLKAGDSCITEA-PEN-R----W-KSGLLK--T--M---A---------KET------------R-KEE------RKTR--Y-GGFL-QHIDAFDHHLFDIREDHVMEMTPELRLSLETVWETFENGGY-----S LERV------TEWQESD-----SGIGVFMGSM----YNQ--YF----------WN---IPSLE-----KAALSS---N-G G-DWHIANRISHFFNLTGPSMGVTTACSSSLSAIHLACESLKLNSCSMAIAGGVNLTLE---PS-KYDALE--RAN-LL--EQGSE-SKSFGT-GTGLMPGEGVGAVLLKPLSKALADKDHIYGVIKSSALCHSGGRQ-MYTAPDPKQQAKLMAASIDK AGINPETISYVESAANGSVLGDPIEVIALTNAF--A-QY-----T-----D------------------------K---K--------R----------------F-C-ALGSVKSNLGHLEAASGMSQLAKVLLQMERETLVPTI-NAKPQNPN-IN--LEQT--AFYLQEKTEYW-ERMRD-----AETGDIIP--RRSMINSF-GAGGAYANLIVEE >PksXKS14 EDIAIIGVSGRYPMS-NSLEELWGHLIAGDNCITEA-PES-R----W-RTSLLK--T--L---S---------KDP------------K-KPA------NKKR--Y-GGFL-QDIEAFDHQLFEVEQNRVMEMTPELRLCLETVWETFEDGGY-----T RTRL-------DKLRDD-----DGVGVFIGNM----YNQ--YF----------WN---IPSLE-----QAVLSS---N-G G-DWHIANRVSHFFNLTGPSIAVSSACSSSLNAIHLACESLKLKNCSMAIAGGVNLTHD---LS-KYDSLE--RAN-LL--GSGNQ-SKSFGT-GNGLIPGEGVGAVLLKPLSKAMEDQDHIYAVIKSSFANHSGGRQ-MYTAPDPKQQAKLIVKSIQQ SGIDPETIGYIESAANGSALGDPIEVIALTNAF--Q-QY-----T-----N------------------------K---K--------Q----------------F-C-AIGSVKSNLGHLEAASGISQLTKVLLQMKKGTLVPTI-NAMPVNPN-IK--LEHT--AFYLQEQTEPW-HRLND-----PETGKQLP--RRSMINSF-GAGGAYANLIIEEdd