Conservation of amino acid residues in

0 downloads 0 Views 2MB Size Report
SARLLESVLESKRMLGRGRRSAEAMGLEKKNKI. Arhl Tra hom. Arhl Enc cun. Arhl Enc int. Arhl Nos cer. Arhl Ent bie. Arhl Sac cer. Arhl Neu cra. Arhl_Hom_sap.
Supplementary Figure 1: Conservation of amino acid residues in microsporidian ISC and CIA proteins. Multi-sequence alignments are provided for a) the ferredoxin oxidoreductase Arh1, b) the ferredoxin Yah1, c) the monothiol glutaredoxin, d) the ABC transporter Atm1, e) the metal-binding P-loop NTPases Cfd1 and Nbp35, f) the CIA targeting complex protein Cia1, g) the CIA targeting complex protein Cia2, h) the iron-only hydrogenase-like CIA factor Nar1, and i) the electron transfer chain components Dre2 and j) Tah18.

1

a. Ferredoxin oxidoreductase Arh1 Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

M------------------------------------------KICIIGA M------------------------------------------KVCVIGG M------------------------------------------KVCVIGG M------------------------------------------KVAIIGG M-----------------------------------------KTVAIIGS MSF---------------------------VQIRHISSQINRKTVSIVGS MTFVARVKPGALGLRGRWPSQVHAPRRMYSQHPAHHTSPEHPLRVAVIGS MASRC----WRWWGWSAWPRTRLPPAGSTPSFCHHFSTQEKTPQICVVGS MRP-----------------------------------YKPIYKLCIVGA

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

GPSTYYLTNHIHTHAP----STTIHVFEKNSSPLQSLKLFHTKPT-MR-GPAGLYTAASLLARN------IDVTLHEKEAEVGGMYR-YSLLPASK--GPAGLYTAASLLSRD------VNVTLYEKEPELGGLYR-YSLLPESR--GPSGLFLAKYLSKHA------NSITIYEKSGTFGGMYN-YSYNPKL---GPSGLLVANYLANY-------MKINIYEASNKILGHYN-YSLNIKK---GPSGFYTAYHLLKKSP---IPLNVTIWEKLPVPFGLSR-YGVAPDHPEVK GPAGFYTTYKLMAKIQ----GTKVDMYESLPVPFGLVR-FGVAPDHPEVK GPAGFYTAQHLLKHPQ-----AHVDIYEKQPVPFGLVR-FGVAPDHPEVK GPSGCYLAKYLLARSKKENIAIKIDLLDSLDKPFGLLR-YGIAPDRHDLK

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

-----INNVL--------------CNIKVHK-----------WMFERLSA --MSPFAKLL---EH---KNFSLKLNSKVDLG-----------KLKTMEK --MSPFTKLL---EH-----KNFSLKLNSKIDL---------EKLSGMEK ---NIFKNIL---NT-----KNIDFKPNFEITES---------NFKTIEP ---SLFENIL---KN-----KNINLFLNTKIT-----------DLTTI-NCEETFTTCAEEFSSPTNQKHKFSF---VG-GITIGKEILLKELL----D NCQEKFEEVA---SS-----PDFRFIGNVSVGTKSDHPDGLTVPLASLFR NVINTFTQTA---HS-----GRCAFWGNVEVGRDV--------TVPELRE KSISSIDNSLFKKYS-----DDIKFYGNVTLGYDV--------KLEELKR

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

FYDGIVVAVGGRARR-LD-------GA---LN-KFAVYGEDVVRKGGGLE EFDAFVIATGSDGPRRLD-----IPGG------EHCVSSLDIAKSWTG-EFDAFVIATGPGGPRKLA-----IPGA------DHCIGSLDIAKSWAG-YYDKFVLAIGGVPNF-NE--------N------SSYINALDIIKNK---NADAIILATGGIAKS-IN----------------GTKTALDIIKQYDH-NQDAVILSYGCTGDRKLN-----IPGE---LGTKGVFSSREFVNWYNGHP HYNAIIFAYGAAQDRKLG-----IPGE---DQLKGVYSAREFVGWYNGLP AYHAVVLSYGAEDHRALE-----IPGE---E-LPGVCSARAFVGWYNGLP KYDVVVLAVGGLQSF-HTLPVKYMNNELQNKIIGGVFSSRDWVFYYNSHP

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

GRTG------------RSARLLESVLESKRMLGRGRRSAEAMGLEKKNKI ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------DFAK------------------DKRF-----------------------QH----------------------AN-----------------------EN----------------------QE-----------------------MFKKMLYPTKNEHSNSINNKLIDNERIDMDIELNYLKKKNNQFFEYKSPL

2

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

TKNKNNEENENTRKD--VNSREMHESLVHAKPFSRLRRVAIVGAGNVSLD -------------EELR---------------YTVGRKVLVVGMGDVSMD -------------EEPK---------------YTVGSKVLIVGMGDVSMD --------------VSI---------------NSLGKKVCIIGMGNVALD --------------NQD---------------VNIGKNICIIGAGNVAMD -------------TDFD---------------WSKVSKVGIIGNGNVALD -------------LNPD---------------LTQGEEAVVIGQGNVALD -------------LEPD---------------LS-CDTAVILGQGNVALD SSTEISVPFKN--ENEDFKYGYGSDILRNYILNSSERNAVIIGNGNVSLD

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

VANMLYE---------------------------------QGVPQIFILS IARFLFGWQ------GPQFRFPKSILEKV-----------KDVEDVTITS ITKYLFGWA------GPQFKFPKNALERV-----------KEVRDVTITS VCRKII----------------------------------TKVNEIDIIS LLLRLK----------------------------------NKLSTATVIS ITRVLISNQIDEIWENTDISSLALNLLRR-----------APVKDVKLIA VARMLLEDV--DVLRKTDIAAHALETLSQ-----------SRVKRVHVVG VARILLTPP--EHLERTDITKAALGVLRQ-----------SRVKTVWLVG ITRLLSFYTHEQLSKNKYLNPDYLNLIDTSSKYSNSDIYRPLFKNIFIIG

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

RNELAKCAFGSYELRRCMHDKNVRVWGGTVGNDRRGMI-VGERLRKCEKRRGVSGSAFTNSGLRSVLEIPDLGFHWSEGGNTPQNHRPRPENFEETSNRRNVLESSFSNHGLRSVLEIPRLGFSWSSPKDISCPDKDKTKDLEQKFTRGGLYISKFGNNVMREILNLANFKGH----------NISLPSNLKEN--RTDLFNSKFTNNKLRNIKDYYQITTNIYN-INDL-----QPKNNEDI--RRDFVHSKFTNKELRELWELEKYGIRGRIDPKFFQK-------------RRGPMQAAFTIKEVRELMKLSNVSF----HPVDTSL-------------RRGPLQVAFTIKELREMIQLPGARPI—LDPVDFLG--------------RRGWIQNSFKYPLLKEFIDKSRKSKYNSTNGMNIRVMM-SQEDFELSQD-

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

---DS-----------GDEEIIETE-NESSRNSTKIENNKIANDTRCDDP ---WAREHR-----DSEENGRIKK---WWERRM----G--------LLG---YLQKDD-------GKKDNINK---WKERRL----R--------LFQ------------------------------RRY----N--------LLN------------------------------RRL----H--------IIK---------------------EMFDPSKYDRAFNRRVEMCSEYLKPFNER ---------------------LPPDLKSLPRAPRRLMEMLAKGTTAISQS ---------------------LQDKIKEVPRPRKRLTELLLRTATE-KPG ---RT-----------SLFELERSG-PEIKRRFLKMKSIFQEMVNNHQEY

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

FYTSPTLLSLFNLSYNPTETSFHNQSPSLQPKQTLYLLLNTVLQSIKKNV ---AVRKG------------------ARRLRLMFNTNIKSIEKVGAQYKV ---GVREG------------------AKRLRLMFNTNIKSIERVGRQYKV ---NNNIKIDK--------------TKNNINLFFDTSIKKVTKINGKYVV ---KINNNIN---------------NPKISFIFNGTVKQIADNKVTY--SKKNYKKAPPPSSGYDKFWELDYLKTPLKINRDDFGAIN-----------PS----------ETAKSWSLDFCLTPKAFSPSSSSSTP---------------PAEAARQASASRAWGLRFFRSPQQVLPSPDGRRA----------IANNDNIFHNDKTINIHFKNLFSTVNIKTEEVNIPENNVKKK--------

3

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

SLFNLALK------------------TR-E--GTKMLNNVDLVVNCTGYT KME---------------------------QDGVQIEECFDSVISSIGFN KME---------------------------QGGIPIEEYFDTVISSVGFS EFN---------------------------N---GCEHKYDSIITSFGFK ---------------------L-----S-N--GETVEKIFTDIISSIGFI S-LSLCNNRLNE-----D-----NSLQPLKDVNNIMTYKVDLLITSLGYA TSTQLASTTFER--TTLSPSPFDPNAYA-LPTGETLTLPSSIAFRSIGYK AGVRLAVTRLEG--VD-------EATRA-VPTGDMEDLPCGLVLSSIGYK SIP--FIKGIELARNIRDSKPI-TDKTK-LNEKEKYYLPCQLLITSLGFK

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

GRDL-----STYV------------------------------CTKPLYF RTKE-----KAPG------------------------------IAKAVYH RADP-----RSLG------------------------------FTKPVYY ANQL------KIN------------------------------TDKPVYK PNIN-----VN--T--------------------------NTVNNVPVYK GVPMPEFSKLSIGFDK--DHIANK-QGRVL--------TSSGEIFPHLYA STPLPEFSDINIPFDERRGIISNDGRGRVQHEERTRGAEMSHGSFPGLYC SRPV----DPSVPFDSKLGVIPNV-EG-------------RVMDVPGLYC PKYD-----YIFNGNKDY-------------------SFENNCFPCPIFK

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

LGWCKD-AKGNLSVVKGRAVELGDRMIDEMGL-----------------VGWARH-PRGNVERAKEDAQDVVNKIVQTEKI-----------------VGWAKH-PKGNAERAKEDAQDVVNKIVEMKK------------------IGWCDI-PFGNISDAVQSAKKMVYKIL----------------------IGWCNK-PMGNIASLRINAQILADQIKTVLL------------------SGWIRKGSQGVIASTMQDAFEVGDRVIQDLVVSGALSLENSIDLS----AGWVKTGPTGVIASTMENAFATADAIIEDWVSRTPFLNADRNVHGWEGVK SGWVKRGPTGVIATTMTDSFLTGQMLLQDL--KAGLLPSGPR-PGYAAIQ TGWMETNSKGDLNIALQKSLTLGNEILSLLKKMP----------------

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------NIK------HTTWKDWERINKKELLRGKKEHKTRSKFLTFEELWNGVE SEVLKSGDDKRVVDWQGWRRIDEAERDRGRETGRSREKFTRTGEMLN--V ALLSS--RGVRPVSFSDWEKLDAEEVARGQGTGKPREKLVDPQEMLR-LL ---------------------------------------------PK-NV

Arh1_Tra_hom Arh1_Enc_cun Arh1_Enc_int Arh1_Nos_cer Arh1_Ent_bie Arh1_Sac_cer Arh1_Neu_cra Arh1_Hom_sap Arh1_Cry_par

-K -Q -M -T -K GI LG GH EA

4

Abbreviations: Tra_hom, Trachipleistophora hominis; Enc_cun, Encephalitozoon cuniculi; Enc_int, Encephalitozoon intestinalis; Nos_cer, Nosema ceranae; Ent_bie, Enterocytozoon bieneusi; Sac_cer, Saccharomyces cerevisiae; Neu_cra, Neurospora crassa; Hom_sap, Homo sapiens; Cry_par, Crystosporidium parvum. Conserved residues situated close to the prosthetic group FAD are highlighted in yellow1,2. Conserved residues situated close to the cofactor NADP+ are highlighted in red. In bold: Consensus sequence of four highly conserved peptide segments in Arh1 homologues. All of these polypeptide motifs map to the active-site of Arh1 and make contacts with both FAD and NADP. Three of the motifs are involved in binding FAD: 1) [VI]-[VI]-G-X-G-P; 2) G-L-X-R-X-G-X-A- P-DH-X(3)-[KR] note that this motif is not conserved in microsporidia; and 3) G-W-X(3)-G-X(2)-G. The last motif is involved in binding NAD: G-X-G-N-V-X(2)-D- X(2)-R3. The Arh1-Yah1 complex displays a highly charged surface arising from interacting surfaces that are predominantly acidic (Yah1) or basic (Arh1). In green: basic residues from Arh1 involved in salt bridges with acidic residues of Yah14. Dark green: Basic Amino acids from Arh1 that share properties with those residues shown empirically to interact with Yah1. The Arg 239 and 243 from the human sequence (underlined) were experimentally shown to have binding affinity for Asp76 and Asp79 of ferredoxin5,6.

5

b. [2Fe-2S] ferredoxin Yah1 Yah1_Tra_hom Yah1_Enc_cun Yah1_Enc_int Yah1_Sac_cer Yah1_Neu_cra Yah1_Hom_sap Yah1_Gia_int Yah1_Tri_vag Yah1_Cry_par

M-LKNV-------------------------------------------MDMFSA-------------------------------------------MDMANA-------------------------------------------MLKIVTRAGHTARI-------SNIAAHLLRTSPSLLTR-TT--------MSTPRVLRQSLQRLAQHARCYSKTTTAPLRTQPQRLPTAWS--------MAAAGGAR-------------------LLRAASAVLGG-PAGRWLHHAGS MSLLSS-------------------------------------------MLASIS-------------------------------------------MVNNLIWR----IS-------R-------------------PISSRVFSA

Yah1_Tra_hom Yah1_Enc_cun Yah1_Enc_int Yah1_Sac_cer Yah1_Neu_cra Yah1_Hom_sap Yah1_Gia_int Yah1_Tri_vag Yah1_Cry_par

-----------------------------KDE--KLINFIFLDK--TPKE -----------------------------PDRIPEQIRIFFKTM-KQVVP -----------------------------PDKTSGKVGLLFKTM-GKMIP ---------TTTRFLPFSTSSFLNHGHLKKPKPGEELKITFILKDGSQKT TTTQLSASASASARRSLSTSSALQHGHVDPPKPGEELYVTFIDKDNQTHR RAGSSGLLRNRGPGGSAEASRSLSVSARARSSSEDKITVHFINRDGETLT --------------------------IRR-----F-ITFRVVQQ-GVEHT ---------------------------RS------AVKIHWTGK-GCDKI IPYFSKRTLFLSFKRFFHSDPELW-----TKDVHPKIELSFILRDGEKKV

Yah1_Tra_hom Yah1_Enc_cun Yah1_Enc_int Yah1_Sac_cer Yah1_Neu_cra Yah1_Hom_sap Yah1_Gia_int Yah1_Tri_vag Yah1_Cry_par

VFSVPGKTLLEVAHANKIDLE--GACEGSLACSTCHVIL-DKKLYNSLEE AKAVCGSTVLDVAHKNGVDLE--GACEGNLACSTCHVIL-EEPLYRKLGE VNAVYGDTVLETAHKNGVDLE--GACEGNLACSTCHVIL-EEPLYRRLGE YEVCEGETILDIAQGHNLDME--GACGGSCACSTCHVIV-DPDYYDALPE LAVSEGDNLLDIAQAHDLEME--GACGGSCACSTCHVIVQDQDMYDRMPE TKGKVGDSLLDVVVENNLDIDGFGACEGTLACSTCHLIF-EDHIYEKLDA VSGAVGQSLLDAIKAAHIPIQ--DACEGHLGCGTCGVYL-DKKTYKRIPR VEGHNGETLLKIAERNKLPLP--NACEGNRACATCQVYV-NKG-GDLLNE FNAPKNISLLEAAQHEELDIE--GACEASLACSTCHVIL-DKEIYDELEP

Yah1_Tra_hom Yah1_Enc_cun Yah1_Enc_int Yah1_Sac_cer Yah1_Neu_cra Yah1_Hom_sap Yah1_Gia_int Yah1_Tri_vag Yah1_Cry_par

PSDREYDLLEQAFMPCNTSRLGCQVRVDERLRNSTIKLPRATRNMAVDGPSDKEYDLIDQAFGATGTSRLGCQLRVDKSFENAVFTVPRATKNMAVDGPSDKEYDLIDQAFGITSTSRLGCQLKIDKSFEKTVLTIPRATKNMAVDGPEDDENDMLDLAYGLTETSRLGCQIKMSKDIDGIRVALPQMTRNVNNNDPDDDENDMLDLAFGLTETSRLGCQVHMTKELDGLVVKLPSMTRNLQASDITDEENDMLDLAYGLTDRSRLGCQICLTKSMDNMTVRVPETVADARQSIATKEEAVLLDQVPNPKPTSRLSCAVKLSSMLEGATVRIPSFNKNVLSESD ISDAEYDTLDYAVDLREQSRLACTCVLQTDDGEMDVVIPERCRNIDVSEPSEREEDMLDMAPQVCETSRLACQIKVDERLTKGNIHLPNMTRNFYVDG-

Yah1_Tra_hom Yah1_Enc_cun Yah1_Enc_int Yah1_Sac_cer Yah1_Neu_cra Yah1_Hom_sap Yah1_Gia_int Yah1_Tri_vag Yah1_Cry_par

-----FKP--QPH -----FKP--KPH -----FKP--KPH -----FS----------FKS---------DVG--KTS ILASEEKKRHGQH -----FKKKKSIL -----FKP--SPH

6

Abbreviations: Tra_hom, Trachipleistophora hominis; Enc_cun, Encephalitozoon cuniculi; Enc_int, Encephalitozoon intestinalis; Sac_cer, Saccharomyces cerevisiae; Neu_cra, Neurospora crassa; Hom_sap, Homo sapiens; Gia_int, Giardia intestinalis; Tri_vag, Trichomonas vaginalis; Cry_par, Crystosporidium parvum. Residues involved in coordination of the [2Fe-2S] cluster are highlighted in red7,8. Acidic residues that interact with Arh1 are highlighted in green6. The glutamic acid residue (D to E change) in ThYah1 is highlighted in dark green. The replacement of D76 (underlined) with E in human Yah1 resulted in a strong increase in the Km value while replacement of D79 with E (as in the T. hominis sequence) exhibited only a slightly reduced binding affinity9. The two-conserved acidic regions are in bold. Ferredoxins containing a [2Fe-2S] cluster comprise two major groups: the plant and the vertebrate-type. They exhibit distinctive biochemical and structural properties. The vertebrate type is present in almost all living cells except Archaea. The amino acid residues that distinguish the vertebrate type from the plant type are highlighted in light blue.

7

c. Monothiol glutaredoxin Grx Grx3_Tra_hom Grx5_Enc_cun Grx3_Enc_int Grx3_Nos_cer Grx3_Ent_bie Grx5_Sac_cer Grx5_Neu_cra Grx5_Hom_sap Grx5_Gia_int Grx3_Cry_par

MDRSPHELSAEEKKNEDELNNIFFKLQNYTNIIAHENDE-QEIDALL--K MALSQPL-----KMEVLDHNDAFEELKEYDVVVGYESES-NELCNML--R M-------------NGIEDSNPFEELKEYEVVVAYDDEN-KKLSNVM--K MEYNK--------------KKNTFADIDYDNIVLF-------YENEI--MDLKN--------------ICDTNDFVLFYKENNTAFEN---LNTQI-KE MFLP-----------KFNPIRSFSPILR--------------AKTLL--R MLTR-----------SLFSRQLFAAASRPAIAPKAVSSA---FRPVL--F M--------------SGSLGRAAAALLRWGRGAGG-GGL---WGPGV--R MDQINGGL----FPKALALTRYASRGLILGIISAAGAELENQLAGFLTRR MSTTLNSF---VGITDFVLGNCLSAILILSGKEEIDGSL-LELENVLQES

Grx3_Tra_hom Grx5_Enc_cun Grx3_Enc_int Grx3_Nos_cer Grx3_Ent_bie Grx5_Sac_cer Grx5_Neu_cra Grx5_Hom_sap Grx5_Gia_int Grx3_Cry_par

YIDDDEYIIVNLSISPVLKERFMKTYSVEKLPILISYSTNI--------NAGVD-YVEVNLGESERLKAEFMKHFNVSKLPVLIIKGSPVS-------DRGLD-YVEVNLAVSEKLRTEFMKHFRVSELPVLLIEGIPAL-------PPGMEQNIRIINCSKQELRNAVVARYNLETLPALLFYKKVIY-------FDDVA-YVD--LSISKKLEELVSKTFNVNSFPLLIFKNTKI--------YQ-----------------------------------------------YQ-----------------------------------------------AA-----------------------------------------------YPRSL-FLPNAQ--NSVSRFLM---------PVCAPLATVS-------RG FSNVK-FGKISP--SGVNEISAIKQFDVKELPSILLFTCQSLKPYKVIS-

Grx3_Tra_hom Grx5_Enc_cun Grx3_Enc_int Grx3_Nos_cer Grx3_Ent_bie Grx5_Sac_cer Grx5_Neu_cra Grx5_Hom_sap Grx5_Gia_int Grx3_Cry_par

--YNDENVDAY--LKERKVNEDSFIDHKIDKMVGEKKVMVFIKGSPDKPE -GDPSDKIREY--AE----EREGDILRRIQSTVDPKRVTLFIKGSPENPK -DDPSERIKEY--IE----RKEKKSLEKIQSVIDPNKITLFIKGSPEHPR --LKNDNIKNY--VE----DKQLLLEREVKRIINSCKIVLFIKGDLFDPY --TPKDTVENV--L-------YNYYFKFCKEFVCQSKYVFFMKGTIEKPY -------------NR---MYLSTEIRKAIEDAIESAPVVLFMKGTPEFPK -------------NR----FLSDATRQAIDKAVASAPVVLFMKGTPETPQ -------------GS---GAGGGGSAEQLDALVKKDKVVVFLKGTPEQPQ LERTSDLNATLAQSESIKQELKMFILPQIRELLAENPVVLFMKGTPDSPE GYNPSELHTNLEELIKIQNLSIPSQNEKFKILTNFKSLMVFMKGIKEEPY

Grx3_Tra_hom Grx5_Enc_cun Grx3_Enc_int Grx3_Nos_cer Grx3_Ent_bie Grx5_Sac_cer Grx5_Neu_cra Grx5_Hom_sap Grx5_Gia_int Grx3_Cry_par

CKFTKELISHFDELQLKNGKDYSYFNIKLDNKTRNRLKKRNNWPTFPQIY CGFTKTLMDILYSAGVT-KDQIVYFDILSDEDVRRKLKEINSWPTFPQVY CGFTKSLIEILYGLGVT-RDKIEYLDVLSDEDIRERLKEINRWPTFPQVY CHFSKEVIQILKDNNVN-LDEIVYYNVLKNKEMAEKIKEVNKWPTFPQLF CKYSKQLVELCNKKNI---IDIIAFDIFQDNIMREYLKKINNWPTYPMIF CGFSRATIGLLGNQGVD-PAKFAAYNVLEDPELREGIKEFSEWPTIPQLY CGFSRASIQVLGLQGVD-PNKFAAFNVLEDAELRQGIKEYSDWPTIPQLY CGFSNAVVQILRLHGV---RDYAAYNVLDDPELRQGIKDYSNWPTIPQVY CGFSKFASMLLKYNNI----SFVGVDVLDDPALRQGIKLYGNWPTIPQLY CRFAKGLVSLLDSIKV---KNYGHYNIFENEETRQGLKEYHNWPTFPQIC

8

Grx3_Tra_hom Grx5_Enc_cun Grx3_Enc_int Grx3_Nos_cer Grx3_Ent_bie Grx5_Sac_cer Grx5_Neu_cra Grx5_Hom_sap Grx5_Gia_int Grx3_Cry_par

IDKLFVGGLDTFKKMKEKKIVQKMLFPGDQ-------------EEIE IGGRFIGGLDVVRKMSEKGELRREIQ------------------EII IRGRFIGGLDIVRKMSEKGKLKDELS------------------GII VNGKLIGGCDILKKLNETKELTKIL-------------------NKQ VDGQFIGGLDAFTDIIRC--------------------------DKI VNKEFIGGCDVITSMARSGELADLLEEAQALVPE-----EEEETKDR IDKEFVGGCDIIVSMHQNGELAKLLEEKDVLVKGEEGAAEEQTEKKE LNGEFVGGCDILLQMHQNGDLVEELKKLGIHSALL---DEKKDQDSK VKGELIGGSDIIQQLHESGELRKVCG------------------LPD INGEFIGGLDILNEMHSNGELVNEIPK-----------------DAF

Abbreviations: Tra_hom, Trachipleistophora hominis; Enc_cun, Encephalitozoon cuniculi; Enc_int, Encephalitozoon intestinalis; Nos_cer, Nosema ceranae; Ent_bie, Enterocytozoon bieneusi; Sac_cer, Saccharomyces cerevisiae; Neu_cra, Neurospora crassa; Hom_sap, Homo sapiens; Gia_int, Giardia intestinalis; Cry_par, Cryptosporidium parvum. Monothiol glutaredoxins bind a [2Fe-2S] cluster in a bridging fashion. The iron atoms are coordinated by the cysteine in the active site CGFS (in yellow) of the protein and a cysteine from bound glutathione. Monothiol glutaredoxins from bacteria and Grx3 and Grx4 from yeast form homodimers and it has been proposed that switching from dimeric to monomeric conformation releases the [2Fe-2S] cluster to acceptor proteins. In human, the holo-GLRX5 is tetrameric, whereas the metal free protein is also monomeric10-13. The amino acids involved in inter-subunit interaction in human Grx5 are shown in green. The amino acids in pink stabilize a loop in the tetrameric structure that shields the [2Fe-2S] cluster. In light blue and bold: essential residues for the biological activity of yeast Grx514. The glutaredoxin domain (PF00462) is underlined in blue. The N-terminal thioredoxin-like domain (SSF52833) identified in E. cuniculi and E. intestinalis glutaredoxins using HHPred (http://toolkit.tuebingen.mpg.de/hhpred) 15 is highlighted in red. A CDD search16 identified a thioredoxin-like domain (cd02984) in the protein from C. parvum (red). By contrast, no thioredoxinlike domain was identified in the corresponding segment of the T. hominis and G. intestinalis sequences.

9

d. ABC transporter Atm1 Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

MAS----------------------------------------------M------------------------------------------------MQGSNNAYRN---------------------------------------MVICFRSPFK---QVNKYFR-----------------------------MGG----------------------------------------------MKK----------------------------------------------MLE----------------------------------------------MKE----------------------------------------------MLLLPRCPVI-----GRIVR-SKFRSGLIRNHS----------------MAPSIKLSTM-----ATSLHRAHGTSALLRRPR--------LWAPRLSSI MALLAMHSWRWAAA-AAAFE-KRRHSAILIRPLVSVSGSGPQWRPHQLGA MES----------------------------------------------MWAFTQKASN--L--K-ILK---PHPFVLNRST-----------------

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

------------------------------------------------------------------------------------------------------------------------------------------SSYVYDRLCD --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------PVIFTV-SKL-STQRPLLFNSAVNLWNQAQKDITHKKSVEQFSSAPK HATPTIANLRASF-TTSSPRLFAPNGSAKDESK--PAVSTVPKTTGRGPS LGTARAYQIPESLKSITWQRLGKGNSGQFLDAA--KALQVWPLIEKRTCW ---------------------------------------------------KYGILYLGVFATSIRGKRAFFSSESNFAQGNCMHAMTTSTFKNKDLAK

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

-------------I-----TTYEILQIIMTKYVK--DIPLIR-MTIMPTL ----------------------NIHLYIFKRYVI--LNTTLA-LFMPPIL IACTYTDTIKKLCERSPHMSKPFQIFIFVVKEMF--SPPLKPCFVIFTIV --KISASMGGKKKV-----SELQTLSIIVRKYIL--SIPQVR-IIVFPVL ----------KKKV-----SELQTLSIIVRKYIL--GIPQAR-IIVFPVL -------------I-----SNRKIIKEIIYNDLL--KIRMLR-YVIVPII ----------KKSI---YGSDISIMKDLIVEYVI--SKPFIR-TFVIYII -------------K-----TNYEILYDTFVKYCY--NISYIR-YVMFPIL VK-----TQVKKTSKAPTLSELKILKD-LFRYIWPKGNNKVR-IRVLIAL DP--L--AAIDKTAQEQRKADWAIMKE-MSKYLWPKGSWGDK-ARVLLAI HGHAGGGLHTDPKEGLKDVDTRKIIKA-MLSYVWPKDRPDLR-ARVAISL ----------------------DVTYI----------------------EN---KLVNILKKSRFIQEKDSKNIEI-LTKYLWP-KNREYR-KRIIFSL

10

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

FLMFVARIMEVKVSEIVQNASLEFS-------G-----------YREGGT TCIILSKYLKIYASSILKDIGQAIE-------D----------SHTVSRL AIIFITELLLYKIRIENLRSLSSCK-------NYSTLRSFLF-YTTICGI LSIFAGKYFEVKVASCISHISDGLD-------S----------EGIPRGR LSIFAGKYFEVKVASCISYISEGLD-------L----------EEVPRSR FLTIFYTVLEIKISTLVKNLNQSIA-------N----------RKGEHTA FGTFAVCTFNVLTVEATRKLTSDIT-------D----------NKDLSQS LSTIIACYLEVQASNISKRIAEDFE-------N----------KINAGKS GLLISAKILNVQVPFFFKQTIDSMNI------AWDDPTVALP--AAIGLT GLLVGGKVLNVQVPFYFREIVDSLNI------DFSTTGGSVT--AVAGAM GFLGGAKAMNIVVPFMFKYAVDSLNQMSGNMLNLSDAPNTVA--TMATAV --------------------------------DRNSIGDSLN-------LSLGVAKLATIQVPLLLSRLIDNVGA-----ISSSNLSLSLTNKLNVLFL TMD1

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

FK--KYMIVGILSSLLIE-LQGFIFKSSVQRAYRTALKSS--LREYLLLII--KLGACYLLYAFFNE-IYDFIAASPIQRAARHASNDF---LRNCLMIYPTKFLVYQVIYSDYLERLSCRAFYTVLSRRWEITLQEEGLNSRFKNLIA--MFLAVSLMGIMFTE-LQGFVFVSAVQYVYRYTLRTT--FEYFIQMII--MFLVISLTGIIFTE-LQGFVFVSAIQYVYRYTLRTT--FEYFIRMLE--RFVLQSLLVAVLTE-LNGFIFTGVVQYIYRNTAKST--FKGFISLLL--YFAVLTIVSITMSE-LNNFIFVTPVQHVFRLTGKNS--FKNFINMIF--KYILFLTSSIVLRQ-INDIVFSGPIQFLYKIVGVEA--FYHYISLIL--CYGVARFGSVLFGE-LRNAVFAKVAQNAIRTVSLQT--FQHLMKLIL--GYGAARVGAVVSQE-LRNAVFASVAQKAIRKVARNT--FEHLLNLLI--GYGVSRAGAAFFNE-VRNAVFGKVAQNSIRRIAKNV--FLHLHNL--------------LLNE--------DPAQISSVFN------FSKY---IS--SYGIARISSSGFNE-LRNALFSEVSQYACKDLSLKA--FHHFHNVS TMD2

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

NWSEFKMKGMGEITASIERRSSAVSEILDVFIINLLPVFFVLFLAVLKIY ERVRLYECSSGEVGRTLVRSASAVSDLIDITLLEILPLLITFVLALTEMY GWLVESVTDNGIFQSFVHRGVKGMTNLIRQLVLSLLARIFSYFFVMKETN ETETFESYGSGTIQSIITRKSKAISDFVEVSVQNLFPVVASLAFVGLEAY ETEMFESYGSGTIQSIITRKSKAISDFVEVSIQNLFPVVASLTFVGLEVY SPKNFSTIGGGEIQTIIDRKSKSASELIEVFLTSLLPICLKLLFALIAVI ELSRYNKIGCGEIQTIIDRESKAISELIEVLFVNIINIFFTVILACSSIY NLENFNKIGSGEIQMIIERKSRAYGDILELTTLSAVPTCIVVIMSFFSVY DLGWHLSRQTGGLTRAMDRGTKGISQVLTAMVFHIIPISFEISVVCGILT DLSFHLSKQTGGLTRAIDRGTKGISFLLTSMVFHIVPTALEISMVCGILT DLGFHLSRQTGALSKAIDRGTRGISFVLSALVFNLLPIMFEVMLVSGVLY ----MQLRNIGQIIA--------------------------FVFTGFTIS NLSFIQSHRSGELLTIITRGFKSVSKLLNIMIFQIIPTTAEFLMVLGILL TMD3

11

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

AIMGFTSSLIILISLITYTTVTIKMAVWRNEIRKKLNLSINESNNKLIDI RKFGYAAFMYNVMMLIVYIVITFAITRYRMRYRRRNNMYENRAHSKCLEC DQLGTTILLAIIFMIVITLLTNIIIIYILIYYRRRYNFLRAQLDNHVSEC LRLGVLASLIIILAVGVYGTMTISIALRRNRIRGALNNAENTASNIVYDT LKLGVLASFIVALAVAAYGFMTISIALRRNRIRGALNSAENAASNIVYDT KNMGGLAGFIMFVCVSCYATVTILIAQWRSNIRRELNNSENRSSNKLQDG TNLGLTNMFIILITLLVYILATAKIVHWRTGIRKEYNCAQQRCSNHLHDS KNLGRDALYVMLATSALYVYFTFVFSIWRNNIRRQYNKSQDKLSNKLQDF YQFGASFAAITFSTMLLYSIFTIKTTAWRTHFRRDANKADNKAASVALDS YNFGWQYAALTALTMVSYTAFTILTTAWRTKFRRQANAADNKASTIAVDS YKCGAQFALVTLGTLGTYTAFTVAVTRWRTRFRIEMNKADNDAGNAAIDS APLTLFAMFMSIFT--TYIFHKIR-DYAVNNIRAKFK-VNAAAMTVCLET HKVGSEVALITLATMVAYMDFTRRITHKRTIYRKNMNTSEQKSNGLLSDS TMD4

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

LSNYESILAFNNQNLELYKYDNKLATSEKHYVKLWRTFYLLNFLQRFVLC IRNVDTITVYKTVQFELNQYDDINKKVQFYSSRQYQSLALLNLTQKVILY LSNHLLVTYCHKEMDEYQRYKMKVKNYRTAVVMLESVEHVLEILLYLITN LSNHESVVSFNNYDIETRRYDAKLMEIERFGTNLFRGLYILNMLQKLIFA LSNHESVMSFNNYCIEVRRYDGKLMEIERFGTNLFRGLYILNMLQKLIFA LNNHETIVSFGTTDLEVDEYDQFLKINASNSNRLWRALYILNLSQRAIFL LINHETILAYKTEEEESLKYEKYVSEVESECNRIWRSLSFLYLVNKIIFA LANHETIKAYNMEEETITYFDENQKPVEYFGVKSHRILFSLLYIQKMTFA LINFEAVKYFNNEKYLADKYNGSLMNYRDSQIKVSQSLAFLNSGQNLIFT LINYEAVKYFNNEAYEVGRYDKALAQYEKNSIKVATSLAFLNSGQNIIFS LLNYETVKYFNNERYEAQRYDGFLKTYETASLKSTSTLAMLNFGQSAIFS ISNPRTVYFFDQEERSINKYYTIVDRVCQ-LERIFHGIFSISFGGERTFN LINAETLKYLNGEKYIYDLYSKYQEIYKNSNVKVQTSLAFLNFGQNFIFT

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

VQTG-MIIYVG----MCNKITSDQFVLYLSISKILASNLDKLGYMYSRFT IFMV-TYLYR-----VRESLGNDQLLQYFSICSTIIEELSNLGCIYHRFS ITKF-VVYYV-IYK-NNDVYPVHLVMFVMQKIDVIEQNTKWAGSFYEKLR FLNA-SVIALGAYGVLSSKMDGKMLIFYVTTSRILLMNLNNLGFTYCRFT FLNS-SMIALGVYGLLSSKMDGRVLIFYVTTSRILLINLNNLGYTYCRFT LQSF-FIIYAGISGILSQNMSSNQLIYLMSITSTVTINLSNLGYLYTRYT LQCF-IVITLGNYGILTAKLSARDVVFYIGINRTLYSSFGQLGFFYSRYT IQAI-IIITLGSFGYFPIKLSSQQLVFYISISKTLTNSLSEMGMLYTRYV TALT-AMMYMGCTGVIGGNLTVGDLVLINQLVFQLSVPLNFLGSVYRDLK SALT-VMMYMGAHGVATGQLTVGDLVLINQLVFQLSVPLNFLGSVYRELR VGLT-AIMVLASQGIVAGTLTVGDLVMVNGLLFQLSLPLNFLGTVYRETR EGTFAVVLSFGGYLVISDRMSGGNLVAMLRIISSFSFLFGLLMGTANNEA GGLL-SAMLITTNKVLAGTLPIGSIVLVTSLLFQLAIPLNFIGMIYRESK TMD5 TMD6 ★

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

AAILNAKMSF-LDTV-----LPKKLYPIR--------------------KAIVDLNTPFVRDMAETNQTYDRKPSVNLTEEPRTKRNTHVGNAVDQYHA KAILDAEFAYDFVSSSKN-LKASHDSDLIIAGIGQSRDTCTDHGLSDSHS EAMLNAREVLSEDYDLK----TSARLSVA--------------------EAMLNAREVLSEDYDLK----TNENISMV--------------------QAIINARSTYDTMLDIK---KENNKFKIT--------------------QAILNIRTSFKPELIKE----EPNLVDVN--------------------QGFLNAKSGYYEFKEDT----QTDKIRLI--------------------QSLIDMETLFKLRKNEVK-IKNAERPLML--------------------QSLLDMETLFNLQKVNVT-IKEQPNAKPL--------------------QALIDMNTLFTLLKVDTQ-IKDKVMASPL--------------------RSMEAANRVFNLLENTQT-VDLKKGLEPE--------------------LTLIDLSKLNEYLTIKPK-NSSNSQCRTI---------------------

12

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

------------YF-----------------------------------RGHEETNNDGRNSNNYAQKGKNICREENINTDQESKNDIQENGRAIPFIN DEGEQPAAKRYLETTVTLPSKEIYYDENENVTNTAPSR---------NTK ------------RF-----------------------------------------------RF-----------------------------------------------EF-----------------------------------------------EF-----------------------------------------------DF-----------------------------------------------PEN----------------------------------------------TLT----------------------------------------------QIT----------------------------------P ------------SL-----------------------------------------------QLNPKTKVLDFYKEMNL--------------NGTGSSS

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

-DKKIVFRNVALPLSSENILNTSNDVFTYAISDNYIFKNMSFEIKKGEKI TKPLLQFQNVTIKH-----------------KNTPILTNLTFNVAQYAKV GTLIMQFRDFTIIM-----------------NQKLLFKPLNLSICKNDKI -RKNIVFNDVHSYY-----------------GDKKVLRGVNLTIEKGDKV -GKNIVFKNVRLYY-----------------GDKKILNDVNLTIKKGDKV -KESIKFNNLSFGY-----------------ESRQIFNKINLEIYKGEKV -NNDIRLENASFNY-----------------YSKKILTDINILIKKGEKV -NEKLEFRNVSFSY-----------------LNKPILVNANFVINKGERV VPYDITFENVTFGY----------------HPDRKILKNASFTIPAGWKT RGGEIEFKDVTFGY----------------HPESPILRDLSLTIPAGKKV QTATVAFDNVHFEY----------------IEGQKVLSGISFEVPAGKKV -KGDIEFKNVWFKYP---------------TRDQWVLKNVSFKINSGDIV KKNSIKLENVSFGFPS-TFGNEHTYIETSNSSDDLVVNDLSLEIPLGKRM

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

ALIGPNGIGKSNFLKMLLKF-NEYTGSIKIDDDELCTIDNYSLRDLISYV AIVGPNGAGKSSIIKAILKL-TPYSGQI-------QKIENTR----LTYT LIKGPNGIGKSSIIRAIFGM-IDYIGDVTIKGVPLQNINTSKLCAMMAIC AIVGSNGSGKSTILKTLLKF-NSYQGSICIDGINIKAIENGSFRRTIGYV AIVGSNGSGKSTILKTFLRF-NRYQGSICIDGISIDAIENGSFRRIIGYV AIIGKNGSGKSTLLKLLLKFDEDYKGSILIDDIDIKKIKDEFYRNLLGYI AIIGKNGSGKTSLIKMIMRF-ENYGGNIYLDNRDIKEISNSSYRSLISFA AIIGKNGTGKSTIIKLLMKF-YKYDGDILIDDTEIDNISDRSYRSLISYA AIVGSSGSGKSTILKLVFRFYDPESGRILINGRDIKEYDIDALRKVIGVV AIVGPSGCGKSTLLRLLFRFYDPQKGAIYIDGQDIRSVTLESLRRAIGVV AIVGGSGSGKSTIVRLLFRFYEPQKGSIYLAGQNIQDVSLESLRRAVGVV AFVGHSGCGKSTIVQLLLRFYDVNSGEVLIDGRNIKEYSPSFIHRNIGVV GFVGSSGSGKTTLAKLIYRIFEPNSGKIRIFGKEIEDYEINEYRNCFAVL

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

PQNSYLVSGTVKENIKYGNLVATDEEMIELCRSLNFHESFVRLSSGYETC SQEPQLFADTVLYNVAYGS-KAKLRCIIAIAMKMGVHKDILRMR-GYGSL PQSPLVFKNTIRFNLGYGN-CATDEGMTKMCKHMNLFDKLVEMEDGLDSL PQNSSLFNETVMYNIKYGSPSVSDYAVVELAKRFNIHDSIMRLERGYFTN PQNSSLFNETVMYNIKYGNPNVSDYTVVELAKRFNIHDSIMRLEKGYFTN PQNTFLFNESVKYNIKYGSFGISDEDIFALCKEFGLYDVFMNLENGFDTN PQTSFLFNETVYYNLTYGKEIFDKEEVIKISKKLCVHDSINNLEDEYNTH TQNTFLFDNTVTYNIFYGTKNVTEKEVLELAKKIGVLESIQEFKDGFSTS PQDTPLFNDTIWENVKFGRIDATDEEVITVVEKAQLAPLIKKLPQGFDTI PQDTPLFNDTVEHNIRYGNLSATPEQVIEAAKAAHIHEKIISWRDGYNTK PQDAVLFHNTIYYNLLYGNISASPEEVYAVAKLAGLHDAILRMPHGYDTQ QQDSALFTLSVRDNILYGKTDSTNEDVENAAKVAFAHNFIIKLPHQYDSM PQEVLLLNMSIIDNLKIANNNATLDEIKSACKLAGVHENILKMKNGYETI

13

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

IGENNSVLSGGEKQKVAIARAML--------------------------LTENARNLSGGERQKITLLRNVVYGICACGDGVDCAHCEGCRDGSVQEWS ILSGGENFSGGEKKRLCVARAAL--------------------------VGECGRHISGGERQKIVILRALL--------------------------VGEAGRHISGGERQKIIILRALL--------------------------VGERGRLLSGGEKQKILLMRTML--------------------------VGDKGHKLSGGERQKVILLRSAL--------------------------VGERGRFLSGGERQKIMLMRALL--------------------------VGERGLMISGGEKQRLAIARVLL--------------------------VGERGLMISGGEKQRLAVSRLIL--------------------------VGERGLKLSGGEKQRVAIARAIL--------------------------VGEKGTTLSGGQRQRIAIARAVL--------------------------VGERGCSLSGGEKQRLGFARMLI---------------------------

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

-------------------------------------------------DEEERNVKISVADRESITQFNDDHQAFNSDFVPYEVTMQDDTYQNATEDR ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

---------------------------KNAE--IYLFDEPTANLD----GSTTNYNMMYDVDEISSKNEETFYPSAKNSPSILMLFDEATSAMD-------------------------------KRCE--IYWFDEPTAGLD-------------------------------KRSE--ILVMDEPTSNLD-------------------------------RRPE--ILAMDEPTSNLD-------------------------------RNKE--IMALDEPTAALD-------------------------------KNSP--IIILDEPTAALD-------------------------------KNSE--IVLLDEPTSALD-------------------------------KNAR--IMFFDEATSALD-------------------------------KDPP--LLFFDEATSALD-------------------------------KDPP--VILYDEATSSLD-------------------------------KNPS--LLITDEATAALD-------------------------------KKSP--IWILDEPTSALDLINHN

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

-------------------------------------------KKSEETF -------------------------------------------KSSEYDV -------------------------------------------STNMHAV -------------------------------------------KEAEIDI -------------------------------------------KEAEIDI -------------------------------------------KESEKKI -------------------------------------------KKAEYEI -------------------------------------------KKSELET -------------------------------------------THTEQAL -------------------------------------------THTEQAL -------------------------------------------SITEETI -------------------------------------------SVSEKKV FMVKMLSFLHEYSLSTSNSVNLNDKYDLGSKYKDIFSLIPYVKDKKEIQS

14

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

FKRLAEDT-------QNKTVIVILHNLELLDYFDRIMFLQKDRIQEIKRE IERLFDVC-------SECTVLMVIHNLTILHLFDTIIYVDKE--LEIGSF VDFITR---------MKGTVIAIMHTDEYDRCFDQVIYLERM-------IRNIIDSE-------GSVTVIAIVHNLDLLPFFNKVCFVDKGSAKMISQT IKNIIDFE-------SSVTVMAIVHNLDLLPLFNKVCFVDRGSVRMIDQT LEIILQQ--------EDRTVLMIIHNLELINKFDKIIYIDENNIEVYKND LKSLISNN-------LEKTIIIVLHNLDLLHLFDKVLSIKNKTVSL---MTYIFNEF-------KYHTFVIIVHNLELLALFDKILFVNGNEVTMIEDI LRTIRDNF-----TSGSRTSVYIAHRLRTIADADKIIVLDNGRVREEGKH MENINAILKGLGQKGEKKTSLFVAHRLRTIYDSDLIIVLKEGRVAEQGTH LGAMKDVV-------KHRTSIFIAHRLSTVVDADEIIVLDQGKVAERGTH EKALRSVM-------SSRTSIIIAHRLGTIRCASHIFVLDDGEVVEEGSH IKKLIDDIV-----KLPLTIIVIAHRLSSVRNFDLIAYLEEGNVKEVGNH

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

EALKLMNSTKNEKSTKNAQEM-------------------------------DRLMQKKG-NFYLFYERM-----------------------------------------------------------------------------NGAARSIAERL-RDC----------------------------------GTSSESAAEGL-RSY----------------------------------IEGQKKFSTNL-QYFLS------------------------------------HKIEENI-----------------------------------------NKKLEENSKYFA-------------------------------------LELLAMPGSLYRELWTIQEDLD-----------------------HL ---RELMERNG-VYAQLWRAQEMLMTEEGEVS--------------KKGE ---HGLLANPHSIYSEMWHTQSSRVQNHDNPKWEAKKENISKEEERK-KL ---DELISRRG-VYYELVKI--------------------------------DQLIENKM-QYYQLWNKQ-----------------------------

Atm1_Tra_hom1 Atm1_Tra_hom2 Atm1_Tra_hom3 Atm1_Enc_cun1 Atm1_Enc_int1 Atm1_Nos_cer1 Atm1_Nos_cer2 Atm1_Nos_cer3 Atm1_Sac_cer Atm1_Neu_cra Atm1_Hom_sap Atm1_Tri_vag Atm1_Cry_par

---CDT------SEEQ ----RK------ERRD -------------------GM------LGVK ----EI------PGVK -----Y------KLGK ----IS------TLNL ----DI------LNNE ENELKD------QQEL KEEVGE------KKEA QEEIVNSVKGCGNCSC ------------QLDA ---HID------SILK

Abbreviations: Tra_hom, Trachipleistophora hominis; Enc_cun, Encephalitozoon cuniculi; Enc_int, Encephalitozoon intestinalis; Nos_cer, Nosema ceranae; Sac_cer, Saccharomyces cerevisiae; Neu_cra, Neurospora crassa; Hom_sap, Homo sapiens; Tri_vag, Trichomonas vaginalis; Cry_par, Cryptosporidium parvum. Atm1 proteins are half size transporters that belong to subfamily B of the ABC transporter family17,18. The structure of these proteins includes an N-terminal domain with six transmembrane domains (TMD16) followed by a nucleotide-binding fold (NBF). The predicted TMD are labelled in brown. These were predicted with the TOPCONS web server: http://topcons.cbr.su.se 19.

15

The motif for the ATP binding box A (Walker A motif; G-X-X-G-X-G-K-S/T-X-X-X-X-X-I/V) is labelled in red. Note that in some cases the I/V residue is replaced by another hydrophobic amino acid. The mutation K475M within this motif resulted in the complete loss of Saccharomyces cerevisiae Atm1 function20. The conserved basic amino acid that precedes the Walker A motif is labelled in blue. The ATP binding box B (Walker B motif; Φ-Φ-Φ-Φ-D, where Φ represents hydrophobic residue) is labelled in pink. The Q-loop of the conserved ATP-binding motif is labelled in green. The conserved glutamate that acts as a catalytic base is labelled in dark green21. The ABC signature motif (LSGG), also called the C-loop, which is present in the nucleotide-binding fold, is labelled in grey22. The D-loop (SALD), which interacts with the Walker A motif is labelled in teal23. The conserved histidine that is proposed to be involved in the catalytic reaction is labelled in light grey24. The residues identified in Saccharomyces cerevisiae that interact with glutathione are labelled in yellow. The star (★) corresponds to the human Atm1 E433K mutation found in patients with X-linked sideroblastic anemia and cerebellar ataxia (XLSA/A)25.

16

e. P-loop NTPase Cfd1 and Nbp35 Cfd1_Tra_hom Cfd1_Enc_cun Cfd1_Enc_int Cfd1_Nos_cer Cfd1_Ent_bie Cfd1_Sac_cer Cfd1_Neu_cra Cfd1_Hom_sap Cfd1_Tri_vag Cfd1_Ent_his Nbp35_Tra_hom Nbp35_Enc_cun Nbp35_Enc_int Nbp35_Nos_cer Nbp35_Ent_bie Nbp35_Sac_cer Nbp35_Neu_cra Nbp35_Hom_sap Nbp35_Tri_vag Nbp35_Gia_int Nbp35_Ent_his Nbp35_Cry_par

M------------------------------------------------M------------------------------------------------M------------------------------------------------M------------------------------------------------MP-----------------------------------------------MEE----------------------------------------------MS-----------------------------------------------MEA----------------------------------------------M------------------------------------------------MTE----------------------------------------------MSK------------------------------------------CPGIH MGES-----------------------------------------CPGVS MGES-----------------------------------------CPGVS MQNN------------------------------------------YKCP MNN------------------------------------------CSNVD MTEIL-----PHVNDEVL--------------PAEYELNQPEPEHCPGPE MAPSLEAEPESVASVLAN--------------PQKPQLVAPEPEHCPGPE MEEV--------------------------------------PHDCPGAD MSCS---------------------------------------------MDCIAYPPRSHRENKMPCCGNSGNGPCACHSGANGVESDLPKSGNKPVSD C------------------------------------------------IQI----------------------------------------GNCVGVD

Cfd1_Tra_hom Cfd1_Enc_cun Cfd1_Enc_int Cfd1_Nos_cer Cfd1_Ent_bie Cfd1_Sac_cer Cfd1_Neu_cra Cfd1_Hom_sap Cfd1_Tri_vag Cfd1_Ent_his Nbp35_Tra_hom Nbp35_Enc_cun Nbp35_Enc_int Nbp35_Nos_cer Nbp35_Ent_bie Nbp35_Sac_cer Nbp35_Neu_cra Nbp35_Hom_sap Nbp35_Tri_vag Nbp35_Gia_int Nbp35_Ent_his Nbp35_Cry_par

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ERI ----------------------------------------------KKNE -------------------------------------QEIGVPAASLAGI ----------------------------------------------LAKV ----------------------------------------AAEPGNLAGV ------------------------------------------------ST -------------------------------------LNSDRNFVGVDHV SE--------RAGQAEACASCPNSTYCQQTTN---SSTITKSRIARNTAG SK--------DAGKAEECKGCPNVGYCSQPVQ---QDPDIKAIQENLSGV SK--------DAGKAKECKGCPNASYCSQPAQ---PDPDIKIIQENLRGV TS--------NFGKSEQCNECPNQSICGT-VK---PNDSLPLISANVSHF KK--------------QCENCPNRDNCYG-N--C-EDEDINLIKNKLQCF SD--------MAGKSDACGGCANKEICESLP--KGPDPDIPLITDNLSGI SQ--------QAGTADSCAGCPNQAICATAP--KGPDPDIPLITARLSGV SA--------QAGRGASCQGCPNQRLCASGA-GATPDTAIEEIKEKMKTV ---------------GNCGSCSHAGTCSSHGTPEALQGALEECKTVLENV APTPEQISLKGECAPDKCSGCPARGACSSRGA---DSSTSIAIAERIQHV ---------------------------GGG--NNGPDRELEEIIEKLKGI SP--------DAGIADSCAGCPNALICAS-G--Q-AKKKPTENIENLSKI

17

Cfd1_Tra_hom Cfd1_Enc_cun Cfd1_Enc_int Cfd1_Nos_cer Cfd1_Ent_bie Cfd1_Sac_cer Cfd1_Neu_cra Cfd1_Hom_sap Cfd1_Tri_vag Cfd1_Ent_his Nbp35_Tra_hom Nbp35_Enc_cun Nbp35_Enc_int Nbp35_Nos_cer Nbp35_Ent_bie Nbp35_Sac_cer Nbp35_Neu_cra Nbp35_Hom_sap Nbp35_Tri_vag Nbp35_Gia_int Nbp35_Ent_his Nbp35_Cry_par

-LFISVVSGKGGVGKSTIAALIAAKLSKQ--APTLLLDFDICGPSIGNIF -SRIAVMSGKGGVGKSSVSIMLSTVLSEK--GRTLLLDFDLCGPSIASGF -VMVAVMSGKGGVGKSSISIMLSTAMSER--GKTLLLDFDLCGPSVASGL PQKIAVMSGKGGVGKSSISILLSTILSEK--HKCLLLDFDLCGPSCFSSL PKIYCILSGKGGVGKSAVAAFLALQLKKN--LKVLFIDFDICGPSAAIYF KHIILILSGKGGVGKSSVTTQTALTLCSMG-FKVGVLDIDLTGPSLPRMF KHIVLVLSGKGGVGKSSVTTQLALSLSLAG-HSVGVLDVDLTGPSIPRMF RHIILVLSGKGGVGKSTISTELALALRHAG-KKVGILDVDLCGPSIPRML QNFILVMSGKGGVGKSTTAANIARAYAAKY-GKVGLLDLDLTGPSIPTLF KNVILVLSGKGGVGKSTIATVLARSFALAG-KKTGILDIDLCGPSIPKMM KKIIAVMSGKGGVGKSTMCMQIAHALK-----KCCVLDFDVSGPSIAKMS KAVIAVMSGKGGVGKSTVTRNIAELMSSRG-IATCILDLDLSGPSIPRLT KTIVAIMSGKGGVGKSTVVRNIAESVSSRG-ITTCILDLDLSGPSIPRLT KLIFSIFSGKGGVGKSTITRNIAEFLSLKN-YKVLLLDLDLSGPSIPKMT KLILCIMCGKGGVGKSLLSVILAQYFSEK--FKTILIDLDLAGSSIPRLT EHKILVLSGKGGVGKSTFAAMLSWALSADEDLQVGAMDLDICGPSLPHML KHKILILSGKGGVGKSTFTSLLAHAFATNAEQTVGVMDTDICGPSIPKML KHKILVLSGKGGVGKSTFSAHLAHGLAEDENTQIALLDIDICGPSIPKIM THKILILSGKGGVGKSTLTYILTKYLAK-T-KKVGVLDLDLCGPSIPILF GRIILVLSGKGGVGKSTLATQLAFFLADTMGKYVGLLDLDICGPSIPTMT KHKYVILSGKGGVGKSTFATQFSWVLSED--KQVGLCDYDICGPSIPQMF KNIILVLSGKGGVGKSTISSQISWCLSSKK-FNVGLLDIDICGPSAPKMM Walker A motif

Cfd1_Tra_hom Cfd1_Enc_cun Cfd1_Enc_int Cfd1_Nos_cer Cfd1_Ent_bie Cfd1_Sac_cer Cfd1_Neu_cra Cfd1_Hom_sap Cfd1_Tri_vag Cfd1_Ent_his Nbp35_Tra_hom Nbp35_Enc_cun Nbp35_Enc_int Nbp35_Nos_cer Nbp35_Ent_bie Nbp35_Sac_cer Nbp35_Neu_cra Nbp35_Hom_sap Nbp35_Tri_vag Nbp35_Gia_int Nbp35_Ent_his Nbp35_Cry_par

PASGK-VVKTRT-GLKPLQVNN-------SS-LYILSMSSLI-KNDSSVI GAKEN-VYKGEK-GLVPIRVSK--------N-LYILSMALLM-KDSDSVI GIKEK-IYKGEK-GLIPAKASE--------N-LYILSMALLM-KESDSVI NGKGE-VKKAKK-GLTPIQITN--------N-LYVLSMGSMI-KPDDAVI NVTGK-ITKHKN-GFKPLTLDS--------N-LDILSFGNIL-GENDVVI GLENESIYQGPE-GWQPVKVETN----STGS-LSVISLGFLLGDRGNSVI GIEDAKVTQAPG-GWLPITVHEADPSAGVGS-LRVMSLGFLLPKRGDAVV GAQGRAVHQCDR-GWAPVFLDR------EQS-ISLMSVGFLLEKPDEAVV GIQDKEIKSRNG-KMVP-QVVD--------G-VQIISLGLMLSDPHDAVI GLDNQGVYQGEHGGILPAKSKI-----GDTF-IDTLSVGFMLSSPDSPVI GTENAIITNVQD-TFVPVHVHG--------TCIGVVSAYHVNEWHSVEQL GTDGQLMCETNG-RLQPVEVHG--------L-LKAVSAGYLQDPCEEGVV GTDGMSMCETSG-IIQPIEVNK--------F-LKVVSVGYLQD-CGEGIM HTEGEIIIESNK-RFYPVKLSE--------N-LGCISVGYFADSQPSQNL NTTDYFITNVEN-QFNPIKVNE----------LSVVSMGHIHNNISD--I GCIKETVHESNS-GWTPVYVTD--------N-LATMSIQYMLPEDDSAII GVEGETIHVSST-GWSPAWAMD--------N-LAVMSIQFMLPNRDDAII GLEGEQVHQSGS-GWSPVYVED--------N-LGVMSVGFLLSSPDDAVI NCDVEPLLDTTF-GFQPYHAAK--------N-INVVSIQFFLPDFDSPLV FTKTEQVQNLPM-GWEPVSVSH--------T-LQALSVGHLVTQEDAPVI GQIGVNVTSGMT-GLQPIYVTE--------N-LCTMSIGYLV-ATETAVV GVQGNDVHISAN-GWSPVYVND--------N-LSVMSTAFLLPQSDDAVI

18

Cfd1_Tra_hom Cfd1_Enc_cun Cfd1_Enc_int Cfd1_Nos_cer Cfd1_Ent_bie Cfd1_Sac_cer Cfd1_Neu_cra Cfd1_Hom_sap Cfd1_Tri_vag Cfd1_Ent_his Nbp35_Tra_hom Nbp35_Enc_cun Nbp35_Enc_int Nbp35_Nos_cer Nbp35_Ent_bie Nbp35_Sac_cer Nbp35_Neu_cra Nbp35_Hom_sap Nbp35_Tri_vag Nbp35_Gia_int Nbp35_Ent_his Nbp35_Cry_par

WRAPRKLQLYEMFYNSAYENILQNTTESDPDSYDILDKHNNIGNKTAQKN WRGPKKMSVLSMFYESID-------------------------------WRGPKKMSVLSMFYESAD-------------------------------WRGPKKLSLLNLFYDSID-------------------------------WRGAKKQIFLELMFNTSLFK-----------------------------WRGPKKTSMIKQFISDVAWG-----------------------------WRGPKKTAMVRQFLSDVFWD-----------------------------WRGPKKNALIKQFVSDVAWG-----------------------------WRGPKKSAMINQFFQLIEW------------------------------WRGPKKGAAIEQFLNDVEWG-----------------------------YQPSFISSFLINVLSNCNFD-----------------------------FSSTLKTSAMKKLLKWCSYE-----------------------------FSSSFKTGIIKKFLAQCNYE-----------------------------FSSTYKTNTIRNILINGDIA-----------------------------YTSEIKRYFIKNILKNCTMD-----------------------------WRGSKKNLLIKKFLKDVDWD-----------------------------WRGPKKNGLIKQFLKDVEWG-----------------------------WRGPKKNGMIKQFLRDVDWG-----------------------------ARGPKKNALVLQLINQIDWS-----------------------------LRGPKKHGMVKQMLTETNWE-----------------------------WKGPKKNSLIRQFIHDVDWG-----------------------------WRGPKKNGLIKQFLSDVVWG------------------------------

Cfd1_Tra_hom Cfd1_Enc_cun Cfd1_Enc_int Cfd1_Nos_cer Cfd1_Ent_bie Cfd1_Sac_cer Cfd1_Neu_cra Cfd1_Hom_sap Cfd1_Tri_vag Cfd1_Ent_his Nbp35_Tra_hom Nbp35_Enc_cun Nbp35_Enc_int Nbp35_Nos_cer Nbp35_Ent_bie Nbp35_Sac_cer Nbp35_Neu_cra Nbp35_Hom_sap Nbp35_Tri_vag Nbp35_Gia_int Nbp35_Ent_his Nbp35_Cry_par

VTVDESCQLGYKYVVIDTPPGITIVHSFIKE----------------KN---------GFDNVVFDMPPGISEEHGFLIG----------------KD---------GFDNVVIDMPPGISEEHGFLVG----------------KD---------DFDFVIIDTPPGVSEEHGFLID----------------KN---DEDGNFIYDAILIDTPPGISEEHGFLVG---------------KKN---------ELDYLLIDTPPGTSDEHISIAE-------EL----RYSK-P ---------ETDYLLIDTPPGTSDEHISLAE-------NLLQKARPGQ-L ---------ELDYLVVDTPPGTSDEHMATIE-------AL----RPYQ-P ---------DCNTVIVDLPPGTSDEHLSTFD-------VLN---RN-NFS ---------DKDVLVVDTPPGTSDEHITIMD-------FF----RKRNQE ---------TYTHVVIDTPPGITDEHLIISN-------YV---------D ---------GTDVLLLDTPPNVTDEHLGMVN-------FI----R----P ---------GVDVLLLDTPPNVTDEHLGMVN-------FI----K----P ---------DYEILIIDTPPNVTDEHLGIVN-------YL----K----L ---------NKEILILDTPPNITEEHFAIYN-------YI----C----N ---------KLDYLVIDTPPGTSDEHISINK-------YM----RESG-I ---------DLDFLLVDTPPGTSDEHLSVNT-------YL----KKSG-I ---------EVDYLIVDTPPGTSDEHLSVVR-------YL----ATAH-I ---------DQDFLLVDTPPGTSDEHLSVVS-------FM----RDSE-I ---LDPRFPKSNIIIVDTPPGTSDEHLSIIDMYQNAIRYMQSNAFPNVPV ---------ELDYLIIDTPPGTSDEHLTIVS-------IL----NKCN-V ---------ELDFLIIDTPPGTSDEHLSIVS-------YL----NGSN-V Walker B motif

19

Cfd1_Tra_hom Cfd1_Enc_cun Cfd1_Enc_int Cfd1_Nos_cer Cfd1_Ent_bie Cfd1_Sac_cer Cfd1_Neu_cra Cfd1_Hom_sap Cfd1_Tri_vag Cfd1_Ent_his Nbp35_Tra_hom Nbp35_Enc_cun Nbp35_Enc_int Nbp35_Nos_cer Nbp35_Ent_bie Nbp35_Sac_cer Nbp35_Neu_cra Nbp35_Hom_sap Nbp35_Tri_vag Nbp35_Gia_int Nbp35_Ent_his Nbp35_Cry_par

TKILLVTTSQNVAISDSINTINFFGK----ISGIIENMSGLKCPNCKKIT VGALIITTPQNVSLGDSSKAIDFCASNGIRILGLVENMSGYCCECCGSSV ISVLIATTPQNISLGDSSRAIDFCISNGIQILGLVENMSGYCCESCGNPT IYSLIVTTSQNVALSDTVKAIDFCKINNIKILGIIENLSGYKCNCCGHIT VHSLIVTTGQNLALNCCQSTIEFCLYHNLNIIGVIQNMSYYVCECCHEKI DGGIVVTTPQSVATADVKKEINFCKKVDLKILGIIENMSGFVCPHCAECT AGAVVVTTPQAVATADVRKELNFCTKTNIRVLGVVENMCGFVCPNCSECT LGALVVTTPQAVSVGDVRRELTFCRKTGLRVMGIVENMSGFTCPHCTECT YSVIIVTTPNVLAVADVRKGINLCLKVNAKIIGIIENFCGVVCPCCNKVS TKAVIVTTPQLVSTNDVEKEIDFCNECQIPIIGLVENMSGYLCPHCSTVT AVCVLVSTPGVLAVNDLVRQIDFCERAGVKVLGVVENMREFVCE-CGCVV RFGIVVTTPQKFSLQDVARQVDFCRKARIEVLGIIENMKRFTCQKCGHSK KFAIVVTTPQKFSLQDVIRQIDFCRKAKISVLGVIENMKRFVCPRCSHQK NFAIVVTTPQLISFQDVIRQYTFCYKNNIKILGIIENMKGFRCEKCDSLQ AKAILISTPHVLCTTELNRQFIFCQKANIDIVGIVSNMDGIRCSKCNHIN DGALVVTTPQEVALLDVRKEIDFCKKAGINILGLVENMSGFVCPNCKGES DGAVMVTTPQEVSLLDVRKEIDFCRKAGIKVLGLVENMSGFVCPKCTHES DGAVIITTPQEVSLQDVRKEINFCRKVKLPIIGVVENMSGFICPKCKKES DGAVIVTTPDEVSISDVRREIEFCQKAGVKILGVVENMSQYKCPMCGKTS LEAVVVSTPQEVALADVRKEINFCKQLNLHIKGVIENMSGFVCPFCETET DGAIIITTPQDVSLIDVRKEINFCKKIGLPIIGVVENMSGFICPCCHKES NGALIVTTPQEIALQDVRKEINFCKKVGLNILGVVENM-GMIFKNAEHDS

Cfd1_Tra_hom Cfd1_Enc_cun Cfd1_Enc_int Cfd1_Nos_cer Cfd1_Ent_bie Cfd1_Sac_cer Cfd1_Neu_cra Cfd1_Hom_sap Cfd1_Tri_vag Cfd1_Ent_his Nbp35_Tra_hom Nbp35_Enc_cun Nbp35_Enc_int Nbp35_Nos_cer Nbp35_Ent_bie Nbp35_Sac_cer Nbp35_Neu_cra Nbp35_Hom_sap Nbp35_Tri_vag Nbp35_Gia_int Nbp35_Ent_his Nbp35_Cry_par

NIY--SRNGGSQLAEEFNIPFYGTLEIDQNISQFIENGTLY--------NIF--GSKGGERLAEETGIPFVCRLPIDSLLCEALDEGRFV--------NIF--GARGGERLAMEMGVRFICELKIDPLLCEALDEGKFL--------NIF--ASKGGQQLSQHYLINFIEKLPIEPLFGELLDTKEFI--------YLY--GKNGGKLLAEEYGIEYLGEIPMESQMLNAIEQGQFP--------NIF--SSGGGKRLSEQFSVPYLGNVPIDPKFVEMIENQVSSKK------NIF--MSGGGEVMANDFGVRFLGRVPIDPQFLVLIETGKRPTYPAGTTVD SVF--SRGGGEELAQLAGVPFLGSVPLDPALMRTLEEGHDF--------PLL--GDKAAEIMSEELQLDILAKIPFLPQAASAADKGEKS--------NIF--SSNGGKELADKYQLKFVGAIPIEPKICLAGETGLN---------PM---GTVDVREVCLRRGVRYLGGLQCVKAVGMFADGGMVY--------SIF--RSVGVESYCMSNGIAYLGSIDLKQDIAKRSDSGDTI--------NVF--VNTEVESYSKSNGIPYLGSIDLRQDIAKASDIGRPT--------DIF--YNSDIEQKCKENNLNYIGSLPLNIEYGKSGDNGILI--------QLF--SKDIILKFCQNKYIQFLGEIEFNSQIVKNIDKGEVI--------QIFKATTGGGEALCKELGIKFLGSVPLDPRIGKSCDMGESF--------EIFKATTGGGRKLAEEMGIAFLGSVPLDPRIGMACDYGESF--------QIFPPTTGGAELMCQDLEVPLLGRVPLDPLIGKNCDKGQSF--------SIYGHEFGGAEELCKQENLDLLGRIPIDPYIVAGQFEPQK---------PVIEATTGGVKKMCEDMHVPYIGSMPLDPQLMKAGEDGVAW--------TIFPPTHGGAKQMCEEMGVKFLGKIPLDPIIAHSCDIGAPY--------SV--------KDMCDNMEVEYLNKIPWDKELLYVCDLGLSI---------

20

Cfd1_Tra_hom Cfd1_Enc_cun Cfd1_Enc_int Cfd1_Nos_cer Cfd1_Ent_bie Cfd1_Sac_cer Cfd1_Neu_cra Cfd1_Hom_sap Cfd1_Tri_vag Cfd1_Ent_his Nbp35_Tra_hom Nbp35_Enc_cun Nbp35_Enc_int Nbp35_Nos_cer Nbp35_Ent_bie Nbp35_Sac_cer Nbp35_Neu_cra Nbp35_Hom_sap Nbp35_Tri_vag Nbp35_Gia_int Nbp35_Ent_his Nbp35_Cry_par

-------------------------ENINSLGCNNVLDTVVR---K----------------------------ERCGSIEAYMKFRKAVL---G----------------------------EKCGSIESYIRLRRSVL---E----------------------------LKYQELKTYKILKKWINR--E----------------------------SYCEATGIFNDIQRKIS---H-------------------------TLVEMYRESSLCPIFEEIMKKLRKQDTT GKDISTPAGASTSEEEEVKDGSRLVHKYKDCSLAPIFSKITADVISA--------------------------IQEFPGSPAFAALTSIAQKILDATP-------------------------D-----VILSFFNEVIDKIFPQQ---------------------------PFADEPSANALKPITDFVADLA--------------------------ED-------ALFAGVVRNITDE---------------------------EE-------EVLGKIVDAIMVVC--------------------------RE-------EIFDRMADAVLSI---------------------------DD-------QIFSKTIDLIINE---------------------------HIPQLKSIYSKLLSIIMEFVPFNK------------------------LDNYPDSPASSAVLNVVEALRDA--------------------------FDSFPDSPACRALKGVVKGLATEMGL ------------------------FIDAPDSPATLAYRSIIQRIQEFCNL -----------------------DLPEAINDAASVICEKIQQKLS--------------------------STICDIDTSPGYDAFANICGKII----------------------------FLEHPDSEATKNFKRIYKEIIT---------------------------CEKFPQSPSSIGIKKLVDIIIYQ---

Cfd1_Tra_hom Cfd1_Enc_cun Cfd1_Enc_int Cfd1_Nos_cer Cfd1_Ent_bie Cfd1_Sac_cer Cfd1_Neu_cra Cfd1_Hom_sap Cfd1_Tri_vag Cfd1_Ent_his Nbp35_Tra_hom Nbp35_Enc_cun Nbp35_Enc_int Nbp35_Nos_cer Nbp35_Ent_bie Nbp35_Sac_cer Nbp35_Neu_cra Nbp35_Hom_sap Nbp35_Tri_vag Nbp35_Gia_int Nbp35_Ent_his Nbp35_Cry_par

------------LVGS ------------LADI ------------ITNI ------------NWTF ------------L--L TPVVDKHEQPQIESPK -------------VQQ ------------ACLP ------------KAAQ ------------KTFA -------------LEK ------------SSKA -------------HES --------------IK -----------KNDFN ------------VGDV DPEV----VMPEEDDA HQS-----KEENLISS ---------------A ---------------E --------------NL ------------SKIN

Abbreviations: Tra_hom, Trachipleistophora hominis; Enc_cun, Encephalitozoon cuniculi; Enc_int, Encephalitozoon intestinalis; Nos_cer, Nosema ceranae; Ent_bie, Enterocytozoon bieneusi; Sac_cer, Saccharomyces cerevisiae; Neu_cra, Neurospora crassa; Hom_sap, Homo sapiens; Tri_vag, Trichomonas vaginalis; Gia_int, Giardia intestinalis; Ent_his, Entamoeba histolytica; Cry_par, Cryptosporidium parvum.

21

The major structural difference between Cfd1 and Nbp35 is the presence in Nbp35 of an N-terminal Cys-X13-Cys-X2-Cys-X5-Cys ferredoxin-like motif (labelled in red) coordinating a [4Fe-4S] cluster26-28. The insertion of this cluster depends on electron transfer from the Tah18-Dre2 complex29. The genome of E. histolytica lacks homologues of Tah18 and Dre2, which is consistent with the loss of the ferredoxinlike motif in E. histolytica Nbp3530. Cysteine residues in the CX18CPXCX2C (Cfd1) and CX18CX2CX38C (Nbp35) motifs that are conserved at the C termini of yeast Cfd1 and Nbp35 homologues are labeled in yellow and dark green respectively. The two central cysteine residues of both motifs, which are conserved in microsporidian sequences, are essential for yeast viability and for maturation of cytosolic and nuclear Fe/S proteins. These motifs also have a critical role in Cfd1-Nbp35 complex formation26,28,31. Cfd1 and Nbp35 are classified as Mrp-like proteins that belong to the Mrp/Nbp35 subfamily of P loop NTPases. Nucleotide binding and/or hydrolysis is apparently critical for loading an Fe/S cluster onto the Cfd1-Nbp35 complex. The consensus Walker A motif of the Mrp family and the ENMS motif are labeled in pink and light blue respectively. The Asn in the ENMS motif is predicted to form a contact with the adenine from ATP32. The Walker B motif with a conserved Gly residue (typically, within the signature hhhhDxxG, where h is a hydrophobic residue) is labeled in green.

22

f. WD40 protein Cia1 Cia1_Tra_hom Cia1_Enc_cun Cia1_Enc_int Cia1_Sac_cer Cia1_Neu_cra Cia1_Hom_sap Cia1_Gia_int Cia1_Ent_his Cia1_Cry_par

M-----K--------------ITERGSAQFDEKLL--SLAVTKDSVLTGG M--------------------KYRITSKKLGEKIL--AVHA-GKSIYTGG M--------------------KYKITSKRLDEKIL--AVHV-NGAVYTGG MASI--NLIKS-----------LKL----YKEKIW--SFDFSQGILATGS MATETTTATPTAPSATTAEITPLPAFSPDLYQRAWASIPHPSLPLIATCH MKDSL-VLLGR-----------VPA---HPDSRCWFLAWNPAGTLLASCG MVHDRVSLLNH-----------TAA----HTDRIWRLRASHTGELVASCS M-----QLVDS-----------FEV---------------P--------MGSL--EKLGI-----------IGV----LDSAIWSVASHPKDRIIASCG

Cia1_Tra_hom Cia1_Enc_cun Cia1_Enc_int Cia1_Sac_cer Cia1_Neu_cra Cia1_Hom_sap Cia1_Gia_int Cia1_Ent_his Cia1_Cry_par

TKK-ILHKL-------------------STDLH----TQKILC--QHEKTSR-ML-----------------------VNQD----TGEVMC--RCKKTSR-TL-----------------------VNQD----TGEVMC--RCRKTDRKIKLV--------------------SVKY-DDFTLIDVLDETAHKKA-HSVTVF--------------------SLST-LS--KHSVLT-GGHTRGDRRIRIW--------------------GTEG-DSWICKSVLS-EGHQRADGSMAIW--------------------RASKDLSLQLVQRLQ-PGHDNP -------------------------------------LKPLLT--PHNRS-S-IVVWMDTKLKNHKWYQEIEIVNKCSMAG-NSWVKAYEFGSLEHKR-

Cia1_Tra_hom Cia1_Enc_cun Cia1_Enc_int Cia1_Sac_cer Cia1_Neu_cra Cia1_Hom_sap Cia1_Gia_int Cia1_Ent_his Cia1_Cry_par

-SIRCIA--AKD----------GVIVCGSYDGNATVLY------------SVRSIA--SHG----------RYVCCGSYDCTAVLFH------------SIRSIA--SHG----------RYICCASYDCTAVLFH------------AIRSVAWRPHT----------SLLAAGSFDSTVSIWAKEE---------SVRSAAWQPPRGKVSGKEAKRLRLVTGSFDTTAGVWTWDQGRREESLER -TVRKVAWSPCG----------NYLASASFDATTCIWKKN---------VIVRDCAFSAND----------QHLVVAAYDGSMYVYDLID---------TIRRVKCSKNG-----------LLACCSFDSTVSLWE------------LIRKIAWSPCG----------GMIISASFDSSISVWEFVS---------

Cia1_Tra_hom Cia1_Enc_cun Cia1_Enc_int Cia1_Sac_cer Cia1_Neu_cra Cia1_Hom_sap Cia1_Gia_int Cia1_Ent_his Cia1_Cry_par

---------------------EDKILDLI-EGPETEIKGVDLLNGNRRDN ---------------------DGKVVDVI-EGPDTEVKCVAFSEDG---R ---------------------DGKVVDVI-EGPDTEIKCVGFSEDG---R ---------------SADRTFEMDLLAII-EGHENEVKGVAWSNDG---Y EIRLQSGENDQEAEEEEEAEDEWELTLVL-EGHENEVKSVNYSPSG---Q -------------------QDDFECVTTL-EGHENEVKSVAWAPSG---N ---------------ELTKGDPFQLTAIIANAHEKEIKSVDISKEG----------------------LNENTIIGTL-EGHESEVKCVDWSFGS---N ------------------RDIGWACICKI-LGPESEVKCVDWSPFN---N

Cia1_Tra_hom Cia1_Enc_cun Cia1_Enc_int Cia1_Sac_cer Cia1_Neu_cra Cia1_Hom_sap Cia1_Gia_int Cia1_Ent_his Cia1_Cry_par

YIALSTRGKTVWVC--------------KLN---DKIEIDSILEDHTQDV YLAMATRGRSVWVV--------------KID---GEIEIDGVIEDHLHDV YLAMATRGKSVWVV--------------KID---PEIEIDEIIEDHLHDV YLATCSRDKSVWIW--------------ETDESGEEYECISVLQEHSQDV YLATCSRDKSVWIWEDVGNPNPSSEEEDEEEEDEDEWETVAVLQEHDGDV LLATCSRDKSVWVW--------------EVD-EEDEYECVSVLNSHTQDV TVAACSRDRFVSFW--------------RPCSDSPDYDCIGLFNNHTEDI MVATCSRDKSVWLWKSY---------------SGIDYECCSVLTGHSGDV FVAACCRDRAIWFFSLDI-------GENRKLGTLIEYDCIGVVTAHTNDI

23

Cia1_Tra_hom Cia1_Enc_cun Cia1_Enc_int Cia1_Sac_cer Cia1_Neu_cra Cia1_Hom_sap Cia1_Gia_int Cia1_Ent_his Cia1_Cry_par

KGVKFF-----------------NNLLYTYGYDNTVKVYQRFTMY----KGCIFH-----------------GGLLFTYGYDNTVKVYDRF-DY----KGCVFH-----------------KGFLFTYGYDNTIKIYERF-DY----KHVIWHPS---------------EALLASSSYDDTVRIWKD---Y----KAVAWCPDVPGRKGKYAPPRRYGDDVLASASYDNTVRLWRED--G----KHVVWHPS---------------QELLASASYDDTVKLYRE---E----KCVRFSRN---------------GHYLVSASYDNNICLYKRCLETADDLG KTVLFHPS---------------GTILFSGSFDGTIKVWKGE--E----KKIKWHPT--------------IPMVLLSCSYDNTIIAWAPS--S-QLLG

Cia1_Tra_hom Cia1_Enc_cun Cia1_Enc_int Cia1_Sac_cer Cia1_Neu_cra Cia1_Hom_sap Cia1_Gia_int Cia1_Ent_his Cia1_Cry_par

-----DDSWVLLQSLEAQKE---------------------TVWDVEML-----DDSWELVQSID-ERS---------------------TVWCVIFH-----DDSWELVQSIS-EKN---------------------TVWCVIFH-----DDDWECVAVLNGHEG---------------------TVWSSDFDK -----DGEWVCVAVLEGHEG---------------------TVWGVAWEG -----EDDWVCCATLEGHES---------------------TVWSLAFDP E--EEIESWIFAGSTKSELDLNSCEMNAPDSEIVASGASCHTVWTAIFMA -----ETEWSELQTIQAYGK---------------------TVWDLKITK HDEVKGLEWVKLYTLNGHSS---------------------TVWDFTYSP

Cia1_Tra_hom Cia1_Enc_cun Cia1_Enc_int Cia1_Sac_cer Cia1_Neu_cra Cia1_Hom_sap Cia1_Gia_int Cia1_Ent_his Cia1_Cry_par

--------TKLFVASNDGCIYVYRK-------------------------------NGRMVCTTEEGTVSIYAL-------------------------------GDKMVCSTEEGTISSYVL----------------------------TEGVFRLCSGSDDSTVRVWKYM-GDDEDD----------------RPRENDKFPRLLSWGADEVIRVWSLK-EPEEEEHGE----GAAGGGGNNT ----S--GQRLASCSDDRTVRIWRQYLPGNEQG--------VACSGS------N--NSSILAVDGNGCIRCYNII---------------------------E--GKFIVAGCANGVIILYEFK---------------------------N--GEFLLSCSDDSSIVLWNSN-QGNENKFKNLNSVNFALTDTFKM

Cia1_Tra_hom Cia1_Enc_cun Cia1_Enc_int Cia1_Sac_cer Cia1_Neu_cra Cia1_Hom_sap Cia1_Gia_int Cia1_Ent_his Cia1_Cry_par

-------------------EKDWV-FDYCCNISVYPILSLCRIR------------------------RSGWT-LEMSRKLSVLPIYSICSVG------------------------RNGWE-LEACKKLSIFPIYSICSVG------------------------QQEWVCEAILPDVHKRQVYNVAWG-FN---WGFGVPNTMRR------SLKEEWECTAVLPKVHKGDIYSVAWSTET----------------------DPSWKCICTLSGFHSRTIYDIAWCQLT----------------------EGGVKQLGCTILHGRRPVYDISLVEPRHASK -------------------DNLLVELDTINNEKYRDIYSIDIND-----IFYNTPNTKRLSKYIQID-QANSFINNYDKELYSYPIYSIEWCNYI----

Cia1_Tra_hom Cia1_Enc_cun Cia1_Enc_int Cia1_Sac_cer Cia1_Neu_cra Cia1_Hom_sap Cia1_Gia_int Cia1_Ent_his Cia1_Cry_par

---DFLALAVDSKSLVI------------------------VDESLQVKC ---ENMAYVLNRSSIGI------------------------VDSNLNLVM ---RDMAYVLNRNNIGI------------------------IDSNLNLTT ---GLIASVGADGVLAVYEE---------------------VDGEWKVFA ---GLLSSVGSDGVLALYQETANTTEKNEENETNGEAPTTTSAGGWKVLT ---GALATACGDDAIRVFQEDPNSD---------------PQQPTFSLTA ASSIYIATAGQDGVVCLSIIN--P-----------------ITGTATPIV ---NNVLVGSGDNAIRLFKIN--T-----------------IKKKLELIE ---NCIIVSSADKSLHLFSV---T-----------------DSKRLKHIC

24

Cia1_Tra_hom Cia1_Enc_cun Cia1_Enc_int Cia1_Sac_cer Cia1_Neu_cra Cia1_Hom_sap Cia1_Gia_int Cia1_Ent_his Cia1_Cry_par

SLEDAHQK-DINCVKYSDD----------NNMLVTCSDDGCLKVYHVDFSIENVHED-SINSIVYDEG----------RNRIVSGGDDGILNTIEL--TIEDIHED-FINGIAYDEG----------RGRIISGGDDGILNVIEM--KRALCHGVYEINVVKWLEL--------NGKTILATGGDDGIVNFWSLEKTVKGAHGPYEINHITWCKRYDAGSERKGEEEMLVTTGDDGVVRPWQVR-HLHQAHSQ-DVNCVAWNPK---------EPGLLASCSDDGEVAFWKYQRP HITGAHDG-EVNSVCDITQ----AVATDGHVVVCSGGDDGCINIWRISTEKQDAHTN-DVNCVKWIN-----------KTLSISVGDDNMLKIWKI--ERPNAHNS-EINSVSWLND--------NKRGEFISAGDDGEIALWRFD--

Cia1_Tra_hom Cia1_Enc_cun Cia1_Enc_int Cia1_Sac_cer Cia1_Neu_cra Cia1_Hom_sap Cia1_Gia_int Cia1_Ent_his Cia1_Cry_par

-SE --L --F -AA -IQ EGL -EE -VN -FE

Abbreviations: Tra_hom, Trachipleistophora hominis; Enc_cun, Encephalitozoon cuniculi; Enc_int, Encephalitozoon intestinalis; Sac_cer, Saccharomyces cerevisiae; Neu_cra, Neurospora crassa; Hom_sap, Homo sapiens; Gia_int, Giardia intestinalis; Ent_his, Entamoeba histolytica; Cry_par, Crystosporidium parvum. Cia1 belongs to the WD40-repeat protein family. WD40 repeats are conserved domains of approximately 44-60 residues that typically contain the GH dipeptide 11-24 residues from its N terminus and the WD dipeptide at the C-terminus. The WD repeat combines a conserved core structure with variable regions that are probably surface-exposed. Most WD proteins contain a cluster of at least 7 or more copies of WD-repeats with as many as 16 but as few as four. A WD protein forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel β-sheet. Each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade. The proposed common function of this protein family is to coordinate the assembly of multi-protein complexes by functioning as a docking site for other proteins. Residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands33,34. The 3D structure of the Cia1 protein from S. cerevisiae shows the typical architecture of other WD40-repeat proteins35,36. The β propeller structure contains three potential interacting surfaces: The top, the bottom and the circumference35,36. The WD40 repeats are shadowed in yellow. The seven WD40 repeats in S. cerevisae were obtained from Unipropt KB: http://www.uniprot.org/uniprot/Q05583 and the WD40 repeats from the other organisms were obtained from SMART: http://smart.embl-heidelberg.de/. Note that Tra_hom has four inferred WD repeats, Enc_cun and Enc_int have five WD repeats, Gia_int and Ent_his have six WD repeats. The arginine (R) in green is critical for the function of Cia1 in yeast36 and is conserved in all of the microsporidian sequences. In bold are the invariant residues present also in the microsporidian sequences.

25

g. Cia2 Cia2_Tra_Hom Cia2_Enc_cun Cia2_Enc_int Cia2_Nos_cer Cia2_Ent_bie Cia2_Sac_cer Cia2_Neu_cra Cia2B_Hom_sap Cia2A_Hom_sap Cia2_Gia_lam Cia2_Tri_vag Cia2_Cry_par

---------------------------------MNREPELFE-SAVPECT------------------------------------------MNEFPFVAS-SLEERHP------------------------------------------MNKSPFVSS-SLERRYP------------------------------------------MNISPQIKNKNFENRFD------------------------------------MNNPNIINQFPSLRS-KSYSRVN---------MSEFLNENPDILEENQLPTRKEDSTKDLLLGGFSNEATLERRSLLLKIDHSLKSQVLQDI ---------------------------MAKSDLDNANPTVLSVSQLPSRNLAKGHVRKGP ---------------------------MVGGGGVGGGLLENANPLIYQ--------------------------------------------MQRVSGLLSWTLSRVLWLSG---------------------------------------MAPHAPYTAGPFFNRGR----------------------------------------MAANPNPVVYGSAKYVR-----------------------------------------------------------------------

Cia2_Tra_Hom Cia2_Enc_cun Cia2_Enc_int Cia2_Nos_cer Cia2_Ent_bie Cia2_Sac_cer Cia2_Neu_cra Cia2B_Hom_sap Cia2A_Hom_sap Cia2_Gia_lam Cia2_Tri_vag Cia2_Cry_par

---------------------------------EQFDP---NKLTKGMVF------------------------------------------ISMSNGVLQNVTQRSVF------------------------------------------IDISDGILQEITQYSVF------------------------------------------LHFKDKLLTDISVDSVF------------------------------------------LDFENGYLREVTAEAIF---------EVLDKLLSIRIPPELTSDEDSLPAESEDESVAGGGKEEEEPDLIDAQEIY---------DSKYDHILFPKQWWAGSSLNTDPSVWTSDEDEDDDLTLATEEPIDEQEIYGENYDPSSCT ------------------------RSGERPVTAGEEDEQVPDSIDAREIF--------------------------------------LSEPGAARQPRIMEEKALEVY----------------------------------------------PEDYEPITPEEVF---------------------------------------STEDDLDSPEREAIDSLELY--------------------------------------------------------DVY----------

Cia2_Tra_Hom Cia2_Enc_cun Cia2_Enc_int Cia2_Nos_cer Cia2_Ent_bie Cia2_Sac_cer Cia2_Neu_cra Cia2B_Hom_sap Cia2A_Hom_sap Cia2_Gia_lam Cia2_Tri_vag Cia2_Cry_par

ELIRHIKDPEHP-YSLEILNVVNLDSIEIKEISTTY------GKNLQQVVVHFQPTIPHC ELIRDIRDPEHP-YTLEQLGVVSREGVSIGCIGPDG-IAPNVGLPIRCVKVVFKPTIPHC ELIRDIRDPEHS-YTLEQLGVVSREGITIGLIDSDG-IAPSAGLPIKYIKVMFKPTIPHC ELIRDIKDPEHP-YTLEELNVVRKDLIKIYQLKDEY-VVEDI---INCIEVQFEPTIPHC ELIRDIQDPEHP-YTLEDLGVVSLSDIKIYTVYNNTNIKCTDGFPLKFIEVQFTPTVPHC DLIAHISDPEHP-LSLGQLSVVNLEDIDVHDSGNQN--------EMAEVVIKITPTITHC YLLSTISDPEHP-VTLGQIAVVRLDDIHLSPSPAER----LDPNTLTNVEVDLTPTVNHC DLIRSINDPEHP-LTLEELNVVEQVRVQVSDPE-------------STVAVAFTPTIPHC DLIRTIRDPEKP-NTLEELEVVSESCVEVQEINEEE----------YLVIIRFTPTVPHC DIIRSVRDPEHMNMTLEDLRVVNLNDITVMDEQG-------------LVRVVYTPTTPTC NYIRLIKDPEHP-FSLEQLHIVSPDDIKVDDKEGR-------------VNLVFTPTVPNC ECIKDIIDPEYP-LTLEQLNVVSLENIIIN-------------HEEQIIFVFFKPTVTSC

Cia2_Tra_Hom Cia2_Enc_cun Cia2_Enc_int Cia2_Nos_cer Cia2_Ent_bie Cia2_Sac_cer Cia2_Neu_cra Cia2B_Hom_sap Cia2A_Hom_sap Cia2_Gia_lam Cia2_Tri_vag Cia2_Cry_par

SMAAIIGLCIFYVLKARL-DTFWIRVQIAE--DTHVNWKTINKQLDDKDRTNAAFENTSI SMAAVIGLCIKTHVSRHV-RNHFVQVHIVD--GGHINFRALNKQLDDKDRVLAATENEVL SMAAIIGLCIKAQINQYI-ENHFIQVHIVN--DGHINFKALNKQLDDKDRVLAAMENETL SMAAIIGLIIKILLEKYI-KGYYIIVSILE--GSHVNDKMLNKQLKDKDRVQAASENEAL SLVGIIGLSIAYQLYKHT-RNYVIKLRITK--GSHHQEEIYNKQLNDRERVFAAFENESI SLATLIGLGIRVRLERSLPPRFRITILLKK--GTHDSENQVNKQLNDKERVAAACENEQL SLATVIGLAVRVRLENALPPNYR--IIVRMKDGSHAQDDQVNKQLGDKERVAAALENDTL SMATLIGLSIKVKLLRSLPQRFKMDVHITP--GTHASEHAVNKQLADKERVAAALENTHL SLATLIGLCLRVKLQRCLPFKHKLEIYISEG--THSTEEDINKQINDKERVAAAMENPNL SLGSIIGLSLKIKLDRCLPRRFCSVVYCKD--GTHENAISLNKQINDKERALAALTNKNI SLPAVLGLCIRERLLQVLPQRFHSKIFITVARGKHIQEDSINRQLRDKERCLAALERRNI SQASLIGLSLYYKLHTVFNKNFKIIIKVVK--GTHDLEDSINKQLKDKERVHAALENPQI

26

Cia2_Tra_Hom Cia2_Enc_cun Cia2_Enc_int Cia2_Nos_cer Cia2_Ent_bie Cia2_Sac_cer Cia2_Neu_cra Cia2B_Hom_sap Cia2A_Hom_sap Cia2_Gia_lam Cia2_Tri_vag Cia2_Cry_par

LNLIGDCIGPCSV-------LDLMEKCLPAI---------LDLMKECLPRHGELLDMPK-LEIIDECLVSIIDKYDL---LEIIENSINK----------LGVVSKMLVTCK--------KGIIEKMLETCV--------LEVVNQCLSARS--------REIVEQCVLEPD--------ASVVNTAIRI----------RTMIDNCIACDDEEE-----YKTITKGLANSDVWEDQSLLY

Abbreviations: Tra_hom, Trachipleistophora hominis; Enc_cun, Encephalitozoon cuniculi; Enc_int, Encephalitozoon intestinalis; Nos_cer, Nosema ceranae; Ent_bie, Enterocytozoon bieneusi; Sa_cer, Saccharomyces cerevisiae; Neu_cra, Neurospora crassa; Hom_sap1-3, Homo sapiens; Gia_lam, Giardia lamblia; Tri_vag, Trichomonas vaginalis; Cry_par, Crystosporidium parvum. The DUF59 (PF01883) domain identified using SMART (http://smart.embl-heidelberg.de and pfam database: http://pfam.xfam.org) is highlighted in light blue. Yeast mutants where the hyper-reactive cysteine (red – universally conserved across aligned sequences) is replaced by alanine are not viable37. Cia2 is part of the CIA targeting complex (with Cia1 and Mms19) which facilitates the insertion of [4Fe4S] clusters into cytosolic and nuclear apoproteins38,39. Humans possess two isoforms of Cia2. CIA2B, together with CIAO1 and MMS19, is required for maturation of the bulk of cytosolic and nuclear Fe/S proteins. CIA2A binds to CIAO1 and IRP2 and is involved in cellular iron regulation. The human CIA targeting complex CIA2B-CIAO1-MMS19 binds to numerous Fe/S proteins presumably reflecting the apoforms38,39.

27

h. Hydrogenase-like Nar1 Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

MRRP-----------HMNKKINPCIK-----------------------MDAL--IRPPMSFFADLPKDNKKCIK-----------------------MSFF----------AGLSKNNQKCIK-----------------------MNT----------------ENKPCI------------------------MSAL----LSESDLNDFISPALACVKPTQ------VSGGKKDNVNMNGEY MSAI----LSVDDLNDFISPGVACIKPIETLPTAAPPAGDANSSLEVEVI MASPFSGALQLTDLDDFIGPSQECIKPVKVEKRA-GSGVAKIRIEDDGSY MSAD-----------PAASTSFDCLHPVSIEE-----------------MSLK------VKVASDLNTLPEECVVPLKPADAP-STGTVKLRLK----MSLS--VGLQIAGVDDYIQQNLVCVMPLKETPP--QEHKGAAKISLGGPMFST---AVKLANLDDYLESSQDCIVSLLSDK---DDTKPKIAVMRPAKA

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

----------GEKPFELTLNDCLACSGCV-SMEETKITAE----DLQ------------IGSPLALSLSDCLACSGCV-SADEAGALSEDLS-FVL------------TDTPFNLSLSDCLACSGCV-TTDEAGALSADIS-FVR--------------SPFTFKLEDCLACSACI-ADFSVPKIPTY--TELK-----EVSTEPDQLEKVSITLSDCLACSGCI-TSSEEILLSSQSHSVFLKNW LDGQQPEAKSNAPPAEISLTDCLACSGCV-TSAEAVLVSLQSHNEVLNML FQINQDGGTRRLEKAKVSLNDCLACSGCI-TSAETVLITQQSHEELKKVL ---RGRVKADDEATFKVTLQDCLACSGCAITKDEITIISEQNTSRIFEKL ---ACDAAPVSSTPVKITINDCLMCSGCV-TSAEEVFFRELNTTALQNAI ---EEGNELPKLTKVTVRLEDCLACSGCI-TSAETVLIEQQGLPEFRKNI QGNKDDKKSGTSDKATVNVADCLACSGCV-TSAEAKLLEDQNVSEFMNIL

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

--------------------------NVPINLIMSHYSVLNLCNEIKKQQ ------------------------DLSPQTSFVLSPQSKINIFNLYR--------------------------DSAPQTSFVLSPQSKVNIFNIYG----------------------------DVPLTFLLSPHSKMNMYAHYN--GKLS-------------------QQQDKFLVVSVSPQCRLSLAQYYG--DSAPALKLVGPDANGKHSVQGLENSDAKLYVASVSPQSRASLAAACG--DANKM----------------AAPSQQRLVVVSVSPQSRASLAARFQ--------------------------DEVKDYIVLVATHVVANLAAVRN--TSGP--------------------KAGRPIVLSLSQSAILSLSRVL---KEL---------------------SQRKKVICTIADECIASMSVVHN--------------------------KQKRLTVVSISNQSCSSFACHLN---

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

LKQNKDDSTNLHAITVKEWYA-YLQNALRRKFSVK--------R---------------EDGMEYREFEA-VLSSFLRSKFNIH--------R---------------KGNMSYREFEG-ALSSFLRAKLNVR--------R---------------NHIMSFSDFEY-HLISFIKLKFNVI--------K------------------LTLEAADL-CLMNFFQKHFQCK--------Y---------------NG-VTEQQAGR-MIEQLFLGEQGLARGGKWGNKF------------------LNPTDTAR-KLTSFF-KKIGVH--------F------------------WSAAKAFS-TIKQLF-L-------SKGAQKV------------------LDITSVEPCTVDTLFK-QLEYALRTRVAGLRHCTYED -------------QPFNVVWT-RVEKAL-KKEGVD--------E------------------CDLITIQR-KLSGLF-KHIGAR--------F------

28

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

-----IVNTYDYQQLSNYMIYNE-----------L--LTKNKLILSECPG -----IVDTSYLRSKIYEETYRE-----------Y--MATNHLIVSACPG -----IVDTSSMRRKIYKEIYKE-----------Y--LATDHLVISACPG -----VYDTSYVKNILYDSVYE------------E--STRDKIIISDCPG -----MVGTEMGRIISISKTVEKI--IAHKKQKENTGADRKPLLSAVCPG ---TWVVDTNTAREATLVLGSDEV--LGGLIAPSD--KAATPVLTASCPG -----VFDTAFSRHFSLLESQREF--VRRFRGQAD-CRQALPLLASACPG -----VLDTD-IQLVFRRLVVKEF--IEN--------QTLSPFMISRCAG APPVYVVSEAQHQEQSVLMNVRQISFLMQS---SE--PRSNIAIITHCPA -----LRDLSQAQDISLFGIYDEF--KEYQ-------KMNKVLLTSTCPG -----VMNSTISEYISLLETKYEF--ISRYKA-----KSDLPMIISHCPG

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

VVAYVERRR-PDLIPYLSEVPSLQQMCAYLCEEEG--------------VVTYIERTA-PYLIGYLSRVKSPQQMAFSLVKG----------------VVAYVERTA-PHLIDYLSRVKSPQQMAFSLVKG----------------TVSYIERQA-HHLIEYLSVTSTQQQAIKLLAQ-----------------FLIYTEKTK-PQLVPMLLNVKSPQQITGSLIRATFESLAI---------WVCYAEKTH-PYVLPHLSRVKSPQALMGTLLKTSLSRILD---------WICYAEKTHGSFILPHISTARSPQQVMGSLVKDFFAQQQH---------SVVYYERKT--SYADHLAQIKPYPQLYAMYEKKILQ-------------VRLFITKRN-RELISYIVSTASPMELFGASYCGIDA-------------WVCYSEKMQGKWMFEYMSKVASSMTIAGMIMKKQNS-------------WICYSEKSLNSSVLPLLSKVRSAQQLQGILIKTLTLEIYNQLLFLYKFRL

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

-------------------------VLTVGVMQCHDKRLEGGV---RV--------------------------SRTVSVMPCQDKKLENGRDGVK---------------------------DRTVSVMPCQDKKLESGRDGVK---------------------------GRSISVIQCYDKFLENSDDVLTT----------------------ARESFYHLSLMPCFDKKLEASRPESLD---------------------IAPERIWHLAVMPCFDKKLEASREELTDAV --------------------LTPDKIYHVTVMPCYDKKLEASRPDFFNQ-----------------------STNYVLYIGPCYDRKLEAARFEED--------------------------SPLLVSIQPCQDRKLEQFRGAAV--------------------------EIYHVSIQMCFDKKLEATKTYNNI-SNSYRTNMNVKSTFTQNDNFVEQSDIFHVAIMPCHDKKLESTRSSLSLK-

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

--------------DYVLSSKDIYEICSD--VRINFEVDNNDRDIYDDVG -------------FDFILTTRGFCKALDS--LGFRRPARAS---------------------FDHVLTTREFRRVLDE--LEFELFLKAN----------------------------YDFYKMILD--LGFLQHNFIK-----GTCE -----------DGIDCVITPREIVTMLQE--LNLDFKSFLT-----EDTS WAGDGKPGRGVRDVDCVITSKEVLMLAAS--RGFDFFSLSA--SMPPQTP -------EHQTRDVDCVLTTGEVFRLLEE--EGVSLPDLEP---------------------VDAVLTIAEINDHITE--PTEEIPVKFP----------------------DVCLTAQEVHSFLAGTPQGSSPPA---------FCS -----------HVIDCVLTTSEIDSIIDW--NE-------------------SSDKNSSCPEVDIVLATSEVGEIIKL--AGFNSLLDVP--------E

29

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

VGNEGKGDNNGRDIHDDVGVGNEGKGDNNGRDTYD--------------GKSLCS-------------------------------------------SWNHHE-------------------------------------------KWEKCT--------------------------IGY--------------LYGRLS-------------------------------------------RFPDQL-------------------------I-----------------APLDSL-------------------------C-----------------ADTDLN-------------------------------------------SYTPSP-----------------------------------------------PI-------------------------------------------APLDNL-------------------------WLNQNFQITKKHNLSLLIT

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

----------DVGVGNEGKGDNNGRDIYDDVGVNGKDNNIYETCNGVHTN ----------MEE---------AETTQW-------------------------------IED---------MEVTQW-------------------------------HYG---------YLEHILNKKNNYLTKNDSIKIFNSK------------PPG---------WDPRVH------------------W-------HDFLFRPG---------HRQQ----------------------------SG---ASA---------EEPT---------------------------------AIS---------QKLGQI-----------------------------------------------TS-------------------------------------------NEITST---------------------ENYVSNQILNQFS---------WLIPSY----------------------

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

DKDDNNTNTHNDRLSPEQHSKNSSEHVLTNEKEFLTGITPYLIKM-----------------NIGT-SSGGYAEFILG------KHC------------------------NIGT-SSGGYAEFILS------KHR----------------------------------------NEEVRLNGKRKL------------------AS-NLGG-TCGGYAYQYVT------AVQRLHPGSQ---------------S-REAG-TSGGNMHFILR------HLQAKNPGSQ---------------S-HRGG-GSGGYLEHVFR------HAARELFGIHV----------------KDSL-NSDSIYQLIAE------IEP------------------------FWQY-ALGPLLVLY-L------RAKEWISDEGLSRLL -------------RMKG-FISSPAQYIAL------MEQKKE----------------------FNS-NSGGFCEYIIR------SAIKELAGDHI----

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

--------LNLTIERIQTI----------NKSYRVLHFK-------NSSL ---------VVETREIRNGI-------------KEH-LLDD-G------R ---------SVKKIKDRNGI-------------REY-MVDD-G------G ---------NINTVTLNKINIPEIDYKYKNGVFTYTQIKN------NKKK --------MIVLEGRNSDIV-------------EYR-LLHD-D---RIIA --------IQTVPGRNADVV-------------EYK-LIAEAG---EVMF ------AEVTYKPLRNKDFQ-------------EVT-LEKE-G---QVLL ---------TLNEEEINSLI-------------SEL-PSRF-----DLEI VQRADGVDLHWTKIGNELFS-------------CTI-ELSQ-NQSPYSCV ---------PFKVTRNKDFL-------------END------G---------DNKVQLPFNKLKN-DIL-------------EAK-YIKN-N----VEL

30

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

TFAHITGLKNLLNFLND--------------------------------TISQITGLENSINYFKSSKT-----------------------------IVSQITGLENSINYFKISKT-----------------------------KYLRILGLEPFLNFIKESKH-----------------------------AASELSGFRNIQNLVRKLTSGSGSERKRNITALRKRRTGPKANSREMAAA KAARYYGFRNIQNLVRKLKP-------------AKTSRMPGGKPFGSAKR HFAMAYGFRNIQNLVQRLKR-----------------------------STNSFDGETLNKRLTKTLDMM-------------SSG------------IIYKSTGYHNLQNLVRRIHA-------------LCP--------------IAIANGFRNIQNVVRFVK------------------------------NYCLAYGFRAIQSISRKLNL-------------QKNASQ-------NTQY

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

-----EEMRYNFVDIFLCDDRCFGGPGQIKNN-VQNDY-AHYFKLV--------KGPRHKMTEIFLCKNGCIGGPGQERVNDVEMDI-REY-DRN--------KGPKYKMAEVFLCKNSCIGGPGQERINSVEVDS-AEY-NIY--------KELEYDVCEIYICNQGCINGPGQLYTDNLYVNN-SEYIDID---TAATADPYHSDYIEVNACPGACMNGGGLLNGEQNSLKR-KQLVQTL---PAGKASGLDYGYVEVMACPGGCTNGGGQIKVDDQVVVDRKGLAVKPGPQE -----GRCPYHYVEVMACPSGCLNGGGQLQAPDRPSRELLQHVERL---Y --KKVPKPAPRLAQIDFCKGGCLVGGGQIRGNSPAQRR-A-LIAATQ-------NKSALYILDLHACPYGCYGGA-CIAGDDRHPVSSV--ASAS--HA -----SKTKLQFIEVEACPGGCICGGGQIKCSPKEKDE-RVK-KMME-IL KQSVVNHVNYHLIEAMACPTGCVSGGGQILSQNDQNDDNSDL-NKL--RK

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

---GLNTNEEV------------------------------------IVP ----GREQPRIFYSS--------------------------------------GKEQPEIYYSD---------------------------------------------------------------------------------LGI ---NKRHGEELAMVD---------------PLT--LGPKLEEAAARPLSL QKEWQKEVDEAYFSG-----DESGSRAQDESLDLVVDGISPSHIRNVLTH GMVRAEAPEDAPGVQ------------------------------ELYTH ---EVHTQNESTNIS---------------------------FPTELYNE ATMSADKAVLSHILAADTCAGL-------------LEGLVQAVDRITLQE EPKVVDEKNKSIYES----------------------------------NIKFIDEVQEALYKG-----INLN-KNQEV-----ILPDEIPIVNILYEY

Nar1_Tra_hom Nar1_Enc_cun Nar1_Enc_int Nar1_Nos_cer Nar1_Sac_cer Nar1_Neu_cra Nar1_Hom_sap Nar1_Tri_vag Nar1_Gia_int Nar1_Ent_his Nar1_Cry_par

QVLKDK---------KRIFENR--------KIYR--------SNFKVEW -PGLEE---------KRVFREV--------KAKR--------VDLRVDW -PGLEE---------KRTFRPV--------SVKR--------IDFKVDW DCKQLQ---------KRTYRKI--------ETKK--------VNFKIEW EYVFAP-----------VKQ----------------AVEKDLVSVGSTW WSTLTGIQLER--LAYTSYREVVSDVGKEKKMTDTERVVQLAGKIGGGW WLQGTDSECAGR-LLHTQYHAV----------------EKASTGLGIRW L--------IKF-GYKTHYESL----------------PQEEEKDQFAW TVIRTD---------GEVVSPE--------EIAARKGIQSGVRIQDLAW -IKDSI---------KLTFIDR--------KESAQ------ENALHLNW LIHIDK-QIDRSSGLKLPFLRN--------DFVSI---NEVPTASSLKW

31

Abbreviations: Tra_hom, Trachipleistophora hominis; Enc_cun, Encephalitozoon cuniculi; Enc_int, Encephalitozoon intestinalis; Nos_cer, Nosema ceranae; Sac_cer, Saccharomyces cerevisiae; Neu_cra, Neurospora crassa; Hom_sap, Homo sapiens; Tri_vag, Trichomonas vaginalis; Gia_int, Giardia intestinalis; Ent_his, Entamoeba histolytica; Cry_par, Crystosporidium parvum. Nar1 proteins are related to Fe-only hydrogenases40 and contain two-conserved cysteine motifs. One is located at the N-terminus and the second is distributed between the central part of the protein and its Cterminus, the cysteine residues forming these motifs are coloured in red and blue, respectively. In yeast, each cysteine motif coordinates a [4Fe-4S] cluster and both are essential for the assembly of cytosolic Fe/S proteins41. It has been shown that the C-terminal Fe/S cluster is stably bound to the protein and its assembly depends on the Fe/S cluster from the N-terminal cysteine motif42. A conserved C-terminal tryptophan characteristic of the Nar1 protein family is highlighted in yellow.

32

i. Fe/S protein Dre2 Dre2_Tra_hom Dre2_Enc_cun Dre2_Enc_int Dre2_Ent_bie Dre2_Sac_cer Dre2_Neu_cra Dre2_Hom_sap Dre2_Cry_par

MATE----------------------------------------N----MEDK---------------------------------------------MEDK---------------------------------------------MTSS----------------------------------------D----MSQYKTGLLL-IHPA---------------------------VTTTP--MSPITLDLTSDFNPA-NTTGAGSSSSQP-----RTLLVAPPSVASHE--MADFGISAGQ-FVAVVWDKSSPVEALKGLVDKLQALTGNEGRVSV----MTQLIITHQ---------SDSKLEES-------EVFLSELNRIKKEEDKF

Dre2_Tra_hom Dre2_Enc_cun Dre2_Enc_int Dre2_Ent_bie Dre2_Sac_cer Dre2_Neu_cra Dre2_Hom_sap Dre2_Cry_par

--DDF---KKLLEG-------------------------------------EEL---RKLLRS-------------------------------------EEL---RKLLRS-------------------------------------EEL---KELLKQ-------------------------------------ELVENTKAQAASKKVKFVDQFLINKLNDGSITLENAKYETVHYLTPEA --ERI---SALFSTYPRDTTDLHMLDRLAAGLVTLPTSTY-----------ENI---KQLLQSAHKESSFDIILSGLVPGSTTLHSA-----------GKFSS---LSDLRAIVKKGEFRIVSIYLSSGSILGEIF------------

Dre2_Tra_hom Dre2_Enc_cun Dre2_Enc_int Dre2_Ent_bie Dre2_Sac_cer Dre2_Neu_cra Dre2_Hom_sap Dre2_Cry_par

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------QTDIKFPKKLISVLADSLKPNGSLIGLSDIYKVDALINGFEIINEPDYCW -----------DLILVLTDPDGSRHAEASALLSNRAVWSLLVPALKAGGK --------EILAEIARILRPGGCLFLKEPVETAVDNNSKVKTASKLCSAL ------TFEFLKEFYGVLDFGSVLKVNI---LALDSIDKVKAFER---NL

Dre2_Tra_hom Dre2_Enc_cun Dre2_Enc_int Dre2_Ent_bie Dre2_Sac_cer Dre2_Neu_cra Dre2_Hom_sap Dre2_Cry_par

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------IKMDSSKLNQ------------TVSIPLKKKKTNNTKLQSGSKLPT--FK LRSEDGTLGRDTTTPEARE---AVLAGL---------VAGADGFTKPDYA TLSGLVEVKELQREPLTPEEVQSVREHLGHESDNLLFVQITG--KKPNFE LFSGFIKVKKLKGDGLN-----SSDSDF---------------------E

Dre2_Tra_hom Dre2_Enc_cun Dre2_Enc_int Dre2_Ent_bie Dre2_Sac_cer Dre2_Neu_cra Dre2_Hom_sap Dre2_Cry_par

---------ALKVKNR---R-----TEE------------------------------MMRKSTDP--RT-------------KM----------------------TIRKNTDP--RV-------------KI----------------------ATTYKQDP--R-----------------------------KASSSTSNLPSFKKADHSRQPIVKETDSFKPPSFKMTTEPKVYRVVDDLEEEAVPLRFGLKRKTNP--NPVVAP—IQPVAQVVTAAPAGVGFVTLDL-VGSSRQLKLSITKKSSP—-SVKPAVDPAAAKL-------------WTLSIVIKAEKP--SWKPEE----------------------------------

Dre2_Tra_hom Dre2_Enc_cun Dre2_Enc_int Dre2_Ent_bie Dre2_Sac_cer Dre2_Neu_cra Dre2_Hom_sap Dre2_Cry_par

----------D-----------------EKDF--LD-EDA-------------------Q-----------------DEYL--TD-EDK---AIQ-------------S-----------------DEYL--TD-EDK---AET-------------H-----------------RIPM--VDRPSK-------------IEDSDDDDFSSDSSKAQYFDQVDTSDDS--IE-EEE-LIDEDGSGK NDDLDLDGEDD-----------------DDDV--ID-EDT-LLTEADLRR ----ANDMEDD-----------------SMDL--ID-SDE-LLDPEDLKK ----------G-----------------KVLVDDID-LEGSVPDIKNYVP

33

Dre2_Tra_hom Dre2_Enc_cun Dre2_Enc_int Dre2_Ent_bie Dre2_Sac_cer Dre2_Neu_cra Dre2_Hom_sap Dre2_Cry_par

---SE--SVSNSRRVQKKKRACENCVCGRAEGKKKLSREELKKM---------------RSERPPAKKRACKDCTCGLKEEQEV------------------------KTLRTPAKKRACKDCTCGLKEKQEV-----------------I---LTQSNKCRENKARKCSNCTCNKNTNTNNT-------------S---MITMITCGKSKTKKKKACKDCTCGMKEQEENEINDI-RSQQDKV-P---IQQPPECQPKPGKKRRACKDCTCGLAERLEAEDKAR-RDKADQALN PDPASLRAASCGEG--KKRKACKNCTCGLAEELEKEKS---REQ--------LGQ----GKESCKSKERACNNCNCGRADLEKEIGVEAARKV------

Dre2_Tra_hom Dre2_Enc_cun Dre2_Enc_int Dre2_Ent_bie Dre2_Sac_cer Dre2_Neu_cra Dre2_Hom_sap Dre2_Cry_par

----SREEV-------KKLANAGCGGCKMGDAFRCGDCPFYGLPSFEEGD ------------------RTRSACGNCYKGDAFRCSGCPSLGLPPYEPGD ------------------EVRSACGNCYKGDAFRCSGCPSLGLPPYEPGE ------------------IYKSKCGSCHLGDPFRCSSCPYKGLPPFNEGD -VKFTEDELTEIDFTIDGKKVGGCGSCSLGDAFRCSGCPYLGLPAFKPGQ TLKLKSEDLLELDLTV-PGKTGSCGSCALGDAFRCAGCPYLGLPPFKVGE --------M-------SSQPKSACGNCYLGDAFRCASCPYLGMPAFKPGE ----YQEKVE------TGTARSSCGNCYLGDAFRCSGCPYKGMPAFKPGE

Dre2_Tra_hom Dre2_Enc_cun Dre2_Enc_int Dre2_Ent_bie Dre2_Sac_cer Dre2_Neu_cra Dre2_Hom_sap Dre2_Cry_par

EVFFD--------------------------------------------VVSFSMDL--------SNE-----------------------------FQ VVSFSTDL--------DEG-----------------------------LQ EINFD--------------------------------------------PINLDSIS--------D--------------------------------EVSILNNV--------P--------------------------------KVLLSDSN-----------------------------------------KVSLANAEGDANDHTVDMNLIHEEKVDLITTTFDDDGSGVNNVQSKGGVL

Dre2_Tra_hom Dre2_Enc_cun Dre2_Enc_int Dre2_Ent_bie Dre2_Sac_cer Dre2_Neu_cra Dre2_Hom_sap Dre2_Cry_par

---GEDA GQDG --EL --DL --QL LHDA KLNI

Abbreviations: Tra_hom, Trachipleistophora hominis; Enc_cun, Encephalitozoon cuniculi; Enc_int, Encephalitozoon intestinalis; Ent_bie, Enterocytozoon bieneusi; Sac_cer, Saccharomyces cerevisiae; Neu_cra, Neurospora crassa; Hom_sap, Homo sapiens; Cry_par, Crystosporidium parvum. Dre2 is an essential protein in Saccharomyces cerevisiae and is conserved among animal species and other eukaryotes43. The human protein is called CIAPIN1 or anamorsin. Dre2 homologues have a conserved C-terminal domain (underlined with blue bar) sometimes called a CIAPIN1 motif, which is present in the microsporidian sequences. In yeast the amino acids 173-348 within the CIAPIN1 motif interact with the FMN- and FAD-binding domains of Tah18. The protein contains an N-terminal [2Fe2S] and a C-terminal [4Fe-4S] cluster within the CIAPIN1 motif. The coordinating cysteine residues are marked in red44. In yeast, Tah18 transfers electrons to the [2Fe-2S] cluster of Dre229. In human and yeast the C-terminus of the proteins are connected to an N-terminal S-adenosylmethionine (SAM) methyltransferase-like domain which is not known to bind SAM45. The SAM-like domain is not present

34

in the microsporidian Dre2-like proteins and was not detected using CDD at the NCBI16. Notably, residues 1-172 of this region are known to be important for Dre2 function in Fe/S protein biogenesis, yet the region is not essential for yeast cell viability43-45. In yellow: Acidic and serine (E, D and S)-rich patch identified in human and yeast Dre243. The E, D and S residues in this region of microsporidian proteins were also labelled in yellow for comparison.

35

j. Diflavin-oxidoreductase Tah18 Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

----------------------------------------MSVQIPIVYGTQSGNSLHVS -------------------------------------------MIPILYGSQTGTAIYVS -------------------------------------------MIPILYGSQTGTSIYVS --------------------------------------------MIILYGSQTGNSIHIA --------------------------------------MSSSKKIVILYGSETGNAHDFA ------------------------MGEPVLAASVAKSTTMEGRNLILLYGSETGNSEEIA ---------------------------------------MPSPQLLVLFGSQTGTAQDVS MPFFYINNALNYLSICNISIGFILVLKKLAQLSSEYIENNVPSNFSIFYSTETGNSRKIS

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

QLIRSKLTH-YTTHIVP-------------------VDEFLLENFLNCNTFIFVCSTYGN NLIARAIMHGYDAKTIYNLDAFLYSPGQKDACLVMEMDLLDIEKILDIDLIIFVCSTHGD NLIERALLYGYDPKTIYNLDCFFNSRDQENLSSVMEMDLFDIEKILDIDFIIFVCSTHGD KLIQNVILYGYNKDLIYNVDKELLPTD-----FTLDMDSFDFEKILDIDMIIFVCSTHGN TILSHRLHRWHFSHTFCS------------------IGDYDPQDILKCRYLFIICSTTGQ MELAKMAERLHFNTVVG------------------EMDDFKLTDLLRYSLAIFVTSTTGQ ERLGREARRRRLGCRVQ------------------ALDSYPVVNLINEPLVIFVCATTGQ ELFKKLLDEISIEANVREIN---------------SILEENMYLNSNNSVFVFVVSTCGN

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

GDFPFAA--------QYFYNCLVSESVPNDFLKKLVRACPTIFLKSCNIAVFGLGDSSYA GAEPFNM--------TKFWSFLSRDDLPSTILSHLS------------FAVFGLGDSSYE GTEPFNM--------TKFWSFLSNSDLPGNLLSHLN------------FAVFGLGDSSYE GSEPFNM--------TKFWKFLRKKNLPTNFLQHLN------------FAVFGLGDSSYK GELPRNVNALKGERPVTFWSFLKRKNLPSNLLNHIQT------------AMLGLGDSSYP GDMPKNT--------TTLWKSLRRTKLNNTNCLAPVK-----------FSIFGLGDSSYP GDPPDNM--------KNFWRFIFRKNLPST-ALCQMD-----------FAVLGLGDSSYA GSFPASS--------RKFIRYLSKMIKSGNEIFLGIK-----------YTIIGLGSSLYE

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

KFNFASKKLFKVFS-KLGANLFVERGNGNVQDDEG--------YYTALFPWIEKLITNLQ KFNYCSKRLFNRLR-MLGARPVIRRGSGDSQDREG--------FLSDFRPWLLELTAYLR KFNYCSKKLFNRLR-MLGAKPVVRRGDGNAQDKEG--------FLSDLRPWLLELMAHFD SFNFCSKKLYNCLL-KHGAKPLIRKGNGDSQDKEG--------FMGEFKTWIKDLYYILP KFNYGIRKLHQRIVTQLGANELFDRLEADDQAMAGSNKGTGLGIESVYFEYEKKVLSFLL KFNWAARKLRVRLL-QLGASEFFRPGEADERHENG--------LDSIYLPWYQELRESLL KFNFVAKKLHRRLL-QLGGSALLPVCLGDDQHELG--------PDAAVDPWLRDLWDRVL YSFNSAALKLDKLISSLKGEKYCEIALLDEVNGNE------IDFKTWWNNTFLNRLGISD

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

S-LNIKGAELIIKQKR-------------------------------------------TYISRPCTDFISTQPK-------------------------------------------P-LKIKHIDILSLKPE-------------------------------------------HYKLQNAKNFASCKND-------------------------------------------SKYPNRKVNGQIIKREELDPEVYLEPASYLQLS--------------------------SQFPLPKGIEPIPDDAPLPPKYNIRLVPSTGSLKDKITNGEGHVSQVEDNEQLAARFERM GLYPPPPGLTEIPPGVPLPSKFTLLFLQEAPS---------------------------T SKEHLNKHFSIAVTRKKDEFVCKLRSINEISLYN--------------------------

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

-------------------------------LYDATIISKNMLTPCDHFQEILEVGLSVE -------------------------------RYVSRLVEKRVLTPDDHFQKIVEFVFDIP -------------------------------KYTSRLVGKRLLTPEGHFQKIIELVFEIP -------------------------------LYSASINDIKILTPYNYIYPILEIKFDI-----------DEHANEKFTSTKVIFEGDESLKVGRVNINKRITSEGHFQDVRQFKFSNV STESEATEAPGQKDGTDVPDFPPAKLLPIPGSFTAQVVCNKRVTPEDHWQDVRHIEFELR GSEGQRVAHPGSQE-------PPSESKP----FLAPMISNQRVTGPSHFQDVRLIEFDIL ------------HSLSGNCHVSEIYFKLIEFCPINRELLCETKTIDGIERQIFNLKLELP

36

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

NYDDFE-----PGDTIRIYPSN--YNWREFCD-YIGNVDDEDHVR--------------N DYKEFF-----PGDCLSLLPEN--YNYREFMS-YNGIGDGDLDGVSSVWM-------LQN EYKEFS-----PGDCLSFCPEN--YNYKEFMK-YNGM-EEDVDGISSSLM-------MKH DIENFE-----IGDCLAVYPEN--YNYEEFVR-YNNIKDNTLVKY------------IKK DKIQEN---YEPGDTVTIYPCNTDEDVSRFLANQSHWLEIADKPLNFTSGVPNDLKDGGL SPGRNGAMSFAG-QTLLIYPKNYPKDVQKLID-LMGWSEVAEQRIEIDWVKGTRPRDYHF GSG----ISFAAGDVVLIQPSNSAAHVQRFCQ-VLGLD--PDQLFMLQPREPDVSSPTRL DKVQYK-----TFDIIDILPPNLDENITFFSSKVLGINSIEDLKNITVEFVPINNMTRNI

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

TMDYN----FVPFFRIFAEIVKYLEDEPLNDYVFKNPERKEAVI-----------RRIRE VVDFN----SQPHQPFFFALRHFLGQK---------NRTEEEIL-----------LKIEE SIDFN----SQPHQPFFFALRYFLDRK---------GGIEDEVL-----------LKIEE YCDFN----SIPQIYFFLQLSLITDT------------IAEEYR-----------EKCKE VRPMT----LRNLLKYHCDFMSIPRTSFFLKIWTFATDVTKMERGQEQLNDQREKLRQFA LKDAT----IRDVLTHNFDISAVPKRTFLEFMAYHTTNPLEKER-----------LHELT PQPCS----MRHLVSHYLDIASVPRRSFFELLACLSLHELEREK-----------LLEFS SVPFPNNRSLMHILKYYFDLMTLPPHSVMLQFVPYLNSIEGELISNE--------SFFNE

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

MSTDYEL-YFDYVKRPKRTFFEVLQDFAL------KLPFSFLKKIVPNIQPRYFTLTKRE IAQDYDL-YHDYVIRARRTVFEVLKDLRI------KVDIGFLKSFVPAMYPRFFSVTKKK IAQDYDL-YYEYIVKPKRTIFEVLQELRV------KVDARFLKKFVPTIYPRFFSVTKKK IYLNYDL-YYDYILLPKRTIFEVLKDFKI------KLTSNFMYKYIPVINPRYFTLTKKD TDQDMQD-LYDYCNRPRRSILEVLEDFIS--V---KLPWKYVLDYLPIIKPRYYSISSGP QRGDSDE-FYDYTSRPRRTILEVLEDFPG--V---KIPYTRLLEFP-IIRPREFSLCNGG SAQGQEE-LFEYCNRPRRTILEVLCDFPHTAA---AIPPDYLLDLIPVIRPRAFSIASSNKDSYEFSFHLFINRFMKSLIPIPIEKFVKFTGIRQYPRSYSISSSSLASPSMIDLTIST

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

-------------------------SSYYLTIALVEYQNSIKAKRKGLCSQYLREINVK-------------------------GLYHITVAIVRYTTFLSEPRRGVCSEYLMSLSLN-------------------------GLYHVTVAVVNYKTILSQPRRGVCSEYLMGLSLN-------------------------FCYFVTVSLVSFKTSLKEERKGLCSEYLKLLTKGGD-----------------------PNIELTVAIVKYKTILRKIRRGICTNYIARLQEGDPAVNAKDLVISNEQDTTTTTTTDVYKFEILAALVHYRTIIRKPRQGLCSRYLRHLPVG---------------------------LLILVAVVQFQTRLKEPRRGLCSSWLASLDPGQ CIKG------------------QISAPLNEIISEGANKKNSKKIIKGLCSSFLFEFDLN-

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

--ESIKVGMVKSHLFYDSAN-------LLFICTGTGITLPRAFWN----FFTD----KNI --DVIKIGVERSNLYFDSDK-------LLFVCTGTGITLPRACVN----EFKD----KEI --DKVPIGIGRSNLYFGSNK-------LLFICTGTGITLPRACIN----EFKD----KEI --STIHVNIVQNRLNFSGKK-------ILFMCTGTGITLPRAFVN----YYSDLNFFKKV --EQIRYKLQNNHIIKKEFLN----KPMILVGPGVGLAPLLSVVK--------AEISKDI --TTVQIGIKPPSSPFAMDDPSFYSRPLIGVATGTGIAPFRALLQDRCLVQEDQQKLGPT GPVRVPLWVRPGSLAFPETP----DTPVIMVGPGTGVAPFRAAIQERVAQGQTG-----N --LPVLGMIRSSSLNIDNVTS-----SALMFSHGSGIAPIRALLHERKYLINEKKIIKPA

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

VVFYGYRHDDKDRLYVEEMKKNKNVKVVYAPSRMG------------------------VIFYGFRYKNKDFLYPDEWT-GRNVRMFTAASRD-------------------------VVFYGFRYRDRDFLYSDEWN-CKNVRMLTAASRD-------------------------ILLYGFRFRDVDFLYKEEFE-NKGIEIYPAVSRE-------------------------KLLFGCRYKDKDYIYKDMLEDWFRKGKIALHSSFSRDEENSP-----------------LLFFGCRNAAADFHFQAEWGTVPN---LTVYPAFSRDNDSSSTEEEETKLALQRAAGIYD FLFFGCRWRDQDFYWEAEWQELEKRDCLTLIPAFSREQEQ-------------------YLFYGCRTEN-EIIYKDELKDFKRIGALTEVFFALSKT----------------------

37

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

-EELYVQEIFYKYF------HDSIDQYLIYVSGR---TRLNKEVRQMFVKKYG-------DKVYVQDVFRKSP------VEDVDQYLIFVSGN---SRLNREVKKLLEDVYG-------DKMYVQDVFNRNP------IEDIDEYLIFVSGN---SRLNKEVRKLFQKLYG-------DNKYIQDIYKTIKG-----LENIDDWLIFVSGN---SRLNNIIEKMLLDIYK-------GVKYVQDYLWRLGEEITNLVVNKDAVFFLCGSS---GKMPIQVRLTFIEMLKKWGNFSD AGKNYVQNQIRQHAAEVGELL-RQNPIIVVCGNS---GRMPKSVREALEDAAVGSGVVAD --KVYVQHRLRELGSLVWELLDRQGAYFYLAGNA---KSMPADVSEALMSIFQEEGGLCS -QKKYVKDIIPFYKHIILKVVDQNDSIIYICGKKEFVSGIKNEVASIISNNRS-------

Tah18_Tra_hom Tah18_Enc_cun Tah18_Enc_int Tah18_Nos_cer Tah18_Sac_cer Tah18_Neu_cra Tah18_Hom_sap Tah18_Cry_par

------------AELYFQAETW------------KRIVFQSETW------------RTIAFQSETW------------KKIYFQAETWEETAKKYLKEMEKSDRYIQETWKEEAKGWFDRK-ENCVYWQETWPDAAAYLARLQ-QTRRFQTETWA ----RNIIKKMFVEGRIFIESWN

Abbreviations: Tra_hom, Trachipleistophora hominis; Enc_cun, Encephalitozoon cuniculi; Enc_int, Encephalitozoon intestinalis; Nos_cer, Nosema ceranae; Sac_cer, Saccharomyces cerevisiae; Neu_cra, Neurospora crassa; Hom_sap, Homo sapiens; Cry_par, Crystosporidium parvum. Tah18 is a member of the diflavin oxidoreductase family containing FMN and FAD cofactors that are reduced by cytosolic NADPH. The FMN binding domain of diflavin reductase proteins is located at the N-terminus while the FAD/NADH binding domain is located in the C-terminal domain46. The FMN and FAD domains interact with Dre245. In vivo Tah18 interacts with the Fe/S CIA component Dre2 and transfers electrons to the [2Fe-2S] cluster of Dre229. Motifs similar to other diflavin reductases: Residues involved in FMN binding are indicated in bold. Residues that make hydrogen bonds with phosphate groups of FMN are labelled in red. The tyrosine that stabilizes the FMN prosthetic group is labelled in green47. The FMN, FAD and NAD binding domains identified using SMART(http://smart.emblheidelberg.de) are highlighted in pink (PF00258), yellow (PF00667) and light blue (PF00175) respectively.

38

Supplementary Figure 2. Phylogenetic trees for mitosomal and cytosolic Fe/S protein biogenesis components and for cytosolic and nuclear Fe/S proteins. Components of the mitosomal ISC pathway have originated from the mitochondrial endosymbiont. The CIA pathway appears largely bacterial in character, and not archaeal as might be expected given that the host for the mitochondrial endosymbiosis is now thought to have descended from an Archaeon. By contrast, important nuclear and cytosolic Fe/S proteins do appear to have originated from an Archaeon. Monophyly of eukaryotic sequences, including those from microsporidians, is generally observed, suggesting that there is strong negative selection against gene replacement and reflecting the important roles that Fe/S proteins play in eukaryotic physiology.

39

1 Mitochondrial / mitosomal Fe/S cluster (ISC) assembly components Mitochondrial Hsp70 (Ssc1 in yeast) Naegleria gruberi n

Entamoeba histolytica HM-1:IMSS n

0.99

Dictyostelium discoideum AX4 n Plasmodium falciparum 3D7 n Cryptosporidium parvum Iowa II n Trypanosoma cruzi n Tetrahymena thermophila n Candidatus Liberibacter asiaticus str. psy62 l

1

0.5

0.62 0.72 0.93

0.99

0.91 0.8 0.94 0.91 0.68

0.98

0.99

0.86 0.99

0.7

Rickettsia prowazekii str. Madrid E l Orientia tsutsugamushi str. Ikeda l Candidatus Midichloria mitochondrii IricVA l Neorickettsia sennetsu str. Miyayama l Wolbachia endosymbiont strain TRS of Brugia malayi l Ehrlichia ruminantium str. Gardel l Trichomonas vaginalis G3 n Arabidopsis thaliana n Volvox carteri f. nagariensis n Chlamydomonas reinhardtii n Ostreococcus tauri n Phytophthora infestans T30-4 n Ectocarpus siliculosus n Cyanidioschyzon merolae n Neurospora crassa OR74A n 0.51

0.56

0.95

Saccharomyces cerevisiae SSC1 n Saccharomyces cerevisiae ECM10 n

Trachipleistophora hominis n

Nosema ceranae BRL01 n Encephalitozoon cuniculi GB-M1 n

Giardia lamblia n

Saccharomyces cerevisiae SSQ1 n

Homo sapiens n Drosophila melanogaster n Caenorhabditis elegans n Rhodospirillum rubrum ATCC 11170 l 0.99 Magnetospirillum magneticum AMB-1 l Rhodopseudomonas palustris DX-1 l 0.65 Rhodobacter sphaeroides ATCC 17025 l 0.6 0.91 Novosphingobium aromaticivorans DSM 12444 l Chelativorans sp. BNC1 l 0.71 Mesorhizobium loti MAFF303099 l Brucella abortus bv. 1 str. 9-941 l 0.75 0.55 Bartonella bacilliformis KC583 l 0.76 Agrobacterium fabrum str. C58 l 0.96 Sinorhizobium meliloti 1021 l Hippea maritima DSM 10411 l 0.99 Campylobacter jejuni l Chlamydia trachomatis A/HAR-13 l 0.95 Treponema pallidum subsp. pallidum str. Nichols l Rhodopirellula baltica l 0.99 1 Isosphaera pallida ATCC 43644 l 0.74 Rubrobacter xylanophilus DSM 9941 l Methanosarcina acetivorans C2A s 1 Methanohalophilus mahii DSM 5219 s 0.99 0.98 Clostridium acetobutylicum ATCC 824 l 0.95 Bacillus subtilis subsp. subtilis str. RO-NN-1 l Nitrosopumilus maritimus SCM1 s 1 0.99 candidate division pSL4 archaeon JGI 0000001-H6 s 0.99 0.91 1 Candidatus Caldiarchaeum subterraneum s 1 Methanosaeta thermophila PT s Thermosynechococcus elongatus BP-1 l 1 Sulfobacillus acidophilus TPY l Burkholderia sp. YI23 l 1 Ralstonia solanacearum GMI1000 l Candidatus Nitrosopumilus salaria s 0.6 Nitrosomonas sp. Is79A3 l 0.69 1 0.64 Dechloromonas aromatica RCB l Chromobacterium violaceum ATCC 12472 l 0.99 Francisella tularensis subsp. tularensis SCHU S4 l Legionella pneumophila str. Lens l 0.94 Pseudomonas aeruginosa PA7 l 0.69 Shewanella oneidensis MR-1 l 0.93 Escherichia coli O157:H7 str. EDL933 l 1 1 Buchnera aphidicola str. Bp (Baizongia pistaciae) l 0.99

0.99

For all trees: Alphaproteobacteria Betaproteobacteria Gammaproteobacteria Deltaproteobacteria Epsilonproteobacteria Other bacterial groups Euryarchaeota Thaumarchaeota Aigarchaeota Crenarchaeota Korarchaeota Eukaryota

0.2

40

l l l l l l s s s s s n

x x x x x x x x x x x x

Jac1 Trichomonas vaginalis G3 n

0.95

0.95

Rhodospirillum rubrum F11 l Plasmodium falciparum 3D7 n Phytophthora infestans T30-4 n Saccharomyces cerevisiae S288c n 0.78 Neurospora crassa OR74A n Naegleria gruberi n Homo sapiens n Trachipleistophora hominis n 0.77 Encephalitozoon cuniculi GB-M1 n Dictyostelium discoideum AX4 n Schizosaccharomyces pombe 972h- n 0.51 Cryptosporidium parvum Iowa II n Arabidopsis thaliana n Thalassospira profundimaris WP0211 l Magnetospirillum magnetotacticum MS-1 l 0.89 0.95 Phaeospirillum molischianum DSM 120 l Rickettsia typhi str. Wilmington l 0.99 Rickettsia prowazekii str. Madrid E l 0.86 Orientia tsutsugamushi str. Boryong l Volvox carteri f. nagariensis n 0.92 Chlamydomonas reinhardtii n 0.51 Neorickettsia sennetsu str. Miyayama l Giardia lamblia ATCC 50803 n Wolbachia endosymbiont strain TRS of Brugia malayi l 0.81 Ehrlichia ruminantium str. Gardel l gamma proteobacterium HTCC2207 l Pseudomonas sp. GM55 l Neisseria flavescens NRL30031/H210 l Vibrio fischeri SR5 l Alishewanella agri BL06 l 0.63 0.59 0.62 Pseudoalteromonas spongiae UST010723-006 l 0.67 Escherichia coli KTE94 l 0.99 Enterococcus gallinarum l Burkholderia pseudomallei K96243 l 0.96 Dechloromonas aromatica RCB l Anaeromyxobacter sp. Fw109-5 l Stigmatella aurantiaca DW4/3-1 l 0.99 Chondromyces apiculatus DSM 436 l

0.2

41

Arh1 0.99 0.97

0.89 0.77

0.92 0.73

Volvox carteri f. nagariensis n Chlamydomonas reinhardtii n Arabidopsis thaliana n Schizosaccharomyces pombe 972h- n Neurospora crassa OR74A n Naegleria gruberi n Phytophthora infestans T30-4 n Ectocarpus siliculosus n Homo sapiens n Drosophila melanogaster n Dictyostelium discoideum AX4 n Cyanidioschyzon merolae strain 10D n Tetrahymena thermophila n Saccharomyces cerevisiae S288c n Plasmodium falciparum 3D7 n

0.51 0.55 0.89

Trachipleistophora hominis n 0.67

0.54

0.94

Nosema ceranae n Encephalitozoon cuniculi GB-M1 n

Cryptosporidium parvum Iowa II n Caenorhabditis elegans n Trypanosoma cruzi strain CL Brener n Ostreococcus tauri n Sphingopyxis alaskensis RB2256 l 0.97 Sphingopyxis baekryungensis l 0.99 Novosphingobium sp. PP1Y l Novosphingobium nitrogenifigens l 0.74 0.99 Novosphingobium aromaticivorans DSM 12444 l Haliangium ochraceum DSM 14365 l Methyloversatilis universalis l 0.87 Methylophaga sp. JAM1 l 0.85 Rhodopseudomonas palustris HaA2 l 0.57 0.97 0.89 Agrobacterium vitis S4 l Anaeromyxobacter dehalogenans 2CP-1 l Rhodococcus jostii RHA1 l 0.85 Deinococcus radiodurans R1 l 0.74 0.99 Deinococcus proteolyticus MRP l 0.94 Dehalobacter sp. FTH1 l 0.98 Frankia sp. Iso899 l 0.2

42

Yah1

0.6

0.63

0.58

Magnetococcus marinus MC-1 l Rhodospirillum rubrum ATCC 11170 l Schizosaccharomyces pombe n Phytophthora infestans T30-4 n Ectocarpus siliculosus n Cyanidioschyzon merolae strain 10D n Volvox carteri f. nagariensis n 0.98 Chlamydomonas reinhardtii n Naegleria gruberi n Homo sapiens n 0.71 Drosophila melanogaster n 0.99 0.83 Caenorhabditis elegans n Arabidopsis thaliana n Tetrahymena thermophila n Trachipleistophora hominis n 0.86 0.84 Encephalitozoon cuniculi GB-M1 n 0.87 Cryptosporidium parvum Iowa II n 0.54 Dictyostelium discoideum AX4 n Plasmodium falciparum 3D7 n 0.6 0.52 Trypanosoma cruzi strain CL Brener n Orientia tsutsugamushi str. Ikeda l Rickettsia philipii str. 364D l 0.71 1 Rickettsia prowazekii str. Madrid E l 0.56 Candidatus Midichloria mitochondrii IricVA l Magnetospirillum gryphiswaldense MSR-1 l 0.99 Magnetospirillum magneticum AMB-1 l Saccharomyces cerevisiae S288c n 0.99 Neurospora crassa OR74A n 0.73 Neorickettsia risticii str. Illinois l 0.99 Neorickettsia sennetsu str. Miyayama l 0.88 Anaplasma marginale l Wolbachia endosymbiont strain TRS of Brugia malayi l 0.99 0.85 Ehrlichia ruminantium str. Welgevonden l Entamoeba histolytica HM-1:IMSS n Novosphingobium aromaticivorans DSM 12444 l Giardia lamblia n Trichomonas vaginalis G3 n Zavarzinella formosa l Plesiocystis pacifica l 0.98 0.59

0.57

0.8

0.6

0.73

0.94 0.98

0.96 0.8 0.96

0.82

Methanoplanus petrolearius DSM 11571 s Ignisphaera aggregans DSM 17230 s Ferroglobus placidus DSM 10642 s

Candidatus Nanosalinarum sp. J07AB56 s Pseudanabaena biceps l

Pseudomonas aeruginosa PA7 l Enterococcus gallinarum l Escherichia coli O157:H7 str. Sakai l 0.73 Ralstonia solanacearum GMI1000 l 0.94 Dechloromonas aromatica RCB l 0.98 Chromobacterium violaceum ATCC 12472 l Buchnera aphidicola BCc l Rhodobacter sphaeroides ATCC 17025 l Acinetobacter l Polaromonas sp. JS666 l Burkholderia xenovorans LB400 l Nocardia cyriacigeorgica GUH-2 l 0.99 Amycolicicoccus subflavus DQS3-9A1 l Xanthobacter autotrophicus Py2 l Rhodopseudomonas palustris CGA009 l Mesorhizobium loti MAFF303099 l Brucella abortus bv. 1 str. 9-941 l 0.89 Sinorhizobium meliloti 1021 l Agrobacterium fabrum str. C58 l 0.98 0.76 Rhizobium sp. AP16 l

0.2

43

Yfh1 Sorangium cellulosum So ce56 l Plesiocystis pacifica SIR-1 l

0.99

0.59 0.74

0.66

Thauera sp. MZ1T l Methyloversatilis universalis FAM5 l Volvox carteri f. nagariensis n Tetrahymena thermophila n Schizosaccharomyces pombe 972h- n Saccharomyces cerevisiae S288c n Neurospora crassa OR74A n Naegleria gruberi n

Trachipleistophora hominis n Encephalitozoon cuniculi GB-M1 n

0.89

0.67

0.89

Thalassiosira pseudonana CCMP1335 n Phytophthora infestans T30-4 n Ectocarpus siliculosus n Dictyostelium discoideum AX4 n Cyanidioschyzon merolae strain 10D n

0.63

0.88

Chlamydomonas reinhardtii n

Homo sapiens n Drosophila melanogaster n 0.59 Caenorhabditis elegans n Arabidopsis thaliana n Thalassospira lucentensis l Trypanosoma cruzi strain CL Brener n Rickettsia prowazekii str. Madrid E l Orientia tsutsugamushi str. Ikeda l Phaeospirillum molischianum DSM 120 l 0.68 Magnetospirillum magneticum AMB-1 l Trichomonas vaginalis G3 n 0.52 Ehrlichia ruminantium str. Gardel l Vibrio fischeri MJ11 l Vibrio azureus l Burkholderia pseudomallei K96243 l Pseudomonas sp. GM67 l Buchnera aphidicola str. Bp (Baizongia pistaciae) l 0.92 Escherichia coli O157:H7 str. EDL933 l 0.5

0.93 0.66 0.63

Cryptosporidium parvum Iowa II n

0.2

44

Nfs1 Magnetococcus marinus MC-1 l Cryptosporidium parvum Iowa II n Phytophthora infestans T30-4 n Homo sapiens n Tetrahymena thermophila n Plasmodium falciparum 3D7 n 0.87 0.67 Giardia lamblia ATCC 50803 n 0.59 Ectocarpus siliculosus n Cyanidioschyzon merolae strain 10D n Volvox carteri f. nagariensis n 0.99 Chlamydomonas reinhardtii n 0.58 Arabidopsis thaliana n 0.69 Dictyostelium discoideum AX4 n 0.56 Trypanosoma cruzi strain CL Brener n 0.91 Trichomonas vaginalis G3 n 0.99 Trichomonas vaginalis G3 n Nosema ceranae BRL01 n 0.67 Encephalitozoon cuniculi GB-M1 n 0.95 Trachipleistophora hominis n Ostreococcus tauri n Schizosaccharomyces pombe 972h- n 0.64 0.99 Saccharomyces cerevisiae S288c n 0.99 Neurospora crassa n Magnetospirillum magneticum AMB-1 l Candidatus Midichloria mitochondrii IricVA l Rickettsia bellii RML369-C l 0.9 0.99 Rickettsia prowazekii str. Madrid E l 0.94 Orientia tsutsugamushi str. Ikeda l 0.98 Neorickettsia sennetsu str. Miyayama l Wolbachia endosymbiont strain TRS of Brugia malayi l 1 0.99 Ehrlichia ruminantium str. Gardel l SAR324 cluster bacterium JCVI-SC AAA005 l Legionella pneumophila str. Lens l Pseudomonas aeruginosa PA7 l Burkholderia ambifaria AMMD l 0.68 Ralstonia solanacearum GMI1000 l 0.99 0.99 Azoarcus sp. BH72 l 0.99 Dechloromonas aromatica RCB l 0.97 0.99 Pseudogulbenkiania ferrooxidans 2002 l 0.92 Chromobacterium violaceum ATCC 12472 l Pseudoalteromonas spongiae l Escherichia coli MS 84-1 l 0.96 1 Shigella flexneri 2a str. 2457T l 0.91 0.98 Buchnera aphidicola str. Ak (Acyrthosiphon kondoi) l Zavarzinella formosa l 0.5 Gloeobacter violaceus PCC 7421 l Synechocystis sp. PCC 6803 l 0.52 Rhodopirellula baltica SH 1 l 0.51 Francisella tularensis subsp. tularensis SCHU S4 l Rhodopirellula baltica l Naegleria gruberi n 0.65

0.5

Bacillus subtilis XF-1 l

0.73 0.69

0.71

0.98

0.96 0.99

0.76

0.66 0.78

0.91 0.8

0.7

Hyperthermus butylicus DSM 5456 s

Chlamydia trachomatis l Thermofilum pendens Hrk 5 s Candidatus Caldiarchaeum subterraneum s Aeropyrum pernix K1 s

0.98

0.78

Treponema pallidum subsp. pallidum str. Nichols l

Sulfolobus solfataricus P2 s

Pyrococcus furiosus DSM 3638 s Thermococcus litoralis DSM 5473 s

Rhodospirillum rubrum ATCC 11170 l Nitrosomonas sp. Is79A3 l Entamoeba histolytica HM-1:IMSS n Campylobacter jejuni RM1221 l 0.99 0.99 Campylobacter jejuni l Rhodopseudomonas palustris BisA53 l Rhodobacter sphaeroides ATCC 17025 l Mesorhizobium loti MAFF303099 l Nitrosopumilus maritimus SCM1 s Methanocella paludicola SANAE s Candidatus Korarchaeum cryptofilum OPF8 s 0.99 Ignisphaera aggregans DSM 17230 s Archaeoglobus fulgidus DSM 4304 s Methanobacterium sp. SWAN-1 s Methanosaeta thermophila PT s Methanosarcina acetivorans C2A s Adlercreutzia equolifaciens DSM 19450 l Clostridium acetobutylicum ATCC 824 l Clostridium chauvoei l Novosphingobium aromaticivorans DSM 12444 l Candidatus Liberibacter asiaticus str. psy62 l Brucella melitensis ATCC 23457 l 0.51 0.99 Bartonella bacilliformis KC583 l 0.97 Agrobacterium fabrum str. C58 l 0.81 Sinorhizobium meliloti 1021 l

0.2

45

Isu1 Enterococcus gallinarum l Pasteurella multocida subsp. multocida str. Anand1 goat l 0.99 Haemophilus influenzae F3047 l Thalassospira profundimaris WP0211 l

0.57

Orientia tsutsugamushi str. Ikeda l Candidatus Odyssella thessalonicensis L13 l Vibrio fischeri ES114 l Verminephrobacter eiseniae EF01-2 l Burkholderia pseudomallei K96243 l 0.74 Ralstonia solanacearum GMI1000 l Gallionella capsiferriformans ES-2 l 0.66 0.86 Dechloromonas aromatica RCB l Chromobacterium violaceum ATCC 12472 l Burkholderia sp. Ch1-1 l Taylorella asinigenitalis MCE3 l Buchnera aphidicola str. Sg (Schizaphis graminum) l 0.96 Escherichia coli O157:H7 str. EDL933 l 0.99 Enterobacter aerogenes KCTC 2190 l Gemmata obscuriglobus l SAR324 cluster bacterium SCGC AAA001-C10 l Wolbachia endosymbiont strain TRS of Brugia malayi l Stigmatella aurantiaca DW4/3-1 l 0.55 Sorangium cellulosum So ce56 l Pseudomonas aeruginosa PAO1 l Plesiocystis pacifica SIR-1 l Pedobacter saltans DSM 12145 l Rickettsia bellii RML369-C l 0.98 Rickettsia prowazekii str. Madrid E l 0.6 Neorickettsia sennetsu str. Miyayama l Trichomonas vaginalis G3 n Homo sapiens n Trachipleistophora hominis n Encephalitozoon romaleae SJ-2008 n 0.99 1 Encephalitozoon cuniculi GB-M1 n Trypanosoma cruzi strain CL Brener n Dictyostelium discoideum AX4 n Thalassiosira pseudonana CCMP1335 n Phytophthora infestans T30-4 n 0.72 0.72 0.69 Ectocarpus siliculosus n Chlamydomonas reinhardtii n 0.66 0.54 0.51 Arabidopsis thaliana n Ostreococcus tauri n Drosophila melanogaster n Tetrahymena thermophila n 0.99 Cyanidioschyzon merolae strain 10D n Saccharomyces cerevisiae S288c n 0.62 0.99 Saccharomyces cerevisiae S288c n 0.6 Neurospora crassa OR74A n 0.87 Schizosaccharomyces pombe n Caenorhabditis elegans n Magnetospirillum magneticum AMB-1 l Giardia lamblia ATCC 50803 n Naegleria gruberi n Plasmodium falciparum 3D7 n 0.96 0.99 Cryptosporidium parvum Iowa II n 0.86 Ehrlichia ruminantium str. Gardel l Chloroherpeton thalassium ATCC 35110 l Candidatus Midichloria mitochondrii IricVA l 0.71

0.66

0.66

0.79

0.2

46

Grx proteins Volvox carteri f. nagariensis n Chlamydomonas reinhardtii n Arabidopsis thaliana n

0.64

0.74

1 0.52

0.58

0.65

0.65

0.64

Tetrahymena thermophila n Schizosaccharomyces pombe 972h- n Rhodobacter sphaeroides ATCC 17025 l Plasmodium falciparum 3D7 n Phytophthora infestans T30-4 n Nosema ceranae BRL01 n Saccharomyces cerevisiae S288c n Neurospora crassa OR74A n Naegleria gruberi n Coxiella burnetii Dugway 5J108-111 l Halosarcina pallida s Halarchaeum acidiphilum s 0.99 0.74 Natrinema versiforme s Phytophthora infestans T30-4 n Ectocarpus siliculosus n Ectocarpus siliculosus n Homo sapiens n 0.99 Drosophila melanogaster n Homo sapiens n Drosophila melanogaster n Dictyostelium discoideum AX4 n Dictyostelium discoideum AX4 n SAR324 cluster bacterium JCVI-SC AAA005 l Plesiocystis pacifica l Gloeobacter kilaueensis JS1 l Anabaena variabilis ATCC 29413 l Synechococcus sp. CB0205 l Cyanidioschyzon merolae strain 10D n Cyanidioschyzon merolae strain 10D n Cryptosporidium parvum Iowa II n Chlamydomonas reinhardtii n Neurospora crassa OR74A n Caenorhabditis elegans n Caenorhabditis elegans n Methylotenera versatilis 301 l Schizosaccharomyces pombe 972h- n Saccharomyces cerevisiae S288c n 0.99 Saccharomyces cerevisiae S288c n Arabidopsis thaliana n Acidocella sp. MX-AZ02 l Holospora undulata l Patulibacter americanus l Tetrahymena thermophila n 0.76 0.84

0.82

0.99

Grx3/5

Trachipleistophora hominis n Encephalitozoon cuniculi GB-M1 n Encephalitozoon cuniculi GB-M1 n Trypanosoma cruzi strain CL Brener n

Trypanosoma cruzi strain CL Brener n

0.5

0.53 0.52

0.96

0.53

0.79

0.75 0.77

0.88 1

Rickettsia prowazekii str. Madrid E l Ostreococcus tauri n Ostreococcus tauri n Orientia tsutsugamushi str. Ikeda l Novosphingobium aromaticivorans DSM 12444 l Nitrosomonas sp. Is79A3 l Neorickettsia sennetsu str. Miyayama l Candidatus Midichloria mitochondrii IricVA l Rhodospirillum rubrum ATCC 11170 l Magnetospirillum magneticum AMB-1 l Giardia lamblia ATCC 50803 n Wolbachia endosymbiont strain TRS of Brugia malayi l Ehrlichia ruminantium str. Gardel l Limnobacter sp. MED105 l Ralstonia solanacearum GMI1000 l Dechloromonas aromatica RCB l Chromobacterium violaceum ATCC 12472 l Alkalilimnicola ehrlichii MLHE-1 l Legionella pneumophila str. Lens l Pseudomonas aeruginosa PAO1 l 0.95 Francisella tularensis subsp. tularensis SCHU S4 l 0.59 Escherichia coli O157 l Buchnera aphidicola (Cinara tujafilina) l 0.53 0.96 Buchnera aphidicola str. Bp (Baizongia pistaciae) l Rhodopseudomonas palustris HaA2 l Sinorhizobium meliloti 1021 l Mesorhizobium loti MAFF303099 l 0.7 Brucella melitensis ATCC 23457 l 0.79 Candidatus Liberibacter asiaticus str. psy62 l 0.76 Bartonella bacilliformis KC583 l Agrobacterium fabrum str. C58 l Volvox carteri f. nagariensis n Dictyostelium discoideum AX4 n Phytophthora infestans T30-4 n Saccharomyces cerevisiae S288c n 0.99 Saccharomyces cerevisiae S288c n 0.53 Schizosaccharomyces pombe 972h- n 0.92 Neurospora crassa OR74A n Naegleria gruberi n Homo sapiens n Ectocarpus siliculosus n 0.52 Cyanidioschyzon merolae strain 10D n Drosophila melanogaster n Chlamydomonas reinhardtii n 0.86

0.5

0.96

Arabidopsis thaliana n

0.96

Caenorhabditis elegans n Trypanosoma cruzi strain CL Brener n Anaplasma marginale str. St. Maries l

Plasmodium falciparum 3D7 n Cryptosporidium parvum Iowa II n

Trachipleistophora hominis n Trachipleistophora hominis n Tetrahymena thermophila n Encephalitozoon cuniculi GB-M1 n

0.84 0.77

Trachipleistophora hominis n Rhodopirellula sallentina l

0.58

0.65 1

0.67

0.79 0.57

0.93

0.62

0.57

0.51 0.9 0.76

0.86 0.64

0.52

Ehrlichia ruminantium str. Gardel l Geopsychrobacter electrodiphilus l Orientia tsutsugamushi str. Ikeda l Novosphingobium aromaticivorans DSM 12444 l Rickettsia prowazekii str. Madrid E l Fluoribacter dumoffii l 0.54 Legionella pneumophila str. Lens l Halobacteroides halobius DSM 5150 l Escherichia coli O157 l Rhodobacter sphaeroides ATCC 17025 l Rhodospirillum rubrum ATCC 11170 l Pseudomonas aeruginosa PA7 l Magnetospirillum magneticum AMB-1 l Burkholderia phymatum STM815 l 0.9 Neorickettsia sennetsu str. Miyayama l Ralstonia solanacearum GMI1000 l 0.59 0.96 Nitrosomonas sp. Is79A3 l 0.82 Chromobacterium violaceum ATCC 12472 l Rhodopseudomonas palustris BisB18 l Brucella suis 1330 l 0.59 Bartonella bacilliformis KC583 l 0.9 Mesorhizobium loti MAFF303099 l 0.59 Agrobacterium fabrum str. C58 l 0.61 0.99 Sinorhizobium meliloti 1021 l

0.2

47

Clostridium acetobutylicum ATCC 824 l

Entamoeba histolytica HM-1:IMSS n Bacillus subtilis subsp. subtilis str. 168 n Trichomonas vaginalis G3 n Methanosarcina acetivorans C2A s candidate division YNPFFA s Nitrosopumilus maritimus SCM1 s Grx1/2 Chlamydia trachomatis B/Jali20/OT n Sulfolobus solfataricus P2 s Hyperthermus butylicus DSM 5456 s Campylobacter jejuni subsp. jejuni NCTC 11168 = ATCC 700819 n Archaeoglobus fulgidus DSM 4304 s Candidatus Caldiarchaeum subterraneum s Methanosaeta thermophila PT s Clostridium glycolicum l

Atm1 0.99

0.99 0.97

0.76

0.99

Mus musculus n Homo sapiens n Chlamydomonas reinhardtii n Naegleria gruberi n 0.99 Mus musculus n Oryctolagus cuniculus n 0.57 Homo sapiens n Phaeodactylum tricornutum CCAP 1055/1 n Chlamydomonas reinhardtii n Arabidopsis thaliana n Dictyostelium discoideum AX4 n 0.9 Cryptosporidium parvum Iowa II n

0.98

0.92 0.91

0.97

Plasmodium yoelii yoelii 17XNL n Phaeodactylum tricornutum CCAP 1055/1 n Dictyostelium discoideum AX4 n Neurospora crassa OR74A n Neurospora crassa OR74A n

0.98

0.55

Plasmodium yoelii yoelii 17XNL n Vittaforma corneae n 0.96

0.54

Vittaforma corneae n

Vittaforma corneae n Vittaforma corneae n

Vittaforma corneae n Vittaforma corneae n

Vittaforma corneae n

Plasmodium yoelii yoelii 17XNL n Tetrahymena thermophila n Paramecium tetraurelia strain d4-2 n Nematocida parisii ERTm1 n

0.89

0.87

0.98

0.56

0.71

Vittaforma corneae n Enterocytozoon bieneusi H348 n

0.98

Nosema ceranae BRL01 n Nosema ceranae BRL01 n Nosema ceranae BRL01 n 0.99 Encephalitozoon cuniculi n Encephalitozoon intestinalis ATCC 50506 n

0.95 0.51

Vittaforma corneae n 0.7

0.71

0.76

0.99 0.99

0.98

0.99

0.99

0.55

0.94

0.96 0.71

0.76

0.7 0.79

0.66

Saccharomyces cerevisiae S288c n Neurospora crassa OR74A n Hydrocarboniphaga effusa AP103 l 0.97 0.99 Thioalkalivibrio thiocyanoxidans ARh 4 l Thioalkalivibrio sulfidophilus HL-EbGr7 l gamma proteobacterium HTCC2207 l 0.89 0.98 Pseudoalteromonas tunicata D2 l 0.99 Glaciecola punicea DSM 14233 = ACAM 611 l 0.89 0.77 0.99 Glaciecola nitratireducens FR1064 l 0.99 Glaciecola sp. 4H-3-7+YE-5 l Pseudoalteromonas atlantica T6c l Laribacter hongkongensis HLHK9 l 0.99 Halothiobacillus neapolitanus c2 l Azoarcus sp. BH72 l 0.99 0.99 Collimonas fungivorans Ter331 l 0.9 0.99 Polynucleobacter necessarius subsp. necessarius STIR1 l 0.64 Polynucleobacter necessarius subsp. asymbioticus QLW-P1DMWA-1 l 0.99 Ralstonia eutropha JMP134 l 0.99 Cupriavidus necator N-1 l 0.99 Ralstonia eutropha H16 l 0.99 Achromobacter arsenitoxydans SY8 l 0.96 Achromobacter piechaudii ATCC 43553 l 0.99 Azospirillum amazonense Y2 l 0.95 0.86 Azospirillum sp. B510 l Thalassospira xiamenensis M-5 = DSM 17429 l SAR324 cluster bacterium JCVI-SC AAA005 l 0.85 Acetobacteraceae bacterium AT-5844 l 0.83 alpha proteobacterium BAL199 l 0.95 Parvibaculum lavamentivorans DS-1 l 0.98 0.99 Rhodobacter sphaeroides ATCC 17025 l Rhodobacter sphaeroides 2.4.1 l Rhodobacter sphaeroides KD131 l 0.99 Rhodobacter sphaeroides ATCC 17029 l

Encephalitozoon cuniculi n Encephalitozoon intestinalis ATCC 50506 n

Encephalitozoon cuniculi n Vittaforma corneae n

Atm1-like

Enterocytozoon bieneusi H348 n

Vittaforma corneae n

Vittaforma corneae n Enterocytozoon bieneusi H348 n Enterocytozoon bieneusi H348 n Enterocytozoon bieneusi H348 n Encephalitozoon cuniculi n Encephalitozoon intestinalis ATCC 50506 n

Edhazardia aedis n 0.86 Vavraia culicus n Trachipleistophora hominis 2794 n Edhazardia aedis n Edhazardia aedis n 0.99 Encephalitozoon cuniculi n Encephalitozoon intestinalis ATCC 50506 n Vittaforma corneae n 0.99 Enterocytozoon bieneusi H348 n Enterocytozoon bieneusi H348 n Nosema ceranae BRL01 n Nosema ceranae BRL01 n Nosema ceranae BRL01 n 0.99 Vavraia culicus n Trachipleistophora hominis 2384 n 0.99 Vavraia culicus n Trachipleistophora hominis 2378 n

0.2

48

2 Mitochondrial / mitosomal Fe/S cluster (ISC) assembly components lost in the Microsporidia Iba57 0.99

0.99

Teredinibacter turnerae T7901 l Francisella sp. TX077308 l 0.99 Francisella tularensis subsp. tularensis SCHU S4 l Volvox carteri f. nagariensis n 1 Chlamydomonas reinhardtii n Arabidopsis thaliana n Acetobacter aceti l 0.99 Rhodospirillum rubrum ATCC 11170 l Saccharomyces cerevisiae S288c n Naegleria gruberi n Schizosaccharomyces pombe 972h- n Neurospora crassa OR74A n 0.61 Dictyostelium discoideum AX4 n Cyanidioschyzon merolae strain 10D n 0.69 Tetrahymena thermophila n 0.94 Caenorhabditis elegans n Homo sapiens n 0.52 0.53 Drosophila melanogaster n 0.69 Trypanosoma cruzi strain CL Brener n Ectocarpus siliculosus n 0.84 Ostreococcus tauri n Plasmodium falciparum 3D7 n Rickettsia prowazekii str. Madrid E l 0.58 0.93 Orientia tsutsugamushi str. Ikeda l Candidatus Midichloria mitochondrii IricVA l Wolbachia endosymbiont strain TRS of Brugia malayi l 0.92 Ehrlichia ruminantium str. Gardel l Candidatus Liberibacter asiaticus str. psy62 l 0.64 Bartonella bacilliformis KC583 l 0.99 Agrobacterium fabrum str. C58 l 0.98 0.54 Brucella melitensis ATCC 23457 l

0.2

49

Isa1/2 0.59

0.57

Naegleria gruberi n Ectocarpus siliculosus n Dictyostelium discoideum AX4 n Cyanidioschyzon merolae strain 10D n Homo sapiens n Drosophila melanogaster n Caenorhabditis elegans n Trypanosoma cruzi strain CL Brener n Phytophthora infestans T30-4 n Volvox carteri f. nagariensis n 0.98 Chlamydomonas reinhardtii n Arabidopsis thaliana n Ostreococcus tauri n Tetrahymena thermophila n

0.62

0.54

0.7

Rickettsia prowazekii str. Madrid E l Azospirillum sp. B510 l

0.78

0.71

Novosphingobium aromaticivorans DSM 12444 l

Candidatus Midichloria mitochondrii IricVA l Magnetospirillum magneticum AMB-1 l

Neorickettsia sennetsu str. Miyayama l

0.99

0.57

0.75

Wolbachia endosymbiont strain TRS of Brugia malayi l Ehrlichia ruminantium str. Welgevonden l Rhodopseudomonas palustris BisA53 l

0.66 0.93

Mesorhizobium loti MAFF303099 l Sinorhizobium meliloti 1021 l

0.93 0.5

Rhodobacter sphaeroides ATCC 17025 l

0.92

0.83 0.96

0.91

0.68 0.71

0.99

0.94

0.69 0.75

0.5

0.66

Trichomonas vaginalis G3 n Trichomonas vaginalis G3 n Trichomonas vaginalis G3 n Giardia lamblia ATCC 50803 n

Candidatus Liberibacter asiaticus str. psy62 l Brucella suis 1330 l Bartonella bacilliformis KC583 l Nitrosomonas sp. Is79A3 l Legionella pneumophila str. Lens l Alkalilimnicola ehrlichii MLHE-1 l Ralstonia solanacearum GMI1000 l Advenella kashmirensis WT001 l Dechloromonas aromatica RCB l 0.8 Pseudomonas aeruginosa PAO1 l 0.87 Yersinia intermedia l 0.52 0.98 Escherichia coli O157:H7 str. EDL933 l 0.59 Simonsiella muelleri l 0.51 Chromobacterium violaceum ATCC 12472 l Tetrahymena thermophila n Schizosaccharomyces pombe 972h- n Rhodobacter sphaeroides ATCC 17025 l Plasmodium falciparum 3D7 n Rhodopirellula baltica SH 1 l Planctomyces brasiliensis DSM 5305 l Phytophthora infestans T30-4 n Saccharomyces cerevisiae S288c n Neurospora crassa OR74A n Naegleria gruberi n Homo sapiens n Haloferax mucosum s 0.99 Halomicrobium mukohataei DSM 12286 s Ectocarpus siliculosus n Drosophila melanogaster n Dictyostelium discoideum AX4 n Anaeromyxobacter dehalogenans 2CP-C l Sorangium cellulosum So0157-2 l Cyanidioschyzon merolae strain 10D n Volvox carteri f. nagariensis n Chlamydomonas reinhardtii n Candidatus Caldiarchaeum subterraneum s Caenorhabditis elegans n Bacillus subtilis l candidate division YNPFFA s 0.99 unclassified Crenarchaeota (miscellaneous) s Nitrosopumilus maritimus SCM1 s 0.99 Candidatus Nitrosoarchaeum s Arabidopsis thaliana n Ectocarpus siliculosus n Leptolyngbya sp. PCC 6406 l Synechococcus sp. WH 8102 l Isa2 Volvox carteri f. nagariensis n 0.99 Chlamydomonas reinhardtii n 0.71 Arabidopsis thaliana n Frankia alni ACN14a l Rubrobacter xylanophilus DSM 9941 l Wolbachia endosymbiont strain TRS of Brugia malayi l Trypanosoma cruzi strain CL Brener n Rhodospirillum rubrum ATCC 11170 l Ostreococcus tauri n Rickettsia prowazekii str. Madrid E l 0.97 Orientia tsutsugamushi str. Ikeda l Novosphingobium aromaticivorans DSM 12444 l Neorickettsia sennetsu str. Miyayama l Magnetospirillum magneticum AMB-1 l Symbiobacterium thermophilum IAM 14863 l Thermaerobacter marianensis DSM 12885 l Methanosarcina acetivorans C2A s Pseudomonas aeruginosa PAO1 l Nitrosomonas sp. Is79A3 l Ralstonia solanacearum GMI1000 l 0.74 0.53 0.64 Legionella pneumophila str. Lens l 0.63 Francisella tularensis subsp. tularensis SCHU S4 l 0.74 Buchnera aphidicola str. Ak (Acyrthosiphon kondoi) l 0.99 Buchnera aphidicola str. APS (Acyrthosiphon pisum) l Rhodopseudomonas palustris BisA53 l 0.97 Oligotropha carboxidovorans OM5 l Brucella melitensis bv. 1 str. 16M l 0.75 Bartonella bacilliformis KC583 l Agrobacterium fabrum str. C58 l 0.91 Sinorhizobium meliloti 1021 l 0.66 Mesorhizobium loti MAFF303099 l Candidatus Liberibacter asiaticus str. psy62 l 0.54

0.86

Isa1

Rhodospirillum rubrum ATCC 11170 l

Orientia tsutsugamushi str. Ikeda l

0.52

Plasmodium falciparum 3D7 n

Schizosaccharomyces pombe 972h- n Saccharomyces cerevisiae S288c n Neurospora crassa OR74A n

0.91

0.2

50

Mge1 Treponema pallidum subsp. pallidum str. Chicago l Trichomonas vaginalis G3 n Rhodobacter sphaeroides ATCC 17025 l Sporolactobacillus vineae DSM 21990 = SL153 l Sporolactobacillus inulinus CASD l Tepidanaerobacter acetatoxydans Re1 l Dolosigranulum pigrum ATCC 51524 l Leuconostoc pseudomesenteroides KCTC 3652 l 1 Leuconostoc argentinum KCTC 3773 l Mitsuokella multacida DSM 20544 l Methanosalsum zhilinae DSM 4017 s Methanosarcina acetivorans C2A s Methanohalobium evestigatum Z-7303 s 0.77 Methanohalophilus mahii DSM 5219 s Methanofollis liminatans DSM 4140 s Methanospirillum hungatei JF-1 s Methanoplanus limicola DSM 2279 s 0.91 Methanoplanus petrolearius DSM 11571 s 0.5 Methanoregula boonei 6A8 s Methanoculleus bourgensis MS2 s 0.99 Methanoculleus marisnigri JR1 s Desulfarculus baarsii DSM 2075 l Rhodopirellula baltica l Planctomyces maris l Anaeromyxobacter dehalogenans 2CP-C l 0.99 Anaeromyxobacter dehalogenans 2CP-1 l Corallococcus coralloides DSM 2259 l Chondromyces apiculatus DSM 436 l 1 0.97 Myxococcus xanthus DK 1622 l Clostridium acetobutylicum ATCC 824 l Campylobacter jejuni RM1221 l Chlamydia trachomatis E/11023 l

0.52 0.99

0.99

0.99

0.99

0.58 0.54 1

0.51

Geobacillus thermoglucosidasius C56-YS93 l Bacillus subtilis l

0.68 0.73

1

0.99

Synechococcus sp. PCC 7336 l Microcoleus vaginatus FGP-2 l Fischerella sp. JSC-11 l Moorea producens 3L l

0.55

0.99 0.98

1

0.8

0.94

1

Synechococcus sp. CB0101 l Synechococcus sp. CB0205 l Chlamydomonas reinhardtii n Volvox carteri f. nagariensis n Chlamydomonas reinhardtii n Arabidopsis thaliana n

Conexibacter woesei DSM 14684 l Tetrahymena thermophila n Schizosaccharomyces pombe 972h- n Saccharomyces cerevisiae S288c n Neurospora crassa OR74A n Naegleria gruberi n Phaeodactylum tricornutum CCAP 1055/1 n Phytophthora infestans T30-4 n Ectocarpus siliculosus n Cyanidioschyzon merolae strain 10D n Plasmodium falciparum 3D7 n Cryptosporidium parvum Iowa II n

0.99

0.85

0.99

0.83

Gloeobacter violaceus PCC 7421 l

0.98

Thalassospira profundimaris WP0211 l

0.94

Synechococcus sp. JA-2-3B a(2-13) l Synechococcus sp. JA-3-3Ab l

Cyanothece sp. PCC 7425 l Cyanothece sp. PCC 8801 l

0.75

0.56

1

0.98 0.97

0.89 0.68

0.85 0.7 0.87

0.98

Dictyostelium discoideum AX4 n

Novosphingobium aromaticivorans DSM 12444 l

0.57

0.99

Magnetospirillum magneticum AMB-1 l

0.64

0.53 0.98 0.54

0.99 0.81

0.59 0.98 0.8 0.55 0.54 0.66

0.76 0.99

0.99 0.9 0.88

0.95

0.87

Phaeodactylum tricornutum CCAP 1055/1 n

Mus musculus n Homo sapiens n

Caenorhabditis elegans n Trypanosoma cruzi strain CL Brener n

Ostreococcus tauri n

Desulfomicrobium baculatum DSM 4028 l Desulfuromonas acetoxidans DSM 684 l Geobacter sp. M21 l Pelobacter propionicus DSM 2379 l Methanosaeta thermophila PT s Rhodospirillum rubrum ATCC 11170 l

Rickettsia prowazekii str. Madrid E l Orientia tsutsugamushi str. Ikeda l Candidatus Midichloria mitochondrii IricVA l Wolbachia endosymbiont strain TRS of Brugia malayi l Ehrlichia ruminantium str. Welgevonden l Neorickettsia sennetsu str. Miyayama l Francisella tularensis subsp. tularensis SCHU S4 l Methylotenera versatilis 301 l 0.99 Candidatus Nitrosopumilus salaria BD31 s Cupriavidus necator N-1 l 0.99 Cupriavidus taiwanensis LMG 19424 l Burkholderia terrae BS001 l Burkholderia sp. CCGE1002 l 0.99 0.99 0.76 Burkholderia graminis C4D1M l Ralstonia solanacearum GMI1000 l Nitrosomonas eutropha C91 l 0.99 Nitrosomonas sp. Is79A3 l Dechloromonas aromatica RCB l Chromobacterium violaceum ATCC 12472 l Spongiibacter tropicus l Legionella pneumophila str. Lens l Aeromonas veronii B565 l Aeromonas caviae Ae398 l 0.99 Aeromonas salmonicida subsp. salmonicida A449 l 0.99 Enterococcus gallinarum l Escherichia coli O157:H7 str. EDL933 l Bermanella marisrubri l Marinobacter adhaerens HP15 l gamma proteobacterium IMCC1989 l gamma proteobacterium NOR5-3 l Alishewanella jeotgali KCTC 22429 l Pseudomonas aeruginosa PA7 l

Rhodopseudomonas palustris HaA2 l Kaistia granuli l Fulvimarina pelagi HTCC2506 l Ochrobactrum anthropi ATCC 49188 l Brucella suis 1330 l Mesorhizobium loti MAFF303099 l Bartonella bacilliformis KC583 l 0.71

Mus musculus n Homo sapiens n Drosophila melanogaster n

Chlamydomonas reinhardtii n Arabidopsis thaliana n

0.94

0.59

Nitrosopumilus maritimus SCM1 s Candidatus Nitrosopumilus sp. AR2 s

0.98

0.58

0.76

Candidatus Caldiarchaeum subterraneum s

Agrobacterium fabrum str. C58 l Sinorhizobium meliloti 1021 l

Buchnera aphidicola str. Bp (Baizongia pistaciae) l

Candidatus Liberibacter asiaticus str. psy62 l

0.2

51

Giardia lamblia ATCC 50803 n

Nfu1 Trichomonas vaginalis G3 n Singulisphaera acidiphila DSM 18658 l

0.6

Geitlerinema sp. PCC 7105 l Bacillus subtilis XF-1 l 0.65

0.61

Clostridium ultunense l

Giardia lamblia ATCC 50803 n

Archaeoglobus profundus DSM 5631 s 0.66 Archaeoglobus fulgidus DSM 4304 s Collinsella sp. CAG:398 l Methanosaeta thermophila PT s Campylobacter jejuni l

Buchnera aphidicola BCc l

Trypanosoma cruzi strain CL Brener n Tetrahymena thermophila n Schizosaccharomyces pombe 972h- n

0.61

0.88

Plasmodium falciparum 3D7 n

Neurospora crassa OR74A n Phytophthora infestans T30-4 n 0.92 Ectocarpus siliculosus n Dictyostelium discoideum AX4 n Cyanidioschyzon merolae strain 10D n Homo sapiens n Drosophila melanogaster n 0.93 0.53 Caenorhabditis elegans n Arabidopsis thaliana n Volvox carteri f. nagariensis n 0.81 0.53 Chlamydomonas reinhardtii n 0.88 Ostreococcus tauri n 1

Nitrosomonas sp. Is79A3 l

Saccharomyces cerevisiae S288c n Neorickettsia sennetsu str. Miyayama l Rickettsia prowazekii str. Madrid E l 0.99 Orientia tsutsugamushi str. Boryong l Candidatus Midichloria mitochondrii IricVA l Wolbachia endosymbiont strain TRS of Brugia malayi l 0.68 0.58 0.99 Ehrlichia ruminantium str. Gardel l Rhodospirillum rubrum ATCC 11170 l 0.55 Magnetospirillum magneticum AMB-1 l Cystobacter fuscus l Rhodobacter sphaeroides ATCC 17025 l 0.75 0.88 uncultured Thiohalocapsa sp. PB-PSB1 l 0.77 Roseovarius sp. 217 l Rhodopseudomonas palustris BisB18 l 0.58 0.93 Novosphingobium aromaticivorans DSM 12444 l Bartonella bacilliformis KC583 l Mesorhizobium loti MAFF303099 l 0.83 Brucella suis 1330 l 0.73 Agrobacterium fabrum str. C58 l Sinorhizobium meliloti 1021 l 0.98 Candidatus Liberibacter asiaticus str. psy62 l 0.59

0.2

52

Thauera sp. MZ1T l

3 Components of the cytosolic Fe/S protein assembly (CIA) pathway Tah18

0.53

0.78

Cryptosporidium parvum Iowa II n Plasmodium falciparum 3D7 n Saccharomyces cerevisiae S288c n Schizosaccharomyces pombe 972h- n Neurospora crassa OR74A n Trachipleistophora hominis n 0.79 Nosema ceranae BRL01 n Encephalitozoon cuniculi GB-M1 n Cyanidioschyzon merolae strain 10D n Trypanosoma cruzi strain CL Brener n

0.64

0.73 1

0.56

0.96 0.81

0.99

0.95

0.99

0.76

0.67

0.99 0.99 0.94 0.98

0.79 0.67

Homo sapiens n Drosophila melanogaster n Caenorhabditis elegans n Phytophthora infestans T30-4 n Ectocarpus siliculosus n Naegleria gruberi n Tetrahymena thermophila n Dictyostelium discoideum AX4 n Volvox carteri f. nagariensis n Chlamydomonas reinhardtii n Arabidopsis thaliana n Ostreococcus tauri n Cystobacter fuscus l Rhodopseudomonas palustris BisB5 l 0.91 Erythrobacter sp. NAP1 l Nostoc sp. PCC 7107 l 0.99

Trichomonas vaginalis G3 n Giardia lamblia ATCC 50803 n Nitrosopumilus maritimus SCM1 s

Chlamydia trachomatis l Magnetospirillum magneticum AMB-1 l

0.51 0.55

0.78

0.61

0.58 0.79

0.96 0.96

0.53 0.77

Brucella suis 1330 l Methanosarcina acetivorans C2A s

Candidatus Nitrososphaera gargensis Ga9.2 s Dechloromonas aromatica RCB l

Streptosporangium roseum DSM 43021 l Pseudomonas aeruginosa PA7 l uncultured marine group II euryarchaeote s Bacillus subtilis l Staphylococcus sp. AL1 l Pusillimonas noertemannii l Buchnera aphidicola str. 5A (Acyrthosiphon pisum) l 1 uncultured archaeon MedDCM-OCT-S11-C441 l Uncultured Euryarchaeote s 0.99 Uncultured Euryarchaeote s 0.92 0.99 uncultured marine group II euryarchaeote DeepAnt-15E7 s 0.99 uncultured archaeon MedDCM-OCT-S11-C441 l Agrobacterium fabrum str. C58 l Rhodopirellula baltica l

0.2

53

Cfd1/Nbp35/Ind1 Hyperthermus butylicus DSM 5456 s Thermofilum sp. 1910b s Sulfolobus solfataricus 98/2 s Metallosphaera sedula DSM 5348 s Archaeoglobus fulgidus DSM 4304 s

0.99 1

0.65 0.57

0.99

0.61 0.67

1

0.99 0.98 0.98 1

0.51 0.8 0.97

0.76

0.68

0.98 0.93

0.99

0.72

Clostridium acetobutylicum ATCC 824 l Coriobacteriaceae l 1 Collinsella tanakaei l Candidatus Korarchaeum cryptofilum OPF8 s Pyrococcus furiosus DSM 3638 s Thermococcus litoralis DSM 5473 s Thermoplasma acidophilum DSM 1728 s planctomycete KSU-1 l Flexistipes sinusarabici DSM 4947 l uncultured archaeon l Methanoculleus marisnigri JR1 s Methanolinea tarda NOBI-1 s 0.95 Methanoregula boonei 6A8 s Methanosaeta thermophila PT s Methanosarcina mazei Go1 s 1 Methanosarcina acetivorans C2A s Archaeoglobus sulfaticallidus PM70-1 s Pelobacter carbinolicus DSM 2380 l Geobacter sulfurreducens PCA l Desulfovibrio gigas DSM 1382 = ATCC 19364 l Geobacter lovleyi SZ l Desulfomicrobium baculatum DSM 4028 l Thermodesulfatator indicus DSM 15286 l Thermodesulfobium narugense DSM 14796 l Acidaminococcus fermentans DSM 20731 l Candidatus Latescibacter anaerobius l Thermodesulfovibrio yellowstonii DSM 11347 l Naegleria gruberi n Tetrahymena thermophila n Phytophthora infestans T30-4 n 0.91 0.97 Ectocarpus siliculosus n Cyanidioschyzon merolae strain 10D n Homo sapiens n 0.59 Dictyostelium discoideum AX4 n Schizosaccharomyces pombe 972h- n 0.84 Saccharomyces cerevisiae S288c n 0.97 0.99 Neurospora crassa OR74A n 0.94 0.54 Entamoeba histolytica HM-1:IMSS n Plasmodium falciparum 3D7 n 0.8 0.86 Cryptosporidium parvum Iowa II n Trichomonas vaginalis G3 n 0.99 Drosophila melanogaster n 0.71 Caenorhabditis elegans n 1 0.56

0.99

Trypanosoma cruzi strain CL Brener n 0.93

0.53

Naegleria gruberi n

Schizosaccharomyces pombe 972h- n Saccharomyces cerevisiae S288c n Neurospora crassa OR74A n Entamoeba histolytica HM-1:IMSS n

0.78

0.72

1

Drosophila melanogaster n Dictyostelium discoideum AX4 n

0.59

0.54

0.99

0.57

0.99

0.82 0.93 1

0.77

Homo sapiens n Trypanosoma cruzi strain CL Brener n Volvox carteri f. nagariensis n Chlamydomonas reinhardtii n Arabidopsis thaliana n Ostreococcus tauri n Giardia lamblia ATCC 50803 n

0.64

0.74 0.54

0.98

1 0.7 0.6

0.96

0.87 0.51 0.54

0.76

0.75

0.64

Dechloromonas aromatica RCB l 0.89

0.87

0.61

Arabidopsis thaliana n

0.63

0.66

0.95

0.98 0.99

0.99

Ind1

Drosophila melanogaster n

Volvox carteri f. nagariensis n Chlamydomonas reinhardtii n Ostreococcus tauri n Campylobacter jejuni subsp. jejuni 81116 l Rhodobacter sphaeroides ATCC 17025 l Novosphingobium aromaticivorans DSM 12444 l Rhodospirillum rubrum ATCC 11170 l Magnetospirillum magneticum AMB-1 l Azospirillum brasilense Sp245 l Rhodopseudomonas palustris BisB5 l 0.99 Rhodopseudomonas palustris BisA53 l 0.98 Oligotropha carboxidovorans OM5 l 0.89 Methylobacterium sp. 4-46 l 0.51 SAR116 cluster alpha proteobacterium HIMB100 l Methylocella silvestris BL2 l Bartonella bacilliformis KC583 l Brucella melitensis ATCC 23457 l 0.53 Mesorhizobium loti MAFF303099 l 0.95 Sinorhizobium meliloti 1021 l 0.97 Agrobacterium fabrum str. C58 l 0.99 0.66 Candidatus Liberibacter asiaticus str. psy62 l 0.53

0.99

Legionella pneumophila str. Lens l Francisella tularensis subsp. tularensis SCHU S4 l

Pseudomonas aeruginosa PA7 l Chromobacterium violaceum ATCC 12472 l

Naegleria gruberi n Dictyostelium discoideum AX4 n Homo sapiens n

0.77

0.73

Cryptosporidium parvum Iowa II n Trichomonas vaginalis G3 n Rickettsia prowazekii str. Madrid E l Orientia tsutsugamushi str. Ikeda l Candidatus Midichloria mitochondrii IricVA l Neorickettsia sennetsu str. Miyayama l Wolbachia endosymbiont strain TRS of Brugia malayi l Ehrlichia ruminantium str. Gardel l

Sutterella wadsworthensis l Escherichia coli O157:H7 str. Sakai l Caldimonas manganoxidans l Ralstonia solanacearum GMI1000 l Acidithiobacillus ferrooxidans ATCC 53993 l Nitrosomonas sp. Is79A3 l 0.59 Gallionella sp. SCGC AAA018-N21 l Frateuria aurantia DSM 6220 l 0.5

0.98

0.6

Nosema ceranae BRL01 n Encephalitozoon cuniculi GB-M1 n

Nocardioides sp. JS614 l Bacillus subtilis XF-1 l Halanaerobium saccharolyticum l Rhodopirellula baltica l unclassified Crenarchaeota (miscellaneous) s Candidatus Caldiarchaeum subterraneum s Nitrosopumilus maritimus SCM1 s Candidatus Nitrososphaera gargensis Ga9.2 s Gemmata obscuriglobus l Neurospora crassa OR74A n Halorhabdus utahensis DSM 12940 s Bdellovibrio bacteriovorus str. Tiberius l Leptolyngbya sp. PCC 7375 l Synechococcus sp. PCC 7335 l Trypanosoma cruzi strain CL Brener n Tetrahymena thermophila n Ectocarpus siliculosus n 0.99 Cyanidioschyzon merolae strain 10D n

0.94

0.99

0.53

0.57

Nosema ceranae BRL01 n Encephalitozoon cuniculi GB-M1 n

Trichomonas vaginalis G3 n

1

0.2

Ind1 is localized to mitochondria.

54

Trachipleistophora hominis n

Trachipleistophora hominis n

Cfd/Nbp35

Nar1 Ectocarpus siliculosus n

0.93

1

Tetrahymena thermophila n Plasmodium falciparum 3D7 n Phytophthora infestans T30-4 n Saccharomyces cerevisiae S288c n Schizosaccharomyces pombe 972h- n 0.58 Neurospora crassa OR74A n 0.99

0.64

0.65

0.73

Trachipleistophora hominis n Nosema ceranae BRL01 n Encephalitozoon cuniculi GB-M1 n

Homo sapiens n Drosophila melanogaster n Dictyostelium discoideum AX4 n Naegleria gruberi n Cyanidioschyzon merolae strain 10D n Cryptosporidium parvum Iowa II n Caenorhabditis elegans n Arabidopsis thaliana n

Entamoeba histolytica HM-1:IMSS n 0.77 Trypanosoma cruzi strain CL Brener n Volvox carteri f. nagariensis n 1 Chlamydomonas reinhardtii n 0.7 Ostreococcus tauri n Shewanella sp. MR-4 l Symbiobacterium thermophilum IAM 14863 l Clostridium ljungdahlii DSM 13528 l 0.71 0.61 Acetonema longum DSM 6540 l 1 Rhodospirillum rubrum ATCC 11170 l Clostridium tyrobutyricum l 1 Clostridium acetobutylicum ATCC 824 l 0.9 Desulfitobacterium hafniense Y51 l 1 Cellulomonas fimi ATCC 484 l Thermosipho africanus TCF52B l Desulfovibrio magneticus RS-1 l 0.92 0.99 1 Desulfobulbus propionicus DSM 2032 l Rhodopseudomonas palustris DX-1 l 1 Rhodopseudomonas palustris TIE-1 l 1 Phaeospirillum molischianum DSM 120 l 0.72 0.54 Desulfovibrio vulgaris str. Hildenborough l 1 Sutterella sp. CAG:521 l Rhodospirillum photometricum DSM 122 l Caloramator australicus l 0.68 1 Alkaliphilus oremlandii OhILAs l 0.99 Coprothermobacter proteolyticus DSM 5265 l 1 Thermoanaerobacter pseudethanolicus ATCC 33223 l 0.75

1

Trichomonas vaginalis G3 n

0.2

55

Giardia lamblia ATCC 50803 n

Cia2 Streptomyces lividans TK24 l Gordonia alkanivorans NBRC 16433 l Rhodopseudomonas palustris CGA009 l gamma proteobacterium BDW918 l 0.71 0.82 Bradyrhizobium sp. WSM471 l Mycobacterium tuberculosis H37Rv l halophilic archaeon DL31 l 0.99 Halobacterium sp. DL1 s Natrialba magadii ATCC 43099 s 0.78 Halogeometricum borinquense DSM 11551 s 0.76 Haloarcula marismortui ATCC 43049 s Methanotorris formicicus Mc-S-70 s Methanocaldococcus fervens AG86 s Methanothermobacter marburgensis str. Marburg s 0.53 0.84 0.86 Methanobrevibacter ruminantium M1 s Ferroglobus placidus DSM 10642 s Pyrobaculum arsenaticum DSM 13514 s Thermotoga petrophila RKU-1 l Staphylothermus hellenicus DSM 12710 s Candidatus Korarchaeum cryptofilum OPF8 s 0.51 Sulfolobus islandicus M.14.25 s 0.59 1 Metallosphaera yellowstonensis MK1 s 0.82 Pyrolobus fumarii 1A s 0.81 Hyperthermus butylicus DSM 5456 s Candidatus Caldiarchaeum subterraneum s 0.97 Bacteroides nordii CL02T12C05 l Nitrobacter hamburgensis X14 l 0.72 Aromatoleum aromaticum EbN1 l Trichomonas vaginalis G3 n Tetrahymena thermophila n 0.73 Plasmodium falciparum 3D7 n Phytophthora infestans T30-4 n 0.55 Phaeodactylum tricornutum CCAP 1055/1 n Schizosaccharomyces pombe 972h- n Saccharomyces cerevisiae S288c n 0.9 0.93 Neurospora crassa OR74A n Naegleria gruberi n Entamoeba histolytica HM-1:IMSS n Trachipleistophora hominis n 0.79 Nosema ceranae BRL01 n 0.97 Encephalitozoon cuniculi GB-M1 n Ectocarpus siliculosus n Dictyostelium discoideum AX4 n Cyanidioschyzon merolae strain 10D n Cryptosporidium parvum Iowa II n Volvox carteri f. nagariensis n 0.88 Chlamydomonas reinhardtii n Arabidopsis thaliana n 0.76 Arabidopsis thaliana n Arabidopsis thaliana n Caenorhabditis elegans n Drosophila melanogaster n 0.64 Mus musculus n 0.99 Homo sapiens n 0.78 0.98 Branchiostoma floridae n Amphimedon queenslandica n Trypanosoma cruzi strain CL Brener n Ostreococcus tauri n Giardia lamblia ATCC 50803 n Dictyostelium discoideum AX4 n Homo sapiens n 0.71 Mus musculus n 0.97 0.87

0.57

1

1

0.55

0.75 0.54

Homo sapiens n Drosophila melanogaster n Branchiostoma floridae n

0.2

Note that the region that can be aligned between eukaryotes and prokaryotes is quite short for this protein (122 amino acids), and the specific relationships between the eukaryotes and any particular prokaryotic group are therefore tentative.

56

4 Nuclear and cytosolic Fe/S cluster-containing proteins Rli1 Trichomonas vaginalis G3 n Trachipleistophora hominis n Nosema ceranae BRL01 n 0.97 Encephalitozoon cuniculi GB-M1 n Caenorhabditis elegans n Naegleria gruberi n 0.92 Trypanosoma cruzi strain CL Brener n 0.6 Homo sapiens n 0.99 Drosophila melanogaster n Saccharomyces cerevisiae S288c n 0.61 Schizosaccharomyces pombe 972h- n 0.86 0.82 Neurospora crassa OR74A n 0.81 Cyanidioschyzon merolae strain 10D n 0.65 Phytophthora infestans T30-4 n 0.97 Ectocarpus siliculosus n Entamoeba histolytica HM-1:IMSS n Dictyostelium discoideum AX4 n 0.83 Tetrahymena thermophila n 0.87 Plasmodium falciparum 3D7 n 0.9 0.92 Cryptosporidium parvum Iowa II n 0.65 Volvox carteri f. nagariensis n 0.99 Chlamydomonas reinhardtii n 0.66 Arabidopsis thaliana n 0.95 Ostreococcus tauri n Thermofilum pendens Hrk 5 s Pyrobaculum aerophilum str. IM2 s 0.85 0.99 Caldivirga maquilingensis IC-167 s Sulfolobus solfataricus P2 s 0.9 0.52 Ignicoccus hospitalis KIN4/I s Staphylothermus marinus F1 s 0.95 Hyperthermus butylicus DSM 5456 s 0.98 Aeropyrum pernix K1 s 0.57 Pyrococcus furiosus DSM 3638 s 0.98 Thermococcus litoralis DSM 5473 s Candidatus Korarchaeum cryptofilum OPF8 s Methanothermobacter thermautotrophicus CaT2 s 0.61 Methanocaldococcus jannaschii DSM 2661 s 0.97 0.99 Methanocaldococcus fervens AG86 s Candidatus Caldiarchaeum subterraneum s 0.51 Nitrosopumilus maritimus SCM1 s 0.99 Candidatus Nitrosoarchaeum limnia s 0.99 Cenarchaeum symbiosum A s Thermoplasma acidophilum DSM 1728 s 0.56 Archaeoglobus fulgidus DSM 4304 s 0.86 Methanosarcina acetivorans C2A s 0.89 0.99 Methanosaeta thermophila PT s Giardia lamblia ATCC 50803 n Sutterella sp. CAG:521 l Bacillus subtilis subsp. spizizenii str. W23 l Atopobium l Nitrosomonas sp. Is79A3 l Neorickettsia sennetsu str. Miyayama l Candidatus Midichloria mitochondrii IricVA l Magnetospirillum magneticum AMB-1 l Streptococcus sp. HSISB1 l Clostridium acetobutylicum ATCC 824 l Rhodopseudomonas palustris DX-1 l 0.92 0.91 Francisella tularensis subsp. tularensis SCHU S4 l Rickettsia prowazekii str. Madrid E l 0.62 Buchnera aphidicola str. Ak (Acyrthosiphon kondoi) l Schlesneria paludicola l Campylobacter jejuni subsp. jejuni 81-176 l 0.92

0.99

Wolbachia endosymbiont strain TRS of Brugia malayi l Ehrlichia ruminantium str. Welgevonden l Legionella pneumophila str. Lens l 0.9 Dechloromonas aromatica RCB l Oscillatoriales cyanobacterium JSC-12 l Chlamydia trachomatis l Desulfonatronospira thiodismutans l 0.98 Chromobacterium violaceum ATCC 12472 l Orientia tsutsugamushi str. Ikeda l 0.67 Candidatus Liberibacter asiaticus str. psy62 l Brucella melitensis ATCC 23457 l Rhodospirillum rubrum ATCC 11170 l Pseudomonas aeruginosa PA7 l Treponema pallidum subsp. pallidum str. Nichols l Ralstonia solanacearum GMI1000 l 0.99 Escherichia coli O157:H7 str. EDL933 l Bartonella bacilliformis KC583 l Rhodopirellula baltica l Agrobacterium fabrum str. C58 l Agrobacterium radiobacter K84 l Novosphingobium aromaticivorans DSM 12444 l Rhodobacter sphaeroides ATCC 17025 l 0.99 0.82 gamma proteobacterium SCGC AB-629-P17 l 0.99 Sinorhizobium meliloti 1021 l 1 Mesorhizobium loti MAFF303099 l 0.99

0.99 0.52 0.63

0.92 0.9 0.78 0.96

0.96

0.58 0.99

0.2

57

Elp3 Ignisphaera aggregans DSM 17230 s Staphylothermus hellenicus DSM 12710 s Acidianus hospitalis W1 s Sulfolobus acidocaldarius DSM 639 s 0.97 Sulfolobus tokodaii str. 7 s 1 Sulfolobus solfataricus 98/2 s 0.99 Sulfolobus solfataricus P2 s 0.84 Pyrococcus furiosus DSM 3638 s 1 Thermococcus litoralis DSM 5473 s Aciduliprofundum sp. MAR08-339 s 0.99 Aciduliprofundum boonei T469 s Methanotorris formicicus Mc-S-70 s 0.84 1 Methanotorris igneus Kol 5 s methanocaldococcus infernus ME s 1 0.99 Methanocaldococcus sp. FS406-22 s 0.79 0.65 0.64 Methanocaldococcus fervens AG86 s Methanocaldococcus jannaschii DSM 2661 s Candidatus Korarchaeum cryptofilum OPF8 s Ferroglobus placidus DSM 10642 s 0.84 0.87 0.99 Archaeoglobus fulgidus DSM 4304 s 0.99 Methanosarcina acetivorans C2A s 0.99 Methanosaeta thermophila PT s Candidatus Nitrososphaera gargensis Ga9.2 s Bacterium l 1 Nitrosopumilus maritimus SCM1 s 1 0.99 Candidatus Nitrosoarchaeum koreensis MY1 s Trachipleistophora hominis n 0.99 Nosema ceranae BRL01 n 1 Encephalitozoon cuniculi GB-M1 n Schizosaccharomyces pombe 972h- n 0.83 Saccharomyces cerevisiae S288c n 0.99 0.99 Neurospora crassa OR74A n 0.99 Tetrahymena thermophila n 0.85 Phytophthora infestans T30-4 n 0.91 Naegleria gruberi n 0.84 0.94 Dictyostelium discoideum AX4 n 0.88 Drosophila melanogaster n Homo sapiens n 0.99 0.56 Caenorhabditis elegans n 0.84 1 Arabidopsis thaliana n Cyanidioschyzon merolae strain 10D n 0.76 Volvox carteri f. nagariensis n 0.58 0.99 Chlamydomonas reinhardtii n 0.56 Ostreococcus tauri n Entamoeba histolytica HM-1:IMSS n Trichomonas vaginalis G3 n 0.67 0.84 Giardia lamblia ATCC 50803 n Dehalogenimonas lykanthroporepellens BL-DC-9 l Dehalococcoides sp. GT l 0.99 1 Dehalococcoides sp. VS l Caldilinea aerophila DSM 14535 = NBRC 104270 l 0.86 Anaerolinea thermophila UNI-1 l Atopobium parvulum DSM 20469 l 1 1 Atopobium rimae ATCC 49626 l Slackia piriformis YIT 12062 l 1 Eggerthella sp. YY7918 l 0.85 Slackia exigua ATCC 700122 l 0.84 0.57 1 Cryptobacterium curtum DSM 15641 l Collinsella stercoris DSM 13279 l Ectocarpus siliculosus n Phytophthora infestans T30-4 n 1 Plasmodium falciparum 3D7 n 0.53 1 Cryptosporidium parvum Iowa II n 0.99 Trypanosoma cruzi strain CL Brener n Candidatus Caldiarchaeum subterraneum 0.76 s 0.99

0.99

0.91

1

0.76

1

0.9 0.99 0.93

0.97

1

Pelobacter carbinolicus DSM 2380 l Geobacter metallireducens GS-15 l Geobacter sulfurreducens PCA l Geobacter sp. M21 l 1 Geobacter bemidjiensis Bem l Desulfatibacillum alkenivorans AK-01 l Desulfococcus oleovorans Hxd3 l Thermincola potens JR l Halobacteroides halobius DSM 5150 l Ruminococcus albus 7 l Clostridium ljungdahlii DSM 13528 l Clostridium acetobutylicum ATCC 824 l 1 0.63 Clostridium butyricum 5521 l Peptoniphilus rhinitidis 1-13 l Peptostreptococcus stomatis DSM 17678 l 0.99 Peptostreptococcus anaerobius VPI 4330 l 0.99 1 Peptostreptococcus anaerobius 653-L l 0.99 Clostridium bartlettii DSM 16795 l Desulfovibrio magneticus str. Maddingley MBC34 l 1 Desulfovibrio magneticus RS-1 l Desulfovibrio piger ATCC 29098 l Rhodopirellula baltica SH 1 l 1 Rhodopirellula sallentina l planctomycete KSU-1 l 0.99 Geitlerinema sp. PCC 7105 l Bacillus subtilis subsp. spizizenii str. W23 l 0.62 Sulfuricella denitrificans skB26 l 0.85 endosymbiont of Tevnia jerichonana (vent Tica) l 1 Thiorhodovibrio sp. 970 l Ectothiorhodospira sp. PHS-1 l 0.95 0.99 Thiorhodospira sibirica ATCC 700588 l 0.98 Photobacterium profundum 3TCK l 1 Photobacterium damselae subsp. damselae CIP 102761 l 0.73 Psychromonas sp. CNPT3 l 1 Enterobacter cloacae subsp. dissolvens SDM l 0.99 0.87 Enterobacter cloacae EcWSU1 l 1 1 Escherichia coli O157:H7 str. EDL933 l Chromobacterium violaceum ATCC 12472 l 0.91 Azoarcus sp. KH32C l 0.99

0.99

0.2

58

Ntg1 0.99

Candidatus Korarchaeum cryptofilum OPF8 s Hyperthermus butylicus DSM 5456 s Methanosaeta thermophila PT s Candidatus Caldiarchaeum subterraneum s Archaeoglobus fulgidus DSM 4304 s

0.83

0.99 0.78

1

0.8 0.99

0.92

0.74 0.93

0.67

1 0.7 1

Acinetobacter sp. CAG:196 l Fusobacterium sp. CAG:439 l Clostridium sp. CAG:768 l

0.99

0.95 0.67 0.93

1

0.99

0.86 0.88

Rhodopirellula baltica l Cyanidioschyzon merolae strain 10D n Chlamydomonas reinhardtii n

0.99 0.9

0.73

0.94

0.73

0.79

0.96

Herbaspirillum sp. JC206 l Azospirillum brasilense Sp245 l

Plasmodium falciparum 3D7 n Trachipleistophora hominis n Tetrahymena thermophila n Nosema ceranae BRL01 n Phytophthora infestans T30-4 n Entamoeba histolytica HM-1:IMSS n Encephalitozoon cuniculi GB-M1 n Cryptosporidium parvum Iowa II n Trypanosoma cruzi strain CL Brener n Saccharomyces cerevisiae S288c n 1 Saccharomyces cerevisiae S288c n Trichomonas vaginalis G3 n Schizosaccharomyces pombe 972h- n 0.95 Neurospora crassa OR74A n Naegleria gruberi n Homo sapiens n Ectocarpus siliculosus n Drosophila melanogaster n 0.69 Dictyostelium discoideum AX4 n Caenorhabditis elegans n Ostreococcus tauri n Arabidopsis thaliana n Giardia lamblia ATCC 50803 n

Chlamydia trachomatis D/UW-3/CX l Volvox carteri f. nagariensis n Pseudanabaena sp. PCC 6802 l Campylobacter jejuni l Eggerthella sp. CAG:368 l Clostridium acetobutylicum ATCC 824 l Bacillus subtilis XF-1 l Treponema pallidum subsp. pallidum str. Nichols l Rickettsia prowazekii str. Madrid E l 0.69 Orientia tsutsugamushi str. Ikeda l Neorickettsia sennetsu str. Miyayama l 0.57 Candidatus Liberibacter asiaticus str. gxpsy l 0.91 Candidatus Midichloria mitochondrii IricVA l Wolbachia endosymbiont strain TRS of Brugia malayi l 0.7 0.86 Ehrlichia ruminantium str. Gardel l Ralstonia solanacearum GMI1000 l 0.73 Nitrosomonas sp. Is79A3 l Legionella pneumophila str. Lens l Escherichia coli O157:H7 str. Sakai l Dechloromonas aromatica RCB l 0.58 Pseudomonas aeruginosa PA7 l 0.83 Francisella tularensis subsp. tularensis SCHU S4 l 0.77 Chromobacterium violaceum ATCC 12472 l Buchnera aphidicola (Cinara tujafilina) l Rhodobacter sphaeroides ATCC 17025 l Novosphingobium aromaticivorans DSM 12444 l Magnetospirillum magneticum AMB-1 l 0.56 Rhodospirillum rubrum ATCC 11170 l Rhodopseudomonas palustris HaA2 l 0.9 Mesorhizobium loti MAFF303099 l Brucella melitensis bv. 1 str. 16M l 0.88 0.55 Bartonella bacilliformis KC583 l 0.99 Agrobacterium fabrum str. C58 l 0.99 Sinorhizobium meliloti 1021 l

0.99 0.99

Pyrococcus furiosus DSM 3638 s Thermococcus litoralis DSM 5473 s

Methanosarcina acetivorans C2A s Nitrosopumilus maritimus SCM1 s candidate division YNPFFA s planctomycete KSU-1 l Methanocaldococcus infernus ME s Desulfocapsa sulfexigens DSM 10523 l Desulfurobacterium sp. TC5-1 l candidate division KSB1 bacterium SCGC AAA252-N05 l

0.2

59

Rad3/Chl1/Rtel1 Thermoplasma acidophilum DSM 1728 s 0.74

0.84 1

0.99 0.99 0.99 0.98 1

Ignicoccus hospitalis KIN4/I s Hyperthermus butylicus DSM 5456 s Sulfolobus solfataricus P2 s Sulfolobus islandicus M.14.25 s Metallosphaera sedula DSM 5348 s Pyrobaculum aerophilum str. IM2 s Thermoproteus uzoniensis 768-20 s Thermofilum pendens Hrk 5 s Caldivirga maquilingensis IC-167 s Candidatus Korarchaeum cryptofilum OPF8 s Thermofilum pendens Hrk 5 s Thermofilum sp. 1910b s 1

0.99

Pyrococcus sp. NA2 s Pyrococcus furiosus DSM 3638 s 0.96 Pyrococcus horikoshii OT3 s 0.64 Thermococcus sp. CL1 s 0.98 Thermococcus litoralis DSM 5473 s

0.99 1

1 1 1 0.99

0.97 0.99 0.99 1

0.75 1 1

1

0.89

Methanosarcina acetivorans C2A s Methanocella paludicola SANAE s Halorhabdus utahensis DSM 12940 s Haloferax volcanii DS2 s 0.99 Haloquadratum walsbyi C23 s Methanospirillum hungatei JF-1 s Methanoregula boonei 6A8 s Methanolinea tarda NOBI-1 s Methanosaeta thermophila PT s

Nosema ceranae BRL01 n Trachipleistophora hominis n Encephalitozoon cuniculi GB-M1 n Plasmodium falciparum 3D7 n 0.99 Cryptosporidium parvum Iowa II n 0.98 Tetrahymena thermophila n Saccharomyces cerevisiae S288c n Schizosaccharomyces pombe 972h- n 0.76 0.99 Neurospora crassa OR74A n Phytophthora infestans T30-4 n 0.94 0.93 Chl1 Homo sapiens n 0.58 Ectocarpus siliculosus n 0.97 Arabidopsis thaliana n Drosophila melanogaster n Volvox carteri f. nagariensis n 0.54 0.5 1 Chlamydomonas reinhardtii n 0.67 Cyanidioschyzon merolae strain 10D n Trichomonas vaginalis G3 n 0.68 Caenorhabditis elegans n 0.79 0.63 Trypanosoma cruzi strain CL Brener n Giardia lamblia ATCC 50803 n Trypanosoma cruzi strain CL Brener n Tetrahymena thermophila n Homo sapiens n Drosophila melanogaster n Plasmodium falciparum 3D7 n 0.99 Cryptosporidium parvum Iowa II n 0.5 Caenorhabditis elegans n Entamoeba histolytica HM-1:IMSS n Phytophthora infestans T30-4 n 0.99 Ectocarpus siliculosus n Trichomonas vaginalis G3 n 0.96 Dictyostelium discoideum AX4 n 0.73 Trachipleistophora hominis n Rtel1 0.88 0.76 0.84 Nosema ceranae BRL01 n 1 Encephalitozoon cuniculi GB-M1 n 0.58 Volvox carteri f. nagariensis n 1 Chlamydomonas reinhardtii n Homo sapiens n 0.68 0.54 Arabidopsis thaliana n Phytophthora infestans T30-4 n Ectocarpus siliculosus n Cyanidioschyzon merolae strain 10D n 0.55 Naegleria gruberi n Arabidopsis thaliana n 0.65 0.53 Ostreococcus tauri n Giardia lamblia ATCC 50803 n Trichomonas vaginalis G3 n Giardia lamblia ATCC 50803 n Trachipleistophora hominis n 0.99 Entamoeba histolytica HM-1:IMSS n 1 Entamoeba histolytica HM-1:IMSS n 0.6 Nosema ceranae BRL01 n 0.99 Encephalitozoon cuniculi GB-M1 n Schizosaccharomyces pombe 972h- n Saccharomyces cerevisiae S288c n 1 0.84 1 Neurospora crassa n Homo sapiens n 0.95 Drosophila melanogaster n 1 0.64 Caenorhabditis elegans n Rad3 Dictyostelium discoideum AX4 n 0.95 Naegleria gruberi n 0.98 Trypanosoma cruzi strain CL Brener n 0.95 Cyanidioschyzon merolae strain 10D n 0.92 Tetrahymena thermophila n Plasmodium falciparum 3D7 n 0.51 0.95 1 Cryptosporidium parvum Iowa II n Phytophthora infestans T30-4 n 0.68 0.98 Ectocarpus siliculosus n Volvox carteri f. nagariensis n 0.94 1 Chlamydomonas reinhardtii n 1 Arabidopsis thaliana n 0.85 Ostreococcus tauri n Bdellovibrio bacteriovorus str. Tiberius l Geobacter daltonii FRC-32 l marine gamma proteobacterium HTCC2148 l Microbulbifer variabilis l Ralstonia eutropha JMP134 l Polaromonas sp. CF318 l 0.98 Pseudomonas aeruginosa PA7 l Pedosphaera parvula l Enterococcus hirae l Clostridium acetobutylicum ATCC 824 l Eubacterium dolichum l 0.99

0.94

1

1 1 0.97 1

0.53 0.93

0.95

0.5

Nitrosopumilus maritimus SCM1 s Candidatus Nitrosoarchaeum limnia s Candidatus Koribacter versatilis Ellin345 l Rhodopirellula baltica l 0.99 Blastopirellula marina l Gemmata obscuriglobus l

0.99

0.2

60

Dna2 0.83

1

0.66 0.75 0.78 0.89 0.75 0.85 0.6

0.99

0.5 0.66

0.78

Trachipleistophora hominis n Nosema ceranae BRL01 n Encephalitozoon cuniculi GB-M1 n

Caenorhabditis elegans n Homo sapiens n Trichomonas vaginalis G3 n Tetrahymena thermophila n Schizosaccharomyces pombe 972h- n Saccharomyces cerevisiae S288c n Neurospora crassa OR74A n

Dictyostelium discoideum AX4 n Drosophila melanogaster n Cyanidioschyzon merolae strain 10D n Phytophthora infestans T30-4 n Ectocarpus siliculosus n Arabidopsis thaliana n Ostreococcus tauri n Thermofilum pendens Hrk 5 s Plasmodium falciparum 3D7 n Halomicrobium mukohataei DSM 12286 s Halorubrum litoreum s 0.97 Halogeometricum borinquense DSM 11551 s Halalkalicoccus jeotgali B3 s 0.79 Salinarchaeum sp. Harcht-Bsk1 s Rhodopseudomonas palustris BisB5 l Trichodesmium erythraeum IMS101 l Vibrio azureus l Rhizobium sp. 2MFCol3.1 l Streptomyces prunicolor l 1

0.55

Sulfolobus solfataricus P2 s Rhodopirellula baltica l 0.91 Blastopirellula marina l Anaeromyxobacter sp. K l Archaeoglobus fulgidus DSM 4304 s Pyrococcus furiosus COM1 s 0.99 0.94 Thermococcus litoralis DSM 5473 s 0.95 Halobacteroides halobius DSM 5150 l Giardia lamblia ATCC 50803 n Betaproteobacteria bacterium MOLA814 l 0.65 0.83 Leptospira borgpetersenii l

Entamoeba histolytica HM-1:IMSS n

Cenarchaeum symbiosum A s Candidatus Liberibacter asiaticus str. psy62 l

0.2

Although this tree is poorly resolved, the best BLAST hits of eukaryotic Dna2 are to sequences from the Haloarchaea; this is perhaps consistent with an archaeal origin for this gene, given the presence of good homologues in these and other Euryarchaeota. The Giardia and Plasmodium sequences included here are the most likely Dna2 candidates in these species, although the tree topology weakly excludes them from the otherwise monophyletic eukaryotic clade.

61

Pri2

1

1

Thermofilum sp. 1910b s Fervidicoccus fontis Kam940 s Staphylothermus hellenicus DSM 12710 s 0.88 0.98 Thermogladius cellulolyticus 1633 s Candidatus Caldiarchaeum subterraneum s Candidatus Nitrososphaera gargensis Ga9.2 s 0.99 Candidatus Nitrosoarchaeum koreensis s 0.64 Pyrobaculum aerophilum str. IM2 s 0.98 Caldivirga maquilingensis IC-167 s Natronococcus amylolyticus s 0.6 Halalkalicoccus jeotgali B3 s 0.94 0.62 Halococcus saccharolyticus s 0.88 Methanosarcina mazei Go1 s 0.58 Archaeoglobus fulgidus DSM 4304 s Trichomonas vaginalis G3 n Tetrahymena thermophila n Entamoeba histolytica HM-1:IMSS n Dictyostelium discoideum AX4 n Plasmodium falciparum 3D7 n 0.94 Cryptosporidium parvum Iowa II n Homo sapiens n 0.62 Caenorhabditis elegans n Saccharomyces cerevisiae S288c n Schizosaccharomyces pombe 972h- n 0.97 0.94 Neurospora crassa OR74A n 0.55 0.61 Arabidopsis thaliana n Phytophthora infestans T30-4 n Naegleria gruberi n 0.54 Drosophila melanogaster n Cyanidioschyzon merolae strain 10D n Ectocarpus siliculosus n 0.52 Volvox carteri f. nagariensis n 0.57 0.99 Chlamydomonas reinhardtii n Trypanosoma cruzi strain CL Brener n Ostreococcus tauri n Giardia lamblia ATCC 50803 n Trachipleistophora hominis n 0.77 Nosema ceranae BRL01 n 0.98 Encephalitozoon cuniculi GB-M1 n

0.2

62

B-family DNA polymerases 0.86 0.87 0.88

Sorangium cellulosum So ce56 l Bacteriovorax marinus SJ l Cellvibrio japonicus Ueda107 l Photobacterium sp. AK15 l 0.99 Glaciecola lipolytica E3 l 0.95 Ferrimonas balearica DSM 9799 l Escherichia coli O127:H6 str. E2348/69 l 0.88 Collimonas fungivorans Ter331 l 0.9 Pseudomonas chlororaphis subsp. aureofaciens 30-84 l 0.9 Burkholderia phymatum STM815 l 0.9 0.91 Burkholderia ambifaria MC40-6 l Ectocarpus siliculosus n Methanosarcina acetivorans C2A s 1 Methanosarcina barkeri str. Fusaro s 0.6 Methanomethylovorans hollandica DSM 15978 s 0.99 Methanolobus psychrophilus R15 s 0.52 Methanosalsum zhilinae DSM 4017 s 0.99 Methanohalobium evestigatum Z-7303 s 0.95 Methanomethylovorans hollandica DSM 15978 s Methanococcoides burtonii DSM 6242 s 0.96 0.98 0.94 Methanohalophilus mahii DSM 5219 s Methanosaeta concilii GP6 s 0.99 Methanosaeta thermophila PT s

Trichomonas vaginalis G3 n Entamoeba histolytica HM-1:IMSS n Plasmodium falciparum 3D7 n Trichomonas vaginalis G3 n

0.94 0.97

0.66

0.58

Dictyostelium discoideum AX4 n

0.73 0.62

Saccharomyces cerevisiae S288c n Neurospora crassa n Zeta Homo sapiens n Naegleria gruberi n Tetraodon nigroviridis n Ectocarpus siliculosus n Volvox carteri f. nagariensis n Chlamydomonas reinhardtii n Arabidopsis thaliana n

0.99

0.98 0.63 0.63

0.77

0.89 0.61

1

0.69

Plasmodium falciparum 3D7 n Cryptosporidium parvum Iowa II n Phytophthora infestans T30-4 n Ectocarpus siliculosus n Trichomonas vaginalis G3 n Vavraia culicis subsp. floridensis n 0.99 Trachipleistophora hominis n Nosema ceranae BRL01 n 0.99 Encephalitozoon cuniculi GB-M1 n Naegleria gruberi n

0.99 0.99 0.81 0.98

0.74

0.61 0.71

Ectocarpus siliculosus n

Dictyostelium discoideum AX4 n Saccharomyces cerevisiae S288c n Neurospora crassa OR74A n Homo sapiens n

0.98

0.53 0.74

Delta

Chlamydomonas reinhardtii n

0.86

Volvox carteri f. nagariensis n Chlamydomonas reinhardtii n Arabidopsis thaliana n Entamoeba histolytica HM-1:IMSS n 0.89 Giardia lamblia ATCC 50803 n Methanopyrus kandleri AV19 s Caldisphaera lagunensis DSM 15908 s 0.99 Acidilobus saccharovorans 345-15 s Ignisphaera aggregans DSM 17230 s Hyperthermus butylicus DSM 5456 s 0.99 Pyrolobus fumarii 1A s 0.87 Staphylothermus hellenicus DSM 12710 s Thermogladius cellulolyticus 1633 s 0.99 0.99 Desulfurococcus mucosus DSM 2162 s 0.78 0.99 Desulfurococcus fermentans DSM 16532 s 1 0.95 1 Desulfurococcus kamchatkensis 1221n s Fervidicoccus fontis Kam940 s 0.99 Ignicoccus hospitalis KIN4/I s 0.78

Volvox carteri f. nagariensis n

0.99

Sulfolobus solfataricus P2 s Sulfolobus tokodaii str. 7 s Acidianus hospitalis W1 s Metallosphaera sedula DSM 5348 s Vulcanisaeta moutnovskia 768-28 s 0.99 Vulcanisaeta distributa DSM 14429 s Thermoproteus tenax Kra 1 s 0.99 Thermoproteus uzoniensis 768-20 s 0.99 Pyrobaculum sp. 1860 s Pyrobaculum aerophilum str. IM2 s 0.99 0.55 Pyrobaculum arsenaticum DSM 13514 s 0.66 Pyrobaculum calidifontis JCM 11548 s 0.99 Pyrobaculum islandicum DSM 4184 s Thermofilum pendens Hrk 5 s Ferroglobus placidus DSM 10642 s 0.99 Archaeoglobus fulgidus DSM 4304 s Thermodesulfovibrio yellowstonii DSM 11347 l Methanothermus fervidus DSM 2088 s 0.99 Methanothermobacter marburgensis str. Marburg s Methanocaldococcus vulcanius M7 s Methanocaldococcus sp. FS406-22 s 0.99 0.64 Methanocaldococcus fervens AG86 s Pyrococcus furiosus DSM 3638 s 0.93 Pyrococcus sp. NA2 s Thermococcus gammatolerans EJ3 s 0.99 Thermococcus barophilus MP s 0.6 1 Thermococcus litoralis DSM 5473 s Candidatus Korarchaeum cryptofilum OPF8 s Candidatus Korarchaeum cryptofilum OPF8 s Sulfolobus solfataricus P2 s 0.95 Hyperthermus butylicus DSM 5456 s 0.82 Nitrosopumilus maritimus SCM1 s 0.92 Candidatus Caldiarchaeum subterraneum s Candidatus Caldiarchaeum subterraneum s Desulfobacca acetoxidans DSM 11109 l 0.96 0.98

0.66

0.51

0.97

0.96

0.55 0.86

0.98 0.99 0.7

0.6

0.61

Sulfolobus solfataricus 98/2 s Archaeoglobus fulgidus DSM 4304 s Chroococcidiopsis thermalis PCC 7203 l Rhodothermus marinus SG0.5JP17-172 l Geobacter sulfurreducens PCA l Ktedonobacter racemifer DSM 44963 l Plasmodium falciparum 3D7 n Cryptosporidium parvum Iowa II n

0.9

0.96

0.53

0.97

0.92 0.99

0.86

0.61 0.55 0.93

0.95

0.99

0.78 0.85 0.88

Trichomonas vaginalis G3 n Trichomonas vaginalis G3 n

0.91

Giardia lamblia ATCC 50803 n

Encephalitozoon cuniculi GB-M1 n Dictyostelium discoideum AX4 n Saccharomyces cerevisiae S288c n Neurospora crassa OR74A n Homo sapiens n Phytophthora infestans T30-4 n Volvox carteri f. nagariensis n 1 Chlamydomonas reinhardtii n Arabidopsis thaliana n Giardia lamblia ATCC 50803 n

Entamoeba histolytica HM-1:IMSS n Dictyostelium discoideum AX4 n Plasmodium falciparum 3D7 n 0.65

0.52

0.99 0.97

0.71 0.55

0.78

0.85 0.74

0.83 0.67 0.88 0.99

0.99

Epsilon

Trichomonas vaginalis G3 n

0.94 0.83

Nosema ceranae BRL01 n

Cryptosporidium parvum Iowa II n Saccharomyces cerevisiae S288c n Homo sapiens n Phytophthora infestans T30-4 n Ectocarpus siliculosus n Naegleria gruberi n Volvox carteri f. nagariensis n Chlamydomonas reinhardtii n Arabidopsis thaliana n

0.99

Vavraia culicis subsp. floridensis n Trachipleistophora hominis n Nosema ceranae BRL01 n Encephalitozoon cuniculi GB-M1 n

Vavraia culicis subsp. floridensis n Trachipleistophora hominis n

Alpha

0.2

63

Supplementary Figure 3. Quantitative analysis of immunogold labelling for iron-sulphur cluster (ISC) assembly components over the mitosome matrix. Distribution of immunogold labelling over mitosome matrix profiles as compared to a similar number of random points (analysed as described in the Methods). Plots (a-e) show the cumulative fraction (Fraction) of gold labelling (blue) and random points (red) plotted against the distance from the inner membrane (nm). For both Kolmogorov Smirnov tests and Mann-Whitney tests p< 0.01 for (a-d) and p