Web document 7.8. ML inference of phylogeny using 13 globin protein sequences.

 

[1] From Web document 7.1, these sequences were obtained from searches of NCBI Entrez Protein following the information provided in Dayhoff et al. (1972), p.20. The names of the sequences are shortened relative to those in Web document 7.1.

 

>mbkangaroo P02194 Macropus rufus (red kangaroo)
MGLSDGEWQLVLNIWGKVETDEGGHGKDVLIRLFKGHPETLEKFDKFKHLKSEDEMKASEDLKKHGITVL
TALGNILKKKGHHEAELKPLAQSHATKHKIPVQFLEFISDAIIQVIQSKHAGNFGADAQAAMKKALELFR
HDMAAKYKEFGFQG
>mbharbor_porpoise P68278 Phocoena phocoena 
MGLSEGEWQLVLNVWGKVEADLAGHGQDVLIRLFKGHPETLEKFDKFKHLKTEAEMKASEDLKKHGNTVL
TALGGILKKKGHHDAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSRHPAEFGADAQGAMNKALELFR
KDIATKYKELGFHG
>mbgray_seal P68081 Halichoerus grypus
MGLSDGEWHLVLNVWGKVETDLAGHGQEVLIRLFKSHPETLEKFDKFKHLKSEDDMRRSEDLRKHGNTVL
TALGGILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSKHPAEFGADAQAAMKKALELFR
NDIAAKYKELGFHG
>alphahorse P01958 Equus caballus
MVLSAADKTNVKAAWSKVGGHAGEYGAEALERMFLGFPTTKTYFPHFDLSHGSAQVKAHGKKVGDALTLA
VGHLDDLPGALSNLSDLHAHKLRVDPVNFKLLSHCLLSTLAVHLPNDFTPAVHASLDKFLSSVSTVLTSK
YR
>alphakangaroo P01975 Macropus giganteus (eastern gray kangaroo)
VLSAADKGHVKAIWGKVGGHAGEYAAEGLERTFHSFPTTKTYFPHFDLSHGSAQIQAHGKKIADALGQAV
EHIDDLPGTLSKLSDLHAHKLRVDPVNFKLLSHCLLVTFAAHLGDAFTPEVHASLDKFLAAVSTVLTSKY
R
>alphadog P60529 Canis lupus familiaris (dog)
VLSPADKTNIKSTWDKIGGHAGDYGGEALDRTFQSFPTTKTYFPHFDLSPGSAQVKAHGKKVADALTTAV
AHLDDLPGALSALSDLHAYKLRVDPVNFKLLSHCLLVTLACHHPTEFTPAVHASLDKFFAAVSTVLTSKY
R
>betadog XP_537902 Canis lupus familiaris (dog)
MVHLTAEEKSLVSGLWGKVNVDEVGGEALGRLLIVYPWTQRFFDSFGDLSTPDAVMSNAKVKAHGKKVLN
SFSDGLKNLDNLKGTFAKLSELHCDKLHVDPENFKLLGNVLVCVLAHHFGKEFTPQVQAAYQKVVAGVAN
ALAHKYH
>betarabbit NP_001075729 Oryctolagus cuniculus (rabbit)
MVHLSSEEKSAVTALWGKVNVEEVGGEALGRLLVVYPWTQRFFESFGDLSSANAVMNNPKVKAHGKKVLA
AFSEGLSHLDNLKGTFAKLSELHCDKLHVDPENFRLLGNVLVIVLSHHFGKEFTPQVQAAYQKVVAGVAN
ALAHKYH
>betakangaroo P02106 Macropus giganteus (eastern gray kangaroo)
VHLTAEEKNAITSLWGKVAIEQTGGEALGRLLIVYPWTSRFFDHFGDLSNAKAVMANPKVLAHGAKVLVA
FGDAIKNLDNLKGTFAKLSELHCDKLHVDPENFKLLGNIIVICLAEHFGKEFTIDTQVAWQKLVAGVANA
LAHKYH
>globinlamprey 690951A Lampetra fluviatilis (European river lamprey)
PIVDSGSPAVLSAAEKTKIRSAWAPVYSNYETSGVDILVKFFTSTPAAQEFFPKFKGMTSADELKKSADV
RWHAERIINAVNDAVASMDDTEKMSMKDLSGKHAKSFQVDPQYFKVLAVIADTVAAGDAGFEKLSMCIIL
MLRSAY
>globinsealamprey P02208 Petromyzon marinus (sea lamprey)
MPIVDTGSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQEFFPKFKGLTTADQLKKSAD
VRWHAERIINAVNDAVASMDDTEKMSMKLRDLSGKHAKSFQVDPQYFKVLAAVIADTVAAGDAGFEKLMS
MICILLRSAY
>globininsect P02229 Chironomus thummi thummi (midge)
MKFLILALCFAAASALSADQISTVQASFDKVKGDPVGILYAVFKADPSIMAKFTQFAGKDLESIKGTAPF
EIHANRIVGFFSKIIGELPNIEADVNTFVASHKPRGVTHDQLNNFRAGFVSYMKAHTDFAGAEAAWGATL
DTFFGMIFSKM
>globinsoybean 711674A Glycine max (soybean)
VAFTEKQDALVSSSFEAFKANIPQYSVVFYTSILEKAPAAKDLFSFLANPTDGVNPKLTGHAEKLFALVR
DSAGQLKASGTVVADAALGSVHAQKAVTNPEFVVKEALLKTIKAAVGDKWSDELSRAWEVAYDELAAAIK
AK
 
 

[2] Go to the EBI Mafft server (http://www.ebi.ac.uk/mafft/) and create a multiple sequence alignment.

 

>mbkangaroo P02194 Macropus rufus (red kangaroo)
-------------MGLSDGEWQLVLNIWGKVETDEGGHGKDVLIRLFKGHPETLEKFDKF
KHLKSEDEMKASEDLKKHGITVLTALGNILKKKGHHEAELKPLAQS---HATKHKIPVQF
LEFISDAIIQVIQSKHAGNFGADAQAAMKKALELFRHDMAAKYKEFGFQG
>mbharbor_porpoise P68278 Phocoena phocoena 
-------------MGLSEGEWQLVLNVWGKVEADLAGHGQDVLIRLFKGHPETLEKFDKF
KHLKTEAEMKASEDLKKHGNTVLTALGGILKKKGHHDAELKPLAQS---HATKHKIPIKY
LEFISEAIIHVLHSRHPAEFGADAQGAMNKALELFRKDIATKYKELGFHG
>mbgray_seal P68081 Halichoerus grypus
-------------MGLSDGEWHLVLNVWGKVETDLAGHGQEVLIRLFKSHPETLEKFDKF
KHLKSEDDMRRSEDLRKHGNTVLTALGGILKKKGHHEAELKPLAQS---HATKHKIPIKY
LEFISEAIIHVLHSKHPAEFGADAQAAMKKALELFRNDIAAKYKELGFHG
>alphahorse P01958 Equus caballus
------------MV-LSAADKTNVKAAWSKVGGHAGEYGAEALERMFLGFPTTKTYFPHF
-DLSHGSA-----QVKAHGKKVGDALTLAVGHLDDLPGALSNLSDL---HAHKLRVDPVN
FKLLSHCLLSTLAVHLPNDFTPAVHASLDKFLSSVSTVLTSKYR------
>alphakangaroo P01975 Macropus giganteus (eastern gray kangaroo)
-------------V-LSAADKGHVKAIWGKVGGHAGEYAAEGLERTFHSFPTTKTYFPHF
-DLSHGSA-----QIQAHGKKIADALGQAVEHIDDLPGTLSKLSDL---HAHKLRVDPVN
FKLLSHCLLVTFAAHLGDAFTPEVHASLDKFLAAVSTVLTSKYR------
>alphadog P60529 Canis lupus familiaris (dog)
-------------V-LSPADKTNIKSTWDKIGGHAGDYGGEALDRTFQSFPTTKTYFPHF
-DLSPGSA-----QVKAHGKKVADALTTAVAHLDDLPGALSALSDL---HAYKLRVDPVN
FKLLSHCLLVTLACHHPTEFTPAVHASLDKFFAAVSTVLTSKYR------
>betadog XP_537902 Canis lupus familiaris (dog)
------------MVHLTAEEKSLVSGLWGKV--NVDEVGGEALGRLLIVYPWTQRFFDSF
GDLSTPDAVMSNAKVKAHGKKVLNSFSDGLKNLDNLKGTFAKLSEL---HCDKLHVDPEN
FKLLGNVLVCVLAHHFGKEFTPQVQAAYQKVVAGVANALAHKYH------
>betarabbit NP_001075729 Oryctolagus cuniculus (rabbit)
------------MVHLSSEEKSAVTALWGKV--NVEEVGGEALGRLLVVYPWTQRFFESF
GDLSSANAVMNNPKVKAHGKKVLAAFSEGLSHLDNLKGTFAKLSEL---HCDKLHVDPEN
FRLLGNVLVIVLSHHFGKEFTPQVQAAYQKVVAGVANALAHKYH------
>betakangaroo P02106 Macropus giganteus (eastern gray kangaroo)
-------------VHLTAEEKNAITSLWGKV--AIEQTGGEALGRLLIVYPWTSRFFDHF
GDLSNAKAVMANPKVLAHGAKVLVAFGDAIKNLDNLKGTFAKLSEL---HCDKLHVDPEN
FKLLGNIIVICLAEHFGKEFTIDTQVAWQKLVAGVANALAHKYH------
>globinlamprey 690951A Lampetra fluviatilis (European river lamprey)
-PIVDS----GSPAVLSAAEKTKIRSAWAPVYSNYETSGVDILVKFFTSTPAAQEFFPKF
KGMTSADELKKSADVRWHAERIINAVNDAVASMDDTEKMSMK--DLSGKHAKSFQVDPQY
FKVL-AVIADTVAAG---------DAGFEKLSMCIILMLRSAY-------
>globinsealamprey P02208 Petromyzon marinus (sea lamprey)
MPIVDT----GSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQEFFPKF
KGLTTADQLKKSADVRWHAERIINAVNDAVASMDDTEKMSMKLRDLSGKHAKSFQVDPQY
FKVLAAVIADTVAAG---------DAGFEKLMSMICILLRSAY-------
>globinsoybean 711674A Glycine max (soybean)
-------------VAFTEKQDALVSSSFEAFKANIPQYSVVFYTSILEKAPAAKDLFSFL
ANPTDG----VNPKLTGHAEKLFALVRDSAGQL-KASGTVVADAALGSVHAQKAVTNPEF
--VVKEALLKTIKAAVGDKWSDELSRAWEVAYDELAAAIKAK--------
>globininsect P02229 Chironomus thummi thummi (midge)
MKFLILALCFAAASALSADQISTVQASFDKVKGD----PVGILYAVFKADPSIMAKFTQF
AG-KDLESIKGTAPFEIHANRIVGFFSKIIGELPNIEADVNTFVAS---HKPRGVTHDQ-
---LNNFRAGFVSYMKAHTDFAGAEAAWGATLDTFFGMIFSKM-------

 

 

[3] Convert the alignment into the phylip format. To do this, go to the ReadSeq utility at Baylor College of Medicine (google readseq bcm), paste in the mafft alignment, and select phylip as output.
http://searchlauncher.bcm.tmc.edu/seq-util/readseq.html
 
13 170
mbkangaroo   ---------- ---MGLSDGE WQLVLNIWGK VETDEGGHGK DVLIRLFKGH
mbharbor_p   ---------- ---MGLSEGE WQLVLNVWGK VEADLAGHGQ DVLIRLFKGH
mbgray_sea   ---------- ---MGLSDGE WHLVLNVWGK VETDLAGHGQ EVLIRLFKSH
alphahorse   ---------- --MV-LSAAD KTNVKAAWSK VGGHAGEYGA EALERMFLGF
alphakanga   ---------- ---V-LSAAD KGHVKAIWGK VGGHAGEYAA EGLERTFHSF
alphadog     ---------- ---V-LSPAD KTNIKSTWDK IGGHAGDYGG EALDRTFQSF
betadog      ---------- --MVHLTAEE KSLVSGLWGK V--NVDEVGG EALGRLLIVY
betarabbit   ---------- --MVHLSSEE KSAVTALWGK V--NVEEVGG EALGRLLVVY
betakangar   ---------- ---VHLTAEE KNAITSLWGK V--AIEQTGG EALGRLLIVY
globinlamp   -PIVDS---- GSPAVLSAAE KTKIRSAWAP VYSNYETSGV DILVKFFTST
globinseal   MPIVDT---- GSVAPLSAAE KTKIRSAWAP VYSTYETSGV DILVKFFTST
globinsoyb   ---------- ---VAFTEKQ DALVSSSFEA FKANIPQYSV VFYTSILEKA
globininse   MKFLILALCF AAASALSADQ ISTVQASFDK VKGD----PV GILYAVFKAD
 
             PETLEKFDKF KHLKSEDEMK ASEDLKKHGI TVLTALGNIL KKKGHHEAEL
             PETLEKFDKF KHLKTEAEMK ASEDLKKHGN TVLTALGGIL KKKGHHDAEL
             PETLEKFDKF KHLKSEDDMR RSEDLRKHGN TVLTALGGIL KKKGHHEAEL
             PTTKTYFPHF -DLSHGSA-- ---QVKAHGK KVGDALTLAV GHLDDLPGAL
             PTTKTYFPHF -DLSHGSA-- ---QIQAHGK KIADALGQAV EHIDDLPGTL
             PTTKTYFPHF -DLSPGSA-- ---QVKAHGK KVADALTTAV AHLDDLPGAL
             PWTQRFFDSF GDLSTPDAVM SNAKVKAHGK KVLNSFSDGL KNLDNLKGTF
             PWTQRFFESF GDLSSANAVM NNPKVKAHGK KVLAAFSEGL SHLDNLKGTF
             PWTSRFFDHF GDLSNAKAVM ANPKVLAHGA KVLVAFGDAI KNLDNLKGTF
             PAAQEFFPKF KGMTSADELK KSADVRWHAE RIINAVNDAV ASMDDTEKMS
             PAAQEFFPKF KGLTTADQLK KSADVRWHAE RIINAVNDAV ASMDDTEKMS
             PAAKDLFSFL ANPTDG---- VNPKLTGHAE KLFALVRDSA GQL-KASGTV
             PSIMAKFTQF AG-KDLESIK GTAPFEIHAN RIVGFFSKII GELPNIEADV
 
             KPLAQS---H ATKHKIPVQF LEFISDAIIQ VIQSKHAGNF GADAQAAMKK
             KPLAQS---H ATKHKIPIKY LEFISEAIIH VLHSRHPAEF GADAQGAMNK
             KPLAQS---H ATKHKIPIKY LEFISEAIIH VLHSKHPAEF GADAQAAMKK
             SNLSDL---H AHKLRVDPVN FKLLSHCLLS TLAVHLPNDF TPAVHASLDK
             SKLSDL---H AHKLRVDPVN FKLLSHCLLV TFAAHLGDAF TPEVHASLDK
             SALSDL---H AYKLRVDPVN FKLLSHCLLV TLACHHPTEF TPAVHASLDK
             AKLSEL---H CDKLHVDPEN FKLLGNVLVC VLAHHFGKEF TPQVQAAYQK
             AKLSEL---H CDKLHVDPEN FRLLGNVLVI VLSHHFGKEF TPQVQAAYQK
             AKLSEL---H CDKLHVDPEN FKLLGNIIVI CLAEHFGKEF TIDTQVAWQK
             MK--DLSGKH AKSFQVDPQY FKVL-AVIAD TVAAG----- ----DAGFEK
             MKLRDLSGKH AKSFQVDPQY FKVLAAVIAD TVAAG----- ----DAGFEK
             VADAALGSVH AQKAVTNPEF --VVKEALLK TIKAAVGDKW SDELSRAWEV
             NTFVAS---H KPRGVTHDQ- ---LNNFRAG FVSYMKAHTD FAGAEAAWGA
 
             ALELFRHDMA AKYKEFGFQG
             ALELFRKDIA TKYKELGFHG
             ALELFRNDIA AKYKELGFHG
             FLSSVSTVLT SKYR------
             FLAAVSTVLT SKYR------
             FFAAVSTVLT SKYR------
             VVAGVANALA HKYH------
             VVAGVANALA HKYH------
             LVAGVANALA HKYH------
             LSMCIILMLR SAY-------
             LMSMICILLR SAY-------
             AYDELAAAIK AK--------
             TLDTFFGMIF SKM-------