Web document 7.10. Bayesian inference of phylogeny using 13 globin protein sequences.

 

[1] From Web document 7.1, these sequences were obtained from searches of NCBI Entrez Protein following the information provided in Dayhoff et al. (1972), p.20. The names of the sequences are shortened relative to those in Web document 7.1.

 

>mbkangaroo P02194 Macropus rufus (red kangaroo)
MGLSDGEWQLVLNIWGKVETDEGGHGKDVLIRLFKGHPETLEKFDKFKHLKSEDEMKASEDLKKHGITVL
TALGNILKKKGHHEAELKPLAQSHATKHKIPVQFLEFISDAIIQVIQSKHAGNFGADAQAAMKKALELFR
HDMAAKYKEFGFQG
>mbharbor_porpoise P68278 Phocoena phocoena 
MGLSEGEWQLVLNVWGKVEADLAGHGQDVLIRLFKGHPETLEKFDKFKHLKTEAEMKASEDLKKHGNTVL
TALGGILKKKGHHDAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSRHPAEFGADAQGAMNKALELFR
KDIATKYKELGFHG
>mbgray_seal P68081 Halichoerus grypus
MGLSDGEWHLVLNVWGKVETDLAGHGQEVLIRLFKSHPETLEKFDKFKHLKSEDDMRRSEDLRKHGNTVL
TALGGILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSKHPAEFGADAQAAMKKALELFR
NDIAAKYKELGFHG
>alphahorse P01958 Equus caballus
MVLSAADKTNVKAAWSKVGGHAGEYGAEALERMFLGFPTTKTYFPHFDLSHGSAQVKAHGKKVGDALTLA
VGHLDDLPGALSNLSDLHAHKLRVDPVNFKLLSHCLLSTLAVHLPNDFTPAVHASLDKFLSSVSTVLTSK
YR
>alphakangaroo P01975 Macropus giganteus (eastern gray kangaroo)
VLSAADKGHVKAIWGKVGGHAGEYAAEGLERTFHSFPTTKTYFPHFDLSHGSAQIQAHGKKIADALGQAV
EHIDDLPGTLSKLSDLHAHKLRVDPVNFKLLSHCLLVTFAAHLGDAFTPEVHASLDKFLAAVSTVLTSKY
R
>alphadog P60529 Canis lupus familiaris (dog)
VLSPADKTNIKSTWDKIGGHAGDYGGEALDRTFQSFPTTKTYFPHFDLSPGSAQVKAHGKKVADALTTAV
AHLDDLPGALSALSDLHAYKLRVDPVNFKLLSHCLLVTLACHHPTEFTPAVHASLDKFFAAVSTVLTSKY
R
>betadog XP_537902 Canis lupus familiaris (dog)
MVHLTAEEKSLVSGLWGKVNVDEVGGEALGRLLIVYPWTQRFFDSFGDLSTPDAVMSNAKVKAHGKKVLN
SFSDGLKNLDNLKGTFAKLSELHCDKLHVDPENFKLLGNVLVCVLAHHFGKEFTPQVQAAYQKVVAGVAN
ALAHKYH
>betarabbit NP_001075729 Oryctolagus cuniculus (rabbit)
MVHLSSEEKSAVTALWGKVNVEEVGGEALGRLLVVYPWTQRFFESFGDLSSANAVMNNPKVKAHGKKVLA
AFSEGLSHLDNLKGTFAKLSELHCDKLHVDPENFRLLGNVLVIVLSHHFGKEFTPQVQAAYQKVVAGVAN
ALAHKYH
>betakangaroo P02106 Macropus giganteus (eastern gray kangaroo)
VHLTAEEKNAITSLWGKVAIEQTGGEALGRLLIVYPWTSRFFDHFGDLSNAKAVMANPKVLAHGAKVLVA
FGDAIKNLDNLKGTFAKLSELHCDKLHVDPENFKLLGNIIVICLAEHFGKEFTIDTQVAWQKLVAGVANA
LAHKYH
>globinlamprey 690951A Lampetra fluviatilis (European river lamprey)
PIVDSGSPAVLSAAEKTKIRSAWAPVYSNYETSGVDILVKFFTSTPAAQEFFPKFKGMTSADELKKSADV
RWHAERIINAVNDAVASMDDTEKMSMKDLSGKHAKSFQVDPQYFKVLAVIADTVAAGDAGFEKLSMCIIL
MLRSAY
>globinsealamprey P02208 Petromyzon marinus (sea lamprey)
MPIVDTGSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQEFFPKFKGLTTADQLKKSAD
VRWHAERIINAVNDAVASMDDTEKMSMKLRDLSGKHAKSFQVDPQYFKVLAAVIADTVAAGDAGFEKLMS
MICILLRSAY
>globininsect P02229 Chironomus thummi thummi (midge)
MKFLILALCFAAASALSADQISTVQASFDKVKGDPVGILYAVFKADPSIMAKFTQFAGKDLESIKGTAPF
EIHANRIVGFFSKIIGELPNIEADVNTFVASHKPRGVTHDQLNNFRAGFVSYMKAHTDFAGAEAAWGATL
DTFFGMIFSKM
>globinsoybean 711674A Glycine max (soybean)
VAFTEKQDALVSSSFEAFKANIPQYSVVFYTSILEKAPAAKDLFSFLANPTDGVNPKLTGHAEKLFALVR
DSAGQLKASGTVVADAALGSVHAQKAVTNPEFVVKEALLKTIKAAVGDKWSDELSRAWEVAYDELAAAIK
AK
 
 

[2] Go to the EBI Mafft server (http://www.ebi.ac.uk/mafft/) and create a multiple sequence alignment.

 

>mbkangaroo P02194 Macropus rufus (red kangaroo)
-------------MGLSDGEWQLVLNIWGKVETDEGGHGKDVLIRLFKGHPETLEKFDKF
KHLKSEDEMKASEDLKKHGITVLTALGNILKKKGHHEAELKPLAQS---HATKHKIPVQF
LEFISDAIIQVIQSKHAGNFGADAQAAMKKALELFRHDMAAKYKEFGFQG
>mbharbor_porpoise P68278 Phocoena phocoena 
-------------MGLSEGEWQLVLNVWGKVEADLAGHGQDVLIRLFKGHPETLEKFDKF
KHLKTEAEMKASEDLKKHGNTVLTALGGILKKKGHHDAELKPLAQS---HATKHKIPIKY
LEFISEAIIHVLHSRHPAEFGADAQGAMNKALELFRKDIATKYKELGFHG
>mbgray_seal P68081 Halichoerus grypus
-------------MGLSDGEWHLVLNVWGKVETDLAGHGQEVLIRLFKSHPETLEKFDKF
KHLKSEDDMRRSEDLRKHGNTVLTALGGILKKKGHHEAELKPLAQS---HATKHKIPIKY
LEFISEAIIHVLHSKHPAEFGADAQAAMKKALELFRNDIAAKYKELGFHG
>alphahorse P01958 Equus caballus
------------MV-LSAADKTNVKAAWSKVGGHAGEYGAEALERMFLGFPTTKTYFPHF
-DLSHGSA-----QVKAHGKKVGDALTLAVGHLDDLPGALSNLSDL---HAHKLRVDPVN
FKLLSHCLLSTLAVHLPNDFTPAVHASLDKFLSSVSTVLTSKYR------
>alphakangaroo P01975 Macropus giganteus (eastern gray kangaroo)
-------------V-LSAADKGHVKAIWGKVGGHAGEYAAEGLERTFHSFPTTKTYFPHF
-DLSHGSA-----QIQAHGKKIADALGQAVEHIDDLPGTLSKLSDL---HAHKLRVDPVN
FKLLSHCLLVTFAAHLGDAFTPEVHASLDKFLAAVSTVLTSKYR------
>alphadog P60529 Canis lupus familiaris (dog)
-------------V-LSPADKTNIKSTWDKIGGHAGDYGGEALDRTFQSFPTTKTYFPHF
-DLSPGSA-----QVKAHGKKVADALTTAVAHLDDLPGALSALSDL---HAYKLRVDPVN
FKLLSHCLLVTLACHHPTEFTPAVHASLDKFFAAVSTVLTSKYR------
>betadog XP_537902 Canis lupus familiaris (dog)
------------MVHLTAEEKSLVSGLWGKV--NVDEVGGEALGRLLIVYPWTQRFFDSF
GDLSTPDAVMSNAKVKAHGKKVLNSFSDGLKNLDNLKGTFAKLSEL---HCDKLHVDPEN
FKLLGNVLVCVLAHHFGKEFTPQVQAAYQKVVAGVANALAHKYH------
>betarabbit NP_001075729 Oryctolagus cuniculus (rabbit)
------------MVHLSSEEKSAVTALWGKV--NVEEVGGEALGRLLVVYPWTQRFFESF
GDLSSANAVMNNPKVKAHGKKVLAAFSEGLSHLDNLKGTFAKLSEL---HCDKLHVDPEN
FRLLGNVLVIVLSHHFGKEFTPQVQAAYQKVVAGVANALAHKYH------
>betakangaroo P02106 Macropus giganteus (eastern gray kangaroo)
-------------VHLTAEEKNAITSLWGKV--AIEQTGGEALGRLLIVYPWTSRFFDHF
GDLSNAKAVMANPKVLAHGAKVLVAFGDAIKNLDNLKGTFAKLSEL---HCDKLHVDPEN
FKLLGNIIVICLAEHFGKEFTIDTQVAWQKLVAGVANALAHKYH------
>globinlamprey 690951A Lampetra fluviatilis (European river lamprey)
-PIVDS----GSPAVLSAAEKTKIRSAWAPVYSNYETSGVDILVKFFTSTPAAQEFFPKF
KGMTSADELKKSADVRWHAERIINAVNDAVASMDDTEKMSMK--DLSGKHAKSFQVDPQY
FKVL-AVIADTVAAG---------DAGFEKLSMCIILMLRSAY-------
>globinsealamprey P02208 Petromyzon marinus (sea lamprey)
MPIVDT----GSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQEFFPKF
KGLTTADQLKKSADVRWHAERIINAVNDAVASMDDTEKMSMKLRDLSGKHAKSFQVDPQY
FKVLAAVIADTVAAG---------DAGFEKLMSMICILLRSAY-------
>globinsoybean 711674A Glycine max (soybean)
-------------VAFTEKQDALVSSSFEAFKANIPQYSVVFYTSILEKAPAAKDLFSFL
ANPTDG----VNPKLTGHAEKLFALVRDSAGQL-KASGTVVADAALGSVHAQKAVTNPEF
--VVKEALLKTIKAAVGDKWSDELSRAWEVAYDELAAAIKAK--------
>globininsect P02229 Chironomus thummi thummi (midge)
MKFLILALCFAAASALSADQISTVQASFDKVKGD----PVGILYAVFKADPSIMAKFTQF
AG-KDLESIKGTAPFEIHANRIVGFFSKIIGELPNIEADVNTFVAS---HKPRGVTHDQ-
---LNNFRAGFVSYMKAHTDFAGAEAAWGATLDTFFGMIFSKM-------

 

 

[3] Convert the alignment into the nexus format. To do this, go to the ReadSeq utility at Baylor College of Medicine (google readseq bcm), paste in the mafft alignment, and select paup/nexus as output.
http://searchlauncher.bcm.tmc.edu/seq-util/readseq.html
 
#NEXUS
[/tmp/readseq.in.8649 -- data title]
 
[Name: mbkangaroo        Len:   170  Check:     15FB]
[Name: mbharbor_porpoise  Len:   170  Check:     18C0]
[Name: mbgray_seal       Len:   170  Check:     1A19]
[Name: alphahorse        Len:   170  Check:      878]
[Name: alphakangaroo     Len:   170  Check:     243B]
[Name: alphadog          Len:   170  Check:      61B]
[Name: betadog           Len:   170  Check:      965]
[Name: betarabbit        Len:   170  Check:      B20]
[Name: betakangaroo      Len:   170  Check:     2370]
[Name: globinlamprey     Len:   170  Check:     1AD3]
[Name: globinsealamprey  Len:   170  Check:      255]
[Name: globinsoybean     Len:   170  Check:     1E0A]
[Name: globininsect      Len:   170  Check:     123A]
 
 
begin data;
 dimensions ntax=13 nchar=170;
 format datatype=protein interleave missing=-;
  matrix
mbkangaro  -------------MGLSDGE WQLVLNIWGKVETDEGGHGK DVLIRLFKGHPETLEKFDKF KHLKSEDEMKASEDLKKHGI TVLTALGNILKKKGHHEAEL
mbharbor_  -------------MGLSEGE WQLVLNVWGKVEADLAGHGQ DVLIRLFKGHPETLEKFDKF KHLKTEAEMKASEDLKKHGN TVLTALGGILKKKGHHDAEL
mbgray_se  -------------MGLSDGE WHLVLNVWGKVETDLAGHGQ EVLIRLFKSHPETLEKFDKF KHLKSEDDMRRSEDLRKHGN TVLTALGGILKKKGHHEAEL
alphahors  ------------MV-LSAAD KTNVKAAWSKVGGHAGEYGA EALERMFLGFPTTKTYFPHF -DLSHGSA-----QVKAHGK KVGDALTLAVGHLDDLPGAL
alphakang  -------------V-LSAAD KGHVKAIWGKVGGHAGEYAA EGLERTFHSFPTTKTYFPHF -DLSHGSA-----QIQAHGK KIADALGQAVEHIDDLPGTL
 alphadog  -------------V-LSPAD KTNIKSTWDKIGGHAGDYGG EALDRTFQSFPTTKTYFPHF -DLSPGSA-----QVKAHGK KVADALTTAVAHLDDLPGAL
  betadog  ------------MVHLTAEE KSLVSGLWGKV--NVDEVGG EALGRLLIVYPWTQRFFDSF GDLSTPDAVMSNAKVKAHGK KVLNSFSDGLKNLDNLKGTF
betarabbi  ------------MVHLSSEE KSAVTALWGKV--NVEEVGG EALGRLLVVYPWTQRFFESF GDLSSANAVMNNPKVKAHGK KVLAAFSEGLSHLDNLKGTF
betakanga  -------------VHLTAEE KNAITSLWGKV--AIEQTGG EALGRLLIVYPWTSRFFDHF GDLSNAKAVMANPKVLAHGA KVLVAFGDAIKNLDNLKGTF
globinlam  -PIVDS----GSPAVLSAAE KTKIRSAWAPVYSNYETSGV DILVKFFTSTPAAQEFFPKF KGMTSADELKKSADVRWHAE RIINAVNDAVASMDDTEKMS
globinsea  MPIVDT----GSVAPLSAAE KTKIRSAWAPVYSTYETSGV DILVKFFTSTPAAQEFFPKF KGLTTADQLKKSADVRWHAE RIINAVNDAVASMDDTEKMS
globinsoy  -------------VAFTEKQ DALVSSSFEAFKANIPQYSV VFYTSILEKAPAAKDLFSFL ANPTDG----VNPKLTGHAE KLFALVRDSAGQL-KASGTV
globinins  MKFLILALCFAAASALSADQ ISTVQASFDKVKGD----PV GILYAVFKADPSIMAKFTQF AG-KDLESIKGTAPFEIHAN RIVGFFSKIIGELPNIEADV
 
mbkangaro  KPLAQS---HATKHKIPVQF LEFISDAIIQVIQSKHAGNF GADAQAAMKKALELFRHDMA AKYKEFGFQG
mbharbor_  KPLAQS---HATKHKIPIKY LEFISEAIIHVLHSRHPAEF GADAQGAMNKALELFRKDIA TKYKELGFHG
mbgray_se  KPLAQS---HATKHKIPIKY LEFISEAIIHVLHSKHPAEF GADAQAAMKKALELFRNDIA AKYKELGFHG
alphahors  SNLSDL---HAHKLRVDPVN FKLLSHCLLSTLAVHLPNDF TPAVHASLDKFLSSVSTVLT SKYR------
alphakang  SKLSDL---HAHKLRVDPVN FKLLSHCLLVTFAAHLGDAF TPEVHASLDKFLAAVSTVLT SKYR------
 alphadog  SALSDL---HAYKLRVDPVN FKLLSHCLLVTLACHHPTEF TPAVHASLDKFFAAVSTVLT SKYR------
  betadog  AKLSEL---HCDKLHVDPEN FKLLGNVLVCVLAHHFGKEF TPQVQAAYQKVVAGVANALA HKYH------
betarabbi  AKLSEL---HCDKLHVDPEN FRLLGNVLVIVLSHHFGKEF TPQVQAAYQKVVAGVANALA HKYH------
betakanga  AKLSEL---HCDKLHVDPEN FKLLGNIIVICLAEHFGKEF TIDTQVAWQKLVAGVANALA HKYH------
globinlam  MK--DLSGKHAKSFQVDPQY FKVL-AVIADTVAAG----- ----DAGFEKLSMCIILMLR SAY-------
globinsea  MKLRDLSGKHAKSFQVDPQY FKVLAAVIADTVAAG----- ----DAGFEKLMSMICILLR SAY-------
globinsoy  VADAALGSVHAQKAVTNPEF --VVKEALLKTIKAAVGDKW SDELSRAWEVAYDELAAAIK AK--------
globinins  NTFVAS---HKPRGVTHDQ- ---LNNFRAGFVSYMKAHTD FAGAEAAWGATLDTFFGMIF SKM-------
 
;
  end;
 
 
 
[4] Save the output as a text file called 13globins.nex
 
[5] Obtain MrBayes and open the program.
 
[6] Type execute 13globins.nex
 
[7] An evolutionary model must be specified.
 
lset defines the structure of the model; type help lset to see the options. 
 
prset defines the prior probability distributions on the parameters of the model; type help prset for more information
 
Type showmodel
This summarizes the model settings; in this example the default settings are:
 
 
 
[8] Type mcmc
This invokes Monte Carlo Markov chain process to initiate the run. Each 10,000 runs the average standard deviation of split frequencies is reported; for typical analyses this should be below 0.01, as convergence is approached.