Web document 3.7. Output of the PRSS program showing an alignment of human beta globin (top sequence) with myoglobin. The myoglobin sequence has been shuffled 100 times, and the resulting Z-score is 210.7 (highlighted).

 

You can access PRSS and similar programs here:

 

http://fasta.bioch.virginia.edu/fasta_www2/fasta_www.cgi?rm=shuffle

http://www.isrec.isb-sib.ch/experiment/ALIGN_form.html

http://www.ch.embnet.org/software/PRSS_form.html

 
 
 
# /seqprg/bin/prss34_t -p -q -w 80 -m 6 -Z 10000 -A -H -k 100 -f -10 -g -2 @ TMP.q2
PRSS evaluates statistical signficance using Smith-Waterman
 version 34.26 January 12, 2007
Please cite:
 W.R. Pearson (1996) Meth. Enzymol. 266:227-258
 
@ - QUERY 147 aa
 vs TMP.q2 - QUERY shuffled sequence
 
  15400 residues in   100 sequences
 (shuffled) MLE statistics: Lambda= 0.1701;  K=0.03061
 
 Smith-Waterman (3.5 Sept 2006) function [BL50 matrix (15:-5)], open/ext: -10/-2
 Scan time:  0.030

The best scores are:                                                          s-w bits E(10000)
QUERY                                                                  ( 154)  163 45.0 6.3e-06 align

>>>@, 147 aa vs TMP.q2 library

>>QUERY                                                                       (154 aa)
 initn: 163  Z-score: 210.7  bits: 45.0 E(): 6.3e-06
Smith-Waterman score: 163;  25.517% identity (58.621% similar) in 145 aa overlap (4-146:3-147)
Entrez Lookup  Re-search database  General re-search  
               10        20          30        40        50        60        70        
QUERY  MVHLTPEEKSAVTALWGKVNVDEVGG--EALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAH
          :.  : . :  .::::..:  :   :.: ::.  .: : . :..:  :.. : . ..  .: ::  :: :..  : .
QUERY   MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASEDLKKHGATVLTALGGILKK
                10        20        30        40        50        60        70         
 
       80        90       100       110       120       130       140             
QUERY  LDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH      
         . .. .  :.. :  : ..  . ...... .. ::  .   .:   .:.:..:..    . .: .:       
QUERY  KGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG
      80        90       100       110       120       130       140       150    
 

 
 
147 residues in 1 query   sequences
15400 residues in 100 library sequences
 Tcomplib [34.26] (2 proc)
 start: Thu Mar  1 07:06:00 2007 done: Thu Mar  1 07:06:00 2007
 Scan time:  0.030 Display time:  0.000
 
Function used was PRSS [version 34.26 January 12, 2007]

 

 

 

# /seqprg/bin/prss34_t -p -q -w 80 -m 6 -Z 10000 -A -H -k 1000 -f -10 -g -2 @ TMP.q2
PRSS evaluates statistical signficance using Smith-Waterman
 version 34.26 January 12, 2007
Please cite:
 W.R. Pearson (1996) Meth. Enzymol. 266:227-258
 
@ - QUERY 147 aa
 vs TMP.q2 - QUERY shuffled sequence
 
 154000 residues in  1000 sequences
 (shuffled) MLE statistics: Lambda= 0.1727;  K=0.03112
 
 Smith-Waterman (3.5 Sept 2006) function [BL50 matrix (15:-5)], open/ext: -10/-2
 Scan time:  0.240

The best scores are:                                                          s-w bits E(10000)
QUERY                                                                  ( 154)  163 45.6 4.2e-06 align

>>>@, 147 aa vs TMP.q2 library

>>QUERY                                                                       (154 aa)
 initn: 163  Z-score: 213.9  bits: 45.6 E(): 4.2e-06
Smith-Waterman score: 163;  25.517% identity (58.621% similar) in 145 aa overlap (4-146:3-147)
Entrez Lookup  Re-search database  General re-search  
               10        20          30        40        50        60        70        
QUERY  MVHLTPEEKSAVTALWGKVNVDEVGG--EALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAH
          :.  : . :  .::::..:  :   :.: ::.  .: : . :..:  :.. : . ..  .: ::  :: :..  : .
QUERY   MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASEDLKKHGATVLTALGGILKK
                10        20        30        40        50        60        70         
 
       80        90       100       110       120       130       140             
QUERY  LDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH      
         . .. .  :.. :  : ..  . ...... .. ::  .   .:   .:.:..:..    . .: .:       
QUERY  KGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKHPGDFGADAQGAMNKALELFRKDMASNYKELGFQG
      80        90       100       110       120       130       140       150    
 

 
 
147 residues in 1 query   sequences
154000 residues in 1000 library sequences
 Tcomplib [34.26] (2 proc)
 start: Thu Mar  1 07:07:25 2007 done: Thu Mar  1 07:07:25 2007
 Scan time:  0.240 Display time:  0.000
 
Function used was PRSS [version 34.26 January 12, 2007]