Web document 5.2. PHI-BLAST.

 

We can do a blastp search restricted to bacterial sequences in the refseq database, using human RBP4 as a query.

 

The query is:

 

>gi|55743122|ref|NP_006735.2| retinol-binding protein 4, plasma precursor [Homo sapiens]
MKWVWALLLLAALGSGRAERDCRVSSFRVKENFDKARFSGTWYAMAKKDPEGLFLQDNIVAEFSVDETGQ
MSATAKGRVRLLNNWDVCADMVGTFTDTEDPAKFKMKYWGVASFLQKGNDDHWIVDTDYDTYAVQYSCRL
LNLDGTCADSYSFVFSRDPNGLPPEAQKIVRQRQEELCLARQYRLIVHNGYCDGRSERNLL

 

The two bacterial matches having low E values are:

 

>gi|119470685|ref|ZP_01613353.1| outer membrane lipoprotein (lipocalin) [Alteromonadales bacterium TW-7]
MKAITTILLITGLFLLTACTSAPEGITPVKNFDLEQYKGKWYEIARLDHSFEEGMEQVTATYTVNDDGTV
KVLNKGFITKEQKWDEAEGLAKFVEGTDTGHFKVSFFGPFYGAYVIFELDQDDYQYAFITSYNRDFLWFL
SRTPTVSDKLKQHFIAKANKLGFATEQIIWVKQ
 
>gi|84519543|ref|ZP_01006814.1| lipoprotein Blc [Prochlorococcus marinus str. MIT 9211]
MYLLLENGALAMMAVLRRWFLIVGLMGLASCTSLPEGIEPVSGFDSDRYLGTWYEIARLDHSFERGLTNV
RAEYSRNDDGSIKVINRGYNAEEEQWEEADGRAVFVEDENTGHLKVSFFGPFYASYVVFELDKDEYSYAY
VTGYDRDYLWFLSRTPEVS

 

A multiple sequence alignment, done using ClustalW (Chapter 6), is shown here (the highly conserved GXW motif is shaded green):

 

CLUSTAL W (1.83) multiple sequence alignment
 
 
ZP_01613353          ------------MKAITTILLITGL-FLLTACTSAPEGITPVKNFDLEQYKGKWYEIARL 47
ZP_01006814          MYLLLENGALAMMAVLRRWFLIVGL-MGLASCTSLPEGIEPVSGFDSDRYLGTWYEIARL 59
human_NP_006735      ------------MKWVWALLLLAALGSGRAERDCRVSSFRVKENFDKARFSGTWYAMAKK 48
                                 *  :   :*:..*    :   .  ..:   ..**  :: *.** :*: 
 
ZP_01613353          DHSFEEGMEQVTATYTVNDDGTVKVLNKGFITKEQKWDEAEGLA-KFVEGTDTGHFKVSF 106
ZP_01006814          DHSFERGLTNVRAEYSRNDDGSIKVINRGYNAEEEQWEEADGRA-VFVEDENTGHLKVSF 118
human_NP_006735      DPEGLFLQDNIVAEFSVDETGQMSATAKGRVRLLNNWDVCADMVGTFTDTEDPAKFKMKY 108
                     * .      :: * :: :: * :..  :*     ::*: . . .  *.:  :..::*:.:
 
ZP_01613353          FG--PFYG----AYVIFELDQDDYQYAFIT-------SYNRDFLWFLSRTP-TVSDKLKQ 152
ZP_01006814          FG--PFYA----SYVVFELDKDEYSYAYVT-------GYDRDYLWFLSRTP-EVS----- 159
human_NP_006735      WGVASFLQKGNDDHWIVDTDYDTYAVQYSCRLLNLDGTCADSYSFVFSRDPNGLPPEAQK 168
                     :*  .*       : :.: * * *   :             .: :.:** *  :.     
 
ZP_01613353          HFIAKANKLGFATEQIIWVKQ------------ 173
ZP_01006814          ---------------------------------
human_NP_006735      IVRQRQEELCLARQYRLIVHNGYCDGRSERNLL 201

 

One can select many possible patterns for PHI-BLAST searches. Here is one:

GXW[YF]X[VILMAFY]A[RKH]XD