[HOME]

Identification of Variable Regions of HIV-1 Proteases


[PURPOSE] Identification of variable regions of HIV-1 proteases. In particular, we want to identify structural differences between (1) crystal structures and NMR models, (2) crystal structures of different space groups to assess the influence of crystal packing, and (3) substrate-bound forms and FDA drug bound forms to analyze the mechanism of multi-drug resistance due to non-active site mutations.

[METHOD] We can identify the five-residue fragments of a HIV-1 protease that form a "minor conformation" (i.e. rare conformation) as follows:

Step1) Encode a collection of HIV-1 protease structures available from the PDB database, using the ProteinEncoder program.
* ProteinEncoder is available from the PROGRAM>ProteinEncoder section.
Step2) Compile a "fragment-to-D2 code" conversion table of five-residue fragments: the frag_code5_HIV1PR_ID.tbl table (sample).
* frag_code5_HIV1PR_ID.tbl is available from the PROGRAM>NtileCodePredictor>DOWNLOADS section.
Step3) Identify the five-residue fragments of a HIV-1 protease chain that are assigned a minor D2 code, using the NtileCodePredictor program with frag_code5_HIV1PR_ID.tbl:
* NtileCodePredictor is available from the PROGRAM>NtileCodePredictor section.
* Type the following command: NtileCodePredictor -t frag_code5_HIV1PR_ID.tbl  filename.code
Step4) See the ".pred" file (sample), where the residues with a minor D2 code are marked with "F" in the [Comp] row.

Note that NtileCodePredictor computes the most frequentry occurred D2 code (the major D2 code) for each five-residue fragment based on the "fragment-to-D2 code" conversion table.

[RESULT]
See (2) for alternative D2 code assignments of a HIV-1 protease (obtained from 142 chains contained in 72 PDB files of the N37S mutant). 
See (3.1) for crystal structures and NMR models, (3.2) for crystal structures of different space groups, and (3.3) for apo (ligand-free) forms and holo (ligand-bound) forms.

(0) Dataset used

All the structures of HIV-1 proteases refined at a resolution equal to or higher than 2.5 Å that are available from the PDB database on 2009-01-01:

Type Dataset
Crystal structures
of space group P21212
120 PDB entries
Crystal structures
of space group P212121
106 PDB entries
Crystal structures
of space group P61
64 PDB entries
Crystal structures
of other space groups
32 PDB entries
NMR models 2 PDB entries
(28 + 20 models)


The D2 code of all the five-residue fragments contained in the dataset are computed to create a "fragment-to-D2 code" conversion table: the frag_code5_HIV1PR_ID.tbl table (sample).
* frag_code5_HIV1PR_ID.tbl is available from the PROGRAM>NtileCodePredictor>DOWNLOADS section.

[Remark] We did not use the structures refined at a resolution lower than 2.5 Å.


(1) Statistics of the conformations of five-residue fragments that occurred in HIV-1 proteases

Fragment coverage: 33 fragments (5.7%) occurred only once in the dataset. On the other hand, 12 fragments (2.1%) occurred more than 700 times.

Alternative conformations: 440 fragments (76.7%) are assigned a unique D2 code, although there is a fragment (amin acid sequence "GGIGG") that assumes 9 different D2 codes. (56% of the fragment assume D2 code "0", 37% assume "Q", 2% assume "R", 1% assume "O", and the others assume "1", "2", "3", "B", or "G".)

 
Fragment coverage
Alternative conformations
(Different D2 code assignments)
jpg image
jpg image


(2) Alternative D2 code assignments of the N37S mutant


There are 72 PDB entries (142 chains) of the N37S mutant, which is the most popular mutant in the dataset. We used the mutant to assess the local flexibility of a HIV-1 protease chain. A region of an amino acid chain is considered as flexible if it assumes more than one D2 code. The bar graph below shows the breakdown of the D2 code assigned for each residue:

Spatial distribution of the 99 amino acids of HIV-1 protease

jpg image jpg image jpg image jpg image jpg image jpg image jpg image jpg image jpg image jpg image
 1 10 20 30 40 50 60 70 80 90
AA seq. PQITLWQRP LVTIKIGGQL KEALLDTGAD DTVLEEMSLP GRWKPKMIGG IGGFIKVRQY DQILIEICGH KAIGTVLVGP TPVNIIGRNL LTQIGCTLNF
* The residue colored blue shows the difference between the N37S mutant and the "Consensus B amino acid sequence."
Breakdown of D2 code assignments of the N37S mutant (142 chains)
jpg image
* Gray bars and dark gray bars in the background indicate the position of minor and major drug resistance mutations, respectively.

The D2 codes that occurred most frequently in the 142 chains of the N37S mutant ("Major" D2 code assignment of the N37S mutant)
Major
D2 Code
--001R00R 000000RRG0 0000R0QBGR 0000000000 0000000000 0R00000000 0R00000RRG 000000000R 0R00000GAA AAABR000--


Superimposition of the 57 chains that assume the same D2 code assignment
as the major D2 code assignment of the N37S mutant
(Front view) (Top view)
jpg image jpg image
* The 57 chains include not only chains of the N37S mutant but also chains of other mutants.
* Superimposition of chains is computed by the DaliLite server in the study.

(3) Distributions of the residues with a minor D2 code


(3.1) Variable regions of Crystal structures and NMR models

The bar graphs below show the spatial distribution of the occurrence of minor D2 codes among structures of a mutant.

Crystal Structures: There are 72 PDB files (142 chains) of the N37S mutant, where 348 residues of the 142 chains are assigned a minor D2 code. 40% of the minor D2 codes are assinged at residue 50 or 51 and 15% are assigned at residue 5. The deformations are provably explained by crystal packing and variation of the size of the ligands bound.

NMR liganded dimer: There is a PDB file (28 NMR models, 56 chains) of a liganded HIV-1 protease (PDB ID 1BVE), where 480 residues of the 56 chains are assigned a minor D2 code. 22% of the minor D2 codes are assinged at the loop around residue 40 and 10% are assinged at the loop around residue 60. The deformations are provably explained by collision with another molecule. Note that no minor D2 code is assinged at the loop around residue 50.

NMR monomer: There is a PDB file (20 NMR models, 20 chains) of a HIV-1 protease monomer (PDB ID 1Q9P), where 367 residues of the 20 chains are assigned a minor D2 code. 15% of the minor D2 codes are assinged at residues around the active site (residues 25, 26, and 27) and 14% are assigned at the loop around residue 50. The deformations are provably explained by the absence of bound ligand.


72 crystal structures of the N37S mutant (dimer)
(348 residues with a minor D2 code / 142 chains = 2.5)
Superimposition of the chains
(front view / top view)
jpg image
P21212 (98 chains)
jpg image
P212121 (24 chains)
jpg image
P61 (18 chains)
jpg image


28 NMR models of a liganded triple mutant (1BVE, dimer)
(480 
residues with a minor D2 code / 56 chains = 8.6)
Superimposition of the chains
(front view / top view)
jpg image
jpg image


20 NMR models of a HIV-1 PR monomer (1Q9P)
(367 
residues with a minor D2 code / 20 chains = 18.4)
Superimposition of the chains
(front view / top view)
jpg image
jpg image


* Gray bars and dark gray bars in the background indicate the position of minor and major drug resistance mutations, respectively.
* In the right figures, yellow spheres indicate
the position of the residues that assume a minor D2 code.
* In the right figures, blue circles indicate the position of the residues with minor D2 code that are shared by crystals of multiple space groups.
*
In the right figures, red circles indicate the position of the residues with minor D2 code that are unique to the crystals of the corresponding space group.
* Superimposition of chains is computed by the DaliLite server in the study.
* Residues with minor D2 codes are detected by NtileCodePredictor as a residue where prediction is failed (i.e. marked with "F" in the ".pred" file).
* Note that different N37S mutants are bound to different ligands. In contrast, the NMR models of 1BVE are bound to the same ligand.




[FOR REFFERENCE (1)] Distribution of minor PB-block codes    (Added in 2009-04-22)

The bar graphs below show the spatial distribution of the occurrence of minor PB-block codes among structures of a mutant.
* PB blocks are a set of 16 short structural motifs of length five residues. 
* See Protein Blocks Expert Home (http://bioinformatics.univ-reunion.fr/PBE) for more info.


72 crystal structures of the N37S mutant (dimer)
(1008 residues with a minor PB codes / 142 chains = 7.1)
jpg image
* The Major PB-block codes of the N37S mutant (The PB-block code that occurred most frequently in the 142 chains of the N37S mutant (positionwise)) :

ZZDFKBCCDDDDDEEHIACDDDDDFKBMBDCDDDDDDEHJACDDDDDFKNOIACDDDDEHIACDEEHIACDFBLCDFKLCBDCDFKLMMMMNOPACDZZ


28 NMR models of a liganded triple mutant (1BVE, dimer)
(1074
residues with a minor PB codes / 56 chains = 19.2)
jpg image
* The Major PB-block codes of 1BVE (The PB-block code that occurred most frequently in the chains of the 28 NMR models of the 1BVE molecule (positionwise)) :

ZZFKLMBLCDDDDEHHIACDDDDDFKBCKLPCDDFBLCHJMMDFDDDFBGOIACFBLCEHIACDEEHIACDFBLCDFKLCBDCDFKLMMMMNOPACDZZ



[FOR REFFERENCE (2)] Distribution of minor phi and psi dihedral angles    (Added in 2009-04-29)

The bar graphs below show the spatial distribution of the occurrence of large deviation of dihedral angles from the average. Spatial distributions of the occurrence of minor D2 code and PB-block code are also shown for comparison.

* [Definition] dPHIi := PHIi - <PHIi>, where PHIi is the phi dihedral angle of the i-the residue and <PHIi> is the average over the chains (142 chains in the case of N37S mutants, and 56 chains in the case of NMR models of 1BVE).
* [Definition] dPSIi := PSIi - <PSIi>, where PSIi is the psi dihedral angle of the i-the residue and <PSIi> is the average over the chains
(142 chains in the case of N37S mutants, and 56 chains in the case of NMR models of 1BVE).


The top row shows the residues whose phi angle or psi angle is away from the average by more than 110°: 386 residues in 72 crystals (142 chains) and 480 residues in 28 NMR models (56 chains). The phi or psi angles of residues 40 and 73 are almost always deviated from the average more than 110° because chain A and chain B of the dimers have different average values on these residues (for example, X and X+340). On the other hand, the phi and psi angle of residue 50 (the tip of flap) are almost always within 110° from the average.

The second row shows the residues whose phi angle or psi angle is away from the average by more than 30°. The phi or psi angles of residues 40, 50, 51, 73, 78, and 86 are almost always deviated from the average more than 30° (because chain A and chain B of the dimers have different average values on these residues).

In either case, regions which could form multiple conformations ("residues associated with minor phi or psi angles") of crystal structures are different from that of NMR models.


72 crystal structures of the N37S mutant 28 NMR models (1BVE)
dPHIi > 110°
or
dPSIi > 110°
jpg image
386 residues with dPHIi or dPSIi > 110°
(2.7 residues/chain)
jpg image
480 residues with dPHIi or dPSIi > 110° 
(8.6 residues/chain)

dPHIi > 30°
or
dPSIi > 30°
jpg image
1037 residues with dPHIi or dPSIi > 30°
(7.3 residues/chain)
jpg image
1498 residues with dPHIi or dPSIi > 110°
(26.8 residues/chain)
minor D2 code

(for comparison)
jpg image
348 residues with a minor D2 codes
(2.5 residues/chain)
jpg image
480 residues with a minor D2 codes
(8.6 residues/chain)
minor PB code

(for comparison)
jpg image
1008 residues with a minor PB codes
(7.1 residues/chain)
jpg image
1074 residues with a minor PB codes
(19.2 residues/chain)







(3.2) Variable regions and space groups

The top bar graph (dark grey) shows the spatial distribution of the occurrence of minor D2 codes among the N37S mutants of a space group. The bottom bar graph (grey) shows the spatial distribution among all of the HIV-1 proteases of the same space group. Both graphs show very similar patterns of distribution. The bar graphs and figures below show that the crystals of space group P212121 are more influenced by crystal packing than others. On the other hand, the crystals of space group P61 are more flexible than others.

Space group P 21 21 2: There are 49 P21212 crystals of the N37S mutants (dimers), where 219 residues of the 98 chains are assigned a minor D2 code. 48% of the minor D2 codes are assinged at residue 50 or 51, 18% are assigned at residue 5, and 11% are assigned at residue 69. On the other hand, there are 120 P21212 crystals of HIV-1 protease dimers, where 530 residues of the 240 chains are assigned a minor D2 code.

Space group P 21 21 21: There are 12 P212121 crystals of the N37S mutants (dimers), where 57 residues of the 24 chains are assigned a minor D2 code. 37% of the minor D2 codes are assinged at residue 50 or 51, 18% are assigned at residue 2412% are assigned at residue 30, and 11% are assigned at residue 5. On the other hand, there are 106 P21212 crystals of the HIV-1 protease dimers, where 569 residues of the 212 chains are assigned a minor D2 code. The minor D2 codes at residues 16, 40 , and 61 are provably explained by crystal packing.

Space group P 61: There are 9 P61 crystals of the N37S mutants (dimers), where 64 residues of the 18 chains are assigned a minor D2 code. 19% of the minor D2 codes are assinged at residue 50 or 51, 13% are assigned at residue 3011% are assigned at residue 24, 9% are assigned at residue 7, and 9% are assigned at residue 89. On the other hand, there are 64 P61 crystals of the HIV-1 protease dimers, where 430 residues of the 128 chains are assigned a minor D2 code. Note that no minor D2 code is assinged at the two crystal packing interaction points (residues 46 and 53)


Space group P 21 21 2 Superimposition of 98 N37S chains
jpg image
jpg image


Space group P 21 21 21 Superimposition of 24 N37S chains
jpg image
jpg image


Space group P 61 Superimposition of 18 N37S chains
jpg image
jpg image


* Gray bars and dark gray bars in the background indicate the position of minor and major drug resistance mutations, respectively.
* In the right figures, yellow spheres indicate
the position of the residues that assume a minor D2 code.
* In the right figures, blue circles indicate the position of the residues with minor D2 code that are shared by crystals of multiple space groups.
*
In the right figures, red circles indicate the position of the residues with minor D2 code that are unique to the crystals of the corresponding space group.
* Superimposition of chains is computed by the DaliLite server in the study.
* Residues with minor D2 codes are detected by NtileCodePredictor as a residue where prediction is failed (i.e. marked with "F"
in the ".pred" file).


(3.3) Apo (ligand-free) forms and Holo (ligand-bound) forms

The bar graphs below show the spatial distribution of the occurrence of minor D2 codes among the structures of HIV-1 proteases.

Residues 50, 51, and 69 often assume a minor D2 code in both substrate-bound (or substrate analog-bound) and FDA drug-bound forms, although substrate-bound (or substrate analog-bound) forms have fewer residues with a minor D2 code than FDA drug-bound forms. In particular, residues 24, 30, and 79 often assume a minor D2 code in FDA drug-bound forms. Some of the substrate-bound (or substrate analog-bound) forms of space group P212121 have residues 16, 30 and 37 with a minor D2 code (provably due to crystal packing).


Ligand-free form Substrate (analog)-bound form FDA drug-bound form
P21212 -
jpg image
jpg image
P212121 -
jpg image
jpg image
P61 jpg image
jpg image
jpg image
Others jpg image
-
jpg image
* Gray bars and dark gray bars in the background indicate the position of minor and major drug resistance mutations, respectively.
* P61 APO (4 chains): They are two tethered molecules.
* P61 SUB (4 chains): They are one dimer and one tethered molecule.



Superimpositions of the corresponding chains:

Ligand-free form Substrate (analog)-bound form FDA drug-bound form
P21212 -
jpg image
jpg image
P212121 -
jpg image
jpg image
P61 jpg image
jpg image
jpg image
Others jpg image
-
jpg image
* In the figures, yellow spheres indicate the position of the residues that assume a minor D2 code.
* Superimposition of chains is computed by the DaliLite server in the study.


(4) Examples


(4.1) Substrate-bound forms and FDA drug-bound forms of a resistant mutant (V82A)

Substrate-bound forms with fewer residues of a minor D2 code correspond to better (smaller) Ki values.
That is,
a mutant does not disrupt ligand binding severely when only a few residues assume a minor D2 code.


In the following, all of the chains have the same amino acid sequence and all of the crystals are in the same space group of P21212:

P
D
B
Id
Ligand Inhibition
constant (Ki)
Residues with a minor D2 code (marked with "F")


The amino acid sequence1)
->
              1         2         3         4         5         6         7         8         9
     123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789

AA
         K                         I   S                         I   A              A            A  
2
A
O
E

Substrate
analog

(CA-P2)
0.024 µM2)                1         2         3         4         5         6         7         8         9
     123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789

ChnA --_____________________________________________________________________________________________----
ChnB --_______________________________________________FF____________________________________________----
2
A
O
G

Substrate
analog

(P2-NC)
0.53 µM2)                1         2         3         4         5         6         7         8         9
     123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789

ChnA --_____________________________________________________________________________________________----
ChnB --_______________________________________________FF____________________________________________----
2
A
O
F

Substrate
analog

(P1-P6)
28.2 µM2)                1         2         3         4         5         6         7         8         9
     123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789

ChnA --__F__________________________________________________________________________________________----
ChnB --_______________________________________________FF_________________F__________________________----
2
A
O
H

Substrate
analog

(P6-PR)
36.3 µM2)                1         2         3         4         5         6         7         8         9
     123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789

ChnA --_______________________________________________FF____________________________________F_______----
ChnB --_____________________F__________________________F_________________F__________________________----
2
N
M
Y

FDA drug
(SQV)
-                1         2         3         4         5         6         7         8         9
     123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789

ChnA --_____________________________________________________________________________________F_______----
ChnB --_______________________________________________FF____________________________________________----
1
S
D
V

FDA drug
(IDV)
3.3-fold relative
to wild type3)
               1         2         3         4         5         6         7         8         9
     123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789

ChnA --__F____________________________________________FF____________________________________________----
ChnB --__________________________________________________________________F__________________________----

1) Mutations from the "consensus B sequence" only are shown and resistant mutantation V82A is colored red, where                                               1         2         3         4         5         6         7         8         9
                                     123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
   Consensus B amino acid sequence : PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNF
2) Tie, Y.,  Boross, P.I.,  Wang, Y.F.,  Gaddis, L.,  Liu, F.,  Chen, X.,  Tozser, J.,  Harrison, R.W.,  Weber, I.T.   (2005) Molecular basis for substrate recognition and drug resistance from 1.1 to 1.6 angstroms resolution crystal structures of HIV-1 protease mutants with substrate analogs.   Febs J.   272: 5265-5277 
3) Mahalingam, B.,  Wang, Y.-F.,  Boross, P.I.,  Tozser, J.,  Louis, J.M.,  Harrison, R.W.,  Weber, I.T.   (2004) Crystal structures of HIV protease V82A and L90M mutants reveal changes in the indinavir-binding site   Eur.J.Biochem.   271: 1516-1524