Comparative Analysis: the D2 encoding and Existing Methods

(1) GroES 'Mobile Loop'

"We show that structural diversity has a significant effect on structural alignment. Moreover, we observe alignment inconsistencies even for modest spatial divergence, implying that the biological interpretation of alignments is less straightforward than commonly assumed. A salient example is the GroES 'mobile loop' where sub-Ångstrom variations give rise to contradictory sequence alignments." [1]

The D2 encoding method (, that is, programs ProteinEncoder and ComSubStruct, ) is compared with existing methods.
We used twelve X-ray crystal structures of GroES used in [1] for the comparative study: 1p3h (chain N), 1aon (chain O, P, and S), 1pcq (chain O, P, Q, R, S), 1pf9 (chain O and R), and 1svt (chain O).

(1.1) 1D representaion of protein backbone

The D2 code (ProteinEncoder) is compared with two encoding methods: HMM-SA (structural alphabet) [2] and Stride (secondary structure assignment) [3].

See more ...

(1.2) Structural alignment

Alignment by ComSubStruct is compared with five structural alignment tools: CE [4], DALI [5], HMM-SA [2], FATCAT [6],  and MATT [7].

See more ...

jpg image jpg image jpg image
jpg image
(A) Structural variations
of GroES.
(B) D2 code-variable residues
of GroES.

(C) HMM-SA-variable residues of GroES. (D) Stride-variable residues
of GroES.
(A) The 'master-slave' alignments by DALI with 1p3h-N as master, where residues of the base loop are shown in red, the roof loop in blue, and the 79-83 turn in yellow. [8]
(B) The 'master-slave' alignments by DALI with 1p3h-N as master, where residues with different D2-codes are colored accordingly:  '0' in blue, 'A' in red, 'B' and 'Q' in orange, 'G' in green, and 'R' in yellow.
(C) The structure of 1p3h-N only is shown, where residues with different HMM-SA codes are shown in blue.
(D) The structure of 1p3h-N only is shown, where residues with different Stride codes are shown in blue.


[1] Pirovano W, Feenstra KA, Heringa J., The meaning of alignment: lessons from structural diversity. BMC Bioinformatics. 2008 Dec 23;9:556.
[2] Camproux AC, Gautier R, Tuffery P., A hidden markov model derived structural alphabet for proteins. J Mol Biol. 2004 Jun 4;339(3):591-605.
[3] Heinig, M., Frishman, D. (2004). STRIDE: a Web server for secondary structure assignment from known atomic coordinates of proteins. Nucl. Acids Res. , 32, W500-2.
[4] Shindyalov IN, Bourne PE. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 1998;11:739–747.
[5] Holm L, Park J. DaliLite workbench for protein structure comparison. Bioinformatics. 2000;16:566–567.
[6] Ye Y, Godzik A. Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics. 2003;19:ii246–255.
[7] Menke M, Berger B, Cowen L. Matt: local flexibility aids protein multiple structure alignment. PLoS Comput Biol. 2008;4:e10.
[8] Roberts MM, Coker AR, Fossati G, Mascagni P, Coates ARM, Wood SP,  Mycobacterium tuberculosis chaperonin 10 heptamers self-associate through their biologically active loops. J.BACTERIOL. 2003 185: 4172-4185.

(2) HIV-1 PR

Human immunodeficiency virus type 1 protease (HIV-1 PR) is one of the major anti-HIV-1 drug targets [1] and a large collection of crystal structures of its variants are available in the PDB. HIV-1 PR is a homodimeric molecule, consisted of two identical 99-residue polypeptide chains. It is a C2-symmetric molecule, with a twofold axis transversing the active site (residues Asp25, Thr26, and Gly27).

For comparative performance analysis, we consider identification of variable regions in HIV1-PR molecules, using the D2 code method and two existing methods: a (Phi, Psidihedral angle-based method [2] and the PB block method (structural alphabet) [3].

The study is based on 72 crystal structures (142 chains) of the N37S mutant and the 28 NMR models (56 chains) of a mutant (1bve).

See more ...

jpg image
jpg image
jpg image
jpg image
jpg image
jpg image
jpg image
jpg image
(A) N37S (P21212) (B) N37S (P212121) (C) N37S (P61) (D) 1bve (NMR)
'Master-slave' alignments by DALI, where the master is 1d4jA for crystal structures and model26-A for NMR models.
In the figure, top and front views are given, where yellow spheres indicate the position of the residues that assume a "minority" D2 code (i.e., D2-variable residues).
(A) 49 crystal structures (98 chains) of space group P21212 of the N37S mutant.
(B) 12 crystal structures (24 chains) of space group P212121 of the N37S mutant.
(C) 9 crystal structures (18 chains) of space group P61 of the N37S mutant.
(D) 28 NMR models (56 chains) of a mutant.


[1] Wlodawer A, Vondrasek J. Inhibitors of HIV-1 protease: a major success of structure-assisted drug design. Annu Rev Biophys Biomol Struct 1998;27:249-84.
[2] Kuznetsov IB. Ordered conformational change in the protein backbone: prediction of conformationally variable positions from sequence and low-resolution structural data. Proteins 2008; 72:74–87.
[3] Tyagi M, Sharma P, Swamy CS, Cadet F, Srinivasan N, de Brevern AG, Offmann B. Protein Block Expert (PBE): a web-based protein structure analysis server using a structural alphabet. Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W119-23.

See also EXAMPLES (Prediction/Alignment/Others) > Structural alignment > HIV-1 PR variants
and EXAMPLES (Prediction/Alignment/Others) > Prediction > Variable regions of HIV-1 PR.