[Statistics of variable regions]


A "variable region" of a structure-dissimilar pair is a region where they fold into different structures.
That is,  a variable region is an amin-acid sequence which could adopt different secondary structures in proteins.

The following table shows the length frequency districution of the variable regions of the structure-dissimilar pairs (158 pairs with sequence identity 100% and RMSD ≥ 6A).


Length Frequency Distribution of the variable regions occured in the database (158 pairs with sequence identity 100% and RMSD ≥ 6A)


Length Frequency
Examples
Amino-acid sequence
5-tile code (left)
5-tile code (right)




1 471

(60.8%)
jpg image
AA:  V
Left:  0
Right:  R 
jpg image
AA:  D
Left:  0
Right:  G 
jpg image
AA:  I
Left:  B
Right  G 




2 133

(17.2%)
jpg image
AA:  QW
Left:  00
Right:  R
jpg image
AA:  KY
Left:  AA
Right:  BQ 
jpg image
AA:  QT
Left:  QB
Right  R




3 66

(8.5%)
jpg image
AA:  RIV
Left:  AAA
Right:  B1Q 
jpg image
AA:  DPV
Left:  00Q
Right:  G2A 
jpg image
AA:  YGS
Left:  0R0
Right  RQB 




4 49

(6.3%)
jpg image
AA:  SQQS
Left:  0RRR
Right:  R0QG 
jpg image
AA:  LKLH
Left:  AAAA
Right:  BRRQ 
jpg image
AA:  TGTN
Left:  AAB0
Right  BG0G 




5 18

(2.3%)
jpg image
AA:  EDDHH
Left:  0RQBO
Right:  R0RRR 
jpg image
AA:  GEADC
Left:  00001
Right:  QBQBG 
jpg image
AA:  KDTDS
Left:  AAAAA
Right  G0GRQ 




6 15

(1.9%)
jpg image
AA:  DIDATF
Left:  0QBRR0
Right:  RG0QBG 
jpg image
AA:  RGEMSQ
Left:  AAAAAA
Right:  BRGG0Q 
jpg image
AA:  WFGNAQ
Left:  AAAB00
Right  B000RG 




7 10

(1.3%)
jpg image
AA:  NELGAGI
Left:  AAAAAAA
Right:  B0RR00Q 
jpg image
AA:  VADKGYT
Left:  00RRRRG
Right:  R300000 
jpg image
AA:  YHGAGSK
Left:  RGRGRGB
Right: QAB00R0 




8 3

(0.4%)
jpg image
AA:  DAVTFVNA
Left:  00000ABH
Right:  1BGRG0RQ 
jpg image
AA:  GDTPIDTF
Left:  1QAAAAAA
Right:  00R0001Q 
jpg image
AA:  KSPELQAE
Left:  AGQB01R1
Right  BHRGR0QA 




9 2

(0.3%)
jpg image
AA:  KTPETAGLI
Left:  00QB00RQG
Right:  QG01RG00R 
jpg image
AA:  QHTIDLTDS
Left:  AAAAAAAAA
Right:  BRR00QBGQ 






AA: -
Left:  -
Right  - 




10 1 (cluster 15)

(0.1%)
AA:  SSGAASTQSL
Left:  00000QBR00
Right:  QB1RG000RG 
AA:  -
Left:  -
Right:  - 
AA:  -
Left:  -
Right  - 




11 3

(0.4%)
AA:  VKSPELQAEAK
Left:  AAAAB00RGQA
Right:  HB1QAAG0QBQ 
AA:  EGEETITAFVE
Left:  0RR0RBBQBAA
Right:  RG0R01QAABQ 
AA:  ANIVRDLFASV
Left:  00R00RG000Q
Right  QAAAAABRQBR 




12 1 (cluster 17)

(0.1%)
AA:  YVEANMGLNPSS
Left:  AB01RRQBBGR0
Right:  BHQ00000000R 
AA:  -
Left:  -
Right:  - 
AA:  -
Left:  -
Right  - 




17 1 (cluster 01)

(0.1%)
AA:  RGLLGCIITSLTGRDKN
Left:  00QAAAAAAAABRG0RG
Right:  G0G0000000QR10R1R 




21 1 (cluster 22)

(0.1%)
AA:  IEKTNEKFHQIEKEFSEVEGR
Left:  AAAAAAAAAAAAAAAAAAAAA
Right:  BG0000000000000R0000




38 1 (cluster 48)

(0.1%)
AA:  LDMGNEIDTQNRQIDRIMEKADSNKTRIDEANQRATKM
Left:  AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Right:  B01RG01RRG0000000000RRR00000000000R00R 




Total 775

(100.0%)
[NOTE] Average length of the 775 variable regions = 2.00








[The 5-tile code]
  • 0 = DDDDD    (extended strand, small blue spheres)
  • 1 = DDDDU
  • 2 = DDDUD
  • 3 = DDDUU
  • 8 = DUDDD
  • 9 = DUDDU
  • A = DUDUD    (helix, large red spheres)
  • B = DUDUU    (C-cap, large yellow spheres)
  • G = UDDDD
  • H = UDDDU
  • I = UDDUD
  • J = UDDUU
  • O = UUDDD
  • P = UUDDU
  • Q = UUDUD    (N-cap, large yellow spheres)
  • R = UUDUU    (turn, large yellow spheres)