Solenoid protein domain

Solenoid protein domains are a highly modular type of protein domain. They consist of a chain of nearly identical folds, often simply called tandem repeats. They are extremely common among all types of proteins, though exact figures are unknown.[1]

Common examples of protein domains with a solenoid architecture: the WD40 repeat domain of beta-TrCP (green), leucine-rich repeat domain of TLR2 (red), armadillo repeat domain of beta-catenin (blue), ankyrin repeat domain of ANKRA2 (orange), kelch repeat domain of Keap1 (yellow) and HEAT repeat domain of a PP2A regulatory subunit R1a (magenta).

"Repeats" in molecular biology

edit

In proteins, a "repeat" is any sequence block that returns more than one time in the sequence, either in an identical or a highly similar form. Repetitiveness does not in itself indicate anything about the structure of the protein. As a "rule of thumb", short repetitive sequences (e.g. those below the length of 10 amino acids) may be intrinsically disordered, and not part of any folded protein domains. Repeats that are at least 30 to 40 amino acids long, are far more likely to be folded as part of a domain. Such long repeats are frequently indicative of the presence of a solenoid domain in the protein.

Examples of disordered repetitive sequences include the 7-mer peptide repeats found in the RPB1 subunit of RNA polymerase II,[2] or the tandem beta-catenin or axin binding linear motifs in APC (adenomatous polyposis coli).[3] Examples of short repeats exhibiting ordered structures include the three-residue collagen repeat or the five-residue pentapeptide repeat that forms a beta helix structure.

Architecture of solenoid domains

edit

Due to the identical form of their building blocks, solenoid domains can only assume a limited number of shapes. Two main topologies are possible: linear (or open, generally with some degree of helical curvature) and circular (or closed).[4]

Linear (open) solenoids

edit
 
Linear (open) solenoid structure

If the two terminal repeats in a solenoid do not physically interact, it leads to an open or linear structure. Members of this group are frequently rod- or crescent-shaped. The number of individual repeats can range from 2 to over 50. A clear advantage of this topology is that both the N- and C-terminal ends are free to add new repeats and folds, or even remove existing ones during evolution without any gross impact on the structural stability of the entire domain.[5] This type of domain is extremely common among extracellular segments of receptors or cell adhesion molecules. A non-exhaustive list of examples include: EGF repeats, cadherin repeats, leucine-rich repeats, HEAT repeats, ankyrin repeats, armadillo repeats, tetratricopeptide repeats, etc. Whenever a linear solenoid domain structure participates in protein-protein interactions, frequently at least 3 or more repetitive subunits form the ligand-binding sites. Thus - while individual repeats might have a (limited) ability to fold on their own – they usually cannot perform the functions of the entire domain alone.

Circular (closed) solenoids

edit
 
Circular (closed) solenoid domain

In the case when the N- and C-terminal repeats lie in close physical contact in a solenoid domain, the result is a topologically compact, closed structure. Such domains typically display a high rotational symmetry (unlike open solenoids that only have translational symmetries), and assume a wheel-like shape. Because of the limitations of this structure, the number of individual repeats is not arbitrary. In the case of WD40 repeats (perhaps the largest family of closed solenoids) the number of repeats can range from 4 to 10 (more usually between 5 and 7).[6] Kelch repeats, beta-barrels and beta-trefoil repeats are further examples for this architecture. Closed solenoids frequently function as protein-protein interaction modules: it is possible that all repeats must be present to form the ligand-binding site if it is located at the centre or axis of the domain "wheel".

Repetitive supradomain modules

edit
 
The BRCT repeats of MDC1, bound to a ligand peptide from phosphorylated histone H2AX. Image based on PDB entry 2AZM.

As common in biology, there are several borderline cases between solenoid architectures and regular protein domains. Proteins that contain tandem repeats of ordinary domains are very common in eukaryotes. Even if these domains are perfectly capable of folding on their own, some of them might bind together and assume a rigidly fixed orientation in the full protein. These supradomain modules can perform functions that its individual constituents are incapable of .[7] A famous example is the case of tandem BRCT domains, found in the tumor suppressor protein BRCA1.[8] While individual BRCT domains are found in certain proteins (e.g. some DNA ligases) binding DNA, these tandem BRCT domains evolved a novel function: phosphorylated linear motif binding.[9][10] In the case of BRCA1 (and MDC1), the peptide-binding groove lies in a cleft formed by the junction of the two domains. This elegantly explains why individual constituents of this supradomain block are incapable of ligand binding, while their proper assembly endows them with a novel function. Therefore, tandem BRCT domains can be regarded as a form of a single, linear solenoid domain as well.

References

edit
  1. ^ Andrade MA, Perez-Iratxeta C, Ponting CP (2001). "Protein repeats: structures, functions, and evolution". J. Struct. Biol. 134 (2–3): 117–31. doi:10.1006/jsbi.2001.4392. PMID 11551174.
  2. ^ Meyer PA, Ye P, Zhang M, Suh MH, Fu J (June 2006). "Phasing RNA polymerase II using intrinsically bound Zn atoms: an updated structural model". Structure. 14 (6): 973–82. doi:10.1016/j.str.2006.04.003. PMID 16765890.
  3. ^ Liu J, Xing Y, Hinds TR, Zheng J, Xu W (June 2006). "The third 20 amino acid repeat is the tightest binding site of APC for beta-catenin". J. Mol. Biol. 360 (1): 133–44. doi:10.1016/j.jmb.2006.04.064. PMID 16753179.
  4. ^ Patthy, László (2007). Protein Evolution. Wiley-Blackwell. ISBN 978-1-4051-5166-5.
  5. ^ Kinch LN, Grishin NV (June 2002). "Evolution of protein structures and functions". Curr. Opin. Struct. Biol. 12 (3): 400–8. doi:10.1016/s0959-440x(02)00338-x. PMID 12127461.
  6. ^ Chen CK, Chan NL, Wang AH (October 2011). "The many blades of the β-propeller proteins: conserved but versatile". Trends Biochem. Sci. 36 (10): 553–61. doi:10.1016/j.tibs.2011.07.004. PMID 21924917.
  7. ^ Vogel C, Berzuini C, Bashton M, Gough J, Teichmann SA (February 2004). "Supra-domains: evolutionary units larger than single protein domains". J. Mol. Biol. 336 (3): 809–23. CiteSeerX 10.1.1.116.6568. doi:10.1016/j.jmb.2003.12.026. PMID 15095989.
  8. ^ Yu X, Chini CC, He M, Mer G, Chen J (October 2003). "The BRCT domain is a phospho-protein binding domain". Science. 302 (5645): 639–42. Bibcode:2003Sci...302..639Y. doi:10.1126/science.1088753. PMID 14576433. S2CID 29407635.
  9. ^ Sheng ZZ, Zhao YQ, Huang JF (2011). "Functional Evolution of BRCT Domains from Binding DNA to Protein". Evol. Bioinform. Online. 7: 87–97. doi:10.4137/EBO.S7084. PMC 3140412. PMID 21814458.
  10. ^ Leung CC, Glover JN (August 2011). "BRCT domains: easy as one, two, three". Cell Cycle. 10 (15): 2461–70. doi:10.4161/cc.10.15.16312. PMC 3180187. PMID 21734457.