hCONDELs refer to regions of deletions within the human genome containing sequences that are highly conserved among closely related relatives. Almost all of these deletions fall within regions that perform non-coding functions. These represent a new class of regulatory sequences and may have played an important role in the development of specific traits and behavior that distinguish closely related organisms from each other.[1][2]
Nomenclature
editThe group of CONDELs of a specific organism is specified by prefixing the CONDELs with the first letter of the organism. For instance, hCONDELs refer to the group of CONDELs found in humans whereas mCONDELs and cCONDELs refer to mouse and chimpanzee CONDELs respectively.
Identification of CONDELs
editThe term hCONDEL was first used in the 2011 Nature article by McLean et al.[3] in whole-genome comparison analysis.[4] This involved firstly identifying a subset of 37,251 human deletions (hDELs)[5] through pairwise comparisons of chimpanzee and macaque genomes.[6] Chimpanzee sequences highly conserved in other species were then identified by pairwise alignment of chimpanzee with macaque, mouse and chicken sequences with BLASTZ[7] followed by multiple alignment of the pairwise alignments done with MULTIZ.[8] The highly conserved chimpanzee sequences were searched against the human genome using BLAT to identify conserved regions not present in humans. This identified 583 regions of deletions that were then referred to as hCONDELs. 510 of these identified hCONDELs were then validated computationally with 39 of these being validated by polymerase chain reaction (PCR).
Characteristics
edithCONDELs in humans cover approximately 0.14% of chimpanzee genome. The number of hCONDELs currently identified is 583 using the genome-wide comparison method; however, validation of these predicated regions of deletions through polymerase chain reaction methods produces 510 hCONDELs. The remainder of these hCONDELs are either false-positives or non-existent genes. hCONDELs have been confirmed through PCR with 88 percent of these shown to have been lost from the draft Neanderthal genome.[9] hCONDELs, on average, remove about 95 base pairs (bp) of highly conserved sequences from the human genome. The median size of these 510 validated CONDELs is about 2,804 bp, thus showing a diverse range in length of the characteristic deletions. Another noticeable characteristic of hCONDELs (and other groups of identified CONDELs such as those from mouse and chimpanzee) is that they tend to be specifically skewed towards GC poor regions.[10] Simulations show that hCONDELs are enriched near genes[11] involved in hormone receptor signaling and neural function, and near genes encoding fibronectin-type-III-or CD80-like immunoglobulin C2-set domains.
Impact in humans
editSialic acid loss
editOf the 510 identified hCONDELs, only one of these deletions has been shown to remove a 92 bp sequence that is part of a protein-coding region in the human sequence. The deletion that affects the protein coding region[12] in humans results in a frameshift mutation in the CMAH gene which codes for the cytidine monophosphate-N-acetylneurminic acid hydroxylase-like protein, an enzyme involved in the production of N-glycolylneuraminic acid, one type of sialic acid. Sialic acid is known to play a crucial part in cell signaling pathways and interaction processes. The loss of this gene is evident in the undetectable levels of sialic acid in humans but highly present in mouse, pig, chimpanzee and other mammal tissues and may provide more insight into the historic background of human evolution.[13]
The mechanisms and time of occurrence of hCONDELs are not entirely understood but given that conserved non-coding sequences play a major developmental role through regulation of genes,[1] their loss in regions of deletions, it is expected that their loss in hCONDELs will result in developmental consequences that can be observed in human-specific traits. In situ hybridization experiments done by Mclean et al.[3] by fusion of mouse constructs fused to basal promoter with LacZ expression[14] for hCONDELs near the androgen receptor (AR) locus and the growth arrest and DNA-damage-inducible protein GADD45 gamma (GADD45G) locus suggest a role in deletions that affect regulatory sequences in humans.
Loss of whiskers and penile spine
editAn hCONDEL located near the locus of the androgen receptor (AR) gene may be responsible for the loss of whiskers and penile spines in humans compared to its close relatives, including chimpanzees.[citation needed] The 60.7kb hCONDEL which is located near the AR locus has been found to be responsible for removing a 5 kb sequence that codes for an enhancer[15] for the AR locus. Using the mouse construct with LacZ expression showed localization of this hCONDEL region (AR enhancer) to the mesenchyme of vibrissae follicles and the mesoderm cells of penile organs.
Expansion of brain size
editMany hCONDELs are located around genes expressed during cortical neurogenesis. A 3,181 bp hCONDEL which is located near the GADD45G gene removes a forebrain-specific p300 enhancer binding site. The removal of this region, known to function as a suppressor, specifically increases the proliferation of the subventricular zone (SVZ) of the septum. The loss of this SVZ enhancer region in an hCONDEL may provide further insights into the role of DNA sequence changes that may have resulted in evolution of the human brain[16] and may provide a better understanding of the evolution of humans.
References
edit- ^ a b Woolfe, A.; Goodson, M.; Goode, D. K.; Snell, P.; McEwen, G. K.; Vavouri, T.; Smith, S. F.; North, P.; Callaway, H.; Kelly, K.; Walter, K.; Abnizova, I.; Gilks, W.; Edwards, Y. J. K.; Cooke, J. E.; Elgar, G. (2005). "Highly Conserved Non-Coding Sequences Are Associated with Vertebrate Development". PLOS Biology. 3 (1): e7. doi:10.1371/journal.pbio.0030007. PMC 526512. PMID 15630479.
- ^ Dermitzakis, E. T.; Reymond, A.; Scamuffa, N.; Ucla, C.; Kirkness, E.; Rossier, C.; Antonarakis, S. E. (2003). "Evolutionary Discrimination of Mammalian Conserved Non-Genic Sequences (CNGs)". Science. 302 (5647): 1033–1035. Bibcode:2003Sci...302.1033D. doi:10.1126/science.1087047. PMID 14526086. S2CID 35299360.
- ^ a b McLean, C. Y.; Reno, P. L.; Pollen, A. A.; Bassan, A. I.; Capellini, T. D.; Guenther, C.; Indjeian, V. B.; Lim, X.; Menke, D. B.; Schaar, B. T.; Wenger, A. M.; Bejerano, G.; Kingsley, D. M. (2011). "Human-specific loss of regulatory DNA and the evolution of human-specific traits". Nature. 471 (7337): 216–9. Bibcode:2011Natur.471..216M. doi:10.1038/nature09774. PMC 3071156. PMID 21390129.
- ^ Chen, R.; Bouck, J. B.; Weinstock, G. M.; Gibbs, R. A. (2001). "Comparing Vertebrate Whole-Genome Shotgun Reads to the Human Genome". Genome Research. 11 (11): 1807–1816. doi:10.1101/gr.203601. PMC 311156. PMID 11691844.
- ^ Harris, R. A.; Rogers, J.; Milosavljevic, A. (2007). "Human-Specific Changes of Genome Structure Detected by Genomic Triangulation". Science. 316 (5822): 235–237. Bibcode:2007Sci...316..235H. doi:10.1126/science.1139477. PMID 17431168.
- ^ Gibbs, R. A.; Gibbs, J.; Rogers, M. G.; Katze, R.; Bumgarner, G. M.; Weinstock, E. R.; Mardis, K. A.; Remington, R. L.; Strausberg, J. C.; Venter, R. K.; Wilson, M. A.; Batzer, C. D.; Bustamante, E. E.; Eichler, M. W.; Hahn, R. C.; Hardison, K. D.; Makova, W.; Miller, A.; Milosavljevic, R. E.; Palermo, A.; Siepel, J. M.; Sikela, T.; Attaway, S.; Bell, K. E.; Bernard, C. J.; Buhay, M. N.; Chandrabose, M.; Dao, C.; Davis, K. D.; et al. (2007). "Evolutionary and Biomedical Insights from the Rhesus Macaque Genome". Science. 316 (5822): 222–234. Bibcode:2007Sci...316..222.. doi:10.1126/science.1139247. PMID 17431167.
- ^ Schwartz, S.; Kent, W. J.; Smit, A.; Zhang, Z.; Baertsch, R.; Hardison, R. C.; Haussler, D.; Miller, W. (2003). "Human–Mouse Alignments with BLASTZ". Genome Research. 13 (1): 103–107. doi:10.1101/gr.809403. PMC 430961. PMID 12529312.
- ^ Blanchette, M.; Kent, W. J.; Riemer, C.; Elnitski, L.; Smit, A. F.; Roskin, K. M.; Baertsch, R.; Rosenbloom, K.; Clawson, H.; Green, E. D.; Haussler, D.; Miller, W. (2004). "Aligning Multiple Genomic Sequences with the Threaded Blockset Aligner". Genome Research. 14 (4): 708–715. doi:10.1101/gr.1933104. PMC 383317. PMID 15060014.
- ^ Green, R. E.; Krause, J.; Briggs, A. W.; Maricic, T.; Stenzel, U.; Kircher, M.; Patterson, N.; Li, H.; Zhai, W.; Fritz, M. H. Y.; Hansen, N. F.; Durand, E. Y.; Malaspinas, A. S.; Jensen, J. D.; Marques-Bonet, T.; Alkan, C.; Prüfer, K.; Meyer, M.; Burbano, H. A.; Good, J. M.; Schultz, R.; Aximu-Petri, A.; Butthof, A.; Höber, B.; Höffner, B.; Siegemund, M.; Weihmann, A.; Nusbaum, C.; Lander, E. S.; Russ, C. (2010). "A Draft Sequence of the Neandertal Genome". Science. 328 (5979): 710–722. Bibcode:2010Sci...328..710G. doi:10.1126/science.1188021. PMC 5100745. PMID 20448178.
- ^ Musto, H.; Cacciò, S.; Rodríguez-Maseda, H.; Bernardi, G. (1997). "Compositional constraints in the extremely GC-poor genome of Plasmodium falciparum". Memórias do Instituto Oswaldo Cruz. 92 (6): 835–841. doi:10.1590/S0074-02761997000600020. PMID 9566216.
- ^ Levy, S.; Hannenhalli, S.; Workman, C. (2001). "Enrichment of regulatory signals in conserved non-coding genomic sequence". Bioinformatics. 17 (10): 871–877. doi:10.1093/bioinformatics/17.10.871. PMID 11673231.
- ^ Suzuki, R.; Saitou, N. (2011). "Exploration for Functional Nucleotide Sequence Candidates within Coding Regions of Mammalian Genes". DNA Research. 18 (3): 177–187. doi:10.1093/dnares/dsr010. PMC 3111233. PMID 21586532.
- ^ Chou, H. -H.; Takematsu, H.; Diaz, S.; Iber, J.; Nickerson, E.; Wright, K. L.; Muchmore, E. A.; Nelson, D. L.; Warren, S. T.; Varki, A. (1998). "A mutation in human CMP-sialic acid hydroxylase occurred after the Homo-Pan divergence". Proceedings of the National Academy of Sciences. 95 (20): 11751–11756. Bibcode:1998PNAS...9511751C. doi:10.1073/pnas.95.20.11751. PMC 21712. PMID 9751737.
- ^ Poulin, F.; Nobrega, M. A.; Plajzer-Frick, I.; Holt, A.; Afzal, V.; Rubin, E. M.; Pennacchio, L. A. (2005). "In vivo characterization of a vertebrate ultraconserved enhancer" (PDF). Genomics. 85 (6): 774–781. doi:10.1016/j.ygeno.2005.03.003. PMID 15885503. S2CID 21888183.
- ^ Gotea, V.; Visel, A.; Westlund, J. M.; Nobrega, M. A.; Pennacchio, L. A.; Ovcharenko, I. (2010). "Homotypic clusters of transcription factor binding sites are a key component of human promoters and enhancers". Genome Research. 20 (5): 565–577. doi:10.1101/gr.104471.109. PMC 2860159. PMID 20363979.
- ^ Hill, R. S.; Walsh, C. A. (2005). "Molecular insights into human brain evolution". Nature. 437 (7055): 64–67. Bibcode:2005Natur.437...64H. doi:10.1038/nature04103. PMID 16136130. S2CID 4406401.