Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965.[5] FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

FAM98C
Identifiers
AliasesFAM98C, family with sequence similarity 98 member C
External IDsMGI: 1921083; HomoloGene: 45483; GeneCards: FAM98C; OMA:FAM98C - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_174905
NM_001351675

NM_001146023
NM_028661

RefSeq (protein)

NP_777565
NP_001338604

n/a

Location (UCSC)Chr 19: 38.4 – 38.41 MbChr 7: 28.85 – 28.86 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Gene

edit

Locus

edit

The FAM98C gene is located on 19q13.2 in humans on the "+" strand. FAM98C spans from 38,403,135 to 38,409,088 bp. The primary mRNA transcript for the FAM98C gene is 5,954 base pairs in length.[5] FAM98C neighbors include RASGRP4 and RYR1.[5]

Transcripts

edit

FAM98C has two known transcript variants.[6] The first variant encodes for the longest isoform of 349 amino acids.[7] The second variant is encodes for a short isoform of 267 amino acids.[8] FAM98C is composed of eight exons.[7]

Proteins

edit

The FAM98C protein is 349 amino acids in length with a predicted molecular weight of 37.3 kDa and a predicted isoelectric point of 6.89.[9] Composition of FAM89A protein is notable for is its abundance of Leucine(16%) and the Lysine-rich C-terminus. FAM98C shows a high scoring positive segment with 6 consecutive Lysine residues.[9]

Domains and motifs

edit

FAM98C has a domain of unknown function 2465 (DUF2465) from the amino acids 18-334.[10] This domain of unknown function is unique to the FAM98 family and is conserved in all orthologs.[11] DUF2465 is fairly unknown but its proposed to bind to RNA. The domain in paralogs FAM98A binds to mRNA, FAM98B targets tRNA splicing.[12]

Structure

edit

The secondary structure of FAM98C is predicted to be composed of approximately 46% alpha helix, 46% random coil and 7% extended strand.[13][14] However, no beta strands were found in any of the predicted secondary structures.[15] The tertiary structure of FAM98C is predicted to have 10 alpha helices by the I-TASSER software.[16][17]

Gene level regulation

edit

Promoter

edit

The FAM98C promoter(GXP_7536558) region is 1254 base pairs in length. Both E2F-myc activator/cell cycle regulator and Krueppel like transcription factors had nineteen sites predicted to bind on the promoter.[18]

Expression pattern

edit

A GEO multiple normal tissue profile revealed that FAM98C is ubiquitously expressed, though not uniformly expressed.[19][20] The highest expressions levels are in the jejunum, liver, and kidney.[19]

Sub-cellular localization

edit

The subcellular localization of FAM98C was predicted using the PSORT II tool.[21] FAM98C is predicted to be localized in the nucleus (60.9%), followed by the mitochondria (21.7%) and then the cytoplasm (17.4%).

Protein level regulation

edit

Post-translational modifications

edit

Phosphorylation

edit

FAM98C has three predicted phosphorylation sites located at amino acid positions 225, 239, and 300 that are conserved in distant orthologs.[22] The predicted phosphorylation site at position 225 is Tyrosine Kinase can function as an "on" and "off" switch. A predicted calmodulin-dependent protein kinase site at position 239.[23]

 
FAM98C Schematic illustration presents DUF2465 and three phosphorylation sites. FAM98C is mostly composed of the domain of unknown function 2465 The lysine-rich C-terminus is also presented in the illustration.

SUMOylation

edit

Sumoylation is a post-translation modification process, that regulates a lot of proteins. The GPS CUCKOO workgroup database predicted SUMO protein sites at 347, 348 and 349.[24] These residues were conserved in even the most distant FAM98C orthologs.

Homology

edit

Paralogs

edit

FAM98C only has two paralogs FAM98A and FAM98B.[5]

Orthologs

edit

Orthologs for FAM98C have been found in mammals, reptiles and amphibians. FAM98C’s orthologs are present as far back as amphibians roughly estimated 351.8 million years ago(mya). FAM98C is only present in the Metazoan kingdom but not present in protozoa. Below is a table of a variety of orthologs for human FAM98C. The orthologs listed below are in descending order in the terms of the date of divergence.[25]

Sequence Number Genus species Common Name Taxonomic Group Date of Divergence(MYA) Accession Number Sequence Length(aa) Sequence Identity Sequence Similarity
1 Homo sapiens Human Primates 0 NP_777565.3 349 100% 100%
2 Pan troglodytes Chimpanzee Primates 6.7 XP_524252.3 350 99% 99%
3 Microcebus murinus Gray mouse lemur Primates 73.8 XP_012630183.1 353 84% 88%
4 Octodon degus Common degu Rodentia 90 XP_023577316.1 352 78% 84%
5 Ochotona princeps American pika Lagomorpha 90 XP_004595135.1 353 77% 83%
6 Mus musculus Mouse Rodentia 90 NP_001139495.1 344 74% 79%
7 Rattus norvegicus Brown Rat Rodentia 90 NP_001185513.1 344 73% 80%
8 Bos taurus Cattle Artiodactyla 96 XP_002695017.1 353 81% 85%
9 Canis lupus familiaris Dog Carnivora 96 XP_541643.2 353 80% 83%
10 Leptonychotes weddellii Weddell seal Carnivora 96 XP_006739473.1 345 79% 84%
11 Monodon monoceros Narwhal Artiodactyla 96 XP_029092965.1 352 78% 84%
12 Desmodus rotundus Common vampire bat Chiroptera 96 XP_024433437.1 355 77% 83%
13 Chrysochloris asiatica Cape golden mole Afrosoricida 105 XP_006871606.1 348 75% 81%
14 Vombatus ursinus common wombat Diprotodontia 159 XP_027711296.1 358 64% 73%
15 Phascolarctos cinereus koala Diprotodontia 159 XP_020834255.1 358 64% 73%
16 Ornithorhynchus anatinus Platypus Monotremata 177 XP_028920793.1 338 57% 65%
17 Chelonoidis abingdonii Pinta Island tortoise Testudines 312 XP_032660367.1 329 44% 57%
18 Podarcis muralis Common wall lizard Squamata 312 XP_028597878.1 330 43% 55%
19 Python bivittatus Burmese python Squamata 312 XP_015745259.1 318 42% 58%
20 Nanorana parkeri High Himalaya frog Gymnophiona 351.8 XP_018411523.1 351 38% 55%
21 Rhinatrema bivittatum two-lined caecilian Anura 351.8 XP_029475031.1 338 38% 51%
 
Rate of Divergence of FAM98C compared to fibrinogen and cytochrome c.

Rate of Evolution

edit

FAM98C is rapidly evolving with a rate of divergence faster than both cytochrome C, a slowly evolving gene, and fibrinogen, a rapidly evolving gene.

Interacting proteins

edit

FAM98C has been predicted to interact with DR1, LRRCC1, FAM83F, TMEM256, Pdrm16 and SPRED1.[26][27] LRRCC1 and TMEM256 were both mentioned with FAM98C as potentially novel genes that are related with ciliopathies.[28]

Clinical significance

edit

In a bioinformatics study, FAM98C and 9 other novel genes were identified to be associated with a prognosis of cholangiocarcinoma.[29]

References

edit
  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000130244Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000030590Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ a b c d "FAM98C Gene - GeneCards | FA98C Protein | FA98C Antibody". www.genecards.org. Retrieved 2020-12-19.
  6. ^ "FAM98C family with sequence similarity 98 member C [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-12-15.
  7. ^ a b "protein FAM98C isoform 1 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  8. ^ "protein FAM98C isoform 2 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  9. ^ a b Brendel V, Bucher P, Nourbakhsh IR, Blaisdell BE, Karlin S (March 1992). "Methods and algorithms for statistical analysis of protein sequences". Proceedings of the National Academy of Sciences of the United States of America. 89 (6): 2002–6. Bibcode:1992PNAS...89.2002B. doi:10.1073/pnas.89.6.2002. PMC 48584. PMID 1549558.
  10. ^ "HomoloGene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  11. ^ "CDD Conserved Protein Domain Family: DUF2465". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  12. ^ Dürnberger G, Bürckstümmer T, Huber K, Giambruno R, Doerks T, Karayel E, et al. (July 2013). "Experimental characterization of the human non-sequence-specific nucleic acid interactome". Genome Biology. 14 (7): R81. doi:10.1186/gb-2013-14-7-r81. PMC 4053969. PMID 23902751.
  13. ^ Prof. T. Ashok Kumar. "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2020-12-16.
  14. ^ "NPS@ : GOR4 secondary structure prediction". npsa-prabi.ibcp.fr. Retrieved 2020-12-16.
  15. ^ "Bioinformatics Toolkit". toolkit.tuebingen.mpg.de. Retrieved 2020-12-16.
  16. ^ "I-TASSER server for protein structure and function prediction". zhanglab.ccmb.med.umich.edu. Retrieved 2020-12-19.
  17. ^ Roy A, Kucukural A, Zhang Y (April 2010). "I-TASSER: a unified platform for automated protein structure and function prediction". Nature Protocols. 5 (4): 725–38. doi:10.1038/nprot.2010.5. PMC 2849174. PMID 20360767.
  18. ^ "ElDorado: Annotation & Analysis". www.genomatix.de. Archived from the original on 2018-05-07. Retrieved 2020-12-16.
  19. ^ a b "GEO DataSet Browser". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  20. ^ "GEO Accession viewer". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  21. ^ "PSORT II Prediction". psort.hgc.jp. Retrieved 2020-12-16.
  22. ^ "GPS 5.0 - Kinase-specific Phosphorylation Site Prediction". gps.biocuckoo.cn. Retrieved 2020-12-16.
  23. ^ "GPS 5.0 - Kinase-specific Phosphorylation Site Prediction". gps.biocuckoo.cn. Retrieved 2020-12-19.
  24. ^ "GPS-SUMO: Prediction of SUMOylation Sites & SUMO-interaction Motifs". sumosp.biocuckoo.org. Archived from the original on 2013-05-10. Retrieved 2020-12-16.
  25. ^ "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2020-12-19.
  26. ^ "FAM98C protein (human) - STRING interaction network". string-db.org. Retrieved 2020-12-19.
  27. ^ "PSICQUIC View". www.ebi.ac.uk. Retrieved 2020-12-19.
  28. ^ Shaheen R, Szymanska K, Basu B, Patel N, Ewida N, Faqeih E, et al. (November 2016). "Characterizing the morbid genome of ciliopathies". Genome Biology. 17 (1): 242. doi:10.1186/s13059-016-1099-5. PMC 5126998. PMID 27894351.
  29. ^ Da Z, Gao L, Su G, Yao J, Fu W, Zhang J, et al. (2020-04-22). "Bioinformatics combined with quantitative proteomics analyses and identification of potential biomarkers in cholangiocarcinoma". Cancer Cell International. 20 (1): 130. doi:10.1186/s12935-020-01212-z. PMC 7178764. PMID 32336950.