C1orf185

C1orf185
Identifiers
Aliases	C1orf185, chromosome 1 open reading frame 185
External IDs	MGI: 1914896 HomoloGene: 49856 GeneCards: C1orf185
Gene location (Human)
Chr.	Chromosome 1 (human)
End	51,148,086 bp
Gene location (Mouse)
Chr.	Chromosome 4 (mouse)
End	109,388,494 bp
RNA expression pattern
	Top expressed in
	stromal cell of endometrium; ; gastric mucosa; ; gallbladder; ; monocyte; ; blood; ; prefrontal cortex; ; right coronary artery; ; canal of the cervix; ; Achilles tendon; ; adipose tissue;
	Top expressed in
	spermatid; ; seminiferous tubule; ; spermatocyte; ; secondary oocyte; ; stria vascularis; ; extraocular muscle; ; submandibular gland; ; pancreas; ; vastus lateralis muscle; ; islet of Langerhans;
	More reference expression data
	n/a
Orthologs
	284546
	67646
	ENSG00000204006
	ENSMUSG00000060491
	Q5T7R7
	Q9CPZ3
	NM_001136508
	NM_001199090; NM_026291
	NP_001129980
	NP_001186019; NP_080567
	Wikidata
View/Edit Human	View/Edit Mouse

Chromosome 1 open reading frame 185, also known as C1orf185, is a protein that in humans is encoded by the C1orf185 gene. In humans, C1orf185 is a lowly expressed protein that has been found to be occasionally expressed in the circulatory system.^[5]^[6]

Gene[edit]

C1orf185 is located on chromosome 1 in humans on the positive strand between bases 51,102,221 and 51,148,086.^[7] There are 5 exons in the main splice isoform, however the number and selection of exons varies based on the isoform^[7]

C1orf185 locus within the human genome. Diagrams from NCBI Genome Viewer^[8] (top) and the Integrative Genomics Viewer^[9] (bottom).

mRNA and Protein Isoforms[edit]

C1orf185 has 5 different splice isoforms in humans.^[7]

C1orf185 Transcripts
Isoform	mRNA Accession	Protein Accession	Transcript Length (bp)	Protein Length (AA)
uncharacterized protein C1orf185	NM_001136508.2	NP_001129980.1	921	199
uncharacterized protein C1orf185 isoform X1	XM_011541282.2	XP_011539584.1	787	195
uncharacterized protein C1orf185 isoform X2	XM_024446525.1	XP_024302293.1	586	116
uncharacterized protein C1orf185 isoform X3	XM_024446528.1	XP_024302296.1	420	116
uncharacterized protein C1orf185 isoform X4	XM_024446529.1	XP_024302297.1	367	107

Protein[edit]

C1orf185 is a member of the pfam15842 protein family, containing a domain of unknown function, DUF4718.^[10] This family of proteins is between 130 and 224 amino acids long, and is found only in eukaryotes..

The main splice isoform of C1orf185 has a molecular weight of 22.4 kDa^[11] and an isoelectric point of 7.67.^[12] It contains a transmembrane domain spanning from positions 15 to 37.^[7] There is also a conserved serine-rich region from S123 to S142, which could possibly indicate function as a "splicing activator".^[13]

C1orf185 contains 3 primary subcellular domains: an extracellular domain which spans the amino acids from positions 1 to 14, a transmembrane domain from positions 15–37, and a large intracellular domain from positions 38–199.^[14]

Below are predicted secondary and tertiary structures of C1orf185, modeled using the Chou-Fasman^[15] secondary structure prediction tool and the I-TASSER^[16] protein structure and function prediction tool. Chou-Fasman predicts a mixture of α-helices, β-sheets, and other structural turns and coils, which can be seen modeled on the I-TASSER prediction.

Chou-Fasman Secondary Structure Prediction^[15] (left) and I-TASSER Tertiary Structure Prediction^[16] (right) for C1orf185.

Regulation of Expression[edit]

Gene Level Regulation[edit]

Below is a diagram showing the locations of predicted transcription factor binding sites in the C1orf185 promoter, along with a table describing the attributes of each individual binding site. The transcription factors were found and analyzed using the ElDorado tool from Genomatix.^[17]

Diagram of the C1orf185 with transcription factor binding sites annotated.

Transcription Factor Binding Sites within the C1orf185 Promoter
Transcription Factor	Detailed matrix info	Matrix similarity	Sequence	+/-
VTATA.02	Mammalian C-type LTR TATA box	0.91	tgtcaTAAAaacattcc	+
NKX25.05	Homeodomain factor Nkx-2.5/Csx	0.986	tttttTGAGtgaagtcttg	-
CDX1.01	Intestine specific homeodomain factor CDX-1	0.988	ttgccctTTTAtgaaaaaa	+
VTATA.02	Mammalian C-type LTR TATA box	0.914	tacttTAAAaataagca	-
ERG.02	v-ets erythroblastosis virus E26 oncogene homolog	0.942	gtctcaaaGGAAaataaaaag	-
SPI1.02	SPI-1 proto-oncogene; hematopoietic transcription factor PU.1	0.992	attaaagaGGAAgtctcaaag	-
FHXB.01	Fork head homologous X binds DNA with a dual sequence specificity (FHXA and FHXB)	0.831	ttctaaATAAcacattt	-
TGIF.01	TG-interacting factor belonging to TALE class of homeodomain factors	1	tctataaatGTCAatta	+
ZNF219.01	Kruppel-like zinc finger protein 219	0.913	ctccaCCCCcgtcagcccaaagg	+
ZBP89.01	Zinc finger transcription factor ZBP-89	0.956	catctccaCCCCcgtcagcccaa	+
CREB.02	cAMP-responsive element binding protein	0.922	cctttgggcTGACgggggtgg	-
FOXP1_ES.01	Alternative splicing variant of FOXP1, activated in ESCs	1	tcataaaAACAttccag	-
VTATA.02	Mammalian C-type LTR TATA box	0.895	tgtcaTAAAaacattcc	-
CREB1.02	cAMP-responsive element binding protein 1	0.949	tggaaGTGAtgtcataaaaac	-
SPI1.02	SPI-1 proto-oncogene; hematopoietic transcription factor PU.1	0.979	atttgagtGGAAgtgatgtca	-
NKX25.05	Homeodomain factor Nkx-2.5/Csx	0.994	gaattTGAGtggaagtgat	-
MESP1_2.01	Mesoderm posterior 1 and 2	0.917	cagtCATAtggct	+
MESP1_2.01	Mesoderm posterior 1 and 2	0.929	aagcCATAtgact	-
DELTAEF1.01	deltaEF1	0.99	gcttcACCTaaag	+
ERG.02	v-ets erythroblastosis virus E26 oncogene homolog	0.93	gaagaagaGGAAaatatattt	+

Matrix similarity correlates to the confidence in the prediction for each individual binding sites. +/- correlates to presence on either the positive or negative strand. The transcription factors are listed in order of appearance from beginning to end of the promoter.

C1orf185 has a very low expression pattern, with the only site in the body showing any signs of expression being the circulatory system. Two NCBI GEO profiles have shown that C1orf185 was consistently overexpressed in whole blood samples within a group of postmenopausal women,^[18] as well as being somewhat overexpressed in the peripheral blood of Parkinson's patients compared to controls.^[19]

Transcript Level Regulation[edit]

Below is a figure produced by mfold^[20] showing predicted mRNA structure of the 3' UTR of C1orf185.

Possible mRNA secondary structure of C1orf185 made by mfold.^[20] There are 3 main branches that end in 1-2 stem loops each. The stem loop near the end of the sequence contains the Poly-A signal, which signals the end of transcription.

C1orf185 has one conserved miRNA binding site of type 7mer-A1 among several orthologs.^[21] The presence of a 7mer-A1 binding site indicates that C1orf185 is likely to be post-transcriptionally repressed.^[22]

Possible conserved C1orf185 miRNA binding site details found using TargetScan.^[21]

Protein Level Regulation[edit]

Below is a figure and table showing predicted post-translational modification sites for C1orf185.

Table of Post-Translational Modifications for C1orf185
Type of Modification	Tool	Positions in Homo sapiens
Phosphorylation	NetPhos^[23]	S61, S69, S104, S130, S142, S147, S165, S186
Glycation	NetGlycate,^[24] NetNGlyc^[25]	K5, K50, K98, K113
O-GlcNAc	YinOYang^[26]	T121, S122, S130

The presence of multiple leucine glycation sites indicate that there may be ways to deter the function of the protein, as glycation has been associated with the loss of protein function in blood vessels^[27]

Clinical Significance[edit]

C1orf185 has been shown to play a role in the circulatory system, likely in a more reactive role, as it is lowly expressed across many species. It appears in studies surrounding atrial fibrillation^[6] and abnormal QRS duration,^[5] which implies it may play a role in those circulatory diseases.

Homology[edit]

Below is a table showing C1orf185 orthologs across a variety of conserved species. Orthologs were found using NCBI BLAST,^[28] the dates of divergence were found using TimeTree,^[29] and the global sequence identities and similarities were found using the Clustal Omega multiple sequence alignment tool.^[30]

Ortholog Table for C1orf185.
Genus and Species	Common Name	Taxonomic Group	Date of Divergence (MYA)	Accession Number	Sequence Length (aa)	Sequence Identity (Global)	Sequence Similarity (Global)
Homo sapiens	Human	Primates	0	NP_001129980.1	199	100%	100%
Pongo abelii	Sumatran orangutan	Primates	15.76	PNJ53823.1	195	93.50%	95.50%
Cebus capucinus imitator	Capuchin	Primates	43.2	XP_017404303.1	229	77.00%	79.60%
Galeopterus variegatus	Sunda flying lemur	Dermoptera	76	XP_008578352.1	203	73.70%	77.90%
Oryctolagus cuniculus	Rabbit	Lagomorpha	90	XP_008263491.1	225	69.90%	76.40%
Dipodomys ordii	Ord's kangaroo rat	Rodentia	90	XP_012877642.1	188	52.20%	59.40%
Mastomys coucha	Southern multimammate mouse	Rodentia	90	XP_031234037	263	51.50%	61.50%
Mus musculus	House mouse	Rodentia	90	NP_001186019.1	226	47.40%	59.50%
Peromyscus leucopus	White-footed mouse	Rodentia	90	XP_028745885.1	295	41%	48.20%
Phyllostomus discolor	Pale spear-nosed bat	Chiroptera	96	XP_028367083.1	191	73.40%	80.40%
Myotis davidii	David's myotis	Chiroptera	96	XP_006768446.1	196	71.40%	78.40%
Equus caballus	Horse	Perissodactyla	96	XP_023485921.1	243	63.80%	68.30%
Muntiacus muntjak	Indian muntjac	Artiodactyla	96	KAB0362285.1	200	59.40%	65.90%
Hipposideros armiger	Great roundleaf bat	Chiroptera	96	XP_019487867.1	157	54.90%	59.20%
Tursiops truncatus	Bottlenose dolphin	Artiodactyla	96	XP_033708766.1	189	54.10%	59.00%
Sarcophilus harrisii	Tasmanian devil	Dasyuromorhpia	159	XP_031825005.1	333	18.20%	27.70%
Ornithorhynchus anatinus	Platypus	Monotremata	180	XP_028902271	309	26.80%	37.40%
Pelodiscus sinensis	Chinese softshell turtle	Reptilia	312	XP_025042106.1	890	7.40%	11.40%
Gopherus evgoodei	Sinaloan thornscrub tortoise	Reptilia	312	XP_030429802.1	777	4.00%	6.30%
Chrysemys picta bellii	Western painted turtle	Reptilia	312	XP_023960730.1	748	3.70%	5.80%

Compared to other genes, C1orf185 appears to be evolving and changing relatively quickly, as it is only conserved in mammals and a few turtles, and more distant mammals have quite distant similarities. Primates are the only taxonomic group that heavily conserves this gene with regards to the human sequence, while other mammals and turtles only heavily conserve the transmembrane domain (positions 15–37). As primates and mammals are warm-blooded, this may further support the evidence showing a possible role in the circulatory system.

References[edit]

^ ^a ^b ^c GRCh38: Ensembl release 89: ENSG00000204006 – Ensembl, May 2017
^ ^a ^b ^c GRCm38: Ensembl release 89: ENSMUSG00000060491 – Ensembl, May 2017
^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
^ ^a ^b Sotoodehnia N, Isaacs A, de Bakker PI, Dörr M, Newton-Cheh C, Nolte IM, et al. (December 2010). "Common variants in 22 loci are associated with QRS duration and cardiac ventricular conduction". Nature Genetics. 42 (12): 1068–76. doi:10.1038/ng.716. PMC 3338195. PMID 21076409.
^ ^a ^b Roselli C, Chaffin MD, Weng LC, Aeschbacher S, Ahlberg G, Albert CM, et al. (June 2018). "Multi-ethnic genome-wide association study for atrial fibrillation". Nature Genetics. 50 (9): 1225–1233. doi:10.1038/s41588-018-0133-9. PMC 6136836. PMID 29892015.
^ ^a ^b ^c ^d "C1orf185 chromosome 1 open reading frame 185 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
^ "Genome Data Viewer". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
^ "Home | Integrative Genomics Viewer". software.broadinstitute.org. Retrieved 2020-05-01.
^ "CDD Conserved Protein Domain Family: DUF4718". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
^ "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2020-05-01.
^ "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2020-05-01.
^ Graveley BR, Maniatis T (April 1998). "Arginine/serine-rich domains of SR proteins can function as activators of pre-mRNA splicing". Molecular Cell. 1 (5): 765–71. doi:10.1016/s1097-2765(00)80076-3. PMID 9660960.
^ "TMHMM Server, v. 2.0". www.cbs.dtu.dk. Retrieved 2020-05-01.
^ ^a ^b "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2020-05-01.
^ ^a ^b "I-TASSER server for protein structure and function prediction". zhanglab.ccmb.med.umich.edu. Retrieved 2020-05-01.
^ "Genomatix - NGS Data Analysis & Personalized Medicine". www.genomatix.de. Retrieved 2020-05-01.
^ "13889230 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
^ "129780050 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
^ ^a ^b "The Mfold Web Server | mfold.rit.albany.edu". unafold.rna.albany.edu. Retrieved 2020-05-01.
^ ^a ^b "TargetScanHuman 7.2". www.targetscan.org. Retrieved 2020-05-01.
^ Grimson A, Farh KK, Johnston WK, Garrett-Engele P, Lim LP, Bartel DP (July 2007). "MicroRNA targeting specificity in mammals: determinants beyond seed pairing". Molecular Cell. 27 (1): 91–105. doi:10.1016/j.molcel.2007.06.017. PMC 3800283. PMID 17612493.
^ "NetPhos 3.1 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.
^ "NetGlycate 1.0 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.
^ "NetNGlyc 1.0 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.
^ "YinOYang 1.2 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.
^ Kim CS, Park S, Kim J (September 2017). "The role of glycation in the pathogenesis of aging and its prevention through herbal products and physical exercise". Journal of Exercise Nutrition & Biochemistry. 21 (3): 55–61. doi:10.20463/jenb.2017.0027. PMC 5643203. PMID 29036767.
^ "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
^ "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2020-05-01.
^ "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2020-05-01.

[refGRCh38Ensembl-1] GRCh38: Ensembl release 89: ENSG00000204006 – Ensembl, May 2017

[refGRCm38Ensembl-2] GRCm38: Ensembl release 89: ENSMUSG00000060491 – Ensembl, May 2017

[3] "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.

[4] "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.

[:0-5] Sotoodehnia N, Isaacs A, de Bakker PI, Dörr M, Newton-Cheh C, Nolte IM, et al. (December 2010). "Common variants in 22 loci are associated with QRS duration and cardiac ventricular conduction". Nature Genetics. 42 (12): 1068–76. doi:10.1038/ng.716. PMC 3338195. PMID 21076409.

[:1-6] Roselli C, Chaffin MD, Weng LC, Aeschbacher S, Ahlberg G, Albert CM, et al. (June 2018). "Multi-ethnic genome-wide association study for atrial fibrillation". Nature Genetics. 50 (9): 1225–1233. doi:10.1038/s41588-018-0133-9. PMC 6136836. PMID 29892015.

[:2-7] "C1orf185 chromosome 1 open reading frame 185 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.

[8] "Genome Data Viewer". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.

[9] "Home | Integrative Genomics Viewer". software.broadinstitute.org. Retrieved 2020-05-01.

[10] "CDD Conserved Protein Domain Family: DUF4718". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.

[11] "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2020-05-01.

[12] "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2020-05-01.

[13] Graveley BR, Maniatis T (April 1998). "Arginine/serine-rich domains of SR proteins can function as activators of pre-mRNA splicing". Molecular Cell. 1 (5): 765–71. doi:10.1016/s1097-2765(00)80076-3. PMID 9660960.

[14] "TMHMM Server, v. 2.0". www.cbs.dtu.dk. Retrieved 2020-05-01.

[:7-15] "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2020-05-01.

[:8-16] "I-TASSER server for protein structure and function prediction". zhanglab.ccmb.med.umich.edu. Retrieved 2020-05-01.

[17] "Genomatix - NGS Data Analysis & Personalized Medicine". www.genomatix.de. Retrieved 2020-05-01.

[:4-18] "13889230 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.

[:5-19] "129780050 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.

[:6-20] "The Mfold Web Server | mfold.rit.albany.edu". unafold.rna.albany.edu. Retrieved 2020-05-01.

[:3-21] "TargetScanHuman 7.2". www.targetscan.org. Retrieved 2020-05-01.

[22] Grimson A, Farh KK, Johnston WK, Garrett-Engele P, Lim LP, Bartel DP (July 2007). "MicroRNA targeting specificity in mammals: determinants beyond seed pairing". Molecular Cell. 27 (1): 91–105. doi:10.1016/j.molcel.2007.06.017. PMC 3800283. PMID 17612493.

[23] "NetPhos 3.1 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.

[24] "NetGlycate 1.0 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.

[25] "NetNGlyc 1.0 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.

[26] "YinOYang 1.2 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.

[27] Kim CS, Park S, Kim J (September 2017). "The role of glycation in the pathogenesis of aging and its prevention through herbal products and physical exercise". Journal of Exercise Nutrition & Biochemistry. 21 (3): 55–61. doi:10.20463/jenb.2017.0027. PMC 5643203. PMID 29036767.

[28] "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2020-05-01.

[29] "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2020-05-01.

[30] "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2020-05-01.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]