CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-044195
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
  
Protein Name
 Protein capicua homolog 
Protein Synonyms/Alias
  
Gene Name
 CIC 
Gene Synonyms/Alias
  
Created Date
 July 27, 2013 
Organism
 Homo sapiens (Human) 
NCBI Taxa ID
 9606 
Lysine Modification
Position
Peptide
Type
References
1598TPVPIASKPFPTSGRacetylation[1, 2, 3]
Reference
 [1] Proteomic investigations reveal a role for RNA processing factor THRAP3 in the DNA damage response.
 Beli P, Lukashchuk N, Wagner SA, Weinert BT, Olsen JV, Baskcomb L, Mann M, Jackson SP, Choudhary C.
 Mol Cell. 2012 Apr 27;46(2):212-25. [PMID: 22424773]
 [2] Proteomic investigations of lysine acetylation identify diverse substrates of mitochondrial deacetylase sirt3.
 Sol EM, Wagner SA, Weinert BT, Kumar A, Kim HS, Deng CX, Choudhary C.
 PLoS One. 2012;7(12):e50545. [PMID: 23236377]
 [3] Integrated proteomic analysis of post-translational modifications by serial enrichment.
 Mertins P, Qiao JW, Patel J, Udeshi ND, Clauser KR, Mani DR, Burgess MW, Gillette MA, Jaffe JD, Carr SA.
 Nat Methods. 2013 Jul;10(7):634-7. [PMID: 23749302
Functional Description
  
Sequence Annotation
  
Keyword
 Complete proteome; Reference proteome. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 2514 AA 
Protein Sequence
MKPMKKACTG LSGPGSGSKS PPATRAKALR RRGAGEGDKP EEEDDEAQQP QPQSGPEEAE 60
EGEEEEAERG PGAEGPPLEL HPGDPAPGPA EDPKGDGEAG RWEPSLSRKT ATFKSRAPKK 120
KYVEEHGAGS SGVAGAPEER VRTPEEASGL GVPPRPPTST RSSSTDTASE HSADLEDEPA 180
EACGPGPWPP GSTSGSYDLR QLRSQRVLAR RGDGLFLPAV VRQVRRSQDL GVQFPGDRAL 240
TFYEGVPGAG VDVVLDATPP PGALVVGTAV CTCVEPGVAA YREGVVVEVA TKPAAYKVRL 300
SPGPSSQPGL PGSLPQPPQP LHREPEEAVW VARSSLRLLR PPWEPETMLR KPPTGPEEEQ 360
AEPGATLPPC PAALDPKQPE DAEVSKISFG GNLGTHCEEG EEKHPPALGT PALLPLPPPQ 420
LLSPPPKSPA FVGPGRPGEQ PSPCQEGSQG GSRSSSVASL EKGTAPAARA RTPLTAAQQK 480
YKKGDVVCTP SGIRKKFNGK QWRRLCSRDG CMKESQRRGY CSRHLSMRTK EMEGLADSGP 540
GGAGRPAAVA AREGSTEFDW GDETSRDSEA SSVAARGDSR PRLVAPADLS RFEFDECEAA 600
VMLVSLGSSR SGTPSFSPVS TQSPFSPAPS PSPSPLFGFR PANFSPINAS PVIQRTAVRS 660
RHLSASTPKA GVLTPPDLGP HPPPPAPRER HSSGILPTFQ TNLTFTVPIS PGRRKTELLP 720
HPGALGAPGA GGGGAAPDFP KSDSLDSGVD SVSHTPTPST PAGFRAVSPA VPFSRSRQPS 780
PLLLLPPPAG LTSDPGPSVR RVPAVQRDSP VIVRNPDVPL PSKFPGEVGT AGEVRAGGPG 840
RGCRETPVPP GVASGKPGLP PPLPAPVPIT VPPAAPTAVA QPMPAFGLAS SPFQPVAFHP 900
SPAALLPVLV PSSYTSHPAP KKEVIMGRPG TVWTNVEPRS VAVFPWHSLV PFLAPSQPDP 960
SVQPSEAQQP ASHPVASNQS KEPAESAAVA HERPPGGTGS ADPERPPGAT CPESPGPGPP 1020
HPLGVVESGK GPPPTTEEEA SGPPGEPRLD SETESDHDDA FLSIMSPEIQ LPLPPGKRRT 1080
QSLSALPKER DSSSEKDGRS PNKREKDHIR RPMNAFMIFS KRHRALVHQR HPNQDNRTVS 1140
KILGEWWYAL GPKEKQKYHD LAFQVKEAHF KAHPDWKWCN KDRKKSSSEA KPTSLGLAGG 1200
HKETRERSMS ETGTAAAPGV SSELLSVAAQ TLLSSDTKAP GSSSCGAERL HTVGGPGSAR 1260
PRAFSHSGVH SLDGGEVDSQ ALQELTQMVS GPASYSGPKP STQYGAPGPF AAPGEGGALA 1320
ATGRPPLLPT RASRSQRAAS EDMTSDEERM VICEEEGDDD VIADDGFGTT DIDLKCKERV 1380
TDSESGDSSG EDPEGNKGFG RKVFSPVIRS SFTHCRPPLD PEPPGPPDPP VAFGKGYGSA 1440
PSSSASSPAS SSASAATSFS LGSGTFKAQE SGQGSTAGPL RPPPPGAGGP ATPSKATRFL 1500
PMDPATFRRK RPESVGGLEP PGPSVIAAPP SGGGNILQTL VLPPNKEEQE GGGARVPSAP 1560
APSLAYGAPA APLSRPAATM VTNVVRPVSS TPVPIASKPF PTSGRAEASP NDTAGARTEM 1620
GTGSRVPGGS PLGVSLVYSD KKSAAATSPA PHLVAGPLLG TVGKAPATVT NLLVGTPGYG 1680
APAPPAVQFI AQGAPGGGTT AGSGAGAGSG PNGPVPLGIL QPGALGKAGG ITQVQYILPT 1740
LPQQLQVAPA PAPAPGTKAA APSGPAPTTS IRFTLPPGTS TNGKVLAATA PTPGIPILQS 1800
VPSAPPPKAQ SVSPVQAPPP GGSAQLLPGK VLVPLAAPSM SVRGGGAGQP LPLVSPPFSV 1860
PVQNGAQPPS KIIQLTPVPV STPSGLVPPL SPATLPGPTS QPQKVLLPSS TRITYVQSAG 1920
GHALPLGTSP ASSQAGTVTS YGPTSSVALG FTSLGPSGPA FVQPLLSGQA PLLAPGQVGV 1980
SPVPSPQLPP ACAAPGGPVI TAFYSGSPAP TSSAPLAQPS QAPPSLVYTV ATSTTPPAAT 2040
ILPKGPPAPA TATPAPTSPF PSATGSMTYS LVAPKAQRPS PKAPQKVKAA IASIPVGSFE 2100
AGASGRPGPA PRQPLEPGPV REPTAPESEL EGQPTPPAPP PLPETWTPTA RSSPPLPPPA 2160
EERTSAKGPE TMASKFPSSS SDWRVPGQGL ENRGEPPTPP SPAPAPAVAP GGSSESSSGR 2220
AAGDTPERKE AAGTGKKVKV RPPPLKKTFD SVDKVLSEVD FEERFAELPE FRPEEVLPSP 2280
TLQSLATSPR AILGSYRKKR KNSTDLDSAP EDPTSPKRKM RRRSSCSSEP NTPKSAKCEG 2340
DIFTFDRTGT EAEDVLGELE YDKVPYSSLR RTLDQRRALV MQLFQDHGFF PSAQATAAFQ 2400
ARYADIFPSK VCLQLKIREV RQKIMQAATP TEQPPGAEAP LPVPPPTGTA AAPAPTPSPA 2460
GGPDPTSPSS DSGTAQAAPP LPPPPESGPG QPGWEGAPQP SPPPPGPSTA ATGR 2514 
Gene Ontology
  
Interpro
 IPR009071; HMG_box_dom. 
Pfam
 PF00505; HMG_box 
SMART
 SM00398; HMG 
PROSITE
 PS50118; HMG_BOX_2 
PRINTS