CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-000453
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Trinucleotide repeat-containing gene 18 protein 
Protein Synonyms/Alias
 Long CAG trinucleotide repeat-containing gene 79 protein 
Gene Name
 TNRC18 
Gene Synonyms/Alias
 CAGL79; KIAA1856 
Created Date
 July 27, 2013 
Organism
 Homo sapiens (Human) 
NCBI Taxa ID
 9606 
Lysine Modification
Position
Peptide
Type
References
548VVAASSSKKAYLDPGubiquitination[1]
2301ALLVPSAKRRSRKTSacetylation[2]
2328GSEEPGAKARGRGRKubiquitination[3, 4]
Reference
 [1] Systematic and quantitative assessment of the ubiquitin-modified proteome.
 Kim W, Bennett EJ, Huttlin EL, Guo A, Li J, Possemato A, Sowa ME, Rad R, Rush J, Comb MJ, Harper JW, Gygi SP.
 Mol Cell. 2011 Oct 21;44(2):325-40. [PMID: 21906983]
 [2] Regulation of cellular metabolism by protein lysine acetylation.
 Zhao S, Xu W, Jiang W, Yu W, Lin Y, Zhang T, Yao J, Zhou L, Zeng Y, Li H, Li Y, Shi J, An W, Hancock SM, He F, Qin L, Chin J, Yang P, Chen X, Lei Q, Xiong Y, Guan KL.
 Science. 2010 Feb 19;327(5968):1000-4. [PMID: 20167786]
 [3] Tryptic digestion of ubiquitin standards reveals an improved strategy for identifying ubiquitinated proteins by mass spectrometry.
 Denis NJ, Vasilescu J, Lambert JP, Smith JC, Figeys D.
 Proteomics. 2007 Mar;7(6):868-74. [PMID: 17370265]
 [4] hCKSAAP_UbSite: improved prediction of human ubiquitination sites by exploiting amino acid pattern and properties.
 Chen Z, Zhou Y, Song J, Zhang Z.
 Biochim Biophys Acta. 2013 Aug;1834(8):1461-7. [PMID: 23603789
Functional Description
  
Sequence Annotation
 DOMAIN 2817 2962 BAH.
 MOD_RES 263 263 Phosphoserine.
 MOD_RES 618 618 Phosphothreonine (By similarity).
 MOD_RES 1857 1857 Phosphoserine.
 MOD_RES 1863 1863 Phosphoserine.
 CROSSLNK 2328 2328 Glycyl lysine isopeptide (Lys-Gly)  
Keyword
 Alternative splicing; Coiled coil; Complete proteome; Isopeptide bond; Phosphoprotein; Polymorphism; Reference proteome; Ubl conjugation. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 2968 AA 
Protein Sequence
MDGRDFGPQR SVHGPPPPLL SGLAMDSHRV GAATAGRLPA SGLPGPLPPG KYMAGLNLHP 60
HPGEAFLGSF VASGMGPSAS SHGSPVPLPS DLSFRSPTPS NLPMVQLWAA HAHEGFSHLP 120
SGLYPSYLHL NHLEPPSSGS PLLSQLGQPS IFDTQKGQGP GGDGFYLPTA GAPGSLHSHA 180
PSARTPGGGH SSGAPAKGSS SRDGPAKERA GRGGEPPPLF GKKDPRARGE EASGPRGVVD 240
LTQEARAEGR QDRGPPRLAE RLSPFLAESK TKNAALQPSV LTMCNGGAGD VGLPALVAEA 300
GRGGAKEAAR QDEGARLLRR TETLLPGPRP CPSPLPPPPA PPKGPPAPPA ATPAGVYTVF 360
REQGREHRVV APTFVPSVEA FDERPGPIQI ASQARDARAR EREAGRPGVL QAPPGSPRPL 420
DRPEGLREKN SVIRSLKRPP PADAPTVRAT RASPDPRAYV PAKELLKPEA DPRPCERAPR 480
GPAGPAAQQA AKLFGLEPGR PPPTGPEHKW KPFELGNFAA TQMAVLAAQH HHSRAEEEAA 540
VVAASSSKKA YLDPGAVLPR SAATCGRPVA DMHSAAHGSG EASAMQSLIK YSGSFARDAV 600
AVRPGGCGKK SPFGGLGTMK PEPAPTSAGA SRAQARLPHS GGPAAGGGRQ LKRDPERPES 660
AKAFGREGSG AQGEAEVRHP PVGIAVAVAR QKDSGGSGRL GPGLVDQERS LSLSNVKGHG 720
RADEDCVDDR ARHREERLLG ARLDRDQEKL LRESKELADL ARLHPTSCAP NGLNPNLMVT 780
GGPALAGSGR WSADPAAHLA THPWLPRSGN ASMWLAGHPY GLGPPSLHQG MAPAFPPGLG 840
GSLPSAYQFV RDPQSGQLVV IPSDHLPHFA ELMERATVPP LWPALYPPGR SPLHHAQQLQ 900
LFSQQHFLRQ QEFLYLQQQA AQALELQRSA QLVQERLKAQ EHRAEMEEKG SKRGLEAAGK 960
AGLATAGPGL LPRKPPGLAA GPAGTYGKAV SPPPSPRASP VAALKAKVIQ KLEDVSKPPA 1020
YAYPATPSSH PTSPPPASPP PTPGITRKEE APENVVEKKD LELEKEAPSP FQALFSDIPP 1080
RYPFQALPPH YGRPYPFLLQ PTAAADADGL APDVPLPADG PERLALSPED KPIRLSPSKI 1140
TEPLREGPEE EPLAEREVKA EVEDMDEGPT ELPPLESPLP LPAAEAMATP SPAGGCGGGL 1200
LEAQALSATG QSCAEPSECP DFVEGPEPRV DSPGRTEPCT AALDLGVQLT PETLVEAKEE 1260
PVEVPVAVPV VEAVPEEGLA QVAPSESQPT LEMSDCDVPA GEGQCPSLEP QEAVPVLGST 1320
CFLEEASSDQ FLPSLEDPLA GMNALAAAAE LPQARPLPSP GAAGAQALEK LEAAESLVLE 1380
QSFLHGITLL SEIAELELER RSQEMGGAER ALVARPSLES LLAAGSHMLR EVLDGPVVDP 1440
LKNLRLPREL KPNKKYSWMR KKEERMYAMK SSLEDMDALE LDFRMRLAEV QRQYKEKQRE 1500
LVKLQRRRDS EDRREEPHRS LARRGPGRPR KRTHAPSALS PPRKRGKSGH SSGKLSSKSL 1560
LTSDDYELGA GIRKRHKGSE EEHDALIGMG KARGRNQTWD EHEASSDFIS QLKIKKKKMA 1620
SDQEQLASKL DKALSLTKQD KLKSPFKFSD SAGGKSKTSG GCGRYLTPYD SLLGKNRKAL 1680
AKGLGLSLKS SREGKHKRAA KTRKMEVGFK ARGQPKSAHS PFASEVSSYS YNTDSEEDEE 1740
FLKDEWPAQG PSSSKLTPSL LCSMVAKNSK AAGGPKLTKR GLAAPRTLKP KPATSRKQPF 1800
CLLLREAEAR SSFSDSSEES FDQDESSEEE DEEEELEEED EASGGGYRLG ARERALSPGL 1860
EESGLGLLAR FAASALPSPT VGPSLSVVQL EAKQKARKKE ERQSLLGTEF EYTDSESEVK 1920
VRKRSPAGLL RPKKGLGEPG PSLAAPTPGA RGPDPSSPDK AKLAVEKGRK ARKLRGPKEP 1980
GFEAGPEASD DDLWTRRRSE RIFLHDASAA APAPVSTAPA TKTSRCAKGG PLSPRKDAGR 2040
AKDRKDPRKK KKGKEAGPGA GLPPPRAPAL PSEARAPHAS SLTAAKRSKA KAKGKEVKKE 2100
NRGKGGAVSK LMESMAAEED FEPNQDSSFS EDEHLPRGGA VERPLTPAPR SCIIDKDELK 2160
DGLRVLIPMD DKLLYAGHVQ TVHSPDIYRV VVEGERGNRP HIYCLEQLLQ EAIIDVRPAS 2220
TRFLPQGTRI AAYWSQQYRC LYPGTVVRGL LDLEDDGDLI TVEFDDGDTG RIPLSHIRLL 2280
PPDYKIQCAE PSPALLVPSA KRRSRKTSKD TGEGKDGGTA GSEEPGAKAR GRGRKPSAKA 2340
KGDRAATLEE GNPTDEVPST PLALEPSSTP GSKKSPPEPV DKRAKAPKAR PAPPQPSPAP 2400
PAFTSCPAPE PFAELPAPAT SLAPAPLITM PATRPKPKKA RAAEESGAKG PRRPGEEAEL 2460
LVKLDHEGVT SPKSKKAKEA LLLREDPGAG GWQEPKSLLS LGSYPPAAGS SEPKAPWPKA 2520
TDGDLAQEPG PGLTFEDSGN PKSPDKAQAE QDGAEESESS SSSSSGSSSS SSSSSSSGSE 2580
TEGEEEGDKN GDGGCGTGGR NCSAASSRAA SPASSSSSSS SSSSSSSSSS SSSSSSSSSS 2640
SSSSSSSSSS SSSSSSSSSS SSSSSSSSSS STTDEDSSCS SDDEAAPAPT AGPSAQAALP 2700
TKATKQAGKA RPSAHSPGKK TPAPQPQAPP PQPTQPLQPK AQAGAKSRPK KREGVHLPTT 2760
KELAKRQRLP SVENRPKIAA FLPARQLWKW FGKPTQRRGM KGKARKLFYK AIVRGKEMIR 2820
IGDCAVFLSA GRPNLPYIGR IQSMWESWGN NMVVRVKWFY HPEETSPGKQ FHQGQHWDQK 2880
SSRSLPAALR VSSQRKDFME RALYQSSHVD ENDVQTVSHK CLVVGLEQYE QMLKTKKYQD 2940
SEGLYYLAGT YEPTTGMIFS TDGVPVLC 2968 
Gene Ontology
 GO:0003677; F:DNA binding; IEA:InterPro. 
Interpro
 IPR001025; BAH_dom. 
Pfam
 PF01426; BAH 
SMART
 SM00439; BAH 
PROSITE
 PS51038; BAH 
PRINTS