CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-019664
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Msx2-interacting protein 
Protein Synonyms/Alias
 SMART/HDAC1-associated repressor protein; SPEN homolog 
Gene Name
 SPEN 
Gene Synonyms/Alias
 KIAA0929; MINT; SHARP 
Created Date
 July 27, 2013 
Organism
 Homo sapiens (Human) 
NCBI Taxa ID
 9606 
Lysine Modification
Position
Peptide
Type
References
902PVEKLKAKLDNDTVKacetylation[1]
1293GSPKVDEKVLPYSNIubiquitination[2, 3]
1308TVREESLKFNPYDSSubiquitination[2, 3]
1361SFPNSIIKRDSLRKRubiquitination[2, 3]
1430SSSLERNKFYSFALDubiquitination[2, 3]
1445KTITPDTKALLERAKubiquitination[2, 3, 4]
1632ENKDSELKTPPSVGPubiquitination[5]
1656SAPSALEKTTGDKTVubiquitination[5]
1838AAVSIVEKPVTRKSEacetylation[6]
1864SPRGEAQKLLELKMEubiquitination[2, 3]
2617VAPVIAPKITSVISRubiquitination[2, 3, 4, 7]
2680KGSVTTLKSLVSTPAubiquitination[2, 3, 7]
2823LLSYSGQKTEGPQRIubiquitination[5, 7]
2857SVSKSQVKPDSVTASubiquitination[5]
2959SIVTTNKKLADPVTLubiquitination[1, 7]
2967LADPVTLKIETKVLQubiquitination[1, 2, 3, 7]
2971VTLKIETKVLQPANLubiquitination[2, 3]
2993HPPALPSKLPTEVNHubiquitination[7]
3240AAKTPDAKAAPTPTPubiquitination[5, 7]
Reference
 [1] Integrated proteomic analysis of post-translational modifications by serial enrichment.
 Mertins P, Qiao JW, Patel J, Udeshi ND, Clauser KR, Mani DR, Burgess MW, Gillette MA, Jaffe JD, Carr SA.
 Nat Methods. 2013 Jul;10(7):634-7. [PMID: 23749302]
 [2] A proteome-wide, quantitative survey of in vivo ubiquitylation sites reveals widespread regulatory roles.
 Wagner SA, Beli P, Weinert BT, Nielsen ML, Cox J, Mann M, Choudhary C.
 Mol Cell Proteomics. 2011 Oct;10(10):M111.013284. [PMID: 21890473]
 [3] hCKSAAP_UbSite: improved prediction of human ubiquitination sites by exploiting amino acid pattern and properties.
 Chen Z, Zhou Y, Song J, Zhang Z.
 Biochim Biophys Acta. 2013 Aug;1834(8):1461-7. [PMID: 23603789]
 [4] Systems-wide analysis of ubiquitylation dynamics reveals a key role for PAF15 ubiquitylation in DNA-damage bypass.
 Povlsen LK, Beli P, Wagner SA, Poulsen SL, Sylvestersen KB, Poulsen JW, Nielsen ML, Bekker-Jensen S, Mailand N, Choudhary C.
 Nat Cell Biol. 2012 Oct;14(10):1089-98. [PMID: 23000965]
 [5] Systematic and quantitative assessment of the ubiquitin-modified proteome.
 Kim W, Bennett EJ, Huttlin EL, Guo A, Li J, Possemato A, Sowa ME, Rad R, Rush J, Comb MJ, Harper JW, Gygi SP.
 Mol Cell. 2011 Oct 21;44(2):325-40. [PMID: 21906983]
 [6] Proteomic investigations reveal a role for RNA processing factor THRAP3 in the DNA damage response.
 Beli P, Lukashchuk N, Wagner SA, Weinert BT, Olsen JV, Baskcomb L, Mann M, Jackson SP, Choudhary C.
 Mol Cell. 2012 Apr 27;46(2):212-25. [PMID: 22424773]
 [7] Refined preparation and use of anti-diglycine remnant (K-ε-GG) antibody enables routine quantification of 10,000s of ubiquitination sites in single proteomics experiments.
 Udeshi ND, Svinkina T, Mertins P, Kuhn E, Mani DR, Qiao JW, Carr SA.
 Mol Cell Proteomics. 2013 Mar;12(3):825-31. [PMID: 23266961
Functional Description
 May serve as a nuclear matrix platform that organizes and integrates transcriptional responses. In osteoblasts, supports transcription activation: synergizes with RUNX2 to enhance FGFR2- mediated activation of the osteocalcin FGF-responsive element (OCFRE) (By similarity). Has also been shown to be an essential corepressor protein, which probably regulates different key pathways such as the Notch pathway. Negative regulator of the Notch pathway via its interaction with RBPSUH, which prevents the association between NOTCH1 and RBPSUH, and therefore suppresses the transactivation activity of Notch signaling. Blocks the differentiation of precursor B-cells into marginal zone B-cells. Probably represses transcription via the recruitment of large complexes containing histone deacetylase proteins. May bind both to DNA and RNA. 
Sequence Annotation
 DOMAIN 6 81 RRM 1.
 DOMAIN 335 415 RRM 2.
 DOMAIN 438 513 RRM 3.
 DOMAIN 517 589 RRM 4.
 DOMAIN 2201 2707 RID.
 DOMAIN 3498 3664 SPOC.
 DNA_BIND 1 573 By similarity.
 REGION 2130 2464 Interaction with MSX2 (By similarity).
 REGION 2709 2870 Interaction with RBPSUH (By similarity).
 MOD_RES 309 309 Phosphoserine.
 MOD_RES 623 623 Phosphoserine.
 MOD_RES 727 727 Phosphoserine.
 MOD_RES 736 736 Phosphoserine.
 MOD_RES 740 740 Phosphoserine.
 MOD_RES 1062 1062 Phosphoserine.
 MOD_RES 1194 1194 Phosphoserine.
 MOD_RES 1222 1222 Phosphoserine.
 MOD_RES 1252 1252 Phosphoserine.
 MOD_RES 1261 1261 Phosphoserine.
 MOD_RES 1268 1268 Phosphoserine.
 MOD_RES 1278 1278 Phosphoserine.
 MOD_RES 1283 1283 Phosphoserine.
 MOD_RES 1287 1287 Phosphoserine.
 MOD_RES 1333 1333 Phosphoserine.
 MOD_RES 1380 1380 Phosphoserine.
 MOD_RES 1382 1382 Phosphoserine.
 MOD_RES 1439 1439 Phosphothreonine.
 MOD_RES 1441 1441 Phosphothreonine.
 MOD_RES 1633 1633 Phosphothreonine.
 MOD_RES 1897 1897 Phosphoserine.
 MOD_RES 1947 1947 Phosphothreonine.
 MOD_RES 2101 2101 Phosphoserine.
 MOD_RES 2120 2120 Phosphoserine.
 MOD_RES 2126 2126 Phosphoserine.
 MOD_RES 2159 2159 Phosphoserine.
 MOD_RES 2366 2366 Phosphoserine.
 MOD_RES 2421 2421 Phosphothreonine.
 MOD_RES 2452 2452 Phosphoserine.
 MOD_RES 2460 2460 Phosphothreonine.
 MOD_RES 2938 2938 Phosphothreonine.
 MOD_RES 2950 2950 Phosphothreonine.
 MOD_RES 3433 3433 Phosphoserine.  
Keyword
 3D-structure; Activator; Coiled coil; Complete proteome; DNA-binding; Host-virus interaction; Notch signaling pathway; Nucleus; Phosphoprotein; Polymorphism; Reference proteome; Repeat; Repressor; RNA-binding; Transcription; Transcription regulation. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 3664 AA 
Protein Sequence
MVRETRHLWV GNLPENVREE KIIEHFKRYG RVESVKILPK RGSEGGVAAF VDFVDIKSAQ 60
KAHNSVNKMG DRDLRTDYNE PGTIPSAARG LDDTVSIASR SREVSGFRGG GGGPAYGPPP 120
SLHAREGRYE RRLDGASDNR ERAYEHSAYG HHERGTGGFD RTRHYDQDYY RDPRERTLQH 180
GLYYASRSRS PNRFDAHDPR YEPRAREQFT LPSVVHRDIY RDDITREVRG RRPERNYQHS 240
RSRSPHSSQS RNQSPQRLAS QASRPTRSPS GSGSRSRSSS SDSISSSSST SSDSSDSSSS 300
SSDDSPARSV QSAAVPAPTS QLLSSLEKDE PRKSFGIKVQ NLPVRSTDTS LKDGLFHEFK 360
KFGKVTSVQI HGTSEERYGL VFFRQQEDQE KALTASKGKL FFGMQIEVTA WIGPETESEN 420
EFRPLDERID EFHPKATRTL FIGNLEKTTT YHDLRNIFQR FGEIVDIDIK KVNGVPQYAF 480
LQYCDIASVC KAIKKMDGEY LGNNRLKLGF GKSMPTNCVW LDGLSSNVSD QYLTRHFCRY 540
GPVVKVVFDR LKGMALVLYN EIEYAQAAVK ETKGRKIGGN KIKVDFANRE SQLAFYHCME 600
KSGQDIRDFY EMLAERREER RASYDYNQDR TYYESVRTPG TYPEDSRRDY PARGREFYSE 660
WETYQGDYYE SRYYDDPREY RDYRNDPYEQ DIREYSYRQR ERERERERFE SDRDRDHERR 720
PIERSQSPVH LRRPQSPGAS PSQAERLPSD SERRLYSRSS DRSGSCSSLS PPRYEKLDKS 780
RLERYTKNEK TDKERTFDPE RVERERRLIR KEKVEKDKTD KQKRKGKVHS PSSQSSETDQ 840
ENEREQSPEK PRSCNKLSRE KADKEGIAKN RLELMPCVVL TRVKEKEGKV IDHTPVEKLK 900
AKLDNDTVKS SALDQKLQVS QTEPAKSDLS KLESVRMKVP KEKGLSSHVE VVEKEGRLKA 960
RKHLKPEQPA DGVSAVDLEK LEARKRRFAD SNLKAEKQKP EVKKSSPEME DARVLSKKQP 1020
DVSSREVILL REGEAERKPV RKEILKRESK KIKLDRLNTV ASPKDCQELA SISVGSGSRP 1080
SSDLQARLGE LAGESVENQE VQSKKPIPSK PQLKQLQVLD DQGPEREDVR KNYCSLRDET 1140
PERKSGQEKS HSVNTEEKIG IDIDHTQSYR KQMEQSRRKQ QMEMEIAKSE KFGSPKKDVD 1200
EYERRSLVHE VGKPPQDVTD DSPPSKKKRM DHVDFDICTK RERNYRSSRQ ISEDSERTGG 1260
SPSVRHGSFH EDEDPIGSPR LLSVKGSPKV DEKVLPYSNI TVREESLKFN PYDSSRREQM 1320
ADMAKIKLSV LNSEDELNRW DSQMKQDAGR FDVSFPNSII KRDSLRKRSV RDLEPGEVPS 1380
DSDEDGEHKS HSPRASALYE SSRLSFLLRD REDKLRERDE RLSSSLERNK FYSFALDKTI 1440
TPDTKALLER AKSLSSSREE NWSFLDWDSR FANFRNNKDK EKVDSAPRPI PSWYMKKKKI 1500
RTDSEGKMDD KKEDHKEEEQ ERQELFASRF LHSSIFEQDS KRLQHLERKE EDSDFISGRI 1560
YGKQTSEGAN STTDSIQEPV VLFHSRFMEL TRMQQKEKEK DQKPKEVEKQ EDTENHPKTP 1620
ESAPENKDSE LKTPPSVGPP SVTVVTLESA PSALEKTTGD KTVEAPLVTE EKTVEPATVS 1680
EEAKPASEPA PAPVEQLEQV DLPPGADPDK EAAMMPAGVE EGSSGDQPPY LDAKPPTPGA 1740
SFSQAESNVD PEPDSTQPLS KPAQKSEEAN EPKAEKPDAT ADAEPDANQK AEAAPESQPP 1800
ASEDLEVDPP VAAKDKKPNK SKRSKTPVQA AAVSIVEKPV TRKSERIDRE KLKRSNSPRG 1860
EAQKLLELKM EAEKITRTAS KNSAADLEHP EPSLPLSRTR RRNVRSVYAT MGDHENRSPV 1920
KEPVEQPRVT RKRLERELQE AAAVPTTPRR GRPPKTRRRA DEEEENEAKE PAETLKPPEG 1980
WRSPRSQKTA AGGGPQGKKG KNEPKVDATR PEATTEVGPQ IGVKESSMEP KAAEEEAGSE 2040
QKRDRKDAGT DKNPPETAPV EVVEKKPAPE KNSKSKRGRS RNSRLAVDKS ASLKNVDAAV 2100
SPRGAAAQAG ERESGVVAVS PEKSESPQKE DGLSSQLKSD PVDPDKEPEK EDVSASGPSP 2160
EATQLAKQME LEQAVEHIAK LAEASASAAY KADAPEGLAP EDRDKPAHQA SETELAAAIG 2220
SIINDISGEP ENFPAPPPYP GESQTDLQPP AGAQALQPSE EGMETDEAVS GILETEAATE 2280
SSRPPVNAPD PSAGPTDTKE ARGNSSETSH SVPEAKGSKE VEVTLVRKDK GRQKTTRSRR 2340
KRNTNKKVVA PVESHVPESN QAQGESPAAN EGTTVQHPEA PQEEKQSEKP HSTPPQSCTS 2400
DLSKIPSTEN SSQEISVEER TPTKASVPPD LPPPPQPAPV DEEPQARFRV HSIIESDPVT 2460
PPSDPSIPIP TLPSVTAAKL SPPVASGGIP HQSPPTKVTE WITRQEEPRA QSTPSPALPP 2520
DTKASDVDTS SSTLRKILMD PKYVSATSVT STSVTTAIAE PVSAAPCLHE APPPPVDSKK 2580
PLEEKTAPPV TNNSEIQASE VLVAADKEKV APVIAPKITS VISRMPVSID LENSQKITLA 2640
KPAPQTLTGL VSALTGLVNV SLVPVNALKG PVKGSVTTLK SLVSTPAGPV NVLKGPVNVL 2700
TGPVNVLTTP VNATVGTVNA APGTVNAAAS AVNATASAVT VTAGAVTAAS GGVTATTGTV 2760
TMAGAVIAPS TKCKQRASAN ENSRFHPGSM PVIDDRPADA GSGAGLRVNT SEGVVLLSYS 2820
GQKTEGPQRI SAKISQIPPA SAMDIEFQQS VSKSQVKPDS VTASQPPSKG PQAPAGYANV 2880
ATHSTLVLTA QTYNASPVIS SVKADRPSLE KPEPIHLSVS TPVTQGGTVK VLTQGINTPP 2940
VLVHNQLVLT PSIVTTNKKL ADPVTLKIET KVLQPANLGS TLTPHHPPAL PSKLPTEVNH 3000
VPSGPSIPAD RTVSHLAAAK LDAHSPRPSG PGPSSFPRAS HPSSTASTAL STNATVMLAA 3060
GIPVPQFISS IHPEQSVIMP PHSITQTVSL SHLSQGEVRM NTPTLPSITY SIRPEALHSP 3120
RAPLQPQQIE VRAPQRASTP QPAPAGVPAL ASQHPPEEEV HYHLPVARAT APVQSEVLVM 3180
QSEYRLHPYT VPRDVRIMVH PHVTAVSEQP RAADGVVKVP PASKAPQQPG KEAAKTPDAK 3240
AAPTPTPAPV PVPVPLPAPA PAPHGEARIL TVTPSNQLQG LPLTPPVVVT HGVQIVHSSG 3300
ELFQEYRYGD IRTYHPPAQL THTQFPAASS VGLPSRTKTA AQGPPPEGEP LQPPQPVQST 3360
QPAQPAPPCP PSQLGQPGQP PSSKMPQVSQ EAKGTQTGVE QPRLPAGPAN RPPEPHTQVQ 3420
RAQAETGPTS FPSPVSVSMK PDLPVSLPTQ TAPKQPLFVP TTSGPSTPPG LVLPHTEFQP 3480
APKQDSSPHL TSQRPVDMVQ LLKKYPIVWQ GLLALKNDTA AVQLHFVSGN NVLAHRSLPL 3540
SEGGPPLRIA QRMRLEATQL EGVARRMTVE TDYCLLLALP CGRDQEDVVS QTESLKAAFI 3600
TYLQAKQAAG IINVPNPGSN QPAYVLQIFP PCEFSESHLS RLAPDLLASI SNISPHLMIV 3660
IASV 3664 
Gene Ontology
 GO:0017053; C:transcriptional repressor complex; IDA:BHF-UCL.
 GO:0000166; F:nucleotide binding; IEA:InterPro.
 GO:0003723; F:RNA binding; IEA:UniProtKB-KW.
 GO:0001191; F:RNA polymerase II transcription factor binding transcription factor activity involved in negative regulation of transcription; IDA:BHF-UCL.
 GO:0003700; F:sequence-specific DNA binding transcription factor activity; IEA:Compara.
 GO:0003697; F:single-stranded DNA binding; IEA:Compara.
 GO:0003714; F:transcription corepressor activity; IEA:Compara.
 GO:0007219; P:Notch signaling pathway; IEA:UniProtKB-KW.
 GO:0050769; P:positive regulation of neurogenesis; IMP:BHF-UCL.
 GO:0045893; P:positive regulation of transcription, DNA-dependent; IEA:Compara.
 GO:0019048; P:virus-host interaction; IEA:UniProtKB-KW. 
Interpro
 IPR012677; Nucleotide-bd_a/b_plait.
 IPR000504; RRM_dom.
 IPR012921; SPOC_C.
 IPR016194; SPOC_like_C_dom.
 IPR010912; SPOC_met. 
Pfam
 PF00076; RRM_1
 PF07744; SPOC 
SMART
 SM00360; RRM 
PROSITE
 PS50102; RRM
 PS50917; SPOC 
PRINTS