CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-022606
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Protein SON 
Protein Synonyms/Alias
 Negative regulatory element-binding protein; NRE-binding protein 
Gene Name
 Son 
Gene Synonyms/Alias
 Nrebp 
Created Date
 July 27, 2013 
Organism
 Mus musculus (Mouse) 
NCBI Taxa ID
 10090 
Lysine Modification
Position
Peptide
Type
References
16FRSFVVSKFREIQQEubiquitination[1]
Reference
 [1] Proteomic analyses reveal divergent ubiquitylation site patterns in murine tissues.
 Wagner SA, Beli P, Weinert BT, Schölz C, Kelstrup CD, Young C, Nielsen ML, Olsen JV, Brakebusch C, Choudhary C.
 Mol Cell Proteomics. 2012 Dec;11(12):1578-85. [PMID: 22790023
Functional Description
 RNA-binding protein that acts as a mRNA splicing cofactor by promoting efficient splicing of transcripts that possess weak splice sites. Specifically promotes splicing of many cell-cycle and DNA-repair transcripts that possess weak splice sites, such as TUBG1, KATNB1, TUBGCP2, AURKB, PCNT, AKT1, RAD23A, and FANCG. Probably acts by facilitating the interaction between Serine/arginine-rich proteins such as SRSF2 and the RNA polymerase II. Also binds to DNA; binds to the consensus DNA sequence: 5'- GA[GT]AN[CG][AG]CC-3' (By similarity). May also regulate the ghrelin signaling in hypothalamic neuron by acting as a negative regulator of GHSR expression. 
Sequence Annotation
 REPEAT 1001 1006 1-1.
 REPEAT 1009 1014 1-2.
 REPEAT 1016 1021 1-3.
 REPEAT 1025 1030 1-4.
 REPEAT 1033 1038 1-5.
 REPEAT 1041 1046 1-6.
 REPEAT 1050 1055 1-7.
 REPEAT 1058 1063 1-8.
 REPEAT 1066 1071 1-9.
 REPEAT 1075 1080 1-10.
 REPEAT 1084 1089 1-11.
 REPEAT 1095 1100 1-12.
 REPEAT 1106 1111 1-13.
 REPEAT 1115 1120 1-14.
 REPEAT 1950 1956 2-1.
 REPEAT 1959 1977 3-1.
 REPEAT 1978 1984 2-2.
 REPEAT 1985 1991 2-3.
 REPEAT 1992 1998 2-4.
 REPEAT 1999 2005 2-5.
 REPEAT 2006 2012 2-6.
 REPEAT 2013 2019 2-7; approximate.
 REPEAT 2020 2030 3-2; approximate.
 DOMAIN 2323 2369 G-patch.
 DOMAIN 2389 2444 DRBM.
 REGION 721 850 13 X 10 AA tandem repeats of L-A-[ST]-
 REGION 907 983 11 X 7 AA tandem repeats of [DR]-P-Y-R-
 REGION 1001 1120 14 X 6 AA repeats of [ED]-R-S-M-M-S.
 REGION 1141 1173 3 X 11 AA tandem repats of P-P-L-P-P-E-E-
 REGION 1950 2019 7 X 7 AA repeats of P-S-R-R-S-R-[TS].
 REGION 1959 2030 2 X 19 AA repeats of P-S-R-R-R-R-S-R-S-V-
 REGION 2031 2057 3 X tandem repeats of [ST]-P-[VLI]-R-
 MOD_RES 2 2 N-acetylalanine (By similarity).
 MOD_RES 16 16 N6-acetyllysine (By similarity).
 MOD_RES 94 94 Phosphoserine (By similarity).
 MOD_RES 142 142 Phosphoserine (By similarity).
 MOD_RES 150 150 Phosphoserine (By similarity).
 MOD_RES 152 152 Phosphoserine (By similarity).
 MOD_RES 158 158 Phosphoserine (By similarity).
 MOD_RES 284 284 N6-acetyllysine (By similarity).
 MOD_RES 1714 1714 Phosphoserine.
 MOD_RES 1723 1723 Phosphoserine.
 MOD_RES 1794 1794 Phosphoserine (By similarity).
 MOD_RES 1808 1808 Phosphoserine (By similarity).
 MOD_RES 1809 1809 Phosphoserine.
 MOD_RES 1845 1845 Phosphoserine.
 MOD_RES 1846 1846 Phosphoserine.
 MOD_RES 1849 1849 Phosphoserine.
 MOD_RES 1973 1973 Phosphoserine.
 MOD_RES 1975 1975 Phosphoserine.
 MOD_RES 1979 1979 Phosphoserine.
 MOD_RES 2027 2027 Phosphoserine.
 MOD_RES 2029 2029 Phosphoserine.
 MOD_RES 2031 2031 Phosphoserine.
 MOD_RES 2073 2073 N6-acetyllysine (By similarity).
 MOD_RES 2181 2181 Phosphothreonine (By similarity).  
Keyword
 Acetylation; Alternative splicing; Cell cycle; Complete proteome; DNA-binding; mRNA processing; mRNA splicing; Nucleus; Phosphoprotein; Reference proteome; Repeat; RNA-binding; Transcription; Transcription regulation. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 2444 AA 
Protein Sequence
MAADIEQVFR SFVVSKFREI QQELSSGRSE GQLNGETNPP IEGNQAGDTA ASARSLPNEE 60
IVQKIEEVLS GVLDTELRYK PDLKEASRKS RCVSVQTDPT DEVPTKKSKK HKKHKNKKKK 120
KKKEKEKKYK RQPEESESKL KSHHDGNLES DSFLKFDSEP SAAALEHPVR AFGLSEASET 180
ALVLEPPVVS MEVQESHVLE TLKPATKAAE LSVVSTSVIS EQSEQPMPGM LEPSMTKILD 240
SFTAAPVPMS TAALKSPEPV VTMSVEYQKS VLKSLETMPP ETSKTTLVEL PIAKVVEPSE 300
TLTIVSETPT EVHPEPSPST MDFPESSTTD VQRLPEQPVE APSEIADSSM TRPQESLELP 360
KTTAVELQES TVASALELPG PPATSILELQ GPPVTPVPEL PGPSATPVPE LSGPLSTPVP 420
ELPGPPATVV PELPGPSVTP VPQLSQELPG PPAPSMGLEP PQEVPEPPVM AQELSGVPAV 480
SAAIELTGQP AVTVAMELTE QPVTTTEFEQ PVAMTTVEHP GHPEVTTATG LLGQPEAAMV 540
LELPGQPVAT TALELSGQPS VTGVPELSGL PSATRALELS GQSVATGALE LPGQLMATGA 600
LEFSGQSGAA GALELLGQPL ATGVLELPGQ PGAPELPGQP VATVALEISV QSVVTTSELS 660
TMTVSQSLEV PSTTALESYN TVAQELPTTL VGETSVTVGV DPLMAQESHM LASNTMETHM 720
LASNTMDSQM LASNTMDSQM LASNTMDSQM LASSTMDSQM LASSTMDSQM LATSTMDSQM 780
LATSSMDSQM LATSSMDSQM LATSSMDSQM LATSSMDSQM LATSSMDSQM LATSSMDSQM 840
LATSSMDSQM LATSSMDSQM LASGAMDSQM LASGTMDAQM LASGTMDAQM LASSTQDSAM 900
MGSKSPDPYR LAQDPYRLAQ DPYRLGHDPY RLGHDAYRLG QDPYRLGHDP YRLTPDPYRV 960
SPRPYRIAPR SYRIAPRPYR LAPRPLMLAS RRSMMMSYAA ERSMMSSYER SMMSYERSMM 1020
SPMAERSMMS AYERSMMSAY ERSMMSPMAE RSMMSAYERS MMSAYERSMM SPMADRSMMS 1080
MGADRSMMSS YSAADRSMMS SYSAADRSMM SSYTDRSMMS MAADSYTDSY TDSYTEAYMV 1140
PPLPPEEPPT MPPLPPEEPP MTPPLPPEEP PEGPALSTEQ SALTADNTWS TEVTLSTGES 1200
LSQPEPPVSQ SEISEPMAVP ANYSMSESET SMLASEAVMT VPEPAREPES SVTSAPVESA 1260
VVAEHEMVPE RPMTYMVSET TMSVEPAVLT SEASVISETS ETYDSMRPSG HAISEVTMSL 1320
LEPAVTISQP AENSLELPSM TVPAPSTMTT TESPVVAVTE IPPVAVPEPP IMAVPELPTM 1380
AVVKTPAVAV PEPLVAAPEP PTMATPELCS LSVSEPPVAV SELPALADPE HAITAVSGVS 1440
SLEPSVPILE PAVSVLQPVM IVSEPSVPVQ EPTVAVSEPA VIVSEHTQIT SPEMAVESSP 1500
VIVDSSVMSS QIMKGMNLLG GDENLGPEVG MQETLLHPGE EPRDGGHLKS DLYENEYDRN 1560
ADLTVNSHLI VKDAEHNTVC ATTVGPVGEA SEEKILPISE TKEITELATC AAVSEADIGR 1620
SLSSQLALEL DTVGTSKGFE FVTASALISE SKYDVEVSVT TQDTEHDMVI STSPSGGSEA 1680
DIEGPLPAKD IHLDLPSTNF VCKDVEDSLP IKESAQAVAV ALSPKESSED TEVPLPNKEI 1740
VPESGYSASI DEINEADLVR PLLPKDMERL TSLRAGIEGP LLASEVERDK SAASPVVISI 1800
PERASESSSE EKDDYEIFVK VKDTHEKSKK NKNRDKGEKE KKRDSSLRSR SKRSKSSEHK 1860
SRKRTSESRS RARKRSSKSK SHRSQTRSRS RSRRRRRSSR SRSKSRGRRS VSKEKRKRSP 1920
KHRSKSRERK RKRSSSRDNR KAARARSRTP SRRSRSHTPS RRRRSRSVGR RRSFSISPSR 1980
RSRTPSRRSR TPSRRSRTPS RRSRTPSRRS RTPSRRRRSR SAVRRRSFSI SPVRLRRSRT 2040
PLRRRFSRSP IRRKRSRSSE RGRSPKRLTD LDKAQLLEIA KANAAAMCAK AGVPLPPNLK 2100
PAPPPTIEEK VAKKSGGATI EELTEKCKQI AQSKEDDDVI VNKPHVSDEE EEEPPFYHHP 2160
FKLSEPKPIF FNLNIAAAKP TPPKSQVTLT KEFPVSSGSQ HRKKEADSVY GEWVPVEKNG 2220
EESKDDDNVF SSSLPSEPVD ISTAMSERAL AQKRLSENAF DLEAMSMLNR AQERIDAWAQ 2280
LNSIPGQFTG STGVQVLTQE QLANTGAQAW IKKDQFLRAA PVTGGMGAVL MRKMGWREGE 2340
GLGKNKEGNK EPILVDFKTD RKGLVAVGER AQKRSGNFSA AMKDLSGKHP VSALMEICNK 2400
RRWQPPEFLL VHDSGPDHRK HFLFRVLRNG SPYQPNCMFF LNRY 2444 
Gene Ontology
 GO:0016607; C:nuclear speck; ISS:UniProtKB.
 GO:0003677; F:DNA binding; IEA:UniProtKB-KW.
 GO:0003725; F:double-stranded RNA binding; IEA:InterPro.
 GO:0003723; F:RNA binding; ISS:UniProtKB.
 GO:0000910; P:cytokinesis; ISS:UniProtKB.
 GO:0000226; P:microtubule cytoskeleton organization; ISS:UniProtKB.
 GO:0006397; P:mRNA processing; ISS:UniProtKB.
 GO:0043066; P:negative regulation of apoptotic process; IEA:Compara.
 GO:0051726; P:regulation of cell cycle; ISS:UniProtKB.
 GO:0043484; P:regulation of RNA splicing; ISS:UniProtKB.
 GO:0006355; P:regulation of transcription, DNA-dependent; IEA:UniProtKB-KW.
 GO:0008380; P:RNA splicing; IEA:UniProtKB-KW.
 GO:0006351; P:transcription, DNA-dependent; IEA:UniProtKB-KW. 
Interpro
 IPR001159; Ds-RNA-bd.
 IPR014720; dsRNA-bd-like_dom.
 IPR000467; G_patch_dom. 
Pfam
 PF01585; G-patch 
SMART
 SM00443; G_patch 
PROSITE
 PS50137; DS_RBD
 PS50174; G_PATCH 
PRINTS