CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-017288
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 pre-mRNA 3' end processing protein WDR33 
Protein Synonyms/Alias
 WD repeat-containing protein 33; WD repeat-containing protein WDC146 
Gene Name
 Wdr33 
Gene Synonyms/Alias
 Wdc146 
Created Date
 July 27, 2013 
Organism
 Mus musculus (Mouse) 
NCBI Taxa ID
 10090 
Lysine Modification
Position
Peptide
Type
References
46QQLTFDGKRMRKAVNubiquitination[1]
Reference
 [1] Proteomic analyses reveal divergent ubiquitylation site patterns in murine tissues.
 Wagner SA, Beli P, Weinert BT, Schölz C, Kelstrup CD, Young C, Nielsen ML, Olsen JV, Brakebusch C, Choudhary C.
 Mol Cell Proteomics. 2012 Dec;11(12):1578-85. [PMID: 22790023
Functional Description
 Essential for both cleavage and polyadenylation of pre- mRNA 3' ends (By similarity). 
Sequence Annotation
 REPEAT 117 156 WD 1.
 REPEAT 159 198 WD 2.
 REPEAT 200 239 WD 3.
 REPEAT 242 283 WD 4.
 REPEAT 286 325 WD 5.
 REPEAT 329 369 WD 6.
 REPEAT 373 412 WD 7.
 DOMAIN 617 769 Collagen-like.
 MOD_RES 2 2 N-acetylalanine (By similarity).
 MOD_RES 7 7 Phosphoserine (By similarity).
 MOD_RES 46 46 N6-acetyllysine (By similarity).
 MOD_RES 1204 1204 Phosphoserine (By similarity).  
Keyword
 Acetylation; Collagen; Complete proteome; mRNA processing; Nucleus; Phosphoprotein; Reference proteome; Repeat; WD repeat. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 1330 AA 
Protein Sequence
MATEIGSPPR FFHMPRFQHQ APRQLFYKRP DFAQQQAMQQ LTFDGKRMRK AVNRKTIDYN 60
PSVIKYLENR IWQRDQRDMR AIQPDAGYYN DLVPPIGMLN NPMNAVTTKF VRTSTNKVKC 120
PVFVVRWTPE GRRLVTGASS GEFTLWNGLT FNFETILQAH DSPVRAMTWS HNDMWMLTAD 180
HGGYVKYWQS NMNNVKMFQA HKEAIREASF SPTDNKFATC SDDGTVRIWD FLRCHEERIL 240
RGHGADVKCV DWHPTKGLVV SGSKDSQQPI KFWDPKTGQS LATLHAHKNT VMEVKLNLNG 300
NWLLTASRDH LCKLFDIRNL KEELQVFRGH KKEATAVAWH PVHEGLFASG GSDGSLLFWH 360
VGVEKEVGGM EMAHEGMIWS LAWHPLGHIL CSGSNDHTSK FWTRNRPGDK MRDRYNLNLL 420
PGMSEDGVEY DDLEPNSLAV IPGMGIPEQL KLAMEQEQMG KDESSEIEMT IPGLDWGMEE 480
VMQKDQKKVP QKKVPYAKPI PAQFQQAWMQ NKVPIPAPNE VLNDRKEDIK LEEKKKTQAE 540
IEQEMATLQY TNPQLLEQLK IERLAQKQAD QIQPPPSSGT PLLGPQPFSG QGPISQIPQG 600
FQQPHPSQQM PLVPQMGPPG PQGQFRAPGP QGQMGPQGPP MHQGGGGPQG FMGPQGPQGP 660
PQGLPRPQDM HGPQGMQRHP GPHGPLGPQG PPGPQGSSGP QGHMGPQGPP GPQGHIGPQG 720
PPASQGHMGP QGPPGTQGMQ GPPGPRGMQG PPHPHGIQGG PASQGIQGPL MGLNPRGMQG 780
PPGPRENQGP APQGLMIGHP PQEMRGPHPP SGLLGHGPQE MRGPQEMRGM QGPPPQGSML 840
GPPQELRGPS GSQGQQGPPQ GSLGPPPQGG MQGPPGPQGQ QNPARGPHPS QGPIPFQQQK 900
APLLGDGPRA PFNQEGQSTG PPPLIPGLGQ QGAQGRIPPL NPGQGPGPNK GDTRGPPNHH 960
LGPMSERRHE QSGGPEHGPD RGPFRGGQDC RGPPDRRGSH PDFPDDFRPD DFHPDKRFGH 1020
RLREFEGRGG PLPQEEKWRR GGPGPPFPPD HREFNEGDGR GAARGPPGAW EGRRPGDDRF 1080
PRDPDDPRFR GRREESFRRG APPRHEGRAP PRGRDNFPGP DDFGPEEGFD ASDEAARGRD 1140
LRGRGRGTPR GGSRKCLLPT PDEFPRFEGG RKPDSWDGNR EPGPGHEHFR DAPRPDHPPH 1200
DGHSPASRER SSSLQGMDMA SLPPRKRPWH DGSGTSEHRE MEAQGGPSED RGSKGRGGPG 1260
PSQRVPKSGR SSSLDGDHHD GYHRDEPFGG PPGSSSSSRG ARSGSNWGRG SNMNSGPPRR 1320
GTSRGSGRGR 1330 
Gene Ontology
 GO:0005581; C:collagen; IEA:UniProtKB-KW.
 GO:0005634; C:nucleus; IDA:MGI.
 GO:0006397; P:mRNA processing; IEA:UniProtKB-KW. 
Interpro
 IPR008160; Collagen.
 IPR015943; WD40/YVTN_repeat-like_dom.
 IPR001680; WD40_repeat.
 IPR017986; WD40_repeat_dom. 
Pfam
 PF01391; Collagen
 PF00400; WD40 
SMART
 SM00320; WD40 
PROSITE
 PS00678; WD_REPEATS_1
 PS50082; WD_REPEATS_2
 PS50294; WD_REPEATS_REGION 
PRINTS