CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-005708
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Murinoglobulin-1 
Protein Synonyms/Alias
 MuG1 
Gene Name
 Mug1 
Gene Synonyms/Alias
 Mug-1 
Created Date
 July 27, 2013 
Organism
 Mus musculus (Mouse) 
NCBI Taxa ID
 10090 
Lysine Modification
Position
Peptide
Type
References
71LVSQSGRKNLFDELVubiquitination[1]
171LAYIEDPKKNRIMQWubiquitination[1]
188IKTENGLKQMSFSLAubiquitination[1]
204EPIQGPYKIVVHKESubiquitination[1]
339KIERITNKLIFLKADubiquitination[1]
344TNKLIFLKADSHFRHubiquitination[1]
365KVRLVDIKGDPIPNEubiquitination[1]
373GDPIPNEKVFIKAQEubiquitination[1]
644YSSWLAEKHTNLVPHubiquitination[1]
655LVPHGTEKDVYRYVEubiquitination[1]
943VLPPTVVKDSARAHFubiquitination[1]
1001ETQQLTQKIKTKALGubiquitination[1]
1005LTQKIKTKALGFLRAubiquitination[1]
1023RELNYKHKDGSYSAFubiquitination[1]
1077TWLSQKQKDNGCFRSubiquitination[1]
1135SCLESSWKTIEQERNubiquitination[1]
1170NKRDEILKSLDEEAIubiquitination[1]
1178SLDEEAIKENNSIHWubiquitination[1]
1265VALDALSKYGAVTFSubiquitination[1]
1395LSGFIPLKPTVKKLEubiquitination[1]
Reference
 [1] Proteomic analyses reveal divergent ubiquitylation site patterns in murine tissues.
 Wagner SA, Beli P, Weinert BT, Schölz C, Kelstrup CD, Young C, Nielsen ML, Olsen JV, Brakebusch C, Choudhary C.
 Mol Cell Proteomics. 2012 Dec;11(12):1578-85. [PMID: 22790023
Functional Description
 A proteinase activates the inhibitor by specific proteolysis in the bait region, which, by an unknown mechanism leads to reaction at the cysteinyl-glutamyl internal thiol ester site and to a conformational change, whereby the proteinase is trapped and/or covalently bound to the inhibitor. While in the tetrameric proteinase inhibitors steric inhibition is sufficiently strong, monomeric forms need a covalent linkage between the activated glutamyl residue of the original thiol ester and a terminal amino group of a lysine or another nucleophilic group on the proteinase, for inhibition to be effective. 
Sequence Annotation
 REGION 677 734 Bait region.
 CARBOHYD 55 55 N-linked (GlcNAc...) (Potential).
 CARBOHYD 294 294 N-linked (GlcNAc...) (Potential).
 CARBOHYD 313 313 N-linked (GlcNAc...).
 CARBOHYD 500 500 N-linked (GlcNAc...) (Potential).
 CARBOHYD 749 749 N-linked (GlcNAc...) (Potential).
 CARBOHYD 776 776 N-linked (GlcNAc...).
 CARBOHYD 871 871 N-linked (GlcNAc...).
 CARBOHYD 993 993 N-linked (GlcNAc...).
 CARBOHYD 1142 1142 N-linked (GlcNAc...).
 CARBOHYD 1180 1180 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1426 1426 N-linked (GlcNAc...) (Potential).
 DISULFID 48 86 By similarity.
 DISULFID 251 276 By similarity.
 DISULFID 269 288 By similarity.
 DISULFID 461 555 By similarity.
 DISULFID 587 773 By similarity.
 DISULFID 634 680 By similarity.
 DISULFID 849 885 By similarity.
 DISULFID 923 1323 By similarity.
 DISULFID 1081 1129 By similarity.
 DISULFID 1354 1469 By similarity.
 CROSSLNK 974 977 Isoglutamyl cysteine thioester (Cys-Gln)  
Keyword
 Bait region; Complete proteome; Direct protein sequencing; Disulfide bond; Glycoprotein; Protease inhibitor; Reference proteome; Secreted; Serine protease inhibitor; Signal; Thioester bond. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 1476 AA 
Protein Sequence
MWKSRRAQLC LFSVLLAFLH SASLLNGDSK YMVLVPSQLY TETPEKICLH LYQLNETVTV 60
TASLVSQSGR KNLFDELVLD KDLFQCVSFI IPRLSSSDEE DFLYVDIKGP THEFSKRKAV 120
LVKNKESVVF VQTDKPVYKP GQSVKFRVVS MDKMLRPLNE LLPLAYIEDP KKNRIMQWRD 180
IKTENGLKQM SFSLAAEPIQ GPYKIVVHKE SGEKEEHSFT VMEFVLPRFN VDLKVPNAMS 240
VNDEVLSVTA CGKYTYGKPV PGHVKINVCR ETETGCREVN SQLDNNGCST QEVNITELQS 300
KKRNYEVQLF HVNATVTEEG TGLEFSRSGT TKIERITNKL IFLKADSHFR HGIPFFVKVR 360
LVDIKGDPIP NEKVFIKAQE LSYTSATTTD QHGLAEFSID TTCISGSSLH IKVNHKEEDS 420
CSYFYCMEER HASAKHVAYA VYSLSKSYIY LDTETSSILP CNQIHTVQAH FILKGDLGVL 480
KELIFYYLVM AQGSIIQTGN HTHQVEPGEA PVKGKFALEI PVEFSMVPMA KMLIYTILPD 540
GEVIADSVNF EIEKCLRNKV DLRFSTSQSL PASQTRLQVT ASPQSLCGLR AVDQSVLLLK 600
PESELSPSWI YNLPGMQQNK FVPSSRLSED QEDCILYSSW LAEKHTNLVP HGTEKDVYRY 660
VEDMGLTAFT NLMIKLPIIC FDYGMVPISA PRVEFDLAFT PEISWSLRTT LSKRPEEPPR 720
KDPSSNDPLT ETIRKYFPET WVWDIVTVNS TGLAEVEMTV PDTITEWKAG ALCLSNDTGL 780
GLSSVVPLQA FKPFFVEVSL PYSVVRGEAF MLKATVMNYL PTSMQMSVQL EASPDFTAVP 840
VGDDQDSYCL SANGRHTSSW LVTPKSLGNV NFSVSAEAQQ SSEPCGSEVA TVPETGRKDT 900
VVKVLIVEPE GIKQEHTFSS LFCASDAEIS EKMSLVLPPT VVKDSARAHF SVMGDILSSA 960
IRNTQNLLHM PYGCGEQNMV LFAPNIYVLK YLNETQQLTQ KIKTKALGFL RAGYQRELNY 1020
KHKDGSYSAF GDQNGEREGN TWLTAFVLKS FAQARAFIFI DESHITHAFT WLSQKQKDNG 1080
CFRSSGSLFN NAMKGGVDDE MTLSAYITMA LLESSLPATH PVVSKALSCL ESSWKTIEQE 1140
RNASFVYTKA LMAYAFALAG NQNKRDEILK SLDEEAIKEN NSIHWKRPQK SRKSEHHLYK 1200
PQASSAEVEM NAYVVLARLT AQPAPSPEDL TLSMSTIMWL TKQQNSNGGF SSTQDTVVAL 1260
DALSKYGAVT FSRSQKTTLV TIQSTGSFSQ KFQVENSNRL LLQQVALPDI PGDYTISVSG 1320
EGCVYAQTML RYNMHLEKQL SAFAIWVQTV PLTCNNPKGH NSFQISLEIS YTGSRPASNM 1380
VIADVKMLSG FIPLKPTVKK LERLEHVSRT EVSNNNVLIY LDQVTNQTLA FSFIIQQDIP 1440
VRNLQPAIVK VYDYYETDEM AFAEYSSPCS TDKQNV 1476 
Gene Ontology
 GO:0005615; C:extracellular space; IEA:InterPro.
 GO:0004867; F:serine-type endopeptidase inhibitor activity; IEA:UniProtKB-KW.
 GO:0010951; P:negative regulation of endopeptidase activity; IEA:GOC. 
Interpro
 IPR009048; A-macroglobulin_rcpt-bd.
 IPR011626; A2M_comp.
 IPR002890; A2M_N.
 IPR011625; A2M_N_2.
 IPR001599; Macroglobln_a2.
 IPR019742; MacrogloblnA2_CS.
 IPR019565; MacrogloblnA2_thiol-ester-bond.
 IPR008930; Terpenoid_cyclase/PrenylTrfase.
 IPR010916; TonB_box_CS. 
Pfam
 PF00207; A2M
 PF07678; A2M_comp
 PF01835; A2M_N
 PF07703; A2M_N_2
 PF07677; A2M_recep
 PF10569; Thiol-ester_cl 
SMART
  
PROSITE
 PS00477; ALPHA_2_MACROGLOBULIN 
PRINTS