CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-014445
UniProt Accession
Genbank Protein ID
 U07615 
Genbank Nucleotide ID
Protein Name
 Mucin-2 
Protein Synonyms/Alias
 MUC-2; Intestinal mucin-2 
Gene Name
 Muc2 
Gene Synonyms/Alias
  
Created Date
 July 27, 2013 
Organism
 Rattus norvegicus (Rat) 
NCBI Taxa ID
 10116 
Lysine Modification
Position
Peptide
Type
References
451VVLLTDNKKNVVAFKacetylation[1]
Reference
 [1] Proteomic analysis of lysine acetylation sites in rat tissues reveals organ specificity and subcellular patterns.
 Lundby A, Lage K, Weinert BT, Bekker-Jensen DB, Secher A, Skovgaard T, Kelstrup CD, Dmytriyev A, Choudhary C, Lundby C, Olsen JV.
 Cell Rep. 2012 Aug 30;2(2):419-31. [PMID: 22902405
Functional Description
 Coats the epithelia of the intestines, airways, and other mucus membrane-containing organs. Thought to provide a protective, lubricating barrier against particles and infectious agents at mucosal surfaces. Major constituent of both the inner and outer mucus layers of the colon and may play a role in excluding bacteria from the inner mucus layer (By similarity). 
Sequence Annotation
 DOMAIN 33 237 VWFD 1.
 DOMAIN 292 348 TIL.
 DOMAIN 350 410 VWFC.
 DOMAIN 387 601 VWFD 2.
 DOMAIN 857 1062 VWFD 3.
 REPEAT 1392 1407 1.
 REPEAT 1408 1423 2.
 REPEAT 1424 1434 3.
 REPEAT 1435 1445 4.
 REPEAT 1446 1456 5.
 REPEAT 1457 1467 6.
 REPEAT 1468 1478 7.
 REPEAT 1479 1489 8.
 REPEAT 1490 1500 9.
 REPEAT 1501 1511 10.
 REPEAT 1512 >1513 11.
 REGION 1392 >1513 Approximate repeats.
 CARBOHYD 160 160 N-linked (GlcNAc...) (Potential).
 CARBOHYD 420 420 N-linked (GlcNAc...) (Potential).
 CARBOHYD 667 667 N-linked (GlcNAc...) (Potential).
 CARBOHYD 767 767 N-linked (GlcNAc...) (Potential).
 CARBOHYD 837 837 N-linked (GlcNAc...) (Potential).
 CARBOHYD 892 892 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1136 1136 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1151 1151 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1212 1212 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1227 1227 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1243 1243 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1350 1350 N-linked (GlcNAc...) (Potential).  
Keyword
 Autocatalytic cleavage; Complete proteome; Direct protein sequencing; Disulfide bond; Glycoprotein; Reference proteome; Repeat; Secreted; Signal. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 1513 AA 
Protein Sequence
MGLPLARLVA VCLVLALAKG LELQKEARSR NHVCSTWGDF HYKTFDGDVF RFPGLCDYNF 60
ASDCRDSYKE FAVHLKRGLD KAGGHSSIES VLITIKDDTI YLTHKLAVVN GAMVSTPHYS 120
SGLLIEKNDA YTKVYSRAGL SLMWNREDAL MVELDGRFQN HTCGLCGDFN GMQANNEFLS 180
DGIRFSAIEF GNMQKINKPE VVCEDPEEVQ EPESCSEHRA ECERLLTSTA FEDCQARVPV 240
ELYVLACMHD RCQCPQGGAC ECSTLAEFSR QCSHAGGRPE NWRTASLCPK KCPGNMVYLE 300
SGSPWLDTCS HLEVSSLCEE HYMDGCFCPE GTVYDDITGS GCIPVSQCHC KLHGHLYMPG 360
QEITNDCEQC VCNAGRWMCK DLPCPETCAL EGGSHITTFD GKKFTFHGDC YYVLTKTKYN 420
DSYALLGELA SCGSTDKQTC LKTVVLLTDN KKNVVAFKSG GSVLLNEMEV SLPHVAASFS 480
IFKPSSYHIV VNTMFGLRLQ IQLVPVMQLF VTLDQSAQGQ VQGLCGNFNG LESDDFMTSG 540
GMVEATGAGF ANTWKAQSSC HDKLDWLDDP CPLNIESANY AEHWCSLLKR SETPFARCHL 600
AVDPTEYYKR CKYDTCNCQN NEDCMCAALS SYARACAAKG VMLWGWRESV CNKDVHACPS 660
SQIFMYNLTT CQQTCRSISE GDTHCLKGFA PVEGCGCPDH TFMDEKGRCV PLSKCSCYHH 720
GLYLEAGDVI LRQEERCICR NGRLQCTQVK LIGHTCLSPQ ILVDCNNLTA LAIREPRPTS 780
CQTLVARYYH TECISGCVCP DGLLDNGRGG CVVEDECPCI HNKQFYDSGK SIKLDCNNTC 840
TCQKGRWECT RYACHSTCSI YGSGHYITFD GKHYDFDGHC SYVAVQDYCG QNSTGSFSII 900
TENVPCGTTG VTCSKAIKIF IGGTELKLVD KHRVVKQLEE GHHVPFITRE VGLYLVVEVS 960
SGIIVIWDKK TTIFIKLDPS YKGNVCGLCG NFDDQTKNDF TTRDHMVVAS ELDFGNSWKE 1020
ASTCPDVSHN PDPCSLNPHR RSWAEKQCSI IKSDVFLACH GKVDPTVFYD ACVHDSCSCD 1080
TGGDCECFCS AVASYAQECT KAEACVFWRT PDLCPVFCDY YNPPDECEWH YEPCGNRSFE 1140
TCRTLNGIHS NISVSYLEGC YPRCPEDRPI YDEDLKKCVS GDKCGCYIED TRYPPGGSVP 1200
TDEICMSCTC TNTSEIICRP DEGKIINQTQ DGIFCYWETC GSNGTVEKHF EICVSSTLSP 1260
TSMTSFTTTS TPISTTPIST TITTTSATAT TTVPCCFWSD WINNNHPTSG NGGDRENFEH 1320
VCSAPENIEC RAATDPKLDW TELGQKVQCN VSEGLICNNE DQYGTGQFEL CYDYEIRVNC 1380
CFPMEYCLST VSPTTSTPIS STPQPTSSPT TLPTTSPLTS SATSPTTSHI TSTVSPTTSP 1440
TTSTTSPTTS PTTSTTSPTT STTSPTPSPT TSTTSPTPSP TTSTTSPTPS PTTSTTSPTT 1500
SPITSPTTST TSP 1513 
Gene Ontology
 GO:0005737; C:cytoplasm; IDA:RGD.
 GO:0070702; C:inner mucus layer; ISS:UniProtKB.
 GO:0005634; C:nucleus; IDA:RGD.
 GO:0070703; C:outer mucus layer; ISS:UniProtKB.
 GO:0046983; F:protein dimerization activity; IMP:RGD.
 GO:0071356; P:cellular response to tumor necrosis factor; IEP:RGD.
 GO:0009725; P:response to hormone stimulus; IEP:RGD.
 GO:0032496; P:response to lipopolysaccharide; IEP:RGD.
 GO:0033189; P:response to vitamin A; IEP:RGD. 
Interpro
 IPR002919; TIL_dom.
 IPR014853; Unchr_dom_Cys-rich.
 IPR001007; VWF_C.
 IPR001846; VWF_type-D.
 IPR025155; WxxW_domain. 
Pfam
 PF08742; C8
 PF13330; Mucin2_WxxW
 PF01826; TIL
 PF00094; VWD 
SMART
 SM00832; C8
 SM00214; VWC
 SM00216; VWD 
PROSITE
 PS01208; VWFC_1
 PS50184; VWFC_2
 PS51233; VWFD 
PRINTS