CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-004149
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Versican core protein 
Protein Synonyms/Alias
 Chondroitin sulfate proteoglycan core protein 2; Chondroitin sulfate proteoglycan 2; Glial hyaluronate-binding protein; GHAP; Large fibroblast proteoglycan; PG-M 
Gene Name
 VCAN 
Gene Synonyms/Alias
 CSPG2 
Created Date
 July 27, 2013 
Organism
 Homo sapiens (Human) 
NCBI Taxa ID
 9606 
Lysine Modification
Position
Peptide
Type
References
98IKIGQDYKGRVSVPTubiquitination[1]
2238ESTKHFPKGMRPTIQmethylation[2]
3312AKTFGKMKPRYEINSacetylation[3]
Reference
 [1] Systematic and quantitative assessment of the ubiquitin-modified proteome.
 Kim W, Bennett EJ, Huttlin EL, Guo A, Li J, Possemato A, Sowa ME, Rad R, Rush J, Comb MJ, Harper JW, Gygi SP.
 Mol Cell. 2011 Oct 21;44(2):325-40. [PMID: 21906983]
 [2] Large-scale global identification of protein lysine methylation in vivo.
 Cao XJ, Arnaudo AM, Garcia BA.
 Epigenetics. 2013 May 1;8(5):477-85. [PMID: 23644510]
 [3] Regulation of cellular metabolism by protein lysine acetylation.
 Zhao S, Xu W, Jiang W, Yu W, Lin Y, Zhang T, Yao J, Zhou L, Zeng Y, Li H, Li Y, Shi J, An W, Hancock SM, He F, Qin L, Chin J, Yang P, Chen X, Lei Q, Xiong Y, Guan KL.
 Science. 2010 Feb 19;327(5968):1000-4. [PMID: 20167786
Functional Description
 May play a role in intercellular signaling and in connecting cells with the extracellular matrix. May take part in the regulation of cell motility, growth and differentiation. Binds hyaluronic acid. 
Sequence Annotation
 DOMAIN 21 146 Ig-like V-type.
 DOMAIN 150 245 Link 1.
 DOMAIN 251 347 Link 2.
 DOMAIN 3089 3125 EGF-like 1.
 DOMAIN 3127 3163 EGF-like 2; calcium-binding (Potential).
 DOMAIN 3176 3290 C-type lectin.
 DOMAIN 3294 3354 Sushi.
 REGION 348 1335 GAG-alpha (glucosaminoglycan attachment
 REGION 1336 3089 GAG-beta.
 CARBOHYD 57 57 N-linked (GlcNAc...) (Potential).
 CARBOHYD 330 330 N-linked (GlcNAc...).
 CARBOHYD 615 615 N-linked (GlcNAc...) (Potential).
 CARBOHYD 782 782 N-linked (GlcNAc...) (Potential).
 CARBOHYD 809 809 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1332 1332 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1398 1398 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1442 1442 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1468 1468 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1663 1663 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1898 1898 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2179 2179 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2272 2272 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2280 2280 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2360 2360 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2385 2385 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2392 2392 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2496 2496 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2628 2628 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2934 2934 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3067 3067 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3369 3369 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3379 3379 N-linked (GlcNAc...) (Potential).
 DISULFID 44 130 By similarity.
 DISULFID 172 243 By similarity.
 DISULFID 196 217 By similarity.
 DISULFID 270 345 By similarity.
 DISULFID 294 315 By similarity.
 DISULFID 3093 3104 By similarity.
 DISULFID 3098 3113 By similarity.
 DISULFID 3115 3124 By similarity.
 DISULFID 3131 3142 By similarity.
 DISULFID 3136 3151 By similarity.
 DISULFID 3153 3162 By similarity.
 DISULFID 3169 3180 By similarity.
 DISULFID 3197 3289 By similarity.
 DISULFID 3265 3281 By similarity.
 DISULFID 3296 3339 By similarity.
 DISULFID 3325 3352 By similarity.  
Keyword
 Alternative splicing; Calcium; Cataract; Complete proteome; Direct protein sequencing; Disulfide bond; EGF-like domain; Extracellular matrix; Glycoprotein; Hyaluronic acid; Immunoglobulin domain; Lectin; Phosphoprotein; Polymorphism; Proteoglycan; Reference proteome; Repeat; Secreted; Signal; Sushi. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 3396 AA 
Protein Sequence
MFINIKSILW MCSTLIVTHA LHKVKVGKSP PVRGSLSGKV SLPCHFSTMP TLPPSYNTSE 60
FLRIKWSKIE VDKNGKDLKE TTVLVAQNGN IKIGQDYKGR VSVPTHPEAV GDASLTVVKL 120
LASDAGLYRC DVMYGIEDTQ DTVSLTVDGV VFHYRAATSR YTLNFEAAQK ACLDVGAVIA 180
TPEQLFAAYE DGFEQCDAGW LADQTVRYPI RAPRVGCYGD KMGKAGVRTY GFRSPQETYD 240
VYCYVDHLDG DVFHLTVPSK FTFEEAAKEC ENQDARLATV GELQAAWRNG FDQCDYGWLS 300
DASVRHPVTV ARAQCGGGLL GVRTLYRFEN QTGFPPPDSR FDAYCFKPKE ATTIDLSILA 360
ETASPSLSKE PQMVSDRTTP IIPLVDELPV IPTEFPPVGN IVSFEQKATV QPQAITDSLA 420
TKLPTPTGST KKPWDMDDYS PSASGPLGKL DISEIKEEVL QSTTGVSHYA TDSWDGVVED 480
KQTQESVTQI EQIEVGPLVT SMEILKHIPS KEFPVTETPL VTARMILESK TEKKMVSTVS 540
ELVTTGHYGF TLGEEDDEDR TLTVGSDEST LIFDQIPEVI TVSKTSEDTI HTHLEDLESV 600
SASTTVSPLI MPDNNGSSMD DWEERQTSGR ITEEFLGKYL STTPFPSQHR TEIELFPYSG 660
DKILVEGIST VIYPSLQTEM THRRERTETL IPEMRTDTYT DEIQEEITKS PFMGKTEEEV 720
FSGMKLSTSL SEPIHVTESS VEMTKSFDFP TLITKLSAEP TEVRDMEEDF TATPGTTKYD 780
ENITTVLLAH GTLSVEAATV SKWSWDEDNT TSKPLESTEP SASSKLPPAL LTTVGMNGKD 840
KDIPSFTEDG ADEFTLIPDS TQKQLEEVTD EDIAAHGKFT IRFQPTTSTG IAEKSTLRDS 900
TTEEKVPPIT STEGQVYATM EGSALGEVED VDLSKPVSTV PQFAHTSEVE GLAFVSYSST 960
QEPTTYVDSS HTIPLSVIPK TDWGVLVPSV PSEDEVLGEP SQDILVIDQT RLEATISPET 1020
MRTTKITEGT TQEEFPWKEQ TAEKPVPALS STAWTPKEAV TPLDEQEGDG SAYTVSEDEL 1080
LTGSERVPVL ETTPVGKIDH SVSYPPGAVT EHKVKTDEVV TLTPRIGPKV SLSPGPEQKY 1140
ETEGSSTTGF TSSLSPFSTH ITQLMEETTT EKTSLEDIDL GSGLFEKPKA TELIEFSTIK 1200
VTVPSDITTA FSSVDRLHTT SAFKPSSAIT KKPPLIDREP GEETTSDMVI IGESTSHVPP 1260
TTLEDIVAKE TETDIDREYF TTSSPPATQP TRPPTVEDKE AFGPQALSTP QPPASTKFHP 1320
DINVYIIEVR ENKTGRMSDL SVIGHPIDSE SKEDEPCSEE TDPVHDLMAE ILPEFPDIIE 1380
IDLYHSEENE EEEEECANAT DVTTTPSVQY INGKHLVTTV PKDPEAAEAR RGQFESVAPS 1440
QNFSDSSESD THPFVIAKTE LSTAVQPNES TETTESLEVT WKPETYPETS EHFSGGEPDV 1500
FPTVPFHEEF ESGTAKKGAE SVTERDTEVG HQAHEHTEPV SLFPEESSGE IAIDQESQKI 1560
AFARATEVTF GEEVEKSTSV TYTPTIVPSS ASAYVSEEEA VTLIGNPWPD DLLSTKESWV 1620
EATPRQVVEL SGSSSIPITE GSGEAEEDED TMFTMVTDLS QRNTTDTLIT LDTSRIITES 1680
FFEVPATTIY PVSEQPSAKV VPTKFVSETD TSEWISSTTV EEKKRKEEEG TTGTASTFEV 1740
YSSTQRSDQL ILPFELESPN VATSSDSGTR KSFMSLTTPT QSEREMTDST PVFTETNTLE 1800
NLGAQTTEHS SIHQPGVQEG LTTLPRSPAS VFMEQGSGEA AADPETTTVS SFSLNVEYAI 1860
QAEKEVAGTL SPHVETTFST EPTGLVLSTV MDRVVAENIT QTSREIVISE RLGEPNYGAE 1920
IRGFSTGFPL EEDFSGDFRE YSTVSHPIAK EETVMMEGSG DAAFRDTQTS PSTVPTSVHI 1980
SHISDSEGPS STMVSTSAFP WEEFTSSAEG SGEQLVTVSS SVVPVLPSAV QKFSGTASSI 2040
IDEGLGEVGT VNEIDRRSTI LPTAEVEGTK APVEKEEVKV SGTVSTNFPQ TIEPAKLWSR 2100
QEVNPVRQEI ESETTSEEQI QEEKSFESPQ NSPATEQTIF DSQTFTETEL KTTDYSVLTT 2160
KKTYSDDKEM KEEDTSLVNM STPDPDANGL ESYTTLPEAT EKSHFFLATA LVTESIPAEH 2220
VVTDSPIKKE ESTKHFPKGM RPTIQESDTE LLFSGLGSGE EVLPTLPTES VNFTEVEQIN 2280
NTLYPHTSQV ESTSSDKIED FNRMENVAKE VGPLVSQTDI FEGSGSVTST TLIEILSDTG 2340
AEGPTVAPLP FSTDIGHPQN QTVRWAEEIQ TSRPQTITEQ DSNKNSSTAE INETTTSSTD 2400
FLARAYGFEM AKEFVTSAPK PSDLYYEPSG EGSGEVDIVD SFHTSATTQA TRQESSTTFV 2460
SDGSLEKHPE VPSAKAVTAD GFPTVSVMLP LHSEQNKSSP DPTSTLSNTV SYERSTDGSF 2520
QDRFREFEDS TLKPNRKKPT ENIIIDLDKE DKDLILTITE STILEILPEL TSDKNTIIDI 2580
DHTKPVYEDI LGMQTDIDTE VPSEPHDSND ESNDDSTQVQ EIYEAAVNLS LTEETFEGSA 2640
DVLASYTQAT HDESMTYEDR SQLDHMGFHF TTGIPAPSTE TELDVLLPTA TSLPIPRKSA 2700
TVIPEIEGIK AEAKALDDMF ESSTLSDGQA IADQSEIIPT LGQFERTQEE YEDKKHAGPS 2760
FQPEFSSGAE EALVDHTPYL SIATTHLMDQ SVTEVPDVME GSNPPYYTDT TLAVSTFAKL 2820
SSQTPSSPLT IYSGSEASGH TEIPQPSALP GIDVGSSVMS PQDSFKEIHV NIEATFKPSS 2880
EEYLHITEPP SLSPDTKLEP SEDDGKPELL EEMEASPTEL IAVEGTEILQ DFQNKTDGQV 2940
SGEAIKMFPT IKTPEAGTVI TTADEIELEG ATQWPHSTSA SATYGVEAGV VPWLSPQTSE 3000
RPTLSSSPEI NPETQAALIR GQDSTIAASE QQVAARILDS NDQATVNPVE FNTEVATPPF 3060
SLLETSNETD FLIGINEESV EGTAIYLPGP DRCKMNPCLN GGTCYPTETS YVCTCVPGYS 3120
GDQCELDFDE CHSNPCRNGA TCVDGFNTFR CLCLPSYVGA LCEQDTETCD YGWHKFQGQC 3180
YKYFAHRRTW DAAERECRLQ GAHLTSILSH EEQMFVNRVG HDYQWIGLND KMFEHDFRWT 3240
DGSTLQYENW RPNQPDSFFS AGEDCVVIIW HENGQWNDVP CNYHLTYTCK KGTVACGQPP 3300
VVENAKTFGK MKPRYEINSL IRYHCKDGFI QRHLPTIRCL GNGRWAIPKI TCMNPSAYQR 3360
TYSMKYFKNS SSAKDNSINT SKHDHRWSRR WQESRR 3396 
Gene Ontology
 GO:0005615; C:extracellular space; IDA:BHF-UCL.
 GO:0005796; C:Golgi lumen; TAS:Reactome.
 GO:0043202; C:lysosomal lumen; TAS:Reactome.
 GO:0005578; C:proteinaceous extracellular matrix; TAS:ProtInc.
 GO:0005509; F:calcium ion binding; IEA:InterPro.
 GO:0030246; F:carbohydrate binding; IEA:InterPro.
 GO:0005540; F:hyaluronic acid binding; TAS:ProtInc.
 GO:0005975; P:carbohydrate metabolic process; TAS:Reactome.
 GO:0007155; P:cell adhesion; TAS:ProtInc.
 GO:0008037; P:cell recognition; TAS:ProtInc.
 GO:0030206; P:chondroitin sulfate biosynthetic process; TAS:Reactome.
 GO:0030207; P:chondroitin sulfate catabolic process; TAS:Reactome.
 GO:0030208; P:dermatan sulfate biosynthetic process; TAS:Reactome.
 GO:0008347; P:glial cell migration; IDA:BHF-UCL.
 GO:0007507; P:heart development; IEA:Compara.
 GO:0007275; P:multicellular organismal development; TAS:ProtInc. 
Interpro
 IPR001304; C-type_lectin.
 IPR016186; C-type_lectin-like.
 IPR018378; C-type_lectin_CS.
 IPR016187; C-type_lectin_fold.
 IPR000742; EG-like_dom.
 IPR001881; EGF-like_Ca-bd.
 IPR013032; EGF-like_CS.
 IPR000152; EGF-type_Asp/Asn_hydroxyl_site.
 IPR018097; EGF_Ca-bd_CS.
 IPR007110; Ig-like_dom.
 IPR013783; Ig-like_fold.
 IPR003599; Ig_sub.
 IPR013106; Ig_V-set.
 IPR000538; Link.
 IPR000436; Sushi_SCR_CCP. 
Pfam
 PF00008; EGF
 PF00059; Lectin_C
 PF00084; Sushi
 PF07686; V-set
 PF00193; Xlink 
SMART
 SM00032; CCP
 SM00034; CLECT
 SM00181; EGF
 SM00179; EGF_CA
 SM00409; IG
 SM00445; LINK 
PROSITE
 PS00010; ASX_HYDROXYL
 PS00615; C_TYPE_LECTIN_1
 PS50041; C_TYPE_LECTIN_2
 PS00022; EGF_1
 PS01186; EGF_2
 PS50026; EGF_3
 PS01187; EGF_CA
 PS50835; IG_LIKE
 PS01241; LINK_1
 PS50963; LINK_2
 PS50923; SUSHI 
PRINTS
 PR01265; LINKMODULE.