CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-010901
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Collagen alpha-3(IV) chain 
Protein Synonyms/Alias
 Goodpasture antigen; Tumstatin 
Gene Name
 COL4A3 
Gene Synonyms/Alias
  
Created Date
 July 27, 2013 
Organism
 Homo sapiens (Human) 
NCBI Taxa ID
 9606 
Lysine Modification
Position
Peptide
Type
References
1414EKGNKGSKGEPGPAGubiquitination[1, 2]
1429SDGLPGLKGKRGDSGacetylation[3]
Reference
 [1] Tryptic digestion of ubiquitin standards reveals an improved strategy for identifying ubiquitinated proteins by mass spectrometry.
 Denis NJ, Vasilescu J, Lambert JP, Smith JC, Figeys D.
 Proteomics. 2007 Mar;7(6):868-74. [PMID: 17370265]
 [2] hCKSAAP_UbSite: improved prediction of human ubiquitination sites by exploiting amino acid pattern and properties.
 Chen Z, Zhou Y, Song J, Zhang Z.
 Biochim Biophys Acta. 2013 Aug;1834(8):1461-7. [PMID: 23603789]
 [3] Proteomic investigations of lysine acetylation identify diverse substrates of mitochondrial deacetylase sirt3.
 Sol EM, Wagner SA, Weinert BT, Kumar A, Kim HS, Deng CX, Choudhary C.
 PLoS One. 2012;7(12):e50545. [PMID: 23236377
Functional Description
 Type IV collagen is the major structural component of glomerular basement membranes (GBM), forming a 'chicken-wire' meshwork together with laminins, proteoglycans and entactin/nidogen. 
Sequence Annotation
 DOMAIN 1445 1669 Collagen IV NC1.
 REGION 29 42 7S domain.
 REGION 43 1438 Triple-helical region.
 REGION 1427 1444 Epitope recognized by Goodpasture
 REGION 1479 1557 Required for the anti-angiogenic activity
 REGION 1610 1628 Required for the anti-tumor cell activity
 MOTIF 791 793 Cell attachment site (Potential).
 MOTIF 996 998 Cell attachment site (Potential).
 MOTIF 1154 1156 Cell attachment site (Potential).
 MOTIF 1306 1308 Cell attachment site (Potential).
 MOTIF 1345 1347 Cell attachment site (Potential).
 MOTIF 1432 1434 Cell attachment site (Potential).
 CARBOHYD 253 253 N-linked (GlcNAc...) (Potential).
 DISULFID 1460 1551 Or C-1460 with C-1548 (By similarity).
 DISULFID 1493 1548 Or C-1493 with C-1551 (By similarity).
 DISULFID 1505 1511 By similarity.
 DISULFID 1570 1665 Or C-1570 with C-1662 (By similarity).
 DISULFID 1604 1662 Or C-1604 with C-1665 (By similarity).
 DISULFID 1616 1622 By similarity.
 CROSSLNK 1414 1414 Glycyl lysine isopeptide (Lys-Gly)
 CROSSLNK 1533 1533 S-Lysyl-methionine sulfilimine (Met-Lys)
 CROSSLNK 1651 1651 S-Lysyl-methionine sulfilimine (Lys-Met)  
Keyword
 Alport syndrome; Alternative splicing; Basement membrane; Cell adhesion; Collagen; Complete proteome; Deafness; Direct protein sequencing; Disease mutation; Disulfide bond; Extracellular matrix; Glycoprotein; Hydroxylation; Isopeptide bond; Phosphoprotein; Polymorphism; Reference proteome; Repeat; Secreted; Signal; Ubl conjugation. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 1670 AA 
Protein Sequence
MSARTAPRPQ VLLLPLLLVL LAAAPAASKG CVCKDKGQCF CDGAKGEKGE KGFPGPPGSP 60
GQKGFTGPEG LPGPQGPKGF PGLPGLTGSK GVRGISGLPG FSGSPGLPGT PGNTGPYGLV 120
GVPGCSGSKG EQGFPGLPGT LGYPGIPGAA GLKGQKGAPA KEEDIELDAK GDPGLPGAPG 180
PQGLPGPPGF PGPVGPPGPP GFFGFPGAMG PRGPKGHMGE RVIGHKGERG VKGLTGPPGP 240
PGTVIVTLTG PDNRTDLKGE KGDKGAMGEP GPPGPSGLPG ESYGSEKGAP GDPGLQGKPG 300
KDGVPGFPGS EGVKGNRGFP GLMGEDGIKG QKGDIGPPGF RGPTEYYDTY QEKGDEGTPG 360
PPGPRGARGP QGPSGPPGVP GSPGSSRPGL RGAPGWPGLK GSKGERGRPG KDAMGTPGSP 420
GCAGSPGLPG SPGPPGPPGD IVFRKGPPGD HGLPGYLGSP GIPGVDGPKG EPGLLCTQCP 480
YIPGPPGLPG LPGLHGVKGI PGRQGAAGLK GSPGSPGNTG LPGFPGFPGA QGDPGLKGEK 540
GETLQPEGQV GVPGDPGLRG QPGRKGLDGI PGTPGVKGLP GPKGELALSG EKGDQGPPGD 600
PGSPGSPGPA GPAGPPGYGP QGEPGLQGTQ GVPGAPGPPG EAGPRGELSV STPVPGPPGP 660
PGPPGHPGPQ GPPGIPGSLG KCGDPGLPGP DGEPGIPGIG FPGPPGPKGD QGFPGTKGSL 720
GCPGKMGEPG LPGKPGLPGA KGEPAVAMPG GPGTPGFPGE RGNSGEHGEI GLPGLPGLPG 780
TPGNEGLDGP RGDPGQPGPP GEQGPPGRCI EGPRGAQGLP GLNGLKGQQG RRGKTGPKGD 840
PGIPGLDRSG FPGETGSPGI PGHQGEMGPL GQRGYPGNPG ILGPPGEDGV IGMMGFPGAI 900
GPPGPPGNPG TPGQRGSPGI PGVKGQRGTP GAKGEQGDKG NPGPSEISHV IGDKGEPGLK 960
GFAGNPGEKG NRGVPGMPGL KGLKGLPGPA GPPGPRGDLG STGNPGEPGL RGIPGSMGNM 1020
GMPGSKGKRG TLGFPGRAGR PGLPGIHGLQ GDKGEPGYSE GTRPGPPGPT GDPGLPGDMG 1080
KKGEMGQPGP PGHLGPAGPE GAPGSPGSPG LPGKPGPHGD LGFKGIKGLL GPPGIRGPPG 1140
LPGFPGSPGP MGIRGDQGRD GIPGPAGEKG ETGLLRAPPG PRGNPGAQGA KGDRGAPGFP 1200
GLPGRKGAMG DAGPRGPTGI EGFPGPPGLP GAIIPGQTGN RGPPGSRGSP GAPGPPGPPG 1260
SHVIGIKGDK GSMGHPGPKG PPGTAGDMGP PGRLGAPGTP GLPGPRGDPG FQGFPGVKGE 1320
KGNPGFLGSI GPPGPIGPKG PPGVRGDPGT LKIISLPGSP GPPGTPGEPG MQGEPGPPGP 1380
PGNLGPCGPR GKPGKDGKPG TPGPAGEKGN KGSKGEPGPA GSDGLPGLKG KRGDSGSPAT 1440
WTTRGFVFTR HSQTTAIPSC PEGTVPLYSG FSFLFVQGNQ RAHGQDLGTL GSCLQRFTTM 1500
PFLFCNVNDV CNFASRNDYS YWLSTPALMP MNMAPITGRA LEPYISRCTV CEGPAIAIAV 1560
HSQTTDIPPC PHGWISLWKG FSFIMFTSAG SEGTGQALAS PGSCLEEFRA SPFLECHGRG 1620
TCNYYSNSYS FWLASLNPER MFRKPIPSTV KAGELEKIIS RCQVCMKKRH 1670 
Gene Ontology
 GO:0005587; C:collagen type IV; IDA:UniProtKB.
 GO:0005788; C:endoplasmic reticulum lumen; TAS:Reactome.
 GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
 GO:0005178; F:integrin binding; IDA:UniProtKB.
 GO:0008191; F:metalloendopeptidase inhibitor activity; NAS:UniProtKB.
 GO:0005198; F:structural molecule activity; NAS:ProtInc.
 GO:0006919; P:activation of cysteine-type endopeptidase activity involved in apoptotic process; IDA:UniProtKB.
 GO:0007411; P:axon guidance; TAS:Reactome.
 GO:0008015; P:blood circulation; TAS:ProtInc.
 GO:0007155; P:cell adhesion; IEA:UniProtKB-KW.
 GO:0008283; P:cell proliferation; IDA:UniProtKB.
 GO:0007166; P:cell surface receptor signaling pathway; NAS:UniProtKB.
 GO:0030574; P:collagen catabolic process; TAS:Reactome.
 GO:0022617; P:extracellular matrix disassembly; TAS:Reactome.
 GO:0032836; P:glomerular basement membrane development; ISS:UniProtKB.
 GO:0006917; P:induction of apoptosis; IDA:UniProtKB.
 GO:0016525; P:negative regulation of angiogenesis; IDA:UniProtKB.
 GO:0008285; P:negative regulation of cell proliferation; TAS:ProtInc.
 GO:0007605; P:sensory perception of sound; TAS:ProtInc. 
Interpro
 IPR016187; C-type_lectin_fold.
 IPR008160; Collagen.
 IPR001442; Collagen_VI_NC. 
Pfam
 PF01413; C4
 PF01391; Collagen 
SMART
 SM00111; C4 
PROSITE
 PS51403; NC1_IV 
PRINTS