CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-044343
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Collagen alpha-4(IV) chain 
Protein Synonyms/Alias
 Collagen, type IV, alpha 4, isoform CRA_a 
Gene Name
 COL4A4 
Gene Synonyms/Alias
 hCG_19233 
Created Date
 July 27, 2013 
Organism
 Homo sapiens (Human) 
NCBI Taxa ID
 9606 
Lysine Modification
Position
Peptide
Type
References
1229PPGPRGKKGPPGPPGubiquitination[1]
Reference
 [1] A data set of human endogenous protein ubiquitination sites.
 Shi Y, Chan DW, Jung SY, Malovannaya A, Wang Y, Qin J.
 Mol Cell Proteomics. 2011 May;10(5):M110.002089. [PMID: 20972266
Functional Description
  
Sequence Annotation
  
Keyword
 Collagen; Complete proteome; Disulfide bond; Reference proteome. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 1687 AA 
Protein Sequence
MWSLHIVLMR CSFRLTKSLA TGPWSLILIL FSVQYVYGSG KKYIGPCGGR DCSVCHCVPE 60
KGSRGPPGPP GPQGPIGPLG APGPIGLSGE KGMRGDRGPP GAAGDKGDKG PTGVPGFPGL 120
DGIPGHPGPP GPRGKPGMSG HNGSRGDPGF PGGRGALGPG GPLGHPGEKG EKGNSVFILG 180
AVKGIQGDRG DPGLPGLPGS WGAGGPAGPT GYPGEPGLVG PPGQPGRPGL KGNPGVGVKG 240
QMGDPGEVGQ QGSPGPTLLV EPPDFCLYKG EKGIKGIPGM VGLPGPPGRK GESGIGAKGE 300
KGIPGFPGPR GDPGSYGSPG FPGLKGELGL VGDPGLFGLI GPKGDPGNRG HPGPPGVLVT 360
PPLPLKGPPG DPGFPGRYGE TGDVGPPGPP GLLGRPGEAC AGMIGPPGPQ GFPGLPGLPG 420
EAGIPGRPDS APGKPGKPGS PGLPGAPGLQ GLPGSSVIYC SVGNPGPQGI KGKVGPPGGR 480
GPKGEKGNEG LCACEPGPMG PPGPPGLPGR QGSKGDLGLP GWLGTKGDPG PPGAEGPPGL 540
PGKHGASGPP GNKGAKGDMV VSRVKGHKGE RGPDGPPGFP GQPGSHGRDG HAGEKGDPGP 600
PGDHEDATPG GKGFPGPLGP PGKAGPVGPP GLGFPGPPGE RGHPGVPGHP GVRGPDGLKG 660
QKGDTISCNV TYPGRHGPPG FDGPPGPKGF PGPQGAPGLS GSDGHKGRPG TPGTAEIPGP 720
PGFRGDMGDP GFGGEKGSSP VGPPGPPGSP GVNGQKGIPG DPAFGHLGPP GKRGLSGVPG 780
IKGPRGDPGC PGAEGPAGIP GFLGLKGPKG REGHAGFPGV PGPPGHSCER GAPGIPGQPG 840
LPGYPGSPGA PGGKGQPGDV GPPGPAGMKG LPGLPGRPGA HGPPGLPGIP GPFGDDGLPG 900
PPGPKGPRGL PGFPGFPGER GKPGAEGCPG AKGEPGEKGM SGLPGDRGLR GAKGAIGPPG 960
DEGEMAIISQ KGTPGEPGPP GDDGFPGERG DKGTPGMQGR RGEPGRYGPP GFHRGEPGEK 1020
GQPGPPGPPG PPGSTGLRGF IGFPGLPGDQ GEPGSPGPPG FSGIDGARGP KGNKGDPASH 1080
FGPPGPKGEP GSPGCPGHFG ASGEQGLPGI QGPRGSPGRP GPPGSSGPPG CPGDHGMPGL 1140
RGQPGEMGDP GPRGLQGDPG IPGPPGIKGP SGSPGLNGLH GLKGQKGTKG ASGLHDVGPP 1200
GPVGIPGLKG ERGDPGSPGI SPPGPRGKKG PPGPPGSSGP PGPAGATGRA PKDIPDPGPP 1260
GDQGPPGPDG PRGAPGPPGL PGSVDLLRGE PGDCGLPGPP GPPGPPGPPG YKGFPGCDGK 1320
DGQKGPVGFP GPQGPHGFPG PPGEKGLPGP PGRKGPTGLP GEPGPPADVD DCPRIPGLPG 1380
APGMRGPEGA MGLPGMRGPS GPGCKGEPGL DGRRGVDGVP GSPGPPGRKG DTGEDGYPGG 1440
PGPPGPIGDP GPKGFGPGYL GGFLLVLHSQ TDQEPTCPLG MPRLWTGYSL LYLEGQEKAH 1500
NQDLGLAGSC LPVFSTLPFA YCNIHQVCHY AQRNDRSYWL ASAAPLPMMP LSEEAIRPYV 1560
SRCAVCEAPA QAVAVHSQDQ SIPPCPQTWR SLWIGYSFLM HTGAGDQGGG QALMSPGSCL 1620
EDFRAAPFLE CQGRQGTCHF FANKYSFWLT TVKADLQFSS APAPDTLKES QAQRQKISRC 1680
QVCVKYS 1687 
Gene Ontology
 GO:0005581; C:collagen; IEA:UniProtKB-KW.
 GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro. 
Interpro
 IPR016187; C-type_lectin_fold.
 IPR008160; Collagen.
 IPR001442; Collagen_VI_NC. 
Pfam
 PF01413; C4
 PF01391; Collagen 
SMART
 SM00111; C4 
PROSITE
 PS51403; NC1_IV 
PRINTS