CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-017579
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Sulfatase-modifying factor 1 
Protein Synonyms/Alias
 C-alpha-formylglycine-generating enzyme 1 
Gene Name
 SUMF1 
Gene Synonyms/Alias
 FGE; PSEC0152; UNQ3037/PRO9852 
Created Date
 July 27, 2013 
Organism
 Homo sapiens (Human) 
NCBI Taxa ID
 9606 
Lysine Modification
Position
Peptide
Type
References
249RLFPWGNKLQPKGQHubiquitination[1]
Reference
 [1] Refined preparation and use of anti-diglycine remnant (K-ε-GG) antibody enables routine quantification of 10,000s of ubiquitination sites in single proteomics experiments.
 Udeshi ND, Svinkina T, Mertins P, Kuhn E, Mani DR, Qiao JW, Carr SA.
 Mol Cell Proteomics. 2013 Mar;12(3):825-31. [PMID: 23266961
Functional Description
 Using molecular oxygen and an unidentified reducing agent, oxidizes a cysteine residue in the substrate sulfatase to an active site 3-oxoalanine residue, which is also called C(alpha)-formylglycine. Known substrates include GALNS, ARSA, STS and ARSE. 
Sequence Annotation
 REGION 341 360 Interaction with sulfatases.
 ACT_SITE 333 333 Proton acceptor (Probable).
 METAL 130 130 Calcium 2.
 METAL 259 259 Calcium 1.
 METAL 260 260 Calcium 1; via carbonyl oxygen.
 METAL 273 273 Calcium 1.
 METAL 275 275 Calcium 1; via carbonyl oxygen.
 METAL 293 293 Calcium 2; via carbonyl oxygen.
 METAL 296 296 Calcium 2; via carbonyl oxygen.
 METAL 298 298 Calcium 2; via carbonyl oxygen.
 METAL 300 300 Calcium 2.
 CARBOHYD 141 141 N-linked (GlcNAc...).
 DISULFID 50 52
 DISULFID 218 365
 DISULFID 235 346
 DISULFID 336 341 Redox-active.  
Keyword
 3D-structure; Alternative splicing; Calcium; Complete proteome; Direct protein sequencing; Disease mutation; Disulfide bond; Endoplasmic reticulum; Glycoprotein; Ichthyosis; Leukodystrophy; Metachromatic leukodystrophy; Metal-binding; Mucopolysaccharidosis; Oxidoreductase; Polymorphism; Redox-active center; Reference proteome; Signal. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 374 AA 
Protein Sequence
MAAPALGLVC GRCPELGLVL LLLLLSLLCG AAGSQEAGTG AGAGSLAGSC GCGTPQRPGA 60
HGSSAAAHRY SREANAPGPV PGERQLAHSK MVPIPAGVFT MGTDDPQIKQ DGEAPARRVT 120
IDAFYMDAYE VSNTEFEKFV NSTGYLTEAE KFGDSFVFEG MLSEQVKTNI QQAVAAAPWW 180
LPVKGANWRH PEGPDSTILH RPDHPVLHVS WNDAVAYCTW AGKRLPTEAE WEYSCRGGLH 240
NRLFPWGNKL QPKGQHYANI WQGEFPVTNT GEDGFQGTAP VDAFPPNGYG LYNIVGNAWE 300
WTSDWWTVHH SVEETLNPKG PPSGKDRVKK GGSYMCHRSY CYRYRCAARS QNTPDSSASN 360
LGFRCAADRL PTMD 374 
Gene Ontology
 GO:0005788; C:endoplasmic reticulum lumen; TAS:Reactome.
 GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
 GO:0016491; F:oxidoreductase activity; IEA:UniProtKB-KW.
 GO:0006687; P:glycosphingolipid metabolic process; TAS:Reactome.
 GO:0043687; P:post-translational protein modification; TAS:Reactome.
 GO:0044281; P:small molecule metabolic process; TAS:Reactome. 
Interpro
 IPR016187; C-type_lectin_fold.
 IPR005532; FGE_dom. 
Pfam
 PF03781; FGE-sulfatase 
SMART
  
PROSITE
  
PRINTS