CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-023213
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Histone-lysine N-methyltransferase 2B 
Protein Synonyms/Alias
 Lysine N-methyltransferase 2B; Myeloid/lymphoid or mixed-lineage leukemia protein 4; Trithorax homolog 2; WW domain-binding protein 7; WBP-7 
Gene Name
 KMT2B 
Gene Synonyms/Alias
 HRX2; KIAA0304; MLL2; MLL4; TRX2; WBP7 
Created Date
 July 27, 2013 
Organism
 Homo sapiens (Human) 
NCBI Taxa ID
 9606 
Lysine Modification
Position
Peptide
Type
References
797EKMFSLLKRAKVQLFubiquitination[1, 2]
926PSRSRRGKVEAAGPGacetylation[3, 4, 5, 6]
1567PSAAFQGKDPAAFSHacetylation[5]
1866SFSGARIKVPNYSPSmethylation[7]
2196SKIILVNKLGQVFVKubiquitination[1, 2, 6, 8, 9]
2296PPAPPPYKAPRLDEDacetylation[3]
Reference
 [1] A proteome-wide, quantitative survey of in vivo ubiquitylation sites reveals widespread regulatory roles.
 Wagner SA, Beli P, Weinert BT, Nielsen ML, Cox J, Mann M, Choudhary C.
 Mol Cell Proteomics. 2011 Oct;10(10):M111.013284. [PMID: 21890473]
 [2] hCKSAAP_UbSite: improved prediction of human ubiquitination sites by exploiting amino acid pattern and properties.
 Chen Z, Zhou Y, Song J, Zhang Z.
 Biochim Biophys Acta. 2013 Aug;1834(8):1461-7. [PMID: 23603789]
 [3] Lysine acetylation targets protein complexes and co-regulates major cellular functions.
 Choudhary C, Kumar C, Gnad F, Nielsen ML, Rehman M, Walther TC, Olsen JV, Mann M.
 Science. 2009 Aug 14;325(5942):834-40. [PMID: 19608861]
 [4] Proteomic investigations reveal a role for RNA processing factor THRAP3 in the DNA damage response.
 Beli P, Lukashchuk N, Wagner SA, Weinert BT, Olsen JV, Baskcomb L, Mann M, Jackson SP, Choudhary C.
 Mol Cell. 2012 Apr 27;46(2):212-25. [PMID: 22424773]
 [5] Proteomic investigations of lysine acetylation identify diverse substrates of mitochondrial deacetylase sirt3.
 Sol EM, Wagner SA, Weinert BT, Kumar A, Kim HS, Deng CX, Choudhary C.
 PLoS One. 2012;7(12):e50545. [PMID: 23236377]
 [6] Integrated proteomic analysis of post-translational modifications by serial enrichment.
 Mertins P, Qiao JW, Patel J, Udeshi ND, Clauser KR, Mani DR, Burgess MW, Gillette MA, Jaffe JD, Carr SA.
 Nat Methods. 2013 Jul;10(7):634-7. [PMID: 23749302]
 [7] Large-scale global identification of protein lysine methylation in vivo.
 Cao XJ, Arnaudo AM, Garcia BA.
 Epigenetics. 2013 May 1;8(5):477-85. [PMID: 23644510]
 [8] Systems-wide analysis of ubiquitylation dynamics reveals a key role for PAF15 ubiquitylation in DNA-damage bypass.
 Povlsen LK, Beli P, Wagner SA, Poulsen SL, Sylvestersen KB, Poulsen JW, Nielsen ML, Bekker-Jensen S, Mailand N, Choudhary C.
 Nat Cell Biol. 2012 Oct;14(10):1089-98. [PMID: 23000965]
 [9] Refined preparation and use of anti-diglycine remnant (K-ε-GG) antibody enables routine quantification of 10,000s of ubiquitination sites in single proteomics experiments.
 Udeshi ND, Svinkina T, Mertins P, Kuhn E, Mani DR, Qiao JW, Carr SA.
 Mol Cell Proteomics. 2013 Mar;12(3):825-31. [PMID: 23266961
Functional Description
 Histone methyltransferase. Methylates 'Lys-4' of histone H3. H3 'Lys-4' methylation represents a specific tag for epigenetic transcriptional activation. Plays a central role in beta-globin locus transcription regulation by being recruited by NFE2. Plays an important role in controlling bulk H3K4me during oocyte growth and preimplantation development. Required during the transcriptionally active period of oocyte growth for the establishment and/or maintenance of bulk H3K4 trimethylation (H3K4me3), global transcriptional silencing that preceeds resumption of meiosis, oocyte survival and normal zygotic genome activation. 
Sequence Annotation
 DOMAIN 1727 1783 FYR N-terminal.
 DOMAIN 2411 2492 FYR C-terminal.
 DOMAIN 2575 2691 SET.
 DOMAIN 2699 2715 Post-SET.
 DNA_BIND 37 44 A.T hook 1.
 DNA_BIND 110 117 A.T hook 2.
 DNA_BIND 357 365 A.T hook 3.
 ZN_FING 959 1006 CXXC-type.
 ZN_FING 1201 1252 PHD-type 1.
 ZN_FING 1249 1303 PHD-type 2.
 ZN_FING 1335 1396 PHD-type 3.
 REGION 2652 2653 S-adenosyl-L-methionine binding (By
 METAL 2655 2655 Zinc (By similarity).
 METAL 2703 2703 Zinc (By similarity).
 METAL 2705 2705 Zinc (By similarity).
 METAL 2710 2710 Zinc (By similarity).
 BINDING 2585 2585 S-adenosyl-L-methionine (By similarity).
 BINDING 2587 2587 S-adenosyl-L-methionine (By similarity).
 BINDING 2629 2629 S-adenosyl-L-methionine (By similarity).
 BINDING 2704 2704 S-adenosyl-L-methionine (By similarity).
 MOD_RES 2 2 N-acetylalanine.
 MOD_RES 320 320 Phosphoserine (By similarity).
 MOD_RES 821 821 Phosphoserine.
 MOD_RES 844 844 Phosphoserine.
 MOD_RES 861 861 Phosphoserine.
 MOD_RES 1032 1032 Phosphoserine.
 MOD_RES 1035 1035 Phosphoserine.
 MOD_RES 1930 1930 Phosphoserine.
 MOD_RES 2068 2068 Phosphothreonine.
 MOD_RES 2070 2070 Phosphoserine.
 MOD_RES 2083 2083 Phosphothreonine.  
Keyword
 3D-structure; Acetylation; Alternative splicing; Chromatin regulator; Complete proteome; DNA-binding; Metal-binding; Methyltransferase; Nucleus; Phosphoprotein; Polymorphism; Reference proteome; Repeat; S-adenosyl-L-methionine; Transcription; Transcription regulation; Transferase; Zinc; Zinc-finger. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 2715 AA 
Protein Sequence
MAAAAGGGSC PGPGSARGRF PGRPRGAGGG GGRGGRGNGA ERVRVALRRG GGATGPGGAE 60
PGEDTALLRL LGLRRGLRRL RRLWAGPRVQ RGRGRGRGRG WGPSRGCVPE EESSDGESDE 120
EEFQGFHSDE DVAPSSLRSA LRSQRGRAPR GRGRKHKTTP LPPPRLADVA PTPPKTPARK 180
RGEEGTERMV QALTELLRRA QAPQAPRSRA CEPSTPRRSR GRPPGRPAGP CRRKQQAVVV 240
AEAAVTIPKP EPPPPVVPVK HQTGSWKCKE GPGPGPGTPR RGGQSSRGGR GGRGRGRGGG 300
LPFVIKFVSR AKKVKMGQLS LGLESGQGQG QHEESWQDVP QRRVGSGQGG SPCWKKQEQK 360
LDDEEEEKKE EEEKDKEGEE KEERAVAEEM MPAAEKEEAK LPPPPLTPPA PSPPPPLPPP 420
STSPPPPLCP PPPPPVSPPP LPSPPPPPAQ EEQEESPPPV VPATCSRKRG RPPLTPSQRA 480
EREAARAGPE GTSPPTPTPS TATGGPPEDS PTVAPKSTTF LKNIRQFIMP VVSARSSRVI 540
KTPRRFMDED PPKPPKVEVS PVLRPPITTS PPVPQEPAPV PSPPRAPTPP STPVPLPEKR 600
RSILREPTFR WTSLTRELPP PPPAPPPPPA PSPPPAPATS SRRPLLLRAP QFTPSEAHLK 660
IYESVLTPPP LGAPEAPEPE PPPADDSPAE PEPRAVGRTN HLSLPRFAPV VTTPVKAEVS 720
PHGAPALSNG PQTQAQLLQP LQALQTQLLP QALPPPQPQL QPPPSPQQMP PLEKARIAGV 780
GSLPLSGVEE KMFSLLKRAK VQLFKIDQQQ QQKVAASMPL SPGGQMEEVA GAVKQISDRG 840
PVRSEDESVE AKRERPSGPE SPVQGPRIKH VCRHAAVALG QARAMVPEDV PRLSALPLRD 900
RQDLATEDTS SASETESVPS RSRRGKVEAA GPGGESEPTG SGGTLAHTPR RSLPSHHGKK 960
MRMARCGHCR GCLRVQDCGS CVNCLDKPKF GGPNTKKQCC VYRKCDKIEA RKMERLAKKG 1020
RTIVKTLLPW DSDESPEASP GPPGPRRGAG AGGPREEVVA HPGPEEQDSL LQRKSARRCV 1080
KQRPSYDIFE DSDDSEPGGP PAPRRRTPRE NELPLPEPEE QSRPRKPTLQ PVLQLKARRR 1140
LDKDALAPGP FASFPNGWTG KQKSPDGVHR VRVDFKEDCD LENVWLMGGL SVLTSVPGGP 1200
PMVCLLCASK GLHELVFCQV CCDPFHPFCL EEAERPLPQH HDTWCCRRCK FCHVCGRKGR 1260
GSKHLLECER CRHAYHPACL GPSYPTRATR KRRHWICSAC VRCKSCGATP GKNWDVEWSG 1320
DYSLCPRCTQ LYEKGNYCPI CTRCYEDNDY ESKMMQCAQC DHWVHAKCEG LSDEDYEILS 1380
GLPDSVLYTC GPCAGAAQPR WREALSGALQ GGLRQVLQGL LSSKVVGPLL LCTQCGPDGK 1440
QLHPGPCGLQ AVSQRFEDGH YKSVHSFMED MVGILMRHSE EGETPDRRAG GQMKGLLLKL 1500
LESAFGWFDA HDPKYWRRST RLPNGVLPNA VLPPSLDHVY AQWRQQEPET PESGQPPGDP 1560
SAAFQGKDPA AFSHLEDPRQ CALCLKYGDA DSKEAGRLLY IGQNEWTHVN CAIWSAEVFE 1620
ENDGSLKNVH AAVARGRQMR CELCLKPGAT VGCCLSSCLS NFHFMCARAS YCIFQDDKKV 1680
FCQKHTDLLD GKEIVNPDGF DVLRRVYVDF EGINFKRKFL TGLEPDAINV LIGSIRIDSL 1740
GTLSDLSDCE GRLFPIGYQC SRLYWSTVDA RRRCWYRCRI LEYRPWGPRE EPAHLEAAEE 1800
NQTIVHSPAP SSEPPGGEDP PLDTDVLVPG APERHSPIQN LDPPLRPDSG SAPPPAPRSF 1860
SGARIKVPNY SPSRRPLGGV SFGPLPSPGS PSSLTHHIPT VGDPDFPAPP RRSRRPSPLA 1920
PRPPPSRWAS PPLKTSPQLR VPPPTSVVTA LTPTSGELAP PGPAPSPPPP EDLGPDFEDM 1980
EVVSGLSAAD LDFAASLLGT EPFQEEIVAA GAMGSSHGGP GDSSEEESSP TSRYIHFPVT 2040
VVSAPGLAPS ATPGAPRIEQ LDGVDDGTDS EAEAVQQPRG QGTPPSGPGV VRAGVLGAAG 2100
DRARPPEDLP SEIVDFVLKN LGGPGDGGAG PREESLPPAP PLANGSQPSQ GLTASPADPT 2160
RTFAWLPGAP GVRVLSLGPA PEPPKPATSK IILVNKLGQV FVKMAGEGEP VPPPVKQPPL 2220
PPTISPTAPT SWTLPPGPLL GVLPVVGVVR PAPPPPPPPL TLVLSSGPAS PPRQAIRVKR 2280
VSTFSGRSPP APPPYKAPRL DEDGEASEDT PQVPGLGSGG FSRVRMKTPT VRGVLDLDRP 2340
GEPAGEESPG PLQERSPLLP LPEDGPPQVP DGPPDLLLES QWHHYSGEAS SSEEEPPSPD 2400
DKENQAPKRT GPHLRFEISS EDGFSVEAES LEGAWRTLIE KVQEARGHAR LRHLSFSGMS 2460
GARLLGIHHD AVIFLAEQLP GAQRCQHYKF RYHQQGEGQE EPPLNPHGAA RAEVYLRKCT 2520
FDMFNFLASQ HRVLPEGATC DEEEDEVQLR STRRATSLEL PMAMRFRHLK KTSKEAVGVY 2580
RSAIHGRGLF CKRNIDAGEM VIEYSGIVIR SVLTDKREKF YDGKGIGCYM FRMDDFDVVD 2640
ATMHGNAARF INHSCEPNCF SRVIHVEGQK HIVIFALRRI LRGEELTYDY KFPIEDASNK 2700
LPCNCGAKRC RRFLN 2715 
Gene Ontology
 GO:0035097; C:histone methyltransferase complex; IDA:MGI.
 GO:0003677; F:DNA binding; IEA:UniProtKB-KW.
 GO:0042800; F:histone methyltransferase activity (H3-K4 specific); IDA:MGI.
 GO:0003700; F:sequence-specific DNA binding transcription factor activity; NAS:UniProtKB.
 GO:0008270; F:zinc ion binding; NAS:UniProtKB.
 GO:0048096; P:chromatin-mediated maintenance of transcription; NAS:UniProtKB.
 GO:0016458; P:gene silencing; IEA:Compara.
 GO:0080182; P:histone H3-K4 trimethylation; IEA:Compara.
 GO:0009994; P:oocyte differentiation; IEA:Compara.
 GO:0001541; P:ovarian follicle development; IEA:Compara.
 GO:0030728; P:ovulation; IEA:Compara.
 GO:0051569; P:regulation of histone H3-K4 methylation; IEA:Compara.
 GO:0006351; P:transcription, DNA-dependent; IEA:UniProtKB-KW. 
Interpro
 IPR017956; AT_hook_DNA-bd_motif.
 IPR003889; FYrich_C.
 IPR003888; FYrich_N.
 IPR015722; Histone-lysine_MeTfrase.
 IPR016569; MeTrfase_trithorax.
 IPR003616; Post-SET_dom.
 IPR001214; SET_dom.
 IPR002857; Znf_CXXC.
 IPR011011; Znf_FYVE_PHD.
 IPR001965; Znf_PHD.
 IPR019787; Znf_PHD-finger.
 IPR013083; Znf_RING/FYVE/PHD. 
Pfam
 PF05965; FYRC
 PF05964; FYRN
 PF00628; PHD
 PF00856; SET
 PF02008; zf-CXXC 
SMART
 SM00384; AT_hook
 SM00542; FYRC
 SM00541; FYRN
 SM00249; PHD
 SM00508; PostSET
 SM00317; SET 
PROSITE
 PS51543; FYRC
 PS51542; FYRN
 PS50868; POST_SET
 PS50280; SET
 PS51058; ZF_CXXC
 PS01359; ZF_PHD_1
 PS50016; ZF_PHD_2 
PRINTS