CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-019425
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Histone-lysine N-methyltransferase, H3 lysine-36 and H4 lysine-20 specific 
Protein Synonyms/Alias
 Androgen receptor coactivator 267 kDa protein; Androgen receptor-associated protein of 267 kDa; H3-K36-HMTase; H4-K20-HMTase; Lysine N-methyltransferase 3B; Nuclear receptor-binding SET domain-containing protein 1; NR-binding SET domain-containing protein 
Gene Name
 NSD1 
Gene Synonyms/Alias
 ARA267; KMT3B 
Created Date
 July 27, 2013 
Organism
 Homo sapiens (Human) 
NCBI Taxa ID
 9606 
Lysine Modification
Position
Peptide
Type
References
865KHVLSELKELSYRSLubiquitination[1, 2]
886SGTSKPSKPLLFSSAubiquitination[1, 2]
2086GFLGVRPKNQPIATEacetylation[3]
2095QPIATEEKSKKFKKKacetylation[3]
Reference
 [1] A proteome-wide, quantitative survey of in vivo ubiquitylation sites reveals widespread regulatory roles.
 Wagner SA, Beli P, Weinert BT, Nielsen ML, Cox J, Mann M, Choudhary C.
 Mol Cell Proteomics. 2011 Oct;10(10):M111.013284. [PMID: 21890473]
 [2] hCKSAAP_UbSite: improved prediction of human ubiquitination sites by exploiting amino acid pattern and properties.
 Chen Z, Zhou Y, Song J, Zhang Z.
 Biochim Biophys Acta. 2013 Aug;1834(8):1461-7. [PMID: 23603789]
 [3] Regulation of cellular metabolism by protein lysine acetylation.
 Zhao S, Xu W, Jiang W, Yu W, Lin Y, Zhang T, Yao J, Zhou L, Zeng Y, Li H, Li Y, Shi J, An W, Hancock SM, He F, Qin L, Chin J, Yang P, Chen X, Lei Q, Xiong Y, Guan KL.
 Science. 2010 Feb 19;327(5968):1000-4. [PMID: 20167786
Functional Description
 Histone methyltransferase. Preferentially methylates 'Lys-36' of histone H3 and 'Lys-20' of histone H4 (in vitro). Transcriptional intermediary factor capable of both negatively or positively influencing transcription, depending on the cellular context. 
Sequence Annotation
 DOMAIN 323 388 PWWP 1.
 DOMAIN 1756 1818 PWWP 2.
 DOMAIN 1890 1940 AWS.
 DOMAIN 1942 2059 SET.
 DOMAIN 2066 2082 Post-SET.
 ZN_FING 1543 1589 PHD-type 1.
 ZN_FING 1590 1646 PHD-type 2.
 ZN_FING 1707 1751 PHD-type 3.
 ZN_FING 2118 2165 PHD-type 4; atypical.
 REGION 1952 1954 S-adenosyl-L-methionine binding.
 REGION 1994 1997 S-adenosyl-L-methionine binding.
 REGION 2020 2021 S-adenosyl-L-methionine binding.
 REGION 2060 2066 Inhibits enzyme activity in the absence
 BINDING 2065 2065 S-adenosyl-L-methionine.
 BINDING 2071 2071 S-adenosyl-L-methionine.
 MOD_RES 483 483 Phosphoserine.
 MOD_RES 486 486 Phosphoserine.
 MOD_RES 766 766 Phosphoserine.
 MOD_RES 2462 2462 Phosphothreonine.
 MOD_RES 2471 2471 Phosphoserine.  
Keyword
 3D-structure; Activator; Alternative splicing; Chromatin regulator; Chromosomal rearrangement; Chromosome; Complete proteome; Disease mutation; Metal-binding; Methyltransferase; Nucleus; Phosphoprotein; Polymorphism; Proto-oncogene; Reference proteome; Repeat; Repressor; S-adenosyl-L-methionine; Transcription; Transcription regulation; Transferase; Zinc; Zinc-finger. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 2696 AA 
Protein Sequence
MDQTCELPRR NCLLPFSNPV NLDAPEDKDS PFGNGQSNFS EPLNGCTMQL STVSGTSQNA 60
YGQDSPSCYI PLRRLQDLAS MINVEYLNGS ADGSESFQDP EKSDSRAQTP IVCTSLSPGG 120
PTALAMKQEP SCNNSPELQV KVTKTIKNGF LHFENFTCVD DADVDSEMDP EQPVTEDESI 180
EEIFEETQTN ATCNYETKSE NGVKVAMGSE QDSTPESRHG AVKSPFLPLA PQTETQKNKQ 240
RNEVDGSNEK AALLPAPFSL GDTNITIEEQ LNSINLSFQD DPDSSTSTLG NMLELPGTSS 300
SSTSQELPFC QPKKKSTPLK YEVGDLIWAK FKRRPWWPCR ICSDPLINTH SKMKVSNRRP 360
YRQYYVEAFG DPSERAWVAG KAIVMFEGRH QFEELPVLRR RGKQKEKGYR HKVPQKILSK 420
WEASVGLAEQ YDVPKGSKNR KCIPGSIKLD SEEDMPFEDC TNDPESEHDL LLNGCLKSLA 480
FDSEHSADEK EKPCAKSRAR KSSDNPKRTS VKKGHIQFEA HKDERRGKIP ENLGLNFISG 540
DISDTQASNE LSRIANSLTG SNTAPGSFLF SSCGKNTAKK EFETSNGDSL LGLPEGALIS 600
KCSREKNKPQ RSLVCGSKVK LCYIGAGDEE KRSDSISICT TSDDGSSDLD PIEHSSESDN 660
SVLEIPDAFD RTENMLSMQK NEKIKYSRFA ATNTRVKAKQ KPLISNSHTD HLMGCTKSAE 720
PGTETSQVNL SDLKASTLVH KPQSDFTNDA LSPKFNLSSS ISSENSLIKG GAANQALLHS 780
KSKQPKFRSI KCKHKENPVM AEPPVINEEC SLKCCSSDTK GSPLASISKS GKVDGLKLLN 840
NMHEKTRDSS DIETAVVKHV LSELKELSYR SLGEDVSDSG TSKPSKPLLF SSASSQNHIP 900
IEPDYKFSTL LMMLKDMHDS KTKEQRLMTA QNLVSYRSPG RGDCSTNSPV GVSKVLVSGG 960
STHNSEKKGD GTQNSANPSP SGGDSALSGE LSASLPGLLS DKRDLPASGK SRSDCVTRRN 1020
CGRSKPSSKL RDAFSAQMVK NTVNRKALKT ERKRKLNQLP SVTLDAVLQG DRERGGSLRG 1080
GAEDPSKEDP LQIMGHLTSE DGDHFSDVHF DSKVKQSDPG KISEKGLSFE NGKGPELDSV 1140
MNSENDELNG VNQVVPKKRW QRLNQRRTKP RKRMNRFKEK ENSECAFRVL LPSDPVQEGR 1200
DEFPEHRTPS ASILEEPLTE QNHADCLDSA GPRLNVCDKS SASIGDMEKE PGIPSLTPQA 1260
ELPEPAVRSE KKRLRKPSKW LLEYTEEYDQ IFAPKKKQKK VQEQVHKVSS RCEEESLLAR 1320
GRSSAQNKQV DENSLISTKE EPPVLEREAP FLEGPLAQSE LGGGHAELPQ LTLSVPVAPE 1380
VSPRPALESE ELLVKTPGNY ESKRQRKPTK KLLESNDLDP GFMPKKGDLG LSKKCYEAGH 1440
LENGITESCA TSYSKDFGGG TTKIFDKPRK RKRQRHAAAK MQCKKVKNDD SSKEIPGSEG 1500
ELMPHRTATS PKETVEEGVE HDPGMPASKK MQGERGGGAA LKENVCQNCE KLGELLLCEA 1560
QCCGAFHLEC LGLTEMPRGK FICNECRTGI HTCFVCKQSG EDVKRCLLPL CGKFYHEECV 1620
QKYPPTVMQN KGFRCSLHIC ITCHAANPAN VSASKGRLMR CVRCPVAYHA NDFCLAAGSK 1680
ILASNSIICP NHFTPRRGCR NHEHVNVSWC FVCSEGGSLL CCDSCPAAFH RECLNIDIPE 1740
GNWYCNDCKA GKKPHYREIV WVKVGRYRWW PAEICHPRAV PSNIDKMRHD VGEFPVLFFG 1800
SNDYLWTHQA RVFPYMEGDV SSKDKMGKGV DGTYKKALQE AAARFEELKA QKELRQLQED 1860
RKNDKKPPPY KHIKVNRPIG RVQIFTADLS EIPRCNCKAT DENPCGIDSE CINRMLLYEC 1920
HPTVCPAGGR CQNQCFSKRQ YPEVEIFRTL QRGWGLRTKT DIKKGEFVNE YVGELIDEEE 1980
CRARIRYAQE HDITNFYMLT LDKDRIIDAG PKGNYARFMN HCCQPNCETQ KWSVNGDTRV 2040
GLFALSDIKA GTELTFNYNL ECLGNGKTVC KCGAPNCSGF LGVRPKNQPI ATEEKSKKFK 2100
KKQQGKRRTQ GEITKEREDE CFSCGDAGQL VSCKKPGCPK VYHADCLNLT KRPAGKWECP 2160
WHQCDICGKE AASFCEMCPS SFCKQHREGM LFISKLDGRL SCTEHDPCGP NPLEPGEIRE 2220
YVPPPVPLPP GPSTHLAEQS TGMAAQAPKM SDKPPADTNQ MLSLSKKALA GTCQRPLLPE 2280
RPLERTDSRP QPLDKVRDLA GSGTKSQSLV SSQRPLDRPP AVAGPRPQLS DKPSPVTSPS 2340
SSPSVRSQPL ERPLGTADPR LDKSIGAASP RPQSLEKTSV PTGLRLPPPD RLLITSSPKP 2400
QTSDRPTDKP HASLSQRLPP PEKVLSAVVQ TLVAKEKALR PVDQNTQSKN RAALVMDLID 2460
LTPRQKERAA SPHQVTPQAD EKMPVLESSS WPASKGLGHM PRAVEKGCVS DPLQTSGKAA 2520
APSEDPWQAV KSLTQARLLS QPPAKAFLYE PTTQASGRAS AGAEQTPGPL SQSPGLVKQA 2580
KQMVGGQQLP ALAAKSGQSF RSLGKAPASL PTEEKKLVTT EQSPWALGKA SSRAGLWPIV 2640
AGQTLAQSCW SAGSTQTLAQ TCWSLGRGQD PKPEQNTLPA LNQAPSSHKC AESEQK 2696 
Gene Ontology
 GO:0005694; C:chromosome; IEA:UniProtKB-SubCell.
 GO:0005634; C:nucleus; IEA:UniProtKB-SubCell.
 GO:0050681; F:androgen receptor binding; IDA:UniProtKB.
 GO:0003682; F:chromatin binding; ISS:UniProtKB.
 GO:0030331; F:estrogen receptor binding; ISS:UniProtKB.
 GO:0046975; F:histone methyltransferase activity (H3-K36 specific); IDA:UniProtKB.
 GO:0042799; F:histone methyltransferase activity (H4-K20 specific); ISS:UniProtKB.
 GO:0016922; F:ligand-dependent nuclear receptor binding; ISS:UniProtKB.
 GO:0046965; F:retinoid X receptor binding; ISS:UniProtKB.
 GO:0046966; F:thyroid hormone receptor binding; ISS:UniProtKB.
 GO:0003714; F:transcription corepressor activity; ISS:UniProtKB.
 GO:0008270; F:zinc ion binding; IDA:UniProtKB.
 GO:0001702; P:gastrulation with mouth forming second; IEA:Compara.
 GO:0000122; P:negative regulation of transcription from RNA polymerase II promoter; ISS:UniProtKB.
 GO:0045893; P:positive regulation of transcription, DNA-dependent; IDA:UniProtKB.
 GO:0006351; P:transcription, DNA-dependent; IEA:UniProtKB-KW. 
Interpro
 IPR006560; AWS.
 IPR003616; Post-SET_dom.
 IPR000313; PWWP.
 IPR001214; SET_dom.
 IPR019786; Zinc_finger_PHD-type_CS.
 IPR011011; Znf_FYVE_PHD.
 IPR001965; Znf_PHD.
 IPR019787; Znf_PHD-finger.
 IPR001841; Znf_RING.
 IPR013083; Znf_RING/FYVE/PHD. 
Pfam
 PF00628; PHD
 PF00855; PWWP
 PF00856; SET 
SMART
 SM00570; AWS
 SM00249; PHD
 SM00508; PostSET
 SM00293; PWWP
 SM00184; RING
 SM00317; SET 
PROSITE
 PS51215; AWS
 PS50868; POST_SET
 PS50812; PWWP
 PS50280; SET
 PS01359; ZF_PHD_1
 PS50016; ZF_PHD_2 
PRINTS