CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-026044
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Set1, isoform A 
Protein Synonyms/Alias
 Set1, isoform B; Set1, isoform C; Set1, isoform D; Set1, isoform E; Set1, isoform F; Set1, isoform G; Set1, isoform H; Set1, isoform I 
Gene Name
 Set1 
Gene Synonyms/Alias
 CG40351; Dmel_CG40351 
Created Date
 July 27, 2013 
Organism
 Drosophila melanogaster (Fruit fly) 
NCBI Taxa ID
 7227 
Lysine Modification
Position
Peptide
Type
References
530TRIALIFKGKTFGNAacetylation[1]
Reference
 [1] Proteome-wide mapping of the Drosophila acetylome demonstrates a high degree of conservation of lysine acetylation.
 Weinert BT, Wagner SA, Horn H, Henriksen P, Liu WR, Olsen JV, Jensen LJ, Choudhary C.
 Sci Signal. 2011 Jul 26;4(183):ra48. [PMID: 21791702
Functional Description
  
Sequence Annotation
  
Keyword
 Complete proteome; Methyltransferase; Nucleus; Reference proteome; Transferase. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 1641 AA 
Protein Sequence
MQDVRNINLV NNSSNSHDSS LANSKMPRNF KLLSDPQLVK CGTRLYRYDG LMPGDPSYPT 60
ITPRDPRNPL IRIRARAVEP LMLLIPRFVI DSDYVGQPPA VEVTIVNLND NIDKQFLASM 120
LDKCGTSDEI NIYHHPITNK HLGIARIVFD STKGARQFVE KYNQKSVMGK ILDVFCDPFG 180
ATLKKSLESL TNSVAGKQLI GPKVTPQWTF QQAALEDTEF IHGYPEKNGE HIKDIYTTQT 240
NHEIPNRSRD RNWNRDKERE RDRHFKERSR HSSERSYDRD RGMRENVGTS IRRRRTFYRR 300
RSSDISPEDS RDILIMTRER SRDSDSRPRD YCRSRERESF RDRKRSHEKG RDQPREKREH 360
YYNSSKDREY RGRDRDRSAE IDQRDRGSLK YCSRYSLHEY IETDVRRSSN TISSYYSASS 420
LPIASHGFNS CSFPSIENIK TWSDRRAWTA FQPDFHPVQP PPPPPEEIDN WDEEEHDKNS 480
IVPTHYGCMA KLQPPVPSNV NFATKLQSVT QPNSDPGTVD LDTRIALIFK GKTFGNAPPF 540
LQMDSSDSET DQGKPEVFSD VNSDSNNSEN KKRSCEKNNK VLHQPNEASD ISSDEELIGK 600
KDKSKLSLIC EKEVNDDNMS LSSLSSQEDP IQTKEGAEYK SIMSSYMYSH SNQNPFYYHA 660
SGYGHYLSGI PSESASRLFS NGAYVHSEYL KAVASFNFDS FSKPYDYNKG ALSDQNDGIR 720
QKVKQVIGYI VEELKQILKR DVNKRMIEIT AFKHFETWWD EHTSKARSKP LFEKADSTVN 780
TPLNCIKDTS YNEKNPDINL LINAHREVAD FQSYSSIGLR AAMPKLPSFR RIRKHPSPIP 840
TKRNFLERDL SDQEEMVQRS DSDKEDSNVE ISDTARSKIK GPVPIQESDS KSHTSGLNSK 900
RKGSASSFFS SSSSSTSSEA EYEAIDCVEK ARTSEEDSPR GYGQRNLNQR TTTIRNRNLV 960
GTMDVINVRN LCSGSNEFKK ENVTKRTKKN IYSDTDEDND RTLFPALKEK NISTILSDLE 1020
EISKDSCIGL DENGIEPTIL RKIPNTPKLN EECRRSLTPV PPPGYNEEEI KKKVDCKQKP 1080
SFEYDRIYSD SEEEKEYQER RKRNTEYMAQ MEREFLEEQE KRIEKSLDKN LQSPNNIVKN 1140
NNSPRNKNDE TRKTAISQTR SCFESASKVD TTLVNIISVE NDINEFGPHE EGDVLTNGCN 1200
KMYTNSKGKT KRTQSPVYSE GGSSQASQAS QVALEHCYSL PPHSVSLGDY PSGKVNETKN 1260
ILKREAENIA IVSQMTRTGP GRPRKDPICI QKKKRDLAPR MSNVKSKMTP NGDEWPDLAH 1320
KNVHFVPCDM YKTRDQNEEM VILYTFLTKG IDAEDINFIK MSYLDHLHKE PYAMFLNNTH 1380
WVDHCTTDRA FWPPPSKKRR KDDELIRHKT GCARTEGFYK LDVREKAKHK YHYAKANTED 1440
SFNEDRSDEP TALTNHHHNK LISKMQGISR EARSNQRRLL TAFGSMGESE LLKFNQLKFR 1500
KKQLKFAKSA IHDWGLFAME PIAADEMVIE YVGQMIRPVV ADLRETKYEA IGIGSSYLFR 1560
IDMETIIDAT KCGNLARFIN HSCNPNCYAK VITIESEKKI VIYSKQPIGI NEEITYDYKF 1620
PLEDEKIPCL CGAQGCRGTL N 1641 
Gene Ontology
 GO:0000791; C:euchromatin; IDA:FlyBase.
 GO:0005700; C:polytene chromosome; IDA:FlyBase.
 GO:0048188; C:Set1C/COMPASS complex; IDA:FlyBase.
 GO:0035327; C:transcriptionally active chromatin; IDA:FlyBase.
 GO:0042800; F:histone methyltransferase activity (H3-K4 specific); IDA:FlyBase.
 GO:0003676; F:nucleic acid binding; IEA:InterPro.
 GO:0000166; F:nucleotide binding; IEA:InterPro.
 GO:0044648; P:histone H3-K4 dimethylation; IDA:FlyBase.
 GO:0080182; P:histone H3-K4 trimethylation; IDA:FlyBase.
 GO:0046427; P:positive regulation of JAK-STAT cascade; IMP:FlyBase. 
Interpro
 IPR024657; COMPASS_Set1_N-SET.
 IPR012677; Nucleotide-bd_a/b_plait.
 IPR003616; Post-SET_dom.
 IPR000504; RRM_dom.
 IPR001214; SET_dom. 
Pfam
 PF11764; N-SET
 PF00076; RRM_1
 PF00856; SET 
SMART
 SM00508; PostSET
 SM00360; RRM
 SM00317; SET 
PROSITE
 PS50868; POST_SET
 PS50102; RRM
 PS50280; SET 
PRINTS