CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-044075
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Zn finger homeodomain 2, isoform B 
Protein Synonyms/Alias
  
Gene Name
 zfh2 
Gene Synonyms/Alias
 CG1449; Dmel_CG1449 
Created Date
 July 27, 2013 
Organism
 Drosophila melanogaster (Fruit fly) 
NCBI Taxa ID
 7227 
Lysine Modification
Position
Peptide
Type
References
1299SDGPVGIKQERLEQEacetylation[1]
1837QKANLPMKVVKHWFRacetylation[1]
2115KKMQIVGKTFEKNVAacetylation[1]
Reference
 [1] Proteome-wide mapping of the Drosophila acetylome demonstrates a high degree of conservation of lysine acetylation.
 Weinert BT, Wagner SA, Horn H, Henriksen P, Liu WR, Olsen JV, Jensen LJ, Choudhary C.
 Sci Signal. 2011 Jul 26;4(183):ra48. [PMID: 21791702
Functional Description
  
Sequence Annotation
  
Keyword
 Complete proteome; DNA-binding; Homeobox; Nucleus; Reference proteome. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 3003 AA 
Protein Sequence
MSSFDVETFN GKIVYNLDGS AHIIATDNTN GGGSGSGQNC YGSTTNSLKN LSKDKGRGQE 60
EKDIEHPSQY HREQSDNKRQ EEAVDNRPGV ESLGSACYKS SPKIHSFRVV SAQDANSTCQ 120
DQIRAFKIQK PILMCFICKL SFGNVKSFSL HANTEHRLNL EELDQQLLNR EYSSAIIQRN 180
MDEKPQISFL QPLANNDASA DTNDTEKLQT ATEGSDATLP SSPQPVFRNV SELEPENKQE 240
TEQNRLLNQD REQEPESDQH TSSSKMAAPS AYIPLSSPKV AGKLTVKFGS LNSATAKTNN 300
LSKVSSTSSP PSTYASGEVL SPSTDNISNH KSTHCNQETE PPSSSSSEVE MKIGSMSTSP 360
QTNDSDVPCS GFLQMQHMTT GGAYTPQVSS FHASLAALAA NESNDNRVKL ITEFLQQQLQ 420
QHQSSLFPSP CPDHPDLNGV DCKTCELLDI QQRSKSPSSS HHQFSQSLPQ LQIQSQPQQT 480
PHRSPCSNSV ALPVSPSASS VASVGNASTA TSSFTIGACS EHINGRPQGV DCARCEMLLN 540
SARLNSGVQM STRNSCKTLK CPQCNWHYKY QETLEIHMRE KHPDGESACG YCLAGQQHPR 600
LARGESYSCG YKPYRCEICN YSTTTKGNLS IHMQSDKHLN NMQELNSSQN MVAAAAAAAV 660
TGKLLLSSSS PQVTAACPSN SGSGAGSGSS NIVGGTASLS GNATPSVTGA NSSNANAGSN 720
TNNAGTKPKP SFRCDICSYD TSVARNLRIH MTSEKHTHNM AVLQNNIKHI QAFNFLQQQQ 780
QSGTGNIASH SSGSFMPEVA LADLAYNQAL MIQLLHQQQQ HQQSANTKLS PSSSPVSTPD 840
QFSFSPKPIK LNHGTGAAMG IGMAMGMGMS HSNEVSCELS GDPHPLTKTD KWPMAFYSCL 900
VCDCYSTNNL DDLNQHLLLD RSRQSSSASS EIMVIHNNNY ICRLCNYKTN LKANFQLHSK 960
TDKHLQKLNF INHIREGGPQ NEYKMQYQQQ QLAANVVQLK CNCCDFHTNS IQKLSLHTQQ 1020
MRHDTMRMIF QHLLYIVQQS EMHNKSSGSA EDDPQCACPD EDQQLQLQSS KKLLLCQLCN 1080
FTAQNIHEMV QHVKGIRHLQ VEQFICLQRR SENQEIPALN EVFKVTEWVM ENEDVSLAPG 1140
LNLARTTTND ATTDASYAAA SSAAVPAIPD VSMFSPTSPS SCATSCDKNL SQIVLPNVNN 1200
LGSGVPTTVF KCNLCEYFVQ SKSEIAAHIE TEHSCAESDE FITIPTNTAA LQAFQTAVAA 1260
AALAAVHQRC AVINPPTQDT VDEDKDLDTN VSDGPVGIKQ ERLEQEVDRT TSMDVTKDLA 1320
SQATDFGAPE SPKVAETEVG VQCPLCLENH FREKQYLEDH LTSVHSVTRD GLSRLLLLVD 1380
QKALKKESTD IACPTDKAPY ANTNALERAP TPIENTCNVS LIKSTSANPS QSVSLQGLSC 1440
QQCEASFKHE EQLLKHAQQN QHFSLQNGEY LCLAASHISR PCFMTFRTIP TMISHFQDLH 1500
MSLIISERHV YKYRCKQCSL AFKTQEKLTT HMLYHSMRDA TKCSFCQRNF RSTQALQKHM 1560
EQAHAEDGTP STRTNSPQTP MLSTEETHKH LLAESHAVER VSGSDVSPIE LETHLNKETR 1620
HLSPTPMSLD SQSHQKHLAT FAALLKQQQC NSDAGGLHPE ALSMSTGEMP PQLQGLQNLQ 1680
HIQQHFGAVA AAAGLPINPV DMLNIMQFHH LMSLNFMNLA PPLVFGANAA GNAVSGPSAL 1740
NNSITTSTAT SASGLGDTHL TSGVSSIPVD SGKATAVPPQ TQLNANANSQ LASNQKRART 1800
RITDDQLKIL RAHFDINNSP SEESIMEMSQ KANLPMKVVK HWFRNTLFKE RQRNKDSPYN 1860
FNNPPSTTLN LEEYERTGQA KVTPLNDTCS VAVTGPMTSS TISLPPSGNI NLSSKENATS 1920
KVLAAGKANA SGPVTFSATV PVSTPLSRPE STNSSGNISD YIGNNIFFGQ LGSKEQILPY 1980
SLDGQIKSEP QDDMIGATDF AYQTKQHSSF SFLKQQQDLV DPPEQCLTNQ NADTAQDQSL 2040
LAGSSLASNC QSQQQINIFE TKSESGSSDV LSRPPSPNSG AAGNVYGSMN DLLNQQLENM 2100
GSNMGPPKKM QIVGKTFEKN VAPMVTSGSV STQFESNSSN SSSSSSSTSG GKRANRTRFT 2160
DYQIKVLQEF FENNSYPKDS DLEYLSKLLL LSPRVIVVWF QNARQKQRKI YENQPNNTLF 2220
ENEETKKQNI NYACKKCNLV FQRYYELIRH QKNHCFKEEN NKKSAKAQIA AAQIAQNLSS 2280
EDSNSSMDIH HVGICPPGSA VASHTLSTPG SAAPLPGQYT QHSFGALPSP QHLFAKSSSL 2340
TDFSPSTTPT PPQRERSNSL DQIQRPPKFD CDKCELNFNQ LEKLREHQLL HLMNPGNICS 2400
DVGQNSNPEA NFGPFGSILQ SLQQAAAQQQ QQHHQQPPTK KRKYSDCSSN ADEMQSLSEL 2460
EASQKKHEYL YKYFMQNETS QEVKQQFLMQ QQQKKLEQGN ECDFELDFLT NFYQQNELKK 2520
VSNYDFLLQY YRTHEEAKSS QQHTFSSSKK PTIEFLLQYY QLNESKKFFQ LVASPQIIPD 2580
VPGYKPSLRI PKSTSDEAPY IGETSLEQAT ELQREKQDEQ LRIDRPSEEN DLSMNKNKVE 2640
NINNNNINVD QSNLTETNGG VPSVETKEEC TQESSLIAMD DENKYLCTRS KQKDDKEKSH 2700
YLHNLEDFLD ATMIENNSQT LTFNDDEKAC QKDELTQNSN AIEKRSSVSP VNVSSKQNKR 2760
LRTTILPEQL NFLYECYQSE SNPSRKMLEE ISKKVNLKKR VVQVWFQNSR AKDKKSRNQR 2820
HYAHISDDNS YDGSSGKEVY SDLRSNGITV DTDLETNLQD CQLCQVTQVN IRKHAFSVEH 2880
ISKMKKLLEQ TTELYAQSNG SGSEDNDSDR EKRFYNLSKA FLLQHVVTNA TSHAIHTARQ 2940
DSDVIAEGNC ILNYDTNGGD SKSHVQHNLP NEVVSEDARK IAGNQELMQQ LFNRNHITVI 3000
GGK 3003 
Gene Ontology
 GO:0005634; C:nucleus; IEA:UniProtKB-SubCell.
 GO:0043565; F:sequence-specific DNA binding; IEA:InterPro.
 GO:0003700; F:sequence-specific DNA binding transcription factor activity; IEA:InterPro.
 GO:0008270; F:zinc ion binding; IEA:InterPro. 
Interpro
 IPR017970; Homeobox_CS.
 IPR001356; Homeodomain.
 IPR009057; Homeodomain-like.
 IPR007087; Znf_C2H2.
 IPR015880; Znf_C2H2-like.
 IPR013087; Znf_C2H2/integrase_DNA-bd. 
Pfam
 PF00046; Homeobox
 PF00096; zf-C2H2 
SMART
 SM00389; HOX
 SM00355; ZnF_C2H2 
PROSITE
 PS00027; HOMEOBOX_1
 PS50071; HOMEOBOX_2
 PS00028; ZINC_FINGER_C2H2_1
 PS50157; ZINC_FINGER_C2H2_2 
PRINTS