CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-034942
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
  
Protein Name
 Protein Mga 
Protein Synonyms/Alias
  
Gene Name
 Mga 
Gene Synonyms/Alias
  
Created Date
 July 27, 2013 
Organism
 Rattus norvegicus (Rat) 
NCBI Taxa ID
 10116 
Lysine Modification
Position
Peptide
Type
References
2426ADKLIGQKNLLSRKRacetylation[1]
Reference
 [1] Proteomic analysis of lysine acetylation sites in rat tissues reveals organ specificity and subcellular patterns.
 Lundby A, Lage K, Weinert BT, Bekker-Jensen DB, Secher A, Skovgaard T, Kelstrup CD, Dmytriyev A, Choudhary C, Lundby C, Olsen JV.
 Cell Rep. 2012 Aug 30;2(2):419-31. [PMID: 22902405
Functional Description
  
Sequence Annotation
  
Keyword
 Complete proteome; DNA-binding; Nucleus; Reference proteome; Transcription; Transcription regulation. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 3005 AA 
Protein Sequence
MEEKQQIILA NQDGGTVTGA APTFFVILKQ PGNGKTDQGI LVTNRDARAL LSRESSPGKS 60
KEKICLPADC TVGKITVTLD NNSMWNEFHN RSTEMILTKQ GRRMFPYCRY WITGLDSNLK 120
YILVMDISPV DSHRYKWNGR WWEPSGKAEP HILGRVFIHP ESPSTGHYWM HQPVSFYKLK 180
LTNNTLDQEG HIILHSMHRY LPRLHLVPAE KATEVIQLNG PGVHTFTFPQ TEFFAVTAYQ 240
NIQITQLKID YNPFAKGFRD DGLSSKPQRD GKQRNSSDQE GNSVSSSPGH RVRLTEGEGS 300
EIHSGDFDPV LRGHETSSLG LEKAPNNVKQ DFLGFMNTDS THEVPQLKRE ISESHVSSFE 360
ENSQISSPLN PNGNFNVVIK EEPLDDYDYE LGECPEGITV KQEETDEETD VYSNSDDDPI 420
LEKQLKRHNK VDNLEADHSS YKWLPNSPGV AKAKMFKLDA GKMPVVYLEP CAVTKSTVKI 480
SELPDNMLST SRKDKSMLAE LEYLPAYIEN SDETGFCLSK DSENSLRKHS SDLRMVQKYT 540
LLKEPHWKYP DIFDSSSTEK THDSSKGSTE DSFSGKEDLG KKRTTMLKMA IPSKTVNASQ 600
SASPNTPGKR GRPRKLRLSK AGRPPKNTGK SLTASKNIPV GPGNTFPDVK PDLEDVDGVL 660
FVSFESKEAL DIHAVDGTTE EPSSVQTTTT NDSGSRTRIS QLEKELIEDL KSLRHKQVIH 720
PALQEVGLKL NSVDPTMSID LKYLGVQLPL APATSFPLWN VTGTNPASPD AGFPFVSRTG 780
KTNDFTKIKG WRGKFHNASA SRNEGGNSET SLKNRSAFCS DKLDEYLENE GKLMETSMGF 840
SSNAPTAPTS PVVYQLPTKS TSYVRTLDSV LKKQSTISPS TSHSVKPHSV TTASRKTKAQ 900
NKQTTVSGRT KSSYKSILPY PVSPKQKNSH VSPGDKITKN SLSSASDNQV TNLVVPSIDE 960
SAFPKQISLR QAQQQHIQQQ GTRPPGLSKS QVKLMDLEDC ALWEGKPRTY ITEERADVSL 1020
TTLLTAQASL KTKPIHTIIR KRAPPCNNDF CRLGCVCSSL ALEKRQPAHC RRPDCMFGCT 1080
CLKRKVVLVK GGSKTKHLHK KAANRDPLFY DTLGEGGREG GGGVREDEEQ LKEKKKRKKL 1140
EYTVCEAEPE QPVRHYPLWV KVDGEVDPEP VYIPTPSVIE PIKPLVLPQP DASSTTIKGK 1200
LTPGIKPTRA YTPKPNPVIR EEDKDPVYLY FESMMTCARV RVYERKKDEQ RQLSPPLSPS 1260
SSFQQQSSCY SSPENHATKE LDSEQTLKQL ICDLEDDSDK SQEKTWKSSC NEGESSSTSY 1320
VHQRSPGGPT KLIEIISDCN WEEDRNKILS ILSQHINSNM PQSLKVGSFI IELASQRKCR 1380
GEKTPPVYSS RVKISMPSSQ DQDDMAEKSG SETPDGPLSP GKMDDISPVQ TDALDSVRER 1440
LHGGKGLPFY AGLSPSGKLV AYKRKPSSTT SGLIQVASNA KVAASRKPRT LLPSTSNSKM 1500
ASSGPTTNRS GKNLKAFVPA KRPIAARPSP GGVFTQFVMS KVGALQQKIP GVRTPQPLTG 1560
PQKFSIRPSP VMVVAPVVSS EQVQVCSTVT AAVTTSPQVV LENVTAVPSL TANSDMGAKE 1620
ATYSSSASTA GVVEISETNN TTPVTSTQST ATVNLTKTTG ITTSPVASVS FPKPLVASPT 1680
ITLPVASTAS TSIVMVTTAA SSSVVTTPTS SLSSVPIILS GINGSPPVSQ RPENAPQIPV 1740
TTPQISPNNV KRTGPRLLHP NGQIVQLLPL HQIRGSNTQP SLQPVVFRNP GSVVGIRLPT 1800
PCKSSETPSS SSSSAFSVMS PVIQAVGSSP TVNVISQAPS LLSSGSSFVS QAGTLTLRIS 1860
PPETQNLASK TASESKITPS TGGQPVGTAS LIPLQSGSFA LLQLPGQKPV PNSVLQHVAS 1920
LQIKKESQNT DQKDESNSIK REKETKKALS SKDEVVDPEA NIMKQNSGII ASEDTFNNSL 1980
DDRGDLLDEE SLREDTRPYE YSYSTGSHTD EDKDGDEESG NKNENSPKEK QTVPEVRAGS 2040
ENINIMTLQN VRKVRPQKYV EVKVEQQQGS ENPDDFLVLS KEESKFELSG SQVKEQKSNS 2100
QTEAKKDCED SLGKDSLRER WRKHLKGPLT HKYVGISQDF KKEADVQLFT EMKPCQDNSE 2160
RDISELLGKS GTIESGGIFK TEDGSWSGIS SSAAFSIIPR RATKGRRGSR RFQGHFLLSK 2220
EHMKPKQQAN GGRSSADFTV LDLEDEDEED ENEKTDDSID EIVDVVSGYQ SEEVDVEKNN 2280
YVDYIEDDEQ VDVETVEELS EEINFTYKKT TAAHTQSFRQ QCHSLISADE KASEKSRKVS 2340
LISSKLKDDC WSDKPHKETE AFAYYRRTHT ANERRRRGEM RDLFEKLKIT LGLLHSSKVS 2400
KSLILNRAFS EIQGLTDQAD KLIGQKNLLS RKRSILIRKV SSLSGKTEEV VLKKLEYIYA 2460
KQQALEAQKR KKKMGSDEFG LSPRISTQLE GSSSSVDLGQ MFMSNRRGKP LILSRRRDQA 2520
TENASPSDTP HSSANLVMTP QGQLLTLKGP LFSAPVVAVS PALLEAGLKP QVASSTMAQS 2580
ENDDLFMMPR IVNVTSLAAE EDLGGMSGNK YLHEVPDGKP ALDHLRDVSG NEASSLKDTE 2640
RISSRGNHRD SRMALGPTQV FLANKDSGFP HVPDVSTMQA AQEFIPKKMS GDLRGHRHKW 2700
KECELRGERL KSKESQFHKL KMKDLKDSSI EMELRKVASA IEEAALDPSE LLTNMEDEDD 2760
TDETLTSLLN EIAFLNQQLN DDSGLAELSG SLDTEFPGDA QQAFISKLAP GNRSAFQVGH 2820
LGTGVKELPD VQEESESISP LLLHLEDDDF SENEKQLGDT ASEPDVLKIV IDSEIKDSLV 2880
SHRKSSDGGQ STSGLPAEPE SVSSPPILHM KTGPENSSTD TLWRPMPKLA PLGLKVANPP 2940
SDADGQSLKV MPALAPIAAK VGSVGHKMNT TGSDQEGRGS KMMPTLAPVV TKLGNSGVPS 3000
SSSGK 3005 
Gene Ontology
 GO:0005634; C:nucleus; IEA:UniProtKB-KW.
 GO:0003677; F:DNA binding; IEA:UniProtKB-KW.
 GO:0003700; F:sequence-specific DNA binding transcription factor activity; IEA:InterPro.
 GO:0006351; P:transcription, DNA-dependent; IEA:UniProtKB-KW. 
Interpro
 IPR011598; bHLH_dom.
 IPR008967; p53-like_TF_DNA-bd.
 IPR001699; TF_T-box.
 IPR018186; TF_T-box_CS. 
Pfam
 PF00010; HLH
 PF00907; T-box 
SMART
 SM00353; HLH
 SM00425; TBOX 
PROSITE
 PS50888; BHLH
 PS01264; TBOX_2
 PS50252; TBOX_3 
PRINTS
 PR00937; TBOX.