CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-019327
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Integrator complex subunit 4 
Protein Synonyms/Alias
 Int4 
Gene Name
 INTS4 
Gene Synonyms/Alias
 MSTP093 
Created Date
 July 27, 2013 
Organism
 Homo sapiens (Human) 
NCBI Taxa ID
 9606 
Lysine Modification
Position
Peptide
Type
References
26PQEEIATKKLRLTKPubiquitination[1]
27QEEIATKKLRLTKPSubiquitination[2]
62YLLQFARKPVEAESVubiquitination[2]
157RLVDVACKHLTDTSHubiquitination[2]
169TSHGVRNKCLQLLGNubiquitination[2]
186SLEKSVTKDAEGLAAubiquitination[2]
198LAARDVQKIIGDYFSubiquitination[2]
304VVRVQAAKLLGSMEQubiquitination[2]
324LEQTLDKKLMSDLRRubiquitination[2]
340RTAHERAKELYSSGEubiquitination[2]
677LIKALQEKLWNVAAPubiquitination[2]
688VAAPLYLKQSDLASAubiquitination[2]
Reference
 [1] Methods for quantification of in vivo changes in protein ubiquitination following proteasome and deubiquitinase inhibition.
 Udeshi ND, Mani DR, Eisenhaure T, Mertins P, Jaffe JD, Clauser KR, Hacohen N, Carr SA.
 Mol Cell Proteomics. 2012 May;11(5):148-59. [PMID: 22505724]
 [2] Refined preparation and use of anti-diglycine remnant (K-ε-GG) antibody enables routine quantification of 10,000s of ubiquitination sites in single proteomics experiments.
 Udeshi ND, Svinkina T, Mertins P, Kuhn E, Mani DR, Qiao JW, Carr SA.
 Mol Cell Proteomics. 2013 Mar;12(3):825-31. [PMID: 23266961
Functional Description
 Component of the Integrator complex, a complex involved in the small nuclear RNAs (snRNA) U1 and U2 transcription and in their 3'-box-dependent processing. The Integrator complex is associated with the C-terminal domain (CTD) of RNA polymerase II largest subunit (POLR2A) and is recruited to the U1 and U2 snRNAs genes. 
Sequence Annotation
 REPEAT 66 105 HEAT 1.
 REPEAT 145 183 HEAT 2.
 REPEAT 190 228 HEAT 3.
 REPEAT 229 263 HEAT 4.
 REPEAT 277 313 HEAT 5.
 REPEAT 369 405 HEAT 6.
 REPEAT 406 444 HEAT 7.
 REPEAT 446 484 HEAT 8.  
Keyword
 Alternative splicing; Complete proteome; Nucleus; Reference proteome; Repeat. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 963 AA 
Protein Sequence
MAAHLKKRVY EEFTKVVQPQ EEIATKKLRL TKPSKSAALH IDLCKATSPA DALQYLLQFA 60
RKPVEAESVE GVVRILLEHY YKENDPSVRL KIASLLGLLS KTAGFSPDCI MDDAINILQN 120
EKSHQVLAQL LDTLLAIGTK LPENQAIQMR LVDVACKHLT DTSHGVRNKC LQLLGNLGSL 180
EKSVTKDAEG LAARDVQKII GDYFSDQDPR VRTAAIKAML QLHERGLKLH QTIYNQACKL 240
LSDDYEQVRS AAVQLIWVVS QLYPESIVPI PSSNEEIRLV DDAFGKICHM VSDGSWVVRV 300
QAAKLLGSME QVSSHFLEQT LDKKLMSDLR RKRTAHERAK ELYSSGEFSS GRKWGDDAPK 360
EEVDTGAVNL IESGACGAFV HGLEDEMYEV RIAAVEALCM LAQSSPSFAE KCLDFLVDMF 420
NDEIEEVRLQ SIHTMRKISN NITLREDQLD TVLAVLEDSS RDIREALHEL LCCTNVSTKE 480
GIHLALVELL KNLTKYPTDR DSIWKCLKFL GSRHPTLVLP LVPELLSTHP FFDTAEPDMD 540
DPAYIAVLVL IFNAAKTCPT MPALFSDHTF RHYAYLRDSL SHLVPALRLP GRKLVSSAVS 600
PSIIPQEDPS QQFLQQSLER VYSLQHLDPQ GAQELLEFTI RDLQRLGELQ SELAGVADFS 660
ATYLRCQLLL IKALQEKLWN VAAPLYLKQS DLASAAAKQI MEETYKMEFM YSGVENKQVV 720
IIHHMRLQAK ALQLIVTART TRGLDPLFGM CEKFLQEVDF FQRYFIADLP HLQDSFVDKL 780
LDLMPRLMTS KPAEVVKILQ TMLRQSAFLH LPLPEQIHKA SATIIEPAGE SDNPLRFTSG 840
LVVALDVDAT LEHVQDPQNT VKVQVLYPDG QAQMIHPKPA DFRNPGPGRH RLITQVYLSH 900
TAWTEACQVE VRLLLAYNSS ARIPKCPWME GGEMSPQVET SIEGTIPFSK PVKVYIMPKP 960
ARR 963 
Gene Ontology
 GO:0032039; C:integrator complex; IDA:HGNC.
 GO:0016180; P:snRNA processing; IDA:HGNC. 
Interpro
 IPR011989; ARM-like.
 IPR016024; ARM-type_fold. 
Pfam
  
SMART
  
PROSITE
 PS50077; HEAT_REPEAT 
PRINTS