CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-019248
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 THO complex subunit 1 
Protein Synonyms/Alias
 Tho1; Nuclear matrix protein p84; p84N5; hTREX84 
Gene Name
 THOC1 
Gene Synonyms/Alias
 HPR1 
Created Date
 July 27, 2013 
Organism
 Homo sapiens (Human) 
NCBI Taxa ID
 9606 
Lysine Modification
Position
Peptide
Type
References
31ALNNKNIKPLLSTFSubiquitination[1, 2]
133NTFYSAGKNYLLRMCacetylation[3]
237IDYNLYRKFWSLQDYubiquitination[1, 2, 4]
258CYEKISWKTFLKYSEubiquitination[4]
300GEHVYFAKFLTSEKLacetylation[3, 5]
401EGCPSFVKERTSDTKubiquitination[4, 5]
430LGKGPTKKILMGNEEubiquitination[4]
453PDNMEACKSETREHMubiquitination[4]
531NMVIKLAKELPPPSEubiquitination[4]
595LAPYLEMKDSEIRQIsumoylation[6]
Reference
 [1] A proteome-wide, quantitative survey of in vivo ubiquitylation sites reveals widespread regulatory roles.
 Wagner SA, Beli P, Weinert BT, Nielsen ML, Cox J, Mann M, Choudhary C.
 Mol Cell Proteomics. 2011 Oct;10(10):M111.013284. [PMID: 21890473]
 [2] hCKSAAP_UbSite: improved prediction of human ubiquitination sites by exploiting amino acid pattern and properties.
 Chen Z, Zhou Y, Song J, Zhang Z.
 Biochim Biophys Acta. 2013 Aug;1834(8):1461-7. [PMID: 23603789]
 [3] Lysine acetylation targets protein complexes and co-regulates major cellular functions.
 Choudhary C, Kumar C, Gnad F, Nielsen ML, Rehman M, Walther TC, Olsen JV, Mann M.
 Science. 2009 Aug 14;325(5942):834-40. [PMID: 19608861]
 [4] Refined preparation and use of anti-diglycine remnant (K-ε-GG) antibody enables routine quantification of 10,000s of ubiquitination sites in single proteomics experiments.
 Udeshi ND, Svinkina T, Mertins P, Kuhn E, Mani DR, Qiao JW, Carr SA.
 Mol Cell Proteomics. 2013 Mar;12(3):825-31. [PMID: 23266961]
 [5] Integrated proteomic analysis of post-translational modifications by serial enrichment.
 Mertins P, Qiao JW, Patel J, Udeshi ND, Clauser KR, Mani DR, Burgess MW, Gillette MA, Jaffe JD, Carr SA.
 Nat Methods. 2013 Jul;10(7):634-7. [PMID: 23749302]
 [6] Site-specific identification of SUMO-2 targets in cells reveals an inverted SUMOylation motif and a hydrophobic cluster SUMOylation motif.
 Matic I, Schimmel J, Hendriks IA, van Santen MA, van de Rijke F, van Dam H, Gnad F, Mann M, Vertegaal AC.
 Mol Cell. 2010 Aug 27;39(4):641-52. [PMID: 20797634
Functional Description
 Component of the THO subcomplex of the TREX complex. The TREX complex specifically associates with spliced mRNA and not with unspliced pre-mRNA. It is recruited to spliced mRNAs by a transcription-independent mechanism. Binds to mRNA upstream of the exon-junction complex (EJC) and is recruited in a splicing- and cap-dependent manner to a region near the 5' end of the mRNA where it functions in mRNA export. The recruitment occurs via an interaction between ALYREF/THOC4 and the cap-binding protein NCBP1. DDX39B functions as a bridge between ALYREF/THOC4 and the THO complex. The TREX complex is essential for the export of Kaposi's sarcoma-associated herpesvirus (KSHV) intronless mRNAs and infectious virus production. The recruitment of the TREX complex to the intronless viral mRNA occurs via an interaction between KSHV ORF57 protein and ALYREF/THOC4. 
Sequence Annotation
 DOMAIN 570 653 Death.
 MOTIF 414 430 Nuclear localization signal.
 MOD_RES 1 1 N-acetylmethionine.
 MOD_RES 2 2 Phosphoserine.
 MOD_RES 133 133 N6-acetyllysine.
 MOD_RES 300 300 N6-acetyllysine.
 MOD_RES 560 560 Phosphoserine.  
Keyword
 3D-structure; Acetylation; Alternative splicing; Apoptosis; Complete proteome; Cytoplasm; DNA-binding; mRNA processing; mRNA splicing; mRNA transport; Nucleus; Phosphoprotein; Reference proteome; RNA-binding; Transcription; Transcription regulation; Transport. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 657 AA 
Protein Sequence
MSPTPPLFSL PEARTRFTKS TREALNNKNI KPLLSTFSQV PGSENEKKCT LDQAFRGILE 60
EEIINHSSCE NVLAIISLAI GGVTEGICTA STPFVLLGDV LDCLPLDQCD TIFTFVEKNV 120
ATWKSNTFYS AGKNYLLRMC NDLLRRLSKS QNTVFCGRIQ LFLARLFPLS EKSGLNLQSQ 180
FNLENVTVFN TNEQESTLGQ KHTEDREEGM DVEEGEMGDE EAPTTCSIPI DYNLYRKFWS 240
LQDYFRNPVQ CYEKISWKTF LKYSEEVLAV FKSYKLDDTQ ASRKKMEELK TGGEHVYFAK 300
FLTSEKLMDL QLSDSNFRRH ILLQYLILFQ YLKGQVKFKS SNYVLTDEQS LWIEDTTKSV 360
YQLLSENPPD GERFSKMVEH ILNTEENWNS WKNEGCPSFV KERTSDTKPT RIIRKRTAPE 420
DFLGKGPTKK ILMGNEELTR LWNLCPDNME ACKSETREHM PTLEEFFEEA IEQADPENMV 480
ENEYKAVNNS NYGWRALRLL ARRSPHFFQP TNQQFKSLPE YLENMVIKLA KELPPPSEEI 540
KTGEDEDEED NDALLKENES PDVRRDKPVT GEQIEVFANK LGEQWKILAP YLEMKDSEIR 600
QIECDSEDMK MRAKQLLVAW QDQEGVHATP ENLINALNKS GLSDLAESLT NDNETNS 657 
Gene Ontology
 GO:0005737; C:cytoplasm; IDA:UniProtKB.
 GO:0016363; C:nuclear matrix; IEA:UniProtKB-SubCell.
 GO:0016607; C:nuclear speck; IEA:UniProtKB-SubCell.
 GO:0000445; C:THO complex part of transcription export complex; IDA:UniProtKB.
 GO:0003677; F:DNA binding; IEA:UniProtKB-KW.
 GO:0003723; F:RNA binding; IEA:UniProtKB-KW.
 GO:0006915; P:apoptotic process; IDA:UniProtKB.
 GO:0046784; P:intronless viral mRNA export from host nucleus; IDA:UniProtKB.
 GO:0006397; P:mRNA processing; IEA:UniProtKB-KW.
 GO:0032784; P:regulation of DNA-dependent transcription, elongation; IDA:UniProtKB.
 GO:0006396; P:RNA processing; TAS:ProtInc.
 GO:0008380; P:RNA splicing; IEA:UniProtKB-KW.
 GO:0007165; P:signal transduction; IEA:InterPro.
 GO:0006351; P:transcription, DNA-dependent; IEA:UniProtKB-KW. 
Interpro
 IPR011029; DEATH-like_dom.
 IPR000488; Death_domain.
 IPR021861; THO_THOC1. 
Pfam
 PF00531; Death
 PF11957; efThoc1 
SMART
 SM00005; DEATH 
PROSITE
 PS50017; DEATH_DOMAIN 
PRINTS