CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-035235
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
  
Protein Name
 Agrin 
Protein Synonyms/Alias
  
Gene Name
 Agrn 
Gene Synonyms/Alias
  
Created Date
 July 27, 2013 
Organism
 Rattus norvegicus (Rat) 
NCBI Taxa ID
 10116 
Lysine Modification
Position
Peptide
Type
References
278TRGLLLQKVRSGQCQacetylation[1]
Reference
 [1] Proteomic analysis of lysine acetylation sites in rat tissues reveals organ specificity and subcellular patterns.
 Lundby A, Lage K, Weinert BT, Bekker-Jensen DB, Secher A, Skovgaard T, Kelstrup CD, Dmytriyev A, Choudhary C, Lundby C, Olsen JV.
 Cell Rep. 2012 Aug 30;2(2):419-31. [PMID: 22902405
Functional Description
  
Sequence Annotation
  
Keyword
 Complete proteome; Disulfide bond; Reference proteome; Repeat. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 1948 AA 
Protein Sequence
MPPLPLEHRP RQEPGASMLV RYFMIPCNIC LILLATSTLG FAVLLFLSNY KPGIHFTPAP 60
PTPPDVCRGM LCGFGAVCEP SVEDPGRASC VCKKNACPAT VAPVCGSDAS TYSNECELQR 120
AQCNQQRRIR LLRQGPCGSR DPCANVTCSF GSTCVPSADG QTASCLCPTT CFGAPDGTVC 180
GSDGVDYPSE CQLLSHACAS QEHIFKKFNG PCDPCQGSMS DLNHICRVNP RTRHPEMLLR 240
PENCPAQHTP ICGDDGVTYE NDCVMSRIGA TRGLLLQKVR SGQCQTRDQC PETCQFNSVC 300
LSRRGRPHCS CDRVTCDGSY RPVCAQDGHT YNNDCWRQQA ECRQQRAIPP KHQGPCDQTP 360
SPCHGVQCAF GAVCTVKNGK AECECQRVCS GIYDPVCGSD GVTYGSVCEL ESMACTLGRE 420
IQVARRGPCD PCGQCRFGSL CEVETGRCVC PSECVESAQP VCGSDGHTYA SECELHVHAC 480
THQISLYVAS AGHCQTCGEK VCTFGAVCSA GQCVCPRCEH PPPGPVCGSD GVTYLSACEL 540
REAACQQQVQ IEEAHAGPCE PAECGSGGSG SGEDDECEQE LCRQRGGIWD EDSEDGPCVC 600
DFSCQSVPRS PVCGSDGVTY GTECDLKKAR CESQQELYVA AQGACRGPTL APLLPVAFPH 660
CAQTPYGCCQ DNFTAAQGVG LAGCPSTCHC NPHGSYSGTC DPATGQCSCR PGVGGLRCDR 720
CEPGFWNFRG IVTDGHSGCT PCSCDPRGAV RDDCEQMTGL CSCRPGVAGP KCGQCPDGQV 780
LGHLGCEADP MTPVTCVEIH CEFGASCVEK AGFAQCICPT LTCPEANSTK VCGSDGVTYG 840
NECQLKAIAC RQRLDISTQS LGPCQESVTP GASPTSASMT TPRHILSKTL PFPHNSLPLS 900
PGSTTHDWPT PLPISPHTTV SIPRSTAWPV LTVPPTAAAS DVTSLATSIF SESGSANGSG 960
DEELSGDEEA SGGGSGGLEP PVGSIVVTHG PPIERASCYN SPLGCCSDGK TPSLDSEGSN 1020
CPATKAFQGV LELEGVEGQE LFYTPEMADP KSELFGETAR SIESTLDDLF RNSDVKKDFW 1080
SVRLRELGPG KLVRAIVDVH FDPTTAFQAS DVGQALLRQI QVSRPWALAV RRPLQEHVRF 1140
LDFDWFPTFF TGAATGTTAA MATARATTVS RLPASSVTPR VYPSHTSRPV GRTTAPPTTR 1200
RPPTTATNMD RPRTPGHQQP SKSCDSQPCL HGGTCQDQDS GKGFTCSCTA GRGGSVCEKV 1260
QPPSMPAFKG HSFLAFPTLR AYHTLRLALE FRALETEGLL LYNGNARGKD FLALALLDGR 1320
VQFRFDTGSG PAVLTSLVPV EPGRWHRLEL SRHWRQGTLS VDGETPVVGE SPSGTDGLNL 1380
DTNLYVGGIP EEQVAMVLDR TSVGVGLKGC IRMLDINNQQ LELSDWQRAA VQSSGVGECG 1440
DHPCLPNPCH GGALCQALEA GMFLCQCPPG RFGPTCADEK SPCQPNPCHG AAPCRVLSSG 1500
GAKCECPLGR SGTFCQTVLE TAGSRPFLAD FNGFSYLELK GLHTFERDLG EKMALEMVFL 1560
ARGPSGLLLY NGQKTDGKGD FVSLALHNRH LEFCYDLGKG AAVIRSKEPI ALGTWVRVFL 1620
ERNGRKGALQ VGDGPRVLGE SPKSRKVPHT MLNLKEPLYI GGAPDFSKLA RGAAVSSGFN 1680
GVIQLVSLRG HQLLTQEHVL RAVDVSPFAD HPCTQALGNP CLNGGSCVPR EATYECLCPG 1740
GFSGLHCEKG LVEKSVGDLE TLAFDGRTYI EYLNAVIESE LTNEIPAEKA LQSNHFELSL 1800
RTEATQGLVL WIGKAAERAD YMALAIVDGH LQLSYDLGSQ PVVLRSTVKV NTNRWLRIRA 1860
HREHREGSLQ VGNEAPVTGS SPLGATQLDT DGALWLGGLQ KLPVGQALPK AYGTGFVGCL 1920
RDVVVGHRQL HLLEDAVTKP ELRPCPTP 1948 
Gene Ontology
  
Interpro
 IPR008985; ConA-like_lec_gl_sf.
 IPR013320; ConA-like_subgrp.
 IPR000742; EG-like_dom.
 IPR013032; EGF-like_CS.
 IPR002049; EGF_laminin.
 IPR003645; Fol_N.
 IPR002350; Kazal_dom.
 IPR001791; Laminin_G.
 IPR000082; SEA_dom. 
Pfam
 PF00008; EGF
 PF00050; Kazal_1
 PF07648; Kazal_2
 PF00053; Laminin_EGF
 PF00054; Laminin_G_1
 PF01390; SEA 
SMART
 SM00181; EGF
 SM00180; EGF_Lam
 SM00274; FOLN
 SM00280; KAZAL
 SM00282; LamG
 SM00200; SEA 
PROSITE
 PS00022; EGF_1
 PS01186; EGF_2
 PS50026; EGF_3
 PS01248; EGF_LAM_1
 PS50027; EGF_LAM_2
 PS51465; KAZAL_2
 PS50025; LAM_G_DOMAIN
 PS50024; SEA 
PRINTS