CPLM 1.0 - Compendium of Protein Lysine Modification
TagContent
CPLM ID CPLM-000947
UniProt Accession
Genbank Protein ID
Genbank Nucleotide ID
Protein Name
 Cubilin 
Protein Synonyms/Alias
 460 kDa receptor; Intestinal intrinsic factor receptor; Intrinsic factor-cobalamin receptor; Intrinsic factor-vitamin B12 receptor 
Gene Name
 CUBN 
Gene Synonyms/Alias
 IFCR 
Created Date
 July 27, 2013 
Organism
 Homo sapiens (Human) 
NCBI Taxa ID
 9606 
Lysine Modification
Position
Peptide
Type
References
1909IQNCYYDKLRIYDGPmethylation[1]
Reference
 [1] Large-scale global identification of protein lysine methylation in vivo.
 Cao XJ, Arnaudo AM, Garcia BA.
 Epigenetics. 2013 May 1;8(5):477-85. [PMID: 23644510
Functional Description
 Cotransporter which plays a role in lipoprotein, vitamin and iron metabolism, by facilitating their uptake. Binds to ALB, MB, Kappa and lambda-light chains, TF, hemoglobin, GC, SCGB1A1, APOA1, high density lipoprotein, and the GIF-cobalamin complex. The binding of all ligands requires calcium. Serves as important transporter in several absorptive epithelia, including intestine, renal proximal tubules and embryonic yolk sac. Interaction with LRP2 mediates its trafficking throughout vesicles and facilitates the uptake of specific ligands like GC, hemoglobin, ALB, TF and SCGB1A1. Interaction with AMN controls its trafficking to the plasma membrane and facilitates endocytosis of ligands. May play an important role in the development of the peri-implantation embryo through internalization of APOA1 and cholesterol. Binds to LGALS3 at the maternal-fetal interface. 
Sequence Annotation
 DOMAIN 132 168 EGF-like 1.
 DOMAIN 170 211 EGF-like 2; calcium-binding (Potential).
 DOMAIN 263 304 EGF-like 3; calcium-binding (Potential).
 DOMAIN 305 348 EGF-like 4; calcium-binding (Potential).
 DOMAIN 349 385 EGF-like 5.
 DOMAIN 395 430 EGF-like 6.
 DOMAIN 432 468 EGF-like 7; calcium-binding (Potential).
 DOMAIN 474 586 CUB 1.
 DOMAIN 590 702 CUB 2.
 DOMAIN 708 816 CUB 3.
 DOMAIN 816 928 CUB 4.
 DOMAIN 932 1042 CUB 5.
 DOMAIN 1048 1161 CUB 6.
 DOMAIN 1165 1277 CUB 7.
 DOMAIN 1278 1389 CUB 8.
 DOMAIN 1391 1506 CUB 9.
 DOMAIN 1510 1619 CUB 10.
 DOMAIN 1620 1734 CUB 11.
 DOMAIN 1738 1850 CUB 12.
 DOMAIN 1852 1963 CUB 13.
 DOMAIN 1978 2091 CUB 14.
 DOMAIN 2092 2213 CUB 15.
 DOMAIN 2217 2334 CUB 16.
 DOMAIN 2336 2448 CUB 17.
 DOMAIN 2452 2565 CUB 18.
 DOMAIN 2570 2687 CUB 19.
 DOMAIN 2689 2801 CUB 20.
 DOMAIN 2805 2919 CUB 21.
 DOMAIN 2920 3035 CUB 22.
 DOMAIN 3037 3150 CUB 23.
 DOMAIN 3157 3274 CUB 24.
 DOMAIN 3278 3393 CUB 25.
 DOMAIN 3395 3507 CUB 26.
 DOMAIN 3511 3623 CUB 27.
 METAL 980 980 Calcium 1.
 METAL 988 988 Calcium 1.
 METAL 1027 1027 Calcium 1.
 METAL 1029 1029 Calcium 1.
 METAL 1030 1030 Calcium 1; via carbonyl oxygen.
 METAL 1096 1096 Calcium 2.
 METAL 1105 1105 Calcium 2.
 METAL 1146 1146 Calcium 2.
 METAL 1148 1148 Calcium 2; via carbonyl oxygen.
 METAL 1149 1149 Calcium 2; via carbonyl oxygen.
 METAL 1213 1213 Calcium 3.
 METAL 1221 1221 Calcium 3.
 METAL 1262 1262 Calcium 3.
 METAL 1264 1264 Calcium 3; via carbonyl oxygen.
 METAL 1265 1265 Calcium 3; via carbonyl oxygen.
 METAL 1328 1328 Calcium 4.
 METAL 1336 1336 Calcium 4.
 METAL 1373 1373 Calcium 4.
 METAL 1375 1375 Calcium 4; via carbonyl oxygen.
 MOD_RES 3008 3008 Phosphothreonine (By similarity).
 CARBOHYD 3 3 N-linked (GlcNAc...) (Potential).
 CARBOHYD 105 105 N-linked (GlcNAc...) (Potential).
 CARBOHYD 428 428 N-linked (GlcNAc...) (Potential).
 CARBOHYD 482 482 N-linked (GlcNAc...) (Potential).
 CARBOHYD 711 711 N-linked (GlcNAc...) (Potential).
 CARBOHYD 749 749 N-linked (GlcNAc...) (Potential).
 CARBOHYD 781 781 N-linked (GlcNAc...) (Potential).
 CARBOHYD 857 857 N-linked (GlcNAc...) (Potential).
 CARBOHYD 957 957 N-linked (GlcNAc...) (Potential).
 CARBOHYD 984 984 N-linked (GlcNAc...).
 CARBOHYD 1092 1092 N-linked (GlcNAc...).
 CARBOHYD 1168 1168 N-linked (GlcNAc...).
 CARBOHYD 1217 1217 N-linked (GlcNAc...).
 CARBOHYD 1285 1285 N-linked (GlcNAc...).
 CARBOHYD 1307 1307 N-linked (GlcNAc...).
 CARBOHYD 1319 1319 N-linked (GlcNAc...).
 CARBOHYD 1332 1332 N-linked (GlcNAc...).
 CARBOHYD 1500 1500 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1551 1551 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1646 1646 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1802 1802 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1819 1819 N-linked (GlcNAc...) (Potential).
 CARBOHYD 1885 1885 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2085 2085 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2117 2117 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2274 2274 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2386 2386 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2400 2400 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2531 2531 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2581 2581 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2592 2592 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2610 2610 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2813 2813 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2923 2923 N-linked (GlcNAc...) (Potential).
 CARBOHYD 2945 2945 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3042 3042 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3103 3103 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3125 3125 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3165 3165 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3268 3268 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3283 3283 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3290 3290 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3295 3295 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3357 3357 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3430 3430 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3457 3457 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3533 3533 N-linked (GlcNAc...) (Potential).
 CARBOHYD 3576 3576 N-linked (GlcNAc...) (Potential).
 DISULFID 136 147 By similarity.
 DISULFID 141 156 By similarity.
 DISULFID 158 167 By similarity.
 DISULFID 174 190 By similarity.
 DISULFID 184 199 By similarity.
 DISULFID 201 210 By similarity.
 DISULFID 267 280 By similarity.
 DISULFID 274 289 By similarity.
 DISULFID 292 303 By similarity.
 DISULFID 353 366 By similarity.
 DISULFID 360 376 By similarity.
 DISULFID 399 409 By similarity.
 DISULFID 404 418 By similarity.
 DISULFID 420 429 By similarity.
 DISULFID 436 447 By similarity.
 DISULFID 441 456 By similarity.
 DISULFID 458 467 By similarity.
 DISULFID 474 500 By similarity.
 DISULFID 527 549 By similarity.
 DISULFID 590 616 By similarity.
 DISULFID 643 665 By similarity.
 DISULFID 708 734 By similarity.
 DISULFID 869 891 By similarity.
 DISULFID 932 958
 DISULFID 985 1005
 DISULFID 1048 1074
 DISULFID 1165 1191
 DISULFID 1218 1240
 DISULFID 1278 1306
 DISULFID 1333 1351
 DISULFID 1391 1417 By similarity.
 DISULFID 1444 1466 By similarity.
 DISULFID 1510 1536 By similarity.
 DISULFID 1563 1581 By similarity.
 DISULFID 1620 1647 By similarity.
 DISULFID 1675 1697 By similarity.
 DISULFID 1738 1764 By similarity.
 DISULFID 1791 1812 By similarity.
 DISULFID 1905 1927 By similarity.
 DISULFID 1978 2006 By similarity.
 DISULFID 2032 2054 By similarity.
 DISULFID 2092 2118 By similarity.
 DISULFID 2217 2247 By similarity.
 DISULFID 2275 2297 By similarity.
 DISULFID 2336 2363 By similarity.
 DISULFID 2390 2411 By similarity.
 DISULFID 2452 2478 By similarity.
 DISULFID 2505 2527 By similarity.
 DISULFID 2570 2599 By similarity.
 DISULFID 2628 2649 By similarity.
 DISULFID 2689 2715 By similarity.
 DISULFID 2742 2764 By similarity.
 DISULFID 2805 2831 By similarity.
 DISULFID 2860 2883 By similarity.
 DISULFID 2920 2946 By similarity.
 DISULFID 2977 2999 By similarity.
 DISULFID 3037 3064 By similarity.
 DISULFID 3091 3113 By similarity.
 DISULFID 3157 3185 By similarity.
 DISULFID 3215 3237 By similarity.
 DISULFID 3278 3306 By similarity.
 DISULFID 3332 3354 By similarity.
 DISULFID 3395 3421 By similarity.
 DISULFID 3448 3470 By similarity.
 DISULFID 3511 3537 By similarity.
 DISULFID 3564 3586 By similarity.  
Keyword
 3D-structure; Calcium; Cholesterol metabolism; Cleavage on pair of basic residues; Cobalamin; Cobalt; Complete proteome; Direct protein sequencing; Disease mutation; Disulfide bond; EGF-like domain; Endocytosis; Endosome; Glycoprotein; Lipid metabolism; Lysosome; Membrane; Metal-binding; Phosphoprotein; Polymorphism; Protein transport; Receptor; Reference proteome; Repeat; Signal; Steroid metabolism; Sterol metabolism; Transport. 
Sequence Source
 UniProt (SWISSPROT/TrEMBL); GenBank; EMBL 
Protein Length
 3623 AA 
Protein Sequence
MMNMSLPFLW SLLTLLIFAE VNGEAGELEL QRQKRSINLQ QPRMATERGN LVFLTGSAQN 60
IEFRTGSLGK IKLNDEDLSE CLHQIQKNKE DIIELKGSAI GLPQNISSQI YQLNSKLVDL 120
ERKFQGLQQT VDKKVCSSNP CQNGGTCLNL HDSFFCICPP QWKGPLCSAD VNECEIYSGT 180
PLSCQNGGTC VNTMGSYSCH CPPETYGPQC ASKYDDCEGG SVARCVHGIC EDLMREQAGE 240
PKYSCVCDAG WMFSPNSPAC TLDRDECSFQ PGPCSTLVQC FNTQGSFYCG ACPTGWQGNG 300
YICEDINECE INNGGCSVAP PVECVNTPGS SHCQACPPGY QGDGRVCTLT DICSVSNGGC 360
HPDASCSSTL GSLPLCTCLP GYTGNGYGPN GCVQLSNICL SHPCLNGQCI DTVSGYFCKC 420
DSGWTGVNCT ENINECLSNP CLNGGTCVDG VDSFSCECTR LWTGALCQVP QQVCGESLSG 480
INGSFSYRSP DVGYVHDVNC FWVIKTEMGK VLRITFTFFR LESMDNCPHE FLQVYDGDSS 540
SAFQLGRFCG SSLPHELLSS DNALYFHLYS EHLRNGRGFT VRWETQQPEC GGILTGPYGS 600
IKSPGYPGNY PPGRDCVWIV VTSPDLLVTF TFGTLSLEHH DDCNKDYLEI RDGPLYQDPL 660
LGKFCTTFSV PPLQTTGPFA RIHFHSDSQI SDQGFHITYL TSPSDLRCGG NYTDPEGELF 720
LPELSGPFTH TRQCVYMMKQ PQGEQIQINF THVELQCQSD SSQNYIEVRD GETLLGKVCG 780
NGTISHIKSI TNSVWIRFKI DASVEKASFR AVYQVACGDE LTGEGVIRSP FFPNVYPGER 840
TCRWTIHQPQ SQVILLNFTV FEIGSSAHCE TDYVEIGSSS ILGSPENKKY CGTDIPSFIT 900
SVYNFLYVTF VKSSSTENHG FMAKFSAEDL ACGEILTEST GTIQSPGHPN VYPHGINCTW 960
HILVQPNHLI HLMFETFHLE FHYNCTNDYL EVYDTDSETS LGRYCGKSIP PSLTSSGNSL 1020
MLVFVTDSDL AYEGFLINYE AISAATACLQ DYTDDLGTFT SPNFPNNYPN NWECIYRITV 1080
RTGQLIAVHF TNFSLEEAIG NYYTDFLEIR DGGYEKSPLL GIFYGSNLPP TIISHSNKLW 1140
LKFKSDQIDT RSGFSAYWDG SSTGCGGNLT TSSGTFISPN YPMPYYHSSE CYWWLKSSHG 1200
SAFELEFKDF HLEHHPNCTL DYLAVYDGPS SNSHLLTQLC GDEKPPLIRS SGDSMFIKLR 1260
TDEGQQGRGF KAEYRQTCEN VVIVNQTYGI LESIGYPNPY SENQHCNWTI RATTGNTVNY 1320
TFLAFDLEHH INCSTDYLEL YDGPRQMGRY CGVDLPPPGS TTSSKLQVLL LTDGVGRREK 1380
GFQMQWFVYG CGGELSGATG SFSSPGFPNR YPPNKECIWY IRTDPGSSIQ LTIHDFDVEY 1440
HSRCNFDVLE IYGGPDFHSP RIAQLCTQRS PENPMQVSST GNELAIRFKT DLSINGRGFN 1500
ASWQAVTGGC GGIFQAPSGE IHSPNYPSPY RSNTDCSWVI RVDRNHRVLL NFTDFDLEPQ 1560
DSCIMAYDGL SSTMSRLART CGREQLANPI VSSGNSLFLR FQSGPSRQNR GFRAQFRQAC 1620
GGHILTSSFD TVSSPRFPAN YPNNQNCSWI IQAQPPLNHI TLSFTHFELE RSTTCARDFV 1680
EILDGGHEDA PLRGRYCGTD MPHPITSFSS ALTLRFVSDS SISAGGFHTT VTASVSACGG 1740
TFYMAEGIFN SPGYPDIYPP NVECVWNIVS SPGNRLQLSF ISFQLEDSQD CSRDFVEIRE 1800
GNATGHLVGR YCGNSFPLNY SSIVGHTLWV RFISDGSGSG TGFQATFMKI FGNDNIVGTH 1860
GKVASPFWPE NYPHNSNYQW TVNVNASHVV HGRILEMDIE EIQNCYYDKL RIYDGPSIHA 1920
RLIGAYCGTQ TESFSSTGNS LTFHFYSDSS ISGKGFLLEW FAVDAPDGVL PTIAPGACGG 1980
FLRTGDAPVF LFSPGWPDSY SNRVDCTWLI QAPDSTVELN ILSLDIESHR TCAYDSLVIR 2040
DGDNNLAQQL AVLCGREIPG PIRSTGEYMF IRFTSDSSVT RAGFNASFHK SCGGYLHADR 2100
GIITSPKYPE TYPSNLNCSW HVLVQSGLTI AVHFEQPFQI PNGDSSCNQG DYLVLRNGPD 2160
ICSPPLGPPG GNGHFCGSHA SSTLFTSDNQ MFVQFISDHS NEGQGFKIKY EAKSLACGGN 2220
VYIHDADSAG YVTSPNHPHN YPPHADCIWI LAAPPETRIQ LQFEDRFDIE VTPNCTSNYL 2280
ELRDGVDSDA PILSKFCGTS LPSSQWSSGE VMYLRFRSDN SPTHVGFKAK YSIAQCGGRV 2340
PGQSGVVESI GHPTLPYRDN LFCEWHLQGL SGHYLTISFE DFNLQNSSGC EKDFVEIWDN 2400
HTSGNILGRY CGNTIPDSID TSSNTAVVRF VTDGSVTASG FRLRFESSME ECGGDLQGSI 2460
GTFTSPNYPN PNPHGRICEW RITAPEGRRI TLMFNNLRLA THPSCNNEHV IVFNGIRSNS 2520
PQLEKLCSSV NVSNEIKSSG NTMKVIFFTD GSRPYGGFTA SYTSSEDAVC GGSLPNTPEG 2580
NFTSPGYDGV RNYSRNLNCE WTLSNPNQGN SSISIHFEDF YLESHQDCQF DVLEFRVGDA 2640
DGPLMWRLCG PSKPTLPLVI PYSQVWIHFV TNERVEHIGF HAKYSFTDCG GIQIGDSGVI 2700
TSPNYPNAYD SLTHCSSLLE APQGHTITLT FSDFDIEPHT TCAWDSVTVR NGGSPESPII 2760
GQYCGNSNPR TIQSGSNQLV VTFNSDHSLQ GGGFYATWNT QTLGCGGIFH SDNGTIRSPH 2820
WPQNFPENSR CSWTAITHKS KHLEISFDNN FLIPSGDGQC QNSFVKVWAG TEEVDKALLA 2880
TGCGNVAPGP VITPSNTFTA VFQSQEAPAQ GFSASFVSRC GSNFTGPSGY IISPNYPKQY 2940
DNNMNCTYVI EANPLSVVLL TFVSFHLEAR SAVTGSCVND GVHIIRGYSV MSTPFATVCG 3000
DEMPAPLTIA GPVLLNFYSN EQITDFGFKF SYRIISCGGV FNFSSGIITS PAYSYADYPN 3060
DMHCLYTITV SDDKVIELKF SDFDVVPSTS CSHDYLAIYD GANTSDPLLG KFCGSKRPPN 3120
VKSSNNSMLL VFKTDSFQTA KGWKMSFRQT LGPQQGCGGY LTGSNNTFAS PDSDSNGMYD 3180
KNLNCVWIII APVNKVIHLT FNTFALEAAS TRQRCLYDYV KLYDGDSENA NLAGTFCGST 3240
VPAPFISSGN FLTVQFISDL TLEREGFNAT YTIMDMPCGG TYNATWTPQN ISSPNSSDPD 3300
VPFSICTWVI DSPPHQQVKI TVWALQLTSQ DCTQNYLQLQ DSPQGHGNSR FQFCGRNASA 3360
VPVFYSSMST AMVIFKSGVV NRNSRMSFTY QIADCNRDYH KAFGNLRSPG WPDNYDNDKD 3420
CTVTLTAPQN HTISLFFHSL GIENSVECRN DFLEVRNGSN SNSPLLGKYC GTLLPNPVFS 3480
QNNELYLRFK SDSVTSDRGY EIIWTSSPSG CGGTLYGDRG SFTSPGYPGT YPNNTYCEWV 3540
LVAPAGRLVT INFYFISIDD PGDCVQNYLT LYDGPNASSP SSGPYCGGDT SIAPFVASSN 3600
QVFIKFHADY ARRPSAFRLT WDS 3623 
Gene Ontology
 GO:0016324; C:apical plasma membrane; IEA:Compara.
 GO:0031526; C:brush border membrane; NAS:UniProtKB.
 GO:0005905; C:coated pit; IEA:Compara.
 GO:0005829; C:cytosol; TAS:Reactome.
 GO:0030139; C:endocytic vesicle; IEA:Compara.
 GO:0005783; C:endoplasmic reticulum; IEA:Compara.
 GO:0010008; C:endosome membrane; IEA:UniProtKB-SubCell.
 GO:0031232; C:extrinsic to external side of plasma membrane; NAS:UniProtKB.
 GO:0005794; C:Golgi apparatus; IEA:Compara.
 GO:0043202; C:lysosomal lumen; TAS:Reactome.
 GO:0005765; C:lysosomal membrane; IEA:UniProtKB-SubCell.
 GO:0005509; F:calcium ion binding; IEA:InterPro.
 GO:0031419; F:cobalamin binding; IEA:UniProtKB-KW.
 GO:0042803; F:protein homodimerization activity; IDA:UniProtKB.
 GO:0004872; F:receptor activity; TAS:ProtInc.
 GO:0005215; F:transporter activity; TAS:ProtInc.
 GO:0008203; P:cholesterol metabolic process; IEA:UniProtKB-KW.
 GO:0015889; P:cobalamin transport; TAS:ProtInc.
 GO:0042157; P:lipoprotein metabolic process; TAS:Reactome.
 GO:0042953; P:lipoprotein transport; IEA:Compara.
 GO:0006898; P:receptor-mediated endocytosis; NAS:UniProtKB.
 GO:0001894; P:tissue homeostasis; NAS:UniProtKB.
 GO:0042359; P:vitamin D metabolic process; TAS:Reactome. 
Interpro
 IPR000859; CUB_dom.
 IPR000742; EG-like_dom.
 IPR001881; EGF-like_Ca-bd.
 IPR013032; EGF-like_CS.
 IPR000152; EGF-type_Asp/Asn_hydroxyl_site.
 IPR018097; EGF_Ca-bd_CS.
 IPR024731; EGF_dom_MSP1-like.
 IPR009030; Growth_fac_rcpt_N_dom. 
Pfam
 PF00431; CUB
 PF00008; EGF
 PF12947; EGF_3
 PF07645; EGF_CA 
SMART
 SM00042; CUB
 SM00181; EGF
 SM00179; EGF_CA 
PROSITE
 PS00010; ASX_HYDROXYL
 PS01180; CUB
 PS00022; EGF_1
 PS01186; EGF_2
 PS50026; EGF_3
 PS01187; EGF_CA 
PRINTS