Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Aggrecan core protein

Gene

ACAN

Organism
Bos taurus (Bovine)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

This proteoglycan is a major component of extracellular matrix of cartilagenous tissues. A major function of this protein is to resist compression in cartilage. It binds avidly to hyaluronic acid via an N-terminal globular region. May play a regulatory role in the matrix assembly of the cartilage.

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Metal bindingi2215Calcium 1By similarity1
Metal bindingi2219Calcium 1By similarity1
Metal bindingi2219Calcium 3By similarity1
Metal bindingi2239Calcium 2By similarity1
Metal bindingi2241Calcium 2By similarity1
Metal bindingi2242Calcium 1By similarity1
Metal bindingi2248Calcium 1; via carbonyl oxygenBy similarity1
Metal bindingi2248Calcium 2By similarity1
Metal bindingi2249Calcium 1By similarity1
Metal bindingi2249Calcium 3By similarity1
Metal bindingi2262Calcium 2By similarity1
Metal bindingi2263Calcium 2By similarity1
Metal bindingi2263Calcium 2; via carbonyl oxygenBy similarity1

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Ligandi

Calcium, Lectin, Metal-binding

Names & Taxonomyi

Protein namesi
Recommended name:
Aggrecan core protein
Alternative name(s):
Cartilage-specific proteoglycan core protein
Short name:
CSPCP
Gene namesi
Name:ACAN
Synonyms:AGC1
OrganismiBos taurus (Bovine)
Taxonomic identifieri9913 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaLaurasiatheriaCetartiodactylaRuminantiaPecoraBovidaeBovinaeBos
Proteomesi
  • UP000009136 Componenti: Unplaced

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Extracellular matrix, Secreted

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 16Sequence analysisAdd BLAST16
ChainiPRO_000001750217 – 2364Aggrecan core proteinAdd BLAST2348

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Disulfide bondi51 ↔ 133By similarity
Glycosylationi126N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi175 ↔ 246By similarity
Disulfide bondi199 ↔ 220By similarity
Glycosylationi239N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi273 ↔ 348By similarity
Disulfide bondi297 ↔ 318By similarity
Glycosylationi333N-linked (GlcNAc...)Sequence analysis1
Glycosylationi371O-linked (Xyl...) (keratan sulfate)By similarity1
Glycosylationi376O-linked (Xyl...) (keratan sulfate)By similarity1
Glycosylationi387N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi509 ↔ 580By similarity
Disulfide bondi533 ↔ 554By similarity
Disulfide bondi607 ↔ 682By similarity
Glycosylationi611N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi631 ↔ 652By similarity
Glycosylationi667N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi2117 ↔ 2128By similarity
Disulfide bondi2122 ↔ 2137By similarity
Disulfide bondi2139 ↔ 2148By similarity
Disulfide bondi2182 ↔ 2274By similarity
Disulfide bondi2250 ↔ 2266By similarity
Disulfide bondi2281 ↔ 2324By similarity
Disulfide bondi2310 ↔ 2337By similarity

Post-translational modificationi

Contains mostly chondroitin sulfate, but also N-linked and O-linked (about 40) oligosaccharides.
The keratan sulfate contents differ considerably between adult and fetal bovine proteoglycans.

Keywords - PTMi

Disulfide bond, Glycoprotein, Proteoglycan

Proteomic databases

PaxDbiP13608.
PeptideAtlasiP13608.
PRIDEiP13608.

Miscellaneous databases

PMAP-CutDBP13608.

Interactioni

Subunit structurei

Interacts with FBLN1 and COMP.By similarity

Binary interactionsi

WithEntry#Exp.IntActNotes
ADAMTS5Q9UNA02EBI-6259246,EBI-2808663From a different organism.
COMPP497472EBI-6259246,EBI-2531022From a different organism.

Protein-protein interaction databases

IntActiP13608. 2 interactors.
STRINGi9913.ENSBTAP00000021512.

Structurei

3D structure databases

ProteinModelPortaliP13608.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini25 – 147Ig-like V-typeAdd BLAST123
Domaini153 – 248Link 1PROSITE-ProRule annotationAdd BLAST96
Domaini254 – 350Link 2PROSITE-ProRule annotationAdd BLAST97
Domaini487 – 582Link 3PROSITE-ProRule annotationAdd BLAST96
Domaini588 – 684Link 4PROSITE-ProRule annotationAdd BLAST97
Repeati774 – 77916
Repeati780 – 78526
Repeati786 – 79136
Repeati792 – 79746
Repeati798 – 80356
Repeati804 – 80966
Repeati810 – 81576
Repeati816 – 82186
Repeati822 – 82796
Repeati828 – 833106
Repeati834 – 839116
Repeati840 – 845126
Repeati846 – 851136
Repeati852 – 857146
Repeati858 – 863156
Repeati864 – 869166
Repeati870 – 875176
Repeati876 – 881186
Repeati882 – 887196
Repeati888 – 893206
Repeati894 – 899216
Repeati900 – 905226
Domaini2113 – 2149EGF-like; calcium-bindingPROSITE-ProRule annotationAdd BLAST37
Domaini2161 – 2276C-type lectinPROSITE-ProRule annotationAdd BLAST116
Domaini2279 – 2339SushiPROSITE-ProRule annotationAdd BLAST61

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni774 – 90522 X 6 AA tandem repeats of E-[EKGV]-[PL]-[FSI]-[PAT]-[STPL]Add BLAST132
Regioni1433 – 2112CS-2Add BLAST680
Regioni2114 – 2364G3Add BLAST251

Domaini

Two globular domains, G1 and G2, comprise the N-terminus of the proteoglycan, while another globular region, G3, makes up the C-terminus. G1 contains Link domains and thus consists of three disulfide-bonded loop structures designated as the A, B, B' motifs. G2 is similar to G1. The keratan sulfate (KS) and the chondroitin sulfate (CS) attachment domains lie between G2 and G3.

Sequence similaritiesi

Contains 1 C-type lectin domain.PROSITE-ProRule annotation
Contains 1 EGF-like domain.PROSITE-ProRule annotation
Contains 4 Link domains.PROSITE-ProRule annotation
Contains 1 Sushi (CCP/SCR) domain.PROSITE-ProRule annotation

Keywords - Domaini

EGF-like domain, Immunoglobulin domain, Repeat, Signal, Sushi

Phylogenomic databases

eggNOGiENOG410IJP2. Eukaryota.
ENOG410XRES. LUCA.
HOVERGENiHBG007982.
InParanoidiP13608.
KOiK06792.

Family and domain databases

CDDicd00033. CCP. 1 hit.
Gene3Di2.60.40.10. 1 hit.
3.10.100.10. 5 hits.
InterProiIPR001304. C-type_lectin-like.
IPR016186. C-type_lectin-like/link.
IPR018378. C-type_lectin_CS.
IPR016187. CTDL_fold.
IPR001881. EGF-like_Ca-bd_dom.
IPR013032. EGF-like_CS.
IPR000742. EGF-like_dom.
IPR000152. EGF-type_Asp/Asn_hydroxyl_site.
IPR018097. EGF_Ca-bd_CS.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR003599. Ig_sub.
IPR013106. Ig_V-set.
IPR000538. Link_dom.
IPR000436. Sushi_SCR_CCP_dom.
[Graphical view]
PfamiPF00008. EGF. 1 hit.
PF00059. Lectin_C. 1 hit.
PF00084. Sushi. 1 hit.
PF07686. V-set. 1 hit.
PF00193. Xlink. 4 hits.
[Graphical view]
PRINTSiPR01265. LINKMODULE.
SMARTiSM00032. CCP. 1 hit.
SM00034. CLECT. 1 hit.
SM00181. EGF. 1 hit.
SM00179. EGF_CA. 1 hit.
SM00409. IG. 1 hit.
SM00406. IGv. 1 hit.
SM00445. LINK. 4 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 1 hit.
SSF56436. SSF56436. 5 hits.
SSF57535. SSF57535. 1 hit.
PROSITEiPS00010. ASX_HYDROXYL. 1 hit.
PS00615. C_TYPE_LECTIN_1. 1 hit.
PS50041. C_TYPE_LECTIN_2. 1 hit.
PS00022. EGF_1. 1 hit.
PS50026. EGF_3. 1 hit.
PS01187. EGF_CA. 1 hit.
PS50835. IG_LIKE. 1 hit.
PS01241. LINK_1. 4 hits.
PS50963. LINK_2. 4 hits.
PS50923. SUSHI. 1 hit.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: P13608-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MTTLLLVFVT LRVITAAISV EVSEPDNSLS VSIPEPSPLR VLLGSSLTIP
60 70 80 90 100
CYFIDPMHPV TTAPSTAPLA PRIKWSRISK EKEVVLLVAT EGRVRVNSAY
110 120 130 140 150
QDKVTLPNYP AIPSDATLEI QNMRSNDSGI LRCEVMHGIE DSQATLEVVV
160 170 180 190 200
KGIVFHYRAI STRYTLDFDR AQRACLQNSA IIATPEQLQA AYEDGFHQCD
210 220 230 240 250
AGWLADQTVR YPIHTPREGC YGDKDEFPGV RTYGIRDTNE TYDVYCFAEE
260 270 280 290 300
MEGEVFYATS PEKFTFQEAA NECRRLGARL ATTGQLYLAW QGGMDMCSAG
310 320 330 340 350
WLADRSVRYP ISKARPNCGG NLLGVRTVYL HANQTGYPDP SSRYDAICYT
360 370 380 390 400
GEDFVDIPES FFGVGGEEDI TIQTVTWPDV ELPLPRNITE GEARGSVILT
410 420 430 440 450
AKPDFEVSPT APEPEEPFTF VPEVRATAFP EVENRTEEAT RPWAFPREST
460 470 480 490 500
PGLGAPTAFT SEDLVVQVTL APGAAEVPGQ PRLPGGVVFH YRPGSSRYSL
510 520 530 540 550
TFEEAKQACL RTGAIIASPE QLQAAYEAGY EQCDAGWLQD QTVRYPIVSP
560 570 580 590 600
RTPCVGDKDS SPGVRTYGVR PPSETYDVYC YVDRLEGEVF FATRLEQFTF
610 620 630 640 650
WEAQEFCESQ NATLATTGQL YAAWSRGLDK CYAGWLADGS LRYPIVTPRP
660 670 680 690 700
ACGGDKPGVR TVYLYPNQTG LLDPLSRHHA FCFRGVSAAP SPEEEEGSAP
710 720 730 740 750
TAGPDVEEWM VTQVGPGVAA VPIGEETTAI PGFTVEPENK TEWELAYTPA
760 770 780 790 800
GTLPLPGIPP TWPPTGEATE EHTEGPSATE VPSASEKPFP SEEPFPPEEP
810 820 830 840 850
FPSEKPFPPE ELFPSEKPFP SEKPFPSEEP FPSEKPFPPE ELFPSEKPIP
860 870 880 890 900
SEEPFPSEEP FPSEKPFPPE EPFPSEKPIP SEEPFPSEKP FPSEEPFPSE
910 920 930 940 950
EPSTLSAPVP SRTELPSSGE VSGVPEISGD FTGSGEISGH LDFSGQPSGE
960 970 980 990 1000
SASGLPSEDL DSSGLTSTVG SGLPVESGLP SGEEERITWT SAPKVDRLPS
1010 1020 1030 1040 1050
GGEGPEVSGV EDISGLPSGG EVHLEISASG VEDISGLPSG GEVHLEISAS
1060 1070 1080 1090 1100
GVEDLSRIPS GEGPEISASG VEDISGLPSG EEGHLEISAS GVEDLSGIPS
1110 1120 1130 1140 1150
GEGPEVSASG VEDLIGLPSG EGPEVSASGV EDLSRLPSGE GPEVSASGVE
1160 1170 1180 1190 1200
DLSGLPSGEG PEVSVSGVED LSRLPSGEGP EVSASGVEDL SRLPSGEGPE
1210 1220 1230 1240 1250
ISVSGVEDIS ILPSGEGPEV SASGVEDLSV LPSGEGHLEI STSGVEDLSV
1260 1270 1280 1290 1300
LPSGEGHLET SSGVEDISRL PSGEGPEVSA SGVEDLSVLP SGEDHLEISA
1310 1320 1330 1340 1350
SGVEDLGVLP SGEDHLEISA SGVEDISRLP SGEGPEVSAS GVEDLSVLPS
1360 1370 1380 1390 1400
GEGHLEISAS GVEDLSRLPS GGEDHLETSA SGVGDLSGLP SGREGLEISA
1410 1420 1430 1440 1450
SGAGDLSGLT SGKEDLTGSA SGALDLGRIP SVTLGSGQAP EASGLPSGFS
1460 1470 1480 1490 1500
GEYSGVDLES GPSSGLPDFS GLPSGFPTVS LVDTTLVEVV TATTAGELEG
1510 1520 1530 1540 1550
RGTIDISGAG ETSGLPFSEL DISGGASGLS SGAELSGQAS GSPDISGETS
1560 1570 1580 1590 1600
GLFGVSGQPS GFPDISGETS GLLEVSGQPS GFYGEISGVT ELSGLASGQP
1610 1620 1630 1640 1650
EISGEASGIL SGLGPPFGIT DLSGEAPGIP DLSGQPSGLP EFSGTASGIP
1660 1670 1680 1690 1700
DLVSSAVSGS GESSGITFVD TSLVEVTPTT FKEEEGLGSV ELSGLPSGEL
1710 1720 1730 1740 1750
GVSGTSGLAD VSGLSSGAID SSGFTSQPPE FSGLPSGVTE VSGEASGAES
1760 1770 1780 1790 1800
GSSLPSGAYD SSGLPSGFPT VSFVDRTLVE SVTQAPTAQE AGEGPSGILE
1810 1820 1830 1840 1850
LSGAPSGAPD MSGDHLGSLD QSGLQSGLVE PSGEPASTPY FSGDFSGTTD
1860 1870 1880 1890 1900
VSGESSAATS TSGEASGLPE VTLITSELVE GVTEPTVSQE LGQRPPVTYT
1910 1920 1930 1940 1950
PQLFESSGEA SASGDVPRFP GSGVEVSSVP ESSGETSAYP EAEVGASAAP
1960 1970 1980 1990 2000
EASGGASGSP NLSETTSTFH EADLEGTSGL GVSGSPSAFP EGPTEGLATP
2010 2020 2030 2040 2050
EVSGESTTAF DVSVEASGSP SATPLASGDR TDTSGDLSGH TSGLDIVIST
2060 2070 2080 2090 2100
TIPESEWTQQ TQRPAEARLE IESSSPVHSG EESQTADTAT SPTDASIPAS
2110 2120 2130 2140 2150
AGGTDDSEAT TTDIDECLSS PCLNGATCVD AIDSFTCLCL PSYQGDVCEI
2160 2170 2180 2190 2200
QKLCEEGWTK FQGHCYRHFP DRATWVDAES QCRKQQSHLS SIVTPEEQEF
2210 2220 2230 2240 2250
VNNNAQDYQW IGLNDKTIEG DFRWSDGHSL QFENWRPNQP DNFFATGEDC
2260 2270 2280 2290 2300
VVMIWHEKGE WNDVPCNYQL PFTCKKGTVA CGEPPVVEHA RIFGQKKDRY
2310 2320 2330 2340 2350
EINALVRYQC TEGFIQGHVP TIRCQPSGHW EEPRITCTDP ATYKRRLQKR
2360
SSRPLRRSHP STAH
Length:2,364
Mass (Da):246,362
Last modified:July 15, 1998 - v3
Checksum:i6FF83763420C3D4C
GO
Isoform 2 (identifier: P13608-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     2114-2150: Missing.

Show »
Length:2,327
Mass (Da):242,481
Checksum:i5C048060466806B0
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti573 – 576SETY → QSET AA sequence (PubMed:2022637).Curated4

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0030722114 – 2150Missing in isoform 2. 1 PublicationAdd BLAST37

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
U76615 mRNA. Translation: AAB38524.1.
AY226875
, AY226858, AY226859, AY226860, AY226861, AY226862, AY226863, AY226864, AY226865, AY226866, AY226867, AY226868, AY226871, AY226872, AY226873, AY226874 Genomic DNA. Translation: AAP44492.1.
L07053 mRNA. No translation available.
PIRiA29164.
A34234. A39808.
B29164.
S74144.
T42630.
RefSeqiNP_776406.1. NM_173981.2. [P13608-2]
UniGeneiBt.4953.
Bt.92700.

Genome annotation databases

GeneIDi280985.
KEGGibta:280985.

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
U76615 mRNA. Translation: AAB38524.1.
AY226875
, AY226858, AY226859, AY226860, AY226861, AY226862, AY226863, AY226864, AY226865, AY226866, AY226867, AY226868, AY226871, AY226872, AY226873, AY226874 Genomic DNA. Translation: AAP44492.1.
L07053 mRNA. No translation available.
PIRiA29164.
A34234. A39808.
B29164.
S74144.
T42630.
RefSeqiNP_776406.1. NM_173981.2. [P13608-2]
UniGeneiBt.4953.
Bt.92700.

3D structure databases

ProteinModelPortaliP13608.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

IntActiP13608. 2 interactors.
STRINGi9913.ENSBTAP00000021512.

Proteomic databases

PaxDbiP13608.
PeptideAtlasiP13608.
PRIDEiP13608.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

GeneIDi280985.
KEGGibta:280985.

Organism-specific databases

CTDi176.

Phylogenomic databases

eggNOGiENOG410IJP2. Eukaryota.
ENOG410XRES. LUCA.
HOVERGENiHBG007982.
InParanoidiP13608.
KOiK06792.

Miscellaneous databases

PMAP-CutDBP13608.

Family and domain databases

CDDicd00033. CCP. 1 hit.
Gene3Di2.60.40.10. 1 hit.
3.10.100.10. 5 hits.
InterProiIPR001304. C-type_lectin-like.
IPR016186. C-type_lectin-like/link.
IPR018378. C-type_lectin_CS.
IPR016187. CTDL_fold.
IPR001881. EGF-like_Ca-bd_dom.
IPR013032. EGF-like_CS.
IPR000742. EGF-like_dom.
IPR000152. EGF-type_Asp/Asn_hydroxyl_site.
IPR018097. EGF_Ca-bd_CS.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR003599. Ig_sub.
IPR013106. Ig_V-set.
IPR000538. Link_dom.
IPR000436. Sushi_SCR_CCP_dom.
[Graphical view]
PfamiPF00008. EGF. 1 hit.
PF00059. Lectin_C. 1 hit.
PF00084. Sushi. 1 hit.
PF07686. V-set. 1 hit.
PF00193. Xlink. 4 hits.
[Graphical view]
PRINTSiPR01265. LINKMODULE.
SMARTiSM00032. CCP. 1 hit.
SM00034. CLECT. 1 hit.
SM00181. EGF. 1 hit.
SM00179. EGF_CA. 1 hit.
SM00409. IG. 1 hit.
SM00406. IGv. 1 hit.
SM00445. LINK. 4 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 1 hit.
SSF56436. SSF56436. 5 hits.
SSF57535. SSF57535. 1 hit.
PROSITEiPS00010. ASX_HYDROXYL. 1 hit.
PS00615. C_TYPE_LECTIN_1. 1 hit.
PS50041. C_TYPE_LECTIN_2. 1 hit.
PS00022. EGF_1. 1 hit.
PS50026. EGF_3. 1 hit.
PS01187. EGF_CA. 1 hit.
PS50835. IG_LIKE. 1 hit.
PS01241. LINK_1. 4 hits.
PS50963. LINK_2. 4 hits.
PS50923. SUSHI. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiPGCA_BOVIN
AccessioniPrimary (citable) accession number: P13608
Secondary accession number(s): P79117, Q28159, Q6XL66
Entry historyi
Integrated into UniProtKB/Swiss-Prot: January 1, 1990
Last sequence update: July 15, 1998
Last modified: November 30, 2016
This is version 157 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.