Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Aggrecan core protein

Gene

Acan

Organism
Rattus norvegicus (Rat)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

This proteoglycan is a major component of extracellular matrix of cartilagenous tissues. A major function of this protein is to resist compression in cartilage. It binds avidly to hyaluronic acid via an N-terminal globular region. May play a regulatory role in the matrix assembly of the cartilage.

Sites

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Metal bindingi1975 – 19751Calcium 1
Metal bindingi1979 – 19791Calcium 1
Metal bindingi1979 – 19791Calcium 3
Metal bindingi1999 – 19991Calcium 2
Metal bindingi2001 – 20011Calcium 2
Metal bindingi2002 – 20021Calcium 1
Metal bindingi2008 – 20081Calcium 1; via carbonyl oxygen
Metal bindingi2008 – 20081Calcium 2
Metal bindingi2009 – 20091Calcium 1
Metal bindingi2009 – 20091Calcium 3
Metal bindingi2022 – 20221Calcium 2
Metal bindingi2023 – 20231Calcium 2
Metal bindingi2023 – 20231Calcium 2; via carbonyl oxygen

GO - Molecular functioni

  • calcium ion binding Source: RGD
  • carbohydrate binding Source: UniProtKB-KW
  • hyaluronic acid binding Source: InterPro

GO - Biological processi

  • cell adhesion Source: InterPro
  • cellular response to growth factor stimulus Source: RGD
  • central nervous system development Source: RGD
  • chondroblast differentiation Source: RGD
  • multicellular organism aging Source: RGD
  • negative regulation of cell migration Source: RGD
  • ossification Source: RGD
  • response to acidic pH Source: RGD
  • response to drug Source: RGD
  • response to glucose Source: RGD
  • response to gravity Source: RGD
  • response to mechanical stimulus Source: RGD
  • response to organic cyclic compound Source: RGD
  • response to radiation Source: RGD
  • spinal cord development Source: RGD
Complete GO annotation...

Keywords - Ligandi

Lectin, Metal-binding

Names & Taxonomyi

Protein namesi
Recommended name:
Aggrecan core protein
Alternative name(s):
Cartilage-specific proteoglycan core protein
Short name:
CSPCP
Gene namesi
Name:Acan
Synonyms:Agc, Agc1
OrganismiRattus norvegicus (Rat)
Taxonomic identifieri10116 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeRattus
Proteomesi
  • UP000002494 Componenti: Unplaced

Organism-specific databases

RGDi68358. Acan.

Subcellular locationi

GO - Cellular componenti

  • perineuronal net Source: RGD
  • proteinaceous extracellular matrix Source: RGD
Complete GO annotation...

Keywords - Cellular componenti

Extracellular matrix, Secreted

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Signal peptidei1 – 19191 PublicationAdd
BLAST
Chaini20 – 21242105Aggrecan core proteinPRO_0000017508Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Disulfide bondi51 ↔ 133By similarity
Glycosylationi126 – 1261N-linked (GlcNAc...)1 Publication
Disulfide bondi175 ↔ 246By similarity
Disulfide bondi199 ↔ 220By similarity
Glycosylationi239 – 2391N-linked (GlcNAc...)Sequence analysis
Disulfide bondi273 ↔ 348By similarity
Disulfide bondi297 ↔ 318By similarity
Glycosylationi333 – 3331N-linked (GlcNAc...)1 Publication
Glycosylationi371 – 3711O-linked (Xyl...) (keratan sulfate)By similarity
Glycosylationi376 – 3761O-linked (Xyl...) (keratan sulfate)By similarity
Glycosylationi387 – 3871N-linked (GlcNAc...)Sequence analysis
Disulfide bondi509 ↔ 580By similarity
Disulfide bondi533 ↔ 554By similarity
Disulfide bondi607 ↔ 682By similarity
Glycosylationi611 – 6111N-linked (GlcNAc...)Sequence analysis
Disulfide bondi631 ↔ 652By similarity
Glycosylationi667 – 6671N-linked (GlcNAc...)Sequence analysis
Glycosylationi1842 – 18421N-linked (GlcNAc...)Sequence analysis
Disulfide bondi1914 ↔ 1925By similarity
Disulfide bondi1942 ↔ 20341 Publication
Disulfide bondi2010 ↔ 20261 Publication
Disulfide bondi2041 ↔ 2084By similarity
Disulfide bondi2070 ↔ 2097By similarity

Post-translational modificationi

Contains mostly chondroitin sulfate, but also keratan sulfate chains, N-linked and O-linked oligosaccharides.1 Publication

Keywords - PTMi

Disulfide bond, Glycoprotein, Proteoglycan

Proteomic databases

PaxDbiP07897.
PRIDEiP07897.

Interactioni

Subunit structurei

Interacts with FBLN1 and COMP.By similarity

Protein-protein interaction databases

IntActiP07897. 1 interaction.
STRINGi10116.ENSRNOP00000042691.

Structurei

Secondary structure

1
2124
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Beta strandi1919 – 19213Combined sources
Beta strandi1924 – 193310Combined sources
Helixi1935 – 194410Combined sources
Helixi1955 – 196511Combined sources
Beta strandi1969 – 19746Combined sources
Beta strandi1976 – 19783Combined sources
Turni2005 – 20073Combined sources
Beta strandi2008 – 20136Combined sources
Turni2015 – 20195Combined sources
Beta strandi2021 – 20255Combined sources
Beta strandi2030 – 20367Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1TDQX-ray2.60B1909-2037[»]
ProteinModelPortaliP07897.
SMRiP07897. Positions 1912-2037.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP07897.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini34 – 147114Ig-like V-typeAdd
BLAST
Domaini153 – 24896Link 1PROSITE-ProRule annotationAdd
BLAST
Domaini254 – 35097Link 2PROSITE-ProRule annotationAdd
BLAST
Domaini487 – 58296Link 3PROSITE-ProRule annotationAdd
BLAST
Domaini588 – 68497Link 4PROSITE-ProRule annotationAdd
BLAST
Domaini1910 – 2036127C-type lectinPROSITE-ProRule annotationAdd
BLAST
Domaini2039 – 209961SushiPROSITE-ProRule annotationAdd
BLAST

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni48 – 14093G1-AAdd
BLAST
Regioni152 – 24796G1-BAdd
BLAST
Regioni253 – 34997G1-B'Add
BLAST
Regioni486 – 58095G2-BAdd
BLAST
Regioni587 – 68296G2-B'Add
BLAST
Regioni685 – 798114KSAdd
BLAST
Regioni801 – 1226426CS-1Add
BLAST
Regioni1227 – 1909683CS-2Add
BLAST
Regioni1910 – 2124215G3Add
BLAST

Domaini

Two globular domains, G1 and G2, comprise the N-terminus of the proteoglycan, while another globular region, G3, makes up the C-terminus. G1 contains Link domains and thus consists of three disulfide-bonded loop structures designated as the A, B, B' motifs. G2 is similar to G1. The keratan sulfate (KS) and the chondroitin sulfate (CS) attachment domains lie between G2 and G3.

Sequence similaritiesi

Contains 1 C-type lectin domain.PROSITE-ProRule annotation
Contains 4 Link domains.PROSITE-ProRule annotation
Contains 1 Sushi (CCP/SCR) domain.PROSITE-ProRule annotation

Keywords - Domaini

Immunoglobulin domain, Repeat, Signal, Sushi

Phylogenomic databases

eggNOGiENOG410IJP2. Eukaryota.
ENOG410XRES. LUCA.
HOGENOMiHOG000168421.
HOVERGENiHBG007982.
InParanoidiP07897.
KOiK06792.

Family and domain databases

Gene3Di2.60.40.10. 1 hit.
3.10.100.10. 5 hits.
InterProiIPR001304. C-type_lectin.
IPR016186. C-type_lectin-like.
IPR018378. C-type_lectin_CS.
IPR016187. C-type_lectin_fold.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR003006. Ig/MHC_CS.
IPR003599. Ig_sub.
IPR013106. Ig_V-set.
IPR000538. Link_dom.
IPR000436. Sushi_SCR_CCP_dom.
[Graphical view]
PfamiPF00059. Lectin_C. 1 hit.
PF00084. Sushi. 1 hit.
PF07686. V-set. 1 hit.
PF00193. Xlink. 4 hits.
[Graphical view]
PRINTSiPR01265. LINKMODULE.
SMARTiSM00032. CCP. 1 hit.
SM00034. CLECT. 1 hit.
SM00409. IG. 1 hit.
SM00406. IGv. 1 hit.
SM00445. LINK. 4 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 1 hit.
SSF56436. SSF56436. 5 hits.
SSF57535. SSF57535. 1 hit.
PROSITEiPS00615. C_TYPE_LECTIN_1. 1 hit.
PS50041. C_TYPE_LECTIN_2. 1 hit.
PS50835. IG_LIKE. 1 hit.
PS00290. IG_MHC. 1 hit.
PS01241. LINK_1. 4 hits.
PS50963. LINK_2. 4 hits.
PS50923. SUSHI. 1 hit.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: P07897-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MTTLLLVFVT LRVIAAVISE EVPDHDNSLS VSIPQPSPLK ALLGTSLTIP
60 70 80 90 100
CYFIDPMHPV TTAPSTAPLT PRIKWSRVSK EKEVVLLVAT EGQVRVNSIY
110 120 130 140 150
QDKVSLPNYP AIPSDATLEI QNLRSNDSGI YRCEVMHGIE DSEATLEVIV
160 170 180 190 200
KGIVFHYRAI STRYTLDFDR AQRACLQNSA IIATPEQLQA AYEDGFHQCD
210 220 230 240 250
AGWLADQTVR YPIHTPREGC YGDKDEFPGV RTYGIRDTNE TYDVYCFAEE
260 270 280 290 300
MEGEVFYATS PEKFTFQEAA NECRTVGARL ATTGQLYLAW QGGMDMCSAG
310 320 330 340 350
WLADRSVRYP ISKARPNCGG NLLGVRTVYL HANQTGYPDP SSRYDAICYT
360 370 380 390 400
GEDFVDIPEN FFGVGGEEDI TIQTVTWPDL ELPLPRNITE GEARGNVILT
410 420 430 440 450
AKPIFDMSPT VSEPGEALTL APEVGTTVFP EAGERTEKTT RPWGFPEEAT
460 470 480 490 500
RGPDSATAFA SEDLVVRVTI SPGAVEVPGQ PRLPGGVVFH YRPGSTRYSL
510 520 530 540 550
TFEEAQQACI RTGAAIASPE QLQAAYEAGY EQCDAGWLQD QTVRYPIVSP
560 570 580 590 600
RTPCVGDKDS SPGVRTYGVR PSSETYDVYC YVDKLEGEVF FATQMEQFTF
610 620 630 640 650
QEAQAFCAAQ NATLASTGQL YAAWSQGLDK CYAGWLADGT LRYPIVNPRP
660 670 680 690 700
ACGGDKPGVR TVYLYPNQTG LPDPLSKHHA FCFRGVSVVP SPGGTPTSPS
710 720 730 740 750
DIEDWIVTRV EPGVDAVPLE PETTEVPYFT TEPEKQTEWE PAYTPVGTSP
760 770 780 790 800
LPGIPPTWLP TVPAAEEHTE SPSASQEPSA SQVPSTSEEP YTPSLAVPSG
810 820 830 840 850
TELPSSGDTS GAPDLSGDFT GSTDTSGRLD SSGEPSGGSE SGLPSGDLDS
860 870 880 890 900
SGLGPTVSSG LPVESGSASG DGEIPWSSTP TVDRLPSGGE SLEGSASASG
910 920 930 940 950
TGDLSGLPSG GEITETSASG TEEISGLPSG GDDLETSTSG IDGASVLPTG
960 970 980 990 1000
RGGLETSASG VEDLSGLPSG EEGSETSTSG IEDISVLPTG ESPETSASGV
1010 1020 1030 1040 1050
GDLSGLPSGG ESLETSASGV EDVTQLPTER GGLETSASGI EDITVLPTGR
1060 1070 1080 1090 1100
ENLETSASGV EDVSGLPSGK EGLETSASGI EDISVFPTEA EGLETSASGG
1110 1120 1130 1140 1150
YVSGIPSGED GTETSTSGVE GVSGLPSGGE GLETSASGVE DLGLPTRDSL
1160 1170 1180 1190 1200
ETSASGVDVT GYPSGREDTE TSVPGVGDDL SGLPSGQEGL ETSASGAEDL
1210 1220 1230 1240 1250
GGLPSGKEDL VGSASGALDF GKLPSGTLGS GQTPEASGLP SGFSGEYSGV
1260 1270 1280 1290 1300
DIGSGPSSGL PDFSGLPSGF PTVSLVDSTL VEVITATTAS ELEGRGTISV
1310 1320 1330 1340 1350
SGSGEESGPP LSELDSSADI SGLPSGTELS GQTSGSLDVS GETSGFFDVS
1360 1370 1380 1390 1400
GQPFGSSGTG EGTSGIPEVS GQAVRSPDTT EISELSGLSS GQPDVSGEGS
1410 1420 1430 1440 1450
GILFGSGQSS GITSVSGETS GISDLSGQPS GFPVLSGTTP GTPDLASGAM
1460 1470 1480 1490 1500
SGSGDSSGIT FVDTSLIEVT PTTFREEEGL GSVELSGLPS GETDLSGTSG
1510 1520 1530 1540 1550
MVDVSGQSSG AIDSSGLISP TPEFSGLPSG VAEVSGEVSG VETGSSLSSG
1560 1570 1580 1590 1600
AFDGSGLVSG FPTVSLVDRT LVESITLAPT AQEAGEGPSS ILEFSGAHSG
1610 1620 1630 1640 1650
TPDISGDLSG SLDQSTWQPG WTEASTEPPS SPYFSGDFSS TTDASGESIT
1660 1670 1680 1690 1700
APTGSGETSG LPEVTLITSE LVEGVTEPTV SQELGHGPSM TYTPRLFEAS
1710 1720 1730 1740 1750
GEASASGDLG GPVTIFPGSG VEASVPEGSS DPSAYPEAGV GVSAAPEASS
1760 1770 1780 1790 1800
QLSEFPDLHG ITSASRETDL EMTTPGTEVS SNPWTFQEGT REGSAAPEVS
1810 1820 1830 1840 1850
GESSTTSDID AGTSGVPFAT PMTSGDRTEI SGEWSDHTSE VNVTVSTTVP
1860 1870 1880 1890 1900
ESRWAQSTQH PTETLQEIGS PNPSYSGEET QTAETAKSLT DTPTLASPEG
1910 1920 1930 1940 1950
SGETESTAAD QEQCEEGWTK FQGHCYRHFP DRETWVDAER RCREQQSHLS
1960 1970 1980 1990 2000
SIVTPEEQEF VNKNAQDYQW IGLNDRTIEG DFRWSDGHSL QFEKWRPNQP
2010 2020 2030 2040 2050
DNFFATGEDC VVMIWHERGE WNDVPCNYQL PFTCKKGTVA CGEPPAVEHA
2060 2070 2080 2090 2100
RTLGQKKDRY EISSLVRYQC TEGFVQRHVP TIRCQPSADW EEPRITCTDP
2110 2120
NTYKHRLQKR TMRPTRRSRP SMAH
Length:2,124
Mass (Da):221,118
Last modified:July 1, 1989 - v2
Checksum:iE30BBE61593A34B1
GO
Isoform 2 (identifier: P07897-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1909-1909: A → ADIDECLSSPCLNGATCVDALDTFTCLCLPSYRGDLCEI

Show »
Length:2,162
Mass (Da):225,171
Checksum:iAE948F63BAC16612
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti38 – 381P → W AA sequence (PubMed:3693371).Curated
Sequence conflicti61 – 611T → E AA sequence (PubMed:3693371).Curated
Sequence conflicti149 – 1491I → L AA sequence (PubMed:3693371).Curated
Sequence conflicti239 – 2391N → S AA sequence (PubMed:3693371).Curated
Sequence conflicti241 – 2411T → A AA sequence (PubMed:3693371).Curated
Sequence conflicti275 – 2762TV → RL in AAA21000 (PubMed:3693370).Curated
Sequence conflicti374 – 3741T → H AA sequence (PubMed:3693371).Curated
Sequence conflicti377 – 3771W → E AA sequence (PubMed:3693371).Curated
Sequence conflicti380 – 3801L → V AA sequence (PubMed:3693371).Curated

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1909 – 19091A → ADIDECLSSPCLNGATCVDA LDTFTCLCLPSYRGDLCEI in isoform 2. 1 PublicationVSP_039196

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
J03485 mRNA. Translation: AAA21000.1.
M13518 mRNA. Translation: AAA41836.1.
PIRiA92623. A28452.
RefSeqiNP_071526.1. NM_022190.1.
UniGeneiRn.54503.

Genome annotation databases

GeneIDi58968.
KEGGirno:58968.
UCSCiRGD:68358. rat. [P07897-1]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
J03485 mRNA. Translation: AAA21000.1.
M13518 mRNA. Translation: AAA41836.1.
PIRiA92623. A28452.
RefSeqiNP_071526.1. NM_022190.1.
UniGeneiRn.54503.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1TDQX-ray2.60B1909-2037[»]
ProteinModelPortaliP07897.
SMRiP07897. Positions 1912-2037.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

IntActiP07897. 1 interaction.
STRINGi10116.ENSRNOP00000042691.

Proteomic databases

PaxDbiP07897.
PRIDEiP07897.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

GeneIDi58968.
KEGGirno:58968.
UCSCiRGD:68358. rat. [P07897-1]

Organism-specific databases

CTDi176.
RGDi68358. Acan.

Phylogenomic databases

eggNOGiENOG410IJP2. Eukaryota.
ENOG410XRES. LUCA.
HOGENOMiHOG000168421.
HOVERGENiHBG007982.
InParanoidiP07897.
KOiK06792.

Miscellaneous databases

EvolutionaryTraceiP07897.
PROiP07897.

Family and domain databases

Gene3Di2.60.40.10. 1 hit.
3.10.100.10. 5 hits.
InterProiIPR001304. C-type_lectin.
IPR016186. C-type_lectin-like.
IPR018378. C-type_lectin_CS.
IPR016187. C-type_lectin_fold.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR003006. Ig/MHC_CS.
IPR003599. Ig_sub.
IPR013106. Ig_V-set.
IPR000538. Link_dom.
IPR000436. Sushi_SCR_CCP_dom.
[Graphical view]
PfamiPF00059. Lectin_C. 1 hit.
PF00084. Sushi. 1 hit.
PF07686. V-set. 1 hit.
PF00193. Xlink. 4 hits.
[Graphical view]
PRINTSiPR01265. LINKMODULE.
SMARTiSM00032. CCP. 1 hit.
SM00034. CLECT. 1 hit.
SM00409. IG. 1 hit.
SM00406. IGv. 1 hit.
SM00445. LINK. 4 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 1 hit.
SSF56436. SSF56436. 5 hits.
SSF57535. SSF57535. 1 hit.
PROSITEiPS00615. C_TYPE_LECTIN_1. 1 hit.
PS50041. C_TYPE_LECTIN_2. 1 hit.
PS50835. IG_LIKE. 1 hit.
PS00290. IG_MHC. 1 hit.
PS01241. LINK_1. 4 hits.
PS50963. LINK_2. 4 hits.
PS50923. SUSHI. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

  1. "Complete primary structure of the rat cartilage proteoglycan core protein deduced from cDNA clones."
    Doege K., Sasaki M., Horigan E., Hassell J.R., Yamada Y.
    J. Biol. Chem. 262:17757-17767(1987) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 2).
  2. Erratum
    Doege K., Sasaki M., Horigan E., Hassell J.R., Yamada Y.
    J. Biol. Chem. 263:10040-10040(1988)
    Cited for: SEQUENCE REVISION TO 698.
  3. "Partial cDNA sequence encoding a globular domain at the C-terminus of the rat cartilage proteoglycan."
    Doege K., Fernandez P., Hassell J.R., Sasaki M., Yamada Y.
    J. Biol. Chem. 261:8108-8111(1986) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 1856-2124 (ISOFORM 1).
  4. "Cartilage proteoglycan aggregates. The link protein and proteoglycan amino-terminal globular domains have similar structures."
    Neame P.J., Christner J.E., Baker J.R.
    J. Biol. Chem. 262:17768-17778(1987) [PubMed] [Europe PMC] [Abstract]
    Cited for: PROTEIN SEQUENCE OF 20-83 AND 88-386, GLYCOSYLATION AT ASN-126 AND ASN-333.
  5. "Structural basis for interactions between tenascins and lectican C-type lectin domains: evidence for a crosslinking role for tenascins."
    Lundell A., Olin A.I., Morgelin M., al-Karadaghi S., Aspberg A., Logan D.T.
    Structure 12:1495-1506(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: X-RAY CRYSTALLOGRAPHY (2.6 ANGSTROMS) OF 1909-2037 IN COMPLEX WITH TNR, DISULFIDE BOND, CALCIUM-BINDING SITES.

Entry informationi

Entry nameiPGCA_RAT
AccessioniPrimary (citable) accession number: P07897
Entry historyi
Integrated into UniProtKB/Swiss-Prot: August 1, 1988
Last sequence update: July 1, 1989
Last modified: June 8, 2016
This is version 146 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.