Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Complement C3

Gene

C3

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

C3 plays a central role in the activation of the complement system. Its processing by C3 convertase is the central reaction in both classical and alternative complement pathways. After activation C3b can bind covalently, via its reactive thioester, to cell surface carbohydrates or immune aggregates.
Derived from proteolytic degradation of complement C3, C3a anaphylatoxin is a mediator of local inflammatory process. In chronic inflammation, acts as a chemoattractant for neutrophils (By similarity). It induces the contraction of smooth muscle, increases vascular permeability and causes histamine release from mast cells and basophilic leukocytes. The short isoform has B-cell stimulatory activity.By similarity
C3-beta-c: Acts as a chemoattractant for neutrophils in chronic inflammation.By similarity
Acylation stimulating protein: adipogenic hormone that stimulates triglyceride (TG) synthesis and glucose transport in adipocytes, regulating fat storage and playing a role in postprandial TG clearance. Appears to stimulate TG synthesis via activation of the PLC, MAPK and AKT signaling pathways. Ligand for C5AR2. Promotes the phosphorylation, ARRB2-mediated internalization and recycling of C5AR2.

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Biological processi

Complement alternate pathway, Complement pathway, Fatty acid metabolism, Immunity, Inflammatory response, Innate immunity, Lipid metabolism

Enzyme and pathway databases

ReactomeiR-MMU-173736. Alternative complement activation.
R-MMU-174577. Activation of C3 and C5.
R-MMU-198933. Immunoregulatory interactions between a Lymphoid and a non-Lymphoid cell.
R-MMU-375276. Peptide ligand-binding receptors.
R-MMU-418594. G alpha (i) signalling events.
R-MMU-6798695. Neutrophil degranulation.
R-MMU-977606. Regulation of Complement cascade.

Protein family/group databases

MEROPSiI39.950.

Names & Taxonomyi

Protein namesi
Gene namesi
Name:C3
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 17

Organism-specific databases

MGIiMGI:88227. C3.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Secreted

Pathology & Biotechi

Disruption phenotypei

Null mice displayed altered lipid metabolism and morphological changes in adipocyte distribution. There is reduced adipsin/CFD expression, increased number of smaller fat cells, decreased DGAT1 expression and activity, and less triglyceride storage capacity associated with delayed postprandial clearance. Mice on a high-fat diet exihibited no diet-induced up-regulation of adipsin/CFD expression nor adipocyte differentiation.1 Publication

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 242 PublicationsAdd BLAST24
ChainiPRO_000000591725 – 1663Complement C3Add BLAST1639
ChainiPRO_000000591825 – 666Complement C3 beta chainAdd BLAST642
ChainiPRO_0000430431569 – 666C3-beta-cBy similarityAdd BLAST98
ChainiPRO_0000005919671 – 1663Complement C3 alpha chainAdd BLAST993
ChainiPRO_0000005920671 – 748C3a anaphylatoxinAdd BLAST78
ChainiPRO_0000419936671 – 747Acylation stimulating proteinAdd BLAST77
ChainiPRO_0000005921749 – 1663Complement C3b alpha' chainAdd BLAST915
ChainiPRO_0000005922749 – 954Complement C3c alpha' chain fragment 1Add BLAST206
ChainiPRO_0000005923955 – 1303Complement C3dg fragmentAdd BLAST349
ChainiPRO_0000005924955 – 1001Complement C3g fragmentAdd BLAST47
ChainiPRO_00000059251002 – 1303Complement C3d fragmentAdd BLAST302
PeptideiPRO_00000059271304 – 1320Complement C3f fragmentAdd BLAST17
ChainiPRO_00002739491321 – 1663Complement C3c alpha' chain fragment 2Add BLAST343

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei40PhosphoserineBy similarity1
Disulfide bondi559 ↔ 816Interchain (between beta and alpha chains)PROSITE-ProRule annotation
Disulfide bondi626 ↔ 661By similarity
Modified residuei671PhosphoserineBy similarity1
Disulfide bondi693 ↔ 720By similarity
Disulfide bondi694 ↔ 727By similarity
Disulfide bondi707 ↔ 728By similarity
Disulfide bondi873 ↔ 1513By similarity
Glycosylationi939N-linked (GlcNAc...)1
Modified residuei968PhosphoserineBy similarity1
Cross-linki1010 ↔ 1013Isoglutamyl cysteine thioester (Cys-Gln)By similarity
Disulfide bondi1101 ↔ 1158By similarity
Modified residuei1321PhosphoserineBy similarity1
Disulfide bondi1358 ↔ 1489By similarity
Disulfide bondi1389 ↔ 1458By similarity
Disulfide bondi1506 ↔ 1511By similarity
Disulfide bondi1518 ↔ 1590By similarity
Disulfide bondi1537 ↔ 1661By similarity
Modified residuei1573PhosphoserineBy similarity1
Glycosylationi1617N-linked (GlcNAc...)1
Disulfide bondi1637 ↔ 1646By similarity

Post-translational modificationi

C3b is rapidly split in two positions by factor I and a cofactor to form iC3b (inactivated C3b) and C3f which is released. Then iC3b is slowly cleaved (possibly by factor I) to form C3c (beta chain + alpha' chain fragment 1 + alpha' chain fragment 2), C3dg and C3f. Other proteases produce other fragments such as C3d or C3g. C3a is further processed by carboxypeptidases to release the C-terminal arginine residue generating the acylation stimulating protein (ASP). Levels of ASP are increased in adipocytes in the postprandial period and by dietary chylomicrons.
Phosphorylated by FAM20C in the extracellular medium.By similarity

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sitei747 – 748Cleavage; by carboxypeptidasesBy similarity2
Sitei748 – 749Cleavage; by C3 convertase2
Sitei954 – 955Cleavage; by factor ISequence analysis2
Sitei1303 – 1304Cleavage; by factor I2
Sitei1320 – 1321Cleavage; by factor I2

Keywords - PTMi

Cleavage on pair of basic residues, Disulfide bond, Glycoprotein, Phosphoprotein, Thioester bond

Proteomic databases

MaxQBiP01027.
PaxDbiP01027.
PeptideAtlasiP01027.
PRIDEiP01027.

PTM databases

iPTMnetiP01027.
PhosphoSitePlusiP01027.
SwissPalmiP01027.

Miscellaneous databases

PMAP-CutDBQ80XP1.

Expressioni

Gene expression databases

BgeeiENSMUSG00000024164.
CleanExiMM_C3.
ExpressionAtlasiP01027. baseline and differential.
GenevisibleiP01027. MM.

Interactioni

Subunit structurei

C3 precursor is first processed by the removal of 4 Arg residues, forming two chains, beta and alpha, linked by a disulfide bond. C3 convertase activates C3 by cleaving the alpha chain, releasing C3a anaphylatoxin and generating C3b (beta chain + alpha' chain). C3dg interacts with CR2 (via the N-terminal Sushi domains 1 and 2). Interacts with VSIG4. Interacts (both C3a and ASP) with C5AR2; the interaction occurs with higher affinity for ASP, enhancing the phosphorylation and activation of C5AR2, recruitment of ARRB2 to the cell surface and endocytosis of GRP77 (By similarity).By similarity

GO - Molecular functioni

Protein-protein interaction databases

BioGridi198418. 1 interactor.
IntActiP01027. 6 interactors.
MINTiMINT-1858136.
STRINGi10090.ENSMUSP00000024988.

Structurei

3D structure databases

ProteinModelPortaliP01027.
SMRiP01027.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini693 – 728Anaphylatoxin-likePROSITE-ProRule annotationAdd BLAST36
Domaini1518 – 1661NTRPROSITE-ProRule annotationAdd BLAST144

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni1424 – 1456Properdin-bindingBy similarityAdd BLAST33

Sequence similaritiesi

Contains 1 anaphylatoxin-like domain.PROSITE-ProRule annotation
Contains 1 NTR domain.PROSITE-ProRule annotation

Keywords - Domaini

Signal

Phylogenomic databases

eggNOGiKOG1366. Eukaryota.
ENOG410XRED. LUCA.
GeneTreeiENSGT00760000118982.
HOGENOMiHOG000286028.
HOVERGENiHBG005110.
InParanoidiP01027.
KOiK03990.
OMAiDETEQWE.
OrthoDBiEOG091G00FJ.
TreeFamiTF313285.

Family and domain databases

Gene3Di1.20.91.20. 1 hit.
1.50.10.20. 1 hit.
2.60.40.690. 1 hit.
InterProiIPR009048. A-macroglobulin_rcpt-bd.
IPR011626. A2M_comp.
IPR002890. A2M_N.
IPR011625. A2M_N_2.
IPR000020. Anaphylatoxin/fibulin.
IPR018081. Anaphylatoxin_comp_syst.
IPR001840. Anaphylatoxn_comp_syst_dom.
IPR001599. Macroglobln_a2.
IPR019742. MacrogloblnA2_CS.
IPR019565. MacrogloblnA2_thiol-ester-bond.
IPR001134. Netrin_domain.
IPR018933. Netrin_module_non-TIMP.
IPR008930. Terpenoid_cyclase/PrenylTrfase.
IPR008993. TIMP-like_OB-fold.
[Graphical view]
PfamiPF00207. A2M. 1 hit.
PF07678. A2M_comp. 1 hit.
PF01835. A2M_N. 1 hit.
PF07703. A2M_N_2. 1 hit.
PF07677. A2M_recep. 1 hit.
PF01821. ANATO. 1 hit.
PF01759. NTR. 1 hit.
PF10569. Thiol-ester_cl. 1 hit.
[Graphical view]
PRINTSiPR00004. ANAPHYLATOXN.
SMARTiSM01360. A2M. 1 hit.
SM01359. A2M_N_2. 1 hit.
SM01361. A2M_recep. 1 hit.
SM00104. ANATO. 1 hit.
SM00643. C345C. 1 hit.
[Graphical view]
SUPFAMiSSF47686. SSF47686. 1 hit.
SSF48239. SSF48239. 1 hit.
SSF49410. SSF49410. 1 hit.
SSF50242. SSF50242. 1 hit.
PROSITEiPS00477. ALPHA_2_MACROGLOBULIN. 1 hit.
PS01177. ANAPHYLATOXIN_1. 1 hit.
PS01178. ANAPHYLATOXIN_2. 1 hit.
PS50189. NTR. 1 hit.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 2 isoformsi produced by alternative initiation. AlignAdd to basket

Isoform Long (identifier: P01027-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MGPASGSQLL VLLLLLASSP LALGIPMYSI ITPNVLRLES EETIVLEAHD
60 70 80 90 100
AQGDIPVTVT VQDFLKRQVL TSEKTVLTGA SGHLRSVSIK IPASKEFNSD
110 120 130 140 150
KEGHKYVTVV ANFGETVVEK AVMVSFQSGY LFIQTDKTIY TPGSTVLYRI
160 170 180 190 200
FTVDNNLLPV GKTVVILIET PDGIPVKRDI LSSNNQHGIL PLSWNIPELV
210 220 230 240 250
NMGQWKIRAF YEHAPKQIFS AEFEVKEYVL PSFEVRVEPT ETFYYIDDPN
260 270 280 290 300
GLEVSIIAKF LYGKNVDGTA FVIFGVQDGD KKISLAHSLT RVVIEDGVGD
310 320 330 340 350
AVLTRKVLME GVRPSNADAL VGKSLYVSVT VILHSGSDMV EAERSGIPIV
360 370 380 390 400
TSPYQIHFTK TPKFFKPAMP FDLMVFVTNP DGSPASKVLV VTQGSNAKAL
410 420 430 440 450
TQDDGVAKLS INTPNSRQPL TITVRTKKDT LPESRQATKT MEAHPYSTMH
460 470 480 490 500
NSNNYLHLSV SRMELKPGDN LNVNFHLRTD PGHEAKIRYY TYLVMNKGKL
510 520 530 540 550
LKAGRQVREP GQDLVVLSLP ITPEFIPSFR LVAYYTLIGA SGQREVVADS
560 570 580 590 600
VWVDVKDSCI GTLVVKGDPR DNHLAPGQQT TLRIEGNQGA RVGLVAVDKG
610 620 630 640 650
VFVLNKKNKL TQSKIWDVVE KADIGCTPGS GKNYAGVFMD AGLAFKTSQG
660 670 680 690 700
LQTEQRADLE CTKPAARRRR SVQLMERRMD KAGQYTDKGL RKCCEDGMRD
710 720 730 740 750
IPMRYSCQRR ARLITQGENC IKAFIDCCNH ITKLREQHRR DHVLGLARSE
760 770 780 790 800
LEEDIIPEED IISRSHFPQS WLWTIEELKE PEKNGISTKV MNIFLKDSIT
810 820 830 840 850
TWEILAVSLS DKKGICVADP YEIRVMQDFF IDLRLPYSVV RNEQVEIRAV
860 870 880 890 900
LFNYREQEEL KVRVELLHNP AFCSMATAKN RYFQTIKIPP KSSVAVPYVI
910 920 930 940 950
VPLKIGQQEV EVKAAVFNHF ISDGVKKTLK VVPEGMRINK TVAIHTLDPE
960 970 980 990 1000
KLGQGGVQKV DVPAADLSDQ VPDTDSETRI ILQGSPVVQM AEDAVDGERL
1010 1020 1030 1040 1050
KHLIVTPAGC GEQNMIGMTP TVIAVHYLDQ TEQWEKFGIE KRQEALELIK
1060 1070 1080 1090 1100
KGYTQQLAFK QPSSAYAAFN NRPPSTWLTA YVVKVFSLAA NLIAIDSHVL
1110 1120 1130 1140 1150
CGAVKWLILE KQKPDGVFQE DGPVIHQEMI GGFRNAKEAD VSLTAFVLIA
1160 1170 1180 1190 1200
LQEARDICEG QVNSLPGSIN KAGEYIEASY MNLQRPYTVA IAGYALALMN
1210 1220 1230 1240 1250
KLEEPYLGKF LNTAKDRNRW EEPDQQLYNV EATSYALLAL LLLKDFDSVP
1260 1270 1280 1290 1300
PVVRWLNEQR YYGGGYGSTQ ATFMVFQALA QYQTDVPDHK DLNMDVSFHL
1310 1320 1330 1340 1350
PSRSSATTFR LLWENGNLLR SEETKQNEAF SLTAKGKGRG TLSVVAVYHA
1360 1370 1380 1390 1400
KLKSKVTCKK FDLRVSIRPA PETAKKPEEA KNTMFLEICT KYLGDVDATM
1410 1420 1430 1440 1450
SILDISMMTG FAPDTKDLEL LASGVDRYIS KYEMNKAFSN KNTLIIYLEK
1460 1470 1480 1490 1500
ISHTEEDCLT FKVHQYFNVG LIQPGSVKVY SYYNLEESCT RFYHPEKDDG
1510 1520 1530 1540 1550
MLSKLCHSEM CRCAEENCFM QQSQEKINLN VRLDKACEPG VDYVYKTELT
1560 1570 1580 1590 1600
NIELLDDFDE YTMTIQQVIK SGSDEVQAGQ QRKFISHIKC RNALKLQKGK
1610 1620 1630 1640 1650
KYLMWGLSSD LWGEKPNTSY IIGKDTWVEH WPEAEECQDQ KYQKQCEELG
1660
AFTESMVVYG CPN
Length:1,663
Mass (Da):186,484
Last modified:July 27, 2011 - v3
Checksum:i7E5546CC7C314779
GO
Isoform Short (identifier: P01027-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-1128: Missing.

Show »
Length:535
Mass (Da):60,952
Checksum:i6F6187342DC868CD
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti137K → Q in AAA37339 (PubMed:6356427).Curated1
Sequence conflicti858E → Q in AAC42013 (PubMed:6208565).Curated1
Sequence conflicti1553E → K in AAC42013 (PubMed:6208565).Curated1
Sequence conflicti1553E → K in AAA37336 (PubMed:6094532).Curated1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0187081 – 1128Missing in isoform Short. CuratedAdd BLAST1128

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
K02782 mRNA. Translation: AAC42013.1.
BC043338 mRNA. Translation: AAH43338.1.
M35659 mRNA. Translation: AAA37339.1.
M33032 mRNA. Translation: AAA37378.1.
J00369, J00367 Genomic DNA. Translation: AAA37336.1.
Z37998 Genomic DNA. Translation: CAA86099.2.
CCDSiCCDS37670.1. [P01027-1]
PIRiA92459. C3MS.
I48284.
RefSeqiNP_033908.2. NM_009778.3. [P01027-1]
UniGeneiMm.19131.

Genome annotation databases

EnsembliENSMUST00000024988; ENSMUSP00000024988; ENSMUSG00000024164. [P01027-1]
GeneIDi12266.
KEGGimmu:12266.
UCSCiuc008deg.2. mouse. [P01027-1]

Keywords - Coding sequence diversityi

Alternative initiation

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
K02782 mRNA. Translation: AAC42013.1.
BC043338 mRNA. Translation: AAH43338.1.
M35659 mRNA. Translation: AAA37339.1.
M33032 mRNA. Translation: AAA37378.1.
J00369, J00367 Genomic DNA. Translation: AAA37336.1.
Z37998 Genomic DNA. Translation: CAA86099.2.
CCDSiCCDS37670.1. [P01027-1]
PIRiA92459. C3MS.
I48284.
RefSeqiNP_033908.2. NM_009778.3. [P01027-1]
UniGeneiMm.19131.

3D structure databases

ProteinModelPortaliP01027.
SMRiP01027.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi198418. 1 interactor.
IntActiP01027. 6 interactors.
MINTiMINT-1858136.
STRINGi10090.ENSMUSP00000024988.

Protein family/group databases

MEROPSiI39.950.

PTM databases

iPTMnetiP01027.
PhosphoSitePlusiP01027.
SwissPalmiP01027.

Proteomic databases

MaxQBiP01027.
PaxDbiP01027.
PeptideAtlasiP01027.
PRIDEiP01027.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000024988; ENSMUSP00000024988; ENSMUSG00000024164. [P01027-1]
GeneIDi12266.
KEGGimmu:12266.
UCSCiuc008deg.2. mouse. [P01027-1]

Organism-specific databases

CTDi718.
MGIiMGI:88227. C3.

Phylogenomic databases

eggNOGiKOG1366. Eukaryota.
ENOG410XRED. LUCA.
GeneTreeiENSGT00760000118982.
HOGENOMiHOG000286028.
HOVERGENiHBG005110.
InParanoidiP01027.
KOiK03990.
OMAiDETEQWE.
OrthoDBiEOG091G00FJ.
TreeFamiTF313285.

Enzyme and pathway databases

ReactomeiR-MMU-173736. Alternative complement activation.
R-MMU-174577. Activation of C3 and C5.
R-MMU-198933. Immunoregulatory interactions between a Lymphoid and a non-Lymphoid cell.
R-MMU-375276. Peptide ligand-binding receptors.
R-MMU-418594. G alpha (i) signalling events.
R-MMU-6798695. Neutrophil degranulation.
R-MMU-977606. Regulation of Complement cascade.

Miscellaneous databases

ChiTaRSiC3. mouse.
PMAP-CutDBQ80XP1.
PROiP01027.
SOURCEiSearch...

Gene expression databases

BgeeiENSMUSG00000024164.
CleanExiMM_C3.
ExpressionAtlasiP01027. baseline and differential.
GenevisibleiP01027. MM.

Family and domain databases

Gene3Di1.20.91.20. 1 hit.
1.50.10.20. 1 hit.
2.60.40.690. 1 hit.
InterProiIPR009048. A-macroglobulin_rcpt-bd.
IPR011626. A2M_comp.
IPR002890. A2M_N.
IPR011625. A2M_N_2.
IPR000020. Anaphylatoxin/fibulin.
IPR018081. Anaphylatoxin_comp_syst.
IPR001840. Anaphylatoxn_comp_syst_dom.
IPR001599. Macroglobln_a2.
IPR019742. MacrogloblnA2_CS.
IPR019565. MacrogloblnA2_thiol-ester-bond.
IPR001134. Netrin_domain.
IPR018933. Netrin_module_non-TIMP.
IPR008930. Terpenoid_cyclase/PrenylTrfase.
IPR008993. TIMP-like_OB-fold.
[Graphical view]
PfamiPF00207. A2M. 1 hit.
PF07678. A2M_comp. 1 hit.
PF01835. A2M_N. 1 hit.
PF07703. A2M_N_2. 1 hit.
PF07677. A2M_recep. 1 hit.
PF01821. ANATO. 1 hit.
PF01759. NTR. 1 hit.
PF10569. Thiol-ester_cl. 1 hit.
[Graphical view]
PRINTSiPR00004. ANAPHYLATOXN.
SMARTiSM01360. A2M. 1 hit.
SM01359. A2M_N_2. 1 hit.
SM01361. A2M_recep. 1 hit.
SM00104. ANATO. 1 hit.
SM00643. C345C. 1 hit.
[Graphical view]
SUPFAMiSSF47686. SSF47686. 1 hit.
SSF48239. SSF48239. 1 hit.
SSF49410. SSF49410. 1 hit.
SSF50242. SSF50242. 1 hit.
PROSITEiPS00477. ALPHA_2_MACROGLOBULIN. 1 hit.
PS01177. ANAPHYLATOXIN_1. 1 hit.
PS01178. ANAPHYLATOXIN_2. 1 hit.
PS50189. NTR. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiCO3_MOUSE
AccessioniPrimary (citable) accession number: P01027
Secondary accession number(s): Q61370, Q80XP1
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 21, 1986
Last sequence update: July 27, 2011
Last modified: November 30, 2016
This is version 172 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.