Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Cytotoxic T-lymphocyte protein 4

Gene

CTLA4

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Inhibitory receptor acting as a major negative regulator of T-cell responses. The affinity of CTLA4 for its natural B7 family ligands, CD80 and CD86, is considerably stronger than the affinity of their cognate stimulatory coreceptor CD28.2 Publications

GO - Biological processi

  • adaptive immune response Source: UniProtKB-KW
  • B cell receptor signaling pathway Source: UniProtKB
  • cellular response to DNA damage stimulus Source: UniProtKB
  • immune response Source: ProtInc
  • negative regulation of B cell proliferation Source: UniProtKB
  • negative regulation of immune response Source: Ensembl
  • negative regulation of regulatory T cell differentiation Source: BHF-UCL
  • negative regulation of T cell proliferation Source: Ensembl
  • positive regulation of apoptotic process Source: UniProtKB
  • T cell costimulation Source: Reactome
Complete GO annotation...

Keywords - Biological processi

Adaptive immunity, Immunity

Enzyme and pathway databases

ReactomeiR-HSA-389513. CTLA4 inhibitory signaling.
SIGNORiP16410.

Names & Taxonomyi

Protein namesi
Recommended name:
Cytotoxic T-lymphocyte protein 4
Alternative name(s):
Cytotoxic T-lymphocyte-associated antigen 4
Short name:
CTLA-4
CD_antigen: CD152
Gene namesi
Name:CTLA4
Synonyms:CD152
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 2

Organism-specific databases

HGNCiHGNC:2505. CTLA4.

Subcellular locationi

Topology

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Topological domaini36 – 161ExtracellularSequence analysisAdd BLAST126
Transmembranei162 – 182HelicalSequence analysisAdd BLAST21
Topological domaini183 – 223CytoplasmicSequence analysisAdd BLAST41

GO - Cellular componenti

  • clathrin-coated endocytic vesicle Source: BHF-UCL
  • external side of plasma membrane Source: BHF-UCL
  • Golgi apparatus Source: BHF-UCL
  • integral component of plasma membrane Source: ProtInc
  • perinuclear region of cytoplasm Source: BHF-UCL
  • plasma membrane Source: Reactome
  • protein complex involved in cell adhesion Source: MGI
Complete GO annotation...

Keywords - Cellular componenti

Cell membrane, Membrane

Pathology & Biotechi

Involvement in diseasei

Systemic lupus erythematosus (SLE)2 Publications
Disease susceptibility is associated with variations affecting the gene represented in this entry.
Disease descriptionA chronic, relapsing, inflammatory, and often febrile multisystemic disorder of connective tissue, characterized principally by involvement of the skin, joints, kidneys and serosal membranes. It is of unknown etiology, but is thought to represent a failure of the regulatory mechanisms of the autoimmune system. The disease is marked by a wide range of system dysfunctions, an elevated erythrocyte sedimentation rate, and the formation of LE cells in the blood or bone marrow.
See also OMIM:152700

Genetic variations in CTLA4 may influence susceptibility to Graves disease, an autoimmune disorder associated with overactivity of the thyroid gland and hyperthyroidism.

Diabetes mellitus, insulin-dependent, 12 (IDDM12)1 Publication
Disease susceptibility is associated with variations affecting the gene represented in this entry.
Disease descriptionA multifactorial disorder of glucose homeostasis that is characterized by susceptibility to ketoacidosis in the absence of insulin therapy. Clinical features are polydipsia, polyphagia and polyuria which result from hyperglycemia-induced osmotic diuresis and secondary thirst. These derangements result in long-term complications that affect the eyes, kidneys, nerves, and blood vessels.
See also OMIM:601388
Celiac disease 3 (CELIAC3)2 Publications
Disease susceptibility is associated with variations affecting the gene represented in this entry.
Disease descriptionA multifactorial, chronic disorder of the small intestine caused by intolerance to gluten. It is characterized by immune-mediated enteropathy associated with failed intestinal absorption, and malnutrition. In predisposed individuals, the ingestion of gluten-containing food such as wheat and rye induces a flat jejunal mucosa with infiltration of lymphocytes.
See also OMIM:609755
Autoimmune lymphoproliferative syndrome 5 (ALPS5)2 Publications
The disease is caused by mutations affecting the gene represented in this entry.
Disease descriptionAn autosomal dominant primary immunodeficiency characterized by severe autoimmunity, infiltration of non-lymphoid organs, such as the intestine, lungs and brain, by hyperactive T cells and B cells, autoimmune cytopenias, and hypogammaglobulinemia in early childhood.
See also OMIM:616100
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_07268170R → W in ALPS5. 1 PublicationCorresponds to variant rs606231422dbSNPEnsembl.1

Pharmaceutical usei

Engineered fusion proteins consisting of the extracellular domain of CTLA4 and the IgG Fc region (Ctla4-Ig), inhibit T-cell-dependent antibody responses, and are used as immunosuppressive agents. They are soluble, have an enhanced affinity for B7 ligands and act as a competitive inhibitor of CD28.

Keywords - Diseasei

Diabetes mellitus, Disease mutation, Systemic lupus erythematosus

Organism-specific databases

DisGeNETi1493.
MalaCardsiCTLA4.
MIMi109100. phenotype.
152700. phenotype.
601388. phenotype.
609755. phenotype.
610424. phenotype.
616100. phenotype.
OpenTargetsiENSG00000163599.
Orphaneti555. Celiac disease.
900. Granulomatosis with polyangiitis.
855. Hashimoto struma.
536. Systemic lupus erythematosus.
PharmGKBiPA27006.

Chemistry databases

ChEMBLiCHEMBL2364164.
DrugBankiDB06186. Ipilimumab.
GuidetoPHARMACOLOGYi2743.

Polymorphism and mutation databases

BioMutaiCTLA4.
DMDMi27735177.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 35Sequence analysisAdd BLAST35
ChainiPRO_000001473436 – 223Cytotoxic T-lymphocyte protein 4Add BLAST188

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Disulfide bondi58 ↔ 129Combined sources5 Publications
Disulfide bondi85 ↔ 103Combined sources5 Publications
Glycosylationi113N-linked (GlcNAc...)3 Publications1
Glycosylationi145N-linked (GlcNAc...)2 Publications1
Disulfide bondi157InterchainCombined sources1 Publication
Modified residuei201Phosphotyrosine; by TXK and JAK23 Publications1

Post-translational modificationi

N-glycosylation is important for dimerization.3 Publications
Phosphorylation at Tyr-201 prevents binding to the AP-2 adapter complex, blocks endocytosis, and leads to retention of CTLA4 on the cell surface.3 Publications

Keywords - PTMi

Disulfide bond, Glycoprotein, Phosphoprotein

Proteomic databases

PaxDbiP16410.
PRIDEiP16410.

PTM databases

iPTMnetiP16410.
PhosphoSitePlusiP16410.

Expressioni

Tissue specificityi

Widely expressed with highest levels in lymphoid tissues. Detected in activated T-cells where expression levels are 30- to 50-fold less than CD28, the stimulatory coreceptor, on the cell surface following activation.3 Publications

Gene expression databases

BgeeiENSG00000163599.
CleanExiHS_CTLA4.
ExpressionAtlasiP16410. baseline and differential.
GenevisibleiP16410. HS.

Interactioni

Subunit structurei

Homodimer; disulfide-linked. Binds to CD80/B7-1 and CD86/B7.2.3 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
CD80P336813EBI-1030991,EBI-1031024
CD86P420813EBI-1030991,EBI-1030956
PIK3R1P279863EBI-1030991,EBI-79464

Protein-protein interaction databases

BioGridi107875. 12 interactors.
DIPiDIP-35607N.
IntActiP16410. 10 interactors.
MINTiMINT-6631153.
STRINGi9606.ENSP00000303939.

Structurei

Secondary structure

1223
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Beta strandi44 – 47Combined sources4
Beta strandi50 – 52Combined sources3
Beta strandi54 – 60Combined sources7
Beta strandi64 – 66Combined sources3
Beta strandi68 – 77Combined sources10
Beta strandi80 – 90Combined sources11
Beta strandi102 – 108Combined sources7
Beta strandi111 – 116Combined sources6
Helixi121 – 123Combined sources3
Beta strandi125 – 138Combined sources14
Beta strandi140 – 143Combined sources4
Beta strandi147 – 150Combined sources4
Beta strandi156 – 158Combined sources3

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
1AH1NMR-A37-161[»]
1H6EX-ray3.60P197-207[»]
1I85X-ray3.20C/D36-161[»]
1I8LX-ray3.00C/D36-161[»]
2X44X-ray2.60D36-161[»]
3BX7X-ray2.10C38-161[»]
3OSKX-ray1.80A/B36-161[»]
ProteinModelPortaliP16410.
SMRiP16410.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP16410.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini39 – 140Ig-like V-typeAdd BLAST102

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni46 – 50Homodimerization5
Regioni150 – 155Homodimerization6

Sequence similaritiesi

Keywords - Domaini

Immunoglobulin domain, Signal, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiENOG410IJ05. Eukaryota.
ENOG410YUQR. LUCA.
GeneTreeiENSGT00530000063873.
HOGENOMiHOG000112047.
HOVERGENiHBG057978.
InParanoidiP16410.
KOiK06538.
OMAiFSKGMHV.
OrthoDBiEOG091G0IGY.
PhylomeDBiP16410.
TreeFamiTF335679.

Family and domain databases

Gene3Di2.60.40.10. 1 hit.
InterProiIPR008096. CTLA4.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR003599. Ig_sub.
IPR013106. Ig_V-set.
[Graphical view]
PfamiPF07686. V-set. 1 hit.
[Graphical view]
PRINTSiPR01720. CTLANTIGEN4.
SMARTiSM00409. IG. 1 hit.
SM00406. IGv. 1 hit.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 1 hit.

Sequences (5)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 5 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: P16410-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MACLGFQRHK AQLNLATRTW PCTLLFFLLF IPVFCKAMHV AQPAVVLASS
60 70 80 90 100
RGIASFVCEY ASPGKATEVR VTVLRQADSQ VTEVCAATYM MGNELTFLDD
110 120 130 140 150
SICTGTSSGN QVNLTIQGLR AMDTGLYICK VELMYPPPYY LGIGNGTQIY
160 170 180 190 200
VIDPEPCPDS DFLLWILAAV SSGLFFYSFL LTAVSLSKML KKRSPLTTGV
210 220
YVKMPPTEPE CEKQFQPYFI PIN
Length:223
Mass (Da):24,656
Last modified:January 10, 2003 - v3
Checksum:i6F9466FB2E139A5A
GO
Isoform 2 (identifier: P16410-2) [UniParc]FASTAAdd to basket
Also known as: ss-CTLA-4

The sequence of this isoform differs from the canonical sequence as follows:
     38-204: Missing.

Show »
Length:56
Mass (Da):6,560
Checksum:i096CBF7AD57AE9B9
GO
Isoform 3 (identifier: P16410-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     38-204: Missing.
     205-223: PPTEPECEKQFQPYFIPIN → KEKKPSYNRGLCENAPNRARM

Show »
Length:58
Mass (Da):6,745
Checksum:i5F70948EEDC80A94
GO
Isoform 4 (identifier: P16410-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     58-58: C → S
     59-204: Missing.
     205-223: PPTEPECEKQFQPYFIPIN → KEKKPSYNRGLCENAPNRARM

Show »
Length:79
Mass (Da):8,855
Checksum:i60CBF1BC1DA59D8A
GO
Isoform 5 (identifier: P16410-5) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     153-174: DPEPCPDSDFLLWILAAVSSGL → AKEKKPSYNRGLCENAPNRARM
     175-223: Missing.

Show »
Length:174
Mass (Da):19,145
Checksum:i0881BFA757AC3FDB
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti37A → V in ABG85285 (PubMed:18595775).Curated1
Sequence conflicti147T → A in AAA52773 (PubMed:3220103).Curated1

Polymorphismi

Genetic variations in CTLA4 are associated with susceptibility to several autoimmune disorders (PubMed:18595775, PubMed:12724780, PubMed:10189842, PubMed:10924276, PubMed:15138458, PubMed:15657618, PubMed:15688186, PubMed:25329329, PubMed:25213377). They influence responsiveness to hepatitis B virus (HBV) infection [MIMi:610424] (PubMed:15452244).10 Publications

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_01357717T → A Increased risk for Graves disease, insulin-dependent diabetes mellitus, thyroid-associated orbitopathy, systemic lupus erythematosus and susceptibility to HBV infection. 7 PublicationsCorresponds to variant rs231775dbSNPEnsembl.1
Natural variantiVAR_07268170R → W in ALPS5. 1 PublicationCorresponds to variant rs606231422dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_04128438 – 204Missing in isoform 2 and isoform 3. 1 PublicationAdd BLAST167
Alternative sequenceiVSP_04128558C → S in isoform 4. 1 Publication1
Alternative sequenceiVSP_04128659 – 204Missing in isoform 4. 1 PublicationAdd BLAST146
Alternative sequenceiVSP_047238153 – 174DPEPC…VSSGL → AKEKKPSYNRGLCENAPNRA RM in isoform 5. 1 PublicationAdd BLAST22
Alternative sequenceiVSP_047239175 – 223Missing in isoform 5. 1 PublicationAdd BLAST49
Alternative sequenceiVSP_041287205 – 223PPTEP…FIPIN → KEKKPSYNRGLCENAPNRAR M in isoform 3 and isoform 4. 1 PublicationAdd BLAST19

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
L15006 mRNA. Translation: AAB59385.1.
M74363 Genomic DNA. Translation: AAA52127.1.
AF411058 Genomic DNA. Translation: AAL40932.1.
AY792514 mRNA. Translation: AAV66331.1.
AY999702 mRNA. Translation: AAY00166.1.
DQ785106 mRNA. Translation: ABG85285.1.
AF414120 mRNA. Translation: AAL07473.1.
DQ357942 Genomic DNA. Translation: ABC67470.1.
AC010138 Genomic DNA. Translation: AAX93176.1.
BC074842 mRNA. Translation: AAH74842.1.
BC074893 mRNA. Translation: AAH74893.1.
AH002733 Genomic DNA. Translation: AAA52773.1.
U90273 mRNA. Translation: AAD00698.1.
AF142144 Genomic DNA. Translation: AAF02499.1.
CCDSiCCDS2362.1. [P16410-1]
CCDS42803.1. [P16410-5]
PIRiS08614.
RefSeqiNP_001032720.1. NM_001037631.2. [P16410-5]
NP_005205.2. NM_005214.4. [P16410-1]
UniGeneiHs.247824.

Genome annotation databases

EnsembliENST00000295854; ENSP00000295854; ENSG00000163599. [P16410-5]
ENST00000302823; ENSP00000303939; ENSG00000163599. [P16410-1]
ENST00000472206; ENSP00000417779; ENSG00000163599. [P16410-4]
GeneIDi1493.
KEGGihsa:1493.
UCSCiuc002vak.3. human. [P16410-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Web resourcesi

Wikipedia

CLTA-4 entry

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
L15006 mRNA. Translation: AAB59385.1.
M74363 Genomic DNA. Translation: AAA52127.1.
AF411058 Genomic DNA. Translation: AAL40932.1.
AY792514 mRNA. Translation: AAV66331.1.
AY999702 mRNA. Translation: AAY00166.1.
DQ785106 mRNA. Translation: ABG85285.1.
AF414120 mRNA. Translation: AAL07473.1.
DQ357942 Genomic DNA. Translation: ABC67470.1.
AC010138 Genomic DNA. Translation: AAX93176.1.
BC074842 mRNA. Translation: AAH74842.1.
BC074893 mRNA. Translation: AAH74893.1.
AH002733 Genomic DNA. Translation: AAA52773.1.
U90273 mRNA. Translation: AAD00698.1.
AF142144 Genomic DNA. Translation: AAF02499.1.
CCDSiCCDS2362.1. [P16410-1]
CCDS42803.1. [P16410-5]
PIRiS08614.
RefSeqiNP_001032720.1. NM_001037631.2. [P16410-5]
NP_005205.2. NM_005214.4. [P16410-1]
UniGeneiHs.247824.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
1AH1NMR-A37-161[»]
1H6EX-ray3.60P197-207[»]
1I85X-ray3.20C/D36-161[»]
1I8LX-ray3.00C/D36-161[»]
2X44X-ray2.60D36-161[»]
3BX7X-ray2.10C38-161[»]
3OSKX-ray1.80A/B36-161[»]
ProteinModelPortaliP16410.
SMRiP16410.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi107875. 12 interactors.
DIPiDIP-35607N.
IntActiP16410. 10 interactors.
MINTiMINT-6631153.
STRINGi9606.ENSP00000303939.

Chemistry databases

ChEMBLiCHEMBL2364164.
DrugBankiDB06186. Ipilimumab.
GuidetoPHARMACOLOGYi2743.

PTM databases

iPTMnetiP16410.
PhosphoSitePlusiP16410.

Polymorphism and mutation databases

BioMutaiCTLA4.
DMDMi27735177.

Proteomic databases

PaxDbiP16410.
PRIDEiP16410.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000295854; ENSP00000295854; ENSG00000163599. [P16410-5]
ENST00000302823; ENSP00000303939; ENSG00000163599. [P16410-1]
ENST00000472206; ENSP00000417779; ENSG00000163599. [P16410-4]
GeneIDi1493.
KEGGihsa:1493.
UCSCiuc002vak.3. human. [P16410-1]

Organism-specific databases

CTDi1493.
DisGeNETi1493.
GeneCardsiCTLA4.
HGNCiHGNC:2505. CTLA4.
MalaCardsiCTLA4.
MIMi109100. phenotype.
123890. gene.
152700. phenotype.
601388. phenotype.
609755. phenotype.
610424. phenotype.
616100. phenotype.
neXtProtiNX_P16410.
OpenTargetsiENSG00000163599.
Orphaneti555. Celiac disease.
900. Granulomatosis with polyangiitis.
855. Hashimoto struma.
536. Systemic lupus erythematosus.
PharmGKBiPA27006.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410IJ05. Eukaryota.
ENOG410YUQR. LUCA.
GeneTreeiENSGT00530000063873.
HOGENOMiHOG000112047.
HOVERGENiHBG057978.
InParanoidiP16410.
KOiK06538.
OMAiFSKGMHV.
OrthoDBiEOG091G0IGY.
PhylomeDBiP16410.
TreeFamiTF335679.

Enzyme and pathway databases

ReactomeiR-HSA-389513. CTLA4 inhibitory signaling.
SIGNORiP16410.

Miscellaneous databases

EvolutionaryTraceiP16410.
GeneWikiiCTLA-4.
GenomeRNAii1493.
PROiP16410.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000163599.
CleanExiHS_CTLA4.
ExpressionAtlasiP16410. baseline and differential.
GenevisibleiP16410. HS.

Family and domain databases

Gene3Di2.60.40.10. 1 hit.
InterProiIPR008096. CTLA4.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR003599. Ig_sub.
IPR013106. Ig_V-set.
[Graphical view]
PfamiPF07686. V-set. 1 hit.
[Graphical view]
PRINTSiPR01720. CTLANTIGEN4.
SMARTiSM00409. IG. 1 hit.
SM00406. IGv. 1 hit.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 1 hit.
ProtoNetiSearch...

Entry informationi

Entry nameiCTLA4_HUMAN
AccessioniPrimary (citable) accession number: P16410
Secondary accession number(s): A0N1S0
, E9PDH0, O95653, Q0PP65, Q52MC1, Q53TD5, Q5S005, Q8WXJ1, Q96P43, Q9UKN9
Entry historyi
Integrated into UniProtKB/Swiss-Prot: August 1, 1990
Last sequence update: January 10, 2003
Last modified: November 2, 2016
This is version 190 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Pharmaceutical, Reference proteome

Documents

  1. Human cell differentiation molecules
    CD nomenclature of surface proteins of human leucocytes and list of entries
  2. Human chromosome 2
    Human chromosome 2: entries, gene names and cross-references to MIM
  3. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  4. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  5. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  6. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  7. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.