Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

HLA class I histocompatibility antigen, alpha chain F

Gene

HLA-F

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at transcript leveli

Functioni

Involved in the presentation of foreign antigens to the immune system.

GO - Molecular functioni

  • peptide antigen binding Source: GO_Central
  • receptor binding Source: GO_Central
  • TAP1 binding Source: UniProtKB
  • TAP2 binding Source: UniProtKB

GO - Biological processi

Complete GO annotation...

Keywords - Biological processi

Immunity

Enzyme and pathway databases

BioCyciZFISH:ENSG00000137403-MONOMER.
ReactomeiR-HSA-1236974. ER-Phagosome pathway.
R-HSA-1236977. Endosomal/Vacuolar pathway.
R-HSA-198933. Immunoregulatory interactions between a Lymphoid and a non-Lymphoid cell.
R-HSA-877300. Interferon gamma signaling.
R-HSA-909733. Interferon alpha/beta signaling.
R-HSA-983170. Antigen Presentation: Folding, assembly and peptide loading of class I MHC.

Names & Taxonomyi

Protein namesi
Recommended name:
HLA class I histocompatibility antigen, alpha chain F
Alternative name(s):
CDA12
HLA F antigen
Leukocyte antigen F
MHC class I antigen F
Gene namesi
Name:HLA-F
Synonyms:HLA-5.4, HLAF
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 6

Organism-specific databases

HGNCiHGNC:4963. HLA-F.

Subcellular locationi

Topology

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Topological domaini22 – 305ExtracellularSequence analysisAdd BLAST284
Transmembranei306 – 329HelicalSequence analysisAdd BLAST24
Topological domaini330 – 346CytoplasmicSequence analysisAdd BLAST17

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Membrane, MHC I

Pathology & Biotechi

Organism-specific databases

DisGeNETi3134.
OpenTargetsiENSG00000137403.
ENSG00000204642.
ENSG00000206509.
ENSG00000229698.
ENSG00000235220.
ENSG00000237508.
PharmGKBiPA35082.

Polymorphism and mutation databases

BioMutaiHLA-F.
DMDMi317373438.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 21Add BLAST21
ChainiPRO_000001888422 – 346HLA class I histocompatibility antigen, alpha chain FAdd BLAST325

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Glycosylationi107N-linked (GlcNAc...)By similarity1
Disulfide bondi122 ↔ 185PROSITE-ProRule annotation
Disulfide bondi224 ↔ 280PROSITE-ProRule annotation

Keywords - PTMi

Disulfide bond, Glycoprotein

Proteomic databases

PaxDbiP30511.
PeptideAtlasiP30511.
PRIDEiP30511.

PTM databases

iPTMnetiP30511.
PhosphoSitePlusiP30511.

Expressioni

Gene expression databases

BgeeiENSG00000137403.
CleanExiHS_HLA-F.
ExpressionAtlasiP30511. baseline and differential.
GenevisibleiP30511. HS.

Interactioni

Subunit structurei

Heterodimer of an alpha chain and a beta chain (beta-2-microglobulin).

GO - Molecular functioni

  • receptor binding Source: GO_Central
  • TAP1 binding Source: UniProtKB
  • TAP2 binding Source: UniProtKB

Protein-protein interaction databases

BioGridi109379. 16 interactors.
IntActiP30511. 2 interactors.
STRINGi9606.ENSP00000259951.

Structurei

3D structure databases

ProteinModelPortaliP30511.
SMRiP30511.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini206 – 296Ig-like C1-typeAdd BLAST91

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni22 – 111Alpha-1Add BLAST90
Regioni112 – 203Alpha-2Add BLAST92
Regioni204 – 295Alpha-3Add BLAST92
Regioni296 – 305Connecting peptide10

Sequence similaritiesi

Belongs to the MHC class I family.Curated

Keywords - Domaini

Signal, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiENOG410IUXN. Eukaryota.
ENOG4111K8F. LUCA.
GeneTreeiENSGT00760000118960.
HOGENOMiHOG000296917.
HOVERGENiHBG016709.
InParanoidiP30511.
KOiK06751.
OMAiINMEIAE.
PhylomeDBiP30511.
TreeFamiTF336617.

Family and domain databases

Gene3Di2.60.40.10. 1 hit.
3.30.500.10. 1 hit.
InterProiIPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR003006. Ig/MHC_CS.
IPR003597. Ig_C1-set.
IPR011161. MHC_I-like_Ag-recog.
IPR011162. MHC_I/II-like_Ag-recog.
IPR001039. MHC_I_a_a1/a2.
[Graphical view]
PfamiPF07654. C1-set. 1 hit.
PF00129. MHC_I. 1 hit.
[Graphical view]
PRINTSiPR01638. MHCCLASSI.
SMARTiSM00407. IGc1. 1 hit.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 1 hit.
SSF54452. SSF54452. 1 hit.
PROSITEiPS50835. IG_LIKE. 1 hit.
PS00290. IG_MHC. 1 hit.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: P30511-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MAPRSLLLLL SGALALTDTW AGSHSLRYFS TAVSRPGRGE PRYIAVEYVD
60 70 80 90 100
DTQFLRFDSD AAIPRMEPRE PWVEQEGPQY WEWTTGYAKA NAQTDRVALR
110 120 130 140 150
NLLRRYNQSE AGSHTLQGMN GCDMGPDGRL LRGYHQHAYD GKDYISLNED
160 170 180 190 200
LRSWTAADTV AQITQRFYEA EEYAEEFRTY LEGECLELLR RYLENGKETL
210 220 230 240 250
QRADPPKAHV AHHPISDHEA TLRCWALGFY PAEITLTWQR DGEEQTQDTE
260 270 280 290 300
LVETRPAGDG TFQKWAAVVV PPGEEQRYTC HVQHEGLPQP LILRWEQSPQ
310 320 330 340
PTIPIVGIVA GLVVLGAVVT GAVVAAVMWR KKSSDRNRGS YSQAAV
Length:346
Mass (Da):39,062
Last modified:January 11, 2011 - v3
Checksum:iD4782968A67E9B7D
GO
Isoform 2 (identifier: P30511-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     204-295: Missing.

Show »
Length:254
Mass (Da):28,588
Checksum:iC81F225D409AAED2
GO
Isoform 3 (identifier: P30511-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     346-346: V → AYSVVSGNLM...MKRVQIKIFD

Show »
Length:442
Mass (Da):50,438
Checksum:i66109DEBE5EF2D57
GO

Sequence cautioni

The sequence AAC24827 differs from that shown. Reason: Erroneous gene model prediction.Curated
The sequence AAH09260 differs from that shown. Reason: Erroneous initiation. Translation N-terminally shortened.Curated
The sequence BAB63337 differs from that shown. Reason: Erroneous gene model prediction.Curated
The sequence CAA34947 differs from that shown. Reason: Erroneous gene model prediction.Curated
The sequence CAB46623 differs from that shown. Reason: Erroneous gene model prediction.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Isoform 3 (identifier: P30511-3)
Sequence conflicti353N → L in AAH09260 (PubMed:15489334).Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_05652513A → V.Corresponds to variant rs17875379dbSNPEnsembl.1
Natural variantiVAR_05652671P → Q.Corresponds to variant rs17875380dbSNPEnsembl.1
Natural variantiVAR_018327272P → S.9 PublicationsCorresponds to variant rs1736924dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_038846204 – 295Missing in isoform 2. 1 PublicationAdd BLAST92
Alternative sequenceiVSP_040349346V → AYSVVSGNLMITWWSSLFLL GVLFQGYLGCLRSHSVLGRR KVGDMWILFFLWLWTSFNTA FLALQSLRFGFGFRRGRSFL LRSWHHLMKRVQIKIFD in isoform 3. 1 Publication1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X17093 Genomic DNA. Translation: CAA34947.1. Sequence problems.
AF055066 Genomic DNA. Translation: AAC24827.1. Sequence problems.
AY253269 mRNA. Translation: AAO86773.1.
AY253270 mRNA. Translation: AAO86774.1.
AY253271 mRNA. Translation: AAO86775.1.
AF523284 Genomic DNA. Translation: AAM74979.1.
AF523285 Genomic DNA. Translation: AAM74980.1.
AF523286 Genomic DNA. Translation: AAM74981.1.
AF523287 Genomic DNA. Translation: AAM74982.1.
AF523288 Genomic DNA. Translation: AAM74983.1.
AF523289 Genomic DNA. Translation: AAM74984.1.
AF523290 Genomic DNA. Translation: AAM74985.1.
AF523291 Genomic DNA. Translation: AAM74986.1.
AF523292 Genomic DNA. Translation: AAM74987.1.
AF523293 Genomic DNA. Translation: AAM74988.1.
AF523294 Genomic DNA. Translation: AAM74989.1.
AF523295 Genomic DNA. Translation: AAM74990.1.
AF523296 Genomic DNA. Translation: AAM74991.1.
AF523297 Genomic DNA. Translation: AAM74992.1.
AY645742 Genomic DNA. Translation: AAT73225.1.
AY645743 Genomic DNA. Translation: AAT73226.1.
AY645744 Genomic DNA. Translation: AAT73227.1.
AY645745 Genomic DNA. Translation: AAT73228.1.
AY645746 Genomic DNA. Translation: AAT73229.1.
AY645748 Genomic DNA. Translation: AAT73231.1.
AY645749 Genomic DNA. Translation: AAT73232.1.
AY645750 Genomic DNA. Translation: AAT73233.1.
AY645751 Genomic DNA. Translation: AAT73234.1.
AY645752 Genomic DNA. Translation: AAT73235.1.
AY645753 Genomic DNA. Translation: AAT73236.1.
AY645754 Genomic DNA. Translation: AAT73237.1.
AY645756 Genomic DNA. Translation: AAT73239.1.
AY645757 Genomic DNA. Translation: AAT73240.1.
AY645758 Genomic DNA. Translation: AAT73241.1.
AY645759 Genomic DNA. Translation: AAT73242.1.
DQ367723 mRNA. Translation: ABD38924.1.
BA000025 Genomic DNA. Translation: BAB63337.1. Sequence problems.
AL022723 Genomic DNA. Translation: CAB46623.1. Sequence problems.
AL645939 Genomic DNA. Translation: CAI18086.2.
AL645939 Genomic DNA. Translation: CAI18087.1.
AL645939 Genomic DNA. Translation: CAI18088.2.
AL669813 Genomic DNA. Translation: CAI17626.1.
AL669813 Genomic DNA. Translation: CAI17627.2.
AL844851 Genomic DNA. Translation: CAI18576.1.
AL844851 Genomic DNA. Translation: CAI18577.2.
BX005428 Genomic DNA. Translation: CAI18731.1.
BX005428 Genomic DNA. Translation: CAI18732.2.
CR753818 Genomic DNA. Translation: CAQ08349.1.
CR753818 Genomic DNA. Translation: CAQ08350.1.
BX927250 Genomic DNA. Translation: CAQ10721.1.
BX927250 Genomic DNA. Translation: CAQ10722.1.
CH471081 Genomic DNA. Translation: EAX03223.1.
CH471081 Genomic DNA. Translation: EAX03226.1.
BC009260 mRNA. Translation: AAH09260.2. Different initiation.
BC062991 mRNA. Translation: AAH62991.1.
CCDSiCCDS43437.1. [P30511-3]
CCDS43438.1. [P30511-1]
CCDS43439.1. [P30511-2]
PIRiA60384.
RefSeqiNP_001091948.1. NM_001098478.1. [P30511-2]
NP_001091949.1. NM_001098479.1. [P30511-3]
NP_061823.2. NM_018950.2. [P30511-1]
UniGeneiHs.519972.

Genome annotation databases

EnsembliENST00000259951; ENSP00000259951; ENSG00000204642. [P30511-3]
ENST00000334668; ENSP00000334263; ENSG00000204642. [P30511-1]
ENST00000359076; ENSP00000351977; ENSG00000206509. [P30511-2]
ENST00000376848; ENSP00000366044; ENSG00000229698. [P30511-2]
ENST00000376861; ENSP00000366057; ENSG00000204642. [P30511-1]
ENST00000383515; ENSP00000373007; ENSG00000137403. [P30511-2]
ENST00000420067; ENSP00000393535; ENSG00000235220. [P30511-2]
ENST00000434407; ENSP00000397376; ENSG00000204642. [P30511-2]
ENST00000440590; ENSP00000399835; ENSG00000237508. [P30511-2]
GeneIDi3134.
KEGGihsa:3134.
UCSCiuc003nnm.5. human. [P30511-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X17093 Genomic DNA. Translation: CAA34947.1. Sequence problems.
AF055066 Genomic DNA. Translation: AAC24827.1. Sequence problems.
AY253269 mRNA. Translation: AAO86773.1.
AY253270 mRNA. Translation: AAO86774.1.
AY253271 mRNA. Translation: AAO86775.1.
AF523284 Genomic DNA. Translation: AAM74979.1.
AF523285 Genomic DNA. Translation: AAM74980.1.
AF523286 Genomic DNA. Translation: AAM74981.1.
AF523287 Genomic DNA. Translation: AAM74982.1.
AF523288 Genomic DNA. Translation: AAM74983.1.
AF523289 Genomic DNA. Translation: AAM74984.1.
AF523290 Genomic DNA. Translation: AAM74985.1.
AF523291 Genomic DNA. Translation: AAM74986.1.
AF523292 Genomic DNA. Translation: AAM74987.1.
AF523293 Genomic DNA. Translation: AAM74988.1.
AF523294 Genomic DNA. Translation: AAM74989.1.
AF523295 Genomic DNA. Translation: AAM74990.1.
AF523296 Genomic DNA. Translation: AAM74991.1.
AF523297 Genomic DNA. Translation: AAM74992.1.
AY645742 Genomic DNA. Translation: AAT73225.1.
AY645743 Genomic DNA. Translation: AAT73226.1.
AY645744 Genomic DNA. Translation: AAT73227.1.
AY645745 Genomic DNA. Translation: AAT73228.1.
AY645746 Genomic DNA. Translation: AAT73229.1.
AY645748 Genomic DNA. Translation: AAT73231.1.
AY645749 Genomic DNA. Translation: AAT73232.1.
AY645750 Genomic DNA. Translation: AAT73233.1.
AY645751 Genomic DNA. Translation: AAT73234.1.
AY645752 Genomic DNA. Translation: AAT73235.1.
AY645753 Genomic DNA. Translation: AAT73236.1.
AY645754 Genomic DNA. Translation: AAT73237.1.
AY645756 Genomic DNA. Translation: AAT73239.1.
AY645757 Genomic DNA. Translation: AAT73240.1.
AY645758 Genomic DNA. Translation: AAT73241.1.
AY645759 Genomic DNA. Translation: AAT73242.1.
DQ367723 mRNA. Translation: ABD38924.1.
BA000025 Genomic DNA. Translation: BAB63337.1. Sequence problems.
AL022723 Genomic DNA. Translation: CAB46623.1. Sequence problems.
AL645939 Genomic DNA. Translation: CAI18086.2.
AL645939 Genomic DNA. Translation: CAI18087.1.
AL645939 Genomic DNA. Translation: CAI18088.2.
AL669813 Genomic DNA. Translation: CAI17626.1.
AL669813 Genomic DNA. Translation: CAI17627.2.
AL844851 Genomic DNA. Translation: CAI18576.1.
AL844851 Genomic DNA. Translation: CAI18577.2.
BX005428 Genomic DNA. Translation: CAI18731.1.
BX005428 Genomic DNA. Translation: CAI18732.2.
CR753818 Genomic DNA. Translation: CAQ08349.1.
CR753818 Genomic DNA. Translation: CAQ08350.1.
BX927250 Genomic DNA. Translation: CAQ10721.1.
BX927250 Genomic DNA. Translation: CAQ10722.1.
CH471081 Genomic DNA. Translation: EAX03223.1.
CH471081 Genomic DNA. Translation: EAX03226.1.
BC009260 mRNA. Translation: AAH09260.2. Different initiation.
BC062991 mRNA. Translation: AAH62991.1.
CCDSiCCDS43437.1. [P30511-3]
CCDS43438.1. [P30511-1]
CCDS43439.1. [P30511-2]
PIRiA60384.
RefSeqiNP_001091948.1. NM_001098478.1. [P30511-2]
NP_001091949.1. NM_001098479.1. [P30511-3]
NP_061823.2. NM_018950.2. [P30511-1]
UniGeneiHs.519972.

3D structure databases

ProteinModelPortaliP30511.
SMRiP30511.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi109379. 16 interactors.
IntActiP30511. 2 interactors.
STRINGi9606.ENSP00000259951.

PTM databases

iPTMnetiP30511.
PhosphoSitePlusiP30511.

Polymorphism and mutation databases

BioMutaiHLA-F.
DMDMi317373438.

Proteomic databases

PaxDbiP30511.
PeptideAtlasiP30511.
PRIDEiP30511.

Protocols and materials databases

DNASUi3134.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000259951; ENSP00000259951; ENSG00000204642. [P30511-3]
ENST00000334668; ENSP00000334263; ENSG00000204642. [P30511-1]
ENST00000359076; ENSP00000351977; ENSG00000206509. [P30511-2]
ENST00000376848; ENSP00000366044; ENSG00000229698. [P30511-2]
ENST00000376861; ENSP00000366057; ENSG00000204642. [P30511-1]
ENST00000383515; ENSP00000373007; ENSG00000137403. [P30511-2]
ENST00000420067; ENSP00000393535; ENSG00000235220. [P30511-2]
ENST00000434407; ENSP00000397376; ENSG00000204642. [P30511-2]
ENST00000440590; ENSP00000399835; ENSG00000237508. [P30511-2]
GeneIDi3134.
KEGGihsa:3134.
UCSCiuc003nnm.5. human. [P30511-1]

Organism-specific databases

CTDi3134.
DisGeNETi3134.
GeneCardsiHLA-F.
HGNCiHGNC:4963. HLA-F.
MIMi143110. gene.
neXtProtiNX_P30511.
OpenTargetsiENSG00000137403.
ENSG00000204642.
ENSG00000206509.
ENSG00000229698.
ENSG00000235220.
ENSG00000237508.
PharmGKBiPA35082.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410IUXN. Eukaryota.
ENOG4111K8F. LUCA.
GeneTreeiENSGT00760000118960.
HOGENOMiHOG000296917.
HOVERGENiHBG016709.
InParanoidiP30511.
KOiK06751.
OMAiINMEIAE.
PhylomeDBiP30511.
TreeFamiTF336617.

Enzyme and pathway databases

BioCyciZFISH:ENSG00000137403-MONOMER.
ReactomeiR-HSA-1236974. ER-Phagosome pathway.
R-HSA-1236977. Endosomal/Vacuolar pathway.
R-HSA-198933. Immunoregulatory interactions between a Lymphoid and a non-Lymphoid cell.
R-HSA-877300. Interferon gamma signaling.
R-HSA-909733. Interferon alpha/beta signaling.
R-HSA-983170. Antigen Presentation: Folding, assembly and peptide loading of class I MHC.

Miscellaneous databases

ChiTaRSiHLA-F. human.
GeneWikiiHLA-F.
GenomeRNAii3134.
PROiP30511.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000137403.
CleanExiHS_HLA-F.
ExpressionAtlasiP30511. baseline and differential.
GenevisibleiP30511. HS.

Family and domain databases

Gene3Di2.60.40.10. 1 hit.
3.30.500.10. 1 hit.
InterProiIPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR003006. Ig/MHC_CS.
IPR003597. Ig_C1-set.
IPR011161. MHC_I-like_Ag-recog.
IPR011162. MHC_I/II-like_Ag-recog.
IPR001039. MHC_I_a_a1/a2.
[Graphical view]
PfamiPF07654. C1-set. 1 hit.
PF00129. MHC_I. 1 hit.
[Graphical view]
PRINTSiPR01638. MHCCLASSI.
SMARTiSM00407. IGc1. 1 hit.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 1 hit.
SSF54452. SSF54452. 1 hit.
PROSITEiPS50835. IG_LIKE. 1 hit.
PS00290. IG_MHC. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiHLAF_HUMAN
AccessioniPrimary (citable) accession number: P30511
Secondary accession number(s): Q5JQI8
, Q5JQJ1, Q5SPT5, Q860R0, Q8MGQ1, Q8WLP5, Q95HC0, Q9TP68
Entry historyi
Integrated into UniProtKB/Swiss-Prot: April 1, 1993
Last sequence update: January 11, 2011
Last modified: November 2, 2016
This is version 152 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 6
    Human chromosome 6: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.