Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Hornerin

Gene

HRNR

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Component of the epidermal cornified cell envelopes.1 Publication

Regions

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Calcium bindingi19 – 321PROSITE-ProRule annotationAdd BLAST14
Calcium bindingi62 – 732PROSITE-ProRule annotationAdd BLAST12

GO - Molecular functioni

GO - Biological processi

  • cell envelope organization Source: UniProtKB
  • establishment of skin barrier Source: UniProtKB
  • keratinization Source: UniProtKB-KW
Complete GO annotation...

Keywords - Molecular functioni

Developmental protein

Keywords - Biological processi

Keratinization

Keywords - Ligandi

Calcium, Metal-binding

Enzyme and pathway databases

BioCyciZFISH:G66-31446-MONOMER.
ReactomeiR-HSA-6798695. Neutrophil degranulation.

Names & Taxonomyi

Protein namesi
Recommended name:
Hornerin
Gene namesi
Name:HRNR
Synonyms:S100A18
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 1

Organism-specific databases

HGNCiHGNC:20846. HRNR.

Subcellular locationi

GO - Cellular componenti

  • cell envelope Source: UniProtKB
  • cornified envelope Source: InterPro
  • cytoplasm Source: UniProtKB
  • extracellular exosome Source: UniProtKB
  • keratohyalin granule Source: UniProtKB
  • nucleus Source: UniProtKB
  • perinuclear region of cytoplasm Source: UniProtKB
Complete GO annotation...

Pathology & Biotechi

Organism-specific databases

DisGeNETi388697.
OpenTargetsiENSG00000197915.
PharmGKBiPA134936141.

Polymorphism and mutation databases

DMDMi45476906.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00001440381 – 2850HornerinAdd BLAST2850

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei659PhosphoserineCombined sources1
Modified residuei661PhosphoserineCombined sources1
Modified residuei890PhosphoserineCombined sources1
Modified residuei993PhosphoserineCombined sources1
Modified residuei1008PhosphoserineCombined sources1
Modified residuei1205Omega-N-methylarginineBy similarity1
Modified residuei1463PhosphoserineCombined sources1
Modified residuei1478PhosphoserineCombined sources1
Modified residuei1712PhosphoserineCombined sources1
Modified residuei1714PhosphoserineCombined sources1
Modified residuei1829PhosphoserineCombined sources1
Modified residuei1831PhosphoserineCombined sources1
Modified residuei1933PhosphoserineCombined sources1
Modified residuei1948PhosphoserineCombined sources1
Modified residuei2299PhosphoserineCombined sources1
Modified residuei2301PhosphoserineCombined sources1
Modified residuei2403PhosphoserineCombined sources1
Modified residuei2418PhosphoserineCombined sources1
Modified residuei2652PhosphoserineCombined sources1
Modified residuei2654PhosphoserineCombined sources1

Post-translational modificationi

Processed during the process of epidermal differentiation.By similarity
Forms covalent cross-links mediated by transglutaminase TGM3, between glutamine and the epsilon-amino group of lysine residues (in vitro).

Keywords - PTMi

Methylation, Phosphoprotein

Proteomic databases

PaxDbiQ86YZ3.
PeptideAtlasiQ86YZ3.
PRIDEiQ86YZ3.

2D gel databases

UCD-2DPAGEQ86YZ3.

PTM databases

iPTMnetiQ86YZ3.
PhosphoSitePlusiQ86YZ3.

Expressioni

Tissue specificityi

Expressed in cornified epidermis, psoriatic and regenerating skin after wounding. Found in the upper granular layer and in the entire cornified layer of epidermis.2 Publications

Inductioni

By UV-B irradiation.1 Publication

Gene expression databases

BgeeiENSG00000197915.
CleanExiHS_HRNR.

Organism-specific databases

HPAiHPA031469.

Interactioni

Protein-protein interaction databases

BioGridi132814. 24 interactors.
IntActiQ86YZ3. 11 interactors.
MINTiMINT-2809380.
STRINGi9606.ENSP00000357791.

Structurei

3D structure databases

ProteinModelPortaliQ86YZ3.
SMRiQ86YZ3.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini13 – 48EF-hand 1PROSITE-ProRule annotationAdd BLAST36
Domaini49 – 84EF-hand 2PROSITE-ProRule annotationAdd BLAST36
Repeati97 – 1871Add BLAST91
Repeati188 – 2782Add BLAST91
Repeati279 – 3693Add BLAST91
Repeati370 – 4604Add BLAST91
Repeati474 – 5665Add BLAST93
Repeati593 – 6836Add BLAST91
Repeati685 – 7477Add BLAST63
Repeati748 – 8368Add BLAST89
Repeati839 – 8759Add BLAST37
Repeati876 – 96510Add BLAST90
Repeati966 – 100411Add BLAST39
Repeati1007 – 109712Add BLAST91
Repeati1098 – 118813Add BLAST91
Repeati1215 – 130514Add BLAST91
Repeati1332 – 142215Add BLAST91
Repeati1423 – 147416Add BLAST52
Repeati1477 – 156717Add BLAST91
Repeati1568 – 165818Add BLAST91
Repeati1685 – 177519Add BLAST91
Repeati1802 – 189220Add BLAST91
Repeati1893 – 194421Add BLAST52
Repeati1947 – 203722Add BLAST91
Repeati2038 – 212823Add BLAST91
Repeati2155 – 224524Add BLAST91
Repeati2272 – 236225Add BLAST91
Repeati2363 – 241426Add BLAST52
Repeati2417 – 250727Add BLAST91
Repeati2508 – 259828Add BLAST91
Repeati2625 – 271529Add BLAST91
Repeati2716 – 280630Add BLAST91

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni1 – 81S-100-likeAdd BLAST81

Sequence similaritiesi

Belongs to the S100-fused protein family.Curated
In the N-terminal section; belongs to the S-100 family.Curated
Contains 2 EF-hand domains.PROSITE-ProRule annotation

Keywords - Domaini

Repeat

Phylogenomic databases

eggNOGiENOG410IX7U. Eukaryota.
ENOG410ZH9F. LUCA.
GeneTreeiENSGT00530000063634.
HOGENOMiHOG000112590.
InParanoidiQ86YZ3.
OMAiSHHESSQ.
OrthoDBiEOG091G01X9.
TreeFamiTF338665.

Family and domain databases

Gene3Di1.10.238.10. 1 hit.
InterProiIPR011992. EF-hand-dom_pair.
IPR018247. EF_Hand_1_Ca_BS.
IPR002048. EF_hand_dom.
IPR033201. HRNR.
IPR001751. S100/CaBP-9k_CS.
IPR013787. S100_Ca-bd_sub.
[Graphical view]
PANTHERiPTHR22571:SF25. PTHR22571:SF25. 2 hits.
PfamiPF01023. S_100. 1 hit.
[Graphical view]
SMARTiSM01394. S_100. 1 hit.
[Graphical view]
SUPFAMiSSF47473. SSF47473. 1 hit.
PROSITEiPS00018. EF_HAND_1. 1 hit.
PS50222. EF_HAND_2. 1 hit.
PS00303. S100_CABP. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Q86YZ3-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MPKLLQGVIT VIDVFYQYAT QHGEYDTLNK AELKELLENE FHQILKNPND
60 70 80 90 100
PDTVDIILQS LDRDHNKKVD FTEYLLMIFK LVQARNKIIG KDYCQVSGSK
110 120 130 140 150
LRDDTHQHQE EQEETEKEEN KRQESSFSHS SWSAGENDSY SRNVRGSLKP
160 170 180 190 200
GTESISRRLS FQRDFSGQHN SYSGQSSSYG EQNSDSHQSS GRGQCGSGSG
210 220 230 240 250
QSPNYGQHGS GSGQSSSNDT HGSGSGQSSG FSQHKSSSGQ SSGYSQHGSG
260 270 280 290 300
SGHSSGYGQH GSRSGQSSRG ERHRSSSGSS SSYGQHGSGS RQSLGHGRQG
310 320 330 340 350
SGSRQSPSHV RHGSGSGHSS SHGQHGSGSS YSYSRGHYES GSGQTSGFGQ
360 370 380 390 400
HESGSGQSSG YSKHGSGSGH SSSQGQHGST SGQASSSGQH GSSSRQSSSY
410 420 430 440 450
GQHESASRHS SGRGQHSSGS GQSPGHGQRG SGSGQSPSSG QHGTGFGRSS
460 470 480 490 500
SSGPYVSGSG YSSGFGHHES SSEHSSGYTQ HGSGSGHSSG HGQHGSRSGQ
510 520 530 540 550
SSRGERQGSS AGSSSSYGQH GSGSRQSLGH SRHGSGSGQS PSPSRGRHES
560 570 580 590 600
GSRQSSSYGP HGYGSGRSSS RGPYESGSGH SSGLGHQESR SGQSSGYGQH
610 620 630 640 650
GSSSGHSSTH GQHGSTSGQS SSCGQHGATS GQSSSHGQHG SGSSQSSRYG
660 670 680 690 700
QQGSGSGQSP SRGRHGSDFG HSSSYGQHGS GSGWSSSNGP HGSVSGQSSG
710 720 730 740 750
FGHKSGSGQS SGYSQHGSGS SHSSGYRKHG SRSGQSSRSE QHGSSSGLSS
760 770 780 790 800
SYGQHGSGSH QSSGHGRQGS GSGHSPSRVR HGSSSGHSSS HGQHGSGTSC
810 820 830 840 850
SSSCGHYESG SGQASGFGQH ESGSGQGYSQ HGSASGHFSS QGRHGSTSGQ
860 870 880 890 900
SSSSGQHDSS SGQSSSYGQH ESASHHASGR GRHGSGSGQS PGHGQRGSGS
910 920 930 940 950
GQSPSYGRHG SGSGRSSSSG RHGSGSGQSS GFGHKSSSGQ SSGYTQHGSG
960 970 980 990 1000
SGHSSSYEQH GSRSGQSSRS EQHGSSSGSS SSYGQHGSGS RQSLGHGQHG
1010 1020 1030 1040 1050
SGSGQSPSPS RGRHGSGSGQ SSSYGPYRSG SGWSSSRGPY ESGSGHSSGL
1060 1070 1080 1090 1100
GHRESRSGQS SGYGQHGSSS GHSSTHGQHG STSGQSSSCG QHGASSGQSS
1110 1120 1130 1140 1150
SHGQHGSGSS QSSGYGRQGS GSGQSPGHGQ RGSGSRQSPS YGRHGSGSGR
1160 1170 1180 1190 1200
SSSSGQHGSG LGESSGFGHH ESSSGQSSSY SQHGSGSGHS SGYGQHGSRS
1210 1220 1230 1240 1250
GQSSRGERHG SSSGSSSHYG QHGSGSRQSS GHGRQGSGSG HSPSRGRHGS
1260 1270 1280 1290 1300
GLGHSSSHGQ HGSGSGRSSS RGPYESRSGH SSVFGQHESG SGHSSAYSQH
1310 1320 1330 1340 1350
GSGSGHFCSQ GQHGSTSGQS STFDQEGSST GQSSSYGHRG SGSSQSSGYG
1360 1370 1380 1390 1400
RHGAGSGQSP SRGRHGSGSG HSSSYGQHGS GSGWSSSSGR HGSGSGQSSG
1410 1420 1430 1440 1450
FGHHESSSWQ SSGCTQHGSG SGHSSSYEQH GSRSGQSSRG ERHGSSSGSS
1460 1470 1480 1490 1500
SSYGQHGSGS RQSLGHGQHG SGSGQSPSPS RGRHGSGSGQ SSSYSPYGSG
1510 1520 1530 1540 1550
SGWSSSRGPY ESGSSHSSGL GHRESRSGQS SGYGQHGSSS GHSSTHGQHG
1560 1570 1580 1590 1600
STSGQSSSCG QHGASSGQSS SHGQHGSGSS QSSGYGRQGS GSGQSPGHGQ
1610 1620 1630 1640 1650
RGSGSRQSPS YGRHGSGSGR SSSSGQHGSG LGESSGFGHH ESSSGQSSSY
1660 1670 1680 1690 1700
SQHGSGSGHS SGYGQHGSRS GQSSRGERHG SSSRSSSRYG QHGSGSRQSS
1710 1720 1730 1740 1750
GHGRQGSGSG QSPSRGRHGS GLGHSSSHGQ HGSGSGRSSS RGPYESRSGH
1760 1770 1780 1790 1800
SSVFGQHESG SGHSSAYSQH GSGSGHFCSQ GQHGSTSGQS STFDQEGSST
1810 1820 1830 1840 1850
GQSSSHGQHG SGSSQSSSYG QQGSGSGQSP SRGRHGSGSG HSSSYGQHGS
1860 1870 1880 1890 1900
GSGWSSSSGR HGSGSGQSSG FGHHESSSWQ SSGYTQHGSG SGHSSSYEQH
1910 1920 1930 1940 1950
GSRSGQSSRG EQHGSSSGSS SSYGQHGSGS RQSLGHGQHG SGSGQSPSPS
1960 1970 1980 1990 2000
RGRHGSGSGQ SSSYGPYGSG SGWSSSRGPY ESGSGHSSGL GHRESRSGQS
2010 2020 2030 2040 2050
SGYGQHGSSS GHSSTHGQHG SASGQSSSCG QHGASSGQSS SHGQHGSGSS
2060 2070 2080 2090 2100
QSSGYGRQGS GSGQSPGHGQ RGSGSRQSPS YGRHGSGSGR SSSSGQHGPG
2110 2120 2130 2140 2150
LGESSGFGHH ESSSGQSSSY SQHGSGSGHS SGYGQHGSRS GQSSRGERHG
2160 2170 2180 2190 2200
SSSGSSSRYG QHGSGSRQSS GHGRQGSGSG HSPSRGRHGS GSGHSSSHGQ
2210 2220 2230 2240 2250
HGSGSGRSSS RGPYESRSGH SSVFGQHESG SGHSSAYSQH GSGSGHFCSQ
2260 2270 2280 2290 2300
GQHGSTSGQS STFDQEGSST GQSSSHGQHG SGSSQSSSYG QQGSGSGQSP
2310 2320 2330 2340 2350
SRGRHGSGSG HSSSYGQHGS GSGWSSSSGR HGSGSGQSSG FGHHESSSWQ
2360 2370 2380 2390 2400
SSGYTQHGSG SGHSSSYEQH GSRSGQSSRG ERHGSSSGSS SSYGQHGSGS
2410 2420 2430 2440 2450
RQSLGHGQHG SGSGQSPSPS RGRHGSGSGQ SSSYSPYGSG SGWSSSRGPY
2460 2470 2480 2490 2500
ESGSGHSSGL GHRESRSGQS SGYGQHGSSS GHSSTHGQHG STSGQSSSCG
2510 2520 2530 2540 2550
QHGASSGQSS SHGQHGSGSS QSSGYGRQGS GSGQSPGHGQ RGSGSRQSPS
2560 2570 2580 2590 2600
YGRHGSGSGR SSSSGQHGSG LGESSGFGHH ESSSGQSSSY SQHGSGSGHS
2610 2620 2630 2640 2650
SGYGQHGSRS GQSSRGERHG SSSGSSSHYG QHGSGSRQSS GHGRQGSGSG
2660 2670 2680 2690 2700
QSPSRGRHGS GLGHSSSHGQ HGSGSGRSSS RGPYESRLGH SSVFGQHESG
2710 2720 2730 2740 2750
SGHSSAYSQH GSGSGHFCSQ GQHGSTSGQS STFDQEGSST GQSSSYGHRG
2760 2770 2780 2790 2800
SGSSQSSGYG RHGAGSGQSL SHGRHGSGSG QSSSYGQHGS GSGQSSGYSQ
2810 2820 2830 2840 2850
HGSGSGQDGY SYCKGGSNHD GGSSGSYFLS FPSSTSPYEY VQEQRCYFYQ
Length:2,850
Mass (Da):282,390
Last modified:March 15, 2004 - v2
Checksum:iF25D8028C5AB6701
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti271E → D in AAR91619 (Ref. 2) Curated1
Sequence conflicti2382R → Q in BAC57496 (PubMed:15507446).Curated1
Sequence conflicti2539G → S in BAC57496 (PubMed:15507446).Curated1
Sequence conflicti2688L → S in BAC57496 (PubMed:15507446).Curated1
Sequence conflicti2837P → T in AAR91619 (Ref. 2) Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_04849485R → H.1 PublicationCorresponds to variant rs11204937dbSNPEnsembl.1
Natural variantiVAR_061053122R → W.Corresponds to variant rs57277761dbSNPEnsembl.1
Natural variantiVAR_048495167G → D.Corresponds to variant rs12741518dbSNPEnsembl.1
Natural variantiVAR_059174273H → Q.Corresponds to variant rs7545406dbSNPEnsembl.1
Natural variantiVAR_061054376Q → R.Corresponds to variant rs6587649dbSNPEnsembl.1
Natural variantiVAR_061055427G → D.Corresponds to variant rs6666097dbSNPEnsembl.1
Natural variantiVAR_048496473E → G.Corresponds to variant rs6587648dbSNPEnsembl.1
Natural variantiVAR_048497492G → R.Corresponds to variant rs6587647dbSNPEnsembl.1
Natural variantiVAR_061056517Y → C.Corresponds to variant rs41266134dbSNPEnsembl.1
Natural variantiVAR_048498664R → Q.Corresponds to variant rs7520249dbSNPEnsembl.1
Natural variantiVAR_048499799S → T.Corresponds to variant rs6662450dbSNPEnsembl.1
Natural variantiVAR_0591752435S → G.Corresponds to variant rs4248393dbSNPEnsembl.1
Natural variantiVAR_0591762461G → S.Corresponds to variant rs6659183dbSNPEnsembl.1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AB104446 mRNA. Translation: BAC57496.1.
BR000036 mRNA. Translation: FAA00004.1.
AY396741 mRNA. Translation: AAR91619.1.
AL589986 Genomic DNA. Translation: CAH70026.1.
CCDSiCCDS30859.1.
RefSeqiNP_001009931.1. NM_001009931.2.
UniGeneiHs.490162.

Genome annotation databases

EnsembliENST00000368801; ENSP00000357791; ENSG00000197915.
GeneIDi388697.
KEGGihsa:388697.
UCSCiuc001ezt.3. human.

Keywords - Coding sequence diversityi

Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AB104446 mRNA. Translation: BAC57496.1.
BR000036 mRNA. Translation: FAA00004.1.
AY396741 mRNA. Translation: AAR91619.1.
AL589986 Genomic DNA. Translation: CAH70026.1.
CCDSiCCDS30859.1.
RefSeqiNP_001009931.1. NM_001009931.2.
UniGeneiHs.490162.

3D structure databases

ProteinModelPortaliQ86YZ3.
SMRiQ86YZ3.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi132814. 24 interactors.
IntActiQ86YZ3. 11 interactors.
MINTiMINT-2809380.
STRINGi9606.ENSP00000357791.

PTM databases

iPTMnetiQ86YZ3.
PhosphoSitePlusiQ86YZ3.

Polymorphism and mutation databases

DMDMi45476906.

2D gel databases

UCD-2DPAGEQ86YZ3.

Proteomic databases

PaxDbiQ86YZ3.
PeptideAtlasiQ86YZ3.
PRIDEiQ86YZ3.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000368801; ENSP00000357791; ENSG00000197915.
GeneIDi388697.
KEGGihsa:388697.
UCSCiuc001ezt.3. human.

Organism-specific databases

CTDi388697.
DisGeNETi388697.
GeneCardsiHRNR.
H-InvDBHIX0200013.
HGNCiHGNC:20846. HRNR.
HPAiHPA031469.
MIMi616293. gene.
neXtProtiNX_Q86YZ3.
OpenTargetsiENSG00000197915.
PharmGKBiPA134936141.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410IX7U. Eukaryota.
ENOG410ZH9F. LUCA.
GeneTreeiENSGT00530000063634.
HOGENOMiHOG000112590.
InParanoidiQ86YZ3.
OMAiSHHESSQ.
OrthoDBiEOG091G01X9.
TreeFamiTF338665.

Enzyme and pathway databases

BioCyciZFISH:G66-31446-MONOMER.
ReactomeiR-HSA-6798695. Neutrophil degranulation.

Miscellaneous databases

GenomeRNAii388697.
PROiQ86YZ3.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000197915.
CleanExiHS_HRNR.

Family and domain databases

Gene3Di1.10.238.10. 1 hit.
InterProiIPR011992. EF-hand-dom_pair.
IPR018247. EF_Hand_1_Ca_BS.
IPR002048. EF_hand_dom.
IPR033201. HRNR.
IPR001751. S100/CaBP-9k_CS.
IPR013787. S100_Ca-bd_sub.
[Graphical view]
PANTHERiPTHR22571:SF25. PTHR22571:SF25. 2 hits.
PfamiPF01023. S_100. 1 hit.
[Graphical view]
SMARTiSM01394. S_100. 1 hit.
[Graphical view]
SUPFAMiSSF47473. SSF47473. 1 hit.
PROSITEiPS00018. EF_HAND_1. 1 hit.
PS50222. EF_HAND_2. 1 hit.
PS00303. S100_CABP. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiHORN_HUMAN
AccessioniPrimary (citable) accession number: Q86YZ3
Secondary accession number(s): Q5DT20, Q5U1F4
Entry historyi
Integrated into UniProtKB/Swiss-Prot: March 15, 2004
Last sequence update: March 15, 2004
Last modified: November 30, 2016
This is version 123 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 1
    Human chromosome 1: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.