Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Zinc finger protein 521

Gene

ZNF521

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Transcription factor that can both act as an activator or a repressor depending on the context. Involved in BMP signaling and in the regulation of the immature compartment of the hematopoietic system. Associates with SMADs in response to BMP2 leading to activate transcription of BMP target genes. Acts as a transcriptional repressor via its interaction with EBF1, a transcription factor involved specification of B-cell lineage; this interaction preventing EBF1 to bind DNA and activate target genes.1 Publication

Regions

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri47 – 67C2H2-type 1; degeneratePROSITE-ProRule annotationAdd BLAST21
Zinc fingeri118 – 140C2H2-type 2PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri146 – 168C2H2-type 3PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri174 – 196C2H2-type 4PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri202 – 224C2H2-type 5PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri246 – 269C2H2-type 6PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri281 – 304C2H2-type 7PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri310 – 332C2H2-type 8PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri405 – 429C2H2-type 9; degeneratePROSITE-ProRule annotationAdd BLAST25
Zinc fingeri437 – 460C2H2-type 10PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri477 – 500C2H2-type 11PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri513 – 536C2H2-type 12PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri560 – 585C2H2-type 13; atypicalPROSITE-ProRule annotationAdd BLAST26
Zinc fingeri634 – 656C2H2-type 14PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri664 – 686C2H2-type 15PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri694 – 717C2H2-type 16PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri722 – 745C2H2-type 17PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri752 – 775C2H2-type 18PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri783 – 805C2H2-type 19PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri809 – 832C2H2-type 20PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri886 – 908C2H2-type 21; degeneratePROSITE-ProRule annotationAdd BLAST23
Zinc fingeri930 – 952C2H2-type 22PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri959 – 981C2H2-type 23PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1020 – 1042C2H2-type 24PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1065 – 1083C2H2-type 25; degeneratePROSITE-ProRule annotationAdd BLAST19
Zinc fingeri1138 – 1161C2H2-type 26PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri1195 – 1217C2H2-type 27PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1225 – 1247C2H2-type 28PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1256 – 1279C2H2-type 29PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri1286 – 1309C2H2-type 30PROSITE-ProRule annotationAdd BLAST24

GO - Molecular functioni

  • DNA binding Source: UniProtKB-KW
  • metal ion binding Source: UniProtKB-KW
  • protein domain specific binding Source: UniProtKB

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Activator, Developmental protein, Repressor

Keywords - Biological processi

Differentiation, Transcription, Transcription regulation

Keywords - Ligandi

DNA-binding, Metal-binding, Zinc

Enzyme and pathway databases

BioCyciZFISH:G66-32735-MONOMER.
SIGNORiQ96K83.

Names & Taxonomyi

Protein namesi
Recommended name:
Zinc finger protein 521
Alternative name(s):
Early hematopoietic zinc finger protein
LYST-interacting protein 3
Gene namesi
Name:ZNF521
Synonyms:EHZF, LIP3
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 18

Organism-specific databases

HGNCiHGNC:24605. ZNF521.

Subcellular locationi

GO - Cellular componenti

  • nucleus Source: MGI
Complete GO annotation...

Keywords - Cellular componenti

Nucleus

Pathology & Biotechi

Involvement in diseasei

A chromosomal aberration involving ZNF521 is found in acute lymphoblastic leukemia. Translocation t(9;18)(p13;q11.2) with PAX5. The translocation generates the PAX5-ZNF521 oncogene consisting of the N-terminus part of PAX5 and the C-terminus part of ZNF521.

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sitei72 – 73Breakpoint for translocation to form PAX5-ZNF5212

Keywords - Diseasei

Proto-oncogene

Organism-specific databases

DisGeNETi25925.
OpenTargetsiENSG00000198795.
PharmGKBiPA134956321.

Polymorphism and mutation databases

BioMutaiZNF521.
DMDMi74760909.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00003068711 – 1311Zinc finger protein 521Add BLAST1311

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei546PhosphoserineCombined sources1
Modified residuei605PhosphoserineCombined sources1
Modified residuei608PhosphoserineCombined sources1

Keywords - PTMi

Phosphoprotein

Proteomic databases

EPDiQ96K83.
PaxDbiQ96K83.
PeptideAtlasiQ96K83.
PRIDEiQ96K83.

PTM databases

iPTMnetiQ96K83.
PhosphoSitePlusiQ96K83.

Expressioni

Tissue specificityi

Predominantly expressed in hematopoietic cells. Present in organs and tissues that contain stem and progenitor cells, myeloid and/or lymphoid: placenta, spleen, lymph nodes, thymus, bone marrow and fetal liver. Within the hematopoietic system, it is abundant in CD34+ cells but undetectable in mature peripheral blood leukocytes, and its levels rapidly decrease during the differentiation of CD34+ cells in response to hemopoietins.1 Publication

Gene expression databases

BgeeiENSG00000198795.
CleanExiHS_ZNF521.
ExpressionAtlasiQ96K83. baseline and differential.
GenevisibleiQ96K83. HS.

Organism-specific databases

HPAiHPA023056.
HPA023849.

Interactioni

Subunit structurei

Interacts with EBF1. Interacts with SMAD1 and SMAD4.2 Publications

GO - Molecular functioni

  • protein domain specific binding Source: UniProtKB

Protein-protein interaction databases

BioGridi117425. 10 interactors.
IntActiQ96K83. 4 interactors.
STRINGi9606.ENSP00000354794.

Structurei

3D structure databases

ProteinModelPortaliQ96K83.
SMRiQ96K83.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domaini

Uses different DNA- and protein-binding zinc fingers to regulate the distinct BMP-Smad and hematopoietic system.By similarity

Sequence similaritiesi

Contains 30 C2H2-type zinc fingers.PROSITE-ProRule annotation

Zinc finger

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri47 – 67C2H2-type 1; degeneratePROSITE-ProRule annotationAdd BLAST21
Zinc fingeri118 – 140C2H2-type 2PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri146 – 168C2H2-type 3PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri174 – 196C2H2-type 4PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri202 – 224C2H2-type 5PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri246 – 269C2H2-type 6PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri281 – 304C2H2-type 7PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri310 – 332C2H2-type 8PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri405 – 429C2H2-type 9; degeneratePROSITE-ProRule annotationAdd BLAST25
Zinc fingeri437 – 460C2H2-type 10PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri477 – 500C2H2-type 11PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri513 – 536C2H2-type 12PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri560 – 585C2H2-type 13; atypicalPROSITE-ProRule annotationAdd BLAST26
Zinc fingeri634 – 656C2H2-type 14PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri664 – 686C2H2-type 15PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri694 – 717C2H2-type 16PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri722 – 745C2H2-type 17PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri752 – 775C2H2-type 18PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri783 – 805C2H2-type 19PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri809 – 832C2H2-type 20PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri886 – 908C2H2-type 21; degeneratePROSITE-ProRule annotationAdd BLAST23
Zinc fingeri930 – 952C2H2-type 22PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri959 – 981C2H2-type 23PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1020 – 1042C2H2-type 24PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1065 – 1083C2H2-type 25; degeneratePROSITE-ProRule annotationAdd BLAST19
Zinc fingeri1138 – 1161C2H2-type 26PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri1195 – 1217C2H2-type 27PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1225 – 1247C2H2-type 28PROSITE-ProRule annotationAdd BLAST23
Zinc fingeri1256 – 1279C2H2-type 29PROSITE-ProRule annotationAdd BLAST24
Zinc fingeri1286 – 1309C2H2-type 30PROSITE-ProRule annotationAdd BLAST24

Keywords - Domaini

Repeat, Zinc-finger

Phylogenomic databases

eggNOGiKOG1721. Eukaryota.
COG5048. LUCA.
GeneTreeiENSGT00840000129874.
HOVERGENiHBG052773.
InParanoidiQ96K83.
OMAiPAGEYIC.
OrthoDBiEOG091G00X5.
PhylomeDBiQ96K83.
TreeFamiTF331504.

Family and domain databases

Gene3Di3.30.160.60. 7 hits.
3.30.40.10. 1 hit.
InterProiIPR007087. Znf_C2H2.
IPR015880. Znf_C2H2-like.
IPR013087. Znf_C2H2/integrase_DNA-bd.
IPR013083. Znf_RING/FYVE/PHD.
[Graphical view]
PfamiPF13912. zf-C2H2_6. 6 hits.
[Graphical view]
SMARTiSM00355. ZnF_C2H2. 30 hits.
[Graphical view]
PROSITEiPS00028. ZINC_FINGER_C2H2_1. 27 hits.
PS50157. ZINC_FINGER_C2H2_2. 24 hits.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Q96K83-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MSRRKQAKPR SLKDPNCKLE DKTEDGEALD CKKRPEDGEE LEDEAVHSCD
60 70 80 90 100
SCLQVFESLS DITEHKINQC QLTDGVDVED DPTCSWPASS PSSKDQTSPS
110 120 130 140 150
HGEGCDFGEE EGGPGLPYPC QFCDKSFSRL SYLKHHEQSH SDKLPFKCTY
160 170 180 190 200
CSRLFKHKRS RDRHIKLHTG DKKYHCSECD AAFSRSDHLK IHLKTHTSNK
210 220 230 240 250
PYKCAICRRG FLSSSSLHGH MQVHERNKDG SQSGSRMEDW KMKDTQKCSQ
260 270 280 290 300
CEEGFDFPED LQKHIAECHP ECSPNEDRAA LQCVYCHELF VEETSLMNHM
310 320 330 340 350
EQVHSGEKKN SCSICSESFH TVEELYSHMD SHQQPESCNH SNSPSLVTVG
360 370 380 390 400
YTSVSSTTPD SNLSVDSSTM VEAAPPIPKS RGRKRAAQQT PDMTGPSSKQ
410 420 430 440 450
AKVTYSCIYC NKQLFSSLAV LQIHLKTMHL DKPEQAHICQ YCLEVLPSLY
460 470 480 490 500
NLNEHLKQVH EAQDPGLIVS AMPAIVYQCN FCSEVVNDLN TLQEHIRCSH
510 520 530 540 550
GFANPAAKDS NAFFCPHCYM GFLTDSSLEE HIRQVHCDLS GSRFGSPVLG
560 570 580 590 600
TPKEPVVEVY SCSYCTNSPI FNSVLKLNKH IKENHKNIPL ALNYIHNGKK
610 620 630 640 650
SRALSPLSPV AIEQTSLKMM QAVGGAPARP TGEYICNQCG AKYTSLDSFQ
660 670 680 690 700
THLKTHLDTV LPKLTCPQCN KEFPNQESLL KHVTIHFMIT STYYICESCD
710 720 730 740 750
KQFTSVDDLQ KHLLDMHTFV FFRCTLCQEV FDSKVSIQLH LAVKHSNEKK
760 770 780 790 800
VYRCTSCNWD FRNETDLQLH VKHNHLENQG KVHKCIFCGE SFGTEVELQC
810 820 830 840 850
HITTHSKKYN CKFCSKAFHA IILLEKHLRE KHCVFETKTP NCGTNGASEQ
860 870 880 890 900
VQKEEVELQT LLTNSQESHN SHDGSEEDVD TSEPMYGCDI CGAAYTMETL
910 920 930 940 950
LQNHQLRDHN IRPGESAIVK KKAELIKGNY KCNVCSRTFF SENGLREHMQ
960 970 980 990 1000
THLGPVKHYM CPICGERFPS LLTLTEHKVT HSKSLDTGNC RICKMPLQSE
1010 1020 1030 1040 1050
EEFLEHCQMH PDLRNSLTGF RCVVCMQTVT STLELKIHGT FHMQKTGNGS
1060 1070 1080 1090 1100
AVQTTGRGQH VQKLYKCASC LKEFRSKQDL VKLDINGLPY GLCAGCVNLS
1110 1120 1130 1140 1150
KSASPGINVP PGTNRPGLGQ NENLSAIEGK GKVGGLKTRC SSCNVKFESE
1160 1170 1180 1190 1200
SELQNHIQTI HRELVPDSNS TQLKTPQVSP MPRISPSQSD EKKTYQCIKC
1210 1220 1230 1240 1250
QMVFYNEWDI QVHVANHMID EGLNHECKLC SQTFDSPAKL QCHLIEHSFE
1260 1270 1280 1290 1300
GMGGTFKCPV CFTVFVQANK LQQHIFSAHG QEDKIYDCTQ CPQKFFFQTE
1310
LQNHTMTQHS S
Length:1,311
Mass (Da):147,866
Last modified:December 1, 2001 - v1
Checksum:iC52DCC71C2B16C8F
GO

Sequence cautioni

The sequence ABI33104 differs from that shown. Reason: Erroneous initiation.Curated
The sequence BAB13829 differs from that shown. Reason: Erroneous initiation.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti203K → E in CAD57322 (PubMed:14630787).Curated1
Sequence conflicti301E → K in ABI33104 (PubMed:17344859).Curated1
Sequence conflicti599K → Q in BAB13829 (PubMed:14702039).Curated1
Sequence conflicti649F → L in CAD57322 (PubMed:14630787).Curated1
Sequence conflicti701K → E in CAD57322 (PubMed:14630787).Curated1
Sequence conflicti933N → S in CAD57322 (PubMed:14630787).Curated1
Sequence conflicti933N → S in CAB56016 (PubMed:17344859).Curated1
Sequence conflicti1019G → A in CAD57322 (PubMed:14630787).Curated1
Sequence conflicti1019G → A in CAB56016 (PubMed:17344859).Curated1
Sequence conflicti1173L → M in AAG49442 (PubMed:17974005).Curated1
Sequence conflicti1176P → T in AAG49442 (PubMed:17974005).Curated1
Sequence conflicti1204F → L in AAG49442 (PubMed:17974005).Curated1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ518106 mRNA. Translation: CAD57322.1.
AK021452 mRNA. Translation: BAB13829.1. Different initiation.
AK027354 mRNA. Translation: BAB55056.1.
AK074046 mRNA. Translation: BAB84872.1.
EF445043 Genomic DNA. Translation: ACA06095.1.
CH471088 Genomic DNA. Translation: EAX01201.1.
BC113622 mRNA. Translation: AAI13623.1.
BC113648 mRNA. Translation: AAI13649.1.
DQ845345 mRNA. Translation: ABI33104.1. Different initiation.
AL117615 mRNA. Translation: CAB56016.1.
AF141339 mRNA. Translation: AAG49442.1.
CCDSiCCDS32806.1.
PIRiT17326.
RefSeqiNP_001295154.1. NM_001308225.1.
NP_056276.1. NM_015461.2.
XP_011524213.1. XM_011525911.1.
XP_016881187.1. XM_017025698.1.
UniGeneiHs.116935.

Genome annotation databases

EnsembliENST00000361524; ENSP00000354794; ENSG00000198795.
ENST00000538137; ENSP00000440768; ENSG00000198795.
GeneIDi25925.
KEGGihsa:25925.
UCSCiuc002kvk.3. human.

Keywords - Coding sequence diversityi

Chromosomal rearrangement

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ518106 mRNA. Translation: CAD57322.1.
AK021452 mRNA. Translation: BAB13829.1. Different initiation.
AK027354 mRNA. Translation: BAB55056.1.
AK074046 mRNA. Translation: BAB84872.1.
EF445043 Genomic DNA. Translation: ACA06095.1.
CH471088 Genomic DNA. Translation: EAX01201.1.
BC113622 mRNA. Translation: AAI13623.1.
BC113648 mRNA. Translation: AAI13649.1.
DQ845345 mRNA. Translation: ABI33104.1. Different initiation.
AL117615 mRNA. Translation: CAB56016.1.
AF141339 mRNA. Translation: AAG49442.1.
CCDSiCCDS32806.1.
PIRiT17326.
RefSeqiNP_001295154.1. NM_001308225.1.
NP_056276.1. NM_015461.2.
XP_011524213.1. XM_011525911.1.
XP_016881187.1. XM_017025698.1.
UniGeneiHs.116935.

3D structure databases

ProteinModelPortaliQ96K83.
SMRiQ96K83.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi117425. 10 interactors.
IntActiQ96K83. 4 interactors.
STRINGi9606.ENSP00000354794.

PTM databases

iPTMnetiQ96K83.
PhosphoSitePlusiQ96K83.

Polymorphism and mutation databases

BioMutaiZNF521.
DMDMi74760909.

Proteomic databases

EPDiQ96K83.
PaxDbiQ96K83.
PeptideAtlasiQ96K83.
PRIDEiQ96K83.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000361524; ENSP00000354794; ENSG00000198795.
ENST00000538137; ENSP00000440768; ENSG00000198795.
GeneIDi25925.
KEGGihsa:25925.
UCSCiuc002kvk.3. human.

Organism-specific databases

CTDi25925.
DisGeNETi25925.
GeneCardsiZNF521.
HGNCiHGNC:24605. ZNF521.
HPAiHPA023056.
HPA023849.
MIMi610974. gene.
neXtProtiNX_Q96K83.
OpenTargetsiENSG00000198795.
PharmGKBiPA134956321.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG1721. Eukaryota.
COG5048. LUCA.
GeneTreeiENSGT00840000129874.
HOVERGENiHBG052773.
InParanoidiQ96K83.
OMAiPAGEYIC.
OrthoDBiEOG091G00X5.
PhylomeDBiQ96K83.
TreeFamiTF331504.

Enzyme and pathway databases

BioCyciZFISH:G66-32735-MONOMER.
SIGNORiQ96K83.

Miscellaneous databases

ChiTaRSiZNF521. human.
GenomeRNAii25925.
PROiQ96K83.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000198795.
CleanExiHS_ZNF521.
ExpressionAtlasiQ96K83. baseline and differential.
GenevisibleiQ96K83. HS.

Family and domain databases

Gene3Di3.30.160.60. 7 hits.
3.30.40.10. 1 hit.
InterProiIPR007087. Znf_C2H2.
IPR015880. Znf_C2H2-like.
IPR013087. Znf_C2H2/integrase_DNA-bd.
IPR013083. Znf_RING/FYVE/PHD.
[Graphical view]
PfamiPF13912. zf-C2H2_6. 6 hits.
[Graphical view]
SMARTiSM00355. ZnF_C2H2. 30 hits.
[Graphical view]
PROSITEiPS00028. ZINC_FINGER_C2H2_1. 27 hits.
PS50157. ZINC_FINGER_C2H2_2. 24 hits.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiZN521_HUMAN
AccessioniPrimary (citable) accession number: Q96K83
Secondary accession number(s): A3QVP7
, B0YJB7, Q8IXI0, Q8TES6, Q9C065, Q9HAL5, Q9UFK4
Entry historyi
Integrated into UniProtKB/Swiss-Prot: October 2, 2007
Last sequence update: December 1, 2001
Last modified: November 30, 2016
This is version 128 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 18
    Human chromosome 18: entries, gene names and cross-references to MIM
  2. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.