Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Bcl-2-associated transcription factor 1

Gene

BCLAF1

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Death-promoting transcriptional repressor. May be involved in cyclin-D1/CCND1 mRNA stability through the SNARP complex which associates with both the 3'end of the CCND1 gene and its mRNA.1 Publication

GO - Molecular functioni

  • DNA binding Source: MGI
  • poly(A) RNA binding Source: UniProtKB

GO - Biological processi

  • apoptotic process Source: UniProtKB
  • negative regulation of transcription, DNA-templated Source: UniProtKB
  • positive regulation of apoptotic process Source: MGI
  • positive regulation of DNA-templated transcription, initiation Source: UniProtKB
  • positive regulation of intrinsic apoptotic signaling pathway Source: UniProtKB
  • positive regulation of response to DNA damage stimulus Source: UniProtKB
  • regulation of DNA-templated transcription in response to stress Source: UniProtKB
  • transcription, DNA-templated Source: UniProtKB-KW
Complete GO annotation...

Keywords - Molecular functioni

Repressor

Keywords - Biological processi

Transcription, Transcription regulation

Keywords - Ligandi

DNA-binding

Enzyme and pathway databases

BioCyciZFISH:ENSG00000029363-MONOMER.

Names & Taxonomyi

Protein namesi
Recommended name:
Bcl-2-associated transcription factor 1
Short name:
Btf
Gene namesi
Name:BCLAF1
Synonyms:BTF, KIAA0164
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 6

Organism-specific databases

HGNCiHGNC:16863. BCLAF1.

Subcellular locationi

GO - Cellular componenti

  • cytoplasm Source: UniProtKB-SubCell
  • nuclear speck Source: UniProtKB
  • nucleolus Source: HPA
  • nucleoplasm Source: UniProtKB
  • nucleus Source: HPA
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Nucleus

Pathology & Biotechi

Organism-specific databases

DisGeNETi9774.
OpenTargetsiENSG00000029363.
PharmGKBiPA134868035.

Polymorphism and mutation databases

BioMutaiBCLAF1.
DMDMi47605556.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00000648881 – 920Bcl-2-associated transcription factor 1Add BLAST920

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei102PhosphoserineCombined sources1
Modified residuei104PhosphoserineCombined sources1
Modified residuei152N6-acetyllysineCombined sources1
Modified residuei177PhosphoserineCombined sources1 Publication1
Modified residuei181PhosphoserineCombined sources1
Modified residuei196PhosphoserineCombined sources1
Modified residuei198PhosphoserineCombined sources1
Modified residuei219PhosphotyrosineCombined sources1
Modified residuei222PhosphoserineCombined sources1
Modified residuei259PhosphoserineCombined sources1
Modified residuei262PhosphoserineCombined sources1
Modified residuei264PhosphoserineCombined sources1
Modified residuei268PhosphoserineCombined sources1 Publication1
Modified residuei284PhosphotyrosineCombined sources1
Modified residuei285PhosphoserineCombined sources1
Modified residuei290PhosphoserineCombined sources1 Publication1
Modified residuei297PhosphoserineCombined sources1
Modified residuei300PhosphoserineCombined sources1
Modified residuei315PhosphoserineCombined sources1
Modified residuei332N6-acetyllysine; alternateBy similarity1
Cross-linki332Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2); alternateCombined sources
Modified residuei341PhosphothreonineCombined sources1
Modified residuei355PhosphothreonineCombined sources1
Modified residuei383PhosphotyrosineCombined sources1
Modified residuei385PhosphoserineCombined sources1
Modified residuei389PhosphoserineCombined sources1
Modified residuei397PhosphoserineCombined sources1
Modified residuei402PhosphothreonineCombined sources1
Cross-linki413Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Modified residuei421N6-acetyllysine; alternateBy similarity1
Cross-linki421Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2); alternateCombined sources
Modified residuei422PhosphoserineCombined sources1
Modified residuei427PhosphoserineCombined sources1
Modified residuei431PhosphothreonineCombined sources1
Modified residuei437N6-acetyllysine; alternateCombined sources1
Cross-linki437Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2); alternateCombined sources
Modified residuei450PhosphoserineCombined sources1
Cross-linki457Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Cross-linki462Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Modified residuei472PhosphoserineCombined sources1
Modified residuei475N6-acetyllysineBy similarity1
Cross-linki491Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Modified residuei494PhosphothreonineCombined sources1
Modified residuei496PhosphoserineCombined sources1
Cross-linki501Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Modified residuei502PhosphoserineCombined sources1
Modified residuei512PhosphoserineCombined sources1 Publication1
Modified residuei525PhosphoserineCombined sources1
Modified residuei531PhosphoserineCombined sources1 Publication1
Cross-linki536Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Cross-linki548Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Cross-linki550Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Modified residuei559PhosphoserineCombined sources1
Modified residuei564PhosphoserineCombined sources1
Modified residuei566PhosphothreonineCombined sources1
Modified residuei578PhosphoserineCombined sources1
Cross-linki580Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO1)Combined sources
Cross-linki580Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Modified residuei648PhosphoserineCombined sources1
Modified residuei658PhosphoserineCombined sources1 Publication1
Modified residuei660PhosphoserineCombined sources1
Modified residuei661PhosphothreonineCombined sources1
Cross-linki676Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Modified residuei690PhosphoserineCombined sources1
Modified residuei760PhosphoserineCombined sources1
Cross-linki778Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Modified residuei803CitrullineBy similarity1
Modified residuei809Omega-N-methylarginineCombined sources1
Cross-linki831Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO1)Combined sources
Cross-linki831Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Isoform 4 (identifier: Q9NYF8-4)
Modified residuei339PhosphoserineCombined sources1

Post-translational modificationi

Citrullinated by PADI4.By similarity

Keywords - PTMi

Acetylation, Citrullination, Isopeptide bond, Methylation, Phosphoprotein, Ubl conjugation

Proteomic databases

EPDiQ9NYF8.
MaxQBiQ9NYF8.
PaxDbiQ9NYF8.
PeptideAtlasiQ9NYF8.
PRIDEiQ9NYF8.

PTM databases

iPTMnetiQ9NYF8.
PhosphoSitePlusiQ9NYF8.
SwissPalmiQ9NYF8.

Expressioni

Tissue specificityi

Ubiquitous.

Gene expression databases

BgeeiENSG00000029363.
CleanExiHS_BCLAF1.
ExpressionAtlasiQ9NYF8. baseline and differential.
GenevisibleiQ9NYF8. HS.

Organism-specific databases

HPAiHPA006669.
HPA027770.

Interactioni

Subunit structurei

Interacts with Bcl-2 related proteins, EMD, with the adenovirus E1B 19 kDa protein and with DNA. Component of the SNARP complex which consists at least of SNIP1, SNW1, THRAP3, BCLAF1 and PNN. Component of the WTAP complex composed of WTAP, ZC3H13, CBLL1, KIAA1429, RBM15, BCLAF1 and THRAP3.3 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
BCL2P104152EBI-437804,EBI-77694
EMDP504023EBI-437804,EBI-489887

Protein-protein interaction databases

BioGridi115118. 98 interactors.
IntActiQ9NYF8. 70 interactors.
MINTiMINT-92502.
STRINGi9606.ENSP00000435210.

Structurei

3D structure databases

ProteinModelPortaliQ9NYF8.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Compositional biasi141 – 148Poly-Ser8
Compositional biasi749 – 763Poly-SerAdd BLAST15

Phylogenomic databases

eggNOGiENOG410IJCD. Eukaryota.
ENOG4110X68. LUCA.
GeneTreeiENSGT00530000063211.
HOVERGENiHBG050681.
InParanoidiQ9NYF8.
KOiK13087.
OMAiDDSKHKS.
OrthoDBiEOG091G03HU.
PhylomeDBiQ9NYF8.
TreeFamiTF335939.

Family and domain databases

InterProiIPR026668. Bcl-2_assoc_TF1.
IPR029199. THRAP3_BCLAF1.
[Graphical view]
PANTHERiPTHR15268:SF4. PTHR15268:SF4. 1 hit.
PfamiPF15440. THRAP3_BCLAF1. 1 hit.
[Graphical view]

Sequences (4)i

Sequence statusi: Complete.

This entry describes 4 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q9NYF8-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MGRSNSRSHS SRSKSRSQSS SRSRSRSHSR KKRYSSRSRS RTYSRSRSRD
60 70 80 90 100
RMYSRDYRRD YRNNRGMRRP YGYRGRGRGY YQGGGGRYHR GGYRPVWNRR
110 120 130 140 150
HSRSPRRGRS RSRSPKRRSV SSQRSRSRSR RSYRSSRSPR SSSSRSSSPY
160 170 180 190 200
SKSPVSKRRG SQEKQTKKAE GEPQEESPLK SKSQEEPKDT FEHDPSESID
210 220 230 240 250
EFNKSSATSG DIWPGLSAYD NSPRSPHSPS PIATPPSQSS SCSDAPMLST
260 270 280 290 300
VHSAKNTPSQ HSHSIQHSPE RSGSGSVGNG SSRYSPSQNS PIHHIPSRRS
310 320 330 340 350
PAKTIAPQNA PRDESRGRSS FYPDGGDQET AKTGKFLKRF TDEESRVFLL
360 370 380 390 400
DRGNTRDKEA SKEKGSEKGR AEGEWEDQEA LDYFSDKESG KQKFNDSEGD
410 420 430 440 450
DTEETEDYRQ FRKSVLADQG KSFATASHRN TEEEGLKYKS KVSLKGNRES
460 470 480 490 500
DGFREEKNYK LKETGYVVER PSTTKDKHKE EDKNSERITV KKETQSPEQV
510 520 530 540 550
KSEKLKDLFD YSPPLHKNLD AREKSTFREE SPLRIKMIAS DSHRPEVKLK
560 570 580 590 600
MAPVPLDDSN RPASLTKDRL LASTLVHSVK KEQEFRSIFD HIKLPQASKS
610 620 630 640 650
TSESFIQHIV SLVHHVKEQY FKSAAMTLNE RFTSYQKATE EHSTRQKSPE
660 670 680 690 700
IHRRIDISPS TLRKHTRLAG EERVFKEENQ KGDKKLRCDS ADLRHDIDRR
710 720 730 740 750
RKERSKERGD SKGSRESSGS RKQEKTPKDY KEYKSYKDDS KHKREQDHSR
760 770 780 790 800
SSSSSASPSS PSSREEKESK KEREEEFKTH HEMKEYSGFA GVSRPRGTFF
810 820 830 840 850
RIRGRGRARG VFAGTNTGPN NSNTTFQKRP KEEEWDPEYT PKSKKYFLHD
860 870 880 890 900
DRDDGVDYWA KRGRGRGTFQ RGRGRFNFKK SGSSPKWTHD KYQGDGIVED
910 920
EEETMENNEE KKDRRKEEKE
Length:920
Mass (Da):106,122
Last modified:May 24, 2004 - v2
Checksum:i8892B98E54F52C20
GO
Isoform 2 (identifier: Q9NYF8-2) [UniParc]FASTAAdd to basket
Also known as: Btf-l

The sequence of this isoform differs from the canonical sequence as follows:
     35-36: Missing.

Show »
Length:918
Mass (Da):105,948
Checksum:i8DD27EC0EC8FBFC5
GO
Isoform 3 (identifier: Q9NYF8-3) [UniParc]FASTAAdd to basket
Also known as: Btf-s, BP-1

The sequence of this isoform differs from the canonical sequence as follows:
     35-36: Missing.
     800-848: Missing.

Show »
Length:869
Mass (Da):100,232
Checksum:i6A11356844C6EF40
GO
Isoform 4 (identifier: Q9NYF8-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     339-511: Missing.

Show »
Length:747
Mass (Da):85,937
Checksum:i85CDAF9B406FAE0F
GO

Sequence cautioni

The sequence AAH47687 differs from that shown. Contaminating sequence. Potential poly-A sequence.Curated
The sequence AAH47887 differs from that shown. Contaminating sequence. Potential poly-A sequence.Curated
The sequence AAH56894 differs from that shown. Contaminating sequence. Potential poly-A sequence.Curated
The sequence AAH63846 differs from that shown. Contaminating sequence. Potential poly-A sequence.Curated
The sequence BAA11481 differs from that shown. Reason: Erroneous initiation.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti4S → A in AAF64304 (PubMed:10330179).Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_05959166G → A.Corresponds to variant rs9942517dbSNPEnsembl.1
Natural variantiVAR_050692209S → C.Corresponds to variant rs6940018dbSNPEnsembl.1
Natural variantiVAR_050693459Y → D.Corresponds to variant rs1967446dbSNPEnsembl.1
Natural variantiVAR_050694461L → H.Corresponds to variant rs1967445dbSNPEnsembl.1
Natural variantiVAR_050695629N → S.Corresponds to variant rs7381749dbSNPEnsembl.1
Natural variantiVAR_050696875R → C.Corresponds to variant rs34541670dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_01036935 – 36Missing in isoform 2 and isoform 3. 2 Publications2
Alternative sequenceiVSP_010371339 – 511Missing in isoform 4. 1 PublicationAdd BLAST173
Alternative sequenceiVSP_010370800 – 848Missing in isoform 3. 1 PublicationAdd BLAST49

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF249273 mRNA. Translation: AAF64304.1.
D79986 mRNA. Translation: BAA11481.2. Different initiation.
AL121713 Genomic DNA. Translation: CAB96722.1.
CH471051 Genomic DNA. Translation: EAW47950.1.
CH471051 Genomic DNA. Translation: EAW47951.1.
BC047687 mRNA. Translation: AAH47687.1. Sequence problems.
BC047887 mRNA. Translation: AAH47887.1. Sequence problems.
BC056894 mRNA. Translation: AAH56894.1. Sequence problems.
BC063846 mRNA. Translation: AAH63846.1. Sequence problems.
BC132780 mRNA. Translation: AAI32781.1.
BC144281 mRNA. Translation: AAI44282.1.
CCDSiCCDS47485.1. [Q9NYF8-4]
CCDS47486.1. [Q9NYF8-3]
CCDS5177.1. [Q9NYF8-1]
CCDS75525.1. [Q9NYF8-2]
RefSeqiNP_001070908.1. NM_001077440.1. [Q9NYF8-3]
NP_001070909.1. NM_001077441.1. [Q9NYF8-4]
NP_001287967.1. NM_001301038.1. [Q9NYF8-2]
NP_055554.1. NM_014739.2. [Q9NYF8-1]
UniGeneiHs.486542.

Genome annotation databases

EnsembliENST00000353331; ENSP00000229446; ENSG00000029363. [Q9NYF8-3]
ENST00000392348; ENSP00000376159; ENSG00000029363. [Q9NYF8-3]
ENST00000527759; ENSP00000434826; ENSG00000029363. [Q9NYF8-2]
ENST00000530767; ENSP00000436501; ENSG00000029363. [Q9NYF8-4]
ENST00000531224; ENSP00000435210; ENSG00000029363. [Q9NYF8-1]
GeneIDi9774.
KEGGihsa:9774.
UCSCiuc003qgw.2. human. [Q9NYF8-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Web resourcesi

Atlas of Genetics and Cytogenetics in Oncology and Haematology

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF249273 mRNA. Translation: AAF64304.1.
D79986 mRNA. Translation: BAA11481.2. Different initiation.
AL121713 Genomic DNA. Translation: CAB96722.1.
CH471051 Genomic DNA. Translation: EAW47950.1.
CH471051 Genomic DNA. Translation: EAW47951.1.
BC047687 mRNA. Translation: AAH47687.1. Sequence problems.
BC047887 mRNA. Translation: AAH47887.1. Sequence problems.
BC056894 mRNA. Translation: AAH56894.1. Sequence problems.
BC063846 mRNA. Translation: AAH63846.1. Sequence problems.
BC132780 mRNA. Translation: AAI32781.1.
BC144281 mRNA. Translation: AAI44282.1.
CCDSiCCDS47485.1. [Q9NYF8-4]
CCDS47486.1. [Q9NYF8-3]
CCDS5177.1. [Q9NYF8-1]
CCDS75525.1. [Q9NYF8-2]
RefSeqiNP_001070908.1. NM_001077440.1. [Q9NYF8-3]
NP_001070909.1. NM_001077441.1. [Q9NYF8-4]
NP_001287967.1. NM_001301038.1. [Q9NYF8-2]
NP_055554.1. NM_014739.2. [Q9NYF8-1]
UniGeneiHs.486542.

3D structure databases

ProteinModelPortaliQ9NYF8.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi115118. 98 interactors.
IntActiQ9NYF8. 70 interactors.
MINTiMINT-92502.
STRINGi9606.ENSP00000435210.

PTM databases

iPTMnetiQ9NYF8.
PhosphoSitePlusiQ9NYF8.
SwissPalmiQ9NYF8.

Polymorphism and mutation databases

BioMutaiBCLAF1.
DMDMi47605556.

Proteomic databases

EPDiQ9NYF8.
MaxQBiQ9NYF8.
PaxDbiQ9NYF8.
PeptideAtlasiQ9NYF8.
PRIDEiQ9NYF8.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000353331; ENSP00000229446; ENSG00000029363. [Q9NYF8-3]
ENST00000392348; ENSP00000376159; ENSG00000029363. [Q9NYF8-3]
ENST00000527759; ENSP00000434826; ENSG00000029363. [Q9NYF8-2]
ENST00000530767; ENSP00000436501; ENSG00000029363. [Q9NYF8-4]
ENST00000531224; ENSP00000435210; ENSG00000029363. [Q9NYF8-1]
GeneIDi9774.
KEGGihsa:9774.
UCSCiuc003qgw.2. human. [Q9NYF8-1]

Organism-specific databases

CTDi9774.
DisGeNETi9774.
GeneCardsiBCLAF1.
HGNCiHGNC:16863. BCLAF1.
HPAiHPA006669.
HPA027770.
MIMi612588. gene.
neXtProtiNX_Q9NYF8.
OpenTargetsiENSG00000029363.
PharmGKBiPA134868035.
HUGEiSearch...
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410IJCD. Eukaryota.
ENOG4110X68. LUCA.
GeneTreeiENSGT00530000063211.
HOVERGENiHBG050681.
InParanoidiQ9NYF8.
KOiK13087.
OMAiDDSKHKS.
OrthoDBiEOG091G03HU.
PhylomeDBiQ9NYF8.
TreeFamiTF335939.

Enzyme and pathway databases

BioCyciZFISH:ENSG00000029363-MONOMER.

Miscellaneous databases

ChiTaRSiBCLAF1. human.
GeneWikiiBCLAF1.
GenomeRNAii9774.
PROiQ9NYF8.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000029363.
CleanExiHS_BCLAF1.
ExpressionAtlasiQ9NYF8. baseline and differential.
GenevisibleiQ9NYF8. HS.

Family and domain databases

InterProiIPR026668. Bcl-2_assoc_TF1.
IPR029199. THRAP3_BCLAF1.
[Graphical view]
PANTHERiPTHR15268:SF4. PTHR15268:SF4. 1 hit.
PfamiPF15440. THRAP3_BCLAF1. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiBCLF1_HUMAN
AccessioniPrimary (citable) accession number: Q9NYF8
Secondary accession number(s): A2RU75
, B7ZM58, E1P586, Q14673, Q86WU6, Q86WY0
Entry historyi
Integrated into UniProtKB/Swiss-Prot: May 24, 2004
Last sequence update: May 24, 2004
Last modified: November 2, 2016
This is version 150 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. Human chromosome 6
    Human chromosome 6: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.