Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Transcription initiation factor TFIID subunit 1-like

Gene

TAF1L

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

May act as a functional substitute for TAF1/TAFII250 during male meiosis, when sex chromosomes are transcriptionally silenced.1 Publication

GO - Molecular functioni

  • DNA binding Source: UniProtKB-KW
  • histone acetyltransferase activity Source: UniProtKB
  • lysine-acetylated histone binding Source: UniProtKB
  • protein serine/threonine kinase activity Source: UniProtKB
  • TBP-class protein binding Source: UniProtKB

GO - Biological processi

  • gene expression Source: Reactome
  • histone acetylation Source: GOC
  • male meiosis Source: UniProtKB
  • positive regulation of transcription, DNA-templated Source: UniProtKB
  • protein phosphorylation Source: GOC
  • regulation of transcription from RNA polymerase II promoter Source: UniProtKB
  • transcription elongation from RNA polymerase II promoter Source: Reactome
  • transcription from RNA polymerase II promoter Source: Reactome
  • transcription initiation from RNA polymerase II promoter Source: Reactome
  • viral process Source: Reactome
Complete GO annotation...

Keywords - Biological processi

Cell cycle, Transcription, Transcription regulation

Keywords - Ligandi

DNA-binding

Enzyme and pathway databases

ReactomeiREACT_1655. RNA Polymerase II Transcription Pre-Initiation And Promoter Opening.
REACT_1851. RNA Polymerase II Transcription Initiation.
REACT_2089. RNA Polymerase II Promoter Escape.
REACT_22107. RNA Polymerase II Pre-transcription Events.
REACT_6233. Transcription of the HIV genome.
REACT_6253. RNA Polymerase II HIV Promoter Escape.
REACT_6332. HIV Transcription Initiation.
REACT_834. RNA Polymerase II Transcription Initiation And Promoter Clearance.

Names & Taxonomyi

Protein namesi
Recommended name:
Transcription initiation factor TFIID subunit 1-like
Alternative name(s):
TAF(II)210
TBP-associated factor 1-like
TBP-associated factor 210 kDa
Transcription initiation factor TFIID 210 kDa subunit
Gene namesi
Name:TAF1L
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
ProteomesiUP000005640 Componenti: Chromosome 9

Organism-specific databases

HGNCiHGNC:18056. TAF1L.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Nucleus

Pathology & Biotechi

Organism-specific databases

PharmGKBiPA134947802.

Polymorphism and mutation databases

BioMutaiTAF1L.
DMDMi57013082.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 18261826Transcription initiation factor TFIID subunit 1-likePRO_0000211217Add
BLAST

Proteomic databases

MaxQBiQ8IZX4.
PaxDbiQ8IZX4.
PRIDEiQ8IZX4.

PTM databases

PhosphoSiteiQ8IZX4.

Expressioni

Tissue specificityi

Testis specific, expressed apparently in germ cells.

Gene expression databases

CleanExiHS_TAF1L.
GenevisibleiQ8IZX4. HS.

Organism-specific databases

HPAiHPA056605.

Interactioni

Subunit structurei

Can bind directly to TATA-box binding protein (TBP). Interacts (via bromo domains) with acetylated lysine residues on the N-terminus of histone H1.4, H2A, H2B, H3 and H4 (in vitro).2 Publications

Protein-protein interaction databases

BioGridi126514. 9 interactions.
IntActiQ8IZX4. 2 interactions.
MINTiMINT-2810023.
STRINGi9606.ENSP00000418379.

Structurei

Secondary structure

1
1826
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Helixi1523 – 153614Combined sources
Turni1537 – 15404Combined sources
Helixi1545 – 15473Combined sources
Turni1553 – 15553Combined sources
Helixi1557 – 15626Combined sources
Helixi1569 – 15779Combined sources
Helixi1584 – 160219Combined sources
Helixi1607 – 162519Combined sources
Helixi1627 – 164923Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
3HMHX-ray2.05A1523-1654[»]
ProteinModelPortaliQ8IZX4.
SMRiQ8IZX4. Positions 607-1107, 1396-1641.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiQ8IZX4.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini1416 – 148671Bromo 1PROSITE-ProRule annotationAdd
BLAST
Domaini1539 – 160971Bromo 2PROSITE-ProRule annotationAdd
BLAST

Motif

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Motifi1370 – 13778Nuclear localization signalSequence Analysis

Compositional bias

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Compositional biasi156 – 1649Pro-rich

Sequence similaritiesi

Belongs to the TAF1 family.Curated
Contains 2 bromo domains.PROSITE-ProRule annotation

Keywords - Domaini

Bromodomain, Repeat

Phylogenomic databases

eggNOGiCOG5076.
GeneTreeiENSGT00390000012659.
HOGENOMiHOG000020066.
HOVERGENiHBG050223.
InParanoidiQ8IZX4.
KOiK03125.
OMAiVIREEPQ.
OrthoDBiEOG7QNVK2.
PhylomeDBiQ8IZX4.
TreeFamiTF313573.

Family and domain databases

Gene3Di1.10.1100.10. 1 hit.
1.20.920.10. 2 hits.
InterProiIPR001487. Bromodomain.
IPR018359. Bromodomain_CS.
IPR011177. TAF1_animal.
IPR009067. TAF_II_230-bd.
IPR022591. TFIID_sub1_DUF3591.
[Graphical view]
PfamiPF00439. Bromodomain. 2 hits.
PF12157. DUF3591. 1 hit.
PF09247. TBP-binding. 1 hit.
[Graphical view]
PIRSFiPIRSF003047. TAF1_animal. 1 hit.
PRINTSiPR00503. BROMODOMAIN.
SMARTiSM00297. BROMO. 2 hits.
[Graphical view]
SUPFAMiSSF47055. SSF47055. 1 hit.
SSF47370. SSF47370. 2 hits.
PROSITEiPS00633. BROMODOMAIN_1. 2 hits.
PS50014. BROMODOMAIN_2. 2 hits.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Q8IZX4-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MRPGCDLLLR AAATVTAAIM SDSDSEEDSS GGGPFTLAGI LFGNISGAGQ
60 70 80 90 100
LEGESVLDDE CKKHLAGLGA LGLGSLITEL TANEELTGTG GALVNDEGWI
110 120 130 140 150
RSTEDAVDYS DINEVAEDES QRHQQTMGSL QPLYHSDYDE DDYDADCEDI
160 170 180 190 200
DCKLMPPPPP PPGPMKKDKD QDAITCVSES GEDIILPSII APSFLASEKV
210 220 230 240 250
DFSSYSDSES EMGPQEATQA ESEDGKLTLP LAGIMQHDAT KLLPSVTELF
260 270 280 290 300
PEFRPGKVLR FLHLFGPGKN VPSVWRSARR KRKKHRELIQ EEQIQEVECS
310 320 330 340 350
VESEVSQKSL WNYDYAPPPP PEQCLADDEI TMMVPVESKF SQSTGDVDKV
360 370 380 390 400
TDTKPRVAEW RYGPARLWYD MLGVSEDGSG FDYGFKLRKT QHEPVIKSRM
410 420 430 440 450
MEEFRKLEES NGTDLLADEN FLMVTQLHWE DSIIWDGEDI KHKGTKPQGA
460 470 480 490 500
SLAGWLPSIK TRNVMAYNVQ QGFAPTLDDD KPWYSIFPID NEDLVYGRWE
510 520 530 540 550
DNIIWDAQAM PRLLEPPVLA LDPNDENLIL EIPDEKEEAT SNSPSKESKK
560 570 580 590 600
ESSLKKSRIL LGKTGVIREE PQQNMSQPEV KDPWNLSNDE YYFPKQQGLR
610 620 630 640 650
GTFGGNIIQH SIPAMELWQP FFPTHMGPIK IRQFHRPPLK KYSFGALSQP
660 670 680 690 700
GPHSVQPLLK HIKKKAKMRE QERQASGGGE LFFMRTPQDL TGKDGDLILA
710 720 730 740 750
EYSEENGPLM MQVGMATKIK NYYKRKPGKD PGAPDCKYGE TVYCHTSPFL
760 770 780 790 800
GSLHPGQLLQ ALENNLFRAP VYLHKMPETD FLIIRTRQGY YIRELVDIFV
810 820 830 840 850
VGQQCPLFEV PGPNSRRANM HIRDFLQVFI YRLFWKSKDR PRRIRMEDIK
860 870 880 890 900
KAFPSHSESS IRKRLKLCAD FKRTGMDSNW WVLKSDFRLP TEEEIRAKVS
910 920 930 940 950
PEQCCAYYSM IAAKQRLKDA GYGEKSFFAP EEENEEDFQM KIDDEVHAAP
960 970 980 990 1000
WNTTRAFIAA MKGKCLLEVT GVADPTGCGE GFSYVKIPNK PTQQKDDKEP
1010 1020 1030 1040 1050
QAVKKTVTGT DADLRRLSLK NAKQLLRKFG VPEEEIKKLS RWEVIDVVRT
1060 1070 1080 1090 1100
MSTEQAHSGE GPMSKFARGS RFSVAEHQER YKEECQRIFD LQNKVLSSTE
1110 1120 1130 1140 1150
VLSTDTDSIS AEDSDFEEMG KNIENMLQNK KTSSQLSREW EEQERKELRR
1160 1170 1180 1190 1200
MLLVAGSAAS GNNHRDDVTA SMTSLKSSAT GHCLKIYRTF RDEEGKEYVR
1210 1220 1230 1240 1250
CETVRKPAVI DAYVRIRTTK DEKFIQKFAL FDEKHREEMR KERRRIQEQL
1260 1270 1280 1290 1300
RRLKRNQEKE KLKGPPEKKP KKMKERPDLK LKCGACGAIG HMRTNKFCPL
1310 1320 1330 1340 1350
YYQTNVPPSK PVAMTEEQEE ELEKTVIHND NEELIKVEGT KIVFGKQLIE
1360 1370 1380 1390 1400
NVHEVRRKSL VLKFPKQQLP PKKKRRVGTT VHCDYLNIPH KSIHRRRTDP
1410 1420 1430 1440 1450
MVTLSSILES IINDMRDLPN THPFHTPVNA KVVKDYYKII TRPMDLQTLR
1460 1470 1480 1490 1500
ENVRKCLYPS REEFREHLEL IVKNSATYNG PKHSLTQISQ SMLDLCDEKL
1510 1520 1530 1540 1550
KEKEDKLARL EKAINPLLDD DDQVAFSFIL DNIVTQKMMA VPDSWPFHHP
1560 1570 1580 1590 1600
VNKKFVPDYY KMIVNPVDLE TIRKNISKHK YQSRESFLDD VNLILANSVK
1610 1620 1630 1640 1650
YNGPESQYTK TAQEIVNICY QTITEYDEHL TQLEKDICTA KEAALEEAEL
1660 1670 1680 1690 1700
ESLDPMTPGP YTSQPPDMYD TNTSLSTSRD ASVFQDESNL SVLDISTATP
1710 1720 1730 1740 1750
EKQMCQGQGR LGEEDSDVDV EGYDDEEEDG KPKPPAPEGG DGDLADEEEG
1760 1770 1780 1790 1800
TVQQPEASVL YEDLLISEGE DDEEDAGSDE EGDNPFSAIQ LSESGSDSDV
1810 1820
GYGGIRPKQP FMLQHASGEH KDGHGK
Length:1,826
Mass (Da):207,302
Last modified:March 1, 2003 - v1
Checksum:i35D780E749AC9B17
GO

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti47 – 471G → A in a lung small cell carcinoma sample; somatic mutation. 1 Publication
VAR_041934
Natural varianti171 – 1711Q → E.1 Publication
Corresponds to variant rs56352331 [ dbSNP | Ensembl ].
VAR_041935
Natural varianti256 – 2561G → A.1 Publication
Corresponds to variant rs55991718 [ dbSNP | Ensembl ].
VAR_041936
Natural varianti371 – 3711M → V.1 Publication
Corresponds to variant rs17219559 [ dbSNP | Ensembl ].
VAR_041937
Natural varianti532 – 5321I → N.1 Publication
Corresponds to variant rs56128445 [ dbSNP | Ensembl ].
VAR_041938
Natural varianti637 – 6371P → S.1 Publication
Corresponds to variant rs56157814 [ dbSNP | Ensembl ].
VAR_041939
Natural varianti750 – 7501L → F in a lung adenocarcinoma sample; somatic mutation. 1 Publication
VAR_041940
Natural varianti762 – 7621L → I in a lung adenocarcinoma sample; somatic mutation. 1 Publication
VAR_041941
Natural varianti794 – 7941E → D in a lung adenocarcinoma sample; somatic mutation. 1 Publication
VAR_041942
Natural varianti820 – 8201M → T.
Corresponds to variant rs1258 [ dbSNP | Ensembl ].
VAR_048434
Natural varianti845 – 8451R → Q.1 Publication
Corresponds to variant rs34787787 [ dbSNP | Ensembl ].
VAR_041943
Natural varianti1016 – 10161R → C.1 Publication
Corresponds to variant rs35905429 [ dbSNP | Ensembl ].
VAR_041944
Natural varianti1038 – 10381K → N.1 Publication
Corresponds to variant rs55767137 [ dbSNP | Ensembl ].
VAR_041945
Natural varianti1169 – 11691T → I.1 Publication
Corresponds to variant rs55976674 [ dbSNP | Ensembl ].
VAR_041946
Natural varianti1312 – 13121V → L.1 Publication
Corresponds to variant rs55824107 [ dbSNP | Ensembl ].
VAR_041947
Natural varianti1356 – 13561R → C.1 Publication
Corresponds to variant rs56107531 [ dbSNP | Ensembl ].
VAR_041948
Natural varianti1389 – 13891P → S.1 Publication
Corresponds to variant rs56393725 [ dbSNP | Ensembl ].
VAR_041949
Natural varianti1411 – 14111I → V.1 Publication
Corresponds to variant rs34500740 [ dbSNP | Ensembl ].
VAR_041950
Natural varianti1540 – 15401A → T.1 Publication
Corresponds to variant rs55782058 [ dbSNP | Ensembl ].
VAR_041951
Natural varianti1549 – 15491H → Y in a glioblastoma multiforme sample; somatic mutation. 1 Publication
VAR_041952
Natural varianti1731 – 17311K → N.1 Publication
Corresponds to variant rs34241003 [ dbSNP | Ensembl ].
VAR_041953
Natural varianti1805 – 18051I → V.
Corresponds to variant rs16918393 [ dbSNP | Ensembl ].
VAR_048435
Natural varianti1810 – 18101P → L.1 Publication
Corresponds to variant rs56342342 [ dbSNP | Ensembl ].
VAR_041954
Natural varianti1824 – 18241H → Q in a lung adenocarcinoma sample; somatic mutation. 1 Publication
VAR_041955

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF390562 mRNA. Translation: AAN40840.1.
CCDSiCCDS35003.1.
RefSeqiNP_722516.1. NM_153809.2.
UniGeneiHs.591086.

Genome annotation databases

EnsembliENST00000242310; ENSP00000418379; ENSG00000122728.
GeneIDi138474.
KEGGihsa:138474.
UCSCiuc003zrg.1. human.

Keywords - Coding sequence diversityi

Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF390562 mRNA. Translation: AAN40840.1.
CCDSiCCDS35003.1.
RefSeqiNP_722516.1. NM_153809.2.
UniGeneiHs.591086.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
3HMHX-ray2.05A1523-1654[»]
ProteinModelPortaliQ8IZX4.
SMRiQ8IZX4. Positions 607-1107, 1396-1641.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi126514. 9 interactions.
IntActiQ8IZX4. 2 interactions.
MINTiMINT-2810023.
STRINGi9606.ENSP00000418379.

Chemistry

ChEMBLiCHEMBL3108641.

PTM databases

PhosphoSiteiQ8IZX4.

Polymorphism and mutation databases

BioMutaiTAF1L.
DMDMi57013082.

Proteomic databases

MaxQBiQ8IZX4.
PaxDbiQ8IZX4.
PRIDEiQ8IZX4.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000242310; ENSP00000418379; ENSG00000122728.
GeneIDi138474.
KEGGihsa:138474.
UCSCiuc003zrg.1. human.

Organism-specific databases

CTDi138474.
GeneCardsiGC09M032619.
H-InvDBHIX0169078.
HGNCiHGNC:18056. TAF1L.
HPAiHPA056605.
MIMi607798. gene.
neXtProtiNX_Q8IZX4.
PharmGKBiPA134947802.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiCOG5076.
GeneTreeiENSGT00390000012659.
HOGENOMiHOG000020066.
HOVERGENiHBG050223.
InParanoidiQ8IZX4.
KOiK03125.
OMAiVIREEPQ.
OrthoDBiEOG7QNVK2.
PhylomeDBiQ8IZX4.
TreeFamiTF313573.

Enzyme and pathway databases

ReactomeiREACT_1655. RNA Polymerase II Transcription Pre-Initiation And Promoter Opening.
REACT_1851. RNA Polymerase II Transcription Initiation.
REACT_2089. RNA Polymerase II Promoter Escape.
REACT_22107. RNA Polymerase II Pre-transcription Events.
REACT_6233. Transcription of the HIV genome.
REACT_6253. RNA Polymerase II HIV Promoter Escape.
REACT_6332. HIV Transcription Initiation.
REACT_834. RNA Polymerase II Transcription Initiation And Promoter Clearance.

Miscellaneous databases

EvolutionaryTraceiQ8IZX4.
GenomeRNAii138474.
NextBioi83796.
PROiQ8IZX4.
SOURCEiSearch...

Gene expression databases

CleanExiHS_TAF1L.
GenevisibleiQ8IZX4. HS.

Family and domain databases

Gene3Di1.10.1100.10. 1 hit.
1.20.920.10. 2 hits.
InterProiIPR001487. Bromodomain.
IPR018359. Bromodomain_CS.
IPR011177. TAF1_animal.
IPR009067. TAF_II_230-bd.
IPR022591. TFIID_sub1_DUF3591.
[Graphical view]
PfamiPF00439. Bromodomain. 2 hits.
PF12157. DUF3591. 1 hit.
PF09247. TBP-binding. 1 hit.
[Graphical view]
PIRSFiPIRSF003047. TAF1_animal. 1 hit.
PRINTSiPR00503. BROMODOMAIN.
SMARTiSM00297. BROMO. 2 hits.
[Graphical view]
SUPFAMiSSF47055. SSF47055. 1 hit.
SSF47370. SSF47370. 2 hits.
PROSITEiPS00633. BROMODOMAIN_1. 2 hits.
PS50014. BROMODOMAIN_2. 2 hits.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Functional substitution for TAF(II)250 by a retroposed homolog that is expressed in human spermatogenesis."
    Wang P.J., Page D.C.
    Hum. Mol. Genet. 11:2341-2346(2002) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA], FUNCTION, INTERACTION WITH TBP.
    Tissue: Testis.
  2. Cited for: X-RAY CRYSTALLOGRAPHY (2.05 ANGSTROMS) OF 1523-1654, SUBUNIT.
  3. "Patterns of somatic mutation in human cancer genomes."
    Greenman C., Stephens P., Smith R., Dalgliesh G.L., Hunter C., Bignell G., Davies H., Teague J., Butler A., Stevens C., Edkins S., O'Meara S., Vastrik I., Schmidt E.E., Avis T., Barthorpe S., Bhamra G., Buck G.
    , Choudhury B., Clements J., Cole J., Dicks E., Forbes S., Gray K., Halliday K., Harrison R., Hills K., Hinton J., Jenkinson A., Jones D., Menzies A., Mironenko T., Perry J., Raine K., Richardson D., Shepherd R., Small A., Tofts C., Varian J., Webb T., West S., Widaa S., Yates A., Cahill D.P., Louis D.N., Goldstraw P., Nicholson A.G., Brasseur F., Looijenga L., Weber B.L., Chiew Y.-E., DeFazio A., Greaves M.F., Green A.R., Campbell P., Birney E., Easton D.F., Chenevix-Trench G., Tan M.-H., Khoo S.K., Teh B.T., Yuen S.T., Leung S.Y., Wooster R., Futreal P.A., Stratton M.R.
    Nature 446:153-158(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: VARIANTS [LARGE SCALE ANALYSIS] ALA-47; GLU-171; ALA-256; VAL-371; ASN-532; SER-637; PHE-750; ILE-762; ASP-794; GLN-845; CYS-1016; ASN-1038; ILE-1169; LEU-1312; CYS-1356; SER-1389; VAL-1411; THR-1540; TYR-1549; ASN-1731; LEU-1810 AND GLN-1824.

Entry informationi

Entry nameiTAF1L_HUMAN
AccessioniPrimary (citable) accession number: Q8IZX4
Secondary accession number(s): Q0VG57
Entry historyi
Integrated into UniProtKB/Swiss-Prot: January 4, 2005
Last sequence update: March 1, 2003
Last modified: July 22, 2015
This is version 110 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 9
    Human chromosome 9: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  6. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.