Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Single-stranded DNA-binding protein

Gene

ssb

Organism
Escherichia coli (strain K12)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Plays an important role in DNA replication, recombination and repair. Binds to ssDNA and to an array of partner proteins to recruit them to their sites of action during DNA metabolism. Acts as a sliding platform that migrates on DNA via reptation.UniRule annotation3 Publications

Enzyme regulationi

The C-terminal tail exerts an inhibitory effect on ssDNA binding.1 Publication

Regions

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
DNA bindingi55 – 617UniRule annotation

GO - Molecular functioni

  • enzyme activator activity Source: UniProtKB
  • identical protein binding Source: EcoCyc
  • single-stranded DNA binding Source: EcoCyc

GO - Biological processi

  • DNA replication Source: UniProtKB-HAMAP
  • mismatch repair Source: EcoCyc
  • positive regulation of catalytic activity Source: UniProtKB
  • recombinational repair Source: EcoCyc
  • SOS response Source: EcoCyc
Complete GO annotation...

Keywords - Biological processi

DNA damage, DNA recombination, DNA repair, DNA replication

Keywords - Ligandi

DNA-binding

Enzyme and pathway databases

BioCyciEcoCyc:EG10976-MONOMER.
ECOL316407:JW4020-MONOMER.

Names & Taxonomyi

Protein namesi
Recommended name:
Single-stranded DNA-binding proteinUniRule annotation
Short name:
SSBUniRule annotation
Alternative name(s):
Helix-destabilizing protein
Gene namesi
Name:ssb
Synonyms:exrB, lexC
Ordered Locus Names:b4059, JW4020
OrganismiEscherichia coli (strain K12)
Taxonomic identifieri83333 [NCBI]
Taxonomic lineageiBacteriaProteobacteriaGammaproteobacteriaEnterobacterialesEnterobacteriaceaeEscherichia
Proteomesi
  • UP000000318 Componenti: Chromosome
  • UP000000625 Componenti: Chromosome

Organism-specific databases

EcoGeneiEG10976. ssb.

Subcellular locationi

GO - Cellular componenti

  • cytosol Source: EcoCyc
Complete GO annotation...

Pathology & Biotechi

Mutagenesis

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Mutagenesisi5 – 51G → D: Increased frequency of precise excision of transposon Tn10 derivatives (mutant SSB-200). 1 Publication
Mutagenesisi11 – 111L → F: Increased frequency of precise excision of transposon Tn10 derivatives (mutant SSB-202). 1 Publication
Mutagenesisi25 – 251P → S: Increased frequency of precise excision of transposon Tn10 derivatives (mutant SSB-202). 1 Publication
Mutagenesisi56 – 561H → L: Reduces DNA-binding affinity. 1 Publication
Mutagenesisi56 – 561H → Y: Destabilizes the tetramer (mutant SSB-1). 1 Publication
Mutagenesisi61 – 611F → A: Reduces DNA-binding affinity. 1 Publication
Mutagenesisi103 – 1031V → M: Increased frequency of precise excision of transposon Tn10 derivatives (mutant SSB-201). 1 Publication
Mutagenesisi177 – 1771P → S: Strongly reduced exonuclease I (sbcB) stimulation. 1 Publication

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Initiator methionineiRemoved2 Publications
Chaini2 – 178177Single-stranded DNA-binding proteinPRO_0000096036Add
BLAST

Post-translational modificationi

Phosphorylated on tyrosine residue(s).1 Publication

Keywords - PTMi

Phosphoprotein

Proteomic databases

EPDiP0AGE0.
PaxDbiP0AGE0.
PRIDEiP0AGE0.

2D gel databases

SWISS-2DPAGEP0AGE0.

Interactioni

Subunit structurei

Homotetramer. Interacts, via its C-terminus, with several proteins involved in DNA metabolism such as DnaG, HolC, PriA, PriB, RecO, RecQ and SbcB.UniRule annotation11 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
dinGP272962EBI-1118620,EBI-1114590
dnaGP0ABS52EBI-1118620,EBI-549259
dnaXP067102EBI-1118620,EBI-549140
holCP289059EBI-1118620,EBI-549169
priAP178882EBI-1118620,EBI-552050

GO - Molecular functioni

  • identical protein binding Source: EcoCyc

Protein-protein interaction databases

BioGridi4262671. 106 interactions.
DIPiDIP-35980N.
IntActiP0AGE0. 12 interactions.
MINTiMINT-8298796.
STRINGi511145.b4059.

Structurei

Secondary structure

1
178
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Beta strandi6 – 1712Combined sources
Beta strandi20 – 223Combined sources
Beta strandi25 – 273Combined sources
Beta strandi30 – 378Combined sources
Beta strandi45 – 484Combined sources
Beta strandi54 – 618Combined sources
Helixi63 – 719Combined sources
Beta strandi77 – 8913Combined sources
Beta strandi91 – 944Combined sources
Beta strandi96 – 1038Combined sources
Beta strandi105 – 1073Combined sources
Beta strandi108 – 1114Combined sources
Beta strandi120 – 1245Combined sources
Beta strandi136 – 1394Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1EQQX-ray3.20A/B/C/D1-178[»]
1EYGX-ray2.80A/B/C/D1-116[»]
1KAWX-ray2.90A/B/C/D2-136[»]
1QVCX-ray2.20A/B/C/D2-146[»]
1SRUX-ray3.30A/B/C/D1-113[»]
3C94X-ray2.70B/C170-178[»]
3SXUX-ray1.85C175-178[»]
3UF7X-ray1.20B/C170-178[»]
4MZ9X-ray2.20A/B/C/D1-178[»]
4Z0UX-ray2.00D/E170-178[»]
ProteinModelPortaliP0AGE0.
SMRiP0AGE0. Positions 2-123.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP0AGE0.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini6 – 111106SSBUniRule annotationAdd
BLAST

Motif

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Motifi173 – 1786Important for interaction with partner proteins

Sequence similaritiesi

Contains 1 SSB domain.UniRule annotation

Phylogenomic databases

eggNOGiENOG4108UUM. Bacteria.
COG0629. LUCA.
HOGENOMiHOG000106483.
InParanoidiP0AGE0.
KOiK03111.
OMAiQQYNEPP.
OrthoDBiEOG6M9F32.
PhylomeDBiP0AGE0.

Family and domain databases

Gene3Di2.40.50.140. 1 hit.
HAMAPiMF_00984. SSB.
InterProiIPR012340. NA-bd_OB-fold.
IPR000424. Primosome_PriB/ssb.
IPR011344. ssDNA-bd.
[Graphical view]
PANTHERiPTHR10302. PTHR10302. 1 hit.
PfamiPF00436. SSB. 1 hit.
[Graphical view]
PIRSFiPIRSF002070. SSB. 1 hit.
SUPFAMiSSF50249. SSF50249. 1 hit.
TIGRFAMsiTIGR00621. ssb. 1 hit.
PROSITEiPS50935. SSB. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

P0AGE0-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MASRGVNKVI LVGNLGQDPE VRYMPNGGAV ANITLATSES WRDKATGEMK
60 70 80 90 100
EQTEWHRVVL FGKLAEVASE YLRKGSQVYI EGQLRTRKWT DQSGQDRYTT
110 120 130 140 150
EVVVNVGGTM QMLGGRQGGG APAGGNIGGG QPQGGWGQPQ QPQGGNQFSG
160 170
GAQSRPQQSA PAAPSNEPPM DFDDDIPF
Length:178
Mass (Da):18,975
Last modified:January 23, 2007 - v2
Checksum:iF0AF43B6DC8D02FC
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti121 – 1211A → AQ no nucleotide entry (PubMed:6351061).Curated
Sequence conflicti134 – 1341G → S in AAA24649 (PubMed:6270666).Curated
Sequence conflicti171 – 1711D → DG no nucleotide entry (PubMed:6351061).Curated

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
J01704 Genomic DNA. Translation: AAA24649.1.
U00006 Genomic DNA. Translation: AAC43153.1.
U00096 Genomic DNA. Translation: AAC77029.1.
AP009048 Genomic DNA. Translation: BAE78061.1.
PIRiB65214. DDEC.
RefSeqiNP_418483.1. NC_000913.3.
WP_000168305.1. NZ_LN832404.1.

Genome annotation databases

EnsemblBacteriaiAAC77029; AAC77029; b4059.
BAE78061; BAE78061; BAE78061.
GeneIDi948570.
KEGGiecj:JW4020.
eco:b4059.
PATRICi32123663. VBIEscCol129921_4180.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
J01704 Genomic DNA. Translation: AAA24649.1.
U00006 Genomic DNA. Translation: AAC43153.1.
U00096 Genomic DNA. Translation: AAC77029.1.
AP009048 Genomic DNA. Translation: BAE78061.1.
PIRiB65214. DDEC.
RefSeqiNP_418483.1. NC_000913.3.
WP_000168305.1. NZ_LN832404.1.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1EQQX-ray3.20A/B/C/D1-178[»]
1EYGX-ray2.80A/B/C/D1-116[»]
1KAWX-ray2.90A/B/C/D2-136[»]
1QVCX-ray2.20A/B/C/D2-146[»]
1SRUX-ray3.30A/B/C/D1-113[»]
3C94X-ray2.70B/C170-178[»]
3SXUX-ray1.85C175-178[»]
3UF7X-ray1.20B/C170-178[»]
4MZ9X-ray2.20A/B/C/D1-178[»]
4Z0UX-ray2.00D/E170-178[»]
ProteinModelPortaliP0AGE0.
SMRiP0AGE0. Positions 2-123.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi4262671. 106 interactions.
DIPiDIP-35980N.
IntActiP0AGE0. 12 interactions.
MINTiMINT-8298796.
STRINGi511145.b4059.

2D gel databases

SWISS-2DPAGEP0AGE0.

Proteomic databases

EPDiP0AGE0.
PaxDbiP0AGE0.
PRIDEiP0AGE0.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsemblBacteriaiAAC77029; AAC77029; b4059.
BAE78061; BAE78061; BAE78061.
GeneIDi948570.
KEGGiecj:JW4020.
eco:b4059.
PATRICi32123663. VBIEscCol129921_4180.

Organism-specific databases

EchoBASEiEB0969.
EcoGeneiEG10976. ssb.

Phylogenomic databases

eggNOGiENOG4108UUM. Bacteria.
COG0629. LUCA.
HOGENOMiHOG000106483.
InParanoidiP0AGE0.
KOiK03111.
OMAiQQYNEPP.
OrthoDBiEOG6M9F32.
PhylomeDBiP0AGE0.

Enzyme and pathway databases

BioCyciEcoCyc:EG10976-MONOMER.
ECOL316407:JW4020-MONOMER.

Miscellaneous databases

EvolutionaryTraceiP0AGE0.
PROiP0AGE0.

Family and domain databases

Gene3Di2.40.50.140. 1 hit.
HAMAPiMF_00984. SSB.
InterProiIPR012340. NA-bd_OB-fold.
IPR000424. Primosome_PriB/ssb.
IPR011344. ssDNA-bd.
[Graphical view]
PANTHERiPTHR10302. PTHR10302. 1 hit.
PfamiPF00436. SSB. 1 hit.
[Graphical view]
PIRSFiPIRSF002070. SSB. 1 hit.
SUPFAMiSSF50249. SSF50249. 1 hit.
TIGRFAMsiTIGR00621. ssb. 1 hit.
PROSITEiPS50935. SSB. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA], PROTEIN SEQUENCE OF 2-53.
  2. "F sex factor encodes a single-stranded DNA binding protein (SSB) with extensive sequence homology to Escherichia coli SSB."
    Chase J.W., Merrill B.M., Williams K.R.
    Proc. Natl. Acad. Sci. U.S.A. 80:5480-5484(1983) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA].
  3. "Analysis of the Escherichia coli genome. IV. DNA sequence of the region from 89.2 to 92.8 minutes."
    Blattner F.R., Burland V.D., Plunkett G. III, Sofia H.J., Daniels D.L.
    Nucleic Acids Res. 21:5408-5417(1993) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: K12 / MG1655 / ATCC 47076.
  4. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: K12 / MG1655 / ATCC 47076.
  5. "Highly accurate genome sequences of Escherichia coli K-12 strains MG1655 and W3110."
    Hayashi K., Morooka N., Yamamoto Y., Fujita K., Isono K., Choi S., Ohtsubo E., Baba T., Wanner B.L., Mori H., Horiuchi T.
    Mol. Syst. Biol. 2:E1-E5(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: K12 / W3110 / ATCC 27325 / DSM 5911.
  6. "Biological activity and a partial amino-acid sequence of Escherichia coli DNA-binding protein I isolated from overproducing cells."
    Beyreuther K., Berthold-Schmidt V., Geider K.
    Eur. J. Biochem. 123:415-420(1982) [PubMed] [Europe PMC] [Abstract]
    Cited for: PROTEIN SEQUENCE OF 2-41.
  7. "Characterization of the Escherichia coli SSB-113 mutant single-stranded DNA-binding protein. Cloning of the gene, DNA and protein sequence analysis, high pressure liquid chromatography peptide mapping, and DNA-binding studies."
    Chase J.W., L'Italien J.J., Murphy J.B., Spicer E.K., Williams K.R.
    J. Biol. Chem. 259:805-814(1984) [PubMed] [Europe PMC] [Abstract]
    Cited for: CHARACTERIZATION, SEQUENCE REVISION TO 134.
  8. "Characterization of the structural and functional defect in the Escherichia coli single-stranded DNA binding protein encoded by the ssb-1 mutant gene. Expression of the ssb-1 gene under lambda pL regulation."
    Williams K.R., Murphy J.B., Chase J.W.
    J. Biol. Chem. 259:11804-11811(1984) [PubMed] [Europe PMC] [Abstract]
    Cited for: MUTANT SSB-1.
  9. "Tryptophan 54 and phenylalanine 60 are involved synergistically in the binding of E. coli SSB protein to single-stranded polynucleotides."
    Casas-Finet J.R., Khamis M.I., Maki A.W., Chase J.W.
    FEBS Lett. 220:347-352(1987) [PubMed] [Europe PMC] [Abstract]
    Cited for: MUTAGENESIS, DNA-BINDING.
  10. "The single-stranded DNA-binding protein of Escherichia coli."
    Meyer R.R., Laine P.S.
    Microbiol. Rev. 54:342-380(1990) [PubMed] [Europe PMC] [Abstract]
    Cited for: REVIEW.
  11. "Monomers of the Escherichia coli SSB-1 mutant protein bind single-stranded DNA."
    Bujalowski W., Lohman T.M.
    J. Mol. Biol. 217:63-74(1991) [PubMed] [Europe PMC] [Abstract]
    Cited for: MUTANT SSB-1, DNA-BINDING.
  12. "Protein interactions in genetic recombination in Escherichia coli. Interactions involving RecO and RecR overcome the inhibition of RecA by single-stranded DNA-binding protein."
    Umezu K., Kolodner R.D.
    J. Biol. Chem. 269:30005-30013(1994) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH RECO.
  13. "Escherichia coli proteome analysis using the gene-protein database."
    VanBogelen R.A., Abshire K.Z., Moldover B., Olson E.R., Neidhardt F.C.
    Electrophoresis 18:1243-1251(1997) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION BY 2D-GEL.
  14. "Identification and characterization of ssb and uup mutants with increased frequency of precise excision of transposon Tn10 derivatives: nucleotide sequence of uup in Escherichia coli."
    Reddy M., Gowrishankar J.
    J. Bacteriol. 179:2892-2899(1997) [PubMed] [Europe PMC] [Abstract]
    Cited for: MUTANTS SSB-200; SSB-201 AND SSB-202.
    Strain: K12 / W3110 / ATCC 27325 / DSM 5911.
  15. "Devoted to the lagging strand-the subunit of DNA polymerase III holoenzyme contacts SSB to promote processive elongation and sliding clamp assembly."
    Kelman Z., Yuzhakov A., Andjelkovic J., O'Donnell M.
    EMBO J. 17:2436-2449(1998) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH HOLC.
  16. "Interaction of E. coli single-stranded DNA binding protein (SSB) with exonuclease I. The carboxy-terminus of SSB is the recognition site for the nuclease."
    Genschel J., Curth U., Urbanke C.
    Biol. Chem. 381:183-192(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH EXONUCLEASE I (SBCB).
  17. "PriA helicase and SSB interact physically and functionally."
    Cadman C.J., McGlynn P.
    Nucleic Acids Res. 32:6378-6387(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: DNA-BINDING, INTERACTION WITH PRIA.
  18. "Bacterial single-stranded DNA-binding proteins are phosphorylated on tyrosine."
    Mijakovic I., Petranovic D., Macek B., Cepo T., Mann M., Davies J., Jensen P.R., Vujaklija D.
    Nucleic Acids Res. 34:1588-1596(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION.
  19. "A central role for SSB in Escherichia coli RecQ DNA helicase function."
    Shereda R.D., Bernstein D.A., Keck J.L.
    J. Biol. Chem. 282:19247-19258(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH RECQ.
    Strain: K12 / MG1655 / ATCC 47076.
  20. Cited for: REVIEW, FUNCTION.
  21. "Structural basis of Escherichia coli single-stranded DNA-binding protein stimulation of exonuclease I."
    Lu D., Keck J.L.
    Proc. Natl. Acad. Sci. U.S.A. 105:9169-9174(2008) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH EXONUCLEASE I (SBCB), MUTAGENESIS OF PRO-177.
  22. "Regulation of single-stranded DNA binding by the C termini of Escherichia coli single-stranded DNA-binding (SSB) protein."
    Kozlov A.G., Cox M.M., Lohman T.M.
    J. Biol. Chem. 285:17246-17252(2010) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, DNA-BINDING, ENZYME REGULATION.
  23. "Small-molecule tools for dissecting the roles of SSB/protein interactions in genome maintenance."
    Lu D., Bernstein D.A., Satyshur K.A., Keck J.L.
    Proc. Natl. Acad. Sci. U.S.A. 107:633-638(2010) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH EXONUCLEASE I (SBCB).
  24. "SSB functions as a sliding platform that migrates on DNA via reptation."
    Zhou R., Kozlov A.G., Roy R., Zhang J., Korolev S., Lohman T.M., Ha T.
    Cell 146:222-232(2011) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION.
  25. "The helicase-binding domain of Escherichia coli DnaG primase interacts with the highly conserved C-terminal region of single-stranded DNA-binding protein."
    Naue N., Beerbaum M., Bogutzki A., Schmieder P., Curth U.
    Nucleic Acids Res. 41:4507-4517(2013) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH DNAG.
  26. "Yeast two-hybrid analysis of PriB-interacting proteins in replication restart primosome: a proposed PriB-SSB interaction model."
    Huang Y.H., Lin M.J., Huang C.Y.
    Protein J. 32:477-483(2013) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH PRIB.
  27. "Crystal structure of the homo-tetrameric DNA binding domain of Escherichia coli single-stranded DNA-binding protein determined by multiwavelength X-ray diffraction on the selenomethionyl protein at 2.9-A resolution."
    Raghunathan S., Ricard C.S., Lohman T.M., Waksman G.
    Proc. Natl. Acad. Sci. U.S.A. 94:6652-6657(1997) [PubMed] [Europe PMC] [Abstract]
    Cited for: X-RAY CRYSTALLOGRAPHY (2.9 ANGSTROMS) OF 1-136, SUBUNIT.
  28. "Structure of the DNA binding domain of E. coli SSB bound to ssDNA."
    Raghunathan S., Kozlov A.G., Lohman T.M., Waksman G.
    Nat. Struct. Biol. 7:648-652(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: X-RAY CRYSTALLOGRAPHY (2.8 ANGSTROMS) OF 1-136, DNA-BINDING, SUBUNIT.

Entry informationi

Entry nameiSSB_ECOLI
AccessioniPrimary (citable) accession number: P0AGE0
Secondary accession number(s): P02339, Q2M6P5
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 21, 1986
Last sequence update: January 23, 2007
Last modified: April 13, 2016
This is version 96 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programProkaryotic Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. Escherichia coli
    Escherichia coli (strain K12): entries and cross-references to EcoGene
  2. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.