Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Histone H3

Gene

his-2

more
Organism
Caenorhabditis elegans
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Core component of nucleosome. Nucleosomes wrap and compact DNA into chromatin, limiting DNA accessibility to the cellular machineries which require DNA as a template. Histones thereby play a central role in transcription regulation, DNA repair, DNA replication and chromosomal stability. DNA accessibility is regulated via a complex set of post-translational modifications of histones, also called histone code, and nucleosome remodeling.

GO - Molecular functioni

Complete GO annotation...

Keywords - Ligandi

DNA-binding

Enzyme and pathway databases

ReactomeiREACT_332708. Factors involved in megakaryocyte development and platelet production.

Names & Taxonomyi

Protein namesi
Recommended name:
Histone H3
Gene namesi
Name:his-2
ORF Names:T10C6.13
AND
Name:his-6
ORF Names:F45F2.13
AND
Name:his-9
ORF Names:ZK131.3
AND
Name:his-13
ORF Names:ZK131.7
AND
Name:his-17
ORF Names:K06C4.5
AND
Name:his-25
ORF Names:ZK131.2
AND
Name:his-27
ORF Names:K06C4.13
AND
Name:his-32
ORF Names:F17E9.10
AND
Name:his-42
ORF Names:F08G2.3
AND
Name:his-45
ORF Names:B0035.10
AND
Name:his-49
ORF Names:F07B7.5
AND
Name:his-55
ORF Names:F54E12.1
AND
Name:his-59
ORF Names:F55G1.2
AND
Name:his-63
ORF Names:F22B3.2
OrganismiCaenorhabditis elegans
Taxonomic identifieri6239 [NCBI]
Taxonomic lineageiEukaryotaMetazoaEcdysozoaNematodaChromadoreaRhabditidaRhabditoideaRhabditidaePeloderinaeCaenorhabditis
ProteomesiUP000001940 Componentsi: Chromosome II, Chromosome IV, Chromosome V

Organism-specific databases

WormBaseiB0035.10; CE03253; WBGene00001919; his-45.
F07B7.5; CE03253; WBGene00001923; his-49.
F08G2.3; CE03253; WBGene00001916; his-42.
F17E9.10; CE03253; WBGene00001906; his-32.
F22B3.2; CE03253; WBGene00001937; his-63.
F45F2.13; CE03253; WBGene00001880; his-6.
F54E12.1; CE03253; WBGene00001929; his-55.
F55G1.2; CE03253; WBGene00001933; his-59.
K06C4.13; CE03253; WBGene00001901; his-27.
K06C4.5; CE03253; WBGene00001891; his-17.
T10C6.13; CE03253; WBGene00001876; his-2.
ZK131.2; CE03253; WBGene00001899; his-25.
ZK131.3; CE03253; WBGene00001883; his-9.
ZK131.7; CE03253; WBGene00001887; his-13.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Chromosome, Nucleosome core, Nucleus

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Initiator methioninei1 – 11Removed1 Publication
Chaini2 – 136135Histone H3PRO_0000221297Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei5 – 51N6,N6,N6-trimethyllysine; alternate4 Publications
Modified residuei5 – 51N6,N6-dimethyllysine; alternate4 Publications
Modified residuei5 – 51N6-acetyllysine; alternate1 Publication
Modified residuei5 – 51N6-methyllysine; alternate4 Publications
Modified residuei10 – 101N6,N6,N6-trimethyllysine; alternate6 Publications
Modified residuei10 – 101N6,N6-dimethyllysine; alternate6 Publications
Modified residuei10 – 101N6-acetyllysine; alternate2 Publications
Modified residuei11 – 111Phosphoserine7 Publications
Modified residuei15 – 151N6-acetyllysine3 Publications
Modified residuei24 – 241N6-acetyllysine1 Publication
Modified residuei28 – 281N6,N6,N6-trimethyllysine; alternate2 Publications
Modified residuei28 – 281N6,N6-dimethyllysine; alternate2 Publications
Modified residuei28 – 281N6-methyllysine; alternate2 Publications
Modified residuei29 – 291Phosphoserine2 Publications
Modified residuei37 – 371N6,N6,N6-trimethyllysine; alternate3 Publications
Modified residuei37 – 371N6,N6-dimethyllysine; alternate3 Publications
Modified residuei37 – 371N6-methyllysine; alternate3 Publications
Modified residuei80 – 801N6-methyllysine1 Publication

Post-translational modificationi

Phosphorylated at Ser-11 and Ser-29 during M phase. Phosphorylation of Ser-11 requires air-2 but not air-1. Dephosphorylated by gsp-1 and/or gsp-2 during chromosome segregation.7 Publications
Acetylation is generally linked to gene activation.By similarity
Methylation at Lys-5 is linked to gene activation and is absent from male inactive X chromosome chromatin. Methylation at Lys-10 is linked to gene repression and is enriched in male inactive X chromosome chromatin. Methylation at Lys-37 occurs on the entire length of autosomes during meiotic prophase. Trimethylation at Lys-10 and Lys-37 is specifically antagonized by jmjd-2. Dimethylation and trimethylation at Lys-28 occurs in all nuclei. The mes-2-mes-3-mes-6 complex may be responsible for Lys-28 methylation in most of the germline and in the early embryo.7 Publications

Keywords - PTMi

Acetylation, Methylation, Phosphoprotein

Proteomic databases

PaxDbiP08898.

Expressioni

Gene expression databases

ExpressionAtlasiP08898. baseline.

Interactioni

Subunit structurei

The nucleosome is a histone octamer containing two molecules each of H2A, H2B, H3 and H4 assembled in one H3-H4 heterotetramer and two H2A-H2B heterodimers. The octamer wraps approximately 147 bp of DNA.

Protein-protein interaction databases

BioGridi42741. 1 interaction.
45065. 3 interactions.
50996. 1 interaction.
IntActiP08898. 1 interaction.
STRINGi6239.ZK131.7.

Structurei

Secondary structure

1
136
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Beta strandi4 – 63Combined sources
Beta strandi25 – 273Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
3N9LX-ray2.80B2-16[»]
3N9NX-ray2.30B/C2-33[»]
3N9OX-ray2.31B2-16[»]
C2-18[»]
3N9PX-ray2.39B/C2-33[»]
3N9QX-ray2.30B2-16[»]
C20-36[»]
ProteinModelPortaliP08898.
SMRiP08898. Positions 17-136.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP08898.

Family & Domainsi

Sequence similaritiesi

Belongs to the histone H3 family.Curated

Phylogenomic databases

eggNOGiCOG2036.
HOGENOMiHOG000155290.
InParanoidiP08898.
KOiK11253.
OrthoDBiEOG7HB5C2.
PhylomeDBiP08898.

Family and domain databases

Gene3Di1.10.20.10. 1 hit.
InterProiIPR009072. Histone-fold.
IPR007125. Histone_core_D.
IPR000164. Histone_H3/CENP-A.
[Graphical view]
PANTHERiPTHR11426. PTHR11426. 1 hit.
PfamiPF00125. Histone. 1 hit.
[Graphical view]
PRINTSiPR00622. HISTONEH3.
SMARTiSM00428. H3. 1 hit.
[Graphical view]
SUPFAMiSSF47113. SSF47113. 1 hit.
PROSITEiPS00322. HISTONE_H3_1. 1 hit.
PS00959. HISTONE_H3_2. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

P08898-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MARTKQTARK STGGKAPRKQ LATKAARKSA PASGGVKKPH RYRPGTVALR
60 70 80 90 100
EIRRYQKSTE LLIRRAPFQR LVREIAQDFK TDLRFQSSAV MALQEACEAY
110 120 130
LVGLFEDTNL CAIHAKRVTI MPKDIQLARR IRGERA
Length:136
Mass (Da):15,376
Last modified:January 23, 2007 - v4
Checksum:i40D7DE0EF5BA6F1F
GO

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti97 – 971C → A.
Natural varianti101 – 1011L → I.

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X15634 Genomic DNA. Translation: CAA33644.1.
FO081018 Genomic DNA. Translation: CCD68531.1.
FO081059 Genomic DNA. Translation: CCD68868.1.
FO081135 Genomic DNA. Translation: CCD69390.1.
FO081223 Genomic DNA. Translation: CCD70027.1.
FO081551 Genomic DNA. Translation: CCD72363.1.
FO081551 Genomic DNA. Translation: CCD72373.1.
Z68336 Genomic DNA. Translation: CAA92733.1.
Z73102 Genomic DNA. Translation: CAA97411.1.
Z81495 Genomic DNA. Translation: CAB04057.1.
Z82271 Genomic DNA. Translation: CAB05209.1.
Z83245 Genomic DNA. Translation: CAB05831.1.
Z83245 Genomic DNA. Translation: CAB05833.1.
Z83245 Genomic DNA. Translation: CAB05834.1.
Z93388 Genomic DNA. Translation: CAB07653.1.
AF304122 mRNA. Translation: AAG50235.1.
PIRiS04241. HSKW3.
RefSeqiNP_001263958.1. NM_001277029.1.
NP_496890.1. NM_064489.1.
NP_496894.1. NM_064493.5.
NP_496895.1. NM_064494.1.
NP_496899.1. NM_064498.1.
NP_501204.1. NM_068803.3.
NP_501407.1. NM_069006.3.
NP_502134.1. NM_069733.3.
NP_502138.1. NM_069737.1.
NP_502153.1. NM_069752.3.
NP_505199.1. NM_072798.1.
NP_505276.1. NM_072875.1.
NP_505292.1. NM_072891.1.
NP_505297.1. NM_072896.3.
NP_507033.1. NM_074632.3.
UniGeneiCel.12716.
Cel.12871.
Cel.21447.
Cel.21596.
Cel.21806.
Cel.21871.
Cel.2350.
Cel.27007.
Cel.29106.
Cel.29299.
Cel.32605.
Cel.32856.
Cel.33056.

Genome annotation databases

GeneIDi13221387.
175030.
175031.
177628.
180074.
181821.
184113.
184200.
184804.
186250.
186325.
191668.
191672.
191673.
246024.
KEGGicel:CELE_B0035.10.
cel:CELE_F07B7.5.
cel:CELE_F08G2.3.
cel:CELE_F17E9.10.
cel:CELE_F22B3.2.
cel:CELE_F45F2.13.
cel:CELE_F54E12.1.
cel:CELE_F55G1.2.
cel:CELE_K06C4.13.
cel:CELE_K06C4.5.
cel:CELE_T10C6.13.
cel:CELE_ZK131.2.
cel:CELE_ZK131.3.
cel:CELE_ZK131.7.
UCSCiZK131.7. c. elegans.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X15634 Genomic DNA. Translation: CAA33644.1.
FO081018 Genomic DNA. Translation: CCD68531.1.
FO081059 Genomic DNA. Translation: CCD68868.1.
FO081135 Genomic DNA. Translation: CCD69390.1.
FO081223 Genomic DNA. Translation: CCD70027.1.
FO081551 Genomic DNA. Translation: CCD72363.1.
FO081551 Genomic DNA. Translation: CCD72373.1.
Z68336 Genomic DNA. Translation: CAA92733.1.
Z73102 Genomic DNA. Translation: CAA97411.1.
Z81495 Genomic DNA. Translation: CAB04057.1.
Z82271 Genomic DNA. Translation: CAB05209.1.
Z83245 Genomic DNA. Translation: CAB05831.1.
Z83245 Genomic DNA. Translation: CAB05833.1.
Z83245 Genomic DNA. Translation: CAB05834.1.
Z93388 Genomic DNA. Translation: CAB07653.1.
AF304122 mRNA. Translation: AAG50235.1.
PIRiS04241. HSKW3.
RefSeqiNP_001263958.1. NM_001277029.1.
NP_496890.1. NM_064489.1.
NP_496894.1. NM_064493.5.
NP_496895.1. NM_064494.1.
NP_496899.1. NM_064498.1.
NP_501204.1. NM_068803.3.
NP_501407.1. NM_069006.3.
NP_502134.1. NM_069733.3.
NP_502138.1. NM_069737.1.
NP_502153.1. NM_069752.3.
NP_505199.1. NM_072798.1.
NP_505276.1. NM_072875.1.
NP_505292.1. NM_072891.1.
NP_505297.1. NM_072896.3.
NP_507033.1. NM_074632.3.
UniGeneiCel.12716.
Cel.12871.
Cel.21447.
Cel.21596.
Cel.21806.
Cel.21871.
Cel.2350.
Cel.27007.
Cel.29106.
Cel.29299.
Cel.32605.
Cel.32856.
Cel.33056.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
3N9LX-ray2.80B2-16[»]
3N9NX-ray2.30B/C2-33[»]
3N9OX-ray2.31B2-16[»]
C2-18[»]
3N9PX-ray2.39B/C2-33[»]
3N9QX-ray2.30B2-16[»]
C20-36[»]
ProteinModelPortaliP08898.
SMRiP08898. Positions 17-136.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi42741. 1 interaction.
45065. 3 interactions.
50996. 1 interaction.
IntActiP08898. 1 interaction.
STRINGi6239.ZK131.7.

Proteomic databases

PaxDbiP08898.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

GeneIDi13221387.
175030.
175031.
177628.
180074.
181821.
184113.
184200.
184804.
186250.
186325.
191668.
191672.
191673.
246024.
KEGGicel:CELE_B0035.10.
cel:CELE_F07B7.5.
cel:CELE_F08G2.3.
cel:CELE_F17E9.10.
cel:CELE_F22B3.2.
cel:CELE_F45F2.13.
cel:CELE_F54E12.1.
cel:CELE_F55G1.2.
cel:CELE_K06C4.13.
cel:CELE_K06C4.5.
cel:CELE_T10C6.13.
cel:CELE_ZK131.2.
cel:CELE_ZK131.3.
cel:CELE_ZK131.7.
UCSCiZK131.7. c. elegans.

Organism-specific databases

CTDi175030.
175031.
177628.
180074.
181821.
184113.
184200.
184804.
186250.
186325.
191668.
191672.
191673.
246024.
WormBaseiB0035.10; CE03253; WBGene00001919; his-45.
F07B7.5; CE03253; WBGene00001923; his-49.
F08G2.3; CE03253; WBGene00001916; his-42.
F17E9.10; CE03253; WBGene00001906; his-32.
F22B3.2; CE03253; WBGene00001937; his-63.
F45F2.13; CE03253; WBGene00001880; his-6.
F54E12.1; CE03253; WBGene00001929; his-55.
F55G1.2; CE03253; WBGene00001933; his-59.
K06C4.13; CE03253; WBGene00001901; his-27.
K06C4.5; CE03253; WBGene00001891; his-17.
T10C6.13; CE03253; WBGene00001876; his-2.
ZK131.2; CE03253; WBGene00001899; his-25.
ZK131.3; CE03253; WBGene00001883; his-9.
ZK131.7; CE03253; WBGene00001887; his-13.

Phylogenomic databases

eggNOGiCOG2036.
HOGENOMiHOG000155290.
InParanoidiP08898.
KOiK11253.
OrthoDBiEOG7HB5C2.
PhylomeDBiP08898.

Enzyme and pathway databases

ReactomeiREACT_332708. Factors involved in megakaryocyte development and platelet production.

Miscellaneous databases

EvolutionaryTraceiP08898.
NextBioi886476.
PROiP08898.

Gene expression databases

ExpressionAtlasiP08898. baseline.

Family and domain databases

Gene3Di1.10.20.10. 1 hit.
InterProiIPR009072. Histone-fold.
IPR007125. Histone_core_D.
IPR000164. Histone_H3/CENP-A.
[Graphical view]
PANTHERiPTHR11426. PTHR11426. 1 hit.
PfamiPF00125. Histone. 1 hit.
[Graphical view]
PRINTSiPR00622. HISTONEH3.
SMARTiSM00428. H3. 1 hit.
[Graphical view]
SUPFAMiSSF47113. SSF47113. 1 hit.
PROSITEiPS00322. HISTONE_H3_1. 1 hit.
PS00959. HISTONE_H3_2. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Nucleotide sequences of Caenorhabditis elegans core histone genes. Genes for different histone classes share common flanking sequence elements."
    Roberts S.B., Emmons S.W., Childs G.
    J. Mol. Biol. 206:567-577(1989) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] (HIS-10).
    Strain: Bristol N2.
  2. "Genome sequence of the nematode C. elegans: a platform for investigating biology."
    The C. elegans sequencing consortium
    Science 282:2012-2018(1998) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: Bristol N2.
  3. "The Caenorhabditis elegans transcriptome project, a complementary view of the genome."
    Kohara Y., Shin-i T., Suzuki Y., Sugano S., Potdevin M., Thierry-Mieg Y., Thierry-Mieg D., Thierry-Mieg J.
    Submitted (SEP-2000) to the EMBL/GenBank/DDBJ databases
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (HIS-4).
    Strain: Bristol N2.
  4. "The primary structure of histone H3 from the nematode Caenorhabditis elegans."
    Vanfleteren J.R., van Bun S.M., van Beeumen J.J.
    FEBS Lett. 211:59-63(1987) [PubMed] [Europe PMC] [Abstract]
    Cited for: PROTEIN SEQUENCE OF 2-136, ACETYLATION AT LYS-5; LYS-15 AND LYS-24, METHYLATION AT LYS-10; LYS-28; LYS-37 AND LYS-80.
    Strain: DR27.
  5. "Mitotic phosphorylation of histone H3 is governed by Ipl1/aurora kinase and Glc7/PP1 phosphatase in budding yeast and nematodes."
    Hsu J.-Y., Sun Z.-W., Li X., Reuben M., Tatchell K., Bishop D.K., Grushcow J.M., Brame C.J., Caldwell J.A., Hunt D.F., Lin R., Smith M.M., Allis C.D.
    Cell 102:279-291(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION AT SER-11.
  6. "The aurora B kinase AIR-2 regulates kinetochores during mitosis and is required for separation of homologous Chromosomes during meiosis."
    Kaitna S., Pasierbek P., Jantsch M., Loidl J., Glotzer M.
    Curr. Biol. 12:798-812(2002) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION AT SER-11 AND SER-29.
  7. "X-chromosome silencing in the germline of C. elegans."
    Kelly W.G., Schaner C.E., Dernburg A.F., Lee M.-H., Kim S.K., Villeneuve A.M., Reinke V.
    Development 129:479-492(2002) [PubMed] [Europe PMC] [Abstract]
    Cited for: METHYLATION AT LYS-5 AND LYS-10, PHOSPHORYLATION AT SER-11, ACETYLATION AT LYS-10 AND LYS-15.
  8. "Germline X chromosomes exhibit contrasting patterns of histone H3 methylation in Caenorhabditis elegans."
    Reuben M., Lin R.
    Dev. Biol. 245:71-82(2002) [PubMed] [Europe PMC] [Abstract]
    Cited for: METHYLATION AT LYS-5 AND LYS-10.
  9. "The C. elegans Tousled-like kinase (TLK-1) has an essential role in transcription."
    Han Z., Saam J.R., Adams H.P., Mango S.E., Schumacher J.M.
    Curr. Biol. 13:1921-1929(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION AT SER-11, METHYLATION AT LYS-5; LYS-10 AND LYS-37.
  10. "Role of Caenorhabditis elegans protein phosphatase type 1, CeGLC-7 beta, in metaphase to anaphase transition during embryonic development."
    Sassa T., Ueda-Ohba H., Kitamura K., Harada S., Hosono R.
    Exp. Cell Res. 287:350-360(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION AT SER-11.
  11. "Caenorhabditis elegans RBX1 is essential for meiosis, mitotic chromosomal condensation and segregation, and cytokinesis."
    Sasagawa Y., Urano T., Kohara Y., Takahashi H., Higashitani A.
    Genes Cells 8:857-872(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION AT SER-11 AND SER-29.
  12. "The MES-2/MES-3/MES-6 complex and regulation of histone H3 methylation in C. elegans."
    Bender L.B., Cao R., Zhang Y., Strome S.
    Curr. Biol. 14:1639-1643(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: METHYLATION AT LYS-28.
  13. "Meiotic pairing and imprinted X chromatin assembly in Caenorhabditis elegans."
    Bean C.J., Schaner C.E., Kelly W.G.
    Nat. Genet. 36:100-105(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: ACETYLATION AT LYS-10 AND LYS-15, METHYLATION AT LYS-5 AND LYS-10, PHOSPHORYLATION AT SER-11.
  14. "Reversal of histone lysine trimethylation by the JMJD2 family of histone demethylases."
    Whetstine J.R., Nottke A., Lan F., Huarte M., Smolikov S., Chen Z., Spooner E., Li E., Zhang G., Colaiacovo M., Shi Y.
    Cell 125:467-481(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: METHYLATION AT LYS-10 AND LYS-37.

Entry informationi

Entry nameiH3_CAEEL
AccessioniPrimary (citable) accession number: P08898
Secondary accession number(s): Q9TW44
Entry historyi
Integrated into UniProtKB/Swiss-Prot: November 1, 1988
Last sequence update: January 23, 2007
Last modified: June 24, 2015
This is version 136 of the entry and version 4 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programCaenorhabditis annotation project

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. Caenorhabditis elegans
    Caenorhabditis elegans: entries, gene names and cross-references to WormBase
  2. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  3. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.