Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

CCAAT/enhancer-binding protein alpha

Gene

Cebpa

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Transcription factor that coordinates proliferation arrest and the differentiation of myeloid progenitors, adipocytes, hepatocytes, and cells of the lung and the placenta (PubMed:8415748, PubMed:15107404, PubMed:15589173). Binds directly to the consensus DNA sequence 5'-T[TG]NNGNAA[TG]-3' acting as an activator on distinct target genes. During early embryogenesis, plays essential and redundant functions with CEBPB (PubMed:15509779). Essential for the transition from common myeloid progenitors (CMP) to granulocyte/monocyte progenitors (GMP) (PubMed:24367003). Critical for the proper development of the liver and the lung (PubMed:8798745). Necessary for terminal adipocyte differentiation, is required for postnatal maintenance of systemic energy homeostasis and lipid storage (PubMed:1935900, PubMed:8090719). To regulate these different processes at the proper moment and tissue, interplays with other transcription factors and modulators. Downregulates the expression of genes that maintain cells in an undifferentiated and proliferative state through E2F1 repression, which is critical for its ability to induce adipocyte and granulocyte terminal differentiation. Reciprocally E2F1 blocks adipocyte differentiation by binding to specific promoters and repressing CEBPA binding to its target gene promoters (PubMed:11672531). Proliferation arrest also depends on a functional binding to SWI/SNF complex (PubMed:14660596). In liver, regulates gluconeogenesis and lipogenesis through different mechanisms. To regulate gluconeogenesis, functionally cooperates with FOXO1 binding to IRE-controlled promoters and regulating the expression of target genes such as PCK1 or G6PC (PubMed:17627282). To modulate lipogenesis, interacts and transcriptionally synergizes with SREBF1 in promoter activation of specific lipogenic target genes such as ACAS2 (PubMed:17290224). In adipose tissue, seems to act as FOXO1 coactivator accessing to ADIPOQ promoter through FOXO1 binding sites (PubMed:17090532).By similarity13 Publications
Isoform 3: Can act as dominant-negative. Binds DNA and have transctivation activity, even if much less efficiently than isoform 2. Does not inhibit cell proliferation.By similarity1 Publication
Isoform 4: Directly and specifically enhances ribosomal DNA transcription interacting with RNA polymerase I-specific cofactors and inducing histone acetylation.By similarity

GO - Molecular functioni

  • DNA binding Source: MGI
  • kinase binding Source: UniProtKB
  • protein homodimerization activity Source: MGI
  • RNA polymerase II core promoter proximal region sequence-specific DNA binding Source: NTNU_SB
  • RNA polymerase I regulatory region DNA binding Source: UniProtKB
  • sequence-specific DNA binding Source: UniProtKB
  • transcriptional activator activity, RNA polymerase II core promoter proximal region sequence-specific binding Source: NTNU_SB
  • transcriptional activator activity, RNA polymerase II transcription regulatory region sequence-specific binding Source: MGI
  • transcription coactivator activity Source: UniProtKB
  • transcription factor activity, RNA polymerase II distal enhancer sequence-specific binding Source: MGI
  • transcription factor activity, sequence-specific DNA binding Source: UniProtKB
  • transcription factor binding Source: UniProtKB
  • transcription regulatory region DNA binding Source: MGI

GO - Biological processi

  • brown fat cell differentiation Source: MGI
  • cell maturation Source: MGI
  • cellular response to lithium ion Source: MGI
  • cellular response to organic cyclic compound Source: MGI
  • cellular response to tumor necrosis factor Source: MGI
  • cholesterol metabolic process Source: MGI
  • cytokine-mediated signaling pathway Source: UniProtKB
  • embryonic placenta development Source: MGI
  • fat cell differentiation Source: UniProtKB
  • glucose homeostasis Source: UniProtKB
  • granulocyte differentiation Source: UniProtKB
  • inner ear development Source: MGI
  • lipid homeostasis Source: UniProtKB
  • liver development Source: UniProtKB
  • lung development Source: UniProtKB
  • macrophage differentiation Source: MGI
  • mitochondrion organization Source: MGI
  • myeloid cell differentiation Source: UniProtKB
  • negative regulation of cell cycle Source: UniProtKB
  • negative regulation of cell proliferation Source: UniProtKB
  • negative regulation of transcription, DNA-templated Source: UniProtKB
  • negative regulation of transcription from RNA polymerase II promoter Source: MGI
  • Notch signaling pathway Source: MGI
  • positive regulation of fat cell differentiation Source: MGI
  • positive regulation of osteoblast differentiation Source: MGI
  • positive regulation of transcription, DNA-templated Source: MGI
  • positive regulation of transcription from RNA polymerase III promoter Source: MGI
  • positive regulation of transcription from RNA polymerase II promoter Source: NTNU_SB
  • regulation of cell proliferation Source: MGI
  • regulation of transcription, DNA-templated Source: MGI
  • regulation of transcription from RNA polymerase II promoter Source: MGI
  • transcription, DNA-templated Source: UniProtKB
  • transcription from RNA polymerase I promoter Source: UniProtKB
  • urea cycle Source: MGI
  • white fat cell differentiation Source: MGI
Complete GO annotation...

Keywords - Molecular functioni

Activator, Developmental protein

Keywords - Biological processi

Transcription, Transcription regulation

Keywords - Ligandi

DNA-binding

Enzyme and pathway databases

ReactomeiR-MMU-381340. Transcriptional regulation of white adipocyte differentiation.
R-MMU-442533. Transcriptional Regulation of Adipocyte Differentiation in 3T3-L1 Pre-adipocytes.

Names & Taxonomyi

Protein namesi
Recommended name:
CCAAT/enhancer-binding protein alphaImported
Short name:
C/EBP alphaImported
Gene namesi
Name:CebpaImported
Synonyms:Cebp1 Publication
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 7

Organism-specific databases

MGIiMGI:99480. Cebpa.

Subcellular locationi

  • Nucleus 1 Publication
Isoform 4 :

GO - Cellular componenti

  • nucleolus Source: UniProtKB
  • nucleoplasm Source: Reactome
  • nucleus Source: UniProtKB
  • RNA polymerase II transcription factor complex Source: MGI
  • transcription factor complex Source: MGI
Complete GO annotation...

Keywords - Cellular componenti

Nucleus

Pathology & Biotechi

Disruption phenotypei

Mutants die of hypoglycemia at 7-10h after bith. They have defects in the control of hepatic growth and lung development. The liver architecture is disturbed with acinar formation. They show hyperproliferation of type II pneumocytes and disturbed alveolar architecture. At the molecular level, accumulation of glycogen and lipids in the liver and adipose tissues is impaired, and the mutant animals are severely hypoglycemic (PubMed:8798745). In very few cases (less than 1%) mutants are able to survive up to 4 weeks but they are sevrely retarded in development. At 2 weeks, they are about half the size of their littermates, very thin and with skin problems. Conditional knockout in adults leads to a lack of granulopoiesis in all hematopoietic organs with no mature peripheral blood granulocytes and the presence of >30% immature myeloid cells in the bone marrow, but without anemia or thrombocytopenia. Animals rarely survive 4 to 5 weeks of age due to sepsis as a result of granulocytopenia (PubMed:15589173). Double knockout CEBPA and CEBPB results in embryonic developmental arrest and death at around E10 to E11, associated with a gross placenta failure (PubMed:15509779).3 Publications

Mutagenesis

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Mutagenesisi182 – 188PPPPPPP → APPPAPA: No effect on DNA-binding or interaction with CDK2 and CDK4. No effect on cell cycle inhibition. 1 Publication7
Mutagenesisi184 – 186PPP → AAA: No effect on DNA-binding or interaction with CDK2 and CDK4. No effect on cell cycle inhibition. 1 Publication3
Mutagenesisi193S → A: No effect on DNA-binding. Loss of interaction with CDK2 and CDK4 as well as cell cycle inhibition. 1 Publication1
Mutagenesisi222 – 230TPPPTPVPS → APPPAPVPA: Decreases phosphorylated form. Deregulation of hepatic glucose metabolism. 1 Publication9
Mutagenesisi286Y → A: No effect on DNA-binding, represses E2F1:TFDP1-mediated transcription and causes adipose hypoplasia and myeloid dysplasia. 1 Publication1
Mutagenesisi288V → A: No effect on DNA-binding, no effect on repression of E2F1:TFDP1-mediated transcription and no effect on adipogenesis and granulopoiesis; when associated with A-291. 1 Publication1
Mutagenesisi291E → A: No effect on DNA-binding, no effect on repression of E2F1:TFDP1-mediated transcription and no effect on adipogenesis and granulopoiesis; when associated with A-288. 1 Publication1
Mutagenesisi295I → A: No effect on DNA-binding, represses E2F1:TFDP1-mediated transcription and causes adipose hypoplasia and myeloid dysplasia; when associated with A-298. 1 Publication1
Mutagenesisi298R → A: No effect on DNA-binding, represses E2F1:TFDP1-mediated transcription and causes adipose hypoplasia and myeloid dysplasia; when associated with A-295. 1 Publication1

Chemistry databases

ChEMBLiCHEMBL3616358.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00000766141 – 359CCAAT/enhancer-binding protein alphaAdd BLAST359

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei159N6-acetyllysine; alternateBy similarity1
Cross-linki159Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO); alternateBy similarity
Cross-linki159Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2); alternateBy similarity
Modified residuei193Phosphoserine1 Publication1
Modified residuei222Phosphothreonine; by GSK31 Publication1
Modified residuei226Phosphothreonine; by GSK31 Publication1
Modified residuei230Phosphoserine; by GSK31 Publication1

Post-translational modificationi

Sumoylated, sumoylation blocks the inhibitory effect on cell proliferation by disrupting the interaction with SMARCA2.By similarity
Phosphorylation at Ser-193 is required for interaction with CDK2, CDK4 and SWI/SNF complex leading to cell cycle inhibiton. Dephosphorylated at Ser-193 by protein phosphatase 2A (PP2A) through PI3K/AKT signaling pathway regulation (PubMed:15107404). Phosphorylation at Thr-222 and Thr-226 by GSK3 is constitutive in adipose tissue and lung. In liver, both Thr-222 and Thr-226 are phosphorylated only during feeding but not during fasting (PubMed:17290224). Phosphorylation of the GSK3 consensus sites selectively decreases transactivation activity on IRE-controlled promoters (PubMed:17290224).2 Publications

Keywords - PTMi

Acetylation, Isopeptide bond, Phosphoprotein, Ubl conjugation

Proteomic databases

MaxQBiP53566.
PaxDbiP53566.
PRIDEiP53566.

PTM databases

iPTMnetiP53566.
PhosphoSitePlusiP53566.

Expressioni

Tissue specificityi

Isoform 2 and isoform 3 are expressed in adipose tissue and liver (at protein level).1 Publication

Developmental stagei

At E9.5, expressed in the chorionic plate. From E10.5 to at least E11.5, is also expressed in the trophoblasts of the labyrinthine layer.1 Publication

Gene expression databases

BgeeiENSMUSG00000034957.
CleanExiMM_CEBPA.
ExpressionAtlasiP53566. baseline and differential.
GenevisibleiP53566. MM.

Interactioni

Subunit structurei

Binds DNA as a homodimer and as a heterodimer. Can form stable heterodimers with CEBPB, CEBPD, CEBPE and CEBPG (By similarity). Interacts with PRDM16 (PubMed:19641492). Interacts with UBN1 (By similarity). Interacts with ZNF638; this interaction increases transcriptional activation (PubMed:21602272). Interacts with the complex TFDP2:E2F1; the interaction prevents CEBPA binding to target gene promoters and represses its transcriptional activity (By similarity). Interacts with RB1 (PubMed:15107404). Interacts (when phosphorylated at SER-193) with CDK2, CDK4, E2F4 and SMARCA2 (PubMed:15107404). Interacts with SREBPF1 (PubMed:17290224). Interacts with FOXO1 (via the Fork-head domain); the interaction increases when FOXO1 is deacetylated (PubMed:17090532, PubMed:17627282). Isoform 1 and isoform 4 interact with TAF1A and UBTF. Isoform 4 interacts with NPM1 (By similarity).By similarity6 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
Foxo1Q9R1E05EBI-2644207,EBI-1371343

GO - Molecular functioni

  • kinase binding Source: UniProtKB
  • protein homodimerization activity Source: MGI
  • transcription factor binding Source: UniProtKB

Protein-protein interaction databases

BioGridi198667. 120 interactors.
DIPiDIP-44054N.
IntActiP53566. 3 interactors.
MINTiMINT-1529126.
STRINGi10090.ENSMUSP00000096129.

Structurei

3D structure databases

ProteinModelPortaliP53566.
SMRiP53566.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini283 – 346bZIPPROSITE-ProRule annotationAdd BLAST64

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni1 – 70Required to repress E2F1:TFDP1-mediated transcription, to inhibit cell cycle and to induce adipocyte differentiationBy similarityAdd BLAST70
Regioni126 – 200Required to induce adipocyte differentiationBy similarityAdd BLAST75
Regioni180 – 194Required to functionally cooperate with SREBF1 in promoter activation1 PublicationAdd BLAST15
Regioni240 – 359Interaction with FOXO11 PublicationAdd BLAST120
Regioni287 – 314Basic motifPROSITE-ProRule annotationAdd BLAST28
Regioni318 – 346Leucine-zipperPROSITE-ProRule annotationAdd BLAST29

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Compositional biasi99 – 102Poly-Gly4
Compositional biasi181 – 190Poly-Pro10
Compositional biasi262 – 271Poly-Gly10

Sequence similaritiesi

Belongs to the bZIP family. C/EBP subfamily.Curated
Contains 1 bZIP (basic-leucine zipper) domain.PROSITE-ProRule annotation

Phylogenomic databases

eggNOGiKOG3119. Eukaryota.
ENOG410YJ8G. LUCA.
GeneTreeiENSGT00530000063192.
HOGENOMiHOG000013112.
HOVERGENiHBG050879.
InParanoidiP53566.
KOiK09055.
OMAiFDYPGAP.
OrthoDBiEOG091G11FC.
PhylomeDBiP53566.
TreeFamiTF105008.

Family and domain databases

InterProiIPR004827. bZIP.
IPR031106. C/EBP.
IPR016468. C/EBP_chordates.
[Graphical view]
PANTHERiPTHR23334. PTHR23334. 1 hit.
PfamiPF07716. bZIP_2. 1 hit.
[Graphical view]
PIRSFiPIRSF005879. CCAAT/enhancer-binding. 1 hit.
SMARTiSM00338. BRLZ. 1 hit.
[Graphical view]
PROSITEiPS50217. BZIP. 1 hit.
[Graphical view]

Sequences (4)i

Sequence statusi: Complete.

This entry describes 4 isoformsi produced by alternative initiation. AlignAdd to basket

Isoform 1 (identifier: P53566-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MESADFYEVE PRPPMSSHLQ SPPHAPSNAA FGFPRGAGPA PPPAPPAAPE
60 70 80 90 100
PLGGICEHET SIDISAYIDP AAFNDEFLAD LFQHSRQQEK AKAAAGPAGG
110 120 130 140 150
GGDFDYPGAP AGPGGAVMSA GAHGPPPGYG CAAAGYLDGR LEPLYERVGA
160 170 180 190 200
PALRPLVIKQ EPREEDEAKQ LALAGLFPYQ PPPPPPPPHP HASPAHLAAP
210 220 230 240 250
HLQFQIAHCG QTTMHLQPGH PTPPPTPVPS PHAAPALGAA GLPGPGSALK
260 270 280 290 300
GLAGAHPDLR TGGGGGGSGA GAGKAKKSVD KNSNEYRVRR ERNNIAVRKS
310 320 330 340 350
RDKAKQRNVE TQQKVLELTS DNDRLRKRVE QLSRELDTLR GIFRQLPESS

LVKAMGNCA
Length:359
Mass (Da):37,430
Last modified:January 23, 2002 - v2
Checksum:i1E6CC09A330BEFEF
GO
Isoform 2 (identifier: P53566-3) [UniParc]FASTAAdd to basket
Also known as: C/EBPalpha-p421 Publication

The sequence of this isoform differs from the canonical sequence as follows:
     1-14: Missing.

Show »
Length:345
Mass (Da):35,781
Checksum:i50B6625758216AF1
GO
Isoform 3 (identifier: P53566-4) [UniParc]FASTAAdd to basket
Also known as: C/EBPalpha-p301 Publication

The sequence of this isoform differs from the canonical sequence as follows:
     1-117: Missing.

Show »
Length:242
Mass (Da):25,557
Checksum:iDB631CCCFD55BC20
GO
Isoform 4 (identifier: P53566-5) [UniParc]FASTAAdd to basket
Also known as: extended-C/EBPalpha1 Publication

The sequence of this isoform differs from the canonical sequence as follows:
     1-1: M → MRGREPVGALGGRRRQRRHAQAGGRRGSPCRENSNSPM

Show »
Length:396
Mass (Da):41,484
Checksum:i1E63C3CF81CF808B
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti30 – 54AFGFP…EPLGG → RLWLSPGRGPRAAPSPTCRP GAAGR in AAA37374 (PubMed:2006196).CuratedAdd BLAST25
Sequence conflicti356 – 359GNCA → ATAREARGCGTALGRPPGWR PRGWFRVAGSLGCPGRASQD in AAA37374 (PubMed:2006196).Curated4

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0575491 – 117Missing in isoform 3. 1 PublicationAdd BLAST117
Alternative sequenceiVSP_0575501 – 14Missing in isoform 2. 1 PublicationAdd BLAST14
Alternative sequenceiVSP_0576081M → MRGREPVGALGGRRRQRRHA QAGGRRGSPCRENSNSPM in isoform 4. 1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M62362 Genomic DNA. Translation: AAA37374.1.
AC150683 Genomic DNA. No translation available.
BC011118 mRNA. Translation: AAH11118.1.
BC028890 mRNA. Translation: AAH28890.1.
BC051102 mRNA. Translation: AAH51102.1.
BC058161 mRNA. Translation: AAH58161.1.
CCDSiCCDS21145.1. [P53566-1]
PIRiI49575.
RefSeqiNP_001274443.1. NM_001287514.1. [P53566-5]
NP_001274444.1. NM_001287515.1. [P53566-3]
NP_001274450.1. NM_001287521.1. [P53566-4]
NP_031704.2. NM_007678.3. [P53566-1]
UniGeneiMm.349667.

Genome annotation databases

EnsembliENSMUST00000042985; ENSMUSP00000096129; ENSMUSG00000034957. [P53566-1]
GeneIDi12606.
KEGGimmu:12606.
UCSCiuc009gjl.2. mouse. [P53566-1]

Keywords - Coding sequence diversityi

Alternative initiation

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M62362 Genomic DNA. Translation: AAA37374.1.
AC150683 Genomic DNA. No translation available.
BC011118 mRNA. Translation: AAH11118.1.
BC028890 mRNA. Translation: AAH28890.1.
BC051102 mRNA. Translation: AAH51102.1.
BC058161 mRNA. Translation: AAH58161.1.
CCDSiCCDS21145.1. [P53566-1]
PIRiI49575.
RefSeqiNP_001274443.1. NM_001287514.1. [P53566-5]
NP_001274444.1. NM_001287515.1. [P53566-3]
NP_001274450.1. NM_001287521.1. [P53566-4]
NP_031704.2. NM_007678.3. [P53566-1]
UniGeneiMm.349667.

3D structure databases

ProteinModelPortaliP53566.
SMRiP53566.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi198667. 120 interactors.
DIPiDIP-44054N.
IntActiP53566. 3 interactors.
MINTiMINT-1529126.
STRINGi10090.ENSMUSP00000096129.

Chemistry databases

ChEMBLiCHEMBL3616358.

PTM databases

iPTMnetiP53566.
PhosphoSitePlusiP53566.

Proteomic databases

MaxQBiP53566.
PaxDbiP53566.
PRIDEiP53566.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000042985; ENSMUSP00000096129; ENSMUSG00000034957. [P53566-1]
GeneIDi12606.
KEGGimmu:12606.
UCSCiuc009gjl.2. mouse. [P53566-1]

Organism-specific databases

CTDi1050.
MGIiMGI:99480. Cebpa.

Phylogenomic databases

eggNOGiKOG3119. Eukaryota.
ENOG410YJ8G. LUCA.
GeneTreeiENSGT00530000063192.
HOGENOMiHOG000013112.
HOVERGENiHBG050879.
InParanoidiP53566.
KOiK09055.
OMAiFDYPGAP.
OrthoDBiEOG091G11FC.
PhylomeDBiP53566.
TreeFamiTF105008.

Enzyme and pathway databases

ReactomeiR-MMU-381340. Transcriptional regulation of white adipocyte differentiation.
R-MMU-442533. Transcriptional Regulation of Adipocyte Differentiation in 3T3-L1 Pre-adipocytes.

Miscellaneous databases

PROiP53566.
SOURCEiSearch...

Gene expression databases

BgeeiENSMUSG00000034957.
CleanExiMM_CEBPA.
ExpressionAtlasiP53566. baseline and differential.
GenevisibleiP53566. MM.

Family and domain databases

InterProiIPR004827. bZIP.
IPR031106. C/EBP.
IPR016468. C/EBP_chordates.
[Graphical view]
PANTHERiPTHR23334. PTHR23334. 1 hit.
PfamiPF07716. bZIP_2. 1 hit.
[Graphical view]
PIRSFiPIRSF005879. CCAAT/enhancer-binding. 1 hit.
SMARTiSM00338. BRLZ. 1 hit.
[Graphical view]
PROSITEiPS50217. BZIP. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiCEBPA_MOUSE
AccessioniPrimary (citable) accession number: P53566
Secondary accession number(s): Q91XB6
Entry historyi
Integrated into UniProtKB/Swiss-Prot: October 1, 1996
Last sequence update: January 23, 2002
Last modified: November 30, 2016
This is version 145 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.