Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

C-terminal-binding protein 2

Gene

CTBP2

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Corepressor targeting diverse transcription regulators. Functions in brown adipose tissue (BAT) differentiation (By similarity).By similarity
Isoform 2 probably acts as a scaffold for specialized synapses.

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Binding sitei106NADBy similarity1
Binding sitei210NAD1 Publication1
Active sitei272By similarity1
Binding sitei296NAD1 Publication1
Active sitei301By similarity1
Active sitei321Proton donorBy similarity1

Regions

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Nucleotide bindingi186 – 191NAD1 Publication6
Nucleotide bindingi243 – 249NAD1 Publication7
Nucleotide bindingi270 – 272NAD1 Publication3
Nucleotide bindingi321 – 324NAD1 Publication4

GO - Molecular functioni

GO - Biological processi

  • negative regulation of cell proliferation Source: ProtInc
  • negative regulation of transcription, DNA-templated Source: UniProtKB
  • negative regulation of transcription from RNA polymerase II promoter Source: GO_Central
  • positive regulation of chromatin binding Source: Ensembl
  • positive regulation of retinoic acid receptor signaling pathway Source: MGI
  • positive regulation of transcription from RNA polymerase II promoter Source: MGI
  • transcription, DNA-templated Source: UniProtKB-KW
  • viral genome replication Source: ProtInc
  • white fat cell differentiation Source: UniProtKB
Complete GO annotation...

Keywords - Molecular functioni

Oxidoreductase, Repressor

Keywords - Biological processi

Differentiation, Host-virus interaction, Transcription, Transcription regulation

Keywords - Ligandi

NAD

Enzyme and pathway databases

ReactomeiR-HSA-4641265. Repression of WNT target genes.
R-HSA-5339700. TCF7L2 mutants don't bind CTBP.
SignaLinkiP56545.
SIGNORiP56545.

Names & Taxonomyi

Protein namesi
Recommended name:
C-terminal-binding protein 2
Short name:
CtBP2
Gene namesi
Name:CTBP2
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 10

Organism-specific databases

HGNCiHGNC:2495. CTBP2.

Subcellular locationi

GO - Cellular componenti

  • cell junction Source: UniProtKB-KW
  • nucleus Source: UniProtKB
  • ribbon synapse Source: Ensembl
  • transcriptional repressor complex Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Cell junction, Nucleus, Synapse

Pathology & Biotechi

Organism-specific databases

DisGeNETi1488.
OpenTargetsiENSG00000175029.
PharmGKBiPA26996.

Polymorphism and mutation databases

BioMutaiCTBP2.
DMDMi3182976.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00000760441 – 445C-terminal-binding protein 2Add BLAST445

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei22Asymmetric dimethylarginineCombined sources1
Modified residuei428PhosphoserineCombined sources1

Keywords - PTMi

Methylation, Phosphoprotein

Proteomic databases

EPDiP56545.
MaxQBiP56545.
PaxDbiP56545.
PeptideAtlasiP56545.
PRIDEiP56545.

PTM databases

iPTMnetiP56545.
PhosphoSitePlusiP56545.

Expressioni

Tissue specificityi

Ubiquitous. Highest levels in heart, skeletal muscle, and pancreas.

Gene expression databases

BgeeiENSG00000175029.
CleanExiHS_CTBP2.
ExpressionAtlasiP56545. baseline and differential.
GenevisibleiP56545. HS.

Organism-specific databases

HPAiCAB031916.
HPA023559.
HPA023564.
HPA044971.

Interactioni

Subunit structurei

Interacts with the C-terminus of adenovirus E1A protein. Can form homodimers or heterodimers of CTBP1 and CTBP2. Interacts with HIPK2 and ZNF217. Interacts with PRDM16; represses white adipose tissue (WAT)-specific genes expression (By similarity). Interacts with PNN, NRIP1 and WIZ. Interacts with human adenovirus 5 E1A protein; this interaction seems to potentiate viral replication (PubMed:23747199). Interacts with MCRIP1 (PubMed:25728771).By similarity6 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
BCL3P207492EBI-741533,EBI-958997
CCNHP519465EBI-741533,EBI-741406
CTBP1Q13363-23EBI-741533,EBI-10171858
FOXP2O154093EBI-741533,EBI-983612
FUNDC1Q8IVP53EBI-741533,EBI-3059266
HIC1Q145262EBI-741533,EBI-2507362
IKZF1Q134225EBI-741533,EBI-745305
IKZF2Q9UKS73EBI-741533,EBI-3893057
LCORQ96JN04EBI-741533,EBI-8833163
NOL4O94818-23EBI-741533,EBI-10190763
NOL4LQ96MY13EBI-741533,EBI-6660790
PLCB1Q9NQ663EBI-741533,EBI-3396023
PPP1R15AO758072EBI-741533,EBI-714746
PROX1Q927862EBI-741533,EBI-3912635
RAI2Q9Y5P34EBI-741533,EBI-746228
SOX13Q9UN792EBI-741533,EBI-3928516
TGIF1Q155835EBI-741533,EBI-714215
Zbp1A2APF72EBI-741533,EBI-6115394From a different organism.
ZNF750Q32MQ03EBI-741533,EBI-10240029

GO - Molecular functioni

Protein-protein interaction databases

BioGridi107870. 105 interactors.
DIPiDIP-42104N.
IntActiP56545. 62 interactors.
MINTiMINT-1188878.
STRINGi9606.ENSP00000311825.

Chemistry databases

BindingDBiP56545.

Structurei

Secondary structure

1445
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Beta strandi35 – 38Combined sources4
Helixi47 – 51Combined sources5
Turni52 – 54Combined sources3
Beta strandi56 – 59Combined sources4
Helixi65 – 67Combined sources3
Helixi70 – 75Combined sources6
Beta strandi76 – 81Combined sources6
Beta strandi83 – 85Combined sources3
Helixi89 – 93Combined sources5
Beta strandi100 – 106Combined sources7
Helixi113 – 118Combined sources6
Beta strandi122 – 124Combined sources3
Beta strandi128 – 130Combined sources3
Helixi131 – 147Combined sources17
Helixi149 – 157Combined sources9
Helixi165 – 171Combined sources7
Turni172 – 174Combined sources3
Beta strandi182 – 186Combined sources5
Helixi190 – 199Combined sources10
Helixi200 – 202Combined sources3
Beta strandi205 – 209Combined sources5
Helixi217 – 220Combined sources4
Helixi229 – 235Combined sources7
Beta strandi237 – 241Combined sources5
Helixi255 – 258Combined sources4
Beta strandi265 – 269Combined sources5
Helixi273 – 275Combined sources3
Helixi278 – 286Combined sources9
Beta strandi289 – 296Combined sources8
Beta strandi299 – 302Combined sources4
Turni309 – 312Combined sources4
Beta strandi314 – 318Combined sources5
Helixi327 – 346Combined sources20
Turni349 – 352Combined sources4
Beta strandi354 – 356Combined sources3

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
2OMEX-ray2.80A/B/C/D/E/F/G/H31-364[»]
4LCJX-ray2.86A/B/C/D/E/F/G/H31-362[»]
ProteinModelPortaliP56545.
SMRiP56545.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP56545.

Family & Domainsi

Sequence similaritiesi

Phylogenomic databases

eggNOGiKOG0067. Eukaryota.
COG0111. LUCA.
GeneTreeiENSGT00530000063021.
HOVERGENiHBG001898.
InParanoidiP56545.
KOiK04496.
OrthoDBiEOG091G08GS.
PhylomeDBiP56545.
TreeFamiTF313593.

Family and domain databases

Gene3Di3.40.50.720. 2 hits.
InterProiIPR006139. D-isomer_2_OHA_DH_cat_dom.
IPR029753. D-isomer_DH_CS.
IPR006140. D-isomer_DH_NAD-bd.
IPR016040. NAD(P)-bd_dom.
[Graphical view]
PfamiPF00389. 2-Hacid_dh. 1 hit.
PF02826. 2-Hacid_dh_C. 1 hit.
[Graphical view]
SUPFAMiSSF51735. SSF51735. 1 hit.
PROSITEiPS00671. D_2_HYDROXYACID_DH_3. 1 hit.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: P56545-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MALVDKHKVK RQRLDRICEG IRPQIMNGPL HPRPLVALLD GRDCTVEMPI
60 70 80 90 100
LKDLATVAFC DAQSTQEIHE KVLNEAVGAM MYHTITLTRE DLEKFKALRV
110 120 130 140 150
IVRIGSGYDN VDIKAAGELG IAVCNIPSAA VEETADSTIC HILNLYRRNT
160 170 180 190 200
WLYQALREGT RVQSVEQIRE VASGAARIRG ETLGLIGFGR TGQAVAVRAK
210 220 230 240 250
AFGFSVIFYD PYLQDGIERS LGVQRVYTLQ DLLYQSDCVS LHCNLNEHNH
260 270 280 290 300
HLINDFTIKQ MRQGAFLVNA ARGGLVDEKA LAQALKEGRI RGAALDVHES
310 320 330 340 350
EPFSFAQGPL KDAPNLICTP HTAWYSEQAS LEMREAAATE IRRAITGRIP
360 370 380 390 400
ESLRNCVNKE FFVTSAPWSV IDQQAIHPEL NGATYRYPPG IVGVAPGGLP
410 420 430 440
AAMEGIIPGG IPVTHNLPTV AHPSQAPSPN QPTKHGDNRE HPNEQ
Length:445
Mass (Da):48,945
Last modified:July 15, 1998 - v1
Checksum:i0A8C21CEB36807FA
GO
Isoform 2 (identifier: P56545-2) [UniParc]FASTAAdd to basket
Also known as: Ribeye

The sequence of this isoform differs from the canonical sequence as follows:
     1-20: MALVDKHKVKRQRLDRICEG → MPVPSRHINI...VSTMLAPEPS

Show »
Length:985
Mass (Da):106,187
Checksum:i90EC841907295622
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti87L → F in AAG45951 (PubMed:11163272).Curated1
Sequence conflicti88T → N in AAG45951 (PubMed:11163272).Curated1
Sequence conflicti112D → N in AAG45951 (PubMed:11163272).Curated1
Sequence conflicti411I → T in AAH47018 (PubMed:15489334).Curated1
Isoform 2 (identifier: P56545-2)
Sequence conflicti455Y → H in AAG45951 (PubMed:11163272).Curated1
Sequence conflicti539Q → E in AAG45951 (PubMed:11163272).Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_03384447E → D.Corresponds to variant rs3198926dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0276151 – 20MALVD…RICEG → MPVPSRHINIGRSQSWDAAG WYEGPWENAESLRPLGRRSS LTYGTAEGTWFEPNHRPQDA ALPVAAEPYLYREAVYNSVA ARKGSTPDFTFYDSRQAVMS GRSPLLPREYYSDPSGAARV PKEPPLYRDPGVSRPVPSYG VLGSRTSWDPMQGRSPALQD AGHLYRDPGGKMIPQGRQTQ SRAASPGRYGREQPDTRYGA EVPAYPLSQVFSDISERPID PAPARQVAPTCLVVDPSSAA APEGSTGVAPGALNRGYGPA RESIPSKMAYETYEADLSTF QGPGGKRTVLPEFLAFLRAE GLAEATLGALLQQGFDSPAV LATLEDADIKSVAPNLGQAR VLSRLANSCRTEMQLRRQDR GGPLPRARSSSFSHRSELLH GDLASLGAAAPLQTASPRAG DPARRPSSAPSQHLLETAAT YSAPGVGTHAPHFPSNSGYS SPTPCALTARLSPTYPLQAG VALTNPGPSNPLHPGPRTAY STAYTVPMELLKRERNVAAS PLPSPHGSPQVLRKPGAPLG PSTLPPASQSLHTPHSPYQK VARRTGAPIIVSTMLAPEPS in isoform 2. 1 PublicationAdd BLAST20

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF016507 mRNA. Translation: AAC39603.1.
AF222711 mRNA. Translation: AAG45951.1.
BT007012 mRNA. Translation: AAP35658.1.
AK290390 mRNA. Translation: BAF83079.1.
AL833398 mRNA. Translation: CAH10590.1.
AL596261, AL731571 Genomic DNA. Translation: CAH72472.1.
AL731571, AL596261 Genomic DNA. Translation: CAI16100.1.
AL731571 Genomic DNA. Translation: CAI16102.1.
CH471066 Genomic DNA. Translation: EAW49247.1.
CH471066 Genomic DNA. Translation: EAW49250.1.
CH471066 Genomic DNA. Translation: EAW49249.1.
BC002486 mRNA. Translation: AAH02486.1.
BC047018 mRNA. Translation: AAH47018.1.
BC052276 mRNA. Translation: AAH52276.1.
BC072020 mRNA. Translation: AAH72020.1.
CCDSiCCDS7643.1. [P56545-1]
CCDS7644.1. [P56545-2]
RefSeqiNP_001077383.1. NM_001083914.2. [P56545-1]
NP_001277143.1. NM_001290214.2. [P56545-1]
NP_001277144.1. NM_001290215.2. [P56545-1]
NP_001307941.1. NM_001321012.1. [P56545-1]
NP_001307942.1. NM_001321013.1. [P56545-1]
NP_001307943.1. NM_001321014.1. [P56545-1]
NP_001320.1. NM_001329.3. [P56545-1]
NP_073713.2. NM_022802.2. [P56545-2]
XP_005269618.1. XM_005269561.2. [P56545-1]
XP_005269621.1. XM_005269564.2. [P56545-1]
XP_005269624.1. XM_005269567.2. [P56545-1]
XP_005269625.1. XM_005269568.4. [P56545-1]
XP_005269626.1. XM_005269569.2. [P56545-1]
XP_005269628.1. XM_005269571.2. [P56545-1]
XP_005269629.1. XM_005269572.3. [P56545-1]
XP_006717705.1. XM_006717642.2. [P56545-1]
XP_011537653.1. XM_011539351.1. [P56545-1]
XP_011537655.1. XM_011539353.1. [P56545-1]
XP_011537656.1. XM_011539354.1. [P56545-1]
XP_011537657.1. XM_011539355.1. [P56545-1]
XP_016871245.1. XM_017015756.1. [P56545-1]
XP_016871246.1. XM_017015757.1. [P56545-1]
UniGeneiHs.501345.

Genome annotation databases

EnsembliENST00000309035; ENSP00000311825; ENSG00000175029. [P56545-2]
ENST00000337195; ENSP00000338615; ENSG00000175029. [P56545-1]
ENST00000411419; ENSP00000410474; ENSG00000175029. [P56545-1]
ENST00000494626; ENSP00000436285; ENSG00000175029. [P56545-1]
ENST00000531469; ENSP00000434630; ENSG00000175029. [P56545-1]
GeneIDi1488.
KEGGihsa:1488.
UCSCiuc001lie.5. human. [P56545-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF016507 mRNA. Translation: AAC39603.1.
AF222711 mRNA. Translation: AAG45951.1.
BT007012 mRNA. Translation: AAP35658.1.
AK290390 mRNA. Translation: BAF83079.1.
AL833398 mRNA. Translation: CAH10590.1.
AL596261, AL731571 Genomic DNA. Translation: CAH72472.1.
AL731571, AL596261 Genomic DNA. Translation: CAI16100.1.
AL731571 Genomic DNA. Translation: CAI16102.1.
CH471066 Genomic DNA. Translation: EAW49247.1.
CH471066 Genomic DNA. Translation: EAW49250.1.
CH471066 Genomic DNA. Translation: EAW49249.1.
BC002486 mRNA. Translation: AAH02486.1.
BC047018 mRNA. Translation: AAH47018.1.
BC052276 mRNA. Translation: AAH52276.1.
BC072020 mRNA. Translation: AAH72020.1.
CCDSiCCDS7643.1. [P56545-1]
CCDS7644.1. [P56545-2]
RefSeqiNP_001077383.1. NM_001083914.2. [P56545-1]
NP_001277143.1. NM_001290214.2. [P56545-1]
NP_001277144.1. NM_001290215.2. [P56545-1]
NP_001307941.1. NM_001321012.1. [P56545-1]
NP_001307942.1. NM_001321013.1. [P56545-1]
NP_001307943.1. NM_001321014.1. [P56545-1]
NP_001320.1. NM_001329.3. [P56545-1]
NP_073713.2. NM_022802.2. [P56545-2]
XP_005269618.1. XM_005269561.2. [P56545-1]
XP_005269621.1. XM_005269564.2. [P56545-1]
XP_005269624.1. XM_005269567.2. [P56545-1]
XP_005269625.1. XM_005269568.4. [P56545-1]
XP_005269626.1. XM_005269569.2. [P56545-1]
XP_005269628.1. XM_005269571.2. [P56545-1]
XP_005269629.1. XM_005269572.3. [P56545-1]
XP_006717705.1. XM_006717642.2. [P56545-1]
XP_011537653.1. XM_011539351.1. [P56545-1]
XP_011537655.1. XM_011539353.1. [P56545-1]
XP_011537656.1. XM_011539354.1. [P56545-1]
XP_011537657.1. XM_011539355.1. [P56545-1]
XP_016871245.1. XM_017015756.1. [P56545-1]
XP_016871246.1. XM_017015757.1. [P56545-1]
UniGeneiHs.501345.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
2OMEX-ray2.80A/B/C/D/E/F/G/H31-364[»]
4LCJX-ray2.86A/B/C/D/E/F/G/H31-362[»]
ProteinModelPortaliP56545.
SMRiP56545.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi107870. 105 interactors.
DIPiDIP-42104N.
IntActiP56545. 62 interactors.
MINTiMINT-1188878.
STRINGi9606.ENSP00000311825.

Chemistry databases

BindingDBiP56545.

PTM databases

iPTMnetiP56545.
PhosphoSitePlusiP56545.

Polymorphism and mutation databases

BioMutaiCTBP2.
DMDMi3182976.

Proteomic databases

EPDiP56545.
MaxQBiP56545.
PaxDbiP56545.
PeptideAtlasiP56545.
PRIDEiP56545.

Protocols and materials databases

DNASUi1488.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000309035; ENSP00000311825; ENSG00000175029. [P56545-2]
ENST00000337195; ENSP00000338615; ENSG00000175029. [P56545-1]
ENST00000411419; ENSP00000410474; ENSG00000175029. [P56545-1]
ENST00000494626; ENSP00000436285; ENSG00000175029. [P56545-1]
ENST00000531469; ENSP00000434630; ENSG00000175029. [P56545-1]
GeneIDi1488.
KEGGihsa:1488.
UCSCiuc001lie.5. human. [P56545-1]

Organism-specific databases

CTDi1488.
DisGeNETi1488.
GeneCardsiCTBP2.
HGNCiHGNC:2495. CTBP2.
HPAiCAB031916.
HPA023559.
HPA023564.
HPA044971.
MIMi602619. gene.
neXtProtiNX_P56545.
OpenTargetsiENSG00000175029.
PharmGKBiPA26996.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG0067. Eukaryota.
COG0111. LUCA.
GeneTreeiENSGT00530000063021.
HOVERGENiHBG001898.
InParanoidiP56545.
KOiK04496.
OrthoDBiEOG091G08GS.
PhylomeDBiP56545.
TreeFamiTF313593.

Enzyme and pathway databases

ReactomeiR-HSA-4641265. Repression of WNT target genes.
R-HSA-5339700. TCF7L2 mutants don't bind CTBP.
SignaLinkiP56545.
SIGNORiP56545.

Miscellaneous databases

ChiTaRSiCTBP2. human.
EvolutionaryTraceiP56545.
GeneWikiiCTBP2.
GenomeRNAii1488.
PROiP56545.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000175029.
CleanExiHS_CTBP2.
ExpressionAtlasiP56545. baseline and differential.
GenevisibleiP56545. HS.

Family and domain databases

Gene3Di3.40.50.720. 2 hits.
InterProiIPR006139. D-isomer_2_OHA_DH_cat_dom.
IPR029753. D-isomer_DH_CS.
IPR006140. D-isomer_DH_NAD-bd.
IPR016040. NAD(P)-bd_dom.
[Graphical view]
PfamiPF00389. 2-Hacid_dh. 1 hit.
PF02826. 2-Hacid_dh_C. 1 hit.
[Graphical view]
SUPFAMiSSF51735. SSF51735. 1 hit.
PROSITEiPS00671. D_2_HYDROXYACID_DH_3. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiCTBP2_HUMAN
AccessioniPrimary (citable) accession number: P56545
Secondary accession number(s): A8K2X5
, D3DRF5, O43449, Q5SQP7, Q69YI3, Q86SV0, Q9H2T8
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 15, 1998
Last sequence update: July 15, 1998
Last modified: November 30, 2016
This is version 168 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. Human chromosome 10
    Human chromosome 10: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  6. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.