Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Tyrosine-protein phosphatase non-receptor type 23

Gene

PTPN23

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Plays a role in sorting of endocytic ubiquitinated cargos into multivesicular bodies (MVBs) via its interaction with the ESCRT-I complex (endosomal sorting complex required for transport I), and possibly also other ESCRT complexes. May act as a negative regulator of Ras-mediated mitogenic activity. Plays a role in ciliogenesis.3 Publications

Catalytic activityi

Protein tyrosine phosphate + H2O = protein tyrosine + phosphate.PROSITE-ProRule annotation

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Active sitei1392Phosphocysteine intermediatePROSITE-ProRule annotation1

GO - Molecular functioni

  • protein kinase binding Source: UniProtKB
  • protein tyrosine phosphatase activity Source: UniProtKB

GO - Biological processi

  • cilium morphogenesis Source: UniProtKB
  • negative regulation of epithelial cell migration Source: UniProtKB
  • positive regulation of adherens junction organization Source: UniProtKB
  • positive regulation of early endosome to late endosome transport Source: UniProtKB
  • positive regulation of homophilic cell adhesion Source: UniProtKB
  • protein transport Source: UniProtKB-KW
  • ubiquitin-dependent protein catabolic process via the multivesicular body sorting pathway Source: UniProtKB
Complete GO annotation...

Keywords - Molecular functioni

Hydrolase, Protein phosphatase

Keywords - Biological processi

Cilium biogenesis/degradation, Protein transport, Transport

Enzyme and pathway databases

BioCyciZFISH:HS01202-MONOMER.
BRENDAi3.1.3.48. 2681.

Names & Taxonomyi

Protein namesi
Recommended name:
Tyrosine-protein phosphatase non-receptor type 23 (EC:3.1.3.48)
Alternative name(s):
His domain-containing protein tyrosine phosphatase
Short name:
HD-PTP
Protein tyrosine phosphatase TD14
Short name:
PTP-TD14
Gene namesi
Name:PTPN23
Synonyms:KIAA1471
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 3

Organism-specific databases

HGNCiHGNC:14406. PTPN23.

Subcellular locationi

GO - Cellular componenti

  • ciliary basal body Source: UniProtKB
  • cytoplasm Source: UniProtKB
  • cytoplasmic, membrane-bounded vesicle Source: UniProtKB-SubCell
  • early endosome Source: UniProtKB
  • endosome Source: UniProtKB
  • extracellular exosome Source: UniProtKB
  • nucleoplasm Source: HPA
  • nucleus Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Cell projection, Cilium, Cytoplasm, Cytoplasmic vesicle, Cytoskeleton, Endosome, Nucleus

Pathology & Biotechi

Mutagenesis

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Mutagenesisi202L → D: Nearly abolishes interaction with CHMP4B. Abolishes interaction with CHMP4B; when associated with D-206. 1 Publication1
Mutagenesisi206I → D: Abolishes interaction with CHMP4B; when associated with D-202. 1 Publication1
Mutagenesisi678F → D: Abolishes interaction with UBAP1. 1 Publication1

Organism-specific databases

DisGeNETi25930.
OpenTargetsiENSG00000076201.
PharmGKBiPA33996.

Polymorphism and mutation databases

BioMutaiPTPN23.
DMDMi68053318.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00000947771 – 1636Tyrosine-protein phosphatase non-receptor type 23Add BLAST1636

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei733PhosphoserineCombined sources1
Modified residuei950Omega-N-methylarginineCombined sources1
Modified residuei1122PhosphoserineCombined sources1
Modified residuei1123PhosphoserineCombined sources1
Modified residuei1131PhosphothreonineCombined sources1
Modified residuei1615Omega-N-methylarginineCombined sources1

Keywords - PTMi

Methylation, Phosphoprotein

Proteomic databases

EPDiQ9H3S7.
MaxQBiQ9H3S7.
PaxDbiQ9H3S7.
PeptideAtlasiQ9H3S7.
PRIDEiQ9H3S7.

PTM databases

DEPODiQ9H3S7.
iPTMnetiQ9H3S7.
PhosphoSitePlusiQ9H3S7.

Expressioni

Gene expression databases

BgeeiENSG00000076201.
CleanExiHS_PTPN23.
ExpressionAtlasiQ9H3S7. baseline and differential.
GenevisibleiQ9H3S7. HS.

Organism-specific databases

HPAiHPA016845.

Interactioni

Subunit structurei

Interacts with GRAP2 and GRB2. Interacts with UBAP1 and CHMP4B.3 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
CEP55D3DR373EBI-724478,EBI-10173536
CHMP4BQ9H4442EBI-724478,EBI-749627
GRAP2O757918EBI-724478,EBI-740418
Grap2O891002EBI-724478,EBI-642151From a different organism.
GRB2P629936EBI-724478,EBI-401755
NOTCH2NLQ7Z3S93EBI-724478,EBI-945833
PDCD6O753403EBI-724478,EBI-352915
PSMA3P257883EBI-724478,EBI-348380
PTK2Q053974EBI-724478,EBI-702142
SH3GL2Q999622EBI-724478,EBI-77938
TRIM27P143733EBI-724478,EBI-719493
TSG101Q998162EBI-724478,EBI-346882

GO - Molecular functioni

  • protein kinase binding Source: UniProtKB

Protein-protein interaction databases

BioGridi117430. 40 interactors.
DIPiDIP-29923N.
IntActiQ9H3S7. 22 interactors.
MINTiMINT-1425077.
STRINGi9606.ENSP00000265562.

Structurei

Secondary structure

11636
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Beta strandi17 – 20Combined sources4
Helixi23 – 32Combined sources10
Turni38 – 41Combined sources4
Helixi42 – 56Combined sources15
Helixi62 – 81Combined sources20
Beta strandi94 – 97Combined sources4
Turni99 – 101Combined sources3
Beta strandi104 – 108Combined sources5
Helixi110 – 131Combined sources22
Helixi137 – 160Combined sources24
Helixi167 – 169Combined sources3
Helixi171 – 195Combined sources25
Helixi200 – 221Combined sources22
Helixi224 – 230Combined sources7
Helixi233 – 262Combined sources30
Helixi266 – 286Combined sources21
Turni287 – 289Combined sources3
Helixi292 – 318Combined sources27
Helixi327 – 329Combined sources3
Helixi349 – 352Combined sources4
Turni356 – 359Combined sources4

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
3RAUX-ray1.95A/B2-361[»]
5CRUX-ray2.40A/B/C/D1-361[»]
5CRVX-ray2.00A/B1-361[»]
ProteinModelPortaliQ9H3S7.
SMRiQ9H3S7.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiQ9H3S7.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini8 – 394BRO1PROSITE-ProRule annotationAdd BLAST387
Repeati250 – 283TPR 1Add BLAST34
Repeati374 – 407TPR 2Add BLAST34
Repeati953 – 95412
Repeati955 – 95622
Repeati957 – 95832
Repeati959 – 96042
Repeati961 – 96252
Repeati963 – 96462
Domaini1192 – 1452Tyrosine-protein phosphatasePROSITE-ProRule annotationAdd BLAST261

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni770 – 1130HisAdd BLAST361
Regioni953 – 9646 X 2 AA approximate tandem repeats of P-QAdd BLAST12

Coiled coil

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Coiled coili550 – 623Sequence analysisAdd BLAST74

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Compositional biasi716 – 1108Pro-richAdd BLAST393
Compositional biasi1509 – 1573Pro-richAdd BLAST65

Sequence similaritiesi

Contains 1 BRO1 domain.PROSITE-ProRule annotation
Contains 2 TPR repeats.Curated
Contains 1 tyrosine-protein phosphatase domain.PROSITE-ProRule annotation

Keywords - Domaini

Coiled coil, Repeat, TPR repeat

Phylogenomic databases

eggNOGiKOG0789. Eukaryota.
KOG2220. Eukaryota.
ENOG410XQX6. LUCA.
GeneTreeiENSGT00780000121909.
HOVERGENiHBG082231.
InParanoidiQ9H3S7.
KOiK18040.
OMAiSIQAPIP.
OrthoDBiEOG091G02XV.
PhylomeDBiQ9H3S7.
TreeFamiTF323502.

Family and domain databases

Gene3Di1.25.40.280. 1 hit.
3.90.190.10. 1 hit.
InterProiIPR025304. ALIX_V_dom.
IPR004328. BRO1_dom.
IPR029021. Prot-tyrosine_phosphatase-like.
IPR000242. PTPase_domain.
IPR028770. PTPN23.
IPR016130. Tyr_Pase_AS.
IPR003595. Tyr_Pase_cat.
IPR000387. TYR_PHOSPHATASE_dom.
[Graphical view]
PANTHERiPTHR19134:SF333. PTHR19134:SF333. 2 hits.
PfamiPF13949. ALIX_LYPXL_bnd. 1 hit.
PF03097. BRO1. 1 hit.
PF00102. Y_phosphatase. 1 hit.
[Graphical view]
PRINTSiPR00700. PRTYPHPHTASE.
SMARTiSM01041. BRO1. 1 hit.
SM00194. PTPc. 1 hit.
SM00404. PTPc_motif. 1 hit.
[Graphical view]
SUPFAMiSSF52799. SSF52799. 1 hit.
PROSITEiPS51180. BRO1. 1 hit.
PS00383. TYR_PHOSPHATASE_1. 1 hit.
PS50056. TYR_PHOSPHATASE_2. 1 hit.
PS50055. TYR_PHOSPHATASE_PTP. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Q9H3S7-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MEAVPRMPMI WLDLKEAGDF HFQPAVKKFV LKNYGENPEA YNEELKKLEL
60 70 80 90 100
LRQNAVRVPR DFEGCSVLRK YLGQLHYLQS RVPMGSGQEA AVPVTWTEIF
110 120 130 140 150
SGKSVAHEDI KYEQACILYN LGALHSMLGA MDKRVSEEGM KVSCTHFQCA
160 170 180 190 200
AGAFAYLREH FPQAYSVDMS RQILTLNVNL MLGQAQECLL EKSMLDNRKS
210 220 230 240 250
FLVARISAQV VDYYKEACRA LENPDTASLL GRIQKDWKKL VQMKIYYFAA
260 270 280 290 300
VAHLHMGKQA EEQQKFGERV AYFQSALDKL NEAIKLAKGQ PDTVQDALRF
310 320 330 340 350
TMDVIGGKYN SAKKDNDFIY HEAVPALDTL QPVKGAPLVK PLPVNPTDPA
360 370 380 390 400
VTGPDIFAKL VPMAAHEASS LYSEEKAKLL REMMAKIEDK NEVLDQFMDS
410 420 430 440 450
MQLDPETVDN LDAYSHIPPQ LMEKCAALSV RPDTVRNLVQ SMQVLSGVFT
460 470 480 490 500
DVEASLKDIR DLLEEDELLE QKFQEAVGQA GAISITSKAE LAEVRREWAK
510 520 530 540 550
YMEVHEKASF TNSELHRAMN LHVGNLRLLS GPLDQVRAAL PTPALSPEDK
560 570 580 590 600
AVLQNLKRIL AKVQEMRDQR VSLEQQLREL IQKDDITASL VTTDHSEMKK
610 620 630 640 650
LFEEQLKKYD QLKVYLEQNL AAQDRVLCAL TEANVQYAAV RRVLSDLDQK
660 670 680 690 700
WNSTLQTLVA SYEAYEDLMK KSQEGRDFYA DLESKVAALL ERTQSTCQAR
710 720 730 740 750
EAARQQLLDR ELKKKPPPRP TAPKPLLPRR EESEAVEAGD PPEELRSLPP
760 770 780 790 800
DMVAGPRLPD TFLGSATPLH FPPSPFPSST GPGPHYLSGP LPPGTYSGPT
810 820 830 840 850
QLIQPRAPGP HAMPVAPGPA LYPAPAYTPE LGLVPRSSPQ HGVVSSPYVG
860 870 880 890 900
VGPAPPVAGL PSAPPPQFSG PELAMAVRPA TTTVDSIQAP IPSHTAPRPN
910 920 930 940 950
PTPAPPPPCF PVPPPQPLPT PYTYPAGAKQ PIPAQHHFSS GIPAGFPAPR
960 970 980 990 1000
IGPQPQPHPQ PHPSQAFGPQ PPQQPLPLQH PHLFPPQAPG LLPPQSPYPY
1010 1020 1030 1040 1050
APQPGVLGQP PPPLHTQLYP GPAQDPLPAH SGALPFPSPG PPQPPHPPLA
1060 1070 1080 1090 1100
YGPAPSTRPM GPQAAPLTIR GPSSAGQSTP SPHLVPSPAP SPGPGPVPPR
1110 1120 1130 1140 1150
PPAAEPPPCL RRGAAAADLL SSSPESQHGG TQSPGGGQPL LQPTKVDAAE
1160 1170 1180 1190 1200
GRRPQALRLI ERDPYEHPER LRQLQQELEA FRGQLGDVGA LDTVWRELQD
1210 1220 1230 1240 1250
AQEHDARGRS IAIARCYSLK NRHQDVMPYD SNRVVLRSGK DDYINASCVE
1260 1270 1280 1290 1300
GLSPYCPPLV ATQAPLPGTA ADFWLMVHEQ KVSVIVMLVS EAEMEKQKVA
1310 1320 1330 1340 1350
RYFPTERGQP MVHGALSLAL SSVRSTETHV ERVLSLQFRD QSLKRSLVHL
1360 1370 1380 1390 1400
HFPTWPELGL PDSPSNLLRF IQEVHAHYLH QRPLHTPIIV HCSSGVGRTG
1410 1420 1430 1440 1450
AFALLYAAVQ EVEAGNGIPE LPQLVRRMRQ QRKHMLQEKL HLRFCYEAVV
1460 1470 1480 1490 1500
RHVEQVLQRH GVPPPCKPLA SASISQKNHL PQDSQDLVLG GDVPISSIQA
1510 1520 1530 1540 1550
TIAKLSIRPP GGLESPVASL PGPAEPPGLP PASLPESTPI PSSSPPPLSS
1560 1570 1580 1590 1600
PLPEAPQPKE EPPVPEAPSS GPPSSSLELL ASLTPEAFSL DSSLRGKQRM
1610 1620 1630
SKHNFLQAHN GQGLRATRPS DDPLSLLDPL WTLNKT
Length:1,636
Mass (Da):178,974
Last modified:March 1, 2001 - v1
Checksum:i536BDDF9D3DC95C0
GO

Sequence cautioni

The sequence BAA95995 differs from that shown. Reason: Erroneous initiation. Translation N-terminally shortened.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti647L → G in CAB53676 (PubMed:17974005).Curated1
Sequence conflicti1087S → P in CAB53676 (PubMed:17974005).Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_022682944A → T.2 PublicationsCorresponds to variant rs6780013dbSNPEnsembl.1
Natural variantiVAR_0226831099P → S in a lung cancer cell line; may be a common polymorphism. 1 PublicationCorresponds to variant rs149563514dbSNPEnsembl.1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AB025194 mRNA. Translation: BAB19280.1.
AF290614 mRNA. Translation: AAK28025.1.
AK289502 mRNA. Translation: BAF82191.1.
CH471055 Genomic DNA. Translation: EAW64823.1.
BC004881 mRNA. Translation: AAH04881.2.
BC027711 mRNA. Translation: AAH27711.2.
BC089042 mRNA. Translation: AAH89042.1.
AB040904 mRNA. Translation: BAA95995.2. Different initiation.
AL110210 mRNA. Translation: CAB53676.1.
BT009758 mRNA. Translation: AAP88760.1.
AF169350 mRNA. Translation: AAD50276.1.
CCDSiCCDS2754.1.
PIRiT14756.
RefSeqiNP_001291411.1. NM_001304482.1.
NP_056281.1. NM_015466.3.
UniGeneiHs.25524.

Genome annotation databases

EnsembliENST00000265562; ENSP00000265562; ENSG00000076201.
GeneIDi25930.
KEGGihsa:25930.
UCSCiuc003crf.2. human.

Keywords - Coding sequence diversityi

Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AB025194 mRNA. Translation: BAB19280.1.
AF290614 mRNA. Translation: AAK28025.1.
AK289502 mRNA. Translation: BAF82191.1.
CH471055 Genomic DNA. Translation: EAW64823.1.
BC004881 mRNA. Translation: AAH04881.2.
BC027711 mRNA. Translation: AAH27711.2.
BC089042 mRNA. Translation: AAH89042.1.
AB040904 mRNA. Translation: BAA95995.2. Different initiation.
AL110210 mRNA. Translation: CAB53676.1.
BT009758 mRNA. Translation: AAP88760.1.
AF169350 mRNA. Translation: AAD50276.1.
CCDSiCCDS2754.1.
PIRiT14756.
RefSeqiNP_001291411.1. NM_001304482.1.
NP_056281.1. NM_015466.3.
UniGeneiHs.25524.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
3RAUX-ray1.95A/B2-361[»]
5CRUX-ray2.40A/B/C/D1-361[»]
5CRVX-ray2.00A/B1-361[»]
ProteinModelPortaliQ9H3S7.
SMRiQ9H3S7.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi117430. 40 interactors.
DIPiDIP-29923N.
IntActiQ9H3S7. 22 interactors.
MINTiMINT-1425077.
STRINGi9606.ENSP00000265562.

PTM databases

DEPODiQ9H3S7.
iPTMnetiQ9H3S7.
PhosphoSitePlusiQ9H3S7.

Polymorphism and mutation databases

BioMutaiPTPN23.
DMDMi68053318.

Proteomic databases

EPDiQ9H3S7.
MaxQBiQ9H3S7.
PaxDbiQ9H3S7.
PeptideAtlasiQ9H3S7.
PRIDEiQ9H3S7.

Protocols and materials databases

DNASUi25930.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000265562; ENSP00000265562; ENSG00000076201.
GeneIDi25930.
KEGGihsa:25930.
UCSCiuc003crf.2. human.

Organism-specific databases

CTDi25930.
DisGeNETi25930.
GeneCardsiPTPN23.
HGNCiHGNC:14406. PTPN23.
HPAiHPA016845.
MIMi606584. gene.
neXtProtiNX_Q9H3S7.
OpenTargetsiENSG00000076201.
PharmGKBiPA33996.
HUGEiSearch...
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG0789. Eukaryota.
KOG2220. Eukaryota.
ENOG410XQX6. LUCA.
GeneTreeiENSGT00780000121909.
HOVERGENiHBG082231.
InParanoidiQ9H3S7.
KOiK18040.
OMAiSIQAPIP.
OrthoDBiEOG091G02XV.
PhylomeDBiQ9H3S7.
TreeFamiTF323502.

Enzyme and pathway databases

BioCyciZFISH:HS01202-MONOMER.
BRENDAi3.1.3.48. 2681.

Miscellaneous databases

ChiTaRSiPTPN23. human.
EvolutionaryTraceiQ9H3S7.
GeneWikiiPTPN23.
GenomeRNAii25930.
PROiQ9H3S7.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000076201.
CleanExiHS_PTPN23.
ExpressionAtlasiQ9H3S7. baseline and differential.
GenevisibleiQ9H3S7. HS.

Family and domain databases

Gene3Di1.25.40.280. 1 hit.
3.90.190.10. 1 hit.
InterProiIPR025304. ALIX_V_dom.
IPR004328. BRO1_dom.
IPR029021. Prot-tyrosine_phosphatase-like.
IPR000242. PTPase_domain.
IPR028770. PTPN23.
IPR016130. Tyr_Pase_AS.
IPR003595. Tyr_Pase_cat.
IPR000387. TYR_PHOSPHATASE_dom.
[Graphical view]
PANTHERiPTHR19134:SF333. PTHR19134:SF333. 2 hits.
PfamiPF13949. ALIX_LYPXL_bnd. 1 hit.
PF03097. BRO1. 1 hit.
PF00102. Y_phosphatase. 1 hit.
[Graphical view]
PRINTSiPR00700. PRTYPHPHTASE.
SMARTiSM01041. BRO1. 1 hit.
SM00194. PTPc. 1 hit.
SM00404. PTPc_motif. 1 hit.
[Graphical view]
SUPFAMiSSF52799. SSF52799. 1 hit.
PROSITEiPS51180. BRO1. 1 hit.
PS00383. TYR_PHOSPHATASE_1. 1 hit.
PS50056. TYR_PHOSPHATASE_2. 1 hit.
PS50055. TYR_PHOSPHATASE_PTP. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiPTN23_HUMAN
AccessioniPrimary (citable) accession number: Q9H3S7
Secondary accession number(s): A8K0D7
, Q7KZF8, Q8N6Z5, Q9BSR5, Q9P257, Q9UG03, Q9UMZ4
Entry historyi
Integrated into UniProtKB/Swiss-Prot: June 21, 2005
Last sequence update: March 1, 2001
Last modified: November 2, 2016
This is version 140 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 3
    Human chromosome 3: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  6. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.