Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Dyslexia susceptibility 1 candidate gene 1 protein

Gene

DYX1C1

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Involved in neuronal migration during development of the cerebral neocortex. May regulate the stability and proteasomal degradation of the estrogen receptors that play an important role in neuronal differentiation, survival and plasticity. Axonemal dynein assembly factor required for ciliary motility.2 Publications

GO - Molecular functioni

  • estrogen receptor binding Source: UniProtKB

GO - Biological processi

  • cilium movement Source: UniProtKB
  • determination of left/right symmetry Source: UniProtKB
  • inner dynein arm assembly Source: UniProtKB
  • neuron migration Source: UniProtKB
  • outer dynein arm assembly Source: UniProtKB
  • regulation of intracellular estrogen receptor signaling pathway Source: UniProtKB
  • regulation of proteasomal protein catabolic process Source: UniProtKB
Complete GO annotation...

Keywords - Biological processi

Neurogenesis

Names & Taxonomyi

Protein namesi
Recommended name:
Dyslexia susceptibility 1 candidate gene 1 protein
Gene namesi
Name:DYX1C1
Synonyms:EKN1
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 15

Organism-specific databases

HGNCiHGNC:21493. DYX1C1.

Subcellular locationi

GO - Cellular componenti

  • cytoplasm Source: UniProtKB
  • nucleus Source: UniProtKB
  • plasma membrane Source: HPA
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Nucleus

Pathology & Biotechi

Involvement in diseasei

Dyslexia 1 (DYX1)
Disease susceptibility is associated with variations affecting the gene represented in this entry. A chromosomal aberration involving DYX1C1 has been found in a family affected by dyslexia. Translocation t(2;15)(q11;q21).
Disease descriptionA relatively common, complex cognitive disorder characterized by an impairment of reading performance despite adequate motivational, educational and intellectual opportunities. It is a multifactorial trait, with evidence for familial clustering and heritability.
See also OMIM:127700
Ciliary dyskinesia, primary, 25 (CILD25)2 Publications
The disease is caused by mutations affecting the gene represented in this entry.
Disease descriptionA disorder characterized by abnormalities of motile cilia. Respiratory infections leading to chronic inflammation and bronchiectasis are recurrent, due to defects in the respiratory cilia. Patients may exhibit randomization of left-right body asymmetry and situs inversus, due to dysfunction of monocilia at the embryonic node. Primary ciliary dyskinesia associated with situs inversus is referred to as Kartagener syndrome.
See also OMIM:615482

Keywords - Diseasei

Ciliopathy, Primary ciliary dyskinesia

Organism-specific databases

MalaCardsiDYX1C1.
MIMi127700. phenotype.
615482. phenotype.
Orphaneti244. Primary ciliary dyskinesia.
PharmGKBiPA134870600.

Polymorphism and mutation databases

BioMutaiDYX1C1.
DMDMi209572610.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 420420Dyslexia susceptibility 1 candidate gene 1 proteinPRO_0000106284Add
BLAST

Proteomic databases

MaxQBiQ8WXU2.
PaxDbiQ8WXU2.
PRIDEiQ8WXU2.

PTM databases

iPTMnetiQ8WXU2.
PhosphoSiteiQ8WXU2.

Expressioni

Tissue specificityi

Expressed in several tissues, including brain, lung, kidney and testis. In brain localizes to a fraction of cortical neurons and white matter glial cells.1 Publication

Gene expression databases

BgeeiQ8WXU2.
CleanExiHS_DYX1C1.
ExpressionAtlasiQ8WXU2. baseline and differential.
GenevisibleiQ8WXU2. HS.

Organism-specific databases

HPAiHPA051048.

Interactioni

Subunit structurei

Interacts with ESR1 and ESR2. Interacts with STUB1. Interacts with DNAAF2. Interacts with CCT3, CCT4, CCT5 and CCT8 (By similarity).By similarity

Binary interactionsi

WithEntry#Exp.IntActNotes
GABARAPO951664EBI-2946907,EBI-712001
GABARAPL1Q9H0R82EBI-2946907,EBI-746969
GABARAPL2P605202EBI-2946907,EBI-720116
MAP1LC3CQ9BXW42EBI-2946907,EBI-2603996
SPDL1Q96EA43EBI-2946907,EBI-715381

GO - Molecular functioni

  • estrogen receptor binding Source: UniProtKB

Protein-protein interaction databases

BioGridi127797. 10 interactions.
IntActiQ8WXU2. 9 interactions.
MINTiMINT-8417659.
STRINGi9606.ENSP00000323275.

Structurei

3D structure databases

ProteinModelPortaliQ8WXU2.
SMRiQ8WXU2. Positions 285-400.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini3 – 8785CSPROSITE-ProRule annotationAdd
BLAST
Repeati290 – 32334TPR 1Add
BLAST
Repeati324 – 35734TPR 2Add
BLAST
Repeati366 – 39934TPR 3Add
BLAST

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni7 – 10397Mediates interaction with ESR1 and STUB1Add
BLAST

Sequence similaritiesi

Contains 1 CS domain.PROSITE-ProRule annotation
Contains 3 TPR repeats.PROSITE-ProRule annotation

Keywords - Domaini

Repeat, TPR repeat

Phylogenomic databases

eggNOGiKOG1124. Eukaryota.
COG0457. LUCA.
GeneTreeiENSGT00390000004930.
HOGENOMiHOG000046868.
HOVERGENiHBG051427.
InParanoidiQ8WXU2.
KOiK19758.
OMAiMPLQVSD.
PhylomeDBiQ8WXU2.
TreeFamiTF328983.

Family and domain databases

Gene3Di1.25.40.10. 1 hit.
2.60.40.790. 1 hit.
InterProiIPR007052. CS_dom.
IPR008978. HSP20-like_chaperone.
IPR013026. TPR-contain_dom.
IPR011990. TPR-like_helical_dom.
IPR019734. TPR_repeat.
[Graphical view]
PfamiPF04969. CS. 1 hit.
[Graphical view]
SMARTiSM00028. TPR. 3 hits.
[Graphical view]
SUPFAMiSSF48452. SSF48452. 1 hit.
SSF49764. SSF49764. 2 hits.
PROSITEiPS51203. CS. 1 hit.
PS50005. TPR. 3 hits.
PS50293. TPR_REGION. 1 hit.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q8WXU2-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MPLQVSDYSW QQTKTAVFLS LPLKGVCVRD TDVFCTENYL KVNFPPFLFE
60 70 80 90 100
AFLYAPIDDE SSKAKIGNDT IVFTLYKKEA AMWETLSVTG VDKEMMQRIR
110 120 130 140 150
EKSILQAQER AKEATEAKAA AKREDQKYAL SVMMKIEEEE RKKIEDMKEN
160 170 180 190 200
ERIKATKALE AWKEYQRKAE EQKKIQREEK LCQKEKQIKE ERKKIKYKSL
210 220 230 240 250
TRNLASRNLA PKGRNSENIF TEKLKEDSIP APRSVGSIKI NFTPRVFPTA
260 270 280 290 300
LRESQVAEEE EWLHKQAEAR RAMNTDIAEL CDLKEEEKNP EWLKDKGNKL
310 320 330 340 350
FATENYLAAI NAYNLAIRLN NKMPLLYLNR AACHLKLKNL HKAIEDSSKA
360 370 380 390 400
LELLMPPVTD NANARMKAHV RRGTAFCQLE LYVEGLQDYE AALKIDPSNK
410 420
IVQIDAEKIR NVIQGTELKS
Length:420
Mass (Da):48,527
Last modified:October 14, 2008 - v2
Checksum:i6B8729B3F4ED5108
GO
Isoform 2 (identifier: Q8WXU2-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     350-381: ALELLMPPVTDNANARMKAHVRRGTAFCQLEL → EFCSLEGIECQASEPKLSHHIPSDLHVYIQMA
     382-420: Missing.

Note: No experimental confirmation available.
Show »
Length:381
Mass (Da):44,201
Checksum:iD1CCD335FF5A0C48
GO
Isoform 3 (identifier: Q8WXU2-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     351-420: LELLMPPVTD...NVIQGTELKS → YRIMKRHLRLIHPTKLYKLMLRRFGM

Note: No experimental confirmation available.
Show »
Length:376
Mass (Da):44,031
Checksum:i3761D3EEF990B85E
GO

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti2 – 21P → S.1 Publication
Corresponds to variant rs143493699 [ dbSNP | Ensembl ].
VAR_017383
Natural varianti38 – 381N → K.
Corresponds to variant rs16976354 [ dbSNP | Ensembl ].
VAR_026214
Natural varianti91 – 911V → I.1 Publication
Corresponds to variant rs17819126 [ dbSNP | Ensembl ].
VAR_017384
Natural varianti191 – 1911E → G.2 Publications
Corresponds to variant rs600753 [ dbSNP | Ensembl ].
VAR_017385
Natural varianti332 – 3321A → V.1 Publication
Corresponds to variant rs17855756 [ dbSNP | Ensembl ].
VAR_026215
Natural varianti420 – 4201S → C.1 Publication
VAR_017386

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei350 – 38132ALELL…CQLEL → EFCSLEGIECQASEPKLSHH IPSDLHVYIQMA in isoform 2. 1 PublicationVSP_011822Add
BLAST
Alternative sequencei351 – 42070LELLM…TELKS → YRIMKRHLRLIHPTKLYKLM LRRFGM in isoform 3. 1 PublicationVSP_041379Add
BLAST
Alternative sequencei382 – 42039Missing in isoform 2. 1 PublicationVSP_011823Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF337549 mRNA. Translation: AAL73230.1.
AK095201 mRNA. Translation: BAC04498.1.
AC013355 Genomic DNA. No translation available.
AC022083 Genomic DNA. No translation available.
BC062564 mRNA. Translation: AAH62564.1.
CCDSiCCDS10154.1. [Q8WXU2-1]
CCDS32243.1. [Q8WXU2-2]
CCDS32244.1. [Q8WXU2-3]
RefSeqiNP_001028731.1. NM_001033559.2. [Q8WXU2-3]
NP_001028732.1. NM_001033560.1. [Q8WXU2-2]
NP_570722.2. NM_130810.3. [Q8WXU2-1]
UniGeneiHs.126403.

Genome annotation databases

EnsembliENST00000321149; ENSP00000323275; ENSG00000256061. [Q8WXU2-1]
ENST00000348518; ENSP00000299561; ENSG00000256061. [Q8WXU2-1]
ENST00000448430; ENSP00000403412; ENSG00000256061. [Q8WXU2-2]
ENST00000457155; ENSP00000402640; ENSG00000256061. [Q8WXU2-3]
GeneIDi161582.
KEGGihsa:161582.
UCSCiuc002adc.3. human. [Q8WXU2-1]

Keywords - Coding sequence diversityi

Alternative splicing, Chromosomal rearrangement, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF337549 mRNA. Translation: AAL73230.1.
AK095201 mRNA. Translation: BAC04498.1.
AC013355 Genomic DNA. No translation available.
AC022083 Genomic DNA. No translation available.
BC062564 mRNA. Translation: AAH62564.1.
CCDSiCCDS10154.1. [Q8WXU2-1]
CCDS32243.1. [Q8WXU2-2]
CCDS32244.1. [Q8WXU2-3]
RefSeqiNP_001028731.1. NM_001033559.2. [Q8WXU2-3]
NP_001028732.1. NM_001033560.1. [Q8WXU2-2]
NP_570722.2. NM_130810.3. [Q8WXU2-1]
UniGeneiHs.126403.

3D structure databases

ProteinModelPortaliQ8WXU2.
SMRiQ8WXU2. Positions 285-400.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi127797. 10 interactions.
IntActiQ8WXU2. 9 interactions.
MINTiMINT-8417659.
STRINGi9606.ENSP00000323275.

PTM databases

iPTMnetiQ8WXU2.
PhosphoSiteiQ8WXU2.

Polymorphism and mutation databases

BioMutaiDYX1C1.
DMDMi209572610.

Proteomic databases

MaxQBiQ8WXU2.
PaxDbiQ8WXU2.
PRIDEiQ8WXU2.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000321149; ENSP00000323275; ENSG00000256061. [Q8WXU2-1]
ENST00000348518; ENSP00000299561; ENSG00000256061. [Q8WXU2-1]
ENST00000448430; ENSP00000403412; ENSG00000256061. [Q8WXU2-2]
ENST00000457155; ENSP00000402640; ENSG00000256061. [Q8WXU2-3]
GeneIDi161582.
KEGGihsa:161582.
UCSCiuc002adc.3. human. [Q8WXU2-1]

Organism-specific databases

CTDi161582.
GeneCardsiDYX1C1.
HGNCiHGNC:21493. DYX1C1.
HPAiHPA051048.
MalaCardsiDYX1C1.
MIMi127700. phenotype.
608706. gene.
615482. phenotype.
neXtProtiNX_Q8WXU2.
Orphaneti244. Primary ciliary dyskinesia.
PharmGKBiPA134870600.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG1124. Eukaryota.
COG0457. LUCA.
GeneTreeiENSGT00390000004930.
HOGENOMiHOG000046868.
HOVERGENiHBG051427.
InParanoidiQ8WXU2.
KOiK19758.
OMAiMPLQVSD.
PhylomeDBiQ8WXU2.
TreeFamiTF328983.

Miscellaneous databases

GeneWikiiDYX1C1.
GenomeRNAii161582.
PROiQ8WXU2.
SOURCEiSearch...

Gene expression databases

BgeeiQ8WXU2.
CleanExiHS_DYX1C1.
ExpressionAtlasiQ8WXU2. baseline and differential.
GenevisibleiQ8WXU2. HS.

Family and domain databases

Gene3Di1.25.40.10. 1 hit.
2.60.40.790. 1 hit.
InterProiIPR007052. CS_dom.
IPR008978. HSP20-like_chaperone.
IPR013026. TPR-contain_dom.
IPR011990. TPR-like_helical_dom.
IPR019734. TPR_repeat.
[Graphical view]
PfamiPF04969. CS. 1 hit.
[Graphical view]
SMARTiSM00028. TPR. 3 hits.
[Graphical view]
SUPFAMiSSF48452. SSF48452. 1 hit.
SSF49764. SSF49764. 2 hits.
PROSITEiPS51203. CS. 1 hit.
PS50005. TPR. 3 hits.
PS50293. TPR_REGION. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "A candidate gene for developmental dyslexia encodes a nuclear tetratricopeptide repeat domain protein dynamically regulated in brain."
    Taipale M., Kaminen N., Nopola-Hemmi J., Haltia T., Myllyluoma B., Lyytinen H., Muller K., Kaaranen M., Lindsberg P.J., Hannula-Jouppi K., Kere J.
    Proc. Natl. Acad. Sci. U.S.A. 100:11553-11558(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1), SUBCELLULAR LOCATION, TISSUE SPECIFICITY, CHROMOSOMAL TRANSLOCATION, VARIANTS SER-2; ILE-91; GLY-191 AND CYS-420.
  2. "Complete sequencing and characterization of 21,243 full-length human cDNAs."
    Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S.
    , Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., Isogai T., Sugano S.
    Nat. Genet. 36:40-45(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 3).
    Tissue: Subthalamic nucleus.
  3. "Analysis of the DNA sequence and duplication history of human chromosome 15."
    Zody M.C., Garber M., Sharpe T., Young S.K., Rowen L., O'Neill K., Whittaker C.A., Kamal M., Chang J.L., Cuomo C.A., Dewar K., FitzGerald M.G., Kodira C.D., Madan A., Qin S., Yang X., Abbasi N., Abouelleil A.
    , Arachchi H.M., Baradarani L., Birditt B., Bloom S., Bloom T., Borowsky M.L., Burke J., Butler J., Cook A., DeArellano K., DeCaprio D., Dorris L. III, Dors M., Eichler E.E., Engels R., Fahey J., Fleetwood P., Friedman C., Gearin G., Hall J.L., Hensley G., Johnson E., Jones C., Kamat A., Kaur A., Locke D.P., Madan A., Munson G., Jaffe D.B., Lui A., Macdonald P., Mauceli E., Naylor J.W., Nesbitt R., Nicol R., O'Leary S.B., Ratcliffe A., Rounsley S., She X., Sneddon K.M.B., Stewart S., Sougnez C., Stone S.M., Topham K., Vincent D., Wang S., Zimmer A.R., Birren B.W., Hood L., Lander E.S., Nusbaum C.
    Nature 440:671-675(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  4. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2), VARIANTS GLY-191 AND VAL-332.
  5. Cited for: SUBCELLULAR LOCATION.
  6. "Functional interaction of DYX1C1 with estrogen receptors suggests involvement of hormonal pathways in dyslexia."
    Massinen S., Tammimies K., Tapia-Paez I., Matsson H., Hokkanen M.E., Soederberg O., Landegren U., Castren E., Gustafsson J.A., Treuter E., Kere J.
    Hum. Mol. Genet. 18:2802-2812(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, INTERACTION WITH ESR1 AND STUB1, SUBCELLULAR LOCATION.
  7. Cited for: FUNCTION, SUBCELLULAR LOCATION, INTERACTION WITH DNAAF2, INVOLVEMENT IN CILD25.
  8. "Ciliary beat pattern and frequency in genetic variants of primary ciliary dyskinesia."
    Raidt J., Wallmeier J., Hjeij R., Onnebrink J.G., Pennekamp P., Loges N.T., Olbrich H., Haeffner K., Dougherty G.W., Omran H., Werner C.
    Eur. Respir. J. 44:1579-1588(2014) [PubMed] [Europe PMC] [Abstract]
    Cited for: INVOLVEMENT IN CILD25.

Entry informationi

Entry nameiDYXC1_HUMAN
AccessioniPrimary (citable) accession number: Q8WXU2
Secondary accession number(s): Q6P5Y9, Q8N1S6
Entry historyi
Integrated into UniProtKB/Swiss-Prot: November 21, 2003
Last sequence update: October 14, 2008
Last modified: June 8, 2016
This is version 140 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 15
    Human chromosome 15: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.