Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Kinetochore-associated protein DSN1 homolog

Gene

DSN1

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Part of the MIS12 complex which is required for normal chromosome alignment and segregation and kinetochore formation during mitosis.2 Publications

GO - Biological processi

Complete GO annotation...

Keywords - Biological processi

Cell cycle, Cell division, Chromosome partition, Mitosis

Enzyme and pathway databases

ReactomeiR-HSA-2467813. Separation of Sister Chromatids.
R-HSA-2500257. Resolution of Sister Chromatid Cohesion.
R-HSA-5663220. RHO GTPases Activate Formins.
R-HSA-68877. Mitotic Prometaphase.

Names & Taxonomyi

Protein namesi
Recommended name:
Kinetochore-associated protein DSN1 homolog
Gene namesi
Name:DSN1
Synonyms:C20orf172, MIS13
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 20

Organism-specific databases

HGNCiHGNC:16165. DSN1.

Subcellular locationi

GO - Cellular componenti

  • cytoplasm Source: HPA
  • cytosol Source: Reactome
  • MIS12/MIND type complex Source: UniProtKB
  • nucleus Source: HPA
Complete GO annotation...

Keywords - Cellular componenti

Centromere, Chromosome, Kinetochore, Nucleus

Pathology & Biotechi

Organism-specific databases

PharmGKBiPA162384106.

Polymorphism and mutation databases

BioMutaiDSN1.
DMDMi28201793.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 356356Kinetochore-associated protein DSN1 homologPRO_0000079481Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei28 – 281PhosphoserineCombined sources
Modified residuei30 – 301PhosphoserineCombined sources
Modified residuei58 – 581PhosphoserineCombined sources
Modified residuei77 – 771PhosphoserineCombined sources
Modified residuei81 – 811PhosphoserineCombined sources
Modified residuei331 – 3311PhosphoserineBy similarity

Keywords - PTMi

Phosphoprotein

Proteomic databases

EPDiQ9H410.
MaxQBiQ9H410.
PaxDbiQ9H410.
PRIDEiQ9H410.

PTM databases

iPTMnetiQ9H410.
PhosphoSiteiQ9H410.

Expressioni

Gene expression databases

BgeeiQ9H410.
CleanExiHS_DSN1.
ExpressionAtlasiQ9H410. baseline and differential.
GenevisibleiQ9H410. HS.

Organism-specific databases

HPAiHPA002813.

Interactioni

Subunit structurei

Component of the MIS12 complex composed of MIS12, DSN1, NSL1 and PMF1. Also interacts with CASC5, CBX3 and CBX5. Interacts with KNSTRN.3 Publications

Protein-protein interaction databases

BioGridi123045. 55 interactions.
IntActiQ9H410. 32 interactions.
MINTiMINT-4988891.
STRINGi9606.ENSP00000362850.

Structurei

3D structure databases

ProteinModelPortaliQ9H410.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Phylogenomic databases

eggNOGiENOG410IHC2. Eukaryota.
ENOG410YXFH. LUCA.
GeneTreeiENSGT00390000011347.
HOGENOMiHOG000231275.
HOVERGENiHBG051208.
InParanoidiQ9H410.
KOiK11544.
OMAiSMKETNR.
PhylomeDBiQ9H410.
TreeFamiTF335504.

Family and domain databases

InterProiIPR013218. Dsn1/Mis13.
[Graphical view]
PfamiPF08202. MIS13. 1 hit.
[Graphical view]

Sequences (4)i

Sequence statusi: Complete.

This entry describes 4 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q9H410-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MTSVTRSEII DEKGPVMSKT HDHQLESSLS PVEVFAKTSA SLEMNQGVSE
60 70 80 90 100
ERIHLGSSPK KGGNCDLSHQ ERLQSKSLHL SPQEQSASYQ DRRQSWRRAS
110 120 130 140 150
MKETNRRKSL HPIHQGITEL SRSISVDLAE SKRLGCLLLS SFQFSIQKLE
160 170 180 190 200
PFLRDTKGFS LESFRAKASS LSEELKHFAD GLETDGTLQK CFEDSNGKAS
210 220 230 240 250
DFSLEASVAE MKEYITKFSL ERQTWDQLLL HYQQEAKEIL SRGSTEAKIT
260 270 280 290 300
EVKVEPMTYL GSSQNEVLNT KPDYQKILQN QSKVFDCMEL VMDELQGSVK
310 320 330 340 350
QLQAFMDEST QCFQKVSVQL GKRSMQQLDP SPARKLLKLQ LQNPPAIHGS

GSGSCQ
Length:356
Mass (Da):40,067
Last modified:October 1, 2001 - v2
Checksum:iC75E113BE2A82A8B
GO
Isoform 2 (identifier: Q9H410-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-84: MTSVTRSEII...SKSLHLSPQE → MIINWNQVSVLWKCLLK

Note: No experimental confirmation available.
Show »
Length:289
Mass (Da):32,928
Checksum:iDF1BDBB113760B93
GO
Isoform 4 (identifier: Q9H410-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-16: Missing.

Show »
Length:340
Mass (Da):38,323
Checksum:i2C2F449A8449F506
GO
Isoform 3 (identifier: Q9H410-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     12-118: Missing.

Note: No experimental confirmation available.
Show »
Length:249
Mass (Da):27,998
Checksum:i9D49E8D25B012B6B
GO

Sequence cautioni

The sequence BAB14564.1 differs from that shown. Reason: Frameshift at position 277. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti142 – 1421F → S in BAC04024 (PubMed:14702039).Curated

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1 – 8484MTSVT…LSPQE → MIINWNQVSVLWKCLLK in isoform 2. 1 PublicationVSP_003826Add
BLAST
Alternative sequencei1 – 1616Missing in isoform 4. 1 PublicationVSP_044281Add
BLAST
Alternative sequencei12 – 118107Missing in isoform 3. 1 PublicationVSP_043204Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK093031 mRNA. Translation: BAC04024.1.
AK023408 mRNA. Translation: BAB14564.1. Frameshift.
AK301671 mRNA. Translation: BAG63144.1.
AK301840 mRNA. Translation: BAG63283.1.
AL132768 Genomic DNA. Translation: CAC10099.2.
AL132768 Genomic DNA. Translation: CAI21468.1.
CH471077 Genomic DNA. Translation: EAW76101.1.
CH471077 Genomic DNA. Translation: EAW76102.1.
CH471077 Genomic DNA. Translation: EAW76104.1.
CH471077 Genomic DNA. Translation: EAW76106.1.
BC058899 mRNA. Translation: AAH58899.1.
CCDSiCCDS13286.1. [Q9H410-1]
CCDS46596.1. [Q9H410-3]
CCDS46597.1. [Q9H410-4]
RefSeqiNP_001138787.1. NM_001145315.1. [Q9H410-1]
NP_001138788.1. NM_001145316.1. [Q9H410-1]
NP_001138789.1. NM_001145317.1. [Q9H410-3]
NP_001138790.1. NM_001145318.1. [Q9H410-4]
NP_079194.3. NM_024918.3. [Q9H410-1]
XP_006723939.1. XM_006723876.1. [Q9H410-3]
UniGeneiHs.632268.

Genome annotation databases

EnsembliENST00000373734; ENSP00000362839; ENSG00000149636. [Q9H410-3]
ENST00000373750; ENSP00000362855; ENSG00000149636. [Q9H410-1]
ENST00000426836; ENSP00000389810; ENSG00000149636. [Q9H410-1]
ENST00000448110; ENSP00000404463; ENSG00000149636. [Q9H410-4]
GeneIDi79980.
KEGGihsa:79980.
UCSCiuc002xga.4. human. [Q9H410-1]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK093031 mRNA. Translation: BAC04024.1.
AK023408 mRNA. Translation: BAB14564.1. Frameshift.
AK301671 mRNA. Translation: BAG63144.1.
AK301840 mRNA. Translation: BAG63283.1.
AL132768 Genomic DNA. Translation: CAC10099.2.
AL132768 Genomic DNA. Translation: CAI21468.1.
CH471077 Genomic DNA. Translation: EAW76101.1.
CH471077 Genomic DNA. Translation: EAW76102.1.
CH471077 Genomic DNA. Translation: EAW76104.1.
CH471077 Genomic DNA. Translation: EAW76106.1.
BC058899 mRNA. Translation: AAH58899.1.
CCDSiCCDS13286.1. [Q9H410-1]
CCDS46596.1. [Q9H410-3]
CCDS46597.1. [Q9H410-4]
RefSeqiNP_001138787.1. NM_001145315.1. [Q9H410-1]
NP_001138788.1. NM_001145316.1. [Q9H410-1]
NP_001138789.1. NM_001145317.1. [Q9H410-3]
NP_001138790.1. NM_001145318.1. [Q9H410-4]
NP_079194.3. NM_024918.3. [Q9H410-1]
XP_006723939.1. XM_006723876.1. [Q9H410-3]
UniGeneiHs.632268.

3D structure databases

ProteinModelPortaliQ9H410.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi123045. 55 interactions.
IntActiQ9H410. 32 interactions.
MINTiMINT-4988891.
STRINGi9606.ENSP00000362850.

PTM databases

iPTMnetiQ9H410.
PhosphoSiteiQ9H410.

Polymorphism and mutation databases

BioMutaiDSN1.
DMDMi28201793.

Proteomic databases

EPDiQ9H410.
MaxQBiQ9H410.
PaxDbiQ9H410.
PRIDEiQ9H410.

Protocols and materials databases

DNASUi79980.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000373734; ENSP00000362839; ENSG00000149636. [Q9H410-3]
ENST00000373750; ENSP00000362855; ENSG00000149636. [Q9H410-1]
ENST00000426836; ENSP00000389810; ENSG00000149636. [Q9H410-1]
ENST00000448110; ENSP00000404463; ENSG00000149636. [Q9H410-4]
GeneIDi79980.
KEGGihsa:79980.
UCSCiuc002xga.4. human. [Q9H410-1]

Organism-specific databases

CTDi79980.
GeneCardsiDSN1.
HGNCiHGNC:16165. DSN1.
HPAiHPA002813.
MIMi609175. gene.
neXtProtiNX_Q9H410.
PharmGKBiPA162384106.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410IHC2. Eukaryota.
ENOG410YXFH. LUCA.
GeneTreeiENSGT00390000011347.
HOGENOMiHOG000231275.
HOVERGENiHBG051208.
InParanoidiQ9H410.
KOiK11544.
OMAiSMKETNR.
PhylomeDBiQ9H410.
TreeFamiTF335504.

Enzyme and pathway databases

ReactomeiR-HSA-2467813. Separation of Sister Chromatids.
R-HSA-2500257. Resolution of Sister Chromatid Cohesion.
R-HSA-5663220. RHO GTPases Activate Formins.
R-HSA-68877. Mitotic Prometaphase.

Miscellaneous databases

GeneWikiiDSN1.
GenomeRNAii79980.
NextBioi70013.
PROiQ9H410.
SOURCEiSearch...

Gene expression databases

BgeeiQ9H410.
CleanExiHS_DSN1.
ExpressionAtlasiQ9H410. baseline and differential.
GenevisibleiQ9H410. HS.

Family and domain databases

InterProiIPR013218. Dsn1/Mis13.
[Graphical view]
PfamiPF08202. MIS13. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Complete sequencing and characterization of 21,243 full-length human cDNAs."
    Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S.
    , Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., Isogai T., Sugano S.
    Nat. Genet. 36:40-45(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 1; 2; 3 AND 4).
    Tissue: Esophageal carcinoma, Esophagus, Ovarian carcinoma and Testis.
  2. "The DNA sequence and comparative analysis of human chromosome 20."
    Deloukas P., Matthews L.H., Ashurst J.L., Burton J., Gilbert J.G.R., Jones M., Stavrides G., Almeida J.P., Babbage A.K., Bagguley C.L., Bailey J., Barlow K.F., Bates K.N., Beard L.M., Beare D.M., Beasley O.P., Bird C.P., Blakey S.E.
    , Bridgeman A.M., Brown A.J., Buck D., Burrill W.D., Butler A.P., Carder C., Carter N.P., Chapman J.C., Clamp M., Clark G., Clark L.N., Clark S.Y., Clee C.M., Clegg S., Cobley V.E., Collier R.E., Connor R.E., Corby N.R., Coulson A., Coville G.J., Deadman R., Dhami P.D., Dunn M., Ellington A.G., Frankland J.A., Fraser A., French L., Garner P., Grafham D.V., Griffiths C., Griffiths M.N.D., Gwilliam R., Hall R.E., Hammond S., Harley J.L., Heath P.D., Ho S., Holden J.L., Howden P.J., Huckle E., Hunt A.R., Hunt S.E., Jekosch K., Johnson C.M., Johnson D., Kay M.P., Kimberley A.M., King A., Knights A., Laird G.K., Lawlor S., Lehvaeslaiho M.H., Leversha M.A., Lloyd C., Lloyd D.M., Lovell J.D., Marsh V.L., Martin S.L., McConnachie L.J., McLay K., McMurray A.A., Milne S.A., Mistry D., Moore M.J.F., Mullikin J.C., Nickerson T., Oliver K., Parker A., Patel R., Pearce T.A.V., Peck A.I., Phillimore B.J.C.T., Prathalingam S.R., Plumb R.W., Ramsay H., Rice C.M., Ross M.T., Scott C.E., Sehra H.K., Shownkeen R., Sims S., Skuce C.D., Smith M.L., Soderlund C., Steward C.A., Sulston J.E., Swann R.M., Sycamore N., Taylor R., Tee L., Thomas D.W., Thorpe A., Tracey A., Tromans A.C., Vaudin M., Wall M., Wallis J.M., Whitehead S.L., Whittaker P., Willey D.L., Williams L., Williams S.A., Wilming L., Wray P.W., Hubbard T., Durbin R.M., Bentley D.R., Beck S., Rogers J.
    Nature 414:865-871(2001) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  3. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  4. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1).
    Tissue: Skin.
  5. "A conserved Mis12 centromere complex is linked to heterochromatic HP1 and outer kinetochore protein Zwint-1."
    Obuse C., Iwasaki O., Kiyomitsu T., Goshima G., Toyoda Y., Yanagida M.
    Nat. Cell Biol. 6:1135-1141(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, INTERACTION WITH MIS12; CASC5; CBX3; CBX5; NSL1 AND PMF1, SUBCELLULAR LOCATION.
  6. "The human Mis12 complex is required for kinetochore assembly and proper chromosome segregation."
    Kline S.L., Cheeseman I.M., Hori T., Fukagawa T., Desai A.
    J. Cell Biol. 173:9-17(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, COMPONENT OF MIS12 COMPLEX, SUBCELLULAR LOCATION.
  7. "Human Blinkin/AF15q14 is required for chromosome alignment and the mitotic checkpoint through direct interaction with Bub1 and BubR1."
    Kiyomitsu T., Obuse C., Yanagida M.
    Dev. Cell 13:663-676(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH CASC5.
  8. "Combining protein-based IMAC, peptide-based IMAC, and MudPIT for efficient phosphoproteomic analysis."
    Cantin G.T., Yi W., Lu B., Park S.K., Xu T., Lee J.-D., Yates J.R. III
    J. Proteome Res. 7:1346-1351(2008) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Cervix carcinoma.
  9. Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-28; SER-30; SER-77 AND SER-81, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Cervix carcinoma.
  10. Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
  11. "Quantitative phosphoproteomic analysis of T cell receptor signaling reveals system-wide modulation of protein-protein interactions."
    Mayya V., Lundgren D.H., Hwang S.-I., Rezaul K., Wu L., Eng J.K., Rodionov V., Han D.K.
    Sci. Signal. 2:RA46-RA46(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-30, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Leukemic T-cell.
  12. "Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis."
    Olsen J.V., Vermeulen M., Santamaria A., Kumar C., Miller M.L., Jensen L.J., Gnad F., Cox J., Jensen T.S., Nigg E.A., Brunak S., Mann M.
    Sci. Signal. 3:RA3-RA3(2010) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-30; SER-58 AND SER-81, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Cervix carcinoma.
  13. "Mitotic regulator SKAP forms a link between kinetochore core complex KMN and dynamic spindle microtubules."
    Wang X., Zhuang X., Cao D., Chu Y., Yao P., Liu W., Liu L., Adams G., Fang G., Dou Z., Ding X., Huang Y., Wang D., Yao X.
    J. Biol. Chem. 287:39380-39390(2012) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH KNSTRN.

Entry informationi

Entry nameiDSN1_HUMAN
AccessioniPrimary (citable) accession number: Q9H410
Secondary accession number(s): B4DWT2
, E1P5U9, Q5JW55, Q5JW56, Q9H8P4
Entry historyi
Integrated into UniProtKB/Swiss-Prot: February 1, 2003
Last sequence update: October 1, 2001
Last modified: April 13, 2016
This is version 129 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 20
    Human chromosome 20: entries, gene names and cross-references to MIM
  2. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.