Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Desmocollin-2

Gene

DSC2

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Component of intercellular desmosome junctions. Involved in the interaction of plaque proteins and intermediate filaments mediating cell-cell adhesion. May contribute to epidermal cell positioning (stratification) by mediating differential adhesiveness between cells that express different isoforms.

GO - Molecular functioni

GO - Biological processi

  • bundle of His cell-Purkinje myocyte adhesion involved in cell communication Source: BHF-UCL
  • cardiac muscle cell-cardiac muscle cell adhesion Source: UniProtKB
  • cell adhesion Source: ProtInc
  • cellular response to starvation Source: Ensembl
  • homophilic cell adhesion via plasma membrane adhesion molecules Source: InterPro
  • regulation of heart rate by cardiac conduction Source: BHF-UCL
  • regulation of ventricular cardiac muscle cell action potential Source: BHF-UCL
Complete GO annotation...

Keywords - Biological processi

Cell adhesion

Keywords - Ligandi

Calcium, Metal-binding

Enzyme and pathway databases

BioCyciZFISH:ENSG00000134755-MONOMER.
ReactomeiR-HSA-6805567. Keratinization.
R-HSA-6809371. Formation of the cornified envelope.

Names & Taxonomyi

Protein namesi
Recommended name:
Desmocollin-2
Alternative name(s):
Cadherin family member 2
Desmocollin-3
Desmosomal glycoprotein II
Desmosomal glycoprotein III
Gene namesi
Name:DSC2
Synonyms:CDHF2, DSC3
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 18

Organism-specific databases

HGNCiHGNC:3036. DSC2.

Subcellular locationi

Topology

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Topological domaini136 – 694ExtracellularSequence analysisAdd BLAST559
Transmembranei695 – 715HelicalSequence analysisAdd BLAST21
Topological domaini716 – 901CytoplasmicSequence analysisAdd BLAST186

GO - Cellular componenti

  • cell-cell adherens junction Source: Ensembl
  • cytoplasmic vesicle Source: UniProtKB
  • desmosome Source: UniProtKB
  • extracellular exosome Source: UniProtKB
  • integral component of membrane Source: UniProtKB-KW
  • intercalated disc Source: UniProtKB
  • plasma membrane Source: BHF-UCL
Complete GO annotation...

Keywords - Cellular componenti

Cell junction, Cell membrane, Membrane

Pathology & Biotechi

Involvement in diseasei

Arrhythmogenic right ventricular dysplasia, familial, 11 (ARVD11)3 Publications
The disease is caused by mutations affecting the gene represented in this entry.
Disease descriptionA congenital heart disease characterized by infiltration of adipose and fibrous tissue into the right ventricle and loss of myocardial cells, resulting in ventricular and supraventricular arrhythmias.
See also OMIM:610476
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_065687203R → C in ARVD11; fails to undergo complete processing into a mature form; fails to localize at the desmosomes. 1 PublicationCorresponds to variant rs142331975dbSNPEnsembl.1
Natural variantiVAR_065688231I → T in ARVD11. 1 Publication1
Natural variantiVAR_065689275T → M in ARVD11; can be processed into a mature form but shows a higher pro-protein to mature protein ratio; only a proportion of the partly functional mutant is incorporated into the desmosomes. 1 PublicationCorresponds to variant rs397517404dbSNPEnsembl.1
Natural variantiVAR_065690340T → A in ARVD11. 1 PublicationCorresponds to variant rs368299411dbSNPEnsembl.1

Keywords - Diseasei

Cardiomyopathy, Disease mutation

Organism-specific databases

DisGeNETi1824.
MalaCardsiDSC2.
MIMi610476. phenotype.
OpenTargetsiENSG00000134755.
Orphaneti293899. Familial isolated arrhythmogenic ventricular dysplasia, biventricular form.
293888. Familial isolated arrhythmogenic ventricular dysplasia, left dominant form.
293910. Familial isolated arrhythmogenic ventricular dysplasia, right dominant form.
PharmGKBiPA27489.

Polymorphism and mutation databases

BioMutaiDSC2.
DMDMi461968.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 27Sequence analysisAdd BLAST27
PropeptideiPRO_000000386928 – 135Sequence analysisAdd BLAST108
ChainiPRO_0000003870136 – 901Desmocollin-2Add BLAST766

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Glycosylationi34N-linked (GlcNAc...)Sequence analysis1
Glycosylationi166N-linked (GlcNAc...)Sequence analysis1
Modified residuei386PhosphothreonineBy similarity1
Glycosylationi392N-linked (GlcNAc...) (complex)2 Publications1
Glycosylationi546N-linked (GlcNAc...)1 Publication1
Glycosylationi629N-linked (GlcNAc...)1 Publication1
Modified residuei864PhosphoserineCombined sources1
Modified residuei868PhosphoserineCombined sources1
Modified residuei873PhosphoserineCombined sources1

Keywords - PTMi

Cleavage on pair of basic residues, Glycoprotein, Phosphoprotein

Proteomic databases

EPDiQ02487.
MaxQBiQ02487.
PaxDbiQ02487.
PeptideAtlasiQ02487.
PRIDEiQ02487.

PTM databases

iPTMnetiQ02487.
PhosphoSitePlusiQ02487.
SwissPalmiQ02487.

Expressioni

Tissue specificityi

Expressed in epithelia, myocardium and lymph nodes.

Gene expression databases

BgeeiENSG00000134755.
CleanExiHS_DSC2.
HS_DSC3.
GenevisibleiQ02487. HS.

Organism-specific databases

HPAiHPA011911.
HPA012615.

Interactioni

Subunit structurei

Interacts with DSP, PKP2 and JUP.1 Publication

Binary interactionsi

WithEntry#Exp.IntActNotes
GJA1P173022EBI-6900677,EBI-1103439
GJA1Q6TYA93EBI-6900677,EBI-6901331From a different organism.
Pkp2F1M7L92EBI-6900677,EBI-6900770From a different organism.

GO - Molecular functioni

Protein-protein interaction databases

BioGridi108158. 18 interactors.
IntActiQ02487. 13 interactors.
STRINGi9606.ENSP00000280904.

Structurei

Secondary structure

1901
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Beta strandi142 – 147Combined sources6
Beta strandi152 – 158Combined sources7
Turni163 – 166Combined sources4
Beta strandi172 – 175Combined sources4
Turni176 – 178Combined sources3
Beta strandi179 – 182Combined sources4
Beta strandi184 – 188Combined sources5
Turni190 – 192Combined sources3
Beta strandi194 – 197Combined sources4
Turni203 – 205Combined sources3
Beta strandi212 – 214Combined sources3
Beta strandi230 – 234Combined sources5
Beta strandi241 – 244Combined sources4
Beta strandi246 – 253Combined sources8
Beta strandi261 – 264Combined sources4
Beta strandi267 – 269Combined sources3
Helixi276 – 278Combined sources3
Beta strandi281 – 289Combined sources9
Beta strandi295 – 297Combined sources3
Turni299 – 301Combined sources3
Beta strandi303 – 307Combined sources5
Turni313 – 315Combined sources3
Beta strandi318 – 327Combined sources10
Helixi328 – 330Combined sources3
Beta strandi336 – 346Combined sources11
Beta strandi354 – 365Combined sources12
Beta strandi370 – 376Combined sources7
Turni387 – 389Combined sources3
Beta strandi390 – 398Combined sources9
Beta strandi404 – 408Combined sources5
Turni410 – 412Combined sources3
Beta strandi415 – 419Combined sources5
Turni425 – 427Combined sources3
Beta strandi429 – 441Combined sources13
Beta strandi454 – 463Combined sources10
Beta strandi472 – 481Combined sources10
Turni498 – 500Combined sources3
Beta strandi518 – 520Combined sources3
Turni522 – 524Combined sources3
Beta strandi527 – 529Combined sources3
Beta strandi544 – 552Combined sources9
Beta strandi558 – 568Combined sources11
Turni586 – 588Combined sources3
Beta strandi590 – 595Combined sources6
Helixi602 – 604Combined sources3
Beta strandi609 – 611Combined sources3
Turni617 – 622Combined sources6
Beta strandi624 – 636Combined sources13
Beta strandi643 – 645Combined sources3
Beta strandi648 – 653Combined sources6
Beta strandi659 – 664Combined sources6
Beta strandi672 – 675Combined sources4

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
5ERPX-ray2.70A/B236-680[»]
5J5JX-ray3.29A136-235[»]
ProteinModelPortaliQ02487.
SMRiQ02487.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini136 – 243Cadherin 1PROSITE-ProRule annotationAdd BLAST108
Domaini244 – 355Cadherin 2PROSITE-ProRule annotationAdd BLAST112
Domaini356 – 471Cadherin 3PROSITE-ProRule annotationAdd BLAST116
Domaini472 – 579Cadherin 4PROSITE-ProRule annotationAdd BLAST108
Domaini580 – 694Cadherin 5PROSITE-ProRule annotationAdd BLAST115

Domaini

Calcium may be bound by the cadherin-like repeats.Curated
Three calcium ions are usually bound at the interface of each cadherin domain and rigidify the connections, imparting a strong curvature to the full-length ectodomain.By similarity

Sequence similaritiesi

Contains 5 cadherin domains.PROSITE-ProRule annotation

Keywords - Domaini

Repeat, Signal, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiKOG3594. Eukaryota.
ENOG410XQHI. LUCA.
GeneTreeiENSGT00760000118906.
HOGENOMiHOG000231253.
HOVERGENiHBG102801.
InParanoidiQ02487.
KOiK07601.
OMAiKIKVQDM.
OrthoDBiEOG091G01IB.
PhylomeDBiQ02487.
TreeFamiTF316817.

Family and domain databases

Gene3Di2.60.40.60. 6 hits.
4.10.900.10. 1 hit.
InterProiIPR002126. Cadherin.
IPR015919. Cadherin-like.
IPR020894. Cadherin_CS.
IPR000233. Cadherin_cytoplasmic-dom.
IPR014868. Cadherin_pro_dom.
IPR027397. Catenin_binding_dom.
IPR009122. Desmosomal_cadherin.
[Graphical view]
PANTHERiPTHR24025. PTHR24025. 2 hits.
PfamiPF00028. Cadherin. 4 hits.
PF01049. Cadherin_C. 1 hit.
PF08758. Cadherin_pro. 1 hit.
[Graphical view]
PRINTSiPR00205. CADHERIN.
PR01818. DESMOCADHERN.
SMARTiSM00112. CA. 5 hits.
SM01055. Cadherin_pro. 1 hit.
[Graphical view]
SUPFAMiSSF49313. SSF49313. 6 hits.
PROSITEiPS00232. CADHERIN_1. 3 hits.
PS50268. CADHERIN_2. 5 hits.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 2A (identifier: Q02487-1) [UniParc]FASTAAdd to basket
Also known as: DGII

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MEAARPSGSW NGALCRLLLL TLAILIFASD ACKNVTLHVP SKLDAEKLVG
60 70 80 90 100
RVNLKECFTA ANLIHSSDPD FQILEDGSVY TTNTILLSSE KRSFTILLSN
110 120 130 140 150
TENQEKKKIF VFLEHQTKVL KKRHTKEKVL RRAKRRWAPI PCSMLENSLG
160 170 180 190 200
PFPLFLQQVQ SDTAQNYTIY YSIRGPGVDQ EPRNLFYVER DTGNLYCTRP
210 220 230 240 250
VDREQYESFE IIAFATTPDG YTPELPLPLI IKIEDENDNY PIFTEETYTF
260 270 280 290 300
TIFENCRVGT TVGQVCATDK DEPDTMHTRL KYSIIGQVPP SPTLFSMHPT
310 320 330 340 350
TGVITTTSSQ LDRELIDKYQ LKIKVQDMDG QYFGLQTTST CIINIDDVND
360 370 380 390 400
HLPTFTRTSY VTSVEENTVD VEILRVTVED KDLVNTANWR ANYTILKGNE
410 420 430 440 450
NGNFKIVTDA KTNEGVLCVV KPLNYEEKQQ MILQIGVVNE APFSREASPR
460 470 480 490 500
SAMSTATVTV NVEDQDEGPE CNPPIQTVRM KENAEVGTTS NGYKAYDPET
510 520 530 540 550
RSSSGIRYKK LTDPTGWVTI DENTGSIKVF RSLDREAETI KNGIYNITVL
560 570 580 590 600
ASDQGGRTCT GTLGIILQDV NDNSPFIPKK TVIICKPTMS SAEIVAVDPD
610 620 630 640 650
EPIHGPPFDF SLESSTSEVQ RMWRLKAIND TAARLSYQND PPFGSYVVPI
660 670 680 690 700
TVRDRLGMSS VTSLDVTLCD CITENDCTHR VDPRIGGGGV QLGKWAILAI
710 720 730 740 750
LLGIALLFCI LFTLVCGASG TSKQPKVIPD DLAQQNLIVS NTEAPGDDKV
760 770 780 790 800
YSANGFTTQT VGASAQGVCG TVGSGIKNGG QETIEMVKGG HQTSESCRGA
810 820 830 840 850
GHHHTLDSCR GGHTEVDNCR YTYSEWHSFT QPRLGEKVYL CNQDENHKHA
860 870 880 890 900
QDYVLTYNYE GRGSVAGSVG CCSERQEEDG LEFLDNLEPK FRTLAEACMK

R
Length:901
Mass (Da):99,962
Last modified:February 1, 1994 - v1
Checksum:i30F7E3D33ECA67CC
GO
Isoform 2B (identifier: Q02487-2) [UniParc]FASTAAdd to basket
Also known as: DGIII

The sequence of this isoform differs from the canonical sequence as follows:
     837-847: KVYLCNQDENH → ESIRGHTLIKN
     848-901: Missing.

Show »
Length:847
Mass (Da):93,769
Checksum:iA53588B1D490CD8F
GO

Sequence cautioni

The sequence CAA40141 differs from that shown. Reason: Erroneous initiation.Curated
The sequence CAA40142 differs from that shown. Reason: Erroneous initiation.Curated

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_02948011N → S.1 PublicationCorresponds to variant rs868333dbSNPEnsembl.1
Natural variantiVAR_065687203R → C in ARVD11; fails to undergo complete processing into a mature form; fails to localize at the desmosomes. 1 PublicationCorresponds to variant rs142331975dbSNPEnsembl.1
Natural variantiVAR_065688231I → T in ARVD11. 1 Publication1
Natural variantiVAR_065689275T → M in ARVD11; can be processed into a mature form but shows a higher pro-protein to mature protein ratio; only a proportion of the partly functional mutant is incorporated into the desmosomes. 1 PublicationCorresponds to variant rs397517404dbSNPEnsembl.1
Natural variantiVAR_065690340T → A in ARVD11. 1 PublicationCorresponds to variant rs368299411dbSNPEnsembl.1
Natural variantiVAR_062391358T → I.1 PublicationCorresponds to variant rs139399951dbSNPEnsembl.1
Natural variantiVAR_065691596A → V.1 PublicationCorresponds to variant rs148185335dbSNPEnsembl.1
Natural variantiVAR_065692638Q → H.1 PublicationCorresponds to variant rs147742157dbSNPEnsembl.1
Natural variantiVAR_024388776I → V.2 PublicationsCorresponds to variant rs1893963dbSNPEnsembl.1
Natural variantiVAR_062392798R → Q.3 PublicationsCorresponds to variant rs61731921dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_000657837 – 847KVYLCNQDENH → ESIRGHTLIKN in isoform 2B. 1 PublicationAdd BLAST11
Alternative sequenceiVSP_000658848 – 901Missing in isoform 2B. 1 PublicationAdd BLAST54

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X56807 mRNA. Translation: CAA40141.1. Different initiation.
X56807 mRNA. Translation: CAA40142.1. Different initiation.
BC063291 mRNA. Translation: AAH63291.1.
CCDSiCCDS11892.1. [Q02487-1]
CCDS11893.1. [Q02487-2]
PIRiA40390. IJHUDB.
B40390. IJHUDA.
RefSeqiNP_004940.1. NM_004949.4. [Q02487-2]
NP_077740.1. NM_024422.4. [Q02487-1]
UniGeneiHs.95612.

Genome annotation databases

EnsembliENST00000251081; ENSP00000251081; ENSG00000134755. [Q02487-2]
ENST00000280904; ENSP00000280904; ENSG00000134755. [Q02487-1]
GeneIDi1824.
KEGGihsa:1824.
UCSCiuc002kwk.5. human. [Q02487-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X56807 mRNA. Translation: CAA40141.1. Different initiation.
X56807 mRNA. Translation: CAA40142.1. Different initiation.
BC063291 mRNA. Translation: AAH63291.1.
CCDSiCCDS11892.1. [Q02487-1]
CCDS11893.1. [Q02487-2]
PIRiA40390. IJHUDB.
B40390. IJHUDA.
RefSeqiNP_004940.1. NM_004949.4. [Q02487-2]
NP_077740.1. NM_024422.4. [Q02487-1]
UniGeneiHs.95612.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
5ERPX-ray2.70A/B236-680[»]
5J5JX-ray3.29A136-235[»]
ProteinModelPortaliQ02487.
SMRiQ02487.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi108158. 18 interactors.
IntActiQ02487. 13 interactors.
STRINGi9606.ENSP00000280904.

PTM databases

iPTMnetiQ02487.
PhosphoSitePlusiQ02487.
SwissPalmiQ02487.

Polymorphism and mutation databases

BioMutaiDSC2.
DMDMi461968.

Proteomic databases

EPDiQ02487.
MaxQBiQ02487.
PaxDbiQ02487.
PeptideAtlasiQ02487.
PRIDEiQ02487.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000251081; ENSP00000251081; ENSG00000134755. [Q02487-2]
ENST00000280904; ENSP00000280904; ENSG00000134755. [Q02487-1]
GeneIDi1824.
KEGGihsa:1824.
UCSCiuc002kwk.5. human. [Q02487-1]

Organism-specific databases

CTDi1824.
DisGeNETi1824.
GeneCardsiDSC2.
GeneReviewsiDSC2.
HGNCiHGNC:3036. DSC2.
HPAiHPA011911.
HPA012615.
MalaCardsiDSC2.
MIMi125645. gene.
610476. phenotype.
neXtProtiNX_Q02487.
OpenTargetsiENSG00000134755.
Orphaneti293899. Familial isolated arrhythmogenic ventricular dysplasia, biventricular form.
293888. Familial isolated arrhythmogenic ventricular dysplasia, left dominant form.
293910. Familial isolated arrhythmogenic ventricular dysplasia, right dominant form.
PharmGKBiPA27489.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG3594. Eukaryota.
ENOG410XQHI. LUCA.
GeneTreeiENSGT00760000118906.
HOGENOMiHOG000231253.
HOVERGENiHBG102801.
InParanoidiQ02487.
KOiK07601.
OMAiKIKVQDM.
OrthoDBiEOG091G01IB.
PhylomeDBiQ02487.
TreeFamiTF316817.

Enzyme and pathway databases

BioCyciZFISH:ENSG00000134755-MONOMER.
ReactomeiR-HSA-6805567. Keratinization.
R-HSA-6809371. Formation of the cornified envelope.

Miscellaneous databases

ChiTaRSiDSC2. human.
GeneWikiiDSC2.
GenomeRNAii1824.
PROiQ02487.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000134755.
CleanExiHS_DSC2.
HS_DSC3.
GenevisibleiQ02487. HS.

Family and domain databases

Gene3Di2.60.40.60. 6 hits.
4.10.900.10. 1 hit.
InterProiIPR002126. Cadherin.
IPR015919. Cadherin-like.
IPR020894. Cadherin_CS.
IPR000233. Cadherin_cytoplasmic-dom.
IPR014868. Cadherin_pro_dom.
IPR027397. Catenin_binding_dom.
IPR009122. Desmosomal_cadherin.
[Graphical view]
PANTHERiPTHR24025. PTHR24025. 2 hits.
PfamiPF00028. Cadherin. 4 hits.
PF01049. Cadherin_C. 1 hit.
PF08758. Cadherin_pro. 1 hit.
[Graphical view]
PRINTSiPR00205. CADHERIN.
PR01818. DESMOCADHERN.
SMARTiSM00112. CA. 5 hits.
SM01055. Cadherin_pro. 1 hit.
[Graphical view]
SUPFAMiSSF49313. SSF49313. 6 hits.
PROSITEiPS00232. CADHERIN_1. 3 hits.
PS50268. CADHERIN_2. 5 hits.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiDSC2_HUMAN
AccessioniPrimary (citable) accession number: Q02487
Entry historyi
Integrated into UniProtKB/Swiss-Prot: February 1, 1994
Last sequence update: February 1, 1994
Last modified: November 30, 2016
This is version 164 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 18
    Human chromosome 18: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  6. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.