Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Synemin

Gene

SYNM

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Type-VI intermediate filament (IF) which plays an important cytoskeletal role within the muscle cell cytoskeleton. It forms heteropolymeric IFs with desmin and/or vimentin, and via its interaction with cytoskeletal proteins alpha-dystrobrevin, dystrophin, talin-1, utrophin and vinculin, is able to link these heteropolymeric IFs to adherens-type junctions, such as to the costameres, neuromuscular junctions, and myotendinous junctions within striated muscle cells.3 Publications

GO - Molecular functioni

  • intermediate filament binding Source: UniProtKB
  • structural constituent of cytoskeleton Source: UniProtKB
  • structural constituent of muscle Source: UniProtKB
  • vinculin binding Source: UniProtKB

GO - Biological processi

Complete GO annotation...

Names & Taxonomyi

Protein namesi
Recommended name:
Synemin
Alternative name(s):
Desmuslin
Gene namesi
Name:SYNM
Synonyms:DMN, KIAA0353, SYN
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 15

Organism-specific databases

HGNCiHGNC:24466. SYNM.

Subcellular locationi

GO - Cellular componenti

  • adherens junction Source: UniProtKB
  • costamere Source: UniProtKB
  • intermediate filament Source: UniProtKB
  • neurofilament cytoskeleton Source: UniProtKB
  • sarcolemma Source: Ensembl
Complete GO annotation...

Keywords - Cellular componenti

Cell junction, Cytoplasm, Cytoskeleton, Intermediate filament

Pathology & Biotechi

Organism-specific databases

DisGeNETi23336.
PharmGKBiPA164726408.

Polymorphism and mutation databases

BioMutaiSYNM.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00000637781 – 1565SyneminAdd BLAST1565

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei429PhosphoserineCombined sources1
Modified residuei598PhosphothreonineCombined sources1
Modified residuei651PhosphothreonineCombined sources1
Modified residuei653PhosphoserineCombined sources1
Modified residuei777PhosphoserineBy similarity1
Modified residuei1044PhosphoserineCombined sources1
Modified residuei1049PhosphoserineCombined sources1
Modified residuei1077PhosphoserineBy similarity1
Modified residuei1087PhosphoserineBy similarity1
Modified residuei1181PhosphoserineCombined sources1
Modified residuei1184PhosphoserineCombined sources1
Modified residuei1435PhosphoserineCombined sources1
Modified residuei1487Omega-N-methylarginineBy similarity1

Keywords - PTMi

Methylation, Phosphoprotein

Proteomic databases

EPDiO15061.
MaxQBiO15061.
PaxDbiO15061.
PeptideAtlasiO15061.
PRIDEiO15061.

PTM databases

iPTMnetiO15061.
PhosphoSitePlusiO15061.

Expressioni

Tissue specificityi

Isoform 2 is strongly detected in adult heart, fetal skeletal muscles and fetal heart. Isoform 1 is weakly detected in fetal heart and also in fetal skeletal muscle. Isoform 1 and isoform 2 are detected in adult bladder (at protein level). The mRNA is predominantly expressed in heart and muscle with some expression in brain which may be due to tissue-specific isoforms.2 Publications

Developmental stagei

In lens, first detected at 16 weeks when expression is weakly and uniformly distributed. Subsequently, expression becomes much stronger in the epithelium of the anterior part at 25 weeks and later. In retina, weakly expressed at 15 weeks in the nerve fiber and ganglion cell layers (NFL and GCL). From 25 weeks onwards, much stronger expression is observed in the endfeet of Mueller cells, the NFL, and GCL, and much lower expression is observed in a minor subpopulation of cells in the inner cell layer (INL). At 30 and 36 weeks, expression remains in the neural retina, and subsequently becomes stronger in the NFL, GCL, and INL and is decreased in Mueller cells. At 36 weeks, also expressed at the external border of the outer nuclear layer (ONL) (at protein level).1 Publication

Gene expression databases

BgeeiENSG00000182253.
CleanExiHS_SYNM.
ExpressionAtlasiO15061. baseline and differential.
GenevisibleiO15061. HS.

Organism-specific databases

HPAiCAB017192.
HPA040066.
HPA044200.

Interactioni

Subunit structurei

Interacts with GFAP and VIM (By similarity). Isoform 1 interacts with TLN1 and VCL. Isoform 2 interacts with DES and DTNA. Isoform 1 and isoform 2 interact with DMD and UTRN.By similarity5 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
TTNQ8WZ423EBI-7843148,EBI-681210

GO - Molecular functioni

  • intermediate filament binding Source: UniProtKB
  • vinculin binding Source: UniProtKB

Protein-protein interaction databases

BioGridi116923. 19 interactors.
IntActiO15061. 10 interactors.
MINTiMINT-198289.
STRINGi9606.ENSP00000336775.

Structurei

3D structure databases

ProteinModelPortaliO15061.
SMRiO15061.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni1 – 10Head10
Regioni11 – 320Interaction with DMD and UTRN1 PublicationAdd BLAST310
Regioni11 – 300RodAdd BLAST290
Regioni11 – 49Coil 1AAdd BLAST39
Regioni50 – 58Linker 19
Regioni59 – 163Coil 1BAdd BLAST105
Regioni164 – 186Linker 12Add BLAST23
Regioni187 – 300Coil 2Add BLAST114
Regioni301 – 1565TailAdd BLAST1265
Regioni1152 – 1463Interaction with TLN1 and VCL2 PublicationsAdd BLAST312
Regioni1244 – 1563Interaction with DMD and UTRN1 PublicationAdd BLAST320

Sequence similaritiesi

Belongs to the intermediate filament family.Curated

Keywords - Domaini

Coiled coil

Phylogenomic databases

eggNOGiENOG410IH92. Eukaryota.
ENOG4111GBF. LUCA.
HOGENOMiHOG000154476.
HOVERGENiHBG008974.
InParanoidiO15061.
KOiK10376.
OrthoDBiEOG091G00OT.
PhylomeDBiO15061.

Family and domain databases

InterProiIPR001664. IF.
IPR018039. Intermediate_filament_CS.
IPR030634. SYNM.
[Graphical view]
PANTHERiPTHR23239. PTHR23239. 1 hit.
PTHR23239:SF194. PTHR23239:SF194. 1 hit.
PfamiPF00038. Filament. 1 hit.
[Graphical view]
SMARTiSM01391. Filament. 1 hit.
[Graphical view]
PROSITEiPS00226. IF. 1 hit.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: O15061-1) [UniParc]FASTAAdd to basket
Also known as: Alpha

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MLSWRLQTGP EKAELQELNA RLYDYVCRVR ELERENLLLE EELRGRRGRE
60 70 80 90 100
GLWAEGQARC AEEARSLRQQ LDELSWATAL AEGERDALRR ELRELQRLDA
110 120 130 140 150
EERAARGRLD AELGAQQREL QEALGARAAL EALLGRLQAE RRGLDAAHER
160 170 180 190 200
DVRELRARAA SLTMHFRARA TGPAAPPPRL REVHDSYALL VAESWRETVQ
210 220 230 240 250
LYEDEVRELE EALRRGQESR LQAEEETRLC AQEAEALRRE ALGLEQLRAR
260 270 280 290 300
LEDALLRMRE EYGIQAEERQ RAIDCLEDEK ATLTLAMADW LRDYQDLLQV
310 320 330 340 350
KTGLSLEVAT YRALLEGESN PEIVIWAEHV ENMPSEFRNK SYHYTDSLLQ
360 370 380 390 400
RENERNLFSR QKAPLASFNH SSALYSNLSG HRGSQTGTSI GGDARRGFLG
410 420 430 440 450
SGYSSSATTQ QENSYGKAVS SQTNVRTFSP TYGLLRNTEA QVKTFPDRPK
460 470 480 490 500
AGDTREVPVY IGEDSTIARE SYRDRRDKVA AGASESTRSN ERTVILGKKT
510 520 530 540 550
EVKATREQER NRPETIRTKP EEKMFDSKEK ASEERNLRWE ELTKLDKEAR
560 570 580 590 600
QRESQQMKEK AKEKDSPKEK SVREREVPIS LEVSQDRRAE VSPKGLQTPV
610 620 630 640 650
KDAGGGTGRE AEARELRFRL GTSDATGSLQ GDSMTETVAE NIVTSILKQF
660 670 680 690 700
TQSPETEASA DSFPDTKVTY VDRKELPGER KTKTEIVVES KLTEDVDVSD
710 720 730 740 750
EAGLDYLLSK DIKEVGLKGK SAEQMIGDII NLGLKGREGR AKVVNVEIVE
760 770 780 790 800
EPVSYVSGEK PEEFSVPFKV EEVEDVSPGP WGLVKEEEGY GESDVTFSVN
810 820 830 840 850
QHRRTKQPQE NTTHVEEVTE AGDSEGEQSY FVSTPDEHPG GHDRDDGSVY
860 870 880 890 900
GQIHIEEEST IRYSWQDEIV QGTRRRTQKD GAVGEKVVKP LDVPAPSLEG
910 920 930 940 950
DLGSTHWKEQ ARSGEFHAEP TVIEKEIKIP HEFHTSMKGI SSKEPRQQLV
960 970 980 990 1000
EVIGQLEETL PERMREELSA LTREGQGGPG SVSVDVKKVQ GAGGSSVTLV
1010 1020 1030 1040 1050
AEVNVSQTVD ADRLDLEELS KDEASEMEKA VESVVRESLS RQRSPAPGSP
1060 1070 1080 1090 1100
DEEGGAEAPA AGIRFRRWAT RELYIPSGES EVAGGASHSS GQRTPQGPVS
1110 1120 1130 1140 1150
ATVEVSSPTG FAQSQVLEDV SQAARHIKLG PSEVWRTERM SYEGPTAEVV
1160 1170 1180 1190 1200
EVSAGGDLSQ AASPTGASRS VRHVTLGPGQ SPLSREVIFL GPAPACPEAW
1210 1220 1230 1240 1250
GSPEPGPAES SADMDGSGRH STFGCRQFHA EKEIIFQGPI SAAGKVGDYF
1260 1270 1280 1290 1300
ATEESVGTQT SVRQLQLGPK EGFSGQIQFT APLSDKVELG VIGDSVHMEG
1310 1320 1330 1340 1350
LPGSSTSIRH ISIGPQRHQT TQQIVYHGLV PQLGESGDSE STVHGEGSAD
1360 1370 1380 1390 1400
VHQATHSHTS GRQTVMTEKS TFQSVVSESP QEDSAGDTSG AEMTSGVSRS
1410 1420 1430 1440 1450
FRHIRLGPTE TETSEHIAIR GPVSRTFVLA GSADSPELGK LADSSRTLRH
1460 1470 1480 1490 1500
IAPGPKETSF TFQMDVSNVE AIRSRTQEAG ALGVSDRGSW RDADSRNDQA
1510 1520 1530 1540 1550
VGVSFKASAG EGDQAHREQG KEQAMFDKKV QLQRMVDQRS VISDEKKVAL
1560
LYLDNEEEEN DGHWF
Length:1,565
Mass (Da):172,768
Last modified:January 23, 2002 - v2
Checksum:i18D19000D3CEA537
GO
Isoform 2 (identifier: O15061-2) [UniParc]FASTAAdd to basket
Also known as: Beta

The sequence of this isoform differs from the canonical sequence as follows:
     1152-1463: Missing.

Show »
Length:1,253
Mass (Da):140,135
Checksum:i88162E538D848F2A
GO
Isoform 3 (identifier: O15061-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     336-339: EFRN → DGCE
     340-1565: Missing.

Show »
Length:339
Mass (Da):39,042
Checksum:i090EEA199A0041C4
GO

Sequence cautioni

The sequence AAI10067 differs from that shown. Reason: Erroneous initiation.Curated
The sequence BAA20810 differs from that shown. Reason: Erroneous initiation.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti4W → L in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti4W → L in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti4W → L in CAG27071 (Ref. 3) Curated1
Sequence conflicti24Missing in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti24Missing in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti52Missing in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti52Missing in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti197E → G in CAG27071 (Ref. 3) Curated1
Sequence conflicti241A → T in CAG27071 (Ref. 3) Curated1
Sequence conflicti274D → G in CAG27071 (Ref. 3) Curated1
Sequence conflicti318E → G in CAG27071 (Ref. 3) Curated1
Sequence conflicti322E → Q in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti322E → Q in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti322E → Q in CAG27071 (Ref. 3) Curated1
Sequence conflicti373A → V in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti373A → V in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti555Q → H in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti555Q → H in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti564K → N in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti564K → N in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti655E → Q in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti655E → Q in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti666T → A in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti666T → A in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti687V → L in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti687V → L in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti720K → N in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti720K → N in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti845D → N in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti845D → N in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti856E → Q in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti856E → Q in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti874R → P in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti874R → P in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti965R → K in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti965R → K in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti1004N → D in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti1004N → D in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti1019L → V in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti1019L → V in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti1039L → M in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti1039L → M in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti1076P → L in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti1076P → L in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti1151E → G in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti1292I → T in CAC83859 (PubMed:11737198).Curated1
Sequence conflicti1493A → R in AAI10067 (PubMed:15489334).Curated1
Sequence conflicti1509A → V in CAC83858 (PubMed:11737198).Curated1
Sequence conflicti1509A → V in CAC83859 (PubMed:11737198).Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_012295272A → V.2 Publications1
Natural variantiVAR_012296330V → I.1 Publication1
Natural variantiVAR_012297338R → W.1 Publication1
Natural variantiVAR_059378355R → W.3 PublicationsCorresponds to variant rs3743242dbSNPEnsembl.1
Natural variantiVAR_059379462G → S.2 PublicationsCorresponds to variant rs3134595dbSNPEnsembl.1
Natural variantiVAR_012298567P → L.2 Publications1
Natural variantiVAR_012299612E → A.1 Publication1
Natural variantiVAR_012300761P → L.1 Publication1
Natural variantiVAR_012301946R → W.1 Publication1
Natural variantiVAR_012302976Q → R.1 Publication1
Natural variantiVAR_0123031059P → L.1 Publication1
Natural variantiVAR_0123041067R → P.1 Publication1
Natural variantiVAR_0123051077S → L.1 Publication1
Natural variantiVAR_0593801130G → S.Corresponds to variant rs9920074dbSNPEnsembl.1
Natural variantiVAR_0593811345G → A.Corresponds to variant rs7167599dbSNPEnsembl.1
Natural variantiVAR_0123061386G → E.1 PublicationCorresponds to variant rs2292288dbSNPEnsembl.1
Natural variantiVAR_0123071462F → C.Corresponds to variant rs2292287dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_036478336 – 339EFRN → DGCE in isoform 3. 1 Publication4
Alternative sequenceiVSP_036479340 – 1565Missing in isoform 3. 1 PublicationAdd BLAST1226
Alternative sequenceiVSP_0024651152 – 1463Missing in isoform 2. 2 PublicationsAdd BLAST312

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ310521 mRNA. Translation: CAC83858.1.
AJ310522 mRNA. Translation: CAC83859.1.
AF359284 mRNA. Translation: AAK57487.1.
AJ697971 mRNA. Translation: CAG27071.1.
AB002351 mRNA. Translation: BAA20810.2. Different initiation.
BC110066 mRNA. Translation: AAI10067.1. Different initiation.
BC151243 mRNA. Translation: AAI51244.1.
CCDSiCCDS73786.1. [O15061-2]
CCDS73787.1. [O15061-1]
RefSeqiNP_056101.5. NM_015286.5.
NP_663780.2. NM_145728.2.
UniGeneiHs.207106.

Genome annotation databases

EnsembliENST00000336292; ENSP00000336775; ENSG00000182253.
GeneIDi23336.
KEGGihsa:23336.
UCSCiuc032crv.2. human. [O15061-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ310521 mRNA. Translation: CAC83858.1.
AJ310522 mRNA. Translation: CAC83859.1.
AF359284 mRNA. Translation: AAK57487.1.
AJ697971 mRNA. Translation: CAG27071.1.
AB002351 mRNA. Translation: BAA20810.2. Different initiation.
BC110066 mRNA. Translation: AAI10067.1. Different initiation.
BC151243 mRNA. Translation: AAI51244.1.
CCDSiCCDS73786.1. [O15061-2]
CCDS73787.1. [O15061-1]
RefSeqiNP_056101.5. NM_015286.5.
NP_663780.2. NM_145728.2.
UniGeneiHs.207106.

3D structure databases

ProteinModelPortaliO15061.
SMRiO15061.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi116923. 19 interactors.
IntActiO15061. 10 interactors.
MINTiMINT-198289.
STRINGi9606.ENSP00000336775.

PTM databases

iPTMnetiO15061.
PhosphoSitePlusiO15061.

Polymorphism and mutation databases

BioMutaiSYNM.

Proteomic databases

EPDiO15061.
MaxQBiO15061.
PaxDbiO15061.
PeptideAtlasiO15061.
PRIDEiO15061.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000336292; ENSP00000336775; ENSG00000182253.
GeneIDi23336.
KEGGihsa:23336.
UCSCiuc032crv.2. human. [O15061-1]

Organism-specific databases

CTDi23336.
DisGeNETi23336.
GeneCardsiSYNM.
H-InvDBHIX0172820.
HGNCiHGNC:24466. SYNM.
HPAiCAB017192.
HPA040066.
HPA044200.
MIMi606087. gene.
neXtProtiNX_O15061.
PharmGKBiPA164726408.
HUGEiSearch...
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410IH92. Eukaryota.
ENOG4111GBF. LUCA.
HOGENOMiHOG000154476.
HOVERGENiHBG008974.
InParanoidiO15061.
KOiK10376.
OrthoDBiEOG091G00OT.
PhylomeDBiO15061.

Miscellaneous databases

ChiTaRSiSYNM. human.
GenomeRNAii23336.
PROiO15061.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000182253.
CleanExiHS_SYNM.
ExpressionAtlasiO15061. baseline and differential.
GenevisibleiO15061. HS.

Family and domain databases

InterProiIPR001664. IF.
IPR018039. Intermediate_filament_CS.
IPR030634. SYNM.
[Graphical view]
PANTHERiPTHR23239. PTHR23239. 1 hit.
PTHR23239:SF194. PTHR23239:SF194. 1 hit.
PfamiPF00038. Filament. 1 hit.
[Graphical view]
SMARTiSM01391. Filament. 1 hit.
[Graphical view]
PROSITEiPS00226. IF. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiSYNEM_HUMAN
AccessioniPrimary (citable) accession number: O15061
Secondary accession number(s): A7E2Y2
, Q2TBJ4, Q5NJJ9, Q8TE61, Q8TE62
Entry historyi
Integrated into UniProtKB/Swiss-Prot: January 23, 2002
Last sequence update: January 23, 2002
Last modified: November 2, 2016
This is version 139 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 15
    Human chromosome 15: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.