Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Myomesin-1

Gene

MYOM1

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Major component of the vertebrate myofibrillar M band. Binds myosin, titin, and light meromyosin. This binding is dose dependent.

GO - Molecular functioni

  • identical protein binding Source: IntAct
  • protein homodimerization activity Source: UniProtKB
  • structural constituent of muscle Source: ProtInc

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Muscle protein

Enzyme and pathway databases

BioCyciZFISH:ENSG00000101605-MONOMER.

Names & Taxonomyi

Protein namesi
Recommended name:
Myomesin-1
Alternative name(s):
190 kDa connectin-associated protein
190 kDa titin-associated protein
Myomesin family member 1
Gene namesi
Name:MYOM1
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 18

Organism-specific databases

HGNCiHGNC:7613. MYOM1.

Subcellular locationi

GO - Cellular componenti

  • M band Source: UniProtKB
  • striated muscle myosin thick filament Source: ProtInc
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Thick filament

Pathology & Biotechi

Organism-specific databases

DisGeNETi8736.
OpenTargetsiENSG00000101605.
PharmGKBiPA31418.

Polymorphism and mutation databases

BioMutaiMYOM1.
DMDMi212276443.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00000726841 – 1685Myomesin-1Add BLAST1685

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei113PhosphoserineBy similarity1
Modified residuei883PhosphoserineBy similarity1
Modified residuei887PhosphoserineBy similarity1
Modified residuei1054PhosphoserineBy similarity1
Disulfide bondi1160 ↔ 1210PROSITE-ProRule annotation

Keywords - PTMi

Disulfide bond, Phosphoprotein

Proteomic databases

MaxQBiP52179.
PaxDbiP52179.
PeptideAtlasiP52179.
PRIDEiP52179.

PTM databases

iPTMnetiP52179.
PhosphoSitePlusiP52179.

Expressioni

Gene expression databases

BgeeiENSG00000101605.
CleanExiHS_MYOM1.
ExpressionAtlasiP52179. baseline and differential.
GenevisibleiP52179. HS.

Organism-specific databases

HPAiHPA014305.
HPA049193.

Interactioni

Subunit structurei

Homodimer (By similarity). Interacts with TTN/titin (By similarity). Interacts with PNKD.By similarity1 Publication

Binary interactionsi

WithEntry#Exp.IntActNotes
itself4EBI-5353249,EBI-5353249
DYSFO759233EBI-5353249,EBI-2799016

GO - Molecular functioni

  • identical protein binding Source: IntAct
  • protein homodimerization activity Source: UniProtKB

Protein-protein interaction databases

BioGridi114273. 1 interactor.
DIPiDIP-59649N.
IntActiP52179. 12 interactors.
STRINGi9606.ENSP00000348821.

Structurei

Secondary structure

11685
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Beta strandi1145 – 1148Combined sources4
Beta strandi1154 – 1159Combined sources6
Beta strandi1169 – 1173Combined sources5
Beta strandi1184 – 1189Combined sources6
Beta strandi1192 – 1199Combined sources8
Helixi1202 – 1204Combined sources3
Beta strandi1206 – 1215Combined sources10
Beta strandi1220 – 1224Combined sources5
Helixi1226 – 1240Combined sources15
Beta strandi1248 – 1254Combined sources7
Helixi1256 – 1258Combined sources3
Beta strandi1260 – 1265Combined sources6
Beta strandi1274 – 1279Combined sources6
Beta strandi1282 – 1284Combined sources3
Beta strandi1286 – 1289Combined sources4
Beta strandi1291 – 1294Combined sources4
Turni1296 – 1298Combined sources3
Beta strandi1300 – 1306Combined sources7
Helixi1310 – 1312Combined sources3
Beta strandi1314 – 1322Combined sources9
Beta strandi1325 – 1333Combined sources9
Helixi1336 – 1355Combined sources20
Beta strandi1357 – 1369Combined sources13
Turni1370 – 1372Combined sources3
Beta strandi1373 – 1382Combined sources10
Beta strandi1388 – 1393Combined sources6
Beta strandi1396 – 1398Combined sources3
Beta strandi1407 – 1415Combined sources9
Helixi1420 – 1422Combined sources3
Beta strandi1424 – 1432Combined sources9
Beta strandi1435 – 1443Combined sources9
Helixi1446 – 1460Combined sources15
Beta strandi1467 – 1470Combined sources4
Beta strandi1472 – 1483Combined sources12
Beta strandi1489 – 1494Combined sources6
Beta strandi1503 – 1510Combined sources8
Beta strandi1513 – 1520Combined sources8
Helixi1523 – 1525Combined sources3
Beta strandi1527 – 1534Combined sources8
Beta strandi1539 – 1546Combined sources8
Helixi1548 – 1570Combined sources23
Beta strandi1573 – 1577Combined sources5
Beta strandi1580 – 1585Combined sources6
Beta strandi1590 – 1597Combined sources8
Beta strandi1603 – 1608Combined sources6
Beta strandi1611 – 1613Combined sources3
Beta strandi1617 – 1624Combined sources8
Turni1625 – 1627Combined sources3
Beta strandi1628 – 1635Combined sources8
Helixi1638 – 1640Combined sources3
Beta strandi1642 – 1650Combined sources9
Beta strandi1653 – 1664Combined sources12

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
2R15X-ray2.24A/B1459-1667[»]
2Y23X-ray2.50A1141-1447[»]
2Y25X-ray3.50A/B/C/D1357-1667[»]
3RBSX-ray1.85A1247-1447[»]
DisProtiDP00517.
ProteinModelPortaliP52179.
SMRiP52179.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP52179.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Repeati182 – 18716
Repeati188 – 19326
Repeati194 – 19936
Repeati200 – 20546
Repeati206 – 21156
Repeati212 – 21766
Domaini277 – 368Ig-like C2-type 1Add BLAST92
Domaini396 – 498Ig-like C2-type 2Add BLAST103
Domaini512 – 607Fibronectin type-III 1PROSITE-ProRule annotationAdd BLAST96
Domaini640 – 734Fibronectin type-III 2PROSITE-ProRule annotationAdd BLAST95
Domaini741 – 834Fibronectin type-III 3PROSITE-ProRule annotationAdd BLAST94
Domaini933 – 1034Fibronectin type-III 4PROSITE-ProRule annotationAdd BLAST102
Domaini1041 – 1140Fibronectin type-III 5PROSITE-ProRule annotationAdd BLAST100
Domaini1132 – 1230Ig-like C2-type 3Add BLAST99
Domaini1358 – 1444Ig-like C2-type 4Add BLAST87
Domaini1573 – 1662Ig-like C2-type 5Add BLAST90

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni182 – 2176 X 6 AA tandem repeatsAdd BLAST36

Sequence similaritiesi

Contains 5 fibronectin type-III domains.PROSITE-ProRule annotation

Keywords - Domaini

Immunoglobulin domain, Repeat

Phylogenomic databases

eggNOGiENOG410INWB. Eukaryota.
ENOG411180Q. LUCA.
GeneTreeiENSGT00860000133685.
HOGENOMiHOG000293283.
HOVERGENiHBG004977.
InParanoidiP52179.
OMAiEDTAQYR.
OrthoDBiEOG091G0095.
PhylomeDBiP52179.
TreeFamiTF331825.

Family and domain databases

CDDicd00063. FN3. 5 hits.
Gene3Di2.60.40.10. 12 hits.
InterProiIPR003961. FN3_dom.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR013098. Ig_I-set.
IPR003599. Ig_sub.
IPR003598. Ig_sub2.
[Graphical view]
PfamiPF00041. fn3. 5 hits.
PF07679. I-set. 5 hits.
[Graphical view]
SMARTiSM00060. FN3. 5 hits.
SM00409. IG. 6 hits.
SM00408. IGc2. 5 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 7 hits.
SSF49265. SSF49265. 3 hits.
PROSITEiPS50853. FN3. 5 hits.
PS50835. IG_LIKE. 4 hits.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: P52179-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MSLPFYQRCH QHYDLSYRNK DVRSTVSHYQ REKKRSAVYT QGSTAYSSRS
60 70 80 90 100
SAAHRRESEA FRRASASSSQ QQASQHALSS EVSRKAASAY DYGSSHGLTD
110 120 130 140 150
SSLLLDDYSS KLSPKPKRAK HSLLSGEEKE NLPSDYMVPI FSGRQKHVSG
160 170 180 190 200
ITDTEEERIK EAAAYIAQRN LLASEEGITT SKQSTASKQT TASKQSTASK
210 220 230 240 250
QSTASKQSTA SRQSTASRQS VVSKQATSAL QQEETSEKKS RKVVIREKAE
260 270 280 290 300
RLSLRKTLEE TETYHAKLNE DHLLHAPEFI IKPRSHTVWE KENVKLHCSI
310 320 330 340 350
AGWPEPRVTW YKNQVPINVH ANPGKYIIES RYGMHTLEIN GCDFEDTAQY
360 370 380 390 400
RASAMNVKGE LSAYASVVVK RYKGEFDETR FHAGASTMPL SFGVTPYGYA
410 420 430 440 450
SRFEIHFDDK FDVSFGREGE TMSLGCRVVI TPEIKHFQPE IQWYRNGVPL
460 470 480 490 500
SPSKWVQTLW SGERATLTFS HLNKEDEGLY TIRVRMGEYY EQYSAYVFVR
510 520 530 540 550
DADAEIEGAP AAPLDVKCLE ANKDYIIISW KQPAVDGGSP ILGYFIDKCE
560 570 580 590 600
VGTDSWSQCN DTPVKFARFP VTGLIEGRSY IFRVRAVNKM GIGFPSRVSE
610 620 630 640 650
PVAALDPAEK ARLKSRPSAP WTGQIIVTEE EPSEGIVPGP PTDLSVTEAT
660 670 680 690 700
RSYVVLSWKP PGQRGHEGIM YFVEKCEAGT ENWQRVNTEL PVKSPRFALF
710 720 730 740 750
DLAEGKSYCF RVRCSNSAGV GEPSEATEVT VVGDKLDIPK APGKIIPSRN
760 770 780 790 800
TDTSVVVSWE ESKDAKELVG YYIEASVAGS GKWEPCNNNP VKGSRFTCHG
810 820 830 840 850
LVTGQSYIFR VRAVNAAGLS EYSQDSEAIE VKAAIGGGVS PDVCPALSDE
860 870 880 890 900
PGGLTASRGR VHEASPPTFQ KDALLGSKPN KPSLPSSSQN LGQTEVSKVS
910 920 930 940 950
ETVQEELTPP PQKAAPQGKS KSDPLKKKTD RAPPSPPCDI TCLESFRDSM
960 970 980 990 1000
VLGWKQPDKI GGAEITGYYV NYREVIDGVP GKWREANVKA VSEEAYKISN
1010 1020 1030 1040 1050
LKENMVYQFQ VAAMNMAGLG APSAVSECFK CEEWTIAVPG PPHSLKCSEV
1060 1070 1080 1090 1100
RKDSLVLQWK PPVHSGRTPV TGYFVDLKEA KAKEDQWRGL NEAAIKNVYL
1110 1120 1130 1140 1150
KVRGLKEGVS YVFRVRAINQ AGVGKPSDLA GPVVAETRPG TKEVVVNVDD
1160 1170 1180 1190 1200
DGVISLNFEC DKMTPKSEFS WSKDYVSTED SPRLEVESKG NKTKMTFKDL
1210 1220 1230 1240 1250
GMDDLGIYSC DVTDTDGIAS SYLIDEEELK RLLALSHEHK FPTVPVKSEL
1260 1270 1280 1290 1300
AVEILEKGQV RFWMQAEKLS GNAKVNYIFN EKEIFEGPKY KMHIDRNTGI
1310 1320 1330 1340 1350
IEMFMEKLQD EDEGTYTFQL QDGKATNHST VVLVGDVFKK LQKEAEFQRQ
1360 1370 1380 1390 1400
EWIRKQGPHF VEYLSWEVTG ECNVLLKCKV ANIKKETHIV WYKDEREISV
1410 1420 1430 1440 1450
DEKHDFKDGI CTLLITEFSK KDAGIYEVIL KDDRGKDKSR LKLVDEAFKE
1460 1470 1480 1490 1500
LMMEVCKKIA LSATDLKIQS TAEGIQLYSF VTYYVEDLKV NWSHNGSAIR
1510 1520 1530 1540 1550
YSDRVKTGVT GEQIWLQINE PTPNDKGKYV MELFDGKTGH QKTVDLSGQA
1560 1570 1580 1590 1600
YDEAYAEFQR LKQAAIAEKN RARVLGGLPD VVTIQEGKAL NLTCNVWGDP
1610 1620 1630 1640 1650
PPEVSWLKNE KALASDDHCN LKFEAGRTAY FTINGVSTAD SGKYGLVVKN
1660 1670 1680
KYGSETSDFT VSVFIPEEEA RMAALESLKG GKKAK
Length:1,685
Mass (Da):187,627
Last modified:November 4, 2008 - v2
Checksum:iF0DBB25EB6323DF8
GO
Isoform 2 (identifier: P52179-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     836-931: Missing.

Show »
Length:1,589
Mass (Da):177,700
Checksum:iAB161B6C31ED4FDC
GO

Sequence cautioni

The sequence CAA48833 differs from that shown. Reason: Erroneous initiation.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti247E → G in CAA48833 (PubMed:7505783).Curated1
Sequence conflicti442Q → R in CAA48833 (PubMed:7505783).Curated1
Sequence conflicti599S → F in AAI16184 (PubMed:15489334).Curated1
Sequence conflicti601P → A in CAA48833 (PubMed:7505783).Curated1
Sequence conflicti615S → G in AAI16184 (PubMed:15489334).Curated1
Sequence conflicti616 – 625RPSAPWTGQI → PLSTLDWTV in CAA48833 (PubMed:7505783).Curated10
Sequence conflicti776S → N in CAA48833 (PubMed:7505783).Curated1
Sequence conflicti793 – 794GS → TH in CAA48833 (PubMed:7505783).Curated2
Sequence conflicti992S → R in CAA48833 (PubMed:7505783).Curated1
Sequence conflicti1001L → S in BAC86128 (PubMed:14702039).Curated1
Sequence conflicti1080Missing in CAA48833 (PubMed:7505783).Curated1
Sequence conflicti1141T → R in CAA48833 (PubMed:7505783).Curated1
Sequence conflicti1615 – 1616SD → QT in CAA48833 (PubMed:7505783).Curated2
Sequence conflicti1617D → G in BAC86128 (PubMed:14702039).Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_04722122V → L.Corresponds to variant rs1791085dbSNPEnsembl.1
Natural variantiVAR_047222181S → P.3 PublicationsCorresponds to variant rs1962519dbSNPEnsembl.1
Natural variantiVAR_047223215T → M.Corresponds to variant rs2230165dbSNPEnsembl.1
Natural variantiVAR_047224341G → A.4 PublicationsCorresponds to variant rs8099021dbSNPEnsembl.1
Natural variantiVAR_047225600E → V.Corresponds to variant rs9807556dbSNPEnsembl.1
Natural variantiVAR_047226960I → T.3 PublicationsCorresponds to variant rs1071600dbSNPEnsembl.1
Natural variantiVAR_0472271408D → N.Corresponds to variant rs3765623dbSNPEnsembl.1
Natural variantiVAR_0472281453M → T.Corresponds to variant rs16944397dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_035663836 – 931Missing in isoform 2. 2 PublicationsAdd BLAST96

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ621424 mRNA. Translation: CAF18565.1.
AK125322 mRNA. Translation: BAC86128.1.
AP005329 Genomic DNA. No translation available.
AP005431 Genomic DNA. No translation available.
BC116183 mRNA. Translation: AAI16184.1.
X69090 mRNA. Translation: CAA48833.1. Different initiation.
CCDSiCCDS45823.1. [P52179-2]
CCDS45824.1. [P52179-1]
PIRiS42167.
RefSeqiNP_003794.3. NM_003803.3. [P52179-1]
NP_062830.1. NM_019856.1. [P52179-2]
UniGeneiHs.464469.

Genome annotation databases

EnsembliENST00000261606; ENSP00000261606; ENSG00000101605. [P52179-2]
ENST00000356443; ENSP00000348821; ENSG00000101605. [P52179-1]
GeneIDi8736.
KEGGihsa:8736.
UCSCiuc002klp.3. human. [P52179-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ621424 mRNA. Translation: CAF18565.1.
AK125322 mRNA. Translation: BAC86128.1.
AP005329 Genomic DNA. No translation available.
AP005431 Genomic DNA. No translation available.
BC116183 mRNA. Translation: AAI16184.1.
X69090 mRNA. Translation: CAA48833.1. Different initiation.
CCDSiCCDS45823.1. [P52179-2]
CCDS45824.1. [P52179-1]
PIRiS42167.
RefSeqiNP_003794.3. NM_003803.3. [P52179-1]
NP_062830.1. NM_019856.1. [P52179-2]
UniGeneiHs.464469.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
2R15X-ray2.24A/B1459-1667[»]
2Y23X-ray2.50A1141-1447[»]
2Y25X-ray3.50A/B/C/D1357-1667[»]
3RBSX-ray1.85A1247-1447[»]
DisProtiDP00517.
ProteinModelPortaliP52179.
SMRiP52179.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi114273. 1 interactor.
DIPiDIP-59649N.
IntActiP52179. 12 interactors.
STRINGi9606.ENSP00000348821.

PTM databases

iPTMnetiP52179.
PhosphoSitePlusiP52179.

Polymorphism and mutation databases

BioMutaiMYOM1.
DMDMi212276443.

Proteomic databases

MaxQBiP52179.
PaxDbiP52179.
PeptideAtlasiP52179.
PRIDEiP52179.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000261606; ENSP00000261606; ENSG00000101605. [P52179-2]
ENST00000356443; ENSP00000348821; ENSG00000101605. [P52179-1]
GeneIDi8736.
KEGGihsa:8736.
UCSCiuc002klp.3. human. [P52179-1]

Organism-specific databases

CTDi8736.
DisGeNETi8736.
GeneCardsiMYOM1.
H-InvDBHIX0017329.
HGNCiHGNC:7613. MYOM1.
HPAiHPA014305.
HPA049193.
MIMi603508. gene.
neXtProtiNX_P52179.
OpenTargetsiENSG00000101605.
PharmGKBiPA31418.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410INWB. Eukaryota.
ENOG411180Q. LUCA.
GeneTreeiENSGT00860000133685.
HOGENOMiHOG000293283.
HOVERGENiHBG004977.
InParanoidiP52179.
OMAiEDTAQYR.
OrthoDBiEOG091G0095.
PhylomeDBiP52179.
TreeFamiTF331825.

Enzyme and pathway databases

BioCyciZFISH:ENSG00000101605-MONOMER.

Miscellaneous databases

EvolutionaryTraceiP52179.
GeneWikiiMYOM1.
GenomeRNAii8736.
PROiP52179.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000101605.
CleanExiHS_MYOM1.
ExpressionAtlasiP52179. baseline and differential.
GenevisibleiP52179. HS.

Family and domain databases

CDDicd00063. FN3. 5 hits.
Gene3Di2.60.40.10. 12 hits.
InterProiIPR003961. FN3_dom.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR013098. Ig_I-set.
IPR003599. Ig_sub.
IPR003598. Ig_sub2.
[Graphical view]
PfamiPF00041. fn3. 5 hits.
PF07679. I-set. 5 hits.
[Graphical view]
SMARTiSM00060. FN3. 5 hits.
SM00409. IG. 6 hits.
SM00408. IGc2. 5 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 7 hits.
SSF49265. SSF49265. 3 hits.
PROSITEiPS50853. FN3. 5 hits.
PS50835. IG_LIKE. 4 hits.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiMYOM1_HUMAN
AccessioniPrimary (citable) accession number: P52179
Secondary accession number(s): Q14BD6, Q6H969, Q6ZUU0
Entry historyi
Integrated into UniProtKB/Swiss-Prot: October 1, 1996
Last sequence update: November 4, 2008
Last modified: November 30, 2016
This is version 145 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 18
    Human chromosome 18: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  6. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.