Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Myosin-IIIa

Gene

MYO3A

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at transcript leveli

Functioni

Probable actin-based motor with a protein kinase activity. Probably plays a role in vision and hearing.1 Publication

Catalytic activityi

ATP + a protein = ADP + a phosphoprotein.

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Binding sitei50ATPPROSITE-ProRule annotation1
Active sitei150Proton acceptorPROSITE-ProRule annotation1

Regions

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Nucleotide bindingi27 – 35ATPPROSITE-ProRule annotation9

GO - Molecular functioni

  • actin-dependent ATPase activity Source: UniProtKB
  • ADP binding Source: UniProtKB
  • ATP binding Source: UniProtKB-KW
  • calmodulin binding Source: UniProtKB
  • microfilament motor activity Source: UniProtKB
  • plus-end directed microfilament motor activity Source: UniProtKB
  • protein kinase activity Source: UniProtKB
  • protein serine/threonine kinase activity Source: UniProtKB-KW

GO - Biological processi

  • protein autophosphorylation Source: UniProtKB
  • response to stimulus Source: UniProtKB-KW
  • sensory perception of sound Source: UniProtKB
  • visual perception Source: UniProtKB-KW
Complete GO annotation...

Keywords - Molecular functioni

Kinase, Motor protein, Myosin, Serine/threonine-protein kinase, Transferase

Keywords - Biological processi

Hearing, Sensory transduction, Vision

Keywords - Ligandi

Actin-binding, ATP-binding, Nucleotide-binding

Enzyme and pathway databases

BioCyciZFISH:HS01834-MONOMER.
SignaLinkiQ8NEV4.

Names & Taxonomyi

Protein namesi
Recommended name:
Myosin-IIIa (EC:2.7.11.1)
Gene namesi
Name:MYO3A
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 10

Organism-specific databases

HGNCiHGNC:7601. MYO3A.

Subcellular locationi

GO - Cellular componenti

  • cytoplasm Source: UniProtKB-KW
  • filamentous actin Source: UniProtKB
  • filopodium Source: UniProtKB
  • myosin complex Source: UniProtKB-KW
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Cytoskeleton

Pathology & Biotechi

Involvement in diseasei

Deafness, autosomal recessive, 30 (DFNB30)1 Publication
The disease is caused by mutations affecting the gene represented in this entry.
Disease descriptionA form of non-syndromic deafness characterized by bilateral progressive hearing loss, which first affects the high frequencies. Hearing loss begins in the second decade, and by age 50 is severe in high and middle frequencies and moderate at low frequencies.
See also OMIM:607101

Keywords - Diseasei

Deafness, Non-syndromic deafness

Organism-specific databases

DisGeNETi53904.
MalaCardsiMYO3A.
MIMi607101. phenotype.
OpenTargetsiENSG00000095777.
Orphaneti90636. Autosomal recessive non-syndromic sensorineural deafness type DFNB.
PharmGKBiPA31405.

Chemistry databases

ChEMBLiCHEMBL5546.
GuidetoPHARMACOLOGYi2112.

Polymorphism and mutation databases

BioMutaiMYO3A.
DMDMi160112826.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00000864131 – 1616Myosin-IIIaAdd BLAST1616

Proteomic databases

EPDiQ8NEV4.
PaxDbiQ8NEV4.
PeptideAtlasiQ8NEV4.
PRIDEiQ8NEV4.

PTM databases

iPTMnetiQ8NEV4.
PhosphoSitePlusiQ8NEV4.

Expressioni

Tissue specificityi

Strongest expression in retina, retinal pigment epithelial cells, cochlea and pancreas.

Gene expression databases

BgeeiENSG00000095777.
CleanExiHS_MYO3A.
ExpressionAtlasiQ8NEV4. baseline and differential.
GenevisibleiQ8NEV4. HS.

Organism-specific databases

HPAiHPA048951.

Interactioni

GO - Molecular functioni

  • calmodulin binding Source: UniProtKB

Protein-protein interaction databases

BioGridi119814. 3 interactors.
IntActiQ8NEV4. 1 interactor.
STRINGi9606.ENSP00000265944.

Chemistry databases

BindingDBiQ8NEV4.

Structurei

3D structure databases

ProteinModelPortaliQ8NEV4.
SMRiQ8NEV4.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini21 – 287Protein kinasePROSITE-ProRule annotationAdd BLAST267
Domaini338 – 1053Myosin motorAdd BLAST716
Domaini1055 – 1084IQ 1PROSITE-ProRule annotationAdd BLAST30
Domaini1082 – 1111IQ 2PROSITE-ProRule annotationAdd BLAST30
Domaini1346 – 1375IQ 3PROSITE-ProRule annotationAdd BLAST30

Sequence similaritiesi

In the C-terminal section; belongs to the TRAFAC class myosin-kinesin ATPase superfamily. Myosin family.Curated
In the N-terminal section; belongs to the protein kinase superfamily. STE Ser/Thr protein kinase family.Curated
Contains 3 IQ domains.PROSITE-ProRule annotation
Contains 1 myosin motor domain.Curated
Contains 1 protein kinase domain.PROSITE-ProRule annotation

Keywords - Domaini

Repeat

Phylogenomic databases

eggNOGiKOG0587. Eukaryota.
KOG4229. Eukaryota.
COG5022. LUCA.
GeneTreeiENSGT00840000129687.
HOGENOMiHOG000234203.
HOVERGENiHBG052555.
InParanoidiQ8NEV4.
KOiK08834.
OMAiNFRGHRE.
OrthoDBiEOG091G00C2.
PhylomeDBiQ8NEV4.
TreeFamiTF326512.

Family and domain databases

InterProiIPR000048. IQ_motif_EF-hand-BS.
IPR011009. Kinase-like_dom.
IPR001609. Myosin_head_motor_dom.
IPR027417. P-loop_NTPase.
IPR000719. Prot_kinase_dom.
IPR017441. Protein_kinase_ATP_BS.
[Graphical view]
PfamiPF00612. IQ. 2 hits.
PF00063. Myosin_head. 1 hit.
PF00069. Pkinase. 1 hit.
[Graphical view]
PRINTSiPR00193. MYOSINHEAVY.
SMARTiSM00015. IQ. 3 hits.
SM00242. MYSc. 1 hit.
SM00220. S_TKc. 1 hit.
[Graphical view]
SUPFAMiSSF52540. SSF52540. 2 hits.
SSF56112. SSF56112. 1 hit.
PROSITEiPS50096. IQ. 3 hits.
PS51456. MYOSIN_MOTOR. 1 hit.
PS00107. PROTEIN_KINASE_ATP. 1 hit.
PS50011. PROTEIN_KINASE_DOM. 1 hit.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q8NEV4-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MFPLIGKTII FDNFPDPSDT WEITETIGKG TYGKVFKVLN KKNGQKAAVK
60 70 80 90 100
ILDPIHDIDE EIEAEYNILK ALSDHPNVVR FYGIYFKKDK VNGDKLWLVL
110 120 130 140 150
ELCSGGSVTD LVKGFLKRGE RMSEPLIAYI LHEALMGLQH LHNNKTIHRD
160 170 180 190 200
VKGNNILLTT EGGVKLVDFG VSAQLTSTRH RRNTSVGTPF WMAPEVIACE
210 220 230 240 250
QQLDTTYDAR CDTWSLGITA IELGDGDPPL ADLHPMRALF KIPRNPPPKL
260 270 280 290 300
RQPELWSAEF NDFISKCLTK DYEKRPTVSE LLQHKFITQI EGKDVMLQKQ
310 320 330 340 350
LTEFIGIHQC MGGTEKARRE RIHTKKGNFN RPLISNLKDV DDLATLEILD
360 370 380 390 400
ENTVSEQLEK CYSRDQIYVY VGDILIALNP FQSLGLYSTK HSKLYIGSKR
410 420 430 440 450
TASPPHIFAM ADLGYQSMIT YNSDQCIVIS GESGAGKTEN AHLLVQQLTV
460 470 480 490 500
LGKANNRTLQ EKILQVNNLV EAFGNACTII NDNSSRFGKY LEMKFTSSGA
510 520 530 540 550
VVGAQISEYL LEKSRVIHQA IGEKNFHIFY YIYAGLAEKK KLAHYKLPEN
560 570 580 590 600
KPPRYLQNDH LRTVQDIMNN SFYKSQYELI EQCFKVIGFT MEQLGSIYSI
610 620 630 640 650
LAAILNVGNI EFSSVATEHQ IDKSHISNHT ALENCASLLC IRADELQEAL
660 670 680 690 700
TSHCVVTRGE TIIRPNTVEK ATDVRDAMAK TLYGRLFSWI VNCINSLLKH
710 720 730 740 750
DSSPSGNGDE LSIGILDIFG FENFKKNSFE QLCINIANEQ IQYYYNQHVF
760 770 780 790 800
AWEQNEYLNE DVDARVIEYE DNWPLLDMFL QKPMGLLSLL DEESRFPKAT
810 820 830 840 850
DQTLVEKFEG NLKSQYFWRP KRMELSFGIH HYAGKVLYNA SGFLAKNRDT
860 870 880 890 900
LPTDIVLLLR SSDNSVIRQL VNHPLTKTGN LPHSKTKNVI NYQMRTSEKL
910 920 930 940 950
INLAKGDTGE ATRHARETTN MKTQTVASYF RYSLMDLLSK MVVGQPHFVR
960 970 980 990 1000
CIKPNSERQA RKYDKEKVLL QLRYTGILET ARIRRLGFSH RILFANFIKR
1010 1020 1030 1040 1050
YYLLCYKSSE EPRMSPDTCA TILEKAGLDN WALGKTKVFL KYYHVEQLNL
1060 1070 1080 1090 1100
MRKEAIDKLI LIQACVRAFL CSRRYQKIQE KRKESAIIIQ SAARGHLVRK
1110 1120 1130 1140 1150
QRKEIVDMKN TAVTTIQTSD QEFDYKKNFE NTRESFVKKQ AENAISANER
1160 1170 1180 1190 1200
FISAPNNKGS VSVVKTSTFK PEEETTNAVE SNNRVYQTPK KMNNVYEEEV
1210 1220 1230 1240 1250
KQEFYLVGPE VSPKQKSVKD LEENSNLRKV EKEEAMIQSY YQRYTEERNC
1260 1270 1280 1290 1300
EESKAAYLER KAISERPSYP VPWLAENETS FKKTLEPTLS QRSIYQNANS
1310 1320 1330 1340 1350
MEKEKKTSVV TQRAPICSQE EGRGRLRHET VKERQVEPVT QAQEEEDKAA
1360 1370 1380 1390 1400
VFIQSKYRGY KRRQQLRKDK MSSFKHQRIV TTPTEVARNT HNLYSYPTKH
1410 1420 1430 1440 1450
EEINNIKKKD NKDSKATSER EACGLAIFSK QISKLSEEYF ILQKKLNEMI
1460 1470 1480 1490 1500
LSQQLKSLYL GVSHHKPINR RVSSQQCLSG VCKGEEPKIL RPPRRPRKPK
1510 1520 1530 1540 1550
TLNNPEDSTY YYLLHKSIQE EKRRPRKDSQ GKLLDLEDFY YKEFLPSRSG
1560 1570 1580 1590 1600
PKEHSPSLRE RRPQQELQNQ CIKANERCWA AESPEKEEER EPAANPYDFR
1610
RLLRKTSQRR RLVQQS
Length:1,616
Mass (Da):186,208
Last modified:November 13, 2007 - v2
Checksum:i7D126A7E22520574
GO
Isoform 2 (identifier: Q8NEV4-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     245-247: NPP → SDD
     248-1616: Missing.

Note: No experimental confirmation available.
Show »
Length:247
Mass (Da):27,678
Checksum:iBAB051B0CFF58532
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti418M → I (PubMed:10936054).Curated1
Sequence conflicti418M → I (PubMed:12032315).Curated1
Sequence conflicti636A → V (PubMed:10936054).Curated1
Sequence conflicti636A → V (PubMed:12032315).Curated1
Sequence conflicti848 – 851RDTL → KTLV in AAF70861 (PubMed:12032315).Curated4
Sequence conflicti886 – 890TKNVI → LKML in AAF70861 (PubMed:12032315).Curated5
Sequence conflicti1099R → G (PubMed:10936054).Curated1
Sequence conflicti1099R → G (PubMed:12032315).Curated1
Sequence conflicti1217S → F (PubMed:10936054).Curated1
Sequence conflicti1217S → F (PubMed:12032315).Curated1
Sequence conflicti1378R → K (PubMed:10936054).Curated1
Sequence conflicti1378R → K (PubMed:12032315).Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_040871178T → I.1 PublicationCorresponds to variant rs33968748dbSNPEnsembl.1
Natural variantiVAR_021866204D → N.Corresponds to variant rs3737274dbSNPEnsembl.1
Natural variantiVAR_040872319R → H.1 PublicationCorresponds to variant rs3824700dbSNPEnsembl.1
Natural variantiVAR_040873348I → V.1 PublicationCorresponds to variant rs3824699dbSNPEnsembl.1
Natural variantiVAR_040874369V → I.1 PublicationCorresponds to variant rs3817420dbSNPEnsembl.1
Natural variantiVAR_040875525N → K in an ovarian mucinous carcinoma sample; somatic mutation. 1 Publication1
Natural variantiVAR_040876833A → S.1 PublicationCorresponds to variant rs33947968dbSNPEnsembl.1
Natural variantiVAR_021867956S → N.1 PublicationCorresponds to variant rs3758449dbSNPEnsembl.1
Natural variantiVAR_040877956S → R in an ovarian serous carcinoma sample; somatic mutation. 1 Publication1
Natural variantiVAR_0408781032A → T.1 PublicationCorresponds to variant rs34918608dbSNPEnsembl.1
Natural variantiVAR_0408791045V → M.1 PublicationCorresponds to variant rs35447806dbSNPEnsembl.1
Natural variantiVAR_0408801137V → M.1 PublicationCorresponds to variant rs35449183dbSNPEnsembl.1
Natural variantiVAR_0408811195V → A.1 PublicationCorresponds to variant rs35675577dbSNPEnsembl.1
Natural variantiVAR_0339051284T → S.1 PublicationCorresponds to variant rs3740231dbSNPEnsembl.1
Natural variantiVAR_0408821287P → T.1 PublicationCorresponds to variant rs35575696dbSNPEnsembl.1
Natural variantiVAR_0227791313R → S.3 PublicationsCorresponds to variant rs1999240dbSNPEnsembl.1
Natural variantiVAR_0408831347D → H in a renal clear cell carcinoma sample; somatic mutation. 1 Publication1
Natural variantiVAR_0408841417T → I.1 PublicationCorresponds to variant rs34151474dbSNPEnsembl.1
Natural variantiVAR_0408851488K → E.1 PublicationCorresponds to variant rs34204285dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_056231245 – 247NPP → SDD in isoform 2. 1 Publication3
Alternative sequenceiVSP_056232248 – 1616Missing in isoform 2. 1 PublicationAdd BLAST1369

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF229172 mRNA. Translation: AAF70861.1.
AY101367 mRNA. Translation: AAM34500.1.
AL162503
, AL358612, AL360217, AL391812 Genomic DNA. Translation: CAH73661.1.
AL358612
, AL162503, AL360217, AL391812 Genomic DNA. Translation: CAH73814.1.
AL360217 Genomic DNA. Translation: CAD13206.2.
AL391812
, AL162503, AL358612, AL360217 Genomic DNA. Translation: CAI17380.1.
BC036079 mRNA. Translation: AAH36079.1.
CCDSiCCDS7148.1. [Q8NEV4-1]
RefSeqiNP_059129.3. NM_017433.4. [Q8NEV4-1]
XP_011517800.1. XM_011519498.2. [Q8NEV4-1]
XP_011517801.1. XM_011519499.1. [Q8NEV4-1]
XP_011517802.1. XM_011519500.2. [Q8NEV4-1]
UniGeneiHs.662630.

Genome annotation databases

EnsembliENST00000265944; ENSP00000265944; ENSG00000095777. [Q8NEV4-1]
ENST00000376302; ENSP00000365479; ENSG00000095777. [Q8NEV4-2]
GeneIDi53904.
KEGGihsa:53904.
UCSCiuc001ism.3. human. [Q8NEV4-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF229172 mRNA. Translation: AAF70861.1.
AY101367 mRNA. Translation: AAM34500.1.
AL162503
, AL358612, AL360217, AL391812 Genomic DNA. Translation: CAH73661.1.
AL358612
, AL162503, AL360217, AL391812 Genomic DNA. Translation: CAH73814.1.
AL360217 Genomic DNA. Translation: CAD13206.2.
AL391812
, AL162503, AL358612, AL360217 Genomic DNA. Translation: CAI17380.1.
BC036079 mRNA. Translation: AAH36079.1.
CCDSiCCDS7148.1. [Q8NEV4-1]
RefSeqiNP_059129.3. NM_017433.4. [Q8NEV4-1]
XP_011517800.1. XM_011519498.2. [Q8NEV4-1]
XP_011517801.1. XM_011519499.1. [Q8NEV4-1]
XP_011517802.1. XM_011519500.2. [Q8NEV4-1]
UniGeneiHs.662630.

3D structure databases

ProteinModelPortaliQ8NEV4.
SMRiQ8NEV4.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi119814. 3 interactors.
IntActiQ8NEV4. 1 interactor.
STRINGi9606.ENSP00000265944.

Chemistry databases

BindingDBiQ8NEV4.
ChEMBLiCHEMBL5546.
GuidetoPHARMACOLOGYi2112.

PTM databases

iPTMnetiQ8NEV4.
PhosphoSitePlusiQ8NEV4.

Polymorphism and mutation databases

BioMutaiMYO3A.
DMDMi160112826.

Proteomic databases

EPDiQ8NEV4.
PaxDbiQ8NEV4.
PeptideAtlasiQ8NEV4.
PRIDEiQ8NEV4.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000265944; ENSP00000265944; ENSG00000095777. [Q8NEV4-1]
ENST00000376302; ENSP00000365479; ENSG00000095777. [Q8NEV4-2]
GeneIDi53904.
KEGGihsa:53904.
UCSCiuc001ism.3. human. [Q8NEV4-1]

Organism-specific databases

CTDi53904.
DisGeNETi53904.
GeneCardsiMYO3A.
GeneReviewsiMYO3A.
H-InvDBHIX0201520.
HGNCiHGNC:7601. MYO3A.
HPAiHPA048951.
MalaCardsiMYO3A.
MIMi606808. gene.
607101. phenotype.
neXtProtiNX_Q8NEV4.
OpenTargetsiENSG00000095777.
Orphaneti90636. Autosomal recessive non-syndromic sensorineural deafness type DFNB.
PharmGKBiPA31405.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG0587. Eukaryota.
KOG4229. Eukaryota.
COG5022. LUCA.
GeneTreeiENSGT00840000129687.
HOGENOMiHOG000234203.
HOVERGENiHBG052555.
InParanoidiQ8NEV4.
KOiK08834.
OMAiNFRGHRE.
OrthoDBiEOG091G00C2.
PhylomeDBiQ8NEV4.
TreeFamiTF326512.

Enzyme and pathway databases

BioCyciZFISH:HS01834-MONOMER.
SignaLinkiQ8NEV4.

Miscellaneous databases

GeneWikiiMYO3A.
GenomeRNAii53904.
PROiQ8NEV4.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000095777.
CleanExiHS_MYO3A.
ExpressionAtlasiQ8NEV4. baseline and differential.
GenevisibleiQ8NEV4. HS.

Family and domain databases

InterProiIPR000048. IQ_motif_EF-hand-BS.
IPR011009. Kinase-like_dom.
IPR001609. Myosin_head_motor_dom.
IPR027417. P-loop_NTPase.
IPR000719. Prot_kinase_dom.
IPR017441. Protein_kinase_ATP_BS.
[Graphical view]
PfamiPF00612. IQ. 2 hits.
PF00063. Myosin_head. 1 hit.
PF00069. Pkinase. 1 hit.
[Graphical view]
PRINTSiPR00193. MYOSINHEAVY.
SMARTiSM00015. IQ. 3 hits.
SM00242. MYSc. 1 hit.
SM00220. S_TKc. 1 hit.
[Graphical view]
SUPFAMiSSF52540. SSF52540. 2 hits.
SSF56112. SSF56112. 1 hit.
PROSITEiPS50096. IQ. 3 hits.
PS51456. MYOSIN_MOTOR. 1 hit.
PS00107. PROTEIN_KINASE_ATP. 1 hit.
PS50011. PROTEIN_KINASE_DOM. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiMYO3A_HUMAN
AccessioniPrimary (citable) accession number: Q8NEV4
Secondary accession number(s): Q4G0X2
, Q5VZ28, Q8WX17, Q9NYS8
Entry historyi
Integrated into UniProtKB/Swiss-Prot: April 26, 2004
Last sequence update: November 13, 2007
Last modified: November 30, 2016
This is version 142 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 10
    Human chromosome 10: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. Human and mouse protein kinases
    Human and mouse protein kinases: classification and index
  6. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.