Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Sodium channel protein type 5 subunit alpha

Gene

Scn5a

Organism
Rattus norvegicus (Rat)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

This protein mediates the voltage-dependent sodium ion permeability of excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a sodium-selective channel through which Na+ ions may pass in accordance with their electrochemical gradient. It is a tetrodotoxin-resistant Na+ channel isoform. This channel is responsible for the initial upstroke of the action potential. Channel inactivation is regulated by intracellular calcium levels.By similarity

Miscellaneous

Na+ channels in mammalian cardiac membrane have functional properties quite distinct from Na+ channels in nerve and skeletal muscle.

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sitei374Cys residue near the selectivity filter, which has a free thiol that is susceptible to reaction with methanethiosulfonate (MTSET); Sodium current is irreversibly blocked by MTSETBy similarity1 Publication1

GO - Molecular functioni

GO - Biological processi

Keywordsi

Molecular functionCalmodulin-binding, Ion channel, Sodium channel, Voltage-gated channel
Biological processIon transport, Sodium transport, Transport
LigandSodium

Names & Taxonomyi

Protein namesi
Recommended name:
Sodium channel protein type 5 subunit alpha
Alternative name(s):
Sodium channel protein cardiac muscle subunit alpha
Sodium channel protein type V subunit alpha
Voltage-gated sodium channel subunit alpha Nav1.5
Gene namesi
Name:Scn5a
OrganismiRattus norvegicus (Rat)
Taxonomic identifieri10116 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaMyomorphaMuroideaMuridaeMurinaeRattus
Proteomesi
  • UP000002494 Componenti: Unplaced

Organism-specific databases

RGDi3637. Scn5a.

Subcellular locationi

Topology

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Topological domaini1 – 132CytoplasmicCuratedAdd BLAST132
Transmembranei133 – 151Helical; Name=S1 of repeat IBy similarityAdd BLAST19
Topological domaini152 – 158ExtracellularCurated7
Transmembranei159 – 179Helical; Name=S2 of repeat IBy similarityAdd BLAST21
Topological domaini180 – 193CytoplasmicCuratedAdd BLAST14
Transmembranei194 – 211Helical; Name=S3 of repeat IBy similarityAdd BLAST18
Topological domaini212 – 217ExtracellularCurated6
Transmembranei218 – 234Helical; Name=S4 of repeat IBy similarityAdd BLAST17
Topological domaini235 – 253CytoplasmicCuratedAdd BLAST19
Transmembranei254 – 273Helical; Name=S5 of repeat IBy similarityAdd BLAST20
Topological domaini274 – 358ExtracellularCuratedAdd BLAST85
Intramembranei359 – 383Pore-formingBy similarityAdd BLAST25
Topological domaini384 – 390ExtracellularCurated7
Transmembranei391 – 411Helical; Name=S6 of repeat IBy similarityAdd BLAST21
Topological domaini412 – 718CytoplasmicCuratedAdd BLAST307
Transmembranei719 – 737Helical; Name=S1 of repeat IIBy similarityAdd BLAST19
Topological domaini738 – 748ExtracellularCuratedAdd BLAST11
Transmembranei749 – 768Helical; Name=S2 of repeat IIBy similarityAdd BLAST20
Topological domaini769 – 782CytoplasmicCuratedAdd BLAST14
Transmembranei783 – 802Helical; Name=S3 of repeat IIBy similarityAdd BLAST20
Topological domaini803 – 804ExtracellularCurated2
Transmembranei805 – 822Helical; Name=S4 of repeat IIBy similarityAdd BLAST18
Topological domaini823 – 838CytoplasmicCuratedAdd BLAST16
Transmembranei839 – 857Helical; Name=S5 of repeat IIBy similarityAdd BLAST19
Topological domaini858 – 886ExtracellularCuratedAdd BLAST29
Intramembranei887 – 907Pore-formingBy similarityAdd BLAST21
Topological domaini908 – 920ExtracellularCuratedAdd BLAST13
Transmembranei921 – 941Helical; Name=S6 of repeat IIBy similarityAdd BLAST21
Topological domaini942 – 1208CytoplasmicCuratedAdd BLAST267
Transmembranei1209 – 1226Helical; Name=S1 of repeat IIIBy similarityAdd BLAST18
Topological domaini1227 – 1239ExtracellularCuratedAdd BLAST13
Transmembranei1240 – 1258Helical; Name=S2 of repeat IIIBy similarityAdd BLAST19
Topological domaini1259 – 1272CytoplasmicCuratedAdd BLAST14
Transmembranei1273 – 1291Helical; Name=S3 of repeat IIIBy similarityAdd BLAST19
Topological domaini1292 – 1299ExtracellularCurated8
Transmembranei1300 – 1318Helical; Name=S4 of repeat IIIBy similarityAdd BLAST19
Topological domaini1319 – 1335CytoplasmicCuratedAdd BLAST17
Transmembranei1336 – 1355Helical; Name=S5 of repeat IIIBy similarityAdd BLAST20
Topological domaini1356 – 1407ExtracellularCuratedAdd BLAST52
Intramembranei1408 – 1429Pore-formingBy similarityAdd BLAST22
Topological domaini1430 – 1446ExtracellularCuratedAdd BLAST17
Transmembranei1447 – 1468Helical; Name=S6 of repeat IIIBy similarityAdd BLAST22
Topological domaini1469 – 1531CytoplasmicCuratedAdd BLAST63
Transmembranei1532 – 1549Helical; Name=S1 of repeat IVBy similarityAdd BLAST18
Topological domaini1550 – 1560ExtracellularCuratedAdd BLAST11
Transmembranei1561 – 1579Helical; Name=S2 of repeat IVBy similarityAdd BLAST19
Topological domaini1580 – 1591CytoplasmicCuratedAdd BLAST12
Transmembranei1592 – 1609Helical; Name=S3 of repeat IVBy similarityAdd BLAST18
Topological domaini1610 – 1622ExtracellularCuratedAdd BLAST13
Transmembranei1623 – 1639Helical; Name=S4 of repeat IVBy similarityAdd BLAST17
Topological domaini1640 – 1658CytoplasmicCuratedAdd BLAST19
Transmembranei1659 – 1676Helical; Name=S5 of repeat IVBy similarityAdd BLAST18
Topological domaini1677 – 1698ExtracellularCuratedAdd BLAST22
Intramembranei1699 – 1721Pore-formingBy similarityAdd BLAST23
Topological domaini1722 – 1750ExtracellularCuratedAdd BLAST29
Transmembranei1751 – 1773Helical; Name=S6 of repeat IVBy similarityAdd BLAST23
Topological domaini1774 – 2019CytoplasmicCuratedAdd BLAST246

GO - Cellular componenti

  • caveola Source: BHF-UCL
  • endoplasmic reticulum Source: BHF-UCL
  • integral component of membrane Source: UniProtKB
  • intercalated disc Source: BHF-UCL
  • T-tubule Source: BHF-UCL
  • voltage-gated sodium channel complex Source: BHF-UCL
  • Z disc Source: BHF-UCL

Keywords - Cellular componenti

Cell membrane, Membrane

Pathology & Biotechi

Mutagenesis

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Mutagenesisi869L → C: >1000-fold increase of sensitivity to the conotoxin GVIIJ(SSG). 1 Publication1
Mutagenesisi1612D → N or R: Little change in voltage-dependence of conductance and decrease in affinity to the sea anemone toxin anthopleurin-B (residue Lys-37). 1 Publication1

Chemistry databases

ChEMBLiCHEMBL3866.
GuidetoPHARMACOLOGYi582.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00000484981 – 2019Sodium channel protein type 5 subunit alphaAdd BLAST2019

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei37PhosphoserineBy similarity1
Modified residuei39PhosphothreonineBy similarity1
Glycosylationi215N-linked (GlcNAc...) asparagineSequence analysis1
Disulfide bondi281 ↔ 336By similarity
Glycosylationi284N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi289N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi292N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi319N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi329N-linked (GlcNAc...) asparagineSequence analysis1
Modified residuei458PhosphoserineBy similarity1
Modified residuei461PhosphoserineBy similarity1
Modified residuei484PhosphoserineCombined sources1
Modified residuei485PhosphoserineCombined sources1
Modified residuei487PhosphothreonineCombined sources1
Modified residuei498PhosphoserineBy similarity1
Modified residuei511PhosphoserineBy similarity1
Modified residuei527Dimethylated arginine; alternateBy similarity1
Modified residuei527Omega-N-methylarginine; alternateBy similarity1
Modified residuei540PhosphoserineBy similarity1
Modified residuei572PhosphoserineBy similarity1
Modified residuei665PhosphoserineBy similarity1
Modified residuei668PhosphoserineBy similarity1
Glycosylationi741N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi804N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi865N-linked (GlcNAc...) asparagineSequence analysis1
Disulfide bondi909 ↔ 918By similarity
Glycosylationi1367N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi1376N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi1382N-linked (GlcNAc...) asparagineSequence analysis1
Glycosylationi1390N-linked (GlcNAc...) asparagineSequence analysis1
Modified residuei1505Phosphoserine; by PKCBy similarity1

Post-translational modificationi

Phosphorylation at Ser-1505 by PKC in a highly conserved cytoplasmic loop slows inactivation of the sodium channel and reduces peak sodium currents. Regulated through phosphorylation by CaMK2D.By similarity
Ubiquitinated by NEDD4L; which promotes its endocytosis. Does not seem to be ubiquitinated by NEDD4 or WWP2.By similarity
Lacks the cysteine which covalently binds the conotoxin GVIIJ. This cysteine (position 869) is speculated in other sodium channel subunits alpha to be implied in covalent binding with the sodium channel subunit beta-2 or beta-4.1 Publication

Keywords - PTMi

Disulfide bond, Glycoprotein, Methylation, Phosphoprotein, Ubl conjugation

Proteomic databases

PaxDbiP15389.
PRIDEiP15389.

PTM databases

iPTMnetiP15389.
PhosphoSitePlusiP15389.

Expressioni

Tissue specificityi

Strongly expressed in the heart. Also expressed in adult and fetal brain, spinal cord, testis, and at moderate levels in kidney, adrenal gland, lung, skeletal muscle, spleen, stomach and bladder. Isoform 2 is expressed in brain.2 Publications

Interactioni

Subunit structurei

Interacts with the PDZ domain of the syntrophin SNTA1, SNTB1 and SNTB2. Interacts with NEDD4, NEDD4L, WWP2 and GPD1L. Interacts with CALM. Interacts with FGF13; the interaction is direct and may regulate SNC5A density at membranes and function. May also interact with FGF12 and FGF14.By similarity

GO - Molecular functioni

  • ankyrin binding Source: RGD
  • calmodulin binding Source: UniProtKB-KW
  • fibroblast growth factor binding Source: RGD

Protein-protein interaction databases

DIPiDIP-60063N.
STRINGi10116.ENSRNOP00000060180.

Chemistry databases

BindingDBiP15389.

Structurei

3D structure databases

ProteinModelPortaliP15389.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Repeati114 – 421ICuratedAdd BLAST308
Repeati700 – 972IICuratedAdd BLAST273
Repeati1189 – 1503IIICuratedAdd BLAST315
Repeati1512 – 1809IVCuratedAdd BLAST298
Domaini1903 – 1932IQAdd BLAST30

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni1841 – 1903Interaction with FGF13By similarityAdd BLAST63
Regioni1977 – 1980Interaction with NEDD4, NEDD4L and WWP2By similarity4

Domaini

The sequence contains 4 internal repeats, each with 5 hydrophobic segments (S1, S2, S3, S5, S6) and one positively charged segment (S4). Segments S4 are probably the voltage-sensors and are characterized by a series of positively charged amino acids at every third position.Curated
The IQ domain mediates association with calmodulin.By similarity

Sequence similaritiesi

Keywords - Domaini

Repeat, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiKOG2301. Eukaryota.
ENOG410XNP6. LUCA.
HOGENOMiHOG000231755.
HOVERGENiHBG053100.
InParanoidiP15389.
KOiK04838.
PhylomeDBiP15389.

Family and domain databases

InterProiView protein in InterPro
IPR005821. Ion_trans_dom.
IPR008053. Na_channel_a5su.
IPR001696. Na_channel_asu.
IPR010526. Na_trans_assoc.
IPR024583. Na_trans_cytopl.
PfamiView protein in Pfam
PF00520. Ion_trans. 4 hits.
PF06512. Na_trans_assoc. 1 hit.
PF11933. Na_trans_cytopl. 1 hit.
PRINTSiPR00170. NACHANNEL.
PR01666. NACHANNEL5.

Sequences (2)i

Sequence statusi: Complete.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: P15389-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MANLLLPRGT SSFRRFTRES LAAIEKRMAE KQARGGSATS QESREGLQEE
60 70 80 90 100
EAPRPQLDLQ ASKKLPDLYG NPPRELIGEP LEDLDPFYST QKTFIVLNKG
110 120 130 140 150
KTIFRFSATN ALYVLSPFHP VRRAAVKILV HSLFSMLIMC TILTNCVFMA
160 170 180 190 200
QHDPPPWTKY VEYTFTAIYT FESLVKILAR GFCLHAFTFL RDPWNWLDFS
210 220 230 240 250
VIVMAYTTEF VDLGNVSALR TFRVLRALKT ISVISGLKTI VGALIQSVKK
260 270 280 290 300
LADVMVLTVF CLSVFALIGL QLFMGNLRHK CVRNFTELNG TNGSVEADGL
310 320 330 340 350
VWNSLDVYLN DPANYLLKNG TTDVLLCGNS SDAGTCPEGY RCLKAGENPD
360 370 380 390 400
HGYTSFDSFA WAFLALFRLM TQDCWERLYQ QTLRSAGKIY MIFFMLVIFL
410 420 430 440 450
GSFYLVNLIL AVVAMAYEEQ NQATIAETEE KEKRFQEAME MLKKEHEALT
460 470 480 490 500
IRGVDTVSRS SLEMSPLAPV TNHERKSKRR KRLSSGTEDG GDDRLPKSDS
510 520 530 540 550
EDGPRALNQL SLTHGLSRTS MRPRSSRGSI FTFRRRDQGS EADFADDENS
560 570 580 590 600
TAGESESHRT SLLVPWPLRH PSAQGQPGPG ASAPGYVLNG KRNSTVDCNG
610 620 630 640 650
VVSLLGAGDA EATSPGSYLL RPMVLDRPPD TTTPSEEPGG PQMLTPQAPC
660 670 680 690 700
ADGFEEPGAR QRALSAVSVL TSALEELEES HRKCPPCWNR FAQHYLIWEC
710 720 730 740 750
CPLWMSIKQK VKFVVMDPFA DLTITMCIVL NTLFMALEHY NMTAEFEEML
760 770 780 790 800
QVGNLVFTGI FTAEMTFKII ALDPYYYFQQ GWNIFDSIIV ILSLMELGLS
810 820 830 840 850
RMGNLSVLRS FRLLRVFKLA KSWPTLNTLI KIIGNSVGAL GNLTLVLAII
860 870 880 890 900
VFIFAVVGMQ LFGKNYSELR HRISDSGLLP RWHMMDFFHA FLIIFRILCG
910 920 930 940 950
EWIETMWDCM EVSGQSLCLL VFLLVMVIGN LVVLNLFLAL LLSSFSADNL
960 970 980 990 1000
TAPDEDGEMN NLQLALARIQ RGLRFVKRTT WDFCCGILRR RPKKPAALAT
1010 1020 1030 1040 1050
HSQLPSCITA PRSPPPPEVE KVPPARKETR FEEDKRPGQG TPGDSEPVCV
1060 1070 1080 1090 1100
PIAVAESDTE DQEEDEENSL GTEEESSKQE SQVVSGGHEP YQEPRAWSQV
1110 1120 1130 1140 1150
SETTSSEAGA STSQADWQQE QKTEPQAPGC GETPEDSYSE GSTADMTNTA
1160 1170 1180 1190 1200
DLLEQIPDLG EDVKDPEDCF TEGCVRRCPC CMVDTTQSPG KVWWRLRKTC
1210 1220 1230 1240 1250
YRIVEHSWFE TFIIFMILLS SGALAFEDIY LEERKTIKVL LEYADKMFTY
1260 1270 1280 1290 1300
VFVLEMLLKW VAYGFKKYFT NAWCWLDFLI VDVSLVSLVA NTLGFAEMGP
1310 1320 1330 1340 1350
IKSLRTLRAL RPLRALSRFE GMRVVVNALV GAIPSIMNVL LVCLIFWLIF
1360 1370 1380 1390 1400
SIMGVNLFAG KFGRCINQTE GDLPLNYTIV NNKSECESFN VTGELYWTKV
1410 1420 1430 1440 1450
KVNFDNVGAG YLALLQVATF KGWMDIMYAA VDSRGYEEQP QWEDNLYMYI
1460 1470 1480 1490 1500
YFVVFIIFGS FFTLNLFIGV IIDNFNQQKK KLGGQDIFMT EEQKKYYNAM
1510 1520 1530 1540 1550
KKLGSKKPQK PIPRPLNKYQ GFIFDIVTKQ AFDVTIMFLI CLNMVTMMVE
1560 1570 1580 1590 1600
TDDQSPEKVN ILAKINLLFV AIFTGECIVK MAALRHYYFT NSWNIFDFVV
1610 1620 1630 1640 1650
VILSIVGTVL SDIIQKYFFS PTLFRVIRLA RIGRILRLIR GAKGIRTLLF
1660 1670 1680 1690 1700
ALMMSLPALF NIGLLLFLVM FIYSIFGMAN FAYVKWEAGI DDMFNFQTFA
1710 1720 1730 1740 1750
NSMLCLFQIT TSAGWDGLLS PILNTGPPYC DPNLPNSNGS RGNCGSPAVG
1760 1770 1780 1790 1800
ILFFTTYIII SFLIVVNMYI AIILENFSVA TEESTEPLSE DDFDMFYEIW
1810 1820 1830 1840 1850
EKFDPEATQF IEYLALSDFA DALSEPLRIA KPNQISLINM DLPMVSGDRI
1860 1870 1880 1890 1900
HCMDILFAFT KRVLGESGEM DALKIQMEEK FMAANPSKIS YEPITTTLRR
1910 1920 1930 1940 1950
KHEEVSATVI QRAFRRHLLQ RSVKHASFLF RQQAGGSGLS DEDAPEREGL
1960 1970 1980 1990 2000
IAYMMNGNFS RRSAPLSSSS ISSTSFPPSY DSVTRATSDN LPVRASDYSR
2010
SEDLADFPPS PDRDRESIV
Length:2,019
Mass (Da):227,367
Last modified:April 1, 1990 - v1
Checksum:iCFC3B03CEAE708AD
GO
Isoform 2 (identifier: P15389-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1080-1132: Missing.

Show »
Length:1,966
Mass (Da):221,706
Checksum:iE4AC1FB7CF825647
GO

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0374821080 – 1132Missing in isoform 2. 1 PublicationAdd BLAST53

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M27902 mRNA. Translation: AAA42114.1.
AF353637 mRNA. Translation: AAK38884.1.
PIRiA33996.
RefSeqiNP_001153634.1. NM_001160162.1. [P15389-2]
NP_037257.1. NM_013125.2. [P15389-1]
UniGeneiRn.32074.

Genome annotation databases

GeneIDi25665.
KEGGirno:25665.
UCSCiRGD:3637. rat. [P15389-1]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M27902 mRNA. Translation: AAA42114.1.
AF353637 mRNA. Translation: AAK38884.1.
PIRiA33996.
RefSeqiNP_001153634.1. NM_001160162.1. [P15389-2]
NP_037257.1. NM_013125.2. [P15389-1]
UniGeneiRn.32074.

3D structure databases

ProteinModelPortaliP15389.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

DIPiDIP-60063N.
STRINGi10116.ENSRNOP00000060180.

Chemistry databases

BindingDBiP15389.
ChEMBLiCHEMBL3866.
GuidetoPHARMACOLOGYi582.

PTM databases

iPTMnetiP15389.
PhosphoSitePlusiP15389.

Proteomic databases

PaxDbiP15389.
PRIDEiP15389.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

GeneIDi25665.
KEGGirno:25665.
UCSCiRGD:3637. rat. [P15389-1]

Organism-specific databases

CTDi6331.
RGDi3637. Scn5a.

Phylogenomic databases

eggNOGiKOG2301. Eukaryota.
ENOG410XNP6. LUCA.
HOGENOMiHOG000231755.
HOVERGENiHBG053100.
InParanoidiP15389.
KOiK04838.
PhylomeDBiP15389.

Miscellaneous databases

PROiPR:P15389.

Family and domain databases

InterProiView protein in InterPro
IPR005821. Ion_trans_dom.
IPR008053. Na_channel_a5su.
IPR001696. Na_channel_asu.
IPR010526. Na_trans_assoc.
IPR024583. Na_trans_cytopl.
PfamiView protein in Pfam
PF00520. Ion_trans. 4 hits.
PF06512. Na_trans_assoc. 1 hit.
PF11933. Na_trans_cytopl. 1 hit.
PRINTSiPR00170. NACHANNEL.
PR01666. NACHANNEL5.
ProtoNetiSearch...

Entry informationi

Entry nameiSCN5A_RAT
AccessioniPrimary (citable) accession number: P15389
Secondary accession number(s): Q925G6
Entry historyiIntegrated into UniProtKB/Swiss-Prot: April 1, 1990
Last sequence update: April 1, 1990
Last modified: May 10, 2017
This is version 135 of the entry and version 1 of the sequence. See complete history.
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.