Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Semaphorin-6D

Gene

SEMA6D

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Shows growth cone collapsing activity on dorsal root ganglion (DRG) neurons in vitro. May be a stop signal for the DRG neurons in their target areas, and possibly also for other neurons. May also be involved in the maintenance and remodeling of neuronal connections.

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Developmental protein

Keywords - Biological processi

Differentiation, Neurogenesis

Enzyme and pathway databases

ReactomeiR-HSA-416700. Other semaphorin interactions.

Names & Taxonomyi

Protein namesi
Recommended name:
Semaphorin-6D
Gene namesi
Name:SEMA6D
Synonyms:KIAA1479
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 15

Organism-specific databases

HGNCiHGNC:16770. SEMA6D.

Subcellular locationi

Topology

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Topological domaini21 – 662ExtracellularSequence analysisAdd BLAST642
Transmembranei663 – 683HelicalSequence analysisAdd BLAST21
Topological domaini684 – 1073CytoplasmicSequence analysisAdd BLAST390

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Cell membrane, Cytoplasm, Membrane

Pathology & Biotechi

Organism-specific databases

DisGeNETi80031.
OpenTargetsiENSG00000137872.
PharmGKBiPA134951035.

Polymorphism and mutation databases

BioMutaiSEMA6D.
DMDMi74715611.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 20Sequence analysisAdd BLAST20
ChainiPRO_000004461521 – 1073Semaphorin-6DAdd BLAST1053

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Glycosylationi51N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi108 ↔ 118PROSITE-ProRule annotation
Disulfide bondi136 ↔ 145PROSITE-ProRule annotation
Disulfide bondi259 ↔ 370PROSITE-ProRule annotation
Glycosylationi283N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi284 ↔ 329PROSITE-ProRule annotation
Glycosylationi435N-linked (GlcNAc...)Sequence analysis1
Glycosylationi461N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi477 ↔ 506PROSITE-ProRule annotation
Disulfide bondi515 ↔ 533PROSITE-ProRule annotation
Disulfide bondi521 ↔ 568PROSITE-ProRule annotation
Disulfide bondi525 ↔ 541PROSITE-ProRule annotation
Glycosylationi631N-linked (GlcNAc...)Sequence analysis1
Modified residuei723PhosphoserineCombined sources1
Modified residuei734PhosphoserineBy similarity1
Modified residuei744PhosphoserineCombined sources1
Modified residuei773PhosphothreonineCombined sources1
Modified residuei931PhosphoserineCombined sources1
Modified residuei957PhosphoserineCombined sources1
Modified residuei983PhosphoserineCombined sources1

Keywords - PTMi

Disulfide bond, Glycoprotein, Phosphoprotein

Proteomic databases

MaxQBiQ8NFY4.
PaxDbiQ8NFY4.
PeptideAtlasiQ8NFY4.
PRIDEiQ8NFY4.

PTM databases

iPTMnetiQ8NFY4.
PhosphoSitePlusiQ8NFY4.

Expressioni

Gene expression databases

BgeeiENSG00000137872.
ExpressionAtlasiQ8NFY4. baseline and differential.
GenevisibleiQ8NFY4. HS.

Organism-specific databases

HPAiHPA043109.

Interactioni

GO - Molecular functioni

Protein-protein interaction databases

BioGridi123081. 1 interactor.
STRINGi9606.ENSP00000324857.

Structurei

3D structure databases

ProteinModelPortaliQ8NFY4.
SMRiQ8NFY4.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini27 – 512SemaPROSITE-ProRule annotationAdd BLAST486
Domaini514 – 569PSIAdd BLAST56

Sequence similaritiesi

Belongs to the semaphorin family.Curated
Contains 1 PSI domain.Curated
Contains 1 Sema domain.PROSITE-ProRule annotation

Keywords - Domaini

Signal, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiKOG3611. Eukaryota.
ENOG410XQZC. LUCA.
GeneTreeiENSGT00760000119134.
HOGENOMiHOG000232047.
HOVERGENiHBG072910.
InParanoidiQ8NFY4.
KOiK06842.
OMAiYIAGRDQ.
OrthoDBiEOG091G014B.
PhylomeDBiQ8NFY4.
TreeFamiTF316102.

Family and domain databases

Gene3Di2.130.10.10. 1 hit.
InterProiIPR002165. Plexin_repeat.
IPR016201. PSI.
IPR001627. Semap_dom.
IPR027231. Semaphorin.
IPR015943. WD40/YVTN_repeat-like_dom.
[Graphical view]
PANTHERiPTHR11036. PTHR11036. 4 hits.
PfamiPF01437. PSI. 1 hit.
PF01403. Sema. 1 hit.
[Graphical view]
SMARTiSM00423. PSI. 1 hit.
SM00630. Sema. 1 hit.
[Graphical view]
SUPFAMiSSF101912. SSF101912. 1 hit.
PROSITEiPS51004. SEMA. 1 hit.
[Graphical view]

Sequences (8)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 8 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 4 (identifier: Q8NFY4-1) [UniParc]FASTAAdd to basket
Also known as: SEMA6D.4

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MRVFLLCAYI LLLMVSQLRA VSFPEDDEPL NTVDYHYSRQ YPVFRGRPSG
60 70 80 90 100
NESQHRLDFQ LMLKIRDTLY IAGRDQVYTV NLNEMPKTEV IPNKKLTWRS
110 120 130 140 150
RQQDRENCAM KGKHKDECHN FIKVFVPRND EMVFVCGTNA FNPMCRYYRL
160 170 180 190 200
STLEYDGEEI SGLARCPFDA RQTNVALFAD GKLYSATVAD FLASDAVIYR
210 220 230 240 250
SMGDGSALRT IKYDSKWIKE PHFLHAIEYG NYVYFFFREI AVEHNNLGKA
260 270 280 290 300
VYSRVARICK NDMGGSQRVL EKHWTSFLKA RLNCSVPGDS FFYFDVLQSI
310 320 330 340 350
TDIIQINGIP TVVGVFTTQL NSIPGSAVCA FSMDDIEKVF KGRFKEQKTP
360 370 380 390 400
DSVWTAVPED KVPKPRPGCC AKHGLAEAYK TSIDFPDETL SFIKSHPLMD
410 420 430 440 450
SAVPPIADEP WFTKTRVRYR LTAISVDHSA GPYQNYTVIF VGSEAGMVLK
460 470 480 490 500
VLAKTSPFSL NDSVLLEEIE AYNHAKCSAE NEEDKKVISL QLDKDHHALY
510 520 530 540 550
VAFSSCIIRI PLSRCERYGS CKKSCIASRD PYCGWLSQGS CGRVTPGMLA
560 570 580 590 600
EGYEQDTEFG NTAHLGDCHE ILPTSTTPDY KIFGGPTSDM EVSSSSVTTM
610 620 630 640 650
ASIPEITPKV IDTWRPKLTS SRKFVVQDDP NTSDFTDPLS GIPKGVRWEV
660 670 680 690 700
QSGESNQMVH MNVLITCVFA AFVLGAFIAG VAVYCYRDMF VRKNRKIHKD
710 720 730 740 750
AESAQSCTDS SGSFAKLNGL FDSPVKEYQQ NIDSPKLYSN LLTSRKELPP
760 770 780 790 800
NGDTKSMVMD HRGQPPELAA LPTPESTPVL HQKTLQAMKS HSEKAHGHGA
810 820 830 840 850
SRKETPQFFP SSPPPHSPLS HGHIPSAIVL PNATHDYNTS FSNSNAHKAE
860 870 880 890 900
KKLQNIDHPL TKSSSKRDHR RSVDSRNTLN DLLKHLNDPN SNPKAIMGDI
910 920 930 940 950
QMAHQNLMLD PMGSMSEVPP KVPNREASLY SPPSTLPRNS PTKRVDVPTT
960 970 980 990 1000
PGVPMTSLER QRGYHKNSSQ RHSISAMPKN LNSPNGVLLS RQPSMNRGGY
1010 1020 1030 1040 1050
MPTPTGAKVD YIQGTPVSVH LQPSLSRQSS YTSNGTLPRT GLKRTPSLKP
1060 1070
DVPPKPSFVP QTPSVRPLNK YTY
Length:1,073
Mass (Da):119,872
Last modified:October 1, 2002 - v1
Checksum:i7DCE4DFC5BF70F9E
GO
Isoform 1 (identifier: Q8NFY4-2) [UniParc]FASTAAdd to basket
Also known as: SEMA6D.1

The sequence of this isoform differs from the canonical sequence as follows:
     549-549: L → LLLTEDFFAFHNHS
     570-644: Missing.

Show »
Length:1,011
Mass (Da):113,290
Checksum:i9D6B8B3633941B89
GO
Isoform 2 (identifier: Q8NFY4-3) [UniParc]FASTAAdd to basket
Also known as: SEMA6D.2

The sequence of this isoform differs from the canonical sequence as follows:
     570-644: Missing.

Show »
Length:998
Mass (Da):111,730
Checksum:i3F46D6872E8D5344
GO
Isoform 3 (identifier: Q8NFY4-4) [UniParc]FASTAAdd to basket
Also known as: SEMA6D.3

The sequence of this isoform differs from the canonical sequence as follows:
     589-644: Missing.

Show »
Length:1,017
Mass (Da):113,736
Checksum:i4D639CEBADD9F2A0
GO
Isoform 5 (identifier: Q8NFY4-5) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     549-549: L → LLLTEDFFAFHNHS
     589-644: Missing.

Note: No experimental confirmation available.
Show »
Length:1,030
Mass (Da):115,296
Checksum:i659CF8EA114B048F
GO
Isoform 6 (identifier: Q8NFY4-6) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     570-588: Missing.

Note: No experimental confirmation available.
Show »
Length:1,054
Mass (Da):117,866
Checksum:iA888C403FFACEE8F
GO
Isoform 7 (identifier: Q8NFY4-7) [UniParc]FASTAAdd to basket
Also known as: SEMA6Ds, Short

The sequence of this isoform differs from the canonical sequence as follows:
     477-1073: Missing.

Show »
Length:476
Mass (Da):54,217
Checksum:i3AAB6E77051D2D83
GO
Isoform 8 (identifier: Q8NFY4-8) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     570-588: Missing.
     601-1073: ASIPEITPKV...SVRPLNKYTY → VYDGKSSLESPTRWST

Note: No experimental confirmation available. Gene prediction based on mRNA data.
Show »
Length:597
Mass (Da):67,538
Checksum:i58D3EE7B6B68DB67
GO

Sequence cautioni

The sequence BAA96003 differs from that shown. Reason: Erroneous initiation. Translation N-terminally shortened.Curated

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_051931307N → S.Corresponds to variant rs3743279dbSNPEnsembl.1
Natural variantiVAR_051932478S → N.Corresponds to variant rs532598dbSNPEnsembl.1
Natural variantiVAR_051933969S → T.Corresponds to variant rs16960074dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_016564477 – 1073Missing in isoform 7. 1 PublicationAdd BLAST597
Alternative sequenceiVSP_016565549L → LLLTEDFFAFHNHS in isoform 1 and isoform 5. 3 Publications1
Alternative sequenceiVSP_016566570 – 644Missing in isoform 1 and isoform 2. 3 PublicationsAdd BLAST75
Alternative sequenceiVSP_016572570 – 588Missing in isoform 6 and isoform 8. CuratedAdd BLAST19
Alternative sequenceiVSP_016567589 – 644Missing in isoform 3 and isoform 5. 1 PublicationAdd BLAST56
Alternative sequenceiVSP_054084601 – 1073ASIPE…NKYTY → VYDGKSSLESPTRWST in isoform 8. CuratedAdd BLAST473

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF389426 mRNA. Translation: AAM69449.1.
AF389427 mRNA. Translation: AAM69450.1.
AF389428 mRNA. Translation: AAM69451.1.
AF389429 mRNA. Translation: AAM69452.1.
AF389430 mRNA. Translation: AAM69453.1.
AB040912 mRNA. Translation: BAA96003.2. Different initiation.
AC018900 Genomic DNA. No translation available.
AC044787 Genomic DNA. No translation available.
AC009558 Genomic DNA. No translation available.
AC012050 Genomic DNA. No translation available.
AC023905 Genomic DNA. No translation available.
AC066615 Genomic DNA. No translation available.
AC084882 Genomic DNA. No translation available.
BC150253 mRNA. Translation: AAI50254.1.
CCDSiCCDS32224.1. [Q8NFY4-2]
CCDS32225.1. [Q8NFY4-1]
CCDS32226.1. [Q8NFY4-4]
CCDS32227.1. [Q8NFY4-3]
CCDS32228.1. [Q8NFY4-8]
CCDS32229.1. [Q8NFY4-7]
RefSeqiNP_001185928.1. NM_001198999.1. [Q8NFY4-2]
NP_065909.1. NM_020858.1. [Q8NFY4-2]
NP_079242.2. NM_024966.2. [Q8NFY4-7]
NP_705869.1. NM_153616.1. [Q8NFY4-3]
NP_705870.1. NM_153617.1. [Q8NFY4-4]
NP_705871.1. NM_153618.1. [Q8NFY4-1]
NP_705872.1. NM_153619.1. [Q8NFY4-8]
XP_005254744.1. XM_005254687.2. [Q8NFY4-1]
XP_005254746.1. XM_005254689.3. [Q8NFY4-6]
XP_011520379.1. XM_011522077.2. [Q8NFY4-6]
XP_011520380.1. XM_011522078.2. [Q8NFY4-5]
XP_011520381.1. XM_011522079.2. [Q8NFY4-4]
XP_011520382.1. XM_011522080.2. [Q8NFY4-2]
XP_011520383.1. XM_011522081.2. [Q8NFY4-3]
XP_016878108.1. XM_017022619.1. [Q8NFY4-5]
XP_016878109.1. XM_017022620.1. [Q8NFY4-4]
XP_016878110.1. XM_017022621.1. [Q8NFY4-3]
UniGeneiHs.511265.

Genome annotation databases

EnsembliENST00000316364; ENSP00000324857; ENSG00000137872. [Q8NFY4-1]
ENST00000354744; ENSP00000346786; ENSG00000137872. [Q8NFY4-4]
ENST00000355997; ENSP00000348276; ENSG00000137872. [Q8NFY4-8]
ENST00000358066; ENSP00000350770; ENSG00000137872. [Q8NFY4-2]
ENST00000389425; ENSP00000374076; ENSG00000137872. [Q8NFY4-7]
ENST00000389428; ENSP00000374079; ENSG00000137872. [Q8NFY4-3]
ENST00000536845; ENSP00000446152; ENSG00000137872. [Q8NFY4-1]
ENST00000558014; ENSP00000452815; ENSG00000137872. [Q8NFY4-2]
ENST00000558816; ENSP00000453661; ENSG00000137872. [Q8NFY4-8]
GeneIDi80031.
KEGGihsa:80031.
UCSCiuc001zvw.4. human. [Q8NFY4-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF389426 mRNA. Translation: AAM69449.1.
AF389427 mRNA. Translation: AAM69450.1.
AF389428 mRNA. Translation: AAM69451.1.
AF389429 mRNA. Translation: AAM69452.1.
AF389430 mRNA. Translation: AAM69453.1.
AB040912 mRNA. Translation: BAA96003.2. Different initiation.
AC018900 Genomic DNA. No translation available.
AC044787 Genomic DNA. No translation available.
AC009558 Genomic DNA. No translation available.
AC012050 Genomic DNA. No translation available.
AC023905 Genomic DNA. No translation available.
AC066615 Genomic DNA. No translation available.
AC084882 Genomic DNA. No translation available.
BC150253 mRNA. Translation: AAI50254.1.
CCDSiCCDS32224.1. [Q8NFY4-2]
CCDS32225.1. [Q8NFY4-1]
CCDS32226.1. [Q8NFY4-4]
CCDS32227.1. [Q8NFY4-3]
CCDS32228.1. [Q8NFY4-8]
CCDS32229.1. [Q8NFY4-7]
RefSeqiNP_001185928.1. NM_001198999.1. [Q8NFY4-2]
NP_065909.1. NM_020858.1. [Q8NFY4-2]
NP_079242.2. NM_024966.2. [Q8NFY4-7]
NP_705869.1. NM_153616.1. [Q8NFY4-3]
NP_705870.1. NM_153617.1. [Q8NFY4-4]
NP_705871.1. NM_153618.1. [Q8NFY4-1]
NP_705872.1. NM_153619.1. [Q8NFY4-8]
XP_005254744.1. XM_005254687.2. [Q8NFY4-1]
XP_005254746.1. XM_005254689.3. [Q8NFY4-6]
XP_011520379.1. XM_011522077.2. [Q8NFY4-6]
XP_011520380.1. XM_011522078.2. [Q8NFY4-5]
XP_011520381.1. XM_011522079.2. [Q8NFY4-4]
XP_011520382.1. XM_011522080.2. [Q8NFY4-2]
XP_011520383.1. XM_011522081.2. [Q8NFY4-3]
XP_016878108.1. XM_017022619.1. [Q8NFY4-5]
XP_016878109.1. XM_017022620.1. [Q8NFY4-4]
XP_016878110.1. XM_017022621.1. [Q8NFY4-3]
UniGeneiHs.511265.

3D structure databases

ProteinModelPortaliQ8NFY4.
SMRiQ8NFY4.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi123081. 1 interactor.
STRINGi9606.ENSP00000324857.

PTM databases

iPTMnetiQ8NFY4.
PhosphoSitePlusiQ8NFY4.

Polymorphism and mutation databases

BioMutaiSEMA6D.
DMDMi74715611.

Proteomic databases

MaxQBiQ8NFY4.
PaxDbiQ8NFY4.
PeptideAtlasiQ8NFY4.
PRIDEiQ8NFY4.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000316364; ENSP00000324857; ENSG00000137872. [Q8NFY4-1]
ENST00000354744; ENSP00000346786; ENSG00000137872. [Q8NFY4-4]
ENST00000355997; ENSP00000348276; ENSG00000137872. [Q8NFY4-8]
ENST00000358066; ENSP00000350770; ENSG00000137872. [Q8NFY4-2]
ENST00000389425; ENSP00000374076; ENSG00000137872. [Q8NFY4-7]
ENST00000389428; ENSP00000374079; ENSG00000137872. [Q8NFY4-3]
ENST00000536845; ENSP00000446152; ENSG00000137872. [Q8NFY4-1]
ENST00000558014; ENSP00000452815; ENSG00000137872. [Q8NFY4-2]
ENST00000558816; ENSP00000453661; ENSG00000137872. [Q8NFY4-8]
GeneIDi80031.
KEGGihsa:80031.
UCSCiuc001zvw.4. human. [Q8NFY4-1]

Organism-specific databases

CTDi80031.
DisGeNETi80031.
GeneCardsiSEMA6D.
HGNCiHGNC:16770. SEMA6D.
HPAiHPA043109.
MIMi609295. gene.
neXtProtiNX_Q8NFY4.
OpenTargetsiENSG00000137872.
PharmGKBiPA134951035.
HUGEiSearch...
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG3611. Eukaryota.
ENOG410XQZC. LUCA.
GeneTreeiENSGT00760000119134.
HOGENOMiHOG000232047.
HOVERGENiHBG072910.
InParanoidiQ8NFY4.
KOiK06842.
OMAiYIAGRDQ.
OrthoDBiEOG091G014B.
PhylomeDBiQ8NFY4.
TreeFamiTF316102.

Enzyme and pathway databases

ReactomeiR-HSA-416700. Other semaphorin interactions.

Miscellaneous databases

ChiTaRSiSEMA6D. human.
GenomeRNAii80031.
PROiQ8NFY4.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000137872.
ExpressionAtlasiQ8NFY4. baseline and differential.
GenevisibleiQ8NFY4. HS.

Family and domain databases

Gene3Di2.130.10.10. 1 hit.
InterProiIPR002165. Plexin_repeat.
IPR016201. PSI.
IPR001627. Semap_dom.
IPR027231. Semaphorin.
IPR015943. WD40/YVTN_repeat-like_dom.
[Graphical view]
PANTHERiPTHR11036. PTHR11036. 4 hits.
PfamiPF01437. PSI. 1 hit.
PF01403. Sema. 1 hit.
[Graphical view]
SMARTiSM00423. PSI. 1 hit.
SM00630. Sema. 1 hit.
[Graphical view]
SUPFAMiSSF101912. SSF101912. 1 hit.
PROSITEiPS51004. SEMA. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiSEM6D_HUMAN
AccessioniPrimary (citable) accession number: Q8NFY4
Secondary accession number(s): A6NF10
, A6NM95, A6NNK1, A7E2A0, Q8NFY3, Q8NFY5, Q8NFY6, Q8NFY7, Q9P249
Entry historyi
Integrated into UniProtKB/Swiss-Prot: December 20, 2005
Last sequence update: October 1, 2002
Last modified: November 30, 2016
This is version 122 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 15
    Human chromosome 15: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.