Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

SUN domain-containing protein 1

Gene

SUN1

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Component of SUN-protein-containing multivariate complexes also called LINC complexes which link the nucleoskeleton and cytoskeleton by providing versatile outer nuclear membrane attachment sites for cytoskeletal filaments. Required for interkinetic nuclear migration (INM) and essential for nucleokinesis and centrosome-nucleus coupling during radial neuronal migration in the cerebral cortex and during glial migration. Anchors chromosome movement in the prophase of meiosis and is involved in selective gene expression of coding and non-coding RNAs needed for gametogenesis. Required for telomere attachment to nuclear envelope and gametogenesis. Helps to define the distribution of nuclear pore complexes (NPCs) (By similarity). Required for efficient localization of SYNE4 in the nuclear envelope (By similarity).By similarity2 Publications

GO - Biological processi

  • cytoskeletal anchoring at nuclear membrane Source: UniProtKB
  • nuclear envelope organization Source: MGI
  • nuclear matrix anchoring at nuclear membrane Source: UniProtKB
  • ossification Source: Ensembl
  • response to mechanical stimulus Source: Ensembl
  • synapsis Source: Ensembl
Complete GO annotation...

Enzyme and pathway databases

BioCyciZFISH:ENSG00000164828-MONOMER.
ReactomeiR-HSA-1221632. Meiotic synapsis.

Protein family/group databases

TCDBi1.I.1.1.3. the nuclear pore complex (npc) family.

Names & Taxonomyi

Protein namesi
Recommended name:
SUN domain-containing protein 1
Alternative name(s):
Protein unc-84 homolog A
Sad1/unc-84 protein-like 1
Gene namesi
Name:SUN1
Synonyms:KIAA0810, UNC84A
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 7

Organism-specific databases

HGNCiHGNC:18587. SUN1.

Subcellular locationi

Topology

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Topological domaini1 – 315Nuclear1 PublicationAdd BLAST315
Transmembranei316 – 335HelicalAdd BLAST20
Topological domaini336 – 812Perinuclear space1 PublicationAdd BLAST477

GO - Cellular componenti

  • acrosomal membrane Source: Ensembl
  • integral component of nuclear inner membrane Source: Ensembl
  • intracellular membrane-bounded organelle Source: HPA
  • LINC complex Source: UniProtKB
  • nuclear envelope Source: UniProtKB
  • nuclear membrane Source: HPA
Complete GO annotation...

Keywords - Cellular componenti

Membrane, Nucleus

Pathology & Biotechi

Organism-specific databases

DisGeNETi23353.
OpenTargetsiENSG00000164828.
PharmGKBiPA165618311.

Polymorphism and mutation databases

BioMutaiSUN1.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00002189111 – 812SUN domain-containing protein 1Add BLAST812

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei48PhosphoserineCombined sources1
Modified residuei100PhosphoserineCombined sources1
Modified residuei138PhosphoserineCombined sources1
Modified residuei371PhosphoserineCombined sources1
Isoform 9 (identifier: O94901-9)
Modified residuei333PhosphoserineCombined sources1

Keywords - PTMi

Phosphoprotein

Proteomic databases

EPDiO94901.
MaxQBiO94901.
PaxDbiO94901.
PeptideAtlasiO94901.
PRIDEiO94901.

PTM databases

iPTMnetiO94901.
PhosphoSitePlusiO94901.

Expressioni

Gene expression databases

BgeeiENSG00000164828.
CleanExiHS_UNC84A.
ExpressionAtlasiO94901. baseline and differential.
GenevisibleiO94901. HS.

Organism-specific databases

HPAiHPA008346.
HPA008461.

Interactioni

Subunit structurei

Dimers and tetramers (By similarity). Core component of the LINC complex which is composed of inner nuclear membrane SUN domain-containing proteins coupled to outer nuclear membrane KASH domain-containing nesprins. SUN domain-containing proteins interact with A-type lamins of the nuclear lamina, while at the other end of the complex, nesprins interact with unique cytoskeletal components. May interact with SYNE1, SYNE2 and SYNE3. May interact with SYNE4 (By similarity). Interacts with A-type lamin with a strong preference for unprocessed A-type lamin compared with the mature protein. Interaction with lamins B1 and C is hardly detectable. Interacts with TSNAX (By similarity). Interacts with EMD and NAT10. Associates with the nuclear pore complex (NPC) (By similarity). Interacts with CCDC155 (via the last 22 AA); this interaction mediates CCDC155 telomere localization. Interacts with CCDC79/TERB1; promoting the accumulation of the LINC complex complexes at the telomere-nuclear envelope attachment sites (By similarity).By similarity

Binary interactionsi

WithEntry#Exp.IntActNotes
SYNE1Q8NF91-12EBI-2796904,EBI-6170938
SYNE2Q8WXH0-12EBI-2796904,EBI-6170976

Protein-protein interaction databases

BioGridi116935. 25 interactors.
IntActiO94901. 8 interactors.
STRINGi9606.ENSP00000384015.

Structurei

3D structure databases

ProteinModelPortaliO94901.
SMRiO94901.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini649 – 811SUNPROSITE-ProRule annotationAdd BLAST163

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni1 – 138LMNA-bindingAdd BLAST138
Regioni209 – 302SYNE2-bindingAdd BLAST94
Regioni223 – 302EMD-bindingAdd BLAST80

Coiled coil

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Coiled coili393 – 430Sequence analysisAdd BLAST38
Coiled coili455 – 493Sequence analysisAdd BLAST39
Coiled coili501 – 523Sequence analysisAdd BLAST23

Domaini

The SUN domain may play a role in the nuclear anchoring and/or migration.1 Publication

Sequence similaritiesi

Contains 1 SUN domain.PROSITE-ProRule annotation

Keywords - Domaini

Coiled coil, Signal-anchor, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiKOG2687. Eukaryota.
ENOG410YM6S. LUCA.
GeneTreeiENSGT00390000011587.
HOGENOMiHOG000253025.
HOVERGENiHBG104132.
InParanoidiO94901.
KOiK19347.
PhylomeDBiO94901.
TreeFamiTF323915.

Family and domain databases

InterProiIPR032680. SUN1_N.
IPR012919. SUN_dom.
[Graphical view]
PfamiPF09387. MRP. 1 hit.
PF07738. Sad1_UNC. 1 hit.
[Graphical view]
PROSITEiPS51469. SUN. 1 hit.
[Graphical view]

Sequences (9)i

Sequence statusi: Complete.

This entry describes 9 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: O94901-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MDFSRLHMYS PPQCVPENTG YTYALSSSYS SDALDFETEH KLDPVFDSPR
60 70 80 90 100
MSRRSLRLAT TACTLGDGEA VGADSGTSSA VSLKNRAART TKQRRSTNKS
110 120 130 140 150
AFSINHVSRQ VTSSGVSHGG TVSLQDAVTR RPPVLDESWI REQTTVDHFW
160 170 180 190 200
GLDDDGDLKG GNKAAIQGNG DVGAAAATAH NGFSCSNCSM LSERKDVLTA
210 220 230 240 250
HPAAPGPVSR VYSRDRNQKC DDCKGKRHLD AHPGRAGTLW HIWACAGYFL
260 270 280 290 300
LQILRRIGAV GQAVSRTAWS ALWLAVVAPG KAASGVFWWL GIGWYQFVTL
310 320 330 340 350
ISWLNVFLLT RCLRNICKFL VLLIPLFLLL AGLSLRGQGN FFSFLPVLNW
360 370 380 390 400
ASMHRTQRVD DPQDVFKPTT SRLKQPLQGD SEAFPWHWMS GVEQQVASLS
410 420 430 440 450
GQCHHHGENL RELTTLLQKL QARVDQMEGG AAGPSASVRD AVGQPPRETD
460 470 480 490 500
FMAFHQEHEV RMSHLEDILG KLREKSEAIQ KELEQTKQKT ISAVGEQLLP
510 520 530 540 550
TVEHLQLELD QLKSELSSWR HVKTGCETVD AVQERVDVQV REMVKLLFSE
560 570 580 590 600
DQQGGSLEQL LQRFSSQFVS KGDLQTMLRD LQLQILRNVT HHVSVTKQLP
610 620 630 640 650
TSEAVVSAVS EAGASGITEA QARAIVNSAL KLYSQDKTGM VDFALESGGG
660 670 680 690 700
SILSTRCSET YETKTALMSL FGIPLWYFSQ SPRVVIQPDI YPGNCWAFKG
710 720 730 740 750
SQGYLVVRLS MMIHPAAFTL EHIPKTLSPT GNISSAPKDF AVYGLENEYQ
760 770 780 790 800
EEGQLLGQFT YDQDGESLQM FQALKRPDDT AFQIVELRIF SNWGHPEYTC
810
LYRFRVHGEP VK
Length:812
Mass (Da):90,064
Last modified:July 11, 2003 - v3
Checksum:iB958E95510B6F15F
GO
Isoform 2 (identifier: O94901-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     221-257: DDCKGKRHLDAHPGRAGTLWHIWACAGYFLLQILRRI → KSQSFKTQKKVCFPNLIFPFCKSQCLHYLSWRLKIIP
     258-812: Missing.

Note: No experimental confirmation available.
Show »
Length:257
Mass (Da):28,083
Checksum:i5E674FAF6714115C
GO
Isoform 3 (identifier: O94901-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     221-341: DDCKGKRHLD...GLSLRGQGNF → GASFYVNRIL...AVMLGTSSRE
     342-812: Missing.

Note: No experimental confirmation available.
Show »
Length:341
Mass (Da):37,468
Checksum:i144E6797A593A612
GO
Isoform 4 (identifier: O94901-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     109-109: R → V
     110-812: Missing.

Show »
Length:109
Mass (Da):11,897
Checksum:iED8ECECB9657080A
GO
Isoform 5 (identifier: O94901-5) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-50: Missing.
     220-280: CDDCKGKRHLDAHPGRAGTLWHIWACAGYFLLQILRRIGAVGQAVSRTAWSALWLAVVAPG → W

Show »
Length:702
Mass (Da):77,911
Checksum:i630DD5C326645BCE
GO
Isoform 6 (identifier: O94901-6) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     151-279: Missing.
     331-331: Missing.

Note: No experimental confirmation available.
Show »
Length:682
Mass (Da):76,403
Checksum:iCEEE24E27B0B6069
GO
Isoform 7 (identifier: O94901-7) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-1: M → MGRISPGSPGLPRTVWFEVVNM
     221-257: DDCKGKRHLDAHPGRAGTLWHIWACAGYFLLQILRRI → KSQSFKTQKKVCFPNLIFPFCKSQCLHYLSWRLKIIP
     258-812: Missing.

Note: No experimental confirmation available.
Show »
Length:278
Mass (Da):30,365
Checksum:iC4E91043C5F2C0BD
GO
Isoform 8 (identifier: O94901-8) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     221-247: Missing.

Note: No experimental confirmation available.
Show »
Length:785
Mass (Da):87,110
Checksum:i1099A50C9540ECF5
GO
Isoform 9 (identifier: O94901-9) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     219-219: K → KCGASFYVNR...GHLSVNGEAL
     232-232: H → HTAAHSQSPRL

Note: No experimental confirmation available.Combined sources
Show »
Length:916
Mass (Da):101,932
Checksum:i3279C7C12D8613C0
GO

Sequence cautioni

The sequence BAA34530 differs from that shown. Reason: Erroneous initiation.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti15V → A in CAD98070 (PubMed:17974005).Curated1
Sequence conflicti78S → G in CAD98070 (PubMed:17974005).Curated1
Sequence conflicti174A → V in BAA34530 (PubMed:9872452).Curated1
Sequence conflicti204A → P in AAH13613 (PubMed:15489334).Curated1
Sequence conflicti445P → L in BAG51119 (PubMed:14702039).Curated1
Sequence conflicti503E → K in BAG64069 (PubMed:14702039).Curated1
Sequence conflicti520R → Q in BAG51119 (PubMed:14702039).Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_059828118H → Y.3 PublicationsCorresponds to variant rs6461378dbSNPEnsembl.1
Natural variantiVAR_071065203A → V.1 PublicationCorresponds to variant rs144929525dbSNPEnsembl.1
Natural variantiVAR_071066614A → V.1 Publication1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0378671 – 50Missing in isoform 5. 1 PublicationAdd BLAST50
Alternative sequenceiVSP_0461081M → MGRISPGSPGLPRTVWFEVV NM in isoform 7. 1 Publication1
Alternative sequenceiVSP_007741109R → V in isoform 4. 1 Publication1
Alternative sequenceiVSP_007742110 – 812Missing in isoform 4. 1 PublicationAdd BLAST703
Alternative sequenceiVSP_045815151 – 279Missing in isoform 6. 1 PublicationAdd BLAST129
Alternative sequenceiVSP_047139219K → KCGASFYVNRILWLARYTAS SFSSFLVQLFQVVLMKLSYE SENYKLKTHESKDCESESYK SKSHESKAHASYYGRMNVRE VLREDGHLSVNGEAL in isoform 9. Curated1
Alternative sequenceiVSP_037868220 – 280CDDCK…VVAPG → W in isoform 5. 1 PublicationAdd BLAST61
Alternative sequenceiVSP_007745221 – 341DDCKG…GQGNF → GASFYVNRILWLARYTASSF SSFLVQLFQVVLMKLSYESE NYKLKTHESKDCESESYKSK SHESKAHASYYGRMNVREVL REDGHLSVNGEALCKYGFVF LWASVVELVPHAVMLGTSSR E in isoform 3. 1 PublicationAdd BLAST121
Alternative sequenceiVSP_007743221 – 257DDCKG…ILRRI → KSQSFKTQKKVCFPNLIFPF CKSQCLHYLSWRLKIIP in isoform 2 and isoform 7. 2 PublicationsAdd BLAST37
Alternative sequenceiVSP_046269221 – 247Missing in isoform 8. 1 PublicationAdd BLAST27
Alternative sequenceiVSP_047140232H → HTAAHSQSPRL in isoform 9. Curated1
Alternative sequenceiVSP_007744258 – 812Missing in isoform 2 and isoform 7. 2 PublicationsAdd BLAST555
Alternative sequenceiVSP_045816331Missing in isoform 6. 1 Publication1
Alternative sequenceiVSP_007746342 – 812Missing in isoform 3. 1 PublicationAdd BLAST471

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AB018353 mRNA. Translation: BAA34530.1. Different initiation.
AK022469 mRNA. Translation: BAB14046.1.
AK022816 mRNA. Translation: BAG51119.1.
AK302896 mRNA. Translation: BAG64069.1.
AK309120 mRNA. No translation available.
BX538211 mRNA. Translation: CAD98070.1.
AC073957 Genomic DNA. No translation available.
AC099731 Genomic DNA. No translation available.
BC013613 mRNA. Translation: AAH13613.1.
BC142707 mRNA. Translation: AAI42708.1.
AF202724 mRNA. Translation: AAF15888.1.
CCDSiCCDS43533.1. [O94901-5]
CCDS47525.1. [O94901-8]
CCDS55078.1. [O94901-7]
CCDS55079.1. [O94901-2]
CCDS55080.1. [O94901-6]
RefSeqiNP_001124437.1. NM_001130965.2. [O94901-8]
NP_001165415.1. NM_001171944.1. [O94901-6]
NP_001165416.1. NM_001171945.1. [O94901-7]
NP_001165417.1. NM_001171946.1. [O94901-2]
NP_079430.3. NM_025154.5. [O94901-5]
UniGeneiHs.438072.

Genome annotation databases

EnsembliENST00000389574; ENSP00000374225; ENSG00000164828. [O94901-5]
ENST00000401592; ENSP00000384015; ENSG00000164828. [O94901-8]
ENST00000403868; ENSP00000383947; ENSG00000164828. [O94901-2]
ENST00000425407; ENSP00000392309; ENSG00000164828. [O94901-5]
ENST00000452783; ENSP00000413439; ENSG00000164828. [O94901-6]
ENST00000457378; ENSP00000395952; ENSG00000164828. [O94901-7]
GeneIDi23353.
KEGGihsa:23353.
UCSCiuc003sjf.4. human. [O94901-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AB018353 mRNA. Translation: BAA34530.1. Different initiation.
AK022469 mRNA. Translation: BAB14046.1.
AK022816 mRNA. Translation: BAG51119.1.
AK302896 mRNA. Translation: BAG64069.1.
AK309120 mRNA. No translation available.
BX538211 mRNA. Translation: CAD98070.1.
AC073957 Genomic DNA. No translation available.
AC099731 Genomic DNA. No translation available.
BC013613 mRNA. Translation: AAH13613.1.
BC142707 mRNA. Translation: AAI42708.1.
AF202724 mRNA. Translation: AAF15888.1.
CCDSiCCDS43533.1. [O94901-5]
CCDS47525.1. [O94901-8]
CCDS55078.1. [O94901-7]
CCDS55079.1. [O94901-2]
CCDS55080.1. [O94901-6]
RefSeqiNP_001124437.1. NM_001130965.2. [O94901-8]
NP_001165415.1. NM_001171944.1. [O94901-6]
NP_001165416.1. NM_001171945.1. [O94901-7]
NP_001165417.1. NM_001171946.1. [O94901-2]
NP_079430.3. NM_025154.5. [O94901-5]
UniGeneiHs.438072.

3D structure databases

ProteinModelPortaliO94901.
SMRiO94901.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi116935. 25 interactors.
IntActiO94901. 8 interactors.
STRINGi9606.ENSP00000384015.

Protein family/group databases

TCDBi1.I.1.1.3. the nuclear pore complex (npc) family.

PTM databases

iPTMnetiO94901.
PhosphoSitePlusiO94901.

Polymorphism and mutation databases

BioMutaiSUN1.

Proteomic databases

EPDiO94901.
MaxQBiO94901.
PaxDbiO94901.
PeptideAtlasiO94901.
PRIDEiO94901.

Protocols and materials databases

DNASUi23353.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000389574; ENSP00000374225; ENSG00000164828. [O94901-5]
ENST00000401592; ENSP00000384015; ENSG00000164828. [O94901-8]
ENST00000403868; ENSP00000383947; ENSG00000164828. [O94901-2]
ENST00000425407; ENSP00000392309; ENSG00000164828. [O94901-5]
ENST00000452783; ENSP00000413439; ENSG00000164828. [O94901-6]
ENST00000457378; ENSP00000395952; ENSG00000164828. [O94901-7]
GeneIDi23353.
KEGGihsa:23353.
UCSCiuc003sjf.4. human. [O94901-1]

Organism-specific databases

CTDi23353.
DisGeNETi23353.
GeneCardsiSUN1.
H-InvDBHIX0078880.
HGNCiHGNC:18587. SUN1.
HPAiHPA008346.
HPA008461.
MIMi607723. gene.
neXtProtiNX_O94901.
OpenTargetsiENSG00000164828.
PharmGKBiPA165618311.
HUGEiSearch...
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG2687. Eukaryota.
ENOG410YM6S. LUCA.
GeneTreeiENSGT00390000011587.
HOGENOMiHOG000253025.
HOVERGENiHBG104132.
InParanoidiO94901.
KOiK19347.
PhylomeDBiO94901.
TreeFamiTF323915.

Enzyme and pathway databases

BioCyciZFISH:ENSG00000164828-MONOMER.
ReactomeiR-HSA-1221632. Meiotic synapsis.

Miscellaneous databases

ChiTaRSiSUN1. human.
GeneWikiiUNC84A.
GenomeRNAii23353.
PROiO94901.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000164828.
CleanExiHS_UNC84A.
ExpressionAtlasiO94901. baseline and differential.
GenevisibleiO94901. HS.

Family and domain databases

InterProiIPR032680. SUN1_N.
IPR012919. SUN_dom.
[Graphical view]
PfamiPF09387. MRP. 1 hit.
PF07738. Sad1_UNC. 1 hit.
[Graphical view]
PROSITEiPS51469. SUN. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiSUN1_HUMAN
AccessioniPrimary (citable) accession number: O94901
Secondary accession number(s): A5PL20
, B3KMV7, B4DZF7, B7WNY4, B7WP53, E9PDU4, E9PF23, F8WD13, Q96CZ7, Q9HA14, Q9UH98
Entry historyi
Integrated into UniProtKB/Swiss-Prot: August 2, 2002
Last sequence update: July 11, 2003
Last modified: November 30, 2016
This is version 157 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 7
    Human chromosome 7: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.