Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Nidogen-1

Gene

Nid1

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Sulfated glycoprotein widely distributed in basement membranes and tightly associated with laminin. Also binds to collagen IV and perlecan. It probably has a role in cell-extracellular matrix interactions.

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sitei457Involved in perlecan binding1
Sitei459Involved in perlecan binding1
Sitei648Involved in perlecan binding1

GO - Molecular functioni

  • calcium ion binding Source: InterPro
  • collagen binding Source: MGI
  • extracellular matrix binding Source: MGI
  • laminin-1 binding Source: MGI
  • laminin binding Source: MGI
  • proteoglycan binding Source: MGI

GO - Biological processi

  • cell-matrix adhesion Source: MGI
  • extracellular matrix disassembly Source: Reactome
  • extracellular matrix organization Source: MGI
  • glomerular basement membrane development Source: MGI
  • positive regulation of cell-substrate adhesion Source: MGI
Complete GO annotation...

Keywords - Biological processi

Cell adhesion

Keywords - Ligandi

Calcium

Enzyme and pathway databases

ReactomeiR-MMU-1474228. Degradation of the extracellular matrix.
R-MMU-3000157. Laminin interactions.

Names & Taxonomyi

Protein namesi
Recommended name:
Nidogen-1
Short name:
NID-1
Alternative name(s):
Entactin
Gene namesi
Name:Nid1
Synonyms:Ent
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 13

Organism-specific databases

MGIiMGI:97342. Nid1.

Subcellular locationi

GO - Cellular componenti

  • basal lamina Source: MGI
  • basement membrane Source: MGI
  • cell periphery Source: MGI
  • extracellular exosome Source: MGI
  • extracellular matrix Source: UniProtKB
  • extracellular region Source: Reactome
  • proteinaceous extracellular matrix Source: MGI
Complete GO annotation...

Keywords - Cellular componenti

Basement membrane, Extracellular matrix, Secreted

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 281 PublicationAdd BLAST28
ChainiPRO_000000767029 – 1245Nidogen-1Add BLAST1217

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Glycosylationi187N-linked (GlcNAc...)1 Publication1
Modified residuei290SulfotyrosineSequence analysis1
Modified residuei295SulfotyrosineSequence analysis1
Glycosylationi299O-linked (GalNAc...)1 Publication1
Glycosylationi331O-linked (GalNAc...)1 Publication1
Glycosylationi337O-linked (GalNAc...)1 Publication1
Glycosylationi345O-linked (GalNAc...)1 Publication1
Glycosylationi348O-linked (GalNAc...); partial1 Publication1
Disulfide bondi388 ↔ 401
Disulfide bondi395 ↔ 410
Disulfide bondi409 ↔ 616
Disulfide bondi412 ↔ 423
Glycosylationi415N-linked (GlcNAc...)1 Publication1
Disulfide bondi670 ↔ 683By similarity
Disulfide bondi677 ↔ 693By similarity
Disulfide bondi695 ↔ 706By similarity
Disulfide bondi712 ↔ 725By similarity
Disulfide bondi719 ↔ 734By similarity
Disulfide bondi736 ↔ 748By similarity
Disulfide bondi760 ↔ 775By similarity
Disulfide bondi767 ↔ 785By similarity
Disulfide bondi787 ↔ 798By similarity
Disulfide bondi804 ↔ 815By similarity
Disulfide bondi809 ↔ 824By similarity
Disulfide bondi826 ↔ 837By similarity
Disulfide bondi847 ↔ 876By similarity
Disulfide bondi887 ↔ 894By similarity
Disulfide bondi896 ↔ 917By similarity
Glycosylationi920O-linked (GalNAc...)1 Publication1
Glycosylationi933O-linked (GalNAc...)1 Publication1
Disulfide bondi1210 ↔ 1221By similarity
Disulfide bondi1217 ↔ 1230By similarity
Disulfide bondi1232 ↔ 1241By similarity

Post-translational modificationi

N- and O-glycosylated.1 Publication

Keywords - PTMi

Disulfide bond, Glycoprotein, Sulfation

Proteomic databases

MaxQBiP10493.
PaxDbiP10493.
PeptideAtlasiP10493.
PRIDEiP10493.

PTM databases

iPTMnetiP10493.
PhosphoSitePlusiP10493.

Miscellaneous databases

PMAP-CutDBP10493.

Expressioni

Gene expression databases

BgeeiENSMUSG00000005397.
CleanExiMM_NID1.
GenevisibleiP10493. MM.

Interactioni

Subunit structurei

Interacts with FBLN1 and LGALS3BP. Interacts with PLXDC1.4 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
Lamc1P024683EBI-1032117,EBI-7059830

GO - Molecular functioni

  • collagen binding Source: MGI
  • laminin-1 binding Source: MGI
  • laminin binding Source: MGI

Protein-protein interaction databases

BioGridi201770. 4 interactors.
IntActiP10493. 3 interactors.
MINTiMINT-215381.
STRINGi10090.ENSMUSP00000005532.

Structurei

Secondary structure

11245
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Helixi388 – 391Combined sources4
Helixi392 – 394Combined sources3
Beta strandi399 – 403Combined sources5
Beta strandi408 – 412Combined sources5
Beta strandi416 – 418Combined sources3
Beta strandi420 – 425Combined sources6
Beta strandi429 – 442Combined sources14
Beta strandi449 – 461Combined sources13
Turni462 – 464Combined sources3
Beta strandi466 – 473Combined sources8
Helixi476 – 479Combined sources4
Helixi480 – 482Combined sources3
Helixi487 – 495Combined sources9
Helixi506 – 510Combined sources5
Beta strandi513 – 522Combined sources10
Beta strandi525 – 527Combined sources3
Beta strandi529 – 536Combined sources8
Beta strandi547 – 554Combined sources8
Beta strandi562 – 564Combined sources3
Beta strandi568 – 575Combined sources8
Beta strandi578 – 590Combined sources13
Beta strandi601 – 612Combined sources12
Beta strandi627 – 641Combined sources15
Turni642 – 645Combined sources4
Beta strandi646 – 656Combined sources11
Beta strandi942 – 960Combined sources19
Helixi964 – 966Combined sources3
Beta strandi968 – 984Combined sources17
Turni985 – 988Combined sources4
Beta strandi989 – 994Combined sources6
Turni995 – 998Combined sources4
Beta strandi999 – 1007Combined sources9
Beta strandi1011 – 1014Combined sources4
Beta strandi1021 – 1027Combined sources7
Turni1028 – 1031Combined sources4
Beta strandi1032 – 1037Combined sources6
Turni1038 – 1041Combined sources4
Beta strandi1042 – 1047Combined sources6
Beta strandi1054 – 1057Combined sources4
Beta strandi1062 – 1070Combined sources9
Turni1071 – 1074Combined sources4
Beta strandi1075 – 1080Combined sources6
Beta strandi1083 – 1085Combined sources3
Beta strandi1087 – 1092Combined sources6
Beta strandi1099 – 1102Combined sources4
Beta strandi1109 – 1115Combined sources7
Turni1116 – 1119Combined sources4
Beta strandi1120 – 1125Combined sources6
Turni1126 – 1129Combined sources4
Beta strandi1130 – 1135Combined sources6
Beta strandi1138 – 1146Combined sources9
Beta strandi1150 – 1157Combined sources8
Beta strandi1160 – 1165Combined sources6
Turni1166 – 1169Combined sources4
Beta strandi1170 – 1175Combined sources6
Turni1176 – 1179Combined sources4
Beta strandi1180 – 1185Combined sources6
Beta strandi1196 – 1199Combined sources4

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
1GL4X-ray2.00A385-665[»]
1H4UX-ray2.20A395-659[»]
1NPEX-ray2.30A941-1207[»]
ProteinModelPortaliP10493.
SMRiP10493.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP10493.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini106 – 268NIDOPROSITE-ProRule annotationAdd BLAST163
Domaini384 – 424EGF-like 1PROSITE-ProRule annotationAdd BLAST41
Domaini428 – 665Nidogen G2 beta-barrelPROSITE-ProRule annotationAdd BLAST238
Domaini666 – 707EGF-like 2PROSITE-ProRule annotationAdd BLAST42
Domaini708 – 749EGF-like 3; calcium-bindingPROSITE-ProRule annotationAdd BLAST42
Domaini756 – 799EGF-like 4PROSITE-ProRule annotationAdd BLAST44
Domaini800 – 838EGF-like 5; calcium-bindingPROSITE-ProRule annotationAdd BLAST39
Domaini844 – 917Thyroglobulin type-1PROSITE-ProRule annotationAdd BLAST74
Repeati988 – 1030LDL-receptor class B 1Add BLAST43
Repeati1031 – 1073LDL-receptor class B 2Add BLAST43
Repeati1074 – 1118LDL-receptor class B 3Add BLAST45
Repeati1119 – 1160LDL-receptor class B 4Add BLAST42
Domaini1206 – 1242EGF-like 6PROSITE-ProRule annotationAdd BLAST37

Motif

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Motifi700 – 702Cell attachment site3

Sequence similaritiesi

Contains 6 EGF-like domains.PROSITE-ProRule annotation
Contains 4 LDL-receptor class B repeats.PROSITE-ProRule annotation
Contains 1 NIDO domain.PROSITE-ProRule annotation
Contains 1 nidogen G2 beta-barrel domain.PROSITE-ProRule annotation
Contains 1 thyroglobulin type-1 domain.PROSITE-ProRule annotation

Keywords - Domaini

EGF-like domain, Repeat, Signal

Phylogenomic databases

eggNOGiKOG1214. Eukaryota.
ENOG410XR8N. LUCA.
GeneTreeiENSGT00860000133769.
HOGENOMiHOG000072712.
HOVERGENiHBG006498.
InParanoidiP10493.
KOiK06826.
OMAiLHNCDIP.
OrthoDBiEOG091G030P.
TreeFamiTF320666.

Family and domain databases

CDDicd00255. nidG2. 1 hit.
Gene3Di2.120.10.30. 1 hit.
2.40.155.10. 2 hits.
4.10.800.10. 1 hit.
InterProiIPR011042. 6-blade_b-propeller_TolB-like.
IPR026823. cEGF.
IPR001881. EGF-like_Ca-bd_dom.
IPR013032. EGF-like_CS.
IPR000742. EGF-like_dom.
IPR000152. EGF-type_Asp/Asn_hydroxyl_site.
IPR018097. EGF_Ca-bd_CS.
IPR024731. EGF_dom.
IPR006605. G2_nidogen/fibulin_G2F.
IPR009017. GFP.
IPR023413. GFP-like.
IPR009030. Growth_fac_rcpt_.
IPR000033. LDLR_classB_rpt.
IPR003886. NIDO_dom.
IPR000716. Thyroglobulin_1.
[Graphical view]
PfamiPF12662. cEGF. 1 hit.
PF12947. EGF_3. 1 hit.
PF07645. EGF_CA. 1 hit.
PF07474. G2F. 1 hit.
PF00058. Ldl_recept_b. 3 hits.
PF06119. NIDO. 1 hit.
PF00086. Thyroglobulin_1. 1 hit.
[Graphical view]
SMARTiSM00181. EGF. 6 hits.
SM00179. EGF_CA. 4 hits.
SM00682. G2F. 1 hit.
SM00135. LY. 5 hits.
SM00539. NIDO. 1 hit.
SM00211. TY. 1 hit.
[Graphical view]
SUPFAMiSSF54511. SSF54511. 1 hit.
SSF57184. SSF57184. 2 hits.
SSF57610. SSF57610. 1 hit.
PROSITEiPS00010. ASX_HYDROXYL. 3 hits.
PS00022. EGF_1. 1 hit.
PS01186. EGF_2. 4 hits.
PS50026. EGF_3. 5 hits.
PS01187. EGF_CA. 2 hits.
PS51120. LDLRB. 4 hits.
PS51220. NIDO. 1 hit.
PS50993. NIDOGEN_G2. 1 hit.
PS00484. THYROGLOBULIN_1_1. 1 hit.
PS51162. THYROGLOBULIN_1_2. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

P10493-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MLDASGCSWA MWTWALLQLL LLVGPGGCLN RQELFPFGPG QGDLELEAGD
60 70 80 90 100
DVVSPSLELI GELSFYDRTD ITSVYVTTNG IIAMSEPPAT EYHPGTFPPS
110 120 130 140 150
FGSVAPFLAD LDTTDGLGNV YYREDLSPFI IQMAAEYVQR GFPEVSFQPT
160 170 180 190 200
SVVVVTWESV APYGGPSSSP AEEGKRNTFQ AVLASSNSSS YAIFLYPEDG
210 220 230 240 250
LQFFTTFSKK DESQVPAVVG FSKGLVGFLW KSNGAYNIFA NDRESIENLA
260 270 280 290 300
KSSNAGHQGV WVFEIGSPAT AKGVVSADVN LDLDDDGADY EDEDYDLVTS
310 320 330 340 350
HLGLEDVATP SPSHSPRRGY PDPHNVPRIL SPGYEATERP RGVPTERTRS
360 370 380 390 400
FQLPAERFPQ HHPQVIDVDE VEETGVVFSY NTGSQQTCAN NRHQCSVHAE
410 420 430 440 450
CRDYATGFCC RCVANYTGNG RQCVAEGSPQ RVNGKVKGRI FVGSSQVPVV
460 470 480 490 500
FENTDLHSYV VMNHGRSYTA ISTIPETVGY SLLPLAPIGG IIGWMFAVEQ
510 520 530 540 550
DGFKNGFSIT GGEFTRQAEV TFLGHPGKLV LKQQFSGIDE HGHLTISTEL
560 570 580 590 600
EGRVPQIPYG ASVHIEPYTE LYHYSSSVIT SSSTREYTVM EPDQDGAAPS
610 620 630 640 650
HTHIYQWRQT ITFQECAHDD ARPALPSTQQ LSVDSVFVLY NKEERILRYA
660 670 680 690 700
LSNSIGPVRD GSPDALQNPC YIGTHGCDSN AACRPGPGTQ FTCECSIGFR
710 720 730 740 750
GDGQTCYDID ECSEQPSRCG NHAVCNNLPG TFRCECVEGY HFSDRGTCVA
760 770 780 790 800
AEDQRPINYC ETGLHNCDIP QRAQCIYMGG SSYTCSCLPG FSGDGRACRD
810 820 830 840 850
VDECQHSRCH PDAFCYNTPG SFTCQCKPGY QGDGFRCMPG EVSKTRCQLE
860 870 880 890 900
REHILGAAGG ADAQRPTLQG MFVPQCDEYG HYVPTQCHHS TGYCWCVDRD
910 920 930 940 950
GRELEGSRTP PGMRPPCLST VAPPIHQGPV VPTAVIPLPP GTHLLFAQTG
960 970 980 990 1000
KIERLPLERN TMKKTEAKAF LHIPAKVIIG LAFDCVDKVV YWTDISEPSI
1010 1020 1030 1040 1050
GRASLHGGEP TTIIRQDLGS PEGIALDHLG RTIFWTDSQL DRIEVAKMDG
1060 1070 1080 1090 1100
TQRRVLFDTG LVNPRGIVTD PVRGNLYWTD WNRDNPKIET SHMDGTNRRI
1110 1120 1130 1140 1150
LAQDNLGLPN GLTFDAFSSQ LCWVDAGTHR AECLNPAQPG RRKVLEGLQY
1160 1170 1180 1190 1200
PFAVTSYGKN LYYTDWKTNS VIAMDLAISK EMDTFHPHKQ TRLYGITIAL
1210 1220 1230 1240
SQCPQGHNYC SVNNGGCTHL CLPTPGSRTC RCPDNTLGVD CIERK
Length:1,245
Mass (Da):136,538
Last modified:July 27, 2011 - v2
Checksum:i92D12D128E6EF144
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti170P → L in CAA32642 (PubMed:2496973).Curated1
Sequence conflicti659R → K in CAA32642 (PubMed:2496973).Curated1
Sequence conflicti967A → R in CAA32408 (PubMed:3264556).Curated1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X14194 mRNA. Translation: CAA32408.1.
X14480 mRNA. Translation: CAA32642.1.
AK041633 mRNA. Translation: BAC31014.1.
AK084876 mRNA. Translation: BAC39300.1.
AK144878 mRNA. Translation: BAE26114.1.
AK166779 mRNA. Translation: BAE39014.1.
BC131669 mRNA. Translation: AAI31670.1.
AH003206 Genomic DNA. Translation: AAA77652.1.
X83093 Genomic DNA. Translation: CAA58148.1.
CCDSiCCDS26244.1.
PIRiS02730. MMMSND.
RefSeqiNP_035047.2. NM_010917.2.
UniGeneiMm.4691.

Genome annotation databases

EnsembliENSMUST00000005532; ENSMUSP00000005532; ENSMUSG00000005397.
GeneIDi18073.
KEGGimmu:18073.
UCSCiuc007pmf.2. mouse.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X14194 mRNA. Translation: CAA32408.1.
X14480 mRNA. Translation: CAA32642.1.
AK041633 mRNA. Translation: BAC31014.1.
AK084876 mRNA. Translation: BAC39300.1.
AK144878 mRNA. Translation: BAE26114.1.
AK166779 mRNA. Translation: BAE39014.1.
BC131669 mRNA. Translation: AAI31670.1.
AH003206 Genomic DNA. Translation: AAA77652.1.
X83093 Genomic DNA. Translation: CAA58148.1.
CCDSiCCDS26244.1.
PIRiS02730. MMMSND.
RefSeqiNP_035047.2. NM_010917.2.
UniGeneiMm.4691.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
1GL4X-ray2.00A385-665[»]
1H4UX-ray2.20A395-659[»]
1NPEX-ray2.30A941-1207[»]
ProteinModelPortaliP10493.
SMRiP10493.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi201770. 4 interactors.
IntActiP10493. 3 interactors.
MINTiMINT-215381.
STRINGi10090.ENSMUSP00000005532.

PTM databases

iPTMnetiP10493.
PhosphoSitePlusiP10493.

Proteomic databases

MaxQBiP10493.
PaxDbiP10493.
PeptideAtlasiP10493.
PRIDEiP10493.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000005532; ENSMUSP00000005532; ENSMUSG00000005397.
GeneIDi18073.
KEGGimmu:18073.
UCSCiuc007pmf.2. mouse.

Organism-specific databases

CTDi4811.
MGIiMGI:97342. Nid1.

Phylogenomic databases

eggNOGiKOG1214. Eukaryota.
ENOG410XR8N. LUCA.
GeneTreeiENSGT00860000133769.
HOGENOMiHOG000072712.
HOVERGENiHBG006498.
InParanoidiP10493.
KOiK06826.
OMAiLHNCDIP.
OrthoDBiEOG091G030P.
TreeFamiTF320666.

Enzyme and pathway databases

ReactomeiR-MMU-1474228. Degradation of the extracellular matrix.
R-MMU-3000157. Laminin interactions.

Miscellaneous databases

ChiTaRSiNid1. mouse.
EvolutionaryTraceiP10493.
PMAP-CutDBP10493.
PROiP10493.
SOURCEiSearch...

Gene expression databases

BgeeiENSMUSG00000005397.
CleanExiMM_NID1.
GenevisibleiP10493. MM.

Family and domain databases

CDDicd00255. nidG2. 1 hit.
Gene3Di2.120.10.30. 1 hit.
2.40.155.10. 2 hits.
4.10.800.10. 1 hit.
InterProiIPR011042. 6-blade_b-propeller_TolB-like.
IPR026823. cEGF.
IPR001881. EGF-like_Ca-bd_dom.
IPR013032. EGF-like_CS.
IPR000742. EGF-like_dom.
IPR000152. EGF-type_Asp/Asn_hydroxyl_site.
IPR018097. EGF_Ca-bd_CS.
IPR024731. EGF_dom.
IPR006605. G2_nidogen/fibulin_G2F.
IPR009017. GFP.
IPR023413. GFP-like.
IPR009030. Growth_fac_rcpt_.
IPR000033. LDLR_classB_rpt.
IPR003886. NIDO_dom.
IPR000716. Thyroglobulin_1.
[Graphical view]
PfamiPF12662. cEGF. 1 hit.
PF12947. EGF_3. 1 hit.
PF07645. EGF_CA. 1 hit.
PF07474. G2F. 1 hit.
PF00058. Ldl_recept_b. 3 hits.
PF06119. NIDO. 1 hit.
PF00086. Thyroglobulin_1. 1 hit.
[Graphical view]
SMARTiSM00181. EGF. 6 hits.
SM00179. EGF_CA. 4 hits.
SM00682. G2F. 1 hit.
SM00135. LY. 5 hits.
SM00539. NIDO. 1 hit.
SM00211. TY. 1 hit.
[Graphical view]
SUPFAMiSSF54511. SSF54511. 1 hit.
SSF57184. SSF57184. 2 hits.
SSF57610. SSF57610. 1 hit.
PROSITEiPS00010. ASX_HYDROXYL. 3 hits.
PS00022. EGF_1. 1 hit.
PS01186. EGF_2. 4 hits.
PS50026. EGF_3. 5 hits.
PS01187. EGF_CA. 2 hits.
PS51120. LDLRB. 4 hits.
PS51220. NIDO. 1 hit.
PS50993. NIDOGEN_G2. 1 hit.
PS00484. THYROGLOBULIN_1_1. 1 hit.
PS51162. THYROGLOBULIN_1_2. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiNID1_MOUSE
AccessioniPrimary (citable) accession number: P10493
Secondary accession number(s): Q3TKX9
, Q8BQI3, Q8C3U8, Q8C9P6
Entry historyi
Integrated into UniProtKB/Swiss-Prot: April 1, 1990
Last sequence update: July 27, 2011
Last modified: November 30, 2016
This is version 181 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.