Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Neogenin

Gene

Neo1

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Multi-functional cell surface receptor regulating cell adhesion in many diverse developmental processes, including neural tube and mammary gland formation, myogenesis and angiogenesis. Receptor for members of the BMP, netrin, and repulsive guidance molecule (RGM) families. Netrin-Neogenin interactions result in a chemoattractive axon guidance response and cell-cell adhesion, the interaction between NEO1/Neogenin and RGMa and RGMb induces a chemorepulsive response.1 Publication

GO - Molecular functioni

  • BMP receptor binding Source: BHF-UCL
  • cadherin binding Source: MGI
  • co-receptor binding Source: MGI
  • receptor activity Source: MGI

GO - Biological processi

  • axon guidance Source: Reactome
  • cell adhesion Source: UniProtKB-KW
  • iron ion homeostasis Source: MGI
  • myoblast fusion Source: MGI
  • negative regulation of protein secretion Source: MGI
  • positive regulation of BMP signaling pathway Source: MGI
  • positive regulation of muscle cell differentiation Source: Reactome
  • regulation of transcription, DNA-templated Source: MGI
Complete GO annotation...

Keywords - Biological processi

Cell adhesion

Enzyme and pathway databases

ReactomeiR-MMU-373752. Netrin-1 signaling.
R-MMU-375170. CDO in myogenesis.

Names & Taxonomyi

Protein namesi
Recommended name:
Neogenin
Gene namesi
Name:Neo1
Synonyms:Ngn
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Unplaced

Organism-specific databases

MGIiMGI:1097159. Neo1.

Subcellular locationi

Topology

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Topological domaini37 – 11361100ExtracellularSequence analysisAdd
BLAST
Transmembranei1137 – 115721HelicalSequence analysisAdd
BLAST
Topological domaini1158 – 1493336CytoplasmicSequence analysisAdd
BLAST

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Cell membrane, Membrane

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Signal peptidei1 – 3636Sequence analysisAdd
BLAST
Chaini37 – 14931457NeogeninPRO_0000015044Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Glycosylationi84 – 841N-linked (GlcNAc...)Sequence analysis
Disulfide bondi85 ↔ 140PROSITE-ProRule annotation
Disulfide bondi184 ↔ 232PROSITE-ProRule annotation
Glycosylationi221 – 2211N-linked (GlcNAc...)2 Publications
Disulfide bondi281 ↔ 331PROSITE-ProRule annotation
Glycosylationi337 – 3371N-linked (GlcNAc...)Sequence analysis
Disulfide bondi373 ↔ 421PROSITE-ProRule annotation
Glycosylationi501 – 5011N-linked (GlcNAc...)2 Publications
Glycosylationi520 – 5201N-linked (GlcNAc...)Sequence analysis
Glycosylationi670 – 6701N-linked (GlcNAc...)1 Publication
Glycosylationi746 – 7461N-linked (GlcNAc...)Sequence analysis
Glycosylationi940 – 9401N-linked (GlcNAc...)Sequence analysis
Modified residuei1209 – 12091PhosphoserineCombined sources
Modified residuei1225 – 12251PhosphoserineCombined sources
Modified residuei1229 – 12291PhosphothreonineCombined sources
Modified residuei1433 – 14331PhosphoserineBy similarity
Modified residuei1436 – 14361PhosphothreonineBy similarity
Modified residuei1464 – 14641PhosphoserineCombined sources
Modified residuei1466 – 14661PhosphoserineBy similarity
Modified residuei1467 – 14671PhosphoserineCombined sources

Keywords - PTMi

Disulfide bond, Glycoprotein, Phosphoprotein

Proteomic databases

MaxQBiP97798.
PaxDbiP97798.
PeptideAtlasiP97798.
PRIDEiP97798.

PTM databases

iPTMnetiP97798.
PhosphoSiteiP97798.
SwissPalmiP97798.

Expressioni

Tissue specificityi

Widely expressed.

Developmental stagei

Expressed ubiquitously throughout the mid to late stages of gestation and in adult tissues. Strong expression is observed in the ventral region of the ventricular zone of the E15.5 mouse neural tube, as well as in the ventricular zones of the mesencephalon and rhombencephalon. Isoform 3 and isoform 4 are expressed at higher level compared to other isoforms between E11.5 and E16.5.

Gene expression databases

CleanExiMM_NEO1.

Interactioni

Subunit structurei

Interacts with BMP2, BMP4, BMP6, and BMP7 (By similarity). Interacts with RGMA and RGMB. Interacts with MYO10.By similarity3 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
Myo10F8VQB63EBI-774991,EBI-6445959

GO - Molecular functioni

  • BMP receptor binding Source: BHF-UCL
  • cadherin binding Source: MGI
  • co-receptor binding Source: MGI

Protein-protein interaction databases

DIPiDIP-32026N.
IntActiP97798. 3 interactions.
MINTiMINT-4997055.
STRINGi10090.ENSMUSP00000063656.

Structurei

Secondary structure

1
1493
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Beta strandi773 – 7808Combined sources
Beta strandi785 – 7939Combined sources
Beta strandi795 – 7973Combined sources
Beta strandi801 – 81010Combined sources
Beta strandi813 – 8197Combined sources
Beta strandi823 – 8275Combined sources
Beta strandi835 – 84410Combined sources
Beta strandi852 – 8576Combined sources
Beta strandi889 – 8957Combined sources
Beta strandi897 – 8993Combined sources
Beta strandi901 – 9066Combined sources
Beta strandi921 – 9277Combined sources
Beta strandi937 – 94812Combined sources
Beta strandi956 – 96510Combined sources
Beta strandi968 – 9703Combined sources
Beta strandi976 – 9794Combined sources
Beta strandi990 – 9978Combined sources
Beta strandi1000 – 100910Combined sources
Beta strandi1020 – 10278Combined sources
Helixi1033 – 10353Combined sources
Beta strandi1036 – 10427Combined sources
Beta strandi1047 – 10504Combined sources
Beta strandi1058 – 106710Combined sources
Beta strandi1070 – 10745Combined sources
Beta strandi1078 – 10814Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
4BQ6X-ray2.30A/B883-1133[»]
4BQ7X-ray6.60A/B883-1133[»]
4BQ8X-ray2.80A883-1083[»]
4BQ9X-ray2.91A/B883-1083[»]
4BQBX-ray2.70A/B/C/D883-1133[»]
4BQCX-ray3.20A/B883-1133[»]
4PLNX-ray3.20C/D765-980[»]
4UI2X-ray3.15A883-1133[»]
ProteinModelPortaliP97798.
SMRiP97798. Positions 64-1088.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini63 – 15896Ig-like C2-type 1Add
BLAST
Domaini163 – 24987Ig-like C2-type 2Add
BLAST
Domaini254 – 34794Ig-like C2-type 3Add
BLAST
Domaini352 – 43786Ig-like C2-type 4Add
BLAST
Domaini472 – 56695Fibronectin type-III 1PROSITE-ProRule annotationAdd
BLAST
Domaini572 – 66291Fibronectin type-III 2PROSITE-ProRule annotationAdd
BLAST
Domaini667 – 76296Fibronectin type-III 3PROSITE-ProRule annotationAdd
BLAST
Domaini772 – 86291Fibronectin type-III 4PROSITE-ProRule annotationAdd
BLAST
Domaini887 – 986100Fibronectin type-III 5PROSITE-ProRule annotationAdd
BLAST
Domaini988 – 108598Fibronectin type-III 6PROSITE-ProRule annotationAdd
BLAST

Compositional bias

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Compositional biasi1149 – 11535Poly-Val

Domaini

The Fibronectin repeats 5 and 6 mediate interaction with RGM family molecules.

Sequence similaritiesi

Belongs to the immunoglobulin superfamily. DCC family.Curated
Contains 6 fibronectin type-III domains.PROSITE-ProRule annotation

Keywords - Domaini

Immunoglobulin domain, Repeat, Signal, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiKOG4221. Eukaryota.
ENOG410Z913. LUCA.
HOGENOMiHOG000230686.
HOVERGENiHBG005455.
InParanoidiP97798.
PhylomeDBiP97798.

Family and domain databases

Gene3Di2.60.40.10. 10 hits.
InterProiIPR003961. FN3_dom.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR013098. Ig_I-set.
IPR003599. Ig_sub.
IPR003598. Ig_sub2.
IPR033024. Neogenin.
IPR010560. Neogenin_C.
[Graphical view]
PANTHERiPTHR10489:SF55. PTHR10489:SF55. 3 hits.
PfamiPF00041. fn3. 6 hits.
PF07679. I-set. 3 hits.
PF13895. Ig_2. 1 hit.
PF06583. Neogenin_C. 1 hit.
[Graphical view]
SMARTiSM00060. FN3. 6 hits.
SM00409. IG. 4 hits.
SM00408. IGc2. 4 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 4 hits.
SSF49265. SSF49265. 3 hits.
PROSITEiPS50853. FN3. 6 hits.
PS50835. IG_LIKE. 4 hits.
[Graphical view]

Sequences (5)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 5 isoformsi produced by alternative splicing. AlignAdd to basket

Note: Additional isoforms seem to exist.

Isoform 1 (identifier: P97798-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MAAEREAGRL LCTSSSRRCC PPPPLLLLLP LLLLLGRPAS GAAATKSGPR
60 70 80 90 100
RQSQGASVRT FTPFYFLVEP VDTLSVRGSS VILNCSAYSE PSPNIEWKKD
110 120 130 140 150
GTFLNLESDD RRQLLPDGSL FISNVVHSKH NKPDEGFYQC VATVDNLGTI
160 170 180 190 200
VSRTAKLTVA GLPRFTSQPE PSSVYVGNSA ILNCEVNADL VPFVRWEQNR
210 220 230 240 250
QPLLLDDRIV KLPSGTLVIS NATEGDGGLY RCIVESGGPP KFSDEAELKV
260 270 280 290 300
LQDPEEIVDL VFLMRPSSMM KVTGQSAVLP CVVSGLPAPV VRWMKNEEVL
310 320 330 340 350
DTESSGRLVL LAGGCLEISD VTEDDAGTYF CIADNGNKTV EAQAELTVQV
360 370 380 390 400
PPGFLKQPAN IYAHESMDIV FECEVTGKPT PTVKWVKNGD VVIPSDNFKI
410 420 430 440 450
VKEHNLQVLG LVKSDEGFYQ CIAENDVGNA QAGAQLIILE HDVAIPTLPP
460 470 480 490 500
TSLTSATTDH LAPATTGPLP SAPRDVVASL VSTRFIKLTW RTPASDPHGD
510 520 530 540 550
NLTYSVFYTK EGVDRERVEN TSQPGEMQVT IQNLMPATVY IFKVMAQNKH
560 570 580 590 600
GSGESSAPLR VETQPEVQLP GPAPNIRAYA TSPTSITVTW ETPLSGNGEI
610 620 630 640 650
QNYKLYYMEK GTDKEQDIDV SSHSYTINGL KKYTEYSFRV VAYNKHGPGV
660 670 680 690 700
STQDVAVRTL SDVPSAAPQN LSLEVRNSKS IVIHWQPPSS TTQNGQITGY
710 720 730 740 750
KIRYRKASRK SDVTETLVTG TQLSQLIEGL DRGTEYNFRV AALTVNGTGP
760 770 780 790 800
ATDWLSAETF ESDLDETRVP EVPSSLHVRP LVTSIVVSWT PPENQNIVVR
810 820 830 840 850
GYAIGYGIGS PHAQTIKVDY KQRYYTIENL DPSSHYVITL KAFNNVGEGI
860 870 880 890 900
PLYESAVTRP HTDTSEVDLF VINAPYTPVP DPTPMMPPVG VQASILSHDT
910 920 930 940 950
IRITWADNSL PKHQKITDSR YYTVRWKTNI PANTKYKNAN ATTLSYLVTG
960 970 980 990 1000
LKPNTLYEFS VMVTKGRRSS TWSMTAHGAT FELVPTSPPK DVTVVSKEGK
1010 1020 1030 1040 1050
PRTIIVNWQP PSEANGKITG YIIYYSTDVN AEIHDWVIEP VVGNRLTHQI
1060 1070 1080 1090 1100
QELTLDTPYY FKIQARNSKG MGPMSEAVQF RTPKADSSDK MPNDQALGSA
1110 1120 1130 1140 1150
GKGSRLPDLG SDYKPPMSGS NSPHGSPTSP LDSNMLLVII VSVGVITIVV
1160 1170 1180 1190 1200
VVVIAVFCTR RTTSHQKKKR AACKSVNGSH KYKGNCKDVK PPDLWIHHER
1210 1220 1230 1240 1250
LELKPIDKSP DPNPVMTDTP IPRNSQDITP VDNSMDSNIH QRRNSYRGHE
1260 1270 1280 1290 1300
SEDSMSTLAG RRGMRPKMMM PFDSQPPQPV ISAHPIHSLD NPHHHFHSSS
1310 1320 1330 1340 1350
LASPARSHLY HPSSPWPIGT SMSLSDRANS TESVRNTPST DTMPASSSQT
1360 1370 1380 1390 1400
CCTDHQDPEG ATSSSYLASS QEEDSGQSLP TAHVRPSHPL KSFAVPAIPP
1410 1420 1430 1440 1450
PGPPLYDPAL PSTPLLSQQA LEPSTFHSVK TASIGTLGRS RPPMPVVVPS
1460 1470 1480 1490
APEVQETTRM LEDSESSYEP DELTKEMAHL EGLMKDLNAI TTA
Length:1,493
Mass (Da):163,160
Last modified:May 1, 1997 - v1
Checksum:i441DE919D5E17C0E
GO
Isoform 2 (identifier: P97798-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     442-461: Missing.

Show »
Length:1,473
Mass (Da):161,128
Checksum:iB830117BB889871A
GO
Isoform 3 (identifier: P97798-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     863-878: Missing.

Note: Expression developmentally regulated.
Show »
Length:1,477
Mass (Da):161,397
Checksum:i3248E1E198F0AB6E
GO
Isoform 4 (identifier: P97798-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1086-1096: Missing.

Note: Expression developmentally regulated.
Show »
Length:1,482
Mass (Da):161,971
Checksum:i705C7C3FBB8A4CC1
GO
Isoform 5 (identifier: P97798-5) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1279-1331: Missing.

Note: Expression developmentally regulated.
Show »
Length:1,440
Mass (Da):157,418
Checksum:iD8818C1E884DC0D3
GO

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei442 – 46120Missing in isoform 2. CuratedVSP_002594Add
BLAST
Alternative sequencei863 – 87816Missing in isoform 3. CuratedVSP_002595Add
BLAST
Alternative sequencei1086 – 109611Missing in isoform 4. CuratedVSP_002596Add
BLAST
Alternative sequencei1279 – 133153Missing in isoform 5. CuratedVSP_002597Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
Y09535 mRNA. Translation: CAA70727.1.
UniGeneiMm.42249.

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
Y09535 mRNA. Translation: CAA70727.1.
UniGeneiMm.42249.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
4BQ6X-ray2.30A/B883-1133[»]
4BQ7X-ray6.60A/B883-1133[»]
4BQ8X-ray2.80A883-1083[»]
4BQ9X-ray2.91A/B883-1083[»]
4BQBX-ray2.70A/B/C/D883-1133[»]
4BQCX-ray3.20A/B883-1133[»]
4PLNX-ray3.20C/D765-980[»]
4UI2X-ray3.15A883-1133[»]
ProteinModelPortaliP97798.
SMRiP97798. Positions 64-1088.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

DIPiDIP-32026N.
IntActiP97798. 3 interactions.
MINTiMINT-4997055.
STRINGi10090.ENSMUSP00000063656.

PTM databases

iPTMnetiP97798.
PhosphoSiteiP97798.
SwissPalmiP97798.

Proteomic databases

MaxQBiP97798.
PaxDbiP97798.
PeptideAtlasiP97798.
PRIDEiP97798.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Organism-specific databases

MGIiMGI:1097159. Neo1.

Phylogenomic databases

eggNOGiKOG4221. Eukaryota.
ENOG410Z913. LUCA.
HOGENOMiHOG000230686.
HOVERGENiHBG005455.
InParanoidiP97798.
PhylomeDBiP97798.

Enzyme and pathway databases

ReactomeiR-MMU-373752. Netrin-1 signaling.
R-MMU-375170. CDO in myogenesis.

Miscellaneous databases

ChiTaRSiNeo1. mouse.
PROiP97798.
SOURCEiSearch...

Gene expression databases

CleanExiMM_NEO1.

Family and domain databases

Gene3Di2.60.40.10. 10 hits.
InterProiIPR003961. FN3_dom.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR013098. Ig_I-set.
IPR003599. Ig_sub.
IPR003598. Ig_sub2.
IPR033024. Neogenin.
IPR010560. Neogenin_C.
[Graphical view]
PANTHERiPTHR10489:SF55. PTHR10489:SF55. 3 hits.
PfamiPF00041. fn3. 6 hits.
PF07679. I-set. 3 hits.
PF13895. Ig_2. 1 hit.
PF06583. Neogenin_C. 1 hit.
[Graphical view]
SMARTiSM00060. FN3. 6 hits.
SM00409. IG. 4 hits.
SM00408. IGc2. 4 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 4 hits.
SSF49265. SSF49265. 3 hits.
PROSITEiPS50853. FN3. 6 hits.
PS50835. IG_LIKE. 4 hits.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Mouse neogenin, a DCC-like molecule, has four splice variants and is expressed widely in the adult mouse and during embryogenesis."
    Keeling S.L., Gad J.M., Cooper H.M.
    Oncogene 15:691-700(1997) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA], ALTERNATIVE SPLICING.
    Tissue: Brain.
  2. "Neogenin-RGMa signaling at the growth cone is bone morphogenetic protein-independent and involves RhoA, ROCK, and PKC."
    Conrad S., Genth H., Hofmann F., Just I., Skutella T.
    J. Biol. Chem. 282:16423-16433(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH RGMA.
  3. "Myosin X regulates netrin receptors and functions in axonal path-finding."
    Zhu X.J., Wang C.Z., Dai P.G., Xie Y., Song N.N., Liu Y., Du Q.S., Mei L., Ding Y.Q., Xiong W.C.
    Nat. Cell Biol. 9:184-192(2007) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH MYO10.
  4. "The mouse C2C12 myoblast cell surface N-linked glycoproteome: identification, glycosite occupancy, and membrane orientation."
    Gundry R.L., Raginski K., Tarasova Y., Tchernyshyov I., Bausch-Fluck D., Elliott S.T., Boheler K.R., Van Eyk J.E., Wollscheid B.
    Mol. Cell. Proteomics 8:2555-2569(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: GLYCOSYLATION [LARGE SCALE ANALYSIS] AT ASN-221 AND ASN-501.
    Tissue: Myoblast.
  5. "Mass-spectrometric identification and relative quantification of N-linked cell surface glycoproteins."
    Wollscheid B., Bausch-Fluck D., Henderson C., O'Brien R., Bibel M., Schiess R., Aebersold R., Watts J.D.
    Nat. Biotechnol. 27:378-386(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: GLYCOSYLATION [LARGE SCALE ANALYSIS] AT ASN-221; ASN-501 AND ASN-670.
  6. Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-1209; SER-1225; THR-1229; SER-1464 AND SER-1467, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Brain, Brown adipose tissue, Heart, Kidney, Liver, Lung and Testis.
  7. "Structure of the repulsive guidance molecule (RGM)-neogenin signaling hub."
    Bell C.H., Healey E., van Erp S., Bishop B., Tang C., Gilbert R.J., Aricescu A.R., Pasterkamp R.J., Siebold C.
    Science 341:77-80(2013) [PubMed] [Europe PMC] [Abstract]
    Cited for: X-RAY CRYSTALLOGRAPHY (2.3 ANGSTROMS) OF 883-1134 IN COMPLEX WITH HUMAN RGMB, FUNCTION.

Entry informationi

Entry nameiNEO1_MOUSE
AccessioniPrimary (citable) accession number: P97798
Entry historyi
Integrated into UniProtKB/Swiss-Prot: December 1, 2000
Last sequence update: May 1, 1997
Last modified: July 6, 2016
This is version 155 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.