Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Protein bassoon

Gene

Bsn

Organism
Rattus norvegicus (Rat)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Is thought to be involved in the organization of the cytomatrix at the nerve terminals active zone (CAZ) which regulates neurotransmitter release. Seems to act through binding to ERC2/CAST1. Essential in regulated neurotransmitter release from a subset of brain glutamatergic synapses. Involved in the formation of the retinal photoreceptor ribbon synapses (By similarity).By similarity1 Publication

Regions

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri167 – 190C4-typeSequence analysisAdd BLAST24
Zinc fingeri195 – 217C4-typeSequence analysisAdd BLAST23
Zinc fingeri462 – 485C4-typeSequence analysisAdd BLAST24
Zinc fingeri490 – 512C4-typeSequence analysisAdd BLAST23

GO - Molecular functioni

  • metal ion binding Source: UniProtKB-KW
  • transcription corepressor binding Source: ParkinsonsUK-UCL

GO - Biological processi

  • cytoskeleton organization Source: RGD
  • synapse assembly Source: InterPro
Complete GO annotation...

Keywords - Ligandi

Metal-binding, Zinc

Names & Taxonomyi

Protein namesi
Recommended name:
Protein bassoon
Gene namesi
Name:Bsn
OrganismiRattus norvegicus (Rat)
Taxonomic identifieri10116 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeRattus
Proteomesi
  • UP000002494 Componenti: Unplaced

Organism-specific databases

RGDi2223. Bsn.

Subcellular locationi

  • Cytoplasm
  • Cell junctionsynapsesynaptosome
  • Cytoplasmcytoskeleton

  • Note: Localized to the active zone of presynaptic density. In retina, is localized in the outer plexiform layer at ribbon synapses formed by rods and cones but was absent from basal synaptic contacts formed by cones. In the retinal inner plexiform layer localized to conventional inhibitory GABAergic synapses, made by amacrine cells, but absent from the bipolar cell ribbon synapses.

GO - Cellular componenti

  • cell junction Source: UniProtKB-KW
  • cytoskeleton of presynaptic active zone Source: RGD
  • excitatory synapse Source: BHF-UCL
  • inhibitory synapse Source: RGD
  • neuronal cell body Source: RGD
  • neuron projection Source: UniProtKB-SubCell
  • presynaptic active zone Source: MGI
  • ribbon synapse Source: RGD
  • synapse Source: RGD
  • trans-Golgi network Source: RGD
Complete GO annotation...

Keywords - Cellular componenti

Cell junction, Cytoplasm, Cytoskeleton, Synapse, Synaptosome

Pathology & Biotechi

Mutagenesis

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Mutagenesisi2G → A: Loss of myristoylation. 1 Publication1

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Initiator methionineiRemoved
ChainiPRO_00000650042 – 3938Protein bassoonAdd BLAST3937

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Lipidationi2N-myristoyl glycine1 Publication1
Modified residuei142PhosphoserineBy similarity1
Modified residuei145Omega-N-methylarginineBy similarity1
Modified residuei241PhosphoserineBy similarity1
Modified residuei245PhosphoserineBy similarity1
Modified residuei863Omega-N-methylarginineBy similarity1
Modified residuei965PhosphoserineBy similarity1
Modified residuei1035PhosphoserineBy similarity1
Modified residuei1036PhosphoserineBy similarity1
Modified residuei1085PhosphoserineBy similarity1
Modified residuei1087PhosphothreonineBy similarity1
Modified residuei1093PhosphoserineBy similarity1
Modified residuei1099PhosphoserineBy similarity1
Modified residuei1221PhosphoserineBy similarity1
Glycosylationi1339O-linked (GlcNAc)1 Publication1
Glycosylationi1380O-linked (GlcNAc)By similarity1
Modified residuei1470PhosphoserineBy similarity1
Modified residuei1479PhosphoserineBy similarity1
Modified residuei1481PhosphoserineBy similarity1
Modified residuei1780Omega-N-methylarginineBy similarity1
Modified residuei1784Omega-N-methylarginineBy similarity1
Modified residuei1794Asymmetric dimethylarginine; alternateBy similarity1
Modified residuei1794Omega-N-methylarginine; alternateBy similarity1
Modified residuei1806Omega-N-methylarginineBy similarity1
Glycosylationi1922O-linked (GlcNAc)By similarity1
Modified residuei1978PhosphoserineBy similarity1
Modified residuei2034PhosphoserineBy similarity1
Modified residuei2039Omega-N-methylarginineBy similarity1
Modified residuei2069Omega-N-methylarginineBy similarity1
Modified residuei2243Asymmetric dimethylarginineBy similarity1
Modified residuei2253Asymmetric dimethylarginineBy similarity1
Modified residuei2259Asymmetric dimethylarginineBy similarity1
Glycosylationi2307O-linked (GlcNAc)By similarity1
Glycosylationi2510O-linked (GlcNAc)By similarity1
Modified residuei2564PhosphoserineBy similarity1
Modified residuei2581PhosphothreonineBy similarity1
Modified residuei2608PhosphothreonineBy similarity1
Glycosylationi2685O-linked (GlcNAc)By similarity1
Modified residuei2796PhosphoserineBy similarity1
Modified residuei2845PhosphoserineBy similarity1
Modified residuei2851PhosphoserineBy similarity1
Glycosylationi2930O-linked (GlcNAc)By similarity1
Modified residuei3007PhosphoserineBy similarity1
Modified residuei3286PhosphoserineBy similarity1
Modified residuei3368PhosphoserineBy similarity1
Modified residuei3488Omega-N-methylarginineBy similarity1
Modified residuei3822Omega-N-methylarginineBy similarity1

Post-translational modificationi

Myristoylated. The N-terminal myristoylation is not sufficient for presynaptic localization.1 Publication

Keywords - PTMi

Glycoprotein, Lipoprotein, Methylation, Myristate, Phosphoprotein

Proteomic databases

PaxDbiO88778.
PRIDEiO88778.

PTM databases

iPTMnetiO88778.
PhosphoSitePlusiO88778.

Expressioni

Developmental stagei

Detected at embryonic day E18 and at later stages. The expression does not significantly change during the developmental stages tested.1 Publication

Interactioni

Subunit structurei

Interacts with ERC2/CAST1, RIMS1 and UNC13A. Part of a complex consisting of ERC2, RIMS1 and BSN.2 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
Siah1Q920M93EBI-2271660,EBI-957514

GO - Molecular functioni

  • transcription corepressor binding Source: ParkinsonsUK-UCL

Protein-protein interaction databases

IntActiO88778. 7 interactors.
MINTiMINT-4508415.
STRINGi10116.ENSRNOP00000039162.

Structurei

3D structure databases

ProteinModelPortaliO88778.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Repeati568 – 57417
Repeati575 – 58127
Repeati582 – 58837

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni62 – 704 X 2 AA tandem repeats of P-G9
Regioni568 – 5883 X 7 AA tandem repeats of K-A-S-P-Q-A-[AK]Add BLAST21
Regioni2934 – 2996Sufficient for binding to ERC2Add BLAST63

Coiled coil

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Coiled coili1032 – 1087Sequence analysisAdd BLAST56
Coiled coili1176 – 1203Sequence analysisAdd BLAST28
Coiled coili2345 – 2470Sequence analysisAdd BLAST126
Coiled coili2933 – 2975Sequence analysisAdd BLAST43
Coiled coili3772 – 3803Sequence analysisAdd BLAST32

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Compositional biasi2594 – 2600Poly-Arg7
Compositional biasi2621 – 2626Poly-Arg6
Compositional biasi2647 – 2654Poly-Ala8
Compositional biasi3770 – 3797Poly-GlnAdd BLAST28

Zinc finger

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri167 – 190C4-typeSequence analysisAdd BLAST24
Zinc fingeri195 – 217C4-typeSequence analysisAdd BLAST23
Zinc fingeri462 – 485C4-typeSequence analysisAdd BLAST24
Zinc fingeri490 – 512C4-typeSequence analysisAdd BLAST23

Keywords - Domaini

Coiled coil, Repeat, Zinc-finger

Phylogenomic databases

eggNOGiENOG410IGEH. Eukaryota.
ENOG410XX5R. LUCA.
HOGENOMiHOG000095267.
HOVERGENiHBG080934.
InParanoidiO88778.
PhylomeDBiO88778.

Family and domain databases

Gene3Di3.30.40.10. 2 hits.
InterProiIPR030627. Bsn.
IPR011011. Znf_FYVE_PHD.
IPR008899. Znf_piccolo.
IPR013083. Znf_RING/FYVE/PHD.
[Graphical view]
PANTHERiPTHR14113:SF1. PTHR14113:SF1. 1 hit.
PfamiPF05715. zf-piccolo. 2 hits.
[Graphical view]
SUPFAMiSSF57903. SSF57903. 2 hits.

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

O88778-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MGNEASLEGG AGEGPLPPGG SGLGPGPGAG KPPSALAGGG QLPVAGAARA
60 70 80 90 100
AGPPTPGLGL VPGPGPGPGP GSVSRRLDPK EPLGSQRATS PTPKQASATA
110 120 130 140 150
PGRESPRETR AQGLSGQEAE GPRRTLQVDS RTQRSGRSPS VSPDRGSTPT
160 170 180 190 200
SPYSVPQIAP LPSSTLCPIC KTSDLTSTSS QPNFNTCTQC HNKVCNQCGF
210 220 230 240 250
NPNPHLTQVK EWLCLNCQMQ RALGMDMTTA PRSKSQQQLH SPALSPAHSP
260 270 280 290 300
AKQPLGKPEQ ERSRSPGATQ SGPRQAEAAR ATSVPGPTQA TAPPEVGRVS
310 320 330 340 350
PQPPLSTKPS TAEPRPPAGE AQGKSATTVP SGLGAAEQTQ GGLTGKLFGL
360 370 380 390 400
GASLLTQAST LMSVQPEADT QGQPSPSKGP PKIVFSDASK EAGPRPPGSG
410 420 430 440 450
PGPGPTPGAK TEPGPRTGPG SGPGALAKTG GTPSPKHGRA DHQAASKAAA
460 470 480 490 500
KPKTMPKERA ACPLCQAELN VGSRGPANYN TCTACKLRVC TLCGFNPTPH
510 520 530 540 550
LVEKTEWLCL NCQTKRLLEG SLGEPAPLPL PTPQEPPAGV PQRAAGASPL
560 570 580 590 600
KQKGPQGPGQ PSGSLPPKAS PQAAKASPQA AKASPQAKPL RASEPSKTSS
610 620 630 640 650
SAPEKKTGIP VKAEPVPKPP PETAVPPGTP KAKSGVKRTD PATPVVKPVP
660 670 680 690 700
EAPKSGEAEE PVPKPYSQDL SRSPQSLSDT GYSSDGVSSS QSEITGVVQQ
710 720 730 740 750
EVEQLDSAGV TGPRPPSPSE LHKVGSSMRP SLEAQAVAPS GEWSKPPSGS
760 770 780 790 800
AVEDQKRRPH SLSIMPEAFD SDEELGDILE EDDSLAMGRQ REQQDTAESS
810 820 830 840 850
DDFGSQLRHD YVEDSSEGGL SPLPPQPPAR ADMTDEEFMR RQILEMSAEE
860 870 880 890 900
DNLEEDDTAV SGRGLAKHGA QKASARPRPE SSQESVALPK RRLPHNATTG
910 920 930 940 950
YEELLSEEGP AEPTDGALQG GLRRFKTIGL NSTGRLWSTS LDLGQGSDPN
960 970 980 990 1000
LDREPELEME SLTGSPEDRS RGEHSSTLPA STPSYTSGTS PTSLSSLEED
1010 1020 1030 1040 1050
SDSSPSRRQR LEEAKQQRKA RHRSHGPLLP TIEDSSEEEE LREEEELLRE
1060 1070 1080 1090 1100
QEKMREVEQQ RIRSTARKTR RDKEELRAQR RRERSKTPPS NLSPIEDASP
1110 1120 1130 1140 1150
TEELRQAAEM EELHRSSCSE YSPSPSLDSE AETLDGGPTR LYKSGSEYNL
1160 1170 1180 1190 1200
PAFMSLCSPT ETPSGSSTTP SSGRPLKSAE EAYEDMMRKA ELLQRQQGQA
1210 1220 1230 1240 1250
AGARGPHGGP SQPTGPRSQG SFEYQDTLDH DYGGRASQPA ADGTPAGLGA
1260 1270 1280 1290 1300
TVYEEILQTS QSIARMRQAS SRDLAFTEDK KKEKQFLNAE SAYMDPMKQN
1310 1320 1330 1340 1350
GGPLTPGTSP TQLAAPVSFP TSTSSDSSGG RVIPDVRVTQ HFAKEPQEPL
1360 1370 1380 1390 1400
KLHSSPASPS LASKEVGMTF SQGPGTPATT AMAPCPASLP RGYMTPAGPE
1410 1420 1430 1440 1450
RSPSTSSTIH SYGQPPTTAN YGSQTEELPH APSGPAGSGR ASREKPLSGG
1460 1470 1480 1490 1500
DGEVGPPQPS RGYSYFTGSS PPLSPSTPSE SPTFSPSKLG PRATAEFSTQ
1510 1520 1530 1540 1550
TPSLTPSSDI PRSVGTPSPM VAQGTQTPHR PSTPRLVWQQ SSQEAPVMVI
1560 1570 1580 1590 1600
TLASDASSQT RMVHASASTS PLCSPTDSQP ASHSYSQTTP PSASQMPSEP
1610 1620 1630 1640 1650
AGPPGFPRAP SAGVDGPLAL YGWGALPAEN ISLCRISSVP GTSRVEPGPR
1660 1670 1680 1690 1700
PPGTAVVDLR TAVKPTPIIL TDQGMDLTSL AVEARKYGLA LDPVPGRQST
1710 1720 1730 1740 1750
AVQPLVINLN AQEQTHTFLA TATTVSITMA SSVLMAQQKQ PVVYGDPFQS
1760 1770 1780 1790 1800
RLDFGQGSGS PVCLAQVKQV EQAVQTAPYR GGPRGRPREA KFARYNLPNQ
1810 1820 1830 1840 1850
VTPLARRDIL ITQMGTAQSV SLKPGPVPEP GAEPHRATPA ELRAHALPGT
1860 1870 1880 1890 1900
RKPHTVVVQM GEGAAGTVTT LLPEEPAGAL DLTGMRPESR LACCDMAYKF
1910 1920 1930 1940 1950
PFGSSCTGTF HPAPSAPDKS VTDAALPGQS SGPFYSPRDP EPPEPLTFRA
1960 1970 1980 1990 2000
QGVVGPGPHE EQRPYPQGLP GRLYSSMSDT NLAEAGLNYH AQRIGQLFQG
2010 2020 2030 2040 2050
PGRDSAVDLS SLKHSYSLGF ADGRYLGQGL QYGSFTDLRH PTDLLSHPLP
2060 2070 2080 2090 2100
MRPYSSVSNI YSDHRYGPRG DAVGFQEASL AQYSATTARE ISRMCAALNS
2110 2120 2130 2140 2150
MDQYGGRHGG GSGGPDLVPY QPQHGPGLNA PQGLASLRSG LLGNPTYPEG
2160 2170 2180 2190 2200
QPSPGNLAQY GPAASQGTAV RQLLPSTATV RAADGMIYST INTPIAATLP
2210 2220 2230 2240 2250
ITTQPASVLR PMVRGGMYRP YGSGGVTAVP LTSLTRVPMI APRVPLGPAG
2260 2270 2280 2290 2300
LYRYPAPSRF PIASTIPPAE GPVYLGKPAA AKASGAGGPP RPELPAGGAR
2310 2320 2330 2340 2350
EEPLSTTAPP AVIKEAPVAQ APAPPPGQKP AGDAAAGSGS GVLGRPVMEK
2360 2370 2380 2390 2400
EEASQEDRQR KQQEQLLQLE RERVELEKLR QLRLQEELER ERVELQRHRE
2410 2420 2430 2440 2450
EEQLLVQREL QELQTIKHHV LQQQQEERQA QFALQREQLA QQRLQLEQIQ
2460 2470 2480 2490 2500
QLQQQLQQQL EEQKQRQKAP FPATCEAPSR GPPPAATELA QNGQYWPPLT
2510 2520 2530 2540 2550
HTAFIAVAGT EGPGQAREPV LHRGLPSSAS DMSLQTEEQW EAGRSGIKKR
2560 2570 2580 2590 2600
HSMPRLRDAC EPESGPDPST VRRIADSSVQ TDDEEGEGRY LLTRRRRTRR
2610 2620 2630 2640 2650
SADCSVQTDD EDNAEWEQPV RRRRSRLSRH SDSGSDSKHE ASASSSAAAA
2660 2670 2680 2690 2700
AARAMSSVGI QTISDCSVQT EPEQLPRVSP AIHITAATDP KVEIVRYISA
2710 2720 2730 2740 2750
PEKTGRGESL ACQTEPDGQA QGVAGPQLIG PTAISPYLPG IQIVTPGALG
2760 2770 2780 2790 2800
RFEKKKPDPL EIGYQAHLPP ESLSQLVSRQ PPKSPQVLYS PVSPLSPHRL
2810 2820 2830 2840 2850
LDTSFASSER LNKAHVSPQK QFIADSTLRQ QTLPRPMKTL QRSLSDPKPL
2860 2870 2880 2890 2900
SPTAEESAKE RFSLYQHQGG LGSQVSALPP NGLVRKVKRT LPSPPPEEAH
2910 2920 2930 2940 2950
LPLAGQVPSQ LYAASLLQRG LAGPTTVPAT KASLLRELDR DLRLVEHEST
2960 2970 2980 2990 3000
KLRKKQAELD EEEKEIDAKL KYLELGITQR KESLAKDRVG RDYPPLRGLG
3010 3020 3030 3040 3050
EHRDYLSDSE LNQLRLQGCT TPAGQYVDYP ASAAVPATPS GPTAFQQPRF
3060 3070 3080 3090 3100
PPAATQYTAG SSGPTQNGFL AHQAPTYTGP STYPAPTYPP GTSYPAEPGL
3110 3120 3130 3140 3150
PSQPAFHPTG HYAAPTPMPT TQSAPFPVQA DSHAAHQKPR QTSLADLEQK
3160 3170 3180 3190 3200
VPTNYEVISS PAVTVSSTPS ETGYSGPAVS SSYEHGKAPE HPRGGDRSSV
3210 3220 3230 3240 3250
SQSPAPTYPS DSHYTSLEQN VPRNYVMIDD ISELTKDSTP TASDSQRPEP
3260 3270 3280 3290 3300
LGPGGVSGRP GKDPGEPAVL EGPTLPCCYG RGEEESEEDS YDPRGKSGHH
3310 3320 3330 3340 3350
RSMESNGRPA STHYYSDSDY RHGARADKYG PGPMGPKHPS KNLAPAAISS
3360 3370 3380 3390 3400
KRSKHRKQGM EQKISKFSPI EEAKDVESDL ASYPPPTVSS SLTSRSRKFQ
3410 3420 3430 3440 3450
DEITYGLKKN VYEQQRYYGV SSRDTAEEDD RMYGGSSRSR VASAYSGEKL
3460 3470 3480 3490 3500
SSHDFSSRSK GYERERETAQ RLQKAGPKPS SLSMAHGRAR PPMRSQASEE
3510 3520 3530 3540 3550
ESPVSPLGRP RPAGGALPPG DTCPQFCSSH SMPDVQEHVK DGPRAHAYKR
3560 3570 3580 3590 3600
EEGYILDDSH CVVSDSEAYH LGQEETDWFD KPRDARSDRF RHHGGHTVSS
3610 3620 3630 3640 3650
SQKRGPARHS YHDYDEPPEE GLWPHDEGGP GRHTSAKEHR HHGDHGRHSG
3660 3670 3680 3690 3700
RHAGEEPGRR AARPHARDMG RHETRPHPQA SPAPAMQKKG QPGYPSSADY
3710 3720 3730 3740 3750
SQPSRAPSAY HHASDSKKGS RQAHSGPTVL QPKPEAQAQP QMQGRQAVPG
3760 3770 3780 3790 3800
PQQSQPPSSR QTPSGTASRQ PQTQQQQQQQ QQQQQQQQQQ QQQQQQQGLG
3810 3820 3830 3840 3850
QQAPQQAPSQ ARLQQQSQPT TRSTAPAASH PAGKPQPGPT TAPGPQPAGL
3860 3870 3880 3890 3900
PRAEQAGSSK PAAKAPQQGR APQAQSAPGP AGAKTGARPG GTPGAPAGQP
3910 3920 3930
AAEGESVFSK ILPGGAAEQA GKLTEAVSAF GKKFSSFW
Length:3,938
Mass (Da):418,424
Last modified:January 23, 2007 - v3
Checksum:i5BF3C230E2C71AE2
GO

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
Y16563 mRNA. Translation: CAA76287.1.
PIRiT42761.
UniGeneiRn.29999.

Genome annotation databases

UCSCiRGD:2223. rat.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
Y16563 mRNA. Translation: CAA76287.1.
PIRiT42761.
UniGeneiRn.29999.

3D structure databases

ProteinModelPortaliO88778.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

IntActiO88778. 7 interactors.
MINTiMINT-4508415.
STRINGi10116.ENSRNOP00000039162.

PTM databases

iPTMnetiO88778.
PhosphoSitePlusiO88778.

Proteomic databases

PaxDbiO88778.
PRIDEiO88778.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

UCSCiRGD:2223. rat.

Organism-specific databases

RGDi2223. Bsn.

Phylogenomic databases

eggNOGiENOG410IGEH. Eukaryota.
ENOG410XX5R. LUCA.
HOGENOMiHOG000095267.
HOVERGENiHBG080934.
InParanoidiO88778.
PhylomeDBiO88778.

Miscellaneous databases

PROiO88778.

Family and domain databases

Gene3Di3.30.40.10. 2 hits.
InterProiIPR030627. Bsn.
IPR011011. Znf_FYVE_PHD.
IPR008899. Znf_piccolo.
IPR013083. Znf_RING/FYVE/PHD.
[Graphical view]
PANTHERiPTHR14113:SF1. PTHR14113:SF1. 1 hit.
PfamiPF05715. zf-piccolo. 2 hits.
[Graphical view]
SUPFAMiSSF57903. SSF57903. 2 hits.
ProtoNetiSearch...

Entry informationi

Entry nameiBSN_RAT
AccessioniPrimary (citable) accession number: O88778
Entry historyi
Integrated into UniProtKB/Swiss-Prot: August 16, 2004
Last sequence update: January 23, 2007
Last modified: November 2, 2016
This is version 112 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.