Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Tubby-related protein 4

Gene

TULP4

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at transcript leveli

Functioni

May be a substrate-recognition component of a SCF-like ECS (Elongin-Cullin-SOCS-box protein) E3 ubiquitin ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins.By similarity

Pathwayi: protein ubiquitination

This protein is involved in the pathway protein ubiquitination, which is part of Protein modification.
View all proteins of this organism that are known to be involved in the pathway protein ubiquitination and in Protein modification.

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Biological processi

Ubl conjugation pathway

Enzyme and pathway databases

SignaLinkiQ9NRJ4.
UniPathwayiUPA00143.

Names & Taxonomyi

Protein namesi
Recommended name:
Tubby-related protein 4
Alternative name(s):
Tubby superfamily protein
Tubby-like protein 4
Gene namesi
Name:TULP4
Synonyms:KIAA1397, TUBL4, TUSP
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 6

Organism-specific databases

HGNCiHGNC:15530. TULP4.

Subcellular locationi

GO - Cellular componenti

  • cilium Source: GO_Central
  • cytoplasm Source: MGI
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm

Pathology & Biotechi

Organism-specific databases

PharmGKBiPA134880863.

Polymorphism and mutation databases

BioMutaiTULP4.
DMDMi212276475.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 15431543Tubby-related protein 4PRO_0000186472Add
BLAST

Proteomic databases

MaxQBiQ9NRJ4.
PaxDbiQ9NRJ4.
PRIDEiQ9NRJ4.

PTM databases

iPTMnetiQ9NRJ4.
PhosphoSiteiQ9NRJ4.

Expressioni

Tissue specificityi

Expressed mainly in the brain, skeletal muscle, testis and kidney.

Gene expression databases

BgeeiQ9NRJ4.
CleanExiHS_TULP4.
ExpressionAtlasiQ9NRJ4. baseline and differential.
GenevisibleiQ9NRJ4. HS.

Organism-specific databases

HPAiHPA005445.

Interactioni

Protein-protein interaction databases

BioGridi121310. 2 interactions.
IntActiQ9NRJ4. 4 interactions.
STRINGi9606.ENSP00000356064.

Structurei

3D structure databases

ProteinModelPortaliQ9NRJ4.
SMRiQ9NRJ4. Positions 78-109, 1483-1537.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Repeati6 – 7267WD 1Add
BLAST
Repeati73 – 11543WD 2Add
BLAST
Repeati116 – 15843WD 3Add
BLAST
Repeati159 – 23779WD 4Add
BLAST
Repeati238 – 27639WD 5Add
BLAST
Repeati277 – 33458WD 6Add
BLAST
Repeati335 – 37238WD 7Add
BLAST
Domaini364 – 41451SOCS boxPROSITE-ProRule annotationAdd
BLAST

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni1466 – 154378TUBAdd
BLAST

Domaini

The SOCS box domain mediates the interaction with the Elongin BC complex, an adapter module in different E3 ubiquitin ligase complexes.By similarity

Sequence similaritiesi

Belongs to the TUB family.Curated
Contains 1 SOCS box domain.PROSITE-ProRule annotation
Contains 7 WD repeats.PROSITE-ProRule annotation

Keywords - Domaini

Repeat, WD repeat

Phylogenomic databases

eggNOGiKOG2503. Eukaryota.
ENOG410YEKG. LUCA.
GeneTreeiENSGT00610000085970.
HOGENOMiHOG000060128.
HOVERGENiHBG058280.
InParanoidiQ9NRJ4.
OMAiDFSLYPT.
OrthoDBiEOG78H3SJ.
PhylomeDBiQ9NRJ4.
TreeFamiTF314076.

Family and domain databases

Gene3Di2.130.10.10. 2 hits.
3.20.90.10. 3 hits.
InterProiIPR001496. SOCS_box.
IPR000007. Tubby_C.
IPR025659. Tubby_C-like.
IPR008983. Tumour_necrosis_fac-like_dom.
IPR015943. WD40/YVTN_repeat-like_dom.
IPR001680. WD40_repeat.
IPR017986. WD40_repeat_dom.
[Graphical view]
PfamiPF01167. Tub. 1 hit.
[Graphical view]
SMARTiSM00969. SOCS_box. 1 hit.
SM00320. WD40. 2 hits.
[Graphical view]
SUPFAMiSSF49842. SSF49842. 1 hit.
SSF50978. SSF50978. 2 hits.
SSF54518. SSF54518. 2 hits.
PROSITEiPS50225. SOCS. 1 hit.
PS50082. WD_REPEATS_2. 1 hit.
PS50294. WD_REPEATS_REGION. 1 hit.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q9NRJ4-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MYAAVEHGPV LCSDSNILCL SWKGRVPKSE KEKPVCRRRY YEEGWLATGN
60 70 80 90 100
GRGVVGVTFT SSHCRRDRST PQRINFNLRG HNSEVVLVRW NEPYQKLATC
110 120 130 140 150
DADGGIFVWI QYEGRWSVEL VNDRGAQVSD FTWSHDGTQA LISYRDGFVL
160 170 180 190 200
VGSVSGQRHW SSEINLESQI TCGIWTPDDQ QVLFGTADGQ VIVMDCHGRM
210 220 230 240 250
LAHVLLHESD GVLGMSWNYP IFLVEDSSES DTDSDDYAPP QDGPAAYPIP
260 270 280 290 300
VQNIKPLLTV SFTSGDISLM NNYDDLSPTV IRSGLKEVVA QWCTQGDLLA
310 320 330 340 350
VAGMERQTQL GELPNGPLLK SAMVKFYNVR GEHIFTLDTL VQRPIISICW
360 370 380 390 400
GHRDSRLLMA SGPALYVVRV EHRVSSLQLL CQQAIASTLR EDKDVSKLTL
410 420 430 440 450
PPRLCSYLST AFIPTIKPPI PDPNNMRDFV SYPSAGNERL HCTMKRTEDD
460 470 480 490 500
PEVGGPCYTL YLEYLGGLVP ILKGRRISKL RPEFVIMDPR TDSKPDEIYG
510 520 530 540 550
NSLISTVIDS CNCSDSSDIE LSDDWAAKKS PKISRASKSP KLPRISIEAR
560 570 580 590 600
KSPKLPRAAQ ELSRSPRLPL RKPSVGSPSL TRREFPFEDI TQHNYLAQVT
610 620 630 640 650
SNIWGTKFKI VGLAAFLPTN LGAVIYKTSL LHLQPRQMTI YLPEVRKISM
660 670 680 690 700
DYINLPVFNP NVFSEDEDDL PVTGASGVPE NSPPCTVNIP IAPIHSSAQA
710 720 730 740 750
MSPTQSIGLV QSLLANQNVQ LDVLTNQTTA VGTAEHAGDS ATQYPVSNRY
760 770 780 790 800
SNPGQVIFGS VEMGRIIQNP PPLSLPPPPQ GPMQLSTVGH GDRDHEHLQK
810 820 830 840 850
SAKALRPTPQ LAAEGDAVVF SAPQEVQVTK INPPPPYPGT IPAAPTTAAP
860 870 880 890 900
PPPLPPPQPP VDVCLKKGDF SLYPTSVHYQ TPLGYERITT FDSSGNVEEV
910 920 930 940 950
CRPRTRMLCS QNTYTLPGPG SSATLRLTAT EKKVPQPCSS ATLNRLTVPR
960 970 980 990 1000
YSIPTGDPPP YPEIASQLAQ GRGAAQRSDN SLIHATLRRN NREATLKMAQ
1010 1020 1030 1040 1050
LADSPRAPLQ PLAKSKGGPG GVVTQLPARP PPALYTCSQC SGTGPSSQPG
1060 1070 1080 1090 1100
ASLAHTASAS PLASQSSYSL LSPPDSARDR TDYVNSAFTE DEALSQHCQL
1110 1120 1130 1140 1150
EKPLRHPPLP EAAVTLKRPP PYQWDPMLGE DVWVPQERTA QTSGPNPLKL
1160 1170 1180 1190 1200
SSLMLSQGQH LDVSRLPFIS PKSPASPTAT FQTGYGMGVP YPGSYNNPPL
1210 1220 1230 1240 1250
PGVQAPCSPK DALSPTQFAQ QEPAVVLQPL YPPSLSYCTL PPMYPGSSTC
1260 1270 1280 1290 1300
SSLQLPPVAL HPWSSYSACP PMQNPQGTLP PKPHLVVEKP LVSPPPADLQ
1310 1320 1330 1340 1350
SHLGTEVMVE TADNFQEVLS LTESPVPQRT EKFGKKNRKR LDSRAEEGSV
1360 1370 1380 1390 1400
QAITEGKVKK EARTLSDFNS LISSPHLGRE KKKVKSQKDQ LKSKKLNKTN
1410 1420 1430 1440 1450
EFQDSSESEP ELFISGDELM NQSQGSRKGW KSKRSPRAAG ELEEAKCRRA
1460 1470 1480 1490 1500
SEKEDGRLGS QGFVYVMANK QPLWNEATQV YQLDFGGRVT QESAKNFQIE
1510 1520 1530 1540
LEGRQVMQFG RIDGSAYILD FQYPFSAVQA FAVALANVTQ RLK
Length:1,543
Mass (Da):169,000
Last modified:November 4, 2008 - v2
Checksum:i803AB78FF8DF2E98
GO
Isoform 2 (identifier: Q9NRJ4-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     672-678: VTGASGV → GDAVWTD
     679-1543: Missing.

Show »
Length:678
Mass (Da):75,852
Checksum:i6C2EA8130490EE32
GO

Sequence cautioni

The sequence AAF87975.1 differs from that shown. Reason: Frameshift at positions 740 and 744. Curated
The sequence BAA92635.1 differs from that shown. Reason: Erroneous initiation. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti745 – 7451P → Q in AAF87975 (PubMed:11595174).Curated

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti199 – 1991R → S.
Corresponds to variant rs705956 [ dbSNP | Ensembl ].
VAR_059841
Natural varianti214 – 2141G → S.
Corresponds to variant rs35262826 [ dbSNP | Ensembl ].
VAR_052417
Natural varianti522 – 5221S → N.
Corresponds to variant rs12206717 [ dbSNP | Ensembl ].
VAR_052418
Natural varianti979 – 9791D → N.
Corresponds to variant rs34622886 [ dbSNP | Ensembl ].
VAR_052419
Natural varianti1084 – 10841V → I.
Corresponds to variant rs34559793 [ dbSNP | Ensembl ].
VAR_052420
Natural varianti1281 – 12811P → T.
Corresponds to variant rs3749852 [ dbSNP | Ensembl ].
VAR_052421

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei672 – 6787VTGASGV → GDAVWTD in isoform 2. 3 PublicationsVSP_006676
Alternative sequencei679 – 1543865Missing in isoform 2. 3 PublicationsVSP_006677Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF219946 mRNA. Translation: AAF87975.1. Frameshift.
AF288480 mRNA. Translation: AAG01020.1.
AB037818 mRNA. Translation: BAA92635.1. Different initiation.
AL360169, AL353800 Genomic DNA. Translation: CAI16559.1.
AL360169, AL353800 Genomic DNA. Translation: CAI16560.1.
AL353800, AL360169 Genomic DNA. Translation: CAI20993.1.
AL353800, AL360169 Genomic DNA. Translation: CAI20994.1.
CH471051 Genomic DNA. Translation: EAW47661.1.
CH471051 Genomic DNA. Translation: EAW47662.1.
BC152476 mRNA. Translation: AAI52477.1.
CCDSiCCDS34561.1. [Q9NRJ4-1]
CCDS34562.1. [Q9NRJ4-2]
RefSeqiNP_001007467.1. NM_001007466.2. [Q9NRJ4-2]
NP_064630.2. NM_020245.4. [Q9NRJ4-1]
UniGeneiHs.486993.

Genome annotation databases

EnsembliENST00000367094; ENSP00000356061; ENSG00000130338. [Q9NRJ4-2]
ENST00000367097; ENSP00000356064; ENSG00000130338. [Q9NRJ4-1]
GeneIDi56995.
KEGGihsa:56995.
UCSCiuc003qrf.4. human. [Q9NRJ4-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF219946 mRNA. Translation: AAF87975.1. Frameshift.
AF288480 mRNA. Translation: AAG01020.1.
AB037818 mRNA. Translation: BAA92635.1. Different initiation.
AL360169, AL353800 Genomic DNA. Translation: CAI16559.1.
AL360169, AL353800 Genomic DNA. Translation: CAI16560.1.
AL353800, AL360169 Genomic DNA. Translation: CAI20993.1.
AL353800, AL360169 Genomic DNA. Translation: CAI20994.1.
CH471051 Genomic DNA. Translation: EAW47661.1.
CH471051 Genomic DNA. Translation: EAW47662.1.
BC152476 mRNA. Translation: AAI52477.1.
CCDSiCCDS34561.1. [Q9NRJ4-1]
CCDS34562.1. [Q9NRJ4-2]
RefSeqiNP_001007467.1. NM_001007466.2. [Q9NRJ4-2]
NP_064630.2. NM_020245.4. [Q9NRJ4-1]
UniGeneiHs.486993.

3D structure databases

ProteinModelPortaliQ9NRJ4.
SMRiQ9NRJ4. Positions 78-109, 1483-1537.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi121310. 2 interactions.
IntActiQ9NRJ4. 4 interactions.
STRINGi9606.ENSP00000356064.

PTM databases

iPTMnetiQ9NRJ4.
PhosphoSiteiQ9NRJ4.

Polymorphism and mutation databases

BioMutaiTULP4.
DMDMi212276475.

Proteomic databases

MaxQBiQ9NRJ4.
PaxDbiQ9NRJ4.
PRIDEiQ9NRJ4.

Protocols and materials databases

DNASUi56995.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000367094; ENSP00000356061; ENSG00000130338. [Q9NRJ4-2]
ENST00000367097; ENSP00000356064; ENSG00000130338. [Q9NRJ4-1]
GeneIDi56995.
KEGGihsa:56995.
UCSCiuc003qrf.4. human. [Q9NRJ4-1]

Organism-specific databases

CTDi56995.
GeneCardsiTULP4.
H-InvDBHIX0006328.
HGNCiHGNC:15530. TULP4.
HPAiHPA005445.
neXtProtiNX_Q9NRJ4.
PharmGKBiPA134880863.
HUGEiSearch...
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG2503. Eukaryota.
ENOG410YEKG. LUCA.
GeneTreeiENSGT00610000085970.
HOGENOMiHOG000060128.
HOVERGENiHBG058280.
InParanoidiQ9NRJ4.
OMAiDFSLYPT.
OrthoDBiEOG78H3SJ.
PhylomeDBiQ9NRJ4.
TreeFamiTF314076.

Enzyme and pathway databases

UniPathwayiUPA00143.
SignaLinkiQ9NRJ4.

Miscellaneous databases

ChiTaRSiTULP4. human.
GenomeRNAii56995.
PROiQ9NRJ4.

Gene expression databases

BgeeiQ9NRJ4.
CleanExiHS_TULP4.
ExpressionAtlasiQ9NRJ4. baseline and differential.
GenevisibleiQ9NRJ4. HS.

Family and domain databases

Gene3Di2.130.10.10. 2 hits.
3.20.90.10. 3 hits.
InterProiIPR001496. SOCS_box.
IPR000007. Tubby_C.
IPR025659. Tubby_C-like.
IPR008983. Tumour_necrosis_fac-like_dom.
IPR015943. WD40/YVTN_repeat-like_dom.
IPR001680. WD40_repeat.
IPR017986. WD40_repeat_dom.
[Graphical view]
PfamiPF01167. Tub. 1 hit.
[Graphical view]
SMARTiSM00969. SOCS_box. 1 hit.
SM00320. WD40. 2 hits.
[Graphical view]
SUPFAMiSSF49842. SSF49842. 1 hit.
SSF50978. SSF50978. 2 hits.
SSF54518. SSF54518. 2 hits.
PROSITEiPS50225. SOCS. 1 hit.
PS50082. WD_REPEATS_2. 1 hit.
PS50294. WD_REPEATS_REGION. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Molecular cloning and characterization of the mouse and human TUSP gene, a novel member of the tubby superfamily."
    Li Q.-Z., Wang C.-Y., Shi J.-D., Ruan Q.-G., Eckenrode S., Davoodi-Semiromi A., Kukar T., Gu Y., Lian W., Wu D., She J.-X.
    Gene 273:275-284(2001) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORMS 1 AND 2), ALTERNATIVE SPLICING.
    Tissue: Brain.
  2. "Prediction of the coding sequences of unidentified human genes. XVI. The complete sequences of 150 new cDNA clones from brain which code for large proteins in vitro."
    Nagase T., Kikuno R., Ishikawa K., Hirosawa M., Ohara O.
    DNA Res. 7:65-73(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2).
    Tissue: Brain.
  3. "The DNA sequence and analysis of human chromosome 6."
    Mungall A.J., Palmer S.A., Sims S.K., Edwards C.A., Ashurst J.L., Wilming L., Jones M.C., Horton R., Hunt S.E., Scott C.E., Gilbert J.G.R., Clamp M.E., Bethel G., Milne S., Ainscough R., Almeida J.P., Ambrose K.D., Andrews T.D.
    , Ashwell R.I.S., Babbage A.K., Bagguley C.L., Bailey J., Banerjee R., Barker D.J., Barlow K.F., Bates K., Beare D.M., Beasley H., Beasley O., Bird C.P., Blakey S.E., Bray-Allen S., Brook J., Brown A.J., Brown J.Y., Burford D.C., Burrill W., Burton J., Carder C., Carter N.P., Chapman J.C., Clark S.Y., Clark G., Clee C.M., Clegg S., Cobley V., Collier R.E., Collins J.E., Colman L.K., Corby N.R., Coville G.J., Culley K.M., Dhami P., Davies J., Dunn M., Earthrowl M.E., Ellington A.E., Evans K.A., Faulkner L., Francis M.D., Frankish A., Frankland J., French L., Garner P., Garnett J., Ghori M.J., Gilby L.M., Gillson C.J., Glithero R.J., Grafham D.V., Grant M., Gribble S., Griffiths C., Griffiths M.N.D., Hall R., Halls K.S., Hammond S., Harley J.L., Hart E.A., Heath P.D., Heathcott R., Holmes S.J., Howden P.J., Howe K.L., Howell G.R., Huckle E., Humphray S.J., Humphries M.D., Hunt A.R., Johnson C.M., Joy A.A., Kay M., Keenan S.J., Kimberley A.M., King A., Laird G.K., Langford C., Lawlor S., Leongamornlert D.A., Leversha M., Lloyd C.R., Lloyd D.M., Loveland J.E., Lovell J., Martin S., Mashreghi-Mohammadi M., Maslen G.L., Matthews L., McCann O.T., McLaren S.J., McLay K., McMurray A., Moore M.J.F., Mullikin J.C., Niblett D., Nickerson T., Novik K.L., Oliver K., Overton-Larty E.K., Parker A., Patel R., Pearce A.V., Peck A.I., Phillimore B.J.C.T., Phillips S., Plumb R.W., Porter K.M., Ramsey Y., Ranby S.A., Rice C.M., Ross M.T., Searle S.M., Sehra H.K., Sheridan E., Skuce C.D., Smith S., Smith M., Spraggon L., Squares S.L., Steward C.A., Sycamore N., Tamlyn-Hall G., Tester J., Theaker A.J., Thomas D.W., Thorpe A., Tracey A., Tromans A., Tubby B., Wall M., Wallis J.M., West A.P., White S.S., Whitehead S.L., Whittaker H., Wild A., Willey D.J., Wilmer T.E., Wood J.M., Wray P.W., Wyatt J.C., Young L., Younger R.M., Bentley D.R., Coulson A., Durbin R.M., Hubbard T., Sulston J.E., Dunham I., Rogers J., Beck S.
    Nature 425:805-811(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  4. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  5. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2).

Entry informationi

Entry nameiTULP4_HUMAN
AccessioniPrimary (citable) accession number: Q9NRJ4
Secondary accession number(s): Q5T3M2
, Q5T3M3, Q9HD22, Q9P2F0
Entry historyi
Integrated into UniProtKB/Swiss-Prot: January 31, 2002
Last sequence update: November 4, 2008
Last modified: June 8, 2016
This is version 137 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 6
    Human chromosome 6: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. PATHWAY comments
    Index of metabolic and biosynthesis pathways
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.