Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Heterogeneous nuclear ribonucleoprotein U-like protein 1

Gene

HNRNPUL1

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Acts as a basic transcriptional regulator. Represses basic transcription driven by several virus and cellular promoters. When associated with BRD7, activates transcription of glucocorticoid-responsive promoter in the absence of ligand-stimulation. Plays also a role in mRNA processing and transport. Binds avidly to poly(G) and poly(C) RNA homopolymers in vitro.2 Publications

GO - Molecular functioni

  • enzyme binding Source: UniProtKB
  • poly(A) RNA binding Source: UniProtKB
  • RNA binding Source: ProtInc

GO - Biological processi

  • mRNA splicing, via spliceosome Source: Reactome
  • regulation of transcription, DNA-templated Source: UniProtKB-KW
  • response to virus Source: ProtInc
  • RNA processing Source: ProtInc
  • transcription, DNA-templated Source: UniProtKB-KW
Complete GO annotation...

Keywords - Molecular functioni

Activator, Repressor, Ribonucleoprotein

Keywords - Biological processi

Transcription, Transcription regulation

Keywords - Ligandi

RNA-binding

Enzyme and pathway databases

BioCyciZFISH:ENSG00000105323-MONOMER.
ReactomeiR-HSA-72163. mRNA Splicing - Major Pathway.

Names & Taxonomyi

Protein namesi
Recommended name:
Heterogeneous nuclear ribonucleoprotein U-like protein 1
Alternative name(s):
Adenovirus early region 1B-associated protein 5
E1B-55 kDa-associated protein 5
Short name:
E1B-AP5
Gene namesi
Name:HNRNPUL1
Synonyms:E1BAP5, HNRPUL1
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 19

Organism-specific databases

HGNCiHGNC:17011. HNRNPUL1.

Subcellular locationi

GO - Cellular componenti

  • intracellular ribonucleoprotein complex Source: UniProtKB-KW
  • nucleoplasm Source: HPA
  • nucleus Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Nucleus

Pathology & Biotechi

Organism-specific databases

DisGeNETi11100.
OpenTargetsiENSG00000105323.
PharmGKBiPA162391519.

Polymorphism and mutation databases

BioMutaiHNRNPUL1.
DMDMi90101344.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00002275551 – 856Heterogeneous nuclear ribonucleoprotein U-like protein 1Add BLAST856

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Cross-linki117Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO1)Combined sources
Cross-linki117Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Cross-linki142Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO1)Combined sources
Cross-linki142Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)Combined sources
Modified residuei194PhosphoserineCombined sources1
Modified residuei209PhosphothreonineBy similarity1
Modified residuei512PhosphoserineCombined sources1
Modified residuei639Asymmetric dimethylarginineCombined sources1
Modified residuei645Asymmetric dimethylarginine; alternateCombined sources1
Modified residuei645Omega-N-methylarginine; alternateBy similarity1
Modified residuei656Asymmetric dimethylarginine; alternateCombined sources1
Modified residuei656Omega-N-methylarginine; alternateBy similarity1
Modified residuei661Omega-N-methylarginineBy similarity1
Modified residuei671Omega-N-methylarginineCombined sources1
Modified residuei718PhosphoserineCombined sources1

Post-translational modificationi

Methylated.2 Publications

Keywords - PTMi

Isopeptide bond, Methylation, Phosphoprotein, Ubl conjugation

Proteomic databases

EPDiQ9BUJ2.
MaxQBiQ9BUJ2.
PaxDbiQ9BUJ2.
PeptideAtlasiQ9BUJ2.
PRIDEiQ9BUJ2.
TopDownProteomicsiQ9BUJ2-2. [Q9BUJ2-2]

PTM databases

iPTMnetiQ9BUJ2.
PhosphoSitePlusiQ9BUJ2.
SwissPalmiQ9BUJ2.

Expressioni

Gene expression databases

BgeeiENSG00000105323.
CleanExiHS_HNRNPUL1.
ExpressionAtlasiQ9BUJ2. baseline and differential.
GenevisibleiQ9BUJ2. HS.

Organism-specific databases

HPAiCAB046477.
HPA046290.
HPA049475.

Interactioni

Subunit structurei

Interacts with the adenovirus type 5 (Ad5) E1B-55 kDa, BRD7, PRMT2, TP53 and NXF1. Associates with histones and BRD7.5 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
BRD7Q9NPI15EBI-1018153,EBI-711221
DZIP3Q86Y135EBI-1018153,EBI-948630
HNRNPFP525974EBI-1018153,EBI-352986
MAPK1IP1LQ8NDC05EBI-1018153,EBI-741424
RBM4BQ9BQ043EBI-1018153,EBI-715531
SF3B4Q154274EBI-1018153,EBI-348469
SMN2Q166374EBI-1018153,EBI-395421
SORBS3O605043EBI-1018153,EBI-741237
TP53BP2Q13625-33EBI-1018153,EBI-10175039
VPS37CA5D8V63EBI-1018153,EBI-2559305
WWP2O003086EBI-1018153,EBI-743923

GO - Molecular functioni

  • enzyme binding Source: UniProtKB

Protein-protein interaction databases

BioGridi116281. 119 interactors.
DIPiDIP-39419N.
IntActiQ9BUJ2. 66 interactors.
MINTiMINT-121512.
STRINGi9606.ENSP00000375863.

Structurei

3D structure databases

ProteinModelPortaliQ9BUJ2.
SMRiQ9BUJ2.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini3 – 37SAPPROSITE-ProRule annotationAdd BLAST35
Domaini191 – 388B30.2/SPRYPROSITE-ProRule annotationAdd BLAST198
Repeati612 – 6141-13
Repeati620 – 6221-23
Repeati639 – 6411-33
Repeati645 – 6471-43
Repeati656 – 6581-53

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni1 – 103Necessary for interaction with HRMT1L1Add BLAST103
Regioni213 – 856Necessary for interaction with TP531 PublicationAdd BLAST644
Regioni456 – 594Necessary for interaction with BRD7 and transcriptional activation1 PublicationAdd BLAST139
Regioni612 – 6585 X 3 AA repeats of R-G-GAdd BLAST47
Regioni612 – 658Necessary for transcription repressionAdd BLAST47

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Compositional biasi613 – 666Gly-richAdd BLAST54
Compositional biasi670 – 689Asn-richAdd BLAST20
Compositional biasi692 – 811Pro-richAdd BLAST120
Compositional biasi757 – 845Tyr-richAdd BLAST89
Compositional biasi806 – 832Gln-richAdd BLAST27

Domaini

The RGG-box domain is methylated.

Sequence similaritiesi

Contains 1 B30.2/SPRY domain.PROSITE-ProRule annotation
Contains 1 SAP domain.PROSITE-ProRule annotation

Keywords - Domaini

Repeat

Phylogenomic databases

eggNOGiKOG2242. Eukaryota.
ENOG410XSBV. LUCA.
GeneTreeiENSGT00390000020210.
HOVERGENiHBG061101.
InParanoidiQ9BUJ2.
KOiK15047.
OMAiHYVMDNI.
OrthoDBiEOG091G041T.
PhylomeDBiQ9BUJ2.
TreeFamiTF317301.

Family and domain databases

Gene3Di1.10.720.30. 1 hit.
3.40.50.300. 1 hit.
InterProiIPR001870. B30.2/SPRY.
IPR013320. ConA-like_dom.
IPR027025. hnRNP_U-like_1.
IPR027417. P-loop_NTPase.
IPR003034. SAP_dom.
IPR003877. SPRY_dom.
[Graphical view]
PANTHERiPTHR12381:SF41. PTHR12381:SF41. 2 hits.
PfamiPF02037. SAP. 1 hit.
PF00622. SPRY. 1 hit.
[Graphical view]
SMARTiSM00513. SAP. 1 hit.
SM00449. SPRY. 1 hit.
[Graphical view]
SUPFAMiSSF49899. SSF49899. 1 hit.
SSF52540. SSF52540. 1 hit.
PROSITEiPS50188. B302_SPRY. 1 hit.
PS50800. SAP. 1 hit.
[Graphical view]

Sequences (5)i

Sequence statusi: Complete.

This entry describes 5 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q9BUJ2-1) [UniParc]FASTAAdd to basket
Also known as: Isoform a

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MDVRRLKVNE LREELQRRGL DTRGLKAELA ERLQAALEAE EPDDERELDA
60 70 80 90 100
DDEPGRPGHI NEEVETEGGS ELEGTAQPPP PGLQPHAEPG GYSGPDGHYA
110 120 130 140 150
MDNITRQNQF YDTQVIKQEN ESGYERRPLE MEQQQAYRPE MKTEMKQGAP
160 170 180 190 200
TSFLPPEASQ LKPDRQQFQS RKRPYEENRG RGYFEHREDR RGRSPQPPAE
210 220 230 240 250
EDEDDFDDTL VAIDTYNCDL HFKVARDRSS GYPLTIEGFA YLWSGARASY
260 270 280 290 300
GVRRGRVCFE MKINEEISVK HLPSTEPDPH VVRIGWSLDS CSTQLGEEPF
310 320 330 340 350
SYGYGGTGKK STNSRFENYG DKFAENDVIG CFADFECGND VELSFTKNGK
360 370 380 390 400
WMGIAFRIQK EALGGQALYP HVLVKNCAVE FNFGQRAEPY CSVLPGFTFI
410 420 430 440 450
QHLPLSERIR GTVGPKSKAE CEILMMVGLP AAGKTTWAIK HAASNPSKKY
460 470 480 490 500
NILGTNAIMD KMRVMGLRRQ RNYAGRWDVL IQQATQCLNR LIQIAARKKR
510 520 530 540 550
NYILDQTNVY GSAQRRKMRP FEGFQRKAIV ICPTDEDLKD RTIKRTDEEG
560 570 580 590 600
KDVPDHAVLE MKANFTLPDV GDFLDEVLFI ELQREEADKL VRQYNEEGRK
610 620 630 640 650
AGPPPEKRFD NRGGGGFRGR GGGGGFQRYE NRGPPGGNRG GFQNRGGGSG
660 670 680 690 700
GGGNYRGGFN RSGGGGYSQN RWGNNNRDNN NSNNRGSYNR APQQQPPPQQ
710 720 730 740 750
PPPPQPPPQQ PPPPPSYSPA RNPPGASTYN KNSNIPGSSA NTSTPTVSSY
760 770 780 790 800
SPPQPSYSQP PYNQGGYSQG YTAPPPPPPP PPAYNYGSYG GYNPAPYTPP
810 820 830 840 850
PPPTAQTYPQ PSYNQYQQYA QQWNQYYQNQ GQWPPYYGNY DYGSYSGNTQ

GGTSTQ
Length:856
Mass (Da):95,739
Last modified:March 21, 2006 - v2
Checksum:i6E57C0271E5F3A77
GO
Isoform 2 (identifier: Q9BUJ2-2) [UniParc]FASTAAdd to basket
Also known as: Isoform b

The sequence of this isoform differs from the canonical sequence as follows:
     755-806: Missing.

Show »
Length:804
Mass (Da):90,292
Checksum:i05950AF9FAFD192F
GO
Isoform 3 (identifier: Q9BUJ2-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     35-77: Missing.
     263-333: Missing.
     754-754: Q → QSFGFFPSTFQ

Show »
Length:752
Mass (Da):84,500
Checksum:i207A6A67EFFFA299
GO
Isoform 4 (identifier: Q9BUJ2-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-100: Missing.

Show »
Length:756
Mass (Da):84,794
Checksum:iB16C6EC86B997FFA
GO
Isoform 5 (identifier: Q9BUJ2-5) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-460: Missing.
     461-562: KMRVMGLRRQ...VPDHAVLEMK → MGFCHVGQAG...CSLWGTSFLL
     754-754: Q → QSFGFFPSTFQ

Note: May be due to intron retention. No experimental confirmation available.
Show »
Length:390
Mass (Da):42,189
Checksum:iA86E13953F84279F
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti27A → T in CAA07548 (PubMed:9733834).Curated1
Sequence conflicti508N → S in BAC86806 (PubMed:14702039).Curated1
Sequence conflicti619G → A in CAA07548 (PubMed:9733834).Curated1
Sequence conflicti625G → A in CAA07548 (PubMed:9733834).Curated1
Sequence conflicti662S → N in CAA07548 (PubMed:9733834).Curated1
Sequence conflicti691A → S in BAC86806 (PubMed:14702039).Curated1
Sequence conflicti773A → G in CAA07548 (PubMed:9733834).Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_02560691G → C.1 PublicationCorresponds to variant rs17849624dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0175461 – 460Missing in isoform 5. 1 PublicationAdd BLAST460
Alternative sequenceiVSP_0175471 – 100Missing in isoform 4. 1 PublicationAdd BLAST100
Alternative sequenceiVSP_01754835 – 77Missing in isoform 3. 1 PublicationAdd BLAST43
Alternative sequenceiVSP_017549263 – 333Missing in isoform 3. 1 PublicationAdd BLAST71
Alternative sequenceiVSP_017550461 – 562KMRVM…VLEMK → MGFCHVGQAGLELLTSGDPP ASASQSAGITGVSHRARPSV FVFLIHYSSFLHLLPSGRPL FWVEGTRLQKVLTSSSCSLW GTSFLL in isoform 5. 1 PublicationAdd BLAST102
Alternative sequenceiVSP_017551754Q → QSFGFFPSTFQ in isoform 3 and isoform 5. 1 Publication1
Alternative sequenceiVSP_017552755 – 806Missing in isoform 2. 1 PublicationAdd BLAST52

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ007509 mRNA. Translation: CAA07548.1.
AK021455 mRNA. Translation: BAB13831.1.
AK022863 mRNA. Translation: BAG51129.1.
AK127057 mRNA. Translation: BAC86806.1.
AC011462 Genomic DNA. No translation available.
AC011510 Genomic DNA. No translation available.
CH471126 Genomic DNA. Translation: EAW57025.1.
BC002564 mRNA. Translation: AAH02564.1.
BC009988 mRNA. Translation: AAH09988.2.
BC014232 mRNA. Translation: AAH14232.1.
BC027713 mRNA. Translation: AAH27713.1.
AL050146 mRNA. Translation: CAB43291.1.
CCDSiCCDS12576.1. [Q9BUJ2-1]
CCDS12577.1. [Q9BUJ2-4]
PIRiT08776.
T13159.
RefSeqiNP_001308137.1. NM_001321208.1. [Q9BUJ2-4]
NP_001308140.1. NM_001321211.1. [Q9BUJ2-4]
NP_008971.2. NM_007040.5. [Q9BUJ2-1]
NP_653333.1. NM_144732.4. [Q9BUJ2-4]
UniGeneiHs.155218.
Hs.718642.

Genome annotation databases

EnsembliENST00000378215; ENSP00000367460; ENSG00000105323. [Q9BUJ2-3]
ENST00000392006; ENSP00000375863; ENSG00000105323. [Q9BUJ2-1]
ENST00000593587; ENSP00000472629; ENSG00000105323. [Q9BUJ2-4]
ENST00000595018; ENSP00000473132; ENSG00000105323. [Q9BUJ2-4]
ENST00000602130; ENSP00000470687; ENSG00000105323. [Q9BUJ2-2]
GeneIDi11100.
KEGGihsa:11100.
UCSCiuc002opz.5. human. [Q9BUJ2-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ007509 mRNA. Translation: CAA07548.1.
AK021455 mRNA. Translation: BAB13831.1.
AK022863 mRNA. Translation: BAG51129.1.
AK127057 mRNA. Translation: BAC86806.1.
AC011462 Genomic DNA. No translation available.
AC011510 Genomic DNA. No translation available.
CH471126 Genomic DNA. Translation: EAW57025.1.
BC002564 mRNA. Translation: AAH02564.1.
BC009988 mRNA. Translation: AAH09988.2.
BC014232 mRNA. Translation: AAH14232.1.
BC027713 mRNA. Translation: AAH27713.1.
AL050146 mRNA. Translation: CAB43291.1.
CCDSiCCDS12576.1. [Q9BUJ2-1]
CCDS12577.1. [Q9BUJ2-4]
PIRiT08776.
T13159.
RefSeqiNP_001308137.1. NM_001321208.1. [Q9BUJ2-4]
NP_001308140.1. NM_001321211.1. [Q9BUJ2-4]
NP_008971.2. NM_007040.5. [Q9BUJ2-1]
NP_653333.1. NM_144732.4. [Q9BUJ2-4]
UniGeneiHs.155218.
Hs.718642.

3D structure databases

ProteinModelPortaliQ9BUJ2.
SMRiQ9BUJ2.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi116281. 119 interactors.
DIPiDIP-39419N.
IntActiQ9BUJ2. 66 interactors.
MINTiMINT-121512.
STRINGi9606.ENSP00000375863.

PTM databases

iPTMnetiQ9BUJ2.
PhosphoSitePlusiQ9BUJ2.
SwissPalmiQ9BUJ2.

Polymorphism and mutation databases

BioMutaiHNRNPUL1.
DMDMi90101344.

Proteomic databases

EPDiQ9BUJ2.
MaxQBiQ9BUJ2.
PaxDbiQ9BUJ2.
PeptideAtlasiQ9BUJ2.
PRIDEiQ9BUJ2.
TopDownProteomicsiQ9BUJ2-2. [Q9BUJ2-2]

Protocols and materials databases

DNASUi11100.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000378215; ENSP00000367460; ENSG00000105323. [Q9BUJ2-3]
ENST00000392006; ENSP00000375863; ENSG00000105323. [Q9BUJ2-1]
ENST00000593587; ENSP00000472629; ENSG00000105323. [Q9BUJ2-4]
ENST00000595018; ENSP00000473132; ENSG00000105323. [Q9BUJ2-4]
ENST00000602130; ENSP00000470687; ENSG00000105323. [Q9BUJ2-2]
GeneIDi11100.
KEGGihsa:11100.
UCSCiuc002opz.5. human. [Q9BUJ2-1]

Organism-specific databases

CTDi11100.
DisGeNETi11100.
GeneCardsiHNRNPUL1.
HGNCiHGNC:17011. HNRNPUL1.
HPAiCAB046477.
HPA046290.
HPA049475.
MIMi605800. gene.
neXtProtiNX_Q9BUJ2.
OpenTargetsiENSG00000105323.
PharmGKBiPA162391519.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG2242. Eukaryota.
ENOG410XSBV. LUCA.
GeneTreeiENSGT00390000020210.
HOVERGENiHBG061101.
InParanoidiQ9BUJ2.
KOiK15047.
OMAiHYVMDNI.
OrthoDBiEOG091G041T.
PhylomeDBiQ9BUJ2.
TreeFamiTF317301.

Enzyme and pathway databases

BioCyciZFISH:ENSG00000105323-MONOMER.
ReactomeiR-HSA-72163. mRNA Splicing - Major Pathway.

Miscellaneous databases

ChiTaRSiHNRNPUL1. human.
GeneWikiiHNRPUL1.
GenomeRNAii11100.
PROiQ9BUJ2.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000105323.
CleanExiHS_HNRNPUL1.
ExpressionAtlasiQ9BUJ2. baseline and differential.
GenevisibleiQ9BUJ2. HS.

Family and domain databases

Gene3Di1.10.720.30. 1 hit.
3.40.50.300. 1 hit.
InterProiIPR001870. B30.2/SPRY.
IPR013320. ConA-like_dom.
IPR027025. hnRNP_U-like_1.
IPR027417. P-loop_NTPase.
IPR003034. SAP_dom.
IPR003877. SPRY_dom.
[Graphical view]
PANTHERiPTHR12381:SF41. PTHR12381:SF41. 2 hits.
PfamiPF02037. SAP. 1 hit.
PF00622. SPRY. 1 hit.
[Graphical view]
SMARTiSM00513. SAP. 1 hit.
SM00449. SPRY. 1 hit.
[Graphical view]
SUPFAMiSSF49899. SSF49899. 1 hit.
SSF52540. SSF52540. 1 hit.
PROSITEiPS50188. B302_SPRY. 1 hit.
PS50800. SAP. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiHNRL1_HUMAN
AccessioniPrimary (citable) accession number: Q9BUJ2
Secondary accession number(s): B3KMW7
, O76022, Q6ZSZ0, Q7L8P4, Q8N6Z4, Q96G37, Q9HAL3, Q9UG75
Entry historyi
Integrated into UniProtKB/Swiss-Prot: March 21, 2006
Last sequence update: March 21, 2006
Last modified: November 30, 2016
This is version 155 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Miscellaneous

Its methylation is enhanced in the late phase of adenoviral infection.

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 19
    Human chromosome 19: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.