Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

DNA-directed RNA polymerase II subunit RPB2

Gene

POLR2B

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

DNA-dependent RNA polymerase catalyzes the transcription of DNA into RNA using the four ribonucleoside triphosphates as substrates. Second largest component of RNA polymerase II which synthesizes mRNA precursors and many functional non-coding RNAs. Proposed to contribute to the polymerase catalytic activity and forms the polymerase active center together with the largest subunit. Pol II is the central component of the basal RNA polymerase II transcription machinery. It is composed of mobile elements that move relative to each other. RPB2 is part of the core element with the central large cleft, the clamp element that moves to open and close the cleft and the jaws that are thought to grab the incoming DNA template (By similarity).By similarity1 Publication

Catalytic activityi

Nucleoside triphosphate + RNA(n) = diphosphate + RNA(n+1).

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Metal bindingi792Magnesium; shared with RPB1By similarity1
Metal bindingi1119ZincBy similarity1
Metal bindingi1122ZincBy similarity1
Metal bindingi1137ZincBy similarity1
Metal bindingi1140ZincBy similarity1

Regions

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri1119 – 1140C4-typeAdd BLAST22

GO - Molecular functioni

  • chromatin binding Source: Ensembl
  • DNA binding Source: ProtInc
  • DNA-directed RNA polymerase activity Source: UniProtKB-KW
  • metal ion binding Source: UniProtKB-KW
  • poly(A) RNA binding Source: UniProtKB
  • ribonucleoside binding Source: InterPro

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Nucleotidyltransferase, Transferase

Keywords - Biological processi

Transcription

Keywords - Ligandi

Magnesium, Metal-binding, Zinc

Enzyme and pathway databases

BioCyciZFISH:HS00587-MONOMER.
ReactomeiR-HSA-112382. Formation of RNA Pol II elongation complex.
R-HSA-112387. Elongation arrest and recovery.
R-HSA-113418. Formation of the Early Elongation Complex.
R-HSA-167152. Formation of HIV elongation complex in the absence of HIV Tat.
R-HSA-167158. Formation of the HIV-1 Early Elongation Complex.
R-HSA-167160. RNA Pol II CTD phosphorylation and interaction with CE.
R-HSA-167161. HIV Transcription Initiation.
R-HSA-167162. RNA Polymerase II HIV Promoter Escape.
R-HSA-167172. Transcription of the HIV genome.
R-HSA-167200. Formation of HIV-1 elongation complex containing HIV-1 Tat.
R-HSA-167238. Pausing and recovery of Tat-mediated HIV elongation.
R-HSA-167242. Abortive elongation of HIV-1 transcript in the absence of Tat.
R-HSA-167243. Tat-mediated HIV elongation arrest and recovery.
R-HSA-167246. Tat-mediated elongation of the HIV-1 transcript.
R-HSA-167287. HIV elongation arrest and recovery.
R-HSA-167290. Pausing and recovery of HIV elongation.
R-HSA-168325. Viral Messenger RNA Synthesis.
R-HSA-203927. MicroRNA (miRNA) biogenesis.
R-HSA-452723. Transcriptional regulation of pluripotent stem cells.
R-HSA-5578749. Transcriptional regulation by small RNAs.
R-HSA-5601884. PIWI-interacting RNA (piRNA) biogenesis.
R-HSA-5617472. Activation of anterior HOX genes in hindbrain development during early embryogenesis.
R-HSA-674695. RNA Polymerase II Pre-transcription Events.
R-HSA-6781823. Formation of TC-NER Pre-Incision Complex.
R-HSA-6781827. Transcription-Coupled Nucleotide Excision Repair (TC-NER).
R-HSA-6782135. Dual incision in TC-NER.
R-HSA-6782210. Gap-filling DNA repair synthesis and ligation in TC-NER.
R-HSA-6796648. TP53 Regulates Transcription of DNA Repair Genes.
R-HSA-6803529. FGFR2 alternative splicing.
R-HSA-6807505. RNA polymerase II transcribes snRNA genes.
R-HSA-72086. mRNA Capping.
R-HSA-72163. mRNA Splicing - Major Pathway.
R-HSA-72165. mRNA Splicing - Minor Pathway.
R-HSA-72203. Processing of Capped Intron-Containing Pre-mRNA.
R-HSA-73776. RNA Polymerase II Promoter Escape.
R-HSA-73779. RNA Polymerase II Transcription Pre-Initiation And Promoter Opening.
R-HSA-75953. RNA Polymerase II Transcription Initiation.
R-HSA-75955. RNA Polymerase II Transcription Elongation.
R-HSA-76042. RNA Polymerase II Transcription Initiation And Promoter Clearance.
R-HSA-77075. RNA Pol II CTD phosphorylation and interaction with CE.
R-HSA-8851708. Signaling by FGFR2 IIIa TM.

Names & Taxonomyi

Protein namesi
Recommended name:
DNA-directed RNA polymerase II subunit RPB2 (EC:2.7.7.6)
Alternative name(s):
DNA-directed RNA polymerase II 140 kDa polypeptide
DNA-directed RNA polymerase II subunit B
RNA polymerase II subunit 2
RNA polymerase II subunit B2
Gene namesi
Name:POLR2B
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 4

Organism-specific databases

HGNCiHGNC:9188. POLR2B.

Subcellular locationi

  • Nucleus 1 Publication

GO - Cellular componenti

  • DNA-directed RNA polymerase II, core complex Source: UniProtKB
  • membrane Source: UniProtKB
  • nucleoplasm Source: Reactome
  • nucleus Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

DNA-directed RNA polymerase, Nucleus

Pathology & Biotechi

Organism-specific databases

DisGeNETi5431.
OpenTargetsiENSG00000047315.
PharmGKBiPA33508.

Polymorphism and mutation databases

BioMutaiPOLR2B.
DMDMi401012.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00000480851 – 1174DNA-directed RNA polymerase II subunit RPB2Add BLAST1174

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei937PhosphoserineCombined sources1
Modified residuei1052N6-methyllysineCombined sources1

Keywords - PTMi

Methylation, Phosphoprotein

Proteomic databases

EPDiP30876.
MaxQBiP30876.
PaxDbiP30876.
PeptideAtlasiP30876.
PRIDEiP30876.

PTM databases

iPTMnetiP30876.
PhosphoSitePlusiP30876.

Expressioni

Gene expression databases

BgeeiENSG00000047315.
CleanExiHS_POLR2B.
ExpressionAtlasiP30876. baseline and differential.
GenevisibleiP30876. HS.

Organism-specific databases

HPAiHPA037506.

Interactioni

Subunit structurei

Component of the RNA polymerase II (Pol II) complex consisting of 12 subunits. Interacts with WDR82. Interacts with MEN1.3 Publications

Protein-protein interaction databases

BioGridi111427. 108 interactors.
DIPiDIP-32910N.
IntActiP30876. 20 interactors.
MINTiMINT-1216897.
STRINGi9606.ENSP00000312735.

Structurei

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
5IY6electron microscopy7.20B1-1174[»]
5IY7electron microscopy8.60B1-1174[»]
5IY8electron microscopy7.90B1-1174[»]
5IY9electron microscopy6.30B1-1174[»]
5IYAelectron microscopy5.40B1-1174[»]
5IYBelectron microscopy3.90B1-1174[»]
5IYCelectron microscopy3.90B1-1174[»]
5IYDelectron microscopy3.90B1-1174[»]
ProteinModelPortaliP30876.
SMRiP30876.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Sequence similaritiesi

Belongs to the RNA polymerase beta chain family.Curated

Zinc finger

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Zinc fingeri1119 – 1140C4-typeAdd BLAST22

Keywords - Domaini

Zinc-finger

Phylogenomic databases

eggNOGiKOG0214. Eukaryota.
COG0085. LUCA.
GeneTreeiENSGT00860000133818.
HOGENOMiHOG000222962.
HOVERGENiHBG017744.
InParanoidiP30876.
KOiK03010.
OMAiRHAIYEK.
OrthoDBiEOG091G00RQ.
PhylomeDBiP30876.
TreeFamiTF103037.

Family and domain databases

CDDicd00653. RNA_pol_B_RPB2. 1 hit.
Gene3Di2.40.270.10. 2 hits.
2.40.50.150. 1 hit.
3.90.1110.10. 1 hit.
InterProiIPR015712. DNA-dir_RNA_pol_su2.
IPR007120. DNA-dir_RNA_pol_su2_6.
IPR007121. RNA_pol_bsu_CS.
IPR007644. RNA_pol_bsu_protrusion.
IPR007642. RNA_pol_Rpb2_2.
IPR007645. RNA_pol_Rpb2_3.
IPR007646. RNA_pol_Rpb2_4.
IPR007647. RNA_pol_Rpb2_5.
IPR007641. RNA_pol_Rpb2_7.
IPR014724. RNA_pol_RPB2_OB-fold.
[Graphical view]
PANTHERiPTHR20856. PTHR20856. 1 hit.
PfamiPF04563. RNA_pol_Rpb2_1. 1 hit.
PF04561. RNA_pol_Rpb2_2. 1 hit.
PF04565. RNA_pol_Rpb2_3. 1 hit.
PF04566. RNA_pol_Rpb2_4. 1 hit.
PF04567. RNA_pol_Rpb2_5. 1 hit.
PF00562. RNA_pol_Rpb2_6. 1 hit.
PF04560. RNA_pol_Rpb2_7. 1 hit.
[Graphical view]
PROSITEiPS01166. RNA_POL_BETA. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

P30876-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MYDADEDMQY DEDDDEITPD LWQEACWIVI SSYFDEKGLV RQQLDSFDEF
60 70 80 90 100
IQMSVQRIVE DAPPIDLQAE AQHASGEVEE PPRYLLKFEQ IYLSKPTHWE
110 120 130 140 150
RDGAPSPMMP NEARLRNLTY SAPLYVDITK TVIKEGEEQL QTQHQKTFIG
160 170 180 190 200
KIPIMLRSTY CLLNGLTDRD LCELNECPLD PGGYFIINGS EKVLIAQEKM
210 220 230 240 250
ATNTVYVFAK KDSKYAYTGE CRSCLENSSR PTSTIWVSML ARGGQGAKKS
260 270 280 290 300
AIGQRIVATL PYIKQEVPII IVFRALGFVS DRDILEHIIY DFEDPEMMEM
310 320 330 340 350
VKPSLDEAFV IQEQNVALNF IGSRGAKPGV TKEKRIKYAK EVLQKEMLPH
360 370 380 390 400
VGVSDFCETK KAYFLGYMVH RLLLAALGRR ELDDRDHYGN KRLDLAGPLL
410 420 430 440 450
AFLFRGMFKN LLKEVRIYAQ KFIDRGKDFN LELAIKTRII SDGLKYSLAT
460 470 480 490 500
GNWGDQKKAH QARAGVSQVL NRLTFASTLS HLRRLNSPIG RDGKLAKPRQ
510 520 530 540 550
LHNTLWGMVC PAETPEGHAV GLVKNLALMA YISVGSQPSP ILEFLEEWSM
560 570 580 590 600
ENLEEISPAA IADATKIFVN GCWVGIHKDP EQLMNTLRKL RRQMDIIVSE
610 620 630 640 650
VSMIRDIRER EIRIYTDAGR ICRPLLIVEK QKLLLKKRHI DQLKEREYNN
660 670 680 690 700
YSWQDLVASG VVEYIDTLEE ETVMLAMTPD DLQEKEVAYC STYTHCEIHP
710 720 730 740 750
SMILGVCASI IPFPDHNQSP RNTYQSAMGK QAMGVYITNF HVRMDTLAHV
760 770 780 790 800
LYYPQKPLVT TRSMEYLRFR ELPAGINSIV AIASYTGYNQ EDSVIMNRSA
810 820 830 840 850
VDRGFFRSVF YRSYKEQESK KGFDQEEVFE KPTRETCQGM RHAIYDKLDD
860 870 880 890 900
DGLIAPGVRV SGDDVIIGKT VTLPENEDEL ESTNRRYTKR DCSTFLRTSE
910 920 930 940 950
TGIVDQVMVT LNQEGYKFCK IRVRSVRIPQ IGDKFASRHG QKGTCGIQYR
960 970 980 990 1000
QEDMPFTCEG ITPDIIINPH AIPSRMTIGH LIECLQGKVS ANKGEIGDAT
1010 1020 1030 1040 1050
PFNDAVNVQK ISNLLSDYGY HLRGNEVLYN GFTGRKITSQ IFIGPTYYQR
1060 1070 1080 1090 1100
LKHMVDDKIH SRARGPIQIL NRQPMEGRSR DGGLRFGEME RDCQIAHGAA
1110 1120 1130 1140 1150
QFLRERLFEA SDPYQVHVCN LCGIMAIANT RTHTYECRGC RNKTQISLVR
1160 1170
MPYACKLLFQ ELMSMSIAPR MMSV
Length:1,174
Mass (Da):133,897
Last modified:July 1, 1993 - v1
Checksum:i32BEDF7F95E4DE10
GO

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X63563 mRNA. Translation: CAA45124.1.
AK289823 mRNA. Translation: BAF82512.1.
CH471057 Genomic DNA. Translation: EAX05519.1.
BC023503 mRNA. Translation: AAH23503.2.
AF055028 mRNA. Translation: AAC09367.1.
CCDSiCCDS3511.1.
PIRiS28976.
RefSeqiNP_000929.1. NM_000938.2.
NP_001290197.1. NM_001303268.1.
NP_001290198.1. NM_001303269.1.
UniGeneiHs.602757.

Genome annotation databases

EnsembliENST00000314595; ENSP00000312735; ENSG00000047315.
ENST00000381227; ENSP00000370625; ENSG00000047315.
GeneIDi5431.
KEGGihsa:5431.
UCSCiuc003hcl.1. human.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X63563 mRNA. Translation: CAA45124.1.
AK289823 mRNA. Translation: BAF82512.1.
CH471057 Genomic DNA. Translation: EAX05519.1.
BC023503 mRNA. Translation: AAH23503.2.
AF055028 mRNA. Translation: AAC09367.1.
CCDSiCCDS3511.1.
PIRiS28976.
RefSeqiNP_000929.1. NM_000938.2.
NP_001290197.1. NM_001303268.1.
NP_001290198.1. NM_001303269.1.
UniGeneiHs.602757.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
5IY6electron microscopy7.20B1-1174[»]
5IY7electron microscopy8.60B1-1174[»]
5IY8electron microscopy7.90B1-1174[»]
5IY9electron microscopy6.30B1-1174[»]
5IYAelectron microscopy5.40B1-1174[»]
5IYBelectron microscopy3.90B1-1174[»]
5IYCelectron microscopy3.90B1-1174[»]
5IYDelectron microscopy3.90B1-1174[»]
ProteinModelPortaliP30876.
SMRiP30876.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi111427. 108 interactors.
DIPiDIP-32910N.
IntActiP30876. 20 interactors.
MINTiMINT-1216897.
STRINGi9606.ENSP00000312735.

PTM databases

iPTMnetiP30876.
PhosphoSitePlusiP30876.

Polymorphism and mutation databases

BioMutaiPOLR2B.
DMDMi401012.

Proteomic databases

EPDiP30876.
MaxQBiP30876.
PaxDbiP30876.
PeptideAtlasiP30876.
PRIDEiP30876.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000314595; ENSP00000312735; ENSG00000047315.
ENST00000381227; ENSP00000370625; ENSG00000047315.
GeneIDi5431.
KEGGihsa:5431.
UCSCiuc003hcl.1. human.

Organism-specific databases

CTDi5431.
DisGeNETi5431.
GeneCardsiPOLR2B.
HGNCiHGNC:9188. POLR2B.
HPAiHPA037506.
MIMi180661. gene.
neXtProtiNX_P30876.
OpenTargetsiENSG00000047315.
PharmGKBiPA33508.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG0214. Eukaryota.
COG0085. LUCA.
GeneTreeiENSGT00860000133818.
HOGENOMiHOG000222962.
HOVERGENiHBG017744.
InParanoidiP30876.
KOiK03010.
OMAiRHAIYEK.
OrthoDBiEOG091G00RQ.
PhylomeDBiP30876.
TreeFamiTF103037.

Enzyme and pathway databases

BioCyciZFISH:HS00587-MONOMER.
ReactomeiR-HSA-112382. Formation of RNA Pol II elongation complex.
R-HSA-112387. Elongation arrest and recovery.
R-HSA-113418. Formation of the Early Elongation Complex.
R-HSA-167152. Formation of HIV elongation complex in the absence of HIV Tat.
R-HSA-167158. Formation of the HIV-1 Early Elongation Complex.
R-HSA-167160. RNA Pol II CTD phosphorylation and interaction with CE.
R-HSA-167161. HIV Transcription Initiation.
R-HSA-167162. RNA Polymerase II HIV Promoter Escape.
R-HSA-167172. Transcription of the HIV genome.
R-HSA-167200. Formation of HIV-1 elongation complex containing HIV-1 Tat.
R-HSA-167238. Pausing and recovery of Tat-mediated HIV elongation.
R-HSA-167242. Abortive elongation of HIV-1 transcript in the absence of Tat.
R-HSA-167243. Tat-mediated HIV elongation arrest and recovery.
R-HSA-167246. Tat-mediated elongation of the HIV-1 transcript.
R-HSA-167287. HIV elongation arrest and recovery.
R-HSA-167290. Pausing and recovery of HIV elongation.
R-HSA-168325. Viral Messenger RNA Synthesis.
R-HSA-203927. MicroRNA (miRNA) biogenesis.
R-HSA-452723. Transcriptional regulation of pluripotent stem cells.
R-HSA-5578749. Transcriptional regulation by small RNAs.
R-HSA-5601884. PIWI-interacting RNA (piRNA) biogenesis.
R-HSA-5617472. Activation of anterior HOX genes in hindbrain development during early embryogenesis.
R-HSA-674695. RNA Polymerase II Pre-transcription Events.
R-HSA-6781823. Formation of TC-NER Pre-Incision Complex.
R-HSA-6781827. Transcription-Coupled Nucleotide Excision Repair (TC-NER).
R-HSA-6782135. Dual incision in TC-NER.
R-HSA-6782210. Gap-filling DNA repair synthesis and ligation in TC-NER.
R-HSA-6796648. TP53 Regulates Transcription of DNA Repair Genes.
R-HSA-6803529. FGFR2 alternative splicing.
R-HSA-6807505. RNA polymerase II transcribes snRNA genes.
R-HSA-72086. mRNA Capping.
R-HSA-72163. mRNA Splicing - Major Pathway.
R-HSA-72165. mRNA Splicing - Minor Pathway.
R-HSA-72203. Processing of Capped Intron-Containing Pre-mRNA.
R-HSA-73776. RNA Polymerase II Promoter Escape.
R-HSA-73779. RNA Polymerase II Transcription Pre-Initiation And Promoter Opening.
R-HSA-75953. RNA Polymerase II Transcription Initiation.
R-HSA-75955. RNA Polymerase II Transcription Elongation.
R-HSA-76042. RNA Polymerase II Transcription Initiation And Promoter Clearance.
R-HSA-77075. RNA Pol II CTD phosphorylation and interaction with CE.
R-HSA-8851708. Signaling by FGFR2 IIIa TM.

Miscellaneous databases

ChiTaRSiPOLR2B. human.
GeneWikiiPOLR2B.
GenomeRNAii5431.
PROiP30876.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000047315.
CleanExiHS_POLR2B.
ExpressionAtlasiP30876. baseline and differential.
GenevisibleiP30876. HS.

Family and domain databases

CDDicd00653. RNA_pol_B_RPB2. 1 hit.
Gene3Di2.40.270.10. 2 hits.
2.40.50.150. 1 hit.
3.90.1110.10. 1 hit.
InterProiIPR015712. DNA-dir_RNA_pol_su2.
IPR007120. DNA-dir_RNA_pol_su2_6.
IPR007121. RNA_pol_bsu_CS.
IPR007644. RNA_pol_bsu_protrusion.
IPR007642. RNA_pol_Rpb2_2.
IPR007645. RNA_pol_Rpb2_3.
IPR007646. RNA_pol_Rpb2_4.
IPR007647. RNA_pol_Rpb2_5.
IPR007641. RNA_pol_Rpb2_7.
IPR014724. RNA_pol_RPB2_OB-fold.
[Graphical view]
PANTHERiPTHR20856. PTHR20856. 1 hit.
PfamiPF04563. RNA_pol_Rpb2_1. 1 hit.
PF04561. RNA_pol_Rpb2_2. 1 hit.
PF04565. RNA_pol_Rpb2_3. 1 hit.
PF04566. RNA_pol_Rpb2_4. 1 hit.
PF04567. RNA_pol_Rpb2_5. 1 hit.
PF00562. RNA_pol_Rpb2_6. 1 hit.
PF04560. RNA_pol_Rpb2_7. 1 hit.
[Graphical view]
PROSITEiPS01166. RNA_POL_BETA. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiRPB2_HUMAN
AccessioniPrimary (citable) accession number: P30876
Secondary accession number(s): A8K1A8, Q8IZ61
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 1, 1993
Last sequence update: July 1, 1993
Last modified: November 30, 2016
This is version 171 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Miscellaneous

The binding of ribonucleoside triphosphate to the RNA polymerase II transcribing complex probably involves a two-step mechanism. The initial binding seems to occur at the entry (E) site and involves a magnesium ion coordinated by three conserved aspartate residues of the two largest RNA Pol II subunits (By similarity).By similarity

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 4
    Human chromosome 4: entries, gene names and cross-references to MIM
  2. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  3. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  4. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.