Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Poly(A) polymerase alpha

Gene

PAPOLA

Organism
Bos taurus (Bovine)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Polymerase that creates the 3'-poly(A) tail of mRNA's. Also required for the endoribonucleolytic cleavage reaction at some polyadenylation sites. May acquire specificity through interaction with a cleavage and polyadenylation specificity factor (CPSF) at its C-terminus.By similarity1 Publication

Catalytic activityi

ATP + RNA(n) = diphosphate + RNA(n+1).1 Publication

Cofactori

Mg2+1 Publication, Mn2+1 PublicationNote: Binds 2 magnesium ions. Also active with manganese.1 Publication

Kineticsi

  1. KM=0.229 mM for ATP1 Publication

    Sites

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Binding sitei109ATP1
    Metal bindingi113Magnesium 1; catalytic1
    Metal bindingi113Magnesium 2; catalytic1
    Metal bindingi115Magnesium 1; catalytic1
    Metal bindingi115Magnesium 2; catalytic1
    Metal bindingi167Magnesium 2; catalytic1
    Binding sitei167ATP1
    Binding sitei228ATP1
    Binding sitei237ATP1

    Regions

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Nucleotide bindingi100 – 102ATP3
    Nucleotide bindingi113 – 115ATP3
    Nucleotide bindingi246 – 247ATP2

    GO - Molecular functioni

    • ATP binding Source: UniProtKB
    • magnesium ion binding Source: UniProtKB
    • manganese ion binding Source: UniProtKB
    • polynucleotide adenylyltransferase activity Source: UniProtKB
    • RNA binding Source: UniProtKB-KW

    GO - Biological processi

    • mRNA polyadenylation Source: UniProtKB
    • regulation of mRNA 3'-end processing Source: UniProtKB
    • RNA polyadenylation Source: UniProtKB
    Complete GO annotation...

    Keywords - Molecular functioni

    Transferase

    Keywords - Biological processi

    mRNA processing

    Keywords - Ligandi

    ATP-binding, Magnesium, Manganese, Metal-binding, Nucleotide-binding, RNA-binding

    Enzyme and pathway databases

    BRENDAi2.7.7.19. 908.
    ReactomeiR-BTA-109688. Cleavage of Growing Transcript in the Termination Region.
    R-BTA-72163. mRNA Splicing - Major Pathway.
    R-BTA-72187. mRNA 3'-end processing.
    R-BTA-77595. Processing of Intronless Pre-mRNAs.
    SABIO-RKP25500.

    Names & Taxonomyi

    Protein namesi
    Recommended name:
    Poly(A) polymerase alpha (EC:2.7.7.19)
    Short name:
    PAP-alpha
    Alternative name(s):
    Polynucleotide adenylyltransferase alpha
    Cleaved into the following chain:
    Gene namesi
    Name:PAPOLA
    Synonyms:PAP
    OrganismiBos taurus (Bovine)
    Taxonomic identifieri9913 [NCBI]
    Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaLaurasiatheriaCetartiodactylaRuminantiaPecoraBovidaeBovinaeBos
    Proteomesi
    • UP000009136 Componenti: Chromosome 21

    Subcellular locationi

    • Nucleus 1 Publication

    GO - Cellular componenti

    Complete GO annotation...

    Keywords - Cellular componenti

    Nucleus

    Pathology & Biotechi

    Mutagenesis

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Mutagenesisi100F → D: Strongly decreased enzyme activity. Strongly reduced affinity for RNA. 1 Publication1
    Mutagenesisi113 – 115DID → AIA: Abolishes most of the specific and non-specific polyadenylation activity. 3
    Mutagenesisi113D → H: Abolishes most of the specific and non-specific polyadenylation activity. 1 Publication1
    Mutagenesisi115D → H: Abolishes most of the specific and non-specific polyadenylation activity. 1 Publication1
    Mutagenesisi153F → A: Strongly reduced affinity for RNA. 1 Publication1
    Mutagenesisi156V → A: Strongly decreased enzyme activity. Strongly reduced affinity for RNA. 1 Publication1
    Mutagenesisi162 – 163DG → HA: Small decrease in non-specific and specific polyadenylation activity. 1 Publication2
    Mutagenesisi167D → A: Loss of enzyme activity. 2 Publications1
    Mutagenesisi167D → H: Abolishes most of the specific and non-specific polyadenylation activity. 2 Publications1
    Mutagenesisi167D → N: Strongly decreased enzyme activity. Strongly reduced affinity for RNA. 2 Publications1
    Mutagenesisi186D → H: Small decrease in non-specific and specific polyadenylation activity. 1 Publication1
    Mutagenesisi194D → H: No change in non-specific and specific polyadenylation activity. 1 Publication1
    Mutagenesisi199R → A: Strongly reduced affinity for RNA. 1 Publication1
    Mutagenesisi202N → A: Strongly decreased enzyme activity. Strongly reduced affinity for RNA. 1 Publication1
    Mutagenesisi203G → H: Loss of enzyme activity. Strongly reduced affinity for RNA. 1 Publication1
    Mutagenesisi208 – 209DE → AA: Reduces by 60% non-specific and specific polyadenylation activity. 2
    Mutagenesisi208D → A: Reduces by 60% non-specific rf and specific polyadenylation activity. 1 Publication1
    Mutagenesisi208D → H: Reduces by 20% non-specific and specific polyadenylation activity. 1 Publication1
    Mutagenesisi209E → A: No change in non-specific and specific polyadenylation activity. 1 Publication1
    Mutagenesisi228K → A: Strongly decreased affinity for ATP. 1 Publication1
    Mutagenesisi232K → A: Decreased affinity for ATP. 1 Publication1
    Mutagenesisi237Y → A: Strongly decreased affinity for ATP. 1 Publication1
    Mutagenesisi247V → A or R: Strongly reduced affinity for RNA. 1 Publication1
    Mutagenesisi291 – 292EE → AA: Abolishes most of non-specific polyadenylation activity. 2
    Mutagenesisi291E → A: Reduces by 60% non-specific polyadenylation activity. 1 Publication1
    Mutagenesisi292E → A: No change in non-specific polyadenylation activity. 1 Publication1
    Mutagenesisi308D → A: No change in non-specific and specific polyadenylation activity. 1
    Mutagenesisi317T → G: Strongly decreased affinity for ATP. 1
    Mutagenesisi431 – 432EE → AA: No change in non-specific and specific polyadenylation activity. 1 Publication2
    Mutagenesisi455D → A: Reduces by 30% non-specific polyadenylation activity. 1 Publication1
    Mutagenesisi459D → A: No change in non-specific polyadenylation activity. 1 Publication1
    Mutagenesisi465D → A: No change in non-specific and specific polyadenylation activity. 1 Publication1
    Mutagenesisi635K → Q: Weak binding to KPBN1. Cytoplasmic location; when associated with Q-644; Q-730 and Q-734. 1 Publication1
    Mutagenesisi635K → R: Some decrease in acetylation. Binds KPBN1 and localizes to the nucleus; when associated with R-644; R-730 and R-734. 1 Publication1
    Mutagenesisi644K → Q: Weak binding to KPBN1. Cytoplasmic location; when associated with Q-635; Q-730 and Q-734. 1 Publication1
    Mutagenesisi644K → R: Large decrease in acetylation. Binds KPBN1 and localizes to the nucleus; when associated with R-635; R-730 and R-734. 1 Publication1
    Mutagenesisi730K → Q: Weak binding to KPBN1. Cytoplasmic location; when associated with Q-635; Q-644 and Q-734. 1 Publication1
    Mutagenesisi730K → R: Some decrease in acetylation. Binds KPBN1 and localizes to the nucleus; when associated with R-635; R-644 and R-734. 1 Publication1
    Mutagenesisi734K → Q: Weak binding to KPBN1. Cytoplasmic location; when associated with Q-635; Q-644 and Q-730. 1 Publication1
    Mutagenesisi734K → R: Some decrease in acetylation. Binds KPBN1 and localizes to the nucleus; when associated with R-635; R-644 and R-730. 1 Publication1

    PTM / Processingi

    Molecule processing

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    ChainiPRO_00000516111 – 739Poly(A) polymerase alphaAdd BLAST739
    Initiator methionineiRemoved; alternateBy similarity
    ChainiPRO_00004343652 – 739Poly(A) polymerase alpha, N-terminally processedAdd BLAST738

    Amino acid modifications

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Modified residuei10PhosphoserineBy similarity1
    Modified residuei24PhosphoserineBy similarity1
    Cross-linki444Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO)Curated
    Cross-linki445Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO)Curated
    Cross-linki506Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO)Curated
    Cross-linki507Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO)Curated
    Modified residuei537Phosphoserine; by MAPKBy similarity1
    Modified residuei558PhosphoserineBy similarity1
    Modified residuei635N6-acetyllysine1 Publication1
    Modified residuei644N6-acetyllysine1 Publication1
    Modified residuei730N6-acetyllysine; alternate1 Publication1
    Cross-linki730Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO); alternateBy similarity
    Modified residuei732PhosphoserineBy similarity1
    Modified residuei734N6-acetyllysine; alternate1 Publication1
    Cross-linki734Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO); alternateBy similarity

    Post-translational modificationi

    Polysumoylated. Varying sumolyation depending on tissue- and cell-type. Highly sumoylated in bladder and NIH 3T3 cells. Sumoylation is required for nuclear localization and enhances PAP stability. Desumoylated by SENP1. Inhibits polymerase activity (By similarity).By similarity
    Hyperphosphorylation on multiple CDK2 consensus and non-consensus sites in the C-terminal Ser/Thr-rich region represses PAP activity in late M-phase. Phosphorylation/dephosphorylation may regulate the interaction between PAP and CPSF (By similarity).By similarity
    Acetylated in the C-terminus. Acetylation decreases interaction with NUDT21 and KPNB1, and inhibits nuclear localization through inhibiting binding to the importin alpha/beta complex.1 Publication

    Keywords - PTMi

    Acetylation, Isopeptide bond, Phosphoprotein, Ubl conjugation

    Proteomic databases

    PaxDbiP25500.
    PRIDEiP25500.

    Expressioni

    Gene expression databases

    BgeeiENSBTAG00000004054.
    ExpressionAtlasiP25500. baseline and differential.

    Interactioni

    Subunit structurei

    Monomer (PubMed:10944102, PubMed:15328606). Found in a complex with CPSF1, FIP1L1 and PAPOLA. Interacts with AHCYL1 and FIP1L1; the interaction with AHCYL1 seems to increase interaction with FIP1L1 (By similarity). Interacts with NUDT21; the interaction is diminished by acetylation (PubMed:17172643). Interacts with KPNB1; the interaction promotes PAP nuclear import and is inhibited by acetylation of PAP (PubMed:17172643).By similarity3 Publications

    Sites

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Sitei153Interaction with RNABy similarity1
    Sitei158Interaction with RNABy similarity1
    Sitei328Interaction with RNABy similarity1
    Sitei399Interaction with RNABy similarity1
    Sitei524Interaction with RNABy similarity1

    Protein-protein interaction databases

    STRINGi9913.ENSBTAP00000005300.

    Structurei

    Secondary structure

    1739
    Legend: HelixTurnBeta strandPDB Structure known for this area
    Show more details
    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Beta strandi21 – 23Combined sources3
    Helixi33 – 46Combined sources14
    Helixi47 – 49Combined sources3
    Helixi55 – 82Combined sources28
    Helixi87 – 90Combined sources4
    Beta strandi96 – 100Combined sources5
    Helixi101 – 105Combined sources5
    Beta strandi114 – 120Combined sources7
    Helixi126 – 129Combined sources4
    Helixi132 – 138Combined sources7
    Beta strandi143 – 149Combined sources7
    Beta strandi152 – 154Combined sources3
    Beta strandi156 – 161Combined sources6
    Beta strandi164 – 172Combined sources9
    Beta strandi176 – 178Combined sources3
    Helixi187 – 190Combined sources4
    Helixi195 – 211Combined sources17
    Beta strandi213 – 215Combined sources3
    Helixi217 – 233Combined sources17
    Turni239 – 242Combined sources4
    Helixi246 – 259Combined sources14
    Helixi265 – 277Combined sources13
    Turni302 – 304Combined sources3
    Helixi306 – 310Combined sources5
    Beta strandi318 – 321Combined sources4
    Turni325 – 328Combined sources4
    Helixi331 – 352Combined sources22
    Helixi358 – 361Combined sources4
    Helixi367 – 370Combined sources4
    Beta strandi372 – 383Combined sources12
    Helixi384 – 395Combined sources12
    Helixi398 – 406Combined sources9
    Beta strandi411 – 416Combined sources6
    Beta strandi428 – 430Combined sources3
    Beta strandi433 – 443Combined sources11
    Helixi457 – 473Combined sources17
    Beta strandi482 – 489Combined sources8
    Helixi490 – 493Combined sources4
    Helixi494 – 496Combined sources3

    3D structure databases

    Select the link destinations:
    PDBei
    RCSB PDBi
    PDBji
    Links Updated
    PDB entryMethodResolution (Å)ChainPositionsPDBsum
    1F5AX-ray2.50A1-513[»]
    1Q78X-ray2.80A1-514[»]
    1Q79X-ray2.15A1-514[»]
    ProteinModelPortaliP25500.
    SMRiP25500.
    ModBaseiSearch...
    MobiDBiSearch...

    Miscellaneous databases

    EvolutionaryTraceiP25500.

    Family & Domainsi

    Region

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Regioni508 – 643Ser/Thr-richAdd BLAST136
    Regioni671 – 739Required for interaction with NUDT211 PublicationAdd BLAST69

    Motif

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Motifi490 – 507Nuclear localization signal 1Add BLAST18
    Motifi644 – 659Nuclear localization signal 2Add BLAST16

    Sequence similaritiesi

    Belongs to the poly(A) polymerase family.Curated

    Phylogenomic databases

    eggNOGiKOG2245. Eukaryota.
    COG5186. LUCA.
    GeneTreeiENSGT00390000017928.
    HOGENOMiHOG000204376.
    HOVERGENiHBG053502.
    InParanoidiP25500.
    KOiK14376.
    OMAiDMKIAAR.
    OrthoDBiEOG091G0571.
    TreeFamiTF300842.

    Family and domain databases

    Gene3Di3.30.70.590. 1 hit.
    InterProiIPR011068. NuclTrfase_I_C.
    IPR007012. PolA_pol_cen_dom.
    IPR007010. PolA_pol_RNA-bd_dom.
    IPR014492. PolyA_polymerase.
    IPR002934. Polymerase_NTP_transf_dom.
    [Graphical view]
    PANTHERiPTHR10682. PTHR10682. 1 hit.
    PfamiPF01909. NTP_transf_2. 1 hit.
    PF04928. PAP_central. 1 hit.
    PF04926. PAP_RNA-bind. 1 hit.
    [Graphical view]
    PIRSFiPIRSF018425. PolyA_polymerase. 1 hit.
    SUPFAMiSSF55003. SSF55003. 1 hit.

    Sequences (2)i

    Sequence statusi: Complete.

    Sequence processingi: The displayed sequence is further processed into a mature form.

    This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

    Note: Additional isoforms seem to exist.
    Isoform Long (identifier: P25500-1) [UniParc]FASTAAdd to basket

    This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

    « Hide

            10         20         30         40         50
    MPFPVTTQGS QQTQPPQKHY GITSPISLAA PKETDCLLTQ KLVETLKPFG
    60 70 80 90 100
    VFEEEEELQR RILILGKLNN LVKEWIREIS ESKNLPQSVI ENVGGKIFTF
    110 120 130 140 150
    GSYRLGVHTK GADIDALCVA PRHVDRSDFF TSFYDKLKLQ EEVKDLRAVE
    160 170 180 190 200
    EAFVPVIKLC FDGIEIDILF ARLALQTIPE DLDLRDDSLL KNLDIRCIRS
    210 220 230 240 250
    LNGCRVTDEI LHLVPNIDNF RLTLRAIKLW AKRHNIYSNI LGFLGGVSWA
    260 270 280 290 300
    MLVARTCQLY PNAIASTLVH KFFLVFSKWE WPNPVLLKQP EECNLNLPVW
    310 320 330 340 350
    DPRVNPSDRY HLMPIITPAY PQQNSTYNVS VSTRMVMVEE FKQGLAITDE
    360 370 380 390 400
    ILLSKAEWSK LFEAPNFFQK YKHYIVLLAS APTEKQRLEW VGLVESKIRI
    410 420 430 440 450
    LVGSLEKNEF ITLAHVNPQS FPAPKENPDK EEFRTMWVIG LVFKKTENSE
    460 470 480 490 500
    NLSVDLTYDI QSFTDTVYRQ AINSKMFEVD MKIAAMHVKR KQLHQLLPSH
    510 520 530 540 550
    VLQKKKKHST EGVKLTPLND SSLDLSMDSD NSMSVPSPTS AMKTSPLNSS
    560 570 580 590 600
    GSSQGRNSPA PAVTAASVTN IQATEVSLPQ INSSESSGGT SSESIPQTAT
    610 620 630 640 650
    QPAISSPPKP TVSRVVSSTR LVNPPPRPSG NAAAKIPNPI VGVKRTSSPH
    660 670 680 690 700
    KEESPKKTKT EEDETSEDAN CLALSGHDKT ETKEQLDTET STTQSETIQT
    710 720 730
    ATSLLASQKT SSTDLSDIPA LPANPIPVIK NSIKLRLNR
    Length:739
    Mass (Da):82,441
    Last modified:January 23, 2007 - v3
    Checksum:i7C89C15E33232CFF
    GO
    Isoform Short (identifier: P25500-2) [UniParc]FASTAAdd to basket

    The sequence of this isoform differs from the canonical sequence as follows:
         663-683: Missing.
         709-710: KT → II
         711-739: Missing.

    Note: No experimental confirmation available.
    Show »
    Length:689
    Mass (Da):77,066
    Checksum:i20BECA9A51B9ED1A
    GO

    Experimental Info

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Sequence conflicti80S → R in CAA45031 (PubMed:1896071).Curated1

    Alternative sequence

    Feature keyPosition(s)DescriptionActionsGraphical viewLength
    Alternative sequenceiVSP_004524663 – 683Missing in isoform Short. 1 PublicationAdd BLAST21
    Alternative sequenceiVSP_004525709 – 710KT → II in isoform Short. 1 Publication2
    Alternative sequenceiVSP_004526711 – 739Missing in isoform Short. 1 PublicationAdd BLAST29

    Sequence databases

    Select the link destinations:
    EMBLi
    GenBanki
    DDBJi
    Links Updated
    X61585 mRNA. Translation: CAA43782.1.
    X63436 mRNA. Translation: CAA45031.1.
    PIRiS17875.
    S17925.
    S18642.
    RefSeqiNP_788820.1. NM_176647.2. [P25500-1]
    UniGeneiBt.109586.

    Genome annotation databases

    EnsembliENSBTAT00000005300; ENSBTAP00000005300; ENSBTAG00000004054. [P25500-1]
    GeneIDi338051.
    KEGGibta:338051.

    Keywords - Coding sequence diversityi

    Alternative splicing

    Cross-referencesi

    Sequence databases

    Select the link destinations:
    EMBLi
    GenBanki
    DDBJi
    Links Updated
    X61585 mRNA. Translation: CAA43782.1.
    X63436 mRNA. Translation: CAA45031.1.
    PIRiS17875.
    S17925.
    S18642.
    RefSeqiNP_788820.1. NM_176647.2. [P25500-1]
    UniGeneiBt.109586.

    3D structure databases

    Select the link destinations:
    PDBei
    RCSB PDBi
    PDBji
    Links Updated
    PDB entryMethodResolution (Å)ChainPositionsPDBsum
    1F5AX-ray2.50A1-513[»]
    1Q78X-ray2.80A1-514[»]
    1Q79X-ray2.15A1-514[»]
    ProteinModelPortaliP25500.
    SMRiP25500.
    ModBaseiSearch...
    MobiDBiSearch...

    Protein-protein interaction databases

    STRINGi9913.ENSBTAP00000005300.

    Proteomic databases

    PaxDbiP25500.
    PRIDEiP25500.

    Protocols and materials databases

    Structural Biology KnowledgebaseSearch...

    Genome annotation databases

    EnsembliENSBTAT00000005300; ENSBTAP00000005300; ENSBTAG00000004054. [P25500-1]
    GeneIDi338051.
    KEGGibta:338051.

    Organism-specific databases

    CTDi10914.

    Phylogenomic databases

    eggNOGiKOG2245. Eukaryota.
    COG5186. LUCA.
    GeneTreeiENSGT00390000017928.
    HOGENOMiHOG000204376.
    HOVERGENiHBG053502.
    InParanoidiP25500.
    KOiK14376.
    OMAiDMKIAAR.
    OrthoDBiEOG091G0571.
    TreeFamiTF300842.

    Enzyme and pathway databases

    BRENDAi2.7.7.19. 908.
    ReactomeiR-BTA-109688. Cleavage of Growing Transcript in the Termination Region.
    R-BTA-72163. mRNA Splicing - Major Pathway.
    R-BTA-72187. mRNA 3'-end processing.
    R-BTA-77595. Processing of Intronless Pre-mRNAs.
    SABIO-RKP25500.

    Miscellaneous databases

    EvolutionaryTraceiP25500.

    Gene expression databases

    BgeeiENSBTAG00000004054.
    ExpressionAtlasiP25500. baseline and differential.

    Family and domain databases

    Gene3Di3.30.70.590. 1 hit.
    InterProiIPR011068. NuclTrfase_I_C.
    IPR007012. PolA_pol_cen_dom.
    IPR007010. PolA_pol_RNA-bd_dom.
    IPR014492. PolyA_polymerase.
    IPR002934. Polymerase_NTP_transf_dom.
    [Graphical view]
    PANTHERiPTHR10682. PTHR10682. 1 hit.
    PfamiPF01909. NTP_transf_2. 1 hit.
    PF04928. PAP_central. 1 hit.
    PF04926. PAP_RNA-bind. 1 hit.
    [Graphical view]
    PIRSFiPIRSF018425. PolyA_polymerase. 1 hit.
    SUPFAMiSSF55003. SSF55003. 1 hit.
    ProtoNetiSearch...

    Entry informationi

    Entry nameiPAPOA_BOVIN
    AccessioniPrimary (citable) accession number: P25500
    Entry historyi
    Integrated into UniProtKB/Swiss-Prot: May 1, 1992
    Last sequence update: January 23, 2007
    Last modified: November 2, 2016
    This is version 146 of the entry and version 3 of the sequence. [Complete history]
    Entry statusiReviewed (UniProtKB/Swiss-Prot)
    Annotation programChordata Protein Annotation Program

    Miscellaneousi

    Keywords - Technical termi

    3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

    Documents

    1. PDB cross-references
      Index of Protein Data Bank (PDB) cross-references
    2. SIMILARITY comments
      Index of protein domains and families

    Similar proteinsi

    Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
    100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
    90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
    50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.