Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Nucleoporin p54

Gene

NUP54

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Component of the nuclear pore complex, a complex required for the trafficking across the nuclear membrane.By similarity

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Biological processi

mRNA transport, Protein transport, Translocation, Transport

Enzyme and pathway databases

ReactomeiR-HSA-1169408. ISG15 antiviral mechanism.
R-HSA-159236. Transport of Mature mRNA derived from an Intron-Containing Transcript.
R-HSA-165054. Rev-mediated nuclear export of HIV RNA.
R-HSA-168276. NS1 Mediated Effects on Host Pathways.
R-HSA-168325. Viral Messenger RNA Synthesis.
R-HSA-170822. Regulation of Glucokinase by Glucokinase Regulatory Protein.
R-HSA-180746. Nuclear import of Rev protein.
R-HSA-180910. Vpr-mediated nuclear import of PICs.
R-HSA-3108214. SUMOylation of DNA damage response and repair proteins.
R-HSA-3301854. Nuclear Pore Complex (NPC) Disassembly.
R-HSA-3371453. Regulation of HSF1-mediated heat shock response.
R-HSA-4570464. SUMOylation of RNA binding proteins.
R-HSA-4615885. SUMOylation of DNA replication proteins.
R-HSA-5578749. Transcriptional regulation by small RNAs.
R-HSA-6784531. tRNA processing in the nucleus.

Protein family/group databases

TCDBi1.I.1.1.3. the nuclear pore complex (npc) family.

Names & Taxonomyi

Protein namesi
Recommended name:
Nucleoporin p54
Alternative name(s):
54 kDa nucleoporin
Gene namesi
Name:NUP54
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 4

Organism-specific databases

HGNCiHGNC:17359. NUP54.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Membrane, Nuclear pore complex, Nucleus

Pathology & Biotechi

Organism-specific databases

PharmGKBiPA31853.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 507507Nucleoporin p54PRO_0000204874Add
BLAST

Post-translational modificationi

O-glycosylated.By similarity

Keywords - PTMi

Glycoprotein

Proteomic databases

EPDiQ7Z3B4.
MaxQBiQ7Z3B4.
PaxDbiQ7Z3B4.
PRIDEiQ7Z3B4.

PTM databases

iPTMnetiQ7Z3B4.
PhosphoSiteiQ7Z3B4.

Expressioni

Gene expression databases

BgeeiQ7Z3B4.
CleanExiHS_NUP54.
ExpressionAtlasiQ7Z3B4. baseline and differential.
GenevisibleiQ7Z3B4. HS.

Organism-specific databases

HPAiHPA035929.
HPA058122.

Interactioni

Subunit structurei

Component of the p62 complex, a complex composed of NUP62, NUP54, and the isoform p58 and isoform p45 of NUP58.By similarity

Binary interactionsi

WithEntry#Exp.IntActNotes
HAUS1Q96CS23EBI-741048,EBI-2514791
HGSO149643EBI-741048,EBI-740220
IFT20Q8IY313EBI-741048,EBI-744203
KRT15P190123EBI-741048,EBI-739566
NUP58Q9BVL24EBI-741048,EBI-2811583
NUP62P371985EBI-741048,EBI-347978
PAQ670202EBI-741048,EBI-11514477From a different organism.

Protein-protein interaction databases

BioGridi119759. 48 interactions.
IntActiQ7Z3B4. 22 interactions.
MINTiMINT-1477978.
STRINGi9606.ENSP00000264883.

Structurei

Secondary structure

1
507
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Beta strandi454 – 4563Combined sources
Helixi457 – 48731Combined sources
Helixi488 – 4903Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
4JNUX-ray1.44A/B/C/D453-491[»]
4JNVX-ray1.85A/B/C/D453-491[»]
4JO7X-ray1.75B/D/F/H453-491[»]
4JO9X-ray2.50A/C453-491[»]
5IJNelectron microscopy21.40F/L/R/X1-507[»]
5IJOelectron microscopy21.40F/L/R/X1-507[»]
ProteinModelPortaliQ7Z3B4.
SMRiQ7Z3B4. Positions 186-422, 453-491.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Repeati5 – 621
Repeati25 – 2622
Repeati28 – 2923
Repeati53 – 5424
Repeati61 – 6225
Repeati63 – 6426
Repeati67 – 6827
Repeati87 – 8828
Repeati444 – 44529

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni5 – 4454419 X 2 AA repeats of F-GAdd
BLAST

Compositional bias

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Compositional biasi6 – 8984Gly-richAdd
BLAST
Compositional biasi30 – 9970Thr-richAdd
BLAST
Compositional biasi93 – 975Poly-Gln
Compositional biasi205 – 2084Poly-Gln

Domaini

Contains FG repeats.

Sequence similaritiesi

Belongs to the NUP54 family.Curated

Keywords - Domaini

Repeat

Phylogenomic databases

eggNOGiKOG3091. Eukaryota.
ENOG410XPU5. LUCA.
GeneTreeiENSGT00390000013620.
HOGENOMiHOG000046583.
HOVERGENiHBG052698.
InParanoidiQ7Z3B4.
KOiK14308.
OMAiEQANIKT.
PhylomeDBiQ7Z3B4.
TreeFamiTF320237.

Family and domain databases

InterProiIPR024864. Nup54/Nup57/Nup44.
IPR025712. Nup54_alpha-helical_dom.
[Graphical view]
PANTHERiPTHR13000. PTHR13000. 1 hit.
PfamiPF13874. Nup54. 1 hit.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q7Z3B4-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MAFNFGAPSG TSGTAAATAA PAGGFGGFGT TSTTAGSAFS FSAPTNTGTT
60 70 80 90 100
GLFGGTQNKG FGFGTGFGTT TGTSTGLGTG LGTGLGFGGF NTQQQQQTTL
110 120 130 140 150
GGLFSQPTQA PTQSNQLINT ASALSAPTLL GDERDAILAK WNQLQAFWGT
160 170 180 190 200
GKGYFNNNIP PVEFTQENPF CRFKAVGYSC MPSNKDEDGL VVLVFNKKET
210 220 230 240 250
EIRSQQQQLV ESLHKVLGGN QTLTVNVEGT KTLPDDQTEV VIYVVERSPN
260 270 280 290 300
GTSRRVPATT LYAHFEQANI KTQLQQLGVT LSMTRTELSP AQIKQLLQNP
310 320 330 340 350
PAGVDPIIWE QAKVDNPDSE KLIPVPMVGF KELLRRLKVQ DQMTKQHQTR
360 370 380 390 400
LDIISEDISE LQKNQTTSVA KIAQYKRKLM DLSHRTLQVL IKQEIQRKSG
410 420 430 440 450
YAIQADEEQL RVQLDTIQGE LNAPTQFKGR LNELMSQIRM QNHFGAVRSE
460 470 480 490 500
ERYYIDADLL REIKQHLKQQ QEGLSHLISI IKDDLEDIKL VEHGLNETIH

IRGGVFS
Length:507
Mass (Da):55,435
Last modified:October 31, 2003 - v2
Checksum:iB5D235E01F1FE467
GO
Isoform 2 (identifier: Q7Z3B4-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-180: Missing.
     353-388: Missing.

Note: No experimental confirmation available.
Show »
Length:291
Mass (Da):33,180
Checksum:i442CA17A5CBC6D3E
GO
Isoform 3 (identifier: Q7Z3B4-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     51-99: GLFGGTQNKGFGFGTGFGTTTGTSTGLGTGLGTGLGFGGFNTQQQQQTT → A

Note: No experimental confirmation available.
Show »
Length:459
Mass (Da):50,775
Checksum:iD35813EFEA5AC035
GO

Sequence cautioni

The sequence BAA91735.1 differs from that shown. Reason: Frameshift at position 479. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti22 – 254AGGF → GWV in AAF67488 (PubMed:10931946).Curated
Sequence conflicti226 – 2261N → S in BAA91735 (PubMed:14702039).Curated
Sequence conflicti265 – 27713FEQAN…QLQQL → LNKPYKNTIAAT in AAF67488 (PubMed:10931946).CuratedAdd
BLAST
Sequence conflicti322 – 3221L → V in AAF67488 (PubMed:10931946).Curated
Sequence conflicti363 – 3631K → T in AAF67488 (PubMed:10931946).Curated
Sequence conflicti433 – 4331E → G in AAF67488 (PubMed:10931946).Curated
Sequence conflicti507 – 5071S → G in CAD97957 (PubMed:17974005).Curated

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1 – 180180Missing in isoform 2. 1 PublicationVSP_008740Add
BLAST
Alternative sequencei51 – 9949GLFGG…QQQTT → A in isoform 3. 1 PublicationVSP_054355Add
BLAST
Alternative sequencei353 – 38836Missing in isoform 2. 1 PublicationVSP_008741Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF157322 mRNA. Translation: AAF67488.1.
AK001517 mRNA. Translation: BAA91735.1. Frameshift.
AK300036 mRNA. Translation: BAG61847.1.
AK315160 mRNA. Translation: BAG37604.1.
BX538002 mRNA. Translation: CAD97957.1.
AC110795 Genomic DNA. No translation available.
AC112719 Genomic DNA. No translation available.
CH471057 Genomic DNA. Translation: EAX05777.1.
CCDSiCCDS3576.1. [Q7Z3B4-1]
CCDS63998.1. [Q7Z3B4-3]
RefSeqiNP_001265532.1. NM_001278603.1. [Q7Z3B4-3]
NP_059122.2. NM_017426.3. [Q7Z3B4-1]
UniGeneiHs.430435.

Genome annotation databases

EnsembliENST00000264883; ENSP00000264883; ENSG00000138750. [Q7Z3B4-1]
ENST00000514987; ENSP00000421304; ENSG00000138750. [Q7Z3B4-3]
GeneIDi53371.
KEGGihsa:53371.
UCSCiuc003hjs.5. human. [Q7Z3B4-1]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF157322 mRNA. Translation: AAF67488.1.
AK001517 mRNA. Translation: BAA91735.1. Frameshift.
AK300036 mRNA. Translation: BAG61847.1.
AK315160 mRNA. Translation: BAG37604.1.
BX538002 mRNA. Translation: CAD97957.1.
AC110795 Genomic DNA. No translation available.
AC112719 Genomic DNA. No translation available.
CH471057 Genomic DNA. Translation: EAX05777.1.
CCDSiCCDS3576.1. [Q7Z3B4-1]
CCDS63998.1. [Q7Z3B4-3]
RefSeqiNP_001265532.1. NM_001278603.1. [Q7Z3B4-3]
NP_059122.2. NM_017426.3. [Q7Z3B4-1]
UniGeneiHs.430435.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
4JNUX-ray1.44A/B/C/D453-491[»]
4JNVX-ray1.85A/B/C/D453-491[»]
4JO7X-ray1.75B/D/F/H453-491[»]
4JO9X-ray2.50A/C453-491[»]
5IJNelectron microscopy21.40F/L/R/X1-507[»]
5IJOelectron microscopy21.40F/L/R/X1-507[»]
ProteinModelPortaliQ7Z3B4.
SMRiQ7Z3B4. Positions 186-422, 453-491.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi119759. 48 interactions.
IntActiQ7Z3B4. 22 interactions.
MINTiMINT-1477978.
STRINGi9606.ENSP00000264883.

Protein family/group databases

TCDBi1.I.1.1.3. the nuclear pore complex (npc) family.

PTM databases

iPTMnetiQ7Z3B4.
PhosphoSiteiQ7Z3B4.

Proteomic databases

EPDiQ7Z3B4.
MaxQBiQ7Z3B4.
PaxDbiQ7Z3B4.
PRIDEiQ7Z3B4.

Protocols and materials databases

DNASUi53371.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000264883; ENSP00000264883; ENSG00000138750. [Q7Z3B4-1]
ENST00000514987; ENSP00000421304; ENSG00000138750. [Q7Z3B4-3]
GeneIDi53371.
KEGGihsa:53371.
UCSCiuc003hjs.5. human. [Q7Z3B4-1]

Organism-specific databases

CTDi53371.
GeneCardsiNUP54.
H-InvDBHIX0004305.
HGNCiHGNC:17359. NUP54.
HPAiHPA035929.
HPA058122.
MIMi607607. gene.
neXtProtiNX_Q7Z3B4.
PharmGKBiPA31853.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG3091. Eukaryota.
ENOG410XPU5. LUCA.
GeneTreeiENSGT00390000013620.
HOGENOMiHOG000046583.
HOVERGENiHBG052698.
InParanoidiQ7Z3B4.
KOiK14308.
OMAiEQANIKT.
PhylomeDBiQ7Z3B4.
TreeFamiTF320237.

Enzyme and pathway databases

ReactomeiR-HSA-1169408. ISG15 antiviral mechanism.
R-HSA-159236. Transport of Mature mRNA derived from an Intron-Containing Transcript.
R-HSA-165054. Rev-mediated nuclear export of HIV RNA.
R-HSA-168276. NS1 Mediated Effects on Host Pathways.
R-HSA-168325. Viral Messenger RNA Synthesis.
R-HSA-170822. Regulation of Glucokinase by Glucokinase Regulatory Protein.
R-HSA-180746. Nuclear import of Rev protein.
R-HSA-180910. Vpr-mediated nuclear import of PICs.
R-HSA-3108214. SUMOylation of DNA damage response and repair proteins.
R-HSA-3301854. Nuclear Pore Complex (NPC) Disassembly.
R-HSA-3371453. Regulation of HSF1-mediated heat shock response.
R-HSA-4570464. SUMOylation of RNA binding proteins.
R-HSA-4615885. SUMOylation of DNA replication proteins.
R-HSA-5578749. Transcriptional regulation by small RNAs.
R-HSA-6784531. tRNA processing in the nucleus.

Miscellaneous databases

GeneWikiiNUP54.
GenomeRNAii53371.
PROiQ7Z3B4.
SOURCEiSearch...

Gene expression databases

BgeeiQ7Z3B4.
CleanExiHS_NUP54.
ExpressionAtlasiQ7Z3B4. baseline and differential.
GenevisibleiQ7Z3B4. HS.

Family and domain databases

InterProiIPR024864. Nup54/Nup57/Nup44.
IPR025712. Nup54_alpha-helical_dom.
[Graphical view]
PANTHERiPTHR13000. PTHR13000. 1 hit.
PfamiPF13874. Nup54. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

  1. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1).
    Tissue: Pituitary.
  2. "Complete sequencing and characterization of 21,243 full-length human cDNAs."
    Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S.
    , Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., Isogai T., Sugano S.
    Nat. Genet. 36:40-45(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 1 AND 3).
    Tissue: Teratocarcinoma.
  3. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2).
    Tissue: Endometrial tumor.
  4. "Generation and annotation of the DNA sequences of human chromosomes 2 and 4."
    Hillier L.W., Graves T.A., Fulton R.S., Fulton L.A., Pepin K.H., Minx P., Wagner-McPherson C., Layman D., Wylie K., Sekhon M., Becker M.C., Fewell G.A., Delehaunty K.D., Miner T.L., Nash W.E., Kremitzki C., Oddy L., Du H.
    , Sun H., Bradshaw-Cordum H., Ali J., Carter J., Cordes M., Harris A., Isak A., van Brunt A., Nguyen C., Du F., Courtney L., Kalicki J., Ozersky P., Abbott S., Armstrong J., Belter E.A., Caruso L., Cedroni M., Cotton M., Davidson T., Desai A., Elliott G., Erb T., Fronick C., Gaige T., Haakenson W., Haglund K., Holmes A., Harkins R., Kim K., Kruchowski S.S., Strong C.M., Grewal N., Goyea E., Hou S., Levy A., Martinka S., Mead K., McLellan M.D., Meyer R., Randall-Maher J., Tomlinson C., Dauphin-Kohlberg S., Kozlowicz-Reilly A., Shah N., Swearengen-Shahid S., Snider J., Strong J.T., Thompson J., Yoakum M., Leonard S., Pearman C., Trani L., Radionenko M., Waligorski J.E., Wang C., Rock S.M., Tin-Wollam A.-M., Maupin R., Latreille P., Wendl M.C., Yang S.-P., Pohl C., Wallis J.W., Spieth J., Bieri T.A., Berkowicz N., Nelson J.O., Osborne J., Ding L., Meyer R., Sabo A., Shotland Y., Sinha P., Wohldmann P.E., Cook L.L., Hickenbotham M.T., Eldred J., Williams D., Jones T.A., She X., Ciccarelli F.D., Izaurralde E., Taylor J., Schmutz J., Myers R.M., Cox D.R., Huang X., McPherson J.D., Mardis E.R., Clifton S.W., Warren W.C., Chinwalla A.T., Eddy S.R., Marra M.A., Ovcharenko I., Furey T.S., Miller W., Eichler E.E., Bork P., Suyama M., Torrents D., Waterston R.H., Wilson R.K.
    Nature 434:724-731(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  5. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  6. Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].

Entry informationi

Entry nameiNUP54_HUMAN
AccessioniPrimary (citable) accession number: Q7Z3B4
Secondary accession number(s): B2RCK7
, B4DT35, Q96EA7, Q9NVL5, Q9P0I1
Entry historyi
Integrated into UniProtKB/Swiss-Prot: October 31, 2003
Last sequence update: October 31, 2003
Last modified: June 8, 2016
This is version 124 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 4
    Human chromosome 4: entries, gene names and cross-references to MIM
  2. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  3. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  4. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.