Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Ski oncogene

Gene

SKI

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

May play a role in terminal differentiation of skeletal muscle cells but not in the determination of cells to the myogenic lineage. Functions as a repressor of TGF-beta signaling.1 Publication

GO - Molecular functioni

  • chromatin binding Source: Ensembl
  • histone deacetylase inhibitor activity Source: BHF-UCL
  • protein domain specific binding Source: UniProtKB
  • protein kinase binding Source: UniProtKB
  • repressing transcription factor binding Source: UniProtKB
  • SMAD binding Source: UniProtKB
  • transcription corepressor activity Source: UniProtKB
  • ubiquitin protein ligase binding Source: UniProtKB
  • zinc ion binding Source: UniProtKB

GO - Biological processi

Complete GO annotation...

Enzyme and pathway databases

ReactomeiREACT_12034. Signaling by BMP.
REACT_121111. Downregulation of SMAD2/3:SMAD4 transcriptional activity.
SignaLinkiP12755.

Names & Taxonomyi

Protein namesi
Recommended name:
Ski oncogene
Alternative name(s):
Proto-oncogene c-Ski
Gene namesi
Name:SKI
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
ProteomesiUP000005640 Componenti: Chromosome 1

Organism-specific databases

HGNCiHGNC:10896. SKI.

Subcellular locationi

GO - Cellular componenti

  • centrosome Source: MGI
  • cytoplasm Source: BHF-UCL
  • nuclear body Source: UniProtKB
  • nucleoplasm Source: Reactome
  • nucleus Source: UniProtKB
  • PML body Source: UniProtKB
  • protein complex Source: MGI
  • transcriptional repressor complex Source: BHF-UCL
  • transcription factor complex Source: BHF-UCL
Complete GO annotation...

Keywords - Cellular componenti

Nucleus

Pathology & Biotechi

Involvement in diseasei

Shprintzen-Goldberg craniosynostosis syndrome (SGS)4 Publications

The disease is caused by mutations affecting the gene represented in this entry.

Disease descriptionA very rare syndrome characterized by a marfanoid habitus, craniosynostosis, characteristic dysmorphic facial features, skeletal and cardiovascular abnormalities, mental retardation, developmental delay and learning disabilities.

See also OMIM:182212
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti21 – 211L → R in SGS. 1 Publication
VAR_071170
Natural varianti28 – 281S → T in SGS. 1 Publication
VAR_071659
Natural varianti31 – 311S → L in SGS. 2 Publications
VAR_071171
Natural varianti32 – 321L → P in SGS. 1 Publication
VAR_071172
Natural varianti32 – 321L → V in SGS. 3 Publications
VAR_071173
Natural varianti34 – 341G → A in SGS. 1 Publication
VAR_071660
Natural varianti34 – 341G → C in SGS. 2 Publications
VAR_071174
Natural varianti34 – 341G → D in SGS. 2 Publications
VAR_071175
Natural varianti34 – 341G → S in SGS. 3 Publications
VAR_071176
Natural varianti34 – 341G → V in SGS. 2 Publications
VAR_071177
Natural varianti35 – 351P → Q in SGS. 1 Publication
VAR_071178
Natural varianti35 – 351P → S in SGS. 4 Publications
VAR_071179
Natural varianti94 – 974Missing in SGS. 1 Publication
VAR_071180
Natural varianti95 – 973Missing in SGS. 2 Publications
VAR_071181
Natural varianti116 – 1161G → E in SGS. 2 Publications
VAR_071182
Natural varianti117 – 1171G → R in SGS. 1 Publication
VAR_071183

Keywords - Diseasei

Craniosynostosis, Disease mutation, Proto-oncogene

Organism-specific databases

MIMi182212. phenotype.
Orphaneti1606. 1p36 deletion syndrome.
2462. Shprintzen-Goldberg syndrome.
PharmGKBiPA35796.

Polymorphism and mutation databases

BioMutaiSKI.
DMDMi134517.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 728728Ski oncogenePRO_0000129382Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei480 – 4801Phosphoserine1 Publication

Keywords - PTMi

Phosphoprotein

Proteomic databases

MaxQBiP12755.
PaxDbiP12755.
PRIDEiP12755.

PTM databases

PhosphoSiteiP12755.

Expressioni

Gene expression databases

BgeeiP12755.
CleanExiHS_SKI.
GenevisibleiP12755. HS.

Organism-specific databases

HPAiCAB010449.

Interactioni

Subunit structurei

Interacts with SMAD2, SMAD3 and SMAD4. Interacts with HIPK2. Part of a complex with HIPK2 and SMAD1/2/3. Interacts with PRDM16 and SMAD3; the interaction with PRDM16 promotes the recruitment SMAD3-HDAC1 complex on the promoter of TGF-beta target genes.2 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
Fam89bQ9QUI16EBI-347281,EBI-6503100From a different organism.
NCOR1O753764EBI-347281,EBI-347233
Ncor1Q609745EBI-347281,EBI-349004From a different organism.
SIN3AQ96ST33EBI-347281,EBI-347218
SMAD3P840228EBI-347281,EBI-347161
SMAD4Q1348511EBI-347281,EBI-347263

Protein-protein interaction databases

BioGridi112388. 44 interactions.
DIPiDIP-31514N.
IntActiP12755. 15 interactions.
MINTiMINT-269973.
STRINGi9606.ENSP00000367797.

Structurei

Secondary structure

1
728
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Beta strandi101 – 1055Combined sources
Beta strandi108 – 1158Combined sources
Beta strandi118 – 1225Combined sources
Helixi123 – 1275Combined sources
Turni128 – 1336Combined sources
Helixi136 – 14510Combined sources
Helixi155 – 1639Combined sources
Beta strandi175 – 1784Combined sources
Helixi179 – 19012Combined sources
Beta strandi219 – 2224Combined sources
Beta strandi228 – 2325Combined sources
Helixi234 – 2363Combined sources
Beta strandi245 – 2473Combined sources
Turni248 – 2503Combined sources
Helixi256 – 2594Combined sources
Beta strandi269 – 2757Combined sources
Helixi278 – 2803Combined sources
Helixi281 – 2844Combined sources
Helixi296 – 31015Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1MR1X-ray2.85C/D219-313[»]
1SBXX-ray1.65A91-192[»]
ProteinModelPortaliP12755.
SMRiP12755. Positions 91-192, 218-312.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP12755.

Family & Domainsi

Coiled coil

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Coiled coili536 – 710175Sequence AnalysisAdd
BLAST

Sequence similaritiesi

Belongs to the SKI family.Curated

Keywords - Domaini

Coiled coil, Repeat

Phylogenomic databases

eggNOGiNOG82850.
GeneTreeiENSGT00530000063040.
HOGENOMiHOG000039989.
HOVERGENiHBG006599.
InParanoidiP12755.
OMAiARQIRVC.
OrthoDBiEOG72RN0F.
PhylomeDBiP12755.
TreeFamiTF324133.

Family and domain databases

Gene3Di3.10.260.20. 1 hit.
3.10.390.10. 1 hit.
InterProiIPR014890. c-SKI_SMAD4-bd_dom.
IPR009061. DNA-bd_dom_put.
IPR010919. SAND_dom-like.
IPR028760. Ski.
IPR003380. Transform_Ski.
IPR023216. Tscrpt_reg_SKI_SnoN.
[Graphical view]
PANTHERiPTHR10005. PTHR10005. 1 hit.
PTHR10005:SF15. PTHR10005:SF15. 1 hit.
PfamiPF08782. c-SKI_SMAD_bind. 1 hit.
PF02437. Ski_Sno. 1 hit.
[Graphical view]
SMARTiSM01046. c-SKI_SMAD_bind. 1 hit.
[Graphical view]
SUPFAMiSSF46955. SSF46955. 1 hit.
SSF63763. SSF63763. 1 hit.

Sequencei

Sequence statusi: Complete.

P12755-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MEAAAGGRGC FQPHPGLQKT LEQFHLSSMS SLGGPAAFSA RWAQEAYKKE
60 70 80 90 100
SAKEAGAAAV PAPVPAATEP PPVLHLPAIQ PPPPVLPGPF FMPSDRSTER
110 120 130 140 150
CETVLEGETI SCFVVGGEKR LCLPQILNSV LRDFSLQQIN AVCDELHIYC
160 170 180 190 200
SRCTADQLEI LKVMGILPFS APSCGLITKT DAERLCNALL YGGAYPPPCK
210 220 230 240 250
KELAASLALG LELSERSVRV YHECFGKCKG LLVPELYSSP SAACIQCLDC
260 270 280 290 300
RLMYPPHKFV VHSHKALENR TCHWGFDSAN WRAYILLSQD YTGKEEQARL
310 320 330 340 350
GRCLDDVKEK FDYGNKYKRR VPRVSSEPPA SIRPKTDDTS SQSPAPSEKD
360 370 380 390 400
KPSSWLRTLA GSSNKSLGCV HPRQRLSAFR PWSPAVSASE KELSPHLPAL
410 420 430 440 450
IRDSFYSYKS FETAVAPNVA LAPPAQQKVV SSPPCAAAVS RAPEPLATCT
460 470 480 490 500
QPRKRKLTVD TPGAPETLAP VAAPEEDKDS EAEVEVESRE EFTSSLSSLS
510 520 530 540 550
SPSFTSSSSA KDLGSPGARA LPSAVPDAAA PADAPSGLEA ELEHLRQALE
560 570 580 590 600
GGLDTKEAKE KFLHEVVKMR VKQEEKLSAA LQAKRSLHQE LEFLRVAKKE
610 620 630 640 650
KLREATEAKR NLRKEIERLR AENEKKMKEA NESRLRLKRE LEQARQARVC
660 670 680 690 700
DKGCEAGRLR AKYSAQIEDL QVKLQHAEAD REQLRADLLR EREAREHLEK
710 720
VVKELQEQLW PRARPEAAGS EGAAELEP
Length:728
Mass (Da):80,005
Last modified:October 1, 1989 - v1
Checksum:i9B78C4840A28C2DA
GO

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti21 – 211L → R in SGS. 1 Publication
VAR_071170
Natural varianti28 – 281S → T in SGS. 1 Publication
VAR_071659
Natural varianti31 – 311S → L in SGS. 2 Publications
VAR_071171
Natural varianti32 – 321L → P in SGS. 1 Publication
VAR_071172
Natural varianti32 – 321L → V in SGS. 3 Publications
VAR_071173
Natural varianti34 – 341G → A in SGS. 1 Publication
VAR_071660
Natural varianti34 – 341G → C in SGS. 2 Publications
VAR_071174
Natural varianti34 – 341G → D in SGS. 2 Publications
VAR_071175
Natural varianti34 – 341G → S in SGS. 3 Publications
VAR_071176
Natural varianti34 – 341G → V in SGS. 2 Publications
VAR_071177
Natural varianti35 – 351P → Q in SGS. 1 Publication
VAR_071178
Natural varianti35 – 351P → S in SGS. 4 Publications
VAR_071179
Natural varianti94 – 974Missing in SGS. 1 Publication
VAR_071180
Natural varianti95 – 973Missing in SGS. 2 Publications
VAR_071181
Natural varianti116 – 1161G → E in SGS. 2 Publications
VAR_071182
Natural varianti117 – 1171G → R in SGS. 1 Publication
VAR_071183

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X15218 mRNA. Translation: CAA33288.1.
AL590822 Genomic DNA. No translation available.
CCDSiCCDS39.1.
PIRiS06053. TVHUSK.
RefSeqiNP_003027.1. NM_003036.3.
UniGeneiHs.656507.

Genome annotation databases

EnsembliENST00000378536; ENSP00000367797; ENSG00000157933.
GeneIDi6497.
KEGGihsa:6497.
UCSCiuc001aja.4. human.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X15218 mRNA. Translation: CAA33288.1.
AL590822 Genomic DNA. No translation available.
CCDSiCCDS39.1.
PIRiS06053. TVHUSK.
RefSeqiNP_003027.1. NM_003036.3.
UniGeneiHs.656507.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1MR1X-ray2.85C/D219-313[»]
1SBXX-ray1.65A91-192[»]
ProteinModelPortaliP12755.
SMRiP12755. Positions 91-192, 218-312.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi112388. 44 interactions.
DIPiDIP-31514N.
IntActiP12755. 15 interactions.
MINTiMINT-269973.
STRINGi9606.ENSP00000367797.

PTM databases

PhosphoSiteiP12755.

Polymorphism and mutation databases

BioMutaiSKI.
DMDMi134517.

Proteomic databases

MaxQBiP12755.
PaxDbiP12755.
PRIDEiP12755.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000378536; ENSP00000367797; ENSG00000157933.
GeneIDi6497.
KEGGihsa:6497.
UCSCiuc001aja.4. human.

Organism-specific databases

CTDi6497.
GeneCardsiGC01P002173.
GeneReviewsiSKI.
HGNCiHGNC:10896. SKI.
HPAiCAB010449.
MIMi164780. gene.
182212. phenotype.
neXtProtiNX_P12755.
Orphaneti1606. 1p36 deletion syndrome.
2462. Shprintzen-Goldberg syndrome.
PharmGKBiPA35796.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiNOG82850.
GeneTreeiENSGT00530000063040.
HOGENOMiHOG000039989.
HOVERGENiHBG006599.
InParanoidiP12755.
OMAiARQIRVC.
OrthoDBiEOG72RN0F.
PhylomeDBiP12755.
TreeFamiTF324133.

Enzyme and pathway databases

ReactomeiREACT_12034. Signaling by BMP.
REACT_121111. Downregulation of SMAD2/3:SMAD4 transcriptional activity.
SignaLinkiP12755.

Miscellaneous databases

ChiTaRSiSKI. human.
EvolutionaryTraceiP12755.
GeneWikiiSKI_protein.
GenomeRNAii6497.
NextBioi25255.
PROiP12755.
SOURCEiSearch...

Gene expression databases

BgeeiP12755.
CleanExiHS_SKI.
GenevisibleiP12755. HS.

Family and domain databases

Gene3Di3.10.260.20. 1 hit.
3.10.390.10. 1 hit.
InterProiIPR014890. c-SKI_SMAD4-bd_dom.
IPR009061. DNA-bd_dom_put.
IPR010919. SAND_dom-like.
IPR028760. Ski.
IPR003380. Transform_Ski.
IPR023216. Tscrpt_reg_SKI_SnoN.
[Graphical view]
PANTHERiPTHR10005. PTHR10005. 1 hit.
PTHR10005:SF15. PTHR10005:SF15. 1 hit.
PfamiPF08782. c-SKI_SMAD_bind. 1 hit.
PF02437. Ski_Sno. 1 hit.
[Graphical view]
SMARTiSM01046. c-SKI_SMAD_bind. 1 hit.
[Graphical view]
SUPFAMiSSF46955. SSF46955. 1 hit.
SSF63763. SSF63763. 1 hit.
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Isolation of human cDNA clones of ski and the ski-related gene, sno."
    Nomura N., Sasamoto S., Ishii S., Date T., Matsui M., Ishizaki R.
    Nucleic Acids Res. 17:5489-5500(1989) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA].
  2. "The DNA sequence and biological annotation of human chromosome 1."
    Gregory S.G., Barlow K.F., McLay K.E., Kaul R., Swarbreck D., Dunham A., Scott C.E., Howe K.L., Woodfine K., Spencer C.C.A., Jones M.C., Gillson C., Searle S., Zhou Y., Kokocinski F., McDonald L., Evans R., Phillips K.
    , Atkinson A., Cooper R., Jones C., Hall R.E., Andrews T.D., Lloyd C., Ainscough R., Almeida J.P., Ambrose K.D., Anderson F., Andrew R.W., Ashwell R.I.S., Aubin K., Babbage A.K., Bagguley C.L., Bailey J., Beasley H., Bethel G., Bird C.P., Bray-Allen S., Brown J.Y., Brown A.J., Buckley D., Burton J., Bye J., Carder C., Chapman J.C., Clark S.Y., Clarke G., Clee C., Cobley V., Collier R.E., Corby N., Coville G.J., Davies J., Deadman R., Dunn M., Earthrowl M., Ellington A.G., Errington H., Frankish A., Frankland J., French L., Garner P., Garnett J., Gay L., Ghori M.R.J., Gibson R., Gilby L.M., Gillett W., Glithero R.J., Grafham D.V., Griffiths C., Griffiths-Jones S., Grocock R., Hammond S., Harrison E.S.I., Hart E., Haugen E., Heath P.D., Holmes S., Holt K., Howden P.J., Hunt A.R., Hunt S.E., Hunter G., Isherwood J., James R., Johnson C., Johnson D., Joy A., Kay M., Kershaw J.K., Kibukawa M., Kimberley A.M., King A., Knights A.J., Lad H., Laird G., Lawlor S., Leongamornlert D.A., Lloyd D.M., Loveland J., Lovell J., Lush M.J., Lyne R., Martin S., Mashreghi-Mohammadi M., Matthews L., Matthews N.S.W., McLaren S., Milne S., Mistry S., Moore M.J.F., Nickerson T., O'Dell C.N., Oliver K., Palmeiri A., Palmer S.A., Parker A., Patel D., Pearce A.V., Peck A.I., Pelan S., Phelps K., Phillimore B.J., Plumb R., Rajan J., Raymond C., Rouse G., Saenphimmachak C., Sehra H.K., Sheridan E., Shownkeen R., Sims S., Skuce C.D., Smith M., Steward C., Subramanian S., Sycamore N., Tracey A., Tromans A., Van Helmond Z., Wall M., Wallis J.M., White S., Whitehead S.L., Wilkinson J.E., Willey D.L., Williams H., Wilming L., Wray P.W., Wu Z., Coulson A., Vaudin M., Sulston J.E., Durbin R.M., Hubbard T., Wooster R., Dunham I., Carter N.P., McVean G., Ross M.T., Harrow J., Olson M.V., Beck S., Rogers J., Bentley D.R.
    Nature 441:315-321(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  3. "Requirement of the co-repressor homeodomain-interacting protein kinase 2 for ski-mediated inhibition of bone morphogenetic protein-induced transcriptional activation."
    Harada J., Kokura K., Kanei-Ishii C., Nomura T., Khan M.M., Kim Y., Ishii S.
    J. Biol. Chem. 278:38998-39005(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH HIPK2; SMAD2; SMAD3 AND SMAD4.
  4. "Lys-N and trypsin cover complementary parts of the phosphoproteome in a refined SCX-based approach."
    Gauci S., Helbig A.O., Slijper M., Krijgsveld J., Heck A.J., Mohammed S.
    Anal. Chem. 81:4493-4501(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
  5. Cited for: FUNCTION, INTERACTION WITH PRDM16 AND SMAD3.
  6. "Quantitative phosphoproteomic analysis of T cell receptor signaling reveals system-wide modulation of protein-protein interactions."
    Mayya V., Lundgren D.H., Hwang S.-I., Rezaul K., Wu L., Eng J.K., Rodionov V., Han D.K.
    Sci. Signal. 2:RA46-RA46(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-480, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Leukemic T-cell.
  7. Cited for: VARIANTS SGS LEU-31; VAL-32; PRO-32; CYS-34; VAL-34; SER-34; GLN-35; SER-35; 94-SER--SER-97 DEL AND 95-ASP--SER-97 DEL.
  8. Cited for: VARIANTS SGS ARG-21; VAL-32; ASP-34; CYS-34; SER-34; SER-35; 95-ASP--SER-97 DEL; GLU-116 AND ARG-117.
  9. "De novo exon 1 missense mutations of SKI and Shprintzen-Goldberg syndrome: two new cases and a clinical review."
    FORGE Canada Consortium
    Au P.Y., Racher H.E., Graham J.M. Jr., Kramer N., Lowry R.B., Parboosingh J.S., Innes A.M.
    Am. J. Med. Genet. A 164A:676-684(2014) [PubMed] [Europe PMC] [Abstract]
    Cited for: VARIANTS SGS SER-35 AND GLU-116.
  10. Cited for: VARIANTS SGS THR-28; LEU-31; VAL-32; VAL-34; ALA-34; ASP-34; SER-34 AND SER-35.

Entry informationi

Entry nameiSKI_HUMAN
AccessioniPrimary (citable) accession number: P12755
Secondary accession number(s): Q5SYT7
Entry historyi
Integrated into UniProtKB/Swiss-Prot: October 1, 1989
Last sequence update: October 1, 1989
Last modified: June 24, 2015
This is version 153 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 1
    Human chromosome 1: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  6. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.