Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Protein furry homolog

Gene

FRY

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 4 out of 5-Experimental evidence at protein leveli

Functioni

Plays a crucial role in the structural integrity of mitotic centrosomes and in the maintenance of spindle bipolarity by promoting PLK1 activity at the spindle poles in early mitosis. May function as a scaffold promoting the interaction between AURKA and PLK1, thereby enhancing AURKA-mediated PLK1 phosphorylation.1 Publication

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Names & Taxonomyi

Protein namesi
Recommended name:
Protein furry homolog
Gene namesi
Name:FRY
Synonyms:C13orf14
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 13

Organism-specific databases

HGNCiHGNC:20367. FRY.

Subcellular locationi

  • Cytoplasm 1 Publication
  • Cytoplasmcytoskeletonmicrotubule organizing centercentrosome 1 Publication
  • Cytoplasmcytoskeletonspindle pole 1 Publication

  • Note: Distributed diffusely throughout the cytoplasm in interphase. Localizes to the separating centrosomes in prophase, to the spindle poles and spindle microtubules in prometaphase to metaphase, to spindle microtubules in anaphase and to the distal sections of the midbody in cytokinesis. Colocalizes with PLK1 to separating centrosomes and spindle poles from prophase to metaphase in mitosis, but not in other stages of the cell cycle.

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Cytoskeleton

Pathology & Biotechi

Organism-specific databases

PharmGKBiPA134927490.

Polymorphism and mutation databases

BioMutaiFRY.
DMDMi74745928.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 30133013Protein furry homologPRO_0000281674Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei213 – 2131PhosphotyrosineCombined sources
Modified residuei1382 – 13821PhosphoserineBy similarity
Modified residuei1383 – 13831PhosphoserineBy similarity
Modified residuei1936 – 19361PhosphoserineCombined sources
Modified residuei1940 – 19401PhosphoserineBy similarity
Modified residuei2419 – 24191PhosphoserineBy similarity
Modified residuei2420 – 24201PhosphoserineBy similarity
Modified residuei2487 – 24871PhosphoserineBy similarity
Modified residuei2508 – 25081Phosphothreonine; by CDK1By similarity
Modified residuei2808 – 28081PhosphoserineBy similarity

Post-translational modificationi

Phosphorylated by AURKA, CDK1 and PLK1.

Keywords - PTMi

Phosphoprotein

Proteomic databases

EPDiQ5TBA9.
MaxQBiQ5TBA9.
PaxDbiQ5TBA9.
PeptideAtlasiQ5TBA9.
PRIDEiQ5TBA9.

PTM databases

iPTMnetiQ5TBA9.
PhosphoSiteiQ5TBA9.

Expressioni

Gene expression databases

BgeeiQ5TBA9.
CleanExiHS_FRY.
ExpressionAtlasiQ5TBA9. baseline and differential.
GenevisibleiQ5TBA9. HS.

Organism-specific databases

HPAiHPA041635.

Interactioni

Subunit structurei

When phosphorylated by CDK1, interacts with PLK1; this interaction occurs in mitotic cells, but not in interphase cells, and leads to further phosphorylation by PLK1. Interacts with AURKA.1 Publication

Protein-protein interaction databases

BioGridi115433. 5 interactions.
IntActiQ5TBA9. 2 interactions.
MINTiMINT-1746188.
STRINGi9606.ENSP00000369600.

Structurei

3D structure databases

ProteinModelPortaliQ5TBA9.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Sequence similaritiesi

Belongs to the furry protein family.Curated

Phylogenomic databases

eggNOGiKOG1825. Eukaryota.
ENOG410XSZS. LUCA.
GeneTreeiENSGT00610000086058.
HOGENOMiHOG000112653.
HOVERGENiHBG081541.
InParanoidiQ5TBA9.
OMAiHFSAMIA.
OrthoDBiEOG7K0ZB7.
PhylomeDBiQ5TBA9.
TreeFamiTF313568.

Family and domain databases

InterProiIPR016024. ARM-type_fold.
IPR025614. Cell_morpho_N.
IPR025481. Cell_Morphogen_C.
IPR029473. MOR2-PAG1_mid.
[Graphical view]
PfamiPF14225. MOR2-PAG1_C. 1 hit.
PF14228. MOR2-PAG1_mid. 4 hits.
PF14222. MOR2-PAG1_N. 1 hit.
[Graphical view]
SUPFAMiSSF48371. SSF48371. 8 hits.

Sequencei

Sequence statusi: Complete.

Q5TBA9-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MASQQDSGFF EISIKYLLKS WSNTSPVGNG YIKPPVPPAS GTHREKGPPT
60 70 80 90 100
MLPINVDPDS KPGEYVLKSL FVNFTTQAER KIRIIMAEPL EKPLTKSLQR
110 120 130 140 150
GEDPQFDQVI SSMSSLSEYC LPSILRTLFD WYKRQNGIED ESHEYRPRTS
160 170 180 190 200
NKSKSDEQQR DYLMERRDLA IDFIFSLVLI EVLKQIPLHP VIDSLIHDVI
210 220 230 240 250
NLAFKHFKYK EGYLGPNTGN MHIVADLYAE VIGVLAQAKF PAVKKKFMAE
260 270 280 290 300
LKELRHKEQN PYVVQSIISL IMGMKFFRIK MYPVEDFEAS LQFMQECAHY
310 320 330 340 350
FLEVKDKDIK HALAGLFVEI LVPVAAAVKN EVNVPCLRNF VESLYDTTLE
360 370 380 390 400
LSSRKKHSLA LYPLVTCLLC VSQKQLFLNR WHIFLNNCLS NLKNKDPKMA
410 420 430 440 450
RVALESLYRL LWVYMIRIKC ESNTATQSRL ITIITTLFPK GSRGVVPRDM
460 470 480 490 500
PLNIFVKIIQ FIAQERLDFA MKEIIFDFLC VGKPAKAFSL NPERMNIGLR
510 520 530 540 550
AFLVIADSLQ QKDGEPPMPV TGAVLPSGNT LRVKKTYLSK TLTEEEAKMI
560 570 580 590 600
GMSLYYSQVR KAVDNILRHL DKEVGRCMML TNVQMLNKEP EDMITGERKP
610 620 630 640 650
KIDLFRTCVA AIPRLLPDGM SKLELIDLLA RLSIHMDDEL RHIAQNSLQG
660 670 680 690 700
LLVDFSDWRE DVLFGFTNFL LREVNDMHHT LLDSSLKLLL QLLTQWKLVI
710 720 730 740 750
QTQGKVYEQA NKIRNSELIA NGSSHRIQSE RGPHCSVLHA VEGFALVLLC
760 770 780 790 800
SFQVATRKLS VLILKEIRAL FIALGQPEDD DRPMIDVMDQ LSSSILESFI
810 820 830 840 850
HVAVSDSATL PLTHNVDLQW LVEWNAVLVN SHYDVKSPSH VWIFAQSVKD
860 870 880 890 900
PWVLCLFSFL RQENLPKHCP TALSYAWPYA FTRLQSVMPL VDPNSPINAK
910 920 930 940 950
KTSTAGSGDN YVTLWRNYLI LCFGVAKPSI MSPGHLRAST PEIMATTPDG
960 970 980 990 1000
TVSYDNKAIG TPSVGVLLKQ LVPLMRLESI EITESLVLGF GRTNSLVFRE
1010 1020 1030 1040 1050
LVEELHPLMK EALERRPENK KRRERRDLLR LQLLRIFELL ADAGVISDST
1060 1070 1080 1090 1100
NGALERDTLA LGALFLEYVD LTRMLLEAEN DKEVEILKDI RAHFSAMVAN
1110 1120 1130 1140 1150
LIQCVPVHHR RFLFPQQSLR HHLFILFSQW AGPFSIMFTP LDRYSDRNHQ
1160 1170 1180 1190 1200
ITRYQYCALK AMSAVLCCGP VFDNVGLSPD GYLYKWLDNI LACQDLRVHQ
1210 1220 1230 1240 1250
LGCEVVVLLL ELNPDQINLF NWAIDRCYTG SYQLASGCFK AIATVCGSRN
1260 1270 1280 1290 1300
YPFDIVTLLN LVLFKASDTN REIYEISMQL MQILEAKLFV YSKKVAEQRP
1310 1320 1330 1340 1350
GSILYGTHGP LPPLYSVSLA LLSCELARMY PELTLPLFSE VSQRFPTTHP
1360 1370 1380 1390 1400
NGRQIMLTYL LPWLHNIELV DSRLLLPGSS PSSPEDEVKD REGDVTASHG
1410 1420 1430 1440 1450
LRGNGWGSPE ATSLVLNNLM YMTAKYGDEV PGPEMENAWN ALANNEKWSN
1460 1470 1480 1490 1500
NLRITLQFLI SLCGVSSDTV LLPYIKKVAI YLCRNNTIQT MEELLFELQQ
1510 1520 1530 1540 1550
TEPVNPIVQH CDNPPFYRFT ASSKASAAAS GTTSSSNTVV AGQENFPDAE
1560 1570 1580 1590 1600
ENKILKESDE RFSNVIRAHT RLESRYSNSS GGSYDEDKND PISPYTGWLL
1610 1620 1630 1640 1650
TITETKQPQP LPMPCTGGCW APLVDYLPET ITPRGPLHRC NIAVIFMTEM
1660 1670 1680 1690 1700
VVDHSVREDW ALHLPLLLHA VFLGLDHYRP EVFEHSKKLL LHLLIALSCN
1710 1720 1730 1740 1750
SNFHSIASVL LQTREMGEAK TLTVQPAYQP EYLYTGGFDF LREDQSSPVP
1760 1770 1780 1790 1800
DSGLSSSSTS SSISLGGSSG NLPQMTQEVE DVDTAAETDE KANKLIEFLT
1810 1820 1830 1840 1850
TRAFGPLWCH EDITPKNQNS KSAEQLTNFL RHVVSVFKDS KSGFHLEHQL
1860 1870 1880 1890 1900
SEVALQTALA SSSRHYAGRS FQIFRALKQP LSAHALSDLL SRLVEVIGEH
1910 1920 1930 1940 1950
GDEIQGYVME ALLTLEAAVD NLSDCLKNSD LLTVLSRSSS PDLSSSSKLT
1960 1970 1980 1990 2000
ASRKSTGQLN MNPGTTSGNT ATAERSRHQR SFSVPKKFGV IDRSSDPPRS
2010 2020 2030 2040 2050
ATLDRIQACT QQGLSSKTRS SSSLKDSLTD PSHINHPTNL LATIFWVTVA
2060 2070 2080 2090 2100
LMESDFEFEY LMALRLLSRL LAHMPLDKAE NREKLEKLQA QLKWADFSGL
2110 2120 2130 2140 2150
QQLLLKGFTS LTTTDLTLQL FSLLTPVSKI SMVDASHAIG FPLNVLCLLP
2160 2170 2180 2190 2200
QLIQHFENPN QFCKDIAERI AQVCLEEKNP KLSNLAHVMT LYKTHSYTRD
2210 2220 2230 2240 2250
CATWVNVVCR YLHEAYADIT LNMVTYLAEL LEKGLPSVQQ PLLQVIYSLL
2260 2270 2280 2290 2300
SYMDLSVVPV KQFNVEVLKT IEKYVQSVHW REALNILKLV VSRSASLVLP
2310 2320 2330 2340 2350
SYQHSDLSKI EIHRVWTSAS KELPGKTLDF HFDISETPII GRRYDELQNS
2360 2370 2380 2390 2400
SGRDGKPRAM AVTRSTSSTS SGSNSNVLVP VSWKRPQYSQ KRTKEKLVHV
2410 2420 2430 2440 2450
LSLCGQEVGL SKNPSVIFSS CGDLDLLEHQ TSLVSSEDGA REQENMDDTN
2460 2470 2480 2490 2500
SEQQFRVFRD FDFLDVELED GEGESMDNFN WGVRRRSLDS LDKCDMQILE
2510 2520 2530 2540 2550
ERQLSGSTPS LNKMHHEDSD ESSEEEDLTA SQILEHSDLI MTLSPSEETN
2560 2570 2580 2590 2600
PMELLTTACD STPAEPHSFN TRMSSFDASL PDMNNLQISE GSKAEAVREE
2610 2620 2630 2640 2650
EDTTVHEDDL SSSINELPAA FECSDSFSLD MTEGEEKGNR ALDQFTLASF
2660 2670 2680 2690 2700
GEGDRGVSPP PSPFFSAILA AFQPAACDDA EEAWRSHINQ LMCDSDGSCA
2710 2720 2730 2740 2750
VYTFHVFSSL FKNIQKRFCF LTCDAASYLG DNLRGIGSKF VSSSQMLTSC
2760 2770 2780 2790 2800
SECPTLFVDA ETLLSCGLLD KLKFSVLELQ EYLDTYNNRK EATLSWLANC
2810 2820 2830 2840 2850
KATFAGGSRD GVITCQPGDS EEKQLELCQR LYKLHFQLLL LFQSYCKLIG
2860 2870 2880 2890 2900
QVHEVSSMPE LLNMSRELSD LKKHLKEASA VIAADPLYSD GAWSEPTFTS
2910 2920 2930 2940 2950
TEAAIQSMLE CLKNNELGKA LRQIRECRSL WPNDIFGSSS DDEVQTLLNI
2960 2970 2980 2990 3000
YFRHQTLGQT GTYALVGSNQ SLTEICTKLM ELNMEIRDMI RRAQSYRVLT
3010
TFLPDSSVSG TSL
Length:3,013
Mass (Da):338,875
Last modified:December 21, 2004 - v1
Checksum:i2AB5EDB7D2529538
GO

Sequence cautioni

The sequence CAB42442.1 differs from that shown. Reason: Frameshift at positions 1340 and 1378. Curated

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti1968 – 19681G → S.
Corresponds to variant rs2806639 [ dbSNP | Ensembl ].
VAR_053831

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AL049784 mRNA. Translation: CAB42442.1. Frameshift.
AL137143, AL138692, AL445212 Genomic DNA. Translation: CAI12629.1.
AL138692, AL137143, AL445212 Genomic DNA. Translation: CAI13379.1.
AL445212, AL137143, AL138692 Genomic DNA. Translation: CAI40478.1.
CCDSiCCDS41875.1.
RefSeqiNP_075463.2. NM_023037.2.
UniGeneiHs.507669.
Hs.718654.

Genome annotation databases

EnsembliENST00000542859; ENSP00000445043; ENSG00000073910.
GeneIDi10129.
KEGGihsa:10129.
UCSCiuc001utx.4. human.

Keywords - Coding sequence diversityi

Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AL049784 mRNA. Translation: CAB42442.1. Frameshift.
AL137143, AL138692, AL445212 Genomic DNA. Translation: CAI12629.1.
AL138692, AL137143, AL445212 Genomic DNA. Translation: CAI13379.1.
AL445212, AL137143, AL138692 Genomic DNA. Translation: CAI40478.1.
CCDSiCCDS41875.1.
RefSeqiNP_075463.2. NM_023037.2.
UniGeneiHs.507669.
Hs.718654.

3D structure databases

ProteinModelPortaliQ5TBA9.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi115433. 5 interactions.
IntActiQ5TBA9. 2 interactions.
MINTiMINT-1746188.
STRINGi9606.ENSP00000369600.

PTM databases

iPTMnetiQ5TBA9.
PhosphoSiteiQ5TBA9.

Polymorphism and mutation databases

BioMutaiFRY.
DMDMi74745928.

Proteomic databases

EPDiQ5TBA9.
MaxQBiQ5TBA9.
PaxDbiQ5TBA9.
PeptideAtlasiQ5TBA9.
PRIDEiQ5TBA9.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000542859; ENSP00000445043; ENSG00000073910.
GeneIDi10129.
KEGGihsa:10129.
UCSCiuc001utx.4. human.

Organism-specific databases

CTDi10129.
GeneCardsiFRY.
HGNCiHGNC:20367. FRY.
HPAiHPA041635.
MIMi614818. gene.
neXtProtiNX_Q5TBA9.
PharmGKBiPA134927490.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG1825. Eukaryota.
ENOG410XSZS. LUCA.
GeneTreeiENSGT00610000086058.
HOGENOMiHOG000112653.
HOVERGENiHBG081541.
InParanoidiQ5TBA9.
OMAiHFSAMIA.
OrthoDBiEOG7K0ZB7.
PhylomeDBiQ5TBA9.
TreeFamiTF313568.

Miscellaneous databases

ChiTaRSiFRY. human.
GenomeRNAii10129.
PROiQ5TBA9.
SOURCEiSearch...

Gene expression databases

BgeeiQ5TBA9.
CleanExiHS_FRY.
ExpressionAtlasiQ5TBA9. baseline and differential.
GenevisibleiQ5TBA9. HS.

Family and domain databases

InterProiIPR016024. ARM-type_fold.
IPR025614. Cell_morpho_N.
IPR025481. Cell_Morphogen_C.
IPR029473. MOR2-PAG1_mid.
[Graphical view]
PfamiPF14225. MOR2-PAG1_C. 1 hit.
PF14228. MOR2-PAG1_mid. 4 hits.
PF14222. MOR2-PAG1_N. 1 hit.
[Graphical view]
SUPFAMiSSF48371. SSF48371. 8 hits.
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. Rhodes S.
    Submitted (MAY-1999) to the EMBL/GenBank/DDBJ databases
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
  2. "The DNA sequence and analysis of human chromosome 13."
    Dunham A., Matthews L.H., Burton J., Ashurst J.L., Howe K.L., Ashcroft K.J., Beare D.M., Burford D.C., Hunt S.E., Griffiths-Jones S., Jones M.C., Keenan S.J., Oliver K., Scott C.E., Ainscough R., Almeida J.P., Ambrose K.D., Andrews D.T.
    , Ashwell R.I.S., Babbage A.K., Bagguley C.L., Bailey J., Bannerjee R., Barlow K.F., Bates K., Beasley H., Bird C.P., Bray-Allen S., Brown A.J., Brown J.Y., Burrill W., Carder C., Carter N.P., Chapman J.C., Clamp M.E., Clark S.Y., Clarke G., Clee C.M., Clegg S.C., Cobley V., Collins J.E., Corby N., Coville G.J., Deloukas P., Dhami P., Dunham I., Dunn M., Earthrowl M.E., Ellington A.G., Faulkner L., Frankish A.G., Frankland J., French L., Garner P., Garnett J., Gilbert J.G.R., Gilson C.J., Ghori J., Grafham D.V., Gribble S.M., Griffiths C., Hall R.E., Hammond S., Harley J.L., Hart E.A., Heath P.D., Howden P.J., Huckle E.J., Hunt P.J., Hunt A.R., Johnson C., Johnson D., Kay M., Kimberley A.M., King A., Laird G.K., Langford C.J., Lawlor S., Leongamornlert D.A., Lloyd D.M., Lloyd C., Loveland J.E., Lovell J., Martin S., Mashreghi-Mohammadi M., McLaren S.J., McMurray A., Milne S., Moore M.J.F., Nickerson T., Palmer S.A., Pearce A.V., Peck A.I., Pelan S., Phillimore B., Porter K.M., Rice C.M., Searle S., Sehra H.K., Shownkeen R., Skuce C.D., Smith M., Steward C.A., Sycamore N., Tester J., Thomas D.W., Tracey A., Tromans A., Tubby B., Wall M., Wallis J.M., West A.P., Whitehead S.L., Willey D.L., Wilming L., Wray P.W., Wright M.W., Young L., Coulson A., Durbin R.M., Hubbard T., Sulston J.E., Beck S., Bentley D.R., Rogers J., Ross M.T.
    Nature 428:522-528(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  3. Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-1936, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
  4. "Quantitative phosphoproteomic analysis of T cell receptor signaling reveals system-wide modulation of protein-protein interactions."
    Mayya V., Lundgren D.H., Hwang S.-I., Rezaul K., Wu L., Eng J.K., Rodionov V., Han D.K.
    Sci. Signal. 2:RA46-RA46(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT TYR-213, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Leukemic T-cell.
  5. "Furry protein promotes Aurora A-mediated polo-like kinase 1 activation."
    Ikeda M., Chiba S., Ohashi K., Mizuno K.
    J. Biol. Chem. 287:27670-27681(2012) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, INTERACTION WITH PLK1, SUBCELLULAR LOCATION.
  6. "An enzyme assisted RP-RPLC approach for in-depth analysis of human liver phosphoproteome."
    Bian Y., Song C., Cheng K., Dong M., Wang F., Huang J., Sun D., Wang L., Ye M., Zou H.
    J. Proteomics 96:253-262(2014) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Liver.

Entry informationi

Entry nameiFRY_HUMAN
AccessioniPrimary (citable) accession number: Q5TBA9
Secondary accession number(s): Q9Y3N6
Entry historyi
Integrated into UniProtKB/Swiss-Prot: March 20, 2007
Last sequence update: December 21, 2004
Last modified: July 6, 2016
This is version 100 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 13
    Human chromosome 13: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.