Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Rho GTPase-activating protein 6

Gene

Arhgap6

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

GTPase activator for the Rho-type GTPases by converting them to an inactive GDP-bound state. Could regulate the interactions of signaling molecules with the actin cytoskeleton. Promotes continuous elongation of cytoplasmic processes during cell motility and simultaneous retraction of the cell body changing the cell morphology (By similarity).By similarity

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

GTPase activation

Enzyme and pathway databases

ReactomeiR-MMU-194840. Rho GTPase cycle.

Names & Taxonomyi

Protein namesi
Recommended name:
Rho GTPase-activating protein 6
Alternative name(s):
Rho-type GTPase-activating protein 6
Rho-type GTPase-activating protein RhoGAPX-1
Gene namesi
Name:Arhgap6
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome X

Organism-specific databases

MGIiMGI:1196332. Arhgap6.

Subcellular locationi

GO - Cellular componenti

  • actin cytoskeleton Source: MGI
  • cytoplasm Source: MGI
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 987987Rho GTPase-activating protein 6PRO_0000056705Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei37 – 371PhosphoserineCombined sources
Modified residuei265 – 2651PhosphoserineBy similarity
Modified residuei669 – 6691PhosphoserineCombined sources
Modified residuei675 – 6751PhosphoserineCombined sources
Modified residuei713 – 7131PhosphoserineCombined sources
Modified residuei781 – 7811PhosphoserineBy similarity
Modified residuei824 – 8241PhosphoserineCombined sources

Keywords - PTMi

Phosphoprotein

Proteomic databases

MaxQBiO54834.
PaxDbiO54834.
PRIDEiO54834.

PTM databases

iPTMnetiO54834.
PhosphoSiteiO54834.

Expressioni

Tissue specificityi

Expressed in retina and lung.

Gene expression databases

BgeeiO54834.
CleanExiMM_ARHGAP6.
ExpressionAtlasiO54834. baseline and differential.
GenevisibleiO54834. MM.

Interactioni

GO - Molecular functioni

Protein-protein interaction databases

STRINGi10090.ENSMUSP00000033721.

Structurei

3D structure databases

ProteinModelPortaliO54834.
SMRiO54834. Positions 406-607.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini403 – 604202Rho-GAPPROSITE-ProRule annotationAdd
BLAST

Motif

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Motifi344 – 35411SH3-bindingAdd
BLAST

Sequence similaritiesi

Contains 1 Rho-GAP domain.PROSITE-ProRule annotation

Keywords - Domaini

SH3-binding

Phylogenomic databases

eggNOGiKOG2710. Eukaryota.
ENOG410YVI5. LUCA.
GeneTreeiENSGT00760000119123.
HOVERGENiHBG067762.
InParanoidiO54834.
OMAiSMGGRHS.
OrthoDBiEOG7FJH1S.
PhylomeDBiO54834.
TreeFamiTF316710.

Family and domain databases

Gene3Di1.10.555.10. 2 hits.
InterProiIPR008936. Rho_GTPase_activation_prot.
IPR030772. RhoGAP6.
IPR000198. RhoGAP_dom.
[Graphical view]
PANTHERiPTHR12635:SF6. PTHR12635:SF6. 1 hit.
PfamiPF00620. RhoGAP. 1 hit.
[Graphical view]
SMARTiSM00324. RhoGAP. 1 hit.
[Graphical view]
SUPFAMiSSF48350. SSF48350. 1 hit.
PROSITEiPS50238. RHOGAP. 1 hit.
[Graphical view]

Sequences (4)i

Sequence statusi: Complete.

This entry describes 4 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: O54834-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MSAQSLLHSV FSCSSPASGG TASAKGFSKR KLRQTRSLDP ALIGGCGSEM
60 70 80 90 100
GAEGGLRGST VSRLHSPQLL AEGLGSRLAS SPRSQHLRAT RFQTPRPLCS
110 120 130 140 150
SFSTPSTPQE KSPSGSFHFD YEVPLSRSGL KKSMAWDLPS VLAGSGSASS
160 170 180 190 200
RSPASILSSS GGGPNGIFSS PRRWLQQRKF QPPPNSRSHP YVVWRSEGDF
210 220 230 240 250
TWNSMSGRSV RLRSVPIQSL SELERARLQE VAFYQLQQDC DLGCQITIPK
260 270 280 290 300
DGQKRKKSLR KKLDSLGKEK NKDKEFIPQA FGMPLSQVIA NDRAYKLKQD
310 320 330 340 350
LQREEQKDAS SDFVSSLLPF GNKKQNKELS SSNSSLSSTS ETPNESTSPN
360 370 380 390 400
TPEPAPRARR RGAMSVDSIT DLDDNQSRLL EALQLSLPAE AQSKKEKARD
410 420 430 440 450
KKLSLNPIYR QVPRLVDSCC QHLEKHGLQT VGIFRVGSSK KRVRQLREEF
460 470 480 490 500
DRGVDVCLEE EHSVHDVAAL LKEFLRDMPD PLLTRELYTA FINTLLLEPE
510 520 530 540 550
EQLGTLQLLI YLLPPCNCDT LHRLLQFLSI VARHADDNVS KDGQEVTGNK
560 570 580 590 600
MTSLNLATIF GPNLLHKQKS SDKEYSVQSS ARAEESTAII AVVQKMIENY
610 620 630 640 650
EALFMVPPDL QNEVLISLLE TDPDVVDYLL RRKASQSSSP DILQTEVSFS
660 670 680 690 700
MGGRHSSTDS NKASSGDISP YDNNSPVLSE RSLLAMQEDR ARGGSEKLYK
710 720 730 740 750
VPEQYTLVGH LSSPKSKSRE SSPGPRLGKE MSEEPFNIWG TWHSTLKSGS
760 770 780 790 800
KDPGMTGSYG DIFESSSLRP RPCSLSQGNL SLNWPRCQGS PTGLDSGTQV
810 820 830 840 850
IRRTQTAATV EQCSVHLPVS RVCSTPHIQD GSRGTRRPAA SSDPFLSLNS
860 870 880 890 900
TEDLAEGKED VAWLQSQARP VYQRPQESGK DDRRPPPPYP GSGKPATTSA
910 920 930 940 950
QLPLEPPLWR LQRHEEGSET AVEGGQQASG EHQTRPKKLS SAYSLSASEQ
960 970 980
DKQNLGEASW LDWQRERWQI WELLSTDNPD ALPETLV
Length:987
Mass (Da):108,844
Last modified:May 29, 2007 - v3
Checksum:iB621845ECF1C7EBD
GO
Isoform 2 (identifier: O54834-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     639-660: SPDILQTEVSFSMGGRHSSTDS → TSSVLPAAGQACSQSPASDFTP
     661-987: Missing.

Show »
Length:660
Mass (Da):72,713
Checksum:i7DB2D964FEF9A5F9
GO
Isoform 3 (identifier: O54834-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-58: Missing.
     59-197: STVSRLHSPQ...SHPYVVWRSE → MGDPSYSEKPRLHYA

Show »
Length:805
Mass (Da):89,722
Checksum:i31C024E718236D05
GO
Isoform 4 (identifier: O54834-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-197: MSAQSLLHSV...SHPYVVWRSE → MYKIF

Show »
Length:795
Mass (Da):88,672
Checksum:i50805457C81512DB
GO

Sequence cautioni

The sequence BAC33263.1 differs from that shown. Reason: Erroneous initiation. Curated
The sequence BAC37093.1 differs from that shown. Reason: Erroneous initiation. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti58 – 636GSTVSR → AHSKP in AAC53522 (PubMed:10699171).Curated
Sequence conflicti58 – 636GSTVSR → AHSKP in AAD55086 (PubMed:10699171).Curated
Sequence conflicti111 – 1111K → N in AAC53522 (PubMed:10699171).Curated
Sequence conflicti111 – 1111K → N in AAD55086 (PubMed:10699171).Curated
Sequence conflicti157 – 1571L → V in AAC53522 (PubMed:10699171).Curated
Sequence conflicti157 – 1571L → V in AAD55086 (PubMed:10699171).Curated
Sequence conflicti262 – 2621K → N in AAC53522 (PubMed:10699171).Curated
Sequence conflicti262 – 2621K → N in AAD55086 (PubMed:10699171).Curated
Sequence conflicti521 – 5211L → P in BAE38520 (PubMed:16141072).Curated
Sequence conflicti580 – 5801S → P in BAE38520 (PubMed:16141072).Curated
Sequence conflicti601 – 6022EA → DS in AAC53522 (PubMed:10699171).Curated
Sequence conflicti601 – 6022EA → DS in AAD55086 (PubMed:10699171).Curated
Sequence conflicti664 – 6641S → F in AAC53522 (PubMed:10699171).Curated
Sequence conflicti723 – 7231P → S in AAC53522 (PubMed:10699171).Curated
Sequence conflicti753 – 7531P → Q in BAE38520 (PubMed:16141072).Curated

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1 – 197197MSAQS…VWRSE → MYKIF in isoform 4. 1 PublicationVSP_026051Add
BLAST
Alternative sequencei1 – 5858Missing in isoform 3. 1 PublicationVSP_026052Add
BLAST
Alternative sequencei59 – 197139STVSR…VWRSE → MGDPSYSEKPRLHYA in isoform 3. 1 PublicationVSP_026053Add
BLAST
Alternative sequencei639 – 66022SPDIL…SSTDS → TSSVLPAAGQACSQSPASDF TP in isoform 2. 1 PublicationVSP_001643Add
BLAST
Alternative sequencei661 – 987327Missing in isoform 2. 1 PublicationVSP_001644Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF012273 mRNA. Translation: AAC53522.2.
AF177664 mRNA. Translation: AAD55086.1.
AK048162 mRNA. Translation: BAC33263.1. Different initiation.
AK048507 mRNA. Translation: BAC33352.1.
AK077996 mRNA. Translation: BAC37093.1. Different initiation.
AK166015 mRNA. Translation: BAE38520.1.
AL663026, AL805974, AL831750 Genomic DNA. Translation: CAM17930.1.
AL663026, AL805974, AL831750 Genomic DNA. Translation: CAM17933.1.
AL663056, AL805974, AL831750 Genomic DNA. Translation: CAM18669.1.
AL805974, AL663056, AL831750 Genomic DNA. Translation: CAM21305.1.
AL805974, AL663026, AL831750 Genomic DNA. Translation: CAM21306.1.
AL805974, AL663026, AL831750 Genomic DNA. Translation: CAM21309.1.
AL831750, AL663056, AL805974 Genomic DNA. Translation: CAM22606.1.
AL831750, AL663026, AL805974 Genomic DNA. Translation: CAM22607.1.
AL831750, AL663026, AL805974 Genomic DNA. Translation: CAM22609.1.
CCDSiCCDS30534.1. [O54834-1]
CCDS41212.1. [O54834-3]
CCDS72472.1. [O54834-4]
RefSeqiNP_001274459.1. NM_001287530.1. [O54834-4]
NP_033837.2. NM_009707.4. [O54834-1]
NP_848869.1. NM_178754.3. [O54834-3]
XP_006528757.1. XM_006528694.1. [O54834-1]
XP_011246085.1. XM_011247783.1. [O54834-2]
UniGeneiMm.441810.

Genome annotation databases

EnsembliENSMUST00000033721; ENSMUSP00000033721; ENSMUSG00000031355. [O54834-1]
ENSMUST00000112127; ENSMUSP00000107755; ENSMUSG00000031355. [O54834-4]
ENSMUST00000112131; ENSMUSP00000107759; ENSMUSG00000031355. [O54834-3]
GeneIDi11856.
KEGGimmu:11856.
UCSCiuc009uxj.2. mouse. [O54834-1]
uc009uxo.2. mouse. [O54834-3]
uc009uxq.2. mouse. [O54834-4]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF012273 mRNA. Translation: AAC53522.2.
AF177664 mRNA. Translation: AAD55086.1.
AK048162 mRNA. Translation: BAC33263.1. Different initiation.
AK048507 mRNA. Translation: BAC33352.1.
AK077996 mRNA. Translation: BAC37093.1. Different initiation.
AK166015 mRNA. Translation: BAE38520.1.
AL663026, AL805974, AL831750 Genomic DNA. Translation: CAM17930.1.
AL663026, AL805974, AL831750 Genomic DNA. Translation: CAM17933.1.
AL663056, AL805974, AL831750 Genomic DNA. Translation: CAM18669.1.
AL805974, AL663056, AL831750 Genomic DNA. Translation: CAM21305.1.
AL805974, AL663026, AL831750 Genomic DNA. Translation: CAM21306.1.
AL805974, AL663026, AL831750 Genomic DNA. Translation: CAM21309.1.
AL831750, AL663056, AL805974 Genomic DNA. Translation: CAM22606.1.
AL831750, AL663026, AL805974 Genomic DNA. Translation: CAM22607.1.
AL831750, AL663026, AL805974 Genomic DNA. Translation: CAM22609.1.
CCDSiCCDS30534.1. [O54834-1]
CCDS41212.1. [O54834-3]
CCDS72472.1. [O54834-4]
RefSeqiNP_001274459.1. NM_001287530.1. [O54834-4]
NP_033837.2. NM_009707.4. [O54834-1]
NP_848869.1. NM_178754.3. [O54834-3]
XP_006528757.1. XM_006528694.1. [O54834-1]
XP_011246085.1. XM_011247783.1. [O54834-2]
UniGeneiMm.441810.

3D structure databases

ProteinModelPortaliO54834.
SMRiO54834. Positions 406-607.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

STRINGi10090.ENSMUSP00000033721.

PTM databases

iPTMnetiO54834.
PhosphoSiteiO54834.

Proteomic databases

MaxQBiO54834.
PaxDbiO54834.
PRIDEiO54834.

Protocols and materials databases

DNASUi11856.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000033721; ENSMUSP00000033721; ENSMUSG00000031355. [O54834-1]
ENSMUST00000112127; ENSMUSP00000107755; ENSMUSG00000031355. [O54834-4]
ENSMUST00000112131; ENSMUSP00000107759; ENSMUSG00000031355. [O54834-3]
GeneIDi11856.
KEGGimmu:11856.
UCSCiuc009uxj.2. mouse. [O54834-1]
uc009uxo.2. mouse. [O54834-3]
uc009uxq.2. mouse. [O54834-4]

Organism-specific databases

CTDi395.
MGIiMGI:1196332. Arhgap6.

Phylogenomic databases

eggNOGiKOG2710. Eukaryota.
ENOG410YVI5. LUCA.
GeneTreeiENSGT00760000119123.
HOVERGENiHBG067762.
InParanoidiO54834.
OMAiSMGGRHS.
OrthoDBiEOG7FJH1S.
PhylomeDBiO54834.
TreeFamiTF316710.

Enzyme and pathway databases

ReactomeiR-MMU-194840. Rho GTPase cycle.

Miscellaneous databases

ChiTaRSiArhgap6. mouse.
PROiO54834.
SOURCEiSearch...

Gene expression databases

BgeeiO54834.
CleanExiMM_ARHGAP6.
ExpressionAtlasiO54834. baseline and differential.
GenevisibleiO54834. MM.

Family and domain databases

Gene3Di1.10.555.10. 2 hits.
InterProiIPR008936. Rho_GTPase_activation_prot.
IPR030772. RhoGAP6.
IPR000198. RhoGAP_dom.
[Graphical view]
PANTHERiPTHR12635:SF6. PTHR12635:SF6. 1 hit.
PfamiPF00620. RhoGAP. 1 hit.
[Graphical view]
SMARTiSM00324. RhoGAP. 1 hit.
[Graphical view]
SUPFAMiSSF48350. SSF48350. 1 hit.
PROSITEiPS50238. RHOGAP. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Functional analysis of ARHGAP6, a novel GTPase-activating protein for RhoA."
    Prakash S.K., Paylor R., Jenna S., Lamarche-Vane N., Armstrong D.L., Xu B., Mancini M.A., Zoghbi H.Y.
    Hum. Mol. Genet. 9:477-488(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORMS 1 AND 2), SEQUENCE REVISION TO 600-601.
    Strain: 129/Sv.
  2. "The transcriptional landscape of the mammalian genome."
    Carninci P., Kasukawa T., Katayama S., Gough J., Frith M.C., Maeda N., Oyama R., Ravasi T., Lenhard B., Wells C., Kodzius R., Shimokawa K., Bajic V.B., Brenner S.E., Batalov S., Forrest A.R., Zavolan M., Davis M.J.
    , Wilming L.G., Aidinis V., Allen J.E., Ambesi-Impiombato A., Apweiler R., Aturaliya R.N., Bailey T.L., Bansal M., Baxter L., Beisel K.W., Bersano T., Bono H., Chalk A.M., Chiu K.P., Choudhary V., Christoffels A., Clutterbuck D.R., Crowe M.L., Dalla E., Dalrymple B.P., de Bono B., Della Gatta G., di Bernardo D., Down T., Engstrom P., Fagiolini M., Faulkner G., Fletcher C.F., Fukushima T., Furuno M., Futaki S., Gariboldi M., Georgii-Hemming P., Gingeras T.R., Gojobori T., Green R.E., Gustincich S., Harbers M., Hayashi Y., Hensch T.K., Hirokawa N., Hill D., Huminiecki L., Iacono M., Ikeo K., Iwama A., Ishikawa T., Jakt M., Kanapin A., Katoh M., Kawasawa Y., Kelso J., Kitamura H., Kitano H., Kollias G., Krishnan S.P., Kruger A., Kummerfeld S.K., Kurochkin I.V., Lareau L.F., Lazarevic D., Lipovich L., Liu J., Liuni S., McWilliam S., Madan Babu M., Madera M., Marchionni L., Matsuda H., Matsuzawa S., Miki H., Mignone F., Miyake S., Morris K., Mottagui-Tabar S., Mulder N., Nakano N., Nakauchi H., Ng P., Nilsson R., Nishiguchi S., Nishikawa S., Nori F., Ohara O., Okazaki Y., Orlando V., Pang K.C., Pavan W.J., Pavesi G., Pesole G., Petrovsky N., Piazza S., Reed J., Reid J.F., Ring B.Z., Ringwald M., Rost B., Ruan Y., Salzberg S.L., Sandelin A., Schneider C., Schoenbach C., Sekiguchi K., Semple C.A., Seno S., Sessa L., Sheng Y., Shibata Y., Shimada H., Shimada K., Silva D., Sinclair B., Sperling S., Stupka E., Sugiura K., Sultana R., Takenaka Y., Taki K., Tammoja K., Tan S.L., Tang S., Taylor M.S., Tegner J., Teichmann S.A., Ueda H.R., van Nimwegen E., Verardo R., Wei C.L., Yagi K., Yamanishi H., Zabarovsky E., Zhu S., Zimmer A., Hide W., Bult C., Grimmond S.M., Teasdale R.D., Liu E.T., Brusic V., Quackenbush J., Wahlestedt C., Mattick J.S., Hume D.A., Kai C., Sasaki D., Tomaru Y., Fukuda S., Kanamori-Katayama M., Suzuki M., Aoki J., Arakawa T., Iida J., Imamura K., Itoh M., Kato T., Kawaji H., Kawagashira N., Kawashima T., Kojima M., Kondo S., Konno H., Nakano K., Ninomiya N., Nishio T., Okada M., Plessy C., Shibata K., Shiraki T., Suzuki S., Tagami M., Waki K., Watahiki A., Okamura-Oho Y., Suzuki H., Kawai J., Hayashizaki Y.
    Science 309:1559-1563(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 3 AND 4).
    Strain: C57BL/6J.
    Tissue: Head, Lung and Testis.
  3. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
    Strain: C57BL/6J.
  4. "Cloning and characterization of a novel rho-type GTPase-activating protein gene (ARHGAP6) from the critical region for microphthalmia with linear skin defects."
    Schaefer L., Prakash S.K., Zoghbi H.Y.
    Genomics 46:268-277(1997) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] OF 280-660 (ISOFORM 1).
    Strain: 129/Sv.
  5. Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-675, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Liver.
  6. Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-37; SER-669; SER-675; SER-713 AND SER-824, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Heart, Kidney, Lung and Spleen.

Entry informationi

Entry nameiRHG06_MOUSE
AccessioniPrimary (citable) accession number: O54834
Secondary accession number(s): A2ABW4
, A2AC55, Q3TMC2, Q8BG83, Q8C842, Q9QZL8
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 15, 1999
Last sequence update: May 29, 2007
Last modified: June 8, 2016
This is version 128 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.