Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Fanconi anemia group G protein

Gene

FANCG

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

DNA repair protein that may operate in a postreplication repair or a cell cycle checkpoint function. May be implicated in interstrand DNA cross-link repair and in the maintenance of normal chromosome stability. Candidate tumor suppressor gene.

GO - Molecular functioni

  • damaged DNA binding Source: ProtInc

GO - Biological processi

  • cell cycle checkpoint Source: ProtInc
  • DNA repair Source: Reactome
  • mitochondrion organization Source: UniProtKB
  • ovarian follicle development Source: Ensembl
  • response to radiation Source: Ensembl
  • spermatid development Source: Ensembl
Complete GO annotation...

Keywords - Biological processi

DNA damage, DNA repair

Enzyme and pathway databases

ReactomeiREACT_18410. Fanconi Anemia pathway.

Names & Taxonomyi

Protein namesi
Recommended name:
Fanconi anemia group G protein
Short name:
Protein FACG
Alternative name(s):
DNA repair protein XRCC9
Gene namesi
Name:FANCG
Synonyms:XRCC9
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
ProteomesiUP000005640 Componenti: Chromosome 9

Organism-specific databases

HGNCiHGNC:3588. FANCG.

Subcellular locationi

GO - Cellular componenti

  • cytoplasm Source: HPA
  • Fanconi anaemia nuclear complex Source: UniProtKB
  • mitochondrion Source: UniProtKB
  • nucleolus Source: HPA
  • nucleoplasm Source: Reactome
  • plasma membrane Source: HPA
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Nucleus

Pathology & Biotechi

Involvement in diseasei

Fanconi anemia complementation group G (FANCG)2 Publications

The disease is caused by mutations affecting the gene represented in this entry.

Disease descriptionA disorder affecting all bone marrow elements and resulting in anemia, leukopenia and thrombopenia. It is associated with cardiac, renal and limb malformations, dermal pigmentary changes, and a predisposition to the development of malignancies. At the cellular level it is associated with hypersensitivity to DNA-damaging agents, chromosomal instability (increased chromosome breakage) and defective DNA repair.

See also OMIM:614082
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti71 – 711L → P in FANCG; associated with a mild clinical phenotype; disruption of HES1-binding; no effect on FANCA-binding. 2 Publications
VAR_017495

Mutagenesis

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Mutagenesisi7 – 71S → A: Loss of BRCA2-, FANCD2- and XRCC3-binding. No effect on complex formation with FANCA and FANCF. 1 Publication
Mutagenesisi383 – 3831S → A: No effect on BRCA2-, FANCA-, FANCF-, nor XRCC3-binding. 1 Publication
Mutagenesisi387 – 3871S → A: No effect on BRCA2-, FANCA-, FANCF-, nor XRCC3-binding. 1 Publication
Mutagenesisi546 – 5461G → R: No effect on HES1-, nor FANCA-binding. 1 Publication

Keywords - Diseasei

Disease mutation, Fanconi anemia

Organism-specific databases

MIMi614082. phenotype.
Orphaneti84. Fanconi anemia.
PharmGKBiPA28002.

Polymorphism and mutation databases

BioMutaiFANCG.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 622622Fanconi anemia group G proteinPRO_0000106292Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei7 – 71Phosphoserine1 Publication

Keywords - PTMi

Phosphoprotein

Proteomic databases

MaxQBiO15287.
PaxDbiO15287.
PRIDEiO15287.

PTM databases

PhosphoSiteiO15287.

Expressioni

Tissue specificityi

Highly expressed in testis and thymus. Found in lymphoblasts.

Gene expression databases

BgeeiO15287.
CleanExiHS_FANCG.
ExpressionAtlasiO15287. baseline and differential.
GenevestigatoriO15287.

Organism-specific databases

HPAiCAB008105.
HPA045335.

Interactioni

Subunit structurei

Belongs to the multisubunit FA complex composed of FANCA, FANCB, FANCC, FANCE, FANCF, FANCG, FANCL/PHF9 and FANCM. The complex is not found in FA patients. In complex with FANCF, FANCA and FANCL, but not with FANCC, nor FANCE, interacts with HES1; this interaction may be essential for the stability and nuclear localization of FA core complex proteins. The complex with FANCC and FANCG may also include EIF2AK2 and HSP70. When phosphorylated at Ser-7, forms a complex with BRCA2, FANCD2 and XRCC3.9 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
FANCAO1536010EBI-81610,EBI-81570
FANCFQ9NPI84EBI-81610,EBI-81589
SPTAN1Q138134EBI-81610,EBI-351450
taxP140793EBI-81610,EBI-9675698From a different organism.

Protein-protein interaction databases

BioGridi108484. 47 interactions.
IntActiO15287. 15 interactions.
MINTiMINT-96443.
STRINGi9606.ENSP00000367910.

Structurei

3D structure databases

ProteinModelPortaliO15287.
SMRiO15287. Positions 223-306, 524-573.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Repeati246 – 27934TPR 1Add
BLAST
Repeati344 – 37734TPR 2Add
BLAST
Repeati453 – 48634TPR 3Add
BLAST
Repeati514 – 54734TPR 4Add
BLAST

Sequence similaritiesi

Contains 4 TPR repeats.Curated

Keywords - Domaini

Repeat, TPR repeat

Phylogenomic databases

eggNOGiNOG40254.
HOGENOMiHOG000231403.
HOVERGENiHBG051551.
InParanoidiO15287.
KOiK10894.
OMAiDTKALQD.
OrthoDBiEOG7S21Z5.
PhylomeDBiO15287.
TreeFamiTF330722.

Family and domain databases

Gene3Di1.25.40.10. 1 hit.
InterProiIPR011990. TPR-like_helical_dom.
IPR001440. TPR_1.
IPR019734. TPR_repeat.
[Graphical view]
PfamiPF00515. TPR_1. 1 hit.
[Graphical view]
SMARTiSM00028. TPR. 3 hits.
[Graphical view]

Sequencei

Sequence statusi: Complete.

O15287-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MSRQTTSVGS SCLDLWREKN DRLVRQAKVA QNSGLTLRRQ QLAQDALEGL
60 70 80 90 100
RGLLHSLQGL PAAVPVLPLE LTVTCNFIIL RASLAQGFTE DQAQDIQRSL
110 120 130 140 150
ERVLETQEQQ GPRLEQGLRE LWDSVLRASC LLPELLSALH RLVGLQAALW
160 170 180 190 200
LSADRLGDLA LLLETLNGSQ SGASKDLLLL LKTWSPPAEE LDAPLTLQDA
210 220 230 240 250
QGLKDVLLTA FAYRQGLQEL ITGNPDKALS SLHEAASGLC PRPVLVQVYT
260 270 280 290 300
ALGSCHRKMG NPQRALLYLV AALKEGSAWG PPLLEASRLY QQLGDTTAEL
310 320 330 340 350
ESLELLVEAL NVPCSSKAPQ FLIEVELLLP PPDLASPLHC GTQSQTKHIL
360 370 380 390 400
ASRCLQTGRA GDAAEHYLDL LALLLDSSEP RFSPPPSPPG PCMPEVFLEA
410 420 430 440 450
AVALIQAGRA QDALTLCEEL LSRTSSLLPK MSRLWEDARK GTKELPYCPL
460 470 480 490 500
WVSATHLLQG QAWVQLGAQK VAISEFSRCL ELLFRATPEE KEQGAAFNCE
510 520 530 540 550
QGCKSDAALQ QLRAAALISR GLEWVASGQD TKALQDFLLS VQMCPGNRDT
560 570 580 590 600
YFHLLQTLKR LDRRDEATAL WWRLEAQTKG SHEDALWSLP LYLESYLSWI
610 620
RPSDRDAFLE EFRTSLPKSC DL
Length:622
Mass (Da):68,554
Last modified:January 1, 1998 - v1
Checksum:i4BC7475472AC3C84
GO

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti71 – 711L → P in FANCG; associated with a mild clinical phenotype; disruption of HES1-binding; no effect on FANCA-binding. 2 Publications
VAR_017495
Natural varianti294 – 2941G → E.1 Publication
Corresponds to variant rs17880082 [ dbSNP | Ensembl ].
VAR_021103
Natural varianti297 – 2971T → I.1 Publication
Corresponds to variant rs2237857 [ dbSNP | Ensembl ].
VAR_020311
Natural varianti330 – 3301P → S.1 Publication
Corresponds to variant rs4986940 [ dbSNP | Ensembl ].
VAR_021104
Natural varianti378 – 3781S → L.1 Publication
Corresponds to variant rs4986939 [ dbSNP | Ensembl ].
VAR_021105
Natural varianti430 – 4301K → E.1 Publication
Corresponds to variant rs17881054 [ dbSNP | Ensembl ].
VAR_021106
Natural varianti513 – 5131R → Q.1 Publication
Corresponds to variant rs17885240 [ dbSNP | Ensembl ].
VAR_021107
Natural varianti603 – 6031S → F.1 Publication
Corresponds to variant rs17878854 [ dbSNP | Ensembl ].
VAR_021108
Natural varianti607 – 6071A → T in a colorectal cancer sample; somatic mutation. 1 Publication
VAR_035864

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
U70310 mRNA. Translation: AAB80802.1.
AJ007669 mRNA. Translation: CAA07602.1.
AY795970 Genomic DNA. Translation: AAV40841.1.
AC004472 Genomic DNA. Translation: AAC07981.1.
AL353795 Genomic DNA. Translation: CAH70994.1.
BC000032 mRNA. Translation: AAH00032.1.
BC011623 mRNA. Translation: AAH11623.1.
CCDSiCCDS6574.1.
PIRiT02244.
RefSeqiNP_004620.1. NM_004629.1.
UniGeneiHs.591084.

Genome annotation databases

EnsembliENST00000378643; ENSP00000367910; ENSG00000221829.
GeneIDi2189.
KEGGihsa:2189.
UCSCiuc003zwb.1. human.

Keywords - Coding sequence diversityi

Polymorphism

Cross-referencesi

Web resourcesi

Fanconi Anemia Mutation Database
Atlas of Genetics and Cytogenetics in Oncology and Haematology
NIEHS-SNPs

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
U70310 mRNA. Translation: AAB80802.1.
AJ007669 mRNA. Translation: CAA07602.1.
AY795970 Genomic DNA. Translation: AAV40841.1.
AC004472 Genomic DNA. Translation: AAC07981.1.
AL353795 Genomic DNA. Translation: CAH70994.1.
BC000032 mRNA. Translation: AAH00032.1.
BC011623 mRNA. Translation: AAH11623.1.
CCDSiCCDS6574.1.
PIRiT02244.
RefSeqiNP_004620.1. NM_004629.1.
UniGeneiHs.591084.

3D structure databases

ProteinModelPortaliO15287.
SMRiO15287. Positions 223-306, 524-573.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi108484. 47 interactions.
IntActiO15287. 15 interactions.
MINTiMINT-96443.
STRINGi9606.ENSP00000367910.

PTM databases

PhosphoSiteiO15287.

Polymorphism and mutation databases

BioMutaiFANCG.

Proteomic databases

MaxQBiO15287.
PaxDbiO15287.
PRIDEiO15287.

Protocols and materials databases

DNASUi2189.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000378643; ENSP00000367910; ENSG00000221829.
GeneIDi2189.
KEGGihsa:2189.
UCSCiuc003zwb.1. human.

Organism-specific databases

CTDi2189.
GeneCardsiGC09M035073.
GeneReviewsiFANCG.
HGNCiHGNC:3588. FANCG.
HPAiCAB008105.
HPA045335.
MIMi602956. gene.
614082. phenotype.
neXtProtiNX_O15287.
Orphaneti84. Fanconi anemia.
PharmGKBiPA28002.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiNOG40254.
HOGENOMiHOG000231403.
HOVERGENiHBG051551.
InParanoidiO15287.
KOiK10894.
OMAiDTKALQD.
OrthoDBiEOG7S21Z5.
PhylomeDBiO15287.
TreeFamiTF330722.

Enzyme and pathway databases

ReactomeiREACT_18410. Fanconi Anemia pathway.

Miscellaneous databases

ChiTaRSiFANCG. human.
GeneWikiiFANCG.
GenomeRNAii2189.
NextBioi8847.
PROiO15287.
SOURCEiSearch...

Gene expression databases

BgeeiO15287.
CleanExiHS_FANCG.
ExpressionAtlasiO15287. baseline and differential.
GenevestigatoriO15287.

Family and domain databases

Gene3Di1.25.40.10. 1 hit.
InterProiIPR011990. TPR-like_helical_dom.
IPR001440. TPR_1.
IPR019734. TPR_repeat.
[Graphical view]
PfamiPF00515. TPR_1. 1 hit.
[Graphical view]
SMARTiSM00028. TPR. 3 hits.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "The human XRCC9 gene corrects chromosomal instability and mutagen sensitivities in CHO UV40 cells."
    Liu N., Lamerdin J.E., Tucker J.D., Zhou Z.-Q., Walter C.A., Albala J.S., Busch D.B., Thompson L.H.
    Proc. Natl. Acad. Sci. U.S.A. 94:9232-9237(1997) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA].
  2. Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA / MRNA].
  3. NIEHS SNPs program
    Submitted (OCT-2004) to the EMBL/GenBank/DDBJ databases
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA], VARIANTS GLU-294; ILE-297; SER-330; LEU-378; GLU-430; GLN-513 AND PHE-603.
  4. "DNA sequence and analysis of human chromosome 9."
    Humphray S.J., Oliver K., Hunt A.R., Plumb R.W., Loveland J.E., Howe K.L., Andrews T.D., Searle S., Hunt S.E., Scott C.E., Jones M.C., Ainscough R., Almeida J.P., Ambrose K.D., Ashwell R.I.S., Babbage A.K., Babbage S., Bagguley C.L.
    , Bailey J., Banerjee R., Barker D.J., Barlow K.F., Bates K., Beasley H., Beasley O., Bird C.P., Bray-Allen S., Brown A.J., Brown J.Y., Burford D., Burrill W., Burton J., Carder C., Carter N.P., Chapman J.C., Chen Y., Clarke G., Clark S.Y., Clee C.M., Clegg S., Collier R.E., Corby N., Crosier M., Cummings A.T., Davies J., Dhami P., Dunn M., Dutta I., Dyer L.W., Earthrowl M.E., Faulkner L., Fleming C.J., Frankish A., Frankland J.A., French L., Fricker D.G., Garner P., Garnett J., Ghori J., Gilbert J.G.R., Glison C., Grafham D.V., Gribble S., Griffiths C., Griffiths-Jones S., Grocock R., Guy J., Hall R.E., Hammond S., Harley J.L., Harrison E.S.I., Hart E.A., Heath P.D., Henderson C.D., Hopkins B.L., Howard P.J., Howden P.J., Huckle E., Johnson C., Johnson D., Joy A.A., Kay M., Keenan S., Kershaw J.K., Kimberley A.M., King A., Knights A., Laird G.K., Langford C., Lawlor S., Leongamornlert D.A., Leversha M., Lloyd C., Lloyd D.M., Lovell J., Martin S., Mashreghi-Mohammadi M., Matthews L., McLaren S., McLay K.E., McMurray A., Milne S., Nickerson T., Nisbett J., Nordsiek G., Pearce A.V., Peck A.I., Porter K.M., Pandian R., Pelan S., Phillimore B., Povey S., Ramsey Y., Rand V., Scharfe M., Sehra H.K., Shownkeen R., Sims S.K., Skuce C.D., Smith M., Steward C.A., Swarbreck D., Sycamore N., Tester J., Thorpe A., Tracey A., Tromans A., Thomas D.W., Wall M., Wallis J.M., West A.P., Whitehead S.L., Willey D.L., Williams S.A., Wilming L., Wray P.W., Young L., Ashurst J.L., Coulson A., Blocker H., Durbin R.M., Sulston J.E., Hubbard T., Jackson M.J., Bentley D.R., Beck S., Rogers J., Dunham I.
    Nature 429:369-374(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  5. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Tissue: Kidney and Uterus.
  6. "Fanconi anemia proteins FANCA, FANCC, and FANCG/XRCC9 interact in a functional nuclear complex."
    Garcia-Higuera I., Kuang Y., Naf D., Wasik J., D'Andrea A.D.
    Mol. Cell. Biol. 19:4866-4873(1999) [PubMed] [Europe PMC] [Abstract]
    Cited for: CHARACTERIZATION.
  7. "A multiprotein nuclear complex connects Fanconi anemia and Bloom syndrome."
    Meetei A.R., Sechi S., Wallisch M., Yang D., Young M.K., Joenje H., Hoatlin M.E., Wang W.
    Mol. Cell. Biol. 23:3417-3426(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION IN A COMPLEX WITH FANCA; FANCC; FANCE; FANCF AND FANCL.
  8. "The Fanconi anemia proteins functionally interact with the protein kinase regulated by RNA (PKR)."
    Zhang X., Li J., Sejas D.P., Rathbun K.R., Bagby G.C., Pang Q.
    J. Biol. Chem. 279:43910-43919(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION IN A COMPLEX WITH EIF2AK2; FANCA; FANCC AND HSP70.
  9. Cited for: IDENTIFICATION IN A COMPLEX WITH FANCA; FANCB; FANCC; FANCE; FANCF AND FANCL.
  10. "A human ortholog of archaeal DNA repair protein Hef is defective in Fanconi anemia complementation group M."
    Meetei A.R., Medhurst A.L., Ling C., Xue Y., Singh T.R., Bier P., Steltenpool J., Stone S., Dokal I., Mathew C.G., Hoatlin M., Joenje H., de Winter J.P., Wang W.
    Nat. Genet. 37:958-963(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION IN A COMPLEX WITH FANCA; FANCB; FANCC; FANCE; FANCF; FANCL AND FANCM.
  11. "FANCG promotes formation of a newly identified protein complex containing BRCA2, FANCD2 and XRCC3."
    Wilson J.B., Yamamoto K., Marriott A.S., Hussain S., Sung P., Hoatlin M.E., Mathew C.G., Takata M., Thompson L.H., Kupfer G.M., Jones N.J.
    Oncogene 27:3641-3652(2008) [PubMed] [Europe PMC] [Abstract]
    Cited for: INTERACTION WITH BRCA2; FANCD2 AND XRCC3, PHOSPHORYLATION AT SER-7, MUTAGENESIS OF SER-7; SER-383 AND SER-387.
  12. "FAAP20: a novel ubiquitin-binding FA nuclear core-complex protein required for functional integrity of the FA-BRCA DNA repair pathway."
    Ali A.M., Pradhan A., Singh T.R., Du C., Li J., Wahengbam K., Grassman E., Auerbach A.D., Pang Q., Meetei A.R.
    Blood 119:3285-3294(2012) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION IN THE FA COMPLEX.
  13. "A ubiquitin-binding protein, FAAP20, links RNF8-mediated ubiquitination to the Fanconi anemia DNA repair network."
    Yan Z., Guo R., Paramasivam M., Shen W., Ling C., Fox D. III, Wang Y., Oostra A.B., Kuehl J., Lee D.Y., Takata M., Hoatlin M.E., Schindler D., Joenje H., de Winter J.P., Li L., Seidman M.M., Wang W.
    Mol. Cell 47:61-75(2012) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION IN THE FA COMPLEX.
  14. "Regulation of Rev1 by the Fanconi anemia core complex."
    Kim H., Yang K., Dejsuphong D., D'Andrea A.D.
    Nat. Struct. Mol. Biol. 19:164-170(2012) [PubMed] [Europe PMC] [Abstract]
    Cited for: IDENTIFICATION IN THE FA COMPLEX.
  15. Cited for: VARIANT FANCG PRO-71.
  16. Cited for: VARIANT [LARGE SCALE ANALYSIS] THR-607.
  17. "HES1 is a novel interactor of the Fanconi anemia core complex."
    Tremblay C.S., Huang F.F., Habi O., Huard C.C., Godin C., Levesque G., Carreau M.
    Blood 112:2062-2070(2008) [PubMed] [Europe PMC] [Abstract]
    Cited for: VARIANT FANCG PRO-71, INTERACTION WITH HES1, SUBCELLULAR LOCATION, MUTAGENESIS OF GLY-546.

Entry informationi

Entry nameiFANCG_HUMAN
AccessioniPrimary (citable) accession number: O15287
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 15, 1999
Last sequence update: January 1, 1998
Last modified: May 27, 2015
This is version 151 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 9
    Human chromosome 9: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

External Data

Dasty 3

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into Uniref entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.