Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Insulin-like growth factor II

Gene

IGF2

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

The insulin-like growth factors possess growth-promoting activity. In vitro, they are potent mitogens for cultured cells. IGF-II is influenced by placental lactogen and may play a role in fetal development.
Preptin undergoes glucose-mediated co-secretion with insulin, and acts as physiological amplifier of glucose-mediated insulin secretion. Exhibits osteogenic properties by increasing osteoblast mitogenic activity through phosphoactivation of MAPK1 and MAPK3.

GO - Molecular functioni

  • growth factor activity Source: BHF-UCL
  • insulin-like growth factor receptor binding Source: AgBase
  • insulin receptor binding Source: BHF-UCL
  • protein serine/threonine kinase activator activity Source: BHF-UCL
  • receptor activator activity Source: BHF-UCL

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Growth factor, Hormone, Mitogen

Keywords - Biological processi

Carbohydrate metabolism, Glucose metabolism, Osteogenesis

Enzyme and pathway databases

ReactomeiR-HSA-114608. Platelet degranulation.
R-HSA-2404192. Signaling by Type 1 Insulin-like Growth Factor 1 Receptor (IGF1R).
R-HSA-2428928. IRS-related events triggered by IGF1R.
R-HSA-2428933. SHC-related events triggered by IGF1R.
R-HSA-381426. Regulation of Insulin-like Growth Factor (IGF) transport and uptake by Insulin-like Growth Factor Binding Proteins (IGFBPs).
SignaLinkiP01344.
SIGNORiP01344.

Names & Taxonomyi

Protein namesi
Recommended name:
Insulin-like growth factor II
Short name:
IGF-II
Alternative name(s):
Somatomedin-A
T3M-11-derived growth factor
Cleaved into the following 3 chains:
Gene namesi
Name:IGF2
ORF Names:PP1446
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 11

Organism-specific databases

HGNCiHGNC:5466. IGF2.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Secreted

Pathology & Biotechi

Involvement in diseasei

Silver-Russell syndrome (SRS)1 Publication
The gene represented in this entry is involved in disease pathogenesis. Most of the cases of Silver-Russell syndrome are caused by the epigenetic changes of DNA hypomethylation at the telomeric imprinting control region (ICR1) on chromosome 11p15, involving the H19 and IGF2 genes.
Disease descriptionA clinically heterogeneous condition characterized by severe intrauterine growth retardation, poor postnatal growth, craniofacial features such as a triangular shaped face and a broad forehead, body asymmetry, and a variety of minor malformations. The phenotypic expression changes during childhood and adolescence, with the facial features and asymmetry usually becoming more subtle with age.
See also OMIM:180860
Growth restriction, severe, with distinctive facies (GRDF)1 Publication
The disease is caused by mutations affecting the gene represented in this entry.
Disease descriptionA disease characterized by severe prenatal and postnatal growth restriction, facial dysmorphism, and short stature in the presence of normal or slightly elevated growth hormone levels.
See also OMIM:616489

Mutagenesis

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Mutagenesisi92 – 921R → A: Decreases mature IGF2 levels. 1 Publication
Mutagenesisi112 – 1121K → A: No effect in proteolytical processing. 1 Publication
Mutagenesisi128 – 1281R → A: Abolishes proteolytical processing. 1 Publication

Keywords - Diseasei

Dwarfism

Organism-specific databases

MalaCardsiIGF2.
MIMi147470. gene+phenotype.
180860. phenotype.
616489. phenotype.
Orphaneti231117. Beckwith-Wiedemann syndrome due to imprinting defect of 11p15.
2128. Hemihypertrophy.
231144. Silver-Russell syndrome due to 11p15 microduplication.
231140. Silver-Russell syndrome due to imprinting defect of 11p15.
PharmGKBiPA29699.

Polymorphism and mutation databases

BioMutaiIGF2.
DMDMi124255.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Signal peptidei1 – 24242 PublicationsAdd
BLAST
Chaini25 – 9167Insulin-like growth factor IIPRO_0000015717Add
BLAST
Chaini26 – 9166Insulin-like growth factor II Ala-25 DelPRO_0000015718Add
BLAST
Propeptidei92 – 18089E peptidePRO_0000015719Add
BLAST
Peptidei93 – 12634PreptinPRO_0000370376Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Disulfide bondi33 ↔ 711 Publication
Disulfide bondi45 ↔ 841 Publication
Disulfide bondi70 ↔ 751 Publication
Glycosylationi96 – 961O-linked (GalNAc...)2 Publications
Glycosylationi99 – 991O-linked (GalNAc...)2 Publications
Glycosylationi163 – 1631O-linked (GalNAc...)2 Publications

Post-translational modificationi

O-glycosylated with core 1 or possibly core 8 glycans. Thr-96 is a minor glycosylation site compared to Thr-99.4 Publications
Proteolytically processed by PCSK4, proIGF2 is cleaved at Arg-128 and Arg-92 to generate big-IGF2 and mature IGF2.1 Publication

Keywords - PTMi

Cleavage on pair of basic residues, Disulfide bond, Glycoprotein

Proteomic databases

MaxQBiP01344.
PaxDbiP01344.
PeptideAtlasiP01344.
PRIDEiP01344.

PTM databases

iPTMnetiP01344.
PhosphoSiteiP01344.
UniCarbKBiP01344.

Miscellaneous databases

PMAP-CutDBP01344.

Expressioni

Gene expression databases

BgeeiENSG00000167244.
CleanExiHS_IGF2.
GenevisibleiP01344. HS.

Organism-specific databases

HPAiHPA007556.
HPA007993.

Interactioni

Binary interactionsi

WithEntry#Exp.IntActNotes
IGF2RP1171717EBI-7178764,EBI-1048580
RBPMSQ930623EBI-7178764,EBI-740322

GO - Molecular functioni

  • growth factor activity Source: BHF-UCL
  • insulin-like growth factor receptor binding Source: AgBase
  • insulin receptor binding Source: BHF-UCL

Protein-protein interaction databases

BioGridi109702. 15 interactions.
DIPiDIP-29508N.
IntActiP01344. 6 interactions.
MINTiMINT-6380943.
STRINGi9606.ENSP00000391826.

Structurei

Secondary structure

1
180
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Helixi34 – 4411Combined sources
Turni45 – 484Combined sources
Beta strandi50 – 523Combined sources
Helixi55 – 584Combined sources
Helixi61 – 7212Combined sources
Helixi77 – 826Combined sources
Beta strandi86 – 883Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1GF2model-A25-91[»]
1IGLNMR-A25-91[»]
2L29NMR-B25-91[»]
2V5PX-ray4.10C/D25-91[»]
3E4ZX-ray2.28C/D25-91[»]
3KR3X-ray2.20D25-91[»]
ProteinModelPortaliP01344.
SMRiP01344. Positions 29-88.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP01344.

Family & Domainsi

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni25 – 5228BAdd
BLAST
Regioni53 – 6412CAdd
BLAST
Regioni65 – 8521AAdd
BLAST
Regioni86 – 916D

Sequence similaritiesi

Belongs to the insulin family.Curated

Keywords - Domaini

Signal

Phylogenomic databases

eggNOGiENOG410IY3P. Eukaryota.
ENOG4111KP2. LUCA.
GeneTreeiENSGT00530000063856.
HOGENOMiHOG000233362.
HOVERGENiHBG006137.
InParanoidiP01344.
KOiK13769.
OMAiFFQYDTW.
OrthoDBiEOG091G0H56.
PhylomeDBiP01344.
TreeFamiTF332820.

Family and domain databases

Gene3Di1.10.100.10. 1 hit.
InterProiIPR022334. IGF2.
IPR013576. IGF2_C.
IPR016179. Insulin-like.
IPR022350. Insulin-like_growth_factor.
IPR022353. Insulin_CS.
IPR022352. Insulin_family.
[Graphical view]
PfamiPF08365. IGF2_C. 1 hit.
PF00049. Insulin. 2 hits.
[Graphical view]
PRINTSiPR02002. INSLNLIKEGF.
PR02006. INSLNLIKEGF2.
PR00276. INSULINFAMLY.
ProDomiPD005188. IGF2_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTiSM00078. IlGF. 1 hit.
[Graphical view]
SUPFAMiSSF56994. SSF56994. 1 hit.
PROSITEiPS00262. INSULIN. 1 hit.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: P01344-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MGIPMGKSML VLLTFLAFAS CCIAAYRPSE TLCGGELVDT LQFVCGDRGF
60 70 80 90 100
YFSRPASRVS RRSRGIVEEC CFRSCDLALL ETYCATPAKS ERDVSTPPTV
110 120 130 140 150
LPDNFPRYPV GKFFQYDTWK QSTQRLRRGL PALLRARRGH VLAKELEAFR
160 170 180
EAKRHRPLIA LPTQDPAHGG APPEMASNRK
Length:180
Mass (Da):20,140
Last modified:July 21, 1986 - v1
Checksum:iC1B0EB1E016BA37A
GO
Isoform 2 (identifier: P01344-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     53-53: S → RLPG

Show »
Length:183
Mass (Da):20,477
Checksum:iA54CD97B56C2B96F
GO
Isoform 3 (identifier: P01344-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-1: M → MVSPDPQIIVVAPETELASMQVQRTEDGVTIIQIFWVGRKGELLRRTPVSSAMQTPM

Note: Gene prediction based on EST data.
Show »
Length:236
Mass (Da):26,331
Checksum:iCF1395E851055BF6
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti3 – 31I → M in AAA52544 (PubMed:3683205).Curated
Sequence conflicti107 – 1104RYPV → EIPL in CAA27249 (PubMed:6382022).Curated
Sequence conflicti147 – 1471E → ELE in AAA60088 (PubMed:3476948).Curated

Mass spectrometryi

Molecular mass is 7469.4 Da from positions 25 - 91. Determined by MALDI. 2 Publications
Molecular mass is 7398.3 Da from positions 26 - 91. Determined by MALDI. 2 Publications

Polymorphismi

Genetic variations in IGF2 are associated with body mass index (BMI). The BMI is a statistical measurement which compares a person's weight and height.1 Publication

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti120 – 1201K → N.
Corresponds to variant rs14367 [ dbSNP | Ensembl ].
VAR_011959
Natural varianti173 – 1731P → Q.
Corresponds to variant rs1050342 [ dbSNP | Ensembl ].
VAR_011960
Natural varianti180 – 1801K → N.
Corresponds to variant rs12993 [ dbSNP | Ensembl ].
VAR_011961

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1 – 11M → MVSPDPQIIVVAPETELASM QVQRTEDGVTIIQIFWVGRK GELLRRTPVSSAMQTPM in isoform 3. 1 PublicationVSP_045624
Alternative sequencei53 – 531S → RLPG in isoform 2. 2 PublicationsVSP_002708

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X03562 Genomic DNA. Translation: CAA27249.1.
X00910 mRNA. Translation: CAA25426.1.
J03242 mRNA. Translation: AAA52545.1.
X03425 Genomic DNA. Translation: CAA27155.1.
X03426 Genomic DNA. Translation: CAA27156.1.
X03427 Genomic DNA. Translation: CAA27157.1.
M17426 mRNA. Translation: AAA60088.1.
M29645 mRNA. Translation: AAA52544.1.
M17863 mRNA. Translation: AAA52443.1. Sequence problems.
S77035 mRNA. Translation: AAB34155.1.
DQ104203 mRNA. Translation: ABD93451.1.
HM481219 mRNA. Translation: ADO21454.1.
AF217977 mRNA. Translation: AAG17220.1.
BT007013 mRNA. Translation: AAP35659.1.
AF517226 Genomic DNA. Translation: AAM51825.1.
AC132217 Genomic DNA. No translation available.
AK126688 mRNA. Translation: BAG54360.1.
CH471158 Genomic DNA. Translation: EAX02485.1.
BC000531 mRNA. Translation: AAH00531.1.
X07868 Genomic DNA. Translation: CAA30717.1.
X06159 mRNA. Translation: CAA29516.1.
X06160 Transcribed RNA. Translation: CAA29517.1.
X06161 mRNA. Translation: CAA29518.1.
M22373 Genomic DNA. Translation: AAA52536.1.
CCDSiCCDS44517.1. [P01344-3]
CCDS7728.1. [P01344-1]
PIRiB23614. IGHU2.
I67610.
S02423.
RefSeqiNP_000603.1. NM_000612.5. [P01344-1]
NP_001007140.2. NM_001007139.5. [P01344-1]
NP_001121070.1. NM_001127598.2. [P01344-3]
NP_001278790.1. NM_001291861.2. [P01344-1]
NP_001278791.1. NM_001291862.2. [P01344-1]
UniGeneiHs.272259.

Genome annotation databases

EnsembliENST00000381389; ENSP00000370796; ENSG00000167244. [P01344-1]
ENST00000381392; ENSP00000370799; ENSG00000167244. [P01344-2]
ENST00000381395; ENSP00000370802; ENSG00000167244. [P01344-1]
ENST00000381406; ENSP00000370813; ENSG00000167244. [P01344-2]
ENST00000416167; ENSP00000414497; ENSG00000167244. [P01344-1]
ENST00000418738; ENSP00000402047; ENSG00000167244. [P01344-1]
ENST00000434045; ENSP00000391826; ENSG00000167244. [P01344-3]
GeneIDi3481.
KEGGihsa:3481.
UCSCiuc001lvf.4. human. [P01344-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Web resourcesi

Wikipedia

Insulin-like growth factor 2 entry

SeattleSNPs

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X03562 Genomic DNA. Translation: CAA27249.1.
X00910 mRNA. Translation: CAA25426.1.
J03242 mRNA. Translation: AAA52545.1.
X03425 Genomic DNA. Translation: CAA27155.1.
X03426 Genomic DNA. Translation: CAA27156.1.
X03427 Genomic DNA. Translation: CAA27157.1.
M17426 mRNA. Translation: AAA60088.1.
M29645 mRNA. Translation: AAA52544.1.
M17863 mRNA. Translation: AAA52443.1. Sequence problems.
S77035 mRNA. Translation: AAB34155.1.
DQ104203 mRNA. Translation: ABD93451.1.
HM481219 mRNA. Translation: ADO21454.1.
AF217977 mRNA. Translation: AAG17220.1.
BT007013 mRNA. Translation: AAP35659.1.
AF517226 Genomic DNA. Translation: AAM51825.1.
AC132217 Genomic DNA. No translation available.
AK126688 mRNA. Translation: BAG54360.1.
CH471158 Genomic DNA. Translation: EAX02485.1.
BC000531 mRNA. Translation: AAH00531.1.
X07868 Genomic DNA. Translation: CAA30717.1.
X06159 mRNA. Translation: CAA29516.1.
X06160 Transcribed RNA. Translation: CAA29517.1.
X06161 mRNA. Translation: CAA29518.1.
M22373 Genomic DNA. Translation: AAA52536.1.
CCDSiCCDS44517.1. [P01344-3]
CCDS7728.1. [P01344-1]
PIRiB23614. IGHU2.
I67610.
S02423.
RefSeqiNP_000603.1. NM_000612.5. [P01344-1]
NP_001007140.2. NM_001007139.5. [P01344-1]
NP_001121070.1. NM_001127598.2. [P01344-3]
NP_001278790.1. NM_001291861.2. [P01344-1]
NP_001278791.1. NM_001291862.2. [P01344-1]
UniGeneiHs.272259.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1GF2model-A25-91[»]
1IGLNMR-A25-91[»]
2L29NMR-B25-91[»]
2V5PX-ray4.10C/D25-91[»]
3E4ZX-ray2.28C/D25-91[»]
3KR3X-ray2.20D25-91[»]
ProteinModelPortaliP01344.
SMRiP01344. Positions 29-88.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi109702. 15 interactions.
DIPiDIP-29508N.
IntActiP01344. 6 interactions.
MINTiMINT-6380943.
STRINGi9606.ENSP00000391826.

PTM databases

iPTMnetiP01344.
PhosphoSiteiP01344.
UniCarbKBiP01344.

Polymorphism and mutation databases

BioMutaiIGF2.
DMDMi124255.

Proteomic databases

MaxQBiP01344.
PaxDbiP01344.
PeptideAtlasiP01344.
PRIDEiP01344.

Protocols and materials databases

DNASUi3481.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000381389; ENSP00000370796; ENSG00000167244. [P01344-1]
ENST00000381392; ENSP00000370799; ENSG00000167244. [P01344-2]
ENST00000381395; ENSP00000370802; ENSG00000167244. [P01344-1]
ENST00000381406; ENSP00000370813; ENSG00000167244. [P01344-2]
ENST00000416167; ENSP00000414497; ENSG00000167244. [P01344-1]
ENST00000418738; ENSP00000402047; ENSG00000167244. [P01344-1]
ENST00000434045; ENSP00000391826; ENSG00000167244. [P01344-3]
GeneIDi3481.
KEGGihsa:3481.
UCSCiuc001lvf.4. human. [P01344-1]

Organism-specific databases

CTDi3481.
GeneCardsiIGF2.
GeneReviewsiIGF2.
H-InvDBHIX0128667.
HGNCiHGNC:5466. IGF2.
HPAiHPA007556.
HPA007993.
MalaCardsiIGF2.
MIMi147470. gene+phenotype.
180860. phenotype.
616489. phenotype.
neXtProtiNX_P01344.
Orphaneti231117. Beckwith-Wiedemann syndrome due to imprinting defect of 11p15.
2128. Hemihypertrophy.
231144. Silver-Russell syndrome due to 11p15 microduplication.
231140. Silver-Russell syndrome due to imprinting defect of 11p15.
PharmGKBiPA29699.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410IY3P. Eukaryota.
ENOG4111KP2. LUCA.
GeneTreeiENSGT00530000063856.
HOGENOMiHOG000233362.
HOVERGENiHBG006137.
InParanoidiP01344.
KOiK13769.
OMAiFFQYDTW.
OrthoDBiEOG091G0H56.
PhylomeDBiP01344.
TreeFamiTF332820.

Enzyme and pathway databases

ReactomeiR-HSA-114608. Platelet degranulation.
R-HSA-2404192. Signaling by Type 1 Insulin-like Growth Factor 1 Receptor (IGF1R).
R-HSA-2428928. IRS-related events triggered by IGF1R.
R-HSA-2428933. SHC-related events triggered by IGF1R.
R-HSA-381426. Regulation of Insulin-like Growth Factor (IGF) transport and uptake by Insulin-like Growth Factor Binding Proteins (IGFBPs).
SignaLinkiP01344.
SIGNORiP01344.

Miscellaneous databases

ChiTaRSiIGF2. human.
EvolutionaryTraceiP01344.
GeneWikiiInsulin-like_growth_factor_2.
GenomeRNAii3481.
PMAP-CutDBP01344.
PROiP01344.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000167244.
CleanExiHS_IGF2.
GenevisibleiP01344. HS.

Family and domain databases

Gene3Di1.10.100.10. 1 hit.
InterProiIPR022334. IGF2.
IPR013576. IGF2_C.
IPR016179. Insulin-like.
IPR022350. Insulin-like_growth_factor.
IPR022353. Insulin_CS.
IPR022352. Insulin_family.
[Graphical view]
PfamiPF08365. IGF2_C. 1 hit.
PF00049. Insulin. 2 hits.
[Graphical view]
PRINTSiPR02002. INSLNLIKEGF.
PR02006. INSLNLIKEGF2.
PR00276. INSULINFAMLY.
ProDomiPD005188. IGF2_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTiSM00078. IlGF. 1 hit.
[Graphical view]
SUPFAMiSSF56994. SSF56994. 1 hit.
PROSITEiPS00262. INSULIN. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiIGF2_HUMAN
AccessioniPrimary (citable) accession number: P01344
Secondary accession number(s): B3KX48
, B7WP08, C9JAF2, E3UN45, P78449, Q14299, Q1WM26, Q9UC68, Q9UC69
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 21, 1986
Last sequence update: July 21, 1986
Last modified: September 7, 2016
This is version 210 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. Human chromosome 11
    Human chromosome 11: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  6. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.