Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Insulin-like growth factor II

Gene

IGF2

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

The insulin-like growth factors possess growth-promoting activity. In vitro, they are potent mitogens for cultured cells. IGF-II is influenced by placental lactogen and may play a role in fetal development.
Preptin undergoes glucose-mediated co-secretion with insulin, and acts as physiological amplifier of glucose-mediated insulin secretion. Exhibits osteogenic properties by increasing osteoblast mitogenic activity through phosphoactivation of MAPK1 and MAPK3.

GO - Molecular functioni

  • growth factor activity Source: BHF-UCL
  • insulin-like growth factor receptor binding Source: AgBase
  • insulin receptor binding Source: BHF-UCL
  • protein serine/threonine kinase activator activity Source: BHF-UCL
  • receptor activator activity Source: BHF-UCL

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Growth factor, Hormone, Mitogen

Keywords - Biological processi

Carbohydrate metabolism, Glucose metabolism, Osteogenesis

Enzyme and pathway databases

BioCyciZFISH:ENSG00000167244-MONOMER.
ReactomeiR-HSA-114608. Platelet degranulation.
R-HSA-2404192. Signaling by Type 1 Insulin-like Growth Factor 1 Receptor (IGF1R).
R-HSA-2428928. IRS-related events triggered by IGF1R.
R-HSA-2428933. SHC-related events triggered by IGF1R.
R-HSA-381426. Regulation of Insulin-like Growth Factor (IGF) transport and uptake by Insulin-like Growth Factor Binding Proteins (IGFBPs).
SignaLinkiP01344.
SIGNORiP01344.

Names & Taxonomyi

Protein namesi
Recommended name:
Insulin-like growth factor II
Short name:
IGF-II
Alternative name(s):
Somatomedin-A
T3M-11-derived growth factor
Cleaved into the following 3 chains:
Gene namesi
Name:IGF2
ORF Names:PP1446
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 11

Organism-specific databases

HGNCiHGNC:5466. IGF2.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Secreted

Pathology & Biotechi

Involvement in diseasei

Silver-Russell syndrome (SRS)1 Publication
The gene represented in this entry is involved in disease pathogenesis. Most of the cases of Silver-Russell syndrome are caused by the epigenetic changes of DNA hypomethylation at the telomeric imprinting control region (ICR1) on chromosome 11p15, involving the H19 and IGF2 genes.
Disease descriptionA clinically heterogeneous condition characterized by severe intrauterine growth retardation, poor postnatal growth, craniofacial features such as a triangular shaped face and a broad forehead, body asymmetry, and a variety of minor malformations. The phenotypic expression changes during childhood and adolescence, with the facial features and asymmetry usually becoming more subtle with age.
See also OMIM:180860
Growth restriction, severe, with distinctive facies (GRDF)1 Publication
The disease is caused by mutations affecting the gene represented in this entry.
Disease descriptionA disease characterized by severe prenatal and postnatal growth restriction, facial dysmorphism, and short stature in the presence of normal or slightly elevated growth hormone levels.
See also OMIM:616489

Mutagenesis

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Mutagenesisi92R → A: Decreases mature IGF2 levels. 1 Publication1
Mutagenesisi112K → A: No effect in proteolytical processing. 1 Publication1
Mutagenesisi128R → A: Abolishes proteolytical processing. 1 Publication1

Keywords - Diseasei

Dwarfism

Organism-specific databases

DisGeNETi3481.
MalaCardsiIGF2.
MIMi147470. gene+phenotype.
180860. phenotype.
616489. phenotype.
OpenTargetsiENSG00000167244.
Orphaneti231117. Beckwith-Wiedemann syndrome due to imprinting defect of 11p15.
2128. Hemihypertrophy.
231144. Silver-Russell syndrome due to 11p15 microduplication.
231140. Silver-Russell syndrome due to imprinting defect of 11p15.
PharmGKBiPA29699.

Polymorphism and mutation databases

BioMutaiIGF2.
DMDMi124255.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 242 PublicationsAdd BLAST24
ChainiPRO_000001571725 – 91Insulin-like growth factor IIAdd BLAST67
ChainiPRO_000001571826 – 91Insulin-like growth factor II Ala-25 DelAdd BLAST66
PropeptideiPRO_000001571992 – 180E peptideAdd BLAST89
PeptideiPRO_000037037693 – 126PreptinAdd BLAST34

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Disulfide bondi33 ↔ 711 Publication
Disulfide bondi45 ↔ 841 Publication
Disulfide bondi70 ↔ 751 Publication
Glycosylationi96O-linked (GalNAc...)2 Publications1
Glycosylationi99O-linked (GalNAc...)2 Publications1
Glycosylationi163O-linked (GalNAc...)2 Publications1

Post-translational modificationi

O-glycosylated with core 1 or possibly core 8 glycans. Thr-96 is a minor glycosylation site compared to Thr-99.4 Publications
Proteolytically processed by PCSK4, proIGF2 is cleaved at Arg-128 and Arg-92 to generate big-IGF2 and mature IGF2.1 Publication

Keywords - PTMi

Cleavage on pair of basic residues, Disulfide bond, Glycoprotein

Proteomic databases

MaxQBiP01344.
PaxDbiP01344.
PeptideAtlasiP01344.
PRIDEiP01344.

PTM databases

iPTMnetiP01344.
PhosphoSitePlusiP01344.
UniCarbKBiP01344.

Miscellaneous databases

PMAP-CutDBP01344.

Expressioni

Gene expression databases

BgeeiENSG00000167244.
CleanExiHS_IGF2.
GenevisibleiP01344. HS.

Organism-specific databases

HPAiHPA007556.
HPA007993.

Interactioni

Binary interactionsi

WithEntry#Exp.IntActNotes
IGF2RP1171717EBI-7178764,EBI-1048580
RBPMSQ930623EBI-7178764,EBI-740322

GO - Molecular functioni

  • growth factor activity Source: BHF-UCL
  • insulin-like growth factor receptor binding Source: AgBase
  • insulin receptor binding Source: BHF-UCL

Protein-protein interaction databases

BioGridi109702. 15 interactors.
DIPiDIP-29508N.
IntActiP01344. 6 interactors.
MINTiMINT-6380943.
STRINGi9606.ENSP00000391826.

Structurei

Secondary structure

1180
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Helixi34 – 44Combined sources11
Turni45 – 48Combined sources4
Beta strandi50 – 52Combined sources3
Helixi55 – 58Combined sources4
Helixi61 – 72Combined sources12
Helixi77 – 82Combined sources6
Beta strandi86 – 88Combined sources3

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
1GF2model-A25-91[»]
1IGLNMR-A25-91[»]
2L29NMR-B25-91[»]
2V5PX-ray4.10C/D25-91[»]
3E4ZX-ray2.28C/D25-91[»]
3KR3X-ray2.20D25-91[»]
5L3LNMR-A25-91[»]
5L3MNMR-A25-91[»]
5L3NNMR-A25-91[»]
ProteinModelPortaliP01344.
SMRiP01344.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiP01344.

Family & Domainsi

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni25 – 52BAdd BLAST28
Regioni53 – 64CAdd BLAST12
Regioni65 – 85AAdd BLAST21
Regioni86 – 91D6

Sequence similaritiesi

Belongs to the insulin family.Curated

Keywords - Domaini

Signal

Phylogenomic databases

eggNOGiENOG410IY3P. Eukaryota.
ENOG4111KP2. LUCA.
GeneTreeiENSGT00530000063856.
HOGENOMiHOG000233362.
HOVERGENiHBG006137.
InParanoidiP01344.
KOiK13769.
OMAiFFQYDTW.
OrthoDBiEOG091G0H56.
PhylomeDBiP01344.
TreeFamiTF332820.

Family and domain databases

Gene3Di1.10.100.10. 1 hit.
InterProiIPR022334. IGF2.
IPR013576. IGF2_C.
IPR016179. Insulin-like.
IPR022350. Insulin-like_growth_factor.
IPR022353. Insulin_CS.
IPR022352. Insulin_family.
[Graphical view]
PfamiPF08365. IGF2_C. 1 hit.
PF00049. Insulin. 2 hits.
[Graphical view]
PRINTSiPR02002. INSLNLIKEGF.
PR02006. INSLNLIKEGF2.
PR00276. INSULINFAMLY.
ProDomiPD005188. IGF2_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTiSM00078. IlGF. 1 hit.
[Graphical view]
SUPFAMiSSF56994. SSF56994. 1 hit.
PROSITEiPS00262. INSULIN. 1 hit.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: P01344-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MGIPMGKSML VLLTFLAFAS CCIAAYRPSE TLCGGELVDT LQFVCGDRGF
60 70 80 90 100
YFSRPASRVS RRSRGIVEEC CFRSCDLALL ETYCATPAKS ERDVSTPPTV
110 120 130 140 150
LPDNFPRYPV GKFFQYDTWK QSTQRLRRGL PALLRARRGH VLAKELEAFR
160 170 180
EAKRHRPLIA LPTQDPAHGG APPEMASNRK
Length:180
Mass (Da):20,140
Last modified:July 21, 1986 - v1
Checksum:iC1B0EB1E016BA37A
GO
Isoform 2 (identifier: P01344-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     53-53: S → RLPG

Show »
Length:183
Mass (Da):20,477
Checksum:iA54CD97B56C2B96F
GO
Isoform 3 (identifier: P01344-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-1: M → MVSPDPQIIVVAPETELASMQVQRTEDGVTIIQIFWVGRKGELLRRTPVSSAMQTPM

Note: Gene prediction based on EST data.
Show »
Length:236
Mass (Da):26,331
Checksum:iCF1395E851055BF6
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti3I → M in AAA52544 (PubMed:3683205).Curated1
Sequence conflicti107 – 110RYPV → EIPL in CAA27249 (PubMed:6382022).Curated4
Sequence conflicti147E → ELE in AAA60088 (PubMed:3476948).Curated1

Mass spectrometryi

Molecular mass is 7469.4 Da from positions 25 - 91. Determined by MALDI. 2 Publications
Molecular mass is 7398.3 Da from positions 26 - 91. Determined by MALDI. 2 Publications

Polymorphismi

Genetic variations in IGF2 are associated with body mass index (BMI). The BMI is a statistical measurement which compares a person's weight and height.1 Publication

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_011959120K → N.Corresponds to variant rs14367dbSNPEnsembl.1
Natural variantiVAR_011960173P → Q.Corresponds to variant rs1050342dbSNPEnsembl.1
Natural variantiVAR_011961180K → N.Corresponds to variant rs12993dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0456241M → MVSPDPQIIVVAPETELASM QVQRTEDGVTIIQIFWVGRK GELLRRTPVSSAMQTPM in isoform 3. 1 Publication1
Alternative sequenceiVSP_00270853S → RLPG in isoform 2. 2 Publications1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X03562 Genomic DNA. Translation: CAA27249.1.
X00910 mRNA. Translation: CAA25426.1.
J03242 mRNA. Translation: AAA52545.1.
X03425 Genomic DNA. Translation: CAA27155.1.
X03426 Genomic DNA. Translation: CAA27156.1.
X03427 Genomic DNA. Translation: CAA27157.1.
M17426 mRNA. Translation: AAA60088.1.
M29645 mRNA. Translation: AAA52544.1.
M17863 mRNA. Translation: AAA52443.1. Sequence problems.
S77035 mRNA. Translation: AAB34155.1.
DQ104203 mRNA. Translation: ABD93451.1.
HM481219 mRNA. Translation: ADO21454.1.
AF217977 mRNA. Translation: AAG17220.1.
BT007013 mRNA. Translation: AAP35659.1.
AF517226 Genomic DNA. Translation: AAM51825.1.
AC132217 Genomic DNA. No translation available.
AK126688 mRNA. Translation: BAG54360.1.
CH471158 Genomic DNA. Translation: EAX02485.1.
BC000531 mRNA. Translation: AAH00531.1.
X07868 Genomic DNA. Translation: CAA30717.1.
X06159 mRNA. Translation: CAA29516.1.
X06160 Transcribed RNA. Translation: CAA29517.1.
X06161 mRNA. Translation: CAA29518.1.
M22373 Genomic DNA. Translation: AAA52536.1.
CCDSiCCDS44517.1. [P01344-3]
CCDS7728.1. [P01344-1]
PIRiB23614. IGHU2.
I67610.
S02423.
RefSeqiNP_000603.1. NM_000612.5. [P01344-1]
NP_001007140.2. NM_001007139.5. [P01344-1]
NP_001121070.1. NM_001127598.2. [P01344-3]
NP_001278790.1. NM_001291861.2. [P01344-1]
NP_001278791.1. NM_001291862.2. [P01344-1]
UniGeneiHs.272259.

Genome annotation databases

EnsembliENST00000381389; ENSP00000370796; ENSG00000167244. [P01344-1]
ENST00000381392; ENSP00000370799; ENSG00000167244. [P01344-2]
ENST00000381395; ENSP00000370802; ENSG00000167244. [P01344-1]
ENST00000381406; ENSP00000370813; ENSG00000167244. [P01344-2]
ENST00000416167; ENSP00000414497; ENSG00000167244. [P01344-1]
ENST00000418738; ENSP00000402047; ENSG00000167244. [P01344-1]
ENST00000434045; ENSP00000391826; ENSG00000167244. [P01344-3]
GeneIDi3481.
KEGGihsa:3481.
UCSCiuc001lvf.4. human. [P01344-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Web resourcesi

Wikipedia

Insulin-like growth factor 2 entry

SeattleSNPs

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
X03562 Genomic DNA. Translation: CAA27249.1.
X00910 mRNA. Translation: CAA25426.1.
J03242 mRNA. Translation: AAA52545.1.
X03425 Genomic DNA. Translation: CAA27155.1.
X03426 Genomic DNA. Translation: CAA27156.1.
X03427 Genomic DNA. Translation: CAA27157.1.
M17426 mRNA. Translation: AAA60088.1.
M29645 mRNA. Translation: AAA52544.1.
M17863 mRNA. Translation: AAA52443.1. Sequence problems.
S77035 mRNA. Translation: AAB34155.1.
DQ104203 mRNA. Translation: ABD93451.1.
HM481219 mRNA. Translation: ADO21454.1.
AF217977 mRNA. Translation: AAG17220.1.
BT007013 mRNA. Translation: AAP35659.1.
AF517226 Genomic DNA. Translation: AAM51825.1.
AC132217 Genomic DNA. No translation available.
AK126688 mRNA. Translation: BAG54360.1.
CH471158 Genomic DNA. Translation: EAX02485.1.
BC000531 mRNA. Translation: AAH00531.1.
X07868 Genomic DNA. Translation: CAA30717.1.
X06159 mRNA. Translation: CAA29516.1.
X06160 Transcribed RNA. Translation: CAA29517.1.
X06161 mRNA. Translation: CAA29518.1.
M22373 Genomic DNA. Translation: AAA52536.1.
CCDSiCCDS44517.1. [P01344-3]
CCDS7728.1. [P01344-1]
PIRiB23614. IGHU2.
I67610.
S02423.
RefSeqiNP_000603.1. NM_000612.5. [P01344-1]
NP_001007140.2. NM_001007139.5. [P01344-1]
NP_001121070.1. NM_001127598.2. [P01344-3]
NP_001278790.1. NM_001291861.2. [P01344-1]
NP_001278791.1. NM_001291862.2. [P01344-1]
UniGeneiHs.272259.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
1GF2model-A25-91[»]
1IGLNMR-A25-91[»]
2L29NMR-B25-91[»]
2V5PX-ray4.10C/D25-91[»]
3E4ZX-ray2.28C/D25-91[»]
3KR3X-ray2.20D25-91[»]
5L3LNMR-A25-91[»]
5L3MNMR-A25-91[»]
5L3NNMR-A25-91[»]
ProteinModelPortaliP01344.
SMRiP01344.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi109702. 15 interactors.
DIPiDIP-29508N.
IntActiP01344. 6 interactors.
MINTiMINT-6380943.
STRINGi9606.ENSP00000391826.

PTM databases

iPTMnetiP01344.
PhosphoSitePlusiP01344.
UniCarbKBiP01344.

Polymorphism and mutation databases

BioMutaiIGF2.
DMDMi124255.

Proteomic databases

MaxQBiP01344.
PaxDbiP01344.
PeptideAtlasiP01344.
PRIDEiP01344.

Protocols and materials databases

DNASUi3481.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000381389; ENSP00000370796; ENSG00000167244. [P01344-1]
ENST00000381392; ENSP00000370799; ENSG00000167244. [P01344-2]
ENST00000381395; ENSP00000370802; ENSG00000167244. [P01344-1]
ENST00000381406; ENSP00000370813; ENSG00000167244. [P01344-2]
ENST00000416167; ENSP00000414497; ENSG00000167244. [P01344-1]
ENST00000418738; ENSP00000402047; ENSG00000167244. [P01344-1]
ENST00000434045; ENSP00000391826; ENSG00000167244. [P01344-3]
GeneIDi3481.
KEGGihsa:3481.
UCSCiuc001lvf.4. human. [P01344-1]

Organism-specific databases

CTDi3481.
DisGeNETi3481.
GeneCardsiIGF2.
GeneReviewsiIGF2.
H-InvDBHIX0128667.
HGNCiHGNC:5466. IGF2.
HPAiHPA007556.
HPA007993.
MalaCardsiIGF2.
MIMi147470. gene+phenotype.
180860. phenotype.
616489. phenotype.
neXtProtiNX_P01344.
OpenTargetsiENSG00000167244.
Orphaneti231117. Beckwith-Wiedemann syndrome due to imprinting defect of 11p15.
2128. Hemihypertrophy.
231144. Silver-Russell syndrome due to 11p15 microduplication.
231140. Silver-Russell syndrome due to imprinting defect of 11p15.
PharmGKBiPA29699.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410IY3P. Eukaryota.
ENOG4111KP2. LUCA.
GeneTreeiENSGT00530000063856.
HOGENOMiHOG000233362.
HOVERGENiHBG006137.
InParanoidiP01344.
KOiK13769.
OMAiFFQYDTW.
OrthoDBiEOG091G0H56.
PhylomeDBiP01344.
TreeFamiTF332820.

Enzyme and pathway databases

BioCyciZFISH:ENSG00000167244-MONOMER.
ReactomeiR-HSA-114608. Platelet degranulation.
R-HSA-2404192. Signaling by Type 1 Insulin-like Growth Factor 1 Receptor (IGF1R).
R-HSA-2428928. IRS-related events triggered by IGF1R.
R-HSA-2428933. SHC-related events triggered by IGF1R.
R-HSA-381426. Regulation of Insulin-like Growth Factor (IGF) transport and uptake by Insulin-like Growth Factor Binding Proteins (IGFBPs).
SignaLinkiP01344.
SIGNORiP01344.

Miscellaneous databases

ChiTaRSiIGF2. human.
EvolutionaryTraceiP01344.
GeneWikiiInsulin-like_growth_factor_2.
GenomeRNAii3481.
PMAP-CutDBP01344.
PROiP01344.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000167244.
CleanExiHS_IGF2.
GenevisibleiP01344. HS.

Family and domain databases

Gene3Di1.10.100.10. 1 hit.
InterProiIPR022334. IGF2.
IPR013576. IGF2_C.
IPR016179. Insulin-like.
IPR022350. Insulin-like_growth_factor.
IPR022353. Insulin_CS.
IPR022352. Insulin_family.
[Graphical view]
PfamiPF08365. IGF2_C. 1 hit.
PF00049. Insulin. 2 hits.
[Graphical view]
PRINTSiPR02002. INSLNLIKEGF.
PR02006. INSLNLIKEGF2.
PR00276. INSULINFAMLY.
ProDomiPD005188. IGF2_C. 1 hit.
[Graphical view] [Entries sharing at least one domain]
SMARTiSM00078. IlGF. 1 hit.
[Graphical view]
SUPFAMiSSF56994. SSF56994. 1 hit.
PROSITEiPS00262. INSULIN. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiIGF2_HUMAN
AccessioniPrimary (citable) accession number: P01344
Secondary accession number(s): B3KX48
, B7WP08, C9JAF2, E3UN45, P78449, Q14299, Q1WM26, Q9UC68, Q9UC69
Entry historyi
Integrated into UniProtKB/Swiss-Prot: July 21, 1986
Last sequence update: July 21, 1986
Last modified: November 2, 2016
This is version 212 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. Human chromosome 11
    Human chromosome 11: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  6. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.