Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Cytosolic 10-formyltetrahydrofolate dehydrogenase

Gene

ALDH1L1

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Catalytic activityi

10-formyltetrahydrofolate + NADP+ + H2O = tetrahydrofolate + CO2 + NADPH.

Sites

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Active sitei106Proton donorBy similarity1
Binding sitei142Substrate1
Sitei142Essential for catalytic activityBy similarity1
Active sitei673Proton acceptorPROSITE-ProRule annotation1
Active sitei707Proton donorBy similarity1
Binding sitei757NADPBy similarity1

Regions

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Nucleotide bindingi571 – 573NADPBy similarity3
Nucleotide bindingi597 – 600NADPBy similarity4
Nucleotide bindingi630 – 635NADPBy similarity6
Nucleotide bindingi650 – 651NADPBy similarity2
Nucleotide bindingi804 – 806NADPBy similarity3

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Molecular functioni

Oxidoreductase

Keywords - Biological processi

One-carbon metabolism

Keywords - Ligandi

NADP

Enzyme and pathway databases

BioCyciMetaCyc:HS07217-MONOMER.
ZFISH:HS07217-MONOMER.
BRENDAi1.5.1.6. 2681.
ReactomeiR-HSA-196757. Metabolism of folate and pterines.

Names & Taxonomyi

Protein namesi
Recommended name:
Cytosolic 10-formyltetrahydrofolate dehydrogenase (EC:1.5.1.6)
Short name:
10-FTHFDH
Short name:
FDH
Alternative name(s):
Aldehyde dehydrogenase family 1 member L1
Gene namesi
Name:ALDH1L1
Synonyms:FTHFD
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 3

Organism-specific databases

HGNCiHGNC:3978. ALDH1L1.

Subcellular locationi

GO - Cellular componenti

  • cytoplasm Source: HPA
  • cytosol Source: Reactome
  • extracellular exosome Source: UniProtKB
  • mitochondrion Source: Ensembl
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm

Pathology & Biotechi

Organism-specific databases

DisGeNETi10840.
OpenTargetsiENSG00000144908.
PharmGKBiPA28393.

Chemistry databases

DrugBankiDB00116. Tetrahydrofolic acid.

Polymorphism and mutation databases

BioMutaiALDH1L1.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00001994191 – 902Cytosolic 10-formyltetrahydrofolate dehydrogenaseAdd BLAST902

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei9PhosphoserineCombined sources1
Modified residuei38N6-succinyllysineBy similarity1
Modified residuei354O-(pantetheine 4'-phosphoryl)serine; alternatePROSITE-ProRule annotation1
Modified residuei354Phosphoserine; alternateCombined sources1
Modified residuei629PhosphoserineCombined sources1
Modified residuei631PhosphoserineCombined sources1
Modified residuei767N6-succinyllysineBy similarity1
Modified residuei825PhosphoserineCombined sources1

Keywords - PTMi

Phosphopantetheine, Phosphoprotein

Proteomic databases

EPDiO75891.
MaxQBiO75891.
PaxDbiO75891.
PeptideAtlasiO75891.
PRIDEiO75891.

PTM databases

iPTMnetiO75891.
PhosphoSitePlusiO75891.

Expressioni

Tissue specificityi

Highly expressed in liver, pancreas and kidney.1 Publication

Gene expression databases

BgeeiENSG00000144908.
CleanExiHS_ALDH1L1.
ExpressionAtlasiO75891. baseline and differential.
GenevisibleiO75891. HS.

Organism-specific databases

HPAiHPA036900.
HPA050139.

Interactioni

Protein-protein interaction databases

BioGridi116052. 2 interactors.
IntActiO75891. 3 interactors.
STRINGi9606.ENSP00000377083.

Structurei

Secondary structure

1902
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Beta strandi2 – 6Combined sources5
Helixi9 – 21Combined sources13
Beta strandi25 – 31Combined sources7
Helixi41 – 49Combined sources9
Beta strandi53 – 55Combined sources3
Helixi67 – 74Combined sources8
Beta strandi79 – 85Combined sources7
Helixi92 – 95Combined sources4
Beta strandi102 – 108Combined sources7
Turni110 – 113Combined sources4
Beta strandi114 – 116Combined sources3
Helixi118 – 124Combined sources7
Beta strandi128 – 136Combined sources9
Beta strandi139 – 142Combined sources4
Beta strandi146 – 153Combined sources8
Helixi160 – 166Combined sources7
Turni167 – 169Combined sources3
Helixi170 – 185Combined sources16
Helixi206 – 209Combined sources4
Helixi217 – 225Combined sources9
Turni226 – 231Combined sources6
Beta strandi234 – 237Combined sources4
Beta strandi240 – 249Combined sources10
Beta strandi258 – 261Combined sources4
Beta strandi270 – 273Combined sources4
Beta strandi276 – 280Combined sources5
Beta strandi286 – 293Combined sources8
Beta strandi299 – 301Combined sources3
Helixi302 – 304Combined sources3
Helixi320 – 334Combined sources15
Helixi347 – 351Combined sources5
Helixi356 – 367Combined sources12
Helixi375 – 380Combined sources6
Helixi384 – 397Combined sources14

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
2BW0X-ray1.70A1-307[»]
2CFIX-ray1.85A1-307[»]
2CQ8NMR-A305-401[»]
ProteinModelPortaliO75891.
SMRiO75891.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiO75891.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini323 – 392Acyl carrierPROSITE-ProRule annotationAdd BLAST70

Region

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Regioni1 – 203GARTAdd BLAST203
Regioni88 – 90Substrate binding3
Regioni417 – 902Aldehyde dehydrogenaseAdd BLAST486

Sequence similaritiesi

In the N-terminal section; belongs to the GART family.Curated
In the C-terminal section; belongs to the aldehyde dehydrogenase family. ALDH1L subfamily.Curated
Contains 1 acyl carrier domain.PROSITE-ProRule annotation

Phylogenomic databases

eggNOGiKOG2450. Eukaryota.
COG1012. LUCA.
GeneTreeiENSGT00760000118999.
HOGENOMiHOG000006902.
HOVERGENiHBG051668.
InParanoidiO75891.
KOiK00289.
OMAiMASTFGD.
OrthoDBiEOG091G05E8.
PhylomeDBiO75891.
TreeFamiTF354242.

Family and domain databases

Gene3Di1.10.1200.10. 1 hit.
3.10.25.10. 1 hit.
3.40.309.10. 1 hit.
3.40.50.170. 1 hit.
3.40.605.10. 1 hit.
InterProiIPR011407. 10_FTHF_DH.
IPR016161. Ald_DH/histidinol_DH.
IPR016163. Ald_DH_C.
IPR016160. Ald_DH_CS_CYS.
IPR029510. Ald_DH_CS_GLU.
IPR016162. Ald_DH_N.
IPR015590. Aldehyde_DH_dom.
IPR005793. Formyl_trans_C.
IPR002376. Formyl_transf_N.
IPR011034. Formyl_transferase_C-like.
IPR001555. GART_AS.
IPR009081. PP-bd_ACP.
[Graphical view]
PfamiPF00171. Aldedh. 1 hit.
PF02911. Formyl_trans_C. 1 hit.
PF00551. Formyl_trans_N. 1 hit.
PF00550. PP-binding. 1 hit.
[Graphical view]
PIRSFiPIRSF036489. 10-FTHFDH. 1 hit.
SUPFAMiSSF47336. SSF47336. 1 hit.
SSF50486. SSF50486. 1 hit.
SSF53328. SSF53328. 1 hit.
SSF53720. SSF53720. 1 hit.
PROSITEiPS50075. ACP_DOMAIN. 1 hit.
PS00070. ALDEHYDE_DEHYDR_CYS. 1 hit.
PS00687. ALDEHYDE_DEHYDR_GLU. 1 hit.
PS00373. GART. 1 hit.
[Graphical view]

Sequences (4)i

Sequence statusi: Complete.

This entry describes 4 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: O75891-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MKIAVIGQSL FGQEVYCHLR KEGHEVVGVF TVPDKDGKAD PLGLEAEKDG
60 70 80 90 100
VPVFKYSRWR AKGQALPDVV AKYQALGAEL NVLPFCSQFI PMEIISAPRH
110 120 130 140 150
GSIIYHPSLL PRHRGASAIN WTLIHGDKKG GFSIFWADDG LDTGDLLLQK
160 170 180 190 200
ECEVLPDDTV STLYNRFLFP EGIKGMVQAV RLIAEGKAPR LPQPEEGATY
210 220 230 240 250
EGIQKKETAK INWDQPAEAI HNWIRGNDKV PGAWTEACEQ KLTFFNSTLN
260 270 280 290 300
TSGLVPEGDA LPIPGAHRPG VVTKAGLILF GNDDKMLLVK NIQLEDGKMI
310 320 330 340 350
LASNFFKGAA SSVLELTEAE LVTAEAVRSV WQRILPKVLE VEDSTDFFKS
360 370 380 390 400
GAASVDVVRL VEEVKELCDG LELENEDVYM ASTFGDFIQL LVRKLRGDDE
410 420 430 440 450
EGECSIDYVE MAVNKRTVRM PHQLFIGGEF VDAEGAKTSE TINPTDGSVI
460 470 480 490 500
CQVSLAQVTD VDKAVAAAKD AFENGRWGKI SARDRGRLMY RLADLMEQHQ
510 520 530 540 550
EELATIEALD AGAVYTLALK THVGMSIQTF RYFAGWCDKI QGSTIPINQA
560 570 580 590 600
RPNRNLTLTR KEPVGVCGII IPWNYPLMML SWKTAACLAA GNTVVIKPAQ
610 620 630 640 650
VTPLTALKFA ELTLKAGIPK GVVNVLPGSG SLVGQRLSDH PDVRKIGFTG
660 670 680 690 700
STEVGKHIMK SCAISNVKKV SLELGGKSPL IIFADCDLNK AVQMGMSSVF
710 720 730 740 750
FNKGENCIAA GRLFVEDSIH DEFVRRVVEE VRKMKVGNPL DRDTDHGPQN
760 770 780 790 800
HHAHLVKLME YCQHGVKEGA TLVCGGNQVP RPGFFFEPTV FTDVEDHMFI
810 820 830 840 850
AKEESFGPVM IISRFADGDL DAVLSRANAT EFGLASGVFT RDINKALYVS
860 870 880 890 900
DKLQAGTVFV NTYNKTDVAA PFGGFKQSGF GKDLGEAALN EYLRVKTVTF

EY
Length:902
Mass (Da):98,829
Last modified:February 15, 2005 - v2
Checksum:iD92CB2930617F7CF
GO
Isoform 2 (identifier: O75891-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     118-218: Missing.

Note: No experimental confirmation available.
Show »
Length:801
Mass (Da):87,602
Checksum:i2749168F3D76D548
GO
Isoform 3 (identifier: O75891-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-1: M → MAGPSNPPATM

Show »
Length:912
Mass (Da):99,753
Checksum:i4703ADF87C0467D9
GO
Isoform 4 (identifier: O75891-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     492-505: LADLMEQHQEELAT → APPSPSTRPDPTAT
     506-902: Missing.

Note: No experimental confirmation available.
Show »
Length:505
Mass (Da):55,394
Checksum:iF4D59683C0651699
GO

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti63G → A in AAC35000 (Ref. 1) Curated1
Sequence conflicti85F → S in AAC35000 (Ref. 1) Curated1
Sequence conflicti176M → V in AAC35000 (Ref. 1) Curated1
Sequence conflicti195E → K in AAC35000 (Ref. 1) Curated1
Sequence conflicti470D → G in AAC35000 (Ref. 1) Curated1
Sequence conflicti472F → L in BAG57647 (PubMed:14702039).Curated1
Sequence conflicti677K → E in AAC35000 (Ref. 1) Curated1
Sequence conflicti680L → F in AAC35000 (Ref. 1) Curated1
Sequence conflicti702N → S in AAC35000 (Ref. 1) Curated1

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_052290254L → P.Corresponds to variant rs3796191dbSNPEnsembl.1
Natural variantiVAR_052291330V → F.Corresponds to variant rs2886059dbSNPEnsembl.1
Natural variantiVAR_052292429E → A.Corresponds to variant rs9282691dbSNPEnsembl.1
Natural variantiVAR_052293436A → T.Corresponds to variant rs9282692dbSNPEnsembl.1
Natural variantiVAR_052294436A → V.Corresponds to variant rs9282693dbSNPEnsembl.1
Natural variantiVAR_052295448S → N.Corresponds to variant rs9282697dbSNPEnsembl.1
Natural variantiVAR_052296481S → G.Corresponds to variant rs2276724dbSNPEnsembl.1
Natural variantiVAR_036101511A → V in a colorectal cancer sample; somatic mutation. 1 PublicationCorresponds to variant rs768309358dbSNPEnsembl.1
Natural variantiVAR_052297793D → G.Corresponds to variant rs1127717dbSNPEnsembl.1
Natural variantiVAR_052298803E → K.Corresponds to variant rs9282689dbSNPEnsembl.1
Natural variantiVAR_052299812I → V.Corresponds to variant rs4646750dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_0472601M → MAGPSNPPATM in isoform 3. 1 Publication1
Alternative sequenceiVSP_045569118 – 218Missing in isoform 2. 1 PublicationAdd BLAST101
Alternative sequenceiVSP_057429492 – 505LADLM…EELAT → APPSPSTRPDPTAT in isoform 4. 1 PublicationAdd BLAST14
Alternative sequenceiVSP_057430506 – 902Missing in isoform 4. 1 PublicationAdd BLAST397

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF052732 mRNA. Translation: AAC35000.1.
AK294392 mRNA. Translation: BAG57647.1.
CR749807 mRNA. Translation: CAH18667.1.
AC079848 Genomic DNA. No translation available.
CH471052 Genomic DNA. Translation: EAW79370.1.
BC027241 mRNA. Translation: AAH27241.1.
CCDSiCCDS3034.1. [O75891-1]
CCDS58850.1. [O75891-2]
CCDS58851.1. [O75891-3]
RefSeqiNP_001257293.1. NM_001270364.1. [O75891-3]
NP_001257294.1. NM_001270365.1. [O75891-2]
NP_036322.2. NM_012190.3. [O75891-1]
XP_006713544.1. XM_006713481.2. [O75891-1]
XP_011510657.1. XM_011512355.1. [O75891-1]
UniGeneiHs.434435.

Genome annotation databases

EnsembliENST00000273450; ENSP00000273450; ENSG00000144908. [O75891-3]
ENST00000393431; ENSP00000377081; ENSG00000144908. [O75891-4]
ENST00000393434; ENSP00000377083; ENSG00000144908. [O75891-1]
ENST00000452905; ENSP00000395881; ENSG00000144908. [O75891-2]
ENST00000455064; ENSP00000414126; ENSG00000144908. [O75891-4]
ENST00000472186; ENSP00000420293; ENSG00000144908. [O75891-1]
GeneIDi10840.
KEGGihsa:10840.
UCSCiuc003eim.3. human. [O75891-1]
uc062njt.1. human.

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF052732 mRNA. Translation: AAC35000.1.
AK294392 mRNA. Translation: BAG57647.1.
CR749807 mRNA. Translation: CAH18667.1.
AC079848 Genomic DNA. No translation available.
CH471052 Genomic DNA. Translation: EAW79370.1.
BC027241 mRNA. Translation: AAH27241.1.
CCDSiCCDS3034.1. [O75891-1]
CCDS58850.1. [O75891-2]
CCDS58851.1. [O75891-3]
RefSeqiNP_001257293.1. NM_001270364.1. [O75891-3]
NP_001257294.1. NM_001270365.1. [O75891-2]
NP_036322.2. NM_012190.3. [O75891-1]
XP_006713544.1. XM_006713481.2. [O75891-1]
XP_011510657.1. XM_011512355.1. [O75891-1]
UniGeneiHs.434435.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
2BW0X-ray1.70A1-307[»]
2CFIX-ray1.85A1-307[»]
2CQ8NMR-A305-401[»]
ProteinModelPortaliO75891.
SMRiO75891.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi116052. 2 interactors.
IntActiO75891. 3 interactors.
STRINGi9606.ENSP00000377083.

Chemistry databases

DrugBankiDB00116. Tetrahydrofolic acid.

PTM databases

iPTMnetiO75891.
PhosphoSitePlusiO75891.

Polymorphism and mutation databases

BioMutaiALDH1L1.

Proteomic databases

EPDiO75891.
MaxQBiO75891.
PaxDbiO75891.
PeptideAtlasiO75891.
PRIDEiO75891.

Protocols and materials databases

DNASUi10840.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000273450; ENSP00000273450; ENSG00000144908. [O75891-3]
ENST00000393431; ENSP00000377081; ENSG00000144908. [O75891-4]
ENST00000393434; ENSP00000377083; ENSG00000144908. [O75891-1]
ENST00000452905; ENSP00000395881; ENSG00000144908. [O75891-2]
ENST00000455064; ENSP00000414126; ENSG00000144908. [O75891-4]
ENST00000472186; ENSP00000420293; ENSG00000144908. [O75891-1]
GeneIDi10840.
KEGGihsa:10840.
UCSCiuc003eim.3. human. [O75891-1]
uc062njt.1. human.

Organism-specific databases

CTDi10840.
DisGeNETi10840.
GeneCardsiALDH1L1.
HGNCiHGNC:3978. ALDH1L1.
HPAiHPA036900.
HPA050139.
MIMi600249. gene.
neXtProtiNX_O75891.
OpenTargetsiENSG00000144908.
PharmGKBiPA28393.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG2450. Eukaryota.
COG1012. LUCA.
GeneTreeiENSGT00760000118999.
HOGENOMiHOG000006902.
HOVERGENiHBG051668.
InParanoidiO75891.
KOiK00289.
OMAiMASTFGD.
OrthoDBiEOG091G05E8.
PhylomeDBiO75891.
TreeFamiTF354242.

Enzyme and pathway databases

BioCyciMetaCyc:HS07217-MONOMER.
ZFISH:HS07217-MONOMER.
BRENDAi1.5.1.6. 2681.
ReactomeiR-HSA-196757. Metabolism of folate and pterines.

Miscellaneous databases

ChiTaRSiALDH1L1. human.
EvolutionaryTraceiO75891.
GeneWikiiALDH1L1.
GenomeRNAii10840.
PROiO75891.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000144908.
CleanExiHS_ALDH1L1.
ExpressionAtlasiO75891. baseline and differential.
GenevisibleiO75891. HS.

Family and domain databases

Gene3Di1.10.1200.10. 1 hit.
3.10.25.10. 1 hit.
3.40.309.10. 1 hit.
3.40.50.170. 1 hit.
3.40.605.10. 1 hit.
InterProiIPR011407. 10_FTHF_DH.
IPR016161. Ald_DH/histidinol_DH.
IPR016163. Ald_DH_C.
IPR016160. Ald_DH_CS_CYS.
IPR029510. Ald_DH_CS_GLU.
IPR016162. Ald_DH_N.
IPR015590. Aldehyde_DH_dom.
IPR005793. Formyl_trans_C.
IPR002376. Formyl_transf_N.
IPR011034. Formyl_transferase_C-like.
IPR001555. GART_AS.
IPR009081. PP-bd_ACP.
[Graphical view]
PfamiPF00171. Aldedh. 1 hit.
PF02911. Formyl_trans_C. 1 hit.
PF00551. Formyl_trans_N. 1 hit.
PF00550. PP-binding. 1 hit.
[Graphical view]
PIRSFiPIRSF036489. 10-FTHFDH. 1 hit.
SUPFAMiSSF47336. SSF47336. 1 hit.
SSF50486. SSF50486. 1 hit.
SSF53328. SSF53328. 1 hit.
SSF53720. SSF53720. 1 hit.
PROSITEiPS50075. ACP_DOMAIN. 1 hit.
PS00070. ALDEHYDE_DEHYDR_CYS. 1 hit.
PS00687. ALDEHYDE_DEHYDR_GLU. 1 hit.
PS00373. GART. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiAL1L1_HUMAN
AccessioniPrimary (citable) accession number: O75891
Secondary accession number(s): B4DG36
, E9PBX3, Q68CS1, Q8TBP8
Entry historyi
Integrated into UniProtKB/Swiss-Prot: May 30, 2000
Last sequence update: February 15, 2005
Last modified: November 30, 2016
This is version 169 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 3
    Human chromosome 3: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  6. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.