Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

HC-toxin synthetase

Gene

HTS1

Organism
Cochliobolus carbonum (Maize leaf spot fungus) (Bipolaris zeicola)
Status
Reviewed-Annotation score: Annotation score: 3 out of 5-Experimental evidence at protein leveli

Functioni

Non-ribosomal peptide synthetase, able to activate proline and AEO (2-amino-9,10-epoxi-8-oxodecanoic acid), and epimerize L-Pro. Catalyzes the production of HC-toxin: a cyclic tetrapeptide. Activates and thioesterifies L-Pro, and epimerizes it to D-Pro; also uses D-Ala as a substrate but this is epimerized from L-Ala by TOXG.1 Publication

Cofactori

pantetheine 4'-phosphateNote: Binds 4 phosphopantetheines covalently.

Pathwayi: HC-toxin biosynthesis

This protein is involved in the pathway HC-toxin biosynthesis, which is part of Mycotoxin biosynthesis.
View all proteins of this organism that are known to be involved in the pathway HC-toxin biosynthesis and in Mycotoxin biosynthesis.

GO - Molecular functioni

  • acid-amino acid ligase activity Source: UniProtKB

GO - Biological processi

  • toxin biosynthetic process Source: UniProtKB
Complete GO annotation...

Keywords - Molecular functioni

Ligase

Enzyme and pathway databases

UniPathwayiUPA00874.

Names & Taxonomyi

Protein namesi
Recommended name:
HC-toxin synthetase (EC:6.3.2.-)
Short name:
HTS
Gene namesi
Name:HTS1
OrganismiCochliobolus carbonum (Maize leaf spot fungus) (Bipolaris zeicola)
Taxonomic identifieri5017 [NCBI]
Taxonomic lineageiEukaryotaFungiDikaryaAscomycotaPezizomycotinaDothideomycetesPleosporomycetidaePleosporalesPleosporineaePleosporaceaeBipolaris

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 52185218HC-toxin synthetasePRO_0000193097Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei803 – 8031O-(pantetheine 4'-phosphoryl)serinePROSITE-ProRule annotation
Modified residuei2414 – 24141O-(pantetheine 4'-phosphoryl)serinePROSITE-ProRule annotation
Modified residuei3569 – 35691O-(pantetheine 4'-phosphoryl)serinePROSITE-ProRule annotation
Modified residuei4701 – 47011O-(pantetheine 4'-phosphoryl)serinePROSITE-ProRule annotation

Keywords - PTMi

Phosphopantetheine, Phosphoprotein

Proteomic databases

PRIDEiQ01886.

Structurei

3D structure databases

ProteinModelPortaliQ01886.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini762 – 84079Acyl carrier 1PROSITE-ProRule annotationAdd
BLAST
Domaini2384 – 245067Acyl carrier 2PROSITE-ProRule annotationAdd
BLAST
Domaini3537 – 360569Acyl carrier 3PROSITE-ProRule annotationAdd
BLAST
Domaini4668 – 473770Acyl carrier 4PROSITE-ProRule annotationAdd
BLAST

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni249 – 842594Domain 1Add
BLAST
Regioni1854 – 2452599Domain 2Add
BLAST
Regioni3006 – 3607602Domain 3Add
BLAST
Regioni4159 – 4739581Domain 4Add
BLAST

Sequence similaritiesi

Contains 4 acyl carrier domains.PROSITE-ProRule annotation

Keywords - Domaini

Repeat

Family and domain databases

Gene3Di1.10.1200.10. 4 hits.
InterProiIPR010071. AA_adenyl_domain.
IPR020845. AMP-binding_CS.
IPR000873. AMP-dep_Synth/Lig.
IPR001242. Condensatn.
IPR020806. PKS_PP-bd.
IPR009081. PP-bd_ACP.
IPR006162. Ppantetheine_attach_site.
[Graphical view]
PfamiPF00501. AMP-binding. 4 hits.
PF00668. Condensation. 5 hits.
PF00550. PP-binding. 4 hits.
[Graphical view]
SMARTiSM00823. PKS_PP. 3 hits.
[Graphical view]
SUPFAMiSSF47336. SSF47336. 4 hits.
TIGRFAMsiTIGR01733. AA-adenyl-dom. 4 hits.
PROSITEiPS50075. ACP_DOMAIN. 4 hits.
PS00455. AMP_BINDING. 3 hits.
PS00012. PHOSPHOPANTETHEINE. 4 hits.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Q01886-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MTMPHHSSGP AKDSPLCRFP PFPGGNPVFT NIRREKVNFQ LPPPDHALLA
60 70 80 90 100
AAWAVLLRLY TGHVKTCFES ATSDQEANLV TYEARDSDTL QTIVLRGACV
110 120 130 140 150
SSTAEEKAGL RDLNTAVVRT TVSIDSWTDE MLQDKIAALL QPGKEIVLFQ
160 170 180 190 200
TPSGCVLVYM QSFMSAMEVK NVSSTLTYIM SSDPDKTAIR NLSISPRDLA
210 220 230 240 250
QIMRWNDRKL KSERTNLVYD LFSARAHEQD ANMAIDAWDG RMSYTELERV
260 270 280 290 300
SSTWARQLQK QGISQGSWVL FCFEKSRLAV VSMIAILKAG GVCVPIDPRY
310 320 330 340 350
PVERIRDIIR TTNATIALVG AGKTAALFKS ADTAVQTIDI TKDIPHGLSD
360 370 380 390 400
TVVQSNTKID DPAFGLFTSG STGVPKCIVV THSQICTAVQ AYKDRFGVTS
410 420 430 440 450
ETRVLQFSSY TFDISIADTF TALFYGGTLC IPSEEDRMSN LQDYMVSVRP
460 470 480 490 500
NWAVLTPTVS RFLDPGVVKD FISTLIFTGE ASREADTVPW IEAGVNLYNV
510 520 530 540 550
YGPAENTLIT TATRIRKGKS SNIGYGVNTR TWVTDVSGAC LVPVGSIGEL
560 570 580 590 600
LIESGHLADK YLNRPDRTEA AFLSDLPWIP NYEGDSVRRG RRFYRTGDLV
610 620 630 640 650
RYCDDGSLIC VGRSDTQIKL AGQRVELGDV EAHLQSDPTT SQAAVVFPRS
660 670 680 690 700
GPLEARLIAL LVTGNKDGTP HNQQSLPKPA FAQCPPDLVK YATSSLQQRL
710 720 730 740 750
PSYMVPSVWL GIDFLPMSVS GKLDRAVLQD QLESLSPSDY AEILGTTGLE
760 770 780 790 800
VDPGGAASSV ASDSDLRDMN DDSLLLTACS RVLNLPAGKI SYSQSFIHAG
810 820 830 840 850
GDSITAMQVS SWMKRFTGKR IGVKDLLVSP SISTAASCIK SAQDGSRNFV
860 870 880 890 900
AVRPGQRIPV SPIQKLFFQT AEASKSWNHY HQSFLFRIDQ PIKPQTIEDA
910 920 930 940 950
ISLVMQRHPM LQARFERTEE GDWYQYIPID VERRASVEVI GSLSTDDREA
960 970 980 990 1000
AMLRARQSID LTEGPLIRCQ LFNNNVDEAS RLFFVVIHHA VVDLVSWRII
1010 1020 1030 1040 1050
MEELEAHLAT DSTPDRGEAY QESVPFLAWC QVQAEAVKDI PVDRTVPLIP
1060 1070 1080 1090 1100
KIPTADFGYW GLKHDENVYG NTVERKIPLG HSITEDLLYK CHDSLHTKTI
1110 1120 1130 1140 1150
DVLLAAVLVS FRRSFLDRPV PAVFNEGHGR EPGGEDAVDL SRTVGWFTTI
1160 1170 1180 1190 1200
SPVYVPEVSP GDILDVVRRV KDYRWATPNN GFDYFSTKYL TQSGIKLFED
1210 1220 1230 1240 1250
HLPAEILFNY EGRYQAMESE QTVLKPESWH AGEASKDQDP GLRRFCLFEI
1260 1270 1280 1290 1300
STAVLPDGQL HLTCSWNKNM RHQGRIRLWL DTLLPAAIGE IVSSLALASP
1310 1320 1330 1340 1350
QLTLSDVELL RLYDYSSLDI LKKSILSIPA VQTLDDLEGV YPGSPMQDAL
1360 1370 1380 1390 1400
FLSQSKSQDG AYEVDFTWRV ATSLQNSQPA VDIGCLVEAW KDTVALHAAL
1410 1420 1430 1440 1450
RTVILESSLP ATGILHQVVL RSHDPDIVIL DVRDVTAAIT ILDSYPPPTE
1460 1470 1480 1490 1500
EGIALIKRPP HRLLICTTIE GSVLIKFQVN HLVFDGMSTD KIIQDLSKAY
1510 1520 1530 1540 1550
TCRHSNKLPD HSESKLHDGT YGNRPTKPPL AEFIRYIRDP QRKQDSINYW
1560 1570 1580 1590 1600
KNALRGATTC SFPPLFDQIT SEKAMPRQSW ASVPIPLCVD SKELSKTLAN
1610 1620 1630 1640 1650
LGITMSTMFQ TVWAIVLRIY SQNGQSVFGY LTSGRDAPVD GIDSAVGNFI
1660 1670 1680 1690 1700
AMLVCFFDFD DDGVHTVADM ARKIHNASAN SISHQACSLA EIQDALGLST
1710 1720 1730 1740 1750
STPLFNTAFT YLPKRPTNVK AGEPEHHLCF EELSMSDPTE FDLTLFVEPT
1760 1770 1780 1790 1800
QESNEVSAHL DFKLSYISQA YATSIASTVA HILSELVHDP YRALNTLPIV
1810 1820 1830 1840 1850
SEHDTAIIRS WNDHLFPPAT ECIHETFSRK VVEHPQREAI CSWDGSLTYA
1860 1870 1880 1890 1900
ELSDLSQRLS IHLVSLGIKV GTKIPICFEK SMWTIVTILA VVQAGGVFVL
1910 1920 1930 1940 1950
LEPGHPESRL SGIIKQVQAE LLLCSPATSR MGALQNISTQ MGTEFKIVEL
1960 1970 1980 1990 2000
EPEFIRSLPL PPKPNHQPMV GLNDDLYVVF TSGSTGVPKG AVATHQAYAT
2010 2020 2030 2040 2050
GIYEHAVACG MTSLGAPPRS LQFASYSFDA SIGDIFTTLA VGGCLCIPRE
2060 2070 2080 2090 2100
EDRNPAGITT FINRYGVTWA GITPSLALHL DPDAVPTLKA LCVAGEPLSM
2110 2120 2130 2140 2150
SVVTVWSKRL NLINMYGPTE ATVACIANQV TCTTTTVSDI GRGYRATTWV
2160 2170 2180 2190 2200
VQPDNHNSLV PIGAVGELII EGSILCRGYL NDPERTAEVF IRSPSWLHDL
2210 2220 2230 2240 2250
RPNSTLYKTG DLVRYSADGK IIFIGRKDTQ VKMNGQRFEL GEVEHALQLQ
2260 2270 2280 2290 2300
LDPSDGPIIV DLLKRTQSGE PDLLIAFLFV GRANTGTGNS DEIFIATSTS
2310 2320 2330 2340 2350
SLSEFSTVIK KLQDAQRAME VLPLFMVPQA YIPIEGGIPL TAAGKIDRRM
2360 2370 2380 2390 2400
LRKLCEPFNR NDLISFTSKA LSTSVKDAET TDTVEDRLAR IWEKVLGVKG
2410 2420 2430 2440 2450
VGRESDFFSS GGNSMAAIAL RAEAQRSGFT LFVADIFTNP RLADMAKLFS
2460 2470 2480 2490 2500
HGQSVSPSSS TLRTKVPISS LQKRSSGLQT AAPVSNGSPV RRCQKENIID
2510 2520 2530 2540 2550
CPVAFEYEEG PSDTQLKEAS RICGISSRSI EDVFPCTPMQ EALVALSLIP
2560 2570 2580 2590 2600
GAQASYALHA AFELRPGLDR NRFRSAWEST VKAQPILRSR IISGSNGSSV
2610 2620 2630 2640 2650
VVTSATDSIP QLDVSGLDTF LEQQLQVGFA PGAPLFRLAF VYSKADDCDY
2660 2670 2680 2690 2700
FVISAHHAIY DGWSLNLIWS QVLALYTNGE LPPPGPSFKH FARNLNLVQS
2710 2720 2730 2740 2750
KLDSEDFWRK LLVKPDQESF RFPDVPVGHK PATRCTTNFH FPFSMQSKIG
2760 2770 2780 2790 2800
TTANTCINAA WAITLAQYSS NKTVNFGVTL WGRDFPMIDI EHMTGPTIVT
2810 2820 2830 2840 2850
VPRQVNVIPE SSVAEFLQDL QKSLAVVLPH QHLGLHRIQA LGPIARQACD
2860 2870 2880 2890 2900
FSTLLVVNHG SSISWSELEA ADIVPVPLRS SDLYAYPMVV EVENASSDTL
2910 2920 2930 2940 2950
DIRVHSDPDC IEVQLLERLM EQFGHNLQTL CRAASFDPGK RIAELMDDTA
2960 2970 2980 2990 3000
TTHLRTLFSW NSRVKDSPDV AAIAVHKLLE ETAQSQPAES AIVAHDGQLS
3010 3020 3030 3040 3050
YMQMDRCADV LARQIRKTNM ISAQSPFVCI HLLRSATAVV SMLAVLKAGG
3060 3070 3080 3090 3100
AFMPVDISQP RSRLQNLIEE SGAKLVLTLP ESANALATLS GLTKVIPVSL
3110 3120 3130 3140 3150
SELVQQITDN TTKKDEYCKS GDTDPSSPAY LLYTSGTSGK PKGVVMEHRA
3160 3170 3180 3190 3200
WSLGFTCHAE YMGFNSCTRI LQFSSLMFDL SILEIWAVLY AGGCLFIPSD
3210 3220 3230 3240 3250
KERVNNLQDF TRINDINTVF LTPSIGKLLN PKDLPNISFA GFIGEPMTRS
3260 3270 3280 3290 3300
LIDAWTLPGR RLVNSYGPTE ACVLVTAREI SPTAPHDKPS SNIGHALGAN
3310 3320 3330 3340 3350
IWVVEPQRTA LVPIGAVGEL CIEAPSLARC YLANPERTEY SFPSTVLDNW
3360 3370 3380 3390 3400
QTKKGTRVYR TGDLVRYASD GTLDFLGRKD GQIKLRGQRI ELGEIEHHIR
3410 3420 3430 3440 3450
RLMSDDPRFH EASVQLYNPA TDPDRDATVD VQMREPYLAG LLVLDLVFTD
3460 3470 3480 3490 3500
EVMGIPCTSL TSANTSENLQ TLVTELKKSL RGVLPHYMVP LHFVAVSRLP
3510 3520 3530 3540 3550
TGSSGKLDHA FVRACLRELT APLDGNFPKV EQVLTTNESV LRQWWGTVLA
3560 3570 3580 3590 3600
MDPHSIQRGD DFFSLGGSSI SAMRLVGLAR SSGHKLQHED IFMCPRLADM
3610 3620 3630 3640 3650
AGQISFVQEA SVSPTTSPTI KFDLLDDCEV DEVIDHILPQ LDMNKELIED
3660 3670 3680 3690 3700
VYPCTPLQES LMAATARHGE AYTMIQSITV LASQLAQLKK AMDVVFRDFE
3710 3720 3730 3740 3750
VLRTRIALGP SQQALQVVVK HEELSWESFP SIQSFKDHFY RSLGYGKPLA
3760 3770 3780 3790 3800
RLAVITQALD TKQPISHGTR EARTKNSQDT VMVVVGAHHS IYDAHVLSMI
3810 3820 3830 3840 3850
WRRLYREFIG SQADGILEAE TSRSEGVVPF KSYVEKLLRG KDNDESLLFW
3860 3870 3880 3890 3900
KEKLRGVSSS QFPPASWPRV LEHQPSATQT LITKVSLPTS SRKKLGATVA
3910 3920 3930 3940 3950
TVAYAAWALT IAHYTADPDV VFGATLSGRE TMAGSISHPE SIAGPTIITV
3960 3970 3980 3990 4000
PLRIIIDFQT VVSDFLSTLQ KDIVRAAYFG QMMGLNSIAH IDNDCRDACG
4010 4020 4030 4040 4050
FKSIIVVQVP DEGENHDGRA ANPFQMSLES IGHFPAPLVV EVEQSESTDV
4060 4070 4080 4090 4100
LIRMAYDPVL VPEKLAHFIS DTFTTTMSNL SAANPKAKVE SIPALSEAHL
4110 4120 4130 4140 4150
AELDVTCPEW ILGKAKDEKI RTESHQCLQD LVCRRAQQSP NSQAIDSWDG
4160 4170 4180 4190 4200
SISYHELDGL SSILAEHLSQ LGVRPEAPVC LLFEKSKWAV VAMIGIIKAG
4210 4220 4230 4240 4250
GCFVPLDPSY PHERLEHIIS ETGSSVIVTS AAYSKLCLSL SVRGIVCDGS
4260 4270 4280 4290 4300
VFSSTKKPLP STADSPPSFS VRPNQAAYIL FTSGSTGKPK GVVMEHHSVC
4310 4320 4330 4340 4350
SALIALGKRM GLGPQSRVLQ FNSYWFDVML LDIFGTLVYG GCLCIPKEEQ
4360 4370 4380 4390 4400
RMSNLSGWVQ KFKVNTMLLS TSVSRLMQPA DTPSLETLCL TGEAVLQSDV
4410 4420 4430 4440 4450
DRWAPKLHLI AGYGPTETCI MSVSGELTPS SPANLIGKPV SCQAWVINPL
4460 4470 4480 4490 4500
KETELAPYGA TGELYIQGPT VARGYLHDDV LTSKAFIVDP QWLTGYKTNE
4510 4520 4530 4540 4550
NQWSRRAYKT GDLVFWGPQS NLYYVRRKDS SQVKIRGQRV ELAEIEEVIR
4560 4570 4580 4590 4600
QHIPPDVTVC VDLLSSDDQN TRIILGAVLG IGDRALGGPE DLEVIGYMDD
4610 4620 4630 4640 4650
LKSHIIPALE ASLPHHMIPE AYVPFVQLPT LGSGKLDRKT VRRVAGPLAF
4660 4670 4680 4690 4700
SLPQASARHP NQPTVTHTQK LLRQLWCKIL PQLDESAVNK QDNFLGIGGD
4710 4720 4730 4740 4750
SIAAIKLVAL LRQHGISLAV AEIFTRPTLE AMSSLIDEHN FVVSHAGILS
4760 4770 4780 4790 4800
DVTRNTSGVM RQTTNLIAGR HSMAVEKSRE CDNSTLPCTE YQQMFLAGTE
4810 4820 4830 4840 4850
AFTGAHSAQF IFRLPEKIDL DRLQAAFDHC ADWYPNLRTQ IHKDADTGRL
4860 4870 4880 4890 4900
LHDISPIGVK VPWSCHYSDD LNTVLSHDKK FPPGLDGPLH RVTIMRHRDP
4910 4920 4930 4940 4950
TESMLVWTLN HAAYDAWSLR MMLEHITEAY ANPDYEPSYS LGWTAFVLHT
4960 4970 4980 4990 5000
ENTKEASRSF WSSYLSDVKP ARLMFNYNLV SNPRQDRLYE ARINIPKRVL
5010 5020 5030 5040 5050
SQATAATVLL AGLTLLVARV CDTRDVILAH LLTGRTLPLA GIENCPGPTI
5060 5070 5080 5090 5100
TKVPLRIPLM DQDLVTLELD SVAKKITAEL MRVMPHEHSG LSAIREFIPQ
5110 5120 5130 5140 5150
AEGTTTSSGK FHAGSVLGRL PLDLVIHPKG GLDLLGKHGL GLQNEGFRLV
5160 5170 5180 5190 5200
APPSGGLSME CALVDDDDDK RSDTISVDVS VLWDQRAATQ EDVIELVHSL
5210
QGIFTKRNLA ASICLMYK
Length:5,218
Mass (Da):574,596
Last modified:February 7, 2006 - v2
Checksum:iAC947CEA7FF409E2
GO

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M98024 Genomic DNA. Translation: AAA33023.2.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
M98024 Genomic DNA. Translation: AAA33023.2.

3D structure databases

ProteinModelPortaliQ01886.
ModBaseiSearch...
MobiDBiSearch...

Proteomic databases

PRIDEiQ01886.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Enzyme and pathway databases

UniPathwayiUPA00874.

Family and domain databases

Gene3Di1.10.1200.10. 4 hits.
InterProiIPR010071. AA_adenyl_domain.
IPR020845. AMP-binding_CS.
IPR000873. AMP-dep_Synth/Lig.
IPR001242. Condensatn.
IPR020806. PKS_PP-bd.
IPR009081. PP-bd_ACP.
IPR006162. Ppantetheine_attach_site.
[Graphical view]
PfamiPF00501. AMP-binding. 4 hits.
PF00668. Condensation. 5 hits.
PF00550. PP-binding. 4 hits.
[Graphical view]
SMARTiSM00823. PKS_PP. 3 hits.
[Graphical view]
SUPFAMiSSF47336. SSF47336. 4 hits.
TIGRFAMsiTIGR01733. AA-adenyl-dom. 4 hits.
PROSITEiPS50075. ACP_DOMAIN. 4 hits.
PS00455. AMP_BINDING. 3 hits.
PS00012. PHOSPHOPANTETHEINE. 4 hits.
[Graphical view]
ProtoNetiSearch...

Publicationsi

  1. "The cyclic peptide synthetase catalyzing HC-toxin production in the filamentous fungus Cochliobolus carbonum is encoded by a 15.7-kilobase open reading frame."
    Scott-Craig J.S., Panaccione D.G., Pocard J.-A., Walton J.D.
    J. Biol. Chem. 267:26044-26049(1992) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA], PARTIAL PROTEIN SEQUENCE.
    Strain: ATCC 90305 / SB111 / 2R15.
  2. Scott-Craig J.S., Panaccione D.G., Pocard J.-A., Walton J.D.
    Submitted (APR-2005) to the EMBL/GenBank/DDBJ databases
    Cited for: SEQUENCE REVISION TO 3448-3462; 3683 AND 4017.
    Strain: ATCC 90305 / SB111 / 2R15.
  3. "A eukaryotic alanine racemase gene involved in cyclic peptide biosynthesis."
    Cheng Y.-Q., Walton J.D.
    J. Biol. Chem. 275:4906-4911(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION.
    Strain: ATCC 90305 / SB111 / 2R15.

Entry informationi

Entry nameiHTS1_COCCA
AccessioniPrimary (citable) accession number: Q01886
Entry historyi
Integrated into UniProtKB/Swiss-Prot: February 1, 1994
Last sequence update: February 7, 2006
Last modified: May 11, 2016
This is version 91 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programFungal Protein Annotation Program

Miscellaneousi

Caution

It is uncertain whether Met-1 or Met-3 is the initiator.Curated

Keywords - Technical termi

Direct protein sequencing, Multifunctional enzyme

Documents

  1. PATHWAY comments
    Index of metabolic and biosynthesis pathways
  2. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.