Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

ADAMTS-like protein 3

Gene

ADAMTSL3

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

GO - Molecular functioni

Complete GO annotation...

Enzyme and pathway databases

ReactomeiR-HSA-5083635. Defective B3GALTL causes Peters-plus syndrome (PpS).
R-HSA-5173214. O-glycosylation of TSR domain-containing proteins.

Names & Taxonomyi

Protein namesi
Recommended name:
ADAMTS-like protein 3
Short name:
ADAMTSL-3
Alternative name(s):
Punctin-2
Gene namesi
Name:ADAMTSL3
Synonyms:KIAA1233
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 15

Organism-specific databases

HGNCiHGNC:14633. ADAMTSL3.

Subcellular locationi

GO - Cellular componenti

Complete GO annotation...

Keywords - Cellular componenti

Extracellular matrix, Secreted

Pathology & Biotechi

Organism-specific databases

PharmGKBiPA134934525.

Polymorphism and mutation databases

BioMutaiADAMTSL3.
DMDMi308153648.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Signal peptidei1 – 2626Sequence analysisAdd
BLAST
Chaini27 – 16911665ADAMTS-like protein 3PRO_0000249684Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Disulfide bondi87 ↔ 118By similarity
Disulfide bondi91 ↔ 123By similarity
Disulfide bondi102 ↔ 108By similarity
Glycosylationi293 – 2931N-linked (GlcNAc...)Sequence analysis
Disulfide bondi576 ↔ 620By similarity
Disulfide bondi580 ↔ 625By similarity
Disulfide bondi591 ↔ 609By similarity
Glycosylationi681 – 6811N-linked (GlcNAc...)Sequence analysis
Glycosylationi797 – 7971N-linked (GlcNAc...)Sequence analysis
Disulfide bondi831 ↔ 875By similarity
Disulfide bondi835 ↔ 880By similarity
Disulfide bondi846 ↔ 863By similarity
Glycosylationi915 – 9151N-linked (GlcNAc...)Sequence analysis
Glycosylationi927 – 9271N-linked (GlcNAc...)Sequence analysis
Disulfide bondi934 ↔ 982By similarity
Glycosylationi1102 – 11021N-linked (GlcNAc...)Sequence analysis
Glycosylationi1191 – 11911N-linked (GlcNAc...)Sequence analysis
Disulfide bondi1215 ↔ 1263By similarity
Glycosylationi1292 – 12921N-linked (GlcNAc...)Sequence analysis
Glycosylationi1316 – 13161N-linked (GlcNAc...)Sequence analysis
Disulfide bondi1321 ↔ 1367By similarity
Glycosylationi1330 – 13301N-linked (GlcNAc...)Sequence analysis
Glycosylationi1343 – 13431N-linked (GlcNAc...)Sequence analysis
Glycosylationi1349 – 13491N-linked (GlcNAc...)Sequence analysis
Glycosylationi1356 – 13561N-linked (GlcNAc...)Sequence analysis
Glycosylationi1432 – 14321N-linked (GlcNAc...)Sequence analysis
Glycosylationi1516 – 15161N-linked (GlcNAc...)Sequence analysis
Glycosylationi1574 – 15741N-linked (GlcNAc...)Sequence analysis
Glycosylationi1591 – 15911N-linked (GlcNAc...)Sequence analysis

Post-translational modificationi

Glycosylated (By similarity). Can be O-fucosylated by POFUT2 on a serine or a threonine residue found within the consensus sequence C1-X(2)-(S/T)-C2-G of the TSP type-1 repeat domains where C1 and C2 are the first and second cysteine residue of the repeat, respectively. Fucosylated repeats can then be further glycosylated by the addition of a beta-1,3-glucose residue by the glucosyltransferase, B3GALTL. Fucosylation mediates the efficient secretion of ADAMTS family members. Also can be C-glycosylated with one or two mannose molecules on tryptophan residues within the consensus sequence W-X-X-W of the TPRs, and N-glycosylated. These other glycosylations can also facilitate secretion (By similarity).By similarity

Keywords - PTMi

Disulfide bond, Glycoprotein

Proteomic databases

PaxDbiP82987.
PeptideAtlasiP82987.
PRIDEiP82987.

PTM databases

iPTMnetiP82987.
PhosphoSiteiP82987.

Expressioni

Tissue specificityi

Expressed in epithelial cells of the colon, fallopian tube, skin, breast, prostate, epididymis, liver, pancreatic islets and bile ducts, as well as by vascular endothelial cells, smooth muscle cells, fibroblasts, cortical and ganglionic neurons and cardiac myocytes. Also expressed by malignant epithelial cells in colon cancer, as well as breast, prostate, renal and skin tumors. Expression is significantly reduced in colon cancer compared to normal colon.2 Publications

Gene expression databases

BgeeiENSG00000156218.
CleanExiHS_ADAMTSL3.
GenevisibleiP82987. HS.

Organism-specific databases

HPAiHPA034773.
HPA034774.

Interactioni

Binary interactionsi

WithEntry#Exp.IntActNotes
KRT40Q6A1623EBI-10221726,EBI-10171697
KRTAP10-8P604103EBI-10221726,EBI-10171774
KRTAP2-3P0C7H83EBI-10221726,EBI-10196781
MDFIQ997503EBI-10221726,EBI-724076
NOTCH2NLQ7Z3S93EBI-10221726,EBI-945833

Protein-protein interaction databases

BioGridi121437. 6 interactions.
IntActiP82987. 5 interactions.
MINTiMINT-3113527.
STRINGi9606.ENSP00000286744.

Structurei

3D structure databases

ProteinModelPortaliP82987.
SMRiP82987. Positions 76-232, 922-984, 1164-1424.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini75 – 12450TSP type-1 1PROSITE-ProRule annotationAdd
BLAST
Domaini418 – 46851TSP type-1 2PROSITE-ProRule annotationAdd
BLAST
Domaini478 – 53558TSP type-1 3PROSITE-ProRule annotationAdd
BLAST
Domaini564 – 62663TSP type-1 4PROSITE-ProRule annotationAdd
BLAST
Domaini703 – 76058TSP type-1 5PROSITE-ProRule annotationAdd
BLAST
Domaini763 – 81856TSP type-1 6PROSITE-ProRule annotationAdd
BLAST
Domaini819 – 88163TSP type-1 7PROSITE-ProRule annotationAdd
BLAST
Domaini896 – 99297Ig-like C2-type 1Add
BLAST
Domaini1185 – 127995Ig-like C2-type 2Add
BLAST
Domaini1296 – 137883Ig-like C2-type 3Add
BLAST
Domaini1424 – 148259TSP type-1 8PROSITE-ProRule annotationAdd
BLAST
Domaini1483 – 154563TSP type-1 9PROSITE-ProRule annotationAdd
BLAST
Domaini1597 – 164448TSP type-1 10PROSITE-ProRule annotationAdd
BLAST
Domaini1655 – 169137PLACPROSITE-ProRule annotationAdd
BLAST

Sequence similaritiesi

Contains 1 PLAC domain.PROSITE-ProRule annotation
Contains 10 TSP type-1 domains.PROSITE-ProRule annotation

Keywords - Domaini

Repeat, Signal

Phylogenomic databases

eggNOGiENOG410INDC. Eukaryota.
ENOG410YY5R. LUCA.
GeneTreeiENSGT00760000118885.
HOGENOMiHOG000230906.
HOVERGENiHBG079989.
InParanoidiP82987.
OMAiMDTAQFD.
OrthoDBiEOG091G00BQ.
PhylomeDBiP82987.
TreeFamiTF351125.

Family and domain databases

Gene3Di2.60.40.10. 3 hits.
InterProiIPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR013098. Ig_I-set.
IPR003599. Ig_sub.
IPR003598. Ig_sub2.
IPR013273. Peptidase_M12B_ADAM-TS.
IPR010909. PLAC.
IPR000884. TSP1_rpt.
[Graphical view]
PfamiPF07679. I-set. 1 hit.
PF08686. PLAC. 1 hit.
PF00090. TSP_1. 9 hits.
[Graphical view]
PRINTSiPR01857. ADAMTSFAMILY.
SMARTiSM00409. IG. 3 hits.
SM00408. IGc2. 3 hits.
SM00209. TSP1. 12 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 3 hits.
SSF82895. SSF82895. 12 hits.
PROSITEiPS50835. IG_LIKE. 3 hits.
PS50900. PLAC. 1 hit.
PS50092. TSP1. 10 hits.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: P82987-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MASWTSPWWV LIGMVFMHSP LPQTTAEKSP GAYFLPEFAL SPQGSFLEDT
60 70 80 90 100
TGEQFLTYRY DDQTSRNTRS DEDKDGNWDA WGDWSDCSRT CGGGASYSLR
110 120 130 140 150
RCLTGRNCEG QNIRYKTCSN HDCPPDAEDF RAQQCSAYND VQYQGHYYEW
160 170 180 190 200
LPRYNDPAAP CALKCHAQGQ NLVVELAPKV LDGTRCNTDS LDMCISGICQ
210 220 230 240 250
AVGCDRQLGS NAKEDNCGVC AGDGSTCRLV RGQSKSHVSP EKREENVIAV
260 270 280 290 300
PLGSRSVRIT VKGPAHLFIE SKTLQGSKGE HSFNSPGVFL VENTTVEFQR
310 320 330 340 350
GSERQTFKIP GPLMADFIFK TRYTAAKDSV VQFFFYQPIS HQWRQTDFFP
360 370 380 390 400
CTVTCGGGYQ LNSAECVDIR LKRVVPDHYC HYYPENVKPK PKLKECSMDP
410 420 430 440 450
CPSSDGFKEI MPYDHFQPLP RWEHNPWTAC SVSCGGGIQR RSFVCVEESM
460 470 480 490 500
HGEILQVEEW KCMYAPKPKV MQTCNLFDCP KWIAMEWSQC TVTCGRGLRY
510 520 530 540 550
RVVLCINHRG EHVGGCNPQL KLHIKEECVI PIPCYKPKEK SPVEAKLPWL
560 570 580 590 600
KQAQELEETR IATEEPTFIP EPWSACSTTC GPGVQVREVK CRVLLTFTQT
610 620 630 640 650
ETELPEEECE GPKLPTERPC LLEACDESPA SRELDIPLPE DSETTYDWEY
660 670 680 690 700
AGFTPCTATC VGGHQEAIAV CLHIQTQQTV NDSLCDMVHR PPAMSQACNT
710 720 730 740 750
EPCPPRWHVG SWGPCSATCG VGIQTRDVYC LHPGETPAPP EECRDEKPHA
760 770 780 790 800
LQACNQFDCP PGWHIEEWQQ CSRTCGGGTQ NRRVTCRQLL TDGSFLNLSD
810 820 830 840 850
ELCQGPKASS HKSCARTDCP PHLAVGDWSK CSVSCGVGIQ RRKQVCQRLA
860 870 880 890 900
AKGRRIPLSE MMCRDLPGLP LVRSCQMPEC SKIKSEMKTK LGEQGPQILS
910 920 930 940 950
VQRVYIQTRE EKRINLTIGS RAYLLPNTSV IIKCPVRRFQ KSLIQWEKDG
960 970 980 990 1000
RCLQNSKRLG ITKSGSLKIH GLAAPDIGVY RCIAGSAQET VVLKLIGTDN
1010 1020 1030 1040 1050
RLIARPALRE PMREYPGMDH SEANSLGVTW HKMRQMWNNK NDLYLDDDHI
1060 1070 1080 1090 1100
SNQPFLRALL GHCSNSAGST NSWELKNKQF EAAVKQGAYS MDTAQFDELI
1110 1120 1130 1140 1150
RNMSQLMETG EVSDDLASQL IYQLVAELAK AQPTHMQWRG IQEETPPAAQ
1160 1170 1180 1190 1200
LRGETGSVSQ SSHAKNSGKL TFKPKGPVLM RQSQPPSISF NKTINSRIGN
1210 1220 1230 1240 1250
TVYITKRTEV INILCDLITP SEATYTWTKD GTLLQPSVKI ILDGTGKIQI
1260 1270 1280 1290 1300
QNPTRKEQGI YECSVANHLG SDVESSSVLY AEAPVILSVE RNITKPEHNH
1310 1320 1330 1340 1350
LSVVVGGIVE AALGANVTIR CPVKGVPQPN ITWLKRGGSL SGNVSLLFNG
1360 1370 1380 1390 1400
SLLLQNVSLE NEGTYVCIAT NALGKAVATS VLHLLERRWP ESRIVFLQGH
1410 1420 1430 1440 1450
KKYILQATNT RTNSNDPTGE PPPQEPFWEP GNWSHCSATC GHLGARIQRP
1460 1470 1480 1490 1500
QCVMANGQEV SEALCDHLQK PLAGFEPCNI RDCPARWFTS VWSQCSVSCG
1510 1520 1530 1540 1550
EGYHSRQVTC KRTKANGTVQ VVSPRACAPK DRPLGRKPCF GHPCVQWEPG
1560 1570 1580 1590 1600
NRCPGRCMGR AVRMQQRHTA CQHNSSDSNC DDRKRPTLRR NCTSGACDVC
1610 1620 1630 1640 1650
WHTGPWKPCT AACGRGFQSR KVDCIHTRSC KPVAKRHCVQ KKKPISWRHC
1660 1670 1680 1690
LGPSCDRDCT DTTHYCMFVK HLNLCSLDRY KQRCCQSCQE G
Length:1,691
Mass (Da):188,692
Last modified:October 5, 2010 - v4
Checksum:i48C7281C737EA56E
GO
Isoform 2 (identifier: P82987-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1657-1691: RDCTDTTHYCMFVKHLNLCSLDRYKQRCCQSCQEG → STYTSQTATNKGAASHVKRDKPLEGS

Show »
Length:1,682
Mass (Da):187,240
Checksum:iEED57C7F66CA0CC6
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti265 – 2651A → V in AAK15041 (PubMed:14667842).Curated
Sequence conflicti986 – 9861S → F in AAI28390 (PubMed:15489334).Curated
Sequence conflicti1526 – 15261A → S in AAI28390 (PubMed:15489334).Curated

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti146 – 1461H → R.1 Publication
Corresponds to variant rs4483821 [ dbSNP | Ensembl ].
VAR_027478
Natural varianti290 – 2901L → V.1 Publication
Corresponds to variant rs4144691 [ dbSNP | Ensembl ].
VAR_027479
Natural varianti330 – 3301V → M in a colorectal cancer sample; somatic mutation. 1 Publication
VAR_035809
Natural varianti587 – 5871R → H in a colorectal cancer sample; somatic mutation. 1 Publication
Corresponds to variant rs142860011 [ dbSNP | Ensembl ].
VAR_035810
Natural varianti661 – 6611V → L.1 Publication
Corresponds to variant rs4842838 [ dbSNP | Ensembl ].
VAR_027480
Natural varianti713 – 7131G → R.
Corresponds to variant rs34047645 [ dbSNP | Ensembl ].
VAR_057365
Natural varianti855 – 8551R → C in a colorectal cancer sample; somatic mutation. 1 Publication
Corresponds to variant rs146769560 [ dbSNP | Ensembl ].
VAR_035811
Natural varianti855 – 8551R → H.
Corresponds to variant rs2277848 [ dbSNP | Ensembl ].
VAR_027481
Natural varianti869 – 8691L → F.1 Publication
Corresponds to variant rs2277849 [ dbSNP | Ensembl ].
VAR_027482
Natural varianti1315 – 13151A → E in a colorectal cancer sample; somatic mutation. 1 Publication
VAR_035812
Natural varianti1370 – 13701T → A.
Corresponds to variant rs17158450 [ dbSNP | Ensembl ].
VAR_027483
Natural varianti1558 – 15581M → T.
Corresponds to variant rs7175910 [ dbSNP | Ensembl ].
VAR_027484
Natural varianti1660 – 16601T → I.
Corresponds to variant rs950169 [ dbSNP | Ensembl ].
VAR_027485
Natural varianti1679 – 16791R → H.
Corresponds to variant rs11857906 [ dbSNP | Ensembl ].
VAR_027486

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1657 – 169135RDCTD…SCQEG → STYTSQTATNKGAASHVKRD KPLEGS in isoform 2. CuratedVSP_037810Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AC087738 Genomic DNA. No translation available.
AC116157 Genomic DNA. No translation available.
AC027807 Genomic DNA. No translation available.
BC128389 mRNA. Translation: AAI28390.1.
BC128390 mRNA. Translation: AAI28391.1.
AF237652 mRNA. Translation: AAK15041.1.
AB033059 mRNA. Translation: BAA86547.1.
CCDSiCCDS10326.1. [P82987-1]
CCDS73773.1. [P82987-2]
RefSeqiNP_001288039.1. NM_001301110.1. [P82987-2]
NP_997400.2. NM_207517.2. [P82987-1]
UniGeneiHs.459162.

Genome annotation databases

EnsembliENST00000286744; ENSP00000286744; ENSG00000156218. [P82987-1]
ENST00000567476; ENSP00000456313; ENSG00000156218. [P82987-2]
GeneIDi57188.
KEGGihsa:57188.
UCSCiuc002bjz.5. human. [P82987-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AC087738 Genomic DNA. No translation available.
AC116157 Genomic DNA. No translation available.
AC027807 Genomic DNA. No translation available.
BC128389 mRNA. Translation: AAI28390.1.
BC128390 mRNA. Translation: AAI28391.1.
AF237652 mRNA. Translation: AAK15041.1.
AB033059 mRNA. Translation: BAA86547.1.
CCDSiCCDS10326.1. [P82987-1]
CCDS73773.1. [P82987-2]
RefSeqiNP_001288039.1. NM_001301110.1. [P82987-2]
NP_997400.2. NM_207517.2. [P82987-1]
UniGeneiHs.459162.

3D structure databases

ProteinModelPortaliP82987.
SMRiP82987. Positions 76-232, 922-984, 1164-1424.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi121437. 6 interactions.
IntActiP82987. 5 interactions.
MINTiMINT-3113527.
STRINGi9606.ENSP00000286744.

PTM databases

iPTMnetiP82987.
PhosphoSiteiP82987.

Polymorphism and mutation databases

BioMutaiADAMTSL3.
DMDMi308153648.

Proteomic databases

PaxDbiP82987.
PeptideAtlasiP82987.
PRIDEiP82987.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000286744; ENSP00000286744; ENSG00000156218. [P82987-1]
ENST00000567476; ENSP00000456313; ENSG00000156218. [P82987-2]
GeneIDi57188.
KEGGihsa:57188.
UCSCiuc002bjz.5. human. [P82987-1]

Organism-specific databases

CTDi57188.
GeneCardsiADAMTSL3.
H-InvDBHIX0017599.
HGNCiHGNC:14633. ADAMTSL3.
HPAiHPA034773.
HPA034774.
MIMi609199. gene.
neXtProtiNX_P82987.
PharmGKBiPA134934525.
HUGEiSearch...
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410INDC. Eukaryota.
ENOG410YY5R. LUCA.
GeneTreeiENSGT00760000118885.
HOGENOMiHOG000230906.
HOVERGENiHBG079989.
InParanoidiP82987.
OMAiMDTAQFD.
OrthoDBiEOG091G00BQ.
PhylomeDBiP82987.
TreeFamiTF351125.

Enzyme and pathway databases

ReactomeiR-HSA-5083635. Defective B3GALTL causes Peters-plus syndrome (PpS).
R-HSA-5173214. O-glycosylation of TSR domain-containing proteins.

Miscellaneous databases

GenomeRNAii57188.
PROiP82987.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000156218.
CleanExiHS_ADAMTSL3.
GenevisibleiP82987. HS.

Family and domain databases

Gene3Di2.60.40.10. 3 hits.
InterProiIPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR013098. Ig_I-set.
IPR003599. Ig_sub.
IPR003598. Ig_sub2.
IPR013273. Peptidase_M12B_ADAM-TS.
IPR010909. PLAC.
IPR000884. TSP1_rpt.
[Graphical view]
PfamiPF07679. I-set. 1 hit.
PF08686. PLAC. 1 hit.
PF00090. TSP_1. 9 hits.
[Graphical view]
PRINTSiPR01857. ADAMTSFAMILY.
SMARTiSM00409. IG. 3 hits.
SM00408. IGc2. 3 hits.
SM00209. TSP1. 12 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 3 hits.
SSF82895. SSF82895. 12 hits.
PROSITEiPS50835. IG_LIKE. 3 hits.
PS50900. PLAC. 1 hit.
PS50092. TSP1. 10 hits.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiATL3_HUMAN
AccessioniPrimary (citable) accession number: P82987
Secondary accession number(s): A1A566, A1A567, Q9ULI7
Entry historyi
Integrated into UniProtKB/Swiss-Prot: September 19, 2006
Last sequence update: October 5, 2010
Last modified: September 7, 2016
This is version 104 of the entry and version 4 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Caution

Although strongly similar to members of the ADAMTS family it lacks the metalloprotease and disintegrin-like domains which are typical of that family.Curated

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 15
    Human chromosome 15: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.