Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Cytoskeleton-associated protein 2-like

Gene

CKAP2L

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Microtubule-associated protein required for mitotic spindle formation and cell-cycle progression in neural progenitor cells.1 Publication

Names & Taxonomyi

Protein namesi
Recommended name:
Cytoskeleton-associated protein 2-like
Alternative name(s):
Radial fiber and mitotic spindle protein
Short name:
Radmis
Gene namesi
Name:CKAP2L
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 2

Organism-specific databases

HGNCiHGNC:26877. CKAP2L.

Subcellular locationi

GO - Cellular componenti

  • centrosome Source: UniProtKB
  • cytoplasm Source: UniProtKB-KW
  • spindle pole Source: UniProtKB-SubCell
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Cytoskeleton

Pathology & Biotechi

Involvement in diseasei

Filippi syndrome (FLPIS)1 Publication
The disease is caused by mutations affecting the gene represented in this entry.
Disease descriptionA rare disorder characterized by microcephaly, pre- and postnatal growth failure, syndactyly, and distinctive facial features, including a broad nasal bridge and underdeveloped alae nasi. Some affected individuals have intellectual disability, seizures, undescended testicles in males, and teeth and hair abnormalities.
See also OMIM:272440

Mutagenesis

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Mutagenesisi198 – 1981K → R: Abrogates sumoylation. 1 Publication

Keywords - Diseasei

Mental retardation

Organism-specific databases

MalaCardsiCKAP2L.
MIMi272440. phenotype.
PharmGKBiPA144596448.

Polymorphism and mutation databases

BioMutaiCKAP2L.
DMDMi311033474.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 745745Cytoskeleton-associated protein 2-likePRO_0000324335Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Cross-linki198 – 198Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO1)
Modified residuei204 – 2041Phosphotyrosine1 Publication
Modified residuei742 – 7421PhosphothreonineCombined sources
Modified residuei745 – 7451PhosphoserineCombined sources

Post-translational modificationi

Ubiquitinated by the anaphase promoting complex/cyclosome (APC/C).By similarity

Keywords - PTMi

Isopeptide bond, Phosphoprotein, Ubl conjugation

Proteomic databases

EPDiQ8IYA6.
MaxQBiQ8IYA6.
PaxDbiQ8IYA6.
PeptideAtlasiQ8IYA6.
PRIDEiQ8IYA6.

PTM databases

iPTMnetiQ8IYA6.
PhosphoSiteiQ8IYA6.

Expressioni

Inductioni

Expression is cell-cycle dependent. Undetectable in interphase and prophase, strong expression at the spindle pole throughout metaphase to telophase.1 Publication

Gene expression databases

BgeeiQ8IYA6.
CleanExiHS_CKAP2L.
ExpressionAtlasiQ8IYA6. baseline and differential.
GenevisibleiQ8IYA6. HS.

Organism-specific databases

HPAiHPA039407.
HPA040057.

Interactioni

Protein-protein interaction databases

BioGridi127298. 4 interactions.
IntActiQ8IYA6. 3 interactions.
MINTiMINT-4992332.
STRINGi9606.ENSP00000305204.

Structurei

3D structure databases

ProteinModelPortaliQ8IYA6.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Motif

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Motifi185 – 1873KEN boxBy similarity

Domaini

The KEN box is required for the association with the APC/C-Cdh1 complex, ubiquitination and degradation.By similarity

Sequence similaritiesi

Belongs to the CKAP2 family.Curated

Phylogenomic databases

eggNOGiENOG410IGXX. Eukaryota.
ENOG4111T8P. LUCA.
GeneTreeiENSGT00530000063691.
HOGENOMiHOG000111753.
HOVERGENiHBG107704.
InParanoidiQ8IYA6.
KOiK16769.
OMAiQDMKLIT.
OrthoDBiEOG7X9G6M.
PhylomeDBiQ8IYA6.
TreeFamiTF333003.

Family and domain databases

InterProiIPR029197. CKAP2_C.
IPR026165. CKAP2_fam.
[Graphical view]
PANTHERiPTHR16076. PTHR16076. 1 hit.
PfamiPF15297. CKAP2_C. 2 hits.
[Graphical view]

Sequences (3)i

Sequence statusi: Complete.

This entry describes 3 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q8IYA6-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MVGPGPTAAA AVEERQRKLQ EYLAAKGKLK SQNTKPYLKS KNNCQNQPPS
60 70 80 90 100
KSTIRPKNDV TNHVVLPVKP KRSISIKLQP RPPNTAGSQK PKLEPPKLLG
110 120 130 140 150
KRLTSECVSS NPYSKPSSKS FQQCEAGSST TGELSRKPVG SLNIEQLKTT
160 170 180 190 200
KQQLTDQGNG KCIDFMNNIH VENESLDNFL KETNKENLLD ILTEPERKPD
210 220 230 240 250
PKLYTRSKPK TDSYNQTKNS LVPKQALGKS SVNSAVLKDR VNKQFVGETQ
260 270 280 290 300
SRTFPVKSQQ LSRGADLARP GVKPSRTVPS HFIRTLSKVQ SSKKPVVKNI
310 320 330 340 350
KDIKVNRSQY ERPNETKIRS YPVTEQRVKH TKPRTYPSLL QGEYNNRHPN
360 370 380 390 400
IKQDQKSSQV CIPQTSCVLQ KSKAISQRPN LTVGRFNSAI PSTPSIRPNG
410 420 430 440 450
TSGNKHNNNG FQQKAQTLDS KLKKAVPQNH FLNKTAPKTQ ADVTTVNGTQ
460 470 480 490 500
TNPNIKKKAT AEDRRKQLEE WQKSKGKTYK RPPMELKTKR KVIKEMNISF
510 520 530 540 550
WKSIEKEEEE KKAQLELSSK INNTLTECLN LIEGGVPSNE ILNILSSIPE
560 570 580 590 600
AEKFAKFWIC KAKLLASKGT FDVIGLYEEA IKNGATPIQE LRKVVLNILQ
610 620 630 640 650
DSNRTTEGIT SDSLVAETSI TSVEELAKKM ESVKSCLSPK EREQVTATPR
660 670 680 690 700
IAKAEQHNYP GIKLQIGPIP RINGMPEVQD MKFITPVRRS SRIERAVSRY
710 720 730 740
PEMLQEHDLV VASLDELLEV EETKCFIFRR NEALPVTLGF QTPES
Length:745
Mass (Da):83,587
Last modified:November 2, 2010 - v4
Checksum:i46FE794CD9EBADD9
GO
Isoform 2 (identifier: Q8IYA6-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-411: Missing.
     412-412: Q → M

Note: No experimental confirmation available.
Show »
Length:334
Mass (Da):37,806
Checksum:i303A71394E91CA62
GO
Isoform 3 (identifier: Q8IYA6-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1-165: Missing.

Note: No experimental confirmation available.
Show »
Length:580
Mass (Da):65,611
Checksum:i904AA6C1F0C1D27F
GO

Sequence cautioni

The sequence AAX93053.1 differs from that shown. Reason: Erroneous gene model prediction. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti375 – 3751I → A in BAH14612 (PubMed:14702039).Curated
Sequence conflicti531 – 5311L → P in BAC05202 (PubMed:14702039).Curated
Sequence conflicti615 – 6151V → A in BAF85219 (PubMed:14702039).Curated

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti19 – 191L → F.
Corresponds to variant rs36093393 [ dbSNP | Ensembl ].
VAR_039735
Natural varianti26 – 261K → R.
Corresponds to variant rs35593767 [ dbSNP | Ensembl ].
VAR_039736
Natural varianti62 – 621N → S.1 Publication
Corresponds to variant rs17042344 [ dbSNP | Ensembl ].
VAR_039737
Natural varianti104 – 1041T → I.
Corresponds to variant rs13007595 [ dbSNP | Ensembl ].
VAR_039738
Natural varianti263 – 2631R → S.1 Publication
Corresponds to variant rs17042341 [ dbSNP | Ensembl ].
VAR_039739
Natural varianti375 – 3751I → V.2 Publications
Corresponds to variant rs6731822 [ dbSNP | Ensembl ].
VAR_039740
Natural varianti379 – 3791P → A.
Corresponds to variant rs2676126 [ dbSNP | Ensembl ].
VAR_039741
Natural varianti519 – 5191S → G.
Corresponds to variant rs36046436 [ dbSNP | Ensembl ].
VAR_039742
Natural varianti614 – 6141L → S.
Corresponds to variant rs3811040 [ dbSNP | Ensembl ].
VAR_039743
Natural varianti706 – 7061E → D.
Corresponds to variant rs3811039 [ dbSNP | Ensembl ].
VAR_039744

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1 – 411411Missing in isoform 2. 1 PublicationVSP_032219Add
BLAST
Alternative sequencei1 – 165165Missing in isoform 3. 1 PublicationVSP_053717Add
BLAST
Alternative sequencei412 – 4121Q → M in isoform 2. 1 PublicationVSP_032220

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK097948 mRNA. Translation: BAC05202.1.
AK292530 mRNA. Translation: BAF85219.1.
AK302875 mRNA. Translation: BAG64055.1.
AK316241 mRNA. Translation: BAH14612.1.
AC079922 Genomic DNA. Translation: AAY14923.1.
AC112235 Genomic DNA. Translation: AAX93053.1. Sequence problems.
BC036217 mRNA. Translation: AAH36217.1.
CCDSiCCDS2100.1. [Q8IYA6-1]
RefSeqiNP_001291290.1. NM_001304361.1. [Q8IYA6-3]
NP_689728.3. NM_152515.4. [Q8IYA6-1]
XP_011508968.1. XM_011510666.1. [Q8IYA6-3]
UniGeneiHs.434250.

Genome annotation databases

EnsembliENST00000302450; ENSP00000305204; ENSG00000169607. [Q8IYA6-1]
GeneIDi150468.
KEGGihsa:150468.
UCSCiuc002tie.3. human. [Q8IYA6-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK097948 mRNA. Translation: BAC05202.1.
AK292530 mRNA. Translation: BAF85219.1.
AK302875 mRNA. Translation: BAG64055.1.
AK316241 mRNA. Translation: BAH14612.1.
AC079922 Genomic DNA. Translation: AAY14923.1.
AC112235 Genomic DNA. Translation: AAX93053.1. Sequence problems.
BC036217 mRNA. Translation: AAH36217.1.
CCDSiCCDS2100.1. [Q8IYA6-1]
RefSeqiNP_001291290.1. NM_001304361.1. [Q8IYA6-3]
NP_689728.3. NM_152515.4. [Q8IYA6-1]
XP_011508968.1. XM_011510666.1. [Q8IYA6-3]
UniGeneiHs.434250.

3D structure databases

ProteinModelPortaliQ8IYA6.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi127298. 4 interactions.
IntActiQ8IYA6. 3 interactions.
MINTiMINT-4992332.
STRINGi9606.ENSP00000305204.

PTM databases

iPTMnetiQ8IYA6.
PhosphoSiteiQ8IYA6.

Polymorphism and mutation databases

BioMutaiCKAP2L.
DMDMi311033474.

Proteomic databases

EPDiQ8IYA6.
MaxQBiQ8IYA6.
PaxDbiQ8IYA6.
PeptideAtlasiQ8IYA6.
PRIDEiQ8IYA6.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000302450; ENSP00000305204; ENSG00000169607. [Q8IYA6-1]
GeneIDi150468.
KEGGihsa:150468.
UCSCiuc002tie.3. human. [Q8IYA6-1]

Organism-specific databases

CTDi150468.
GeneCardsiCKAP2L.
H-InvDBHIX0200299.
HGNCiHGNC:26877. CKAP2L.
HPAiHPA039407.
HPA040057.
MalaCardsiCKAP2L.
MIMi272440. phenotype.
616174. gene.
neXtProtiNX_Q8IYA6.
PharmGKBiPA144596448.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410IGXX. Eukaryota.
ENOG4111T8P. LUCA.
GeneTreeiENSGT00530000063691.
HOGENOMiHOG000111753.
HOVERGENiHBG107704.
InParanoidiQ8IYA6.
KOiK16769.
OMAiQDMKLIT.
OrthoDBiEOG7X9G6M.
PhylomeDBiQ8IYA6.
TreeFamiTF333003.

Miscellaneous databases

ChiTaRSiCKAP2L. human.
GenomeRNAii150468.
PROiQ8IYA6.
SOURCEiSearch...

Gene expression databases

BgeeiQ8IYA6.
CleanExiHS_CKAP2L.
ExpressionAtlasiQ8IYA6. baseline and differential.
GenevisibleiQ8IYA6. HS.

Family and domain databases

InterProiIPR029197. CKAP2_C.
IPR026165. CKAP2_fam.
[Graphical view]
PANTHERiPTHR16076. PTHR16076. 1 hit.
PfamiPF15297. CKAP2_C. 2 hits.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Complete sequencing and characterization of 21,243 full-length human cDNAs."
    Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S.
    , Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., Isogai T., Sugano S.
    Nat. Genet. 36:40-45(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 1; 2 AND 3), VARIANT VAL-375.
    Tissue: Testis and Thymus.
  2. "Generation and annotation of the DNA sequences of human chromosomes 2 and 4."
    Hillier L.W., Graves T.A., Fulton R.S., Fulton L.A., Pepin K.H., Minx P., Wagner-McPherson C., Layman D., Wylie K., Sekhon M., Becker M.C., Fewell G.A., Delehaunty K.D., Miner T.L., Nash W.E., Kremitzki C., Oddy L., Du H.
    , Sun H., Bradshaw-Cordum H., Ali J., Carter J., Cordes M., Harris A., Isak A., van Brunt A., Nguyen C., Du F., Courtney L., Kalicki J., Ozersky P., Abbott S., Armstrong J., Belter E.A., Caruso L., Cedroni M., Cotton M., Davidson T., Desai A., Elliott G., Erb T., Fronick C., Gaige T., Haakenson W., Haglund K., Holmes A., Harkins R., Kim K., Kruchowski S.S., Strong C.M., Grewal N., Goyea E., Hou S., Levy A., Martinka S., Mead K., McLellan M.D., Meyer R., Randall-Maher J., Tomlinson C., Dauphin-Kohlberg S., Kozlowicz-Reilly A., Shah N., Swearengen-Shahid S., Snider J., Strong J.T., Thompson J., Yoakum M., Leonard S., Pearman C., Trani L., Radionenko M., Waligorski J.E., Wang C., Rock S.M., Tin-Wollam A.-M., Maupin R., Latreille P., Wendl M.C., Yang S.-P., Pohl C., Wallis J.W., Spieth J., Bieri T.A., Berkowicz N., Nelson J.O., Osborne J., Ding L., Meyer R., Sabo A., Shotland Y., Sinha P., Wohldmann P.E., Cook L.L., Hickenbotham M.T., Eldred J., Williams D., Jones T.A., She X., Ciccarelli F.D., Izaurralde E., Taylor J., Schmutz J., Myers R.M., Cox D.R., Huang X., McPherson J.D., Mardis E.R., Clifton S.W., Warren W.C., Chinwalla A.T., Eddy S.R., Marra M.A., Ovcharenko I., Furey T.S., Miller W., Eichler E.E., Bork P., Suyama M., Torrents D., Waterston R.H., Wilson R.K.
    Nature 434:724-731(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA], VARIANTS SER-62 AND SER-263.
  3. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1), VARIANT VAL-375.
    Tissue: Testis.
  4. Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT THR-742 AND SER-745, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Cervix carcinoma.
  5. "In vivo identification of sumoylation sites by a signature tag and cysteine-targeted affinity purification."
    Blomster H.A., Imanishi S.Y., Siimes J., Kastu J., Morrice N.A., Eriksson J.E., Sistonen L.
    J. Biol. Chem. 285:19324-19329(2010) [PubMed] [Europe PMC] [Abstract]
    Cited for: SUMOYLATION AT LYS-198, PHOSPHORYLATION AT TYR-204, MUTAGENESIS OF LYS-198.
    Tissue: Cervix carcinoma.
  6. "Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis."
    Olsen J.V., Vermeulen M., Santamaria A., Kumar C., Miller M.L., Jensen L.J., Gnad F., Cox J., Jensen T.S., Nigg E.A., Brunak S., Mann M.
    Sci. Signal. 3:RA3-RA3(2010) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-745, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Cervix carcinoma.
  7. "Toward a comprehensive characterization of a human cancer cell phosphoproteome."
    Zhou H., Di Palma S., Preisinger C., Peng M., Polat A.N., Heck A.J., Mohammed S.
    J. Proteome Res. 12:260-271(2013) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-745, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Erythroleukemia.
  8. Cited for: INVOLVEMENT IN FLPIS, FUNCTION, SUBCELLULAR LOCATION, INDUCTION.

Entry informationi

Entry nameiCKP2L_HUMAN
AccessioniPrimary (citable) accession number: Q8IYA6
Secondary accession number(s): A8K915
, B4DZE3, B7ZAC6, F5H0M5, Q53QF8, Q53RS8, Q8N1J8
Entry historyi
Integrated into UniProtKB/Swiss-Prot: March 18, 2008
Last sequence update: November 2, 2010
Last modified: July 6, 2016
This is version 96 of the entry and version 4 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 2
    Human chromosome 2: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.