Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Keratin, type II cytoskeletal 71

Gene

KRT71

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Plays a central role in hair formation. Essential component of keratin intermediate filaments in the inner root sheath (IRS) of the hair follicle.1 Publication

Sites

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sitei381 – 3811Stutter

GO - Molecular functioni

GO - Biological processi

  • hair follicle morphogenesis Source: UniProtKB
  • intermediate filament organization Source: UniProtKB
Complete GO annotation...

Names & Taxonomyi

Protein namesi
Recommended name:
Keratin, type II cytoskeletal 71
Alternative name(s):
Cytokeratin-71
Short name:
CK-71
Keratin-71
Short name:
K71
Type II inner root sheath-specific keratin-K6irs1
Short name:
Keratin 6 irs
Short name:
hK6irs
Short name:
hK6irs1
Type-II keratin Kb34
Gene namesi
Name:KRT71
Synonyms:K6IRS1, KB34, KRT6IRS1
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 12

Organism-specific databases

HGNCiHGNC:28927. KRT71.

Subcellular locationi

GO - Cellular componenti

  • cytoplasm Source: UniProtKB-KW
  • extracellular exosome Source: UniProtKB
  • keratin filament Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Cytoplasm, Cytoskeleton, Intermediate filament, Keratin

Pathology & Biotechi

Involvement in diseasei

Hypotrichosis 13 (HYPT13)1 Publication
The disease is caused by mutations affecting the gene represented in this entry.
Disease descriptionA form of hypotrichosis, a condition characterized by the presence of less than the normal amount of hair and abnormal hair follicles and shafts, which are thin and atrophic. The extent of scalp and body hair involvement can be very variable, within as well as between families. HYPT13 is characterized by sparse woolly hair.
See also OMIM:615896
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti141 – 1411F → C in HYPT13; dominant negative; decreased keratin intermediate filament formation. 1 Publication
VAR_071406

Keywords - Diseasei

Disease mutation, Hypotrichosis

Organism-specific databases

MalaCardsiKRT71.
MIMi615896. phenotype.
Orphaneti170. Woolly hair.
PharmGKBiPA147357697.

Polymorphism and mutation databases

BioMutaiKRT71.
DMDMi296439318.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Chaini1 – 523523Keratin, type II cytoskeletal 71PRO_0000314874Add
BLAST

Proteomic databases

EPDiQ3SY84.
MaxQBiQ3SY84.
PaxDbiQ3SY84.
PRIDEiQ3SY84.

PTM databases

iPTMnetiQ3SY84.
PhosphoSiteiQ3SY84.

Expressioni

Tissue specificityi

Highly expressed in hair follicles from scalp. Specifically expressed in the inner root sheath (IRS) of the hair follicle. Present in the all 3 IRS layers: the cuticle, the Henle and the Huxley layers. Also detected in the pseudopods of specialized Huxley cells, termed Fluegelzellen, along the area of differentiated Henle cells (at protein level).4 Publications

Developmental stagei

In all 3 IRS layers, expression begins simultaneously in adjacent cells of the lowermost bulb above the germinative cell pool and terminated higher up in the follicle with the asynchronous terminal differentiation of each cell layer (at protein level).

Gene expression databases

BgeeiQ3SY84.
CleanExiHS_KRT71.
GenevisibleiQ3SY84. HS.

Organism-specific databases

HPAiHPA049404.

Interactioni

Subunit structurei

Heterodimer of a type I and a type II keratin. Associates with KRT16 and/or KRT17 (By similarity).By similarity

Binary interactionsi

WithEntry#Exp.IntActNotes
KRT13A1A4E93EBI-2952676,EBI-10171552
KRT15P190123EBI-2952676,EBI-739566
KRT31Q153233EBI-2952676,EBI-948001
KRT38O760153EBI-2952676,EBI-1047263
KRT40Q6A1623EBI-2952676,EBI-10171697

Protein-protein interaction databases

BioGridi125205. 10 interactions.
IntActiQ3SY84. 9 interactions.
STRINGi9606.ENSP00000267119.

Structurei

3D structure databases

ProteinModelPortaliQ3SY84.
SMRiQ3SY84. Positions 127-280, 296-437.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Region

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Regioni1 – 129129HeadAdd
BLAST
Regioni130 – 439310RodAdd
BLAST
Regioni130 – 16536Coil 1AAdd
BLAST
Regioni166 – 18419Linker 1Add
BLAST
Regioni185 – 27692Coil 1BAdd
BLAST
Regioni277 – 30024Linker 12Add
BLAST
Regioni301 – 439139Coil 2Add
BLAST
Regioni440 – 52384TailAdd
BLAST

Compositional bias

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Compositional biasi10 – 10192Gly-richAdd
BLAST

Sequence similaritiesi

Belongs to the intermediate filament family.Curated

Keywords - Domaini

Coiled coil

Phylogenomic databases

eggNOGiENOG410IQKP. Eukaryota.
ENOG4111AD7. LUCA.
GeneTreeiENSGT00760000118796.
HOGENOMiHOG000230976.
HOVERGENiHBG013015.
InParanoidiQ3SY84.
KOiK07605.
OMAiFFKCLFE.
OrthoDBiEOG7P2XRX.
PhylomeDBiQ3SY84.
TreeFamiTF317854.

Family and domain databases

InterProiIPR001664. IF.
IPR018039. Intermediate_filament_CS.
IPR032444. Keratin_2_head.
IPR003054. Keratin_II.
[Graphical view]
PANTHERiPTHR23239. PTHR23239. 1 hit.
PfamiPF00038. Filament. 1 hit.
PF16208. Keratin_2_head. 1 hit.
[Graphical view]
PRINTSiPR01276. TYPE2KERATIN.
SMARTiSM01391. Filament. 1 hit.
[Graphical view]
PROSITEiPS00226. IF. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Q3SY84-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MSRQFTCKSG AAAKGGFSGC SAVLSGGSSS SFRAGSKGLS GGFGSRSLYS
60 70 80 90 100
LGGVRSLNVA SGSGKSGGYG FGRGRASGFA GSMFGSVALG PVCPTVCPPG
110 120 130 140 150
GIHQVTVNES LLAPLNVELD PEIQKVRAQE REQIKALNNK FASFIDKVRF
160 170 180 190 200
LEQQNQVLET KWELLQQLDL NNCKNNLEPI LEGYISNLRK QLETLSGDRV
210 220 230 240 250
RLDSELRNVR DVVEDYKKRY EEEINKRTAA ENEFVLLKKD VDAAYANKVE
260 270 280 290 300
LQAKVESMDQ EIKFFRCLFE AEITQIQSHI SDMSVILSMD NNRNLDLDSI
310 320 330 340 350
IDEVRTQYEE IALKSKAEAE ALYQTKFQEL QLAAGRHGDD LKNTKNEISE
360 370 380 390 400
LTRLIQRIRS EIENVKKQAS NLETAIADAE QRGDNALKDA RAKLDELEGA
410 420 430 440 450
LHQAKEELAR MLREYQELMS LKLALDMEIA TYRKLLESEE CRMSGEFPSP
460 470 480 490 500
VSISIISSTS GGSVYGFRPS MVSGGYVANS SNCISGVCSV RGGEGRSRGS
510 520
ANDYKDTLGK GSSLSAPSKK TSR
Length:523
Mass (Da):57,292
Last modified:May 18, 2010 - v3
Checksum:i797F5655EE3A62D7
GO

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti210 – 2101R → Q in AAI03919 (PubMed:15489334).Curated

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti107 – 1071V → I.
Corresponds to variant rs665522 [ dbSNP | Ensembl ].
VAR_038082
Natural varianti122 – 1221E → K.
Corresponds to variant rs665470 [ dbSNP | Ensembl ].
VAR_038083
Natural varianti141 – 1411F → C in HYPT13; dominant negative; decreased keratin intermediate filament formation. 1 Publication
VAR_071406
Natural varianti355 – 3551I → F.
Corresponds to variant rs35988863 [ dbSNP | Ensembl ].
VAR_038084
Natural varianti464 – 4641V → G.2 Publications
Corresponds to variant rs10783518 [ dbSNP | Ensembl ].
VAR_038085
Natural varianti523 – 5231R → Q.
Corresponds to variant rs2292506 [ dbSNP | Ensembl ].
VAR_038086

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ308599 mRNA. Translation: CAC43429.1.
AK122795 mRNA. Translation: BAG53733.1.
AC055736 Genomic DNA. No translation available.
CH471054 Genomic DNA. Translation: EAW96636.1.
BC103917 mRNA. Translation: AAI03918.1.
BC103918 mRNA. Translation: AAI03919.1.
CCDSiCCDS8831.1.
RefSeqiNP_258259.1. NM_033448.2.
UniGeneiHs.660007.

Genome annotation databases

EnsembliENST00000267119; ENSP00000267119; ENSG00000139648.
GeneIDi112802.
KEGGihsa:112802.
UCSCiuc001sao.3. human.

Keywords - Coding sequence diversityi

Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AJ308599 mRNA. Translation: CAC43429.1.
AK122795 mRNA. Translation: BAG53733.1.
AC055736 Genomic DNA. No translation available.
CH471054 Genomic DNA. Translation: EAW96636.1.
BC103917 mRNA. Translation: AAI03918.1.
BC103918 mRNA. Translation: AAI03919.1.
CCDSiCCDS8831.1.
RefSeqiNP_258259.1. NM_033448.2.
UniGeneiHs.660007.

3D structure databases

ProteinModelPortaliQ3SY84.
SMRiQ3SY84. Positions 127-280, 296-437.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi125205. 10 interactions.
IntActiQ3SY84. 9 interactions.
STRINGi9606.ENSP00000267119.

PTM databases

iPTMnetiQ3SY84.
PhosphoSiteiQ3SY84.

Polymorphism and mutation databases

BioMutaiKRT71.
DMDMi296439318.

Proteomic databases

EPDiQ3SY84.
MaxQBiQ3SY84.
PaxDbiQ3SY84.
PRIDEiQ3SY84.

Protocols and materials databases

DNASUi112802.
Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000267119; ENSP00000267119; ENSG00000139648.
GeneIDi112802.
KEGGihsa:112802.
UCSCiuc001sao.3. human.

Organism-specific databases

CTDi112802.
GeneCardsiKRT71.
H-InvDBHIX0026399.
HGNCiHGNC:28927. KRT71.
HPAiHPA049404.
MalaCardsiKRT71.
MIMi608245. gene.
615896. phenotype.
neXtProtiNX_Q3SY84.
Orphaneti170. Woolly hair.
PharmGKBiPA147357697.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410IQKP. Eukaryota.
ENOG4111AD7. LUCA.
GeneTreeiENSGT00760000118796.
HOGENOMiHOG000230976.
HOVERGENiHBG013015.
InParanoidiQ3SY84.
KOiK07605.
OMAiFFKCLFE.
OrthoDBiEOG7P2XRX.
PhylomeDBiQ3SY84.
TreeFamiTF317854.

Miscellaneous databases

GenomeRNAii112802.
NextBioi78665.
PROiQ3SY84.
SOURCEiSearch...

Gene expression databases

BgeeiQ3SY84.
CleanExiHS_KRT71.
GenevisibleiQ3SY84. HS.

Family and domain databases

InterProiIPR001664. IF.
IPR018039. Intermediate_filament_CS.
IPR032444. Keratin_2_head.
IPR003054. Keratin_II.
[Graphical view]
PANTHERiPTHR23239. PTHR23239. 1 hit.
PfamiPF00038. Filament. 1 hit.
PF16208. Keratin_2_head. 1 hit.
[Graphical view]
PRINTSiPR01276. TYPE2KERATIN.
SMARTiSM01391. Filament. 1 hit.
[Graphical view]
PROSITEiPS00226. IF. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "A novel epithelial keratin, hK6irs1, is expressed differentially in all layers of the inner root sheath, including specialized huxley cells (Flugelzellen) of the human hair follicle."
    Langbein L., Rogers M.A., Praetzel S., Aoki N., Winter H., Schweizer J.
    J. Invest. Dermatol. 118:789-799(2002) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA], TISSUE SPECIFICITY.
    Tissue: Scalp.
  2. "Complete sequencing and characterization of 21,243 full-length human cDNAs."
    Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S.
    , Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., Isogai T., Sugano S.
    Nat. Genet. 36:40-45(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Tissue: Thymus.
  3. "The finished DNA sequence of human chromosome 12."
    Scherer S.E., Muzny D.M., Buhay C.J., Chen R., Cree A., Ding Y., Dugan-Rocha S., Gill R., Gunaratne P., Harris R.A., Hawes A.C., Hernandez J., Hodgson A.V., Hume J., Jackson A., Khan Z.M., Kovar-Smith C., Lewis L.R.
    , Lozado R.J., Metzker M.L., Milosavljevic A., Miner G.R., Montgomery K.T., Morgan M.B., Nazareth L.V., Scott G., Sodergren E., Song X.-Z., Steffen D., Lovering R.C., Wheeler D.A., Worley K.C., Yuan Y., Zhang Z., Adams C.Q., Ansari-Lari M.A., Ayele M., Brown M.J., Chen G., Chen Z., Clerc-Blankenburg K.P., Davis C., Delgado O., Dinh H.H., Draper H., Gonzalez-Garay M.L., Havlak P., Jackson L.R., Jacob L.S., Kelly S.H., Li L., Li Z., Liu J., Liu W., Lu J., Maheshwari M., Nguyen B.-V., Okwuonu G.O., Pasternak S., Perez L.M., Plopper F.J.H., Santibanez J., Shen H., Tabor P.E., Verduzco D., Waldron L., Wang Q., Williams G.A., Zhang J., Zhou J., Allen C.C., Amin A.G., Anyalebechi V., Bailey M., Barbaria J.A., Bimage K.E., Bryant N.P., Burch P.E., Burkett C.E., Burrell K.L., Calderon E., Cardenas V., Carter K., Casias K., Cavazos I., Cavazos S.R., Ceasar H., Chacko J., Chan S.N., Chavez D., Christopoulos C., Chu J., Cockrell R., Cox C.D., Dang M., Dathorne S.R., David R., Davis C.M., Davy-Carroll L., Deshazo D.R., Donlin J.E., D'Souza L., Eaves K.A., Egan A., Emery-Cohen A.J., Escotto M., Flagg N., Forbes L.D., Gabisi A.M., Garza M., Hamilton C., Henderson N., Hernandez O., Hines S., Hogues M.E., Huang M., Idlebird D.G., Johnson R., Jolivet A., Jones S., Kagan R., King L.M., Leal B., Lebow H., Lee S., LeVan J.M., Lewis L.C., London P., Lorensuhewa L.M., Loulseged H., Lovett D.A., Lucier A., Lucier R.L., Ma J., Madu R.C., Mapua P., Martindale A.D., Martinez E., Massey E., Mawhiney S., Meador M.G., Mendez S., Mercado C., Mercado I.C., Merritt C.E., Miner Z.L., Minja E., Mitchell T., Mohabbat F., Mohabbat K., Montgomery B., Moore N., Morris S., Munidasa M., Ngo R.N., Nguyen N.B., Nickerson E., Nwaokelemeh O.O., Nwokenkwo S., Obregon M., Oguh M., Oragunye N., Oviedo R.J., Parish B.J., Parker D.N., Parrish J., Parks K.L., Paul H.A., Payton B.A., Perez A., Perrin W., Pickens A., Primus E.L., Pu L.-L., Puazo M., Quiles M.M., Quiroz J.B., Rabata D., Reeves K., Ruiz S.J., Shao H., Sisson I., Sonaike T., Sorelle R.P., Sutton A.E., Svatek A.F., Svetz L.A., Tamerisa K.S., Taylor T.R., Teague B., Thomas N., Thorn R.D., Trejos Z.Y., Trevino B.K., Ukegbu O.N., Urban J.B., Vasquez L.I., Vera V.A., Villasana D.M., Wang L., Ward-Moore S., Warren J.T., Wei X., White F., Williamson A.L., Wleczyk R., Wooden H.S., Wooden S.H., Yen J., Yoon L., Yoon V., Zorrilla S.E., Nelson D., Kucherlapati R., Weinstock G., Gibbs R.A.
    Nature 440:346-351(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
  4. Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA], VARIANT GLY-464.
  5. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA], VARIANT GLY-464.
  6. "K6irs1, K6irs2, K6irs3, and K6irs4 represent the inner-root-sheath-specific type II epithelial keratins of the human hair follicle."
    Langbein L., Rogers M.A., Praetzel S., Winter H., Schweizer J.
    J. Invest. Dermatol. 120:512-522(2003) [PubMed] [Europe PMC] [Abstract]
    Cited for: TISSUE SPECIFICITY.
  7. "K25 (K25irs1), K26 (K25irs2), K27 (K25irs3), and K28 (K25irs4) represent the type I inner root sheath keratins of the human hair follicle."
    Langbein L., Rogers M.A., Praetzel-Wunder S., Helmke B., Schirmacher P., Schweizer J.
    J. Invest. Dermatol. 126:2377-2386(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: TISSUE SPECIFICITY.
  8. Cited for: IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
  9. "A missense mutation within the helix initiation motif of the keratin K71 gene underlies autosomal dominant woolly hair/hypotrichosis."
    Fujimoto A., Farooq M., Fujikawa H., Inoue A., Ohyama M., Ehama R., Nakanishi J., Hagihara M., Iwabuchi T., Aoki J., Ito M., Shimomura Y.
    J. Invest. Dermatol. 132:2342-2349(2012) [PubMed] [Europe PMC] [Abstract]
    Cited for: FUNCTION, INVOLVEMENT IN HYPT13, VARIANT HYPT13 CYS-141, CHARACTERIZATION OF VARIANT HYPT13 CYS-141, TISSUE SPECIFICITY, SUBCELLULAR LOCATION.

Entry informationi

Entry nameiK2C71_HUMAN
AccessioniPrimary (citable) accession number: Q3SY84
Secondary accession number(s): B3KVC1, Q3SY85, Q96DU2
Entry historyi
Integrated into UniProtKB/Swiss-Prot: January 15, 2008
Last sequence update: May 18, 2010
Last modified: May 11, 2016
This is version 98 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Miscellaneous

There are two types of cytoskeletal and microfibrillar keratin, I (acidic) and II (neutral to basic) (40-55 and 56-70 kDa, respectively).

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 12
    Human chromosome 12: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.