Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Cell adhesion molecule 1

Gene

CADM1

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Mediates homophilic cell-cell adhesion in a Ca2+-independent manner. Also mediates heterophilic cell-cell adhesion with CADM3 and NECTIN3 in a Ca2+-independent manner. Acts as a tumor suppressor in non-small-cell lung cancer (NSCLC) cells. Interaction with CRTAM promotes natural killer (NK) cell cytotoxicity and interferon-gamma (IFN-gamma) secretion by CD8+ cells in vitro as well as NK cell-mediated rejection of tumors expressing CADM3 in vivo. May contribute to the less invasive phenotypes of lepidic growth tumor cells. In mast cells, may mediate attachment to and promote communication with nerves. CADM1, together with MITF, is essential for development and survival of mast cells in vivo. Acts as a synaptic cell adhesion molecule and plays a role in the formation of dendritic spines and in synapse assembly (By similarity). May be involved in neuronal migration, axon growth, pathfinding, and fasciculation on the axons of differentiating neurons. May play diverse roles in the spermatogenesis including in the adhesion of spermatocytes and spermatids to Sertoli cells and for their normal differentiation into mature spermatozoa.By similarity7 Publications

GO - Molecular functioni

  • cell adhesion molecule binding Source: GO_Central
  • PDZ domain binding Source: UniProtKB
  • protein homodimerization activity Source: UniProtKB
  • receptor activity Source: GO_Central
  • receptor binding Source: UniProtKB

GO - Biological processi

  • adherens junction organization Source: Reactome
  • apoptotic process Source: UniProtKB-KW
  • brain development Source: Ensembl
  • cell differentiation Source: UniProtKB-KW
  • cell recognition Source: UniProtKB
  • detection of stimulus Source: UniProtKB
  • heterophilic cell-cell adhesion via plasma membrane cell adhesion molecules Source: UniProtKB
  • homophilic cell adhesion via plasma membrane adhesion molecules Source: UniProtKB
  • immune system process Source: UniProtKB-KW
  • liver development Source: Ensembl
  • positive regulation of cytokine secretion Source: UniProtKB
  • positive regulation of natural killer cell mediated cytotoxicity Source: UniProtKB
  • spermatogenesis Source: UniProtKB-KW
  • susceptibility to natural killer cell mediated cytotoxicity Source: UniProtKB
Complete GO annotation...

Keywords - Molecular functioni

Developmental protein

Keywords - Biological processi

Apoptosis, Cell adhesion, Differentiation, Immunity, Spermatogenesis

Enzyme and pathway databases

BioCyciZFISH:G66-33180-MONOMER.
ReactomeiR-HSA-418990. Adherens junctions interactions.
R-HSA-420597. Nectin/Necl trans heterodimerization.

Names & Taxonomyi

Protein namesi
Recommended name:
Cell adhesion molecule 1
Alternative name(s):
Immunoglobulin superfamily member 4
Short name:
IgSF4
Nectin-like protein 2
Short name:
NECL-2
Spermatogenic immunoglobulin superfamily
Short name:
SgIgSF
Synaptic cell adhesion molecule
Short name:
SynCAM
Tumor suppressor in lung cancer 1
Short name:
TSLC-1
Gene namesi
Name:CADM1Imported
Synonyms:IGSF4By similarity, IGSF4A, NECL21 Publication, SYNCAMBy similarity, TSLC11 Publication
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 11

Organism-specific databases

HGNCiHGNC:5951. CADM1.

Subcellular locationi

Topology

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Topological domaini45 – 374ExtracellularSequence analysisAdd BLAST330
Transmembranei375 – 395HelicalSequence analysisAdd BLAST21
Topological domaini396 – 442CytoplasmicSequence analysisAdd BLAST47

GO - Cellular componenti

  • basolateral plasma membrane Source: UniProtKB
  • cell-cell adherens junction Source: GO_Central
  • cell-cell junction Source: UniProtKB
  • extracellular exosome Source: UniProtKB
  • integral component of plasma membrane Source: GO_Central
  • neuron projection Source: Ensembl
  • plasma membrane Source: UniProtKB
  • synapse Source: UniProtKB-SubCell
Complete GO annotation...

Keywords - Cellular componenti

Cell junction, Cell membrane, Membrane, Synapse

Pathology & Biotechi

Mutagenesis

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Mutagenesisi406Y → A: Nearly abolishes EPB41L3 binding. 1 Publication1
Mutagenesisi408T → A: Strongly reduced affinity for EPB41L3. 1 Publication1

Keywords - Diseasei

Tumor suppressor

Organism-specific databases

DisGeNETi23705.
OpenTargetsiENSG00000182985.
PharmGKBiPA29764.

Polymorphism and mutation databases

BioMutaiCADM1.
DMDMi150438862.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Signal peptidei1 – 44Sequence analysisAdd BLAST44
ChainiPRO_000029196845 – 442Cell adhesion molecule 1Sequence analysisAdd BLAST398

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Disulfide bondi64 ↔ 124PROSITE-ProRule annotation
Glycosylationi67N-linked (GlcNAc...)Sequence analysis1
Glycosylationi101N-linked (GlcNAc...)3 Publications1
Glycosylationi113N-linked (GlcNAc...)3 Publications1
Glycosylationi165N-linked (GlcNAc...)Sequence analysis1
Disulfide bondi166 ↔ 220PROSITE-ProRule annotation
Disulfide bondi267 ↔ 313PROSITE-ProRule annotation
Glycosylationi301N-linked (GlcNAc...); atypical1 Publication1
Glycosylationi302N-linked (GlcNAc...); atypical1 Publication1
Glycosylationi304N-linked (GlcNAc...)Sequence analysis1
Glycosylationi308N-linked (GlcNAc...)Sequence analysis1
Modified residuei422PhosphothreonineBy similarity1
Modified residuei434PhosphoserineBy similarity1

Post-translational modificationi

Glycosylation at Asn-67 and Asn-101 promotes adhesive binding and synapse induction.By similarity

Keywords - PTMi

Disulfide bond, Glycoprotein, Phosphoprotein

Proteomic databases

EPDiQ9BY67.
PaxDbiQ9BY67.
PeptideAtlasiQ9BY67.
PRIDEiQ9BY67.

PTM databases

iPTMnetiQ9BY67.
PhosphoSitePlusiQ9BY67.

Expressioni

Gene expression databases

BgeeiENSG00000182985.
CleanExiHS_CADM1.
ExpressionAtlasiQ9BY67. baseline and differential.
GenevisibleiQ9BY67. HS.

Organism-specific databases

HPAiCAB037266.

Interactioni

Subunit structurei

Homodimer. Interacts with FARP1 (By similarity). Interacts with CRTAM. Interacts (via C-terminus) with EPB41L3/DAL1. The interaction with EPB41L3/DAL1 may act to anchor CADM1 to the actin cytoskeleton. Interacts via its C-terminus with the PDZ domain of MPP3 and the PDZ domain of MPP6.By similarity6 Publications

Binary interactionsi

WithEntry#Exp.IntActNotes
PSMA6P609003EBI-5652260,EBI-357793

GO - Molecular functioni

  • cell adhesion molecule binding Source: GO_Central
  • PDZ domain binding Source: UniProtKB
  • protein homodimerization activity Source: UniProtKB
  • receptor binding Source: UniProtKB

Protein-protein interaction databases

BioGridi117218. 8 interactors.
DIPiDIP-57599N.
IntActiQ9BY67. 5 interactors.
MINTiMINT-1184987.
STRINGi9606.ENSP00000395359.

Structurei

Secondary structure

1442
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Beta strandi52 – 55Combined sources4
Beta strandi60 – 68Combined sources9
Beta strandi74 – 77Combined sources4
Beta strandi83 – 86Combined sources4
Beta strandi97 – 103Combined sources7
Beta strandi106 – 113Combined sources8
Helixi116 – 118Combined sources3
Beta strandi120 – 126Combined sources7
Beta strandi128 – 130Combined sources3
Beta strandi132 – 141Combined sources10
Beta strandi405 – 407Combined sources3

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
3BINX-ray2.30B400-411[»]
4H5SX-ray1.70B45-144[»]
ProteinModelPortaliQ9BY67.
SMRiQ9BY67.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiQ9BY67.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini45 – 139Ig-like V-typeSequence analysisAdd BLAST95
Domaini144 – 238Ig-like C2-type 1Sequence analysisAdd BLAST95
Domaini243 – 329Ig-like C2-type 2Sequence analysisAdd BLAST87

Domaini

The cytoplasmic domain appears to play a critical role in proapoptosis and tumor suppressor activity in NSCLC.2 Publications

Sequence similaritiesi

Belongs to the nectin family.Curated
Contains 2 Ig-like C2-type (immunoglobulin-like) domains.Sequence analysis
Contains 1 Ig-like V-type (immunoglobulin-like) domain.Sequence analysis

Keywords - Domaini

Immunoglobulin domain, Repeat, Signal, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiENOG410IH4H. Eukaryota.
ENOG41116SB. LUCA.
GeneTreeiENSGT00770000120518.
HOGENOMiHOG000036057.
HOVERGENiHBG057086.
InParanoidiQ9BY67.
KOiK06781.
OMAiEIYTTIT.
PhylomeDBiQ9BY67.
TreeFamiTF334317.

Family and domain databases

Gene3Di2.60.40.10. 3 hits.
InterProiIPR013162. CD80_C2-set.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR003599. Ig_sub.
IPR003598. Ig_sub2.
IPR013106. Ig_V-set.
IPR003585. Neurexin-like.
[Graphical view]
PfamiPF08205. C2-set_2. 1 hit.
PF07686. V-set. 1 hit.
[Graphical view]
SMARTiSM00294. 4.1m. 1 hit.
SM00409. IG. 3 hits.
SM00408. IGc2. 3 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 3 hits.
PROSITEiPS50835. IG_LIKE. 3 hits.
[Graphical view]

Sequences (5)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 5 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 11 Publication (identifier: Q9BY67-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MASVVLPSGS QCAAAAAAAA PPGLRLRLLL LLFSAAALIP TGDGQNLFTK
60 70 80 90 100
DVTVIEGEVA TISCQVNKSD DSVIQLLNPN RQTIYFRDFR PLKDSRFQLL
110 120 130 140 150
NFSSSELKVS LTNVSISDEG RYFCQLYTDP PQESYTTITV LVPPRNLMID
160 170 180 190 200
IQKDTAVEGE EIEVNCTAMA SKPATTIRWF KGNTELKGKS EVEEWSDMYT
210 220 230 240 250
VTSQLMLKVH KEDDGVPVIC QVEHPAVTGN LQTQRYLEVQ YKPQVHIQMT
260 270 280 290 300
YPLQGLTREG DALELTCEAI GKPQPVMVTW VRVDDEMPQH AVLSGPNLFI
310 320 330 340 350
NNLNKTDNGT YRCEASNIVG KAHSDYMLYV YDPPTTIPPP TTTTTTTTTT
360 370 380 390 400
TTTILTIITD SRAGEEGSIR AVDHAVIGGV VAVVVFAMLC LLIILGRYFA
410 420 430 440
RHKGTYFTHE AKGADDAADA DTAIINAEGG QNNSEEKKEY FI
Length:442
Mass (Da):48,509
Last modified:June 26, 2007 - v2
Checksum:iCDEDE1E0C08BDD3A
GO
Isoform 21 Publication (identifier: Q9BY67-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     332-333: DP → GT
     334-442: Missing.

Show »
Length:333
Mass (Da):36,915
Checksum:iD7C1102F46D08492
GO
Isoform 3 (identifier: Q9BY67-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     359-359: T → TDTTATTEPAVHGLTQLPNSAEELDSEDLS

Show »
Length:471
Mass (Da):51,533
Checksum:i322A71AB89E8B21F
GO
Isoform 4 (identifier: Q9BY67-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     359-359: T → TDTTATTEPAVH

Show »
Length:453
Mass (Da):49,633
Checksum:i466DC2D374481CFB
GO
Isoform 5 (identifier: Q9BY67-5) [UniParc]FASTAAdd to basket
Also known as: E

The sequence of this isoform differs from the canonical sequence as follows:
     333-360: Missing.

Show »
Length:414
Mass (Da):45,624
Checksum:i4C5AB05F34BA714A
GO

Sequence cautioni

The sequence AAI25103 differs from that shown. Reason: Frameshift at position 1.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti13A → V in CCD32613 (PubMed:22438059).Curated1
Sequence conflicti153K → R in AAF69029 (PubMed:15893517).Curated1
Sequence conflicti333 – 359PPTTI…LTIIT → TTATTEPAVHGLTQLPNSAE ELDSEDLS in BAC11657 (PubMed:14702039).CuratedAdd BLAST27

Natural variant

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Natural variantiVAR_061309285D → E.Corresponds to variant rs45525440dbSNPEnsembl.1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_052461332 – 333DP → GT in isoform 2. 1 Publication2
Alternative sequenceiVSP_047405333 – 360Missing in isoform 5. 2 PublicationsAdd BLAST28
Alternative sequenceiVSP_052462334 – 442Missing in isoform 2. 1 PublicationAdd BLAST109
Alternative sequenceiVSP_047406359T → TDTTATTEPAVHGLTQLPNS AEELDSEDLS in isoform 3. 1 Publication1
Alternative sequenceiVSP_047407359T → TDTTATTEPAVH in isoform 4. 1 Publication1

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF132811 mRNA. Translation: AAF69029.1.
HE586496 mRNA. Translation: CCD32610.1.
HE586497 mRNA. Translation: CCD32611.1.
HE586498 mRNA. Translation: CCD32612.1.
HE586499 mRNA. Translation: CCD32613.1.
KJ534791 mRNA. Translation: AHW56431.1.
KJ534794 mRNA. Translation: AHW56434.1.
AB094146 mRNA. Translation: BAC66178.1.
AK075502 mRNA. Translation: BAC11657.1.
AP000462 Genomic DNA. No translation available.
AP000465 Genomic DNA. No translation available.
AP003174 Genomic DNA. No translation available.
AP003179 Genomic DNA. No translation available.
AP005020 Genomic DNA. No translation available.
BC125102 mRNA. Translation: AAI25103.1. Frameshift.
CCDSiCCDS53711.1. [Q9BY67-5]
CCDS73398.1. [Q9BY67-4]
CCDS73399.1. [Q9BY67-3]
CCDS8373.1. [Q9BY67-1]
RefSeqiNP_001091987.1. NM_001098517.1. [Q9BY67-5]
NP_001287972.1. NM_001301043.1. [Q9BY67-3]
NP_001287973.1. NM_001301044.1. [Q9BY67-4]
NP_001287974.1. NM_001301045.1.
NP_055148.3. NM_014333.3. [Q9BY67-1]
UniGeneiHs.370510.

Genome annotation databases

EnsembliENST00000331581; ENSP00000329797; ENSG00000182985. [Q9BY67-3]
ENST00000452722; ENSP00000395359; ENSG00000182985. [Q9BY67-1]
ENST00000537058; ENSP00000439817; ENSG00000182985. [Q9BY67-4]
ENST00000542447; ENSP00000439176; ENSG00000182985. [Q9BY67-5]
GeneIDi23705.
KEGGihsa:23705.
UCSCiuc001ppk.4. human. [Q9BY67-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF132811 mRNA. Translation: AAF69029.1.
HE586496 mRNA. Translation: CCD32610.1.
HE586497 mRNA. Translation: CCD32611.1.
HE586498 mRNA. Translation: CCD32612.1.
HE586499 mRNA. Translation: CCD32613.1.
KJ534791 mRNA. Translation: AHW56431.1.
KJ534794 mRNA. Translation: AHW56434.1.
AB094146 mRNA. Translation: BAC66178.1.
AK075502 mRNA. Translation: BAC11657.1.
AP000462 Genomic DNA. No translation available.
AP000465 Genomic DNA. No translation available.
AP003174 Genomic DNA. No translation available.
AP003179 Genomic DNA. No translation available.
AP005020 Genomic DNA. No translation available.
BC125102 mRNA. Translation: AAI25103.1. Frameshift.
CCDSiCCDS53711.1. [Q9BY67-5]
CCDS73398.1. [Q9BY67-4]
CCDS73399.1. [Q9BY67-3]
CCDS8373.1. [Q9BY67-1]
RefSeqiNP_001091987.1. NM_001098517.1. [Q9BY67-5]
NP_001287972.1. NM_001301043.1. [Q9BY67-3]
NP_001287973.1. NM_001301044.1. [Q9BY67-4]
NP_001287974.1. NM_001301045.1.
NP_055148.3. NM_014333.3. [Q9BY67-1]
UniGeneiHs.370510.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
3BINX-ray2.30B400-411[»]
4H5SX-ray1.70B45-144[»]
ProteinModelPortaliQ9BY67.
SMRiQ9BY67.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi117218. 8 interactors.
DIPiDIP-57599N.
IntActiQ9BY67. 5 interactors.
MINTiMINT-1184987.
STRINGi9606.ENSP00000395359.

PTM databases

iPTMnetiQ9BY67.
PhosphoSitePlusiQ9BY67.

Polymorphism and mutation databases

BioMutaiCADM1.
DMDMi150438862.

Proteomic databases

EPDiQ9BY67.
PaxDbiQ9BY67.
PeptideAtlasiQ9BY67.
PRIDEiQ9BY67.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000331581; ENSP00000329797; ENSG00000182985. [Q9BY67-3]
ENST00000452722; ENSP00000395359; ENSG00000182985. [Q9BY67-1]
ENST00000537058; ENSP00000439817; ENSG00000182985. [Q9BY67-4]
ENST00000542447; ENSP00000439176; ENSG00000182985. [Q9BY67-5]
GeneIDi23705.
KEGGihsa:23705.
UCSCiuc001ppk.4. human. [Q9BY67-1]

Organism-specific databases

CTDi23705.
DisGeNETi23705.
GeneCardsiCADM1.
HGNCiHGNC:5951. CADM1.
HPAiCAB037266.
MIMi605686. gene.
neXtProtiNX_Q9BY67.
OpenTargetsiENSG00000182985.
PharmGKBiPA29764.
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410IH4H. Eukaryota.
ENOG41116SB. LUCA.
GeneTreeiENSGT00770000120518.
HOGENOMiHOG000036057.
HOVERGENiHBG057086.
InParanoidiQ9BY67.
KOiK06781.
OMAiEIYTTIT.
PhylomeDBiQ9BY67.
TreeFamiTF334317.

Enzyme and pathway databases

BioCyciZFISH:G66-33180-MONOMER.
ReactomeiR-HSA-418990. Adherens junctions interactions.
R-HSA-420597. Nectin/Necl trans heterodimerization.

Miscellaneous databases

ChiTaRSiCADM1. human.
EvolutionaryTraceiQ9BY67.
GeneWikiiCell_adhesion_molecule_1.
GenomeRNAii23705.
PROiQ9BY67.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000182985.
CleanExiHS_CADM1.
ExpressionAtlasiQ9BY67. baseline and differential.
GenevisibleiQ9BY67. HS.

Family and domain databases

Gene3Di2.60.40.10. 3 hits.
InterProiIPR013162. CD80_C2-set.
IPR007110. Ig-like_dom.
IPR013783. Ig-like_fold.
IPR003599. Ig_sub.
IPR003598. Ig_sub2.
IPR013106. Ig_V-set.
IPR003585. Neurexin-like.
[Graphical view]
PfamiPF08205. C2-set_2. 1 hit.
PF07686. V-set. 1 hit.
[Graphical view]
SMARTiSM00294. 4.1m. 1 hit.
SM00409. IG. 3 hits.
SM00408. IGc2. 3 hits.
[Graphical view]
SUPFAMiSSF48726. SSF48726. 3 hits.
PROSITEiPS50835. IG_LIKE. 3 hits.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiCADM1_HUMAN
AccessioniPrimary (citable) accession number: Q9BY67
Secondary accession number(s): A4FVB5
, F5H0J4, H0YGA7, H1ZZV9, H1ZZW1, H1ZZW2, Q86WB8, Q8N2F4, X5D2C8
Entry historyi
Integrated into UniProtKB/Swiss-Prot: June 26, 2007
Last sequence update: June 26, 2007
Last modified: November 2, 2016
This is version 138 of the entry and version 2 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 11
    Human chromosome 11: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  6. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.