Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Contactin-associated protein-like 3

Gene

CNTNAP3

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 4 out of 5-Experimental evidence at transcript leveli

Functioni

GO - Biological processi

  • cell adhesion Source: UniProtKB-KW
  • cell recognition Source: UniProtKB
Complete GO annotation...

Keywords - Biological processi

Cell adhesion

Names & Taxonomyi

Protein namesi
Recommended name:
Contactin-associated protein-like 3
Alternative name(s):
Cell recognition molecule Caspr3
Gene namesi
Name:CNTNAP3
Synonyms:CASPR3, KIAA1714
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 9

Organism-specific databases

HGNCiHGNC:13834. CNTNAP3.

Subcellular locationi

Topology

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Topological domaini26 – 12451220ExtracellularSequence analysisAdd
BLAST
Transmembranei1246 – 126621HelicalSequence analysisAdd
BLAST
Topological domaini1267 – 128822CytoplasmicSequence analysisAdd
BLAST

GO - Cellular componenti

  • extracellular region Source: UniProtKB-SubCell
  • integral component of membrane Source: UniProtKB
  • plasma membrane Source: UniProtKB-SubCell
Complete GO annotation...

Keywords - Cellular componenti

Cell membrane, Membrane, Secreted

Pathology & Biotechi

Organism-specific databases

PharmGKBiPA134963289.

Polymorphism and mutation databases

BioMutaiCNTNAP3.
DMDMi209572752.

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Signal peptidei1 – 2525Sequence analysisAdd
BLAST
Chaini26 – 12881263Contactin-associated protein-like 3PRO_0000019509Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Disulfide bondi31 ↔ 177By similarity
Glycosylationi285 – 2851N-linked (GlcNAc...)Sequence analysis
Disulfide bondi332 ↔ 364By similarity
Glycosylationi359 – 3591N-linked (GlcNAc...)Sequence analysis
Glycosylationi441 – 4411N-linked (GlcNAc...)Sequence analysis
Glycosylationi497 – 4971N-linked (GlcNAc...)Sequence analysis
Disulfide bondi513 ↔ 545By similarity
Disulfide bondi551 ↔ 562By similarity
Disulfide bondi556 ↔ 571By similarity
Disulfide bondi573 ↔ 583By similarity
Glycosylationi623 – 6231N-linked (GlcNAc...)Sequence analysis
Glycosylationi706 – 7061N-linked (GlcNAc...)Sequence analysis
Disulfide bondi931 ↔ 958By similarity
Disulfide bondi962 ↔ 975By similarity
Disulfide bondi969 ↔ 984By similarity
Disulfide bondi986 ↔ 996By similarity
Glycosylationi1023 – 10231N-linked (GlcNAc...)Sequence analysis
Glycosylationi1073 – 10731N-linked (GlcNAc...)Sequence analysis
Glycosylationi1120 – 11201N-linked (GlcNAc...)Sequence analysis
Disulfide bondi1167 ↔ 1203By similarity

Keywords - PTMi

Disulfide bond, Glycoprotein

Proteomic databases

EPDiQ9BZ76.
MaxQBiQ9BZ76.
PaxDbiQ9BZ76.
PRIDEiQ9BZ76.

PTM databases

iPTMnetiQ9BZ76.
PhosphoSiteiQ9BZ76.

Expressioni

Gene expression databases

BgeeiQ9BZ76.
CleanExiHS_CNTNAP3.
ExpressionAtlasiQ9BZ76. baseline and differential.
GenevisibleiQ9BZ76. HS.

Organism-specific databases

HPAiHPA015604.
HPA047731.

Interactioni

Protein-protein interaction databases

BioGridi123011. 33 interactions.
STRINGi9606.ENSP00000297668.

Structurei

3D structure databases

ProteinModelPortaliQ9BZ76.
SMRiQ9BZ76. Positions 45-177, 208-348, 371-584, 786-1110.
ModBaseiSearch...
MobiDBiSearch...

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini31 – 177147F5/8 type CPROSITE-ProRule annotationAdd
BLAST
Domaini183 – 364182Laminin G-like 1PROSITE-ProRule annotationAdd
BLAST
Domaini370 – 545176Laminin G-like 2PROSITE-ProRule annotationAdd
BLAST
Domaini551 – 58333EGF-like 1PROSITE-ProRule annotationAdd
BLAST
Domaini584 – 792209Fibrinogen C-terminalPROSITE-ProRule annotationAdd
BLAST
Domaini793 – 958166Laminin G-like 3PROSITE-ProRule annotationAdd
BLAST
Domaini962 – 99635EGF-like 2PROSITE-ProRule annotationAdd
BLAST
Domaini1015 – 1203189Laminin G-like 4PROSITE-ProRule annotationAdd
BLAST

Compositional bias

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Compositional biasi42 – 487Poly-Ser

Sequence similaritiesi

Belongs to the neurexin family.Curated
Contains 2 EGF-like domains.PROSITE-ProRule annotation
Contains 1 F5/8 type C domain.PROSITE-ProRule annotation
Contains 1 fibrinogen C-terminal domain.PROSITE-ProRule annotation
Contains 4 laminin G-like domains.PROSITE-ProRule annotation

Keywords - Domaini

EGF-like domain, Repeat, Signal, Transmembrane, Transmembrane helix

Phylogenomic databases

eggNOGiKOG3516. Eukaryota.
ENOG410XPHG. LUCA.
GeneTreeiENSGT00760000118991.
HOGENOMiHOG000230964.
HOVERGENiHBG057718.
InParanoidiQ9BZ76.
OMAiCEAHRHR.
OrthoDBiEOG7GXP9N.
PhylomeDBiQ9BZ76.
TreeFamiTF321823.

Family and domain databases

Gene3Di2.60.120.200. 4 hits.
2.60.120.260. 1 hit.
3.90.215.10. 1 hit.
InterProiIPR028873. CASPR3.
IPR013320. ConA-like_dom.
IPR000742. EGF-like_dom.
IPR000421. FA58C.
IPR014716. Fibrinogen_a/b/g_C_1.
IPR002181. Fibrinogen_a/b/g_C_dom.
IPR008979. Galactose-bd-like.
IPR001791. Laminin_G.
[Graphical view]
PANTHERiPTHR10127:SF605. PTHR10127:SF605. 1 hit.
PfamiPF00754. F5_F8_type_C. 1 hit.
PF02210. Laminin_G_2. 4 hits.
[Graphical view]
SMARTiSM00181. EGF. 2 hits.
SM00231. FA58C. 1 hit.
SM00282. LamG. 4 hits.
[Graphical view]
SUPFAMiSSF49785. SSF49785. 1 hit.
SSF49899. SSF49899. 4 hits.
SSF56496. SSF56496. 1 hit.
PROSITEiPS50026. EGF_3. 2 hits.
PS01285. FA58C_1. 1 hit.
PS01286. FA58C_2. 1 hit.
PS50022. FA58C_3. 1 hit.
PS51406. FIBRINOGEN_C_2. 1 hit.
PS50025. LAM_G_DOMAIN. 4 hits.
[Graphical view]

Sequences (2)i

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

This entry describes 2 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q9BZ76-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MASVAWAVLK VLLLLPTQTW SPVGAGNPPD CDAPLASALP RSSFSSSSEL
60 70 80 90 100
SSSHGPGFSR LNRRDGAGGW TPLVSNKYQW LQIDLGERME VTAVATQGGY
110 120 130 140 150
GSSDWVTSYL LMFSDGGRNW KQYRREESIW GFPGNTNADS VVHYRLQPPF
160 170 180 190 200
EARFLRFLPL AWNPRGRIGM RIEVYGCAYK SEVVYFDGQS ALLYRLDKKP
210 220 230 240 250
LKPIRDVISL KFKAMQSNGI LLHREGQHGN HITLELIKGK LVFFLNSGNA
260 270 280 290 300
KLPSTIAPVT LTLGSLLDDQ HWHSVLIELL DTQVNFTVDK HTHHFQAKGD
310 320 330 340 350
SSYLDLNFEI SFGGIPTPGR SRAFRRKSFH GCLENLYYNG VDVTELAKKH
360 370 380 390 400
KPQILMMGNV SFSCPQPQTV PVTFLSSRSY LALPGNSGED KVSVTFQFRT
410 420 430 440 450
WNRAGHLLFG ELRRGSGSFV LFLKDGKLKL SLFQPGQSPR NVTAGAGLND
460 470 480 490 500
GQWHSVSFSA KWSHMNVVVD DDTAVQPLVA VLIDSGDTYY FGGCLDNSSG
510 520 530 540 550
SGCKSPLGGF QGCLRLITIG DKAVDPILVQ QGALGSFRDL QIDSCGITDR
560 570 580 590 600
CLPSYCEHGG ECSQSWDTFS CDCLGTGYTG ETCHSSLYEQ SCEAHKHRGN
610 620 630 640 650
PSGLYYIDAD GSGPLGPFLV YCNMTADAAW TVVQHGGPDA VTLRGAPSGH
660 670 680 690 700
PRSAVSFAYA AGAGQLRSAV NLAERCEQRL ALRCGTARRP DSRDGTPLSW
710 720 730 740 750
WVGRTNETHT SWGGSLPDAQ KCTCGLEGNC IDSQYYCNCD AGRNEWTSDT
760 770 780 790 800
IVLSQKEHLP VTQIVMTDAG RPHSEAAYTL GPLLCRGDQS FWNSASFNTE
810 820 830 840 850
TSYLHFPAFH GELTADVCFF FKTTVSSGVF MENLGITDFI RIELRAPTEV
860 870 880 890 900
TFSFDVGNGP CEVTVQSPTP FNDNQWHHVR AERNVKGASL QVDQLPQKMQ
910 920 930 940 950
PAPADGHVRL QLNSQLFIGG TATRQRGFLG CIRSLQLNGV ALDLEERATV
960 970 980 990 1000
TPGVEPGCAG HCSTYGHLCR NGGRCREKRR GVTCDCAFSA YDGPFCSNEI
1010 1020 1030 1040 1050
SAYFATGSSM TYHFQEHYTL SENSSSLVSS LHRDVTLTRE MITLSFRTTR
1060 1070 1080 1090 1100
TPSLLLYVSS FYEEYLSVIL ANNGSLQIRY KLDRHQNPDA FTFDFKNMAD
1110 1120 1130 1140 1150
GQLHQVKINR EEAVVMVEVN QSTKKQVILS SGTEFNAVKS LILGKVLEAA
1160 1170 1180 1190 1200
GADPDTRRAA TSGFTGCLSA VRFGRAAPLK AALRPSGPSR VTVRGHVAPM
1210 1220 1230 1240 1250
ARCAAGAASG SPARELAPRL AGGAGRSGPA DEGEPLVNAD RRDSAVIGGV
1260 1270 1280
IAVVIFILLC ITAIAIRIYQ QRKLRKENES KVSKKEEC
Length:1,288
Mass (Da):140,690
Last modified:October 14, 2008 - v3
Checksum:iF41F1CE8A83D417E
GO
Isoform 2 (identifier: Q9BZ76-2) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1120-1127: NQSTKKQV → IPQMQKSN
     1128-1288: Missing.

Note: No experimental confirmation available.
Show »
Length:1,127
Mass (Da):124,116
Checksum:iCB24B889FE46BEB1
GO

Sequence cautioni

The sequence BAB21805.2 differs from that shown. Reason: Erroneous initiation. Curated

Experimental Info

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Sequence conflicti21 – 211S → R in AAG52889 (PubMed:12093160).Curated
Sequence conflicti33 – 331A → S in AAG52889 (PubMed:12093160).Curated
Sequence conflicti89 – 891M → I in AAG52889 (PubMed:12093160).Curated
Sequence conflicti711 – 7111S → Y in AAG52889 (PubMed:12093160).Curated
Sequence conflicti711 – 7111S → Y in BAB21805 (PubMed:11214970).Curated
Sequence conflicti714 – 7141G → V in BAB21805 (PubMed:11214970).Curated
Sequence conflicti769 – 7713AGR → TGQ in AAG52889 (PubMed:12093160).Curated
Sequence conflicti777 – 7771A → D in AAG52889 (PubMed:12093160).Curated

Natural variant

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Natural varianti628 – 6281A → S.
Corresponds to variant rs1758272 [ dbSNP | Ensembl ].
VAR_046710
Natural varianti845 – 8451R → H.1 Publication
Corresponds to variant rs7852039 [ dbSNP | Ensembl ].
VAR_046711

Alternative sequence

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Alternative sequencei1120 – 11278NQSTKKQV → IPQMQKSN in isoform 2. 1 PublicationVSP_003535
Alternative sequencei1128 – 1288161Missing in isoform 2. 1 PublicationVSP_003536Add
BLAST

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF333769 mRNA. Translation: AAG52889.2.
AB051501 mRNA. Translation: BAB21805.2. Different initiation.
AL162501, AL353729 Genomic DNA. Translation: CAH70488.1.
AL353729, AL162501 Genomic DNA. Translation: CAH72039.1.
CCDSiCCDS6616.1. [Q9BZ76-1]
RefSeqiNP_387504.2. NM_033655.3. [Q9BZ76-1]
UniGeneiHs.128474.
Hs.521495.
Hs.604441.

Genome annotation databases

EnsembliENST00000297668; ENSP00000297668; ENSG00000106714. [Q9BZ76-1]
GeneIDi79937.
KEGGihsa:79937.
UCSCiuc004abi.4. human. [Q9BZ76-1]

Keywords - Coding sequence diversityi

Alternative splicing, Polymorphism

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AF333769 mRNA. Translation: AAG52889.2.
AB051501 mRNA. Translation: BAB21805.2. Different initiation.
AL162501, AL353729 Genomic DNA. Translation: CAH70488.1.
AL353729, AL162501 Genomic DNA. Translation: CAH72039.1.
CCDSiCCDS6616.1. [Q9BZ76-1]
RefSeqiNP_387504.2. NM_033655.3. [Q9BZ76-1]
UniGeneiHs.128474.
Hs.521495.
Hs.604441.

3D structure databases

ProteinModelPortaliQ9BZ76.
SMRiQ9BZ76. Positions 45-177, 208-348, 371-584, 786-1110.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi123011. 33 interactions.
STRINGi9606.ENSP00000297668.

PTM databases

iPTMnetiQ9BZ76.
PhosphoSiteiQ9BZ76.

Polymorphism and mutation databases

BioMutaiCNTNAP3.
DMDMi209572752.

Proteomic databases

EPDiQ9BZ76.
MaxQBiQ9BZ76.
PaxDbiQ9BZ76.
PRIDEiQ9BZ76.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000297668; ENSP00000297668; ENSG00000106714. [Q9BZ76-1]
GeneIDi79937.
KEGGihsa:79937.
UCSCiuc004abi.4. human. [Q9BZ76-1]

Organism-specific databases

CTDi79937.
GeneCardsiCNTNAP3.
H-InvDBHIX0008061.
HGNCiHGNC:13834. CNTNAP3.
HPAiHPA015604.
HPA047731.
MIMi610517. gene.
neXtProtiNX_Q9BZ76.
PharmGKBiPA134963289.
HUGEiSearch...
GenAtlasiSearch...

Phylogenomic databases

eggNOGiKOG3516. Eukaryota.
ENOG410XPHG. LUCA.
GeneTreeiENSGT00760000118991.
HOGENOMiHOG000230964.
HOVERGENiHBG057718.
InParanoidiQ9BZ76.
OMAiCEAHRHR.
OrthoDBiEOG7GXP9N.
PhylomeDBiQ9BZ76.
TreeFamiTF321823.

Miscellaneous databases

GenomeRNAii79937.
NextBioi69878.
PROiQ9BZ76.
SOURCEiSearch...

Gene expression databases

BgeeiQ9BZ76.
CleanExiHS_CNTNAP3.
ExpressionAtlasiQ9BZ76. baseline and differential.
GenevisibleiQ9BZ76. HS.

Family and domain databases

Gene3Di2.60.120.200. 4 hits.
2.60.120.260. 1 hit.
3.90.215.10. 1 hit.
InterProiIPR028873. CASPR3.
IPR013320. ConA-like_dom.
IPR000742. EGF-like_dom.
IPR000421. FA58C.
IPR014716. Fibrinogen_a/b/g_C_1.
IPR002181. Fibrinogen_a/b/g_C_dom.
IPR008979. Galactose-bd-like.
IPR001791. Laminin_G.
[Graphical view]
PANTHERiPTHR10127:SF605. PTHR10127:SF605. 1 hit.
PfamiPF00754. F5_F8_type_C. 1 hit.
PF02210. Laminin_G_2. 4 hits.
[Graphical view]
SMARTiSM00181. EGF. 2 hits.
SM00231. FA58C. 1 hit.
SM00282. LamG. 4 hits.
[Graphical view]
SUPFAMiSSF49785. SSF49785. 1 hit.
SSF49899. SSF49899. 4 hits.
SSF56496. SSF56496. 1 hit.
PROSITEiPS50026. EGF_3. 2 hits.
PS01285. FA58C_1. 1 hit.
PS01286. FA58C_2. 1 hit.
PS50022. FA58C_3. 1 hit.
PS51406. FIBRINOGEN_C_2. 1 hit.
PS50025. LAM_G_DOMAIN. 4 hits.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "Caspr3 and Caspr4, two novel members of the Caspr family are expressed in the nervous system and interact with PDZ domains."
    Spiegel I., Salomon D., Erne B., Schaeren-Wiemers N., Peles E.
    Mol. Cell. Neurosci. 20:283-297(2002) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1).
    Tissue: Brain.
  2. "Prediction of the coding sequences of unidentified human genes. XIX. The complete sequences of 100 new cDNA clones from brain which code for large proteins in vitro."
    Nagase T., Kikuno R., Hattori A., Kondo Y., Okumura K., Ohara O.
    DNA Res. 7:347-355(2000) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2), VARIANT HIS-845.
    Tissue: Brain.
  3. Nagase T., Kikuno R., Yamakawa H., Ohara O.
    Submitted (JAN-2003) to the EMBL/GenBank/DDBJ databases
    Cited for: SEQUENCE REVISION.
  4. "DNA sequence and analysis of human chromosome 9."
    Humphray S.J., Oliver K., Hunt A.R., Plumb R.W., Loveland J.E., Howe K.L., Andrews T.D., Searle S., Hunt S.E., Scott C.E., Jones M.C., Ainscough R., Almeida J.P., Ambrose K.D., Ashwell R.I.S., Babbage A.K., Babbage S., Bagguley C.L.
    , Bailey J., Banerjee R., Barker D.J., Barlow K.F., Bates K., Beasley H., Beasley O., Bird C.P., Bray-Allen S., Brown A.J., Brown J.Y., Burford D., Burrill W., Burton J., Carder C., Carter N.P., Chapman J.C., Chen Y., Clarke G., Clark S.Y., Clee C.M., Clegg S., Collier R.E., Corby N., Crosier M., Cummings A.T., Davies J., Dhami P., Dunn M., Dutta I., Dyer L.W., Earthrowl M.E., Faulkner L., Fleming C.J., Frankish A., Frankland J.A., French L., Fricker D.G., Garner P., Garnett J., Ghori J., Gilbert J.G.R., Glison C., Grafham D.V., Gribble S., Griffiths C., Griffiths-Jones S., Grocock R., Guy J., Hall R.E., Hammond S., Harley J.L., Harrison E.S.I., Hart E.A., Heath P.D., Henderson C.D., Hopkins B.L., Howard P.J., Howden P.J., Huckle E., Johnson C., Johnson D., Joy A.A., Kay M., Keenan S., Kershaw J.K., Kimberley A.M., King A., Knights A., Laird G.K., Langford C., Lawlor S., Leongamornlert D.A., Leversha M., Lloyd C., Lloyd D.M., Lovell J., Martin S., Mashreghi-Mohammadi M., Matthews L., McLaren S., McLay K.E., McMurray A., Milne S., Nickerson T., Nisbett J., Nordsiek G., Pearce A.V., Peck A.I., Porter K.M., Pandian R., Pelan S., Phillimore B., Povey S., Ramsey Y., Rand V., Scharfe M., Sehra H.K., Shownkeen R., Sims S.K., Skuce C.D., Smith M., Steward C.A., Swarbreck D., Sycamore N., Tester J., Thorpe A., Tracey A., Tromans A., Thomas D.W., Wall M., Wallis J.M., West A.P., Whitehead S.L., Willey D.L., Williams S.A., Wilming L., Wray P.W., Young L., Ashurst J.L., Coulson A., Blocker H., Durbin R.M., Sulston J.E., Hubbard T., Jackson M.J., Bentley D.R., Beck S., Rogers J., Dunham I.
    Nature 429:369-374(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].

Entry informationi

Entry nameiCNTP3_HUMAN
AccessioniPrimary (citable) accession number: Q9BZ76
Secondary accession number(s): B1AMA0, Q9C0E9
Entry historyi
Integrated into UniProtKB/Swiss-Prot: December 5, 2001
Last sequence update: October 14, 2008
Last modified: May 11, 2016
This is version 146 of the entry and version 3 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

Complete proteome, Reference proteome

Documents

  1. Human chromosome 9
    Human chromosome 9: entries, gene names and cross-references to MIM
  2. Human entries with polymorphisms or disease mutations
    List of human entries with polymorphisms or disease mutations
  3. Human polymorphisms and disease mutations
    Index of human polymorphisms and disease mutations
  4. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  5. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.