Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Matrin-3

Gene

Matr3

Organism
Mus musculus (Mouse)
Status
Reviewed-Annotation score: Annotation score: 4 out of 5-Experimental evidence at protein leveli

Functioni

May play a role in transcription or may interact with other nuclear matrix proteins to form the internal fibrogranular network. In association with the SFPQ-NONO heteromer may play a role in nuclear retention of defective RNAs (By similarity).By similarity

Regions

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Zinc fingeri800 – 83132Matrin-typePROSITE-ProRule annotationAdd
BLAST

GO - Molecular functioni

GO - Biological processi

  • heart valve development Source: MGI
  • posttranscriptional regulation of gene expression Source: MGI
  • ventricular septum development Source: MGI
Complete GO annotation...

Keywords - Ligandi

Metal-binding, RNA-binding, Zinc

Names & Taxonomyi

Protein namesi
Recommended name:
Matrin-3
Gene namesi
Name:Matr3
OrganismiMus musculus (Mouse)
Taxonomic identifieri10090 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresGliresRodentiaSciurognathiMuroideaMuridaeMurinaeMusMus
Proteomesi
  • UP000000589 Componenti: Chromosome 18

Organism-specific databases

MGIiMGI:1298379. Matr3.

Subcellular locationi

  • Nucleus matrix PROSITE-ProRule annotation

GO - Cellular componenti

  • membrane Source: MGI
  • nuclear matrix Source: UniProtKB-SubCell
  • nucleoplasm Source: MGI
  • nucleus Source: MGI
Complete GO annotation...

Keywords - Cellular componenti

Nucleus

PTM / Processingi

Molecule processing

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Initiator methionineiRemovedBy similarity
Chaini2 – 846845Matrin-3PRO_0000081623Add
BLAST

Amino acid modifications

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Modified residuei2 – 21N-acetylserineBy similarity
Modified residuei3 – 31N6-acetyllysine; alternateBy similarity
Cross-linki3 – 3Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2); alternateBy similarity
Modified residuei4 – 41PhosphoserineBy similarity
Modified residuei9 – 91PhosphoserineBy similarity
Modified residuei14 – 141PhosphoserineBy similarity
Modified residuei22 – 221PhosphoserineBy similarity
Modified residuei41 – 411PhosphoserineBy similarity
Modified residuei118 – 1181PhosphoserineBy similarity
Cross-linki146 – 146Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)By similarity
Modified residuei150 – 1501PhosphothreonineCombined sources
Modified residuei188 – 1881PhosphoserineCombined sources
Modified residuei195 – 1951PhosphoserineCombined sources
Modified residuei206 – 2061PhosphoserineBy similarity
Modified residuei208 – 2081PhosphoserineBy similarity
Modified residuei219 – 2191PhosphotyrosineBy similarity
Cross-linki487 – 487Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)By similarity
Cross-linki491 – 491Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)By similarity
Modified residuei522 – 5221N6-acetyllysineBy similarity
Modified residuei533 – 5331PhosphoserineCombined sources
Cross-linki554 – 554Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)By similarity
Modified residuei571 – 5711N6-acetyllysineBy similarity
Modified residuei596 – 5961PhosphoserineBy similarity
Modified residuei598 – 5981PhosphoserineCombined sources
Modified residuei604 – 6041PhosphoserineCombined sources
Modified residuei606 – 6061PhosphoserineBy similarity
Cross-linki616 – 616Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)By similarity
Modified residuei653 – 6531PhosphoserineBy similarity
Modified residuei670 – 6701PhosphoserineBy similarity
Modified residuei672 – 6721PhosphoserineBy similarity
Modified residuei673 – 6731PhosphoserineBy similarity
Modified residuei678 – 6781PhosphothreonineBy similarity
Modified residuei688 – 6881PhosphoserineBy similarity
Cross-linki718 – 718Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)By similarity
Cross-linki735 – 735Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)By similarity
Modified residuei740 – 7401PhosphothreonineBy similarity
Modified residuei746 – 7461PhosphoserineBy similarity
Modified residuei758 – 7581PhosphoserineBy similarity
Modified residuei835 – 8351N6-acetyllysineCombined sources

Keywords - PTMi

Acetylation, Isopeptide bond, Phosphoprotein, Ubl conjugation

Proteomic databases

EPDiQ8K310.
MaxQBiQ8K310.
PaxDbiQ8K310.
PRIDEiQ8K310.

PTM databases

iPTMnetiQ8K310.
PhosphoSiteiQ8K310.
SwissPalmiQ8K310.

Expressioni

Gene expression databases

BgeeiQ8K310.
CleanExiMM_MATR3.
ExpressionAtlasiQ8K310. baseline and differential.
GenevisibleiQ8K310. MM.

Interactioni

Subunit structurei

Part of a complex consisting of SFPQ, NONO and MATR3. Interacts with AGO1 and AGO2 (By similarity). Part of a complex composed at least of ASCL2, EMSY, HCFC1, HSPA8, MATR3, MKI67, CCAR2, RBBP5, TUBB2A, WDR5 and ZNF335; this complex may have a histone H3-specific methyltransferase activity (By similarity). Interacts with TARDBP (By similarity).By similarity

Protein-protein interaction databases

BioGridi201324. 9 interactions.
IntActiQ8K310. 10 interactions.
MINTiMINT-4101305.
STRINGi10090.ENSMUSP00000125761.

Structurei

Secondary structure

1
846
Legend: HelixTurnBeta strand
Show more details
Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Beta strandi399 – 4046Combined sources
Beta strandi408 – 4103Combined sources
Helixi411 – 4166Combined sources
Turni417 – 4193Combined sources
Helixi420 – 4223Combined sources
Beta strandi425 – 4306Combined sources
Beta strandi432 – 4354Combined sources
Beta strandi437 – 4437Combined sources
Helixi444 – 45613Combined sources
Beta strandi467 – 4715Combined sources
Beta strandi497 – 5026Combined sources
Helixi511 – 5144Combined sources
Turni515 – 5206Combined sources
Beta strandi524 – 5296Combined sources
Turni530 – 5334Combined sources
Beta strandi534 – 5385Combined sources
Helixi542 – 55413Combined sources
Beta strandi559 – 5624Combined sources
Beta strandi565 – 5695Combined sources
Beta strandi573 – 5764Combined sources

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1X4DNMR-A390-478[»]
1X4FNMR-A478-576[»]
ProteinModelPortaliQ8K310.
SMRiQ8K310. Positions 390-576.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiQ8K310.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Domaini398 – 47376RRM 1PROSITE-ProRule annotationAdd
BLAST
Domaini496 – 57176RRM 2PROSITE-ProRule annotationAdd
BLAST

Motif

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Motifi709 – 7179Nuclear localization signalSequence analysis

Sequence similaritiesi

Contains 1 matrin-type zinc finger.PROSITE-ProRule annotation
Contains 2 RRM (RNA recognition motif) domains.PROSITE-ProRule annotation

Zinc finger

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifierActions
Zinc fingeri800 – 83132Matrin-typePROSITE-ProRule annotationAdd
BLAST

Keywords - Domaini

Repeat, Zinc-finger

Phylogenomic databases

eggNOGiENOG410IGPC. Eukaryota.
ENOG410XSPB. LUCA.
GeneTreeiENSGT00840000129848.
HOVERGENiHBG057347.
InParanoidiQ8K310.
KOiK13213.
OMAiEQEPSML.
OrthoDBiEOG7GJ6C8.
PhylomeDBiQ8K310.
TreeFamiTF333921.

Family and domain databases

Gene3Di3.30.70.330. 2 hits.
InterProiIPR012677. Nucleotide-bd_a/b_plait.
IPR000504. RRM_dom.
IPR000690. Znf_C2H2_matrin.
IPR003604. Znf_U1.
[Graphical view]
SMARTiSM00360. RRM. 2 hits.
SM00451. ZnF_U1. 2 hits.
[Graphical view]
SUPFAMiSSF54928. SSF54928. 2 hits.
PROSITEiPS50102. RRM. 2 hits.
PS50171. ZF_MATRIN. 1 hit.
[Graphical view]

Sequencei

Sequence statusi: Complete.

Sequence processingi: The displayed sequence is further processed into a mature form.

Q8K310-1 [UniParc]FASTAAdd to basket

« Hide

        10         20         30         40         50
MSKSFQQSSL GRDSQGHGRD LSAAGIGLLA AATQSLSMPA SLGRMNQGTA
60 70 80 90 100
RLASLMNLGM SSSLNQQGAH SALSSASTSS HNLQSIFNIG SRGPLPLSSQ
110 120 130 140 150
HRGDTDQASN ILASFGLSAR DLDELSRYPE DKITPENLPQ ILLQLKRRRT
160 170 180 190 200
EEGPTLSYGR DGRSATREPP YRVPRDDWEE KRHFRRDSFD DRGPSLNPVL
210 220 230 240 250
DYDHGSRSQE SGYYDRMDYE DDRLRDGERC RDDSFFGETS HNYHKFDSEY
260 270 280 290 300
ERMGRGPGPL QERSLFEKKR GAPPSSNIED FHGLLPKGYP HLCSICDLPV
310 320 330 340 350
HSNKEWSQHI NGASHSRRCQ LLLEIYPEWN PDNDTGHTMG DPFMLQQSTN
360 370 380 390 400
PAPGILGPPP PSFHLGGPAV GPRGNLGAGN GNLQGPRHMQ KGRVETSRVV
410 420 430 440 450
HIMDFQRGKN LRYQLLQLVE PFGVISNHLI LNKINEAFIE MATTEDAQAA
460 470 480 490 500
VDYYTTTPAL VFGKPVRVHL SQKYKRIKKP EGKPDQKFDQ KQELGRVIHL
510 520 530 540 550
SNLPHSGYSD SAVLKLAEPY GKIKNYILMR MKSQAFIEME TREDAMAMVD
560 570 580 590 600
HCLKKALWFQ GRCVKVDLSE KYKKLVLRIP NRGIDLLKKD KSRKRSYSPD
610 620 630 640 650
GKESPSDKKS KTDAQKTESP AEGKEQEEKS GEDGEKDTKD DQTEQEPSML
660 670 680 690 700
LESEDELLVD EEEAAALLES GSSVGDETDL ANLGDVSSDG KKEPSDKAVK
710 720 730 740 750
KDPSASATSK KKLKKVDKIE ELDQENEAAL ENGIKNEENT EPGAESAENA
760 770 780 790 800
DDPNKDTSEN ADGQNDENKE DYTIPDEYRI GPYQPNVPVG IDYVIPKTGF
810 820 830 840
YCKLCSLFYT NEEVAKNTHC SSLPHYQKLK KFLNKLAEER RQKKET
Length:846
Mass (Da):94,630
Last modified:October 1, 2002 - v1
Checksum:i59C30E63D55093AF
GO

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK087939 mRNA. Translation: BAC40050.1.
BC029070 mRNA. Translation: AAH29070.1.
CCDSiCCDS29142.1.
RefSeqiNP_034901.2. NM_010771.6.
UniGeneiMm.215034.

Genome annotation databases

EnsembliENSMUST00000166793; ENSMUSP00000125761; ENSMUSG00000037236.
ENSMUST00000187389; ENSMUSP00000139745; ENSMUSG00000037236.
ENSMUST00000190029; ENSMUSP00000140846; ENSMUSG00000037236.
GeneIDi17184.
KEGGimmu:17184.
UCSCiuc008emg.2. mouse.

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK087939 mRNA. Translation: BAC40050.1.
BC029070 mRNA. Translation: AAH29070.1.
CCDSiCCDS29142.1.
RefSeqiNP_034901.2. NM_010771.6.
UniGeneiMm.215034.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
EntryMethodResolution (Å)ChainPositionsPDBsum
1X4DNMR-A390-478[»]
1X4FNMR-A478-576[»]
ProteinModelPortaliQ8K310.
SMRiQ8K310. Positions 390-576.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi201324. 9 interactions.
IntActiQ8K310. 10 interactions.
MINTiMINT-4101305.
STRINGi10090.ENSMUSP00000125761.

PTM databases

iPTMnetiQ8K310.
PhosphoSiteiQ8K310.
SwissPalmiQ8K310.

Proteomic databases

EPDiQ8K310.
MaxQBiQ8K310.
PaxDbiQ8K310.
PRIDEiQ8K310.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENSMUST00000166793; ENSMUSP00000125761; ENSMUSG00000037236.
ENSMUST00000187389; ENSMUSP00000139745; ENSMUSG00000037236.
ENSMUST00000190029; ENSMUSP00000140846; ENSMUSG00000037236.
GeneIDi17184.
KEGGimmu:17184.
UCSCiuc008emg.2. mouse.

Organism-specific databases

CTDi9782.
MGIiMGI:1298379. Matr3.

Phylogenomic databases

eggNOGiENOG410IGPC. Eukaryota.
ENOG410XSPB. LUCA.
GeneTreeiENSGT00840000129848.
HOVERGENiHBG057347.
InParanoidiQ8K310.
KOiK13213.
OMAiEQEPSML.
OrthoDBiEOG7GJ6C8.
PhylomeDBiQ8K310.
TreeFamiTF333921.

Miscellaneous databases

ChiTaRSiMatr3. mouse.
EvolutionaryTraceiQ8K310.
NextBioi291510.
PROiQ8K310.
SOURCEiSearch...

Gene expression databases

BgeeiQ8K310.
CleanExiMM_MATR3.
ExpressionAtlasiQ8K310. baseline and differential.
GenevisibleiQ8K310. MM.

Family and domain databases

Gene3Di3.30.70.330. 2 hits.
InterProiIPR012677. Nucleotide-bd_a/b_plait.
IPR000504. RRM_dom.
IPR000690. Znf_C2H2_matrin.
IPR003604. Znf_U1.
[Graphical view]
SMARTiSM00360. RRM. 2 hits.
SM00451. ZnF_U1. 2 hits.
[Graphical view]
SUPFAMiSSF54928. SSF54928. 2 hits.
PROSITEiPS50102. RRM. 2 hits.
PS50171. ZF_MATRIN. 1 hit.
[Graphical view]
ProtoNetiSearch...

Publicationsi

« Hide 'large scale' publications
  1. "The transcriptional landscape of the mammalian genome."
    Carninci P., Kasukawa T., Katayama S., Gough J., Frith M.C., Maeda N., Oyama R., Ravasi T., Lenhard B., Wells C., Kodzius R., Shimokawa K., Bajic V.B., Brenner S.E., Batalov S., Forrest A.R., Zavolan M., Davis M.J.
    , Wilming L.G., Aidinis V., Allen J.E., Ambesi-Impiombato A., Apweiler R., Aturaliya R.N., Bailey T.L., Bansal M., Baxter L., Beisel K.W., Bersano T., Bono H., Chalk A.M., Chiu K.P., Choudhary V., Christoffels A., Clutterbuck D.R., Crowe M.L., Dalla E., Dalrymple B.P., de Bono B., Della Gatta G., di Bernardo D., Down T., Engstrom P., Fagiolini M., Faulkner G., Fletcher C.F., Fukushima T., Furuno M., Futaki S., Gariboldi M., Georgii-Hemming P., Gingeras T.R., Gojobori T., Green R.E., Gustincich S., Harbers M., Hayashi Y., Hensch T.K., Hirokawa N., Hill D., Huminiecki L., Iacono M., Ikeo K., Iwama A., Ishikawa T., Jakt M., Kanapin A., Katoh M., Kawasawa Y., Kelso J., Kitamura H., Kitano H., Kollias G., Krishnan S.P., Kruger A., Kummerfeld S.K., Kurochkin I.V., Lareau L.F., Lazarevic D., Lipovich L., Liu J., Liuni S., McWilliam S., Madan Babu M., Madera M., Marchionni L., Matsuda H., Matsuzawa S., Miki H., Mignone F., Miyake S., Morris K., Mottagui-Tabar S., Mulder N., Nakano N., Nakauchi H., Ng P., Nilsson R., Nishiguchi S., Nishikawa S., Nori F., Ohara O., Okazaki Y., Orlando V., Pang K.C., Pavan W.J., Pavesi G., Pesole G., Petrovsky N., Piazza S., Reed J., Reid J.F., Ring B.Z., Ringwald M., Rost B., Ruan Y., Salzberg S.L., Sandelin A., Schneider C., Schoenbach C., Sekiguchi K., Semple C.A., Seno S., Sessa L., Sheng Y., Shibata Y., Shimada H., Shimada K., Silva D., Sinclair B., Sperling S., Stupka E., Sugiura K., Sultana R., Takenaka Y., Taki K., Tammoja K., Tan S.L., Tang S., Taylor M.S., Tegner J., Teichmann S.A., Ueda H.R., van Nimwegen E., Verardo R., Wei C.L., Yagi K., Yamanishi H., Zabarovsky E., Zhu S., Zimmer A., Hide W., Bult C., Grimmond S.M., Teasdale R.D., Liu E.T., Brusic V., Quackenbush J., Wahlestedt C., Mattick J.S., Hume D.A., Kai C., Sasaki D., Tomaru Y., Fukuda S., Kanamori-Katayama M., Suzuki M., Aoki J., Arakawa T., Iida J., Imamura K., Itoh M., Kato T., Kawaji H., Kawagashira N., Kawashima T., Kojima M., Kondo S., Konno H., Nakano K., Ninomiya N., Nishio T., Okada M., Plessy C., Shibata K., Shiraki T., Suzuki S., Tagami M., Waki K., Watahiki A., Okamura-Oho Y., Suzuki H., Kawai J., Hayashizaki Y.
    Science 309:1559-1563(2005) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Strain: NOD.
    Tissue: Thymus.
  2. "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)."
    The MGC Project Team
    Genome Res. 14:2121-2127(2004) [PubMed] [Europe PMC] [Abstract]
    Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
    Strain: Czech II.
    Tissue: Mammary tumor.
  3. Lubec G., Sunyer B., Chen W.-Q.
    Submitted (JAN-2009) to UniProtKB
    Cited for: PROTEIN SEQUENCE OF 4-12; 20-44; 93-127; 133-146; 193-223; 230-245; 271-304; 399-407; 413-433; 497-515; 525-530; 533-542; 556-562; 719-735; 780-797 AND 804-816, IDENTIFICATION BY MASS SPECTROMETRY.
    Strain: OF1.
    Tissue: Hippocampus.
  4. "Comprehensive identification of phosphorylation sites in postsynaptic density preparations."
    Trinidad J.C., Specht C.G., Thalhammer A., Schoepfer R., Burlingame A.L.
    Mol. Cell. Proteomics 5:914-922(2006) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-188, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Brain.
  5. Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-188, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Liver.
  6. "The phagosomal proteome in interferon-gamma-activated macrophages."
    Trost M., English L., Lemieux S., Courcelles M., Desjardins M., Thibault P.
    Immunity 30:143-154(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-188, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
  7. "Large scale localization of protein phosphorylation by use of electron capture dissociation mass spectrometry."
    Sweet S.M., Bailey C.M., Cunningham D.L., Heath J.K., Cooper H.J.
    Mol. Cell. Proteomics 8:904-912(2009) [PubMed] [Europe PMC] [Abstract]
    Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-598, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Embryonic fibroblast.
  8. Cited for: PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT THR-150; SER-188; SER-195; SER-533; SER-598 AND SER-604, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Brain, Brown adipose tissue, Heart, Kidney, Lung, Pancreas, Spleen and Testis.
  9. "SIRT5-mediated lysine desuccinylation impacts diverse metabolic pathways."
    Park J., Chen Y., Tishkoff D.X., Peng C., Tan M., Dai L., Xie Z., Zhang Y., Zwaans B.M., Skinner M.E., Lombard D.B., Zhao Y.
    Mol. Cell 50:919-930(2013) [PubMed] [Europe PMC] [Abstract]
    Cited for: ACETYLATION [LARGE SCALE ANALYSIS] AT LYS-835, IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
    Tissue: Embryonic fibroblast.
  10. "Solution structure of RRM domains in matrin 3."
    RIKEN structural genomics initiative (RSGI)
    Submitted (NOV-2005) to the PDB data bank
    Cited for: STRUCTURE BY NMR OF 390-576.

Entry informationi

Entry nameiMATR3_MOUSE
AccessioniPrimary (citable) accession number: Q8K310
Entry historyi
Integrated into UniProtKB/Swiss-Prot: November 23, 2004
Last sequence update: October 1, 2002
Last modified: May 11, 2016
This is version 130 of the entry and version 1 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Direct protein sequencing, Reference proteome

Documents

  1. MGD cross-references
    Mouse Genome Database (MGD) cross-references in UniProtKB/Swiss-Prot
  2. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  3. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.