Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
Protein

Meiosis arrest female protein 1

Gene

KIAA0430

Organism
Homo sapiens (Human)
Status
Reviewed-Annotation score: Annotation score: 5 out of 5-Experimental evidence at protein leveli

Functioni

Essential regulator of oogenesis required for female meiotic progression to repress transposable elements and preventing their mobilization, which is essential for the germline integrity. Probably acts via some RNA metabolic process, equivalent to the piRNA system in males, which mediates the repression of transposable elements during meiosis by forming complexes composed of RNAs and governs the methylation and subsequent repression of transposons. Also required to protect from DNA double-strand breaks (By similarity).By similarity

GO - Molecular functioni

GO - Biological processi

Complete GO annotation...

Keywords - Biological processi

Differentiation, Meiosis, Oogenesis

Keywords - Ligandi

RNA-binding

Names & Taxonomyi

Protein namesi
Recommended name:
Meiosis arrest female protein 1
Alternative name(s):
Limkain-b1
Gene namesi
Name:KIAA0430
Synonyms:LKAP, MARF1
OrganismiHomo sapiens (Human)
Taxonomic identifieri9606 [NCBI]
Taxonomic lineageiEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo
Proteomesi
  • UP000005640 Componenti: Chromosome 16

Organism-specific databases

HGNCiHGNC:29562. KIAA0430.

Subcellular locationi

GO - Cellular componenti

  • membrane Source: UniProtKB
  • peroxisome Source: UniProtKB
Complete GO annotation...

Keywords - Cellular componenti

Peroxisome

Pathology & Biotechi

Organism-specific databases

OpenTargetsiENSG00000166783.
ENSG00000277140.
PharmGKBiPA145148631.

Polymorphism and mutation databases

BioMutaiKIAA0430.
DMDMi387912929.

PTM / Processingi

Molecule processing

Feature keyPosition(s)DescriptionActionsGraphical viewLength
ChainiPRO_00002768461 – 1742Meiosis arrest female protein 1Add BLAST1742

Amino acid modifications

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Modified residuei66PhosphoserineCombined sources1
Modified residuei699PhosphotyrosineBy similarity1
Modified residuei760PhosphoserineCombined sources1
Modified residuei1091PhosphoserineCombined sources1
Modified residuei1093PhosphoserineCombined sources1
Modified residuei1571PhosphoserineCombined sources1

Keywords - PTMi

Phosphoprotein

Proteomic databases

EPDiQ9Y4F3.
PaxDbiQ9Y4F3.
PeptideAtlasiQ9Y4F3.
PRIDEiQ9Y4F3.

PTM databases

iPTMnetiQ9Y4F3.
PhosphoSitePlusiQ9Y4F3.

Expressioni

Gene expression databases

BgeeiENSG00000166783.
CleanExiHS_KIAA0430.
ExpressionAtlasiQ9Y4F3. baseline and differential.
GenevisibleiQ9Y4F3. HS.

Organism-specific databases

HPAiHPA017992.

Interactioni

Subunit structurei

Interacts with LIMK2.1 Publication

Protein-protein interaction databases

BioGridi115020. 8 interactors.
IntActiQ9Y4F3. 9 interactors.
STRINGi9606.ENSP00000379654.

Structurei

Secondary structure

11742
Legend: HelixTurnBeta strandPDB Structure known for this area
Show more details
Feature keyPosition(s)DescriptionActionsGraphical viewLength
Beta strandi510 – 518Combined sources9
Helixi525 – 537Combined sources13
Turni538 – 540Combined sources3
Beta strandi543 – 545Combined sources3
Beta strandi551 – 557Combined sources7
Helixi558 – 568Combined sources11
Beta strandi573 – 576Combined sources4
Beta strandi579 – 583Combined sources5
Beta strandi791 – 797Combined sources7
Helixi804 – 818Combined sources15
Beta strandi821 – 826Combined sources6
Beta strandi836 – 842Combined sources7
Helixi843 – 853Combined sources11
Beta strandi856 – 858Combined sources3
Beta strandi861 – 867Combined sources7

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
2DGXNMR-A789-871[»]
2DIUNMR-A510-592[»]
ProteinModelPortaliQ9Y4F3.
SMRiQ9Y4F3.
ModBaseiSearch...
MobiDBiSearch...

Miscellaneous databases

EvolutionaryTraceiQ9Y4F3.

Family & Domainsi

Domains and Repeats

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Domaini353 – 490NYNAdd BLAST138
Domaini791 – 870RRMPROSITE-ProRule annotationAdd BLAST80
Domaini875 – 949HTH OST-type 1PROSITE-ProRule annotationAdd BLAST75
Domaini1003 – 1079HTH OST-type 2PROSITE-ProRule annotationAdd BLAST77
Domaini1099 – 1173HTH OST-type 3PROSITE-ProRule annotationAdd BLAST75
Domaini1175 – 1250HTH OST-type 4PROSITE-ProRule annotationAdd BLAST76
Domaini1259 – 1334HTH OST-type 5PROSITE-ProRule annotationAdd BLAST76
Domaini1335 – 1410HTH OST-type 6PROSITE-ProRule annotationAdd BLAST76
Domaini1411 – 1485HTH OST-type 7PROSITE-ProRule annotationAdd BLAST75
Domaini1486 – 1560HTH OST-type 8PROSITE-ProRule annotationAdd BLAST75

Compositional bias

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Compositional biasi117 – 125Poly-Gly9
Compositional biasi1079 – 1082Poly-Pro4
Compositional biasi1685 – 1690Poly-Ser6

Sequence similaritiesi

Contains 8 HTH OST-type domains.PROSITE-ProRule annotation
Contains 1 NYN domain.Curated
Contains 1 RRM (RNA recognition motif) domain.PROSITE-ProRule annotation

Keywords - Domaini

Repeat

Phylogenomic databases

eggNOGiENOG410IF58. Eukaryota.
ENOG410XQFX. LUCA.
GeneTreeiENSGT00390000002393.
HOVERGENiHBG081919.
InParanoidiQ9Y4F3.
KOiK17573.
OMAiKDNCLMM.
OrthoDBiEOG091G07VR.
PhylomeDBiQ9Y4F3.
TreeFamiTF329117.

Family and domain databases

Gene3Di3.30.70.330. 2 hits.
3.40.50.1010. 1 hit.
InterProiIPR024582. Limkain_b1_cons_dom.
IPR024768. Marf1.
IPR012677. Nucleotide-bd_a/b_plait.
IPR021139. NYN_limkain-b1.
IPR025605. OST-HTH/LOTUS_dom.
IPR029060. PIN_domain-like.
IPR000504. RRM_dom.
[Graphical view]
PANTHERiPTHR14379. PTHR14379. 4 hits.
PfamiPF11608. Limkain-b1. 1 hit.
PF01936. NYN. 1 hit.
PF12872. OST-HTH. 5 hits.
[Graphical view]
SMARTiSM00360. RRM. 2 hits.
[Graphical view]
SUPFAMiSSF54928. SSF54928. 2 hits.
PROSITEiPS51644. HTH_OST. 8 hits.
PS50102. RRM. 1 hit.
[Graphical view]

Sequences (5)i

Sequence statusi: Complete.

This entry describes 5 isoformsi produced by alternative splicing. AlignAdd to basket

Isoform 1 (identifier: Q9Y4F3-1) [UniParc]FASTAAdd to basket

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.

« Hide

        10         20         30         40         50
MMEGNGTENS CSRTRGWLQQ DNDAKPWLWK FSNCFSRPEQ TLPHSPQTKE
60 70 80 90 100
YMENKKVAVE LKDVPSPLHA GSKLFPAVPL PDIRSLQQPK IQLSSVPKVS
110 120 130 140 150
CCAHCPNEPS TSPMRFGGGG GGSGGTSSLI HPGALLDSQS TRTITCQVGS
160 170 180 190 200
GFAFQSASSL QNASARNNLA GIASDFPSMC LESNLSSCKH LPCCGKLHFQ
210 220 230 240 250
SCHGNVHKLH QFPSLQGCTS AGYFPCSDFT SGAPGHLEEH ISQSELTPHL
260 270 280 290 300
CTNSLHLNVV PPVCLKGSLY CEDCLNKPAR NSIIDAAKVW PNIPPPNTQP
310 320 330 340 350
APLAVPLCNG CGTKGTGKET TLLLATSLGK AASKFGSPEV AVAGQVLENL
360 370 380 390 400
PPIGVFWDIE NCSVPSGRSA TAVVQRIREK FFKGHREAEF ICVCDISKEN
410 420 430 440 450
KEVIQELNNC QVTVAHINAT AKNAADDKLR QSLRRFANTH TAPATVVLVS
460 470 480 490 500
TDVNFALELS DLRHRHGFHI ILVHKNQASE ALLHHANELI RFEEFISDLP
510 520 530 540 550
PRLPLKMPQC HTLLYVYNLP ANKDGKSVSN RLRRLSDNCG GKVLSITGCS
560 570 580 590 600
AILRFINQDS AERAQKRMEN EDVFGNRIIV SFTPKNRELC ETKSSNAIAD
610 620 630 640 650
KVKSPKKLKN PKLCLIKDAS EQSSSAKATP GKGSQANSGS ATKNTNVKSL
660 670 680 690 700
QELCRMESKT GHRNSEHQQG HLRLVVPTHG NSSAAVSTPK NSGVAEPVYK
710 720 730 740 750
TSQKKENLSA RSVTSSPVEK KDKEETVFQV SYPSAFSKLV ASRQVSPLLA
760 770 780 790 800
SQSWSSRSMS PNLLNRASPL AFNIANSSSE ADCPDPFANG ADVQVSNIDY
810 820 830 840 850
RLSRKELQQL LQEAFARHGK VKSVELSPHT DYQLKAVVQM ENLQDAIGAV
860 870 880 890 900
NSLHRYKIGS KKILVSLATG AASKSLSLLS AETMSVLQDA PACCLPLFKF
910 920 930 940 950
TDIYEKKFGH KLNVSDLYKL TDTVAIREQG NGRLVCLLPS SQARQSPLGS
960 970 980 990 1000
SQSHDGSSTN CSPIIFEELE YHEPVCRQHC SNKDFSEHEF DPDSYKIPFV
1010 1020 1030 1040 1050
ILSLKTFAPQ VHSLLQTHEG TVPLLSFPDC YIAEFGDLEV VQENQGGVPL
1060 1070 1080 1090 1100
EHFITCVPGV NIATAQNGIK VVKWIHNKPP PPNTDPWLLR SKSPVGNPQL
1110 1120 1130 1140 1150
IQFSREVIDL LKSQPSCVIP ISHFIPSYHH HFAKQCRVSD YGYSKLIELL
1160 1170 1180 1190 1200
EAVPHVLQIL GMGSKRLLTL THRAQVKRFT QDLLKLLKSQ ASKQVIVREF
1210 1220 1230 1240 1250
SQAYHWCFSK DWDVTEYGVC ELIDIVSEIP DTTICLSQQD NEMVICIPKR
1260 1270 1280 1290 1300
ERTQDEIERT KQFSKDVVDL LRHQPHFRMP FNKFIPSYHH HFGRQCKLAY
1310 1320 1330 1340 1350
YGFTKLLELF EAIPDTLQVL ECGEEKILTL TEVERFKALA AQFVKLLRSQ
1360 1370 1380 1390 1400
KDNCLMMTDL LTEYAKTFGY TFRLQDYDVS SISALTQKLC HVVKVADIES
1410 1420 1430 1440 1450
GRQIQLINRK SLRSLTAQLL VLLMSWEGTT HLSVEELKRH YESTHNTPLN
1460 1470 1480 1490 1500
PCEYGFMTLT ELLKSLPYLV EVFTNDKMEE CVKLTSLYLF AKNVRSLLHT
1510 1520 1530 1540 1550
YHYQQIFLHE FSMAYTKYVG ETLQPKTYGH SSVEELLGAI PQVVWIKGHG
1560 1570 1580 1590 1600
HKRIVVLKND MKSRLSSLSL SPANHENQPS EGERILEVPE SHTASELKLG
1610 1620 1630 1640 1650
ADGSGPSHTE QELLRLTDDS PVDLLCAPVP SCLPSPQLRP DPVILQSADL
1660 1670 1680 1690 1700
IQFEERPQEP SEIMILNQEE KMEIPIPGKS KTLTSDSSSS CISAAVPVPP
1710 1720 1730 1740
CPSSETSESL LSKDPVESPA KKQPKNRVKL AANFSLAPIT KL
Length:1,742
Mass (Da):192,859
Last modified:May 16, 2012 - v6
Checksum:i9A0C0687B93A6A4B
GO
Isoform 2 (identifier: Q9Y4F3-3) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     1026-1084: Missing.

Show »
Length:1,683
Mass (Da):186,447
Checksum:iFBB6FE3DF3E14D75
GO
Isoform 3 (identifier: Q9Y4F3-4) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     336-338: Missing.

Show »
Length:1,739
Mass (Da):192,618
Checksum:iAE0825D8F42C7200
GO
Isoform 4 (identifier: Q9Y4F3-5) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     509-509: Missing.
     754-754: W → WS

Show »
Length:1,742
Mass (Da):192,818
Checksum:i68D45929E9D4BB88
GO
Isoform 5 (identifier: Q9Y4F3-6) [UniParc]FASTAAdd to basket

The sequence of this isoform differs from the canonical sequence as follows:
     278-291: PARNSIIDAAKVWP → VRIFLFLKLGAAED
     293-1742: Missing.

Note: May be due to an intron retention.
Show »
Length:292
Mass (Da):31,458
Checksum:iA36DF5EF146267CD
GO

Sequence cautioni

The sequence AAC31662 differs from that shown. Reason: Erroneous gene model prediction.Curated
The sequence AAH64914 differs from that shown. Contaminating sequence. Potential poly-A sequence.Curated
The sequence EAW53920 differs from that shown. Reason: Erroneous gene model prediction.Curated

Experimental Info

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Sequence conflicti451T → A in AC026401 (PubMed:15616553).Curated1
Sequence conflicti941S → R in BAA24860 (PubMed:9455477).Curated1
Sequence conflicti1541P → S in AAI44516 (PubMed:15489334).Curated1

Alternative sequence

Feature keyPosition(s)DescriptionActionsGraphical viewLength
Alternative sequenceiVSP_037755278 – 291PARNS…AKVWP → VRIFLFLKLGAAED in isoform 5. 1 PublicationAdd BLAST14
Alternative sequenceiVSP_037756293 – 1742Missing in isoform 5. 1 PublicationAdd BLAST1450
Alternative sequenceiVSP_037757336 – 338Missing in isoform 3. 1 Publication3
Alternative sequenceiVSP_037758509Missing in isoform 4. 1 Publication1
Alternative sequenceiVSP_037759754W → WS in isoform 4. 1 Publication1
Alternative sequenceiVSP_0229881026 – 1084Missing in isoform 2. 1 PublicationAdd BLAST59

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK302667 mRNA. Translation: BAG63901.1.
U95740 Genomic DNA. Translation: AAC31662.1. Sequence problems.
AC026401 Genomic DNA. No translation available.
CH471226 Genomic DNA. Translation: EAW53920.1. Sequence problems.
BC064914 mRNA. Translation: AAH64914.2. Sequence problems.
BC137165 mRNA. Translation: AAI37166.1.
BC137170 mRNA. Translation: AAI37171.1.
BC144514 mRNA. Translation: AAI44515.1.
BC144515 mRNA. Translation: AAI44516.1.
AB007890 mRNA. Translation: BAA24860.3.
AB012134 mRNA. Translation: BAB82433.1.
CCDSiCCDS10562.2. [Q9Y4F3-1]
CCDS53990.1. [Q9Y4F3-5]
CCDS55991.1. [Q9Y4F3-4]
PIRiT00060.
RefSeqiNP_001171927.1. NM_001184998.1. [Q9Y4F3-5]
NP_001171928.1. NM_001184999.1. [Q9Y4F3-4]
NP_055462.2. NM_014647.3. [Q9Y4F3-1]
XP_005255764.1. XM_005255707.1.
XP_016879390.1. XM_017023901.1. [Q9Y4F3-5]
UniGeneiHs.173524.

Genome annotation databases

EnsembliENST00000396368; ENSP00000379654; ENSG00000166783. [Q9Y4F3-1]
ENST00000548025; ENSP00000449376; ENSG00000166783. [Q9Y4F3-4]
ENST00000551742; ENSP00000450309; ENSG00000166783. [Q9Y4F3-5]
ENST00000621511; ENSP00000479383; ENSG00000277140. [Q9Y4F3-1]
ENST00000632465; ENSP00000487685; ENSG00000277140. [Q9Y4F3-4]
ENST00000632628; ENSP00000488025; ENSG00000277140. [Q9Y4F3-5]
GeneIDi9665.
KEGGihsa:9665.
UCSCiuc002ddr.4. human. [Q9Y4F3-1]

Keywords - Coding sequence diversityi

Alternative splicing

Cross-referencesi

Sequence databases

Select the link destinations:
EMBLi
GenBanki
DDBJi
Links Updated
AK302667 mRNA. Translation: BAG63901.1.
U95740 Genomic DNA. Translation: AAC31662.1. Sequence problems.
AC026401 Genomic DNA. No translation available.
CH471226 Genomic DNA. Translation: EAW53920.1. Sequence problems.
BC064914 mRNA. Translation: AAH64914.2. Sequence problems.
BC137165 mRNA. Translation: AAI37166.1.
BC137170 mRNA. Translation: AAI37171.1.
BC144514 mRNA. Translation: AAI44515.1.
BC144515 mRNA. Translation: AAI44516.1.
AB007890 mRNA. Translation: BAA24860.3.
AB012134 mRNA. Translation: BAB82433.1.
CCDSiCCDS10562.2. [Q9Y4F3-1]
CCDS53990.1. [Q9Y4F3-5]
CCDS55991.1. [Q9Y4F3-4]
PIRiT00060.
RefSeqiNP_001171927.1. NM_001184998.1. [Q9Y4F3-5]
NP_001171928.1. NM_001184999.1. [Q9Y4F3-4]
NP_055462.2. NM_014647.3. [Q9Y4F3-1]
XP_005255764.1. XM_005255707.1.
XP_016879390.1. XM_017023901.1. [Q9Y4F3-5]
UniGeneiHs.173524.

3D structure databases

Select the link destinations:
PDBei
RCSB PDBi
PDBji
Links Updated
PDB entryMethodResolution (Å)ChainPositionsPDBsum
2DGXNMR-A789-871[»]
2DIUNMR-A510-592[»]
ProteinModelPortaliQ9Y4F3.
SMRiQ9Y4F3.
ModBaseiSearch...
MobiDBiSearch...

Protein-protein interaction databases

BioGridi115020. 8 interactors.
IntActiQ9Y4F3. 9 interactors.
STRINGi9606.ENSP00000379654.

PTM databases

iPTMnetiQ9Y4F3.
PhosphoSitePlusiQ9Y4F3.

Polymorphism and mutation databases

BioMutaiKIAA0430.
DMDMi387912929.

Proteomic databases

EPDiQ9Y4F3.
PaxDbiQ9Y4F3.
PeptideAtlasiQ9Y4F3.
PRIDEiQ9Y4F3.

Protocols and materials databases

Structural Biology KnowledgebaseSearch...

Genome annotation databases

EnsembliENST00000396368; ENSP00000379654; ENSG00000166783. [Q9Y4F3-1]
ENST00000548025; ENSP00000449376; ENSG00000166783. [Q9Y4F3-4]
ENST00000551742; ENSP00000450309; ENSG00000166783. [Q9Y4F3-5]
ENST00000621511; ENSP00000479383; ENSG00000277140. [Q9Y4F3-1]
ENST00000632465; ENSP00000487685; ENSG00000277140. [Q9Y4F3-4]
ENST00000632628; ENSP00000488025; ENSG00000277140. [Q9Y4F3-5]
GeneIDi9665.
KEGGihsa:9665.
UCSCiuc002ddr.4. human. [Q9Y4F3-1]

Organism-specific databases

CTDi9665.
GeneCardsiKIAA0430.
HGNCiHGNC:29562. KIAA0430.
HPAiHPA017992.
MIMi614593. gene.
neXtProtiNX_Q9Y4F3.
OpenTargetsiENSG00000166783.
ENSG00000277140.
PharmGKBiPA145148631.
HUGEiSearch...
GenAtlasiSearch...

Phylogenomic databases

eggNOGiENOG410IF58. Eukaryota.
ENOG410XQFX. LUCA.
GeneTreeiENSGT00390000002393.
HOVERGENiHBG081919.
InParanoidiQ9Y4F3.
KOiK17573.
OMAiKDNCLMM.
OrthoDBiEOG091G07VR.
PhylomeDBiQ9Y4F3.
TreeFamiTF329117.

Miscellaneous databases

ChiTaRSiKIAA0430. human.
EvolutionaryTraceiQ9Y4F3.
GeneWikiiKIAA0430.
GenomeRNAii9665.
PROiQ9Y4F3.
SOURCEiSearch...

Gene expression databases

BgeeiENSG00000166783.
CleanExiHS_KIAA0430.
ExpressionAtlasiQ9Y4F3. baseline and differential.
GenevisibleiQ9Y4F3. HS.

Family and domain databases

Gene3Di3.30.70.330. 2 hits.
3.40.50.1010. 1 hit.
InterProiIPR024582. Limkain_b1_cons_dom.
IPR024768. Marf1.
IPR012677. Nucleotide-bd_a/b_plait.
IPR021139. NYN_limkain-b1.
IPR025605. OST-HTH/LOTUS_dom.
IPR029060. PIN_domain-like.
IPR000504. RRM_dom.
[Graphical view]
PANTHERiPTHR14379. PTHR14379. 4 hits.
PfamiPF11608. Limkain-b1. 1 hit.
PF01936. NYN. 1 hit.
PF12872. OST-HTH. 5 hits.
[Graphical view]
SMARTiSM00360. RRM. 2 hits.
[Graphical view]
SUPFAMiSSF54928. SSF54928. 2 hits.
PROSITEiPS51644. HTH_OST. 8 hits.
PS50102. RRM. 1 hit.
[Graphical view]
ProtoNetiSearch...

Entry informationi

Entry nameiMARF1_HUMAN
AccessioniPrimary (citable) accession number: Q9Y4F3
Secondary accession number(s): A8MSK2
, B2RNX2, B4DYY9, B7ZMG1, B7ZMG2, F8VV09, Q6P1R6, Q8WYR2, Q9Y4J9
Entry historyi
Integrated into UniProtKB/Swiss-Prot: February 6, 2007
Last sequence update: May 16, 2012
Last modified: November 30, 2016
This is version 128 of the entry and version 6 of the sequence. [Complete history]
Entry statusiReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Miscellaneousi

Keywords - Technical termi

3D-structure, Complete proteome, Reference proteome

Documents

  1. Human chromosome 16
    Human chromosome 16: entries, gene names and cross-references to MIM
  2. MIM cross-references
    Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot
  3. PDB cross-references
    Index of Protein Data Bank (PDB) cross-references
  4. SIMILARITY comments
    Index of protein domains and families

Similar proteinsi

Links to similar proteins from the UniProt Reference Clusters (UniRef) at 100%, 90% and 50% sequence identity:
100%UniRef100 combines identical sequences and sub-fragments with 11 or more residues from any organism into one UniRef entry.
90%UniRef90 is built by clustering UniRef100 sequences that have at least 90% sequence identity to, and 80% overlap with, the longest sequence (a.k.a seed sequence).
50%UniRef50 is built by clustering UniRef90 seed sequences that have at least 50% sequence identity to, and 80% overlap with, the longest sequence in the cluster.