Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

How can I download the sequences corresponding to a specified domain or region, or the sequences of mature chains or peptides, from a list of UniProt entries?

Last modified February 1, 2021

Download the sequences of all annotated disintegrin domains

Run your query, e.g. to retrieve the UniProtKB entries annotated to contain
disintegrin domains, (or alternatively, with a list of identifiers).

Then click on "Download" and choose to download the results in GFF format.
You can modify the GFF file as follows:

Keep only the lines containing your domain/region, (e.g. "Disintegrin", "Cytoplasmic" or "Transit") and ignore all other lines (e.g. using grep). These lines include information about extent of the domains/regions.

Transform the relevant lines (e.g. using a scripting language, or a word processor) from

Q9R158 UniProtKB Domain 392 478 . . . Note=Disintegrin
Q10741 UniProtKB Domain 457 551 . . . Note=Disintegrin
O14672 UniProtKB Domain 457 551 . . . Note=Disintegrin

to

Q9R158[392-478]
Q10741[457-551]
O14672[457-551]

and ignore all other lines.

You will then be able to use the "Retrieve/ID mapping" service to upload the file you obtained from modifying the GFF, and retrieve the corresponding entries. To download the subsequences, select the format "FASTA (source list)" from the download menu.

If you only have a short list of entries, you can also select the domains manually from the entry views by clicking on "Add to basket" at the right hand side of the feature descriptions in the section "Family and domains" of these entries. When you have finished selecting your domains, open the basket and click on "Download".

Download the sequences of all mature chains or peptides

There is no pre-computed database carrying the sequence data for mature chains or peptides.
However, for any given query or even for the complete database, you can proceed as described above and download the gff format and when following the subsequent steps, just replace "Domain" by "Chain" and / or "Peptide".

UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again