Your basket is currently empty.
Select item(s) and click on "Add to basket" to create your own collection here
(400 entries max)
Downloaded data seems incomplete or corrupted - how can I get help with download problems?
Last modified May 15, 2015
FTP downloads
Every folder on our FTP server contains a file called RELEASE.metalink that specifies the size and MD5 checksum of every file in that folder, e.g.
ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/RELEASE.metalink
Metalink is an extensible metadata file format that describes one or more computer files available for download. It facilitates file verification and recovery from data corruption and lists alternate download sources (mirror URIs).
Various command line download tools, e.g. cURL version 7.30 or higher and aria2, support metalink.
Example: The following command will download all files in the current_release/ folder and verify their MD5 checksums:
curl --metalink ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/RELEASE.metalink
They will be downloaded from one of the alternative locations mentioned in the metalink file. If one FTP server goes down during a download, programs can automatically switch to another mirror location. Some programs can also download segments from several FTP locations at the same time, which can make downloads much faster.
Please note that UniProt can be downloaded from the consortium member FTP sites at three different geographical locations:
USA: ftp://ftp.uniprot.org/pub/databases/uniprot
UK: ftp://ftp.ebi.ac.uk/pub/databases/uniprot
Switzerland: ftp://ftp.expasy.org/databases/uniprot
HTTP downloads
Due to HTTP transport unreliability (HTTP streams tend to fail after a while due to packet loss), large downloads should be split into smaller chunks using the “offset” and “limit” functions. These are described in our FAQ for programmatic access.
1) Start by retrieving the number of results in your query by checking the “X-Total-Results” header like in the example Download all UniProt sequences for a given organism in FASTA format.
2) If the number of results x is greater than 50000, repeat your query and append the following to the URL:
&offset=0&limit=50000 &offset=50000&limit=50000 &offset=100000&limit=50000 etc.
Also use compress=yes
http://www.uniprot.org/uniprot/?query=organism:%22Homo%20sapiens%20(Human)%20[9606]%22&fil=&offset=0&limit=50&compress=yes&format=fasta http://www.uniprot.org/uniprot/?query=organism:%22Homo%20sapiens%20(Human)%20[9606]%22&fil=&offset=50&limit=50&compress=yes&format=fasta http://www.uniprot.org/uniprot/?query=organism:%22Homo%20sapiens%20(Human)%20[9606]%22&fil=&offset=100&limit=50&compress=yes&format=fasta
etc.
3) Once you have your download, use gzip -t to check the integrity of your file. Uncompress the chunks and concatenate them into a single download file.
