Lots of people have problems with their Amazon Glacier uploads timing out. For some, switching to a different AWS region may help. Otherwise, the key to dealing with timeouts is to use multipart uploads.
I use the glacier-cmd command-line tool. It seems to be popular, well maintained, and have a slightly more coherent set of commands for managing your uploads than the standard AWS client. The multi-part approach I used to work around timeouts is explained below.
Set up
First you will need to install and configure glacier-cmd as per the instructions on GitHub.
Create a file vault if you don't already have one. This is a container in which all your uploads will reside:
glacier-cmd mkvault [VaultName]
Uploads and resuming
Then begin an upload, specifying a small upload part-size to preserve as much of your upload as possible if there is a failure:
glacier-cmd upload [VaultName] ./filename --description "file description" --partsize 1
You may get the dreaded Timeout exception:
boto.glacier.exceptions.UnexpectedHTTPResponseError: Expected 204, got (408, code=RequestTimeoutException, message=Request timed out.)
You should be able to see the failed upload if you run the listmultiparts command:
glacier-cmd listmultiparts [VaultName]
+------------------------------------+--------------------+--------------------------+-----------------+------------------------------------------------------------+
| MultipartUploadId | ArchiveDescription | CreationDate | PartSizeInBytes | VaultARN |
+------------------------------------+--------------------+--------------------------+-----------------+------------------------------------------------------------+
| AbCdEfghIhijklm_...zxZxZXzxZ00XZxx | file_description | 2016-11-05T10:26:00.000Z | 1048576 | arn:aws:glacier:us-east-1:012617398452:vaults/[VaultName] |
+------------------------------------+--------------------+--------------------------+-----------------+------------------------------------------------------------+
You can then resume the upload by referring to the MultipartUploadId from the output above:
glacier-cmd upload --resume --uploadid "AbCdEfghIhijklm_...zxZxZXzxZ00XZxx" [VaultName] ./filename --description "file description" --partsize 1
If the upload fails again, you can keep resuming and eventually it should complete. For me, this happened frequently enough to warrant running the command in a simple bash-script loop:
for ((n=0;n<10;n++)); do /usr/local/bin/glacier-cmd upload --resume --uploadid "AbCdEfghIhijklm_...zxZxZXzxZ00XZxx" [VaultName] ./filename --description "file descripion" --partsize 1 ; done
Clean up
If you have made multiple attempts to upload the same file, you may find there are 'orphaned' file fragments that will show up in the listmultiparts command:
glacier-cmd listmultiparts [VaultName]
You can run abortmultipart to delete them:
glacier-cmd abortmultipart [VaultName] AbCdEfghIhijklm_...zxZxZXzxZ00XZxx
Listing your successfully uploaded files
Unfortunately, viewing an up-to-date list of your uploaded files isn't as simple as a 'dir' or 'ls' command. Retrieving the vault inventory is a job that AWS run in the background and notify you on completion. You run it like this:
glacier-cmd inventory [VaultName]
+---------------------------+------------------------------------+
| Header | Value |
+---------------------------+------------------------------------+
| Status | Inventory retrieval in progress. |
| Job ID | AbCdEfghIhijklm_...zxZxZXzxZ00XZxx |
| Job started (time in UTC) | 2016-11-18T23:23:00.000Z |
+---------------------------+------------------------------------+
If you have set up notifications for Glacier via the AWS console, you may get an email/sms or other notification on completion. You can also check on the status of this running job using the listjobs command:
glacier-cmd listjobs [VaultName]
+-----------------------------------------------------------+------------------------------------+------------+--------------------+--------------------------+------------+
| VaultARN | Job ID | Archive ID | Action | Initiated | Status |
+-----------------------------------------------------------+------------------------------------+------------+--------------------+--------------------------+------------+
| arn:aws:glacier:us-east-1:012617398452:vaults/[VaultName] | AbCdEfghIhijklm_...zxZxZXzxZ00XZxx | None | InventoryRetrieval | 2016-11-18T23:23:00.000Z | InProgress |
+-----------------------------------------------------------+------------------------------------+------------+--------------------+--------------------------+------------+
Once the job completes, which may take up to 24 hours, you can then use the inventory command again to view the up-to-date file listing:
glacier-cmd inventory [VaultName]
Inventory of vault: arn:aws:glacier:us-east-1:012617398452:vaults/[VaultName]
Inventory Date: 2016-12-22T12:00:00Z
Content:
+------------------------------------+---------------------+----------------------+------------------------------------+------------+
| Archive ID | Archive Description | Uploaded | SHA256 tree hash | Size |
+------------------------------------+---------------------+----------------------+------------------------------------+------------+
| AbCdEfghIhijklm_...zxZxZXzxZ00XZxx | file one desc | 2016-11-05T21:00:00Z | AbCdEfghIhijklm_...zxZxZXzxZ00XZxx | 725500000 |
| AbCdEfghIhijklm_...zxZxZXzxZ00XZxx | file two desc | 2016-11-16T07:00:00Z | AbCdEfghIhijklm_...zxZxZXzxZ00XZxx | 7413500000 |
+------------------------------------+---------------------+----------------------+------------------------------------+------------+
This vault contains 2 items, total size 10.2 GB.
That's it. Good luck using AWS Glacier for your own archive storage!