Using Amazon Glacier and handling 408 RequestTimeoutException errors

Filed under: TechNotes, AWS — lars @ 08:30:00 pm

Lots of people have problems with their Amazon Glacier uploads timing out.  For some, switching to a different AWS region may help.  Otherwise, the key to dealing with timeouts is to use multipart uploads.

I use the glacier-cmd command-line tool.  It seems to be popular, well maintained, and have a slightly more coherent set of commands for managing your uploads than the standard AWS client.  The multi-part approach I used to work around timeouts is explained below.


Set up

First you will need to install and configure glacier-cmd as per the instructions on GitHub.

Create a file vault if you don't already have one.  This is a container in which all your uploads will reside:

glacier-cmd mkvault [VaultName]


Uploads and resuming

Then begin an upload, specifying a small upload part-size to preserve as much of your upload as possible if there is a failure:

glacier-cmd upload [VaultName] ./filename --description "file description" --partsize 1

You may get the dreaded Timeout exception:

boto.glacier.exceptions.UnexpectedHTTPResponseError: Expected 204, got (408, code=RequestTimeoutException, message=Request timed out.)

You should be able to see the failed upload if you run the listmultiparts command:

glacier-cmd listmultiparts [VaultName]
+------------------------------------+--------------------+--------------------------+-----------------+------------------------------------------------------------+
|        MultipartUploadId         | ArchiveDescription |       CreationDate       | PartSizeInBytes |                          VaultARN                          |
+------------------------------------+--------------------+--------------------------+-----------------+------------------------------------------------------------+
| AbCdEfghIhijklm_...zxZxZXzxZ00XZxx | file_description   | 2016-11-05T10:26:00.000Z |     1048576     | arn:aws:glacier:us-east-1:012617398452:vaults/[VaultName] |
+------------------------------------+--------------------+--------------------------+-----------------+------------------------------------------------------------+

You can then resume the upload by referring to the MultipartUploadId from the output above:

glacier-cmd upload --resume --uploadid "AbCdEfghIhijklm_...zxZxZXzxZ00XZxx" [VaultName] ./filename --description "file description" --partsize 1

If the upload fails again, you can keep resuming and eventually it should complete.  For me, this happened frequently enough to warrant running the command in a simple bash-script loop:

for ((n=0;n<10;n++)); do    /usr/local/bin/glacier-cmd upload --resume --uploadid "AbCdEfghIhijklm_...zxZxZXzxZ00XZxx" [VaultName] ./filename --description "file descripion" --partsize 1 ;   done


Clean up

If you have made multiple attempts to upload the same file, you may find there are 'orphaned' file fragments that will show up in the listmultiparts command:

glacier-cmd listmultiparts [VaultName]

You can run abortmultipart to delete them:

glacier-cmd abortmultipart [VaultName] AbCdEfghIhijklm_...zxZxZXzxZ00XZxx


Listing your successfully uploaded files

Unfortunately, viewing an up-to-date list of your uploaded files isn't as simple as a 'dir' or 'ls' command.  Retrieving the vault inventory is a job that AWS run in the background and notify you on completion.  You run it like this: 

glacier-cmd inventory [VaultName]
+---------------------------+------------------------------------+
|           Header          |            Value                 |
+---------------------------+------------------------------------+
|           Status          | Inventory retrieval in progress. |
|           Job ID          | AbCdEfghIhijklm_...zxZxZXzxZ00XZxx |
| Job started (time in UTC) |     2016-11-18T23:23:00.000Z      |
+---------------------------+------------------------------------+

If you have set up notifications for Glacier via the AWS console, you may get an email/sms or other notification on completion.  You can also check on the status of this running job using the listjobs command:

glacier-cmd listjobs [VaultName]
+-----------------------------------------------------------+------------------------------------+------------+--------------------+--------------------------+------------+
|                          VaultARN                         |        Job ID               | Archive ID |       Action       |        Initiated         |   Status   |
+-----------------------------------------------------------+------------------------------------+------------+--------------------+--------------------------+------------+
| arn:aws:glacier:us-east-1:012617398452:vaults/[VaultName] | AbCdEfghIhijklm_...zxZxZXzxZ00XZxx |    None    | InventoryRetrieval | 2016-11-18T23:23:00.000Z | InProgress |
+-----------------------------------------------------------+------------------------------------+------------+--------------------+--------------------------+------------+
 Once the job completes, which may take up to 24 hours, you can then use the inventory command again to view the up-to-date file listing:
glacier-cmd inventory [VaultName]
Inventory of vault: arn:aws:glacier:us-east-1:012617398452:vaults/[VaultName]
Inventory Date: 2016-12-22T12:00:00Z

Content:
+------------------------------------+---------------------+----------------------+------------------------------------+------------+
|              Archive ID            | Archive Description |       Uploaded       |           SHA256 tree hash         |    Size    |
+------------------------------------+---------------------+----------------------+------------------------------------+------------+
| AbCdEfghIhijklm_...zxZxZXzxZ00XZxx |    file one desc    | 2016-11-05T21:00:00Z | AbCdEfghIhijklm_...zxZxZXzxZ00XZxx | 725500000  |
| AbCdEfghIhijklm_...zxZxZXzxZ00XZxx | file two desc | 2016-11-16T07:00:00Z | AbCdEfghIhijklm_...zxZxZXzxZ00XZxx | 7413500000 |
+------------------------------------+---------------------+----------------------+------------------------------------+------------+ This vault contains 2 items, total size 10.2 GB.

 

That's it.  Good luck using AWS Glacier for your own archive storage!

Comments

No Comments for this post yet...

    Leave a comment

    Allowed XHTML tags: <p, ul, ol, li, dl, dt, dd, address, blockquote, ins, del, span, bdo, br, em, strong, dfn, code, samp, kdb, var, cite, abbr, acronym, q, sub, sup, tt, i, b, big, small>


    Options:
    (Line breaks become <br />)
    (Set cookies for name, email & url)




    powered by  b2evolution