Backup to S3

  1. Requirements
  2. Define a remote
  3. Check remote status
  4. Install sample package and run operations
  5. Push run to S3
  6. Restore run from S3
  7. Delete run from S3
  8. Manage runs in S3

This guide describes how to use S3 to backup runs.

Requirements

Define a remote

Guild remotes are defined in ~/.guild/config.yml. You must edit this file to add and modify remote definitions.

In this guide we add a remote named s3‑backup that will serve as a backup location in S3 for runs.

Modify ~/.guild/config.yml and add the following at the end of the file:

remotes:
  s3-backup:
    type: s3
    description: Backups on S3
    bucket: <your S3 bucket>
    root: guild-backups

Note

If you already have a remotes section, add s3‑backup within that section—don’t add a second remotes section.

Save your changes to ~/.guild/config.yml.

In a command console, list available remotes:

guild remotes

Guild should show:

s3-backup        Backups on S3

If you don’t see the remote or Guild exits with an error, verify the step above and try again.

Check remote status

Use remote status to check status for s3‑backup:

guild remote status s3-backup

Guild should exit with this error message:

guild: missing required AWS_ACCESS_KEY_ID environment variable

Guild requires AWS access keys to check server status in S3. You must define the following two environment variables to use EC2 remotes in Guild:

AWS_ACCESS_KEY_ID
Access key ID for your AWS security credentials.
AWS_SECRET_ACCESS_KEY
Secret access key for your AWS security credentials.

Note

If you don’t have these values, refer to Requirements above for help.

Define the required environment variables, replacing <...> with your access key values:

AWS_ACCESS_KEY_ID=<your access key id>
AWS_SECRET_ACCESS_KEY=<your secret access key>

Check status again:

guild remote status s3-backup

Guild should show:

s3-backup (S3 bucket guild-dev-backup) is available

If Guild exits with an error, verify that the requirements above are met. If you cannot resolve the issue, open an issue on GitHub.

Important

Do not post AWS security credentials to GitHub issues or otherwise make them available in plain text to others.

Install sample package and run operations

To illustrate backing up runs to S3, we need to first generate some runs. We use the gpkg.mnist package in this guide, but any Guild project or package will work.

Install gpkg.mnist:

guild install gpkg.mnist

To verify installation, list available operations for mnist:

guild operations mnist

Verify that you see the following:

gpkg.mnist/cnn:evaluate     Evaluate a trained CNN
gpkg.mnist/cnn:train        Train the CNN
gpkg.mnist/logreg:evaluate  Evaluate a trained logistic regression
gpkg.mnist/logreg:train     Train the logistic regression
gpkg.mnist/samples:prepare  Generate a set of sample MNIST images

Run train for the logreg model:

guild run logreg:train

Press Enter to confirm.

Guild trains the model for ten epochs. When the run is finished, list available runs:

guild runs

You should see the following run (ID and dates will differ):

[1:d6e12108]  gpkg.mnist/logreg:train  2018-10-26 04:23:46  completed

If you see other runs that’s okay. We just backup this run in this guide.

Push run to S3

With our S3 remote and a training run, we’re ready to backup.

Use the push command to backup the latest run to s3‑backup:

guild push --operation gpkg.mnist/logreg:train 1 s3-backup

The use of ‑‑operation limits the command to operations matching the specified value. The value 1 indicates that only the latest run should be copied. The command can be read to mean “copy the latest MNIST logreg training run to the S3 backup”.

Verify that the gpkg.mnist/logreg:train run will be copied and press Enter.

Guild copies the run to the configured S3 bucket under the guild‑backups root path. You may use the AWS console or other S3 bucket browser to verify.

Note

You can copy all available runs by omitting run selection options—e.g. by using guild push s3‑backup. Refer to the push command for details.

Next, list the runs in the S3 remote:

guild runs -r s3-backup

Guild shows the run, but in this case its from S3.

Restore run from S3

In this section we delete our local run and restore it from S3.

Delete the local gpkg.mnist/logreg:train run:

guild runs rm --operation gpkg.mnist/logreg:train 1

Confirm that our sample run is displayed in the confirmation prompt and press Enter to confirm.

We use ‑‑operation and the value 1 to ensure that only the latest gpkg.mnist/logreg:train run is deleted.

Next, we use the S3 remote to restore the run.

Note

You can recover a deleted run in Guild using the runs restore command, provided it’s not been permanently deleted. In this case we will restore

Use the pull command to copy runs from s3‑backup:

guild pull --operation gpkg.mnist/logreg:train 1 s3-backup

Verify the run to be copied and press ‘Enter’.

Guild copies the run from S3 to the local system.

Verify that the run is restored:

guild runs

Delete run from S3

In this section we demonstrate Guild’s run management support for S3 by deleting the backup run.

Delete the latest gpkg.mnist/logreg:train run in S3:

guild runs rm --operation gpkg.mnist/logreg:train 1 -r s3-backup

Verify the run to be deleted and press Enter.

Guild deletes the run in S3. You can verify using the AWS console or an S3 browser.

List the runs in S3:

guild runs -r s3-backup

Manage runs in S3

Guild supports the following run management commands for S3 remotes. This includes:

Delete a run in S3 (can be undeleted using restore).
Show run information for a run in S3.
Label a run in S3.
List runs in S3.
Permanently delete runs in S3.
Restore a (non-permanently) deleted run in S3.

You may have noted that in the previous section, when we deleted the run in S3, Guild did not actually delete the run in S3 but moved it to a trash path under the bucket root. Guild does this to support restoring deleted runs.

You can list deleted runs using the ‑‑deleted option:

guild runs --deleted -r s3-backup

You can restore deleted runs on S3 using:

guild runs restore -r s3-backup

Note

This command restores all deleted runs. If you want to restore a subset of runs, use the run filter options available with the runs-restore command.

Finally, if you want to permanently delete the run, use the ‑‑permanent option:

guild runs rm \
  --permanent \
  --operation gpkg.mnist/logreg:train 1 \
  -r s3-backup