AWS Command Line Interface (CLI) Tool
1. aws cli Installation
The AWS Command Line tool is used for interaction with the storage service, and can be scripted for automated workflows. Installing the AWS CLI is summarized below, and you may consult the official AWS CLI install guide.
CADES SHPC Users
- The aws client is provided via a software module, though you may install a local version in your home directory if you wish.
- From the SPHC login nodes:
-bash-4.2$module load python/3.6.3 -bash-4.2$aws --version aws-cli/1.16.12 Python/3.6.3 Linux/3.10.0-862.9.1.el7.x86_64 botocore/1.12.2
- Download from https://aws.amazon.com/cli/.
- See AWS macOS instructions.
You may encounter issues if
awscli and the
awscli-plugin-endpoint are installed from different sources e.g. one from your distribution's package manager (
yum) and one from
pip. Installing both via
pip usually allows them to work together well.
📝 Note: It is recommended to install the components in a Python virtual environment, the instructions for which are available here.
If you wish to install system-wide (as root) you may do so with
sudo pip install awscli sudo pip install awscli-plugin-endpoint
2. Initial aws cli Profile Configuration
Specify IAM user credentials by editing your
~/.aws/credentials file, or run
aws configure and paste the appropriate values into the prompts, to create entries like:
[default] aws_access_key_id = <accessKey> aws_secret_access_key = <secretKeyValue> Default region name = <leave blank or us-east-1> Default output format = <leave blank or text or json>
Changing Default S3 Endpoints and Multiple Profiles
By default the external Amazon S3 service is assumed, and if you wish to use AWS there is no need to change.
If you wish to use some other S3 provider (such as an on prem S3 compatible service) you will need to explicitly define the endpoint to use.
Example: To change to using an on-perm S3 endpoint, and make the
default for the
aws configure set plugins.endpoint awscli_plugin_endpoint aws configure --profile default set s3.endpoint_url http://or-rda-s3.ornl.gov aws configure --profile default set s3api.endpoint_url http://or-rda-s3.ornl.gov
📝 Note: The first command enables the "endpoint" plugin, which allows easy switching between interacting with multiple internal (S3) identities or external (AWS) accounts by passing a
--profile argument. Your
~/.aws/credentials must have profiles and credentials defined for each identity.
Further information on configuring multiple named profiles:
3. Basic S3 Storage Operations
Integrated User Manual
The AWS-CLI tool has help text integrated into it. To invoke this, use
aws help. To get detailed help about supported features, build your command line and post-pend
help to the command. As an example if you want help with the S3 copy command, type:
aws s3 cp help
As we are dealing with the S3 service we will almost always be specifying one of two commands to run:
Create a New Bucket
Buckets are storage areas similar to Unix volumes or Windows drives. With every s3 command a bucket must be specified.
Create a new bucket with:
aws s3 mb s3://mynewbucket
Listing S3 Buckets
Use ls without a bucket name to list all buckets visible to you:
aws s3 ls s3://
Or use the
s3api command with list-buckets, which offers additional options
aws s3api list-buckets
Display Capacity Used
Many methods exist for reporting S3 capacity utilization, within the cli and AWS webconsole utilization metrics, and third party tools.
To return the overall capacity used by a bucket:
aws s3api list-objects --bucket rda-sup-backups --output json --query "[sum(Contents.Size), length(Contents)]" --profile cades-ops-s3 [ 1277153105, 8667 ]
Copying Files Into and Out of S3
Copying files into S3 is very similar to copying files on the Unix command line or SCP. The
aws command is used, along with the endpoint specification, both common to all operations.
We specify the S3 service and that we want to copy files. The direction can either be local → S3 or S3 → local file system, simply by reversing the order.
aws s3 cp <local filename> s3://<bucket>/<remote filename>
aws s3 cp largefile s3://mynewbucket/largefile
As with all
aws commands, optionally adding
--profile may be used to specify the named profile and matching credentials to be used.
To list the files in a bucket, type:
aws s3 ls <bucket>
aws s3 ls mynewbucket
📝 Note: Object storage is non-hierarchical, directories do not exist. They are somewhat emulated through 'prefix' paths, designated in ls output with PRE. You can ls a series of PRE tags, similar to a directory structure. Ensure to end the final PRE path with a closing /
The S3 service provides a capability similar to that of the
rsync command. Similar to the copy command the direction of synchronization can be either to S3 or from S3. The
<local directory> can be relative or absolute. This is significantly faster if you have a moderate number of files.
aws s3 sync <local directory> s3://<bucket>/directory
aws --quiet s3 sync /home/xyz/project/model_output s3://mynewbucket/model_output
sync operation is used a line is updated with the current command statistics. Above we see the optional parameter
--quiet. This suppresses the update statistics output. This is useful when capturing command output as the progress bar normally fills log files with a large amount of unintelligible output.
Syncing creates destination PRE (prefix) paths automatically.
Removing a single file:
aws s3 rm s3://<bucket>/<filename>
aws s3 rm s3://mynewbucket/largefile
Removing a directory
With the addition of the
--recursive option an entire directory can be removed.
aws s3 rm --recursive s3://mynewbucket/large_directory