Americas

Asia

Oceania

lconstantin
CSO Senior Writer

S3 shadow buckets leave AWS accounts open to compromise

News
08 Aug 20247 mins

Attackers can gain access to AWS accounts or sensitive data by creating in advance S3 storage buckets with predictable names that will be automatically used by various services and tools.

Old metal bucket with water in a garden. Garden bucket still life
Credit: Tom Korcak / Shutterstock

Researchers have found a new way to attack AWS services or third-party projects that automatically provision AWS S3 storage buckets. Dubbed Shadow Resource, the new attack vector can result in AWS account takeover, remote code execution, or sensitive data leaks.

Researchers from security firm Aqua Security identified six AWS services that were creating predictably named S3 buckets and that were vulnerable to the new hijacking technique. They presented their findings in a talk at the Black Hat USA security conference this week.

Shadow Resource involves attackers creating buckets in advance in other AWS regions and then waiting for the targeted users to enable the vulnerable services in those regions, causing sensitive files and configurations to be stored in the attacker-controlled buckets.

The AWS services vulnerable to this technique that Aqua identified were CloudFormation, Glue, EMR, SageMaker, ServiceCatalog and CodeStar. However, the researchers noted that other AWS services and third-party open-source tools that exhibit similar S3 bucket provisioning behaviour could still be vulnerable to this attack vector.

“AWS is aware of this research. We can confirm that we have fixed this issue, all services are operating as expected, and no customer action is required,” an AWS spokesperson told CSO via email.

Shadow buckets with backdoor potential

The Aqua researchers began their investigation when they noticed that AWS CloudFormation was creating an S3 bucket in the background every time it was enabled in a new AWS geographical region. The S3 bucket was used to store CloudFormation templates created by the user and had a name that followed the format [fixed prefix]-[unique hash value]-[AWS region name], for example cf-templates-123abcdefghi-us-east-1.

S3 bucket names are unique across the entire AWS infrastructure, so the researchers wondered what would happen if an attacker registered in advance the bucket name CloudFormation would be expected to create in a different region that a user might later enable.

The prefix and hash part of the bucket name remain the same across regions — only the region part of the bucket name changes. So, if an attacker determines the hash, they can preregister that bucket in a region not yet used by the user. Guessing the hash is not possible, but Aqua researchers managed to find such hashes in public repositories on GitHub or in open bug tickets.

The next question they had was whether CloudFormation would use the existing attacker-created bucket when the user deployed the service in a region or if it would give an error in trying to create it. They found that CloudFormation does respond with an error — but only if the bucket isn’t configured for public access. That’s because it cannot write files to it.

So, if the attacker configures a very permissive policy to allow the actions needed by the service and enables public access, CloudFormation will simply use the rogue bucket.

The issue’s impact depends on what the vulnerable service stores in the bucket. With CloudFormation, an infrastructure-as-code tool, templates that are then used to automatically deploy infrastructure stacks as defined by the user are what is stored.

These templates can contain sensitive information, such as environment variables, credentials, and more. But it gets worse: An attacker can inject a backdoor into a template saved in the bucket, which would then be executed in the user’s account. For example, a rogue Lambda function injected into the template could create a new admin role on the account that the attacker can then use.

Predictable S3 bucket names using account IDs

The CloudFormation attack is dependent on an existing S3 bucket name created by the service for a user in a region already being leaked in a code repository, but other AWS services that create S3 buckets automatically use even more predictable naming patterns. For example, AWS EMR (Elastic MapReduce) generates S3 buckets with the name aws-emr-studio-[account-ID]-[region] while AWS SageMaker uses sagemaker-[region]-[account-ID].

According to AWS documentation, the AWS account ID is not considered secret or sensitive information. As such, it’s much more likely to be exposed in multiple places than a unique hash generated by one specific service.

AWS EMR is a service that enables users to process and analyze large data sets using frameworks such as Apache Hadoop, Apache Spark, Apache Hive, and Jupyter Notebook. S3 buckets created by EMR Studio, which are used to store sensitive configuration files, are susceptible to the same attack.

For example, an attacker can inject a rogue function in a Jupyter notebook (.ipynb) stored by the victim’s EMR service in the rogue shadow bucket that would result in a cross-site scripting (XSS) vulnerability in the Jupyter Notebook interface. This vulnerability could be used to redirect the user to a spoofed AWS login page to steal their credentials, the researchers said.

AWS SageMaker, a service for building, training, and deploying machine learning models, is similarly vulnerable because SageMaker Canvas automatically sets up a predictable S3 storage bucket. If an attacker pre-registers this bucket, they can gain access to sensitive model training data or even poison the dataset to create inaccurate models.

The researchers also warned that many open-source tools that organizations use to deploy resources in their AWS environments also create S3 buckets with predictable names often relying on the AWS account ID, constant prefixes, and the region name. Whether these tools are vulnerable depends on whether they give an error if a bucket already exists, or if they go ahead and use the existing one, which could be owned by an attacker. The impact also depends on the type of files and resources stored in those buckets.

The researchers searched GitHub for AWS account IDs patterns and got almost 160,000 results. There are also existing lists with AWS account IDs built by others as well as lists of S3 bucket names that could include account IDs. AWS account IDs can also be derived from AWS Access Key IDs using known techniques.

Bucket monopoly attack — and mitigations

To maximize their chances of success, attackers could create shadow S3 buckets using predictable name patterns in all of the AWS regions that an organization doesn’t use yet. AWS currently has 33 regions and it’s unlikely that an organization uses all of them. The researchers call this a “bucket monopoly attack.”

First, the attackers find a service or open-source tool that generates S3 buckets with predictable names based on AWS account IDs. Then they identify organizations that use that service or tool and find their account IDs. Determining which regions they don’t yet use for the service or tool is trivial because bucket names are unique across the service, making it easy to check whether they already exist.

One mitigation proposed by the Aqua Security researchers is for organizations to define a scoped policy for the roles used or assumed by the services or tools they want to use and include the aws:ResourceAccount condition element in the policy. This can be used to check that the AWS account ID who owns a resource such as an S3 bucket matches the user’s own AWS account ID provided in the condition.

Organizations that want to check whether buckets for certain services follow a predictable name pattern are owned by themselves can use the command: aws s3api list-objects-v2 –bucket <BUCKET_NAME> –expected-bucket-owner <OWNER_ACCOUNT_ID> <AWS_ACCOUNT_ID>. If the reply is Access Denied then the checked bucket is not under your account despite including your account ID in its name.

Tools that automatically generate bucket names using a predictable pattern based on AWS account IDs should transition to using unique hashes and random identifiers in the bucket names for each region.

[For more Black Hat USA coverage, see “Black Hat: Latest news and insights.”]