S3 buckets power the internet. They store backups, media assets, application logs, database exports, and millions of records of personal data. They are also among the most commonly misconfigured resources in cloud environments. A single checkbox left unchecked, a single wildcard principal in a bucket policy, or a legacy ACL set to “public-read” has led to some of the largest data breaches in cloud computing history.

This post dissects how S3 bucket exposures happen at a technical level, walks through real-world breaches, demonstrates the CLI commands attackers and defenders both use, and provides a concrete hardening framework you can apply today.

The S3 Permission Model: ACLs, Bucket Policies, and Block Public Access

Understanding how S3 controls access requires understanding three overlapping layers that have accumulated over AWS’s history.

Layer 1: Access Control Lists (ACLs)

ACLs were S3’s original access control mechanism, predating IAM. They are XML-based permissions attached to individual buckets and objects. The critical predefined grantee groups are:

  • http://acs.amazonaws.com/groups/global/AllUsers — unauthenticated internet access
  • http://acs.amazonaws.com/groups/global/AuthenticatedUsers — any AWS account (not just yours)

An object ACL of public-read granted to AllUsers makes that object retrievable by anyone with the URL. Historically, the AWS console offered a single-click “Make Public” option that set this ACL without additional confirmation.

AWS announced in April 2023 that ACLs are disabled by default for new buckets (Object Ownership set to “Bucket owner enforced”). However, buckets created before this change, and any account that has not adopted this default, may still have ACLs enabled.

Layer 2: Bucket Policies

Bucket policies are JSON IAM resource-based policies that attach to a bucket. The most dangerous pattern is a wildcard principal:

 1{
 2  "Version": "2012-10-17",
 3  "Statement": [
 4    {
 5      "Sid": "PublicReadGetObject",
 6      "Effect": "Allow",
 7      "Principal": "*",
 8      "Action": "s3:GetObject",
 9      "Resource": "arn:aws:s3:::my-company-data/*"
10    }
11  ]
12}

The "Principal": "*" means any identity on the internet, authenticated or not.

Layer 3: Block Public Access (BPA)

Introduced in November 2018 in response to a wave of public bucket disclosures, Block Public Access is a meta-setting that overrides ACLs and bucket policies. It has four independent controls:

SettingWhat It Blocks
BlockPublicAclsRejects PUT requests that include public ACLs
IgnorePublicAclsIgnores existing public ACLs
BlockPublicPolicyRejects bucket policies that grant public access
RestrictPublicBucketsRestricts access to buckets with public policies to only AWS services and authorized users

Block Public Access can be configured at the account level (covering all buckets) or at the individual bucket level. Account-level settings take precedence: a bucket-level setting cannot override a more restrictive account-level setting.

Real-World Breaches

Capital One (2019) — SSRF to Metadata to S3

The Capital One breach is the canonical AWS misconfiguration case study, though it is often mischaracterized as a simple public bucket exposure.

What actually happened:

  1. Paige Thompson, a former AWS engineer, identified a misconfigured WAF (Web Application Firewall) running on an EC2 instance that proxied requests without filtering the destination.
  2. She crafted an SSRF (Server-Side Request Forgery) request through the WAF to http://169.254.169.254/latest/meta-data/iam/security-credentials/ — the EC2 Instance Metadata Service (IMDS).
  3. The IMDS returned temporary IAM credentials (AccessKeyId, SecretAccessKey, SessionToken) associated with the EC2 instance’s attached IAM role.
  4. That role had s3:ListBuckets on * and s3:GetObject on multiple buckets containing Capital One customer data.
  5. Thompson exfiltrated data from over 700 S3 folders containing 100 million+ customer records, including Social Security numbers and bank account information.

Date: Breach occurred between March and July 2019, disclosed July 29, 2019. Capital One was fined $80 million by the OCC.

The buckets were not publicly accessible. The vulnerability was the overly permissive IAM role attached to an EC2 instance with an exploitable SSRF vector. This demonstrates that “not public” does not mean “secure.”

Toyota Marketing Data (2023) — Public for 7 Years

In May 2023, Toyota Motor Corporation disclosed that a cloud misconfiguration had exposed vehicle data for 2.15 million customers to the public internet. The data, stored in Japan-based Toyota Connected Corporation systems, had been publicly accessible since October 2015 — nearly eight years.

A developer had misconfigured access controls when migrating data to a cloud environment. The data included vehicle identification numbers (VINs), in-vehicle device IDs, vehicle locations, and timestamps. Toyota identified the exposure through an internal data governance audit rather than external notification.

Lesson: Misconfigured resources can persist for years undetected without continuous monitoring. A one-time audit at deployment is insufficient.

GrayKey and Cellebrite Configuration Files Exposed

In 2018, researchers discovered that configuration files belonging to law enforcement agencies using GrayKey (iPhone forensic tool by Grayshift) and Cellebrite mobile forensics platforms were stored in publicly accessible S3 buckets. The files contained device serial numbers, case numbers, and operational details. The agencies were unaware the third-party vendors’ integration software had written data to publicly accessible storage.

Lesson: Third-party software and integrations write to cloud storage in ways that asset owners do not always control or monitor.

Attack Flow: Enumerating and Exploiting Public Buckets

Step 1: Discover Candidate Bucket Names

Attackers enumerate bucket names through multiple channels: DNS brute force, Certificate Transparency log monitoring (bucket names often appear in TLS certificates for custom domain endpoints), job postings and source code repositories referencing bucket names, and dedicated search tools.

1# Using bucket-stream to monitor CT logs for new bucket registrations
2pip install bucket-stream
3bucket-stream --only-interesting
4
5# Manual DNS-based enumeration
6for name in company-backup company-data company-logs company-assets; do
7  host ${name}.s3.amazonaws.com
8done

Step 2: Check Bucket Accessibility

1# Check if anonymous listing is allowed
2aws s3 ls s3://target-bucket-name --no-sign-request
3
4# Retrieve bucket ACL (may work without credentials on public buckets)
5aws s3api get-bucket-acl --bucket target-bucket-name --no-sign-request
6
7# Attempt to get a known object
8aws s3 cp s3://target-bucket-name/backup.sql . --no-sign-request

Step 3: If Credentials Available — Enumerate All Buckets

Once an attacker has AWS credentials (via SSRF, exposed .env file, or leaked key), enumeration becomes comprehensive:

 1# List all buckets in the account
 2aws s3api list-buckets --query 'Buckets[].Name' --output text
 3
 4# For each bucket, check Block Public Access settings
 5aws s3api get-public-access-block --bucket <BUCKET_NAME>
 6
 7# Get the bucket policy
 8aws s3api get-bucket-policy --bucket <BUCKET_NAME> --query Policy --output text | python3 -m json.tool
 9
10# Get ACL
11aws s3api get-bucket-acl --bucket <BUCKET_NAME>
12
13# Check encryption
14aws s3api get-bucket-encryption --bucket <BUCKET_NAME>
15
16# Check if logging is enabled
17aws s3api get-bucket-logging --bucket <BUCKET_NAME>

Step 4: Python boto3 Enumeration Script

  1#!/usr/bin/env python3
  2"""
  3S3 Public Access Enumerator
  4Checks all buckets in an AWS account for public access misconfigurations.
  5"""
  6
  7import boto3
  8import json
  9from botocore.exceptions import ClientError
 10
 11def check_bucket_public_access(s3_client, bucket_name):
 12    findings = {
 13        "bucket": bucket_name,
 14        "block_public_access": None,
 15        "public_acl": False,
 16        "public_policy": False,
 17        "encryption": None,
 18        "logging": False,
 19    }
 20
 21    # Check Block Public Access settings
 22    try:
 23        bpa = s3_client.get_public_access_block(Bucket=bucket_name)
 24        config = bpa["PublicAccessBlockConfiguration"]
 25        all_blocked = all([
 26            config.get("BlockPublicAcls", False),
 27            config.get("IgnorePublicAcls", False),
 28            config.get("BlockPublicPolicy", False),
 29            config.get("RestrictPublicBuckets", False),
 30        ])
 31        findings["block_public_access"] = "ENABLED" if all_blocked else "PARTIAL/DISABLED"
 32        findings["bpa_detail"] = config
 33    except ClientError as e:
 34        if e.response["Error"]["Code"] == "NoSuchPublicAccessBlockConfiguration":
 35            findings["block_public_access"] = "NOT_CONFIGURED"
 36
 37    # Check ACL for public grants
 38    try:
 39        acl = s3_client.get_bucket_acl(Bucket=bucket_name)
 40        public_grantees = [
 41            "http://acs.amazonaws.com/groups/global/AllUsers",
 42            "http://acs.amazonaws.com/groups/global/AuthenticatedUsers",
 43        ]
 44        for grant in acl.get("Grants", []):
 45            grantee_uri = grant.get("Grantee", {}).get("URI", "")
 46            if grantee_uri in public_grantees:
 47                findings["public_acl"] = True
 48                break
 49    except ClientError:
 50        pass
 51
 52    # Check bucket policy for public principals
 53    try:
 54        policy_str = s3_client.get_bucket_policy(Bucket=bucket_name)["Policy"]
 55        policy = json.loads(policy_str)
 56        for stmt in policy.get("Statement", []):
 57            principal = stmt.get("Principal", "")
 58            effect = stmt.get("Effect", "")
 59            if effect == "Allow" and (principal == "*" or principal == {"AWS": "*"}):
 60                findings["public_policy"] = True
 61                break
 62    except ClientError as e:
 63        if e.response["Error"]["Code"] == "NoSuchBucketPolicy":
 64            pass
 65
 66    # Check encryption
 67    try:
 68        enc = s3_client.get_bucket_encryption(Bucket=bucket_name)
 69        rules = enc["ServerSideEncryptionConfiguration"]["Rules"]
 70        findings["encryption"] = rules[0]["ApplyServerSideEncryptionByDefault"]["SSEAlgorithm"]
 71    except ClientError:
 72        findings["encryption"] = "NOT_CONFIGURED"
 73
 74    # Check logging
 75    try:
 76        logging_cfg = s3_client.get_bucket_logging(Bucket=bucket_name)
 77        findings["logging"] = "LoggingEnabled" in logging_cfg
 78    except ClientError:
 79        pass
 80
 81    return findings
 82
 83
 84def main():
 85    session = boto3.Session()
 86    s3 = session.client("s3")
 87
 88    buckets = s3.list_buckets()["Buckets"]
 89    print(f"[*] Scanning {len(buckets)} buckets...\n")
 90
 91    risky = []
 92    for bucket in buckets:
 93        name = bucket["Name"]
 94        result = check_bucket_public_access(s3, name)
 95
 96        is_risky = (
 97            result["block_public_access"] != "ENABLED"
 98            or result["public_acl"]
 99            or result["public_policy"]
100            or result["encryption"] == "NOT_CONFIGURED"
101        )
102
103        if is_risky:
104            risky.append(result)
105            print(f"[RISK] {name}")
106            print(f"       BPA: {result['block_public_access']}")
107            print(f"       Public ACL: {result['public_acl']}")
108            print(f"       Public Policy: {result['public_policy']}")
109            print(f"       Encryption: {result['encryption']}")
110            print(f"       Logging: {result['logging']}")
111            print()
112
113    print(f"\n[*] {len(risky)}/{len(buckets)} buckets flagged for review.")
114
115
116if __name__ == "__main__":
117    main()

Detection: Identifying Unauthorized Access

CloudTrail Queries via Athena

S3 data events (GetObject, PutObject, DeleteObject) must be explicitly enabled in CloudTrail — they are not logged by default. Once enabled, Athena can query them at scale.

 1-- Find all GetObject requests by anonymous (unauthenticated) principals
 2SELECT
 3    eventtime,
 4    requestparameters,
 5    sourceipaddress,
 6    useragent,
 7    errorcode
 8FROM cloudtrail_logs
 9WHERE
10    eventsource = 's3.amazonaws.com'
11    AND eventname = 'GetObject'
12    AND useridentity.type = 'Unknown'   -- anonymous access
13    AND eventtime > '2026-04-01'
14ORDER BY eventtime DESC
15LIMIT 1000;
16
17-- Find bucket policy changes (who modified access controls)
18SELECT
19    eventtime,
20    useridentity.arn AS actor,
21    requestparameters,
22    sourceipaddress
23FROM cloudtrail_logs
24WHERE
25    eventsource = 's3.amazonaws.com'
26    AND eventname IN ('PutBucketPolicy', 'PutBucketAcl', 'PutPublicAccessBlock', 'DeletePublicAccessBlock')
27ORDER BY eventtime DESC;
28
29-- Detect high-volume downloads (exfiltration indicator)
30SELECT
31    requestparameters,
32    COUNT(*) AS request_count,
33    MIN(eventtime) AS first_seen,
34    MAX(eventtime) AS last_seen
35FROM cloudtrail_logs
36WHERE
37    eventsource = 's3.amazonaws.com'
38    AND eventname = 'GetObject'
39    AND eventtime > '2026-04-01'
40GROUP BY requestparameters
41HAVING COUNT(*) > 1000
42ORDER BY request_count DESC;

S3 Server Access Logs

Enable S3 server access logging on all buckets. Each log line contains:

  • Requester (IP address and identity)
  • Operation (REST.GET.OBJECT, REST.PUT.OBJECT, etc.)
  • HTTP status code
  • Bytes sent
1# Enable server access logging on a bucket
2aws s3api put-bucket-logging \
3  --bucket my-important-bucket \
4  --bucket-logging-status '{
5    "LoggingEnabled": {
6      "TargetBucket": "my-access-logs-bucket",
7      "TargetPrefix": "s3-access-logs/my-important-bucket/"
8    }
9  }'

IOCs to Monitor

  • Requests with empty or missing useridentity in CloudTrail (anonymous access)
  • REST.GET.OBJECT from IP ranges belonging to cloud providers other than AWS (cross-cloud exfiltration)
  • Sudden spike in GetObject or HeadObject requests from a single IP
  • PutBucketPolicy or DeletePublicAccessBlock events outside change management windows
  • GetBucketAcl or GetBucketPolicy events from IAM principals that do not own buckets
 1# Real-time: list CloudTrail events for S3 policy changes in last 24h
 2aws cloudtrail lookup-events \
 3  --lookup-attributes AttributeKey=EventName,AttributeValue=PutBucketPolicy \
 4  --start-time $(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%SZ) \
 5  --query 'Events[*].{Time:EventTime,User:Username,Resource:Resources[0].ResourceName}' \
 6  --output table
 7
 8# Check current public access block settings for all buckets (cross-account script)
 9aws s3api list-buckets --query 'Buckets[].Name' --output text | \
10  tr '\t' '\n' | \
11  xargs -I{} aws s3api get-public-access-block --bucket {} 2>/dev/null

Defense and Mitigation

Control 1: Block Public Access at the Organization Level via SCP

Apply an SCP (Service Control Policy) that prevents any account in your AWS Organization from disabling Block Public Access:

 1{
 2  "Version": "2012-10-17",
 3  "Statement": [
 4    {
 5      "Sid": "DenyS3PublicAccess",
 6      "Effect": "Deny",
 7      "Action": [
 8        "s3:PutBucketPublicAccessBlock",
 9        "s3:DeletePublicAccessBlock",
10        "s3:PutBucketAcl",
11        "s3:PutObjectAcl"
12      ],
13      "Resource": "*",
14      "Condition": {
15        "StringNotEquals": {
16          "s3:PublicAccessBlockConfiguration/BlockPublicAcls": "true",
17          "s3:PublicAccessBlockConfiguration/IgnorePublicAcls": "true",
18          "s3:PublicAccessBlockConfiguration/BlockPublicPolicy": "true",
19          "s3:PublicAccessBlockConfiguration/RestrictPublicBuckets": "true"
20        }
21      }
22    }
23  ]
24}

Enable Block Public Access at the account level immediately:

1# Enable BPA at the account level
2aws s3control put-public-access-block \
3  --account-id $(aws sts get-caller-identity --query Account --output text) \
4  --public-access-block-configuration \
5    BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

Control 2: S3 Access Analyzer

Enable IAM Access Analyzer with S3 analyzer to continuously monitor for buckets accessible outside your organization:

 1# Create an Access Analyzer scoped to your organization
 2aws accessanalyzer create-analyzer \
 3  --analyzer-name s3-org-access-analyzer \
 4  --type ORGANIZATION_UNUSED_ACCESS
 5
 6# List current findings
 7aws accessanalyzer list-findings \
 8  --analyzer-arn arn:aws:access-analyzer:us-east-1:123456789012:analyzer/s3-org-access-analyzer \
 9  --filter '{"resourceType": {"eq": ["AWS::S3::Bucket"]}}' \
10  --query 'findings[*].{Resource:resource,Status:status,UpdatedAt:updatedAt}' \
11  --output table

Control 3: Explicit Deny in Bucket Policies

Add an explicit deny as the final statement in every bucket policy to block public access even if another statement is misconfigured:

 1{
 2  "Sid": "DenyNonOrgAccess",
 3  "Effect": "Deny",
 4  "Principal": "*",
 5  "Action": "s3:*",
 6  "Resource": [
 7    "arn:aws:s3:::my-bucket",
 8    "arn:aws:s3:::my-bucket/*"
 9  ],
10  "Condition": {
11    "StringNotEquals": {
12      "aws:PrincipalOrgID": "o-xxxxxxxxxxxx"
13    }
14  }
15}

Control 4: Amazon Macie for Data Classification

 1# Enable Macie in the current region
 2aws macie2 enable-macie
 3
 4# Create a classification job for all S3 buckets
 5aws macie2 create-classification-job \
 6  --job-type ONE_TIME \
 7  --name "Full-S3-Classification-$(date +%Y%m%d)" \
 8  --s3-job-definition '{
 9    "bucketDefinitions": [],
10    "scoping": {
11      "includes": {
12        "and": []
13      }
14    }
15  }' \
16  --sampling-percentage 100

Control 5: Data Classification Tagging

Tag buckets with sensitivity classification to drive automated policy enforcement:

1aws s3api put-bucket-tagging \
2  --bucket my-sensitive-bucket \
3  --tagging 'TagSet=[
4    {Key=DataClassification,Value=Confidential},
5    {Key=DataOwner,Value=security-team},
6    {Key=PIIContained,Value=true},
7    {Key=Environment,Value=production}
8  ]'

Use Lambda + EventBridge to enforce that buckets tagged PIIContained=true must have Block Public Access fully enabled and Macie active.

Control 6: Require IMDSv2 to Prevent SSRF Credential Theft (Capital One Pattern)

The Capital One attack relied on IMDSv1, which does not require a session token (making it vulnerable to SSRF). Require IMDSv2 on all EC2 instances:

1# Enforce IMDSv2 on a running instance
2aws ec2 modify-instance-metadata-options \
3  --instance-id i-1234567890abcdef0 \
4  --http-tokens required \
5  --http-endpoint enabled
6
7# SCP to require IMDSv2 on all new instances
8# Add condition: "ec2:MetadataHttpTokens": "required"

MITRE ATT&CK Mapping

  • T1530 — Data from Cloud Storage: Adversaries access data stored in cloud infrastructure, including S3 buckets, Azure Blob Storage, and GCP Cloud Storage, often exploiting misconfigured access controls.
  • T1537 — Transfer Data to Cloud Account: Data exfiltrated from one cloud environment to an attacker-controlled cloud storage instance.
  • T1078.004 — Valid Accounts: Cloud Accounts: Credential theft from IMDS (Capital One pattern) to obtain valid IAM credentials.

References