AWS · IAM
IAM Permission Boundaries Are a Footgun
How a one-line policy change unblocked a feature and also handed every service account a god-mode escape hatch.
- #aws
- #iam
- #security
AI Summary
Get a quick overview
Permission boundaries are great when you understand them. Until then, they are a very polite way for AWS to hand you a loaded gun.
What a permission boundary actually is
A boundary is a separate policy you attach to an IAM principal that caps the maximum permissions that principal can have, regardless of what their own identity policies say.
effective_permissions = identity_policies ∩ permission_boundary
The intersection. That's the whole idea. It exists so you can let developers create roles for their own services without those roles being able to escape the guardrails the platform team set.
The footgun
The trap is that the boundary policy is itself an IAM policy with
Allow and Deny statements. People reach for it as a "default permit"
container - everything inside this boundary is allowed - and it sort
of works that way, until someone writes an identity policy with a star.
What you expected
What the intersection actually does
The intersection of "*" with {S3, Lambda} is {S3, Lambda}. The
boundary did not restrict Action: "*" the way you'd expect - it
restricted the service surface, not the actions within those
services.
If a service-team developer attaches AdministratorAccess to their own
service role and the boundary permits S3, they have S3 admin. Including
DeleteBucket.
What we did about it
Three things, in order of impact:
- Treat boundaries as deny lists, not allow lists. The boundary
should
Denythe specific actions a service role must never have (e.g.,iam:*,kms:Delete*,s3:DeleteBucket*), not enumerate the allowed ones. - Require identity policies to be explicit. No
Action: "*"in any service role; CI rejects PRs that introduce them viatflintplus a custom rule. - Audit before you trust. A simple
aws iam get-roleplus the simulator for the riskier actions, run nightly, with diffs posted to the platform Slack channel.
The general lesson
Security primitives almost always have two readings: the one that fits on a slide, and the one that comes out when you read the spec slowly. The slide says "boundaries cap permissions"; the spec says "boundaries participate in an intersection with whatever else you give the principal." Those are very different sentences.
Whenever I onboard a new IAM concept now, I write the failing test first. The first time the test passes when I expected it to fail, I learn something I'd never have learned from the docs.