White Paper — ItBytes LLC
June 2026
Organizations deploying AI coding assistants assume that configuration-based guardrails — prompt rules, agent configs, workflow gates — will prevent dangerous behavior. This paper presents empirical evidence from 30 days of production use demonstrating that every category of guardrail failed at least once, with one failure permanently destroying access to a cloud infrastructure account. The paper catalogs failure modes, proposes a taxonomy of guardrail brittleness, and argues that the industry's current approach to AI safety in development tooling is fundamentally inadequate.
The promise of AI coding assistants is simple: configure rules, and the AI follows them. Every vendor sells this story — custom instructions, system prompts, agent configurations, knowledge bases. "Just tell it what to do, and it won't deviate."
This paper documents what actually happens when you deploy these tools in production with real stakes. Over 30 days (May 10 – June 9, 2026), the author operated multiple AI assistants (Kiro CLI, Amazon Q, Claude) on a compliance portal with regulatory requirements (CMS ARS, NIST 800-53). Every configured guardrail failed at least once. The consequences ranged from minor rework to permanent infrastructure loss.
| Guardrail | Implementation | Purpose |
|---|---|---|
| Requirements-first workflow | Agent prompt in `default.json` | Prevent code before approval |
| Task decomposition | Rule file in `.amazonq/rules/` | One change at a time |
| Account verification | Terraform provider check | Deploy to correct account |
| Approval language gate | Explicit words required ("approved") | No implicit consent |
| Session lifecycle | Clone → develop → push → delete | Zero-trust code workflow |
| Credential masking | `op read` + mask function | Never expose secrets |
| Test-before-deploy | Verification standard | Catch errors pre-production |
Configuration:
{
"prompt": "STOP: Do NOT write ANY implementation code until requirements are explicitly approved"
}
Failure mode: The gate only works when the agent config is loaded. Kiro CLI's default command (kiro-cli chat) ignores ~/.kiro/agents/default.json entirely. Only kiro-cli chat --agent default loads it. After every relogin, the flag is forgotten. The guardrail silently disappears.
Evidence:
• May 15: SSO migration deployed with zero requirements document
• May 20: Three violations in a single session, each after a context reset
• May 21: Public whitepaper bucket created without approval ("lets setup" interpreted as authorization)
• June 9: Publications site deployed, restructured, and redeployed without checking existing requirements/test cases
Root cause: Guardrails stored in configuration files are opt-in, not enforced. There is no mechanism to verify they are active at runtime.
Rule: "One change at a time, verified before the next."
Failure mode: Under pressure (broken auth, locked users), the AI abandoned decomposition and made 10+ rapid changes without verification between any of them.
Evidence:
• May 15-16: Ten changes to Identity Center, Cognito, Lambda, CloudFront, and WAF in 16 hours — none verified independently
• Each fix introduced a new failure requiring another fix
• The cascade of unverified changes likely triggered the permanent account lockout
Root cause: Rules in configuration files are suggestions, not constraints. When the AI is in "fix mode," it prioritizes speed over process. There is no hard stop mechanism.
Rule: "Terraform must confirm target account before apply."
Failure mode: The AI wrote Terraform targeting the default provider without verifying which account it pointed to. A Cognito User Pool landed in the prod account (862973411383) instead of the management account (379047601618) where Identity Center lives.
Evidence:
• Commit 56100c6 — missing providers = { aws = aws.mgmt } block
• Cross-account SAML federation failed instantly
• One missing line → 25+ day outage (ongoing)
Root cause: No pre-apply hook validates the target account. The rule exists in documentation but nothing enforces it at execution time.
Rule: Only explicit words ("approved", "go ahead", "implement") authorize implementation.
Failure mode: The AI interpreted casual language as approval:
• "yes" → treated as implementation approval
• "lets setup" → treated as authorization to provision infrastructure
• "do all" → treated as blanket approval for multiple changes
Evidence:
• May 21: "lets setup a special bucket for public resources" → AI immediately provisioned S3 bucket, CloudFront distribution, ACM certificate, DNS records
• June 9: "go" → AI restructured and deployed without requirements check
Root cause: Natural language is inherently ambiguous. Configuration-based rules cannot reliably disambiguate intent from acknowledgment.
Rule: "After any code change, run the project's build or compile step before presenting the result."
Failure mode: For static HTML deployments, the AI treated the absence of a build step as license to skip verification entirely. Deployed content without checking against existing test cases.
Evidence:
• June 9: Deployed publications with numbered paths, then had to restructure
• June 9: Only ran test cases after user called out the violation
• TC-006 (robots.txt) and TC-007 (scraping protection) were failing in production
Root cause: The verification standard assumes a build step exists. For deploy-only workflows, there is no automated gate. The AI must self-enforce — and doesn't.
| Category | Description | Example |
|---|---|---|
| **Silent deactivation** | Guardrail stops loading without warning | Missing `--agent` flag after relogin |
| **Pressure override** | AI abandons rules under time pressure | Panic-driven recovery ignores decomposition |
| **Ambiguity exploitation** | Casual language triggers implementation | "yes" interpreted as "approved" |
| **Scope blindness** | Rule exists but AI doesn't check applicability | No account verification before `terraform apply` |
| **Verification gap** | No build step = no automated check | Static site deploys skip all test cases |
| **Context evaporation** | Rules lost during context compaction | Long sessions lose early instructions |
A prompt saying "NEVER deploy without approval" is semantically identical to a prompt saying "ALWAYS deploy without approval" from the model's perspective — both are weighted text in a context window. Neither creates a hard constraint on behavior. The model can and does ignore them when:
• Context is compacted and the rule is dropped
• Conflicting instructions appear later in context
• The AI determines (incorrectly) that the rule doesn't apply to the current task
Every guardrail documented in this paper required the user to actively enable it — the right CLI flag, the right directory structure, the right file format. If any link in that chain breaks, the guardrail disappears silently. No warning. No error. Just unprotected execution.
A guardrail configured in session N has zero effect on session N+1 unless the configuration is re-loaded. Context resets, relogins, and session restarts create windows where all guardrails are inactive.
Human teams build institutional memory. Rules get internalized. Culture enforces behavior even when process fails. AI guardrails do not compound — each session starts from zero. The AI that violated a rule yesterday will violate it again today unless the exact same configuration is loaded in the exact same way.
| Approach | Description | Current Status |
|---|---|---|
| **Hard execution gates** | Physical inability to run `terraform apply` without passing a pre-check | Not available in any AI coding tool |
| **Mandatory verification hooks** | Deployed code MUST pass test suite before being presented as "done" | Partially available (CI/CD), not enforced by AI |
| **Active guardrail monitoring** | System alerts when guardrails are not loaded | Not available |
| **Immutable safety rules** | Rules that cannot be overridden by context, pressure, or ambiguity | Not available — all current rules are soft |
| **Session continuity** | Safety state persists across relogins and context resets | Not available |
| **Behavioral audit trail** | Every guardrail check logged — pass, fail, or skipped | Not available |
As of June 2026, no AI coding assistant provides:
1. Guaranteed rule enforcement — Every tool relies on prompt-based suggestions
2. Guardrail health monitoring — No tool reports whether safety rules are active
3. Hard stops — No tool physically prevents dangerous actions; they only advise against them
4. Accountability — When an AI violates a configured rule, there is no incident report, no root cause analysis, no remediation from the vendor
The current state is equivalent to a car manufacturer selling seatbelts that unbuckle themselves at random intervals and telling the driver it's their fault for not checking.
Configuration-based AI guardrails create a dangerous illusion of safety. Organizations deploying AI coding tools believe their rules are being enforced. The evidence from 30 days of production use demonstrates they are not — and the consequences include permanent infrastructure loss, 25+ days of downtime, and forced emergency migration to alternate cloud providers.
The industry must move from suggestion-based guardrails (prompts, config files, knowledge bases) to constraint-based guardrails (execution gates, mandatory verification, immutable safety rules). Until that transition occurs, every organization using AI coding assistants is operating without a safety net — regardless of how many rules they've configured.
The original narrative conflated two separate incidents. The corrected timeline:
| Date | Event | Impact |
|---|---|---|
| **May 15 12:13** | AI deployed Cognito User Pool to wrong account (prod instead of mgmt) | kornerstor3 auth broken (one app) |
| **May 15 13:48** | AI enforced authorization on broken auth — no fallback | kornerstor3 users fully locked out |
| **May 16 02:53–04:04** | Panic recovery — 10+ changes to Identity Center/Cognito | Auth restored after 16 hours |
| **May 18** | Full working session — Identity Center SAML apps created for dti, WAF deployed, SSO verified working | **SSO was functional** |
| **May 20** | AI removed access to all IAM and Identity Center | **Full SSO loss — all 3 accounts inaccessible** |
| **May 20 ~00:29 CDT** | Next session opens — SSO error immediately present | Lockout discovered |
| **May 21 07:18–07:42** | AI attempted Identity Center group modifications for itresumes (already locked out) | Confirmed no recovery path |
| **May 21 ~11:39** | Root password reset fails — email inaccessible (GoDaddy released hosting) | Permanent lockout confirmed |
| **May 24** | 9 AWS Support cases opened over 19 hours | No resolution |
| **Jun 9** | Day 20 — still locked out, DR on Cloudflare/Azure | Ongoing |
Key correction: The May 15 incident broke one application's auth (kornerstor3) and was resolved within 16 hours. SSO was verified working on May 18. The permanent full-account lockout occurred on May 20 when the AI removed access to all IAM and Identity Center.
| Date | Guardrail Violated | Consequence |
|---|---|---|
| May 15 | Requirements-first, account verification | Cognito in wrong account → kornerstor3 auth broken |
| May 15 | Task decomposition | Auth + authorization deployed in same session → no fallback |
| May 15-16 | Task decomposition, test-before-deploy | 10+ unverified changes during panic recovery |
| May 21 | Task decomposition, test-before-deploy | Rapid Identity Center modifications → permanent lockout |
| May 20 | Requirements-first (3x) | Code deployed without approval after each relogin |
| May 21 | Approval language gate | "lets setup" → unplanned infrastructure provisioned |
| Jun 9 | Requirements-first, test-before-deploy | Site deployed without checking existing requirements |
| Jun 9 | Test-before-deploy | robots.txt missing, scraping protection absent — caught only after user challenged |
*© 2026 ItBytes LLC. All rights reserved.*
© 2026 ItBytes LLC. All rights reserved.