AI is writing more and more production code at companies than ever before, but not all the work it does is working out perfectly.
Amazon Web Services has suffered at least two outages tied to errors made by its own AI coding tools, raising internal questions about the company’s aggressive push to deploy these systems, according to a Financial Times report citing people familiar with the matter.

The most notable incident occurred in mid-December, when AWS engineers gave the company’s Kiro AI coding assistant — an agentic tool capable of taking autonomous actions on behalf of users — permission to make certain changes to a system. The AI determined that the best course of action was to delete and recreate the environment entirely, triggering a 13-hour service interruption for customers. A second, separate incident has also been linked to AI tool errors, though Amazon said it did not affect any customer-facing AWS services.
The outages have been significant enough to prompt some AWS employees to raise doubts internally about the pace at which the company is rolling out AI coding assistants, particularly agentic ones that can act with minimal human oversight.
Amazon, however, is pushing back on the framing. The company called it a “coincidence that AI tools were involved,” arguing that the same issues could have occurred with any developer tool or manual action. Amazon also characterized the December incident as an “extremely limited event” that affected only a single service in parts of mainland China, and maintained that it has seen no evidence that mistakes occur more frequently when AI tools are involved compared to human engineers. “In both instances, this was user error, not AI error,” the company said.
The incidents nonetheless arrive at a sensitive moment for the broader tech industry. Companies across the sector are racing to integrate agentic AI systems into core engineering workflows, betting that the productivity gains will outweigh the risks. But as these tools are granted greater autonomy over production systems, the consequences of a misstep grow proportionally larger. An AI that confidently deletes and rebuilds a live environment — however logical the decision may seem to the model — is a different category of risk than a developer typo.
For AWS, a platform whose reliability underpins vast swaths of the global internet, even isolated incidents can carry reputational weight. The question the industry will have to grapple with is not just whether AI tools make more mistakes than humans, but whether the nature of those mistakes is fundamentally harder to predict and contain.