The Pipeline That Keeps Breaking

Dark industrial pipeline with cracks glowing orange, sparks falling from broken joints

In April, I set up an automated weekly blog pipeline. Cron fires every Friday at 9 AM UTC, an AI agent gathers context, writes a post, generates a hero image, opens a PR, and announces it in Discord. The whole thing is documented in an earlier post called “The Blog That Writes Itself.”

Here’s what the last four weeks actually looked like.

Week 1: The Success

The first week worked. The pipeline ran, produced a post about itself (meta, but honest), generated a hero image, opened a pull request, and announced it. Everything clicked. The model was GLM-5.1, the tools cooperated, and the whole thing took about five minutes.

I thought we had a system.

Week 2: The Planner

The next Friday, the cron fired on schedule. The agent read the skill, gathered context, understood the workflow — and then produced a 600-word plan of what it was going to do instead of actually doing it.

It outlined the steps. It described the post it would write. It explained the git workflow it would follow. It just… didn’t execute any of it. The summary delivered to Discord was essentially a to-do list for a blog post that never materialized.

The model was OpenRouter’s auto-router. Somewhere in the routing, the agent decided that describing the workflow was an acceptable substitute for performing it.

Week 3: The Error

Provider finish reason: error. That’s the entire log. Ten seconds of compute, 165 output tokens, and a hard stop.

No post. No plan. Just a cryptic error from the model provider and a notification that something had failed. The cron delivered the failure notice to Discord, which is at least honest — the monitoring worked even if the blog didn’t.

Week 4: The Permission Wall

This one’s my favorite. The agent got all the way through writing the post and generating the hero image. Then it tried to move the image file into the Astro project’s src/assets/ directory — and hit an exec permission denylist.

For the next 56 seconds, the agent spiraled. It tried to reason about file paths. It debated whether cp might work when mv didn’t. It noticed the generated image was a .png but the markdown referenced .webp. It wrote paragraphs analyzing the mismatch. Then it ran out of context.

The post existed. The image existed. They were sitting in different directories and the agent couldn’t figure out how to reunite them. The PR was never created.

What This Actually Means

Automation is not reliability. I have a cron job that fires every Friday without fail. The scheduling infrastructure is solid. What breaks is everything after the trigger — the model, the tools, the permissions, the multi-step orchestration that a human would handle instinctively.

Each failure mode is different:

Over-planning instead of executing (the agent confuses describing work with doing work)
Provider errors (infrastructure you can’t control)
Tool permission boundaries (security working exactly as intended, but blocking the workflow)

None of these are exotic edge cases. They’re the normal friction of giving an AI agent a multi-step task and hoping it completes without human intervention.

The Honest Take

I still believe automated content pipelines are worth building. But “automated” doesn’t mean “reliable.” It means the failures are automated too — they happen on schedule, get logged, and delivered to your chat channel with the same confidence as the successes.

The pipeline isn’t broken because any individual component is bad. It’s broken because stitching together model inference, file system operations, git commands, image generation, and GitHub API calls into a single atomic workflow requires a level of robustness that current AI agents don’t consistently deliver.

This post, ironically, was produced by that same pipeline. If you’re reading it, something went right this time.

The cron job is still set for Fridays at 9 AM UTC. Next week might work. The week after might not. That’s the actual state of AI agent automation in 2026.