Building Dipflow: Months of Late Nights, Broken Queues, and Learning to Ship

I’ve been putting off writing this post for months. Not because I don’t have anything to say. I have too much to say. Building Dipflow has been one of the most challenging, frustrating, and genuinely rewarding things I’ve ever done. And I wanted to wait until it felt “ready” to talk about.

But here’s the thing I’ve learned: it’s never ready. You just have to start talking.

So here’s the real story of building a programmatic SEO and AI content generation platform. The parts that worked, the parts that broke spectacularly, and everything I wish someone had told me before I started.

The Problem That Wouldn’t Leave Me Alone

Like most side projects that turn into actual products, Dipflow started with a personal itch.

I was running a content-heavy site. Nothing huge. Maybe 200 articles, decent traffic, making some affiliate revenue. The problem was scaling it. Every new article meant:

Hours of keyword research
Competitor analysis (opening 10+ tabs, squinting at what was ranking)
Writing the actual content
Finding or creating images
Formatting, optimizing, publishing

For one article, that’s fine. For ten? Tedious. For the hundreds I needed to actually compete? Impossible without either hiring a team or finding a better way.

I tried the existing tools. Some were good at one piece. Keyword research here, AI writing there. But none of them connected the dots. I’d export from one tool, import to another, copy-paste into a third. The workflow was held together with duct tape and browser tabs.

What I wanted was simple: put in a keyword, get out a properly researched, well-structured, publish-ready article. With images. Formatted correctly. Ready to go to my WordPress site.

That tool didn’t exist. So I started building it.

The First Attempt (It Was Bad)

Let me be honest about version one: it was a mess.

I hacked together a Python script that called the OpenAI API, generated some text, and dumped it into a file. It “worked” in the sense that it produced words. But the content was generic, the structure was random, and it had none of the strategic thinking that makes content actually rank.

The problem with AI-generated content isn’t that it’s bad at writing sentences. GPT can write perfectly good sentences all day. The problem is that it doesn’t know what to write about unless you tell it. And telling it properly? That’s the hard part.

I realized I wasn’t building a wrapper around an AI. I was building an editorial system that happens to use AI as one component.

That realization changed everything.

Designing the Pipeline (Where Things Got Complicated)

Here’s what I landed on: a multi-step wizard that guides an article from keyword to published post. Each step builds on the previous one, and the AI gets progressively more context about what it’s supposed to create.

The steps look like this:

SERP Research – Fetch what’s actually ranking, let the user pick which competitors to analyze
Title Generation – Generate 5 options based on competitor analysis, user picks one
Deep Research – Search the web, extract relevant content, synthesize into a research brief
Topic Generation – Create the article structure and outline
Content Generation – Write the full article
Extras – Key takeaways, FAQ sections
Metadata – Tags, external links
Visuals – Image prompts and descriptions
Image Generation – Actually create the images

Nine steps. Each one involving API calls, user decisions, and state management.

Building this as a simple request-response system would have been a nightmare. User clicks “generate,” waits 3 minutes, gets a result or a timeout error. Nobody wants that experience.

So I went with WebSockets.

The WebSocket Decision (And Why It Almost Broke Me)

Real-time updates sounded great in theory. User kicks off generation, sees each step complete in real-time, can watch the article come together. Feels fast and responsive even when the actual processing takes time.

In practice? WebSockets added a layer of complexity I wasn’t prepared for.

First, there’s the infrastructure. You need a separate WebSocket server running alongside your main application. In my case, that meant a Node.js service that could receive updates from the PHP backend and broadcast them to connected frontend clients.

Then there’s the protocol. Every update needs a consistent format:

{
  "type": "article_update",
  "article_id": 123,
  "step": "content_ready",
  "status": "content_ready",
  "article": { },
  "data": { }
}

Seems simple, right? But now you’re managing state in three places: the database (source of truth), the WebSocket broadcast (real-time updates), and the frontend (user interface). They need to stay in sync. When they don’t, and they won’t at first, you get articles that show as “generating” forever, or content that appears but doesn’t save, or users refreshing the page because something “feels stuck.”

I spent weeks debugging race conditions. Job finishes, broadcasts update, but the database transaction hasn’t committed yet. Frontend receives update, fetches article, gets stale data. User sees old content. User refreshes. User submits support ticket.

The fix was adding slight delays and ensuring transaction commits before broadcasts. Ugly, but it works.

The harder fix was building retry logic that doesn’t spam the WebSocket. If a job fails and retries, you don’t want to send “generating” three times. The frontend needs to handle idempotent updates gracefully.

Would I use WebSockets again? Yes, but I’d start with a clearer state machine. The article status progression (draft to serp_pending to serp_ready and so on) was an afterthought. It should have been the first thing I designed.

The Queue System That Saved Everything

Early on, I was running jobs synchronously. User clicks generate, PHP script blocks while calling OpenAI, user waits, server times out if it takes too long.

It worked for testing. It absolutely did not work for production.

The answer was a Redis queue. Jobs get pushed onto the queue with their parameters, worker processes pick them up, execute them, and either complete successfully or fail into a retry cycle.

I’ll be honest: I spent an embarrassing amount of time building custom queue infrastructure before realizing I was overcomplicating it. The final implementation is straightforward:

Jobs extend a base class with get_hook() and execute() methods
Scheduling is just $job->schedule(['params'], $delay)
Workers poll Redis, pull jobs, run them
Failed jobs retry with exponential backoff (immediate, 60 seconds, 180 seconds)
After three failures, they move to a failed queue for manual review

The backoff matters more than you’d think. OpenAI rate limits hit you at the worst times. Image generation APIs have cold starts. SEO APIs occasionally return 500s. Without backoff, a temporary blip turns into a permanent failure.

Running three worker replicas was another lesson. One worker handles normal load fine, but when someone kicks off a campaign with 50 keywords, you need parallelism. Multiple workers competing for the same queue means you need atomic operations. Redis LPOP is atomic. Building your own locking mechanism is not worth it.

The debugging tools I wish I’d built earlier:

A command to show queue depth (LLEN jobs)
A way to view what’s in the queue (LRANGE jobs 0 -1)
A dashboard for failed jobs with retry buttons

I eventually built all of these. I should have built them first.

The Credit System (Fairness Is Harder Than You’d Think)

Pricing AI products is genuinely difficult. The costs are real and variable. OpenAI charges per token, image generation charges per image, SEO APIs charge per query. You can’t just offer unlimited usage for $10/month. You’ll go broke.

But per-usage pricing feels nickle-and-dime-y to users. They don’t want to think about whether each click is costing them money.

I landed on credits. You get a monthly allotment based on your plan, and each operation costs a fixed number of credits. Simple math:

SERP research: 1 credit
Title generation: 5 credits
Deep research: 10 credits
Topic generation: 15 credits
Content generation: 20 credits
Extras: 10 credits
Metadata: 5 credits
Visuals: 10 credits
Image generation: 15 credits per image

A full article costs roughly 100 credits. The starter plan gives you 1,500 credits. That’s about 15 articles per month. Makes sense for the price point.

But then the edge cases appeared.

What if title generation fails due to an API error? Do you refund the credits? (Yes, you have to.) What if the user generates titles but abandons the article? (Credits spent, nothing you can do.) What if they retry the same step multiple times? (Each attempt costs credits. Is that fair?)

The answer I landed on: credits are deducted before API calls, and refunded only on infrastructure failures (API timeouts, server errors). User-initiated retries cost additional credits. Abandoned articles are sunk costs.

Is this perfectly fair? Probably not. But it’s understandable, and understandable beats “fair but confusing” every time.

The other thing I didn’t anticipate: bonus credits. Users can buy credit packs on top of their subscription. Those credits shouldn’t reset at the end of the month. They’re purchased, not allocated. So now I’m tracking two credit balances: subscription credits (reset monthly) and bonus credits (persist until used).

The transaction log was a late addition that I’m glad I built. Every credit change, whether a deduction, grant, or refund, gets logged with a timestamp and reason. When users ask “where did my credits go,” I can actually tell them.

Integrating Multiple AI Providers (The Abstraction That Paid Off)

Early on, I hardcoded everything against OpenAI. It worked, but it was fragile. What if I wanted to try a different model? What if OpenAI changed their API? What if a better option appeared?

I refactored to use a provider pattern. There’s an interface for each integration type:

interface SeoProviderInterface {
  public function search_serp(string $keyword): array;
  public function suggest_keywords(string $keyword): array;
}

Then implementations for each actual provider. And this part was crucial: fixture providers for development.

The fixture providers return static JSON responses instead of calling real APIs. During development, I’m not burning through API credits or waiting for network round trips. I can work on the UI, test edge cases, and iterate quickly.

Switching between fixtures and real APIs is a single config toggle. use_fixtures set to true means everything runs locally with mock data. False means real API calls.

This pattern saved me probably hundreds of dollars in API costs during development, and made debugging infinitely easier. When something breaks, I can isolate whether it’s my code or the external API in seconds.

The abstraction also made adding new providers straightforward. When I wanted to add Tavily for web search alongside existing options, it was: create interface, create implementation, add to factory. No touching existing code.

The Campaign System (Automated Bulk Publishing)

Single article generation is useful. But the real value of programmatic SEO is scale. You’re not writing one article. You’re creating hundreds around related keywords.

Campaigns automate this. You provide a list of keywords, configure the target WordPress site, set publishing frequency, and let it run.

The lifecycle seems simple:

draft → active → (paused) → completed

Active campaigns get picked up by a processor job that runs on a schedule. For each due campaign, it:

Picks the next pending keyword
Creates a draft article
Kicks off the full generation pipeline
Publishes to WordPress when complete
Schedules the next keyword based on publish_interval_hours

The “seems simple” part is doing a lot of work in that sentence.

What happens when generation fails? The keyword moves to failed status, and the campaign pauses automatically. Nobody wants to wake up to 50 failed articles because the API was down overnight.

What happens when the user runs out of credits mid-campaign? Pause, notify, wait for credits.

What happens when the WordPress site rejects the publish request? Mark that specific article as failed, log the error, continue with the next one.

Each of these edge cases was a separate debugging session. The campaign processor job is probably the most complex single piece of code in the system, and it’s still not perfect. There are race conditions I’ve worked around rather than solved. There’s retry logic that probably retries more than it should.

But it works. Users set up campaigns, articles appear on their sites, and I get emails thanking me for saving them hours per week.

Those emails make the debugging worth it.

The Time Investment (Being Honest About Hours)

I didn’t track my hours religiously, but I can estimate. Over the course of about 4 months:

Initial prototype: 3 weeks of evenings and weekends. Maybe 60 hours. Got basic generation working, realized it was all wrong, scrapped most of it.

Core architecture: 2 months of serious development. 150+ hours. This was the queue system, WebSockets, database schema, basic API endpoints.

Frontend wizard: 6 weeks. 100+ hours. React components, state management, real-time updates, the TipTap editor integration.

Credit system and Stripe: 3 weeks. 50 hours. More than half of this was testing edge cases and handling webhooks correctly.

Campaign system: 4 weeks. 80 hours. Would have been faster if I’d planned better upfront.

Image generation: 2 weeks. 40 hours. Mostly spent on the real-time update flow for multiple images.

Polish, bug fixes, documentation: Ongoing. At least 100 hours of just making things work right.

Total: somewhere between 600 and 800 hours. Call it a year of part-time work – Claude helped a lot!

There were features that took way longer than expected:

Article status transitions seemed like they’d be simple. They weren’t. Validating that transitions are legal (you can’t go from draft to completed), handling rollbacks, displaying the right UI for each status. Easily 20+ hours on what I thought would be a 2-hour task.
Image placeholders in content caused weeks of headaches. The idea was simple: insert placeholder tags during content generation, replace them with actual images later. But TipTap handles pasted content differently than typed content, the placeholder detection needed to be robust against edge cases, and coordinating the async image generation with the synchronous content save was a mess.
The WebSocket reconnection logic took three iterations to get right. First version didn’t reconnect at all. Second version reconnected but didn’t re-subscribe to articles. Third version worked but hammered the server with connection attempts when something was wrong. Fourth version (current) has exponential backoff and actually handles offline/online transitions properly.

What Surprised Me

AI APIs Are More Reliable Than I Expected

I went in expecting constant failures, rate limits, and outages. OpenAI has been remarkably stable. The rate limits are real but manageable with proper backoff. Image generation has occasional cold starts but rarely fails outright.

The unreliable part isn’t the AI. It’s everything around it. Network timeouts. My own bugs. Edge cases in parsing responses.

Users Don’t Read Instructions

Every feature I thought was “obvious” needed a tooltip, a label, or a tour. The step-by-step wizard that felt intuitive to me was confusing to first-time users who didn’t know what “SERP research” meant.

I’ve added so much explanatory text that I sometimes worry the UI is too cluttered. But users stopped asking “what does this button do?” so it’s probably worth it.

The 80/20 Rule Is Real

20% of the features take 80% of the time. Basic article generation worked early. Everything after (campaigns, integrations, image generation, the credit system) took the bulk of development time.

I should have shipped earlier. The MVP could have been just the wizard without campaigns or image generation. Users would have paid for that. I spent months building features before validating anyone wanted the core product.

PHP Gets a Bad Rap

I used PHP (WordPress backend) because I needed WordPress integration anyway, and I know it well. People love to criticize PHP, but honestly? It’s fine. Type hints, namespacing, modern autoloading. PHP 8 is a perfectly reasonable language.

The WordPress ecosystem is the frustrating part. Hooks, filters, global functions, decades of backward compatibility baggage. Working with WordPress is fine. Working around WordPress patterns to build something modern is where the friction comes in.

What I’d Do Differently

Start With the State Machine

The article status progression should have been a formal state machine from day one. Which statuses exist, what transitions are allowed, what happens on each transition. Instead, I built it organically, adding statuses as needed, and spent weeks debugging illegal transitions and edge cases.

Build Monitoring Earlier

I added proper logging and monitoring late in development. For months, I was debugging production issues by adding error_log() statements and deploying. Embarrassing. Proper structured logging from the start would have saved hours.

Ship the Wizard, Then Add Campaigns

The campaign system is powerful but complex. I should have validated the core article generation with real users before investing in automation. Some users only ever create single articles. They’re not doing programmatic SEO. For them, campaigns add complexity without value.

Charge Earlier

I spent too long in “free beta” mode. Real payment changes everything. Users care more, feedback is better, you feel pressure to fix things. I should have added Stripe within the first month of having a working product.

Write Better Prompts, Not More Code

Half my “smart” prompt engineering could have been replaced by clearer instructions and better examples in the prompts themselves. I wrote code to preprocess data, structure context, and inject information, when often just telling the AI what I wanted in plain English worked better.

Where It’s Going

I’m not going to pretend I have a grand vision or a five-year roadmap. I’ve learned that plans change as users actually use the product.

But there are some directions I’m exploring:

More integration targets. WordPress is great, but people publish elsewhere. Webflow, Shopify, Ghost, custom CMSes. Each integration is significant work, but it expands who can use the platform.

Better research. The current research step is good but not great. I want to pull in more sources, synthesize better, and give the AI more context about what makes content actually rank.

Team features. Right now it’s single-user. Adding workspaces, permissions, and collaboration would open up agency use cases.

Analytics. Tracking which generated content actually performs well and feeding that back into the generation process. Ambitious, but could be powerful.

I’m working on these slowly, shipping incrementally, watching what users actually use versus what they say they want.

The Part Nobody Talks About

Building something like this is lonely. There’s no team to brainstorm with, no one to rubber-duck problems, no celebration when something finally works. It’s just you, your IDE, and an increasingly long list of issues.

I’ve shipped features at 2 AM that I was sure would change everything. They didn’t. I’ve fixed bugs that I thought nobody would notice. Users noticed immediately.

The dopamine hits are real but unpredictable. A user emails to say the platform saved them 10 hours this week. That’s a good day. A bug takes down image generation for 6 hours and nobody complains. That’s a confusing day. (Did nobody notice? Does nobody use that feature? Am I building in a vacuum?)

Imposter syndrome is constant. Other tools in this space have teams, funding, marketing budgets. I have a terminal window and too much coffee.

But also: I built this. Every feature, every bug fix, every email response. It exists because I made it exist. That’s not nothing.

Final Thoughts

Dipflow isn’t done. I’m not sure when it will be. Products like this are never really done. There will always be another feature to build, another edge case to handle, another user request to evaluate.

But it’s working. People are using it. Articles are being generated, published, ranked. The thing I wished existed now exists.

If you’re building something similar, a complex product with multiple integrations, real-time requirements, monetization to figure out, the only advice I have is: start smaller than you think, ship before you’re ready, and budget twice as much time as you expect for the “simple” parts.

And if you’re considering whether to build something at all: the worst case is you spend months learning a ton and build something that doesn’t work out. The best case is you build something useful that people pay for.

Both outcomes are better than not trying.

This post was written late at night, like most of the code it describes. If you’ve read this far and have questions, I’m probably procrastinating on building features by answering emails. No guarantees on response time.