Factory

Field Testing

AI Creative Production Pipeline

The AI handles logistics. You handle taste.

Every campaign needs visuals, and every visual needs iterations. That means logging into Midjourney, writing a prompt, waiting three minutes, downloading, then doing the same in Grok, then Gemini, then organizing everything into folders you'll never find again.

I was spending more time on logistics than creativity.

So I built the orchestration layer that was missing. Not another AI image generator—the system that makes the existing ones usable at scale.

The Pipeline

Research

Pulls brand guidelines, analyzes competitors, builds a moodboard

Brief

Synthesizes direction into platform-specific prompts

Generate

Automates browsers, runs prompts across multiple AI platforms

Iterate

Tracks what worked, suggests variations

Deploy

Packages assets with full metadata, pushes to GitHub

The Browser Automation Problem

Most people don't realize: these AI image platforms don't have APIs. Or the APIs are limited, expensive, or waitlisted.

So Factory automates the browser. It navigates to Midjourney's web interface, finds the prompt box, types your prompt, waits for generation, clicks upscale, downloads with metadata tracking which prompt produced what.

The trick is using accessibility snapshots instead of CSS selectors. "Find the textbox labeled 'What will you imagine?'" survives UI updates. CSS selectors don't.

Platform Support

Midjourney

Stylized hero images

Grok

Fast generation, surprises

Gemini

Photorealism, product shots

Human Gates

Not everything should be automated. Factory has explicit decision points where the AI stops and waits:

•After the brief: Does this creative direction match what you want?
•After generation: Which images should we iterate on?
•Before deploy: Ready to push to production?

First Real Test

Built for our DeepStack campaign. Results:

Production-ready images

<10 min

Human attention required

State Tracking

A week later, I needed to know exactly which prompt produced the image we used. It was all there—prompt, platform, timestamp, generation time, file path.

You will forget what made what. The system won't.

Status: Field testing at ID8Labs

Built with: Claude Code, Playwright MCP, Perplexity MCP, Firecrawl MCP

Read the full essay