Five opportunities found. Two designed in full. One built to run.
Professional Services, COGS, and Support runs on a hidden, repeating task: a person reading messy input, interpreting it, and routing it by hand. This submission finds five high-value instances of that task, designs two of them in full, and builds one that runs.
Use the navigation up top to jump anywhere, or move through in order with the controls at the bottom. An AI assistant grounded in the full submission is available any time via the button on the right.
Five instances of the same shape
These are not five unrelated ideas. They are five instances of one recurring problem inside the function. The inputs differ, the systems differ, but the architectural shape is identical every time.
Because the shape repeats, the first pipeline built creates reusable patterns and shared infrastructure that compound across every build after it. This is a repeatable strategy, not a collection of point solutions.
Five opportunities, ranked by impact
Ranked by labor hours at stake, timeline compression, breadth of beneficiaries, and proximity to a buildable implementation. Expand any card for the short version.
Combined conservative estimate: 3,600 to 6,300 labor hours per year in steady state. The range discounts for overlap between opportunities rather than stacking best cases. For context, the AI team is working toward eliminating 55,000 manual hours this year across many contributors; these five are a measurable contribution to that, not a claim on the whole.
WebCentral Revisions Intelligence Pipeline
WebCentral runs 30 to 40 revision rounds a week. Each one loses one to four days in a manual chain: a PM splits compound requests, classifies each as design or dev, copies them to OneNote, flags scope, and routes to an art director who, on two of four ADs, skips the review entirely. This replaces that chain with classification and routing that runs in seconds and applies one consistent scope standard on every batch.
The flow
The same four-step shape from the framing, made concrete. Input, the model layer, the routing branch, and the human gate that holds on the review path. A caption below explains why the numbering jumps from 01 to 03.
The numbering jumps from 01 to 03 on purpose. Step 02 in the framing is the human transform, the manual bottleneck this pipeline exists to remove, so it is absent from the diagram by design. The flow goes straight from messy input to the model that replaces the hand-off.
Model selection
Structured input, constrained JSON output, pattern-matching against a known ruleset. That is a small-fast-model task. All three candidates reach through OpenRouter on one key, so this is a capability-and-cost call, not an infrastructure one.
Current-gen, near Sonnet-4 reasoning, the capability headroom
Likely sufficient; the cheaper fallback if it scores equally
Cost floor, but an aging line that loses on longevity, not price
Self-hosted Mistral 7B via Ollama is named as the path if volume scales or governance requires keeping content off third-party APIs. At 30 to 40 runs a week the dollar difference is trivial, so the default takes the headroom and the eval settles the tie on real data.
Evaluation: offline before online
The unlock is that the golden set already exists. WebCentral PMs have hand-parsed, classified, and scope-flagged batches for over 18 months, every result archived as a OneNote page. That archive is a labeled corpus. No hand-labeling to fund.
What gets measured online
| Metric | Target |
|---|---|
| OneNote confirmation rate | > 85% at 60 days |
| PM review rate | < 30% at steady state |
| AD correction rate | < 5% |
| Scope flag precision | > 70% |
| Processing time (webhook to ticket) | < 5 minutes |
Key tradeoffs
Failure modes instrumented: empty SharePoint URL halts the write rather than guessing, missing retrieval corpus halts the auto-route path, and revision text is treated as untrusted user content to guard against prompt injection.
Project Initiation Intelligence Brief
When a project enters the queue, an implementation manager is supposed to read the full account history before assigning a PM. The handoff form meant to capture it is used on one product, filled out poorly, and a cross-product rollout depends on sales buy-in that history says will not come. This pipeline makes that question moot: the information already lives in Cloud Coach and Gong, so the brief is produced automatically, for every product, with no form for sales to fill.
The flow
Model selection
Synthesis across deduplicated email, transcripts, and records, surfacing anomalies a human IM would catch. A small model here produces something technically a brief but missing the signal, which is worse than no brief because it creates false confidence.
Current-gen synthesis, near prior-Opus capability; ~$0.06 per brief
A genuine option if cost ever outweighs synthesis quality
At a heavy hypothetical of 100 projects a week, Sonnet 4.6 runs $25 to $45 a month. Against a tool replacing hundreds of review hours a year, the model cost is immaterial, so the spend goes to the dimension that decides whether IMs trust the output.
Why the evaluation is forward-looking
Architecture A grades against 18 months of consistent archived output. Architecture B cannot, and that is the problem statement, not a gap. The CRF has been filled out poorly for years; grading against that archive would mean measuring against a broken answer key. So the eval is forward-looking from day one: a PM feedback module built into the end-of-project CRF completion (a one-to-five rating plus a conditional reason field) accumulates into a labeled dataset as volume builds. The asymmetry between the two architectures is deliberate.
What gets measured
| Metric | Target |
|---|---|
| Brief generation rate | 100% |
| Escalation flag precision | > 60% at 90 days |
| PM usefulness rating | > 3.5 / 5.0 average |
| Thin-record rate | Track (no target) |
Key tradeoffs
The one real risk: governance
The data this pipeline reads already lives in Cloud Coach and is already viewable by anyone with access there. This is not net-new internal exposure. The only new question is routing that data out to an external model provider, and that is a compliance and contractual clearance, not a technical wall. Because the data is already centralized, it is a more contained question than a fresh data-access request, and the same clearance unblocks every later pipeline that touches these sources.
If that clearance fails, the costed fallback is Ollama running an open-weight model on CivicPlus infrastructure, keeping all content in-house. The tradeoff is stated plainly: a small local model is weaker at exactly the multi-source synthesis this architecture argues for, so self-hosting trades away the capability that justified Sonnet 4.6 in the first place. It is the fallback if clearance fails, not the default.
The working n8n flow
Architecture A, built. When a revision form is submitted, the workflow resolves the project, assembles the two-layer scope context, sends it to Claude Haiku 4.5 via OpenRouter for a single classification pass, writes a DRAFT OneNote page, and routes the batch by confidence and scope. It executes end to end. Every node needing a CivicPlus credential is stubbed, with the real API call documented in its notes.
The eleven nodes, in order
Manual Trigger · stands in for the production PADS webhook
Code · injects the realistic test submission
Code · resolves project by POC email, then PM, with fuzzy org fallback
Code · two Graph calls in production (notebook, then section)
Code · builds the two-layer scope corpus and the request body
HTTP · live OpenRouter call to anthropic/claude-haiku-4-5
Code · parses JSON, computes the routing decision
Code · builds the DRAFT page; fires on every batch, both paths
Switch · the one branch point in the flow
Code · creates the AD ticket and an informational PM notice
Code · sends flags and the draft clarification email to the PM
The two real-logic Code nodes (5 and 7) carry the pipeline's actual intelligence; the one real external call (6) does the classification. Everything stubbed is a CivicPlus credential boundary, not a design gap, and each stub documents its production endpoint inline.
One run, start to finish
A real Standard-tier batch from a test "Westminster CO" submission. The customer dumped multiple requests into one text block, exactly the behavior the manual process struggles with. Here is what went in and what the model returned.
2. The about us page needs a new photo and the staff bios should be moved below the mission statement.
3. Add a search bar to the homepage. We also want to add a live chat widget and integrate our Facebook feed.
Honest notes on the build
The routing threshold, scope ruleset, OneNote page structure, due-date logic, and two-branch outcome are all implemented as designed in Architecture A. Nothing was removed that affects the pipeline's logic.
Why this submission, from inside the function
These opportunities are not hypothetical. They come from years of living the pain points, and from tools already built and running against them: a mass-notifications import pipeline with fuzzy matching, a Chrome-extension agenda importer evolving toward headless Playwright, an accessibility scanner with an OpenAI feature and C-suite visibility. An external candidate can guess where the work is slow. This is knowing.