Build vs buy: what it actually takes to ship an AI agent for Slack in 2026
OAuth scopes, event subscriptions, tool-call gateway, approval architecture, audit logs, on-call rotation. A realistic walk-through of the 6-12 month engineering project to build your own AI agent for Slack — and where buy starts to win.
Marketing operators ask this every other week now: do we build our own AI agent for Slack, or install a managed one? The answer depends on what you actually need. This post walks through what 'build' realistically costs in 2026 — every component, every gotcha, every place we've seen teams underestimate the engineering surface — and then makes the honest case for where 'buy' wins. Mavrick is the buy side of this trade. We'll be transparent about that throughout.
The short version: a production-grade AI agent for Slack is a 6-12 month engineering project for a team of 2-3 senior engineers. For most marketing teams, installing a managed agent is the better trade. For platform teams with unique tools or strict compliance, building is sometimes the right call.
Phase 1 — Slack app foundation (2-4 weeks)
You start by creating a Slack app via api.slack.com/apps and defining its manifest. The manifest declares OAuth scopes, event subscriptions, slash commands, interactive components, and the bot user. Mistakes here ripple through everything else.
OAuth scopes — the trust surface
The minimum useful scope set for an AI agent that responds to @-mentions and posts back to channels:
- →app_mentions:read — receive events when users @-mention the bot
- →chat:write — post messages back in channels and threads
- →im:write — open and post in DMs (for operator approval flows)
- →im:read — read DMs sent to the bot
- →users:read — resolve user IDs to names for context
- →files:write — upload artifacts (CSVs, charts, briefings)
Add channels:history if the agent needs to read channel context beyond just @-mentions. Add commands if you want a slash-command interface alongside @-mentions. Be parsimonious — every scope you ask for is friction at the install step, and Slack App Directory review weighs scope creep heavily.
Event subscriptions — the inbound webhook
Configure your Request URL to point at a public HTTPS endpoint you control. Slack will send a one-time URL verification challenge; respond with the challenge string within 3 seconds or the endpoint won't validate. Subscribe to app_mention and message.im events (and channels.history if you need ambient context).
Critical gotcha: Slack retries failed deliveries up to 3 times with backoff. If your endpoint takes >3 seconds to respond, you'll get duplicate events. You must ack the event within 3s (return 200 immediately) and process the work asynchronously. Most teams accomplish this by enqueuing the event into a job queue (Sidekiq, BullMQ, Celery) and returning 200 right away.
Phase 2 — LLM integration + tool calls (3-6 weeks)
Now you wire the agent to an LLM and give it tools. The LLM parses intent from the user's @-mention, decides which tool to call, and composes a response. As of mid-2026, the dominant choices are Claude (Anthropic) and GPT-5 (OpenAI). Both support tool-calling via JSON-schema-defined functions.
Tool gateway — the most-overlooked component
Your tools are functions the LLM can call: 'pull last week's Meta Ads ROAS', 'list HubSpot contacts updated today', 'pause Google campaign id_X', etc. The gateway layer that maps LLM tool-call JSON to actual API requests is the single most-rebuilt component of every AI agent project. It needs:
- →Parameter validation (LLMs sometimes hallucinate parameter values)
- →Authentication injection (OAuth tokens fetched from your secure vault, injected at the boundary — the LLM should NEVER see raw tokens)
- →Rate limit handling (with per-vendor backoff strategy)
- →Response caching (same query in two minutes shouldn't re-fetch)
- →Error normalization (Meta's 400 vs Stripe's 402 vs HubSpot's 401 all need to surface as the same coherent failure type to the LLM)
- →Audit logging (every tool call gets a row in your audit_log with timestamp, tool, params hash, result code, workspace_id)
OAuth credential management
Your customers authorize the agent against their Meta Ads, Google Ads, HubSpot, Stripe, etc. Each of those gives you an OAuth refresh token. Tokens must be stored encrypted at rest (we use Supabase Vault with pgsodium and an isolated master key). Tokens must be rotated when refresh windows close. Tokens must be redacted from every log output. Tokens must never reach the LLM.
If you have to support more than 10 third-party services, this is where most build projects stall. The OAuth-flow-per-vendor maintenance is unending — every vendor changes their callback URL format, scope syntax, or token rotation pattern every 6-12 months. The managed-connector layer pattern (Pipedream Connect, Workato, Tray.io, or roll-your-own) exists specifically because this work is too cross-cutting to keep in-house at most companies.
Phase 3 — Approval architecture (2-4 weeks if done right)
> THE MAVRICK BRIEF
Want this kind of thing in your inbox once a week?
// Written personally by Brian
An AI agent that takes action without explicit human approval will eventually take a wrong action. The recovery cost is steep — financial loss, customer trust damage, audit-trail gaps. The cleared-hot approval pattern (one-click confirmation in Slack before any mutation) is non-negotiable for a production agent.
The architecture: any tool call classified as mutative (writes, deletes, transfers, sends) returns a proposed-action message rather than executing. The message includes the action description, expected outcome, and a Slack Block Kit button with a callback_id. When the user clicks 'Cleared hot — execute', your interactive endpoint receives the payload, validates the user is authorized to approve, and triggers the actual execution.
Cleared-hot is not a setting. If your agent has an 'auto-approve' toggle, the toggle will eventually be on when something destructive runs. Make the approval gate architectural — there is no path that bypasses it. This is Privacy Charter Rule 3 at Mavrick.
The classification problem
Deciding which tool calls require approval is harder than it looks. 'Read MRR' is obviously safe. 'Pause campaign' is obviously mutative. 'Generate a draft email' looks safe but if your tool definition includes 'send' as a side effect, it isn't. The safest pattern is a tool-side flag (mutates: true/false) that every tool declares explicitly, plus a CI check that fails the build if any new tool is added without that flag.
Phase 4 — Production concerns (4-8 weeks across the project)
Audit logging
Every tool call lands in a per-workspace audit_log with timestamp, tool name, parameters (sanitized of secrets), result code, and workspace_id. 12 months rolling retention is the common standard for B2B SaaS. Make this queryable by workspace admins from a dashboard — your future SOC 2 audit will thank you.
Rate limiting and per-workspace concurrency caps
Without per-workspace caps, one chatty workspace will starve your LLM quota and degrade response time for everyone else. The Mavrick pattern: a sliding 90-second window enforces a maximum of 5 simultaneous agent runs per workspace, env-overridable. Cap-hit returns an honest customer message ("Mavrick is busy — burst limit") rather than a silent drop, and fires an operator alert.
Idempotency on webhook ingestion
Slack retries events. Stripe retries webhooks. Every external event source retries. Without idempotency, you'll double-charge users and double-post messages. Mavrick gates Stripe events through a UNIQUE primary-key check on processed_stripe_events before any dispatch runs. If Stripe retries delivery, the duplicate is caught at the PG level, returns 200 received:true with duplicate:true, and the side-effecting dispatch never fires twice.
Operator alerting and on-call
When the agent fails, you need to know before the customer does. Wire your errors into Sentry. Wire your high-severity failures (catastrophic exceptions, cost-cap breaches, credential rotation failures) into a Telegram or PagerDuty rotation. Scrub secrets from every alert before send — bearer tokens, sk- prefixes, xoxb-, ghp_, Stripe live keys, AWS access keys, etc. all have characteristic patterns that should be regex-redacted.
Schema drift detection
When you ship a database migration that drops a column your tool code references, you'll discover it in production. The standard mitigation is a CI check that parses applied migrations and walks every database-query call site to verify referenced columns exist. We installed this at Mavrick after a single dropped-column incident — never had another one.
Phase 5 — Slack App Directory submission (2-6 weeks calendar)
If you want your customers to install your agent in one click, you need App Directory approval. Slack's review process takes 2-6 weeks typical and gates on: scope minimization (asking for fewer scopes wins), privacy policy clarity, terms of service, in-product app icon, screenshots, written description, and the install-flow user experience. Expect 1-2 rounds of revisions before approval. The first submission almost never passes.
If you don't need App Directory listing, you can distribute as a private app — share an install link directly, and the customer's admin approves the OAuth on first install. This is faster but limits your reach.
Phase 6 — Ongoing maintenance (perpetual)
After ship, you're on the hook for:
- →LLM API changes — OpenAI and Anthropic make breaking changes every few months
- →Slack API deprecations — Slack deprecates old methods routinely; you have ~6 months to migrate
- →Vendor API changes for every connected tool — Meta breaks their API somewhere quarterly; Google Ads has a 1-year migration window when they deprecate; HubSpot announces breaking changes with 90-day notice
- →Security patches across your dependency tree — Renovate or Dependabot catches them; you still need to ship them
- →Customer-reported bugs — every workspace runs slightly different workflows; edge cases will surface
- →Compliance evidence — SOC 2 Type 2 requires continuous logging, not just point-in-time audits
Maintenance is the silent cost. A working AI agent for Slack at customer-zero is 30% of the total project. The other 70% is keeping it working through 2-5 years of platform evolution.
Total realistic effort
| Phase | Senior engineer-weeks | Calendar weeks |
|---|---|---|
| Slack app foundation | 3-5 | 2-4 |
| LLM integration + tool gateway | 8-12 | 3-6 |
| Approval architecture | 3-5 | 2-4 |
| Production concerns (audit, rate limit, idempotency, on-call) | 6-10 | 4-8 |
| App Directory submission + revisions | 1-2 | 2-6 (calendar wait) |
| TOTAL (build to customer-zero) | 21-34 | 13-28 |
Plus 30-60% of one senior engineer perpetually for maintenance. At 2026 SF-Bay-Area fully-loaded engineering cost (~$300K/yr per senior eng), that's $360K-$1.2M to build and $90K-$180K/yr to maintain.
Where 'build' wins
There are real cases where building is the right call:
- →Your stack is genuinely unique — you've built proprietary internal tools the agent must integrate with that no managed connector covers
- →Compliance forces on-prem hosting — regulated industries (defense, certain healthcare verticals) sometimes can't use SaaS at all
- →You're a platform company selling agents as a core product — owning the infrastructure becomes a competitive asset
- →You have a 5+ engineer team where ongoing maintenance is amortized across many product features
Where 'buy' wins (most marketing teams)
For marketing teams not in those buckets, installing a managed AI agent for Slack is the better trade:
- →60-second install vs. 13-28 weeks to ship
- →$50/mo Pilot tier vs. $360K+ build cost
- →3,200+ integrations via managed connector layer vs. one-by-one OAuth project per tool
- →SOC 2 posture, GDPR alignment, contractual Privacy Charter included vs. building your own compliance program
- →Public Constitution + system prompt + decline log governing the agent's behavior — versus rolling your own governance
- →Anthropic + OpenAI model updates flowing through automatically — versus you keeping pace with API changes
Mavrick is the buy side of this trade. If your team's job is marketing and not platform engineering, the math leans heavily toward installing instead of building. The 6-12 months you'd spend building is 6-12 months you could spend running campaigns with an agent that already works.
What to do next
If you're seriously evaluating build: download Slack's app manifest reference, sketch your tool gateway architecture, and budget 6-12 months. If you're seriously evaluating buy: install Mavrick in your Slack in 60 seconds, run a real mission against your accounts, and decide whether the trade-off works for your team. Start free — 10 missions, no credit card.
Stop pulling data. Start commanding Mavrick.
10 free missions. Connects to your accounts in minutes.
> THE MAVRICK BRIEF
What operators are actually installing this week.
One short email a week. Tool stack changes. Workflows operators just installed. Patterns from inside Mavrick's customer base. No theory, no hype, nothing you have to “implement later.”
// Written personally by Brian · One click to unsubscribe

Brian MacDonald
Brian MacDonald is the founder of Mavrick, the AI coworker for marketing teams. Previously ran SetupClaw.tech, an AI deployment service for SMBs. Read more about Brian and the mission.