CI is great at running code, but it’s notoriously bad at one thing your product probably depends on: email. If your tests cover signups, password resets, magic links, or inbound notifications, you still need a way to see emails in CI without (1) logging into a shared mailbox, (2) scraping HTML, or (3) dumping sensitive content into job logs.
The good news is you can treat email like any other test artifact: provision it per run, wait deterministically, assert on structured fields, and attach the results to CI output.
Why “log into a mailbox” breaks CI
A traditional mailbox (Gmail, Outlook, IMAP) is a human UX wrapped around long lived identity. CI needs the opposite: short lived, isolated, machine readable state.
Common failure modes when you rely on a real mailbox in pipelines:
- Parallel runs collide: multiple jobs read the same inbox, pick the wrong message, or delete each other’s state.
- Auth and MFA don’t belong in CI: OAuth refresh tokens expire, MFA challenges block runs, and permissions are too broad.
- Observability is backwards: you can’t easily store “what email did we receive?” as a build artifact.
- Security gets worse: teams start pasting OTPs and magic links into logs to debug.
If you want emails to be testable, they must be consumable like events.
💡 Stop Fighting Email in Your CI Pipeline
Skip the mailbox login headaches and parallel run collisions. Mailhook creates disposable inboxes that behave like proper test resources - provisioned per run via API and consumed as structured JSON.
The three practical ways to see emails in CI
There are lots of email testing tricks, but in 2026 most teams converge on one of these three, depending on what they’re validating.
Option 1: Local SMTP capture (fast, great for dev and some CI)
If your goal is “did our app attempt to send an email?” then a local SMTP capture tool is the simplest approach.
How it works:
- Your app sends via SMTP to a local server inside the CI network.
- The server stores messages and exposes them via a small API/UI.
Pros:
- Very fast
- No external dependency
- Great for unit tests and integration tests
Cons:
- Not a true end to end test of real inbound routing
- Doesn’t catch deliverability and routing issues that happen in production email infrastructure
This is a good default for developer workflows, but it often falls short for E2E flows that must behave like production.
Option 2: A disposable inbox API (best for E2E and verification flows)
For end to end tests where you need to receive real inbound email and extract an OTP or link, programmable disposable inboxes are the most reliable pattern:
- Create a fresh inbox via API per run or per attempt.
- Use its email address in the flow under test.
- Wait for arrival deterministically.
- Retrieve the message as structured JSON.
- Assert and extract only the artifact you need.
This avoids mailbox logins entirely. CI just fetches data.
Mailhook is built for exactly this model: disposable inbox creation via API, email delivered as JSON, webhook notifications, polling retrieval, shared domains, custom domain support, signed payloads, and batch processing. For the canonical integration details, use the published contract: Mailhook llms.txt.
Option 3: Provider specific event streams (useful, but often not inbox shaped)
Some email sending providers expose delivery events (accepted, bounced, delivered) and sometimes message previews. This can be helpful when you care about deliverability telemetry, but it’s not always enough for “click this magic link” style tests.
In most orgs, teams still need an inbox shaped abstraction (address, messages, retrieval) for deterministic verification flows.
Quick decision table
| Goal in CI | Best approach | What you actually assert on |
|---|---|---|
| “We generated the right email content” | Local SMTP capture | Subject/from/to, text body, templates |
| “A user can complete signup verification” | Disposable inbox API | Arrival within timeout, extracted OTP or link |
| “Deliverability and provider behavior” | Provider events + targeted tests | Delivery events, bounces, complaint signals |
If your pipeline needs to “see emails” the way a user would experience them, disposable inboxes win.
The deterministic pattern: inbox per run, wait, parse JSON
The key is to stop thinking in terms of “search a mailbox.” In CI you want an explicit contract:
- Isolation: each run gets its own inbox.
- Delivery: your test harness has a reliable way to wait for messages (webhook first, polling fallback).
- Consumption: you assert on stable JSON fields, not rendered HTML.

💡 Get the Deterministic Email Pattern Working in Minutes
Why rebuild inbox isolation and JSON parsing when you can start testing immediately? Mailhook gives you webhooks, polling, and structured email data out of the box - exactly what this pattern needs.
Step 1: Provision an inbox
Create the inbox at test start and keep both:
- the email address (what your app uses)
- the inbox handle (what your test harness uses to fetch messages)
With Mailhook, the exact request and response fields are documented in llms.txt. Treat that file as the source of truth in code reviews.
Step 2: Trigger the email
Run the action under test, for example:
- signup
- password reset
- email sign in
For determinism, pass a correlation value you control (for example, a run ID) into your application so it ends up in the email. Common places:
- a query param inside the verification link
- a custom header your mailer adds
- a reference token in the email text
Step 3: Wait for arrival (polling is simplest inside CI)
Webhooks are great when you already have a public endpoint. In CI, polling is often easier and still deterministic if you implement timeouts and backoff.
Provider agnostic polling pseudocode:
async function waitForEmail({ inboxId, matcher, timeoutMs }) {
const started = Date.now();
let delay = 250;
while (Date.now() - started < timeoutMs) {
const messages = await listMessages(inboxId); // provider API call
const match = messages.find(matcher);
if (match) return match;
await sleep(delay);
delay = Math.min(delay * 1.5, 2000);
}
throw new Error(`Timed out waiting for email in inbox ${inboxId}`);
}
The important part is the matcher: make it narrow enough that parallel runs cannot match each other’s mail.
Step 4: Assert on JSON, extract only what you need
Once you retrieve a structured JSON representation, prefer assertions like:
- From domain is correct
- Recipient matches expected address
- Subject contains the expected intent
- Text body contains an OTP pattern
Avoid brittle behaviors:
- scraping HTML markup
- depending on exact formatting or CSS
- letting an LLM “read the whole email” when you only need an OTP
Example extraction (OTP):
import re
def extract_otp(text: str) -> str:
m = re.search(r"\b(\d{6})\b", text)
if not m:
raise ValueError("OTP not found")
return m.group(1)
Step 5: Attach the email JSON as a CI artifact (instead of printing it)
To debug failures without leaking secrets:
- write the received message JSON to a file
- upload it as an artifact
- redact or omit sensitive fields where appropriate
In GitHub Actions, artifact upload is straightforward:
- name: Save inbound email JSON
run: node scripts/save-email.js > email.json
- name: Upload email artifact
uses: actions/upload-artifact@v4
with:
name: inbound-email
path: email.json
This is the biggest practical difference between “manual mailbox debugging” and “CI grade email visibility.” You get a durable artifact tied to a run.
Webhooks in CI: when they make sense
Webhooks are ideal when:
- you run a long lived test environment with a stable public URL
- you want the lowest latency
- you already operate a small receiver service
If you do use webhooks, treat the webhook request as an untrusted input channel:
- verify signatures
- enforce timestamp tolerance
- dedupe deliveries (retries happen)
Mailhook supports signed payloads, which is the correct primitive here. Again, the exact signature and header format is described in Mailhook llms.txt.
CI reliability checklist for “seeing emails”
These are the guardrails that remove most flakiness:
Use a unique inbox per run (or per attempt)
If you share inboxes, you will eventually match the wrong message. Isolation is cheaper than debugging.
Prefer deterministic waiting over fixed sleeps
Replace sleep(10) with “wait until a matching email arrives or timeout.” This improves speed and stability.
Dedupe at the right level
Email pipelines can legitimately deliver duplicates due to retries. Your harness should be robust by:
- selecting the latest matching message
- tracking seen message IDs
- making consumption idempotent
Treat email content as hostile input
Even in test environments, email can contain:
- unexpected HTML
- tracking links
- attachments
If an agent is involved, constrain what it can do with the email. Extract a minimal artifact (OTP or URL), validate it, then proceed.
Keep secrets out of logs
If you must log something, log identifiers:
- run ID
- inbox ID
- message ID
Then store the full message content in artifacts with appropriate access controls.
A minimal “see emails in CI” implementation plan
If you’re starting from a flaky shared mailbox setup, migrate in this order:
Phase 1: Stop logging into mailboxes
- Replace mailbox UI steps with API retrieval.
- Store the received message as an artifact.
Phase 2: Isolate
- Create an inbox per run (or per attempt for verification flows).
- Add a correlation token into the email content.
Phase 3: Make it secure and scalable
- Add webhook signature verification if you use webhooks.
- Add timeouts, dedupe, and clear error messages.
- Consider a custom domain if you need allowlisting or stricter control.
Mailhook supports both instant shared domains and custom domain support, so you can start fast and tighten control later.
Where Mailhook fits
If your goal is to see emails in CI without ever logging into a mailbox, you want an inbox that behaves like a test resource:
- provisioned on demand via API
- consumed as structured JSON
- delivered via webhooks or fetched via polling
- safe to run in parallel
That is the core workflow Mailhook is designed for. Use the canonical integration reference to implement the exact calls and payload formats: Mailhook llms.txt. You can also start from the product overview at Mailhook.