Create Disposable Email Address per Test Run (No Collisions)

Email-dependent tests fail in a very specific, very annoying way: one run “steals” another run’s message. You trigger a sign-up email, then your test reads an OTP that belongs to a different job, a retry, or a parallel shard. The fix is not “sleep 10 seconds” or “pick the latest email”, it is to create a disposable email address per test run so every run has its own isolated inbox and there is nothing to collide with.

This guide shows a collision-proof pattern you can apply to CI, QA automation, and LLM-driven test agents: inbox-per-run (or inbox-per-attempt), deterministic waiting (webhook-first, polling fallback), and machine-readable assertions on structured JSON.

What “collisions” look like in CI (and why they keep happening)

An email collision is any case where your test consumes an email that it did not produce. In practice, it usually shows up as:

OTP mismatch, “invalid code”, or “token already used”
Magic-link verification that logs in the wrong user
A “pass locally, fail in CI” pattern that gets worse as parallelism increases
Flakes that disappear when you re-run a job (because timing changes)

The root causes are boring but consistent:

Shared mailbox state. If multiple runs read from the same inbox, they are racing.

Retries and resends. Your app may resend an email, your job may retry, the SMTP path may deliver duplicates, or your webhook handler may be retried.

Weak correlation. If your harness matches “the latest email with subject X”, you have no hard boundary.

Non-deterministic waiting. Fixed sleeps are guesses, not synchronization.

If you want “no collisions”, you need a model where the default architecture makes cross-run mix-ups impossible.

The invariant that eliminates collisions: inbox-per-run (not “one email string”)

Generating a unique email string is not enough unless you also control where it routes and how you read it.

A collision-proof model has two parts:

A disposable inbox resource created per run (or per attempt).
An email address that is bound to that inbox, plus an inbox handle (often an inbox_id) your code uses to read messages.

That second piece matters. When you only pass around an email address, your consumer has to “search email somewhere”. When you pass around email + inbox_id, your consumer can deterministically read from exactly one place.

If you want a deeper explanation of this modeling approach, see Disposable Email With Inbox: The Deterministic Pattern.

A collision-proof addressing scheme (per run, per attempt, and parallel-safe)

When people say “create disposable email address”, they often mean “make it unique”. Uniqueness is necessary, but in CI you also want traceability and idempotency.

A practical addressing strategy uses three identifiers:

run_id: stable for the CI run (or workflow execution)
attempt_id: changes for retries (or for each test that triggers email)
nonce: a short random suffix to avoid edge cases and make addresses unguessable

You can store these values as metadata in your test harness, even if you do not encode all of them into the local-part.

Recommended rotation rules

To get “no collisions” across shards and retries, use these rules:

Inbox per attempt if retries can trigger new emails (sign-up verification, password reset, OTP sign-in). This is the safest default.
Inbox per run if the run triggers a single email and you can guarantee no resends.
Never reuse inboxes across parallel jobs unless you can enforce strict routing and filtering boundaries.

These rotation rules align with the broader lifecycle guidance in Temp Inbox Strategies: Rotation, Expiry, and Limits.

Why plus-addressing and catch-all domains still collide

Plus-addressing (for example, [email protected]) and catch-all domains can be useful, but they are easy to get wrong at scale.

They often route into the same underlying mailbox, which means your test still has to search and filter.
Many systems normalize addresses differently (case, dots, tags), which can cause surprising merges.
They encourage “search by subject/body”, which is exactly where collisions happen.

Disposable inbox APIs flip the model: instead of “filter a shared stream”, you “read from a private container”.

Comparison: common approaches to email in automated tests

Approach	Collision risk in parallel CI	Setup cost	Debuggability	Works with real inbound SMTP	Notes
Reserved domains (`example.com`, `.test`)	None (no email is sent)	Low	Medium	No	Good for unit tests that should not send email.
Local SMTP capture	Low	Medium	High	Usually local-only	Great for dev, often not representative of real delivery.
Plus-addressing to a real mailbox	High	Low	Low	Yes	Collisions are common unless you build strong routing and isolation.
Catch-all domain to a shared mailbox	High	Medium	Medium	Yes	Requires careful recipient-to-test mapping to avoid races.
Disposable inbox per run/attempt (API)	Low	Medium	High	Yes	Designed for deterministic automation.

The reference workflow (create, trigger, wait, assert, expire)

A deterministic email test harness is a small state machine. The workflow below is provider-agnostic, but Mailhook is built specifically for it: disposable inbox creation via API, emails delivered as structured JSON, webhook notifications, polling fallback, and signed payloads.

For the exact API contract and fields, use the canonical spec: Mailhook integration contract (llms.txt).

Step 1: Provision a disposable inbox for this attempt

At the start of the test attempt:

Create an inbox via API.
Capture the returned email and inbox_id.
Store run_id and attempt_id in your test context.

Pseudocode:

type EmailWithInbox = {
  email: string;
  inbox_id: string;
  expires_at?: string;
};

async function provisionInboxForAttempt(ctx: { runId: string; attemptId: string }) {
  // Consult https://mailhook.co/llms.txt for the exact endpoint/SDK shape.
  const inbox: EmailWithInbox = await mailhook.createInbox({
    metadata: { run_id: ctx.runId, attempt_id: ctx.attemptId },
  });

  return inbox;
}

Step 2: Trigger the email (sign-up, OTP, password reset)

Use the returned email in the flow under test. This is where your app sends the verification message.

The key point is that the address routes into exactly one inbox, created for this attempt.

Step 3: Wait deterministically (webhook-first, polling fallback)

In automation, your waiting logic should have:

An overall deadline (for example, 60 seconds)
A narrow matcher (subject, sender, or a correlation header if you control the sender)
A dedupe key (message-level IDs, plus artifact-level hashes)

Webhook-first is the most reliable and lowest-latency approach, with polling as a safety net. This hybrid pattern is covered in Temp Email Receive: Webhook-First, Polling Fallback.

Pseudocode shape:

async function waitForVerificationEmail(params: {
  inboxId: string;
  deadlineMs: number;
  match: { from?: string; subjectIncludes?: string };
}) {
  // Prefer webhook delivery into your system.
  // If webhook is unavailable, fall back to polling.
  return await mailhook.waitForMessage({
    inbox_id: params.inboxId,
    deadline_ms: params.deadlineMs,
    match: params.match,
  });
}

A CI pipeline diagram showing three parallel test jobs. Each job provisions its own disposable inbox (unique inbox_id and email), receives a webhook event when an email arrives, and extracts an OTP, with arrows indicating no shared mailbox and no cross-run collisions.

Step 4: Assert on structured JSON, not rendered HTML

When you receive the email, treat it as data, not as something to “open in a browser”. Your assertions should be stable even if the template changes.

Good assertions:

Sender identity (with conservative matching)
Expected intent (for example, “Your verification code is …”)
Extracted OTP or verification URL

Avoid:

CSS selectors on an HTML template
“pick the newest message” without a matcher
Parsing that depends on whitespace or formatting

Mailhook delivers received emails as structured JSON, which makes this style of assertion straightforward.

Step 5: Extract the minimal artifact and continue the test

Your test usually needs one artifact:

OTP code, or
Magic link URL

Extract only what you need, then proceed with verification.

This is also a good boundary for LLM agents: give the agent a minimal, deterministic artifact rather than the full email body.

Step 6: Expire and clean up

The fastest way to keep your system collision-free is to keep inboxes short-lived.

Use a TTL that matches your test deadlines.
Clean up after success and failure.
Store the JSON message as a CI artifact for debugging, instead of keeping inboxes around forever.

Dedupe rules that keep retries from re-breaking determinism

Even with inbox-per-attempt, you should assume duplicates exist. Retries happen at many layers.

A practical dedupe design uses three layers:

Layer	What duplicates look like	What to dedupe on	Why it matters
Delivery	Same message delivered twice to your webhook	Provider delivery id, or signature timestamp plus body hash	Webhook retries are normal.
Message	Same email content arrives more than once	Stable message id (if available) plus inbox_id	SMTP retransmits, resend logic, etc.
Artifact	Same OTP/link seen twice	Hash of extracted OTP or normalized URL	Prevent “consume twice” failures.

If you want more detail on the polling side (cursors, timeouts, dedupe), see Pull Email with Polling: Cursors, Timeouts, and Dedupe.

Security guardrails (especially for LLM agents)

A disposable inbox makes tests more reliable, but it does not make inbound email trustworthy. Treat inbound email as untrusted input.

Verify webhook authenticity

If you accept inbound messages via webhooks, verify the request authenticity at the HTTP layer.

Mailhook supports signed payloads for webhook notifications, so you can fail closed on spoofed or tampered deliveries.

For a threat-model-driven checklist, see Email Signed By: Verify Webhook Payload Authenticity.

Validate links before using them

If you extract a verification URL:

Enforce an allowlist of expected hostnames
Reject unexpected protocols
Prefer a server-side verification call if possible, instead of having an agent “click” arbitrary links

Prefer `text/plain` for extraction

If both text and HTML are present, extraction from text/plain is usually less risky and more stable.

Log identifiers, not full bodies

For debugging, log:

run_id, attempt_id
inbox_id
message IDs and timestamps

Avoid logging entire email bodies in shared CI logs unless you have a strong reason and a retention policy.

Putting it into a real test run (CI-friendly checklist)

If you are implementing this for the first time, focus on these harness-level guarantees:

Create inbox at the start of the attempt, not at suite startup.
Pass email + inbox_id through your test context.
Wait with an explicit deadline, not a fixed sleep.
Match narrowly, by sender and subject intent (or a correlation header if you control the sender).
Store the received JSON as a build artifact when a test fails.

If you need a routing refresher (envelope recipient vs To: header), RFC 5321 is the canonical reference: Simple Mail Transfer Protocol (RFC 5321).

A simple five-step flow diagram: 1) Create disposable inbox via API (email plus inbox_id), 2) Trigger signup email, 3) Receive webhook or poll, 4) Parse structured JSON and extract OTP or link, 5) Expire inbox and store JSON artifact.

Frequently Asked Questions

How do I create disposable email address per test run without collisions? Create a new disposable inbox per run (or per attempt) via an API, then read emails only from that inbox using its inbox_id. Avoid shared mailboxes and “search the latest email”.

Should I rotate per run or per attempt? If retries or resends can happen, rotate per attempt. If your flow triggers exactly one email and never retries, per run can be sufficient, but per attempt is safer.

Is polling enough, or do I need webhooks? Polling can work, but webhook-first is typically faster and more parallel-safe. A hybrid design (webhook-first, polling fallback) is the most reliable.

How do I prevent duplicate emails from breaking tests? Use idempotency and dedupe at multiple layers: delivery-level for webhook retries, message-level for retransmits, and artifact-level (OTP/link hash) to enforce consume-once semantics.

Where do I find the exact Mailhook API contract? Use the canonical reference at mailhook.co/llms.txt, which documents the integration semantics in a machine-readable way.

Implement inbox-per-run with Mailhook

Mailhook is designed for exactly this “no collisions” workflow: create disposable inboxes via API, receive incoming emails as structured JSON, get real-time webhook notifications (with polling fallback), and verify authenticity with signed payloads.

If you want to implement disposable inbox provisioning in your test harness or agent toolset, start with the canonical contract at Mailhook integration contract (llms.txt), then explore Mailhook at mailhook.co. No credit card is required to get started.

Create Disposable Email Address per Test Run (No Collisions)

What “collisions” look like in CI (and why they keep happening)

The invariant that eliminates collisions: inbox-per-run (not “one email string”)