What's the difference between getting an email address programmatically and just generating a random email string?

A programmatic email address for testing needs to be routable (can receive mail), isolated (scoped to single test), and observable (your code can read messages) - not just a valid-looking string format.

Why doesn't plus-addressing work well for parallel CI runs?

Plus-addressing like user+token@domain.com still routes to one shared mailbox, creating race conditions when multiple tests run simultaneously and search for their specific messages.

Should I use webhooks or polling to wait for test emails?

Webhooks are faster and more efficient, but polling is simpler to implement and works in restricted network environments. Many teams use a hybrid approach with webhook-first and polling fallback.

What should I extract from test emails to avoid flaky tests?

Extract only the minimum verification artifact you need (OTP codes, magic links, reset tokens) from structured JSON or plain text, avoiding HTML rendering and treating email content as potentially hostile input.

How to Get an Email Address Programmatically (For Testing)

When you say “get an email address programmatically” for testing, you usually mean something more specific than “generate a random string that looks like an email.” You need an address that is routable, isolated, and observable so your test runner (or LLM agent) can deterministically wait for the right message and extract a verification artifact (OTP, magic link, reset token) without flakiness.

The mistake teams make is treating email like a UI surface (scrape HTML from a shared mailbox) instead of treating it like an event stream (a short-lived inbox that your code can read).

What you actually need (not just an email string)

In test automation, an email address is only useful if you also have a reliable way to:

Scope messages to a single test attempt (parallel CI runs cannot collide).
Wait for delivery without arbitrary sleeps.
Fetch the message in a machine-readable format.
Deduplicate retries and duplicates.
Clean up the inbox (or let it expire) so future tests do not see old state.

That is why the most robust model is Email + Inbox Handle:

email: the routable recipient you hand to the system under test.
inbox_id (or equivalent): the handle your test uses to read only the messages for that address.

If you only pass around an email string, you will eventually end up “searching” a shared mailbox and fighting race conditions.

Simple flow diagram showing: Test Runner creates a disposable inbox, uses the generated email address in the app under test, waits for delivery via webhook or polling, extracts OTP or verification link from structured JSON, then discards or expires the inbox.

Options to get an email address programmatically (and when each one works)

There are several legitimate approaches. The right one depends on whether you are doing unit tests, local dev, CI, or agent-driven end-to-end flows.

Approach	What it gives you	Good for	Common failure mode in CI
Reserved domains (`example.com`, `.test`)	A safe string that never receives mail	Unit tests, validation-only tests	Not routable, cannot test email delivery
Plus-addressing (e.g., `user+token@domain`)	Many “unique-ish” addresses on one mailbox	Small-scale manual testing	Collisions, filtering quirks, shared mailbox search
Catch-all on your domain	Unlimited recipients under one domain	Staging environments, controlled integrations	Still a shared mailbox unless you build routing + storage
Local SMTP capture (Mailpit/MailHog style)	Inbox-like behavior on localhost	Local dev and PR previews	Hard to use in distributed CI, not internet-routable
Disposable inbox API	A real address plus inbox isolation, JSON retrieval	CI, QA automation, LLM agents	Provider choice and webhook security are critical

1) Reserved domains for tests that should never send email

If your test is only checking format validation (for example: “reject missing @”), do not send mail at all. Use reserved domains defined for documentation and testing, like example.com and example.test.

A canonical reference is RFC 2606, which reserves example.com, example.net, and example.org.

This is the simplest “get an email address” approach, but it cannot test the actual email workflow.

2) Plus-addressing: quick, but not isolation

Many providers support plus-addressing (subaddressing), so you can generate:

[email protected]

Pros:

Easy to implement.
Often works with existing mailboxes.

Cons:

You still have one mailbox, which means shared state.
Some systems normalize or strip the plus part.
Your test harness usually devolves into “search the inbox for a subject line,” which is brittle.

If you run tests in parallel, plus-addressing tends to become a source of nondeterminism.

3) Catch-all domains: powerful, but you are building infrastructure

A catch-all domain (or catch-all subdomain like test-mail.yourcompany.com) can route [email protected] somewhere you control.

This can be a good strategy when you need:

Allowlisting with vendors
Deliverability control
A stable domain for long-running environments

But a catch-all domain alone is not a testing solution. You still need:

Recipient-to-inbox mapping
Message storage
Retrieval API
Webhook delivery (optional)
Deduplication and lifecycle policies

If you want this “domain control” plus automation-friendly inboxes, many teams end up using a programmable inbox provider with custom domain support, rather than building the whole mail ingestion pipeline.

4) Local SMTP capture: best for local dev loops

Local SMTP capture tools are great when your app sends mail to localhost and you want to inspect it during development. They are less great when:

Your CI runs in multiple containers
Your system under test is remote
You need real inbound routing and realistic delivery behavior

They solve “see the email locally,” not “reliably coordinate email events in distributed automation.”

5) Disposable inbox APIs: the most direct fit for CI and agents

For end-to-end tests, the cleanest approach is:

Create a fresh inbox via API
Use the generated real email address in your app
Wait for email arrival deterministically
Consume the email as structured JSON

This is the pattern products like Mailhook are designed for: create disposable inboxes via API, receive emails as JSON, and drive automation using webhooks or polling.

For the exact integration contract and payload shapes, use the canonical reference: Mailhook llms.txt.

The deterministic workflow you can standardize across test frameworks

No matter which test runner you use (Playwright, Cypress, pytest, Jest) the reliable workflow is the same:

Provision an inbox (one per test run or one per attempt).
Trigger the email (signup, password reset, invite).
Wait with explicit semantics (webhook-first is ideal, polling fallback is practical).
Parse as data (JSON fields, text/plain), extract the minimal artifact.
Assert and continue.

The key design choice is step 3: avoid sleep(10) and instead wait on a concrete condition.

A provider-agnostic “EmailAddressFactory” interface

Even if you use a specific provider, it helps to hide it behind an interface so your tests stay portable.

// TypeScript-style pseudo-interface
export type EmailWithInbox = {
  email: string;
  inboxId: string;
  expiresAt?: string;
};

export interface EmailAddressFactory {
  createInbox(params?: { webhookUrl?: string }): Promise<EmailWithInbox>;
  waitForMessage(params: {
    inboxId: string;
    timeoutMs: number;
    match?: {
      fromContains?: string;
      subjectContains?: string;
    };
  }): Promise<{ messageId: string; text?: string; html?: string; raw?: string }>;
}

Two immediate benefits:

Your test code becomes “ask for an inbox, wait for message,” not “glue together email hacks.”
Your LLM agent tools can be constrained to safe primitives (create, wait, extract) rather than “read arbitrary inbox HTML.”

Webhook-first vs polling: what to choose in 2026

Both models can be production-grade if you implement them carefully.

Webhook-first is best when:

You want fast end-to-end tests (no polling delay)
You already run an HTTP endpoint in CI
You can verify signatures and implement idempotency

Polling is best when:

You do not want to expose an inbound endpoint
You are running locally or in restricted networks
You can tolerate a small delay and implement backoff

In practice, most teams settle on a hybrid:

Webhook delivers “message arrived” quickly
Polling fetches the message (or acts as a fallback if the webhook is delayed)

Mailhook supports both webhook notifications and a polling API, and can sign payloads (useful for verifying authenticity). Again, refer to the exact fields and verification guidance in their llms.txt.

Message parsing: treat email as hostile input

For testing, you typically only need one thing: a verification artifact.

Examples:

OTP code (6 digits)
Magic link URL
Password reset link

A few hard rules that make your harness safer and less flaky:

Prefer structured JSON output from your inbox provider.
Prefer text/plain over HTML when extracting.
Extract the minimum and avoid “rendering” email content.
If you extract URLs, validate them:
- Allowlist hostnames
- Follow redirects carefully
- Avoid executing arbitrary links (SSRF risk in CI)

If you are integrating with LLM agents, keep the agent-facing view minimal: do not feed raw HTML and full headers unless you have to.

Example scenario: testing an order-confirmation email

Consider an e-commerce test where a user places an order and should receive a confirmation email with an order number and a “view order” link. This applies whether you are testing your own store or a partner flow.

For instance, if you are testing an integration similar to a high-volume storefront like buying bulk jerky online (order confirmations, shipping notices, password resets), you want every CI run to have a fresh inbox so parallel purchases do not cross-contaminate.

A robust test flow looks like:

Create inbox (unique per test run)
Use its email at checkout
Wait for the “Order Confirmation” message
Extract the order number from text (or structured fields if your pipeline adds them)
Validate the confirmation link points to your expected domain

How Mailhook fits (without changing your test architecture)

If your main goal is: “I need to get an email address programmatically and then consume the resulting emails as data,” Mailhook is designed around that contract:

Create disposable inboxes via API
Receive inbound emails as structured JSON
Get real-time webhook notifications (with signed payloads)
Poll for emails when webhooks are not convenient
Use shared domains instantly, or bring a custom domain when you need control
Process messages in batches when you are scaling ingestion

To avoid mismatches between this article and the current API, use the machine-readable integration reference: Mailhook llms.txt.

A short checklist for choosing your approach

If you do not need to receive email, use reserved domains and do not send.
If you need to receive email in CI with parallel runs, prefer disposable inboxes with an inbox handle.
If you need vendor allowlisting or deliverability control, plan for a custom domain.
If you use webhooks, verify signatures and design handlers to be idempotent.

If you want the simplest “inbox-per-run” workflow without building your own mail ingestion pipeline, start with Mailhook’s disposable inbox API and follow the contract in the llms.txt reference.