Email-dependent test flows are where “it works on my machine” goes to die. The email arrives late, lands in the wrong inbox, gets resent, or comes from a vendor that only delivers to allowlisted domains. If you are building QA automation or LLM-driven agents that must complete signups, password resets, or magic-link logins, an email address with custom domain is often the simplest way to make those flows deterministic and enterprise-compatible.
This guide walks through a practical setup: pick a safe domain layout, configure DNS, choose an addressing scheme that scales in CI, then consume inbound messages in a way that is reliable for automation and safe for agents.
For Mailhook-specific integration details (endpoints, payloads, webhook signatures), use the canonical reference at llms.txt.
What an “email address with custom domain” means for testing
In test automation, the key requirement is not “a mailbox UI.” It is a routable address you control, backed by an API that your code can query.
A typical model looks like this:
- You own a domain (or subdomain) like
qa-mail.example.com. - You point that domain’s MX records to an inbound email provider.
- Each test run creates unique recipients under that domain (for isolation).
- Your test runner or agent receives inbound emails as structured data (ideally JSON), and extracts a minimal artifact (OTP or verification URL).
If you are comparing domain strategies first, see Email Domains for Testing: Shared vs Custom.
When a custom domain is worth the effort
A shared domain is great for quick starts, but a custom domain becomes important when you need control that external vendors recognize.
A custom domain setup is usually justified when:
- A SaaS vendor, customer, or IdP requires domain allowlisting.
- You need strong isolation across environments (
dev,staging,prod-like) and across parallel CI shards. - You want fewer deliverability surprises caused by other tenants’ traffic.
- You need predictable routing rules (catch-all, alias tables, or encoded local-parts) without collisions.
The setup, at a glance
You can think about the work in three layers: DNS routing, recipient mapping, and message consumption.
| Layer | What you configure | Why it matters for tests |
|---|---|---|
| DNS (domain routing) | MX records for your domain or subdomain | Ensures inbound email is deliverable to your system |
| Addressing (recipient mapping) | How you generate unique recipients per run/attempt | Prevents collisions and makes correlation deterministic |
| Retrieval (automation interface) | Webhooks and/or polling returning structured JSON | Eliminates brittle HTML scraping and reduces flakiness |

Step 1: Choose a domain layout that will not bite you later
Most teams should avoid using the apex domain (like example.com) for test inbox routing. Instead, create a dedicated subdomain that communicates intent and supports separation.
Practical patterns:
-
qa-mail.example.comfor a shared QA inbox domain -
staging-mail.example.comfor staging-only flows -
mail.dev.example.comif you want environment nesting
This gives you:
- Safer blast radius (you can delete or change records without impacting real corporate email).
- Clear allowlisting guidance for third parties (“allow
qa-mail.example.com”). - Easier cleanup and rotation if you later migrate providers.
Step 2: Point MX records to your inbound provider
Inbound SMTP delivery is controlled by MX records. When a sender tries to deliver mail to [email protected], their mail server looks up MX records for qa-mail.example.com and connects to the provider listed.
What to do:
- In your DNS provider, create the MX records your inbound provider specifies.
- Set a short TTL while iterating (for example 60 to 300 seconds), then increase it once stable.
- If your provider supports custom domains (Mailhook does), follow their custom-domain onboarding steps and validation requirements.
Because MX values are provider-specific, do not guess them. For Mailhook’s current contract and instructions, start with llms.txt.
Verify propagation and routing
DNS changes can look “saved” but not be live everywhere. Verify externally:
# Replace with your subdomain
DIG_DOMAIN="qa-mail.example.com"
dig MX "$DIG_DOMAIN" +short
If you see no MX records, or they do not match your provider’s expected values, delivery will fail.
Also remember a subtle but important detail for debugging: SMTP routing is based on the envelope recipient (the address used during SMTP delivery), which can differ from the To: header shown in the email body. If you want the conceptual model, Mailhook’s routing deep dive is here: Domains and Emails: How Routing Works in API Inboxes.
Step 3: Pick an addressing scheme that scales in CI
Once the domain routes correctly, your tests need a way to generate lots of unique addresses without accidental overlap.
Here are common recipient mapping options, and how they behave under parallelism:
| Mapping strategy | Example recipient | Good for | Common pitfall |
|---|---|---|---|
| Encoded local-part | [email protected] |
High parallel CI, easy correlation | Beware of normalization rules and length limits |
| Alias table | [email protected] |
Human-readable, stable fixtures | Requires managing alias state |
| Catch-all with tags | [email protected] |
Fast experimentation | Easy to create collisions if you do not isolate per run |
For testing flows, the “encoded local-part” pattern tends to be the most robust because it makes uniqueness explicit.
A practical convention:
- Generate a
run_id(orattempt_id) per test attempt. - Use it in the recipient local-part.
- Store it alongside your test artifacts so failures are debuggable.
Important: do not assume every system preserves case, dots, or plus addressing semantics consistently across providers. If you accept user-entered email addresses in your app, keep your product policy separate from what your test harness uses. For a deep dive on edge cases, see Email Addresses in Automation: Validation and Edge Cases.
Step 4: Consume inbound email deterministically (webhook first, polling as fallback)
Most flaky email tests fail for one of two reasons:
- They look in the wrong place (shared inbox, wrong run).
- They wait incorrectly (fixed sleeps instead of event-driven waits).
A reliable pattern is:
- Create an inbox per run or per attempt.
- Prefer a webhook signal for arrival.
- Keep polling as a fallback (for CI environments where inbound webhooks are hard).
Mailhook supports both real-time webhook notifications and a polling API, and delivers emails as structured JSON, which is much safer than scraping HTML.
Minimal “inbox per attempt” flow (provider-agnostic pseudocode)
attempt_id = new_uuid()
inbox = create_inbox({
domain: "qa-mail.example.com",
metadata: { attempt_id }
})
# Use inbox.email when your app asks for an address
submit_signup_form(email=inbox.email)
# Deterministic wait (webhook-first, polling fallback)
msg = wait_for_message({
inbox_id: inbox.inbox_id,
match: { purpose: "signup_verification" },
timeout_seconds: 60
})
artifact = extract_verification_artifact(msg) # OTP or URL
complete_verification(artifact)
Two implementation details that matter in real systems:
- Idempotency: retries happen. Your “wait” and “consume” logic should tolerate duplicate deliveries.
- Narrow matching: match on stable signals you control (a correlation header, a run token in the body), not on fragile HTML layout.
If you want a full reliability recipe specifically for signups, see Generate Temp Email for Signup Tests Without Flakes.
Webhook security is part of test reliability
If your provider signs webhook payloads (Mailhook supports signed payloads), verify the signature and reject invalid requests. This prevents spoofed “email arrived” events from contaminating your test pipeline.
Treat this like any other event ingestion:
- Verify the signature.
- Enforce timestamp tolerance to reduce replay risk.
- Make handlers idempotent.
Step 5: Make it safe for LLM agents to handle email
Email is untrusted input, and agent tooling amplifies risk because an LLM may follow links or execute instructions embedded in content. When you connect an inbox to an agent, design a constrained interface.
Recommended guardrails:
- Provide the agent a minimized, structured view (subject, sender, received time, and extracted OTP or a single verification URL).
- Allowlist domains for clickable links, and require explicit confirmation before any navigation.
- Avoid rendering HTML in the agent loop. Prefer
text/plainor a sanitized extraction pipeline. - Log identifiers (inbox_id, message_id, attempt_id), not full bodies, unless you have a strong retention policy.
Mailhook’s model of receiving emails as JSON is a good fit for this, because you can treat messages like data records and pass only what the agent needs.
Step 6: Troubleshooting “email not received” with a custom domain
When a custom domain setup fails, you want a checklist that distinguishes DNS problems from application problems.
Start with these high-signal checks:
- Confirm MX records are present and correct using
dig. - Confirm you are sending to the exact domain you configured (typos and wrong environment subdomains are common).
- Confirm the system is using the right recipient at the SMTP envelope layer (not just the
To:header). - If you are using webhooks, confirm your endpoint is reachable from the public internet and that signature verification is not rejecting valid events.
- If you are polling, confirm your polling loop is using explicit timeouts and is not ignoring the newest matching message.
If you need to debug deeper, capture the full set of stable identifiers from your provider’s JSON output and correlate them in your CI logs (attempt_id, inbox_id, message_id). That turns “it flaked” into an actionable trace.
Putting it together with Mailhook
Mailhook is designed for exactly this style of workflow: create disposable inboxes via API, route email for shared or custom domains, and consume messages as structured JSON via webhook notifications or polling.
To avoid stale examples, use Mailhook’s canonical integration contract at llms.txt. If you want to get a custom domain running quickly and then harden your approach, these two posts complement this guide:
Once your custom domain is in place, the biggest win is organizational: your team can treat email as a testable event stream, not a flaky side channel.