If your test suite covers signups, password resets, OTPs, magic links, invite acceptance, or agent-driven onboarding, email is part of the product surface. A custom-domain address makes those flows closer to production, easier to allowlist, and easier to debug. But the reliable pattern is not to create a permanent mailbox like [email protected] and share it across every run.
For test automation, the better pattern is to create a short-lived email address with a custom domain and bind it to a programmatic inbox. Your tests or LLM agents can then wait for the right message, receive it as structured JSON, extract only the needed artifact, and move on without logging into a human mailbox.
This guide focuses on the implementation contract: what to create, how to route it, and how to keep custom-domain email test flows deterministic in CI and agent workflows.
Create an address descriptor, not just an email string
A bare email address is not enough for reliable automation. Your test harness needs a descriptor that tells it where the email is routed, how to read messages, and how to correlate the message to the current run.
A practical descriptor looks like this:
| Field | Purpose in test flows |
|---|---|
email |
The custom-domain address you pass to the application under test |
inbox_id or provider handle |
The programmatic inbox resource your code reads from |
domain |
The configured test domain or subdomain, such as tests.example.com
|
run_id |
CI job, test run, or agent session identifier for correlation |
attempt_id |
Retry-safe identifier, especially useful when a test re-runs |
created_at |
Helps debug late arrivals and stale messages |
expires_at or retention policy |
Defines how long the address should remain useful |
receive_mode |
Webhook, polling, or hybrid delivery pattern |
This distinction matters because test code should not be searching a shared mailbox for “the latest email.” It should read from the exact inbox created for the exact attempt.
With Mailhook, the relevant primitives are programmable disposable inboxes, structured JSON email output, RESTful API access, webhooks, polling, shared domains, custom domain support, signed payloads, and batch processing. For exact API details and machine-readable integration guidance, use the canonical Mailhook llms.txt reference.
When a custom domain is worth using for tests
Shared provider domains are useful when you want to start fast, especially for internal CI or prototypes. A custom domain becomes valuable when your test flows need more control.
Use a custom domain when you need:
- Allowlisting compatibility for third-party SaaS products that reject unfamiliar disposable domains.
- Environment separation between development, staging, preview, and production-like tests.
- Stable routing rules that your organization controls through DNS.
- Auditability for CI, QA, and agent workflows where every address should map back to a run.
- Closer production parity for sign-up, password reset, invite, or verification flows.
Avoid using your primary production domain for automated tests. A dedicated subdomain is usually safer, easier to route, and easier to revoke without affecting human email.
Reference architecture for custom-domain test email
The architecture is simple:
custom test subdomain -> MX records -> inbound email provider -> disposable inbox -> JSON message -> test harness or agent
The important part is that the domain and the inbox are separate concerns. DNS routes messages for the domain. Your inbox provider maps each recipient address to a programmatic inbox. Your automation consumes messages through webhooks, polling, or both.
In SMTP, routing depends on the envelope recipient and MX records, not just the visible To: header. The SMTP model is defined in RFC 5321, and it is one reason test harnesses should rely on provider-normalized routing metadata instead of scraping headers alone.
Step 1: choose a safe subdomain layout
Start with a subdomain dedicated to automated inbound mail. The right layout depends on how your environments and tenants are organized.
| Subdomain pattern | Best for | Notes |
|---|---|---|
tests.example.com |
General CI and QA | Good default for one shared automation environment |
staging-mail.example.com |
Staging-only email flows | Keeps staging mail clearly separated from production |
ci.example.com |
Parallel CI suites | Short and easy to include in generated addresses |
tenant-a.tests.example.com |
Tenant-isolated test flows | Useful when tenant routing or allowlists differ |
agents.example.com |
LLM agent workflows | Makes agent-originated test mail easy to identify |
Keep the domain value configurable. Your code should not hard-code tests.example.com inside every test. Use an environment variable or centralized configuration so you can move from shared domains to custom domains without rewriting the harness.
Step 2: point MX records to your inbound provider
To receive email on a custom test domain, configure MX records for the subdomain at your DNS provider. The inbound provider will tell you the required MX hostnames and priorities.
A typical DNS verification check looks like this:
dig MX tests.example.com +short
Before running full test suites, send a smoke-test email to a generated address and verify that your provider receives it. If the message does not arrive, debug in this order:
- Confirm the MX record is on the exact subdomain you are using.
- Confirm DNS propagation from more than one network or resolver.
- Confirm the application under test sends to the generated address, not a cached address.
- Confirm the inbound provider recognizes the custom domain.
- Compare the envelope recipient with the visible
To:header when troubleshooting forwarded or templated mail.
SPF, DKIM, and DMARC are important when the domain sends email, but they do not route inbound mail. For inbound-only test addresses, MX routing and recipient-to-inbox mapping are the first things to verify.
Step 3: choose an address creation model
There are several ways to create email addresses on a custom domain. For automated test flows, the safest default is one programmatic inbox per run or per attempt.
| Model | How it works | Best use case | Main risk |
|---|---|---|---|
| API-created inbox | Provider creates an inbox and returns a routable address on your domain | CI, QA, signup tests, LLM agents | Requires provider integration |
| Encoded local-part | Address includes encoded metadata, such as run and attempt IDs | Stateless routing and debugging | Local-part length and character constraints |
| Alias table | Your system maps generated aliases to inboxes | Compatibility with fixed recipient formats | State management and cleanup |
| Catch-all | Any address on the domain routes to a handler | Exploratory tests or legacy systems | Collisions and noisy mail if unconstrained |
For most teams, API-created inboxes are the best balance. They give you a real address, a provider-side inbox handle, and a clean place to read messages. Encoded local-parts can work well too, but they should still map to isolated inboxes if the flow runs in parallel.
Step 4: implement an EmailAddressFactory
Your application tests should not know the details of DNS, provider APIs, webhook verification, or polling cursors. Hide those concerns behind a small factory.
The interface can be provider-agnostic:
type TestInbox = {
email: string;
inboxId: string;
domain: string;
runId: string;
attemptId: string;
createdAt: string;
};
async function createTestInbox(input: {
flow: string;
runId: string;
attemptId: string;
}): Promise<TestInbox> {
const inbox = await inboxProvider.createInbox({
domain: process.env.TEST_EMAIL_DOMAIN,
metadata: {
flow: input.flow,
runId: input.runId,
attemptId: input.attemptId
}
});
return {
email: inbox.email,
inboxId: inbox.id,
domain: process.env.TEST_EMAIL_DOMAIN,
runId: input.runId,
attemptId: input.attemptId,
createdAt: new Date().toISOString()
};
}
This is pseudocode, not a Mailhook SDK contract. The point is the boundary: tests ask for a test inbox, then use the returned email address. The provider-specific implementation can use Mailhook’s REST API and should follow the exact contract documented in llms.txt.
Step 5: wait for the message deterministically
Fixed sleeps are the fastest way to make email tests flaky. A custom-domain address does not fix that by itself. You still need deterministic waiting.
A robust wait strategy uses webhooks first, with polling as a fallback. Webhooks minimize latency and reduce unnecessary requests. Polling gives your test harness a recovery path if a webhook is delayed, blocked, or missed.
async function waitForVerificationEmail(inbox: TestInbox) {
const deadline = Date.now() + 60_000;
const webhookResult = await waitForWebhookEvent({
inboxId: inbox.inboxId,
runId: inbox.runId,
until: deadline
});
if (webhookResult) return webhookResult;
return pollMessagesUntil({
inboxId: inbox.inboxId,
match: message =>
message.to.includes(inbox.email) &&
message.subject.toLowerCase().includes("verify"),
deadline
});
}
Again, adapt the function names to your provider. The invariant is what matters: wait against the inbox created for this attempt, use a clear deadline, and match the message narrowly.
Step 6: consume structured JSON, not rendered HTML
For agents and CI, email should be data. A normalized JSON payload is easier to test, log, dedupe, and minimize than a rendered mailbox page.
Your test harness should extract only the artifact it needs:
| Test flow | Artifact to extract | Safer assertion |
|---|---|---|
| Sign-up verification | OTP or verification link | Code format, sender, expected domain |
| Password reset | Reset link | Host allowlist, token presence, expiration behavior |
| Magic link login | Login URL | Correct recipient, single-use behavior |
| Team invite | Accept-invite link | Team or tenant identifier in the flow |
| Billing or admin alert | Message metadata and key text | Sender, subject, event ID, not pixel-perfect HTML |
Prefer text/plain content when available. If you must inspect HTML, do not render it inside CI or expose it directly to an LLM agent. Treat inbound email as untrusted input.
Reliability rules for CI and retries
Creating a custom-domain address is only one part of a reliable test flow. The rest is about isolation, correlation, idempotency, and observability.
| Failure mode | Common cause | Fix |
|---|---|---|
| Wrong email selected | Shared inbox or broad matcher | Create one inbox per attempt and match by inbox ID plus flow intent |
| Duplicate processing | SMTP retries, webhook retries, polling overlap | Dedupe by delivery ID, message ID, and extracted artifact |
| Retry consumes stale email | Reused address after failed attempt | Generate a new inbox for each retry attempt |
| Late message fails test | Fixed short sleeps | Use deadline-based waiting with webhook-first delivery and polling fallback |
| Hard-to-debug CI failure | Missing correlation logs | Log run ID, attempt ID, inbox ID, message ID, and timestamps |
| Agent clicks unsafe link | Full email exposed to model | Validate links in code and pass only the approved artifact to the agent |
The most important operational rule is simple: do not reuse the same custom-domain inbox across parallel tests. Reuse creates ambiguous state, especially when providers resend messages, webhooks retry, or CI jobs run concurrently.
Guardrails for LLM agents using custom-domain email
LLM agents should not browse a mailbox like a human. Give them a narrow tool surface.
A safe agent contract might include:
-
create_test_inbox(flow_name)to provision an isolated address. -
wait_for_message(inbox_id, expected_intent)to wait with a deadline. -
extract_verification_artifact(message_id)to return an OTP or approved URL. -
complete_flow(artifact)to use the artifact in the target application.
The agent does not need raw MIME, full HTML, tracking pixels, or every header. It needs a minimal, typed result. Your code should verify webhook signatures, dedupe events, validate URLs, and enforce domain allowlists before the model sees anything actionable.
This is where signed payloads matter. If you receive email events through a webhook, verify the signature before processing the payload. Mailhook supports signed payloads, which helps protect automation from spoofed webhook requests and tampered JSON.
How Mailhook fits this workflow
Mailhook is designed for programmable temp inboxes rather than human mailbox management. For custom-domain test flows, the useful building blocks are:
- Disposable inbox creation via API.
- Structured JSON output for received emails.
- RESTful API access for automation harnesses and agents.
- Real-time webhook notifications.
- Polling API support when pull-based retrieval is simpler or needed as fallback.
- Instant shared domains for fast prototyping.
- Custom domain support for allowlisting, control, and environment separation.
- Signed payloads for webhook security.
- Batch email processing for higher-volume workflows.
A practical rollout is to prototype your harness on an instant shared domain, then move the same factory interface to a custom subdomain once you need allowlisting or tighter governance. Mailhook’s llms.txt is the best reference for exact implementation details.
Custom-domain test flow checklist
Before you rely on a custom-domain address in CI or an LLM toolchain, confirm that the workflow satisfies these checks:
- The domain is a dedicated test subdomain, not the primary human email domain.
- MX records point to the inbound provider and are verified from DNS.
- The test creates a unique inbox per run or per retry attempt.
- The test stores the returned email and inbox handle together.
- Email waiting uses webhooks first or polling with a clear deadline.
- Message selection uses narrow matchers, not “latest email in mailbox.”
- The parser extracts only the required OTP, magic link, or assertion text.
- Webhook payloads are verified before processing.
- Dedupe keys exist for delivery, message, and extracted artifact levels.
- CI logs include stable identifiers but avoid full secrets and unnecessary email content.
- LLM agents receive minimized artifacts, not raw inbox access.
Frequently Asked Questions
Do I need a custom domain for every email test? No. For syntax validation, use reserved non-routable examples. For local development, SMTP capture may be enough. Use a custom domain when the flow must receive real email, pass allowlists, or behave like a production-adjacent integration.
Should I create a real mailbox account or an API inbox? For automated test flows, an API inbox is usually better. A real mailbox account is built for humans and long-lived identity. An API inbox is easier to isolate per run, read as JSON, and clean up after tests.
Can I use one catch-all inbox on my custom domain? You can, but it is risky in parallel CI. Catch-all routing should be constrained with strict local-part patterns and correlation tokens. For reliability, prefer one disposable inbox per test attempt.
What DNS records matter for inbound-only test email? MX records matter most because they route inbound mail for the subdomain. SPF, DKIM, and DMARC matter when the domain sends mail or when you are specifically testing email authentication behavior.
How should an LLM agent handle verification emails? The agent should use a constrained tool that returns only the extracted OTP or validated link. Do not expose raw HTML or full mailbox access to the model. Verify webhook signatures and validate links in code first.
Where can I find Mailhook’s API details for agents? Use the Mailhook llms.txt reference. It is intended to make the integration contract easier for LLMs, agents, and developers to consume.
Build custom-domain test inboxes without mailbox glue
If your current tests depend on a shared mailbox, manual login, or brittle HTML scraping, moving to API-created inboxes is a cleaner path. With Mailhook, you can create disposable inboxes, receive emails as structured JSON, use webhooks or polling, and support custom domains for more realistic test flows.
Start with Mailhook and check the llms.txt integration reference when you are ready to wire custom-domain inboxes into CI, QA automation, or LLM agent tools. No credit card is required to get started.