Email is one of the last “human-shaped” dependencies in automated systems. The moment you ask a test runner or an LLM agent to receive an email, you inherit queues, retries, spam filtering, and timing uncertainty. That is why most “temp email receive” implementations become flaky: they default to fixed sleeps, shared mailboxes, or polling loops that miss edge cases.
A more reliable pattern is webhook-first delivery with a polling fallback. Webhooks give you low latency and less wasted compute. Polling gives you a deterministic escape hatch when webhooks cannot reach you (CI sandboxes, local dev, transient outages). Together, they form an integration contract that is fast, parallel-safe, and retry-friendly.
This post shows how to implement that hybrid pattern for automation and agent workflows, and how to do it safely.
For provider-specific request/response shapes, refer to the canonical contract in Mailhook’s llms.txt.
What “temp email receive” should mean in automation
For humans, “receive email” means “I can open my inbox and read it.” For code, it should mean something stricter:
- You can create an inbox on demand (disposable, scoped to a run or attempt).
- Emails arrive as structured JSON (so you assert on data, not brittle HTML rendering).
- You can wait deterministically (event-driven when possible, pull-based when necessary).
- Processing is idempotent (safe to retry without double-consuming the same verification code or magic link).
Mailhook is built around this model: disposable inbox creation via API, JSON email output, real-time webhooks, and a polling API as a fallback (see llms.txt for the integration contract).
Why webhook-first is the default
Polling is simple to understand, but it is rarely the best default once you run at scale.
Webhook-first delivery is preferred because it:
- Cuts latency: you react as soon as the email lands.
- Reduces cost: you are not burning CPU and API calls on empty polls.
- Handles parallelism cleanly: each run can own an inbox, and webhooks arrive independently.
- Improves debuggability: you can log one inbound HTTP event per delivery attempt.
In practice, webhook-first does not mean “process everything inside the webhook handler.” It means:
- Treat the webhook as a delivery signal.
- Verify authenticity (signatures).
- Store the payload (or fetch the message via API if your provider supports it).
- Hand off the minimal artifact (OTP, verification URL) to the workflow.

The reliability contract: what to design for
A hybrid “webhook-first, polling fallback” integration is reliable only if you explicitly handle the failure modes both mechanisms share.
1) Expect duplicates and retries
Email delivery pipelines and webhook dispatchers retry. Your handler must be idempotent.
A practical approach is to track three layers of identity:
- Delivery identity: a provider-specific delivery attempt ID (used for webhook retry dedupe).
- Message identity: a stable message identifier (used to avoid processing the same email twice).
- Artifact identity: a derived “thing you actually consume” such as an OTP string or verification URL hash (used to enforce consume-once semantics).
If you only dedupe on the OTP or only on the email subject, you will eventually get burned.
2) Verify webhook authenticity before processing
“Email signed by” indicators (DKIM) are not a substitute for webhook request authentication. Your threat model is different: an attacker may try to spoof your webhook endpoint directly.
Use your provider’s signed payload mechanism (Mailhook supports signed payloads) and verify signatures over the raw request body. The exact headers and signature format are provider-specific, use the canonical description in Mailhook’s llms.txt.
At minimum, enforce:
- Signature verification on the raw bytes.
- A timestamp tolerance window.
- Replay protection (store a “seen delivery id” set for a bounded time).
- Fail closed (reject if anything does not verify).
3) Keep webhook handlers fast
Webhook endpoints are not job runners. Aim for:
- Verify signature.
- Persist payload and minimal metadata.
- Enqueue a job or trigger a waiting workflow.
- Return a success response.
This reduces timeout-related retries and makes duplicates less frequent.
Polling fallback: when and how to do it
Polling exists to cover cases where webhooks are temporarily impossible or unreliable:
- Local development without a public callback URL.
- CI environments that block inbound connections.
- A transient outage between the provider and your webhook.
- You deliberately disable webhooks for a batch job.
Polling done poorly causes flakes. Polling done well is deterministic.
Polling rules that prevent flakes
Use these rules as your baseline:
- Use a per-attempt inbox: do not poll a shared mailbox.
- Use a deadline, not a sleep: wait up to a time budget, with short backoff.
- Use a cursor or “seen ids”: do not reprocess old messages.
- Match narrowly: select the email you want by stable attributes (recipient, inbox id, time window, and optionally a correlation token).
Here is provider-agnostic pseudocode for a resilient polling wait:
import time
def wait_for_email(poll, match, timeout_s=60):
"""
poll(): returns a list of messages (newest-first or provider-defined)
match(msg): returns True for the message you want
"""
deadline = time.time() + timeout_s
seen_ids = set()
backoff_s = 0.5
while time.time() < deadline:
messages = poll()
for msg in messages:
msg_id = msg.get("id") or msg.get("message_id")
if msg_id and msg_id in seen_ids:
continue
if msg_id:
seen_ids.add(msg_id)
if match(msg):
return msg
time.sleep(backoff_s)
backoff_s = min(backoff_s * 1.5, 5.0)
raise TimeoutError("No matching email before deadline")
This pattern avoids fixed sleeps and makes timeouts actionable.

The hybrid pattern: webhook signals, polling as the safety net
A robust implementation uses both mechanisms, but with clear responsibilities:
- Webhook path (primary): fastest way to learn an email arrived.
- Polling path (fallback): used when the webhook never arrives or cannot be received.
A common approach is “race then converge”:
- Start a wait with a deadline.
- Subscribe to webhook deliveries.
- In parallel, run a low-frequency polling loop.
- Whichever sees the message first resolves the wait.
You can implement this without complex concurrency by using a shared store:
- Webhook handler writes
inbox_id -> latest_messages(ormessage_id -> payload). - Polling checks the provider API only if the store has not been updated recently.
This reduces API calls while still being resilient.
Security guardrails for LLM agents receiving email
If an LLM agent is in the loop, “temp email receive” becomes a security boundary. Email is untrusted input, and it often contains links.
Recommended guardrails:
- Prefer structured JSON fields and text/plain content when extracting artifacts.
- Avoid rendering or “browsing” arbitrary HTML from inbound mail.
- Treat links as potentially hostile (open redirects, tracking, phishing).
- Validate verification URLs before using them (scheme, host allowlist, path constraints).
- Minimize what you show the agent. Often the agent only needs one artifact: an OTP string or a single verification URL.
If you provide the agent a tool like extract_verification_artifact(message_json), you reduce prompt injection risk by not exposing the entire raw email.
Webhook-only vs polling-only vs hybrid
Most teams end up hybrid, but it helps to be explicit:
| Approach | Strengths | Weaknesses | Best for |
|---|---|---|---|
| Webhook-only | Low latency, efficient, event-driven | Requires inbound connectivity, must handle signature verification and retries | Production automation, agents running on servers |
| Polling-only | Works anywhere, simplest network model | Higher latency, wasted calls, can miss edge cases without cursors and dedupe | Local dev, restricted CI |
| Hybrid (recommended) | Fast by default, resilient in weird environments | Slightly more moving parts | CI plus production, agent toolchains, signup verification |
A practical checklist for implementation
Use this as a code review checklist for your receiving path:
- Inbox lifecycle: create a disposable inbox per run or per attempt.
- Correlation: log
run_id,inbox_id, and message identifiers, not full bodies. - Webhooks: verify signatures, enforce timestamp tolerance, and dedupe delivery retries.
- Storage: persist the normalized JSON and enough metadata to replay/debug.
- Consumption: extract minimal artifacts (OTP/link), enforce consume-once semantics.
- Polling fallback: deadline-based wait with backoff and a cursor or seen-id set.
Frequently Asked Questions
What is the best way to receive temp email in CI? Use webhook-first if your CI can accept inbound callbacks, otherwise use polling with a strict deadline, backoff, and inbox-per-run isolation.
Why not just sleep for 10 seconds and fetch the mailbox? Fixed sleeps are the main source of flakiness. Delivery latency varies, and retries can create duplicates. Deadline-based waits plus dedupe are more reliable.
Do I still need polling if I have webhooks? Usually yes, as a fallback. Network misconfigurations, transient outages, and local development are common. A minimal polling fallback makes failures recoverable.
How do I secure webhook-delivered email JSON? Verify signed payloads on the raw request body, enforce timestamp tolerance, add replay detection, and only then parse and process the JSON.
What should I expose to an LLM agent from an email? Ideally a minimal, deterministic artifact like an OTP or a single allowlisted verification URL, not the full HTML body.
Use Mailhook for webhook-first temp inbox receiving
If you are building “temp email receive” into an agent workflow, QA suite, or signup verification harness, Mailhook provides the primitives you want: disposable inbox creation via API, emails as structured JSON, real-time webhook notifications, and a polling API fallback, plus signed payloads for webhook security.
- Start with Mailhook
- Use the canonical integration reference in Mailhook’s llms.txt (recommended for LLM tools and exact API details)
No credit card is required to try it, and you can later move to shared or custom domains depending on your deliverability and allowlisting needs.