Email is a stubborn dependency in automation. It arrives late, duplicates happen, HTML templates change, and MIME edge cases quietly break parsers. A temp email API solves the infrastructure problem by giving you disposable inboxes on demand, and it solves the integration problem by delivering each message as structured JSON, so your tests and AI agents consume email like data, not like a webpage.
This guide covers what “receive and parse emails as JSON” should look like in practice, how to choose webhook vs polling delivery, and how to extract OTPs and magic links safely.
What a temp email API is (and what it should do)
A temp email API is an automation-friendly way to provision inboxes programmatically, receive inbound mail for those inboxes, and consume messages from code.
For QA and agent workflows, the winning model is inbox-first:
- You create a fresh inbox per run or per attempt.
- You get back a routable email address plus an inbox handle (often an
inbox_id). - You wait for messages deterministically (webhook-first, polling fallback).
- You parse the message as JSON and extract a minimal artifact (OTP, verification URL).
- You discard or rotate the inbox.
That last point matters: short-lived inboxes reduce cross-test collisions and limit the blast radius if something leaks.
“Receive emails as JSON” means a stable contract, not a best-effort parse
Email is defined by standards like RFC 5322 and MIME (RFC 2045), but real-world mail often contains:
- Duplicate or folded headers
- Multi-part bodies with both
text/plainandtext/html - Weird encodings
- Attachments and inline resources
- Threading headers that are present, missing, or malformed
A provider that returns JSON should normalize these details into a contract you can rely on. When you evaluate any temp email API, ask whether the JSON output is stable enough that you can write assertions against it.
Recommended JSON fields for automation and agents
A practical JSON payload usually needs:
| JSON field (concept) | Why it matters | Notes for reliability |
|---|---|---|
message_id (provider ID) |
Stable identifier for idempotency | Do not rely only on Subject or From
|
received_at |
Deterministic time budgets | Prefer server receive time vs client guess |
to, from
|
Routing and correlation | Normalize address casing and formatting |
subject |
Debugging and loose matching | Useful, but not trustworthy for strict matching |
headers (normalized) |
Forensics and correlation | Handle duplicates and folded headers |
text (plain) |
Safe, stable parsing target | Prefer text/plain for extraction |
html (optional) |
Human inspection, fallback | Treat as untrusted input |
attachments (metadata + fetch) |
Workflows that email files | Keep attachment handling explicit |
raw (optional) |
Debugging and reprocessing | Helps when templates change |
If you are building LLM tooling, consider exposing a minimized view to the model (for example: from, to, subject, received_at, text, and a small set of extracted artifacts) and keep raw or HTML out of the agent context unless you truly need it.

Delivery patterns: webhook vs polling (and why most teams need both)
There are two ways to receive messages from a temp email API:
Webhooks (push)
Webhooks are event-driven: the provider calls your endpoint when a message arrives.
They are ideal when:
- You want low latency
- You want fewer API calls
- You run many parallel inboxes and want an event stream
Operationally, webhooks require you to think about retries, signature verification, and idempotent processing.
Polling (pull)
Polling is simpler to deploy: your code repeatedly asks the API whether messages have arrived.
It is useful when:
- Your environment cannot accept inbound web requests (some CI setups)
- You want a straightforward implementation first
- You need a fallback path when webhooks are delayed or blocked
Polling must be designed carefully to avoid fixed sleeps, missed messages, and duplicate processing.
The recommended hybrid
In production-grade automation, the most reliable approach is:
- Webhook-first for fast delivery
- Polling fallback for recovery (if your webhook endpoint was down, if a deploy happened, or if the notification was dropped)
Here is a quick comparison:
| Mechanism | Strength | Common pitfall | Fix |
|---|---|---|---|
| Webhook | Fast, scalable | Replay, spoofing, retries | Verify signatures, enforce idempotency |
| Polling | Simple, firewall-friendly | Rate limits, fixed sleeps, duplicates | Backoff, cursors, seen-IDs |
| Hybrid | Most robust | More moving parts | Keep a single “wait for message” contract |
How to parse emails as JSON safely (especially for LLM agents)
Even when the provider gives you JSON, the email content is still untrusted input. The JSON format just makes it easier to apply consistent rules.
Prefer text/plain for deterministic extraction
For OTP and verification flows, text/plain usually changes less often than HTML, and it avoids markup-related parsing failures.
If you must fall back to HTML, do it with explicit safeguards:
- Do not render HTML in an agent tool
- Strip tags and decode entities before searching
- Avoid “clicking” links automatically
Extract minimal artifacts, not “meaning”
For automation you typically need one of these artifacts:
- A numeric OTP
- A verification URL (magic link)
- A reset-password link
The goal is minimal deterministic extraction, not summarization.
A simple extraction policy:
- Accept at most one OTP candidate that matches your format
- Accept at most one URL candidate that matches an allowlisted domain and expected path prefix
- If there are multiple candidates, fail with an explainable error and log the message IDs
Validate links before using them (SSRF and open redirect hygiene)
If your automation follows URLs from email, treat that as a security boundary:
- Allowlist domains you control
- Reject IP-literals and private network destinations
- Do not follow unbounded redirect chains
The OWASP SSRF Prevention Cheat Sheet is a useful baseline for link-following safety.
Determinism rules for CI and agent workflows
Most flaky email tests fail for predictable reasons: shared inbox collisions, naive sleeps, and weak matching.
Use an inbox-per-attempt model
If you reuse inboxes across runs, you will eventually read the wrong email. The clean fix is: create a disposable inbox per attempt (or at least per run) and treat its inbox_id as your scope.
Add correlation that you control
Make matching narrow and explainable:
- Include a run identifier in the action that triggers the email (for example: a unique signup email address per run, or a correlation token in metadata your system echoes)
- Prefer stable identifiers from the JSON payload over fuzzy subject matching
Design idempotency up front
Assume duplicates will happen:
- Webhooks can retry
- Polling can return the same message multiple times
- Your own job runner can restart
Use a stable key like inbox_id + message_id (and sometimes an extracted artifact hash) to ensure you process the same logical email once.
Batch processing for scale
If you ingest a lot of mail (load tests, large QA suites, or agent fleets), batch retrieval and processing reduces overhead and makes your pipeline more predictable.
Minimal integration example (provider-agnostic pseudocode)
Below is a reference shape that works well for both QA harnesses and LLM tools.
Create an inbox
// Pseudocode, see your provider docs for exact endpoints
const inbox = await emailApi.createInbox({
// optionally: domain strategy, webhook URL, metadata
});
// Use inbox.email when signing up, creating users, or triggering a workflow
Wait for a message (hybrid)
async function waitForMessage({ inboxId, timeoutMs }) {
const deadline = Date.now() + timeoutMs;
// Prefer webhook in real systems, but keep polling as a fallback.
// This function represents a single contract: "return the first matching message or time out".
while (Date.now() < deadline) {
const messages = await emailApi.listMessages({ inboxId });
const candidate = pickBestCandidate(messages);
if (candidate) return candidate;
await sleep(backoffMs());
}
throw new Error(`Timed out waiting for email for inbox ${inboxId}`);
}
Extract an OTP or magic link from JSON
function extractArtifact(messageJson) {
const text = messageJson.text ?? "";
const otp = text.match(/\b\d{6}\b/)?.[0];
if (otp) return { type: "otp", value: otp };
const url = findFirstAllowlistedUrl(text, {
allowlistDomains: ["example.com"],
});
if (url) return { type: "url", value: url };
throw new Error(`No OTP or allowlisted URL found in message ${messageJson.message_id}`);
}
The important part is not the regex, it is the contract: wait deterministically, parse JSON, extract minimally, fail loudly when ambiguous.
Implementing this with Mailhook
Mailhook is built around this exact automation model:
- Disposable inbox creation via API
- Receive emails as structured JSON
- Real-time webhook notifications (with signed payloads)
- Polling API for fallback
- Shared domains for instant setup, and custom domain support
- Batch email processing
For the canonical, machine-readable integration details (endpoints, payload shapes, signature verification specifics), use Mailhook’s llms.txt reference: Mailhook llms.txt.
You can also explore the product at Mailhook.
Evaluation checklist for a temp email API (quick sanity test)
Use this to compare providers without getting lost in marketing:
- Can you create inboxes programmatically and isolate them per run?
- Do messages arrive as normalized JSON, with stable IDs and timestamps?
- Are webhooks supported, and are payloads signed?
- Is polling available as a fallback, with semantics that avoid duplicates?
- Can you choose between shared domains and custom domains?
- Is batch processing supported if you need throughput?
- Can you keep raw source for debugging without forcing agents to read raw HTML?
Frequently Asked Questions
What is a temp email API? A temp email API lets you create disposable inboxes via API, receive inbound email for those inboxes, and retrieve messages from code (often as JSON).
Why should I parse emails as JSON instead of scraping HTML? JSON makes the email payload stable and machine-readable. Scraping HTML is brittle, unsafe for agents, and breaks when templates change.
Should I use webhooks or polling to receive emails? Use webhooks for low latency and scale, and keep polling as a fallback for reliability. A hybrid “webhook-first, polling fallback” pattern is the most robust.
How do I safely extract OTPs or verification links from email? Prefer text/plain, extract a minimal artifact deterministically (one OTP or one allowlisted URL), and validate links before following them.
How do I prevent duplicates in webhook and polling flows? Treat email as an at-least-once event stream. Deduplicate using stable keys like inbox_id + message_id, and make handlers idempotent.
Try Mailhook for JSON-first email automation
If you want a temp email API that fits LLM agents and QA automation, Mailhook provides disposable inboxes via API and delivers inbound messages as structured JSON, with webhooks (signed payloads) and polling fallback.
Get the exact integration contract here: Mailhook llms.txt, then start at mailhook.co (no credit card required).