Email is an odd “notification bus” for software agents.
On the one hand, it is ubiquitous: password resets, magic links, invoices, alerts, invites, and audit events often only ship via email. On the other hand, it is fundamentally retry-heavy and duplicate-prone. SMTP senders retry. Mail providers greylist and retry. Your webhook endpoint flakes and gets retried. Your worker restarts mid-job. Your LLM agent decides to “try again” and triggers a resend.
If you treat an inbound email notification as a one-shot event, your system will eventually double-consume an OTP, follow the wrong link, or mark the wrong run as “verified.” This post lays out a practical reliability contract for notifications email for agents, centered on webhooks, retries, and idempotency.
The core reality: inbound email is at-least-once, out-of-order, and sometimes late
For agent workflows, the most important mindset shift is this:
- You will see duplicates (identical message content arriving via separate deliveries, or the same delivery retried).
- Order is not guaranteed (resends, multi-recipient messages, or provider processing can reorder what you observe).
- A “missing” email is often just delayed (or delivered to spam/quarantine if you are using a real mailbox).
Your design goal is not “no duplicates.” Your goal is safe duplicates, meaning a duplicated notification produces the same final state.
That implies two things:
- Your ingestion path must be retry-tolerant.
- Your downstream processing must be idempotent at the right layer (delivery, message, and extracted artifact).
Webhook-first ingestion: treat notifications as events, not inbox scraping
If an agent needs to react to inbound notifications quickly and reliably, a webhook-first design is the default choice:
- Low latency (no polling loop delays)
- Natural fan-out into queues/workers
- Cleaner time budgets (you can fail fast and alert on missing events)
A webhook-first design only works if you adopt a strict rule:
Ack fast, process async. Your HTTP handler should validate, persist, enqueue, and return a 2xx quickly. Everything else happens in a worker.
What “ack fast” means in practice
Keep your webhook handler’s critical path extremely short:
- Read raw request body
- Verify authenticity (signature and timestamp, if provided)
- Derive idempotency key(s)
- Upsert a delivery record (or “seen” record)
- Enqueue a job keyed by the idempotency key
- Return
200 OK
If you do heavy parsing, template classification, link following, or LLM calls inside the handler, you increase the chance of timeouts, which causes retries, which increases duplicates.
For HTTP semantics, remember that providers typically retry on non-2xx responses (and sometimes on timeouts). The relevant status code definitions live in the HTTP specification (for example, see RFC 9110).
Webhook authenticity and replay defense
For agent systems, webhook authenticity is not optional.
At minimum, you want:
- Signature verification over the raw request body (not a parsed JSON object that can be re-serialized)
- Timestamp tolerance (reject very old timestamps)
- Replay detection (reject a repeated delivery identifier, or store a hash of the body)
This is separate from email-level authenticity signals (like DKIM). DKIM can tell you something about the email’s origin, but it does not prove the webhook request hitting your endpoint is genuine.
Retries: assume your notification pipeline will retry at multiple layers
Retries happen whether you implement them or not. In a typical notifications email pipeline, you might see:
- Sender retries at the SMTP layer
- Inbound provider retries delivering to your webhook
- Your queue retries jobs
- Your agent retries actions (sometimes without telling you)
Instead of trying to “turn retries off,” design your system so retries are safe.
Define retry classes: transient vs permanent
In your worker (not the webhook handler), classify failures:
- Transient: network timeouts, 5xx from dependencies, rate limits, temporary DB contention
- Permanent: invalid signature, schema violations, unsupported content type, missing required artifact
Only transient failures should be retried automatically. Permanent failures should be recorded and surfaced for debugging, because repeated retries will not change the outcome.
Use backoff and a bounded time budget
Agents often run inside a larger deadline (CI job timeout, tool invocation budget, or workflow SLA). So retries should be bounded by:
- A maximum attempt count
- A maximum wall-clock duration
- Exponential backoff with jitter
That makes failures diagnosable instead of turning into silent “infinite waiting.”
Idempotency: pick the right key for the job
“Idempotency” is overloaded. For inbound notifications email, you typically need idempotency at three different layers.
Layer 1: Delivery idempotency (webhook retry safety)
This is the simplest case: the same webhook delivery might be posted multiple times.
If your provider gives you a stable delivery_id, that should be your primary dedupe key for the webhook handler. If it does not, derive a deterministic key (for example, a cryptographic hash of the raw body) and store it with a TTL.
Layer 2: Message idempotency (SMTP duplicates and resends)
Even if your webhook delivery is deduped, you can still receive multiple messages that look identical:
- The sender actually sent twice
- An intermediate system duplicated
- A “resend code” action happened
A stable message_id (often based on the RFC 5322 Message-ID header) is the common choice here (see RFC 5322).
Layer 3: Artifact idempotency (OTP/link consume-once)
This is the layer that breaks agent workflows most often.
Even if you dedupe deliveries and messages, you can still extract the same artifact multiple times and accidentally apply it twice, such as:
- Submitting the same OTP twice
- Following the same verification link twice
- Marking the same “email verified” state twice
Artifact idempotency typically uses a key like:
artifact_type + artifact_value_hash + target_account
Where target_account is the user, session, or workflow instance that the artifact is meant for.
A practical key map
| Layer | What can duplicate? | Recommended idempotency key | Stored where? |
|---|---|---|---|
| Delivery | Webhook POST retried |
delivery_id or sha256(raw_body)
|
DB table with unique constraint |
| Message | SMTP duplicates, resends |
message_id (or normalized fallback) |
Messages table |
| Artifact | OTP/link consumed twice | type + hash(value) + subject |
“Consumes” table with unique constraint |
The big win: once you model these layers explicitly, retries stop being scary because you know where to collapse duplicates.
Reference architecture: ingest, persist, enqueue, process
Here is a minimal, production-friendly layout.

Webhook handler pseudocode (ack-fast)
// Express-style pseudocode
app.post("/email/webhook", async (req, res) => {
const rawBody = req.rawBody; // capture raw bytes
// 1) Verify signature (provider-specific)
verifySignatureOrThrow(req.headers, rawBody);
// 2) Build a delivery dedupe key
const deliveryId = req.headers["x-delivery-id"] ?? sha256(rawBody);
// 3) Idempotent insert
const inserted = await db.tryInsert("deliveries", {
delivery_id: deliveryId,
received_at: new Date().toISOString(),
raw_body_sha256: sha256(rawBody)
});
// 4) Enqueue at most once
if (inserted) {
await queue.enqueue("process_delivery", { delivery_id: deliveryId });
}
// 5) Ack fast
res.status(200).send("ok");
});
Notes:
-
tryInsertshould be backed by a unique constraint ondelivery_id. - If signature verification fails, respond with a non-2xx and do not enqueue.
- If your provider retries on 5xx, prefer returning 2xx once you have persisted the delivery, even if the worker will handle failures later.
Worker pseudocode (message and artifact idempotency)
worker.on("process_delivery", async ({ delivery_id }) => {
const delivery = await db.get("deliveries", { delivery_id });
const email = parseProviderPayload(delivery.raw_body); // yields a normalized JSON
// Message idempotency
const messageId = email.message_id ?? stableFallbackMessageId(email);
await db.upsert("messages", {
message_id: messageId,
inbox_id: email.inbox_id,
received_at: email.received_at,
subject: email.subject,
from: email.from,
text: email.text,
// store html/raw cautiously, often behind restricted access
});
// Artifact extraction (deterministic, not “ask the LLM to guess”)
const artifact = extractOtpOrLink(email.text);
// Consume-once idempotency
const consumeKey = sha256(`${artifact.type}:${artifact.value}:${email.inbox_id}`);
const firstConsume = await db.tryInsert("artifact_consumes", {
consume_key: consumeKey,
message_id: messageId,
created_at: new Date().toISOString()
});
if (!firstConsume) {
return; // already processed
}
await applyArtifactToWorkflow(artifact);
});
This is where you prevent “double verify” and similar automation bugs.
Polling fallback: your safety net when webhooks fail
Even a well-designed webhook system can go down.
A polling fallback gives agents a deterministic escape hatch:
- If no webhook arrives within
T, start polling by inbox ID - Use cursor-based listing and dedupe seen message IDs
- Stop at an overall deadline
Polling is not the default, it is the insurance policy. The key is to keep polling deterministic:
- Poll a specific inbox, not a shared mailbox
- Use a cursor (or monotonic IDs) to avoid re-reading
- Deduplicate at message and artifact layers as usual
Agent-specific hazards (and guardrails)
LLM agents add failure modes you do not see in “normal” backend consumers.
Guardrail 1: minimize what the agent sees
Agents do not need the full HTML email body, tracking pixels, or all headers. Give them a minimized view:
- The single artifact you want (OTP or verification URL)
- The intended domain allowlist check result
- A small amount of provenance (received timestamp, sender address)
This reduces prompt injection risk and keeps the agent focused on the task.
Guardrail 2: prevent resend loops
Agents that can click “resend code” can get trapped in a loop:
- No email arrives quickly
- Agent resends
- Multiple emails arrive
- Agent picks the wrong one
Fix this with policy:
- A resend budget (for example, max 1 resend)
- Always choose the latest matching message within the same attempt/inbox
- Artifact consume-once semantics (so only one OTP is accepted)
Guardrail 3: never “browse” from an email by default
If your agent extracts a link, validate it before any navigation:
- Allowlist domains
- Block IP-literals and internal hostnames
- Prefer making a backend call with a token rather than having the agent open a browser
Where Mailhook fits for notifications email
If you are building agent workflows that need inbound email as structured events, you want primitives that line up with the architecture above:
- Create a disposable inbox via API
- Receive inbound messages as normalized JSON
- Get webhook notifications (push) with authenticity controls
- Fall back to polling (pull) by inbox when needed
Mailhook is designed around that model: programmable disposable inboxes, JSON email output, real-time webhooks, polling, signed payload support, custom domains, and batch processing.
For the exact integration contract and current API surface area, use the canonical reference: Mailhook llms.txt.
A concrete “notifications email” example
Imagine a product where hosts need a reliable email notification that a time-based action has completed, like an event gallery unlocking at the end of a timer. Even if the user experience is mobile-first, the operational notification is still often email. A consumer product like Revel.cam’s instant event photo sharing is a good example of a workflow where delivery timing and duplicate handling matter, because notifications can trigger downstream actions (publishing, sharing, moderation).
For an agent, you would implement this as:
- Provision an inbox for the workflow attempt
- Trigger the action that sends the email
- Wait webhook-first, poll if needed
- Extract one artifact (a link, code, or state) and consume-once
- Expire or rotate the inbox
A quick reliability checklist you can code review against
Use this to review an agent-facing email notifications pipeline:
- Webhook handler verifies authenticity using the raw body and fails closed
- Handler is ack-fast and never runs long parsing or LLM calls
- Delivery dedupe exists (unique constraint on a delivery key)
- Message dedupe exists (unique constraint on a message key)
- Artifact consume-once exists (unique constraint on a consume key)
- Worker retries are bounded by attempts and wall-clock deadline
- Polling fallback is inbox-scoped, cursor-based, and deduped
- Agent receives a minimized view, not full HTML
If all of the above are true, retries stop being incidents and become normal background noise.