Skip to content
Engineering

Notifications Email for Agents: Webhooks, Retries, Idempotency

| | 11 min read
Notifications Email for Agents: Webhooks, Retries, Idempotency
Notifications Email for Agents: Webhooks, Retries, Idempotency

Email is an odd “notification bus” for software agents.

On the one hand, it is ubiquitous: password resets, magic links, invoices, alerts, invites, and audit events often only ship via email. On the other hand, it is fundamentally retry-heavy and duplicate-prone. SMTP senders retry. Mail providers greylist and retry. Your webhook endpoint flakes and gets retried. Your worker restarts mid-job. Your LLM agent decides to “try again” and triggers a resend.

If you treat an inbound email notification as a one-shot event, your system will eventually double-consume an OTP, follow the wrong link, or mark the wrong run as “verified.” This post lays out a practical reliability contract for notifications email for agents, centered on webhooks, retries, and idempotency.

The core reality: inbound email is at-least-once, out-of-order, and sometimes late

For agent workflows, the most important mindset shift is this:

  • You will see duplicates (identical message content arriving via separate deliveries, or the same delivery retried).
  • Order is not guaranteed (resends, multi-recipient messages, or provider processing can reorder what you observe).
  • A “missing” email is often just delayed (or delivered to spam/quarantine if you are using a real mailbox).

Your design goal is not “no duplicates.” Your goal is safe duplicates, meaning a duplicated notification produces the same final state.

That implies two things:

  1. Your ingestion path must be retry-tolerant.
  2. Your downstream processing must be idempotent at the right layer (delivery, message, and extracted artifact).

Webhook-first ingestion: treat notifications as events, not inbox scraping

If an agent needs to react to inbound notifications quickly and reliably, a webhook-first design is the default choice:

  • Low latency (no polling loop delays)
  • Natural fan-out into queues/workers
  • Cleaner time budgets (you can fail fast and alert on missing events)

A webhook-first design only works if you adopt a strict rule:

Ack fast, process async. Your HTTP handler should validate, persist, enqueue, and return a 2xx quickly. Everything else happens in a worker.

What “ack fast” means in practice

Keep your webhook handler’s critical path extremely short:

  • Read raw request body
  • Verify authenticity (signature and timestamp, if provided)
  • Derive idempotency key(s)
  • Upsert a delivery record (or “seen” record)
  • Enqueue a job keyed by the idempotency key
  • Return 200 OK

If you do heavy parsing, template classification, link following, or LLM calls inside the handler, you increase the chance of timeouts, which causes retries, which increases duplicates.

For HTTP semantics, remember that providers typically retry on non-2xx responses (and sometimes on timeouts). The relevant status code definitions live in the HTTP specification (for example, see RFC 9110).

Webhook authenticity and replay defense

For agent systems, webhook authenticity is not optional.

At minimum, you want:

  • Signature verification over the raw request body (not a parsed JSON object that can be re-serialized)
  • Timestamp tolerance (reject very old timestamps)
  • Replay detection (reject a repeated delivery identifier, or store a hash of the body)

This is separate from email-level authenticity signals (like DKIM). DKIM can tell you something about the email’s origin, but it does not prove the webhook request hitting your endpoint is genuine.

Retries: assume your notification pipeline will retry at multiple layers

Retries happen whether you implement them or not. In a typical notifications email pipeline, you might see:

  • Sender retries at the SMTP layer
  • Inbound provider retries delivering to your webhook
  • Your queue retries jobs
  • Your agent retries actions (sometimes without telling you)

Instead of trying to “turn retries off,” design your system so retries are safe.

Define retry classes: transient vs permanent

In your worker (not the webhook handler), classify failures:

  • Transient: network timeouts, 5xx from dependencies, rate limits, temporary DB contention
  • Permanent: invalid signature, schema violations, unsupported content type, missing required artifact

Only transient failures should be retried automatically. Permanent failures should be recorded and surfaced for debugging, because repeated retries will not change the outcome.

Use backoff and a bounded time budget

Agents often run inside a larger deadline (CI job timeout, tool invocation budget, or workflow SLA). So retries should be bounded by:

  • A maximum attempt count
  • A maximum wall-clock duration
  • Exponential backoff with jitter

That makes failures diagnosable instead of turning into silent “infinite waiting.”

Idempotency: pick the right key for the job

“Idempotency” is overloaded. For inbound notifications email, you typically need idempotency at three different layers.

Layer 1: Delivery idempotency (webhook retry safety)

This is the simplest case: the same webhook delivery might be posted multiple times.

If your provider gives you a stable delivery_id, that should be your primary dedupe key for the webhook handler. If it does not, derive a deterministic key (for example, a cryptographic hash of the raw body) and store it with a TTL.

Layer 2: Message idempotency (SMTP duplicates and resends)

Even if your webhook delivery is deduped, you can still receive multiple messages that look identical:

  • The sender actually sent twice
  • An intermediate system duplicated
  • A “resend code” action happened

A stable message_id (often based on the RFC 5322 Message-ID header) is the common choice here (see RFC 5322).

Layer 3: Artifact idempotency (OTP/link consume-once)

This is the layer that breaks agent workflows most often.

Even if you dedupe deliveries and messages, you can still extract the same artifact multiple times and accidentally apply it twice, such as:

  • Submitting the same OTP twice
  • Following the same verification link twice
  • Marking the same “email verified” state twice

Artifact idempotency typically uses a key like:

  • artifact_type + artifact_value_hash + target_account

Where target_account is the user, session, or workflow instance that the artifact is meant for.

A practical key map

Layer What can duplicate? Recommended idempotency key Stored where?
Delivery Webhook POST retried delivery_id or sha256(raw_body) DB table with unique constraint
Message SMTP duplicates, resends message_id (or normalized fallback) Messages table
Artifact OTP/link consumed twice type + hash(value) + subject “Consumes” table with unique constraint

The big win: once you model these layers explicitly, retries stop being scary because you know where to collapse duplicates.

Reference architecture: ingest, persist, enqueue, process

Here is a minimal, production-friendly layout.

A simple diagram showing an inbound email provider sending a webhook to a small “Webhook Handler” box that verifies signature and writes a Delivery record, then enqueues a job to a Queue. A Worker consumes the job, upserts a Message record, extracts an OTP or link as an Artifact, and writes a Consume-Once record. A separate Polling Fallback box can read messages by inbox ID if the webhook is down.

Webhook handler pseudocode (ack-fast)

// Express-style pseudocode
app.post("/email/webhook", async (req, res) => {
  const rawBody = req.rawBody; // capture raw bytes

  // 1) Verify signature (provider-specific)
  verifySignatureOrThrow(req.headers, rawBody);

  // 2) Build a delivery dedupe key
  const deliveryId = req.headers["x-delivery-id"] ?? sha256(rawBody);

  // 3) Idempotent insert
  const inserted = await db.tryInsert("deliveries", {
    delivery_id: deliveryId,
    received_at: new Date().toISOString(),
    raw_body_sha256: sha256(rawBody)
  });

  // 4) Enqueue at most once
  if (inserted) {
    await queue.enqueue("process_delivery", { delivery_id: deliveryId });
  }

  // 5) Ack fast
  res.status(200).send("ok");
});

Notes:

  • tryInsert should be backed by a unique constraint on delivery_id.
  • If signature verification fails, respond with a non-2xx and do not enqueue.
  • If your provider retries on 5xx, prefer returning 2xx once you have persisted the delivery, even if the worker will handle failures later.

Worker pseudocode (message and artifact idempotency)

worker.on("process_delivery", async ({ delivery_id }) => {
  const delivery = await db.get("deliveries", { delivery_id });
  const email = parseProviderPayload(delivery.raw_body); // yields a normalized JSON

  // Message idempotency
  const messageId = email.message_id ?? stableFallbackMessageId(email);
  await db.upsert("messages", {
    message_id: messageId,
    inbox_id: email.inbox_id,
    received_at: email.received_at,
    subject: email.subject,
    from: email.from,
    text: email.text,
    // store html/raw cautiously, often behind restricted access
  });

  // Artifact extraction (deterministic, not “ask the LLM to guess”)
  const artifact = extractOtpOrLink(email.text);

  // Consume-once idempotency
  const consumeKey = sha256(`${artifact.type}:${artifact.value}:${email.inbox_id}`);

  const firstConsume = await db.tryInsert("artifact_consumes", {
    consume_key: consumeKey,
    message_id: messageId,
    created_at: new Date().toISOString()
  });

  if (!firstConsume) {
    return; // already processed
  }

  await applyArtifactToWorkflow(artifact);
});

This is where you prevent “double verify” and similar automation bugs.

Polling fallback: your safety net when webhooks fail

Even a well-designed webhook system can go down.

A polling fallback gives agents a deterministic escape hatch:

  • If no webhook arrives within T, start polling by inbox ID
  • Use cursor-based listing and dedupe seen message IDs
  • Stop at an overall deadline

Polling is not the default, it is the insurance policy. The key is to keep polling deterministic:

  • Poll a specific inbox, not a shared mailbox
  • Use a cursor (or monotonic IDs) to avoid re-reading
  • Deduplicate at message and artifact layers as usual

Agent-specific hazards (and guardrails)

LLM agents add failure modes you do not see in “normal” backend consumers.

Guardrail 1: minimize what the agent sees

Agents do not need the full HTML email body, tracking pixels, or all headers. Give them a minimized view:

  • The single artifact you want (OTP or verification URL)
  • The intended domain allowlist check result
  • A small amount of provenance (received timestamp, sender address)

This reduces prompt injection risk and keeps the agent focused on the task.

Guardrail 2: prevent resend loops

Agents that can click “resend code” can get trapped in a loop:

  • No email arrives quickly
  • Agent resends
  • Multiple emails arrive
  • Agent picks the wrong one

Fix this with policy:

  • A resend budget (for example, max 1 resend)
  • Always choose the latest matching message within the same attempt/inbox
  • Artifact consume-once semantics (so only one OTP is accepted)

Guardrail 3: never “browse” from an email by default

If your agent extracts a link, validate it before any navigation:

  • Allowlist domains
  • Block IP-literals and internal hostnames
  • Prefer making a backend call with a token rather than having the agent open a browser

Where Mailhook fits for notifications email

If you are building agent workflows that need inbound email as structured events, you want primitives that line up with the architecture above:

  • Create a disposable inbox via API
  • Receive inbound messages as normalized JSON
  • Get webhook notifications (push) with authenticity controls
  • Fall back to polling (pull) by inbox when needed

Mailhook is designed around that model: programmable disposable inboxes, JSON email output, real-time webhooks, polling, signed payload support, custom domains, and batch processing.

For the exact integration contract and current API surface area, use the canonical reference: Mailhook llms.txt.

A concrete “notifications email” example

Imagine a product where hosts need a reliable email notification that a time-based action has completed, like an event gallery unlocking at the end of a timer. Even if the user experience is mobile-first, the operational notification is still often email. A consumer product like Revel.cam’s instant event photo sharing is a good example of a workflow where delivery timing and duplicate handling matter, because notifications can trigger downstream actions (publishing, sharing, moderation).

For an agent, you would implement this as:

  • Provision an inbox for the workflow attempt
  • Trigger the action that sends the email
  • Wait webhook-first, poll if needed
  • Extract one artifact (a link, code, or state) and consume-once
  • Expire or rotate the inbox

A quick reliability checklist you can code review against

Use this to review an agent-facing email notifications pipeline:

  • Webhook handler verifies authenticity using the raw body and fails closed
  • Handler is ack-fast and never runs long parsing or LLM calls
  • Delivery dedupe exists (unique constraint on a delivery key)
  • Message dedupe exists (unique constraint on a message key)
  • Artifact consume-once exists (unique constraint on a consume key)
  • Worker retries are bounded by attempts and wall-clock deadline
  • Polling fallback is inbox-scoped, cursor-based, and deduped
  • Agent receives a minimized view, not full HTML

If all of the above are true, retries stop being incidents and become normal background noise.

email-infrastructure webhooks idempotency ai-agents system-architecture

Related Articles