When an inbox event can unblock a CI test, submit an OTP, or tell an LLM agent to continue a workflow, a webhook is no longer a convenience. It is a trust boundary. Signed email webhooks give automation teams a way to prove that the JSON payload was produced by the expected provider and was not modified in transit before their code acts on it.
That proof matters because email automation is usually wired to state changes: mark a signup verified, continue an onboarding run, resolve a QA assertion, or hand a verification link to an agent tool. If your endpoint accepts any POST that looks like an email event, an attacker or a misconfigured internal service can trigger the same downstream actions as a real message.
For automation teams, the goal is not just to add a signature check somewhere. The goal is to make webhook verification the first gate in a repeatable, testable ingestion pipeline. This guide breaks down what to verify, where to verify it, how to test the verifier, and how to keep signed email webhooks safe when they feed CI, QA, and LLM-driven workflows.
What a signed email webhook actually proves
A signed webhook is usually a request that includes the original payload plus signing metadata in HTTP headers. The provider computes a cryptographic signature over the request body, commonly using HMAC with a shared secret. Your application recomputes the signature using the same secret and compares the result.
If the comparison succeeds, you have evidence that the payload was created by someone with the secret and that the raw body has not changed since signing. That is powerful, but it is not the same as proving every fact inside the email is safe or true.
| Verification signal | What it helps prove | What it does not prove |
|---|---|---|
| Valid webhook signature | The raw payload matches what the provider signed | The original email sender is trustworthy |
| Fresh timestamp | The delivery is recent enough to process | The event has never been delivered before |
| Delivery ID or event ID | You can detect replay and duplicate deliveries | The message content is semantically safe |
| JSON schema validation | The payload has the expected shape | The links, HTML, and text are safe to execute |
| Email authentication fields | SPF, DKIM, or DMARC results can be evaluated as email signals | The webhook request itself is authentic |
This distinction is especially important for LLM agents. A signed payload can still contain malicious email content, prompt injection, misleading links, or stale verification codes. The signature authenticates the transport event. Your processing layer still needs content validation, link constraints, deduplication, and minimal extraction.
The verification gates every automation team should enforce
Treat webhook verification as a sequence of gates. Each gate should fail closed, produce an auditable reason, and stop the request before any downstream action runs.
Gate 1: Capture the raw request body before parsing
Signature verification must run against the exact raw bytes that were signed. If your framework parses JSON first, reorders fields, changes whitespace, decodes characters, or normalizes line endings, the recomputed signature may no longer match the provider’s signature.
This is a common source of false failures and, in worse designs, accidental bypasses. The webhook handler should read the raw body, verify it, and only then parse it as JSON.
Many mature webhook providers emphasize this raw-body requirement. For example, GitHub recommends validating webhook deliveries by computing a digest over the payload body and comparing it to the signature sent with the event.
Gate 2: Validate signing metadata before doing expensive work
A signed webhook is more than a signature string. Your verifier should check that all required signing metadata is present and formatted correctly. Depending on the provider, that may include a timestamp, signature version, key identifier, delivery ID, or algorithm identifier.
Do not silently accept missing metadata. Do not downgrade to an unsigned mode in production. Do not let a malformed header fall through to the normal event processor. A malformed request should be rejected before JSON parsing, queueing, or database writes.
If your provider supports secret rotation through key IDs, select the expected secret based on the key ID and reject unknown keys. Secret selection should be explicit, not a best-effort loop that hides configuration mistakes.
Gate 3: Enforce timestamp freshness
A valid signature over an old payload may still be valid cryptographically. That does not mean it should be processed. Without a freshness window, anyone who captures a valid request could replay it later.
A practical approach is to enforce a short tolerance window, often measured in minutes, based on your provider’s delivery behavior and your infrastructure latency. The exact number should be a team policy, not an afterthought. Too short can cause false rejections during incidents. Too long increases replay risk.
Timestamp checks should compare against a trusted server clock. They should also fail safely if the timestamp is missing, unparsable, too far in the future, or too old.
Gate 4: Compare signatures using a constant-time comparison
After recomputing the expected signature, compare it to the received signature using a constant-time comparison function. Ordinary string comparison can leak timing differences in some environments. That may sound theoretical for internal test infrastructure, but webhook endpoints are often public by design.
The verification function should also bind the signature to the exact payload and timestamp format expected by the provider. Do not invent a canonicalization scheme unless the provider explicitly documents one. If the provider signs timestamp plus raw body, verify timestamp plus raw body. If it signs only raw body, verify only raw body.
For Mailhook integrations, use the exact signature fields and payload semantics documented in the Mailhook llms.txt integration reference. The reference is the right place to confirm header names, event shapes, and implementation details before wiring production automation.
Gate 5: Detect replays and make processing idempotent
Signature verification answers: did this payload come from the expected signer? Replay detection answers: have we already accepted this delivery?
Automation teams need both. Store a delivery ID or event ID after verification and reject or no-op duplicate deliveries. Webhook providers commonly retry delivery when your endpoint times out or returns an error, so duplicates are normal and should not be treated as rare anomalies.
Idempotency should exist at multiple layers:
- Delivery level, so the same webhook delivery is not processed twice.
- Message level, so the same email message does not create duplicate records.
- Artifact level, so the same OTP or magic link is not consumed twice.
For sign-up verification, artifact-level idempotency is often the most important layer. If two deliveries contain the same verification link, your automation should not click or submit it twice.
A reference handler pattern
The safest architecture is simple: verify first, acknowledge quickly, process asynchronously. The webhook endpoint should not run browser automation, call LLM tools, or mutate complex application state before authenticity and replay checks pass.
async function emailWebhook(req) {
const raw = await req.rawBody()
const headers = req.headers
const verified = verifyEmailWebhook({
rawBody: raw,
headers,
secret: selectWebhookSecret(headers)
})
if (!verified.ok) {
logSecurityEvent(verified.reason)
return { status: 401 }
}
if (await replayStore.exists(verified.deliveryId)) {
return { status: 200 }
}
await replayStore.put(verified.deliveryId, { ttlSeconds: 86400 })
const event = JSON.parse(raw)
await queue.enqueue({
event,
deliveryId: verified.deliveryId,
receivedAt: now()
})
return { status: 202 }
}
This is provider-neutral pseudocode, not a drop-in SDK. The important shape is the order: raw body first, signature verification second, replay check third, parsing and queueing only after that.
Where verification belongs in the automation architecture
Webhook verification should sit at the edge of your email ingestion path, not deep inside a test runner or agent loop. By the time a payload reaches an LLM tool, CI assertion, or workflow engine, it should already be labeled as verified or rejected.
| Layer | Responsibility | What should never happen here |
|---|---|---|
| Webhook ingress | Read raw body, verify signature, check timestamp | Parse and trust unverified JSON |
| Replay store | Track accepted delivery IDs | Allow duplicate state changes |
| Queue or event bus | Decouple receipt from processing | Treat queue delivery as proof of authenticity |
| Email processor | Normalize JSON, extract artifacts, validate links | Execute raw HTML or arbitrary URLs |
| Agent or test tool | Consume a minimal typed result | Read full untrusted email content by default |
That separation matters in high-trust workflows too. Whether a team is automating onboarding, document intake, or integrations around smart mortgage solutions made simple, a webhook should not advance a customer state until it passes authenticity, freshness, and idempotency checks.
The same principle applies to QA and CI. A failed signature should not become a flaky test. It should become a security event with enough context to debug the integration safely.
Special considerations for LLM agents
LLM agents make webhook safety more important because they can turn message content into actions. A human might inspect a suspicious email and stop. An agent might follow an instruction, visit a link, or retry a workflow unless the tool boundary is narrow.
For signed email webhooks that feed agents, use a minimized pipeline:
- Verify the webhook before the event enters the agent system.
- Normalize the email into structured JSON rather than exposing raw MIME.
- Extract only the artifact the agent needs, such as an OTP, verification URL, sender domain, or message timestamp.
- Validate URLs with allowlists, scheme checks, and SSRF protections before any browser or HTTP tool can use them.
- Return a typed tool result, not the full email body, unless the task explicitly requires inspection.
A good agent tool does not ask the model to decide whether a webhook is legitimate. The tool should only return verified, bounded, machine-readable data. The model can decide what to do next within a constrained workflow, but it should not be responsible for cryptographic checks.
How to test your webhook verifier
A webhook verifier is security-critical code, so it needs negative tests. Happy-path tests prove that a valid payload works. Negative tests prove that the handler fails closed when the request is wrong.
| Test case | Expected result |
|---|---|
| Valid payload, valid timestamp, valid signature | Accepted and queued once |
| Body changed by one byte after signing | Rejected before JSON parsing |
| Missing signature header | Rejected |
| Unknown key ID or signing version | Rejected |
| Timestamp outside tolerance window | Rejected |
| Same delivery ID sent twice | Second request returns no-op success |
| Valid signature with malformed JSON | Signature passes, parsing fails safely |
| Correct JSON but unsafe link domain | Event accepted, artifact rejected |
| Old valid payload replayed after TTL policy | Rejected or no-op based on replay policy |
Include these cases in CI. If your application framework changes body parsing behavior during an upgrade, the tampered-body and raw-body tests should catch it before production.
It is also worth testing operational failures. What happens if the replay store is down? For high-safety automation, fail closed rather than processing an event you cannot deduplicate. What happens if the queue is down? Return a retryable status after verification, or persist a durable receipt before acknowledging.
Observability without leaking secrets
Webhook verification should be observable, but logs can become a liability if they contain full emails, OTPs, magic links, signatures, or secrets. Log stable identifiers and decisions instead of sensitive content.
Good logs include delivery ID, inbox ID, message ID, verification decision, rejection reason, timestamp skew, queue status, and processing duration. Risky logs include raw email bodies, full URLs with tokens, signature headers, shared secrets, or complete JSON payloads in production.
Metrics should separate authenticity failures from delivery failures. A spike in invalid signatures is different from a spike in processing timeouts. A spike in replay no-ops may be normal during provider retries, or it may indicate an upstream retry storm. Your dashboards should make that distinction visible.
The OWASP API Security Top 10 is a useful broader reference for thinking about API abuse, authentication, authorization, and unsafe consumption patterns. Webhook endpoints are API endpoints, even when they are only intended for provider-to-server communication.
Using Mailhook for signed email webhook workflows
Mailhook is built for automation teams that need programmable temp inboxes, structured JSON emails, and deterministic email handling for AI agents, QA automation, and signup verification flows.
A typical Mailhook-style flow looks like this:
- Create a disposable inbox through the API for a specific run, attempt, user simulation, or agent task.
- Use the generated email address in the target workflow.
- Receive the inbound email as structured JSON through a real-time webhook, with polling available as a fallback.
- Verify the signed payload before parsing or processing the event.
- Extract the minimal artifact, such as an OTP or verification link, and process it idempotently.
- Let the inbox expire or clean it up according to your workflow policy.
Mailhook also supports instant shared domains and custom domain setups, which lets teams start quickly and later move toward stronger domain control when allowlisting, environment isolation, or governance requires it. For higher-volume pipelines, batch email processing can help keep ingestion efficient while still preserving the same verification and idempotency rules.
The key is to treat Mailhook’s webhook as the event transport, not the final trust decision for email content. Verify the signed payload, then validate the message fields and extracted artifacts according to your own workflow rules.
Code review checklist for signed email webhooks
Use this checklist when reviewing a new webhook integration or hardening an existing one.
| Review item | Pass condition |
|---|---|
| Raw body handling | Signature is computed over the exact raw request body |
| Metadata validation | Required timestamp, signature, and delivery identifiers are present |
| Freshness policy | Old and future timestamps fail closed |
| Signature comparison | Constant-time comparison is used |
| Replay defense | Delivery IDs are stored and checked before processing |
| Idempotency | Message and artifact processing can safely repeat |
| Parser boundary | JSON parsing happens only after signature verification |
| Agent boundary | LLM tools receive minimal verified artifacts, not raw email by default |
| Secret handling | Webhook secrets are not logged and can be rotated |
| Fallback path | Polling fallback follows the same dedupe and validation rules |
If any item fails, the endpoint is not ready to trigger automated state changes. It may still be acceptable for a local prototype, but not for CI, QA, or agent workflows that run without human review.
Frequently Asked Questions
Are signed email webhooks the same as DKIM? No. DKIM is an email-level authentication signal about the message sender and domain alignment. A webhook signature authenticates the HTTP payload delivered by your email API provider. Automation teams often need both signals, but they answer different questions.
Can I parse the JSON before checking the signature? Avoid it. Verify the signature against the raw body first. Parsing can change the byte representation, and unverified JSON should not influence routing, queueing, or business logic.
What timestamp tolerance should we use? Choose a tolerance based on provider retry behavior, network latency, and your risk model. Many teams start with a short window measured in minutes, then adjust based on observed delivery patterns. Document the policy and test old, future, and malformed timestamps.
What should happen when the same signed webhook is delivered twice? The second delivery should be a safe no-op. Return success if the first delivery was already accepted, but do not repeat downstream actions such as submitting an OTP, clicking a magic link, or advancing a workflow state.
Do LLM agents need webhook verification if they only read email content? Yes. If an agent consumes email-derived data, the system should verify the webhook before exposing that data. The agent should receive a minimized, typed view, such as a verified OTP artifact, rather than raw untrusted email content whenever possible.
Build email automation on verified events
Signed email webhooks are the foundation for reliable email-driven automation. They keep CI tests deterministic, protect agent workflows from spoofed events, and give teams a clean boundary between receiving an email and acting on it.
If you want disposable inboxes that can be created through an API, delivered as structured JSON, consumed through real-time webhooks or polling, and protected with signed payloads, Mailhook provides the primitives to build that workflow without turning human mailboxes into automation infrastructure.
Start with the exact integration contract in the Mailhook llms.txt reference, then make signature verification, replay defense, and idempotent processing the first checks in every email automation path.