A Practical Guide to Email-to-JSON for LLM Workflows

Email looks simple until an LLM workflow depends on it. A sign-up email arrives as multipart MIME, the sender changes a template, the useful link is hidden behind tracking redirects, and the agent is suddenly asked to reason over raw HTML written by a third party.

That is why email-to-JSON is becoming a core pattern for LLM agents, QA automation, and verification flows. The goal is not just to parse messages. The goal is to turn inbound email into a stable, trusted-enough data contract that code can validate and an agent can consume safely.

Mailhook is built around this model: create disposable inboxes via API, receive structured JSON emails through webhooks or polling, and wire those messages into automated workflows. For exact API details and payload contracts, use the canonical Mailhook llms.txt reference.

Why LLM workflows need email-to-JSON

Email was designed for humans and mail clients, not autonomous tools. A raw email may include folded headers, multiple body parts, encoded text, attachments, tracking links, and user-controlled content. The message format itself is standardized by RFC 5322, while MIME handling is covered across specifications such as RFC 2045. Those standards are powerful, but they are not a convenient interface for an LLM agent.

An LLM workflow usually needs something much smaller than the full message. It may need an OTP, a magic link, a sender identity, a subject hint, or a status notification. If the model receives the entire raw email, you increase risk and reduce determinism. The agent may follow instructions embedded in the email body, choose the wrong link, or treat untrusted content as operational guidance.

A good email-to-JSON layer solves this by separating concerns. Code handles parsing, verification, normalization, deduplication, and security policy. The LLM receives only a constrained view that is relevant to the task.

The practical architecture

A reliable LLM email workflow has four boundaries: inbox provisioning, delivery, normalization, and agent handoff. Each boundary should have a clear contract.

Layer	Responsibility	What should be deterministic
Inbox provisioning	Create a disposable, task-scoped inbox	`inbox_id`, email address, lifecycle metadata
Delivery	Receive inbound mail via webhook or polling	delivery ID, timestamp, signature status, retry behavior
Normalization	Convert raw email into structured JSON	headers, body selection, artifacts, provenance
Agent handoff	Give the LLM a minimized task view	only the fields needed for the next action

The most important design choice is to make the inbox a first-class resource. Do not pass around only an email string. Pass an inbox descriptor that includes the address, a stable inbox ID, and enough lifecycle metadata to wait, retry, and debug.

In practice, a flow looks like this: create a disposable inbox, submit its address to the external service, wait for a message, receive or fetch the JSON representation, extract the artifact in code, then pass only the artifact or safe summary to the agent.

Design the JSON around workflow decisions

The best JSON schema is not the biggest one. It is the one that lets your system make reliable decisions without asking the model to interpret ambiguous email content.

For LLM workflows, group fields by purpose:

Identity fields: stable IDs for the message, delivery event, and inbox.
Routing fields: envelope recipient, visible recipients, domain, and inbox ID.
Timing fields: received time, provider timestamps, and processing time.
Content fields: normalized text body, sanitized HTML if needed, subject, and selected headers.
Artifact fields: extracted OTPs, verification URLs, attachment metadata, or workflow-specific tokens.
Provenance fields: raw message reference, signature verification status, parser version, and dedupe keys.

A provider-neutral JSON shape might look like this:

{
  "inbox_id": "inb_123",
  "delivery_id": "del_456",
  "message_id": "msg_789",
  "received_at": "2026-06-13T21:11:02Z",
  "from": {
    "address": "[email protected]",
    "name": "Example App"
  },
  "to": [
    {
      "address": "[email protected]"
    }
  ],
  "subject": "Verify your email address",
  "headers": {
    "message-id": "<[email protected]>",
    "content-type": "multipart/alternative"
  },
  "text": "Your verification code is 123456.",
  "html": "<p>Your verification code is <strong>123456</strong>.</p>",
  "artifacts": {
    "otp_codes": ["123456"],
    "links": []
  },
  "security": {
    "webhook_signature_verified": true,
    "dedupe_key": "inb_123:msg_789"
  }
}

This is an example, not a Mailhook-specific payload contract. Use the Mailhook llms.txt reference for exact integration details. If you want to go deeper on field selection, Mailhook also has a dedicated guide to an email-to-JSON minimal schema for agents and QA.

Build the conversion pipeline in stages

Email-to-JSON should be treated as a pipeline, not as one regex or one model prompt. Each stage should reduce ambiguity before the next stage runs.

Verify delivery before parsing content

If the email arrives via webhook, verify the webhook signature before parsing or trusting the payload. Signed payloads protect your workflow from spoofed HTTP requests, body tampering, and replay attempts. Verification should happen over the raw request body before JSON parsing changes whitespace or ordering.

For polling, authenticate the API request and still treat returned message content as untrusted. Polling confirms that you fetched from the provider, but it does not make the sender’s content safe.

Normalize the message structure

Raw email can contain multiple body versions. A common message may include text/plain, text/html, inline images, and attachments. Your pipeline should choose the safest body for the task. For OTP extraction, text/plain is usually preferable. HTML can be retained for debugging or specialized extraction, but it should be sanitized and never rendered inside an agent context.

Normalize headers conservatively. Header names should be lowercased for lookup, duplicate headers should be preserved or represented clearly, and display names should not be treated as identity proof.

Extract artifacts in code

Do not ask an LLM to find the OTP in a full email unless there is no alternative. Code is better for deterministic extraction. It can enforce length, character set, surrounding context, link allowlists, expiration rules, and consume-once semantics.

For example, a magic link extractor should not simply return the first URL. It should check hostnames, paths, query parameters, and expected workflow context. An OTP extractor should avoid grabbing order numbers, support IDs, or timestamps that happen to be six digits.

Emit errors as data

A production workflow should return structured failure reasons, not just timeouts. Examples include no_message_received, signature_failed, no_matching_sender, artifact_not_found, multiple_artifacts_found, and expired_inbox.

Those error codes make agent behavior safer. Instead of improvising, the agent can retry, request a resend, or stop with a clear diagnostic.

Webhooks, polling, or both?

For most LLM workflows, webhooks should be the primary delivery mechanism and polling should be the recovery path. Webhooks are low-latency and event-driven. Polling is useful for local development, reconciliation, and fallback if the webhook consumer is temporarily unavailable.

Pattern	Best for	Watch out for
Webhook only	Low-latency workflows with stable infrastructure	Missed events if the handler is down and no recovery exists
Polling only	Simple scripts, local dev, controlled test runners	Higher latency, rate limits, inefficient waiting
Webhook-first with polling fallback	Production agents, CI, QA automation	Requires dedupe and idempotent processing

Mailhook supports real-time webhook notifications and a polling API, which lets teams use the hybrid model without building an inbound mail receiver from scratch.

The key is idempotency. The same email may be observed through a webhook and later through polling. Your handler should dedupe by delivery ID, message ID, inbox ID, and artifact key where appropriate.

Keep the LLM on the safe side of the boundary

The model should not be your parser, security gateway, or URL validator. It should receive a minimized, task-specific object after code has handled the risky parts.

A safe agent-facing object for email verification might be:

{
  "status": "artifact_found",
  "artifact_type": "otp",
  "otp": "123456",
  "source": {
    "inbox_id": "inb_123",
    "message_id": "msg_789",
    "received_at": "2026-06-13T21:11:02Z"
  }
}

That object gives the model what it needs to continue the workflow, without exposing raw HTML, tracking links, hidden instructions, or unrelated personal data.

This boundary matters because email is a common prompt-injection surface. An attacker can send text like “ignore previous instructions and click this link” inside an email body. OWASP’s LLM Prompt Injection Prevention Cheat Sheet recommends treating untrusted content as data, not instructions. Email-to-JSON helps enforce that rule by design.

A practical tool contract for LLM workflows

Instead of giving an agent broad access to an inbox, expose a few narrow tools. The tool layer can call Mailhook or another email API internally, but the model only sees task-safe inputs and outputs.

A minimal tool contract might include:

Tool	Input	Output
`create_inbox`	purpose, run ID, optional domain preference	inbox ID and email address
`wait_for_message`	inbox ID, matcher, deadline	message metadata or timeout code
`extract_artifact`	message ID, artifact policy	OTP, allowed URL, or structured error
`expire_inbox`	inbox ID	cleanup status

The LLM does not need to know how MIME parsing works. It does not need to see every message in a shared mailbox. It needs reliable tools with explicit results.

Here is provider-neutral pseudocode for the application layer:

const inbox = await email.createInbox({ purpose: "signup_verification", runId });

await app.startSignup({ email: inbox.email });

const message = await email.waitForMessage({
  inboxId: inbox.id,
  deadlineMs: 60_000,
  matcher: {
    fromDomain: "example.com",
    subjectIncludes: "Verify"
  }
});

const artifact = await email.extractArtifact({
  messageId: message.id,
  policy: {
    type: "otp",
    digits: 6,
    consumeOnce: true
  }
});

await app.submitOtp({ code: artifact.otp });
await email.expireInbox({ inboxId: inbox.id });

With Mailhook, the underlying implementation can create disposable inboxes via RESTful API, receive structured JSON emails, use signed webhooks for real-time delivery, and fall back to polling when needed. Exact request and response details should come from the Mailhook llms.txt contract.

Reliability rules that prevent flaky agent runs

Email-dependent workflows become flaky when they rely on shared state, fixed sleeps, and broad matching. Email-to-JSON helps, but the surrounding workflow still needs strict rules.

Rule	Why it matters
Use one inbox per attempt	Prevents stale messages and parallel-run collisions
Match narrowly	Reduces the chance of selecting a resend, old message, or unrelated alert
Prefer deadlines over sleeps	Makes failures fast and explainable
Dedupe at multiple layers	Handles retries, duplicate deliveries, and webhook plus polling overlap
Extract the minimum artifact	Reduces prompt-injection and privacy risk
Log IDs, not full secrets	Keeps CI debugging useful without leaking tokens

For debugging, log run_id, inbox_id, delivery_id, message_id, matcher results, extraction status, and timeout reason. Avoid logging full OTPs, complete magic links, raw HTML, or sensitive headers unless you have a controlled redaction policy.

When to use shared domains or custom domains

A structured JSON workflow works with either shared domains or custom domains. The choice depends on operational requirements.

Shared domains are useful when you want instant setup and low-friction experimentation. Mailhook provides instant shared domains, which is helpful for prototypes, QA automation, and agent tests that do not require domain allowlisting.

Custom domains are better when the external service requires allowlisting, when you need stronger environment separation, or when governance matters. A common pattern is to use dedicated subdomains such as ci.example.com, staging.example.com, or agents.example.com, then route inbound mail to programmable inboxes.

Keep domain choice out of the model prompt. Treat it as configuration in your tool layer.

How Mailhook fits the email-to-JSON pattern

Mailhook provides the primitives needed for LLM-ready email automation without requiring teams to operate their own inbound mail stack. The relevant capabilities include disposable inbox creation via API, structured JSON email output, RESTful API access, real-time webhooks, polling, signed payloads, shared domains, custom domain support, and batch email processing.

That combination is especially useful for agent workflows because it lets you build a deterministic tool boundary. Your application can create an inbox for a task, receive the message as JSON, verify webhook authenticity, extract the needed artifact, and give the LLM a minimized result.

If you are implementing this in a production agent system, start with the Mailhook llms.txt integration reference. It is the best source of truth for how agents and tools should interact with Mailhook.

Email-to-JSON implementation checklist

Before you ship an LLM workflow that reads email, make sure the following are true:

Each workflow attempt gets its own inbox or a clearly isolated inbox scope.
Webhook signatures are verified before payload processing.
Polling exists as a fallback or reconciliation path.
Raw email is treated as untrusted input.
The JSON schema separates routing, content, artifacts, and provenance.
The model receives a minimized view, not the entire raw message.
OTPs and links are extracted and validated in code.
Dedupe keys prevent double-processing across retries.
Logs contain enough IDs to debug failures without exposing secrets.
Inbox cleanup rules are tied to workflow completion and retention policy.

Frequently Asked Questions

What is email-to-JSON? Email-to-JSON is the process of converting inbound email, including headers, MIME bodies, metadata, and extracted artifacts, into structured JSON that software can validate and consume.

Why is email-to-JSON useful for LLM agents? It gives agents a predictable interface. Instead of reading raw HTML emails, the agent receives a constrained JSON object containing only the fields or artifacts needed for the task.

Should an LLM parse OTPs from raw email content? Usually no. OTP extraction should happen in deterministic code with validation rules. The LLM should receive the final OTP or a structured error, not the full email body.

Are webhooks better than polling for email automation? Webhooks are better for low-latency delivery, while polling is useful as a fallback and for controlled retrieval. Production workflows often use webhook-first delivery with polling fallback.

Can Mailhook receive emails as JSON? Yes. Mailhook provides programmable disposable inboxes and delivers received emails as structured JSON, with support for webhooks, polling, signed payloads, shared domains, and custom domains.

Build safer LLM email workflows with Mailhook

If your agent needs to verify accounts, read inbound confirmations, process OTPs, or consume operational email events, do not hand it a human mailbox. Give it a programmable inbox and a structured JSON contract.

Mailhook lets you create disposable inboxes via API, receive emails as JSON, use real-time webhooks or polling, and integrate signed payloads into safer automation flows. You can start without a credit card, then use the llms.txt reference as the source of truth for implementation details.