A customer emails "I need to cancel my order #45892 and get a refund." Your support agent copies the order number into the CRM, types out the request, and files a ticket.
Ten minutes later, another customer emails. Then another. Your team spends half their day copy-pasting between email and your systems.
Meanwhile, the urgent ones get buried in the queue because nobody saw the word 'critical' in the subject line.
Every email contains structured data. You just need to extract it.
GATEWAY COMPONENT - Email remains the #1 way businesses receive unstructured requests. Parsing unlocks automation.
Email parsing extracts the useful pieces from raw email messages: who sent it, what they're asking for, any reference numbers, dates, amounts, or entities mentioned. The sender field gives you identity. The subject gives you context. The body gives you intent and details.
The simplest parsing is regex-based: find patterns like order numbers (#\d{5}) or email addresses. More sophisticated parsing uses NLP to understand intent ('cancel my order' vs 'check order status') and extract entities ('order #45892', '$299.00').
You're not trying to understand every email perfectly. You're trying to extract enough structure that automation can take over the repetitive work - routing, data entry, initial response.
Email parsing solves a universal problem: how do you convert freeform human communication into structured data that systems can act on?
Receive → Parse Headers → Parse Body → Extract Entities → Classify Intent → Route or Act. Every email follows this pattern. The variation is in how deeply you parse and what you do with the results.
Select an email and parsing level. Watch how each level extracts more (or less) from the same message.
The metadata you get for free
Every email has structured headers: From, To, Subject, Date, Reply-To. These are already parsed by your email server. You just need to read them. The From field tells you who's writing. The Subject often contains the request type.
Regex and rules for known formats
Order numbers follow patterns (#12345). Phone numbers have formats. Dates appear in recognizable ways. Regular expressions catch these reliably. If the customer always includes 'Order #' before the number, you extract it every time.
Understanding intent and entities
Language models can understand 'I want to return this' means refund intent, even without the word 'refund.' They extract entities like product names, amounts, and dates even when format varies. More expensive, but handles natural language.
A customer emails support with a refund request. Instead of a human reading, copying the order number, and filing a ticket. the system parses the email, extracts the order number, looks up the customer, and creates an auto-categorized support ticket in 2 seconds.
Hover over any component to see what it does and why it's neededTap any component to see what it does and why it's needed
Animated lines show direct connections · Hover for detailsTap for details · Click to learn more
You're running NLP on every email to extract the sender's company. Meanwhile, the From header already has their domain (acme.com), and you could look them up in your CRM in milliseconds. Save AI for what actually needs understanding.
Instead: Extract headers first. Only parse the body for what headers can't tell you.
Your parser looks for 'Order #' followed by 5 digits. Works great until someone writes 'order number 45892' or 'Order: #45892' or attaches the order as a PDF. Now 30% of emails fail silently.
Instead: Build fuzzy matching. Test against real email variations. Have a fallback for unparseable emails.
You parse the latest message but miss that the customer already provided their order number three replies ago. Your bot asks for information that's already in the thread. Customer gets frustrated.
Instead: Parse the full thread when needed. Track conversation context. Check previous messages before asking again.
You've learned how to extract structure from emails. The natural next step is understanding how to classify what the sender actually wants - turning 'I need to cancel' into an actionable intent.