Invoice Data Extraction from PDF4 nodes
Invoice Data Extraction from PDF
An AI agent calls LlamaParse to parse a PDF invoice, then returns structured JSON with vendor, amount, line items, and totals.
Workflow at a glance
The full canvas, before you import it
Click any node to see its config.
#Invoice#PDF#LlamaParse#Agent#Data Extraction#Accounting
Click a node to select it — same as the Heym editor; the panel shows its settings.
4 nodes · Free & source-available
Invoice Data Extraction from PDF
Stop copying invoice data by hand. Provide a PDF URL and the InvoiceExtractor agent calls the LlamaParse tool to get clean markdown, then returns a structured JSON object with every invoice field — ready for QuickBooks, Xero, or a DataTable.
What this workflow does
- InvoiceURL — provide the PDF URL (paste it in or swap for a Webhook trigger)
- InvoiceExtractor — agent receives the URL, calls the llamaParseAPI tool, and returns structured JSON
- llamaParseAPI — HTTP tool node: agent POSTs the PDF URL to LlamaParse and gets back clean markdown
- InvoiceData — output with structured invoice JSON (vendor, number, date, total, line items)
Use cases
- Automated AP data entry from emailed PDF invoices
- Invoice pre-processing before uploading to accounting software
- Batch ingestion of historical invoices into a DataTable
Setup
- Open the llamaParseAPI node, replace
YOUR_LLAMAPARSE_KEYin the curl command with your real key from cloud.llamaindex.ai. - Open InvoiceExtractor and connect an OpenAI-compatible credential.
- Run once with the sample URL — the agent calls LlamaParse and returns a JSON invoice object in the output panel.
Notes
- LlamaParse handles scanned PDFs with OCR automatically.
- Extend the JSON schema in the agent's system instruction to capture PO numbers, tax IDs, or additional line-item fields.
- For high volume, add a Loop upstream and pass each PDF URL through the same workflow.
How to import this template
- 1Click Import → Copy JSON on this page.
- 2Open your Heym and navigate to a workflow canvas.
- 3PressCmd+V/Ctrl+V— nodes appear instantly.
- 4Add your API keys in the node config panels and click Run.
More workflow templates
Discover more automations
- Finance OpsPayment Webhook HandlerReceive a Stripe-style webhook payload, parse the event, send a receipt email on success, and Slack-alert your team on failure.
- Finance OpsBinance WebSocket BTC Price AlertSubscribe to the Binance public WebSocket ticker stream and trigger a price alert when BTC/USDT crosses your target threshold — no API key required.
- Finance OpsQuickBooks Sales Receipts from StripeReceive Stripe payment webhooks, create or find the QuickBooks customer, and generate a sales receipt automatically.
- Finance OpsPortfolio Diversification Slack BriefRead portfolio holdings from Google Sheets, analyze concentration risk with AI, and alert Slack when exposure needs review.
- Finance OpsInvoice Total Calculator (Agent Skill)An agent runs a bundled Python skill that computes line totals, discount, tax, and the grand total from invoice items.
- Finance OpsFinance Evidence S3 ArchiveUpload a finance evidence note to a structured Amazon S3 path and return the stored object key.
- Customer SupportHITL Support Reply AgentDraft a customer-facing support response, pause for human approval, then continue with the reviewed reply.
- Marketing & SEOReddit Subreddit GET Tool CallLet an Agent call a no-key HTTP GET tool that fetches hot posts from a subreddit JSON feed.
- AI AgentsBatch LLM Status TrackerSend an array through the OpenAI Batch API, branch on live status updates, and collect the final per-item results.
- Customer SupportIMAP Support Inbox TriageWatch a shared mailbox, summarize incoming support email, and route urgent messages to Slack.
- Document OpsJina Web FetcherFetch clean, LLM-ready text from any URL using the Jina Reader API.
- Dev & IT OpsCursor Post NotifierMonitor the Cursor blog on a schedule and Slack-notify your team when a new post goes live.