Invoice → JSON API

Convert Invoice PDFs to JSON in 2026

Parse any invoice PDF into structured JSON via API or webhook. Typed schemas, line-item arrays, custom fields, per-field confidence scores. Built for developer pipelines.

· 30 free pages/month · Webhooks + REST API

What the JSON looks like

Schema-driven output. Line items as an array. Per-field confidence scores let you route low-confidence extractions to human review.

{
  "id": "doc_01HX7K2YZQ3NJPM5S9R4T6V8W0",
  "status": "processed",
  "confidence": 0.97,
  "data": {
    "vendor": { "name": "Acme Supply Co", "address": "123 Market St, SF" },
    "invoice_number": "INV-1042",
    "invoice_date": "2026-04-15",
    "due_date": "2026-05-15",
    "po_number": "PO-998",
    "currency": "USD",
    "line_items": [
      {
        "description": "Widget A - Blue",
        "quantity": 10,
        "unit_price": 24.00,
        "tax": 2.40,
        "line_total": 240.00
      },
      {
        "description": "Widget B - Red",
        "quantity": 5,
        "unit_price": 48.00,
        "tax": 2.40,
        "line_total": 240.00
      }
    ],
    "subtotal": 620.00,
    "tax_total": 62.00,
    "grand_total": 682.00,
    "gl_code": "5100-OPERATING-SUPPLIES"
  }
}

Three ways to get JSON out

Web upload

Drop a PDF, see the structured JSON in the web app, download as .json.

REST API

POST the PDF to /v1/parsers/{id}/documents, receive JSON in the response or via async webhook.

Webhooks

Parsli pushes JSON to your endpoint the moment extraction completes. Zero polling.

Quick start — one curl call

curl -X POST https://api.parsli.co/v1/parsers/${PARSER_ID}/documents \
  -H "Authorization: Bearer ${PARSLI_API_KEY}" \
  -F "file=@invoice.pdf"

# Response (sync, small PDFs)
# {
#   "id": "doc_...",
#   "status": "processed",
#   "data": { "vendor": {...}, "line_items": [...], "grand_total": 682.00 }
# }

Full reference, authentication, rate limits, and webhook payloads in the API docs.

Features that matter for developers

  • Typed schemas (string, number, date, array)
  • Per-field confidence scores
  • Nested objects for vendor / line items
  • Custom fields via natural-language instructions
  • Webhook delivery with HMAC signatures
  • REST API with idempotency keys
  • Per-document PDF retrieval URL
  • Async processing for large PDFs

Frequently asked questions

How do I get invoice data as JSON?
Three paths. (1) Define your invoice schema in the no-code builder — vendor, invoice number, line items, totals, plus any custom fields — then upload a PDF or forward it to your parser inbox. The web app returns JSON inline and lets you download it. (2) POST the PDF to the REST API (`/v1/parsers/{id}/documents`) and get structured JSON back in the response. (3) Configure a webhook so every processed invoice pushes a JSON payload to your endpoint automatically.
Is the schema typed?
Yes. Each field in the schema builder has a type: string, number, date, boolean, or an array of objects (for line items). The JSON output respects those types — numbers are numbers, dates are ISO 8601 strings, and the line-item array is a first-class list. You can also specify natural-language instructions per field for things like "normalize all dates to YYYY-MM-DD" or "extract only the numeric total, not the currency symbol."
How do I extract line items as an array?
Define `line_items` as an array field in your schema with sub-fields for `description`, `quantity`, `unit_price`, `tax`, and `line_total`. The AI returns a JSON array with one object per line item, preserving the original order. Multi-row descriptions and tables that span pages are stitched together automatically. See our [extract line items from invoices guide](/guides/extract-line-items-from-invoices) for the schema builder walkthrough.
What about custom fields like GL code or cost center?
Add them to the schema with plain-English instructions — e.g., "Extract the cost center from the top-right of the invoice, format as a 4-digit code" or "Map the vendor to the matching GL account from this list: …". The AI handles these the same way it handles standard fields. No training data required.
What's the API response shape?
A standard JSON envelope: `{ "id": "doc_…", "status": "processed", "confidence": 0.97, "data": { …your schema… } }`. The `confidence` score is per-field so you can route low-confidence results to human review. For async processing, the initial POST returns immediately with `"status": "pending"` and the full payload arrives via webhook. Full reference in our [docs](/docs).
How does this compare to AWS Textract or Document AI?
Textract and Google Document AI are primitives — you build the invoice-parsing layer on top, handle vendor variation, template training, and field validation yourself. Parsli is the finished product: schema builder, line-item handling, webhook delivery, native QuickBooks integration, and human-review UI for exceptions, all out of the box. For the underlying benchmark comparison of extraction engines, see our [invoice OCR software guide](/invoice-ocr-software).

Ship invoice parsing to production.

API, webhooks, typed schemas. Free tier, no credit card.