API Documentation

Integrate DocServant into your systems to extract structured data from documents.

Looking for the Web UI User Guide?Learn how to create templates, run extractions, and manage reference data using our dashboard UI.

View User Guide

How the DocServant API works

DocServant uses a template-first approach to document extraction:

Templates define the structure of extracted data
Templates are created once in DocServant Studio
The API processes documents against a template
All processing is asynchronous
Results are delivered via webhooks

Think of templates as a contract between your documents and the structured output you receive. Define the schema once, then process any number of documents against it.

Templates

Templates are the foundation of DocServant's extraction model. Every API request references a template ID.

What templates define

The fields to extract from documents
Data types and formats for each field
How fields map to spreadsheet columns
Validation rules and transformations

Creating templates

Templates are created and managed in DocServant Studio. Upload a sample document, define your columns, and the template is ready to use via the API.

Templates are reusable. Create once, process unlimited documents. Templates cannot be created via the API — they require the Studio interface.

Quick Start

The typical workflow for integrating DocServant:

Create a template in DocServant Studio

Upload a sample document, define columns, and configure extraction settings.

Generate an API key

Go to the Developers section in your account to create and manage API keys.

Send documents to the API with a template ID

POST documents to /run referencing your template.

Receive structured results via webhook

Authentication

All API requests must include your API key in the Authorization header using the Bearer scheme.

Authorization: Bearer sk_<key_id>_<secret>

API keys follow the format sk_<uuid>_<random>. The raw key is returned once at creation and never stored — save it securely.

Keep your API keys secret. Never expose them in client-side code, public repositories, or share them with unauthorized users. If a key is compromised, revoke it immediately from the Developers page.

API Keys

API keys let external services authenticate without user sessions. They are scoped, revocable, and optionally expiring. Only team owners can create and manage keys.

Scopes

Each key carries a set of scopes that limit what it can access. Grant only what the integration needs.

Scope	Permission
`read:team`	Read team details
`read:run`	Read document runs
`read:template`	Read templates
`read:user`	Read user data
`read:statistics`	Read usage statistics
`write:team`	Modify team settings
`write:run`	Create/modify document runs
`write:template`	Create/modify templates
`write:user`	Create/modify users

Key Management Endpoints

API keys are created and managed through your DocServant Platform account — not via an API key itself.

Endpoints

All endpoints are relative to the API URL:

https://api.platform.docservant.com

Runs

Submit a Run

POST/v1/run

Submit a document for extraction using a specified template.

Request Fields

Field	Type	Description
`template_id`*	string	The ID of the template to use for extraction
`files`*	files	The document file (PDF, DOCX, PNG, JPG)
`output_mode`	string	Controls how results are structured: "combined" merges all documents into a single list; "single" keeps each document's data separate. Defaults to the template's output_mode.
`store_data`	boolean	Whether to retain uploaded files after the run completes. Set to false on privacy-sensitive workloads to purge source files automatically. Defaults to the template's store_data setting.

cURL Example

curl -X POST https://api.platform.docservant.com/v1/run \
  -H 'Authorization: Bearer <sk_your_live_api_key>' \
  -H 'Content-Type: application/json' \
  -d '{
    "template_id": "<your-template-uuid>",
    "output_mode": "combined",
    "store_data": true,
    "files": [
      {
        "id": "abc123",
        "filename": "invoice.pdf",
        "content_type": "application/pdf",
        "file_size": 102400
      }
    ]
  }'

Response Example

{
  "success": true,
  "message": "Successfully generated upload URLs for 1 files",
  "data": {
    "uploads": [
      {
        "document_id": "<document-uuid>",
        "status": "pending",
        "filename": "invoice.pdf",
        "expires_in": 3600,
        "upload_url": "<presigned-s3-upload-url>",
        "upload_fields": {
          "Content-Type": "application/pdf",
          "key": "<s3-upload-key>"
        }
      }
    ],
    "run_id": "<run-uuid>"
  }
}

Get Run Status

GET/v1/run/:run_id

Check the status of a run and retrieve outputs when complete.

cURL Example

curl https://api.platform.docservant.com/v1/run/<your-run-uuid> \
  -H "Authorization: Bearer <sk_your_live_api_key>" \
  -H "Content-Type: application/json"

Response Example

{
  "success": true,
  "message": "Get Run",
  "data": {
    "run": {
      "run_id": "<run-uuid>",
      "template_id": "<template-uuid>",
      "template_name": "Invoice Extraction",
      "team_id": "<team-uuid>",
      "status": "completed",
      "output_mode": "combined",
      "store_data": true,
      "avg_time_to_process": 9569,
      "completed_at": "2026-04-28T10:41:07.097058+00:00",
      "created_at": "2026-04-28T10:40:58.000960+00:00",
      "updated_at": "2026-04-28T10:41:08.151896+00:00",
      "deleted_at": null,
      "input_tokens": "Total input tokens for this run",
      "output_tokens": "Total output tokens for this run",
      "documents": [
        {
          "document_id": "<document-uuid>",
          "name": "invoice.pdf",
          "status": "completed",
          "pages": 1,
          "bucket_key": "uploads/<team-uuid>/<template-uuid>/<run-uuid>/invoice.pdf",
          "preview_url": "https://s3.amazonaws.com/...",
          "time_to_process": 9569.3,
          "input_tokens": "Amount input tokens for this document",
          "output_tokens": "Amount output tokens for this document",
          "created_at": "2026-04-28T10:40:57.258407+00:00",
          "updated_at": "2026-04-28T10:41:06.827715+00:00",
          "extracted_data": {
            "note": "This object structure depends on your template's defined fields",
            "ExampleGroup": {
              "Field One": "value",
              "Field Two": "value"
            }
          }
        }
      ],
      "extracted_data": [
        {
          "note": "This array structure depends on your template's defined fields",
          "ExampleGroup": {
            "Field One": "value",
            "Field Two": "value"
          }
        }
      ]
    }
  }
}

For failed runs, the response includes an error field with details:

{
  "run_id": "<run-uuid>",
  "status": "failed",
  "error": {
    "code": "EXTRACTION_FAILED",
    "message": "Unable to extract data from document. The file may be corrupted or in an unsupported format."
  }
}

output_mode

Controls how document extraction results are structured once all documents complete. combined (default) merges all document extractions into a single list accessible as extracted_data at the run level. single keeps each document's own extracted_data without a run-level merge. When omitted from the create request, the run inherits the value from the template.

store_data

Controls whether the original uploaded files are kept in storage after the run completes. When true (default), files are retained and each document includes a signed preview_url. When false, files are deleted from storage once the run reaches a terminal state and preview_url is not returned. Use store_data: false on privacy-sensitive workloads where source documents should be purged automatically after extraction. When omitted, the run inherits the value from the template.

Team

Get Team Details

GET/v1/team

Retrieve details about the authenticated user's team, including plan information and quota usage.

cURL Example

curl https://api.platform.docservant.com/v1/team \
  -H "Authorization: Bearer <sk_your_live_api_key>" \
  -H "Content-Type: application/json"

Response Example

{
  "success": true,
  "message": "Successfully get team",
  "data": {
    "team_id": "<team-uuid>",
    "name": "Your Team Name",
    "api_access_enabled": true,
    "preferred_model": "gpt-5.4-mini",
    "is_custom_plan": 0,
    "webhook_mask": "wh_xxx***...***xxx",
    "created_at": "2026-03-13T13:38:55.387267+00:00",
    "updated_at": "2026-04-28T10:41:07.096285+00:00",
    "deleted_at": null,
    "plan": {
      "key": "docservant_platform_standard",
      "name": "DocServant Platform - Standard",
      "type": "STANDARD",
      "token_quota": 5000000,
      "current_token_usage": 40944,
      "last_reported_token_usage": 0,
      "overage_allowed": true,
      "overage_cost_per_token": 0.00008,
      "overage_unit_size": 50000,
      "overage_cost_per_unit": 4.0,
      "price_monthly": 25,
      "recurring": false,
      "start_date": "2026-04-06T09:07:56.549326+00:00",
      "expiration_date": "2026-05-06T09:07:56.549326+00:00",
      "created_at": "2026-04-06T09:07:56.549326+00:00",
      "updated_at": "2026-04-06T09:07:56.549326+00:00"
    }
  }
}

List Team Members

GET/v1/user

Retrieve a paginated list of all members in the authenticated user's team.

cURL Example

curl https://api.platform.docservant.com/v1/user \
  -H "Authorization: Bearer <sk_your_live_api_key>" \
  -H "Content-Type: application/json"

Response Example

{
  "success": true,
  "message": "Successfully get list of users",
  "data": {
    "items": [
      {
        "user_id": "<user-uuid>",
        "team_id": "<team-uuid>",
        "email": "owner@example.com",
        "first_name": "Jane",
        "last_name": "Doe",
        "role": "owner",
        "status": "active",
        "email_verified": true,
        "last_login": "2026-04-29T14:31:05.154539+00:00",
        "created_at": "2026-03-13T13:38:56.749083+00:00",
        "updated_at": "2026-04-29T14:31:04.435744+00:00",
        "deleted_at": null
      },
      {
        "user_id": "<user-uuid>",
        "team_id": "<team-uuid>",
        "email": "member@example.com",
        "first_name": "John",
        "last_name": "Smith",
        "role": "admin",
        "status": "pending",
        "email_verified": false,
        "last_login": "2026-03-18T08:18:05.714621+00:00",
        "created_at": "2026-03-18T08:16:56.736167+00:00",
        "updated_at": "2026-03-18T08:18:05.714639+00:00",
        "deleted_at": null
      }
    ],
    "pagination": {
      "current_page": 1,
      "all_pages": 1,
      "total_items": 2
    }
  }
}

Template

List Templates

GET/v1/template

Retrieve a paginated list of all extraction templates for your team.

cURL Example

curl https://api.platform.docservant.com/v1/template \
  -H "Authorization: Bearer <sk_your_live_api_key>" \
  -H "Content-Type: application/json"

Response Example

{
  "success": true,
  "message": "Successfully get list of jobs",
  "data": {
    "items": [
      {
        "template_id": "<template-uuid>",
        "team_id": "<team-uuid>",
        "name": "Invoice Extraction",
        "output_mode": "single",
        "prompt_instructions": "Extract all relevant invoice data",
        "store_data": false,
        "total_runs": 4,
        "last_run": "2026-04-28T10:41:08.320130+00:00",
        "created_at": "2026-04-28T10:22:49.194772+00:00",
        "created_by": "<user-uuid>",
        "updated_at": "2026-04-28T13:44:48.346157+00:00",
        "updated_by": "<user-uuid>",
        "deleted_at": null,
        "sheets": [
          {
            "sheet_id": "<sheet-uuid>",
            "name": "Invoice",
            "prompt_instructions": "",
            "included_in_extraction": true,
            "single_object_extraction": false,
            "parent_sheet_id": null,
            "parent_sheet_name": null,
            "parent_sheet_column_id": null,
            "parent_sheet_column_name": null,
            "created_at": "2026-04-28T10:22:49.193908+00:00",
            "created_by": "<user-uuid>",
            "updated_at": "2026-04-28T10:22:49.193908+00:00",
            "updated_by": "<user-uuid>",
            "columns": [
              {
                "column_id": "<column-uuid>",
                "name": "Invoice Number",
                "data_type": "string",
                "enabled": true,
                "prompt_instructions": "",
                "enabled_reference_mapping": false,
                "reference_data_set_id": null,
                "return_data_set_field_id": null,
                "created_at": "2026-04-28T10:22:49.193908+00:00",
                "created_by": "<user-uuid>",
                "updated_at": "2026-04-28T10:22:49.193908+00:00",
                "updated_by": "<user-uuid>"
              }
            ],
            "children": []
          }
        ]
      }
    ],
    "pagination": {
      "current_page": 1,
      "all_pages": 1,
      "total_items": 1
    }
  }
}

Create a Template

POST/v1/template

Create a new extraction template with sheets and columns defining the structure of data to extract.

Request Fields

Field	Type	Description
`name`*	string	Display name for the template
`prompt_instructions`	string	Global extraction instructions applied to all sheets
`output_mode`*	string	Extraction output mode. Use "single" for most cases
`store_data`*	boolean	Whether to persist extracted data on the server
`sheets`*	array	Array of sheet objects defining the extraction schema (see below)

cURL Example

curl -X POST https://api.platform.docservant.com/v1/template \
  -H "Authorization: Bearer <sk_your_live_api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Invoice Data Extraction Template",
    "prompt_instructions": "Extract all relevant financial information from invoices with high accuracy",
    "output_mode": "single",
    "store_data": true,
    "sheets": [
      {
        "name": "Invoice Header",
        "prompt_instructions": "Extract the main invoice header information",
        "included_in_extraction": true,
        "single_object_extraction": true,
        "columns": [
          { "name": "Invoice Number", "enabled": true, "prompt_instructions": "Extract the unique invoice number or ID" },
          { "name": "Invoice Date",   "enabled": true, "prompt_instructions": "Extract the invoice date in YYYY-MM-DD format" },
          { "name": "Total Amount",   "enabled": true, "prompt_instructions": "Extract the total amount due" }
        ],
        "children": []
      },
      {
        "name": "Line Items",
        "prompt_instructions": "Extract all individual line items from the invoice",
        "included_in_extraction": true,
        "single_object_extraction": false,
        "columns": [
          { "name": "Item Description", "enabled": true, "prompt_instructions": "Extract the description of the product or service" },
          { "name": "Quantity",         "enabled": true, "prompt_instructions": "Extract the quantity ordered" },
          { "name": "Unit Price",       "enabled": true, "prompt_instructions": "Extract the price per unit" },
          { "name": "Line Total",       "enabled": true, "prompt_instructions": "Extract or calculate the total for this line item" }
        ],
        "children": []
      }
    ]
  }'

Response Example

{
  "success": true,
  "message": "Created Template",
  "data": {
    "template": {
      "template_id": "<template-uuid>",
      "team_id": "<team-uuid>",
      "name": "Invoice Data Extraction Template",
      "output_mode": "single",
      "prompt_instructions": "Extract all relevant financial information from invoices with high accuracy",
      "store_data": true,
      "total_runs": 0,
      "last_run": null,
      "created_at": "2026-04-29T15:41:45.442903+00:00",
      "created_by": "<user-uuid>",
      "updated_at": "2026-04-29T15:41:45.442903+00:00",
      "updated_by": "<user-uuid>",
      "deleted_at": null,
      "sheets": [
        {
          "sheet_id": "<sheet-uuid>",
          "name": "Invoice Header",
          "prompt_instructions": "Extract the main invoice header information",
          "included_in_extraction": true,
          "single_object_extraction": true,
          "parent_sheet_id": null,
          "parent_sheet_name": null,
          "parent_sheet_column_id": null,
          "parent_sheet_column_name": null,
          "created_at": "2026-04-29T15:41:45.441983+00:00",
          "created_by": "<user-uuid>",
          "updated_at": "2026-04-29T15:41:45.441983+00:00",
          "updated_by": "<user-uuid>",
          "columns": [
            {
              "column_id": "<column-uuid>",
              "name": "Invoice Number",
              "data_type": "string",
              "enabled": true,
              "prompt_instructions": "Extract the unique invoice number or ID",
              "enabled_reference_mapping": false,
              "reference_data_set_id": null,
              "return_data_set_field_id": null,
              "created_at": "2026-04-29T15:41:45.441983+00:00",
              "created_by": "<user-uuid>",
              "updated_at": "2026-04-29T15:41:45.441983+00:00",
              "updated_by": "<user-uuid>"
            }
          ],
          "children": []
        }
      ]
    }
  }
}

Statistics

GET/v1/statistics

Retrieve usage statistics for your team including token consumption, extraction counts, and overage details.

cURL Example

curl https://api.platform.docservant.com/v1/statistics \
  -H "Authorization: Bearer <sk_your_live_api_key>" \
  -H "Content-Type: application/json"

Response Example

{
  "success": true,
  "message": "Successfully fetched dashboard statistics",
  "data": {
    "templates": 10,
    "extractions": 39,
    "tokens": 40944,
    "avg_processing_time_in_seconds": 21.0,
    "overage": {
      "allowed": true,
      "tokens_left": 459056,
      "overage_tokens": 0,
      "cost_per_token": 0.00008,
      "overage_unit_size": 50000,
      "overage_cost": 0.0,
      "overage_cost_per_unit": 4.0
    }
  }
}

Webhooks

Webhooks push real-time event notifications to your server whenever something changes — a run is created, a template is updated, a user is removed, etc.

How webhooks work

A team owner registers a webhook with an HTTPS endpoint URL and an event type.
When the event occurs, the platform publishes it to AWS EventBridge.
An EventBridge rule triggers the delivery function, which POSTs the payload to your endpoint.
The request is signed with HMAC-SHA256 so you can verify it came from DocServant.

Event Types

Event	Triggered when
`create.user`	A new user is added to the team
`update.user`	A user record is modified
`delete.user`	A user is removed
`create.template`	A new template is created
`update.template`	A template is modified
`delete.template`	A template is deleted
`create.run`	A document run is started
`update.run`	A run is updated
`delete.run`	A run is deleted

Payload Format

Every delivery is an HTTP POST with Content-Type: application/json.

{
  "event": "create.run",
  "team_id": "your-team-uuid",
  "data": {
    // full object data for the affected resource
  }
}

Request Headers

Header	Description
`Content-Type`	application/json
`X-Webhook-Signature`	sha256=<hex_signature>
`X-Webhook-Timestamp`	Unix timestamp (seconds) of the delivery

Verifying the Signature

Always verify the signature before processing a webhook. Compute HMAC-SHA256 over {timestamp}.{raw_body} using your team's webhook secret, where timestamp is the value from X-Webhook-Timestamp.

// Node.js
const crypto = require("crypto");

function verifyWebhook(secret, timestamp, body, signatureHeader) {
  const message = `${timestamp}.${body}`;
  const expected = crypto
    .createHmac("sha256", secret)
    .update(message, "utf8")
    .digest("hex");
  const received = signatureHeader.replace("sha256=", "");
  return crypto.timingSafeEqual(Buffer.from(expected), Buffer.from(received));
}

# Python
import hmac, hashlib

def verify_webhook(secret, timestamp, body, signature_header):
    message = f"{timestamp}.{body}"
    expected = hmac.new(
        secret.encode("utf-8"),
        message.encode("utf-8"),
        hashlib.sha256
    ).hexdigest()
    received = signature_header.removeprefix("sha256=")
    return hmac.compare_digest(expected, received)

Use a timing-safe comparison (timingSafeEqual / compare_digest) to prevent timing attacks.

Delivery Behavior

Method: HTTP POST, timeout: 30 seconds
No automatic retries — failed deliveries are logged but not re-sent
Each webhook records the last delivery attempt in last_delivery

Make your handler idempotent and consider polling the API for missed events if uptime guarantees are critical.

Security Notes

HTTPS endpoints only — plaintext HTTP is not accepted
Validate X-Webhook-Signature on every delivery before trusting the payload
Rotate the secret if it is ever exposed
To temporarily stop deliveries, set is_active: false rather than deleting

Error Handling

Runs may fail due to unreadable documents, invalid templates, or processing errors. Clients should handle retries where appropriate.

HTTP Status Codes

Code	Meaning
`200`	Success
`400`	Bad request — Check your request body or parameters
`401`	Unauthorized — Invalid or missing API key
`403`	Forbidden — Key inactive, revoked, expired, or missing required scope
`404`	Not found — Key, run, or template doesn't exist
`429`	Rate limited — Too many requests, slow down
`500`	Server error — Something went wrong on our end

Examples

Extract Invoices into JSON

Submit an invoice for extraction and fetch the structured JSON output:

# Submit the invoice
curl -X POST https://api.docservant.com/v1/run \
  -H "Authorization: Bearer sk_<key_id>_<secret>" \
  -F "templateId=tpl_invoices" \
  -F "file=@invoice.pdf"

# Response: { "run_id": "run_abc123", "status": "queued" }

# Poll for completion (or use webhooks)
curl https://api.docservant.com/v1/run/run_abc123 \
  -H "Authorization: Bearer sk_<key_id>_<secret>"

# When status is "completed", fetch the JSON output
curl "https://storage.docservant.com/outputs/run_abc123.json?token=..."

Bulk Upload

When you submit multiple documents with the same template, each run produces its own output files. To consolidate results, fetch individual JSON outputs and merge them in your application, or download each XLSX and combine sheets programmatically.

# Submit multiple documents
for file in invoices/*.pdf; do
  curl -X POST https://api.docservant.com/v1/run \
    -H "Authorization: Bearer sk_<key_id>_<secret>" \
    -F "templateId=tpl_invoices" \
    -F "file=@$file"
done

Handle Webhook Delivery (Node.js)

const express = require('express');
const crypto = require('crypto');

const app = express();
app.use(express.raw({ type: 'application/json' })); // raw body needed for signature

app.post('/webhooks/docservant', (req, res) => {
  const signature = req.headers['x-webhook-signature'];
  const timestamp = req.headers['x-webhook-timestamp'];
  const rawBody = req.body.toString('utf8');

  // Verify signature
  const message = `${timestamp}.${rawBody}`;
  const expected = 'sha256=' + crypto
    .createHmac('sha256', process.env.WEBHOOK_SECRET)
    .update(message, 'utf8')
    .digest('hex');
  const received = signature.replace('sha256=', '');

  if (!crypto.timingSafeEqual(Buffer.from(expected), Buffer.from(received))) {
    return res.status(401).send('Invalid signature');
  }

  const { event, team_id, data } = JSON.parse(rawBody);

  if (event === 'create.run') {
    console.log(`New run started for team ${team_id}`, data);
  }

  res.status(200).send('OK');
});