API Documentation
Integrate DocServant into your systems to extract structured data from documents.
How the DocServant API works
DocServant uses a template-first approach to document extraction:
- Templates define the structure of extracted data
- Templates are created once in DocServant Studio
- The API processes documents against a template
- All processing is asynchronous
- Results are delivered via webhooks
Templates
Templates are the foundation of DocServant's extraction model. Every API request references a template ID.
What templates define
- The fields to extract from documents
- Data types and formats for each field
- How fields map to spreadsheet columns
- Validation rules and transformations
Creating templates
Templates are created and managed in DocServant Studio. Upload a sample document, define your columns, and the template is ready to use via the API.
Quick Start
The typical workflow for integrating DocServant:
Create a template in DocServant Studio
Upload a sample document, define columns, and configure extraction settings.
Generate an API key
Go to the Developers section in your account to create and manage API keys.
Send documents to the API with a template ID
POST documents to /run referencing your template.
Receive structured results via webhook
Register a webhook to receive real-time event notifications — runs, templates, users, and more.
Authentication
All API requests must include your API key in the Authorization header using the Bearer scheme.
Authorization: Bearer sk_<key_id>_<secret>API keys follow the format sk_<uuid>_<random>. The raw key is returned once at creation and never stored — save it securely.
API Keys
API keys let external services authenticate without user sessions. They are scoped, revocable, and optionally expiring. Only team owners can create and manage keys.
Scopes
Each key carries a set of scopes that limit what it can access. Grant only what the integration needs.
| Scope | Permission |
|---|---|
read:team | Read team details |
read:run | Read document runs |
read:template | Read templates |
read:user | Read user data |
read:statistics | Read usage statistics |
write:team | Modify team settings |
write:run | Create/modify document runs |
write:template | Create/modify templates |
write:user | Create/modify users |
Key Management Endpoints
API keys are created and managed through your DocServant Platform account — not via an API key itself.
Endpoints
All endpoints are relative to the API URL:
https://api.platform.docservant.comRuns
Submit a Run
/v1/runSubmit a document for extraction using a specified template.
Request Fields
| Field | Type | Description |
|---|---|---|
template_id* | string | The ID of the template to use for extraction |
files* | files | The document file (PDF, DOCX, PNG, JPG) |
output_mode | string | Controls how results are structured: "combined" merges all documents into a single list; "single" keeps each document's data separate. Defaults to the template's output_mode. |
store_data | boolean | Whether to retain uploaded files after the run completes. Set to false on privacy-sensitive workloads to purge source files automatically. Defaults to the template's store_data setting. |
cURL Example
curl -X POST https://api.platform.docservant.com/v1/run \
-H 'Authorization: Bearer <sk_your_live_api_key>' \
-H 'Content-Type: application/json' \
-d '{
"template_id": "<your-template-uuid>",
"output_mode": "combined",
"store_data": true,
"files": [
{
"id": "abc123",
"filename": "invoice.pdf",
"content_type": "application/pdf",
"file_size": 102400
}
]
}'Response Example
{
"success": true,
"message": "Successfully generated upload URLs for 1 files",
"data": {
"uploads": [
{
"document_id": "<document-uuid>",
"status": "pending",
"filename": "invoice.pdf",
"expires_in": 3600,
"upload_url": "<presigned-s3-upload-url>",
"upload_fields": {
"Content-Type": "application/pdf",
"key": "<s3-upload-key>"
}
}
],
"run_id": "<run-uuid>"
}
}Get Run Status
/v1/run/:run_idCheck the status of a run and retrieve outputs when complete.
cURL Example
curl https://api.platform.docservant.com/v1/run/<your-run-uuid> \
-H "Authorization: Bearer <sk_your_live_api_key>" \
-H "Content-Type: application/json"Response Example
{
"success": true,
"message": "Get Run",
"data": {
"run": {
"run_id": "<run-uuid>",
"template_id": "<template-uuid>",
"template_name": "Invoice Extraction",
"team_id": "<team-uuid>",
"status": "completed",
"output_mode": "combined",
"store_data": true,
"avg_time_to_process": 9569,
"completed_at": "2026-04-28T10:41:07.097058+00:00",
"created_at": "2026-04-28T10:40:58.000960+00:00",
"updated_at": "2026-04-28T10:41:08.151896+00:00",
"deleted_at": null,
"input_tokens": "Total input tokens for this run",
"output_tokens": "Total output tokens for this run",
"documents": [
{
"document_id": "<document-uuid>",
"name": "invoice.pdf",
"status": "completed",
"pages": 1,
"bucket_key": "uploads/<team-uuid>/<template-uuid>/<run-uuid>/invoice.pdf",
"preview_url": "https://s3.amazonaws.com/...",
"time_to_process": 9569.3,
"input_tokens": "Amount input tokens for this document",
"output_tokens": "Amount output tokens for this document",
"created_at": "2026-04-28T10:40:57.258407+00:00",
"updated_at": "2026-04-28T10:41:06.827715+00:00",
"extracted_data": {
"note": "This object structure depends on your template's defined fields",
"ExampleGroup": {
"Field One": "value",
"Field Two": "value"
}
}
}
],
"extracted_data": [
{
"note": "This array structure depends on your template's defined fields",
"ExampleGroup": {
"Field One": "value",
"Field Two": "value"
}
}
]
}
}
}For failed runs, the response includes an error field with details:
{
"run_id": "<run-uuid>",
"status": "failed",
"error": {
"code": "EXTRACTION_FAILED",
"message": "Unable to extract data from document. The file may be corrupted or in an unsupported format."
}
}output_mode
Controls how document extraction results are structured once all documents complete. combined (default) merges all document extractions into a single list accessible as extracted_data at the run level. single keeps each document's own extracted_data without a run-level merge. When omitted from the create request, the run inherits the value from the template.
store_data
Controls whether the original uploaded files are kept in storage after the run completes. When true (default), files are retained and each document includes a signed preview_url. When false, files are deleted from storage once the run reaches a terminal state and preview_url is not returned. Use store_data: false on privacy-sensitive workloads where source documents should be purged automatically after extraction. When omitted, the run inherits the value from the template.
Team
Get Team Details
/v1/teamRetrieve details about the authenticated user's team, including plan information and quota usage.
cURL Example
curl https://api.platform.docservant.com/v1/team \
-H "Authorization: Bearer <sk_your_live_api_key>" \
-H "Content-Type: application/json"Response Example
{
"success": true,
"message": "Successfully get team",
"data": {
"team_id": "<team-uuid>",
"name": "Your Team Name",
"api_access_enabled": true,
"preferred_model": "gpt-5.4-mini",
"is_custom_plan": 0,
"webhook_mask": "wh_xxx***...***xxx",
"created_at": "2026-03-13T13:38:55.387267+00:00",
"updated_at": "2026-04-28T10:41:07.096285+00:00",
"deleted_at": null,
"plan": {
"key": "docservant_platform_standard",
"name": "DocServant Platform - Standard",
"type": "STANDARD",
"token_quota": 500000,
"current_token_usage": 40944,
"last_reported_token_usage": 0,
"overage_allowed": true,
"overage_cost_per_token": 0.00008,
"overage_unit_size": 50000,
"overage_cost_per_unit": 4.0,
"price_monthly": 25,
"recurring": false,
"start_date": "2026-04-06T09:07:56.549326+00:00",
"expiration_date": "2026-05-06T09:07:56.549326+00:00",
"created_at": "2026-04-06T09:07:56.549326+00:00",
"updated_at": "2026-04-06T09:07:56.549326+00:00"
}
}
}List Team Members
/v1/userRetrieve a paginated list of all members in the authenticated user's team.
cURL Example
curl https://api.platform.docservant.com/v1/user \
-H "Authorization: Bearer <sk_your_live_api_key>" \
-H "Content-Type: application/json"Response Example
{
"success": true,
"message": "Successfully get list of users",
"data": {
"items": [
{
"user_id": "<user-uuid>",
"team_id": "<team-uuid>",
"email": "owner@example.com",
"first_name": "Jane",
"last_name": "Doe",
"role": "owner",
"status": "active",
"email_verified": true,
"last_login": "2026-04-29T14:31:05.154539+00:00",
"created_at": "2026-03-13T13:38:56.749083+00:00",
"updated_at": "2026-04-29T14:31:04.435744+00:00",
"deleted_at": null
},
{
"user_id": "<user-uuid>",
"team_id": "<team-uuid>",
"email": "member@example.com",
"first_name": "John",
"last_name": "Smith",
"role": "admin",
"status": "pending",
"email_verified": false,
"last_login": "2026-03-18T08:18:05.714621+00:00",
"created_at": "2026-03-18T08:16:56.736167+00:00",
"updated_at": "2026-03-18T08:18:05.714639+00:00",
"deleted_at": null
}
],
"pagination": {
"current_page": 1,
"all_pages": 1,
"total_items": 2
}
}
}Template
List Templates
/v1/templateRetrieve a paginated list of all extraction templates for your team.
cURL Example
curl https://api.platform.docservant.com/v1/template \
-H "Authorization: Bearer <sk_your_live_api_key>" \
-H "Content-Type: application/json"Response Example
{
"success": true,
"message": "Successfully get list of jobs",
"data": {
"items": [
{
"template_id": "<template-uuid>",
"team_id": "<team-uuid>",
"name": "Invoice Extraction",
"output_mode": "single",
"prompt_instructions": "Extract all relevant invoice data",
"store_data": false,
"total_runs": 4,
"last_run": "2026-04-28T10:41:08.320130+00:00",
"created_at": "2026-04-28T10:22:49.194772+00:00",
"created_by": "<user-uuid>",
"updated_at": "2026-04-28T13:44:48.346157+00:00",
"updated_by": "<user-uuid>",
"deleted_at": null,
"sheets": [
{
"sheet_id": "<sheet-uuid>",
"name": "Invoice",
"prompt_instructions": "",
"included_in_extraction": true,
"single_object_extraction": false,
"parent_sheet_id": null,
"parent_sheet_name": null,
"parent_sheet_column_id": null,
"parent_sheet_column_name": null,
"created_at": "2026-04-28T10:22:49.193908+00:00",
"created_by": "<user-uuid>",
"updated_at": "2026-04-28T10:22:49.193908+00:00",
"updated_by": "<user-uuid>",
"columns": [
{
"column_id": "<column-uuid>",
"name": "Invoice Number",
"data_type": "string",
"enabled": true,
"prompt_instructions": "",
"enabled_reference_mapping": false,
"reference_data_set_id": null,
"return_data_set_field_id": null,
"created_at": "2026-04-28T10:22:49.193908+00:00",
"created_by": "<user-uuid>",
"updated_at": "2026-04-28T10:22:49.193908+00:00",
"updated_by": "<user-uuid>"
}
],
"children": []
}
]
}
],
"pagination": {
"current_page": 1,
"all_pages": 1,
"total_items": 1
}
}
}Create a Template
/v1/templateCreate a new extraction template with sheets and columns defining the structure of data to extract.
Request Fields
| Field | Type | Description |
|---|---|---|
name* | string | Display name for the template |
prompt_instructions | string | Global extraction instructions applied to all sheets |
output_mode* | string | Extraction output mode. Use "single" for most cases |
store_data* | boolean | Whether to persist extracted data on the server |
sheets* | array | Array of sheet objects defining the extraction schema (see below) |
cURL Example
curl -X POST https://api.platform.docservant.com/v1/template \
-H "Authorization: Bearer <sk_your_live_api_key>" \
-H "Content-Type: application/json" \
-d '{
"name": "Invoice Data Extraction Template",
"prompt_instructions": "Extract all relevant financial information from invoices with high accuracy",
"output_mode": "single",
"store_data": true,
"sheets": [
{
"name": "Invoice Header",
"prompt_instructions": "Extract the main invoice header information",
"included_in_extraction": true,
"single_object_extraction": true,
"columns": [
{ "name": "Invoice Number", "enabled": true, "prompt_instructions": "Extract the unique invoice number or ID" },
{ "name": "Invoice Date", "enabled": true, "prompt_instructions": "Extract the invoice date in YYYY-MM-DD format" },
{ "name": "Total Amount", "enabled": true, "prompt_instructions": "Extract the total amount due" }
],
"children": []
},
{
"name": "Line Items",
"prompt_instructions": "Extract all individual line items from the invoice",
"included_in_extraction": true,
"single_object_extraction": false,
"columns": [
{ "name": "Item Description", "enabled": true, "prompt_instructions": "Extract the description of the product or service" },
{ "name": "Quantity", "enabled": true, "prompt_instructions": "Extract the quantity ordered" },
{ "name": "Unit Price", "enabled": true, "prompt_instructions": "Extract the price per unit" },
{ "name": "Line Total", "enabled": true, "prompt_instructions": "Extract or calculate the total for this line item" }
],
"children": []
}
]
}'Response Example
{
"success": true,
"message": "Created Template",
"data": {
"template": {
"template_id": "<template-uuid>",
"team_id": "<team-uuid>",
"name": "Invoice Data Extraction Template",
"output_mode": "single",
"prompt_instructions": "Extract all relevant financial information from invoices with high accuracy",
"store_data": true,
"total_runs": 0,
"last_run": null,
"created_at": "2026-04-29T15:41:45.442903+00:00",
"created_by": "<user-uuid>",
"updated_at": "2026-04-29T15:41:45.442903+00:00",
"updated_by": "<user-uuid>",
"deleted_at": null,
"sheets": [
{
"sheet_id": "<sheet-uuid>",
"name": "Invoice Header",
"prompt_instructions": "Extract the main invoice header information",
"included_in_extraction": true,
"single_object_extraction": true,
"parent_sheet_id": null,
"parent_sheet_name": null,
"parent_sheet_column_id": null,
"parent_sheet_column_name": null,
"created_at": "2026-04-29T15:41:45.441983+00:00",
"created_by": "<user-uuid>",
"updated_at": "2026-04-29T15:41:45.441983+00:00",
"updated_by": "<user-uuid>",
"columns": [
{
"column_id": "<column-uuid>",
"name": "Invoice Number",
"data_type": "string",
"enabled": true,
"prompt_instructions": "Extract the unique invoice number or ID",
"enabled_reference_mapping": false,
"reference_data_set_id": null,
"return_data_set_field_id": null,
"created_at": "2026-04-29T15:41:45.441983+00:00",
"created_by": "<user-uuid>",
"updated_at": "2026-04-29T15:41:45.441983+00:00",
"updated_by": "<user-uuid>"
}
],
"children": []
}
]
}
}
}Statistics
/v1/statisticsRetrieve usage statistics for your team including token consumption, extraction counts, and overage details.
cURL Example
curl https://api.platform.docservant.com/v1/statistics \
-H "Authorization: Bearer <sk_your_live_api_key>" \
-H "Content-Type: application/json"Response Example
{
"success": true,
"message": "Successfully fetched dashboard statistics",
"data": {
"templates": 10,
"extractions": 39,
"tokens": 40944,
"avg_processing_time_in_seconds": 21.0,
"overage": {
"allowed": true,
"tokens_left": 459056,
"overage_tokens": 0,
"cost_per_token": 0.00008,
"overage_unit_size": 50000,
"overage_cost": 0.0,
"overage_cost_per_unit": 4.0
}
}
}Webhooks
Webhooks push real-time event notifications to your server whenever something changes — a run is created, a template is updated, a user is removed, etc.
How webhooks work
- A team owner registers a webhook with an HTTPS endpoint URL and an event type.
- When the event occurs, the platform publishes it to AWS EventBridge.
- An EventBridge rule triggers the delivery function, which POSTs the payload to your endpoint.
- The request is signed with HMAC-SHA256 so you can verify it came from DocServant.
Event Types
| Event | Triggered when |
|---|---|
create.user | A new user is added to the team |
update.user | A user record is modified |
delete.user | A user is removed |
create.template | A new template is created |
update.template | A template is modified |
delete.template | A template is deleted |
create.run | A document run is started |
update.run | A run is updated |
delete.run | A run is deleted |
Payload Format
Every delivery is an HTTP POST with Content-Type: application/json.
{
"event": "create.run",
"team_id": "your-team-uuid",
"data": {
// full object data for the affected resource
}
}Request Headers
| Header | Description |
|---|---|
Content-Type | application/json |
X-Webhook-Signature | sha256=<hex_signature> |
X-Webhook-Timestamp | Unix timestamp (seconds) of the delivery |
Verifying the Signature
Always verify the signature before processing a webhook. Compute HMAC-SHA256 over {timestamp}.{raw_body} using your team's webhook secret, where timestamp is the value from X-Webhook-Timestamp.
// Node.js
const crypto = require("crypto");
function verifyWebhook(secret, timestamp, body, signatureHeader) {
const message = `${timestamp}.${body}`;
const expected = crypto
.createHmac("sha256", secret)
.update(message, "utf8")
.digest("hex");
const received = signatureHeader.replace("sha256=", "");
return crypto.timingSafeEqual(Buffer.from(expected), Buffer.from(received));
}# Python
import hmac, hashlib
def verify_webhook(secret, timestamp, body, signature_header):
message = f"{timestamp}.{body}"
expected = hmac.new(
secret.encode("utf-8"),
message.encode("utf-8"),
hashlib.sha256
).hexdigest()
received = signature_header.removeprefix("sha256=")
return hmac.compare_digest(expected, received)timingSafeEqual / compare_digest) to prevent timing attacks.Delivery Behavior
- Method: HTTP POST, timeout: 30 seconds
- No automatic retries — failed deliveries are logged but not re-sent
- Each webhook records the last delivery attempt in
last_delivery
Security Notes
- HTTPS endpoints only — plaintext HTTP is not accepted
- Validate
X-Webhook-Signatureon every delivery before trusting the payload - Rotate the secret if it is ever exposed
- To temporarily stop deliveries, set
is_active: falserather than deleting
Error Handling
Runs may fail due to unreadable documents, invalid templates, or processing errors. Clients should handle retries where appropriate.
HTTP Status Codes
| Code | Meaning |
|---|---|
200 | Success |
400 | Bad request — Check your request body or parameters |
401 | Unauthorized — Invalid or missing API key |
403 | Forbidden — Key inactive, revoked, expired, or missing required scope |
404 | Not found — Key, run, or template doesn't exist |
429 | Rate limited — Too many requests, slow down |
500 | Server error — Something went wrong on our end |
Examples
Extract Invoices into JSON
Submit an invoice for extraction and fetch the structured JSON output:
# Submit the invoice
curl -X POST https://api.docservant.com/v1/run \
-H "Authorization: Bearer sk_<key_id>_<secret>" \
-F "templateId=tpl_invoices" \
-F "file=@invoice.pdf"
# Response: { "run_id": "run_abc123", "status": "queued" }
# Poll for completion (or use webhooks)
curl https://api.docservant.com/v1/run/run_abc123 \
-H "Authorization: Bearer sk_<key_id>_<secret>"
# When status is "completed", fetch the JSON output
curl "https://storage.docservant.com/outputs/run_abc123.json?token=..."Bulk Upload
When you submit multiple documents with the same template, each run produces its own output files. To consolidate results, fetch individual JSON outputs and merge them in your application, or download each XLSX and combine sheets programmatically.
# Submit multiple documents
for file in invoices/*.pdf; do
curl -X POST https://api.docservant.com/v1/run \
-H "Authorization: Bearer sk_<key_id>_<secret>" \
-F "templateId=tpl_invoices" \
-F "file=@$file"
doneHandle Webhook Delivery (Node.js)
const express = require('express');
const crypto = require('crypto');
const app = express();
app.use(express.raw({ type: 'application/json' })); // raw body needed for signature
app.post('/webhooks/docservant', (req, res) => {
const signature = req.headers['x-webhook-signature'];
const timestamp = req.headers['x-webhook-timestamp'];
const rawBody = req.body.toString('utf8');
// Verify signature
const message = `${timestamp}.${rawBody}`;
const expected = 'sha256=' + crypto
.createHmac('sha256', process.env.WEBHOOK_SECRET)
.update(message, 'utf8')
.digest('hex');
const received = signature.replace('sha256=', '');
if (!crypto.timingSafeEqual(Buffer.from(expected), Buffer.from(received))) {
return res.status(401).send('Invalid signature');
}
const { event, team_id, data } = JSON.parse(rawBody);
if (event === 'create.run') {
console.log(`New run started for team ${team_id}`, data);
}
res.status(200).send('OK');
});