Welcome to the Data Publisher API documentation. This guide is written for developers who need to build serious integrations with the Data Publisher platform — whether you're creating a custom client interface, building workflow automation, or developing a vertical-specific application.
The Data Publisher v2.0 API exposes 123 endpoints across 23 namespaces. More importantly, it exposes the complete operational logic of the platform in three discrete layers: data comes in, it moves through a document engine, and the output goes out through a distribution layer. Understanding that layering is the first step toward building integrations that actually work in production environments.
Base URL: https://app.datapublisher.io/api
Development URL: http://localhost:3001/api
Authentication: JWT Bearer tokens
Data Format: JSON (with multipart/form-data for file uploads)
Before writing a single line of integration code, spend time understanding how the platform is organized. The namespaces are not arbitrary — they reflect a deliberate separation of concerns that maps directly onto how document automation workflows actually operate in practice.
Every document automation workflow begins with data that lives somewhere: a spreadsheet maintained in Google Sheets, a customer database on a SQL Server instance, an Excel file on SharePoint, or a CSV exported from a legacy system. The data layer's job is to accept all of those sources and normalize them into a consistent internal format that the document engine can consume without needing to know where the data came from.
Key namespaces: /csv, /google, /microsoft, /sql-server, /data-sources, /sync, /data-sets
This is the core of what Data Publisher does: it takes a Word template with {{variable}} placeholders and a data file, and produces populated documents. The engine handles:
Everything in the document engine is stateless: you give it a template ID and a data file ID, and it gives you documents.
Key namespaces: /documents, /word-templates, /image-library, /sample-library, /domains
Generated documents need to get somewhere. In most enterprise workflows, that means email: individual emails with personalized attachments, bulk campaigns with engagement tracking, or packaged exports that feed into downstream systems. The distribution layer handles the full email lifecycle:
Key namespaces: /email-templates, /email-campaigns, /email/track, /email/auth, /email-publishing-exports
A fourth element sits orthogonally across all three layers: the AI namespace (/claude-coop). Unlike most platforms that treat AI as a UI feature, Data Publisher exposes AI capabilities as API endpoints, meaning they can be invoked at any point in any workflow. This enables:
The platform does not have one "mode" — it has three distinct stages, each with its own API surface. Understanding which stage your use case lives in is the prerequisite to knowing which endpoints to use.
The data layer supports four distinct connector types. Each connector normalizes its source into the same internal format — a registered CSV file that can be addressed by ID. Once data enters the platform through any connector, the downstream generation and distribution layers operate identically regardless of source.
/csv)
Direct file upload endpoint supporting multipart form-data (files up to 50MB) and JSON payloads (programmatic uploads). On upload, the platform automatically:
Paginated access via GET /api/csv/:id/data with limit/offset parameters plus server-side filtering (filterColumn/filterValue) keeps large datasets performant.
/google)
Full OAuth2 connector for Google Sheets API v4. Three-step authorization flow:
GET /auth/start returns authorization URL
Key endpoints: List spreadsheets (GET /spreadsheets), read sheet data (GET /spreadsheets/:id/sheets/:name/data), import to CSV (POST /spreadsheets/:id/sheets/:name/import). First row always treated as headers; all data imported as strings.
/microsoft)
Parallel OAuth2 connector for Microsoft Graph API (OneDrive/SharePoint). Requests three scopes: Files.Read.All, offline_access, User.Read. Architectural mirror of Google Sheets connector with transparent token refresh.
Uses OneDrive file IDs (not URLs) for workbook addressing. Returns both ID and webUrl for display. Supports .xlsx, .xlsm, .xlsb formats (not legacy .xls). Worksheet names are URL-encoded to handle spaces/special characters.
/sql-server)
Enterprise database connector for operational data sources. Supports connection strings or server/database/credentials configuration. Platform validates connections at setup time and enforces:
Execution via POST /query (returns typed JSON) or POST /query/import (routes directly to CSV file).
/sync)
Scheduled sync layer for all four connectors. Configure frequency (hourly, daily, weekly) to automatically refresh target CSV files from source data. Sync history (GET /schedules/:id/history) records row counts, change deltas (added/updated/deleted), and error messages for debugging.
The Data Sources API (/api/data-sources) provides unified view across all connector types — single GET / returns all sources with type, status, last sync. Batch connection testing via POST /test-all for health monitoring dashboards.
/data-sets and /domains)
Data Sets (/api/data-sets): Group multiple related CSV files with foreign key relationship definitions for complex documents requiring joins across tables.
Domains (/api/domains): Pre-configured vertical-specific Data Sets with relationships and sample data. Available domains: real estate (Properties, Agents, PropertyPhotos), e-commerce (Products, Categories, Specifications), healthcare, finance, HR, education. Clone to user account via POST /domains/:domainId/clone for instant setup.
Two complementary APIs manage templates depending on integration context:
/documents)
Primary template management layer handling:
POST /upload with multipart form-data
{{variable}} placeholders; status field progresses from processing to completed
variables array in response defines exact field names expected from data file (basis for field mapping UIs)
GET /api/documents/:id/content returns paragraph structure with variable occurrence counts (useful for previews and detecting breaking changes)
/word-templates)
Adds OOXML capabilities for Office.js integration:
GET /api/word-templates/:id/ooxml returns raw Office Open XML for insertion into open Word documents via Office.js (Word.run() + body.insertOoxml())
POST /from-document accepts OOXML from currently open Word document and stores as template
/image-library)
Centralized visual asset management:
/sample-library)
Onboarding acceleration through pre-built templates:
POST /samples/:id/copy clones to user account
The distribution layer manages the complete email workflow from authentication through engagement analytics.
/email/auth)
All email sending routes through Microsoft Graph API. OAuth2 flow requests three scopes:
Token management is transparent — platform automatically refreshes expired access tokens (1-hour lifetime) before send operations. Developer code never touches token state.
/email-templates)
Accept Word documents and convert to email-optimized HTML:
htmlContent field in response provides exact rendered output for previews
/email-campaigns)
11-endpoint lifecycle for bulk sending with personalization:
POST / configures campaign (data file, email template, recipient field, subject line, attachment settings, tracking preferences, test mode)
POST /:id/send starts asynchronous processing (returns jobId and estimated duration)
GET /:id/status returns progress %, sent/failed/remaining counts, per-recipient errors (poll every 10-30 seconds; ~30 emails/min send rate due to Graph throttling)
POST /:id/pause and POST /:id/resume for operational safety (correct data mid-campaign without losing progress)
POST /:id/cancel stops execution permanently
GET /:id/analytics returns aggregate engagement metrics post-completion
Personalized Attachments: When attachmentTemplateId + generateAttachments: true specified, platform generates custom PDF/DOCX for each recipient using their row data (500 recipients → 500 unique attached documents).
/email/track)
Three public endpoints (no authentication, <10ms response time):
GET /open/:trackingId returns 1×1 transparent GIF (async DB update after response)
GET /click/:trackingId?redirect=URL records click and redirects (async update)
POST /reply/:trackingId integrates with Power Automate flows monitoring Outlook inbox
POST /reply-attachment/:trackingId stores uploaded documents from respondents (contract collection workflows)
Authenticated tracking endpoints:
GET /status/:trackingId returns open count, click count, timestamps, IP addresses, user agents
GET /campaign/:campaignId returns aggregate stats, top-links breakdown, engagement timeline
/email-publishing-exports)
Alternative distribution for non-email channels (client portals, print vendors, downstream systems):
Workflow: Create configuration (POST /), execute generation (POST /:id/execute), poll status (GET /:id/status), download ZIP (GET /:id/download).
ZIP Structure: Generated documents at root with consistent naming, static attachments in subdirectory (150 records → 150 individually named files in single archive).
/claude-coop)
Unlike platforms that treat AI as UI-only features (chat windows, help buttons), Data Publisher exposes AI capabilities as API endpoints — invokable at any point in any workflow.
Two endpoints:
Integration Opportunities:
GET /:id/analytics data to /analyze for human-readable insights
/suggest for specific optimization recommendations
AI participates as a collaborative workflow layer, not a bolt-on feature — intelligence native to the document automation domain.
The following demonstrates a complete real estate integration using five API namespaces to create a branded property document generator with automated data refresh and engagement tracking.
Registration Flow (email-gated):
POST /api/auth/request-code → sends 6-digit code to email
POST /api/auth/register → exchanges code for JWT (7-day expiry)
POST /api/auth/login before expiry
Automated Provisioning: Call request-code and register programmatically for bulk user setup. 14-day trial activates automatically on registration.
Clone pre-configured real estate data structure:
POST /api/domains/real-estate/clone
{ "includeSampleData": true }
Result: Receives IDs for:
Replace sample data with real property records to make templates operational.
Connect daily-updated Google Sheet:
GET /api/google/auth/start → get OAuth URL
POST /api/sync/schedules → configure daily sync at 6 AM targeting Properties CSV
Properties data auto-refreshes every morning without manual exports.
Create export configuration for PDF generation:
POST /api/email-publishing-exports
{
"csvId": ,
"templateId": ,
"format": "pdf",
"emailField": "AgentEmail"
}
POST /:id/execute // Start generation
GET /:id/status // Poll until "completed"
GET /:id/download // Download ZIP
Result: ZIP contains one PDF per property with data from joined tables (Properties + Agents + PropertyPhotos).
Create and execute campaign with personalized attachments:
POST /api/email-campaigns
{
"dataFileId": ,
"templateId": ,
"emailField": "ProspectEmail",
"attachmentTemplateId": ,
"trackOpens": true,
"trackClicks": true
}
POST /:id/send // Initiate (async)
GET /:id/status // Monitor progress
GET /:id/analytics // Post-completion engagement metrics
Reply tracking via Power Automate captures prospect responses and stores attachments (signed offers) retrievable through tracking API.
Data In (Google Sheets via sync) → Document Engine (Property Brochure + Data Set joins) → Distribution (email campaign with attachment generation and tracking)
Five API namespaces. One coherent workflow.
The platform's API layer supports complete white-labeling:
Control what capabilities your interface exposes:
/claude-coop endpoints provide AI capabilities most vertical automation solutions can't match:
- Data Sources API (/data-sources) + Data Sync (/sync): Define how fresh data is
- Campaign Lifecycle Endpoints: Async send pattern requires interface design thinking different from synchronous APIs
- /claude-coop Prototyping: Demonstrating AI capabilities changes prospects' expectations
---
| Layer | Key Namespaces | What It Does |
|-------|---------------|--------------|
| Data Layer | /csv, /google, /microsoft, /sql-server, /data-sources, /sync, /data-sets | Connects data from any source; normalizes to CSV for the engine |
| Document Engine | /documents, /word-templates, /image-library, /sample-library, /domains | Converts templates + data into generated documents |
| Distribution | /email-templates, /email-campaigns, /email/track, /email/auth, /email-publishing-exports | Sends, tracks, and exports generated documents |
| AI Layer | /claude-coop | Analyze results and suggest improvements via natural language |
| Platform | /auth, /users | Account management, JWT lifecycle, subscription status |