Why agents + data rooms
Data rooms are provisioning, sharing, watching, and concluding a piece of sensitive content. Every step is a tool call. Every step has an audit trail. That makes data rooms unusually well-shaped for autonomous workflows.
- Provision. Create the room, attach files, organize folders
- Share. Mint links with the right gating per recipient
- Watch. Pull view analytics, flag interesting visitors
- Conclude. Revoke links, archive the room, export the audit log
The agent loop
Most real agents around datarooms look the same:
┌─────────────┐ ┌─────────────┐ ┌────────────┐
│ Trigger │ → │ Reasoning │ → │ Tool call │
│ (event) │ │ (LLM) │ │ (API/MCP) │
└─────────────┘ └─────────────┘ └────────────┘
↑ │
└─────────── Observation ←────────────┘- Trigger: webhook from CRM, cron, inbound email
- Reasoning: Claude / GPT / Gemini, with system prompt + memory
- Tool call: Papermark via MCP (preferred) or direct REST
- Observation: API response or analytics events feed the next turn
Tools: MCP vs. function calling
Two ways an LLM can call the Papermark API. They're not exclusive.
MCP (recommended for hosted hosts)
Use @papermark/mcp-server when the host (Claude Desktop, Claude Code, Cursor, Zed) speaks MCP natively. Zero wiring on your end.
Native function calling
Use direct tool schemas when you embed the agent in your own product (your own loop, your own model client). Generate schemas from the OpenAPI spec:
import Anthropic from "@anthropic-ai/sdk";
import { Papermark } from "@papermark/sdk";
import { toolSchemas } from "@papermark/sdk/tools"; // OpenAPI → tool defs
const client = new Anthropic();
const pm = new Papermark({ token: process.env.PAPERMARK_TOKEN! });
const result = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 4096,
tools: toolSchemas, // 43 tools
messages: [
{ role: "user", content: "Create a dataroom for 'Acme. Series B' and upload deck.pdf." },
],
});
// Loop: dispatch tool_use blocks to pm.*, send tool_result back, repeat until stop.Scoping & safety
An agent token should be the smallest possible token. Three rules:
- One token per agent role. The "analytics digest" agent gets
analytics.readonly. The "provisioning" agent gets writes. - No
*.deleteunless required. Most workflows can soft-delete viaupdatewith an archived flag. - Audit trail is mandatory. Every API call carries
request_id. Log them. Replay them. Diff them weekly.
Patterns
Pattern 1 · Inbound-deal provisioning
CRM event → agent reads contact context → creates dataroom → uploads template kit → mints recipient-scoped link → emails it.
Trigger: CRM "Deal Stage = DD" webhook
Tools: create_dataroom, upload_document, create_link
Scopes: datarooms.write, documents.write, links.write
Guardrails: human approval before send (Slack interactive message)Pattern 2 · Engagement watcher
Cron → list yesterday's views → cluster by visitor → flag deep-engagement signals → write to your CRM.
Trigger: cron 0 9 * * *
Tools: list_visitors, list_visitor_views, get_view_analytics
Scopes: analytics.read, visitors.read
Output: Slack digest + CRM enrichmentPattern 3 · Expiry janitor
Daily sweep: revoke expired links, archive idle rooms, regenerate signed links for ongoing deals.
Trigger: cron 0 3 * * *
Tools: list_links, delete_link, update_dataroom
Scopes: links.write, datarooms.write
Guardrails: dry-run mode default; --apply to commitReference architecture
┌──────────────┐
│ Trigger │ Webhook · cron · email · Slack · CRM
└──────┬───────┘
│
┌──────▼───────┐
│ Orchestrator│ LangChain · Inngest · Trigger.dev · custom
└──────┬───────┘
│
┌──────▼───────┐ ┌──────────────────┐
│ LLM call │←→│ Tools │ via MCP or function-calling
│ │ │ (Papermark) │
└──────┬───────┘ └──────────────────┘
│
┌──────▼───────┐
│ Side effects│ Email · Slack · CRM update · DB write
└──────────────┘