API Documentation
Complete REST API reference for the self-hosted promptShield engine. All endpoints are available at your deployment URL.
Base URL
http://localhost:8000Replace with your deployment host and port.
Interactive API Docs
When running the server, visit /docs for an auto-generated interactive Swagger UI where you can test every endpoint directly.
PROMPTSHIELD_DEBUG_API_DOCS=1 promptshield serveIn API-key mode, /docs, /openapi.json, and /redoc require Authorization: Bearer <api-key>.
Authentication
When PROMPTSHIELD_API_KEY is set (via promptshield serve --api-key or the Docker env var), every request must carry it as Authorization: Bearer <api-key>; missing or wrong keys get 401, and a key scoped to only some operations gets 403 on the rest. The /health endpoint is always exempt. If no API key is set, the server runs unauthenticated and accepts all requests — only appropriate on a trusted local network.
Quick Start
Get the local API running in under 5 minutes.
Install the package
pip install promptshield-appInstall a language model
export PROMPTSHIELD_ASSETS_DIR="$HOME/.local/share/promptshield/assets"
promptshield models install spacy-en-mdSet your API key
export PROMPTSHIELD_API_KEY="ps_live_your_key_here"Your API key is available in the API Keys dashboard.
Start the server
promptshield serve --port 8000Process your first document
AUTH="Authorization: Bearer ps_live_your_key_here"
curl -X POST http://localhost:8000/api/documents/upload \
-H "$AUTH" \
-F "file=@contract.pdf"
# → {"doc_id": "abc123", "filename": "contract.pdf", "page_count": 5, "status": "REVIEWING"}
curl -X POST http://localhost:8000/api/documents/abc123/detect \
-H "$AUTH" \
-H "Content-Type: application/json" \
-d '{"confidence_threshold": 0.7}'
# → {"doc_id": "abc123", "total_regions": 42, "regions": [...]}
curl -X POST http://localhost:8000/api/documents/abc123/anonymize \
-H "$AUTH"
# → {"doc_id": "abc123", "tokens_created": 42, "regions_removed": 0, "output_path": "..."}
curl -o contract-safe.pdf \
-H "$AUTH" \
http://localhost:8000/api/documents/abc123/download/pdfOr run with Docker (zero install)
docker run -e PROMPTSHIELD_API_KEY="ps_live_your_key_here" \
-p 8000:8000 -v promptshield-data:/data \
promptshield/promptshield-api:latestStep-by-Step Guide
A comprehensive walkthrough to set up the local API and integrate it into your workflow.
1. Prerequisites
Before you begin, make sure you have the following installed:
Optional. for full format support:
2. Install the SDK
Install the promptShield Python package. The base install handles PDF files. Add the office extra for Word/Excel/PowerPoint.
pip install promptshield-apppip install "promptshield-app[office]"Install language models (NER packs) via the CLI:
# Set the install directory (add to your shell profile)
export PROMPTSHIELD_ASSETS_DIR="$HOME/.local/share/promptshield/assets"
# Windows: $env:PROMPTSHIELD_ASSETS_DIR = "$env:APPDATA\promptshield\assets"
# English pack — required (medium is recommended)
promptshield models install spacy-en-md
# For maximum accuracy (larger download)
promptshield models install spacy-en-lgInstall additional language packs the same way:
promptshield models install spacy-fr-md # French
promptshield models install spacy-de-md # German
promptshield models install spacy-es-md # Spanish
promptshield models install spacy-it-md # Italian
promptshield models install spacy-nl-md # Dutch
promptshield models install spacy-pt-md # Portuguese3. Authentication
The local API uses your promptShield API key for license validation. Set it as an environment variable or pass it when starting the server.
export PROMPTSHIELD_API_KEY="ps_live_your_key_here"$env:PROMPTSHIELD_API_KEY = "ps_live_your_key_here"The API key is validated online on startup and then cached locally for up to 35 days, so the server can run fully offline after the first launch.
4. Start the API Server
Launch the local server with a single command:
promptshield serve
# Starting promptShield API server on 127.0.0.1:8000
# Health endpoint: http://127.0.0.1:8000/health
# Docs: enable with PROMPTSHIELD_DEBUG_API_DOCS=1 in non-API-key modeAvailable options:
| --host | Bind address (default: 127.0.0.1) |
| --port | Bind port (default: 8000) |
| --api-key | API key (alternative to env var) |
Once started, the API is available at the displayed URL. Interactive Swagger docs are at /docs.
5. Docker Deployment (Alternative)
For production deployments, Docker provides a zero-install experience with all dependencies bundled.
docker run -d --name promptshield \
-e PROMPTSHIELD_API_KEY="ps_live_your_key_here" \
-p 8000:8000 \
-v promptshield-data:/data \
promptshield/promptshield-api:latest# docker-compose.yml
services:
promptshield:
image: promptshield/promptshield-api:latest
ports:
- "8000:8000"
volumes:
- promptshield-data:/data
environment:
- PROMPTSHIELD_API_KEY=${PROMPTSHIELD_API_KEY}
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
volumes:
promptshield-data:Create a .env file with your API key:
PROMPTSHIELD_API_KEY=ps_live_your_key_hereThe Docker image includes:
6. API Workflow
The standard document processing workflow has 5 steps:
Full workflow example (cURL)
# 1. Upload
AUTH="Authorization: Bearer ps_live_your_key_here"
DOC_ID=$(curl -s -X POST http://localhost:8000/api/documents/upload \
-H "$AUTH" \
-F "file=@contract.pdf" | python -c "import sys,json; print(json.load(sys.stdin)['doc_id'])")
echo "Document ID: $DOC_ID"
# 2. Detect PII (with optional inline overrides)
curl -X POST http://localhost:8000/api/documents/$DOC_ID/detect \
-H "$AUTH" \
-H "Content-Type: application/json" \
-d '{"confidence_threshold": 0.7, "regex_enabled": true, "ner_enabled": true}'
# → {"doc_id": "...", "total_regions": 42, "regions": [...]}
# 3. Review detected regions
curl -H "$AUTH" http://localhost:8000/api/documents/$DOC_ID/regions | python -m json.tool
# → [{"id": "...", "pii_type": "PERSON", "text": "John Smith", "confidence": 0.98,
# "action": "TOKENIZE", ...}, ...]
# 3b. Bulk-update region actions before anonymizing (region_ids from step 3)
curl -X PUT http://localhost:8000/api/documents/$DOC_ID/regions/batch-action \
-H "$AUTH" \
-H "Content-Type: application/json" \
-d '{"region_ids": ["<id1>", "<id2>"], "action": "REMOVE"}'
# 4. Anonymize
curl -X POST http://localhost:8000/api/documents/$DOC_ID/anonymize \
-H "$AUTH"
# → {"doc_id": "...", "tokens_created": 42, "regions_removed": 0}
# 5. Download the anonymized document
curl -o contract-safe.pdf \
-H "$AUTH" \
http://localhost:8000/api/documents/$DOC_ID/download/pdf
echo "Done! Anonymized file saved to contract-safe.pdf"7. CLI Commands
Process documents directly from the command line without starting a server:
promptshield detect invoice.pdf
# Detected 23 PII entities across 5 pages
promptshield detect invoice.pdf -o report.json
# Report saved to report.json
# Tune the pipeline per run — every desktop-app setting is also a CLI flag.
promptshield detect contract.pdf \
--language en --countries US \
--ner-backend Davlan/bert-base-multilingual-cased-ner-hrl \
--no-llm --fuzziness 0.5 \
--ner-types PERSON,ORG \
-e SA-2026-0847 -e "84-329-1057" \
--export-review -o contract.psreview
# Or load all of the above from demo/setup.json by locale:
promptshield detect contract.pdf --preset en --export-review -o contract.psreviewpromptshield anonymize contract.pdf -o contract-safe.pdf
# Anonymized: 42 tokens created, saved to contract-safe.pdf
# Permanently redact PII (black boxes, irreversible)
promptshield anonymize contract.pdf --redact -o contract-redacted.pdfpromptshield detokenize contract-safe.pdf -o contract-restored.pdf
# Restored 42 tokens, saved to contract-restored.pdfAll CLI commands validate your API key and process documents locally. No data leaves your machine.
Model Add-ons
Use the built-in models command to browse, install, and remove NER language packs. Set PROMPTSHIELD_ASSETS_DIR so the CLI and running server both find them.
export PROMPTSHIELD_ASSETS_DIR="$HOME/.local/share/promptshield/assets"
# Windows: $env:PROMPTSHIELD_ASSETS_DIR = "$env:APPDATA\promptshield\assets"promptshield models list
# Language packs (NER)
# [ ] spacy-en-sm English (small) (12 MB)
# [installed] spacy-en-md English (medium) (33 MB)
# [ ] spacy-fr-md French (medium) (16 MB) ...promptshield models install spacy-fr-md
# Installing spacy-fr-md (French (medium))...
# Downloading... 16.0/16.0 MB (100%)
# Verifying integrity...
# Extracting...
# Installed to ~/.local/share/promptshield/assets/spacy/fr_core_news_mdpromptshield models installed
# Installed models:
# spacy-en-md English (medium)
# spacy-fr-md French (medium)promptshield models remove spacy-fr-mdInstalled packs are picked up automatically by detect, anonymize, and the running API server. The install command talks directly to the licensing server — no running sidecar needed.
8. Custom expressions
Built-in regex and NER cover most PII, but every organisation has its own identifiers — contract numbers, internal project codes, tax IDs, confidential flags. Add them as expressions and they get treated as first-class PII, flagged as CUSTOM type with full bbox + propagation across pages. Available on the CLI (-e / --expression), on the self-hosted HTTP API (blacklist_terms), and via demo/setup.json presets.
# Literal text — case-insensitive substring match
promptshield detect contract.pdf -e SA-2026-0847 -e "84-329-1057"
# Repeatable — pass as many -e flags as you need
promptshield detect contract.pdf \
-e ACME-CONFIDENTIAL \
-e "Internal Use Only" \
-e CUST-12345
# Regex — prefix with "re:" for full regex semantics
promptshield detect contract.pdf \
-e "re:CASE-\d{4}-[A-Z]{3}" \
-e "re:Project [A-Z][a-z]+"# Same feature on the self-hosted HTTP API.
curl -X POST http://localhost:8000/api/documents/$DOC_ID/detect \
-H "$AUTH" \
-H "Content-Type: application/json" \
-d '{
"expressions": ["SA-2026-0847", "84-329-1057"],
"expressions_action": "tokenize",
"expressions_fuzziness": 1.0
}'
# expressions_action: "none" | "tokenize" | "remove"
# expressions_fuzziness: 1.0 = exact match, 0.5 = fuzzy edit-distance match
#
# Back-compat: the legacy keys "blacklist_terms" / "blacklist_action" /
# "blacklist_fuzziness" still work and map to the same fields via
# Pydantic alias — existing integrations don't need changes.# demo/setup.json
{
"en": {
"language": "en",
"countries": ["US"],
"ner_backend": "Davlan/bert-base-multilingual-cased-ner-hrl",
"expressions": ["SA-2026-0847", "84-329-1057"],
"fuzziness": 0.5
}
}
# Then any time:
promptshield detect contract.pdf --preset en --export-review -o contract.psreviewOn naming: expressions is now the canonical name across all three surfaces — CLI (-e / --expression), HTTP API (expressions / expressions_action / expressions_fuzziness), and the desktop UI label. The earlier blacklist_terms / blacklist_action / blacklist_fuzziness keys on the HTTP API still work via a Pydantic alias for back-compat, so existing integrations keep working.
Review Workflow: API → Desktop
Automated detection at scale is only half the story. Auto-detection never achieves 100% precision — there are always false positives to dismiss and missed entities to add. promptShield closes the loop with a portable .psreview bundle: detect programmatically, import into the desktop app for visual human QA, then anonymize.
Step 1a — Detect and export (CLI, one-shot)
# Detect + bundle in one command
promptshield detect contract.pdf --export-review -o contract.psreview
# Ingesting contract.pdf...
# Detecting PII on page 1/3...
# Detecting PII on page 2/3...
# Detecting PII on page 3/3...
# Detected 14 PII region(s) across 3 page(s) in 1240ms.
# Review bundle saved to contract.psreview (14 region(s))Step 1b — Detect and export (HTTP API)
# 1. Upload document
curl -X POST http://localhost:8000/api/documents/upload \
-H "Authorization: Bearer $PROMPTSHIELD_API_KEY" \
-F "file=@contract.pdf"
# → {"doc_id": "abc123ef", "page_count": 3, ...}
# 2. Run detection
curl -X POST http://localhost:8000/api/documents/abc123ef/detect \
-H "Authorization: Bearer $PROMPTSHIELD_API_KEY"
# 3. Download the review bundle
curl http://localhost:8000/api/documents/abc123ef/export-review \
-H "Authorization: Bearer $PROMPTSHIELD_API_KEY" \
-o contract.psreview.psreview API
/api/documents/{id}/export-reviewExport review bundle
Download the document and all its region annotations as a portable .psreview ZIP bundle.
/api/documents/import-reviewImport review bundle
Upload a .psreview bundle. Extracts the document (runs OCR), injects the saved region annotations, and opens the document in REVIEWING status — ready for visual QA without re-running detection.
Step 2 — Import into the desktop app
Open the promptShield desktop app. On the upload screen, click Import for Review and select the .psreview file. The document opens immediately in review mode — all detected regions are pre-populated, no detection step needed.
Step 3 — Review, edit, and anonymize
Use the region sidebar to dismiss false positives (set action to Cancel), draw new regions for anything the model missed, then click Export. The desktop app runs anonymization and saves the redacted PDF to your Downloads folder.
8. Python SDK (Direct Import)
For advanced integrations, import the core modules directly in your Python code. This gives you full control over every processing step.
from promptshield import PromptShield
ps = PromptShield()
def process(input_path: str, output_path: str):
# Step 1 — Detect PII (runs the full local pipeline)
result = ps.detect(input_path)
print(f"Loaded: {result.file} ({result.pages} pages)")
print(f"Total PII entities: {result.total_regions}")
for page in result.regions:
for entity in page["entities"]:
print(f" Page {page['page']}: [{entity['type']}] {entity['text']}")
# Step 2 — Anonymize (reversible tokens by default; pass redact=True
# for permanent removal). Output is always a PDF.
out = ps.anonymize(input_path, output_path)
print(f"Anonymized: {out.tokens_created} tokens, saved to {out.output_file}")
# Step 3 — Restore the originals later from the local token vault
restored = ps.detokenize(output_path, "contract-restored.pdf")
print(f"Restored {restored.tokens_replaced} tokens")
process("contract.pdf", "contract-safe.pdf")
# Async variants are available too: ps.adetect / ps.aanonymize / ps.adetokenizeBatch processing example
Process multiple files in a directory:
from pathlib import Path
from promptshield import PromptShield
ps = PromptShield()
def batch_process(input_dir: str, output_dir: str):
input_path = Path(input_dir)
output_path = Path(output_dir)
output_path.mkdir(parents=True, exist_ok=True)
files = list(input_path.glob("*.pdf"))
print(f"Processing {len(files)} files...")
for i, file in enumerate(files, 1):
print(f"[{i}/{len(files)}] {file.name}")
out_file = output_path / f"anon_{file.stem}.pdf"
result = ps.anonymize(str(file), str(out_file))
print(f" → {result.tokens_created} tokens, saved to {out_file.name}")
print("Batch complete!")
batch_process("./documents", "./anonymized")9. Security & Data Privacy
promptShield is designed for zero-trust environments:
10. Troubleshooting
Common issues and solutions:
'No spaCy model found' or 'no NER model loaded' error
Set PROMPTSHIELD_ASSETS_DIR and install a pack: promptshield models install spacy-en-md. Alternatively for a system-wide install: python -m spacy download en_core_web_sm
OCR not working on scanned PDFs
Install Tesseract OCR and ensure it's in your PATH. On Windows, the installer adds it automatically.
Office files fail to convert
Install LibreOffice. DOCX has a pure-Python fallback, but XLSX/PPTX require LibreOffice.
'Invalid API key' on startup
Verify your key starts with ps_live_ and that your subscription is active. Keys can be managed at /developers/api-keys.
Documents
/api/documents/uploadUpload Document
Upload a document for processing. Supports PDF, DOCX, XLSX, PPTX, and image files.
/api/documentsList Documents
Retrieve all uploaded documents with their detection and anonymization status.
/api/documents/{id}Get Document
Get detailed information about a specific document including page count and detection status.
/api/documents/{id}Delete Document
Remove a document and all associated data from the server.
Detection
/api/documents/{id}/detectDetect PII
Run the PII detection pipeline on a document. Accepts an optional JSON body to override detection settings (confidence_threshold, regex_enabled, ner_enabled, regex_types, etc.) for this run only.
/api/documents/{id}/redetectRe-detect PII
Re-run detection with full parameter control. Accepts confidence_threshold, regex_types, ner_types, blacklist_terms, and more.
/api/documents/{id}/detection-progressDetection Progress
Get the progress of a running detection job (percentage, current page, and status).
/api/documents/{id}/regionsGet Regions
Retrieve all detected PII regions for a document with type, text, confidence score, and bounding box.
/api/documents/{id}/regions/batch-actionBatch Update Regions
Set the action (TOKENIZE, REMOVE, or CANCEL) for multiple regions at once, filtered by entity type.
Anonymize & Export
/api/documents/{id}/regions/syncSync Regions
Replace a document's region list entirely. Use after editing regions client-side to push changes before anonymizing.
/api/documents/{id}/anonymizeAnonymize Document
Anonymize a document by replacing detected PII with codes. Returns the processed document.
/api/documents/{id}/download/{type}Download Anonymized
Download the anonymized output. {type} is 'pdf' or 'text'. Output is always rendered as PDF (non-PDF inputs are converted), so 'pdf' is the primary format.
/api/documents/batch-anonymizeBatch Anonymize
Anonymize multiple documents in a single request. Up to 50 documents processed concurrently.
Decode
/api/detokenizeDecode Text
Replace codes in a text string with their original values from the vault.
/api/detokenize/fileDecode File
Upload an encoded document and receive a version with all codes restored to original values.
Vault & Code Registry
/api/vault/statusVault Status
Get vault statistics: total codes, total documents, registry size.
/api/vault/tokensList Codes
Retrieve all code-to-original mappings stored in the vault.
/api/vault/exportExport Vault
Export the entire vault as a JSON file for backup or migration.
/api/vault/importImport Vault
Import a previously exported vault JSON to restore code mappings.
Health
/healthHealth Check
Returns server health status. Use for monitoring and load balancer health probes.
Settings
/api/settingsGet Settings
Retrieve the current detection and processing configuration (thresholds, enabled pipelines, language, etc.).
/api/settingsUpdate Settings
Partially update the global detection settings. Changes apply to all subsequent detection runs.
/api/settings/patternsList Custom Patterns
Get all user-defined regex patterns used for custom PII detection.
/api/settings/patternsCreate Custom Pattern
Add a new custom regex pattern for PII detection. Patterns are evaluated alongside built-in rules.
Example
# 1. Upload a document
curl -X POST http://localhost:8000/api/documents/upload \
-F "file=@contract.pdf"
# → {"doc_id": "abc123", "filename": "contract.pdf", "page_count": 5, "status": "REVIEWING"}
# 2. Run PII detection
curl -X POST http://localhost:8000/api/documents/abc123/detect
# → {"doc_id": "abc123", "total_regions": 42, "regions": [...]}
# 3. Get detected regions
curl http://localhost:8000/api/documents/abc123/regions
# → [{"id": "...", "pii_type": "PERSON", "text": "John Smith", "confidence": 0.98,
# "page_number": 1, "bbox": {...}, "source": "NER", "action": "TOKENIZE"}, ...]
# 4. Anonymize the document
curl -X POST http://localhost:8000/api/documents/abc123/anonymize
# → {"doc_id": "abc123", "tokens_created": 42, "regions_removed": 0}
# 5. Download anonymized PDF
curl -o contract-safe.pdf \
http://localhost:8000/api/documents/abc123/download/pdfPython SDK
The PromptShield class runs the full pipeline locally — detect, anonymize, and detokenize in three calls. All processing happens on your machine; no data leaves it.
from promptshield import PromptShield
ps = PromptShield()
# Detect PII
result = ps.detect("contract.pdf")
print(f"{result.total_regions} entities across {result.pages} pages")
for page in result.regions:
for entity in page["entities"]:
print(f" [{entity['type']}] {entity['text']}")
# Anonymize (reversible tokenization)
out = ps.anonymize("contract.pdf", "contract-safe.pdf")
print(f"Done: {out.tokens_created} tokens created")
# Restore the original values
restored = ps.detokenize("contract-safe.pdf", "contract-restored.pdf")
print(f"Restored {restored.tokens_replaced} tokens")Error Handling
All error responses follow a standard JSON format with a detail field describing the error.
Bad request. Invalid parameters or a corrupt / unsupported file.
Unauthorized. Missing or invalid API key (Authorization: Bearer) in API-key mode.
Forbidden. The API key lacks the required scope (detect / anonymize / detokenize).
Not found. The document ID does not exist.
Unprocessable. A required field is missing, or the file is password-protected.
Server error. Check the server logs for details.
A required detection model is not installed. Install it with promptshield models install.
{
"detail": "Document not found: abc123"
}Rate Limits
The self-hosted API has no rate limits by default. You can configure rate limiting via environment variables if needed.