API Reference

API Documentation

Complete REST API reference for the self-hosted PromptShield engine. All endpoints are available at your deployment URL.

Base URL

http://localhost:8000

Replace with your deployment host and port.

Interactive API Docs

When running the server, visit /docs for an auto-generated interactive Swagger UI where you can test every endpoint directly.

http://localhost:8000/docs

Authentication

The self-hosted API runs on your infrastructure. Authentication is optional and configurable via environment variables. By default, the API accepts all local requests.

Quick Start

Get the local API running in under 5 minutes.

Install the package

bash

pip install promptshield-app

Download a language model

bash

python -m spacy download en_core_web_sm

Set your API key

bash

export PROMPTSHIELD_API_KEY="ps_live_your_key_here"

Your API key is available in the API Keys dashboard.

Start the server

bash

promptshield serve --port 8000

Process your first document

bash

curl -X POST http://localhost:8000/api/documents/upload \
  -F "file=@contract.pdf"
# → {"id": "abc123", "filename": "contract.pdf", "pages": 5}

curl -X POST http://localhost:8000/api/documents/abc123/detect
# → {"status": "completed", "regions_count": 42}

curl -X POST http://localhost:8000/api/documents/abc123/anonymize
# → {"status": "completed", "tokens_created": 42}

curl -o contract-safe.pdf \
  http://localhost:8000/api/documents/abc123/download/pdf

Or run with Docker (zero install)

bash

docker run -e PROMPTSHIELD_API_KEY="ps_live_your_key_here" \
  -p 8000:8000 -v promptshield-data:/data \
  promptshield/promptshield-api:latest

Step-by-Step Guide

A comprehensive walkthrough to set up the local API and integrate it into your workflow.

1. Prerequisites

Before you begin, make sure you have the following installed:

Python 3.11 or newer

pip (included with Python)

A PromptShield API key (Professional or Business plan)

Optional — for full format support:

Tesseract OCR — for scanned/image-based PDFs

LibreOffice — for .docx, .xlsx, .pptx conversion

2. Install the SDK

Install the PromptShield Python package. The base install handles PDF files. Add the office extra for Word/Excel/PowerPoint.

Base install (PDF only)

bash

pip install promptshield-app

With Office format support

bash

pip install "promptshield-app[office]"

Download at least one spaCy NLP model:

bash

# English (required — at least one model)
python -m spacy download en_core_web_sm

# For better accuracy (larger model)
python -m spacy download en_core_web_lg

For other languages, add the matching model:

bash

python -m spacy download fr_core_news_sm   # French
python -m spacy download de_core_news_sm   # German
python -m spacy download es_core_news_sm   # Spanish
python -m spacy download it_core_news_sm   # Italian
python -m spacy download nl_core_news_sm   # Dutch
python -m spacy download pt_core_news_sm   # Portuguese

3. Authentication

The local API uses your PromptShield API key for license validation. Set it as an environment variable or pass it when starting the server.

macOS / Linux

bash

export PROMPTSHIELD_API_KEY="ps_live_your_key_here"

Windows (PowerShell)

bash

$env:PROMPTSHIELD_API_KEY = "ps_live_your_key_here"

The API key is validated online on startup and then cached locally for up to 35 days, so the server can run fully offline after the first launch.

4. Start the API Server

Launch the local server with a single command:

bash

promptshield serve
# Starting PromptShield API server on 127.0.0.1:8000
# API docs available at http://127.0.0.1:8000/docs

Available options:

--host	Bind address (default: 127.0.0.1)
--port	Bind port (default: 8000)
--api-key	API key (alternative to env var)

Once started, the API is available at the displayed URL. Interactive Swagger docs are at /docs.

5. Docker Deployment (Alternative)

For production deployments, Docker provides a zero-install experience with all dependencies bundled.

Single container

bash

docker run -d --name promptshield \
  -e PROMPTSHIELD_API_KEY="ps_live_your_key_here" \
  -p 8000:8000 \
  -v promptshield-data:/data \
  promptshield/promptshield-api:latest

With Docker Compose

bash

# docker-compose.yml
services:
  promptshield:
    image: promptshield/promptshield-api:latest
    ports:
      - "8000:8000"
    volumes:
      - promptshield-data:/data
    environment:
      - PROMPTSHIELD_API_KEY=${PROMPTSHIELD_API_KEY}
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/api/health"]
      interval: 30s

volumes:
  promptshield-data:

Create a .env file with your API key:

.env

bash

PROMPTSHIELD_API_KEY=ps_live_your_key_here

The Docker image includes:

Tesseract OCR with 7 language packs (EN, FR, DE, ES, IT, NL, PT)

LibreOffice for Office format conversion

spaCy NLP models pre-installed

Non-root user, health checks, persistent data volume

6. API Workflow

The standard document processing workflow has 5 steps:

1Upload — Send your document to the server

2Detect — Run the PII detection pipeline (regex + NER + optional LLM)

3Review — Inspect detected regions, edit or confirm

4Anonymize — Replace PII with codes

5Download — Retrieve the anonymized document

Full workflow example (cURL)

cURL

bash

# 1. Upload
DOC_ID=$(curl -s -X POST http://localhost:8000/api/documents/upload \
  -F "file=@contract.pdf" | python -c "import sys,json; print(json.load(sys.stdin)['id'])")

echo "Document ID: $DOC_ID"

# 2. Detect PII
curl -X POST http://localhost:8000/api/documents/$DOC_ID/detect
# → {"status": "completed", "regions_count": 42}

# 3. Review detected regions
curl http://localhost:8000/api/documents/$DOC_ID/regions | python -m json.tool
# → [{"type": "PERSON", "text": "John Smith", "score": 0.98}, ...]

# 4. Anonymize
curl -X POST http://localhost:8000/api/documents/$DOC_ID/anonymize
# → {"status": "completed", "tokens_created": 42}

# 5. Download the anonymized document
curl -o contract-safe.pdf \
  http://localhost:8000/api/documents/$DOC_ID/download/pdf

echo "Done! Anonymized file saved to contract-safe.pdf"

7. CLI Commands

Process documents directly from the command line without starting a server:

Detect PII and save a JSON report

bash

promptshield detect invoice.pdf
# Detected 23 PII entities across 5 pages

promptshield detect invoice.pdf -o report.json
# Report saved to report.json

Anonymize a document in one step

bash

promptshield anonymize contract.pdf -o contract-safe.pdf
# Anonymized: 42 tokens created, saved to contract-safe.pdf

Restore codes in a previously anonymized document

bash

promptshield detokenize contract-safe.pdf -o contract-restored.pdf
# Restored 42 tokens, saved to contract-restored.pdf

All CLI commands validate your API key and process documents locally. No data leaves your machine.

8. Python SDK (Direct Import)

For advanced integrations, import the core modules directly in your Python code. This gives you full control over every processing step.

Complete Python example

bash

import asyncio
from pathlib import Path
from promptshield import ingest_document, detect_pii_on_page, anonymize_document

async def process(input_path: str, output_path: str):
    # Step 1 — Ingest document
    doc = await ingest_document(input_path, Path(input_path).name)
    print(f"Loaded: {doc.filename} ({len(doc.pages)} pages)")

    # Step 2 — Detect PII
    total_entities = 0
    for page in doc.pages:
        regions = detect_pii_on_page(page)
        total_entities += len(regions)
        print(f"  Page {page.number}: {len(regions)} entities found")

    print(f"Total PII entities: {total_entities}")

    # Step 3 — Anonymize
    result = await anonymize_document(doc)
    print(f"Anonymized: {result.tokens_created} tokens created")

    # Save the result
    with open(output_path, "wb") as f:
        f.write(result.content)
    print(f"Saved to {output_path}")

asyncio.run(process("contract.pdf", "contract-safe.pdf"))

Batch processing example

Process multiple files in a directory:

batch_process.py

bash

import asyncio
from pathlib import Path
from promptshield import ingest_document, detect_pii_on_page, anonymize_document

async def batch_process(input_dir: str, output_dir: str):
    input_path = Path(input_dir)
    output_path = Path(output_dir)
    output_path.mkdir(parents=True, exist_ok=True)

    files = list(input_path.glob("*.pdf"))
    print(f"Processing {len(files)} files...")

    for i, file in enumerate(files, 1):
        print(f"[{i}/{len(files)}] {file.name}")
        doc = await ingest_document(str(file), file.name)

        for page in doc.pages:
            detect_pii_on_page(page)

        result = await anonymize_document(doc)

        out_file = output_path / f"anon_{file.name}"
        with open(out_file, "wb") as f:
            f.write(result.content)

        print(f"  → {result.tokens_created} tokens, saved to {out_file.name}")

    print("Batch complete!")

asyncio.run(batch_process("./documents", "./anonymized"))

9. Security & Data Privacy

PromptShield is designed for zero-trust environments:

All processing happens locally — no document content is ever sent to external servers

The code vault is encrypted at rest (AES-256-GCM)

API key authentication prevents unauthorized access in self-hosted mode

Docker runs as non-root with minimal privileges

API key validated online once, then cached locally for offline use (35-day TTL)

10. Troubleshooting

Common issues and solutions:

'No spaCy model found' error

Install at least one model: python -m spacy download en_core_web_sm

OCR not working on scanned PDFs

Install Tesseract OCR and ensure it's in your PATH. On Windows, the installer adds it automatically.

Office files fail to convert

Install LibreOffice. DOCX has a pure-Python fallback, but XLSX/PPTX require LibreOffice.

'Invalid API key' on startup

Verify your key starts with ps_live_ and that your subscription is active. Keys can be managed at /developers/api-keys.

Documents

POST

/api/documents/upload

Upload Document

Upload a document for processing. Supports PDF, DOCX, XLSX, PPTX, and image files.

GET

/api/documents

List Documents

Retrieve all uploaded documents with their detection and anonymization status.

GET

/api/documents/{id}

Get Document

Get detailed information about a specific document including page count and detection status.

DELETE

/api/documents/{id}

Delete Document

Remove a document and all associated data from the server.

Detection

POST

/api/documents/{id}/detect

Detect PII

Run the PII detection pipeline on a document. Combines regex, NER, and optional LLM layers.

GET

/api/documents/{id}/regions

Get Regions

Retrieve all detected PII regions for a document with type, text, confidence score, and bounding box.

Anonymize & Export

POST

/api/documents/{id}/anonymize

Anonymize Document

Anonymize a document by replacing detected PII with codes. Returns the processed document.

GET

/api/documents/{id}/download/{type}

Download Anonymized

Download the anonymized version of a document as PDF, DOCX, or XLSX.

POST

/api/documents/batch-anonymize

Batch Anonymize

Anonymize multiple documents in a single request. Up to 50 documents processed concurrently.

Decode

POST

/api/detokenize

Decode Text

Replace codes in a text string with their original values from the vault.

POST

/api/detokenize/file

Decode File

Upload an encoded document and receive a version with all codes restored to original values.

Vault & Code Registry

GET

/api/vault/status

Vault Status

Get vault statistics: total codes, total documents, registry size.

GET

/api/vault/tokens

List Codes

Retrieve all code-to-original mappings stored in the vault.

POST

/api/vault/export

Export Vault

Export the entire vault as a JSON file for backup or migration.

POST

/api/vault/import

Import Vault

Import a previously exported vault JSON to restore code mappings.

Health

GET

/api/health

Health Check

Returns server health status. Use for monitoring and load balancer health probes.

Example

cURL — Upload, Detect, Anonymize

bash

# 1. Upload a document
curl -X POST http://localhost:8000/api/documents/upload \
  -F "file=@contract.pdf"
# → {"id": "abc123", "filename": "contract.pdf", "pages": 5}

# 2. Run PII detection
curl -X POST http://localhost:8000/api/documents/abc123/detect
# → {"status": "completed", "regions_count": 42}

# 3. Get detected regions
curl http://localhost:8000/api/documents/abc123/regions
# → [{"type": "PERSON", "text": "John Smith", "score": 0.98, ...}, ...]

# 4. Anonymize the document
curl -X POST http://localhost:8000/api/documents/abc123/anonymize
# → {"status": "completed", "tokens_created": 42}

# 5. Download anonymized PDF
curl -o contract-safe.pdf \
  http://localhost:8000/api/documents/abc123/download/pdf

Python SDK

Use the core engine directly in your Python code for maximum flexibility.

Python SDK

bash

import asyncio
from promptshield import ingest_document, detect_pii_on_page, anonymize_document

async def process(path: str):
    # Ingest a document
    doc = await ingest_document(path, "contract.pdf")

    # Detect PII on each page
    for page in doc.pages:
        regions = detect_pii_on_page(page)
        print(f"Page {page.number}: {len(regions)} entities")

    # Anonymize the document
    result = await anonymize_document(doc)
    print(f"Done: {result.tokens_created} tokens")

asyncio.run(process("contract.pdf"))

Error Handling

All error responses follow a standard JSON format with a detail field describing the error.

400

Bad request — invalid parameters or file format

404

Not found — document ID does not exist

422

Validation error — missing required fields

500

Server error — check logs for details

Error Response Format

bash

{
  "detail": "Document not found: abc123"
}

Rate Limits

The self-hosted API has no rate limits by default. You can configure rate limiting via environment variables if needed.

curl -X POST http://localhost:8000/api/documents/upload \ -F "file=@contract.pdf" # → {"id": "abc123", "filename": "contract.pdf", "pages": 5} curl -X POST http://localhost:8000/api/documents/abc123/detect # → {"status": "completed", "regions_count": 42} curl -X POST http://localhost:8000/api/documents/abc123/anonymize # → {"status": "completed", "tokens_created": 42} curl -o contract-safe.pdf \ http://localhost:8000/api/documents/abc123/download/pdf

python -m spacy download fr_core_news_sm # French python -m spacy download de_core_news_sm # German python -m spacy download es_core_news_sm # Spanish python -m spacy download it_core_news_sm # Italian python -m spacy download nl_core_news_sm # Dutch python -m spacy download pt_core_news_sm # Portuguese

# docker-compose.yml services: promptshield: image: promptshield/promptshield-api:latest ports: - "8000:8000" volumes: - promptshield-data:/data environment: - PROMPTSHIELD_API_KEY=${PROMPTSHIELD_API_KEY} restart: unless-stopped healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8000/api/health"] interval: 30s volumes: promptshield-data:

# 1. Upload DOC_ID=$(curl -s -X POST http://localhost:8000/api/documents/upload \ -F "file=@contract.pdf" | python -c "import sys,json; print(json.load(sys.stdin)['id'])") echo "Document ID: $DOC_ID" # 2. Detect PII curl -X POST http://localhost:8000/api/documents/$DOC_ID/detect # → {"status": "completed", "regions_count": 42} # 3. Review detected regions curl http://localhost:8000/api/documents/$DOC_ID/regions | python -m json.tool # → [{"type": "PERSON", "text": "John Smith", "score": 0.98}, ...] # 4. Anonymize curl -X POST http://localhost:8000/api/documents/$DOC_ID/anonymize # → {"status": "completed", "tokens_created": 42} # 5. Download the anonymized document curl -o contract-safe.pdf \ http://localhost:8000/api/documents/$DOC_ID/download/pdf echo "Done! Anonymized file saved to contract-safe.pdf"

import asyncio from pathlib import Path from promptshield import ingest_document, detect_pii_on_page, anonymize_document async def process(input_path: str, output_path: str): # Step 1 — Ingest document doc = await ingest_document(input_path, Path(input_path).name) print(f"Loaded: {doc.filename} ({len(doc.pages)} pages)") # Step 2 — Detect PII total_entities = 0 for page in doc.pages: regions = detect_pii_on_page(page) total_entities += len(regions) print(f" Page {page.number}: {len(regions)} entities found") print(f"Total PII entities: {total_entities}") # Step 3 — Anonymize result = await anonymize_document(doc) print(f"Anonymized: {result.tokens_created} tokens created") # Save the result with open(output_path, "wb") as f: f.write(result.content) print(f"Saved to {output_path}") asyncio.run(process("contract.pdf", "contract-safe.pdf"))

import asyncio from pathlib import Path from promptshield import ingest_document, detect_pii_on_page, anonymize_document async def batch_process(input_dir: str, output_dir: str): input_path = Path(input_dir) output_path = Path(output_dir) output_path.mkdir(parents=True, exist_ok=True) files = list(input_path.glob("*.pdf")) print(f"Processing {len(files)} files...") for i, file in enumerate(files, 1): print(f"[{i}/{len(files)}] {file.name}") doc = await ingest_document(str(file), file.name) for page in doc.pages: detect_pii_on_page(page) result = await anonymize_document(doc) out_file = output_path / f"anon_{file.name}" with open(out_file, "wb") as f: f.write(result.content) print(f" → {result.tokens_created} tokens, saved to {out_file.name}") print("Batch complete!") asyncio.run(batch_process("./documents", "./anonymized"))

Example

cURL — Upload, Detect, Anonymize

bash

# 1. Upload a document
curl -X POST http://localhost:8000/api/documents/upload \
  -F "file=@contract.pdf"
# → {"id": "abc123", "filename": "contract.pdf", "pages": 5}

# 2. Run PII detection
curl -X POST http://localhost:8000/api/documents/abc123/detect
# → {"status": "completed", "regions_count": 42}

# 3. Get detected regions
curl http://localhost:8000/api/documents/abc123/regions
# → [{"type": "PERSON", "text": "John Smith", "score": 0.98, ...}, ...]

# 4. Anonymize the document
curl -X POST http://localhost:8000/api/documents/abc123/anonymize
# → {"status": "completed", "tokens_created": 42}

# 5. Download anonymized PDF
curl -o contract-safe.pdf \
  http://localhost:8000/api/documents/abc123/download/pdf

Python SDK

Use the core engine directly in your Python code for maximum flexibility.

Python SDK

bash

import asyncio
from promptshield import ingest_document, detect_pii_on_page, anonymize_document

async def process(path: str):
    # Ingest a document
    doc = await ingest_document(path, "contract.pdf")

    # Detect PII on each page
    for page in doc.pages:
        regions = detect_pii_on_page(page)
        print(f"Page {page.number}: {len(regions)} entities")

    # Anonymize the document
    result = await anonymize_document(doc)
    print(f"Done: {result.tokens_created} tokens")

asyncio.run(process("contract.pdf"))

Error Handling

All error responses follow a standard JSON format with a detail field describing the error.

400

Bad request — invalid parameters or file format

404

Not found — document ID does not exist

422

Validation error — missing required fields

500

Server error — check logs for details

Error Response Format

bash

{
  "detail": "Document not found: abc123"
}