Trust Center · transparent by design

Security you can verify, not just trust.

SoterAI is an OWASP LLM Top 10 aligned AI security command layer focused on risk reduction for chatbots, RAG apps, and AI agents. This page documents what we test, how we handle data, the controls we run, how you can deploy us, and how to report a vulnerability.

We do not claim complete protection or certification. Customers remain responsible for secure design, access control, monitoring, incident response, and human oversight.

View the benchmark Live system status

soterai · security posture

OPERATIONAL

Adversarial benchmarkF1 = 1.0000

False-positive rate0 / 25 safe inputs

OWASP LLM Top 10Aligned coverage

PII redacted before logging

No raw secret storage

Per-project row-level isolation

Honest scope · self-authored evidence · independent audit welcomed

Test status

Self-authored regression and adversarial coverage. Independent third-party auditing is recommended and welcomed.

101 / 101

Adversarial battery

Comprehensive attack scenarios across 40+ services

97 / 97

Garak-style benchmark

F1 = 1.0000, 0 / 25 false positives

60+ files

Unit + integration suites

Guard, agent-firewall, auth, billing, retention

Playwright

E2E guard scenarios

Real attack flows against a live build

Full benchmark data: /benchmarks · live system status: /status

Data handling

Guard analysis runs inline; request text is evaluated against detection rules and not used to train models.
Sensitive values (PII, secrets) are redacted before logs are persisted — original text is not stored on redaction paths.
Audit logs and reports are scoped per project and per organization with row-level ownership checks on every read.
Data retention is configurable; enterprise workspaces can set automated purge windows for guard logs.
Demo workspace data is synthetic and resettable via an authenticated, confirmation-gated script.

See also our privacy policy, subprocessors, and data retention.

Security controls

Defense-in-depth detection

Layered rules for prompt injection, jailbreaks, encoding/obfuscation, multilingual bypass, PII, secrets, and unsafe output.

Strict transport + headers

HSTS in production, strict Content-Security-Policy, X-Frame-Options DENY, nosniff, and a locked-down Permissions-Policy.

Authn & authz

Session-based auth with CSRF protection; per-project API keys are stored only as hashes, never in plaintext.

Signed audit exports

HMAC-signed JSONL/CSV exports so downstream SIEM and compliance pipelines can verify integrity.

Agent firewall

Tool-call authorization, agent passports, approvals, and escrow for autonomous workflows before risky actions execute.

Secrets hygiene

Secrets live in environment configuration (gitignored); the repo is scanned to keep keys and tokens out of source control.

Control details and posture: /security · /compliance

Deployment model

Managed SaaS

Hosted guard APIs, dashboard, and audit storage. Fastest path to production.

Self-hosted (Docker)

Run the full stack in your own VPC for data residency and isolation requirements.

Hybrid

Inline SDK detection at the edge with centralized policy, reporting, and audit.

Responsible disclosure

Report suspected vulnerabilities to the security contact listed in your enterprise agreement or deployment runbook. Include affected URLs, impact, reproduction steps, and whether any data was accessed. Only test systems you own or are authorized to assess — do not access, modify, delete, or exfiltrate data that is not yours.

Read the full disclosure policy

Security questions before you ship?

Try the live playground, review the benchmark, or talk to us about self-hosting.

Try the playground Talk to us

Security you can verify, not just trust.

We do not claim complete protection or certification. Customers remain responsible for secure design, access control, monitoring, incident response, and human oversight.

Test status

Self-authored regression and adversarial coverage. Independent third-party auditing is recommended and welcomed.

101 / 101

Adversarial battery

Comprehensive attack scenarios across 40+ services

97 / 97

Garak-style benchmark

F1 = 1.0000, 0 / 25 false positives

60+ files

Unit + integration suites

Guard, agent-firewall, auth, billing, retention

Playwright

E2E guard scenarios

Real attack flows against a live build

Full benchmark data: /benchmarks · live system status: /status

Data handling

Guard analysis runs inline; request text is evaluated against detection rules and not used to train models.

Sensitive values (PII, secrets) are redacted before logs are persisted — original text is not stored on redaction paths.

Audit logs and reports are scoped per project and per organization with row-level ownership checks on every read.

Data retention is configurable; enterprise workspaces can set automated purge windows for guard logs.

Demo workspace data is synthetic and resettable via an authenticated, confirmation-gated script.

Security controls

Defense-in-depth detection

Layered rules for prompt injection, jailbreaks, encoding/obfuscation, multilingual bypass, PII, secrets, and unsafe output.

Strict transport + headers

HSTS in production, strict Content-Security-Policy, X-Frame-Options DENY, nosniff, and a locked-down Permissions-Policy.

Authn & authz

Session-based auth with CSRF protection; per-project API keys are stored only as hashes, never in plaintext.

Signed audit exports

HMAC-signed JSONL/CSV exports so downstream SIEM and compliance pipelines can verify integrity.

Agent firewall

Tool-call authorization, agent passports, approvals, and escrow for autonomous workflows before risky actions execute.

Secrets hygiene

Secrets live in environment configuration (gitignored); the repo is scanned to keep keys and tokens out of source control.

Control details and posture: /security · /compliance

Responsible disclosure

Read the full disclosure policy