Security Overview

Security Overview

This is a practical security overview for teams evaluating SELF for production.

If you need deeper detail, see:

technical/SELF-SECURITY.md (security architecture summary)
technical/THREAT-MODEL.md (canonical threat model)
SECURITY_DOCUMENTATION.md (override prevention / invariants)

What SELF is securing

SELF’s security goal is harm prevention through conservative constraint on support-critical AI behavior.

Security in SELF is not “prevent all attacks on the internet”; it is preventing these operational failures:

unsafe behavior under emotional load
silent weakening of constraints over time (“safety drift”)
un-auditable decisions (“we can’t explain what happened”)
implementation shortcuts that bypass governance steps

Core security mechanisms (high-level)

SELF implements defense in depth:

input + state detection (S0–S3 posture selection)
policy enforcement (explicit constraints per posture)
output validation + repair (postflight gate before shipping)
logging hooks (pre/post events for auditability)
integrity controls (guardrails against bypass/override patterns)

Operational security responsibilities (your side)

To operate SELF securely, you must still:

control secrets (API keys, env vars, log paths)
restrict network access to the HTTP service
enforce rate limiting at the edge (SELF can also rate limit)
store logs securely (treat them as sensitive)
maintain escalation paths (human support, crisis resources)

What “secure integration” means

Secure integration is not optional:

always call /v1/pre before generating a model draft
always call /v1/post and ship only the returned output
honor clarifier requirements and escalation behaviors

If an integration skips these steps, SELF’s guarantees do not apply.