Security Model

Designed to defend, not just observe

governance-sdk enforces before execution — not after. Every design decision prioritizes security: zero dependencies, no network calls, append-only audits, and 64+ injection patterns blocking the attacks that are happening right now in production AI deployments.

Security principles

Zero external network calls

Enforcement is entirely in-process. No calls to external services, no telemetry, no phone-home. Your agent's decisions never leave your runtime.

No eval() or dynamic code

The entire SDK is statically analyzable. No eval(), no new Function(), no dynamic imports that could be hijacked. Works safely in edge runtimes.

Append-only audit semantics

Audit events are written and never modified. The HMAC chain means deletion is detected — you can't silently erase a decision from the audit trail.

Signing key isolation

The HMAC signing key is provided by you at startup. It never leaves your environment. Rotate it without breaking historical chain verification.

Zero dependencies

No supply chain attack surface. The entire governance enforcement path is first-party code. Nothing from npm can compromise your governance layer.

TypeScript-native, not transpiled

Shipped as TypeScript source with full type safety. You can audit exactly what runs. No minified bundles with hidden behavior.

64+-pattern injection detection

Run on every user-provided string before it reaches the agent. Categories allow targeted response — block all vs log only vs require approval.

Instruction Override5 patterns

Attempts to replace or nullify the agent's original system prompt or instructions.

ignore_previous_instructionsdisregard_system_promptoverride_withforget_everythingnew_instructions
Role Switch4 patterns

Forces the agent to adopt a different persona, often one without safety constraints.

you_are_nowpretend_you_areact_as_ifswitch_to_mode
Data Exfiltration4 patterns

Instructs the agent to send internal data, credentials, or prompts to attacker-controlled endpoints.

send_to_externalexport_data_toforward_contentsleak_prompt
Command Injection4 patterns

Embeds OS or interpreter commands inside user input, hoping they execute in the agent's context.

execute_commandrun_shellsystem_callexec_
Goal Hijacking3 patterns

Overrides the agent's stated objectives with attacker-defined goals.

your_real_goal_isprimary_objectivesecret_mission
Prompt Leakage2 patterns

Attempts to extract the agent's system prompt, revealing business logic or credentials.

repeat_your_instructionsshow_your_system_prompt
Usage
ts

Threat model

Six threat categories with mitigations in governance-sdk v0.5.0.

HIGH
Prompt injection via user input
detectInjection() on all user-sourced strings64+ patterns across 7 categoriesCategory-aware blocking (override vs exfil vs role-switch)
CRITICAL
Agent tool abuse (unauthorized actions)
blockTools() — exact or glob matchrequireLevel() — governance score gaterequireSequence() — must complete prerequisite tools first
HIGH
Runaway agent (infinite loops, resource exhaustion)
kill() / killAll() at priority 999rateLimit() per hour/daytokenBudget() hard cap
MEDIUM
Audit log tampering
HMAC-SHA256 hash chainchain.verify() detects any modificationbrokenAt reports exact tamper location
HIGH
Unauthorized high-risk actions (payments, deletes)
requireApproval() — human-in-the-loop gatetimeWindow() — restrict to business hoursrequireLevel(4+) for sensitive namespaces
LOW
Supply chain compromise
Zero runtime dependenciesFirst-party enforcement code onlyStatically analyzable — no eval()

Responsible Disclosure

Found a security issue in governance-sdk? Please report it privately via GitHub Security Advisories before public disclosure. We target a 72-hour initial response for all reports.