Governance Scoring
7-dimension governance scoring model with L0-L4 maturity levels for AI agent assessment.
Every agent gets a composite score from 0-100 across 7 weighted dimensions, mapped to governance levels L0 through L4. The score is computed instantly at registration time and updates when agent metadata changes.
7 Scoring Dimensions
Each dimension is scored 0-100 independently, then combined into a weighted composite. Higher-weight dimensions have more influence on the final score.
| Dimension | Weight | What It Measures |
|---|---|---|
| Identity | 1.5 | Name, owner, framework, version, description, authentication, channels |
| Permissions | 1.5 | Explicit permissions, tool scoping, auth, bounded tool count |
| Guardrails | 1.3 | Input/output guardrails, auth, framework-native guardrails, bounded tools |
| Observability | 1.2 | Tracing, audit logging, framework tracing, metadata |
| Auditability | 1.0 | Audit logging, observability, ownership, versioning, documentation |
| Compliance | 1.0 | Audit logs, guardrails, auth, observability, ownership, permissions |
| Lifecycle | 0.8 | Owner, version, description, framework, channels, metadata |
Note: The composite score is a weighted average: each dimension's score is multiplied by its weight, summed, then divided by the total weight (8.3). This means identity and permissions together account for ~36% of the final score.
Governance Levels (L0-L4)
The composite score maps directly to a governance level, aligned with the CSA Agent Trust Framework progressive autonomy model.
| Level | Label | Score Range | Autonomy |
|---|---|---|---|
| L0 | Unregistered | 0-20 | No autonomous operation |
| L1 | Basic | 21-40 | Human-in-loop required |
| L2 | Managed | 41-60 | Limited autonomous actions |
| L3 | Governed | 61-80 | Full autonomous within policy |
| L4 | Certified | 81-100 | Cross-team, regulatory-ready |
Tip: Use the
requireLevel()policy preset to enforce minimum governance levels. Agents below the threshold are blocked from operating autonomously.
Scoring at Registration
Scores are computed automatically when you call gov.register(). The more metadata you provide, the higher the score.
Dimension Breakdown
Every assessment includes per-dimension scores with evidence, so you know exactly which features contribute to the score and where the gaps are.
Fleet-Wide Scoring
Assess your entire agent fleet at once. The fleet summary includes averages, distributions by level and status, and actionable recommendations.
How to Improve Your Score
| Transition | Action |
|---|---|
| L0 → L1 | Register the agent with a name and owner. Declare a known framework. |
| L1 → L2 | Add tools list, enable audit logging, set a version string. |
| L2 → L3 | Enable authentication, add guardrails, configure permissions and observability. |
| L3 → L4 | Complete all metadata: description, channels, metadata object. Enable all security features. |