Position B: Why Safety Is Bigger Than Security

5 minute read

In conventional thinking, safety is a subset of security. You harden the system; harm doesn’t land. Secure is safe is the implicit mental model.

For agentic systems, this gets the relationship backwards. Security is a subset of safety. Safe is bigger than secure.

This is not a wordplay. It is the structural recognition that determines what you have to engineer.

The Kitchen Argument

Imagine a fully secured kitchen. The door is locked. The perimeter is intact. No attacker can compromise the space. The supply chain on the food is verified. Every utensil is attested.

Is the kitchen safe?

It depends on who is using it. A trained chef using verified knives in a secured kitchen is safe. A toddler with the same knives in the same kitchen is not. The knives are sharp. The stove is hot. The gas is real. Authorised use — not just authorised access — determines safety.

Security governs whether the wrong person gets in. Safety governs what the right person, once in, can do.

Both are necessary. Security alone is insufficient. The kitchen can be perfectly locked and still produce harm if the authorised user is the toddler — or if the authorised user is misled, mistaken, or acting under false pretences.

What Conventional Security Covers

Conventional security is built around a specific question: can the system resist compromise? The discipline is adversary-relative. The defences exist to prevent attack from succeeding.

Concrete instances:

Identity — verify who is requesting access
Supply chain — verify the components are authentic
Perimeter — keep external attackers out
Credentials — keep secrets secret
Patch hygiene — keep known vulnerabilities closed

This is genuinely valuable engineering. It addresses real failure modes that have caused real damage. None of what follows discards it.

What conventional security does not answer:

Does the authorised actor act for the purpose they were given, or for some other purpose introduced by the data they consume?
When the authorised actor encounters something unexpected, does it bound the response, or escalate the failure?
Is the actor useful — capable of engaging at consequential scale — or does it default to refusal that purchases the appearance of safety at the cost of the actual job?

These questions are about the use of the secured system, not the securing of it. They are safety questions, not security questions.

For agentic systems, safety has three facets that have to be engineered separately from the security layer:

Alignment. The agent must act for the entrusted’s purpose, not just within the technical bounds of its mandate. A mandate to “reconcile billing” can be technically satisfied by issuing prompted refunds — every action signed, every credential verified, the perimeter intact. The mandate’s purpose was not satisfied. Alignment is the engineering of intent fidelity, not just access control.

Resilience. The agent must recover from internal failure gracefully — not because it was attacked, but because something in its operation went wrong. A sensor read incorrectly. A calculation overflowed. A state machine entered an unexpected configuration. The system holds, bounds the failure, and surfaces evidence. Conventional security doesn’t address internal failure that is not adversary-driven; resilience does.

Utility. The agent must remain capable of engagement under sustained load. Not “able to refuse harmful requests” — able to engage well, at consequential scale, with the discipline to act on evidence and to refuse on evidence. Useless bureaucrats are a safety failure, not a safety success. Conventional security has no concept of the cost of ceasing to act; safety must.

These three facets — alignment, resilience, utility — are Safety. Conventional security covers part of resilience (the part that defends against attack). It does not cover alignment. It does not cover utility. It barely scratches non-adversarial resilience.

The Inversion

Once you see the three facets, the relationship between safety and security inverts:

Security covers part of resilience.
Safety covers all of resilience, plus alignment, plus utility.
Therefore: safety contains security, not the other way around.

This is Position B. Safety ⊃ Security.

The reason this matters is engineering allocation. If you believe security is the larger discipline and safety is a subset, you invest in security and assume the rest follows. You discover, only after deployment, that you have unaddressed alignment failures, unaddressed internal-failure recoveries, and an agent that either refuses everything or executes everything. None of those are security failures. All of them are safety failures, and you did not build the layer that addresses them.

If you believe safety is the larger discipline and security is a subset, you invest in both. You inherit the security investment intact. You add the alignment, resilience, and utility engineering on top. The kitchen is locked and the cook is competent.

What Position B Asks of You

Three engineering consequences follow from the inversion:

1. Run safety and security as separate engineering tracks. Do not roll safety up under your existing security org without giving it its own surface, its own instruments, and its own verdicts. The two solve different problems. Treating them as the same problem is the root cause of post-deployment alignment failures.

2. Instrument all three safety facets. Alignment needs intent verification, mandate fidelity, prompt-injection bounds. Resilience needs internal-failure detection, bounded recovery, contained failure. Utility needs the engagement discipline that prevents useless-bureaucrat default-refusal. Each facet has its own tooling.

3. Compose multiplicatively, not additively. Trust × Safety, both required. Strong security with no alignment engineering is a high-fidelity instrument that will execute the prompt-injected instruction with full attestation. The attestation is proof of the failure, not absence of it.

The Receipt

Position B is not new. Process safety engineering has held this position for decades — the discipline that engineers chemical plants, aerospace systems, automotive controls. Process safety has long understood that security against attack is one mode of failure, and that engineered systems have to handle the others: misuse, internal error, capability under load. The agentic-systems community is not inventing this distinction. It is receiving it.

What is new is the specific shape of the agentic instance: alignment as a first-class concern (because mandates can be subverted by prompt content), resilience extended to non-adversarial internal failure (because the model itself can produce unexpected outputs), utility as engineered discipline (because refusal-as-default is a real and tempting failure mode).

The conventional security investment stays. The safety layer is added on top. The composition is multiplicative.

The kitchen is locked and the cook is competent. Both. Either alone is failure.

Twitter Facebook LinkedIn

MvpZone

Position B: Why Safety Is Bigger Than Security

The Kitchen Argument

What Conventional Security Covers

The Three Facets Conventional Security Doesn’t Cover

The Inversion

What Position B Asks of You

The Receipt

You May Also Enjoy

Recoverable Failure: The Forensic Ledger as Posture

Bounded Action: Why Refusal Is Not Free

Authority Is Not the Same Thing As Identity

What Is Not Attested Is Neither Real Nor True