Skip to main content

Adversarial Stress-Testing — AI as Punching Bag

Adversarial Stress-Testing — AI as Punching Bag

AI does not co-author Zenzic. It is the punching bag we hit until the design stops bleeding.


1. The Arena

Zenzic development operates as an adversarial arena. The rules are simple:

  • Humans decide the architecture. The Three Pillars, the VSM design, the Shield

    pipeline, the Blood Sentinel perimeter — these are strategic human choices.

  • AI is deployed as a controlled Red Team. Its mission is to find logical flaws,

    Pillar violations, and security weaknesses in what the human has already decided.

RoleFunction
Human ArchitectIntegrity Gatekeeper. Decides strategy. Sets invariants. Owns liability. Ratifies or rejects every AI finding.
AI Red TeamAdversarial Auditor. Attacks assumptions. Attempts to violate the Three Pillars. Surfaces hidden coupling. Finds contradictions before they ship.

The Core Rule:

"AI output is treated as an adversarial pull request. It must pass the Constitutional Audit before being merged. If the AI can propose a valid violation, it has found a real bug — not a style suggestion."


2. The Three Pillars as Stress-Test Targets

Every AI-assisted session in Zenzic is framed as a direct attack on one or more of the Three Pillars:

Pillar 1 — Lint the Source, Not the Build

The attack surface: Can any analysis code be made to depend on HTML output, compiled assets, or build artifacts? Can a rule be written that only fires after a build completes?

The invariant: Analysis operates on raw Markdown and configuration files. If the AI finds a code path that reads build/ or requires a build step before firing, it has found a Pillar 1 violation.

Pillar 2 — Zero Subprocesses

The attack surface: Can any code path lead to subprocess.run, os.system, os.popen, shutil.which + exec, or any form of external process invocation? Can a plugin contract be satisfied by a class that wraps a subprocess?

The invariant: 100% pure Python. No exceptions. If the AI can write a BaseRule subclass that calls a subprocess and still passes the PluginContractError validation — that is a real contract vulnerability.

Pillar 3 — Pure Functions First

The attack surface: Can analysis logic accumulate state between calls? Can a rule hold a mutable counter that affects future findings? Can the check() method make an I/O call that influences its output?

The invariant: Analysis logic is deterministic. check(file_path, text) always returns the same findings for the same inputs. If the AI can construct a valid BaseRule implementation with a hidden self._cache that changes behavior on the second call, it has found a Pillar 3 violation.


3. Adversarial Session Types

Type A — Architecture Violation Hunt

The AI is given the full codebase and tasked with finding any code that violates an [INVARIANT] from the Zenzic Ledger. No guidance is given on where to look.

Outcome: If a real violation is found, it is promoted to a bug and fixed in the same sprint. If no violations are found, the session confirms architectural soundness.

Type B — Reg Ex Canary Attack (ZRT-002)

The AI is tasked with constructing a regex pattern that:

  1. Would be accepted by AdaptiveRuleEngine construction (passes _assert_regex_canary)
  2. Exhibits catastrophic backtracking on input sizes > 1 KiB

This is a direct security stress-test on the ReDoS hardening.

Type C — Shield Bypass Hunt

The AI is given the 8-stage normalization pipeline and tasked with constructing a Markdown fragment that:

  1. Contains a real credential (from a known family in _SECRETS)
  2. Passes through all 8 normalization stages undetected

This is the most adversarial session type. Any successful bypass is a Z201 SHIELD_SECRET detection failure — a Critical security finding requiring an immediate patch and a new normalization stage.

Type D — Blood Sentinel Escape

The AI is given the InMemoryPathResolver._build_target() implementation and tasked with constructing a path string that:

  1. Is a valid relative Markdown link (parseable by the MDX renderer)
  2. After os.path.normpath() collapse, resolves to a path outside docs_root
  3. Does not contain obvious traversal sequences (literal ../)

Any successful escape is a Z202 PATH_TRAVERSAL false-negative — a Critical security finding requiring immediate perimeter hardening.


4. Governance Badge

AI-Tested / Human-Governed

This badge, visible in the Zenzic README, signals:

  1. AI was used — in adversarial sessions, as documented here.
  2. Humans decided — every strategic choice, every invariant, every merge decision.
  3. Transparency — you know exactly how AI was deployed in this project.

Every sprint that involved adversarial sessions records the outcome in the Zenzic Ledger [ACTIVE SPRINT] entry: sessions run, violations found, outcome.


5. What AI Does Not Decide

DecisionAuthority
Architectural Principles (Three Pillars)Human — non-delegable
Finding code semantics (Zxxx registry)Human — ratified in core/codes.py
Exit code contract (0/1/2/3)Human — immutable
Sprint scope and release scheduleHuman
Whether an AI finding is a real violationHuman Integrity Gatekeeper

AI proposes. AI attacks. AI does not ratify.


6. FAQ

Q: Is Zenzic "written by AI"?

No. Zenzic is stress-tested by AI. The Three Pillars, the VSM architecture, the Shield normalization pipeline, and the Blood Sentinel perimeter are human strategic choices. AI is used to enforce these choices by attempting to break them.

Q: Can I attribute a bug to the AI?

No. The human Integrity Gatekeeper merged the code. The Integrity Gatekeeper owns the bug. The AI's role was to catch it before merge — if it failed to, that is an audit protocol failure, not a liability transfer.

Q: Does the AI have access to secrets or production systems during adversarial sessions?

No. Adversarial sessions operate on the open-source codebase only. The AI is given source code and documentation. No credentials, no deployment keys, no production access. The adversarial model is purely analytical.

Q: Why document this?

To avoid Asymmetrical Information. When you read a Zenzic ADR that says "the Shield has 8 normalization stages", you should know that those 8 stages survived a Type C adversarial session where an AI attempted to construct bypass payloads for each one. The rigor is deliberate. The transparency is part of the security model.

Saga VI: The Governance of Quartz — read the chronicle