Zenzic v0.16.0: Engineering Determinism into Documentation Pipelines
Zenzic v0.16.0 "Magnetite" marks the transition from an experimental toolchain to production infrastructure. The changes in this release address three classes of problems that accumulate silently in long-running documentation pipelines: malformed configuration that allows analysis to start on an invalid basis, unused allowlist entries that widen the security perimeter without purpose, and inconsistent exception types that force CI adapters to implement defensive workarounds.
1. The Bootstrap Gate: Z001 CORE_CONFIG_STRUCTURE¶
Problem statement¶
Prior to v0.16.0, a type error in .zenzic.toml could cause Zenzic to start analysis, produce partial output, and exit with an ambiguous message — or, in some code paths, exit 0. A CI pipeline consuming that output would record a clean run on top of a structurally invalid configuration.
The specific failure mode is a type mismatch in a numeric field:
# examples/z001-config-error/.zenzic.toml
[governance]
suppression_cap = "high" # invalid: must be an integer
suppression_cap expects an integer. When the value is a string, the configuration parser previously applied a best-effort coercion path that masked the error. Analysis proceeded with an unvalidated governance ceiling.
The fix¶
V0.16.0 introduces ZenzicConfigError, a typed subclass of ConfigurationError, hard-coded to code Z001:
# src/zenzic/core/exceptions.py
class ZenzicConfigError(ConfigurationError):
"""Raised when the project configuration structure is semantically invalid (Z001)."""
def __init__(self, message: str, context: dict[str, Any] | None = None) -> None:
super().__init__(message, context, code="Z001")
When this exception is raised, cli_main() intercepts it and — if the output format is sarif or json — serializes a structured error report before exiting:
{
"$schema": "https://json.schemastore.org/sarif-2.1.0.json",
"version": "2.1.0",
"runs": [{
"tool": { "driver": { "name": "zenzic", "version": "0.16.0", "rules": [] } },
"invocations": [{
"executionSuccessful": false,
"toolExecutionNotifications": [{
"descriptor": { "id": "Z001" },
"level": "error",
"message": { "text": "suppression_cap: expected int, got str ('high')" }
}]
}],
"results": []
}]
}
The executionSuccessful: false field in the SARIF invocation block is the critical element. The GitHub Code Scanning Security tab reads this field and marks the run as failed — even when results is empty. Z001 surfaces in the Security tab with the exact configuration file and message, without requiring any finding in the scanned documents.
Key guarantee: Z001 fires before any Markdown file is opened. Analysis either starts from a validated configuration or does not start at all.
$ zenzic check all examples/z001-config-error/
# Exit: 1 (Z001 — configuration guard raised before analysis begins)
ADR-020 (Law of the Mirror): every diagnostic code must be reproducible via a physical fixture. The
examples/z001-config-error/directory provides the canonical reproduction case for Z001.
2. Configuration Hygiene: Z110 STALE_ALLOWLIST_ENTRY¶
Problem statement¶
absolute_path_allowlist in .zenzic.toml permits specific absolute path prefixes in links without triggering Z105 ABSOLUTE_PATH. This allowlist exists for legacy migration paths. Once the migration is complete, the entries in the allowlist are no longer referenced by any link in the documentation.
An unused allowlist entry is a security-relevant configuration item. It represents a declared exemption from a rule for a path prefix that no longer exists in the codebase. It widens the rule's blind spot without benefit.
The fix¶
Z110 STALE_ALLOWLIST_ENTRY computes the set of active absolute path prefixes against the set of allowlisted prefixes after every link scan. Any entry in the allowlist that matched zero links emits a warning:
.zenzic.toml:1: Stale absolute_path_allowlist entry:
'/legacy/unused/path/' is never referenced in links.
| Code | Severity | DQS Penalty | Exit (default) | Exit (--strict) |
|---|---|---|---|---|
| Z110 | warning |
1.0 (structural) | 0 | 1 |
Z110 is a warning by default. Repositories running strict = true in .zenzic.toml exit 1, blocking the CI pipeline. A configuration item that serves no function should be removed, not tolerated.
Fix: remove the stale entry from .zenzic.toml:
3. Unified Exception Hierarchy¶
Problem statement¶
Before v0.16.0, distinct code paths in the scanner raised different exception types for semantically identical conditions — a rule violation. CI adapters implemented catch-by-type logic that required updates every time a new exception subclass was introduced.
The fix¶
ZenzicViolation is now the canonical exception for any condition where a rule or policy is violated. The full hierarchy is:
ZenzicError
├── ConfigurationError — missing / malformed config
│ ├── ZenzicConfigError — semantic invalidity (Z001)
│ └── EngineError — engine absent or incompatible
├── ZenzicViolation — rule or policy violation
├── CheckError — internal check machinery failure
├── NetworkError — transport-level HTTP failure
└── PluginContractError — plugin serialisability violation
The cli_main() top-level handler catches ZenzicError in a single branch. The structured context dictionary and code attribute on every exception provide the information needed to produce valid SARIF without pattern-matching on subclass names. Any CI adapter that catches ZenzicError at the boundary receives the full diagnostic payload regardless of which subclass was raised.
4. The Great Decoupling: ADR-075¶
The Zenzic Core engine produces SARIF, JSON, and text output via a stable, platform-agnostic exit-code contract:
| Exit | Meaning | Suppressible? |
|---|---|---|
| 0 | All checks passed | — |
| 1 | Documentation findings | Yes (--exit-zero) |
| 2 | Credential detected (Z201/Z204) | Never |
| 3 | Path traversal (Z202/Z203) | Never |
The Core contains no code that produces GitHub Annotations, calls the GitHub API, reads GITHUB_OUTPUT, or writes to GITHUB_STEP_SUMMARY. This is a design constraint documented in ADR-075:
Logic that maps Zenzic output to a CI platform's native format must live in the Adapter, never in the Core. Exit codes 2 and 3 propagate unchanged through every adapter.
The Zenzic Action v2.3.0 is the official GitHub Adapter. It reads the Core's exit code and SARIF output, then performs all GitHub-specific operations. The same Core binary can be consumed by a GitLab CI adapter, a Bitbucket Pipeline script, or a local pre-commit hook without any changes to the Core.
5. RE2 and DFA Guarantees¶
The Zenzic scanner uses an RE2-compatible regex engine for all pattern matching. RE2 compiles patterns to a Deterministic Finite Automaton (DFA), which provides O(N) time complexity with respect to input length N.
Adding new rules to the scanner does not change the worst-case scan time per file. A repository with 10,000 Markdown files scanned against 50 rules takes the same time per-file as a repository scanned against 5 rules — scan time scales linearly with file count, not rule count.
Any plugin rule that introduces a backtracking-capable pattern is rejected by the Z902 RULE_TIMEOUT guard, which enforces a per-file time limit.
| Property | Guarantee |
|---|---|
| Time complexity per file | O(N) — linear in input length |
| Regex engine | RE2/DFA — no backtracking |
| Rule timeout guard | Z902 — emitted if per-file limit exceeded |
| New rules affect per-file time | No |
6. Zenzic Action v2.3.0 and the Integration Blueprints¶
Aligned with Core v0.16.0, the Action introduces three documented CI blueprints.
Blueprint 1 — Baseline Check: standard link and topology validation on every push and PR.
- name: Run Zenzic Baseline
uses: PythonWoods/zenzic-action@v2
with:
version: "0.16.0"
format: text
fail-on-error: "true"
Blueprint 2 — Security Hardening: credential and path-traversal pre-gate with SARIF upload to GitHub Code Scanning.
- name: Run Hardened Zenzic Audit
uses: PythonWoods/zenzic-action@v2
with:
version: "0.16.0"
format: sarif
upload-sarif: "true"
guard-scan: "true"
permissions:
contents: read
security-events: write
Blueprint 3 — PR Governance: DQS regression blocking against a baseline snapshot, with inline annotations and step summary output.
- name: Run Zenzic PR Quality Gate
id: zenzic
uses: PythonWoods/zenzic-action@v2
with:
version: "0.16.0"
format: json
diff-base: .zenzic-score.json
fail-on-error: "true"
- name: Report DQS
if: always()
run: |
echo "**DQS:** ${{ steps.zenzic.outputs.score }}/100" >> $GITHUB_STEP_SUMMARY
echo "**Debt:** ${{ steps.zenzic.outputs.suppression-debt-pts }} pts" >> $GITHUB_STEP_SUMMARY
7. The Road to v0.17.0¶
V0.17.0 will introduce the AST Walker — a traversal layer that operates on the parsed abstract syntax tree of each Markdown file rather than on raw text. This changes the scanning model from pattern matching over strings to structural queries over a typed document model, providing access to node-level metadata not available to a regex scanner: heading hierarchy, list nesting depth, table cell count, and link context.
The RE2/DFA O(N) guarantee is preserved. The AST walker does not introduce pattern matching — it provides a structured input layer that the existing rule engine traverses in O(N) with respect to document node count.
No release timeline is committed. The AST Walker becomes a merge candidate when its integration test suite achieves the same coverage level as the current text-based scanner.