Zenzic — Architectural Gaps & Roadmap
"What is not documented, does not exist; what is documented poorly, is an ambush."
This page tracks gaps that were closed during the v0.7.0 cycle and those that remain open for the v0.8.0 roadmap. It is a living document — updated each sprint.
Open — Target v0.8.0
GAP-001 — Auto-Fix Engine
Component: cli/_check.py, new core/fixer.py
Description: Zenzic detects but does not repair. A contributor who receives a Z501
(placeholder) or Z502 (short content) finding must locate and edit the file manually.
An Auto-Fix engine would apply safe, reversible patches directly to source files —
replacing placeholder tokens, stubbing short sections, and reporting what was changed.
Planned semantics:
zenzic fix all # dry-run by default: shows diff, writes nothing
zenzic fix all --apply # writes changes; staged via git diff
zenzic fix links # fixes only Z101/Z104 (dead links) — renames or stubs
Design constraints:
-
Auto-fix must never touch files that triggered Z201 (Shield secret) — those require
human judgment.
-
Exit code semantics are unchanged:
--applystill exits 1 if unfixed findings remain. -
Pure Python, no subprocess (Pillar 2).
Status: Design phase. No code merged.
GAP-002 — Dynamic Navbar/Footer Plugin Support
Component: core/adapters/_docusaurus.py, _parse_config_navigation()
Description: Docusaurus supports navbar items declared via @docusaurus/plugin-*
plugins (e.g. plugin-content-docs multi-instance, custom navbar components). When
the navbar is populated dynamically at build time, Zenzic's static regex parser cannot
extract those paths — it falls back to treating all files as REACHABLE.
Impact: Low false-positive risk (the fallback is conservative), but some true orphans may be missed in plugin-heavy configurations.
Planned resolution: A structured warning (::warning annotation in CI mode) when
Zenzic detects dynamic navbar plugins, indicating that orphan detection may be
incomplete. The user can suppress it with dynamic_nav_plugins = true in zenzic.toml.
Status: Tracked. RFC open.
Closed in v0.7.0 — Operation Obsidian Stress
Before the v0.7.0 release, four AI agents were instructed to break Zenzic's Shield (credential scanner) using realistic bypass techniques. They found four real vectors. All were closed before stable release.
See the full technical post-mortem: AI Red Team Attacks Code Linter
ZRT-001 — Unicode Normalization Bypass (Shield)
Identified by: AI Red Team agent "Alpha" during Operation Obsidian Stress
Component: core/shield.py, scan_lines_with_lookback()
Description: The Shield's regex patterns matched ASCII credential shapes. An attacker
controlling a Markdown file could insert a Unicode lookalike character (e.g. ghp_…
using fullwidth Latin letters) into what appeared to be a token. The Shield would not
fire because the byte sequence did not match the ASCII pattern.
Resolution: Unicode normalization (NFKC) is applied to each line before pattern
matching. unicodedata.normalize("NFKC", line) collapses fullwidth, superscript,
enclosed, and other Unicode lookalikes to their ASCII canonical form. The original line
content is preserved for output; only the normalized copy is matched against.
Lesson: Regex-based credential detection must normalize input. The attack surface is not the pattern — it is the encoding.
Closed in: v0.7.0 sprint D038.
ZRT-002 — Lookback Buffer Escape (Shield)
Identified by: AI Red Team agent "Bravo" during Operation Obsidian Stress
Component: core/shield.py, scan_lines_with_lookback()
Description: The Shield's lookback buffer was used to detect multi-line credential
constructs (e.g. a password: key on one line, the value on the next). Agent Bravo
inserted a sufficiently long "filler" block (> buffer size) between the key and value
lines. The buffer emptied before the value line was scanned, breaking the association
and suppressing the Z201 finding.
Resolution: The lookback buffer size was validated against the maximum known
multi-line credential pattern length in the registry (codes.py). The buffer is now
guaranteed to span the maximum pattern window. Additionally, the buffer is flushed on
file boundaries only — never mid-file.
Lesson: Buffer-based detection requires formal sizing against the worst-case pattern. An informal "large enough" buffer is not a security guarantee.
Closed in: v0.7.0 sprint D039.
ZRT-003 — HTML Entity Obfuscation (Shield)
Identified by: AI Red Team agent "Charlie" during Operation Obsidian Stress
Component: core/shield.py
Description: The Shield scanned raw Markdown bytes. Agent Charlie used HTML entity
encoding (ghp_… for ghp_…) inside fenced code blocks. The Shield's patterns
did not match the entity-encoded form, allowing a fake credential to pass undetected.
Resolution: A lightweight HTML entity decoder is applied to each line before Shield
pattern matching (after NFKC normalization). The decoder handles numeric (g) and
named (&) entities. XML/HTML character references are normalized to their Unicode
codepoints before the regex runs.
Lesson: Multi-encoding defense requires layered normalization. A single normalization pass (NFKC only) is insufficient when HTML rendering is part of the content pipeline.
Closed in: v0.7.0 sprint D040.
ZRT-004 — Fenced Block Scope Confusion (Shield)
Identified by: AI Red Team agent "Delta" during Operation Obsidian Stress
Component: core/shield.py, fenced block state machine
Description: The Shield originally skipped scanning inside triple-backtick fenced
blocks, reasoning that code examples are not live secrets. Agent Delta embedded a
ghp_ pattern inside a bash fenced block. The Shield did not fire.
Resolution after deliberation: The "skip fenced blocks" heuristic was reversed.
The Shield now scans all lines, including fenced code blocks. The rationale: a
documentation file that leaks a real credential inside a bash block is still leaking
a real credential. The example nature of the block is irrelevant to the security outcome.
A # zenzic: ignore-next-line comment is the authorized mechanism for authors who need
to include a credential-shaped string in a documented example (e.g. showing the format
of a GitHub token without using a real one). The examples/matrix/red-team/ fixtures
demonstrate this pattern.
Lesson: Heuristic scope exclusions that reduce false positives often create false negatives in adversarial conditions. Security-critical passes should default to scan everything, authorize exceptions explicitly.
Closed in: v0.7.0 sprint D041.
Closed Earlier (Pre-v0.7.0)
ZRT-005 — Bootstrap Paradox
Component: core/scanner.py
Description: zenzic init crashed with a configuration error when invoked in an
empty directory. The find_repo_root() function had no fallback, making it impossible
to initialize a project that did not yet have a .git or zenzic.toml marker.
Resolution: fallback_to_cwd=True parameter added to find_repo_root(), used
exclusively by zenzic init. See ADR 003.
Closed in: v0.6.0a4.
ZRT-006 — VSM Bypass: Absolute Slug Links Skipped Silently
Component: core/validator.py — Phase 2 link validation loop
Description: When a Docusaurus project declares routeBasePath-owned prefixes
(e.g. /blog/) via get_absolute_url_prefixes(), the validator suppresses Z105
(ABSOLUTE_PATH) for links starting with those prefixes. The suppression was
implemented as a bare continue, which exited the per-link iteration before the
VSM lookup — making Z001 impossible to fire on absolute prefix-owned links.
A second compounding issue: DocusaurusAdapter.set_slug_map() was never called
during validate_links_async(), so the slug map was empty at VSM construction time.
Blog posts declaring slug: my-post in frontmatter were routed via filename
derivation instead (e.g. 2026-04-29-my-post → /blog/my-post/), producing a VSM
that diverged from the URLs Docusaurus actually served.
Combined effect: A link /blog/wrong-slug where the real slug was
/blog/correct-slug produced no finding from Zenzic, while docusaurus build failed
with a broken-link error. The sentinel was blind to the most common post-rename failure
mode.
Resolution: Two coordinated fixes in core/validator.py:
-
Lifecycle ordering —
adapter.set_slug_map(md_contents)is now called (viahasattrguard for cross-engine safety) immediately beforebuild_vsm(). The VSM is built on the correct virtual identity, not the physical filename. -
Scoped VSM lookup — After Z105 suppression, the validator checks whether the matched prefix has at least one route in the VSM (
_scanned_vsm_prefixes). If so, it performs adict.get()lookup and reportsFILE_NOT_FOUNDwhen the route is absent. Prefixes with no VSM entries (sibling plugins whose markdown is outside the scan scope) retain the unconditional bypass — Zero-Config invariant preserved.
Cross-engine impact: MkDocs, Zensical, and Standalone adapters do not implement
set_slug_map(). The hasattr guard makes the call a no-op for those engines — no
behaviour change.
Regression lock: tests/test_docusaurus_blog_vsm.py — class
TestAbsoluteSlugMismatch — two new tests:
test_absolute_broken_blog_link_is_detected— wrong slug raisesFILE_NOT_FOUNDtest_correct_absolute_slug_link_is_clean— correct slug produces no error
Closed in: v0.7.0.
D100 — Privacy Gate Migration: .zenzic.dev.toml → .zenzic.local.toml
Component: cli/_standalone.py, models/config.py, core/shield.py, core/codes.py
Description: The original D002 Environmental Privacy Gate (_scaffold_dev_toml) created
.zenzic.dev.toml with a [development_gate] table that held forbidden_patterns for
export redaction. This file was not integrated into the Shield scanning pipeline —
it served only as a local redaction hint for export tooling. The patterns were never
checked against documentation content, so a developer could inadvertently publish a
document containing a forbidden code-name without any Zenzic warning.
This gap created a false sense of security: users configured forbidden_patterns
expecting Zenzic to block those terms from documentation, but the scan never happened.
Resolution (Sprint D100 — v0.7.0):
-
New canonical file:
.zenzic.local.tomlreplaces.zenzic.dev.tomlas the machine-local, git-ignored privacy configuration. It is a flat TOML file with a top-levelforbidden_patterns = [...]key. -
Automatic
.gitignoremanagement:zenzic initnow always scaffolds.zenzic.local.tomland appends the filename to.gitignoreif the file exists and the entry is absent. No manual step required. -
Config deep-merge:
ZenzicConfig.load()performs an additive merge offorbidden_patternsfrom.zenzic.local.tomlafter loading the primary config (zenzic.tomlor[tool.zenzic]). Duplicates are removed; insertion order is preserved. -
Z204 FORBIDDEN_TERM — Exit 2:
scan_line_for_forbidden_terms()incore/shield.pyperforms a case-insensitive verbatim substring scan against the mergedforbidden_patternslist. Any match on any line of any documentation file is emitted as aSecurityFindingwithsecret_type="FORBIDDEN_TERM". The scanner bridges this to Z204 (not Z201), preserving clear separation between credential leaks and forbidden-term violations. -
Backward compatibility:
_scaffold_dev_toml()is retained as a shim that delegates to_scaffold_local_toml(). No external callers need updating.
Brand Integrity Shield — Two-Layer Design: The Z204 Privacy Gate and the Z905 Brand Obsolescence Guard form a two-layer architecture:
- Z204 (
forbidden_patternsin.zenzic.local.toml): exit 2, non-suppressible. Designed for private terms that must never appear in any published doc. - Z905 (
obsolete_namesinzenzic.toml): exit 1, suppressible withzenzic:ignore Z905. Designed for deprecated brand terms where historical references in CHANGELOG files are acceptable.
Closed in: v0.7.0 sprint D100.