Skip to main content

Writing a New Check

Zenzic's checks live in src/zenzic/core/. Each check is a standalone function in either scanner.py (filesystem traversal) or validator.py (content validation). CLI wiring is in the cli/ package (src/zenzic/cli/).


Six-Step Checklist

  1. Implement the logic in the appropriate core module (zenzic.core.scanner or zenzic.core.validator).
  2. Delegate resolution to InMemoryPathResolver — never call os.path.exists(), Path.is_file(), or any other filesystem probe inside a per-link loop. The resolver is instantiated once before the loop; re-instantiation per file defeats the pre-computed _lookup_map and drops throughput from 430 000+ to below 30 000 resolutions/s.
  3. Test i18n — if the check involves file paths, test it in all three i18n configurations (none, folder mode, suffix mode).
  4. Wire the CLI — add a corresponding command or sub-command in the cli/ package. See the CLI Architecture reference.
  5. Write tests in tests/ covering both passing and failing cases, including a performance baseline (5 000 links resolved in < 100 ms against a mock in-memory corpus).
  6. Update examples in examples/ to exercise the new check — Zenzic validates its own examples on every commit.

Performance contract: the zenzic.core hot path must remain allocation-free. No Path object construction, no syscalls, and no relative_to() calls inside the resolution loop.


Core Laws (non-negotiable)

These rules protect the performance and determinism guarantees of src/zenzic/core/. A PR that violates any of them will be rejected regardless of test coverage.

Zero I/O in the Hot Path

src/zenzic/core/ must never call Path.exists(), Path.is_file(), open(), or any other filesystem or subprocess operation inside a per-link or per-file loop.

The two permitted I/O phases are:

PhaseWhereWhat
Pass 1validate_links_async preamblerglob traversal to build md_contents and known_assets
InMemoryPathResolver construction__init__Building _lookup_map from the pre-read content dict

Everything after Pass 1 must use only in-memory data structures:

  • Internal .md resolution → InMemoryPathResolver.resolve()
  • Non-.md asset resolution → asset_str in known_assets (frozenset[str], O(1))

i18n Determinism

src/zenzic/core/ must produce identical findings and identical exit codes in all three i18n configurations:

ConfigurationRoot structure
No i18ndocs/*.md only
Folder modedocs/ + i18n/<locale>/docusaurus-plugin-content-docs/current/
Suffix modedocs/*.md + docs/*.it.md

Any check that produces different findings depending on locale configuration has a bug. Locale detection happens in the adapter layer; core must be locale-agnostic.

Ghost Route Awareness

Any check that validates links or routes must query the VSM, not the filesystem:

# ❌ Grade-1 violation — asks the filesystem, misses Ghost Routes
if not (docs_root / resolved_path).exists():
yield Finding(...)

# ✅ Correct — asks the VSM
if route_info.status == RouteStatus.ORPHAN_AND_ABSENT:
yield Finding(...)

Ghost Routes are pages generated by Docusaurus at build time (tag listings, paginated indexes, author pages) that have no physical Markdown source on disk. A filesystem check always reports them as broken.

VSM Sovereignty

When building or querying the navigation model:

  • Use only the adapter's get_nav_paths() / get_route_info() surface.
  • Never parse mkdocs.yml, docusaurus.config.ts, or any other engine config file directly inside a check. That responsibility belongs exclusively to the adapter.
  • Never call subprocess to run the build engine. Zenzic reads config as data, not as executable code.

Adapter Contract

When a check needs adapter data:

# ✅ Correct — use the adapter
route_info = adapter.get_route_info(rel_path)

# ❌ Wrong — never parse mkdocs.yml for locale data inside a check
with open("mkdocs.yml") as f:
config = yaml.safe_load(f)
locale = config.get("plugins", {}).get("i18n", {}).get("default_locale", "en")

Credential Scanner Obligations

If your check touches the credential scanner or harvest(), see the dedicated Credential Scanner Obligations reference. The four obligations (Worker Timeout, Regex-Canary, Dual-Stream Invariant, Mutation Score ≥ 90%) are enforced on every PR touching src/zenzic/core/.


Finding Codes

Every new check must emit findings using a code registered in FROZEN_CODES. Before adding a new code:

  1. Run zenzic inspect codes — confirm the code does not already exist.
  2. Add the code to FROZEN_CODES in the appropriate tier (Core, Structure, or Governance).
  3. Update CHANGELOG.md with the new code in the same commit.

Do not reuse retired codes. Retired codes stay in FROZEN_CODES with status retired.