Security maintenance
Checkstack keeps its dependency graph patched through a daily, decoupled security-maintenance pipeline rather than ad-hoc bumps on each PR. The pipeline scans the published image and the full dependency graph, opens CI-gated pull requests for fixes it can apply automatically, prunes overrides the graph no longer needs, and tracks the rest in a single GitHub issue. This page is the reference for that machinery and for the override-drift check you will hit in PR CI.
The daily workflow
Section titled “The daily workflow”security-maintenance.yml
runs at 06:00 UTC (and on workflow_dispatch). It is intentionally separate
from PR CI so a newly-disclosed CVE or a newly-available upstream fix surfaces
even when nobody opens a PR. The workflow has two jobs.
The maintain job:
- Scans the published image (
ghcr.io/<owner>/checkstack:latest) and the filesystem with Trivy atCRITICAL,HIGH,MEDIUM. Both scans run withexit-code: 0- the job never fails on a finding; it reports and opens PRs/issues instead. - Normalizes the two scans into one
vulns.json([{ id, pkg, installed, fixed, sev }]) and uploads it as an artifact for theremediatejob. - Re-resolves every managed override with it removed (
audit:overrides:prune) and opens achore(security): prune redundant managed overridesPR when the graph no longer needs one. - Upserts a single tracking issue (label
security-maintenance) listing every fixable finding, and closes it automatically once nothing is fixable.
The remediate job runs in its own clean checkout (so override-pruning and
auto-remediation never share a working tree) and turns the fixable findings
into a fix(security): auto-remediate fixable vulnerabilities PR.
The pipeline generates, never merges. Every auto-PR must pass the full CI (typecheck, lint, test, build, and the security gate) before a human merges it. Version bumps can break consumers in ways unit tests do not catch, so a maintainer is always in the loop.
The managed-override manifest
Section titled “The managed-override manifest”security/managed-overrides.json
is the registry of every overrides/resolutions entry in the root
package.json, split by intent:
security- a transitive pin added to remediate a vulnerability. Each entry carries asafeFloor(the lowest version that is not vulnerable) plusseverity,advisory,reason,addedAt, andremoveWhen. These are re-resolved daily and auto-pruned once redundant, so they MUST carry asafeFloor.intentional- a deliberate, permanent pin (version alignment, singletons such as the single React instance for Module Federation). These are documented but NEVER auto-removed, so they only recordreasonandaddedAt.
The pinned version is not stored in this file. It lives in package.json
overrides/resolutions, which is the single source of truth; the manifest
only records intent and the safe floor.
How a fix is applied
Section titled “How a fix is applied”scripts/remediate-vulns.ts
groups the findings by package and decides one remediation per package. The
invariants are deliberate:
- Lowest in-major fix. For each package it picks the lowest fixed version in the SAME major as what is installed, high enough to clear every advisory for that package.
- Never an automatic major bump. When a package’s only fix is in a higher
major, it is listed as manual and left for a human. This avoids the
@grpc/proto-loadertrap, where forcingprotobufjs8 breaks a consumer pinned to^7. - Direct dependency -> range bump. When the package is declared in one or
more workspace manifests, its declared range is raised to
^<target>in every owning manifest and every dependency block, includingpeerDependencies. Bumping all declarations together keeps syncpack’s unified version groups consistent. - Existing security override -> raise the override. When the package is already pinned by a managed security override, the override floor is raised (never lowered) and the curated metadata is preserved. A package that is both directly declared AND security-overridden gets both: the range bump for npm consumers and the raised override for the image.
- Transitive -> root override. When the package is neither declared nor
already overridden, it is pinned in the root
overrides/resolutionsand a fully-documented entry is added to the manifest’ssecuritysection. - Intentional pin -> manual. A package in the
intentionalsection (react, drizzle-orm, …) is never auto-bumped; auto-bumping would fight the intent, so it is routed to manual.
Shipping the fix in a release
Section titled “Shipping the fix in a release”Editing manifests is not enough - a release has to version and publish the
affected packages (and rebuild the Docker image) for the fix to reach
operators. The remediator therefore writes a changeset at
.changeset/auto-security-remediation.md that bumps every publishable package
whose manifest it range-bumped. When a round touches only transitive deps (no
publishable package declares them), it falls back to bumping a single platform
package (@checkstack/backend) so a release still ships the image rebuild that
carries the override fixes. Per the beta never-major rule, the changeset is
always a patch.
The override-drift check
Section titled “The override-drift check”scripts/audit-overrides.ts
--check runs in PR CI (the fast drift guard). It fails when:
- a
package.jsonoverride is undocumented (not in either bucket), or documented in both buckets; - a managed entry is missing from, or mismatched between,
overridesandresolutions(this repo mirrors them, so they must be identical); - a security pin’s range floor sits below its recorded
safeFloor.
If you add or change an override and CI’s audit:overrides:check fails, fix it
by making package.json and security/managed-overrides.json agree: add the
override to exactly one bucket of the manifest, mirror it in both overrides
and resolutions, and ensure the pinned range is at or above the safeFloor.
The heavier --redundant mode (run by the daily workflow) spins up a throwaway
git worktree per override, removes that single override, runs bun install,
and reads the resolved versions back from the worktree’s lockfile. An override
is redundant when, without it, every resolved copy still sits at or above the
safeFloor (or the package left the graph entirely). --prune applies that
result, removing redundant security overrides from both package.json and the
manifest.
Running it locally
Section titled “Running it locally”The scripts are exposed as root package.json tasks. To check for override
drift exactly as PR CI does:
bun run audit:overrides:checkTo see which overrides the graph no longer needs (the heavy re-resolution audit; spins up worktrees and runs installs):
bun run audit:overrides:redundantTo plan a remediation against a Trivy findings file without changing anything, run the remediator in dry-run mode:
bun run security:remediate path/to/vulns.json --dry-runDrop --dry-run to apply the range bumps, override edits, changeset, and
lockfile refresh into your working tree - the same change set the daily
remediate job opens as a PR.
See also
Section titled “See also”- Changesets - how the auto-remediation changeset versions and ships the fix.
- Dependency linter - the syncpack-based version-consistency check that the range bumps keep green.