← gitpulse
New Feature·Pushed May 1, 2026·M

Gitpulse adds incremental daily run processing

GitHub Actions runs now fetch the prior manifest from the deployed site and only process new commits — daily digests scale with velocity instead of repository history depth.

Gitpulse daily runs are now incremental. Each execution fetches the manifest from the deployed GitHub Pages site, restores prior story JSONs in parallel, then filters local commits against that manifest. Only commits not yet in the manifest pass through the LLM. This is a significant architectural shift. On first run — when no manifest exists — the full bootstrap window is processed. Subsequent runs maintain a cursor tracking the last processed commit and only process new ones that arrived since the last run. A daily digest for a repository with years of history now costs the same as one for a repo with a day's worth of commits. Output paths moved to [[code]]site/public/data/[[/code]] so the JSON files are HTTP-fetchable after deploy. The old [[code]]src/content/[[/code]] directory is gone.
Technical description
This PR implements Phase 3 of an incremental processing system for gitpulse. The core change transforms daily GitHub Actions runs from processing all commits in the window to processing only new commits. The system now operates in four phases per run: 1. **Fetch prior state**: [[code ref=3]]SiteFetcher[[/code]] fetches [[code]]data/manifest.json[[/code]] from the deployed site. If missing, this triggers bootstrap mode. The site URL defaults to [[code]]https://{owner}.github.io/{repo}/[[/code]] with [[code]]GITPULSE_SITE_URL[[/code]] as an override. 2. **Restore prior stories**: When a manifest exists, [[code ref=3]]SiteFetcher.restorePriorStories()[[/code]] downloads each [[code]]data/stories/<id>.json[[/code]] in parallel using [[code]]pMap[[/code]]. This repopulates the working tree so subsequent writes produce a complete merged set. Failed fetches are counted but don't block the run. 3. **Filter commits**: The walkCommits() call returns all commits in the window. These are filtered against the manifest's SHA set — only commits not in the manifest proceed to LLM processing. The log output shows both the total window size and the new-count filter result. 4. **Write merged state**: After processing completes, [[code ref=6]]readAllStories()[[/code]] reads all story JSONs from disk (both restored and newly written), sorted by commit date. [[code ref=5]]buildManifestFromStories()[[/code]] and [[code ref=4]]buildStateFromStories()[[/code]] generate fresh manifest and state files tracking the cursor (lastCommitSha, lastCommittedDate, lastRunAt). ````mermaid graph LR A["Fetch manifest from site"] --> B{"Manifest exists?"} B -->|"Yes"| C["Restore prior stories in parallel"] B -->|"No (bootstrap)"| D["Process full window"] C --> E["Filter commits by SHA"] D --> E E --> F["Process new commits via LLM"] F --> G["Write stories to disk"] C --> G G --> H["Rebuild manifest + state"] ```` **Config changes**: [[code ref=1]]RuntimeConfig[[/code]] replaces [[code]]outDir[[/code]] with [[code]]dataDir[[/code]] and [[code]]storiesDir[[/code]]. A new [[code]]siteUrl[[/code]] field is auto-derived from the repo full name or set via [[code]]GITPULSE_SITE_URL[[/code]]. **Output paths**: All JSON files now write to [[code]]site/public/data/[[/code]] — [[code]]repo.json[[/code]], [[code]]state.json[[/code]], [[code]]manifest.json[[/code]], and [[code]]stories/<id>.json[[/code]]. This makes them HTTP-fetchable after Pages deploy. The [[code]]src/content/[[/code]] directory is removed. **Site loaders updated**: [[code]]repo.ts[[/code]] and [[code]]stories-loader.ts[[/code]] now read from [[code]]public/data/[[/code]] instead of [[code]]src/content/[[/code]]. **Files at a Glance**: - [[code]]action/src/config.ts[[/code]] — Runtime config with new dataDir, storiesDir, siteUrl fields - [[code]]action/src/index.ts[[/code]] — Main orchestrator: fetch → restore → filter → process → write - [[code]]action/src/site-fetcher.ts[[/code]] — SiteFetcher class for HTTP fetch + restorePriorStories - [[code]]action/src/state.ts[[/code]] — Manifest/state types, write functions, build functions from stories - [[code]]site/src/lib/repo.ts[[/code]] — Updated path to public/data/repo.json - [[code]]site/src/lib/stories-loader.ts[[/code]] — Updated path to public/data/stories - [[code]]site/src/content/.gitkeep[[/code]] — Deleted (old content directory removed)

Categories

  • New Feature (55%)New incremental processing system: fetch prior manifest from deployed site, restore stories, filter commits by SHA against manifest, write merged state and manifest
  • Performance (35%)Primary user value: daily runs scale O(new commits) instead of O(history depth); bootstrap detection for first-run vs incremental runs
  • Refactoring (10%)Output paths moved from site/src/content/* to site/public/data/*; site loaders updated to read from public/data/; removed old content directory