Repo Map
Section titled “Repo Map”Canonical reference for what each top-level area owns and why it exists. For placement rules see 01-code-placement.md. For package categories see 02-package-categories.md. For feature source-of-truth decisions see 07-feature-source-of-truth.md. For generated output policy and exceptions see 12-generated-artifacts.md. For naming conventions see 13-naming-conventions.md. For current architecture gaps and follow-up cleanup, see 06-current-gaps.md.
Top-level layout
Section titled “Top-level layout”apps/ dashboard/ TanStack Start dashboard — fullstack Cloudflare Worker marketing/ Astro public site — Cloudflare Worker docs/ Starlight docs site — Cloudflare Pages
workers/ ingestion/ Queue consumer — ORCID/OpenAlex/Crossref enrichment pipeline consumer-api/ Hono Consumer API (api.legaciti.org) rate-limiter/ Cloudflare Durable Object — per-client rate limiting log-processor/ Cloudflare log tail consumer
packages/ db/ Drizzle ORM schema + D1 shard configs (5 shards) schemas/ Shared Zod input/output schemas types/ Shared TypeScript interfaces utils/ Pure utilities: DOI normalization, sharding, JSON Patch api-contract/ OpenAPI contract for the integrations API integration-core/ Integration request/response runtime helpers platform-cloudflare/ Cloudflare-specific runtime helpers shared across workers platform-query/ TanStack Query keys, retry config, shell-level query keys platform-telemetry/ Route-level metrics helpers platform-events/ Client event bus, registry, cache invalidation platform-i18n/ i18next bootstrap + shell-level locale files host-tsc-shims/ TypeScript path shims so feature packages compile standalone feature-activity/ Domain: activity feed feature-people/ Domain: researcher/people management feature-publications/ Domain: publication management
cli/ Admin CLI (legaciti binary) — internal tooling onlymigrations/ Canonical D1 SQL schema bootstrap (applied to all 5 shards)tests/ Node-based E2E harness (NOT Playwright); hits live deploymentsscripts/ Bash + mjs scripts: deploy, DB, CI, locale checksdocs/ Runbooks, ADRs, architecture docs, operational guidesplans/ In-flight design docs and refactor plans (non-canonical)thoughts/ Exploratory notes and design sketches (non-canonical)tooling/ Shared Vitest coverage presetDeployable taxonomy
Section titled “Deployable taxonomy”apps/is organized by deployable ownership, not by build tool.apps/contains user-facing deployables (dashboard,marketing,docs).workers/contains backend/runtime deployables (Cloudflare Workers not directly user-facing).packages/contains reusable code that should not own deployment entrypoints.
apps/dashboard
Section titled “apps/dashboard”- Runtime: Cloudflare Worker (TanStack Start)
- Domains:
my.legaciti.org,dash.legaciti.org - Owns: dashboard UI, all server functions, auth, admin views
- Not a pure SPA — it does SSR and uses
createServerFn()for all API calls - State: TanStack Query (server state) + URL search params (filter/pagination)
- Detailed structure: 04-dashboard-structure.md
apps/marketing
Section titled “apps/marketing”- Runtime: Cloudflare Worker (Astro)
- Domain:
legaciti.org - Owns: public marketing pages only
- No dashboard UI, no auth, no DB access
apps/docs
Section titled “apps/docs”- Runtime: Cloudflare Pages (Starlight / Astro)
- Domain:
docs.legaciti.org - Owns: product documentation, API reference, integration guides, runbooks
- Source of truth for public-facing docs;
apps/docs/src/content/docs/
workers/ingestion-orchestrator
Section titled “workers/ingestion-orchestrator”- Runtime: Cloudflare Worker (Queue consumer + scheduled control plane)
- Domain:
ingest.legaciti.org - Owns: ingress queue dispatch, ORCID sync scheduling, orchestrator-to-processor handoff
- Structure:
src/worker entrypoint + monitoring wrapper around queue policy
workers/ingestion-process
Section titled “workers/ingestion-process”- Runtime: Cloudflare Worker (Queue consumer)
- Domain: internal only; no public route
- Owns: provider fetch/enrich/persist workload plus linking, retry, investigation, and app-event listener consumers
- Structure:
src/worker entrypoint delegating to shared ingestion feature modules
workers/notification
Section titled “workers/notification”- Runtime: Cloudflare Worker (Queue consumer)
- Domain: internal only; no public route
- Owns: email notification delivery, Discord webhook delivery, and notification DLQ replay operations
- Structure:
src/worker entrypoint reusing shared notification feature modules frompackages/platform-ingestion
workers/consumer-api
Section titled “workers/consumer-api”- Runtime: Cloudflare Worker (Hono)
- Domain:
api.legaciti.org - Owns: unauthenticated read-only publication API (
/v1/publications,/v1/doi/:doi) - No writes, no auth beyond API key header check
workers/rate-limiter
Section titled “workers/rate-limiter”- Runtime: Cloudflare Durable Object
- Owns: per-client request rate limiting shared by ingestion and public API
workers/log-processor
Section titled “workers/log-processor”- Runtime: Cloudflare log tail consumer
- Owns: tail-based log processing and forwarding
Build and deploy split
Section titled “Build and deploy split”Not every deployable follows the same pipeline.
apps/dashboard,apps/marketing, andapps/docsare the user-facing deployables most visible in the main Turbo workflow.workers/*deployables are operated primarily through Wrangler and root worker deployment scripts.- This split is intentional. The repo groups by ownership and runtime surface, not by requiring every deployable to share identical build orchestration.
Package categories (summary)
Section titled “Package categories (summary)”Full details and decision rules in 02-package-categories.md.
| Category | Current packages / locations |
|---|---|
| app | apps/dashboard, apps/marketing, apps/docs |
| worker | workers/ingestion-orchestrator, workers/ingestion-process, workers/notification, workers/consumer-api, workers/rate-limiter, workers/log-processor |
| feature | packages/feature-* (extracted), apps/dashboard/src/features/* (in-app) |
| domain | packages/db, packages/utils |
| platform | packages/platform-cloudflare, packages/platform-query, packages/platform-telemetry, packages/platform-events, packages/platform-i18n |
| contract | packages/schemas, packages/types, packages/api-contract, packages/integration-core |
| shared UI | apps/dashboard/src/components/ (in-app only) |
| tooling | packages/host-tsc-shims, root tooling/, cli/ |
Non-source areas
Section titled “Non-source areas”| Area | What it is |
|---|---|
migrations/ | Canonical D1 schema SQL — source of truth, intentionally versioned |
tests/ | Node E2E harness; .mjs files; NOT Playwright |
scripts/ | Operational bash/mjs scripts; not imported by app code |
docs/ | Operational runbooks, ADRs, architecture guidance |
plans/ | In-flight design docs; not canonical; subject to deletion |
thoughts/ | Exploratory notes; not canonical |
cli/ | legaciti binary for admin tasks; TypeScript + tsup |
src/ | Deleted (2026-04-23). Tracing primitives moved to packages/platform-ingestion/src/infra/telemetry/. |
tooling/ | Shared Vitest coverage preset |
Runtime topology
Section titled “Runtime topology”browser └─ apps/dashboard (TanStack Start / Cloudflare Worker) ├─ createServerFn() → D1 (via packages/db, 5 shards) ├─ Better Auth (sessions in D1) └─ enqueues to → worker-ingestion-queue
worker-ingestion-queue (Cloudflare Queue) └─ workers/ingestion-orchestrator └─ enqueues to → worker-ingestion-process-queue
worker-ingestion-process-queue (Cloudflare Queue) └─ workers/ingestion-process ├─ ORCID API (external) ├─ OpenAlex API (external) ├─ Crossref API (external) └─ D1 (upsert, 5 shards)
worker-ingestion-notification-queue / worker-ingestion-notification-discord-queue └─ workers/notification ├─ SES / configured email provider └─ Discord webhook
internet └─ workers/consumer-api (Hono / Cloudflare Worker) └─ D1 reads (visible publications only)
workers/rate-limiter (Durable Object) └─ used by worker-consumer-api, worker-ingestion-process, and worker-notification
workers/log-processor (log tail) └─ receives logs from all workersCloudflare binding names
Section titled “Cloudflare binding names”All 5 D1 shards are bound as DB_0…DB_4 in every worker that needs them.
KV cache is bound as PUB_CACHE. Primary ingress queue is worker-ingestion-queue,
which hands off to worker-ingestion-process-queue. Defined in the relevant
worker wrangler.jsonc files.