Skip to content

Migrated from root technical docs.

Canonical reference for what each top-level area owns and why it exists. For placement rules see 01-code-placement.md. For package categories see 02-package-categories.md. For feature source-of-truth decisions see 07-feature-source-of-truth.md. For generated output policy and exceptions see 12-generated-artifacts.md. For naming conventions see 13-naming-conventions.md. For current architecture gaps and follow-up cleanup, see 06-current-gaps.md.


apps/
dashboard/ TanStack Start dashboard — fullstack Cloudflare Worker
marketing/ Astro public site — Cloudflare Worker
docs/ Starlight docs site — Cloudflare Pages
workers/
ingestion/ Queue consumer — ORCID/OpenAlex/Crossref enrichment pipeline
consumer-api/ Hono Consumer API (api.legaciti.org)
rate-limiter/ Cloudflare Durable Object — per-client rate limiting
log-processor/ Cloudflare log tail consumer
packages/
db/ Drizzle ORM schema + D1 shard configs (5 shards)
schemas/ Shared Zod input/output schemas
types/ Shared TypeScript interfaces
utils/ Pure utilities: DOI normalization, sharding, JSON Patch
api-contract/ OpenAPI contract for the integrations API
integration-core/ Integration request/response runtime helpers
platform-cloudflare/ Cloudflare-specific runtime helpers shared across workers
platform-query/ TanStack Query keys, retry config, shell-level query keys
platform-telemetry/ Route-level metrics helpers
platform-events/ Client event bus, registry, cache invalidation
platform-i18n/ i18next bootstrap + shell-level locale files
host-tsc-shims/ TypeScript path shims so feature packages compile standalone
feature-activity/ Domain: activity feed
feature-people/ Domain: researcher/people management
feature-publications/ Domain: publication management
cli/ Admin CLI (legaciti binary) — internal tooling only
migrations/ Canonical D1 SQL schema bootstrap (applied to all 5 shards)
tests/ Node-based E2E harness (NOT Playwright); hits live deployments
scripts/ Bash + mjs scripts: deploy, DB, CI, locale checks
docs/ Runbooks, ADRs, architecture docs, operational guides
plans/ In-flight design docs and refactor plans (non-canonical)
thoughts/ Exploratory notes and design sketches (non-canonical)
tooling/ Shared Vitest coverage preset
  • apps/ is organized by deployable ownership, not by build tool.
  • apps/ contains user-facing deployables (dashboard, marketing, docs).
  • workers/ contains backend/runtime deployables (Cloudflare Workers not directly user-facing).
  • packages/ contains reusable code that should not own deployment entrypoints.

  • Runtime: Cloudflare Worker (TanStack Start)
  • Domains: my.legaciti.org, dash.legaciti.org
  • Owns: dashboard UI, all server functions, auth, admin views
  • Not a pure SPA — it does SSR and uses createServerFn() for all API calls
  • State: TanStack Query (server state) + URL search params (filter/pagination)
  • Detailed structure: 04-dashboard-structure.md
  • Runtime: Cloudflare Worker (Astro)
  • Domain: legaciti.org
  • Owns: public marketing pages only
  • No dashboard UI, no auth, no DB access
  • Runtime: Cloudflare Pages (Starlight / Astro)
  • Domain: docs.legaciti.org
  • Owns: product documentation, API reference, integration guides, runbooks
  • Source of truth for public-facing docs; apps/docs/src/content/docs/
  • Runtime: Cloudflare Worker (Queue consumer + scheduled control plane)
  • Domain: ingest.legaciti.org
  • Owns: ingress queue dispatch, ORCID sync scheduling, orchestrator-to-processor handoff
  • Structure: src/ worker entrypoint + monitoring wrapper around queue policy
  • Runtime: Cloudflare Worker (Queue consumer)
  • Domain: internal only; no public route
  • Owns: provider fetch/enrich/persist workload plus linking, retry, investigation, and app-event listener consumers
  • Structure: src/ worker entrypoint delegating to shared ingestion feature modules
  • Runtime: Cloudflare Worker (Queue consumer)
  • Domain: internal only; no public route
  • Owns: email notification delivery, Discord webhook delivery, and notification DLQ replay operations
  • Structure: src/ worker entrypoint reusing shared notification feature modules from packages/platform-ingestion
  • Runtime: Cloudflare Worker (Hono)
  • Domain: api.legaciti.org
  • Owns: unauthenticated read-only publication API (/v1/publications, /v1/doi/:doi)
  • No writes, no auth beyond API key header check
  • Runtime: Cloudflare Durable Object
  • Owns: per-client request rate limiting shared by ingestion and public API
  • Runtime: Cloudflare log tail consumer
  • Owns: tail-based log processing and forwarding

Not every deployable follows the same pipeline.

  • apps/dashboard, apps/marketing, and apps/docs are the user-facing deployables most visible in the main Turbo workflow.
  • workers/* deployables are operated primarily through Wrangler and root worker deployment scripts.
  • This split is intentional. The repo groups by ownership and runtime surface, not by requiring every deployable to share identical build orchestration.

Full details and decision rules in 02-package-categories.md.

CategoryCurrent packages / locations
appapps/dashboard, apps/marketing, apps/docs
workerworkers/ingestion-orchestrator, workers/ingestion-process, workers/notification, workers/consumer-api, workers/rate-limiter, workers/log-processor
featurepackages/feature-* (extracted), apps/dashboard/src/features/* (in-app)
domainpackages/db, packages/utils
platformpackages/platform-cloudflare, packages/platform-query, packages/platform-telemetry, packages/platform-events, packages/platform-i18n
contractpackages/schemas, packages/types, packages/api-contract, packages/integration-core
shared UIapps/dashboard/src/components/ (in-app only)
toolingpackages/host-tsc-shims, root tooling/, cli/

AreaWhat it is
migrations/Canonical D1 schema SQL — source of truth, intentionally versioned
tests/Node E2E harness; .mjs files; NOT Playwright
scripts/Operational bash/mjs scripts; not imported by app code
docs/Operational runbooks, ADRs, architecture guidance
plans/In-flight design docs; not canonical; subject to deletion
thoughts/Exploratory notes; not canonical
cli/legaciti binary for admin tasks; TypeScript + tsup
src/Deleted (2026-04-23). Tracing primitives moved to packages/platform-ingestion/src/infra/telemetry/.
tooling/Shared Vitest coverage preset

browser
└─ apps/dashboard (TanStack Start / Cloudflare Worker)
├─ createServerFn() → D1 (via packages/db, 5 shards)
├─ Better Auth (sessions in D1)
└─ enqueues to → worker-ingestion-queue
worker-ingestion-queue (Cloudflare Queue)
└─ workers/ingestion-orchestrator
└─ enqueues to → worker-ingestion-process-queue
worker-ingestion-process-queue (Cloudflare Queue)
└─ workers/ingestion-process
├─ ORCID API (external)
├─ OpenAlex API (external)
├─ Crossref API (external)
└─ D1 (upsert, 5 shards)
worker-ingestion-notification-queue / worker-ingestion-notification-discord-queue
└─ workers/notification
├─ SES / configured email provider
└─ Discord webhook
internet
└─ workers/consumer-api (Hono / Cloudflare Worker)
└─ D1 reads (visible publications only)
workers/rate-limiter (Durable Object)
└─ used by worker-consumer-api, worker-ingestion-process, and worker-notification
workers/log-processor (log tail)
└─ receives logs from all workers

All 5 D1 shards are bound as DB_0DB_4 in every worker that needs them. KV cache is bound as PUB_CACHE. Primary ingress queue is worker-ingestion-queue, which hands off to worker-ingestion-process-queue. Defined in the relevant worker wrangler.jsonc files.