Worker Structure
Section titled “Worker Structure”Conventions shared by all Cloudflare Workers in this repo.
Applies to: ingestion, consumer-api, rate-limiter,
log-processor.
Standard layout
Section titled “Standard layout”workers/{name}/ src/ index.ts Entrypoint — exports default Worker/Queue/DurableObject/Tail handler features/ Worker application logic, grouped by capability/domain {domain}/ {domain}.ts Orchestration and feature logic {domain}.test.ts infra/ Cloudflare-specific adapters and infrastructure code db/ D1 query helpers, shard routing http/ External API clients, fetch wrappers, retry logic kv/ KV-backed stores and adapters rate-limiter/ Durable Object client stubs telemetry/ Worker-side monitoring: monitoring.ts, tracing.ts shared/ Low-level worker-safe types and helpers reused across layers types.ts Shared ambient or local types wrangler.jsonc Worker config: routes, bindings, queues package.json tsconfig.json vitest.config.ts (if worker has tests) shared-env.d.ts Global Env interface declarationLayer ownership
Section titled “Layer ownership”index.ts — entrypoint
Section titled “index.ts — entrypoint”- Owns runtime dispatch:
fetch,queue,tail,scheduledhandlers - Bootstraps monitoring and resets circuit breakers before delegating
- Should remain thin — routes to feature handlers, does not implement logic inline
- Acceptable inline: error wrappers, monitoring init/flush, routing by queue name
features/ — worker application logic
Section titled “features/ — worker application logic”- Owns use-case orchestration, processing flows, and capability-oriented modules
- Examples:
ingestion/,orcid/,enrichment/,email/,notifications/,log-processor/ - May import from
infra/adapters andshared/ - Must not directly access Cloudflare bindings (
env.DB_*,env.KV_*, etc.) - Pure formatting, parsing, and domain logic belongs here
infra/ — Cloudflare adapters and infrastructure
Section titled “infra/ — Cloudflare adapters and infrastructure”- Owns all Cloudflare runtime binding usage: D1, KV, Queue, DO, R2
infra/db/— D1 query functions, shard routing, upsert helpersinfra/http/— external API clients (ORCID, OpenAlex, Crossref, Discord webhooks), retry wrappersinfra/kv/— KV-backed persistence adaptersinfra/rate-limiter/— Durable Object stubs, rate-limit check helpersinfra/telemetry/—monitoring.ts(Sentry bootstrap wrapper),tracing.ts(span helpers)- Should not contain business/domain logic
shared/ — narrow worker-local support
Section titled “shared/ — narrow worker-local support”- Low-level types, constants, and pure utilities reused by multiple layers inside the same worker
- Must not depend on feature internals or Cloudflare bindings
- Example:
shared/types.tsfor ambient Env and event type declarations
Dependency direction
Section titled “Dependency direction”index.ts └─→ features/{domain}/ └─→ infra/{adapter}/ └─→ (Cloudflare runtime: D1, KV, Queue, DO) └─→ shared/ └─→ infra/{adapter}/ (direct infra use at entry only for bootstrap)shared/has no upward dependenciesinfra/does not import fromfeatures/features/does not import fromindex.ts- Only
index.tsandinfra/may reference Cloudflare runtime bindings
Entrypoint patterns
Section titled “Entrypoint patterns”Hono HTTP worker (worker-consumer-api):
import { Hono } from "hono";const app = new Hono<{ Bindings: Env }>();// ... route registrationsexport default app;Queue consumer (worker-ingestion-process or worker-ingestion-orchestrator):
export default { async queue(batch: MessageBatch<QueueMessage>, env: Env): Promise<void> { // process batch.messages },} satisfies ExportedHandler<Env>;Tail worker (worker-log-processor):
export default { async tail(events: TailInvocation[], env: Env, ctx: ExecutionContext) { // process tail events },};Durable Object (worker-rate-limiter):
export class RateLimiter implements DurableObject { // ...}export default { fetch: () => new Response("ok") };Testing
Section titled “Testing”- Unit tests: co-located alongside source files as
{name}.test.ts - Integration/E2E: in
tests/features/*.mjs— runs against live Cloudflare deployments, not local workers - Workers are excluded from the Turbo pipeline; run tests with wrangler directly:
pnpm --filter @workers/ingestion-process testpnpm --filter @workers/ingestion-orchestrator testDeployment
Section titled “Deployment”Workers are deployed via wrangler deploy directly — they are NOT in the
turbo.json pipeline. Deploy scripts live in scripts/deploy/.
scripts/deploy/deploy-ingestion.shscripts/deploy/deploy-consumer-api.shSee the root AGENTS.md for the full deploy command reference.
Worker-specific notes
Section titled “Worker-specific notes”ingestion
Section titled “ingestion”Fully normalized. Layout:
src/ index.ts Thin queue router + fetch handler, with monitoring wrappers features/ ingestion/ Core ingestion pipeline, work processing, linking, retry queues enrichment/ Metadata enrichment from OpenAlex, Crossref orcid/ ORCID API client, sync queue handler, record parser providers/ Provider-specific data fetching (crossref, openalex, orcid) — HTTP + transform email/ Notification email queue processor notifications/ Discord notification queue processor activity/ Activity event writing infra/ db/ D1 upsert helpers, shard routing, researcher/publication/work queries http/ Fetch retry helpers, circuit breaker rate-limiter/ DO stub, rate-limit check telemetry/ monitoring.ts Sentry wrapper (captureException, captureMessage, initMonitoring, flushMonitoring) tracing.ts Worker-local span/trace system (Span, Trace classes)Note: features/providers/ contains both HTTP fetching and data transformation for each
external provider (OpenAlex, Crossref, ORCID). The HTTP client concern is mixed with provider
data mapping — this is intentional for now, kept as documented behaviour in the worker AGENTS.md.
consumer-api
Section titled “consumer-api”Partially normalized. Layout:
src/ index.ts Hono app with ALL route handlers inline (2359 lines) docs/ openapi.ts OpenAPI 3.0 spec infra/ telemetry/ monitoring.ts Sentry wrapper (same pattern as worker-ingestion) __mocks__/ cloudflare-workers.ts Vitest alias for cloudflare:workersTemporary exception: src/index.ts is a 2359-line monolith that mixes Hono app setup,
middleware, type definitions, inline DB queries, and all route handlers. The worker AGENTS.md
explicitly documents “single file” as the current convention. Splitting it into proper
features/ and infra/db/ layers is the preferred target but deferred — the risk of
regressions in the complex auth/rate-limit/sharding logic is too high for a structural-only pass.
Target structure (when ready to split):
src/ index.ts Thin Hono app setup + middleware registration + export features/ publications/ Publication query handlers people/ People query handlers works/ Works query handlers ingest/ Ingest POST handler integrations/ Integration auth + credential handlers infra/ db/ D1 fan-out helpers, shard routing middleware/ Auth, rate-limit, workspace-check middleware telemetry/ monitoring.tsrate-limiter
Section titled “rate-limiter”Single-file worker. Flat layout appropriate — the entire worker is a single Durable Object implementation with no domain separation needed. No structural changes required.
log-processor
Section titled “log-processor”Normalized in this pass. Layout:
src/ index.ts Thin tail handler — filters error events, delegates to features/infra features/ log-processor/ formatting.ts Log parsing, alert formatting, structured log helpers infra/ kv/ persistence.ts KV-backed event persistence (LOG_STORE) http/ discord.ts Discord webhook HTTP adapter shared/ types.ts Env, TailInvocation, StructuredLog typesBinding access
Section titled “Binding access”- Bindings (
env.DB_0,env.PUB_CACHE,env.RATE_LIMITER,env.LOG_STORE, etc.) are accessed only insrc/index.tsorsrc/infra/ - The
Envtype is declared globally viashared-env.d.tsin each worker package (ambient global — no import required)
Service binding topology
Section titled “Service binding topology”Internal worker-to-worker communication uses Cloudflare’s native binding mechanisms — never public URL fetch calls. The binding graph is as follows:
Durable Object bindings (durable_objects.bindings[].script_name)
Section titled “Durable Object bindings (durable_objects.bindings[].script_name)”| Caller worker | Binding name | DO class | Callee worker |
|---|---|---|---|
worker-ingestion-process | RATE_LIMITER | RateLimiter | worker-rate-limiter |
worker-notification | RATE_LIMITER | RateLimiter | worker-rate-limiter |
consumer-api | RATE_LIMITER | RateLimiter | worker-rate-limiter |
my-legaciti-dashboard | RATE_LIMITER | RateLimiter | worker-rate-limiter |
Tail consumers (tail_consumers[].service)
Section titled “Tail consumers (tail_consumers[].service)”| Producer worker | Tail consumer |
|---|---|
worker-ingestion-orchestrator | worker-log-processor |
worker-ingestion-process | worker-log-processor |
worker-notification | worker-log-processor |
consumer-api | worker-log-processor |
worker-rate-limiter | worker-log-processor |
Queue bindings (producer → consumer)
Section titled “Queue bindings (producer → consumer)”| Queue name | Producer workers | Consumer worker |
|---|---|---|
worker-ingestion-queue | my-legaciti-dashboard, consumer-api | worker-ingestion-orchestrator |
worker-ingestion-process-queue | worker-ingestion-orchestrator | worker-ingestion-process |
worker-ingestion-orcid-sync-queue | my-legaciti-dashboard, worker-ingestion-orchestrator | worker-ingestion-orchestrator |
worker-ingestion-notification-queue | my-legaciti-dashboard, worker-ingestion-process, worker-authz | worker-notification |
worker-ingestion-notification-discord-queue | my-legaciti-dashboard, worker-ingestion-process | worker-notification |
worker-ingestion-linking-queue | worker-ingestion-process | worker-ingestion-process |
worker-ingestion-investigation-queue | worker-ingestion-process | worker-ingestion-process |
worker-ingestion-retry-queue | worker-ingestion-process | worker-ingestion-process |
worker-ingestion-app-event-listener-queue | worker-ingestion-process | worker-ingestion-process |
Deploy order rule
Section titled “Deploy order rule”Callees must be deployed before callers. Failure to follow this order will cause binding resolution failures in Cloudflare (the DO class or service name won’t exist yet).
Required order:
1. worker-log-processor (callee only — no outbound bindings)2. worker-rate-limiter (callee for ingestion + consumer-api + dashboard; tail → log-processor)3. worker-notification (caller of rate-limiter; tail → log-processor)4. worker-ingestion-process (caller of rate-limiter; queue producer to worker-notification; tail → log-processor)5. worker-ingestion-orchestrator (queue producer to worker-ingestion-process; tail → log-processor)6. consumer-api (caller of rate-limiter; queue producer to worker-ingestion-orchestrator)7. my-legaciti-dashboard (caller of rate-limiter; queue producer to worker-ingestion-orchestrator and worker-notification queues)This order is enforced by scripts/deploy/index.mjs (workers target).