Skip to content

Migrated from root technical docs.

Conventions shared by all Cloudflare Workers in this repo. Applies to: ingestion, consumer-api, rate-limiter, log-processor.


workers/{name}/
src/
index.ts Entrypoint — exports default Worker/Queue/DurableObject/Tail handler
features/ Worker application logic, grouped by capability/domain
{domain}/
{domain}.ts Orchestration and feature logic
{domain}.test.ts
infra/ Cloudflare-specific adapters and infrastructure code
db/ D1 query helpers, shard routing
http/ External API clients, fetch wrappers, retry logic
kv/ KV-backed stores and adapters
rate-limiter/ Durable Object client stubs
telemetry/ Worker-side monitoring: monitoring.ts, tracing.ts
shared/ Low-level worker-safe types and helpers reused across layers
types.ts Shared ambient or local types
wrangler.jsonc Worker config: routes, bindings, queues
package.json
tsconfig.json
vitest.config.ts (if worker has tests)
shared-env.d.ts Global Env interface declaration

  • Owns runtime dispatch: fetch, queue, tail, scheduled handlers
  • Bootstraps monitoring and resets circuit breakers before delegating
  • Should remain thin — routes to feature handlers, does not implement logic inline
  • Acceptable inline: error wrappers, monitoring init/flush, routing by queue name
  • Owns use-case orchestration, processing flows, and capability-oriented modules
  • Examples: ingestion/, orcid/, enrichment/, email/, notifications/, log-processor/
  • May import from infra/ adapters and shared/
  • Must not directly access Cloudflare bindings (env.DB_*, env.KV_*, etc.)
  • Pure formatting, parsing, and domain logic belongs here

infra/ — Cloudflare adapters and infrastructure

Section titled “infra/ — Cloudflare adapters and infrastructure”
  • Owns all Cloudflare runtime binding usage: D1, KV, Queue, DO, R2
  • infra/db/ — D1 query functions, shard routing, upsert helpers
  • infra/http/ — external API clients (ORCID, OpenAlex, Crossref, Discord webhooks), retry wrappers
  • infra/kv/ — KV-backed persistence adapters
  • infra/rate-limiter/ — Durable Object stubs, rate-limit check helpers
  • infra/telemetry/monitoring.ts (Sentry bootstrap wrapper), tracing.ts (span helpers)
  • Should not contain business/domain logic
  • Low-level types, constants, and pure utilities reused by multiple layers inside the same worker
  • Must not depend on feature internals or Cloudflare bindings
  • Example: shared/types.ts for ambient Env and event type declarations

index.ts
└─→ features/{domain}/
└─→ infra/{adapter}/
└─→ (Cloudflare runtime: D1, KV, Queue, DO)
└─→ shared/
└─→ infra/{adapter}/ (direct infra use at entry only for bootstrap)
  • shared/ has no upward dependencies
  • infra/ does not import from features/
  • features/ does not import from index.ts
  • Only index.ts and infra/ may reference Cloudflare runtime bindings

Hono HTTP worker (worker-consumer-api):

import { Hono } from "hono";
const app = new Hono<{ Bindings: Env }>();
// ... route registrations
export default app;

Queue consumer (worker-ingestion-process or worker-ingestion-orchestrator):

export default {
async queue(batch: MessageBatch<QueueMessage>, env: Env): Promise<void> {
// process batch.messages
},
} satisfies ExportedHandler<Env>;

Tail worker (worker-log-processor):

export default {
async tail(events: TailInvocation[], env: Env, ctx: ExecutionContext) {
// process tail events
},
};

Durable Object (worker-rate-limiter):

export class RateLimiter implements DurableObject {
// ...
}
export default { fetch: () => new Response("ok") };

  • Unit tests: co-located alongside source files as {name}.test.ts
  • Integration/E2E: in tests/features/*.mjs — runs against live Cloudflare deployments, not local workers
  • Workers are excluded from the Turbo pipeline; run tests with wrangler directly:
Terminal window
pnpm --filter @workers/ingestion-process test
pnpm --filter @workers/ingestion-orchestrator test

Workers are deployed via wrangler deploy directly — they are NOT in the turbo.json pipeline. Deploy scripts live in scripts/deploy/.

Terminal window
scripts/deploy/deploy-ingestion.sh
scripts/deploy/deploy-consumer-api.sh

See the root AGENTS.md for the full deploy command reference.


Fully normalized. Layout:

src/
index.ts Thin queue router + fetch handler, with monitoring wrappers
features/
ingestion/ Core ingestion pipeline, work processing, linking, retry queues
enrichment/ Metadata enrichment from OpenAlex, Crossref
orcid/ ORCID API client, sync queue handler, record parser
providers/ Provider-specific data fetching (crossref, openalex, orcid) — HTTP + transform
email/ Notification email queue processor
notifications/ Discord notification queue processor
activity/ Activity event writing
infra/
db/ D1 upsert helpers, shard routing, researcher/publication/work queries
http/ Fetch retry helpers, circuit breaker
rate-limiter/ DO stub, rate-limit check
telemetry/
monitoring.ts Sentry wrapper (captureException, captureMessage, initMonitoring, flushMonitoring)
tracing.ts Worker-local span/trace system (Span, Trace classes)

Note: features/providers/ contains both HTTP fetching and data transformation for each external provider (OpenAlex, Crossref, ORCID). The HTTP client concern is mixed with provider data mapping — this is intentional for now, kept as documented behaviour in the worker AGENTS.md.

Partially normalized. Layout:

src/
index.ts Hono app with ALL route handlers inline (2359 lines)
docs/
openapi.ts OpenAPI 3.0 spec
infra/
telemetry/
monitoring.ts Sentry wrapper (same pattern as worker-ingestion)
__mocks__/
cloudflare-workers.ts Vitest alias for cloudflare:workers

Temporary exception: src/index.ts is a 2359-line monolith that mixes Hono app setup, middleware, type definitions, inline DB queries, and all route handlers. The worker AGENTS.md explicitly documents “single file” as the current convention. Splitting it into proper features/ and infra/db/ layers is the preferred target but deferred — the risk of regressions in the complex auth/rate-limit/sharding logic is too high for a structural-only pass.

Target structure (when ready to split):

src/
index.ts Thin Hono app setup + middleware registration + export
features/
publications/ Publication query handlers
people/ People query handlers
works/ Works query handlers
ingest/ Ingest POST handler
integrations/ Integration auth + credential handlers
infra/
db/ D1 fan-out helpers, shard routing
middleware/ Auth, rate-limit, workspace-check middleware
telemetry/
monitoring.ts

Single-file worker. Flat layout appropriate — the entire worker is a single Durable Object implementation with no domain separation needed. No structural changes required.

Normalized in this pass. Layout:

src/
index.ts Thin tail handler — filters error events, delegates to features/infra
features/
log-processor/
formatting.ts Log parsing, alert formatting, structured log helpers
infra/
kv/
persistence.ts KV-backed event persistence (LOG_STORE)
http/
discord.ts Discord webhook HTTP adapter
shared/
types.ts Env, TailInvocation, StructuredLog types

  • Bindings (env.DB_0, env.PUB_CACHE, env.RATE_LIMITER, env.LOG_STORE, etc.) are accessed only in src/index.ts or src/infra/
  • The Env type is declared globally via shared-env.d.ts in each worker package (ambient global — no import required)

Internal worker-to-worker communication uses Cloudflare’s native binding mechanisms — never public URL fetch calls. The binding graph is as follows:

Durable Object bindings (durable_objects.bindings[].script_name)

Section titled “Durable Object bindings (durable_objects.bindings[].script_name)”
Caller workerBinding nameDO classCallee worker
worker-ingestion-processRATE_LIMITERRateLimiterworker-rate-limiter
worker-notificationRATE_LIMITERRateLimiterworker-rate-limiter
consumer-apiRATE_LIMITERRateLimiterworker-rate-limiter
my-legaciti-dashboardRATE_LIMITERRateLimiterworker-rate-limiter
Producer workerTail consumer
worker-ingestion-orchestratorworker-log-processor
worker-ingestion-processworker-log-processor
worker-notificationworker-log-processor
consumer-apiworker-log-processor
worker-rate-limiterworker-log-processor
Queue nameProducer workersConsumer worker
worker-ingestion-queuemy-legaciti-dashboard, consumer-apiworker-ingestion-orchestrator
worker-ingestion-process-queueworker-ingestion-orchestratorworker-ingestion-process
worker-ingestion-orcid-sync-queuemy-legaciti-dashboard, worker-ingestion-orchestratorworker-ingestion-orchestrator
worker-ingestion-notification-queuemy-legaciti-dashboard, worker-ingestion-process, worker-authzworker-notification
worker-ingestion-notification-discord-queuemy-legaciti-dashboard, worker-ingestion-processworker-notification
worker-ingestion-linking-queueworker-ingestion-processworker-ingestion-process
worker-ingestion-investigation-queueworker-ingestion-processworker-ingestion-process
worker-ingestion-retry-queueworker-ingestion-processworker-ingestion-process
worker-ingestion-app-event-listener-queueworker-ingestion-processworker-ingestion-process

Callees must be deployed before callers. Failure to follow this order will cause binding resolution failures in Cloudflare (the DO class or service name won’t exist yet).

Required order:

1. worker-log-processor (callee only — no outbound bindings)
2. worker-rate-limiter (callee for ingestion + consumer-api + dashboard; tail → log-processor)
3. worker-notification (caller of rate-limiter; tail → log-processor)
4. worker-ingestion-process (caller of rate-limiter; queue producer to worker-notification; tail → log-processor)
5. worker-ingestion-orchestrator (queue producer to worker-ingestion-process; tail → log-processor)
6. consumer-api (caller of rate-limiter; queue producer to worker-ingestion-orchestrator)
7. my-legaciti-dashboard (caller of rate-limiter; queue producer to worker-ingestion-orchestrator and worker-notification queues)

This order is enforced by scripts/deploy/index.mjs (workers target).