Skip to content

Resync Unverified Works

Migrated from root technical docs.

Scan all stored ORCID items and queue ingestion for every person who has works with a now-eligible work type so the ingestion pipeline re-classifies them as works instead of leaving them in the unverified review queue.

Script: scripts/db/resync-unverified-works.mjs npm alias: pnpm resync:unverified-works

Token is loaded automatically from .env at the repo root (CLOUDFLARE_API_TOKEN or CF_API_TOKEN).


Run this after expanding the ORCID_WORK_TYPE_KEYWORDS list. Existing orcid_works rows that were stored before the change still carry their original classification. This script finds those rows and queues ingestion jobs for the affected ORCIDs so the pipeline re-processes them with the updated keyword list.

Newly-ingested works (after the keyword change is deployed) are handled automatically — no manual resync needed.


Prints the affected ORCID count and a work-type breakdown without sending anything:

Terminal window
pnpm resync:unverified-works --dry-run
Terminal window
pnpm resync:unverified-works

This defaults to low-pressure ingestion mode (skip_existing=true, concurrency=1, enqueue-delay-ms=1000) to reduce worker invocation pressure.

Terminal window
pnpm resync:unverified-works --entity-id=cesam
Terminal window
pnpm resync:unverified-works --watch-timeout=300

By default, watch mode has no timeout and waits until all queued jobs reach a terminal state (complete or failed).

Use local D1 (for testing against local state)

Section titled “Use local D1 (for testing against local state)”
Terminal window
pnpm resync:unverified-works --local

Control queue send concurrency (default: 1)

Section titled “Control queue send concurrency (default: 1)”
Terminal window
pnpm resync:unverified-works --concurrency=10

Control delay between queue chunks (default: 1000ms)

Section titled “Control delay between queue chunks (default: 1000ms)”
Terminal window
pnpm resync:unverified-works --enqueue-delay-ms=2000

Re-enrich already-known publications (higher load)

Section titled “Re-enrich already-known publications (higher load)”
Terminal window
pnpm resync:unverified-works --no-skip-existing

FlagDefaultDescription
--dry-runoffPrint plan only; no messages sent
--remoteonUse remote D1 databases
--localoffUse local D1 databases
--entity-id=<id>cesamEntity ID written into each sync job message
--concurrency=<n>1Concurrent queue send requests
--enqueue-delay-ms=<n>1000Delay between queue chunks in milliseconds
--no-skip-existingoffRe-enrich already-known publications (higher load)
--no-watchoffQueue jobs only, skip live status polling
--watch-timeoutoffOptional timeout in seconds for watch mode

Each queued message triggers the ingestion pipeline for one ORCID. The worker fetches ORCID works and runs classifyOrcidWork with the updated keyword list. Items that now resolve to "work" are written to the works table via handleNonDoiOrcidWork and will no longer appear in the unverified review queue.

The script also polls ingestion_jobs live and prints counts for missing, queued, processing, complete, and failed jobs so you can see if processing started and whether it finished.