The Worker (Agent Backend)
worker/index.ts — a Cloudflare Worker that is the only thing holding API
keys. The browser never talks to Cerebras/OpenRouter/Gemini directly; every
model call goes through here.
Responsibilities
- Inject API keys from Worker secrets — never shipped to the browser.
- Resolve the provider → an AI SDK model (Cerebras/OpenRouter/NVIDIA NIM via
@ai-sdk/openai-compatible, Gemini via@ai-sdk/google). OpenRouter and NVIDIA are interchangeable GPU-hosted challengers, selectable per-run from the lobby. - Run
streamObject+ Zod so the model emits schema-validated JSON while still streaming — preserving the live tokens/sec speedometer. - Re-wrap the JSON deltas as OpenAI-shaped SSE so the client streaming code works unchanged.
One symmetric code path for all three providers = a fair race.
Endpoints
| Route | Method | Purpose |
|---|---|---|
/api/health | GET | liveness probe |
/api/config | GET | which providers are wired (readiness booleans + model ids + a placeholder flag per provider — never keys) |
/api/chat | POST | run one agent step via streamObject, stream back as SSE |
How /api/chat works
The request body carries the schema identity:
{ "provider": "gemini", "role": "worker", "taskTypeId": "label-parse",
"messages": [...], "temperature": 0.2, "max_tokens": 512 }
The Worker:
- Validates
provider(cerebras|openrouter|nvidia|gemini) androle. - Validates
messagesagainst a Zod schema — roles constrained tosystem|user, andimage_url.urlmust be adata:URL or an allowlisted asset host. (This closes the SSRF vector where the provider would fetch an arbitrary URL server-side.) - Resolves the schema by
(role, taskTypeId)— workers key off the task id; router/checker/escalation have fixed schemas. - Builds the model via
buildModel(env, provider, modelOverride). - Transforms the messages with
toModelMessages()— the system prompt is lifted out of the array and passed via thesystemoption (the AI SDK rejects arole:'system'message), and OpenAI-style{type:'image_url'}parts are converted to the SDK's{type:'image', image}shape. Without this every live call fails prompt standardization. - Calls
streamObject({ model, schema, system, messages, temperature, maxOutputTokens })and pipes itstextStreamthroughwrapStreamAsSse(), which emitsdata: {choices:[{delta:{content}}]}frames +data: [DONE].
temperature is clamped to [0,2] and max_tokens (default 512) to [1,8192]
as maxOutputTokens.
Why SSE re-wrapping
The AI SDK's streamObject yields partial JSON text deltas. The client
(src/agents/streaming.ts) parses OpenAI-shaped SSE. So the Worker re-wraps each
delta as an OpenAI delta.content frame — the client reconstructs the full JSON
and parses it. Because streamObject emits schema-valid JSON, the assembled
string parses cleanly.
Security posture
- Keys are server-only.
/api/configreturns readiness booleans, never secrets. The test suite asserts no key material appears in the response. - Errors never leak. All error paths log detail server-side and return a
stable code (
upstream_error/provider_not_configured). AI-SDK errors can embed request URLs / echoed auth, soString(err)is never sent to the client. APP_TOKENis a weak public-proxy gate (it ships in the client bundle when set, so it's publicly recoverable). Real protection = provider spend caps + Cloudflare rate-limiting. See Security.- No
dangerouslySetInnerHTMLanywhere; model output is React-escaped text.
See Providers for adding a new model provider.