Programmatic Agent (BYO Anthropic key)

Build BitBadges collections from natural-language prompts in Node/TypeScript β€” you bring your own Anthropic or OpenAI API key. BitBadges never sees your key, never proxies your requests.

This is the scriptable counterpart to the MCP Builder Tools path. Pick whichever fits:

No-code UI

MCP Builder

Programmatic Agent

Where it runs

bitbadges.io/create

Claude Desktop / Cursor / Claude Code

Your Node process

LLM key

BitBadges-managed (billed credits)

Your Claude subscription

Your Anthropic or OpenAI key

Good for

End users, one-off builds

Power users, exploratory work

Dapps, bots, games, CI, fine-tuning

Install

Install the SDK plus the LLM provider you want to use. Both providers ship as optional peer dependencies β€” install whichever you'll use; you don't need both.

# Anthropic (default)
npm install bitbadges @anthropic-ai/sdk

# OpenAI
npm install bitbadges openai

The SDK never bundles either provider β€” your key stays in your process.

Anthropic / OpenAI keys are required ONLY for BitBadgesBuilderAgent β€” the Node-side self-driving build loop on this page. The MCP server (bitbadges-builder bin used by Cursor / Claude Desktop / Claude Code / Cline / OpenAI Codex / Gemini Code Assist) is model-agnostic and does not read these env vars. If you're just installing the MCP, skip ahead to Builder Tools (MCP) β€” your IDE / agent already provides the model.

# Pick one β€” Anthropic (default) or OpenAI
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-proj-...

# Optional β€” only needed if prompts trigger query/search/simulate tools
export BITBADGES_API_KEY=bb-...

Zero-config

Anthropic (default)

OpenAI

Both providers run the same self-driving loop β€” same tools, same validation, same review pass, same auto-token-type inference. The dispatcher translates the Anthropic-style internal message format to/from OpenAI's chat-completions shape at the API boundary.

Token-type inference parity: both providers run a fast classifier (Anthropic Haiku / OpenAI gpt-4o-mini) before each build to auto-pick the right token-type skill. OpenAI uses native structured outputs (response_format: json_schema, strict: true) so the JSON contract is server-enforced. The two paths are interchangeable β€” pick whichever matches your stack.

Auth modes

Anthropic

OpenAI

Env vars are auto-read when no explicit creds are passed:

  • Anthropic: ANTHROPIC_API_KEY, ANTHROPIC_OAUTH_TOKEN, ANTHROPIC_AUTH_TOKEN

  • OpenAI: OPENAI_API_KEY

  • Shared: BITBADGES_API_KEY, BITBADGES_API_URL

Customization

Hook contract

  • onTokenUsage is load-bearing: it's awaited, and rejections propagate out of build(). Throw from it to enforce per-build quotas (the BitBadges indexer does this with its TokenLedger).

  • onCompletion fires exactly once per build() β€” on success AND on error paths β€” so cleanup logic runs either way.

  • onToolCall / onStatusUpdate / onLog are fire-and-forget observability hooks; rejections are swallowed so a misbehaving logger can't hang a build.

  • onLog receives { type: 'info' | 'ai_text' | 'validation' | 'error', label, data } entries β€” round boundaries, the LLM's text responses, validation-gate pass/fail. Useful for live-tail dev consoles and audit log persistence.

Validation modes

  • 'strict' (default) β€” throws ValidationFailedError if hard errors remain after the fix loop.

  • 'lenient' β€” always returns; result.valid is false with hard errors surfaced in result.errors.

  • 'off' β€” skips the gate entirely. Use only for experimentation.

Skills

Two inputs, two levels:

agent.listSkills() returns every available skill (filtered by the constructor whitelist when set). agent.describeSkill(id) returns one by ID. Discovery is code-only β€” there's no public marketplace endpoint. If you pass an unknown ID to selectedSkills, it's dropped silently (no build failure); enable debug: true to log dropped IDs.

How skill content is injected by mode

Build mode
What gets injected

create

Full skill instructions with build recipes β€” the LLM follows them to construct a new collection from scratch.

update / refine

Summaries only, with an explicit "don't rebuild the collection to match the skill" warning. The collection already exists on-chain; skills are reference context, not a blueprint.

If you're hitting the agent for an update and seeing over-aggressive rewriting, this is the dial to turn β€” drop skills from the per-build call entirely and the agent falls back to generic DOMAIN_KNOWLEDGE guidance.

Smart token-type inference (auto-pick)

When the caller doesn't supply a token-type skill, the agent classifies the prompt and prepends one high-confidence pick β€” or builds freestyle if nothing is confidently a match. See the dedicated Smart Token-Type Detection page for the full contract.

Quick shape:

Inference is skipped entirely when selectedSkills already has a token-type entry β€” explicit picks win. Non-token-type skills (community, additional-context) don't block inference.

Community skills (power-user)

promptSkillIds injects community-contributed skill docs stored on BitBadges. No discovery UI ships β€” callers bring their own IDs (shared via URL, Discord, internal registry). Requires a BitBadges API key.

Fetcher calls GET /api/v0/builder/community-skills?ids=... and silently returns an empty array on any failure (missing key, network error, timeout) β€” your build still runs, it just loses the community injection.

Local dev: when bitbadgesApiUrl points at localhost, 127.0.0.1, or *.localhost, the fetcher skips the API-key requirement. Mirrors the indexer's own relaxed auth for local development β€” iterate against a local BitBadges indexer without a production key.

Prompt-injection guard on the system-prompt slots

systemPromptAppend (additive) and systemPrompt (full replace) both run through an injection-pattern check at agent construction. If either contains obvious "ignore all previous instructions" / "you are now a…" style payloads, the constructor throws a BitBadgesBuilderAgentError with code INVALID_SYSTEM_PROMPT_APPEND or INVALID_SYSTEM_PROMPT. Hosted/server deployments that accept end-user input into these slots should still run their own containsInjection check at the trust boundary β€” the SDK's check is a defense-in-depth, not a replacement.

Custom tools

Add tools on top of the builtins, or filter them out:

Session stores

Conversation messages + token counters are persisted so refinement works across HTTP requests.

Pass the same sessionId across .build() calls to continue a session (e.g., for refinement).

Result shape

Errors dispatch on instanceof:

Image placeholders

The agent has two image-handling modes. Pick whichever fits your pipeline.

1. Real URLs in the prompt (simplest)

If you already have hosted images, just tell the agent the URLs in the prompt. The LLM emits them verbatim into metadataPlaceholders entries.

No post-processing needed. Downside: the LLM has to faithfully copy the URLs, so keep them short and well-formed.

2. Placeholders + post-build substitution (for dynamically-uploaded images)

When the user is still choosing/uploading images at build time, use symbolic placeholders and swap them in after the build:

The LLM wires IMAGE_N tokens into the metadata; you resolve them at the end. Matches the hosted frontend's flow exactly.

Detecting stragglers

agent.collectImageReferences(tx) returns every IMAGE_N token still in the transaction. Useful as a pre-broadcast sanity check β€” anything that comes back from this call is a placeholder that never got a real value and will land on-chain as-is.

Health check

Validate without building

Export as a single prompt for no-tools LLMs

When you want to hand the build to Claude.ai, ChatGPT, or Gemini (no tools available there), use agent.exportPrompt() to assemble the no-tools variant of the system prompt concatenated with the user message. The LLM emits the final transaction JSON directly.

No Anthropic call is made. No validation, no simulation, no fix loop β€” this is a pure prompt-assembly helper. Best-effort path; the SDK-agent build() flow remains the quality-gated path.

Abort

Cancellation + streaming

  • Cancellation: supported via abortSignal or agent.abort().

  • Streaming: not in v1. The agent returns when the build completes or throws; use onTokenUsage / onToolCall hooks for live progress.

Prompt caching (automatic)

The agent uses Anthropic's prompt caching on the stable prefix β€” system prompt, tool schemas, and inlined skill instructions β€” so subsequent builds inside a 5-minute window read those tokens from cache at ~10% of the regular input-token cost. Cache-creation tokens cost ~1.25x regular input on the miss; one hit afterwards pays the miss back and every hit after that is profit.

Caching is on by default. There's nothing to configure. Skill ordering is canonicalized (alphabetical) so ['nft', 'subscription'] and ['subscription', 'nft'] hit the same cache key.

When caching actually pays off

The stable prefix (system prompt + tool schemas + inlined skills) is typically 10–15% of the per-build token count β€” the rest is dynamic user context and the tool-calling round trips. Numbers to keep in mind:

  • First build with a new skill set: cache miss. You pay 1.25x on the prefix tokens and nothing comes back. Net: slightly more expensive than no-cache, by a few cents at most for a typical build.

  • Second build within 5 minutes with the same skill set: cache hit. The prefix tokens now cost 10% of full rate. Break-even against the miss hits roughly here.

  • Steady-state (several builds / hour with overlapping skill sets): cache-read tokens dominate the input count on result.trace. Real-world savings are 40–60% on the total input-token bill, not 90% (because the prefix is only part of the request).

For one-off scripts that run a single build and exit, you pay the 1.25x write premium with no recovery. The absolute cost delta is small (cents), so leave caching on β€” but don't count on it as a headline optimization for low-volume use.

Observability

The onTokenUsage hook reports cache counters per round:

result.trace.cacheReadTokens and result.trace.cacheCreationTokens also carry cumulative counts for the whole build. A healthy steady- state setup has cacheReadTokens >> inputTokens.

What invalidates the cache

  • 5-minute TTL since the last hit.

  • Any change to the system prompt (e.g. systemPromptAppend edit).

  • Any change to the tool set (tools.add / tools.remove).

  • Any change to the canonical skill set.

The per-request tail (request header, metadata, prompt text, refinement history) is never cached β€” it's expected to vary.

/internals β€” unstable primitives

For users who want to run their own loop (different LLM, custom strategy, fine-tuning data collection):

Not covered by semver β€” anything here may be renamed or removed in a minor release. Use bitbadges/builder/agent (the stable path) whenever possible.

Examples

Full runnable scripts at bitbadgesjs/packages/bitbadgesjs-sdk/examples/builder-agent/arrow-up-right:

  • zero-config.ts β€” the 5-line sample

  • middle-tier.ts β€” hooks, skills, file store, typed errors

  • diy-internals.ts β€” OpenAI-via-/internals wiring (unsupported)

Troubleshooting

  • PeerDependencyError: @anthropic-ai/sdk is required β€” run npm install @anthropic-ai/sdk.

  • Anthropic credentials are required β€” set ANTHROPIC_API_KEY or pass anthropicKey/anthropicAuthToken to the constructor.

  • ValidationFailedError after 3 fix rounds β€” the validation fix loop gave up. Inspect err.errors for structured causes and err.advisoryNotes for design concerns the agent considered but didn't resolve. Raising fixLoopMaxRounds rarely helps β€” usually the prompt needs more constraints.

  • Simulation says jsonToTxBytes error β€” this is an encode-time advisory, not a chain failure. The transaction is typically still broadcast-safe.

See also

Last updated