Concepts

The managed AI gateway

How AI runs in an Extentos app — the assistant routes voice AI through the managed gateway by default. Content relayed, never stored; metered. BYOK opts out.

By default, the assistant routes voice AI through Extentos's managed gateway on Extentos's keys — content is relayed but never stored, usage is metered, and BYOK is the opt-out. Handler-code AI you call yourself never touches the gateway at all.

There are two distinct ways AI shows up in an Extentos app, and the privacy and billing story is different for each. Getting them straight is the whole point of this page.

Managed-gateway AI (default)Handler-code BYOK (your own calls)
What it isThe Phase 4 assistant runtime (glasses.assistant.*) — real-time voice AIAny AI call you make from your own handler (Anthropic, a vision model, a translation API, …)
Whose keyExtentos's provider key, by defaultYours, wired into your handler code
Is Extentos in the path?Yes — it relays the call (but never stores the content)No — your app talks to your provider directly
Metered by Extentos?Yes — token usage + cost recorded per projectNo — Extentos never sees it

The rest of this page is about the managed gateway (the left column). For the right column — calling your own provider from a handler — see credentials; Extentos is not involved in those calls at all.

What the managed gateway is

When your app uses the Phase 4 assistant runtime (glasses.assistant.start(...)), the voice-AI traffic does not carry an API key in your app. Instead, the SDK opens a WebSocket to Extentos's gateway (wss://api.extentos.com/v1/realtime), and the gateway relays the session to the AI provider on Extentos's key. The default provider is OpenAI Realtime; the default model is gpt-realtime-2.

This is deliberate: it means a developer can build and ship a voice-AI glasses app with zero AI configuration — no key to obtain, store, rotate, or leak. The assistant just works.

// No API key anywhere in your code — the gateway supplies it.
val session = glasses.assistant.start(AssistantProvider.OpenAi()) {
    instructions = "You are a running companion…"
    tool("get_pace", "The runner's current pace") { ToolResult.Ok(pace()) }
}

Heads-up: the assistant runtime ships in the 1.4.0-phase4-dogfood preview snapshot (via mavenLocal()), not yet on Maven Central. See assistant and SDK install for the snapshot setup.

What the gateway does and doesn't do with your data

The gateway is a transient relay, not a data store:

  • Audio frames, transcripts, and model responses are forwarded verbatim in both directions and never persisted. Binary audio is never even read.
  • The only thing recorded is the provider's usage object at the end of each response: token counts (input/output, text/audio/cached splits), the model, and the computed cost_usd. No conversation content is stored.

So the precise framing is: for the managed gateway, Extentos is in the path but never inspects or stores the content — it reads the usage object and forwards everything else. (The older docs claimed Extentos is "never in the path / never proxies AI." That was only ever true for handler-code BYOK — the right column above — and it's the framing this page corrects.) See security for the full data-handling commitments and the content/PII boundary.

BYOK — bringing your own key (the opt-out)

If you'd rather the AI run on your provider account, BYOK is the opt-out — and it's configured in the dashboard, not in code:

  1. Upload your OpenAI key in the project's Credentials section. It's stored encrypted, scoped to your project.
  2. The gateway swaps your key in server-side at call time. Your app still never holds the key.
  3. The call now bills to your OpenAI account. Extentos still relays it (and still records token-usage metadata), but the spend isn't on Extentos's key.

The SDK surface is identical either way — your handler code doesn't change when you switch between managed and BYOK. (Note: BYOK currently disables the persistent memory profile, because that profile lives on the Extentos backend behind the gateway.)

Identity: how the gateway knows who's calling

Every gateway connection presents an attestation JWT (Authorization: Bearer). In production that token comes from platform attestation — Play Integrity on Android, App Attest on iOS — carrying the project id, an anonymous device id, environment, and platform. In the browser simulator the backend mints a session-scoped token, so simulator sessions reach the gateway without device attestation.

A sideloaded development build can't attest, so it can't use Extentos's managed key — it falls back to BYOK (or the simulator's session token). This keeps the managed key usable only from builds Extentos can account for.

Metering and billing

Metering is live. Every managed-gateway response is priced at the provider's list price (no markup) and recorded against your project. You can see usage per project in the dashboard.

Billing is planned, not yet charging. The model: a prepaid credit balance with a free allowance for every account, after which managed-gateway usage draws down credits (top up, or switch to BYOK to move the spend to your own provider account). This runs on a dedicated Extentos billing account, separate from everything else.

What this means today: managed-gateway usage is measured and shown, but you are not charged — there's a provisional safety cap, not a paywall. When credit billing goes live it will be announced in the changelog, and the free-forever surfaces (MCP tools, on-device simulation, real-hardware testing) stay free regardless. See pricing for the full money story.

When to use which

  • Default (managed gateway): you want voice AI with zero config, you're fine running on Extentos's provider with usage metered, and (eventually) drawing down a credit balance. This is the path most apps take.
  • BYOK: you already have an OpenAI account you want the spend and rate limits on, you need a specific model/tier the gateway doesn't expose, or your compliance posture requires the provider relationship be yours. Upload the key in the dashboard; nothing else changes.
  • Handler-code BYOK (a different thing): you're calling a non-assistant AI — a vision model on a captured photo, a translation API — from your own handler. That never touches the gateway; wire the key yourself per credentials.
  • Assistant runtime — the glasses.assistant.* API the gateway powers
  • Pricing — free surfaces, metered gateway, the coming credit model
  • Security — the data-handling boundary and what's collected
  • Credentials — Meta DAT setup and BYOK provider keys