The managed AI gateway

How AI runs in an Extentos app — the assistant routes voice AI through the managed gateway. Content relayed, never stored; metered. The assistant always runs on the managed gateway; there is no bring-your-own-key option.

The assistant routes voice AI through Extentos's managed gateway on Extentos's keys — content is relayed but never stored, and usage is metered. Handler-code AI you call yourself never touches the gateway at all.

There are two distinct ways AI shows up in an Extentos app, and the privacy and billing story is different for each. Getting them straight is the whole point of this page.

	Managed-gateway AI (the assistant)	Your own AI calls (from your handler code)
What it is	The Phase 4 assistant runtime (`glasses.assistant.*`) — real-time voice AI	Any AI call you make from your own handler (Anthropic, a vision model, a translation API, …)
Whose key	Extentos's provider key, by default	Yours, wired into your handler code
Is Extentos in the path?	Yes — it relays the call (but never stores the content)	No — your app talks to your provider directly
Metered by Extentos?	Yes — usage + cost recorded per project	No — Extentos never sees it

The rest of this page is about the managed gateway (the left column). For the right column — calling your own provider from a handler — see credentials; Extentos is not involved in those calls at all.

What the managed gateway is

When your app uses the Phase 4 assistant runtime (glasses.assistant.start(...)), the voice-AI traffic does not carry an API key in your app. Instead, the SDK opens a WebSocket to Extentos's gateway (wss://api.extentos.com/v1/realtime), and the gateway relays the session to the AI provider on Extentos's key. The default provider is OpenAI Realtime (default model gpt-realtime-2); the gateway also routes xAI Grok and Google Gemini Live voice models — switch to one in the dashboard's Agent settings and the gateway relays to that vendor on Extentos's key, with no change to your app. Gemini 3.1 Flash Live additionally accepts streaming video input (sendVideoFrame) — the only realtime lineup entry that can see what the glasses see. Each realtime model brings its own voices (they aren't interchangeable across vendors — the dashboard scopes the voice picker to the model you pick).

This is deliberate: it means a developer can build and ship a voice-AI glasses app with zero AI configuration — no key to obtain, store, rotate, or leak. The assistant just works.

// No API key anywhere in your code — the gateway supplies it.
val session = glasses.assistant.start(AssistantProvider.Managed()) {
    instructions = "You are a running companion…"
    tool("get_pace", "The runner's current pace") { ToolResult.Ok(pace()) }
}

Heads-up: the assistant runtime has shipped in com.extentos:glasses since 1.4.0 (install the current release — see below) and in the iOS SDK's GlassesCore product from github.com/extentos/swift-glasses. See assistant, Android install, and iOS install.

Compaction-model providers

The gateway also proxies the assistant's memory-compaction model — the chat model that summarizes older turns when a conversation outgrows the context window (chosen per project in the dashboard's Agent section). Alongside OpenAI you can pick Google Gemini or Anthropic Claude compaction models: the gateway routes each to its provider on Extentos's managed keys (Gemini via Google's OpenAI-compatible endpoint, Claude via Anthropic's Messages API) and meters the usage exactly like any other gateway call, at the provider's list price.

What the gateway does and doesn't do with your data

The gateway is a transient relay, not a data store:

Audio frames, transcripts, and model responses are forwarded verbatim in both directions and never persisted. Binary audio is never even read.
The only thing recorded is the provider's usage object at the end of each response: token counts (input/output, text/audio/cached splits), the model, and the computed cost_usd — or, for per-minute-billed voice models (xAI Grok voice), the billed session seconds. No conversation content is stored.

So the precise framing is: for the managed gateway, Extentos is in the path but never inspects or stores the content — it reads the usage object and forwards everything else. (The older docs claimed Extentos is "never in the path / never proxies AI." That was only ever true for handler-code BYOK — the right column above — and it's the framing this page corrects.) See security for the full data-handling commitments and the content/PII boundary.

There is no bring-your-own-key option

The assistant's voice AI always runs on Extentos's provider key, metered to your account. You cannot supply your own key, and this is not a roadmap item — it is the design. Holding one provider relationship is what lets the gateway meter uniformly, add models without per-customer key management, and keep the SDK free of any credential.

If the provider relationship needs to be yours — an existing commitment, a specific tier, a compliance posture — talk to us about whether Extentos is the right fit, rather than waiting for a feature that isn't coming.

Calling a non-assistant AI from your own handler code (a vision model over a captured photo, a translation API) is a separate thing and always works — that is your app making its own request with its own key, which Extentos neither routes nor meters.

Identity: how the gateway knows who's calling

Every gateway connection presents an attestation JWT (Authorization: Bearer). In production that token comes from platform attestation — Play Integrity on Android, App Attest on iOS — carrying the project id, an anonymous device id, environment, and platform. In the browser simulator the backend mints a session-scoped token, so simulator sessions reach the gateway without device attestation.

A sideloaded development build can't attest; on the production backend it simply can't reach the managed gateway. The dev backend accepts its baked project key instead (the dev-tier lane — still Extentos's managed key, metered to your account), and the simulator uses its minted session token. Either way the managed key is usable only from builds Extentos can account for.

Metering and billing

Metering and billing are live. Every managed-gateway response is priced at the provider's list price (no markup) and recorded against your project — token counts for token-billed models, or connected minutes for per-minute voice models (xAI Grok voice). You can see usage per project in the dashboard's Agent section, and combined across projects in the account Billing hub.

It's a prepaid credit model. Every account starts with $2 of free managed-gateway credit; after that, usage draws down a prepaid balance you top up in the Billing hub (buy any amount; optional auto-reload). When the balance reaches zero, managed-gateway sessions stop until you add credits — prepaid, so there's never a surprise invoice. All managed usage counts toward the balance regardless of environment (simulator, dev, and real hardware alike).

The free-forever surfaces (discovery, validation, and guidance MCP tools, on-device simulation, real-hardware DAT testing) stay free regardless. See pricing for the full money story.

When to use which

Managed gateway (the assistant's AI): voice AI with zero config, running on Extentos's provider with usage metered against your credit balance. This is the path every voice app takes today.
Your own AI calls (a different thing): you're calling a non-assistant AI — a vision model on a captured photo, a translation API — from your own handler. That never touches the gateway; wire the key yourself per credentials.

Assistant runtime — the glasses.assistant.* API the gateway powers
Pricing — free surfaces, metered gateway, the prepaid credit model
Security — the data-handling boundary and what's collected
Credentials — Meta DAT setup and handler-code provider keys

The assistant runtime

Build a voice assistant on smart glasses with glasses.assistant — wake/sleep, tools, vision, barge-in, memory, on the managed AI gateway. Kotlin and Swift.

Pricing

Extentos is free to start — discovery, validation, the on-device simulator, and real-hardware DAT testing need no account. A free account unlocks the browser simulator, project scaffolding, and the managed AI gateway. The gateway is the one metered surface; every account gets $2 of free credit, then prepaid credits at provider list price.

The managed AI gateway

What the managed gateway is

Compaction-model providers

What the gateway does and doesn't do with your data

There is no bring-your-own-key option

Identity: how the gateway knows who's calling

Metering and billing

When to use which

The assistant runtime

Pricing

Security and data handling

Credentials checklist

Capabilities

On this page

The managed AI gateway

Related

The assistant runtime

Pricing

Security and data handling

Credentials checklist

Capabilities

On this page