The managed AI gateway
How AI runs in an Extentos app — the assistant routes voice AI through the managed gateway by default. Content relayed, never stored; metered. BYOK opts out.
By default, the assistant routes voice AI through Extentos's managed gateway on Extentos's keys — content is relayed but never stored, usage is metered, and BYOK is the opt-out. Handler-code AI you call yourself never touches the gateway at all.
There are two distinct ways AI shows up in an Extentos app, and the privacy and billing story is different for each. Getting them straight is the whole point of this page.
| Managed-gateway AI (default) | Handler-code BYOK (your own calls) | |
|---|---|---|
| What it is | The Phase 4 assistant runtime (glasses.assistant.*) — real-time voice AI | Any AI call you make from your own handler (Anthropic, a vision model, a translation API, …) |
| Whose key | Extentos's provider key, by default | Yours, wired into your handler code |
| Is Extentos in the path? | Yes — it relays the call (but never stores the content) | No — your app talks to your provider directly |
| Metered by Extentos? | Yes — token usage + cost recorded per project | No — Extentos never sees it |
The rest of this page is about the managed gateway (the left column). For the right column — calling your own provider from a handler — see credentials; Extentos is not involved in those calls at all.
What the managed gateway is
When your app uses the Phase 4 assistant runtime (glasses.assistant.start(...)), the voice-AI traffic does not carry an API key in your app. Instead, the SDK opens a WebSocket to Extentos's gateway (wss://api.extentos.com/v1/realtime), and the gateway relays the session to the AI provider on Extentos's key. The default provider is OpenAI Realtime; the default model is gpt-realtime-2.
This is deliberate: it means a developer can build and ship a voice-AI glasses app with zero AI configuration — no key to obtain, store, rotate, or leak. The assistant just works.
// No API key anywhere in your code — the gateway supplies it.
val session = glasses.assistant.start(AssistantProvider.OpenAi()) {
instructions = "You are a running companion…"
tool("get_pace", "The runner's current pace") { ToolResult.Ok(pace()) }
}Heads-up: the assistant runtime ships in the
1.4.0-phase4-dogfoodpreview snapshot (viamavenLocal()), not yet on Maven Central. See assistant and SDK install for the snapshot setup.
What the gateway does and doesn't do with your data
The gateway is a transient relay, not a data store:
- Audio frames, transcripts, and model responses are forwarded verbatim in both directions and never persisted. Binary audio is never even read.
- The only thing recorded is the provider's
usageobject at the end of each response: token counts (input/output, text/audio/cached splits), the model, and the computedcost_usd. No conversation content is stored.
So the precise framing is: for the managed gateway, Extentos is in the path but never inspects or stores the content — it reads the usage object and forwards everything else. (The older docs claimed Extentos is "never in the path / never proxies AI." That was only ever true for handler-code BYOK — the right column above — and it's the framing this page corrects.) See security for the full data-handling commitments and the content/PII boundary.
BYOK — bringing your own key (the opt-out)
If you'd rather the AI run on your provider account, BYOK is the opt-out — and it's configured in the dashboard, not in code:
- Upload your OpenAI key in the project's Credentials section. It's stored encrypted, scoped to your project.
- The gateway swaps your key in server-side at call time. Your app still never holds the key.
- The call now bills to your OpenAI account. Extentos still relays it (and still records token-usage metadata), but the spend isn't on Extentos's key.
The SDK surface is identical either way — your handler code doesn't change when you switch between managed and BYOK. (Note: BYOK currently disables the persistent memory profile, because that profile lives on the Extentos backend behind the gateway.)
Identity: how the gateway knows who's calling
Every gateway connection presents an attestation JWT (Authorization: Bearer). In production that token comes from platform attestation — Play Integrity on Android, App Attest on iOS — carrying the project id, an anonymous device id, environment, and platform. In the browser simulator the backend mints a session-scoped token, so simulator sessions reach the gateway without device attestation.
A sideloaded development build can't attest, so it can't use Extentos's managed key — it falls back to BYOK (or the simulator's session token). This keeps the managed key usable only from builds Extentos can account for.
Metering and billing
Metering is live. Every managed-gateway response is priced at the provider's list price (no markup) and recorded against your project. You can see usage per project in the dashboard.
Billing is planned, not yet charging. The model: a prepaid credit balance with a free allowance for every account, after which managed-gateway usage draws down credits (top up, or switch to BYOK to move the spend to your own provider account). This runs on a dedicated Extentos billing account, separate from everything else.
What this means today: managed-gateway usage is measured and shown, but you are not charged — there's a provisional safety cap, not a paywall. When credit billing goes live it will be announced in the changelog, and the free-forever surfaces (MCP tools, on-device simulation, real-hardware testing) stay free regardless. See pricing for the full money story.
When to use which
- Default (managed gateway): you want voice AI with zero config, you're fine running on Extentos's provider with usage metered, and (eventually) drawing down a credit balance. This is the path most apps take.
- BYOK: you already have an OpenAI account you want the spend and rate limits on, you need a specific model/tier the gateway doesn't expose, or your compliance posture requires the provider relationship be yours. Upload the key in the dashboard; nothing else changes.
- Handler-code BYOK (a different thing): you're calling a non-assistant AI — a vision model on a captured photo, a translation API — from your own handler. That never touches the gateway; wire the key yourself per credentials.
Related
- Assistant runtime — the
glasses.assistant.*API the gateway powers - Pricing — free surfaces, metered gateway, the coming credit model
- Security — the data-handling boundary and what's collected
- Credentials — Meta DAT setup and BYOK provider keys
Related
The assistant runtime
Build a voice assistant on smart glasses with glasses.assistant — wake/sleep, tools, vision, barge-in, memory, on the managed AI gateway. Phase-4 preview.
Pricing
Most of Extentos is free with no account — MCP tools, code generation, validation, on-device sim, real-hardware testing. The AI gateway is the metered surface.
Security and data handling
What Extentos collects (aggregate dev + runtime metadata) and never collects (transcripts, photo/video bytes, AI prompts, PII). Anonymous-first, GDPR-friendly.
Credentials checklist
Provider-keys checklist for Extentos. Simulator needs none; Phase-4 assistant defaults to the managed gateway; handler calls are BYOK; Meta creds for hardware only.
Capabilities
The Extentos capability vocabulary — the vendor-agnostic SDK primitives (audio, camera, voice, assistant, display, hardware events) your handler subscribes to.
Extentos vs Meta DAT directly
Extentos sits on top of Meta's Device Access Toolkit — it wraps DAT, it doesn't replace it. What you write yourself against raw DAT vs what Extentos gives you, and when going direct is the right call.
The assistant runtime
Build a voice assistant on smart glasses with glasses.assistant — wake/sleep, tools, vision, barge-in, memory, on the managed AI gateway. Phase-4 preview.