---
title: The managed AI gateway
description: How AI runs in an Extentos app — the assistant routes voice AI through the managed gateway by default. Content relayed, never stored; metered. BYOK opts out.
type: concept
platform: all
related:
  - /docs/concepts/assistant
  - /docs/resources/pricing
  - /docs/resources/security
  - /docs/mcp-server/credentials-checklist
  - /docs/concepts/capabilities
---

**By default, the assistant routes voice AI through Extentos's managed gateway on Extentos's keys — content is relayed but never stored, usage is metered, and BYOK is the opt-out.** Handler-code AI you call yourself never touches the gateway at all.

There are **two distinct ways AI shows up** in an Extentos app, and the privacy and billing story is different for each. Getting them straight is the whole point of this page.

| | **Managed-gateway AI** (default) | **Handler-code BYOK** (your own calls) |
|---|---|---|
| What it is | The Phase 4 assistant runtime (`glasses.assistant.*`) — real-time voice AI | Any AI call you make from your own handler (Anthropic, a vision model, a translation API, …) |
| Whose key | Extentos's provider key, by default | Yours, wired into your handler code |
| Is Extentos in the path? | **Yes** — it relays the call (but never stores the content) | **No** — your app talks to your provider directly |
| Metered by Extentos? | Yes — token usage + cost recorded per project | No — Extentos never sees it |

The rest of this page is about the **managed gateway** (the left column). For the right column — calling your own provider from a handler — see [credentials](/docs/mcp-server/credentials-checklist); Extentos is not involved in those calls at all.

## What the managed gateway is

When your app uses the [Phase 4 assistant runtime](/docs/concepts/assistant) (`glasses.assistant.start(...)`), the voice-AI traffic does **not** carry an API key in your app. Instead, the SDK opens a WebSocket to Extentos's gateway (`wss://api.extentos.com/v1/realtime`), and the gateway relays the session to the AI provider on **Extentos's** key. The default provider is OpenAI Realtime; the default model is `gpt-realtime-2`.

This is deliberate: it means a developer can build and ship a voice-AI glasses app with **zero AI configuration** — no key to obtain, store, rotate, or leak. The assistant just works.

```kotlin
// No API key anywhere in your code — the gateway supplies it.
val session = glasses.assistant.start(AssistantProvider.OpenAi()) {
    instructions = "You are a running companion…"
    tool("get_pace", "The runner's current pace") { ToolResult.Ok(pace()) }
}
```

> **Heads-up:** the assistant runtime ships in the `1.4.0-phase4-dogfood` preview snapshot (via `mavenLocal()`), not yet on Maven Central. See [assistant](/docs/concepts/assistant) and [SDK install](/docs/sdk/android/install) for the snapshot setup.

## What the gateway does and doesn't do with your data

The gateway is a **transient relay**, not a data store:

- Audio frames, transcripts, and model responses are **forwarded verbatim in both directions and never persisted.** Binary audio is never even read.
- The **only** thing recorded is the provider's `usage` object at the end of each response: token counts (input/output, text/audio/cached splits), the model, and the computed `cost_usd`. **No conversation content is stored.**

So the precise framing is: for the managed gateway, Extentos is **in the path but never inspects or stores the content** — it reads the usage object and forwards everything else. (The older docs claimed Extentos is "never in the path / never proxies AI." That was only ever true for handler-code BYOK — the right column above — and it's the framing this page corrects.) See [security](/docs/resources/security) for the full data-handling commitments and the content/PII boundary.

## BYOK — bringing your own key (the opt-out)

If you'd rather the AI run on **your** provider account, BYOK is the opt-out — and it's configured in the **dashboard**, not in code:

1. Upload your OpenAI key in the project's **Credentials** section. It's stored encrypted, scoped to your project.
2. The gateway swaps your key in **server-side** at call time. Your app still never holds the key.
3. The call now bills to **your** OpenAI account. Extentos still relays it (and still records token-usage metadata), but the spend isn't on Extentos's key.

The SDK surface is identical either way — your handler code doesn't change when you switch between managed and BYOK. (Note: BYOK currently disables the [persistent memory](/docs/concepts/assistant#memory) profile, because that profile lives on the Extentos backend behind the gateway.)

## Identity: how the gateway knows who's calling

Every gateway connection presents an **attestation JWT** (`Authorization: Bearer`). In production that token comes from platform attestation — Play Integrity on Android, App Attest on iOS — carrying the project id, an anonymous device id, environment, and platform. In the [browser simulator](/docs/concepts/transport-vs-app) the backend mints a session-scoped token, so simulator sessions reach the gateway without device attestation.

A sideloaded **development** build can't attest, so it can't use Extentos's managed key — it falls back to BYOK (or the simulator's session token). This keeps the managed key usable only from builds Extentos can account for.

## Metering and billing

**Metering is live.** Every managed-gateway response is priced at the provider's list price (no markup) and recorded against your project. You can see usage per project in the dashboard.

**Billing is planned, not yet charging.** The model: a prepaid **credit** balance with a **free allowance** for every account, after which managed-gateway usage draws down credits (top up, or switch to BYOK to move the spend to your own provider account). This runs on a dedicated Extentos billing account, separate from everything else.

What this means **today**: managed-gateway usage is measured and shown, but you are **not charged** — there's a provisional safety cap, not a paywall. When credit billing goes live it will be announced in the [changelog](/docs/resources/changelog), and the free-forever surfaces (MCP tools, on-device simulation, real-hardware testing) stay free regardless. See [pricing](/docs/resources/pricing) for the full money story.

## When to use which

- **Default (managed gateway):** you want voice AI with zero config, you're fine running on Extentos's provider with usage metered, and (eventually) drawing down a credit balance. This is the path most apps take.
- **BYOK:** you already have an OpenAI account you want the spend and rate limits on, you need a specific model/tier the gateway doesn't expose, or your compliance posture requires the provider relationship be yours. Upload the key in the dashboard; nothing else changes.
- **Handler-code BYOK (a different thing):** you're calling a *non-assistant* AI — a vision model on a captured photo, a translation API — from your own handler. That never touches the gateway; wire the key yourself per [credentials](/docs/mcp-server/credentials-checklist).

## Related

- [Assistant runtime](/docs/concepts/assistant) — the `glasses.assistant.*` API the gateway powers
- [Pricing](/docs/resources/pricing) — free surfaces, metered gateway, the coming credit model
- [Security](/docs/resources/security) — the data-handling boundary and what's collected
- [Credentials](/docs/mcp-server/credentials-checklist) — Meta DAT setup and BYOK provider keys