The display capability

Render UI on the Ray-Ban Display with the glasses.display builder DSL in Kotlin and Swift — text, images, buttons, media, and Neural Band input. Gated per device. Beta.

Display needs the Meta vendor module. com.extentos:glasses carries no vendor SDK, so add implementation("com.extentos:glasses-meta") alongside it — see install. Without it your build still succeeds and voice still works, but display.isAvailable is false and display calls are silent no-ops. The SDK logs a warning at startup when it spots that combination.

Display support is in beta. The full display surface — the builder DSL, select/back input, per-device gating — ships in both SDKs and is fully tested in the simulator on both platforms. On-glasses rendering on Meta Ray-Ban Display hardware is in its verification phase; APIs are stable, with refinements expected as hardware verification completes.

The display capability (glasses.display.*) renders UI on a glasses screen — today, the Ray-Ban Display. You compose a node tree with a small builder DSL (Kotlin or Swift) and hand it to glasses.display.show { ... }; the connected device renders it. On real hardware the tree is delivered over Meta DAT's mwdat-display path; in the browser simulator the same tree renders in the browser tab. There is no WebView, no canvas, no local-bitmap drawing — you declare a tree of containers and leaves, and the platform renders it.

Ships in both SDKs; fully sim-testable on both. glasses.display.* ships in the Kotlin and Swift SDKs, and the DSL, input routing, and event log are exercised end-to-end in the simulator on both platforms. On-glasses rendering via Meta DAT (mwdat-display, DAT 0.8.0) is live on Android; iOS on-glasses delivery is in progress. The local-media roots (image(photo) / video(clip), prepareVideo, forgetHostedVideo / forgetHostedImage) are Android-only — see the same tree in Swift for the Swift shapes.

Per-device, never-throw

Display is a capability of the specific device model, not the vendor. Within Meta Ray-Ban, the Ray-Ban Display has a screen and the Ray-Ban Meta / Oakley models do not. So you branch on a runtime signal, never on a model name:

val isAvailable: Boolean = glasses.display.isAvailable   // true only on a display-capable device

if (glasses.display.isAvailable) {
    glasses.display.show { /* ... rich display UX ... */ }
} else {
    glasses.audio.speak("Here's what I found …")          // graceful voice fallback
}

The model is never-throw: show() on a display-less device is a safe silent no-op — degradation never crashes and never diverges between the simulator and hardware. The gate is live: in the simulator, switching the device model pushes the new profile to the running app with no reconnect, so the same session can see isAvailable flip from true to false mid-run. Good UX guards rich display features with isAvailable and falls back to speech.

The builder DSL

show(onBack) { ... } runs against a DisplayRootScope. You nest containers and place leaves; show {} replaces the entire display each call (there are no incremental updates — own your view state app-side and re-render whole views on every transition). show() and clear() are suspend.

Containers:

column(...) { ... } — a vertical stack.
row(...) { ... } — a horizontal stack.

Both take gap, padding, mainAlign, crossAlign, background, and an optional onClick that makes the whole container tappable. A clickable container needs an explicit id for a stable, agent-targetable handle — otherwise it falls back to a positional id (box-0, box-1, …) that shifts when you reorder the tree.

Leaves:

text(text, style, color, align)
image(url, size, cornerRadius, align) — a remote image fetched by URL.
button(text, style, icon, align, id, onClick) — pass an explicit id.
icon(name, style, align) — a vendor-specific icon by name.

Root-only, full-surface (these render as the entire display and must be the sole top-level node — nesting them inside a column / row is a compile error, and the builder scope inside a container simply doesn't expose them):

video(url) — a remote MP4 by URL.
video(clip) — a locally-recorded VideoClip (the SDK hosts it; see below). Android-only today.
image(photo) — a locally-captured Photo rendered full-surface (the SDK hosts it). Android-only today.

glasses.display.show(onBack = { scope.launch { glasses.display.clear() } }) {
    column(gap = 8, padding = 12) {
        text("My notes", style = TextStyle.HEADING)
        column(
            background = Background.CARD,
            onClick = { scope.launch { openNote(note) } },
            id = "note-${note.id}",          // stable identity-based id
        ) {
            text(note.title, style = TextStyle.BODY)
            text(note.preview, style = TextStyle.CAPTION, color = TextColor.SECONDARY)
        }
    }
}

Enum vocabulary

Enum	Cases
`TextStyle`	`HEADING`, `BODY`, `CAPTION`
`TextColor`	`PRIMARY`, `SECONDARY`
`ButtonStyle`	`PRIMARY`, `SECONDARY`, `OUTLINE`
`Background`	`NONE`, `CARD`
`Alignment`	`START`, `CENTER`, `END`, `STRETCH`
`ImageSize`	`ICON`, `FILL`
`CornerRadius`	`NONE`, `SMALL`, `MEDIUM`

The same tree in Swift

The Swift SDK builds the identical node tree — the tree vocabulary, wire JSON, and root-normalization rules are core-owned, so a tree composed in Swift renders exactly like its Kotlin twin. The builder differs in a few idioms:

show(onBack:content:) and clear() are async — await glasses.display.show { ... }.
Containers: one flexBox(direction:) method (.column / .row) instead of the column / row sugar — same gap, padding (an EdgeInsets), mainAlign, crossAlign, background, and optional onClick.
Leaves: text(_:style:color:align:), image(url:size:cornerRadius:align:), button(_:style:icon:align:onClick:), icon(_:style:align:) — and video(url:) on the root scope.
Ids auto-assign. There is no explicit id: parameter — tappable nodes get stable-within-a-show ids automatically each show.

await glasses.display.show(onBack: { Task { await glasses.display.clear() } }) { root in
    root.flexBox(direction: .column, gap: 8,
                 padding: EdgeInsets(top: 12, right: 12, bottom: 12, left: 12)) { col in
        col.text("My notes", style: .heading)
        col.button("Open latest") { Task { await openLatest() } }
    }
}

Android-only today: the column / row container sugar, the local-media roots image(photo) / video(clip), prepareVideo, and forgetHostedVideo / forgetHostedImage. URL-based image(url:) and video(url:) work on both platforms.

Additive-light design rule

The Ray-Ban Display is a waveguide that adds light to the world — it can only brighten, never occlude. Dark pixels render transparent. So design bright-on-dark: bright text and strokes read well; dark fills disappear into the scene. The same applies double to video: bright-on-dark footage reads, dark scenes mostly vanish. The simulator composites the display additively (a screen blend) so previews match the optics.

One consequence to know: a dead or unfetchable media URL emits no light, so the panel shows nothing even while the tree-level checks look green — the one failure additive optics can't surface visually. See Errors and observability.

Full-surface video and photos, with hosting

video(...) is the one true-motion channel third-party content has — declarative tree updates are link-bound (a few Hz), but a root video plays on the glasses themselves. Playback auto-starts once the tree is sent and stops on the next show() / clear().

The glasses can only fetch an http(s) URL — a local data: / file: clip is rejected on hardware. So:

video(url) / image(url) — for media already hosted at an http(s) URL.
video(clip) / image(photo) — for a clip or photo you just captured. The SDK uploads it to a copy on Extentos's media host at show() time and renders the hosted URL; you never hand-roll hosting. A photo hosts in well under a second; a multi-MB clip takes a few seconds (you'll see a brief "Preparing…" only for video).

// Show a just-captured photo full-surface — the SDK hosts the local capture.
glasses.display.show(onBack = { scope.launch { showList() } }) {
    image(photo)
}

The hosted copy intentionally outlives the on-screen display, so re-showing the same clip/photo reuses it instead of re-uploading. Release it when the user deletes the underlying item:

forgetHostedVideo(clip) / forgetHostedImage(photo) — tear down the hosted copy. Idempotent and safe (a no-op if never hosted, already forgotten, or no display-video delivery is wired); does not touch the local file.
prepareVideo(clip): Boolean — speculatively pre-upload a clip (e.g. when you render a list of recordings) so a later show { video(clip) } plays instantly. Best-effort, idempotent, uploads at most once. There's no prepareImage sibling — photos host fast enough that there's no latency to move off the critical path.

Build the Photo / VideoClip the same way for show and for forget, so both resolve to the same hosted copy. Use platform-hosted test clips for quick verification: https://extentos.com/sample-videos/mountain-bike.mp4 and https://extentos.com/sample-videos/run-in-city.mp4.

Input model

The display has two input rails, both driven the same way in the simulator (injectInput) as on hardware:

Selection — a Neural Band index-pinch on hardware (injectInput(action: "select", targetId) in the sim) fires the node's registered onClick, routed by node id. Give every clickable node a stable id.
Back — the back gesture (Neural Band mid-pinch on hardware; injectInput(action: "back") in the sim) fires the current show's onBack. Back is view-contextual, so register it per show: a detail view's onBack typically re-renders the browse view; a root view's onBack clears the display; a show with no onBack ignores the gesture. clear() drops the current handlers so a stale onBack can't fire after a wipe.

onClick / onBack are plain () -> Unit — launch a coroutine inside them for the suspend show() / clear() calls. The slide (focus highlight) moves device/sim-side; your app doesn't see focus changes, but the focus target is observable in the event log. Keep an explicit Back button alongside onBack for discoverability. The back-gesture wiring is sim-verified; the real-hardware mid-pinch lands with the DAT display validation pass.

Agent-driven verification reads the live tree with getDisplayState and drives input with injectInput — see the MCP tools reference.

Permissions

The display capability contributes zero extra Android permissions and no iOS plist keys — rendering is outbound over the existing DAT connection (the standard Bluetooth set every glasses app already declares). If your app already connects to the glasses, you can render to the display — on real hardware the scaffold also flips the DAM_ENABLED manifest flag for you when Display is in your declared capability footprint (the Extentos SDK ships it off by default).

Errors and observability

There is no DisplayError return type. show() never throws; display delivery problems surface as runtime log events on glasses.runtime.events — display.video_delivery_failed, display.image_delivery_failed, display.video_error — and on the errors chip in the simulator event log (they are warn severity). A dead media URL additionally raises a visible "video failed to load" placeholder in the simulator viewport and a display_error entry on the errors chip — when a display flow looks green but shows nothing, check that chip first. App code can react to display.video_error (e.g. fall back to a text view). See the error reference.

Render on the display — the task guide: capture → store → browse, plus the assistant-driven variant
Capabilities — the full vendor-agnostic SDK vocabulary and the per-device capability model
The assistant runtime — assistant tools can drive glasses.display.* by voice
Error reference — the no-DisplayError model and the runtime log events
Vendors: Meta Ray-Ban — the device family and the Ray-Ban Display

Render on the display

Render UI on glasses with a screen — photo capture, browsable card lists via glasses.display, and full-surface views. One tree, three vendors: Meta Ray-Ban Display, Android XR Display Glasses, and Brilliant Labs Halo and Frame.

Capabilities

The Extentos capability vocabulary — the vendor-agnostic SDK primitives (audio, camera, voice, assistant, display, hardware events) your handler subscribes to.

The assistant runtime

Build a voice assistant on smart glasses with glasses.assistant — wake/sleep, tools, vision, barge-in, memory, on the managed AI gateway. Kotlin and Swift.

Error reference

Every typed error the Extentos SDK can return — ConnectError, CaptureError, AudioError, TransportError, the ExtentosError umbrella, and the Meta-DAT DeviceSessionError — with their payload fields and meaning. Lifecycle operations return ExtentosResult<T, E> with these concrete failure variants rather than throwing; pattern-match them. Generated from the Rust core.

Meta smart glasses (Meta DAT)

Meta smart glasses developer guide: Wearables Device Access Toolkit (DAT 0.8.0) capabilities, supported models (Ray-Ban Meta, Oakley Meta, Ray-Ban Display), 2026 distribution state, and how Extentos abstracts the toolkit.