Work / 02 · 2022–2024

Leading Google's Assistant into the Gemini era

As Google reinvented its assistant around Gemini, I led one of the broadest engineering portfolios in that effort — on-device intelligence, the server-side fulfillment that turns a request into an action, Gemini Live's natural conversation, the protocol that hands a session between phone and cloud without a seam, and the reliability and unification of the whole stack. Managing managers, and one of the pillars of a ~150-person org.

1B+

people across Assistant & Gemini surfaces

150

person org I helped lead and calibrate

device→cloud

one portfolio across the full stack

~800

engineers on the platform we unified

The portfolio

Not one layer — the whole assistant stack

I came to Zürich in 2022 to lead assistant engineering, and the remit grew fast. Generative AI arrived mid-flight, Assistant became Gemini, and the work I owned landed at the center of the company's top priority. What I led wasn't a single team or a single layer — it was a portfolio that ran the length of the stack:

On-device intelligence: The assistant running locally on phones, watches, cars, and the home — fast, private, and useful even offline.
Server fulfillment: The layer that turns an understood request into a real action — the difference between answering and doing.
Gemini Live: Natural, free-flowing conversation you can interrupt and redirect, instead of one rigid command at a time.
Device ↔ cloud handoff: One conversational protocol so a session moves between phone and server seamlessly — light work local, heavy work in the cloud.
Reliability: The readiness and quality of assistive experiences end to end, across surfaces and conditions.
Stack unification: Collapsing parallel Assistant and Gemini stacks into one, so a capability built once works everywhere.

The hard problem

Make two stacks one — and make the seam disappear

The through-line was unification. Years of Assistant and the new world of Gemini had grown into parallel stacks; my mandate was to make them one — so a capability built once worked everywhere, and a conversation could begin on a watch, continue on a phone, and reach into the cloud for a heavy action without the user ever feeling the boundary.

The best handoff is the one nobody notices: light work on the device, heavy work in the cloud, one conversation throughout.

On-devicefast · private · offline-capable

light work

No seamOne conversationstate moves with the session

heavy work

Cloudreal actions · larger models

Begin on a watch, continue on a phone, reach the cloud — without ever feeling the boundary

Fig. 01 — Device ↔ cloud: one conversation, no seam

surfacehandoff

The structural bet underneath it was consolidating custom, siloed systems into one core-supported platform — adopted by roughly 800 engineers across Google, cutting redundant infrastructure cost by as much as half, standardized on Kotlin so the same code could serve both server and device. And the on-device half kept its own discipline: every capability fit inside per-device-class budgets for latency, battery, and memory, because running models where your data lives is the strongest privacy promise a consumer AI platform can make.

Reliability

Make 'production-ready' something you can measure

As capabilities multiplied across surfaces, quality threatened to become anecdote. I chartered a reliability and quality effort that turned “is it ready?” from a judgment call into a measured bar — so an assistive experience shipped when the numbers said it was trustworthy, not when the demo went well.

Correctness: Does it do what was asked? Measured per capability, with budgets attached.
Reliability: Does it do so every time — across devices, locales, and network conditions?
Latency & health: Inside the device-class budget, without degrading the system around it?
Debuggability: When it fails, can an engineer see why? Failures you can't explain are failures you can't fix.

The pillars fed a continuous-readiness bar with launch gating. On Family Link I had learned this lesson after a crisis; here we built it in before one.

Leadership

Managing managers, holding an org together

I led this portfolio by managing managers and senior staff, not by touching every change — and I was one of the pillars of the ~150-person organization around it, helping run its operating rhythm and its performance calibration. That breadth mattered most when the ground moved: this era brought layoffs, the Assistant-to-Gemini pivot, and reorganizations.

I did the unglamorous parts well: rebuilding a struggling team's leadership end to end, divesting systems we shouldn't own, winding down scope my own teams had built when the strategy genuinely changed, and protecting morale and retention through the uncertainty. What I learned leading at this scale: set the direction in writing, measure what “good” means, and when the plan changes, say so plainly and early. Teams forgive a changed strategy; they don't forgive a hidden one.

Next — 03 · 2024–2026

Steering YouTube mobile's next decade