Google’s Gemma series continues to throw up all kinds of interesting models.
The latest is Magenta RealTime 2 (MRT2), an open-weights model from Google’s long-running Magenta research team that lets musicians run live, low-latency music synthesis directly on a MacBook — no cloud, no TPU, no dedicated hardware required. The model is 2.4 billion parameters and comes bundled with a C++ inference engine, a Python library, and a set of ready-to-use applications that can drop into a digital audio workstation (DAW) or run standalone.

What It Does
The key distinction with MRT2 is that it is not a generative music tool in the way most people currently understand that category. Tools like Google’s Lyria 3, Suno, or Udio take a prompt and produce a finished track. MRT2 is something different: a live instrument. You control it with MIDI input, audio clips, or text descriptions, and it generates music continuously in response — effectively acting as a playable AI synthesizer.
The numbers that matter here are latency. The original Magenta RealTime had a control latency of around three seconds, which made it interesting as a research artifact but difficult to use as an actual instrument. MRT2 brings that down to roughly 200 milliseconds, with a frame size of 40 milliseconds. That is still not quite the sub-10ms response of a hardware synthesizer, but it is within a range where real musical interaction becomes plausible.
The Technical Approach
MRT2 is a codec language model operating on audio tokens from the SpectroStream codec. It uses frame-level autoregression with frame-aligned conditioning — meaning MIDI and style prompts are injected at every generation step, so the model can respond to input changes within a single 40ms frame rather than waiting to complete a longer sequence.
To enable continuous streaming without runaway memory consumption, the model uses a causal sliding window attention mechanism alongside learnable attention embeddings. The latter are designed to improve generation quality over long sessions, addressing the kind of artifacts — feedback, tonal drift — that can emerge when context gets evicted from a sliding window.
For inference, the team built a C++ engine powered by Apple’s MLX framework. MRT2’s weights and computational graph are compiled into a .mlxfn container file, which the C++ engine loads and executes on Apple Silicon GPUs. This is what allows the model to run without an internet connection or a dedicated compute cluster. The base 2.4B model requires an M3 Pro or M2 Max and above; the smaller 230M variant runs on any Apple Silicon MacBook, including the Air.
The Applications
Google is shipping four example applications alongside the model:
Jam is the most straightforward entry point — a standalone app with style presets and MIDI control for anyone who wants to start playing immediately.
Collider takes a more experimental approach, letting users mix and interpolate prompts on a two-dimensional surface to blend genres and textures.
MRT2 Plugin is the DAW integration, shipping as an Audio Unit (AU) plugin. Musicians can drop it onto a MIDI track in Logic, Ableton, or any AU-compatible DAW, and the model generates audio in response to what is being played.
Creative Coding Extensions ship for Max/MSP, PureData, and SuperCollider — three environments widely used by electronic musicians, sound designers, and researchers who build custom instruments and performance tools.
The feature set across these applications includes MIDI steering (holding a chord causes the model to generate an ensemble following that harmony), text-to-synth (type “string ensemble” or “disco funk” to generate a playable instrument), audio cloning from short snippets, prompt mixing between audio and text styles, and gesture control via LFO, MIDI controllers, or even a camera.
A Decade of Magenta
The Magenta project has been running since around 2016, with an explicit philosophy that AI should augment musicians rather than replace them. NSynth, released in 2017, was an early neural synthesizer built into playable hardware. DDSP and Piano Genie followed. The first Magenta RealTime was the team’s debut live music model, but it required high-power GPU or TPU access and had latency that made live performance impractical.
MRT2 is meaningfully different in that it targets hardware musicians actually own. The decision to build on MLX and ship a C++ inference engine rather than a Python-only library reflects a real commitment to production usability, not just research demonstration. The AU plugin format is not glamorous, but it is what gets software into a professional recording workflow.
What Comes Next
Google says finetuning support is coming, which would allow anyone to train the model on their own audio data and generate instruments in their own sonic signature. The team will also be at the Music Technology Hackathon in Boston in the coming days, presenting a challenge centered on MRT2.
The broader context here is that Google has been pushing hard across music AI on multiple fronts simultaneously — Lyria 3 in Gemini for casual consumers, Lyria RealTime via the Gemini API for developers, and now MRT2 for musicians who want a proper instrument. These are genuinely different use cases with different requirements, and the strategy of addressing all three separately rather than with a single product reflects a more mature understanding of how music software actually gets used.
MRT2’s code and weights are fully open source, available on GitHub under the magenta/magenta-realtime repository. The bundle — apps, plugin, and creative coding extensions — is available for download on macOS with Apple Silicon.