Mira Murati: Multimodality and the Modern AI Product Stack

As models moved from text-only to multimodal and from demos to infrastructure, product leadership became a key shaping force. This post frames Murati’s relevance through the shift to multimodal assistants and deployment realities.

Multimodality changes the interface

When models can accept and produce multiple modalities (text, images, audio), the product becomes an interface layer that coordinates inputs/outputs, tool calls, and user expectations. Capability is no longer a single benchmark score; it’s end-to-end task success.

Reliability as an engineering problem

Shipping assistants at scale requires managing hallucinations, latency, cost, and policy. Many improvements come from evaluation, careful post-training, and product guardrails rather than architecture alone.

Connections

For image-generation roots, see DALL·E. For assistant shaping methods, see RLHF and Post-Training. For the broader organizational timeline, see OpenAI.