Microsoft has unveiled Mu, a compact AI language model designed to operate entirely on a PC’s Neural Processing Unit (NPU). Built for speed and privacy, Mu enables users to perform natural language-based tasks on Copilot+ PCs without relying on cloud connectivity.
Unlike cloud-dependent models, Mu processes user prompts locally, enabling faster and more private interactions. Microsoft has optimised the model to work with the Settings agent in Windows Insider builds, allowing users to adjust system settings such as brightness and power modes simply by asking in natural language.
“Mu is built with only 330 million parameters, but its performance rivals much larger models,” said Microsoft. It uses a transformer-based encoder-decoder architecture, which processes input and generates output more efficiently by separating the two tasks.
Testing on Qualcomm’s Hexagon NPU showed a 47 percent reduction in the time to generate the first token and decoding speeds nearly five times faster than similar decoder-only models. On optimised devices like the Surface Laptop 7, Mu can produce over 200 tokens per second in real-time, ensuring minimal latency and smooth user experience.
Beyond speed, Mu prioritises user privacy, performing all computation locally on the device. This means personal data remains on the machine, an increasingly important consideration for users and enterprises alike.
The company also noted that Mu has been specifically fine-tuned to handle hundreds of system-level commands. Whether it’s tweaking screen settings or managing power preferences, users can control the system with straightforward instructions.
This announcement comes just a week after Microsoft revealed it is upgrading its European cloud infrastructure to reinforce data sovereignty, ensuring that customer data remains within European legal jurisdictions.