Qwen 3 now supports ARM and MLX

ukuina · 2025-09-13T05:49:33 1757742573

[June 2025]

littlestymaar · 2025-09-13T07:50:47 1757749847

The unfortunate thing with their Qwen3-next naming is that it doesn't reflect on the fact that the architecture is completely different from Qwen3. Much more different than the difference between Qwen2 and Qwen3 even.

So support is likely to take quite some time because it's not just regular transformer blocks stacked on each other, but a brand new hybrid architecture using SSM.

NiekvdMaas · 2025-09-13T07:58:37 1757750317

From https://github.com/ggml-org/llama.cpp/issues/15940#issuecomm...:

> This is a massive task, likely 2-3 months of full-time work for a highly specialized engineer. Until the Qwen team contributes the implementation, there are no quick fixes.

veber-alex · 2025-09-13T08:25:12 1757751912

It's already supported in vLLM, SGLang and MLX.

littlestymaar · 2025-09-13T10:44:46 1757760286

The Qwen team made sure to land PRs to vLLM and SGLang on the first day, which is nice.

NiekvdMaas · 2025-09-13T07:27:02 1757748422

Old post indeed: https://x.com/Alibaba_Qwen/status/1934517774635991412

p0w3n3d · 2025-09-13T08:22:46 1757751766

Yeah I've been using qwen3 on mlx in July already

anemll · 2025-09-14T23:55:01 1757894101

It’s also supported in Apple Neural Engine https://github.com/Anemll/Anemll