How Flash-MoE Runs a 397B Parameter Model on a MacBook Pro at 4.4 tok/s
A developer ran Qwen3.5-397B—a model bigger than GPT-4—on a laptop with no Python and no frameworks. Here's exactly how.
March 23, 202612 min read
1 post tagged with “local-llm”
A developer ran Qwen3.5-397B—a model bigger than GPT-4—on a laptop with no Python and no frameworks. Here's exactly how.