Liquid AI Ships LFM2.5-230M with llama.cpp, MLX, vLLM, SGLang, and ONNX Support for On-Device Inference
Liquid AI released LFM2.5-230M, its smallest model yet. The 230M-parameter, open-weight model runs on-device at 213 tok/s on a Galaxy S25 Ultra and 42 on a Raspberry Pi 5. Built on the LFM2 architecture, it targets tool use and data extraction, beating larger models like Qwen3.5-0.8B and Gemma 3 1B on instruction following. The post Liquid AI Ships LFM2.5-230M with llama.cpp, MLX, vLLM, SGLang,…
This is a summary curated by AIFuture. Read the complete article at the original source:
Read the full story on MarkTechPost