ProductMarkTechPost·June 28, 2026

Liquid AI Ships LFM2.5-230M with llama.cpp, MLX, vLLM, SGLang, and ONNX Support for On-Device Inference

Liquid AI released LFM2.5-230M, its smallest model yet. The 230M-parameter, open-weight model runs on-device at 213 tok/s on a Galaxy S25 Ultra and 42 on a Raspberry Pi 5. Built on the LFM2 architecture, it targets tool use and data extraction, beating larger models like Qwen3.5-0.8B and Gemma 3 1B on instruction following. The post Liquid AI Ships LFM2.5-230M with llama.cpp, MLX, vLLM, SGLang,…

This is a summary curated by AIFuture. Read the complete article at the original source:

Read the full story on MarkTechPost

More AI News

StartupsTechCrunch

Neil Rimer thinks the AI money is coming back out

Neil Rimer, the venture capitalist who co-founded Index Ventures, predicts the historic wealth AI is generating in Silicon Valley will have to be redistributed, voluntarily or involuntarily.

Jul 18, 2026

IndustryTechCrunch

Vertu wants executives to pay $6,880 for an AI agent — here’s how it actually performs

From AI workflows to battery life and security, here's what it's really like to live with Vertu's luxury foldable every day.

Jul 17, 2026

AI ResearchMarkTechPost

Build an Agentic Event Venue Operator with MongoDB Atlas, Voyage, and LangGraph

Introduction This tutorial starts where most agent demos stop: giving the agent persistent memory, operational context, and a place to write back what happened. An event operator does not just need an agent that can summarize a weather report or generate a generic plan. The operator needs an agent that can remember what happened at […] The post Build an Agentic Event Venue Operator with MongoDB…

Jul 17, 2026