ByteDance's "iLLaDA" is a diffusion language model that keeps up with Qwen2.5

Researchers from Renmin University and ByteDance have released iLLaDA, an 8B language model that generates text differently than ChatGPT. It matches Qwen2.5 at the base level but falls behind after fine-tuning. The article ByteDance's "iLLaDA" is a diffusion language model that keeps up with Qwen2.5 appeared first on The Decoder.
This is a summary curated by AIFuture. Read the complete article at the original source:
Read the full story on The Decoder