- モデル
- Qwen3 VL

Qwen3 VL
by Alibaba
Qwen3-VL is a powerful open-source vision-language model with 235B parameters. It delivers comprehensive capabilities across text, image, and video understanding with a native 256K token context window. Key features include Visual Agent functionality for operating computer and mobile GUIs, advanced OCR in 32 languages, enhanced spatial perception and 3D grounding, and visual coding that generates Draw.io/HTML/CSS/JS from images and videos. The model excels at long-video comprehension, object localization, and embodied AI tasks.