Visionlanguage

3 Articles
Vision-language model creates plans for automated inspection of environments
Tech

Vision-language model creates plans for automated inspection of environments

Figure showing the pipeline of the team’s method. The input to their method includes a text description and a 3D environmental map, and...

Vision-language models gain spatial reasoning skills through artificial worlds and 3D scene descriptions
Tech

Vision-language models gain spatial reasoning skills through artificial worlds and 3D scene descriptions

On the left, the simulated environment containing a cuboid placed on a plane and observed by a camera, placed directly above the object...

Vision-language models can’t handle queries with negation words, study shows
Tech

Vision-language models can’t handle queries with negation words, study shows

We present NegBench with image retrieval and multiplechoice tasks to evaluate negation understanding. CLIP-based models frequently misinterpret negation in both tasks, but we...