Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?

Yichi Zhang, Jiayi Pan, Yuchen Zhou, Rui Pan, Joyce Chai

October, 2023

Featured

Abstract

Yichi Zhang, Jiayi Pan, Yuchen Zhou, Rui Pan, Joyce Chai. EMNLP 2023.

Do Vision-Language Models, an emergent human-computer interface, experience visual illusions similarly to humans, or do they accurately depict reality? We created GVIL dataset to study this. Among other findings, we discover that larger models align more closely with human perception.

Type

Conference paper

Publication

The 2023 Conference on Empirical Methods in Natural Language Processing

Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?

Abstract

Jiayi Pan

潘家怡