Hi 👋

My name is Jiayi Pan, and I am a first-year PhD student at Berkeley AI Research (BAIR) advised by Prof. Alane Suhr. My current research focuses on language grounding to vision and robotics. I have received an Outstanding Paper Award at ACL 2023, and won Amazon Alexa Prize SimBot Challenge in 2023.

I completed my undergrad in 2023 with dual degrees at the University of Michigan and Shanghai Jiao Tong University. During those wonderful years, I worked with Professors Joyce Chai, Dmitry Berenson, and Fan Wu.

Feedbacks are always welcome!

Publications & Manuscripts

* denotes equal contribution
Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?

Yichi Zhang, Jiayi Pan, Yuchen Zhou, Rui Pan, Joyce Chai. EMNLP 2023.

In this paper, we ask, do Vision-Language Models (VLMs), an emergent human-computer interface, perceive visual illusions like humans? Or do they faithfully represent reality. We built VL-Illusion, a new dataset that systematically evaluate the problem. And among all other exciting findings, we found that although model's humanlike rate is low under illusion, larger models are more susceptible to visual illusions, and closer to human perception.
Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?
World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models

Ziqiao Ma*, Jiayi Pan*, Joyce Chai. ⭐️ ACL 2023 Outstanding Paper.

We introduce Grounded Open Vocabulary Acquisition (GOVA) to examine grounding and bootstrapping in open-world language learning. We also propose object-oriented BERT (OctoBERT), a visually-grounded language model highlighting grounding as an objective. Our experiments demonstrate that OctoBERT is a more coherent and fast grounded word learner, and that the grounding ability helps the model to learn unseen words more rapidly and robustly.
World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models
SEAGULL: An Embodied Agent for Instruction Following through Situated Dialog

Team SEAGULL at UMich. 🏆 1st Place in the inaugural Alexa Prize SimBot Challenge.

We introduce SEAGULL, an interactive embodied agent designed for Alexa Prize SimBot Challenge, which can complete complex tasks in the Arena simulation environment through dialog with users. SEAGULL is engineered to be efficient, user-centric, and continuously improving.
SEAGULL: An Embodied Agent for Instruction Following through Situated Dialog
Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification

Jiayi Pan, Glen Chou, Dmitry Berenson. ICRA 2023.

We present a learning-based approach to translate from natural language commands to LTL specifications with very limited human-labeled training data by leveraging Large Language Models. Our model can translate natural language commands at 75% accuracy with about 12 annotations and when given full training data, achieves state-of-the-art performance. We also show how its outputs can be used to plan long-horizon, multi-stage tasks on a 12D quadrotor.
Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification
DANLI: Deliberative Agent for Following Natural Language Instructions

Yichi Zhang, Jianing Yang, Jiayi Pan, Shane Storks, Nikhil Devraj, Ziqiao Ma, Keunwoo Peter Yu, Yuwei Bao, Joyce Chai. EMNLP 2022, Oral.

We propose a neuro-symbolic deliberative agent that, while following language instructions, proactively applies reasoning and planning based on its neural and symbolic representations acquired from past experience. Our deliberative agent achieves 70% improvement over reactive baselines on the challenging TEACh benchmark. Moreover, the underlying reasoning and planning processes, together with our modular framework, offer impressive transparency and explainability to the behaviors of the agent.
DANLI: Deliberative Agent for Following Natural Language Instructions

Contact

  • Email: jiayipan [AT] berkeley [DOT] edu