upcarta
  • Sign In
  • Sign Up
  • Explore
  • Search

Multimodal Chain-of-Thought Reasoning in Language Models

  • Paper
  • Feb 17, 2023
  • #ComputerScience
George Karypis
@GeorgeKarypis
(Author)
Alexander J. Smola
@smolix
(Author)
Aston Zhang
@astonzhangAZ
(Author)
arxiv.org
Read on arxiv.org
1 Recommender
2 Mentions
1 Collection
Large language models (LLMs) have shown impressive performance on complex reasoning by leveraging chain-of-thought (CoT) prompting to generate intermediate reasoning chains as the r... Show More

Large language models (LLMs) have shown impressive performance on complex reasoning by leveraging chain-of-thought (CoT) prompting to generate intermediate reasoning chains as the rationale to infer the answer. However, existing CoT studies have focused on the language modality. We propose Multimodal-CoT that incorporates language (text) and vision (images) modalities into a two-stage framework that separates rationale generation and answer inference. In this way, answer inference can leverage better generated rationales that are based on multimodal information. With Multimodal-CoT, our model under 1 billion parameters outperforms the previous state-of-the-art LLM (GPT-3.5) by 16 percentage points (75.17%->91.68% accuracy) on the ScienceQA benchmark and even surpasses human performance. Code is publicly available available at this https URL.

Show Less
Recommend
Post
Save
Complete
Collect
Mentions
See All
Michael Spencer @AISupremacyNews · Feb 21, 2023
  • Post
  • From Twitter
Xavi says: This paper is interesting in many ways. By using multimodal fine-tuning of small(er) (i.e. 1B parameter) models like T5 the authors show that they can beat GPT3.5 in visual QA tasks, surpassing visual performance in several tasks.
Cemre Ucar @CemreUcar · May 24, 2023
  • Curated in Artificial Intelligence Tools and Sustainability Implementation Workshop Resources - Şirince 2023
Collections
See All
  • Cemre Ucar
    • Collection
    Artificial Intelligence Tools and Sustainability Implementation Workshop Resources - Şirince 2023
    5 curations
  • upcarta ©2025
  • Home
  • About
  • Terms
  • Privacy
  • Cookies
  • @upcarta