Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text | paper by Wanrong Zhu - Upcarta

Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text

Paper
Apr 14, 2023
#ComputerScience

Wanrong Zhu

Read on arxiv.org

In-context vision and language models like Flamingo support arbitrarily interleaved sequences of images and text as input. This format not only enables few-shot learning via interle... Show More

Mentions

Xin Eric Wang @xwang_lk · Apr 17, 2023

Post
From Twitter

A large OPEN dataset for vision and language model training (e.g., GPT-4). Great work by @ZhuWanrong!