BigDocs

Wed, 01 Jan 2025 00:00:00 +0000

BigDocs is a large scale, open, and permissively licensed dataset for training multimodal models on document understanding and code generation tasks. Published at ICLR 2025.

SYNTHIA

Wed, 01 Jun 2016 00:00:00 +0000

SYNTHIA is a large collection of synthetic images for semantic segmentation of urban scenes, generated using a video game engine. Published at CVPR 2016 and widely adopted in the autonomous driving research community. Licensed for commercial use by Intel, Audi, Huawei, Toyota, and Samsung.

Datasets | David Vázquez

BigDocs

SYNTHIA