The Multimodal IMDb (MM-IMDb) dataset

Multimodal dataset with around 26,000 movies including images, plots and other metadata


The MM-IMDb dataset comprises 25,959 movies along with their plot, poster, genres and other 50 additional metadata fields such as year, language, writer, director, aspect ratio, etc. Additional info can be found in the paper.

Source code

Github repository

Download Dataset

Raw dataset: mmimdb.tar.gz [8.1G]

Fuel format dataset: multimodal_imdb.hdf5 [15G], metadata.npy [62M].