← All projects

Visual Text Correction

2017 – 2018

A vision-and-language task for automatically detecting and correcting falsified words in video descriptions.

vision-and-languagevideotext-correction

Introduces the Visual Text Correction (VTC) task: given a short video clip and a caption, detect the incorrect word and replace it with the correct one. We built a dataset on top of LSMDC by automatically falsifying captions, then trained a model that jointly localises the error and proposes a correction using video–text semantics.

Publication

Links

Project site Video