Visual Text Correction
ECCV 2018Amir Mazaheri, Mubarak Shah
2017 – 2018
A vision-and-language task for automatically detecting and correcting falsified words in video descriptions.
Introduces the Visual Text Correction (VTC) task: given a short video clip and a caption, detect the incorrect word and replace it with the correct one. We built a dataset on top of LSMDC by automatically falsifying captions, then trained a model that jointly localises the error and proposes a correction using video–text semantics.
Amir Mazaheri, Mubarak Shah