Portrait of Amir Mazaheri

Amir Mazaheri

Computer Vision Research Scientist · PhD

Staff ML Engineer at Warner Bros. Discovery (HBO). Deep expertise in large-scale video understanding, Vision-Language Models, and multimodal AI.

Projects · CV · GitHub · LinkedIn · amirmazaheri1990@gmail.com

About

I'm a Computer Vision Research Scientist with deep expertise in large-scale video understanding, Vision-Language Models (VLMs), and multimodal AI systems. I currently work as a Staff ML Engineer at Warner Bros. Discovery (HBO), where I lead video understanding and content moderation systems for the streaming platform.

I hold a PhD from UCF's Center for Research in Computer Vision (CRCV), advised by Prof. Mubarak Shah. My dissertation focused on Video Content Understanding Using Text. I have authored publications at CVPR, ICCV, ECCV, EMNLP, and AAAI, and hold multiple US patents.

Current focus

Selected projects

MMFT-BERT: Multimodal Fusion for Video QA

2019 – 2020

A multimodal fusion transformer with BERT encodings that achieves SOTA on the TVQA dataset.

Findings of EMNLP 2020

Visual Text Correction

2017 – 2018

A vision-and-language task for automatically detecting and correcting falsified words in video descriptions.

ECCV 2018

See all projects →