Projects

Research papers, patents, and engineering work — each with its own page.

A domain-adaptable few-shot framework distinguishing healer from non-healer wounds through temporal image analysis.

A deep learning framework that monitors student performance and detects anomalies in large-scale active learning courses.

Pioneering text-to-video generation on realistic datasets using latent path construction for temporal modeling.

Production-scale scene and shot boundary detection system for video content at Tubi.

A multimodal fusion transformer with BERT encodings that achieves SOTA on the TVQA dataset.

Robustifying deep visuomotor policies through task-focused visual attention guided by natural language instructions.

A vision-and-language task for automatically detecting and correcting falsified words in video descriptions.

A latent-variable model for retrieving videos by multiple concepts supplied directly by users or inferred from queries.

Bidirectional LSTMs with spatial-temporal attention to predict missing words in video descriptions.

Dual deep networks for precise cropping and super-resolution enhancement of embedded images.

Machine learning techniques for advanced frequency management in streaming media delivery.