Projects
Research papers, patents, and engineering work — each with its own page.
WoundNet: Few-Shot Wound Healing Assessment
2022 – 2023A domain-adaptable few-shot framework distinguishing healer from non-healer wounds through temporal image analysis.
Context-Aware Group Anomaly Detection in Education
2022 – 2023A deep learning framework that monitors student performance and detects anomalies in large-scale active learning courses.
Text-to-Video Generation via Latent Path Construction
2021 – 2022Pioneering text-to-video generation on realistic datasets using latent path construction for temporal modeling.
Multimedia Scene Break Detection
2021 – 2024Production-scale scene and shot boundary detection system for video content at Tubi.
MMFT-BERT: Multimodal Fusion for Video QA
2019 – 2020A multimodal fusion transformer with BERT encodings that achieves SOTA on the TVQA dataset.
Task-Focused Attention for Robotic Manipulation
2018 – 2019Robustifying deep visuomotor policies through task-focused visual attention guided by natural language instructions.
Visual Text Correction
2017 – 2018A vision-and-language task for automatically detecting and correcting falsified words in video descriptions.
Multi-Concept Video Retrieval
2016 – 2017A latent-variable model for retrieving videos by multiple concepts supplied directly by users or inferred from queries.
Video Fill In the Blank
2016 – 2017Bidirectional LSTMs with spatial-temporal attention to predict missing words in video descriptions.
Deep Photo Cropper and Enhancement
2019 – 2020Dual deep networks for precise cropping and super-resolution enhancement of embedded images.
ML for Advanced Frequency Management
2022 – 2024Machine learning techniques for advanced frequency management in streaming media delivery.