Selected work
2026
2025
ConViS-Bench: Estimating Video Similarity Through Semantic Concepts
ImageNet-trained CNNs are not biased towards texture: Revisiting feature reliance through controlled suppression
Increasing the Utility of Synthetic Images through Chamfer Guidance
When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
2025
AlignCAT: Visual-Linguistic Alignment of Category and Attributefor Weakly Supervised Visual Grounding
Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection
2025
On Large Multimodal Models as Open-World Image Classifiers