Our Publications

Selected work

3DV

2026

Dense Motion Captioning

Dense Motion Captioning

NeurIPS

2025

ConViS-Bench: Estimating Video Similarity Through Semantic Concepts

ConViS-Bench: Estimating Video Similarity Through Semantic Concepts

ImageNet-trained CNNs are not biased towards texture: Revisiting feature reliance through controlled suppression

ImageNet-trained CNNs are not biased towards texture: Revisiting feature reliance through controlled suppression

Tom Burgert, Oliver Stoll, Paolo Rota, Begüm Demir
Increasing the Utility of Synthetic Images through Chamfer Guidance

Increasing the Utility of Synthetic Images through Chamfer Guidance

Nicola Dall'Asen, Xiaofeng Zhang, Reyhane Askari Hemmat, Melissa Hall, Jakob Verbeek, Adriana Romero-Soriano, Michal Drozdzal
When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding

When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding

Yan Shu, Hangui Lin, Yexin Liu, Yan Zhang, Gangyan Zeng, Yan Li, Yu Zhou, Ser-Nam Lim, Harry Yang, Niculae Sebe

ACM Multimedia

2025

AlignCAT: Visual-Linguistic Alignment of Category and Attributefor Weakly Supervised Visual Grounding

AlignCAT: Visual-Linguistic Alignment of Category and Attributefor Weakly Supervised Visual Grounding

Yidan Wang, Chenyi Zhuang, Wutao Liu, Pan Gao, Niculae Sebe
Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection

Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection

ICCV

2025

On Large Multimodal Models as Open-World Image Classifiers

On Large Multimodal Models as Open-World Image Classifiers