University of Trento · DISI

Multimedia
& Human
Understanding
Group

A research group at the University of Trento working on computer vision, video understanding, and multimodal AI — building models that see, listen, and reason about the world.

Meet the Team Our Publications

50 Members

244 Papers

3 Active Projects

2009 Est.

Members

244+

Papers

Active Projects

2009

Est.

What we do

Research Areas

Vision-Language and Multimodal Models

Vision-language alignment with large multimodal foundation models.

3D Vision and Spatial Scene Understanding

Spatial scene reconstruction and semantic segmentation.

Geometric Deep Learning and Non-Euclidean Networks

Representation learning with non-Euclidean and geometric neural networks.

Generative AI

Image synthesis and degradation-agnostic restoration with attention mechanisms and manifold regularization.

Human-Centric Analysis and Motion Understanding

Human motion analysis and privacy-preserving vision with action-aligned representationsn.

Trustworthy AI and Adversarial Security

Trustworthy AI and adversarial defense with vision-language and multi-view systems.

Ongoing work

Featured Projects

See all projects

ELIAS

2023 – 2027

ELLIOT

2025 – 2029

MUR

GUIDANCE

2025 – 2028

See all projects

Updates

Latest News

All news

May 21, 2026

News: Our Group is Heading to CVPR!

Come talk with us at our poster sessions. Here's the full schedule.