Papers published at top conferences and journals.
Automatic benchmarking of large multimodal models via iterative experiment programming
Diversified in-domain synthesis with efficient fine-tuning for few-shot classification
Evaluating Attribute Confusion in Fashion Text-to-Image Generation
Can Text-to-Video Generation help Video-Language Alignment?
Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers
Compositional Caching for Training-free Open-vocabulary Attribute Detection
Multi-focal Conditioned Latent Diffusion for Person Image Synthesis
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language Models
Seeing the abstract: Translating the abstract language for vision language models
3D Part Segmentation via Geometric Aggregation of 2D Visual Features
One vlm to keep it learning: Generation and balancing for data-free continual visual question answering
Understanding Matrix Function Normalizations in Covariance Pooling from the Lens of Riemannian Geometry
Group-robust Machine Unlearning
Unlearning Personal Data from a Single Image