VLSlice: Interactive Vision-and-Language Slice Discovery

Slyman, Eric; Kahng, Minsuk; Lee, Stefan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2309.06703 (cs)

[Submitted on 13 Sep 2023]

Title:VLSlice: Interactive Vision-and-Language Slice Discovery

Authors:Eric Slyman, Minsuk Kahng, Stefan Lee

View PDF

Abstract:Recent work in vision-and-language demonstrates that large-scale pretraining can learn generalizable models that are efficiently transferable to downstream tasks. While this may improve dataset-scale aggregate metrics, analyzing performance around hand-crafted subgroups targeting specific bias dimensions reveals systemic undesirable behaviors. However, this subgroup analysis is frequently stalled by annotation efforts, which require extensive time and resources to collect the necessary data. Prior art attempts to automatically discover subgroups to circumvent these constraints but typically leverages model behavior on existing task-specific annotations and rapidly degrades on more complex inputs beyond "tabular" data, none of which study vision-and-language models. This paper presents VLSlice, an interactive system enabling user-guided discovery of coherent representation-level subgroups with consistent visiolinguistic behavior, denoted as vision-and-language slices, from unlabeled image sets. We show that VLSlice enables users to quickly generate diverse high-coherency slices in a user study (n=22) and release the tool publicly.

Comments:	Conference paper at ICCV 2023. 17 pages, 11 figures. this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
ACM classes:	I.4.10; I.2.7; J.4
Cite as:	arXiv:2309.06703 [cs.CV]
	(or arXiv:2309.06703v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2309.06703

Submission history

From: Eric Slyman [view email]
[v1] Wed, 13 Sep 2023 04:02:38 UTC (22,099 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:VLSlice: Interactive Vision-and-Language Slice Discovery

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:VLSlice: Interactive Vision-and-Language Slice Discovery

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators