Computer Vision Videos

Xiaolei Diao – Towards data-centric AI (55:26)
31/05/2023
Data quality is critical for multimedia tasks, while various types of systematic flaws are found in image benchmark datasets, as discussed in recent work. In particular, the semantic gap problem leads to a many-to-many mapping between the information extracted from an image and its linguistic description. This unavoidable bias further leads to poor performance on current computer vision tasks. To address this issue, we introduce a Knowledge Representation (KR)-based methodology to provide guidelines driving the labeling process, thereby indirectly introducing intended semantics in ML models. Specifically, an iterative refinement-based annotation method is proposed to optimize data labeling by organizing objects in a classification hierarchy according to their visual properties, ensuring they align with their linguistic descriptions. Preliminary results verify the effectiveness of the proposed method.
Slides

Mayukh Bagchi – Aligning Visual and Lexical Semantics (58:43)
11/01/2023
The presentation will discuss two kinds of semantics relevant to semantics-enhanced Computer Vision (CV) systems – Visual Semantics and Lexical Semantics. While visual semantics focus on how humans build concepts when using vision to perceive a target reality, lexical semantics focus on how humans build concepts of the same target reality through the use of language. The lack of coincidence between visual and lexical semantics, in turn, has a major impact on CV systems in the form of the Semantic Gap Problem (SGP). The presentation, while exemplifying the lack of coincidence as above, will propose a general methodology to enforce one-to-one semantic alignment between visual and lexical semantics.
Slides

Xiaolei Diao – Incremental image labeling via Human-in-the-loop iterative refinement (40:07)
23/11/2022
Recent work in data-driven machine learning has discussed various types of systematic flaws in image benchmark datasets. In particular, the existence of the semantic gap problem leads to a many-to-many mapping between the information extracted from an image and its linguistic description. This unavoidable bias further leads to poor performance on current computer vision tasks. To address this issue, we introduce a Knowledge Representation (KR)-based methodology to provide guidelines driving the labeling process, thereby indirectly introducing intended semantics in ML models. Specifically, a novel human-in-the-loop iterative refinement process is proposed to optimize data labeling by organizing objects in a classification hierarchy according to their visual properties, ensuring that they are aligned with their linguistic descriptions. Preliminary results verify the effectiveness of the proposed method.
Slides

Xiutiao Ye – A brief introduction of the paper: “From ImageNet to Image Classification: Contextualizing Progress on Benchmarks”
KnowDive Seminars (1:02:46)
26/01/2022 Find the paper