Computer Vision Videos

Mayukh Bagchi – Aligning Visual and Lexical Semantics (58:43)
The presentation will discuss two kinds of semantics relevant to semantics-enhanced Computer Vision (CV) systems – Visual Semantics and Lexical Semantics. While visual semantics focus on how humans build concepts when using vision to perceive a target reality, lexical semantics focus on how humans build concepts of the same target reality through the use of language. The lack of coincidence between visual and lexical semantics, in turn, has a major impact on CV systems in the form of the Semantic Gap Problem (SGP). The presentation, while exemplifying the lack of coincidence as above, will propose a general methodology to enforce one-to-one semantic alignment between visual and lexical semantics.

Xiaolei Diao – Incremental image labeling via Human-in-the-loop iterative refinement (40:07)
Recent work in data-driven machine learning has discussed various types of systematic flaws in image benchmark datasets. In particular, the existence of the semantic gap problem leads to a many-to-many mapping between the information extracted from an image and its linguistic description. This unavoidable bias further leads to poor performance on current computer vision tasks. To address this issue, we introduce a Knowledge Representation (KR)-based methodology to provide guidelines driving the labeling process, thereby indirectly introducing intended semantics in ML models. Specifically, a novel human-in-the-loop iterative refinement process is proposed to optimize data labeling by organizing objects in a classification hierarchy according to their visual properties, ensuring that they are aligned with their linguistic descriptions. Preliminary results verify the effectiveness of the proposed method.

Xiutiao Ye – A brief introduction of the paper: “From ImageNet to Image Classification: Contextualizing Progress on Benchmarks”
KnowDive Seminars (1:02:46)
26/01/2022 Find the paper