How to do research
Introduction to the course
Research Methodology Practice (20:59)
The KnowDive starts a new course, Research Methodology Practice, given by Prof. Fausto Giunchiglia. The first lecture introduced the course organization and the “WHAT, HOW, and WHY” in research.
Find the slide
More videos on this topic
Xiaolei Diao – Towards data-centric AI (55:26)
Data quality is critical for multimedia tasks, while various types of systematic flaws are found in image benchmark datasets, as discussed in recent work. In particular, the semantic gap problem leads to a many-to-many mapping between the information extracted from an image and its linguistic description. This unavoidable bias further leads to poor performance on current computer vision tasks. To address this issue, we introduce a Knowledge Representation (KR)-based methodology to provide guidelines driving the labeling process, thereby indirectly introducing intended semantics in ML models. Specifically, an iterative refinement-based annotation method is proposed to optimize data labeling by organizing objects in a classification hierarchy according to their visual properties, ensuring they align with their linguistic descriptions. Preliminary results verify the effectiveness of the proposed method.
Simone Bocca – Data quality & Interoperability – The DataScientia approach (55:33)
Data quality and interoperability are characteristics playing a crucial role for data generation and exploitation. The cost of “noisy” (or “dirty”) data, as well as a low level of data interoperability, is paid at different levels during the data lifecycle. It affects the capacity of data retrieval, the interpretation of the data, the data adaptation and integration, therefore the capacity of exploiting the data. In other words the value of data itself. To limit such costs, quality and interoperability principles have been defined, such as the 5-star Open Data schema, and the FAIR principles. Nevertheless, one more step can be done towards data resources which can be smartly exploited in different contexts. This seminar aims at summarizing the current approaches for data quality and interoperability, therefore describing the approach defined by the DataScientia foundation to do the additional step toward higher quality and interoperable data.
Human Machine Symbiosis
Andrea Bontempelli – Concept-level Debugging of Part-Prototype Networks (57:19)
Part-prototype Networks (ProtoPNets) are concept-based classifiers designed to achieve the same performance as black-box models without compromising transparency. ProtoPNets compute predictions based on similarity to class-specific part-prototypes learned to recognize parts of training examples, making it easy to faithfully determine what examples are responsible for any target prediction and why. However, like other models, they are prone to picking up confounders and shortcuts from the data, thus suffering from compromised prediction accuracy and limited generalization. We propose ProtoPDebug, an effective concept-level debugger for ProtoPNets in which a human supervisor, guided by the model’s explanations, supplies feedback in the form of what part-prototypes must be forgotten or kept, and the model is fine-tuned to align with this supervision. Our experimental evaluation shows that ProtoPDebug outperforms state-of-the-art debuggers for a fraction of the annotation cost. An online experiment with laypeople confirms the simplicity of the feedback requested to the users and the effectiveness of the collected feedback for learning confounder-free part-prototypes. ProtoPDebug is a promising tool for trustworthy interactive learning in critical applications, as suggested by a preliminary evaluation on a medical decision making task.
Mayukh Bagchi – CFrom Knowledge Representation to Knowledge Organization and Back (52:42)
Knowledge Representation (KR) and facet-analytical Knowledge Organization (KO) have been the two most prominent methodologies of data and knowledge modelling in the Artificial Intelligence community and the Information Science community, respectively. KR boasts of a robust and scalable ecosystem of technologies to support knowledge modelling while, often, underemphasizing the quality of its models (and model-based data). KO, on the other hand, is less technology-driven but has developed a robust framework of guiding principles (canons) for ensuring modelling (and model-based data) quality. This seminar will present both the KR and facet-analytical KO methodologies at a high level and would provide a functional mapping between them. Out of the mapping, the seminar will propose an integrated KO-enriched KR methodology with all the standard components of a KR methodology plus the guiding canons of modelling quality provided by KO.
AI for Healthcare
Gábor Bella-Sharing health data for research: Technical perspective
InteropEHRate Final Conference (14:48)
28/09/2022 Venue:Université de Liège
In the context of the InteropEHRate project, the presentation provided the description of a novel data sharing protocol focused on health data for research purposes. The presentation gives an overview of the whole protocol steps, and it explains how it is supported by a dedicated technological infrastructure. In the end the presentation reports how the Research Data Sharing protocol has been concretely implemented during the InteropEHRate project’s pilots.
Check more IEHR information on Knowdive website
More videos on this topic
Alessio Zamboni – Agile Container @ Knowdive (57:41)
This presentation will introduce the computing resource that we have in our research group and the technologies that are used to deploy application in production. The presentation will cover dfferent platform but concentrate mostly on the leading state-of-the-art of orchestration technology: Kubernetes. We will understand how kubernetes is working, what are the limit and we will discuss the best practices for the development.
More videos on this topic