How to do research
Introduction to the course
Research Methodology Practice (20:59)
The KnowDive starts a new course, Research Methodology Practice, given by Prof. Fausto Giunchiglia. The first lecture introduced the course organization and the “WHAT, HOW, and WHY” in research.
Find the slide
More videos on this topic
Xiaolei Diao – Towards data-centric AI (55:26)
Data quality is critical for multimedia tasks, while various types of systematic flaws are found in image benchmark datasets, as discussed in recent work. In particular, the semantic gap problem leads to a many-to-many mapping between the information extracted from an image and its linguistic description. This unavoidable bias further leads to poor performance on current computer vision tasks. To address this issue, we introduce a Knowledge Representation (KR)-based methodology to provide guidelines driving the labeling process, thereby indirectly introducing intended semantics in ML models. Specifically, an iterative refinement-based annotation method is proposed to optimize data labeling by organizing objects in a classification hierarchy according to their visual properties, ensuring they align with their linguistic descriptions. Preliminary results verify the effectiveness of the proposed method.
Simone Bocca – Data quality & Interoperability – The DataScientia approach (55:33)
Data quality and interoperability are characteristics playing a crucial role for data generation and exploitation. The cost of “noisy” (or “dirty”) data, as well as a low level of data interoperability, is paid at different levels during the data lifecycle. It affects the capacity of data retrieval, the interpretation of the data, the data adaptation and integration, therefore the capacity of exploiting the data. In other words the value of data itself. To limit such costs, quality and interoperability principles have been defined, such as the 5-star Open Data schema, and the FAIR principles. Nevertheless, one more step can be done towards data resources which can be smartly exploited in different contexts. This seminar aims at summarizing the current approaches for data quality and interoperability, therefore describing the approach defined by the DataScientia foundation to do the additional step toward higher quality and interoperable data.
Human Machine Symbiosis
Andrea Bontempelli – Concept-level Debugging of Part-Prototype Networks (57:19)
Part-prototype Networks (ProtoPNets) are concept-based classifiers designed to achieve the same performance as black-box models without compromising transparency. ProtoPNets compute predictions based on similarity to class-specific part-prototypes learned to recognize parts of training examples, making it easy to faithfully determine what examples are responsible for any target prediction and why. However, like other models, they are prone to picking up confounders and shortcuts from the data, thus suffering from compromised prediction accuracy and limited generalization. We propose ProtoPDebug, an effective concept-level debugger for ProtoPNets in which a human supervisor, guided by the model’s explanations, supplies feedback in the form of what part-prototypes must be forgotten or kept, and the model is fine-tuned to align with this supervision. Our experimental evaluation shows that ProtoPDebug outperforms state-of-the-art debuggers for a fraction of the annotation cost. An online experiment with laypeople confirms the simplicity of the feedback requested to the users and the effectiveness of the collected feedback for learning confounder-free part-prototypes. ProtoPDebug is a promising tool for trustworthy interactive learning in critical applications, as suggested by a preliminary evaluation on a medical decision making task.
Vincenzo Maltese – Catalouging experts by competences: the Digital University project (12:22)
Subject indexing of non-book resources International Conference
06/02/2023 Venue: Roma, Aula Odeion, Sapienza Università di Roma
Knowledge Organization methodologies are traditionally used to catalog bibliographic material. However, they also prove effective for the creation of controlled vocabularies, the cataloging and search of non-library resources (for example audio, video, museum items, etc.). I will present the experience gained at the University of Trento as part of the Digital University initiative. By applying Knowledge Organization, Knowledge Representation and Data Integration methodologies, we developed a system that allows indexing and searching experts by their competences (https://webapps.unitn.it/du/en/Esperti). The controlled vocabulary currently consists of about 3000 concepts in Italian and English, and is constantly growing. The vocabulary is managed by applying Ranganathan’s analytico-synthetic approach, and in particular the subdivision rule by genus-species. In Digital University, competences are only one of the metadata we use to describe people (for example, we also manage the name, surname, email and telephone contacts, affiliations, …). In turn, they are only one of the objects that we manage that include publications, theses, courses and research projects. The methodologies that enable these objects to be managed effectively and efficiently have been first presented at ISKO UK in 2013 and published in the Knowledge Organization journal in 2014 . The developed solution has been published in the CATALOGING & CLASSIFICATION QUARTERLY journal in 2019 .  Giunchiglia, Dutta, Maltese, “From Knowledge Organization to Knowledge Representation” in KNOWLEDGE ORGANIZATION, v. 2014, n. 41 (1) (2014), p. 44-56  Maltese, “Digital Transformation Challenges for Universities: Ensuring Information Consistency Across Digital Services” in CATALOGING & CLASSIFICATION QUARTERLY, v. 2019, 56, n. 7 (2019), p. 1-15. – https://www.tandfonline.com/doi/abs/10.1080/01639374.2018.1504847.
AI for Healthcare
Gábor Bella-Sharing health data for research: Technical perspective
InteropEHRate Final Conference (14:48)
28/09/2022 Venue:Université de Liège
In the context of the InteropEHRate project, the presentation provided the description of a novel data sharing protocol focused on health data for research purposes. The presentation gives an overview of the whole protocol steps, and it explains how it is supported by a dedicated technological infrastructure. In the end the presentation reports how the Research Data Sharing protocol has been concretely implemented during the InteropEHRate project’s pilots.
Check more IEHR information on Knowdive website
More videos on this topic
Alessio Zamboni – Agile Container @ Knowdive (57:41)
This presentation will introduce the computing resource that we have in our research group and the technologies that are used to deploy application in production. The presentation will cover dfferent platform but concentrate mostly on the leading state-of-the-art of orchestration technology: Kubernetes. We will understand how kubernetes is working, what are the limit and we will discuss the best practices for the development.
More videos on this topic