Bias Assessments of Benchmarks for Link Predictions over Knowledge Graphs

Sawischa, Sammy Fabian

Repository landing page

oai:www.repo.uni-hannover.de:123456789/13710

Bias Assessments of Benchmarks for Link Predictions over Knowledge Graphs

Authors: Sammy Fabian Sawischa
Publication date: 26 April 2023
Publisher: Hannover : Gottfried Wilhelm Leibniz Universität
Doi

Abstract

Link prediction (LP) aims to tackle the challenge of predicting new facts by reasoning over a knowledge graph (KG). Different machine learning architectures have been proposed to solve the task of LP, several of them competing for better performance on a few de-facto benchmarks. The problem of this thesis is the characterization of LP datasets regarding their structural bias properties and their effects on attained performance results. We provide a domain-agnostic framework that assesses the network topology, test leakage bias and sample selection bias in LP datasets. The framework includes SPARQL queries that can be reused in the explorative data analysis of KGs for uncovering unusual patterns. We finally apply our framework to characterize 7 common benchmarks used for assessing the task of LP. In conducted experiments, we use a trained TransE model to show how the two bias types affect prediction results. Our analysis shows problematic patterns in most of the benchmark datasets. Especially critical are the findings regarding the state-of-the-art benchmarks FB15k-237, WN18RR and YAGO3-10

Similar works

Full text

Open in the Core reader

Download PDF

Institutionelles Repositorium der Leibniz Universität Hannover

oai:www.repo.uni-hannover.de:1...

Last time updated on 20/06/2023

This paper was published in Institutionelles Repositorium der Leibniz Universität Hannover.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.