Overview

This workshop aims to explore the different aspects of quality of graph data and models of graphs, in the context of graph mining and ML on graphs. Like all data, graphs may be noisy. For example, links and nodes may be missing or have wrong attribute values (random noise), there may be structural errors in the collection or coding (bias), this bias may be related to protected groups (fairness), the graph may have been tampered with (adversarial changes), etc. Further, calibrated noise may be added purposefully to address privacy concerns. However, owing to the structured and non-i.i.d. nature of the objects that comprise a graph, special methods are needed to address these questions.

To address data quality issues for graphs, the first step is to have methods to identify possible problems and present such insights. That is, methods are needed specifically to identify bias, outliers, noise, unexpected values, etc. Such methods may be fully automatic or have a human-in-the-loop component and for example employ graph visualisation for interaction. Secondly, modeling and prediction typically depend on a machine learning step, fitting a model to the graph. Robust methods that learn models on graphs with noisy data must be developed. Further, evaluation of bias/fairness in the outcome of the model, should be balanced with the model prediction quality. Moreover, there is little research on exploring the interaction between data and model quality.

To summarize, the topic of the workshop is research and methods for

  1. The assessment, quantification, and identification of data and model quality problems on graphs.
  2. The interplay between problems in data and model quality.
  3. Methods for (semi-)automatically improving data and model quality.

More specifically, we would like to cover the following topics at the workshop:

  • Anomaly detection on graphs
  • Assessment of fairness and bias in the context of graphs, graph models (including representations), and subsequent tasks such as link prediction
  • Explainable graph models and predictions
  • Graph models and representations learning in the context of missing data and noise
  • Privacy preserving data mining and machine learning for relational data
  • Probabilistic methods and uncertainty estimation on networks
  • Algorithms and metrics for quality preservation on relational data

This workshop aims to connect researchers working on this area, provide a platform for presentation of the latest developments in new methods, and stimulate discussion of these issues.