System Design to Utilize Domain Expertise for Visual Exploratory Data Analysis release_poqm5p662barnkn5iyigor5qtu

by Tristan Langer, Tobias Meisen

Published in Information by MDPI AG.

2021   Volume 12, p140

Abstract

Exploratory data analysis (EDA) is an iterative process where data scientists interact with data to extract information about their quality and shape as well as derive knowledge and new insights into the related domain of the dataset. However, data scientists are rarely experienced domain experts who have tangible knowledge about a domain. Integrating domain knowledge into the analytic process is a complex challenge that usually requires constant communication between data scientists and domain experts. For this reason, it is desirable to reuse the domain insights from exploratory analyses in similar use cases. With this objective in mind, we present a conceptual system design on how to extract domain expertise while performing EDA and utilize it to guide other data scientists in similar use cases. Our system design introduces two concepts, interaction storage and analysis context storage, to record user interaction and interesting data points during an exploratory analysis. For new use cases, it identifies historical interactions from similar use cases and facilitates the recorded data to construct candidate interaction sequences and predict their potential insight—i.e., the insight generated from performing the sequence. Based on these predictions, the system recommends the sequences with the highest predicted insight to data scientist. We implement a prototype to test the general feasibility of our system design and enable further research in this area. Within the prototype, we present an exemplary use case that demonstrates the usefulness of recommended interactions. Finally, we give a critical reflection of our first prototype and discuss research opportunities resulting from our system design.
In application/xml+jats format

Archived Files and Locations

application/pdf  1.6 MB
file_nylbuod2nbcvzna3lqyccwwtbq
res.mdpi.com (publisher)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Date   2021-03-24
Language   en ?
Container Metadata
Open Access Publication
In DOAJ
In ISSN ROAD
In Keepers Registry
ISSN-L:  2078-2489
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 7b75754c-a273-4232-b247-63f6bd9f1064
API URL: JSON