-
pdf Keshif: Rapid and Expressive Tabular Data Exploration for Novices ↗
Click to read abstract
General purpose graphical interfaces for data exploration are typically based on manual visualization and interaction specifications. While designing manual specification can be very expressive, it demands high efforts to make effective decisions, therefore reducing exploratory speed. Instead, principled automated designs can increase exploratory speed, decrease learning efforts, help avoid ineffective decisions, and therefore better support data analytics novices. Towards these goals, we present Keshif, a new systematic design for tabular data exploration. To summarize a given dataset, Keshif aggregates records by value within attribute summaries, and visualizes aggregate characteristics using a consistent design based on data types. To reveal data distribution details, Keshif features three complementary linked selections: highlighting, filtering, and comparison. Keshif further increases expressiveness through aggregate metrics, absolute/part-of scale modes, calculated attributes, and saved selections, all working in synchrony. Its automated design approach also simplifies authoring of dashboards composed of summaries and individual records from raw data using fluid interaction. We show examples selected from 160+ datasets from diverse domains. Our study with novices shows that after exploring raw data for 15 minutes, our participants reached close to 30 data insights on average, comparable to other studies with skilled users using more complex tools.
-
pdf Raising the Bars: Evaluating Treemaps vs. Wrapped Bars for Dense Visualization of Sorted Numeric Data ↗
Click to read abstract
A standard (single-column) bar chart can effectively visualize a sorted list of numeric records. However, the chart height limits the number of visible records. To show more records, the bars could be made thinner (which could hinder identifying records individually), and scrolling requires interaction to see the overview. Treemaps have been used in practice in non-hierarchical settings for dense visualization of numeric data. Alternatively, we consider wrapped bars, a multi-column bar chart that uses length instead of area to encode numeric values. We compare treemaps and wrapped bars based on their design characteristics, and graphical perception performance for comparison, ranking, and overview tasks using crowdsourced experiments. Our analysis found that wrapped bars perceptually outperform treemaps in all three tasks for dense visualization of non-hierarchical, sorted numeric data.
-
pdf AggreSet: Rich and Scalable Set Exploration using Visualizations of Element Aggregations ↗
Click to read abstract
Datasets commonly include multi-value (set-typed) attributes that describe set memberships over elements, such as genres per movie or courses taken per student. Set-typed attributes describe rich relations across elements, sets, and the set intersections. Increasing the number of sets results in a combinatorial growth of relations and creates scalability challenges. Exploratory tasks (e.g. selection, comparison) have commonly been designed in separation for set-typed attributes, which reduces interface consistency. To improve on scalability and to support rich, contextual exploration of set-typed data, we present AggreSet. AggreSet creates aggregations for each data dimension: sets, set-degrees, set-pair intersections, and other attributes. It visualizes the element count per aggregate using a matrix plot for set-pair intersections, and histograms for set lists, set-degrees and other attributes. Its non-overlapping visual design is scalable to numerous and large sets. AggreSet supports selection, filtering, and comparison as core exploratory tasks. It allows analysis of set relations inluding subsets, disjoint sets and set intersection strength, and also features perceptual set ordering for detecting patterns in set matrices. Its interaction is designed for rich and rapid data exploration. We demonstrate results on a wide range of datasets from different domains with varying characteristics, and report on expert reviews and a case study using student enrollment and degree data with assistant deans at a major public university.