-
pdf Revealing Perceptual Proxies with Adversarial Examples ↗
Click to read abstract
Data visualizations convert numbers into visual marks so that our visual system can extract data from an image instead of raw numbers. Clearly, the visual system does not compute these values as a computer would, as an arithmetic mean or a correlation.Instead, it extracts these patterns using perceptual proxies; heuristic shortcuts of the visual marks, such as a center of mass or a shape envelope. Understanding which proxies people use would lead to more effective visualizations. We present the results of a series of crowdsourced experiments that measure how powerfully a set of candidate proxies can explain human performance when comparing the mean and range of pairs of data series presented as bar charts. We generated datasets where the correct answer---the series with the larger arithmetic mean or range---was pitted against an "adversarial" series that should be seen as larger if the viewer uses a particular candidate proxy. We used both Bayesian logistic regression models and a robust Bayesian mixed-effects linear model to measure how strongly each adversarial proxy could drive viewers to answer incorrectly and whether different individuals may use different proxies. Finally, we attempt to construct adversarial datasets from scratch, using an iterative crowdsourcing procedure to perform black-box optimization.
-
pdf The Perceptual Proxies of Visual Comparison ↗
Click to read abstract
Perceptual tasks in visualizations often involve comparisons. Of two sets of values depicted in two charts, which set had values that were the highest overall? Which had the widest range? Prior empirical work found that the performance on different visual comparison tasks (e.g., "biggest delta", "biggest correlation") varied widely across different combinations of marks and spatial arrangements. In this paper, we expand upon these combinations in an empirical evaluation of two new comparison tasks: the "biggest mean" and "biggest range" between two sets of values. We used a staircase procedure to titrate the difficulty of the data comparison to assess which arrangements produced the most precise comparisons for each task. We find visual comparisons of biggest mean and biggest range are supported by some chart arrangements more than others, and that this pattern is substantially different from the pattern for other tasks. To synthesize these dissonant findings, we argue that we must understand which features of a visualization are actually used by the human visual system to solve a given task. We call these perceptual proxies. For example, when comparing the means of two bar charts, the visual system might use a "Mean length" proxy that isolates the actual lengths of the bars and then constructs a true average across these lengths. Alternatively, it might use a "Hull Area" proxy that perceives an implied hull bounded by the bars of each chart and then compares the areas of these hulls. We propose a series of potential proxies across different tasks, marks, and spatial arrangements. Simple models of these proxies can be empirically evaluated for their explanatory power by matching their performance to human performance across these marks, arrangements, and tasks. We use this process to highlight candidates for perceptual proxies that might scale more broadly to explain performance in visual comparison.
-
pdf Face to Face: Evaluating Visual Comparison ↗
Click to read abstract
Data are often viewed as a single set of values, but those values frequently must be compared with another set. The existing evaluations of designs that facilitate these comparisons tend to be based on intuitive reasoning, rather than quantifiable measures. We build on this work with a series of crowdsourced experiments that use low-level perceptual comparison tasks that arise frequently in comparisons within data visualizations (e.g., which value changes the most between the two sets of data?). Participants completed these tasks across a variety of layouts: overlaid, two arrangements of juxtaposed small multiples, mirror-symmetric small multiples, and animated transitions. A staircase procedure sought the difficulty level (e.g., value change delta) that led to equivalent accuracy for each layout. Confirming prior intuition, we observe high levels of performance for overlaid versus standard small multiples. However, we also find performance improvements for both mirror symmetric small multiples and animated transitions. While some results are incongruent with common wisdom in data visualization, they align with previous work in perceptual psychology, and thus have potentially strong implications for visual comparison designs.