Center for Anytime Anywhere Analytics

Journal Paper

#81

@article{Newburger2022, title={Fitting Bell Curves to Data Distributions using Visualization}, author={Eric Newburger and Michael Correll and Niklas Elmqvist}, url={https://users.umiacs.umd.edu/~elm/projects/fitting-bells/fitting-bells.pdf}, year={2022}, date={2022-10-01}, journal={IEEE Transactions on Visualization & Computer Graphics}, abstract={idealized probability distributions, such as normal or other curves, lie at the root of confirmatory statistical tests. But how well do people understand these idealized curves? In practical terms, does the human visual system allow us to match sample data distributions with hypothesized population distributions from which those samples might have been drawn? And how do different visualization techniques impact this capability? This paper shares the results of a crowdsourced experiment that tested the ability of respondents to fit normal curves to four different data distribution visualizations: bar histograms, dotplot histograms, strip plots, and boxplots. We find that the crowd can estimate the center (mean) of a distribution with some success and little bias. We also find that people generally overestimate the standard deviation—which we dub the “umbrella effect” because people tend to want to cover the whole distribution using the curve, as if sheltering it from the heavens above—and that strip plots yield the best accuracy.}, keywords={}, }

IEEE Transactions on Visualization & Computer Graphics • 2022

pdf Fitting Bell Curves to Data Distributions using Visualization^↗

Eric Newburger

Michael Correll

Niklas Elmqvist

Click to read abstract

idealized probability distributions, such as normal or other curves, lie at the root of confirmatory statistical tests. But how well do people understand these idealized curves? In practical terms, does the human visual system allow us to match sample data distributions with hypothesized population distributions from which those samples might have been drawn? And how do different visualization techniques impact this capability? This paper shares the results of a crowdsourced experiment that tested the ability of respondents to fit normal curves to four different data distribution visualizations: bar histograms, dotplot histograms, strip plots, and boxplots. We find that the crowd can estimate the center (mean) of a distribution with some success and little bias. We also find that people generally overestimate the standard deviation—which we dub the “umbrella effect” because people tend to want to cover the whole distribution using the curve, as if sheltering it from the heavens above—and that strip plots yield the best accuracy.

Publications coauthored by

Michael Correll

pdf Fitting Bell Curves to Data Distributions using Visualization ↗

pdf Fitting Bell Curves to Data Distributions using Visualization^↗