Anscombes Quartet
Anscombes Quartet comprises four datasets that have nearly identical simple statistical properties, yet appear very different when graphed. Each dataset consists of eleven (x,y) points. They were constructed in 1973 by the statistician Francis Anscombe to demonstrate both the importance of graphing data before analyzing it and the effect of outliers on statistical properties. He described the article as being intended to attack the impression among statisticians that "numerical calculations are exact, but graphs are rough.
Here we use r inside a Jupyter Notebook to briefly explore and then graph these datasets to highlight the importance of visually inspecting data.
The notebook can be viewed here.
The github repository for this project can be viewed here.