Datavis with R:
Drawing a Cleveland dot plot with ggplot2

Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Print this pageEmail this to someone

Cleveland dot plots are a great alternative to a simple bar chart, particularly if you have more than a few items. It doesn’t take much for a bar chart to look cluttered. In the same amount of space, many more values can be included in a dot plot, and it’s easier to read as well. R has a built-in base function, dotchart(), but since it’s such an easy graph to draw, doing it “from scratch” in ggplot2 or base allows for more customization. Here is a dot plot showing fertility data from the built-in swiss dataset drawn with ggplot2:

Cleveland dot plot with ggplot2

Hold mouse over blue code for explanation.

```{r ggdot, fig.height = 6, fig.width = 5} beginning of an Rmarkdown code chunk specifying figure height and width in inches
library(dplyrintuitive data manipulation package that works well with ggplot2)
library(ggplot2R data visualization package based on the grammar of graphics)
# create a theme for dot plots, which can be reused
theme_dotplot <- theme_bw(14)switches to a theme with a white background and sets the base font size to 14 pt +
    theme(axis.text.y = element_text(size = rel(.75))makes y-axis tick mark labels (Province names) 75% of default size,
    	axis.ticks.y = element_blank()removes y-axis tick marks,
        axis.title.x = element_text(size = rel(.75))makes x-axis label 75% of default size,
        panel.grid.major.x = element_blank()removes major vertical gridlines (theme default is 0.2),
        panel.grid.major.y = element_line(size = 0.5)darkens major horizontal gridlines (theme default is 0.2),
        panel.grid.minor.x = element_blank()removes minor vertical gridlines)
# move row names to a dataframe column        
df <-  swissbuilt-in dataset %>% add_rownamesmoves rownames to a new column (named "Province" here), needed since ggplot2 ignores rownames {dplyr}("Province")

# create the plot
ggplot(df, aes(x = Fertility
maps "Fertility" column to x axis, y = reorder(Province, Fertility)reorders "Province" by "Fertility" column (so dots will be plotted in ascending order from bottom to top), and maps it to the y axis)) +
	geom_point(color = "blue")geom for creating scatterplots (a.k.a. "dots") +
	scale_x_continuouscontrols mapping of data values to the x-axis(limits = c(35, 95)sets range of x-axis (35 to 95),
		breaks = seq(40, 90, 10)places labeled tick marks at multiples of 10 from 40 to 90) +
	theme_dotplotadds the dot plot theme created above--no parens used since it's not a function +
	xlabadds x-axis label("\nadds line break (has the effect of moving the x-axis label down)annual live births per 1,000 women aged 15-44") +
	ylabadds y-axis label("French-speaking provinces\nadds line break (has the effect of moving the y-axis label to the left)") +
	ggtitleadds title("Standardized Fertility Measure\nadds line break in titleSwitzerland, 1888")
```end of Rmarkdown code chunk

For more information on dot plots, see:
Naomi Robbins, “Dot Plots: A Useful Alternative to Bar Charts”

Webinar April 28, 10am PST: Effective Graphs with Microsoft R Open

Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Print this pageEmail this to someone

(reposted from:

Naomi Robbins, author of Creating More Effective Graphs and Forbes contributor has teamed up with daughter Dr Joyce Robbins to present a new webinar this Thursday April 28, Creating Effective Graphs with Microsoft R Open. The webinar will demonstrate how to create a variety of useful graphics with R: comparisons, distributions, trends over time, relationships, divisions of a whole, and much more like this:


This webinar will be useful for anyone who wants to learns how to display data graphically with the greatest impact. The webinar will use Microsoft R Open, but since it’s 100% compatible, the code provided during the webinar can be used with any edition of R. The webinar will begin at 10AM Pacific time (click here to see your local time), and I’ll be hosting and passing your questions to the presenters. Even if you can’t make the live event, sign up to receive a link to the slides and replay, plus a free copy of a new 50-page e-book by the presenters.

To register for the webinar, follow the link below.

Microsoft Advanced Analytics and IoT: Creating Effective Graphs with Microsoft R Open


How to Improve Your Graphs in R

Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Print this pageEmail this to someone

I teamed up with my mom, EGMROcover Naomi Robbins, to write Effective Graphs with Microsoft R Open, which we just completed and is available for download. I enjoyed returning to coding after a long break, and it was a great experience to be part of a mother-daughter team. Naomi took responsibility for the theory on drawing good graphs, and I did the R coding. The guide is written on an advanced beginner level–it assumes that readers will have basic knowledge of data manipulation in R–but our hope is that anyone with an interest in improving their graphs will find something useful in it. We divided the material into sections based on the type of data: direct comparisons, distributions, trends over time, relationships, percents, and “special cases”: diverging stacked bar charts (see graph below) and linked micromaps. Code is available on Github.

Two Worlds

Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Print this pageEmail this to someone

I’m a sociologist with a geeky side. In the blog posts to follow, I’ll share my observations on society, interspersed with how-to’s on programming, mainly with R. If you have any ideas on what you’d like me to blog about please contact me

Screenshot 2016-04-27 10.17.13


“Specialists without spirit, sensualists without heart; this nullity imagines that it has attained a level of civilization never before achieved.”
-Max Weber