Some satisfaction with R

Casual readers of this blog will know that I’ve been using R (a statistical programming language) to sort out some data and plot some graphs. It’s a minor dabbling, and one which is often frustrating. I’m not a natural coder, and I certainly don’t relish the challenges it throws up. But, having used R on and off for a few years now, I’m starting to get a grip on a few things.

Take, for example, a diagram I produced in R for the upcoming GHGT-12 conference, Figure 1. I’ve deliberately taken off anything which might say what the data relates to; partly because it’s not relevant, and partly to keep things under wraps until the conference paper is published in a few months. Not that it’s of great importance, mind you, but at this point in my PhD I’m not sure which cats I can let out the bag yet.

Figure 1. Faceted histogram of concentration data

Figure 1. Faceted histogram of concentration data

Anyway, we can see that it’s a bunch of histograms of three sets of concentration data (green, purple, pink), subdivided by some other properties using the facet_grid() function in ggplot. Using facet_grid() is very straightforward: you set up an individual plot, and then use facet_grid() to split your data up based on one or two variables, e.g. see a snippet of code below:

library(ggplot2)    #for plotting graph
library(scales)     #for log scale labelling

#Sets the resolution of the printed PNG image file

#sets up to plot as PNG image
png("Facet Histogram.png", width=7.2*ppi, height=7.2*ppi, res=ppi)   

#plot function. Note variable1 is a substitute for what I actually called it
ggplot(filename, aes(x=Concentration, fill=variable1)) +
                 binwidth=0.25) +
  scale_fill_manual(values=c("#16c922", "#e587ff", "#c724f4"),
                    labels=c("1", "2", "3")) +    #I removed these labels from the diagram

#scales the x axis on a log 10 scale, and formats the numbers with superscript powers
  scale_x_log10(breaks=trans_breaks("log10", function(x) 10^x),
                labels=trans_format("log10", math_format(10^.x))) +  

#The magic facet_grid function, variable names are substitutes
  facet_grid(variable2 ~ variable3) +
  ylab("Counts") +
  theme_bw() +

#My own theme settings, not included here

So, being all pleased with the output, my supervisor had a look and suggested that I should add an indicator of the analytical detection limits, since there are lots of values in the data which sit at these limits. “Bugger“, was my first thought. “I’m going to have to manually draw in 27 different lines or arrows” was the second. But then I remembered that I can use ggplot to draw vertical lines using geom_vline(), and that the lines can be given values based on the data in a file. So, the moment of satisfaction for me was working out how to add that data in, and ensuring the lines plotted out in the facet. So long as the new added data (as a new data frame) has the same column names as those used to plot the facet, then job’s a good ‘un. You only need to add the geom_vline() in to the histogram plot code, and the facet_grid() function takes care of the rest, Figure 2. The code is below the figure.

Facet 2

Figure 2. Faceted histogram with added detection limit reference lines.

#I've chopped off the top and bottom parts to show where the geom_vline was inserted

scale_x_log10(breaks=trans_breaks("log10", function(x) 10^x),
                labels=trans_format("log10", math_format(10^.x))) +  

#In it goes...
  geom_vline(aes(xintercept=LOD),  #adds vertical dotted line with detection limit
             size=0.4) +

#The lod data frame has the same variable names as previously used
  facet_grid(variable2 ~ variable3) + 


So it turns out this was very satisfying for me, mostly because it shows that I’m actually learning something about coding, even if it’s something fairly simple and nondescript!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s