Coding Standards in R

I write a lot of R code. It's my primarily language.

I have been frusturated for a while by the lack of coding standards at eSpark. The ruby developers have extensive best practices and all check each other's code with dedicated review. The R developers are far and few between. Currently there are only two of us writing actively and we each work on completely separate projects with no review. So I have tried to adopt the standards I learned working on ruby projects into R. That is until I stumbled across Google's styleguide for R.

I love this and I am going to adopt these guidelines on my future projects.

Some highlights:


Use <-, not =, for assignment.

x <- 5
x = 5

This I already follow. It makes me wonder why the = is used at all as an assignment tool when its real-world meaning is so different. The <- as an assigner is much more visually clear.


When indenting your code, use two spaces. Never use tabs or mix tabs and spaces. 
Exception: When a line break occurs inside parentheses, align the wrapped line with the first character inside the parenthesis.

The default in the R community is four spaces, which I find incredibly obnoxious. Two spaces is so much nicer and I'm glad that Google also uses two spaces. Now I can justify my maverick behavior.


Don't use underscores ( _ ) or hyphens ( - ) in identifiers. Identifiers should be named according to the following conventions. The preferred form for variable names is all lower case letters and words separated with dots (, but variableName is also accepted; function names have initial capital letters and no dots (FunctionName); constants are named like functions but with an initial k. is preferred, variableName is accepted 
GOOD: avg.clicks 
OK: avgClicks 
BAD: avg_Clicks
GOOD: CalculateAvgClicks 
BAD: calculate_avg_clicks , calculateAvgClicks 
Make function names verbs. 
Exception: When creating a classed object, the function name (constructor) and class should match (e.g., lm).

I have played around a lot with variable and function names.
These conventions makes sense. I wonder why they actively banned underscores.
I've also seen all variable names start with underscores and function names are camelcase.

Function Documentation

Functions should contain a comments section immediately below the function definition line. These comments should consist of a one-sentence description of the function; a list of the function's arguments, denoted by Args:, with a description of each (including the data type); and a description of the return value, denoted by Returns:. The comments should be descriptive enough that a caller can use the function without reading any of the function's code.

Example Function

CalculateSampleCovariance <- function(x, y, verbose = TRUE) {
  # Computes the sample covariance between two vectors.
  # Args:
  #   x: One of two vectors whose sample covariance is to be calculated.
  #   y: The other vector. x and y must have the same length, greater than one,
  #      with no missing values.
  #   verbose: If TRUE, prints sample covariance; if not, not. Default is TRUE.
  # Returns:
  #   The sample covariance between x and y.
  n <- length(x)
  # Error handling
  if (n <= 1 || n != length(y)) {
    stop("Arguments x and y have different lengths: ",
         length(x), " and ", length(y), ".")
  if (TRUE %in% || TRUE %in% {
    stop(" Arguments x and y must not have missing values.")
  covariance <- var(x, y)
  if (verbose)
    cat("Covariance = ", round(covariance, 4), ".\n", sep = "")

This is my favorite part. My mentor when learning rails was really against function comments. He said that if you need to comment on a function then you should write a better function name. That stigma has stuck with me and I rarely write comments on my function. But R is much less expressive language and comments are needed. Google's function commenting style is consistent and will improve the readability of my code greatly.


I want to write our own coding standards documentation at some point, but for now I'll stick with Google's.