60+ R resources to improve your data skills

This list was originally published as part of the Computerworld Beginner's Guide to R but has since been expanded to also include resources for advanced beginner and intermediate users. If you're just starting out with R, I recommend first heading to the Beginner's Guide.

These websites, videos, blogs, social media/communities, software and books/ebooks can help you do more with R.

Books and e-books

R Cookbook. Like the rest of the O'Reilly Cookbook series, this one offers how-to "recipes" for doing lots of different tasks, from the basics of R installation and creating simple data objects to generating probabilities, graphics and linear regressions. It has the added bonus of being well written. If you like learning by example or are seeking a good R reference book, this is well worth adding to your reference library. By Paul Teetor, a quantitative developer working in the financial sector.

R Graphics Cookbook. If you want to do beyond-the-basics graphics in R, this is a useful resource both for its graphics recipes and brief introduction to ggplot2. While this goes way beyond the graphics capabilities that I need in R, I'd recommend this if you're looking to move beyond advanced-beginner plotting. By Winston Chang, a software engineer at RStudio.

R in Action: Data analysis and graphics with R. This book aims at all levels of users, with sections for beginning, intermediate and advanced R ranging from "Exploring R data structures" to running regressions and conducting factor analyses. The beginner's section may be a bit tough to follow if you haven't had any exposure to R, but it offers a good foundation in data types, imports and reshaping once you've had a bit of experience. There are some particularly useful explanations and examples for aggregating, restructuring and subsetting data, as well as a lot of applied statistics. Note that if your interest in graphics is learning ggplot2, there's relatively little on that here compared with base R graphics and the lattice package. You can see an excerpt from the book online: Aggregation and restructuring data. By Robert I. Kabacoff.

The Art of R Programming. For those who want to move beyond using R "in an ad hoc way ... to develop[ing] software in R." This is best if you're already at least moderately proficient in another programming language. It's a good resource for systematically learning fundamentals such as types of objects, control statements (unlike many R purists, the author doesn't actively discourage for loops), variable scope, classes and debugging -- in fact, there's nearly as large a chapter on debugging as there is on graphics. With some robust examples of solving real-world statistical problems in R. By Norman Matloff.

R in a Nutshell. A reasonably readable guide to R that teaches the language's fundamentals -- syntax, functions, data structures and so on -- as well as how-to statistical and graphics tasks. Useful if you want to start writing robust R programs, as it includes sections on functions, object-oriented programming and high-performance R. By Joseph Adler, a senior data scientist at LinkedIn.

Visualize This. Note; Most of this book is not about R, but there are several examples of visualizing data with R. And there's so much other interesting info here about how to tell stories with data that it's worth a read. By Nathan Yau, who runs the popular Flowing Data blog and whose doctoral dissertation was on "personal data collection and how we can use visualization to learn about ourselves."

R For Dummies. I haven't had a chance to read this one, but it's garnered some good reviews on Amazon.com. If you're familiar with the Dummies series and have found them helpful in the past, you might want to check this one out. You can get a taste of the authors' style in the Programming in R section of Dummies.com, which has more than a 100 short sections such as How to construct vectors in R and How to use the apply family of functions in R. By Joris Meys and Andrie de Vries.

Introduction to Data Science. It's highly readable, packed with useful examples and free -- what more could you want? This e-book isn't technically an "R book," but it uses R for all of its examples as it teaches concepts of data analysis. If you're familiar with that topic you may find some of the explanations rather basic, but there's still a lot of R code for things like analyzing tweet rates (including a helpful section on how to get Twitter OAuth authorization working in R), simple map mashups and basic linear regression. Although Stanton calls this an "electronic textbook," Introduction to Data Science has a conversational style that's pleasantly non-textbook like. There used to be a downloadable PDF, but now the only versions are for OS X or iOS.

R for Everyone. Author Jared P. Lander promises to go over "20% of the functionality needed to accomplish 80% of the work." And in fact, topics that are actually covered, are covered pretty well; but be warned that some topics appearing in the table of contents can be a little thin. This is still a well-organized reference, though, with sections on topics beginning and intermediate users might want to know: importing data, generating graphs, grouping and reshaping data, working with basic stats and more.

Statistical Analysis With R: Beginner's Guide. This book has you "pretend" you're a strategist for an ancient Chinese kingdom analyzing military strategies with R. If you find that idea hokey, move along to see another resource; if not, you'll get a beginner-level introduction to various tasks in R, including tasks you don't always see in an intro text, such as multiple linear regressions and forecasting. Note: My early e-version had a considerable amount of bad spaces in my Kindle app, but it was still certainly readable and usable.

Reproducible Research with R and RStudio. Although categorized as a "bioinformatics" textbook (and priced that way - even the Kindle edition is more than $50), this is more general advice on steps to make sure you can document and present your work. This includes numerous sections on creating report documents using the knitr package, LaTeX and Markdown -- tasks not often covered in-depth in general R books. The author has posted source code for generating the book on GitHub, though, if you want to create an electronic version of it yourself.

Exploring Everyday Things with R and Ruby. This book oddly goes from a couple of basic introductory chapters to some fairly robust, beyond-beginner programming examples; for those who are just starting to code, much of the book may be tough to follow at the outset. However, the intro to R is one of the better ones I've read, including lot of language fundamentals and basics of graphing with ggplot2. Plus experienced programmers can see how author Sau Sheong Chang splits up tasks between a general language like Ruby and the statistics-focused R.

Online references

4 data wrangling tasks in R for advanced beginners, This follow-up to our Beginner's Guide outlines how to do several specific data tasks in R: add columns to an existing data frame, get summaries, sort results and reshape data. With sample code and explanations.

Cookbook for R. Not to be confused with the R Cookbook book mentioned above, this website by software engineer Winston Chang (author of the R Graphics Cookbook) offers how-to's for tasks such as data input and output, statistical analysis and creating graphs. It's got a similar format to an O'Reilly Cookbook; and while not as complete, can be helpful for answering some "How do I do that?" questions.

Quick-R. This site has a fair amount of samples and brief explanations grouped by major category and then specific items. For example, you'd head to "Stats" and then "Frequencies and crosstabs" to get an explainer of the table() function. This ranges from basics (including useful how-to's for customizing R startup) through beyond-beginner statistics (matrix algebra, anyone?) and graphics. By Robert I. Kabacoff, author of R in Action.

R Reference Card. If you want help remembering function names and formats for various tasks, this 4-page PDF is quite useful despite its age (2004) and the fact that a link to what's supposed to be the latest version no longer works. By Tom Short, an engineer at the Electric Power Research Institute.

A short list of R the most useful commands. Commands grouped by function such as input, "moving around" and "statistics and transformations." This offers minimal explanations, but there's also a link to a longer guide to Using R for psychological research. HTML format makes it easy to cut and paste commands. Also somewhat old, from 2005. By William Revelle, psychology professor at Northwestern University.

Chart Chooser in R. This has numerous examples of R visualizations and sample code to go with them, including bar, column, stacked bar & column, bubble charts and more. It also breaks down the visualizations by categories like comparison, distribution and trend. By Greg Lamp, based on Juice Labs' Chart Choser for Excel and PowerPoint.

Frequently Asked Questions about R. Some basics about reading, writing, sorting and shaping data as well as a lineup of how to do various statistical operations and a few specialized graphics such as spaghetti plots. From UCLA's Institute for Digital Research and Education.

R Reference Card for Data Mining. This is a task-oriented compilation of useful R packages and functions for things ranging from text mining and time series analysis to more general subjects like graphics and data manipulation. Since descriptions are somewhat bare-boned, this will likely be more useful to either remind you of functions you've seen before or give you suggestions for things to try. For much more on the subject, head to the author's R and Data Mining website, which includes examples and other documentation. including a substantial portion of his book R and Data Mining published by Elsevier in 2012. By Yanchang Zhao.

Spatial Cheat Sheet. For those doing GIS and spatial analysis work, this list offers some key functions and packages for working with spatial vector and raster data. By Barry Stephen Rowlingson at Lancaster University in the U.K.

Online tools

Web interface for ggplot2. This online tool by UCLA Ph.D. candidate Jeroen Ooms creates an interactive front end for ggplot2, allowing users to input tasks they want to do and get a plot plus R code in return. Useful for those who want to learn more about using ggplot2 for graphics without having to read through lengthy documentation.

Ten Things You Can Do in R That You Would've Done in Microsoft Excel. From the R for Dummies Web site, these code samples aim to help Excel users feel more comfortable with R.


Twotorials. You'll either enjoy these snappy 2-minute "twotorial" videos or find them, oh, corny or over the top. I think they're both informative and fun, a welcome antidote to the typically dry how-to's you often find in statistical programming. Analyst Anthony Damico takes on R in 2-minute chunks, from "how to create a variable with R" to "how to plot residuals from a regression in R;" he also tackles an occasional problem such as "how to calculate your ten, fifteen, or twenty thousandth day on earth with R." I'd strongly recommend giving this a look if textbook-style instruction leaves you cold.

Google Developers' Intro to R. This series of 21 short YouTube videos includes some basic R concepts, a few lessons on reshaping data and some info on loops. In addition, six videos focus on a topic that's often missing in R intros: working with and writing your own functions. This YouTube playlist offers a good programmer's introduction to the language -- just note that if you're looking to learn more about visualizations with R, that's not one of the topics covered.

Up and Running with R. This lynda.com video class covers the basics of topics such as using the R environment, reading in data, creating charts and calculating statistics. The curriculum is limited, but presenter Barton Poulson tries to explain what he's doing and why, not simply run commands. He also has a more in-depth 6-hour class, R Statistics Essential Training. Lynda.com is a subscription service that starts at $25/month, but several of the videos are available free for you to view and see if you like the instruction style, and there's a 7-day free trial available.

Coursera: Computing for Data Analysis. Coursera's free online classes are time-sensitive: You've got to enroll while they're taking place or you're out of luck. However, if there's no session starting soon, instructor Roger Peng, associate professor of biostatistics at Johns Hopkins University, posted his lectures on YouTube; Revolution Analytics then collected them on a handy single page. While I found some of these a bit difficult to follow at times, they are packed with information, and you may find them useful.

Coursera: Data Analysis. This was more of an applied statistics class that uses R as opposed to one that teaches R; but if you've got the R basics down and want to see it in action, this might be a good choice. There are no upcoming scheduled sessions for this at Coursera, but instructor Jeff Leek -- an assistant professor of biostatistics at Johns Hopkins, posted his lecture videos on YouTube, and Revolution Analytics collected links to them all by week.

1 2 3 Page 1
Page 1 of 3
The 10 most powerful companies in enterprise networking 2022