1: Perhaps a decent resource for R package developers, not end-users
This is a strange little book in that it seems somewhat directed toward statisticians who want to develop R packages. The OOP section takes up 50 pages and discusses "S3 and S4" implementations of OOP in R in great detail, all of which is not doubt important for those few dozen accomplished statisticians who wish to write packages. However, by the time you are ready to actually write an R function that other people will use I can't imagine you wouldn't already be familiar with some of the basic commands discussed elsewhere in this book. So I am wondering who the intended audience is.
I think the majority of R users (biologists and programmers) want to run through some common statistical routines in a procedural fashion and produce reports that perform some analysis and show some graphs. The difficulty with R is learning how to massage data into a form that an existing statistical function will accept. That will invariably involve helper R-specific helper functions that do not exist in programming languages (e.g. unsplit) or that require a precise understanding of input (e.g. xtabs), and statistical routines that almost never return meaningful errors (glm). Manipulating data structures in R is not particularly intuitive (e.g. as.numeric(levels(f))[f]), so tons of examples are a must. However this book simply does not include enough R code - probably fewer than 250 lines.
In some instances commands are discussed at length in the space it would take to simply show the command. For example, a beginner would want to know how to save a data frame. Instead of providing a useful example like:
save(myDataFrame,file="myDataFrame.frame.RData",compress=TRUE)
there is a bizarre paragraph called "Working with R's binary format", in which save and load are discussed in theory as if they are planned for a distant release.
There is no chapter on using Sweave to develop pdf reports despite the book being actually written in Sweave. The author is more focused on "vignettes" which appear to be for documentation akin to POD files.
This book does include excellent sections on string manipulation, connecting to databases, and C integration. I learned some things about some neat Bioconductor functions available but a dedicated chapter would be nice.
At no point do you ever sense the author does not know what he is talking about - he just doesn't know who he is talking to. I hope in the future "R Programming For Bioinformatics" is split this into two more comprehensive books: "Developing R Packages" and "R for Biologists"
|