Project Botticelli

R Vendor Revolution Analytics to be Acquired by Microsoft

23 January 2015 · 1232 views

Revolution Analytics LogoMicrosoft’s Joseph Shirosh, Corporate Vice President of Machine Learning, has just announced that Revolution Analytics will be acquired. I am very pleased to hear this news, not only because I use R every time I work with a client, but also because I think it is a good development for R and its huge, and immensely skilled and intelligent community.

For those of you who follow my data science videos it is clear that R is a necessary component of any advanced analytical system, as it provides the indispensable statistical functionality. I use it, first of all, for a quick analysis of data which I am supposed to model, looking at descriptive statistics, like distributions etc. This helps me see any issues in data quite early on, and it helps my SQL data experts to fix them, before we continue with modelling.

I also model in R. I am a decade-long fan of highly visual approaches to modelling, which SQL Server data mining offers and which it even supports through Excel’s UI. However, for additional breadth and power, there is no-one else to turn to other than R, although, of course, Azure Machine Learning is steadily adding new algorithms (most recently for image analysis). There are some 6219 packages of add-on software, all free-of-charge, available for R. So if you need to do something, for example you need to resample the underlying data while looking and modelling anomalies or fraud, which I seem to be doing a lot recently, R will provide you not just with a package, but six! I use ROSE and sampling, in case you wondered.

I really like R for plotting charts. ggplot2 is my tool of choice—its name stands for grammar of graphics, and it aptly explains how you build accurate, trustworthy plots: by describing what you want on the plot in a script that has its grammar: I’d like some axis, background, scale, colour, and some pretty scattered dots, please. Yes, it might be slightly faster to chart in Excel, but R plots look right and they tend not to offend basic rules of visually communicating quantitative data, such as arbitrarily missing out ticks or labels on an axis, or displaying data in wrong aspect ratios and exaggerating growth…

I use R when I need to do more involved calculations, such as when trying to fit a curve to some data in order to find a point of intersection (root solving). I use it to solve equations that tell me when a certain alarm condition should be raised. I even use it for my avocation, black and white film photography, to find out the most optimal film developing times.

But the free version of R is not without concerns, especially related to performance or ease of use. This is where a few eager, smaller companies can help. Two of them stand out as pretty major contributors, and they are also benefactors of the R FoundationRStudio makes an excellent integrated development environment for R, which I use—if you tried debugging using core, free version of R, you’re in for a treat with RStudio. Revolution Analytics has a popular implementation of R that solves performance problems by adding much needed parallelism to utilise multicore, multithreaded, and in-memory abilities of today’s hardware. Although those things are missing from the default behaviour of the R core, it does not necessarily mean it is a fault of R—it allows the development of the core to remain focused on its analytical and statistical strengths, and to maintain backwards compatibility.

What I have been waiting for is Microsoft implementing its own R engine not only in Azure ML, as they have already done, but also directly in SQL Server relational engine, in Excel (statistical calculations and better plotting please), and perhaps in Power BI. It would take them a long time to do it from scratch. I am hopeful that this acquisition might be a way to get there sooner. But in the meantime, it is great to have the support of a mature software vendor for what has become the leading platform for statistical software. It is a great sign of mutual respect that a company like Microsoft, especially with its history, wants to play in the open source world, not demanding a full control over the amazing process that keeps R developing so very well. That process ought to continue within its academic core, and to remain free of the commercially negative influence on creativity, something that clearly has badly impacted the once-mighty SAS, making theirs the most-likely-to-be-discontinued BI product in Gartner’s October 2014 magic quadrant…

Being realistic about my expectations, however, I think that Microsoft could sell more Revolution R than Revolution ever could, which must be a nice feeling for the small company and its very nimble team. But if Microsoft could also make the next version of Revolution something that would answer the on-premise needs of those customers of mine who want Azure ML but just cannot go to the cloud—not yet, not for 5–10 years—it would be a professional dream-come-true. 

If you want to learn more about R, Azure ML and the related technologies, please have a look at our just-launched and fast-growing data science course. The next module, a 1-hour 40-minute detailed overview of Azure ML (and some R!) will be live next week, and there are plenty more in the pipeline. Above all, enjoy discovering great things for yourselves and for your customers by using your skills, the might of the community, and this superbly powerful software.

Rafal 

[Updated 24 January 2015. The original version of this article was titled “R Maker Revolution Analytics to be Acquired by Microsoft”. I am grateful to Hadley Wickham for suggesting this change, which no longer suggests that Revolution Analytics actually made the R langauge—that has been achieved by a rather extensive team of volunteers. I hope it continues to be so for a long time to come, and I am happy that so many people and organisations can mutually benefit from their serious collaboration efforts.]