Project Botticelli

Azure ML, Azure Machine Learning just Announced!

17 June 2014 · 5 comments · 5244 views

The day I've been waiting for over 2 years

Microsoft Azure Machine Learning LogotypeMicrosoft have just announced their biggest-ever foray into data science: Azure Machine Learning, or simply Azure ML. [Click here for the Brief Intro to Azure ML, which I wrote 1 month after this article]

I had the great honour of meeting the engineering team behind this project over two years ago, and I have been following its various early versions, code named Project Passau (and a few other cooler but sadly unmarketable names) all that time. I’ve been dying to mention it publicly, but I am still embargoed beyond what has been made public until the day it has been released into an open preview, which, according to this Microsoft web page will be in July (you can see the date if you register for the preview). In the meantime, let me share a few early observations with you.

My customers and I have been relying on Microsoft data mining technology contained in SQL Server Analysis Services for a decade—we even have the best online training course on this subject right here. During these ten years many had hoped that Microsoft might develop their data mining technology further, and while it took a good while longer than expected, the day is almost upon us. As the world has moved towards the cloud, we will be seeing this new predictive analytical, statistical and machine learning capability appear in the Azure cloud. While I would have liked to have an on-premise version, too, I realise that this may have to wait a good while more. This is because Azure ML is not just a cool bunch of important, open source machine learning frameworks, plus R, pleasantly integrated on top of a cloud infrastructure, but it also makes an excellent use of the Microsoft somewhat internal data treasure chest, aka Bing, powerful algorithms developed for it and for Xbox, and, like Power BI Q&A, it lets further improve the quality of the system thanks to our online use of its modelling technologies, and, like it or not, because of every cloud’s obvious own telemetry, far superior to the feedback engineering gets from on-premise software.

In my day-to-day job I use 4 data science tools a lot. By the way, I am still not sure if I like that term—when I studied it at Imperial College in London it was called Foundations of Advanced Information Technology, or FAIT, but I suppose Data Science sounds sexier. I digress. The key tools of my consulting life are SQL Server (not just for data mining, but also for data wrangling), R, Excel, and Python. I also use, to an extent, the very good Mahout machine learning and data mining library, which runs well over Hadoop, which I use both on-prem in Linux and Windows (as HDP) and in the cloud, as HDInsight and Elastic MapReduce, when I need it quickly for a larger job at a customer site who does not have the on-prem infrastructure, or one of the Microsoft Analytics Platform System appliances.

What I have been dreaming of is having all of those put together into a neat tool, which works and thinks like a data scientist: allowing me to build data-flow-like experiment designs (somewhat like RapidMiner), to cope with a lot of data, support extensive open source machine learning libraries, validate the results statistically (which I almost always do in R) and to do it on someone else’s suitably massive chunk of fast hardware that I don’t have to look after. I just want to focus on getting the answers to my customers’ complex problems, like how to reduce fraud, make better recommendations, understand their customers better, market more effectively, or how to understand and predict the fluctuations in their sales, without having to spend hours—or days—building the kit for doing that, every single time. Azure ML is about to make that a reality.

I have been committed to analytics with data mining and machine learning for a decade. I can honestly tell you that as long as the public general availability release is a successful one (fingers crossed!) this will be very important to all of us from a professional perspective, and perhaps even academically, not to mention having some data geek street cred. And it makes me feel glad I have faithfully stuck by Microsoft all those years—ok, that bit was easy, thanks to SQL Server popularity.

As soon as the technology hits the road, I will write about it in depth (read here!), and you will also find detailed training videos on our site. If you are not yet a member, please join now, I will keep keeping you up-to-date, I promise, over the next many months to come.


mandeep.goraya · 22 June 2014

Sounds great....Hope to learn much more from you when it will finally be available for preview.

Thanks for the update.

aneesh · 16 July 2014

Tried a couple of experiments. Supported algorithms, R support, Designer, multiple model testing, publishing support.. all looks great.
My next step : Dig deep with Rafal Lukawiecki. I look forward a wonderful course on Azure ML by you soon!

Rafal Lukawiecki · 16 July 2014

Thanks, Aneesh, I am planning the course modules and the demos right now, can't wait to start recording them! I hope to release throughout the rest of this year and early next, hopefully starting in Sept/Oct.

aneesh · 2 November 2014

When can we expect your course on Azure Machine Learning?

Rafal Lukawiecki · 3 November 2014

Dear Aneesh, thanks for asking. I am currently writing the new Azure ML course and the intro PPTs are ready, but before I record them I would like to see that the product exits the current "preview" and enters the general release and full availability. Because of the expected last-minute changes, I would prefer not to have to re-record it too soon, as it takes a few weeks of full-time work to create one hour of video. Having said that, I hope to have an introduction to Azure ML, in some form, still this year.