Project Botticelli

Why Cluster and Segment Data? Get Free Access Purchase this course

31 May 2013 · 8013 views

A Brief Introduction to Clustering

Cluster-based data segmentation

Clustering is a popular data mining technique, often used for segmentation, and for outlier (exception) detection. In this short, free video, Rafal introduces these concepts, focusing on the reasons why it is useful to use clustering to find non-traditional segments. Log-in or get a free account to watch it!

For example, you may be used to seeing customer sales segmented by a geographical region. That is a common way to discuss and compare financial results at company meetings. Unfortunately, such a way of looking at the sales might not be showing you what is really happening with your sales in a way that could help your company improve its performance. What if there were a completely different way to segment your customers, that would show major, yet otherwise unknown, differences between sales?

You can find new ways to cluster data using SQL Server Data Mining. We will explain this process in the next, full-length video in this series. Once you have found your clusters, you need to analyse them to understand them, and so that you can give them meaningful names. Then, if your clustering model works, and you have tested it, you will be able to apply it to any similar data to automatically categorise it. In the demo you can see how Excel data is categorised by using a clustering model—all you need to do is to use the Query button from the free Microsoft Data Mining Add-ins for Office. Excel queries the model (which runs in SQL Server Analysis Services) and asks it to predict the names of the clusters to which each row in your sheet should belong to. This is a very fast process, very useful, and it is also an important step in getting to know your clusters, as it is a good idea to apply a model to different sets of data to verify that the cluster names, which you have given to them, make sense. Indeed, spot the comment in Rafal’s video which shows that an even better name should have been applied to one of the shown clusters!

If you are interested in clustering, make sure to watch the 1-hour 50-minute, in-depth module Clustering in Depth, and please also review the remaining modules in this online course, starting with the Introduction to Data Mining

Log in or register for free to access this content.

Purchase This Course or Full Access Subscription

Single Course

$250/once

Access this course for its lifetime*.
Purchase
Subscription Best Value

$480/year

Access all content on this site for 1 year.
Purchase
Group Purchase

from $480/year

For small business & enterprise.
Group Purchase

  • Redeem a prepaid code
  • Payment is instant and you will receive a tax invoice straight away.
  • Your satisfaction is paramount: we offer a no-quibble refund guarantee.

* We guarantee each course to be available for at least 2 years from today, unless marked above with a date as a Course Retiring.

Online Courses