This is the Part 3 of my serie of tutorials about the math behind Support Vector Machine.
If you did not read the previous articles, you might want to take a look before reading this one :
SVM - Understanding the math
Part 1: What is the goal of the Support Vector Machine (SVM) ?
Part 2: How to compute the margin ?
Part 3: How to find the optimal hyperplane ?
What is this article about?
The main focus of this article is to show you the reasoning allowing us to select the optimal hyperplane.
Here is a quick summary of what we will see:
- How can we find the optimal hyperplane ?
- How do we calculate the distance between two hyperplanes ?
- What is the SVM optimization problem ?
How to find the optimal hyperplane ?
At the end of Part 2 we computed the distance between a point and a hyperplane. We then computed the margin which was equal to .
However, even if it did quite a good job at separating the data it was not the optimal hyperplane.
Figure 1: The margin we calculated in Part 2 is shown as M1
In this tutorial I will show you how to classify text with SVM in R.
The main steps to classify text in R are:
- Create a new RStudio project
- Install the required packages
- Read the data
- Prepare the data
- Create and train the SVM model
- Predict with new data
Step 1: Create a new RStudio Project
To begin with, you will need to download and install the RStudio development environment.
Once you installed it, you can create a new project by clicking on "Project: (None)" at the top right of the screen :
Create a new project in R Studio
In the first part of this svm tutorial about math, we saw what is the aim of the Support Vector Machine. Its goal is to find the hyperplane which maximize the margin.
But how do we calculate this margin?
SVM = Support VECTOR Machine
In Support Vector Machine, there is the word vector.
That means it is important to understand vector well and how to use them.
Here a short sum-up of what we will see today:
- What is a vector?
- How to add and subtract vectors ?
- What is the dot product ?
- How to project a vector onto another ?
Once we have all these tools in our toolbox, we will then see:
- What is the equation of the hyperplane?
- How to compute the margin?
This is the first article from a serie of articles I will be writing about the math behind SVM. There is a lot to talk about and a lot of mathematical background is often necessary. However I will try to keep a slow pace and to give in-depth explanations, so that everything is crystal clear, even for beginners.
What is the goal of the Support Vector Machine (SVM)?
The goal of a support vector machine is to find the optimal separating hyperplane which maximizes the margin of the training data.
The first thing we can see from this definition, is that a SVM needs training data. Which means it is a supervised learning algorithm.
It is also important to know that SVM is a classification algorithm. Which means we will use it to predict if something belongs to a particular class.
For instance, we can have the training data below:
We have plotted the size and weight of several people, and there is also a way to distinguish between men and women.
With such data, using a SVM will allow us to answer the following question:
Given a particular data point (weight and size), is the person a man or a woman ?
For instance: if someone measures 175 cm and weights 80 kg, is it a man of a woman?
In this article I will show how to use R to perform a Support Vector Regression.
We will first do a simple linear regression, then move to the Support Vector Regression so that you can see how the two behave with the same data.
A simple data set
To begin with we will use this simple data set:
I just put some data in excel. I prefer that over using an existing well-known data-set because the purpose of the article is not about the data, but more about the models we will use.
As you can see there seems to be some kind of relation between our two variables X and Y, and it look like we could fit a line which would pass near each point.
Let's do that in R !
The Support Vector Machine can be viewed as a kernel machine. As a result, you can change its behavior by using a different kernel function.
The most popular kernel functions are :
- the linear kernel
- the polynomial kernel
- the RBF (Gaussian) kernel
- the string kernel
The linear kernel is often recommended for text classification
It is interesting to note that :
The original optimal hyperplane algorithm proposed by Vapnik in 1963 was a linear classifier 
That's only 30 years later that the kernel trick was introduced.
If it is the simpler algorithm, why is the linear kernel recommended for text classification?
Text is often linearly separable
Most of text classification problems are linearly separable 
Linear kernel works well with linearly separable data
SVM Tutorial : Classify text in C#
In this tutorial I will show you how to classify text with SVM in C#.
The main steps to classify text in C# are:
- Create a new project
- Install the SVM package with Nuget
- Prepare the data
- Read the data
- Generate a problem
- Train the model
Step 1: Create the Project
Create a new Console application.
Step 2: Install the SVM package with NuGet
In the solution explorer, right click on "References" and click on "Manage NuGet Packages..."
Select "Online" and in the search box type "SVM".
You should now see the libsvm.net package. Click on Install, and that's it !
There are several libsvm implementations in C#. We will use libsvm.net because it is the more up to date and it is easily downloadable via NuGet.