SVM Tutorial: How to classify text in R

July 26, 2020November 23, 2014 by Alexandre KOWALCZYK

In this tutorial I will show you how to classify text with SVM in R.

rlogo

The main steps to classify text in R are:

Create a new RStudio project
Install the required packages
Read the data
Prepare the data
Create and train the SVM model
Predict with new data

Step 1: Create a new RStudio Project

To begin with, you will need to download and install the RStudio development environment.

Once you installed it, you can create a new project by clicking on "Project: (None)" at the top right of the screen :

svm tutorial : create r studio project — Create a new project in R Studio

Alexandre KOWALCZYK

I am passionate about machine learning and Support Vector Machine. I like to explain things simply to share my knowledge with people from around the world.

Support Vector Regression with R

August 19, 2021October 23, 2014 by Alexandre KOWALCZYK

In this article I will show how to use R to perform a Support Vector Regression.
We will first do a simple linear regression, then move to the Support Vector Regression so that you can see how the two behave with the same data.

A simple data set

To begin with we will use this simple data set:

I just put some data in excel. I prefer that over using an existing well-known data-set because the purpose of the article is not about the data, but more about the models we will use.

As you can see there seems to be some kind of relation between our two variables X and Y, and it look like we could fit a line which would pass near each point.

Let's do that in R !

Alexandre KOWALCZYK

I am passionate about machine learning and Support Vector Machine. I like to explain things simply to share my knowledge with people from around the world.

Linear Kernel: Why is it recommended for text classification ?

July 26, 2020October 19, 2014 by Alexandre KOWALCZYK

The Support Vector Machine can be viewed as a kernel machine. As a result, you can change its behavior by using a different kernel function.

The most popular kernel functions are :

the linear kernel
the polynomial kernel
the RBF (Gaussian) kernel
the string kernel

The linear kernel is often recommended for text classification

It is interesting to note that :

The original optimal hyperplane algorithm proposed by Vapnik in 1963 was a linear classifier [1]

That's only 30 years later that the kernel trick was introduced.

If it is the simpler algorithm, why is the linear kernel recommended for text classification?

Text is often linearly separable

Most of text classification problems are linearly separable [2]

Linear kernel works well with linearly separable data

Alexandre KOWALCZYK

I am passionate about machine learning and Support Vector Machine. I like to explain things simply to share my knowledge with people from around the world.

How to classify text using SVM in C#

July 26, 2020October 19, 2014 by Alexandre KOWALCZYK

SVM Tutorial : Classify text in C#

In this tutorial I will show you how to classify text with SVM in C#.

The main steps to classify text in C# are:

Create a new project
Install the SVM package with Nuget
Prepare the data
Read the data
Generate a problem
Train the model
Predict

Step 1: Create the Project

Create a new Console application.

Step 2: Install the SVM package with NuGet

In the solution explorer, right click on "References" and click on "Manage NuGet Packages..."

Select "Online" and in the search box type "SVM".

You should now see the libsvm.net package. Click on Install, and that's it !

There are several libsvm implementations in C#. We will use libsvm.net because it is the more up to date and it is easily downloadable via NuGet.

Alexandre KOWALCZYK

I am passionate about machine learning and Support Vector Machine. I like to explain things simply to share my knowledge with people from around the world.