Lecture 5: Support Vector Machines
Today’s topic Support Vector Machines (SVM)
Support vector machines (SVMs) are used for classification, and use some of the ideas from logistic regression. Support vector machines deal with noisy data, where some labels are miss-classified, with large-margin classifiers.
Support vector machines can also do non-linear classification using kernels. A kernel is a non-linear transformation of the input data into a higher dimensional space. Kernels transform your input data into a space where it is possible to do linear separation as in logistic regression.
This sounds very abstract, but the basic mathematics is not very complicated. Support vector machines are very powerful, and before you try neural networks try SVMs with different kernels.
These are the slides that I use.
Reading Guide
- Hundred-Page Machine Learning Book Chapter 3 section 3.4, Chapter 1 sections 1.3 and 1.4 again and Chapter 7 sections 7.1,7.2 and 7.3.
- The Wikipedia page on the Kernel Method is a good starting point for the kernel trick and a reference on different Kernels. The whole Wikipedia article on SVMs is also a good starting point that includes a very good set of references.
What should I know by the end of this lecture?
- What are Support Vector Machines?
- What is a large margin classifier?
- What is the hinge loss function?
- What are Kernels? You should know some common kernels including the polynomial kernel and the Gaussian kernel (or radial bases kernel).
- What is the Kernel trick?
- How does the learning algorithm work for SVMs?