Friday, March 1, 2002:
Roundtable discussion of support vector machines.
(click on further readings above for more papers!)
Paper: "A tutorial on support vector machines for pattern recognition. " (download)
Burges, 1998, Knowledge Discovery and Data Mining 2(2) :121-167.
In a problem of classifying points in a plane into two groups A and
B, if the groups can be separated by a straight line, the support
vector algorithm chooses the line such that the distance between that
line and the points in each group closest to it is maximised. The
points in each group closest to the dividing line are called support
vectors. It turns out that this method involves calculating a scalar
product. If the points are not linearly separable, a trick is to map
them into a higher dimensional space in which they become linearly
separable, and then apply the support vector algorithm.
Unfortunately, calculating the scalar product in the higher
dimensional space may be intractable because of the large number of
dimensions. Fortunately, the higher dimensional scalar product is
sometimes a function or "kernel" of the low dimensional scalar
product. Using the kernel trick, the high dimensional scalar product
can be calculated. There are arguments from statistical learning
theory that the support vector algorithm tends to maximise the fit to
the training data while not overfitting it, and hence will perform
well on test data.
1:30pm -3:00pm, HSE 810.