1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605
|
Libsvm is a simple, easy-to-use, and efficient software for SVM
classification and regression. It solves C-SVM classification, nu-SVM
classification, one-class-SVM, epsilon-SVM regression, and nu-SVM
regression. It also provides an automatic model selection tool for
C-SVM classification. This document explains the use of libsvm.
Libsvm is available at
http://www.csie.ntu.edu.tw/~cjlin/libsvm
Please read the COPYRIGHT file before using libsvm.
Table of Contents
=================
- Quick Start
- Installation
- `svm-train' Usage
- `svm-predict' Usage
- Tips on practical use
- Examples
- Precomputed Kernels
- Library Usage
- Java Version
- Building Windows Binaries
- Additional Tools: Model Selection, Sub-sampling, etc.
- Python Interface
- Additional Information
Quick Start
===========
If you are new to SVM and if the data is not large, please go to
`tools' directory and use easy.py after installation. It does
everything automatic -- from data scaling to parameter selection.
Usage: easy.py training_file [testing_file]
More information about parameter selection can be found in
tools/README.
Installation
============
On Unix systems, type `make' to build the `svm-train' and `svm-predict'
programs. Run them without arguments to show the usages of them.
On other systems, consult `Makefile' to build them (e.g., see
'Building Windows binaries' in this file) or use the pre-built
binaries (Windows binaries are in the directory `windows').
The format of training and testing data file is:
<label> <index1>:<value1> <index2>:<value2> ...
.
.
.
For classification, <label> is an integer indicating the class label
(multi-class is supported). For regression, <label> is
the target value which can be any real number. For one-class SVM, it's
not used so can be any number. Except using precomputed kernels
(explained in another section), <index>:<value> gives a feature
(attribute) value. <index> is an integer starting from 1 and <value>
is a real number. Indices must be in an ASCENDING order. Labels in the
testing file are only used to calculate accuracy or errors. If they
are unknown, just fill the first column with any numbers.
A sample classification data included in this package is `heart_scale'.
Type `svm-train heart_scale', and the program will read the training
data and output the model file `heart_scale.model'. If you have a test
set called heart_scale.t, then type `svm-predict heart_scale.t
heart_scale.model output' to see the prediction accuracy. The `output'
file contains the predicted class labels.
There are some other useful programs in this package.
svm-scale:
This is a tool for scaling input data file.
svm-toy:
This is a simple graphical interface which shows how SVM
separate data in a plane. You can click in the window to
draw data points. Use "change" button to choose class
1, 2 or 3 (i.e., up to three classes are supported), "load"
button to load data from a file, "save" button to save data to
a file, "run" button to obtain an SVM model, and "clear"
button to clear the window.
You can enter options in the bottom of the window, the syntax of
options is the same as `svm-train'.
Note that "load" and "save" consider data in the
classification but not the regression case. Each data point
has one label (the color) which must be 1, 2, or 3 and two
attributes (x-axis and y-axis values) in [0,1].
Type `make' in respective directories to build them.
You need Qt library to build the Qt version.
(available from http://www.trolltech.com)
You need GTK+ library to build the GTK version.
(available from http://www.gtk.org)
We use Visual C++ to build the Windows version.
The pre-built Windows binaries are in the windows directory.
`svm-train' Usage
=================
Usage: svm-train [options] training_set_file [model_file]
options:
-s svm_type : set type of SVM (default 0)
0 -- C-SVC
1 -- nu-SVC
2 -- one-class SVM
3 -- epsilon-SVR
4 -- nu-SVR
-t kernel_type : set type of kernel function (default 2)
0 -- linear: u'*v
1 -- polynomial: (gamma*u'*v + coef0)^degree
2 -- radial basis function: exp(-gamma*|u-v|^2)
3 -- sigmoid: tanh(gamma*u'*v + coef0)
4 -- precomputed kernel (kernel values in training_set_file)
-d degree : set degree in kernel function (default 3)
-g gamma : set gamma in kernel function (default 1/k)
-r coef0 : set coef0 in kernel function (default 0)
-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
-m cachesize : set cache memory size in MB (default 100)
-e epsilon : set tolerance of termination criterion (default 0.001)
-h shrinking: whether to use the shrinking heuristics, 0 or 1 (default 1)
-b probability_estimates: whether to train an SVC or SVR model for probability estimates, 0 or 1 (default 0)
-wi weight: set the parameter C of class i to weight*C in C-SVC (default 1)
-v n: n-fold cross validation mode
The k in the -g option means the number of attributes in the input data.
option -v randomly splits the data into n parts and calculates cross
validation accuracy/mean squared error on them.
`svm-predict' Usage
===================
Usage: svm-predict [options] test_file model_file output_file
options:
-b probability_estimates: whether to predict probability estimates, 0 or 1 (default 0); for one-class SVM only 0 is supported
model_file is the model file generated by svm-train.
test_file is the test data you want to predict.
svm-predict will produce output in the output_file.
Tips on Practical Use
=====================
* Scale your data. For example, scale each attribute to [0,1] or [-1,+1].
* For C-SVC, consider using the model selection tool in the tools directory.
* nu in nu-SVC/one-class-SVM/nu-SVR approximates the fraction of training
errors and support vectors.
* If data for classification are unbalanced (e.g. many positive and
few negative), try different penalty parameters C by -wi (see
examples below).
* Specify larger cache size (i.e., larger -m) for huge problems.
Examples
========
> svm-scale -l -1 -u 1 -s range train > train.scale
> svm-scale -r range test > test.scale
Scale each feature of the training data to be in [-1,1]. Scaling
factors are stored in the file range and then used for scaling the
test data.
> svm-train -s 0 -c 1000 -t 2 -g 0.5 -e 0.00001 data_file
Train a classifier with RBF kernel exp(-0.5|u-v|^2) and stopping
tolerance 0.00001
> svm-train -s 3 -p 0.1 -t 0 -c 10 data_file
Solve SVM regression with linear kernel u'v and C=10, and epsilon = 0.1
in the loss function.
> svm-train -s 0 -c 10 -w1 1 -w-1 5 data_file
Train a classifier with penalty 10 for class 1 and penalty 50
for class -1.
> svm-train -s 0 -c 500 -g 0.1 -v 5 data_file
Do five-fold cross validation for the classifier using
the parameters C = 500 and gamma = 0.1
> svm-train -s 0 -b 1 data_file
> svm-predict -b 1 test_file data_file.model output_file
Obtain a model with probability information and predict test data with
probability estimates
Precomputed Kernels
===================
Users may precompute kernel values and input them as training and
testing files. Then libsvm does not need the original
training/testing sets.
Assume there are L training instances x1, ..., xL and.
Let K(x, y) be the kernel
value of two instances x and y. The input formats
are:
New training instance for xi:
<label> 0:i 1:K(xi,x1) ... L:K(xi,xL)
New testing instance for any x:
<label> 0:? 1:K(x,x1) ... L:K(x,xL)
That is, in the training file the first column must be the "ID" of
xi. In testing, ? can be any value.
All kernel values including ZEROs must be explicitly provided. Any
permutation or random subsets of the training/testing files are also
valid (see examples below).
Note: the format is slightly different from the precomputed kernel
package released in libsvmtools earlier.
Examples:
Assume the original training data has three four-feature
instances and testing data has one instance:
1 1:1 2:1 3:1 4:1
3 2:3 4:3
2 3:1
1 1:1 3:1
If the linear kernel is used, we have the following new
training/testing sets:
1 0:1 1:4 2:6 3:1
3 0:2 1:6 2:18 3:0
2 0:3 1:1 2:0 3:1
1 0:? 1:2 2:0 3:1
? can be any value.
Any subset of the above training file is also valid. For example,
2 0:3 1:1 2:0 3:1
3 0:2 1:6 2:18 3:0
implies that the kernel matrix is
0 1
18 0
Library Usage
=============
These functions and structures are declared in the header file `svm.h'.
You need to #include "svm.h" in your C/C++ source files and link your
program with `svm.cpp'. You can see `svm-train.c' and `svm-predict.c'
for examples showing how to use them.
Before you classify test data, you need to construct an SVM model
(`svm_model') using training data. A model can also be saved in
a file for later use. Once an SVM model is available, you can use it
to classify new data.
- Function: struct svm_model *svm_train(const struct svm_problem *prob,
const struct svm_parameter *param);
This function constructs and returns an SVM model according to
the given training data and parameters.
struct svm_problem describes the problem:
struct svm_problem
{
int l;
double *y;
struct svm_node **x;
};
where `l' is the number of training data, and `y' is an array containing
their target values. (integers in classification, real numbers in
regression) `x' is an array of pointers, each of which points to a sparse
representation (array of svm_node) of one training vector.
For example, if we have the following training data:
LABEL ATTR1 ATTR2 ATTR3 ATTR4 ATTR5
----- ----- ----- ----- ----- -----
1 0 0.1 0.2 0 0
2 0 0.1 0.3 -1.2 0
1 0.4 0 0 0 0
2 0 0.1 0 1.4 0.5
3 -0.1 -0.2 0.1 1.1 0.1
then the components of svm_problem are:
l = 5
y -> 1 2 1 2 3
x -> [ ] -> (2,0.1) (3,0.2) (-1,?)
[ ] -> (2,0.1) (3,0.3) (4,-1.2) (-1,?)
[ ] -> (1,0.4) (-1,?)
[ ] -> (2,0.1) (4,1.4) (5,0.5) (-1,?)
[ ] -> (1,-0.1) (2,-0.2) (3,0.1) (4,1.1) (5,0.1) (-1,?)
where (index,value) is stored in the structure `svm_node':
struct svm_node
{
int index;
double value;
};
index = -1 indicates the end of one vector.
struct svm_parameter describes the parameters of an SVM model:
struct svm_parameter
{
int svm_type;
int kernel_type;
int degree; /* for poly */
double gamma; /* for poly/rbf/sigmoid */
double coef0; /* for poly/sigmoid */
/* these are for training only */
double cache_size; /* in MB */
double eps; /* stopping criteria */
double C; /* for C_SVC, EPSILON_SVR, and NU_SVR */
int nr_weight; /* for C_SVC */
int *weight_label; /* for C_SVC */
double* weight; /* for C_SVC */
double nu; /* for NU_SVC, ONE_CLASS, and NU_SVR */
double p; /* for EPSILON_SVR */
int shrinking; /* use the shrinking heuristics */
int probability; /* do probability estimates */
};
svm_type can be one of C_SVC, NU_SVC, ONE_CLASS, EPSILON_SVR, NU_SVR.
C_SVC: C-SVM classification
NU_SVC: nu-SVM classification
ONE_CLASS: one-class-SVM
EPSILON_SVR: epsilon-SVM regression
NU_SVR: nu-SVM regression
kernel_type can be one of LINEAR, POLY, RBF, SIGMOID.
LINEAR: u'*v
POLY: (gamma*u'*v + coef0)^degree
RBF: exp(-gamma*|u-v|^2)
SIGMOID: tanh(gamma*u'*v + coef0)
PRECOMPUTED: kernel values in training_set_file
cache_size is the size of the kernel cache, specified in megabytes.
C is the cost of constraints violation. (we usually use 1 to 1000)
eps is the stopping criterion. (we usually use 0.00001 in nu-SVC,
0.001 in others). nu is the parameter in nu-SVM, nu-SVR, and
one-class-SVM. p is the epsilon in epsilon-insensitive loss function
of epsilon-SVM regression. shrinking = 1 means shrinking is conducted;
= 0 otherwise. probability = 1 means model with probability
information is obtained; = 0 otherwise.
nr_weight, weight_label, and weight are used to change the penalty
for some classes (If the weight for a class is not changed, it is
set to 1). This is useful for training classifier using unbalanced
input data or with asymmetric misclassification cost.
nr_weight is the number of elements in the array weight_label and
weight. Each weight[i] corresponds to weight_label[i], meaning that
the penalty of class weight_label[i] is scaled by a factor of weight[i].
If you do not want to change penalty for any of the classes,
just set nr_weight to 0.
*NOTE* Because svm_model contains pointers to svm_problem, you can
not free the memory used by svm_problem if you are still using the
svm_model produced by svm_train().
- Function: double svm_predict(const struct svm_model *model,
const struct svm_node *x);
This function does classification or regression on a test vector x
given a model.
For a classification model, the predicted class for x is returned.
For a regression model, the function value of x calculated using
the model is returned. For an one-class model, +1 or -1 is
returned.
- Function: void svm_cross_validation(const struct svm_problem *prob,
const struct svm_parameter *param, int nr_fold, double *target);
This function conducts cross validation. Data are separated to
nr_fold folds. Under given parameters, sequentially each fold is
validated using the model from training the remaining. Predicted
labels in the validation process are stored in the array called
target.
The format of svm_prob is same as that for svm_train().
- Function: int svm_get_svm_type(const struct svm_model *model);
This function gives svm_type of the model. Possible values of
svm_type are defined in svm.h.
- Function: int svm_get_nr_class(const svm_model *model);
For a classification model, this function gives the number of
classes. For a regression or an one-class model, 2 is returned.
- Function: void svm_get_labels(const svm_model *model, int* label)
For a classification model, this function outputs the name of
labels into an array called label. For regression and one-class
models, label is unchanged.
- Function: double svm_get_svr_probability(const struct svm_model *model);
For a regression model with probability information, this function
outputs a value sigma > 0. For test data, we consider the
probability model: target value = predicted value + z, z: Laplace
distribution e^(-|z|/sigma)/(2sigma)
If the model is not for svr or does not contain required
information, 0 is returned.
- Function: void svm_predict_values(const svm_model *model,
const svm_node *x, double* dec_values)
This function gives decision values on a test vector x given a
model.
For a classification model with nr_class classes, this function
gives nr_class*(nr_class-1)/2 decision values in the array
dec_values, where nr_class can be obtained from the function
svm_get_nr_class. The order is label[0] vs. label[1], ...,
label[0] vs. label[nr_class-1], label[1] vs. label[2], ...,
label[nr_class-2] vs. label[nr_class-1], where label can be
obtained from the function svm_get_labels.
For a regression model, label[0] is the function value of x
calculated using the model. For one-class model, label[0] is +1 or
-1.
- Function: double svm_predict_probability(const struct svm_model *model,
const struct svm_node *x, double* prob_estimates);
This function does classification or regression on a test vector x
given a model with probability information.
For a classification model with probability information, this
function gives nr_class probability estimates in the array
prob_estimates. nr_class can be obtained from the function
svm_get_nr_class. The class with the highest probability is
returned. For all other situations, the array prob_estimates is
unchanged and the returned value is the same as that of
svm_predict.
- Function: const char *svm_check_parameter(const struct svm_problem *prob,
const struct svm_parameter *param);
This function checks whether the parameters are within the feasible
range of the problem. This function should be called before calling
svm_train(). It returns NULL if the parameters are feasible,
otherwise an error message is returned.
- Function: int svm_check_probability_model(const struct svm_model *model);
This function checks whether the model contains required
information to do probability estimates. If so, it returns
+1. Otherwise, 0 is returned. This function should be called
before calling svm_get_svr_probability and
svm_predict_probability.
- Function: int svm_save_model(const char *model_file_name,
const struct svm_model *model);
This function saves a model to a file; returns 0 on success, or -1
if an error occurs.
- Function: struct svm_model *svm_load_model(const char *model_file_name);
This function returns a pointer to the model read from the file,
or a null pointer if the model could not be loaded.
- Function: void svm_destroy_model(struct svm_model *model);
This function frees the memory used by a model.
- Function: void svm_destroy_param(struct svm_parameter *param);
This function frees the memory used by a parameter set.
Java Version
============
The pre-compiled java class archive `libsvm.jar' and its source files are
in the java directory. To run the programs, use
java -classpath libsvm.jar svm_train <arguments>
java -classpath libsvm.jar svm_predict <arguments>
java -classpath libsvm.jar svm_toy
You may need to add Java runtime library (like classes.zip) to the classpath.
You may need to increase maximum Java heap size.
Library usages are similar to the C version. These functions are available:
public class svm {
public static svm_model svm_train(svm_problem prob, svm_parameter param);
public static void svm_cross_validation(svm_problem prob, svm_parameter param, int nr_fold, double[] target);
public static int svm_get_svm_type(svm_model model);
public static int svm_get_nr_class(svm_model model);
public static void svm_get_labels(svm_model model, int[] label);
public static double svm_get_svr_probability(svm_model model);
public static void svm_predict_values(svm_model model, svm_node[] x, double[] dec_values);
public static double svm_predict(svm_model model, svm_node[] x);
public static double svm_predict_probability(svm_model model, svm_node[] x, double[] prob_estimates);
public static void svm_save_model(String model_file_name, svm_model model) throws IOException
public static svm_model svm_load_model(String model_file_name) throws IOException
public static String svm_check_parameter(svm_problem prob, svm_parameter param);
public static int svm_check_probability_model(svm_model model);
}
The library is in the "libsvm" package.
Note that in Java version, svm_node[] is not ended with a node whose index = -1.
Building Windows Binaries
=========================
Windows binaries are in the directory `windows'. To build them via
Visual C++, use the following steps:
1. Open a dos command box and change to libsvm directory. If
environment variables of VC++ have not been set, type
"C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\vcvars32.bat"
You may have to modify the above according which version of VC++or
where it is installed.
2. Type
nmake -f Makefile.win clean all
3. (optional) To build python interface, download and install Python.
Edit Makefile.win and change PYTHON_INC and PYTHON_LIB to your python
installation. Type
nmake -f Makefile.win python
and then copy windows\python\svmc.dll to the python directory.
Another way is to build them from Visual C++ environment. See details
in libsvm faq.
Additional Tools: Model Selection, Sub-sampling, etc.
====================================================
See the README file in the tools directory.
Python Interface
================
See the README file in python directory.
Additional Information
======================
If you find LIBSVM helpful, please cite it as
Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for
support vector machines, 2001.
Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
LIBSVM implementation document is available at
http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf
For any questions and comments, please send your email to
cjlin@csie.ntu.edu.tw
Acknowledgments:
This work was supported in part by the National Science
Council of Taiwan via the grant NSC 89-2213-E-002-013.
The authors thank their group members and users
for many helpful discussions and comments. They are listed in
http://www.csie.ntu.edu.tw/~cjlin/libsvm/acknowledgements
|