File: tutorial.txt

package info (click to toggle)

mlpy 2.2.0~dfsg1-2

links: PTS, VCS
area: main
in suites: squeeze, wheezy
size: 968 kB
ctags: 851
sloc: ansic: 6,056; python: 3,316; makefile: 101

file content (59 lines) | stat: -rw-r--r-- 1,627 bytes

parent folder | download | duplicates (2)

Tutorial
========

A Simple Example
----------------

In this example the performance of SVM classifier is evaluated in a
stratified k-fold resampling schema.

First, import NumPy and mlpy modules:

.. code-block:: python
   
   >>> import numpy as np
   >>> import mlpy


Then, load a data file (*data.dat*) containing 30 samples described by
100 features (*x*) and labels (*y*):

.. code-block:: python

   >>> x, y = mlpy.data_fromfile('data.dat') # import data file
   >>> x.shape
   (30, 100)

Initialize SVM classifier, specifying kernel type (*linear*) and
regularization parameter (*C*):

.. code-block:: python

   >>> classifier = mlpy.Svm(kernel = 'linear', C = 1.0)  # initialize the svm classifier 
   
Define a stratified 10-fold resampling schema, where *idx* contains
the sample indexes (list of train/test pairs):

.. code-block:: python
   
   >>> idx = mlpy.kfoldS(cl = y, sets = 10)

Actually build train and test data. Train the model on *xtr* and
test it on *xts*. The performance is evaluated computing the
average prediction error:

.. code-block:: python

   >>> pred_err = 0.0
   >>> for idxtr, idxts in idx:
   ...     xtr, xts = x[idxtr], x[idxts]       # build training data
   ...     ytr, yts = y[idxtr], y[idxts]       # build test data
   ...     ret = classifier.compute(xtr, ytr)  # compute the model
   ...     pred = classifier.predict(xts)      # test the model on test data
   ...     pred_err += mlpy.err(yts, pred)          # compute the prediction error
   >>> av_pred_err = pred_err / len(idx)       # compute the average prediction error
   >>> av_pred_err
   0.17499999999999999