File: data_transforms.rst

package info (click to toggle)
scikit-learn 0.23.2-5
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, bullseye, sid
  • size: 21,892 kB
  • sloc: python: 132,020; cpp: 5,765; javascript: 2,201; ansic: 831; makefile: 213; sh: 44
file content (37 lines) | stat: -rw-r--r-- 1,324 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
.. include:: includes/big_toc_css.rst

.. _data-transforms:

Dataset transformations
-----------------------

scikit-learn provides a library of transformers, which may clean (see
:ref:`preprocessing`), reduce (see :ref:`data_reduction`), expand (see
:ref:`kernel_approximation`) or generate (see :ref:`feature_extraction`)
feature representations.

Like other estimators, these are represented by classes with a ``fit`` method,
which learns model parameters (e.g. mean and standard deviation for
normalization) from a training set, and a ``transform`` method which applies
this transformation model to unseen data. ``fit_transform`` may be more
convenient and efficient for modelling and transforming the training data
simultaneously.

Combining such transformers, either in parallel or series is covered in
:ref:`combining_estimators`. :ref:`metrics` covers transforming feature
spaces into affinity matrices, while :ref:`preprocessing_targets` considers
transformations of the target space (e.g. categorical labels) for use in
scikit-learn.

.. toctree::
    :maxdepth: 2

    modules/compose
    modules/feature_extraction
    modules/preprocessing
    modules/impute
    modules/unsupervised_reduction
    modules/random_projection
    modules/kernel_approximation
    modules/metrics
    modules/preprocessing_targets