File: impute.rst

package info (click to toggle)
python-altair 5.0.1-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 6,952 kB
  • sloc: python: 25,649; sh: 14; makefile: 5
file content (130 lines) | stat: -rw-r--r-- 3,425 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
.. currentmodule:: altair

.. _user-guide-impute-transform:

Impute
~~~~~~
The impute transform allows you to fill-in missing entries in a dataset.
As an example, consider the following data, which includes missing values
that we filter-out of the long-form representation (see :ref:`data-long-vs-wide`
for more on this):

.. altair-plot::
   :output: repr

   import numpy as np
   import pandas as pd

   data = pd.DataFrame({
       't': range(5),
       'x': [2, np.nan, 3, 1, 3],
       'y': [5, 7, 5, np.nan, 4]
   }).melt('t').dropna()
   data

Notice the result: the ``x`` series has no entry at ``t=1``, and the ``y``
series has a missing entry at ``t=3``. If we use Altair to visualize this
data directly, the line skips the missing entries:

.. altair-plot::

   import altair as alt

   raw = alt.Chart(data).mark_line(point=True).encode(
       x='t:Q',
       y='value:Q',
       color='variable:N'
   )
   raw

This is not always desirable, because (particularly for a line plot with
no points) it can imply the existence of data that is not there.

Impute via Encodings
^^^^^^^^^^^^^^^^^^^^
To address this, you can use the impute method of the encoding channel.
For example, we can impute using a constant value (we'll show the raw chart
lightly in the background for reference):

.. altair-plot::

   background = raw.encode(opacity=alt.value(0.2))
   chart = alt.Chart(data).mark_line(point=True).encode(
       x='t:Q',
       y=alt.Y('value:Q').impute(value=0),
       color='variable:N'
   )
   background + chart

Or we can impute using any supported aggregate:

.. altair-plot::

   chart = alt.Chart(data).mark_line(point=True).encode(
       x='t:Q',
       y=alt.Y('value:Q').impute(method='mean'),
       color='variable:N'
   )
   background + chart

Impute via Transform
^^^^^^^^^^^^^^^^^^^^
Similar to the :ref:`user-guide-bin-transform` and :ref:`user-guide-aggregate-transform`,
it is also possible to specify the impute transform outside the encoding as a
transform. For example, here is the equivalent of the above two charts:

.. altair-plot::

   chart = alt.Chart(data).transform_impute(
       impute='value',
       key='t',
       value=0,
       groupby=['variable']
   ).mark_line(point=True).encode(
       x='t:Q',
       y='value:Q',
       color='variable:N'
   )
   background + chart

.. altair-plot::

   chart = alt.Chart(data).transform_impute(
       impute='value',
       key='t',
       method='mean',
       groupby=['variable']
   ).mark_line(point=True).encode(
       x='t:Q',
       y='value:Q',
       color='variable:N'
   )
   background + chart

If you would like to use more localized imputed values, you can specify a
``frame`` parameter similar to the :ref:`user-guide-window-transform` that
will control which values are used for the imputation. For example, here
we impute missing values using the mean of the neighboring points on either
side:

.. altair-plot::

   chart = alt.Chart(data).transform_impute(
       impute='value',
       key='t',
       method='mean',
       frame=[-1, 1],
       groupby=['variable']
   ).mark_line(point=True).encode(
       x='t:Q',
       y='value:Q',
       color='variable:N'
   )
   background + chart

Transform Options
^^^^^^^^^^^^^^^^^
The :meth:`~Chart.transform_impute` method is built on the :class:`~ImputeTransform`
class, which has the following options:

.. altair-object-table:: altair.ImputeTransform