File: lookup.rst

package info (click to toggle)
python-altair 5.0.1-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 6,952 kB
  • sloc: python: 25,649; sh: 14; makefile: 5
file content (130 lines) | stat: -rw-r--r-- 4,464 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
.. currentmodule:: altair

.. _user-guide-lookup-transform:

Lookup
~~~~~~
The Lookup transform extends a primary data source by looking up values from
another data source; it is similar to a one-sided join. A lookup can be added
at the top level of a chart using the :meth:`Chart.transform_lookup` method.

By way of example, imagine you have two sources of data that you would like
to combine and plot: one is a list of names of people along with their height
and weight, and the other is some information about which groups they belong
to. This example data is available in ``vega_datasets``:

.. altair-plot::
   :output: none

   from vega_datasets import data
   people = data.lookup_people()
   groups = data.lookup_groups()

We know how to visualize each of these datasets separately; for example:

.. altair-plot::

    import altair as alt

    top = alt.Chart(people).mark_square(size=200).encode(
        x=alt.X('age:Q').scale(zero=False),
        y=alt.Y('height:Q').scale(zero=False),
        color='name:N',
        tooltip='name:N'
    ).properties(
        width=400, height=200
    )

    bottom = alt.Chart(groups).mark_rect().encode(
        x='person:N',
        y='group:O'
    ).properties(
        width=400, height=100
    )

    alt.vconcat(top, bottom)

If we would like to plot features that reference both datasets (for example, the
average age within each group), we need to combine the two datasets.
This can be done either as a data preprocessing step, using tools available
in Pandas, or as part of the visualization using a :class:`~LookupTransform`
in Altair.

Combining Datasets with pandas.merge
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Pandas provides a wide range of tools for merging and joining datasets; see
`Merge, Join, and Concatenate <https://pandas.pydata.org/pandas-docs/stable/merging.html>`_
for some detailed examples.
For the above data, we can merge the data and create a combined chart as follows:

.. altair-plot::

    import pandas as pd
    merged = pd.merge(groups, people, how='left',
                      left_on='person', right_on='name')

    alt.Chart(merged).mark_bar().encode(
        x='mean(age):Q',
        y='group:O'
    )

We specify a left join, meaning that for each entry of the "person" column in
the groups, we seek the "name" column in people and add the entry to the data.
From this, we can easily create a bar chart representing the mean age in each group.

Combining Datasets with a Lookup Transform
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
For some data sources (e.g. data available at a URL, or data that is streaming),
it is desirable to have a means of joining data without having to download
it for pre-processing in Pandas.
This is where Altair's :meth:`~Chart.transform_lookup` comes in.
To reproduce the above combined plot by combining datasets within the
chart specification itself, we can do the following:

.. altair-plot::

    alt.Chart(groups).mark_bar().encode(
        x='mean(age):Q',
        y='group:O'
    ).transform_lookup(
        lookup='person',
        from_=alt.LookupData(data=people, key='name',
                             fields=['age', 'height'])
    )

Here ``lookup`` names the field in the groups dataset on which we will match,
and the ``from_`` argument specifies a :class:`~LookupData` structure where
we supply the second dataset, the lookup key, and the fields we would like to
extract.

Example: Lookup Transforms for Geographical Visualization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Lookup transforms are often particularly important for geographic visualization,
where it is common to combine tabular datasets with datasets that specify
geographic boundaries to be visualized; for example, here is a visualization
of unemployment rates per county in the US:

.. altair-plot::

    import altair as alt
    from vega_datasets import data

    counties = alt.topo_feature(data.us_10m.url, 'counties')
    unemp_data = data.unemployment.url

    alt.Chart(counties).mark_geoshape().encode(
        color='rate:Q'
    ).transform_lookup(
        lookup='id',
        from_=alt.LookupData(unemp_data, 'id', ['rate'])
    ).properties(
        projection={'type': 'albersUsa'},
        width=500, height=300
    )

Transform Options
^^^^^^^^^^^^^^^^^
The :meth:`~Chart.transform_lookup` method is built on the :class:`~LookupTransform`
class, which has the following options:

.. altair-object-table:: altair.LookupTransform