1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224
|
.. currentmodule:: altair
.. _user-guide-time:
Times and Dates
===============
Working with dates, times, and timezones is often one of the more challenging
aspects of data analysis. In Altair, the difficulties are compounded by the
fact that users are writing Python code, which outputs JSON-serialized
timestamps, which are interpreted by Javascript, and then rendered by your
browser. At each of these steps, there are things that can go wrong, but
Altair and Vega-Lite do their best to ensure that dates are interpreted and
visualized in a consistent way.
Altair and Pandas Datetimes
---------------------------
Altair is designed to work best with `Pandas timeseries`_. A standard
timezone-agnostic date/time column in a Pandas dataframe will be both
interpreted and displayed as local user time. For example, here is a dataset
containing hourly temperatures measured in Seattle:
.. altair-plot::
:output: repr
import altair as alt
from vega_datasets import data
temps = data.seattle_temps()
temps.head()
We can see from the ``dtypes`` attribute that the times are encoded as a standard
64-bit datetime, without any specified timezone:
.. altair-plot::
:output: repr
temps.dtypes
We can use Altair to visualize this datetime data; for clarity in this
example, we'll limit ourselves to the first two weeks of data:
.. altair-plot::
temps = temps[temps.date < '2010-01-15']
alt.Chart(temps).mark_line().encode(
x='date:T',
y='temp:Q'
)
(notice that for date/time values we use the ``T`` to indicate a temporal
encoding: while this is optional for pandas datetime input, it is good practice
to specify a type explicitly; see :ref:`encoding-data-types` for more discussion).
For date-time inputs like these, it can sometimes be useful to extract particular
time units (e.g. hours of the day, dates of the month, etc.).
In Altair, this can be done with a time unit transform, discussed in detail in
:ref:`user-guide-timeunit-transform`.
For example, we might decide we want a heatmap with hour of the day on the
x-axis, and day of the month on the y-axis:
.. altair-plot::
alt.Chart(temps).mark_rect().encode(
alt.X('hoursminutes(date):O').title('hour of day'),
alt.Y('monthdate(date):O').title('date'),
alt.Color('temp:Q').title('temperature (F)')
)
Unless you are using a non-ES6 browser (See :ref:`note-browser-compliance`),
you will notice that the chart created by this code reflects hours starting
at 00:00:00 on January 1st, just as in the data we input.
This is because both the input timestamps and the plot outputs are using
local time.
Specifying Time Zones
---------------------
If you are viewing the above visualizations in a supported browser (see
:ref:`note-browser-compliance`), the times are both serialized and
rendered in local time, so that the ``January 1st 00:00:00`` row renders in
the chart as ``00:00`` on ``January 1st``.
In Altair, simple dates without an explicit timezone are treated as local time,
and in Vega-Lite, unless otherwise specified, times are rendered in the local
time of the browser that does the rendering.
If you would like your dates to instead be time-zone aware, you can set the
timezone explicitly in the input dataframe. Since Seattle is in the
``US/Pacific`` timezone, we can localize the timestamps in Pandas as follows:
.. altair-plot::
:output: repr
temps['date_pacific'] = temps['date'].dt.tz_localize('US/Pacific')
temps.dtypes
Notice that the timezone is now part of the pandas datatype.
If we repeat the above chart with this timezone-aware data, the result will
render **according to the timezone of the browser rendering it**:
.. altair-plot::
alt.Chart(temps).mark_rect().encode(
alt.X('hoursminutes(date_pacific):O').title('hour of day'),
alt.Y('monthdate(date_pacific):O').title('date'),
alt.Color('temp:Q').title('temperature (F)')
)
If you are viewing this chart on a computer whose time is set to the west coast
of the US, it should appear identical to the first version. If you are rendering
the chart in any other timezone, it will render using a timezone correction
computed from the location set in your system.
.. _explicit-utc-time:
Using UTC Time
--------------
This user-local rendering can sometimes be confusing, because it leads to the
same output being visualized differently by different users.
If you want timezone-aware data to appear the same to every user regardless of
location, the best approach is to adopt a standard timezone in which to render
the data. One commonly-used standard is `Coordinated Universal Time (UTC)`_.
In Altair, any of the ``timeUnit`` bins can be prefixed with ``utc`` in
order to extract UTC time units.
Here is the above chart visualized in UTC time, which will render the same way
regardless of the system location:
.. altair-plot::
alt.Chart(temps).mark_rect().encode(
alt.X('utchoursminutes(date_pacific):O').title('UTC hour of day'),
alt.Y('utcmonthdate(date_pacific):O').title('UTC date'),
alt.Color('temp:Q').title('temperature (F)')
)
To make your charts as portable as possible (even in non-ES6 browsers which parse
timezone-agnostic times as UTC), you can explicitly work
in UTC time, both on the Pandas side and on the Vega-Lite side:
.. altair-plot::
temps['date_utc'] = temps['date'].dt.tz_localize('UTC')
alt.Chart(temps).mark_rect().encode(
alt.X('utchoursminutes(date_utc):O').title('hour of day'),
alt.Y('utcmonthdate(date_utc):O').title('date'),
alt.Color('temp:Q').title('temperature (F)')
)
This is somewhat less convenient than the default behavior for timezone-agnostic
dates, in which both Pandas and Vega-Lite assume times are local
(except in non-ES6 browsers; see :ref:`note-browser-compliance`),
but it gets around browser incompatibilities by explicitly working in UTC, which
gives similar results even in older browsers.
.. _note-browser-compliance:
Note on Browser Compliance
--------------------------
.. note:: Warning about non-ES6 Browsers
The discussion below applies to modern browsers which support `ECMAScript 6`_,
in which time strings like ``"2018-01-01T12:00:00"`` without a trailing ``"Z"``
are treated as local time rather than `Coordinated Universal Time (UTC)`_.
For example, recent versions of Chrome and Firefox are ES6-compliant,
while Safari 11 is not.
If you are using a non-ES6 browser, this means that times displayed in Altair
charts may be rendered with a timezone offset, unless you explicitly use
UTC time (see :ref:`explicit-utc-time`).
The following chart will help you determine if your browser parses dates in the
way that Altair expects:
.. altair-plot::
:links: none
import altair as alt
import pandas as pd
df = pd.DataFrame({'local': ['2018-01-01T00:00:00'],
'utc': ['2018-01-01T00:00:00Z']})
alt.Chart(df).transform_calculate(
compliant="hours(datum.local) != hours(datum.utc) ? true : false",
).mark_text(size=20, baseline='middle').encode(
text=alt.condition('datum.compliant', alt.value('OK'), alt.value('not OK')),
color=alt.condition('datum.compliant', alt.value('green'), alt.value('red'))
).properties(width=80, height=50)
If the above output contains a red "not OK":
.. altair-plot::
:hide-code:
:links: none
alt.Chart(df).mark_text(size=10, baseline='middle').encode(
alt.TextValue('not OK'),
alt.ColorValue('red')
).properties(width=40, height=25)
it means that your browser's date parsing is not ES6-compliant.
If it contains a green "OK":
.. altair-plot::
:hide-code:
:links: none
alt.Chart(df).mark_text(size=10, baseline='middle').encode(
alt.TextValue('OK'),
alt.ColorValue('green')
).properties(width=40, height=25)
then it means that your browser parses dates as Altair expects, either because
it is ES6-compliant or because your computer locale happens to be set to
the UTC+0 (GMT) timezone.
.. _Coordinated Universal Time (UTC): https://en.wikipedia.org/wiki/Coordinated_Universal_Time
.. _Pandas timeseries: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html
.. _ECMAScript 6: http://www.ecma-international.org/ecma-262/6.0/
|