File: weather-climate.rst

package info (click to toggle)
python-xarray 2025.08.0-1
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 11,796 kB
  • sloc: python: 115,416; makefile: 258; sh: 47
file content (281 lines) | stat: -rw-r--r-- 10,037 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
.. currentmodule:: xarray

.. _weather-climate:

Weather and climate data
========================

.. jupyter-execute::
    :hide-code:

    import xarray as xr
    import numpy as np

Xarray can leverage metadata that follows the `Climate and Forecast (CF) conventions`_ if present. Examples include :ref:`automatic labelling of plots<plotting>` with descriptive names and units if proper metadata is present and support for non-standard calendars used in climate science through the ``cftime`` module (explained in the :ref:`CFTimeIndex` section). There are also a number of :ref:`geosciences-focused projects that build on xarray<ecosystem>`.

.. _Climate and Forecast (CF) conventions: https://cfconventions.org

.. _cf_variables:

Related Variables
-----------------

Several CF variable attributes contain lists of other variables
associated with the variable with the attribute.  A few of these are
now parsed by xarray, with the attribute value popped to encoding on
read and the variables in that value interpreted as non-dimension
coordinates:

- ``coordinates``
- ``bounds``
- ``grid_mapping``
- ``climatology``
- ``geometry``
- ``node_coordinates``
- ``node_count``
- ``part_node_count``
- ``interior_ring``
- ``cell_measures``
- ``formula_terms``

This decoding is controlled by the ``decode_coords`` kwarg to
:py:func:`open_dataset` and :py:func:`open_mfdataset`.

The CF attribute ``ancillary_variables`` was not included in the list
due to the variables listed there being associated primarily with the
variable with the attribute, rather than with the dimensions.

.. _metpy_accessor:

CF-compliant coordinate variables
---------------------------------

`MetPy`_ adds a ``metpy`` accessor that allows accessing coordinates with appropriate CF metadata using generic names ``x``, ``y``, ``vertical`` and ``time``. There is also a ``cartopy_crs`` attribute that provides projection information, parsed from the appropriate CF metadata, as a `Cartopy`_ projection object. See the `metpy documentation`_ for more information.

.. _`MetPy`: https://unidata.github.io/MetPy/dev/index.html
.. _`metpy documentation`:	https://unidata.github.io/MetPy/dev/tutorials/xarray_tutorial.html#coordinates
.. _`Cartopy`: https://scitools.org.uk/cartopy/docs/latest/reference/crs.html

.. _CFTimeIndex:

Non-standard calendars and dates outside the precision range
------------------------------------------------------------

Through the standalone ``cftime`` library and a custom subclass of
:py:class:`pandas.Index`, xarray supports a subset of the indexing
functionality enabled through the standard :py:class:`pandas.DatetimeIndex` for
dates from non-standard calendars commonly used in climate science or dates
using a standard calendar, but outside the `precision range`_ and dates prior to `1582-10-15`_.

.. note::

   As of xarray version 0.11, by default, :py:class:`cftime.datetime` objects
   will be used to represent times (either in indexes, as a
   :py:class:`~xarray.CFTimeIndex`, or in data arrays with dtype object) if
   any of the following are true:

   - The dates are from a non-standard calendar
   - Any dates are outside the nanosecond-precision range (prior xarray version 2025.01.2)
   - Any dates are outside the time span limited by the resolution (from xarray version 2025.01.2)

   Otherwise pandas-compatible dates from a standard calendar will be
   represented with the ``np.datetime64[unit]`` data type (where unit can be one of ``"s"``, ``"ms"``, ``"us"``, ``"ns"``), enabling the use of a :py:class:`pandas.DatetimeIndex` or arrays with dtype ``np.datetime64[unit]`` and their full set of associated features.

   As of pandas version 2.0.0, pandas supports non-nanosecond precision datetime
   values. From xarray version 2025.01.2 on, non-nanosecond precision datetime values are also supported in xarray (this can be parameterized via :py:class:`~xarray.coders.CFDatetimeCoder` and ``decode_times`` kwarg). See also :ref:`internals.timecoding`.

For example, you can create a DataArray indexed by a time
coordinate with dates from a no-leap calendar and a
:py:class:`~xarray.CFTimeIndex` will automatically be used:

.. jupyter-execute::

    from itertools import product
    from cftime import DatetimeNoLeap

    dates = [
        DatetimeNoLeap(year, month, 1)
        for year, month in product(range(1, 3), range(1, 13))
    ]
    da = xr.DataArray(np.arange(24), coords=[dates], dims=["time"], name="foo")

Xarray also includes a :py:func:`~xarray.date_range` function, which enables
creating a :py:class:`~xarray.CFTimeIndex` with regularly-spaced dates.  For
instance, we can create the same dates and DataArray we created above using
(note that ``use_cftime=True`` is not mandatory to return a
:py:class:`~xarray.CFTimeIndex` for non-standard calendars, but can be nice
to use to be explicit):

.. jupyter-execute::

    dates = xr.date_range(
        start="0001", periods=24, freq="MS", calendar="noleap", use_cftime=True
    )
    da = xr.DataArray(np.arange(24), coords=[dates], dims=["time"], name="foo")

Mirroring pandas' method with the same name, :py:meth:`~xarray.infer_freq` allows one to
infer the sampling frequency of a :py:class:`~xarray.CFTimeIndex` or a 1-D
:py:class:`~xarray.DataArray` containing cftime objects. It also works transparently with
``np.datetime64`` and ``np.timedelta64`` data (with "s", "ms", "us" or "ns" resolution).

.. jupyter-execute::

    xr.infer_freq(dates)

With :py:meth:`~xarray.CFTimeIndex.strftime` we can also easily generate formatted strings from
the datetime values of a :py:class:`~xarray.CFTimeIndex` directly or through the
``dt`` accessor for a :py:class:`~xarray.DataArray`
using the same formatting as the standard `datetime.strftime`_ convention .

.. _datetime.strftime: https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior

.. jupyter-execute::

    dates.strftime("%c")

.. jupyter-execute::

    da["time"].dt.strftime("%Y%m%d")

Conversion between non-standard calendar and to/from pandas DatetimeIndexes is
facilitated with the :py:meth:`xarray.Dataset.convert_calendar` method (also available as
:py:meth:`xarray.DataArray.convert_calendar`). Here, like elsewhere in xarray, the ``use_cftime``
argument controls which datetime backend is used in the output. The default (``None``) is to
use ``pandas`` when possible, i.e. when the calendar is ``standard``/``gregorian`` and dates starting with `1582-10-15`_. There is no such restriction when converting to a ``proleptic_gregorian`` calendar.

.. _1582-10-15: https://en.wikipedia.org/wiki/Gregorian_calendar

.. jupyter-execute::

    dates = xr.date_range(
        start="2001", periods=24, freq="MS", calendar="noleap", use_cftime=True
    )
    da_nl = xr.DataArray(np.arange(24), coords=[dates], dims=["time"], name="foo")
    da_std = da.convert_calendar("standard", use_cftime=True)

The data is unchanged, only the timestamps are modified. Further options are implemented
for the special ``"360_day"`` calendar and for handling missing dates. There is also
:py:meth:`xarray.Dataset.interp_calendar` (and :py:meth:`xarray.DataArray.interp_calendar`)
for interpolating data between calendars.

For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports:

- `Partial datetime string indexing`_:

.. jupyter-execute::

    da.sel(time="0001")

.. jupyter-execute::

    da.sel(time=slice("0001-05", "0002-02"))

.. note::


   For specifying full or partial datetime strings in cftime
   indexing, xarray supports two versions of the `ISO 8601 standard`_, the
   basic pattern (YYYYMMDDhhmmss) or the extended pattern
   (YYYY-MM-DDThh:mm:ss), as well as the default cftime string format
   (YYYY-MM-DD hh:mm:ss).  This is somewhat more restrictive than pandas;
   in other words, some datetime strings that would be valid for a
   :py:class:`pandas.DatetimeIndex` are not valid for an
   :py:class:`~xarray.CFTimeIndex`.

- Access of basic datetime components via the ``dt`` accessor (in this case
  just "year", "month", "day", "hour", "minute", "second", "microsecond",
  "season", "dayofyear", "dayofweek", and "days_in_month") with the addition
  of "calendar", absent from pandas:

.. jupyter-execute::

    da.time.dt.year

.. jupyter-execute::

    da.time.dt.month

.. jupyter-execute::

    da.time.dt.season

.. jupyter-execute::

    da.time.dt.dayofyear

.. jupyter-execute::

    da.time.dt.dayofweek

.. jupyter-execute::

    da.time.dt.days_in_month

.. jupyter-execute::

    da.time.dt.calendar

- Rounding of datetimes to fixed frequencies via the ``dt`` accessor:

.. jupyter-execute::

    da.time.dt.ceil("3D").head()

.. jupyter-execute::

    da.time.dt.floor("5D").head()

.. jupyter-execute::

    da.time.dt.round("2D").head()

- Group-by operations based on datetime accessor attributes (e.g. by month of
  the year):

.. jupyter-execute::

    da.groupby("time.month").sum()

- Interpolation using :py:class:`cftime.datetime` objects:

.. jupyter-execute::

    da.interp(time=[DatetimeNoLeap(1, 1, 15), DatetimeNoLeap(1, 2, 15)])

- Interpolation using datetime strings:

.. jupyter-execute::

    da.interp(time=["0001-01-15", "0001-02-15"])

- Differentiation:

.. jupyter-execute::

    da.differentiate("time")

- Serialization:

.. jupyter-execute::

    da.to_netcdf("example-no-leap.nc")
    reopened = xr.open_dataset("example-no-leap.nc")
    reopened

.. jupyter-execute::
    :hide-code:

    import os

    reopened.close()
    os.remove("example-no-leap.nc")

- And resampling along the time dimension for data indexed by a :py:class:`~xarray.CFTimeIndex`:

.. jupyter-execute::

    da.resample(time="81min", closed="right", label="right", offset="3min").mean()

.. _precision range: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timestamp-limitations
.. _ISO 8601 standard: https://en.wikipedia.org/wiki/ISO_8601
.. _partial datetime string indexing: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#partial-string-indexing