File: boolean.rst

package info (click to toggle)
pandas 1.5.3%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 56,516 kB
  • sloc: python: 382,477; ansic: 8,695; sh: 119; xml: 102; makefile: 97
file content (107 lines) | stat: -rw-r--r-- 2,884 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
.. currentmodule:: pandas

.. ipython:: python
   :suppress:

   import pandas as pd
   import numpy as np

.. _boolean:

**************************
Nullable Boolean data type
**************************

.. note::

   BooleanArray is currently experimental. Its API or implementation may
   change without warning.

.. versionadded:: 1.0.0


.. _boolean.indexing:

Indexing with NA values
-----------------------

pandas allows indexing with ``NA`` values in a boolean array, which are treated as ``False``.

.. versionchanged:: 1.0.2

.. ipython:: python
   :okexcept:

   s = pd.Series([1, 2, 3])
   mask = pd.array([True, False, pd.NA], dtype="boolean")
   s[mask]

If you would prefer to keep the ``NA`` values you can manually fill them with ``fillna(True)``.

.. ipython:: python

   s[mask.fillna(True)]

.. _boolean.kleene:

Kleene logical operations
-------------------------

:class:`arrays.BooleanArray` implements `Kleene Logic`_ (sometimes called three-value logic) for
logical operations like ``&`` (and), ``|`` (or) and ``^`` (exclusive-or).

This table demonstrates the results for every combination. These operations are symmetrical,
so flipping the left- and right-hand side makes no difference in the result.

================= =========
Expression        Result
================= =========
``True & True``   ``True``
``True & False``  ``False``
``True & NA``     ``NA``
``False & False`` ``False``
``False & NA``    ``False``
``NA & NA``       ``NA``
``True | True``   ``True``
``True | False``  ``True``
``True | NA``     ``True``
``False | False`` ``False``
``False | NA``    ``NA``
``NA | NA``       ``NA``
``True ^ True``   ``False``
``True ^ False``  ``True``
``True ^ NA``     ``NA``
``False ^ False`` ``False``
``False ^ NA``    ``NA``
``NA ^ NA``       ``NA``
================= =========

When an ``NA`` is present in an operation, the output value is ``NA`` only if
the result cannot be determined solely based on the other input. For example,
``True | NA`` is ``True``, because both ``True | True`` and ``True | False``
are ``True``. In that case, we don't actually need to consider the value
of the ``NA``.

On the other hand, ``True & NA`` is ``NA``. The result depends on whether
the ``NA`` really is ``True`` or ``False``, since ``True & True`` is ``True``,
but ``True & False`` is ``False``, so we can't determine the output.


This differs from how ``np.nan`` behaves in logical operations. pandas treated
``np.nan`` is *always false in the output*.

In ``or``

.. ipython:: python

   pd.Series([True, False, np.nan], dtype="object") | True
   pd.Series([True, False, np.nan], dtype="boolean") | True

In ``and``

.. ipython:: python

   pd.Series([True, False, np.nan], dtype="object") & True
   pd.Series([True, False, np.nan], dtype="boolean") & True

.. _Kleene Logic: https://en.wikipedia.org/wiki/Three-valued_logic#Kleene_and_Priest_logics