1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213
|
.. _whatsnew_0171:
Version 0.17.1 (November 21, 2015)
----------------------------------
{{ header }}
.. note::
We are proud to announce that *pandas* has become a sponsored project of the (`NumFOCUS organization`_). This will help ensure the success of development of *pandas* as a world-class open-source project.
.. _numfocus organization: http://www.numfocus.org/blog/numfocus-announces-new-fiscally-sponsored-project-pandas
This is a minor bug-fix release from 0.17.0 and includes a large number of
bug fixes along several new features, enhancements, and performance improvements.
We recommend that all users upgrade to this version.
Highlights include:
- Support for Conditional HTML Formatting, see :ref:`here <whatsnew_0171.style>`
- Releasing the GIL on the csv reader & other ops, see :ref:`here <whatsnew_0171.performance>`
- Fixed regression in ``DataFrame.drop_duplicates`` from 0.16.2, causing incorrect results on integer values (:issue:`11376`)
.. contents:: What's new in v0.17.1
:local:
:backlinks: none
New features
~~~~~~~~~~~~
.. _whatsnew_0171.style:
Conditional HTML formatting
^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. warning::
This is a new feature and is under active development.
We'll be adding features an possibly making breaking changes in future
releases. Feedback is welcome in :issue:`11610`
We've added *experimental* support for conditional HTML formatting:
the visual styling of a DataFrame based on the data.
The styling is accomplished with HTML and CSS.
Accesses the styler class with the :attr:`pandas.DataFrame.style`, attribute,
an instance of :class:`.Styler` with your data attached.
Here's a quick example:
.. ipython:: python
np.random.seed(123)
df = pd.DataFrame(np.random.randn(10, 5), columns=list("abcde"))
html = df.style.background_gradient(cmap="viridis", low=0.5)
We can render the HTML to get the following table.
.. raw:: html
:file: whatsnew_0171_html_table.html
:class:`.Styler` interacts nicely with the Jupyter Notebook.
See the :ref:`documentation </user_guide/style.ipynb>` for more.
.. _whatsnew_0171.enhancements:
Enhancements
~~~~~~~~~~~~
- ``DatetimeIndex`` now supports conversion to strings with ``astype(str)`` (:issue:`10442`)
- Support for ``compression`` (gzip/bz2) in :meth:`pandas.DataFrame.to_csv` (:issue:`7615`)
- ``pd.read_*`` functions can now also accept :class:`python:pathlib.Path`, or :class:`py:py._path.local.LocalPath`
objects for the ``filepath_or_buffer`` argument. (:issue:`11033`)
- The ``DataFrame`` and ``Series`` functions ``.to_csv()``, ``.to_html()`` and ``.to_latex()`` can now handle paths beginning with tildes (e.g. ``~/Documents/``) (:issue:`11438`)
- ``DataFrame`` now uses the fields of a ``namedtuple`` as columns, if columns are not supplied (:issue:`11181`)
- ``DataFrame.itertuples()`` now returns ``namedtuple`` objects, when possible. (:issue:`11269`, :issue:`11625`)
- Added ``axvlines_kwds`` to parallel coordinates plot (:issue:`10709`)
- Option to ``.info()`` and ``.memory_usage()`` to provide for deep introspection of memory consumption. Note that this can be expensive to compute and therefore is an optional parameter. (:issue:`11595`)
.. ipython:: python
df = pd.DataFrame({"A": ["foo"] * 1000}) # noqa: F821
df["B"] = df["A"].astype("category")
# shows the '+' as we have object dtypes
df.info()
# we have an accurate memory assessment (but can be expensive to compute this)
df.info(memory_usage="deep")
- ``Index`` now has a ``fillna`` method (:issue:`10089`)
.. ipython:: python
pd.Index([1, np.nan, 3]).fillna(2)
- Series of type ``category`` now make ``.str.<...>`` and ``.dt.<...>`` accessor methods / properties available, if the categories are of that type. (:issue:`10661`)
.. ipython:: python
s = pd.Series(list("aabb")).astype("category")
s
s.str.contains("a")
date = pd.Series(pd.date_range("1/1/2015", periods=5)).astype("category")
date
date.dt.day
- ``pivot_table`` now has a ``margins_name`` argument so you can use something other than the default of 'All' (:issue:`3335`)
- Implement export of ``datetime64[ns, tz]`` dtypes with a fixed HDF5 store (:issue:`11411`)
- Pretty printing sets (e.g. in DataFrame cells) now uses set literal syntax (``{x, y}``) instead of
Legacy Python syntax (``set([x, y])``) (:issue:`11215`)
- Improve the error message in :func:`pandas.io.gbq.to_gbq` when a streaming insert fails (:issue:`11285`)
and when the DataFrame does not match the schema of the destination table (:issue:`11359`)
.. _whatsnew_0171.api:
API changes
~~~~~~~~~~~
- raise ``NotImplementedError`` in ``Index.shift`` for non-supported index types (:issue:`8038`)
- ``min`` and ``max`` reductions on ``datetime64`` and ``timedelta64`` dtyped series now
result in ``NaT`` and not ``nan`` (:issue:`11245`).
- Indexing with a null key will raise a ``TypeError``, instead of a ``ValueError`` (:issue:`11356`)
- ``Series.ptp`` will now ignore missing values by default (:issue:`11163`)
.. _whatsnew_0171.deprecations:
Deprecations
^^^^^^^^^^^^
- The ``pandas.io.ga`` module which implements ``google-analytics`` support is deprecated and will be removed in a future version (:issue:`11308`)
- Deprecate the ``engine`` keyword in ``.to_csv()``, which will be removed in a future version (:issue:`11274`)
.. _whatsnew_0171.performance:
Performance improvements
~~~~~~~~~~~~~~~~~~~~~~~~
- Checking monotonic-ness before sorting on an index (:issue:`11080`)
- ``Series.dropna`` performance improvement when its dtype can't contain ``NaN`` (:issue:`11159`)
- Release the GIL on most datetime field operations (e.g. ``DatetimeIndex.year``, ``Series.dt.year``), normalization, and conversion to and from ``Period``, ``DatetimeIndex.to_period`` and ``PeriodIndex.to_timestamp`` (:issue:`11263`)
- Release the GIL on some rolling algos: ``rolling_median``, ``rolling_mean``, ``rolling_max``, ``rolling_min``, ``rolling_var``, ``rolling_kurt``, ``rolling_skew`` (:issue:`11450`)
- Release the GIL when reading and parsing text files in ``read_csv``, ``read_table`` (:issue:`11272`)
- Improved performance of ``rolling_median`` (:issue:`11450`)
- Improved performance of ``to_excel`` (:issue:`11352`)
- Performance bug in repr of ``Categorical`` categories, which was rendering the strings before chopping them for display (:issue:`11305`)
- Performance improvement in ``Categorical.remove_unused_categories``, (:issue:`11643`).
- Improved performance of ``Series`` constructor with no data and ``DatetimeIndex`` (:issue:`11433`)
- Improved performance of ``shift``, ``cumprod``, and ``cumsum`` with groupby (:issue:`4095`)
.. _whatsnew_0171.bug_fixes:
Bug fixes
~~~~~~~~~
- ``SparseArray.__iter__()`` now does not cause ``PendingDeprecationWarning`` in Python 3.5 (:issue:`11622`)
- Regression from 0.16.2 for output formatting of long floats/nan, restored in (:issue:`11302`)
- ``Series.sort_index()`` now correctly handles the ``inplace`` option (:issue:`11402`)
- Incorrectly distributed .c file in the build on ``PyPi`` when reading a csv of floats and passing ``na_values=<a scalar>`` would show an exception (:issue:`11374`)
- Bug in ``.to_latex()`` output broken when the index has a name (:issue:`10660`)
- Bug in ``HDFStore.append`` with strings whose encoded length exceeded the max unencoded length (:issue:`11234`)
- Bug in merging ``datetime64[ns, tz]`` dtypes (:issue:`11405`)
- Bug in ``HDFStore.select`` when comparing with a numpy scalar in a where clause (:issue:`11283`)
- Bug in using ``DataFrame.ix`` with a MultiIndex indexer (:issue:`11372`)
- Bug in ``date_range`` with ambiguous endpoints (:issue:`11626`)
- Prevent adding new attributes to the accessors ``.str``, ``.dt`` and ``.cat``. Retrieving such
a value was not possible, so error out on setting it. (:issue:`10673`)
- Bug in tz-conversions with an ambiguous time and ``.dt`` accessors (:issue:`11295`)
- Bug in output formatting when using an index of ambiguous times (:issue:`11619`)
- Bug in comparisons of Series vs list-likes (:issue:`11339`)
- Bug in ``DataFrame.replace`` with a ``datetime64[ns, tz]`` and a non-compat to_replace (:issue:`11326`, :issue:`11153`)
- Bug in ``isnull`` where ``numpy.datetime64('NaT')`` in a ``numpy.array`` was not determined to be null(:issue:`11206`)
- Bug in list-like indexing with a mixed-integer Index (:issue:`11320`)
- Bug in ``pivot_table`` with ``margins=True`` when indexes are of ``Categorical`` dtype (:issue:`10993`)
- Bug in ``DataFrame.plot`` cannot use hex strings colors (:issue:`10299`)
- Regression in ``DataFrame.drop_duplicates`` from 0.16.2, causing incorrect results on integer values (:issue:`11376`)
- Bug in ``pd.eval`` where unary ops in a list error (:issue:`11235`)
- Bug in ``squeeze()`` with zero length arrays (:issue:`11230`, :issue:`8999`)
- Bug in ``describe()`` dropping column names for hierarchical indexes (:issue:`11517`)
- Bug in ``DataFrame.pct_change()`` not propagating ``axis`` keyword on ``.fillna`` method (:issue:`11150`)
- Bug in ``.to_csv()`` when a mix of integer and string column names are passed as the ``columns`` parameter (:issue:`11637`)
- Bug in indexing with a ``range``, (:issue:`11652`)
- Bug in inference of numpy scalars and preserving dtype when setting columns (:issue:`11638`)
- Bug in ``to_sql`` using unicode column names giving UnicodeEncodeError with (:issue:`11431`).
- Fix regression in setting of ``xticks`` in ``plot`` (:issue:`11529`).
- Bug in ``holiday.dates`` where observance rules could not be applied to holiday and doc enhancement (:issue:`11477`, :issue:`11533`)
- Fix plotting issues when having plain ``Axes`` instances instead of ``SubplotAxes`` (:issue:`11520`, :issue:`11556`).
- Bug in ``DataFrame.to_latex()`` produces an extra rule when ``header=False`` (:issue:`7124`)
- Bug in ``df.groupby(...).apply(func)`` when a func returns a ``Series`` containing a new datetimelike column (:issue:`11324`)
- Bug in ``pandas.json`` when file to load is big (:issue:`11344`)
- Bugs in ``to_excel`` with duplicate columns (:issue:`11007`, :issue:`10982`, :issue:`10970`)
- Fixed a bug that prevented the construction of an empty series of dtype ``datetime64[ns, tz]`` (:issue:`11245`).
- Bug in ``read_excel`` with MultiIndex containing integers (:issue:`11317`)
- Bug in ``to_excel`` with openpyxl 2.2+ and merging (:issue:`11408`)
- Bug in ``DataFrame.to_dict()`` produces a ``np.datetime64`` object instead of ``Timestamp`` when only datetime is present in data (:issue:`11327`)
- Bug in ``DataFrame.corr()`` raises exception when computes Kendall correlation for DataFrames with boolean and not boolean columns (:issue:`11560`)
- Bug in the link-time error caused by C ``inline`` functions on FreeBSD 10+ (with ``clang``) (:issue:`10510`)
- Bug in ``DataFrame.to_csv`` in passing through arguments for formatting ``MultiIndexes``, including ``date_format`` (:issue:`7791`)
- Bug in ``DataFrame.join()`` with ``how='right'`` producing a ``TypeError`` (:issue:`11519`)
- Bug in ``Series.quantile`` with empty list results has ``Index`` with ``object`` dtype (:issue:`11588`)
- Bug in ``pd.merge`` results in empty ``Int64Index`` rather than ``Index(dtype=object)`` when the merge result is empty (:issue:`11588`)
- Bug in ``Categorical.remove_unused_categories`` when having ``NaN`` values (:issue:`11599`)
- Bug in ``DataFrame.to_sparse()`` loses column names for MultiIndexes (:issue:`11600`)
- Bug in ``DataFrame.round()`` with non-unique column index producing a Fatal Python error (:issue:`11611`)
- Bug in ``DataFrame.round()`` with ``decimals`` being a non-unique indexed Series producing extra columns (:issue:`11618`)
.. _whatsnew_0.17.1.contributors:
Contributors
~~~~~~~~~~~~
.. contributors:: v0.17.0..v0.17.1
|