File: v0.13.0.rst

package info (click to toggle)
pandas 2.2.3%2Bdfsg-9
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 66,784 kB
  • sloc: python: 422,228; ansic: 9,190; sh: 270; xml: 102; makefile: 83
file content (1375 lines) | stat: -rw-r--r-- 54,601 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
.. _whatsnew_0130:

Version 0.13.0 (January 3, 2014)
--------------------------------

{{ header }}



This is a major release from 0.12.0 and includes a number of API changes, several new features and
enhancements along with a large number of bug fixes.

Highlights include:

- support for a new index type ``Float64Index``, and other Indexing enhancements
- ``HDFStore`` has a new string based syntax for query specification
- support for new methods of interpolation
- updated ``timedelta`` operations
- a new string manipulation method ``extract``
- Nanosecond support for Offsets
- ``isin`` for DataFrames

Several experimental features are added, including:

- new ``eval/query`` methods for expression evaluation
- support for ``msgpack`` serialization
- an i/o interface to Google's ``BigQuery``

Their are several new or updated docs sections including:

- :ref:`Comparison with SQL<compare_with_sql>`, which should be useful for those familiar with SQL but still learning pandas.
- :ref:`Comparison with R<compare_with_r>`, idiom translations from R to pandas.
- :ref:`Enhancing Performance<enhancingperf>`, ways to enhance pandas performance with ``eval/query``.

.. warning::

   In 0.13.0 ``Series`` has internally been refactored to no longer sub-class ``ndarray``
   but instead subclass ``NDFrame``, similar to the rest of the pandas containers. This should be
   a transparent change with only very limited API implications. See :ref:`Internal Refactoring<whatsnew_0130.refactoring>`

API changes
~~~~~~~~~~~

- ``read_excel`` now supports an integer in its ``sheetname`` argument giving
  the index of the sheet to read in (:issue:`4301`).
- Text parser now treats anything that reads like inf ("inf", "Inf", "-Inf",
  "iNf", etc.) as infinity. (:issue:`4220`, :issue:`4219`), affecting
  ``read_table``, ``read_csv``, etc.
- ``pandas`` now is Python 2/3 compatible without the need for 2to3 thanks to
  @jtratner. As a result, pandas now uses iterators more extensively. This
  also led to the introduction of substantive parts of the Benjamin
  Peterson's ``six`` library into compat. (:issue:`4384`, :issue:`4375`,
  :issue:`4372`)
- ``pandas.util.compat`` and ``pandas.util.py3compat`` have been merged into
  ``pandas.compat``. ``pandas.compat`` now includes many functions allowing
  2/3 compatibility. It contains both list and iterator versions of range,
  filter, map and zip, plus other necessary elements for Python 3
  compatibility. ``lmap``, ``lzip``, ``lrange`` and ``lfilter`` all produce
  lists instead of iterators, for compatibility with ``numpy``, subscripting
  and ``pandas`` constructors.(:issue:`4384`, :issue:`4375`, :issue:`4372`)
- ``Series.get`` with negative indexers now returns the same as ``[]`` (:issue:`4390`)
- Changes to how ``Index`` and ``MultiIndex`` handle metadata (``levels``,
  ``labels``, and ``names``) (:issue:`4039`):

  .. code-block:: python

     # previously, you would have set levels or labels directly
     >>> pd.index.levels = [[1, 2, 3, 4], [1, 2, 4, 4]]

     # now, you use the set_levels or set_labels methods
     >>> index = pd.index.set_levels([[1, 2, 3, 4], [1, 2, 4, 4]])

     # similarly, for names, you can rename the object
     # but setting names is not deprecated
     >>> index = pd.index.set_names(["bob", "cranberry"])

     # and all methods take an inplace kwarg - but return None
     >>> pd.index.set_names(["bob", "cranberry"], inplace=True)

- **All** division with ``NDFrame`` objects is now *truedivision*, regardless
  of the future import. This means that operating on pandas objects will by default
  use *floating point* division, and return a floating point dtype.
  You can use ``//`` and ``floordiv`` to do integer division.

  Integer division

  .. code-block:: ipython

     In [3]: arr = np.array([1, 2, 3, 4])

     In [4]: arr2 = np.array([5, 3, 2, 1])

     In [5]: arr / arr2
     Out[5]: array([0, 0, 1, 4])

     In [6]: pd.Series(arr) // pd.Series(arr2)
     Out[6]:
     0    0
     1    0
     2    1
     3    4
     dtype: int64

  True Division

  .. code-block:: ipython

      In [7]: pd.Series(arr) / pd.Series(arr2)  # no future import required
      Out[7]:
      0    0.200000
      1    0.666667
      2    1.500000
      3    4.000000
      dtype: float64

- Infer and downcast dtype if ``downcast='infer'`` is passed to ``fillna/ffill/bfill`` (:issue:`4604`)
- ``__nonzero__`` for all NDFrame objects, will now raise a ``ValueError``, this reverts back to (:issue:`1073`, :issue:`4633`)
  behavior. See :ref:`gotchas<gotchas.truth>` for a more detailed discussion.

  This prevents doing boolean comparison on *entire* pandas objects, which is inherently ambiguous. These all will raise a ``ValueError``.

  .. code-block:: python

     >>> df = pd.DataFrame({'A': np.random.randn(10),
     ...                    'B': np.random.randn(10),
     ...                    'C': pd.date_range('20130101', periods=10)
     ...                    })
     ...
     >>> if df:
     ...     pass
     ...
     Traceback (most recent call last):
         ...
     ValueError: The truth value of a DataFrame is ambiguous.  Use a.empty,
     a.bool(), a.item(), a.any() or a.all().

     >>> df1 = df
     >>> df2 = df
     >>> df1 and df2
     Traceback (most recent call last):
         ...
     ValueError: The truth value of a DataFrame is ambiguous.  Use a.empty,
     a.bool(), a.item(), a.any() or a.all().

     >>> d = [1, 2, 3]
     >>> s1 = pd.Series(d)
     >>> s2 = pd.Series(d)
     >>> s1 and s2
     Traceback (most recent call last):
         ...
     ValueError: The truth value of a DataFrame is ambiguous.  Use a.empty,
     a.bool(), a.item(), a.any() or a.all().

  Added the ``.bool()`` method to ``NDFrame`` objects to facilitate evaluating of single-element boolean Series:

  .. code-block:: python

     >>> pd.Series([True]).bool()
      True
     >>> pd.Series([False]).bool()
      False
     >>> pd.DataFrame([[True]]).bool()
      True
     >>> pd.DataFrame([[False]]).bool()
      False

- All non-Index NDFrames (``Series``, ``DataFrame``, ``Panel``, ``Panel4D``,
  ``SparsePanel``, etc.), now support the entire set of arithmetic operators
  and arithmetic flex methods (add, sub, mul, etc.). ``SparsePanel`` does not
  support ``pow`` or ``mod`` with non-scalars. (:issue:`3765`)
- ``Series`` and ``DataFrame`` now have a ``mode()`` method to calculate the
  statistical mode(s) by axis/Series. (:issue:`5367`)

- Chained assignment will now by default warn if the user is assigning to a copy. This can be changed
  with the option ``mode.chained_assignment``, allowed options are ``raise/warn/None``. See :ref:`the docs<indexing.view_versus_copy>`.

  .. ipython:: python

     dfc = pd.DataFrame({'A': ['aaa', 'bbb', 'ccc'], 'B': [1, 2, 3]})
     pd.set_option('chained_assignment', 'warn')

  The following warning / exception will show if this is attempted.

  .. ipython:: python
     :okwarning:

     dfc.loc[0]['A'] = 1111

  ::

     Traceback (most recent call last)
        ...
     SettingWithCopyWarning:
        A value is trying to be set on a copy of a slice from a DataFrame.
        Try using .loc[row_index,col_indexer] = value instead

  Here is the correct method of assignment.

  .. ipython:: python

     dfc.loc[0, 'A'] = 11
     dfc

- ``Panel.reindex`` has the following call signature ``Panel.reindex(items=None, major_axis=None, minor_axis=None, **kwargs)``
   to conform with other ``NDFrame`` objects. See :ref:`Internal Refactoring<whatsnew_0130.refactoring>` for more information.

- ``Series.argmin`` and ``Series.argmax`` are now aliased to ``Series.idxmin`` and ``Series.idxmax``. These return the *index* of the
   min or max element respectively. Prior to 0.13.0 these would return the position of the min / max element. (:issue:`6214`)

Prior version deprecations/changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

These were announced changes in 0.12 or prior that are taking effect as of 0.13.0

- Remove deprecated ``Factor`` (:issue:`3650`)
- Remove deprecated ``set_printoptions/reset_printoptions`` (:issue:`3046`)
- Remove deprecated ``_verbose_info`` (:issue:`3215`)
- Remove deprecated ``read_clipboard/to_clipboard/ExcelFile/ExcelWriter`` from ``pandas.io.parsers`` (:issue:`3717`)
  These are available as functions in the main pandas namespace (e.g. ``pd.read_clipboard``)
- default for ``tupleize_cols`` is now ``False`` for both ``to_csv`` and ``read_csv``. Fair warning in 0.12 (:issue:`3604`)
- default for ``display.max_seq_len`` is now 100 rather than ``None``. This activates
  truncated display ("...") of long sequences in various places. (:issue:`3391`)

Deprecations
~~~~~~~~~~~~

Deprecated in 0.13.0

- deprecated ``iterkv``, which will be removed in a future release (this was
  an alias of iteritems used to bypass ``2to3``'s changes).
  (:issue:`4384`, :issue:`4375`, :issue:`4372`)
- deprecated the string method ``match``, whose role is now performed more
  idiomatically by ``extract``. In a future release, the default behavior
  of ``match`` will change to become analogous to ``contains``, which returns
  a boolean indexer. (Their
  distinction is strictness: ``match`` relies on ``re.match`` while
  ``contains`` relies on ``re.search``.) In this release, the deprecated
  behavior is the default, but the new behavior is available through the
  keyword argument ``as_indexer=True``.

Indexing API changes
~~~~~~~~~~~~~~~~~~~~

Prior to 0.13, it was impossible to use a label indexer (``.loc/.ix``) to set a value that
was not contained in the index of a particular axis. (:issue:`2578`). See :ref:`the docs<indexing.basics.partial_setting>`

In the ``Series`` case this is effectively an appending operation

.. ipython:: python

   s = pd.Series([1, 2, 3])
   s
   s[5] = 5.
   s

.. ipython:: python

   dfi = pd.DataFrame(np.arange(6).reshape(3, 2),
                      columns=['A', 'B'])
   dfi

This would previously ``KeyError``

.. ipython:: python

   dfi.loc[:, 'C'] = dfi.loc[:, 'A']
   dfi

This is like an ``append`` operation.

.. ipython:: python

   dfi.loc[3] = 5
   dfi

A Panel setting operation on an arbitrary axis aligns the input to the Panel

.. code-block:: ipython

   In [20]: p = pd.Panel(np.arange(16).reshape(2, 4, 2),
      ....:              items=['Item1', 'Item2'],
      ....:              major_axis=pd.date_range('2001/1/12', periods=4),
      ....:              minor_axis=['A', 'B'], dtype='float64')
      ....:

   In [21]: p
   Out[21]:
   <class 'pandas.core.panel.Panel'>
   Dimensions: 2 (items) x 4 (major_axis) x 2 (minor_axis)
   Items axis: Item1 to Item2
   Major_axis axis: 2001-01-12 00:00:00 to 2001-01-15 00:00:00
   Minor_axis axis: A to B

   In [22]: p.loc[:, :, 'C'] = pd.Series([30, 32], index=p.items)

   In [23]: p
   Out[23]:
   <class 'pandas.core.panel.Panel'>
   Dimensions: 2 (items) x 4 (major_axis) x 3 (minor_axis)
   Items axis: Item1 to Item2
   Major_axis axis: 2001-01-12 00:00:00 to 2001-01-15 00:00:00
   Minor_axis axis: A to C

   In [24]: p.loc[:, :, 'C']
   Out[24]:
               Item1  Item2
   2001-01-12   30.0   32.0
   2001-01-13   30.0   32.0
   2001-01-14   30.0   32.0
   2001-01-15   30.0   32.0

Float64Index API change
~~~~~~~~~~~~~~~~~~~~~~~

- Added a new index type, ``Float64Index``. This will be automatically created when passing floating values in index creation.
  This enables a pure label-based slicing paradigm that makes ``[],ix,loc`` for scalar indexing and slicing work exactly the
  same. (:issue:`263`)

  Construction is by default for floating type values.

  .. ipython:: python

     index = pd.Index([1.5, 2, 3, 4.5, 5])
     index
     s = pd.Series(range(5), index=index)
     s

  Scalar selection for ``[],.ix,.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``)

  .. ipython:: python

     s[3]
     s.loc[3]

  The only positional indexing is via ``iloc``

  .. ipython:: python

     s.iloc[3]

  A scalar index that is not found will raise ``KeyError``

  Slicing is ALWAYS on the values of the index, for ``[],ix,loc`` and ALWAYS positional with ``iloc``

  .. ipython:: python
     :okwarning:

     s[2:4]
     s.loc[2:4]
     s.iloc[2:4]

  In float indexes, slicing using floats are allowed

  .. ipython:: python

     s[2.1:4.6]
     s.loc[2.1:4.6]

- Indexing on other index types are preserved (and positional fallback for ``[],ix``), with the exception, that floating point slicing
  on indexes on non ``Float64Index`` will now raise a ``TypeError``.

  .. code-block:: ipython

     In [1]: pd.Series(range(5))[3.5]
     TypeError: the label [3.5] is not a proper indexer for this index type (Int64Index)

     In [1]: pd.Series(range(5))[3.5:4.5]
     TypeError: the slice start [3.5] is not a proper indexer for this index type (Int64Index)

  Using a scalar float indexer will be deprecated in a future version, but is allowed for now.

  .. code-block:: ipython

     In [3]: pd.Series(range(5))[3.0]
     Out[3]: 3

HDFStore API changes
~~~~~~~~~~~~~~~~~~~~

- Query Format Changes. A much more string-like query format is now supported. See :ref:`the docs<io.hdf5-query>`.

  .. ipython:: python

     path = 'test.h5'
     dfq = pd.DataFrame(np.random.randn(10, 4),
                        columns=list('ABCD'),
                        index=pd.date_range('20130101', periods=10))
     dfq.to_hdf(path, key='dfq', format='table', data_columns=True)

  Use boolean expressions, with in-line function evaluation.

  .. ipython:: python

     pd.read_hdf(path, 'dfq',
                 where="index>Timestamp('20130104') & columns=['A', 'B']")

  Use an inline column reference

  .. ipython:: python

     pd.read_hdf(path, 'dfq',
                 where="A>0 or C>0")

  .. ipython:: python
     :suppress:

     import os
     os.remove(path)

- the ``format`` keyword now replaces the ``table`` keyword; allowed values are ``fixed(f)`` or ``table(t)``
  the same defaults as prior < 0.13.0 remain, e.g. ``put`` implies ``fixed`` format and ``append`` implies
  ``table`` format. This default format can be set as an option by setting ``io.hdf.default_format``.

  .. ipython:: python

     path = 'test.h5'
     df = pd.DataFrame(np.random.randn(10, 2))
     df.to_hdf(path, key='df_table', format='table')
     df.to_hdf(path, key='df_table2', append=True)
     df.to_hdf(path, key='df_fixed')
     with pd.HDFStore(path) as store:
         print(store)

  .. ipython:: python
     :suppress:

     import os
     os.remove(path)

- Significant table writing performance improvements
- handle a passed ``Series`` in table format (:issue:`4330`)
- can now serialize a ``timedelta64[ns]`` dtype in a table (:issue:`3577`), See :ref:`the docs<io.hdf5-timedelta>`.
- added an ``is_open`` property to indicate if the underlying file handle is_open;
  a closed store will now report 'CLOSED' when viewing the store (rather than raising an error)
  (:issue:`4409`)
- a close of a ``HDFStore`` now will close that instance of the ``HDFStore``
  but will only close the actual file if the ref count (by ``PyTables``) w.r.t. all of the open handles
  are 0. Essentially you have a local instance of ``HDFStore`` referenced by a variable. Once you
  close it, it will report closed. Other references (to the same file) will continue to operate
  until they themselves are closed. Performing an action on a closed file will raise
  ``ClosedFileError``

  .. ipython:: python

     path = 'test.h5'
     df = pd.DataFrame(np.random.randn(10, 2))
     store1 = pd.HDFStore(path)
     store2 = pd.HDFStore(path)
     store1.append('df', df)
     store2.append('df2', df)

     store1
     store2
     store1.close()
     store2
     store2.close()
     store2

  .. ipython:: python
     :suppress:

     import os
     os.remove(path)

- removed the ``_quiet`` attribute, replace by a ``DuplicateWarning`` if retrieving
  duplicate rows from a table (:issue:`4367`)
- removed the ``warn`` argument from ``open``. Instead a ``PossibleDataLossError`` exception will
  be raised if you try to use ``mode='w'`` with an OPEN file handle (:issue:`4367`)
- allow a passed locations array or mask as a ``where`` condition (:issue:`4467`).
  See :ref:`the docs<io.hdf5-where_mask>` for an example.
- add the keyword ``dropna=True`` to ``append`` to change whether ALL nan rows are not written
  to the store (default is ``True``, ALL nan rows are NOT written), also settable
  via the option ``io.hdf.dropna_table`` (:issue:`4625`)
- pass through store creation arguments; can be used to support in-memory stores

DataFrame repr changes
~~~~~~~~~~~~~~~~~~~~~~

The HTML and plain text representations of :class:`DataFrame` now show
a truncated view of the table once it exceeds a certain size, rather
than switching to the short info view (:issue:`4886`, :issue:`5550`).
This makes the representation more consistent as small DataFrames get
larger.

.. image:: ../_static/df_repr_truncated.png
   :alt: Truncated HTML representation of a DataFrame

To get the info view, call :meth:`DataFrame.info`. If you prefer the
info view as the repr for large DataFrames, you can set this by running
``set_option('display.large_repr', 'info')``.

Enhancements
~~~~~~~~~~~~

- ``df.to_clipboard()`` learned a new ``excel`` keyword that let's you
  paste df data directly into excel (enabled by default). (:issue:`5070`).
- ``read_html`` now raises a ``URLError`` instead of catching and raising a
  ``ValueError`` (:issue:`4303`, :issue:`4305`)
- Added a test for ``read_clipboard()`` and ``to_clipboard()`` (:issue:`4282`)
- Clipboard functionality now works with PySide (:issue:`4282`)
- Added a more informative error message when plot arguments contain
  overlapping color and style arguments (:issue:`4402`)
- ``to_dict`` now takes ``records`` as a possible out type.  Returns an array
  of column-keyed dictionaries. (:issue:`4936`)

- ``NaN`` handing in get_dummies (:issue:`4446`) with ``dummy_na``

  .. ipython:: python

     # previously, nan was erroneously counted as 2 here
     # now it is not counted at all
     pd.get_dummies([1, 2, np.nan])

     # unless requested
     pd.get_dummies([1, 2, np.nan], dummy_na=True)


- ``timedelta64[ns]`` operations. See :ref:`the docs<timedeltas.timedeltas_convert>`.

  .. warning::

     Most of these operations require ``numpy >= 1.7``

  Using the new top-level ``to_timedelta``, you can convert a scalar or array from the standard
  timedelta format (produced by ``to_csv``) into a timedelta type (``np.timedelta64`` in ``nanoseconds``).

  .. ipython:: python

     pd.to_timedelta('1 days 06:05:01.00003')
     pd.to_timedelta('15.5us')
     pd.to_timedelta(['1 days 06:05:01.00003', '15.5us', 'nan'])
     pd.to_timedelta(np.arange(5), unit='s')
     pd.to_timedelta(np.arange(5), unit='d')

  A Series of dtype ``timedelta64[ns]`` can now be divided by another
  ``timedelta64[ns]`` object, or astyped to yield a ``float64`` dtyped Series. This
  is frequency conversion. See :ref:`the docs<timedeltas.timedeltas_convert>` for the docs.

  .. ipython:: python

     import datetime
     td = pd.Series(pd.date_range('20130101', periods=4)) - pd.Series(
         pd.date_range('20121201', periods=4))
     td[2] += np.timedelta64(datetime.timedelta(minutes=5, seconds=3))
     td[3] = np.nan
     td

  .. code-block:: ipython

     # to days
     In [63]: td / np.timedelta64(1, 'D')
     Out[63]:
     0    31.000000
     1    31.000000
     2    31.003507
     3          NaN
     dtype: float64

     In [64]: td.astype('timedelta64[D]')
     Out[64]:
     0    31.0
     1    31.0
     2    31.0
     3     NaN
     dtype: float64

     # to seconds
     In [65]: td / np.timedelta64(1, 's')
     Out[65]:
     0    2678400.0
     1    2678400.0
     2    2678703.0
     3          NaN
     dtype: float64

     In [66]: td.astype('timedelta64[s]')
     Out[66]:
     0    2678400.0
     1    2678400.0
     2    2678703.0
     3          NaN
     dtype: float64

  Dividing or multiplying a ``timedelta64[ns]`` Series by an integer or integer Series

  .. ipython:: python

     td * -1
     td * pd.Series([1, 2, 3, 4])

  Absolute ``DateOffset`` objects can act equivalently to ``timedeltas``

  .. ipython:: python

     from pandas import offsets
     td + offsets.Minute(5) + offsets.Milli(5)

  Fillna is now supported for timedeltas

  .. ipython:: python

     td.fillna(pd.Timedelta(0))
     td.fillna(datetime.timedelta(days=1, seconds=5))

  You can do numeric reduction operations on timedeltas.

  .. ipython:: python

     td.mean()
     td.quantile(.1)

- ``plot(kind='kde')`` now accepts the optional parameters ``bw_method`` and
  ``ind``, passed to scipy.stats.gaussian_kde() (for scipy >= 0.11.0) to set
  the bandwidth, and to gkde.evaluate() to specify the indices at which it
  is evaluated, respectively. See scipy docs. (:issue:`4298`)

- DataFrame constructor now accepts a numpy masked record array (:issue:`3478`)

- The new vectorized string method ``extract`` return regular expression
  matches more conveniently.

  .. ipython:: python
     :okwarning:

     pd.Series(['a1', 'b2', 'c3']).str.extract('[ab](\\d)')

  Elements that do not match return ``NaN``. Extracting a regular expression
  with more than one group returns a DataFrame with one column per group.


  .. ipython:: python
     :okwarning:

     pd.Series(['a1', 'b2', 'c3']).str.extract('([ab])(\\d)')

  Elements that do not match return a row of ``NaN``.
  Thus, a Series of messy strings can be *converted* into a
  like-indexed Series or DataFrame of cleaned-up or more useful strings,
  without necessitating ``get()`` to access tuples or ``re.match`` objects.

  Named groups like

  .. ipython:: python
     :okwarning:

     pd.Series(['a1', 'b2', 'c3']).str.extract(
         '(?P<letter>[ab])(?P<digit>\\d)')

  and optional groups can also be used.

  .. ipython:: python
     :okwarning:

      pd.Series(['a1', 'b2', '3']).str.extract(
          '(?P<letter>[ab])?(?P<digit>\\d)')

- ``read_stata`` now accepts Stata 13 format (:issue:`4291`)

- ``read_fwf`` now infers the column specifications from the first 100 rows of
  the file if the data has correctly separated and properly aligned columns
  using the delimiter provided to the function (:issue:`4488`).

- support for nanosecond times as an offset

  .. warning::

     These operations require ``numpy >= 1.7``

  Period conversions in the range of seconds and below were reworked and extended
  up to nanoseconds. Periods in the nanosecond range are now available.

  .. code-block:: python

     In [79]: pd.date_range('2013-01-01', periods=5, freq='5N')
     Out[79]:
     DatetimeIndex([          '2013-01-01 00:00:00',
                    '2013-01-01 00:00:00.000000005',
                    '2013-01-01 00:00:00.000000010',
                    '2013-01-01 00:00:00.000000015',
                    '2013-01-01 00:00:00.000000020'],
                   dtype='datetime64[ns]', freq='5N')

  or with frequency as offset

  .. ipython:: python

     pd.date_range('2013-01-01', periods=5, freq=pd.offsets.Nano(5))

  Timestamps can be modified in the nanosecond range

  .. ipython:: python

     t = pd.Timestamp('20130101 09:01:02')
     t + pd.tseries.offsets.Nano(123)

- A new method, ``isin`` for DataFrames, which plays nicely with boolean indexing. The argument to ``isin``, what we're comparing the DataFrame to, can be a DataFrame, Series, dict, or array of values. See :ref:`the docs<indexing.basics.indexing_isin>` for more.

  To get the rows where any of the conditions are met:

  .. ipython:: python

     dfi = pd.DataFrame({'A': [1, 2, 3, 4], 'B': ['a', 'b', 'f', 'n']})
     dfi
     other = pd.DataFrame({'A': [1, 3, 3, 7], 'B': ['e', 'f', 'f', 'e']})
     mask = dfi.isin(other)
     mask
     dfi[mask.any(axis=1)]

- ``Series`` now supports a ``to_frame`` method to convert it to a single-column DataFrame (:issue:`5164`)

- All R datasets listed here http://stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html can now be loaded into pandas objects

  .. code-block:: python

     # note that pandas.rpy was deprecated in v0.16.0
     import pandas.rpy.common as com
     com.load_data('Titanic')

- ``tz_localize`` can infer a fall daylight savings transition based on the structure
  of the unlocalized data (:issue:`4230`), see :ref:`the docs<timeseries.timezone>`

- ``DatetimeIndex`` is now in the API documentation, see :ref:`the docs<api.datetimeindex>`

- :meth:`~pandas.io.json.json_normalize` is a new method to allow you to create a flat table
  from semi-structured JSON data. See :ref:`the docs<io.json_normalize>` (:issue:`1067`)

- Added PySide support for the qtpandas DataFrameModel and DataFrameWidget.

- Python csv parser now supports usecols (:issue:`4335`)

- Frequencies gained several new offsets:

  * ``LastWeekOfMonth`` (:issue:`4637`)
  * ``FY5253``, and ``FY5253Quarter`` (:issue:`4511`)


- DataFrame has a new ``interpolate`` method, similar to Series (:issue:`4434`, :issue:`1892`)

  .. ipython:: python

      df = pd.DataFrame({'A': [1, 2.1, np.nan, 4.7, 5.6, 6.8],
                        'B': [.25, np.nan, np.nan, 4, 12.2, 14.4]})
      df.interpolate()

  Additionally, the ``method`` argument to ``interpolate`` has been expanded
  to include ``'nearest', 'zero', 'slinear', 'quadratic', 'cubic',
  'barycentric', 'krogh', 'piecewise_polynomial', 'pchip', 'polynomial', 'spline'``
  The new methods require scipy_. Consult the Scipy reference guide_ and documentation_ for more information
  about when the various methods are appropriate. See :ref:`the docs<missing_data.interpolate>`.

  Interpolate now also accepts a ``limit`` keyword argument.
  This works similar to ``fillna``'s limit:

  .. ipython:: python

    ser = pd.Series([1, 3, np.nan, np.nan, np.nan, 11])
    ser.interpolate(limit=2)

- Added ``wide_to_long`` panel data convenience function. See :ref:`the docs<reshaping.melt>`.

  .. ipython:: python

    np.random.seed(123)
    df = pd.DataFrame({"A1970" : {0 : "a", 1 : "b", 2 : "c"},
                       "A1980" : {0 : "d", 1 : "e", 2 : "f"},
                       "B1970" : {0 : 2.5, 1 : 1.2, 2 : .7},
                       "B1980" : {0 : 3.2, 1 : 1.3, 2 : .1},
                       "X"     : dict(zip(range(3), np.random.randn(3)))
                      })
    df["id"] = df.index
    df
    pd.wide_to_long(df, ["A", "B"], i="id", j="year")

.. _scipy: http://www.scipy.org
.. _documentation: http://docs.scipy.org/doc/scipy/reference/interpolate.html#univariate-interpolation
.. _guide: https://docs.scipy.org/doc/scipy/tutorial/interpolate.html

- ``to_csv`` now takes a ``date_format`` keyword argument that specifies how
  output datetime objects should be formatted. Datetimes encountered in the
  index, columns, and values will all have this formatting applied. (:issue:`4313`)
- ``DataFrame.plot`` will scatter plot x versus y by passing ``kind='scatter'`` (:issue:`2215`)
- Added support for Google Analytics v3 API segment IDs that also supports v2 IDs. (:issue:`5271`)

.. _whatsnew_0130.experimental:

Experimental
~~~~~~~~~~~~

- The new :func:`~pandas.eval` function implements expression evaluation using
  ``numexpr`` behind the scenes. This results in large speedups for
  complicated expressions involving large DataFrames/Series. For example,

  .. ipython:: python

     nrows, ncols = 20000, 100
     df1, df2, df3, df4 = [pd.DataFrame(np.random.randn(nrows, ncols))
                           for _ in range(4)]

  .. ipython:: python

     # eval with NumExpr backend
     %timeit pd.eval('df1 + df2 + df3 + df4')

  .. ipython:: python

     # pure Python evaluation
     %timeit df1 + df2 + df3 + df4

  For more details, see the :ref:`the docs<enhancingperf.eval>`

- Similar to ``pandas.eval``, :class:`~pandas.DataFrame` has a new
  ``DataFrame.eval`` method that evaluates an expression in the context of
  the ``DataFrame``. For example,

  .. ipython:: python
     :suppress:

     try:
         del a  # noqa: F821
     except NameError:
         pass

     try:
         del b  # noqa: F821
     except NameError:
         pass

  .. ipython:: python

     df = pd.DataFrame(np.random.randn(10, 2), columns=['a', 'b'])
     df.eval('a + b')

- :meth:`~pandas.DataFrame.query` method has been added that allows
  you to select elements of a ``DataFrame`` using a natural query syntax
  nearly identical to Python syntax. For example,

  .. ipython:: python
     :suppress:

     try:
         del a  # noqa: F821
     except NameError:
         pass

     try:
         del b  # noqa: F821
     except NameError:
         pass

     try:
         del c  # noqa: F821
     except NameError:
         pass

  .. ipython:: python

     n = 20
     df = pd.DataFrame(np.random.randint(n, size=(n, 3)), columns=['a', 'b', 'c'])
     df.query('a < b < c')

  selects all the rows of ``df`` where ``a < b < c`` evaluates to ``True``.
  For more details see the :ref:`the docs<indexing.query>`.

- ``pd.read_msgpack()`` and ``pd.to_msgpack()`` are now a supported method of serialization
  of arbitrary pandas (and python objects) in a lightweight portable binary format. See :ref:`the docs<io.msgpack>`

  .. warning::

     Since this is an EXPERIMENTAL LIBRARY, the storage format may not be stable until a future release.

  .. code-block:: python

     df = pd.DataFrame(np.random.rand(5, 2), columns=list('AB'))
     df.to_msgpack('foo.msg')
     pd.read_msgpack('foo.msg')

     s = pd.Series(np.random.rand(5), index=pd.date_range('20130101', periods=5))
     pd.to_msgpack('foo.msg', df, s)
     pd.read_msgpack('foo.msg')

  You can pass ``iterator=True`` to iterator over the unpacked results

  .. code-block:: python

     for o in pd.read_msgpack('foo.msg', iterator=True):
         print(o)

  .. ipython:: python
     :suppress:
     :okexcept:

     os.remove('foo.msg')

- ``pandas.io.gbq`` provides a simple way to extract from, and load data into,
  Google's BigQuery Data Sets by way of pandas DataFrames. BigQuery is a high
  performance SQL-like database service, useful for performing ad-hoc queries
  against extremely large datasets. :ref:`See the docs <io.bigquery>`

  .. code-block:: python

     from pandas.io import gbq

     # A query to select the average monthly temperatures in the
     # in the year 2000 across the USA. The dataset,
     # publicata:samples.gsod, is available on all BigQuery accounts,
     # and is based on NOAA gsod data.

     query = """SELECT station_number as STATION,
     month as MONTH, AVG(mean_temp) as MEAN_TEMP
     FROM publicdata:samples.gsod
     WHERE YEAR = 2000
     GROUP BY STATION, MONTH
     ORDER BY STATION, MONTH ASC"""

     # Fetch the result set for this query

     # Your Google BigQuery Project ID
     # To find this, see your dashboard:
     # https://console.developers.google.com/iam-admin/projects?authuser=0
     projectid = 'xxxxxxxxx'
     df = gbq.read_gbq(query, project_id=projectid)

     # Use pandas to process and reshape the dataset

     df2 = df.pivot(index='STATION', columns='MONTH', values='MEAN_TEMP')
     df3 = pd.concat([df2.min(), df2.mean(), df2.max()],
                     axis=1, keys=["Min Tem", "Mean Temp", "Max Temp"])

  The resulting DataFrame is::

     > df3
                 Min Tem  Mean Temp    Max Temp
      MONTH
      1     -53.336667  39.827892   89.770968
      2     -49.837500  43.685219   93.437932
      3     -77.926087  48.708355   96.099998
      4     -82.892858  55.070087   97.317240
      5     -92.378261  61.428117  102.042856
      6     -77.703334  65.858888  102.900000
      7     -87.821428  68.169663  106.510714
      8     -89.431999  68.614215  105.500000
      9     -86.611112  63.436935  107.142856
      10    -78.209677  56.880838   92.103333
      11    -50.125000  48.861228   94.996428
      12    -50.332258  42.286879   94.396774

  .. warning::

     To use this module, you will need a BigQuery account. See
     <https://cloud.google.com/products/big-query> for details.

     As of 10/10/13, there is a bug in Google's API preventing result sets
     from being larger than 100,000 rows. A patch is scheduled for the week of
     10/14/13.

.. _whatsnew_0130.refactoring:

Internal refactoring
~~~~~~~~~~~~~~~~~~~~

In 0.13.0 there is a major refactor primarily to subclass ``Series`` from
``NDFrame``, which is the base class currently for ``DataFrame`` and ``Panel``,
to unify methods and behaviors. Series formerly subclassed directly from
``ndarray``. (:issue:`4080`, :issue:`3862`, :issue:`816`)

.. warning::

   There are two potential incompatibilities from < 0.13.0

   - Using certain numpy functions would previously return a ``Series`` if passed a ``Series``
     as an argument. This seems only to affect ``np.ones_like``, ``np.empty_like``,
     ``np.diff`` and ``np.where``. These now return ``ndarrays``.

     .. ipython:: python

        s = pd.Series([1, 2, 3, 4])

     Numpy Usage

     .. ipython:: python

        np.ones_like(s)
        np.diff(s)
        np.where(s > 1, s, np.nan)

     Pandonic Usage

     .. ipython:: python

        pd.Series(1, index=s.index)
        s.diff()
        s.where(s > 1)

   - Passing a ``Series`` directly to a cython function expecting an ``ndarray`` type will no
     long work directly, you must pass ``Series.values``, See :ref:`Enhancing Performance<enhancingperf.ndarray>`

   - ``Series(0.5)`` would previously return the scalar ``0.5``, instead this will return a 1-element ``Series``

   - This change breaks ``rpy2<=2.3.8``. an Issue has been opened against rpy2 and a workaround
     is detailed in :issue:`5698`. Thanks @JanSchulz.

- Pickle compatibility is preserved for pickles created prior to 0.13. These must be unpickled with ``pd.read_pickle``, see :ref:`Pickling<io.pickle>`.

- Refactor of series.py/frame.py/panel.py to move common code to generic.py

  - added ``_setup_axes`` to created generic NDFrame structures
  - moved methods

    - ``from_axes,_wrap_array,axes,ix,loc,iloc,shape,empty,swapaxes,transpose,pop``
    - ``__iter__,keys,__contains__,__len__,__neg__,__invert__``
    - ``convert_objects,as_blocks,as_matrix,values``
    - ``__getstate__,__setstate__`` (compat remains in frame/panel)
    - ``__getattr__,__setattr__``
    - ``_indexed_same,reindex_like,align,where,mask``
    - ``fillna,replace`` (``Series`` replace is now consistent with ``DataFrame``)
    - ``filter`` (also added axis argument to selectively filter on a different axis)
    - ``reindex,reindex_axis,take``
    - ``truncate`` (moved to become part of ``NDFrame``)

- These are API changes which make ``Panel`` more consistent with ``DataFrame``

  - ``swapaxes`` on a ``Panel`` with the same axes specified now return a copy
  - support attribute access for setting
  - filter supports the same API as the original ``DataFrame`` filter

- Reindex called with no arguments will now return a copy of the input object

- ``TimeSeries`` is now an alias for ``Series``. the property ``is_time_series``
  can be used to distinguish (if desired)

- Refactor of Sparse objects to use BlockManager

  - Created a new block type in internals, ``SparseBlock``, which can hold multi-dtypes
    and is non-consolidatable. ``SparseSeries`` and ``SparseDataFrame`` now inherit
    more methods from there hierarchy (Series/DataFrame), and no longer inherit
    from ``SparseArray`` (which instead is the object of the ``SparseBlock``)
  - Sparse suite now supports integration with non-sparse data. Non-float sparse
    data is supportable (partially implemented)
  - Operations on sparse structures within DataFrames should preserve sparseness,
    merging type operations will convert to dense (and back to sparse), so might
    be somewhat inefficient
  - enable setitem on ``SparseSeries`` for boolean/integer/slices
  - ``SparsePanels`` implementation is unchanged (e.g. not using BlockManager, needs work)

- added ``ftypes`` method to Series/DataFrame, similar to ``dtypes``, but indicates
  if the underlying is sparse/dense (as well as the dtype)
- All ``NDFrame`` objects can now use ``__finalize__()`` to specify various
  values to propagate to new objects from an existing one (e.g. ``name`` in ``Series`` will
  follow more automatically now)
- Internal type checking is now done via a suite of generated classes, allowing ``isinstance(value, klass)``
  without having to directly import the klass, courtesy of @jtratner
- Bug in Series update where the parent frame is not updating its cache based on
  changes (:issue:`4080`) or types (:issue:`3217`), fillna (:issue:`3386`)
- Indexing with dtype conversions fixed (:issue:`4463`, :issue:`4204`)
- Refactor ``Series.reindex`` to core/generic.py (:issue:`4604`, :issue:`4618`), allow ``method=`` in reindexing
  on a Series to work
- ``Series.copy`` no longer accepts the ``order`` parameter and is now consistent with ``NDFrame`` copy
- Refactor ``rename`` methods to core/generic.py; fixes ``Series.rename`` for (:issue:`4605`), and adds ``rename``
  with the same signature for ``Panel``
- Refactor ``clip`` methods to core/generic.py (:issue:`4798`)
- Refactor of ``_get_numeric_data/_get_bool_data`` to core/generic.py, allowing Series/Panel functionality
- ``Series`` (for index) / ``Panel`` (for items) now allow attribute access to its elements  (:issue:`1903`)

  .. ipython:: python

     s = pd.Series([1, 2, 3], index=list('abc'))
     s.b
     s.a = 5
     s

.. _release.bug_fixes-0.13.0:

Bug fixes
~~~~~~~~~

- ``HDFStore``

  - raising an invalid ``TypeError`` rather than ``ValueError`` when
    appending with a different block ordering (:issue:`4096`)
  - ``read_hdf`` was not respecting as passed ``mode`` (:issue:`4504`)
  - appending a 0-len table will work correctly (:issue:`4273`)
  - ``to_hdf`` was raising when passing both arguments ``append`` and
    ``table`` (:issue:`4584`)
  - reading from a store with duplicate columns across dtypes would raise
    (:issue:`4767`)
  - Fixed a bug where ``ValueError`` wasn't correctly raised when column
    names weren't strings (:issue:`4956`)
  - A zero length series written in Fixed format not deserializing properly.
    (:issue:`4708`)
  - Fixed decoding perf issue on pyt3 (:issue:`5441`)
  - Validate levels in a MultiIndex before storing (:issue:`5527`)
  - Correctly handle ``data_columns`` with a Panel (:issue:`5717`)
- Fixed bug in tslib.tz_convert(vals, tz1, tz2): it could raise IndexError
  exception while trying to access trans[pos + 1] (:issue:`4496`)
- The ``by`` argument now works correctly with the ``layout`` argument
  (:issue:`4102`, :issue:`4014`) in ``*.hist`` plotting methods
- Fixed bug in ``PeriodIndex.map`` where using ``str`` would return the str
  representation of the index (:issue:`4136`)
- Fixed test failure ``test_time_series_plot_color_with_empty_kwargs`` when
  using custom matplotlib default colors (:issue:`4345`)
- Fix running of stata IO tests. Now uses temporary files to write
  (:issue:`4353`)
- Fixed an issue where ``DataFrame.sum`` was slower than ``DataFrame.mean``
  for integer valued frames (:issue:`4365`)
- ``read_html`` tests now work with Python 2.6 (:issue:`4351`)
- Fixed bug where ``network`` testing was throwing ``NameError`` because a
  local variable was undefined (:issue:`4381`)
- In ``to_json``, raise if a passed ``orient`` would cause loss of data
  because of a duplicate index (:issue:`4359`)
- In ``to_json``, fix date handling so milliseconds are the default timestamp
  as the docstring says (:issue:`4362`).
- ``as_index`` is no longer ignored when doing groupby apply (:issue:`4648`,
  :issue:`3417`)
- JSON NaT handling fixed, NaTs are now serialized to ``null`` (:issue:`4498`)
- Fixed JSON handling of escapable characters in JSON object keys
  (:issue:`4593`)
- Fixed passing ``keep_default_na=False`` when ``na_values=None``
  (:issue:`4318`)
- Fixed bug with ``values`` raising an error on a DataFrame with duplicate
  columns and mixed dtypes, surfaced in (:issue:`4377`)
- Fixed bug with duplicate columns and type conversion in ``read_json`` when
  ``orient='split'`` (:issue:`4377`)
- Fixed JSON bug where locales with decimal separators other than '.' threw
  exceptions when encoding / decoding certain values. (:issue:`4918`)
- Fix ``.iat`` indexing with a ``PeriodIndex`` (:issue:`4390`)
- Fixed an issue where ``PeriodIndex`` joining with self was returning a new
  instance rather than the same instance (:issue:`4379`); also adds a test
  for this for the other index types
- Fixed a bug with all the dtypes being converted to object when using the
  CSV cparser with the usecols parameter (:issue:`3192`)
- Fix an issue in merging blocks where the resulting DataFrame had partially
  set _ref_locs (:issue:`4403`)
- Fixed an issue where hist subplots were being overwritten when they were
  called using the top level matplotlib API (:issue:`4408`)
- Fixed a bug where calling ``Series.astype(str)`` would truncate the string
  (:issue:`4405`, :issue:`4437`)
- Fixed a py3 compat issue where bytes were being repr'd as tuples
  (:issue:`4455`)
- Fixed Panel attribute naming conflict if item is named 'a'
  (:issue:`3440`)
- Fixed an issue where duplicate indexes were raising when plotting
  (:issue:`4486`)
- Fixed an issue where cumsum and cumprod didn't work with bool dtypes
  (:issue:`4170`, :issue:`4440`)
- Fixed Panel slicing issued in ``xs`` that was returning an incorrect dimmed
  object (:issue:`4016`)
- Fix resampling bug where custom reduce function not used if only one group
  (:issue:`3849`, :issue:`4494`)
- Fixed Panel assignment with a transposed frame (:issue:`3830`)
- Raise on set indexing with a Panel and a Panel as a value which needs
  alignment (:issue:`3777`)
- frozenset objects now raise in the ``Series`` constructor (:issue:`4482`,
  :issue:`4480`)
- Fixed issue with sorting a duplicate MultiIndex that has multiple dtypes
  (:issue:`4516`)
- Fixed bug in ``DataFrame.set_values`` which was causing name attributes to
  be lost when expanding the index. (:issue:`3742`, :issue:`4039`)
- Fixed issue where individual ``names``, ``levels`` and ``labels`` could be
  set on ``MultiIndex`` without validation (:issue:`3714`, :issue:`4039`)
- Fixed (:issue:`3334`) in pivot_table. Margins did not compute if values is
  the index.
- Fix bug in having a rhs of ``np.timedelta64`` or ``np.offsets.DateOffset``
  when operating with datetimes (:issue:`4532`)
- Fix arithmetic with series/datetimeindex and ``np.timedelta64`` not working
  the same (:issue:`4134`) and buggy timedelta in NumPy 1.6 (:issue:`4135`)
- Fix bug in ``pd.read_clipboard`` on windows with PY3 (:issue:`4561`); not
  decoding properly
- ``tslib.get_period_field()`` and ``tslib.get_period_field_arr()`` now raise
  if code argument out of range (:issue:`4519`, :issue:`4520`)
- Fix boolean indexing on an empty series loses index names (:issue:`4235`),
  infer_dtype works with empty arrays.
- Fix reindexing with multiple axes; if an axes match was not replacing the
  current axes, leading to a possible lazy frequency inference issue
  (:issue:`3317`)
- Fixed issue where ``DataFrame.apply`` was reraising exceptions incorrectly
  (causing the original stack trace to be truncated).
- Fix selection with ``ix/loc`` and non_unique selectors (:issue:`4619`)
- Fix assignment with iloc/loc involving a dtype change in an existing column
  (:issue:`4312`, :issue:`5702`) have internal setitem_with_indexer in core/indexing
  to use Block.setitem
- Fixed bug where thousands operator was not handled correctly for floating
  point numbers in csv_import (:issue:`4322`)
- Fix an issue with CacheableOffset not properly being used by many
  DateOffset; this prevented the DateOffset from being cached (:issue:`4609`)
- Fix boolean comparison with a DataFrame on the lhs, and a list/tuple on the
  rhs (:issue:`4576`)
- Fix error/dtype conversion with setitem of ``None`` on ``Series/DataFrame``
  (:issue:`4667`)
- Fix decoding based on a passed in non-default encoding in ``pd.read_stata``
  (:issue:`4626`)
- Fix ``DataFrame.from_records`` with a plain-vanilla ``ndarray``.
  (:issue:`4727`)
- Fix some inconsistencies with ``Index.rename`` and ``MultiIndex.rename``,
  etc. (:issue:`4718`, :issue:`4628`)
- Bug in using ``iloc/loc`` with a cross-sectional and duplicate indices
  (:issue:`4726`)
- Bug with using ``QUOTE_NONE`` with ``to_csv`` causing ``Exception``.
  (:issue:`4328`)
- Bug with Series indexing not raising an error when the right-hand-side has
  an incorrect length (:issue:`2702`)
- Bug in MultiIndexing with a partial string selection as one part of a
  MultIndex (:issue:`4758`)
- Bug with reindexing on the index with a non-unique index will now raise
  ``ValueError`` (:issue:`4746`)
- Bug in setting with ``loc/ix`` a single indexer with a MultiIndex axis and
  a NumPy array, related to (:issue:`3777`)
- Bug in concatenation with duplicate columns across dtypes not merging with
  axis=0 (:issue:`4771`, :issue:`4975`)
- Bug in ``iloc`` with a slice index failing (:issue:`4771`)
- Incorrect error message with no colspecs or width in ``read_fwf``.
  (:issue:`4774`)
- Fix bugs in indexing in a Series with a duplicate index (:issue:`4548`,
  :issue:`4550`)
- Fixed bug with reading compressed files with ``read_fwf`` in Python 3.
  (:issue:`3963`)
- Fixed an issue with a duplicate index and assignment with a dtype change
  (:issue:`4686`)
- Fixed bug with reading compressed files in as ``bytes`` rather than ``str``
  in Python 3. Simplifies bytes-producing file-handling in Python 3
  (:issue:`3963`, :issue:`4785`).
- Fixed an issue related to ticklocs/ticklabels with log scale bar plots
  across different versions of matplotlib (:issue:`4789`)
- Suppressed DeprecationWarning associated with internal calls issued by
  repr() (:issue:`4391`)
- Fixed an issue with a duplicate index and duplicate selector with ``.loc``
  (:issue:`4825`)
- Fixed an issue with ``DataFrame.sort_index`` where, when sorting by a
  single column and passing a list for ``ascending``, the argument for
  ``ascending`` was being interpreted as ``True`` (:issue:`4839`,
  :issue:`4846`)
- Fixed ``Panel.tshift`` not working. Added ``freq`` support to ``Panel.shift``
  (:issue:`4853`)
- Fix an issue in TextFileReader w/ Python engine (i.e. PythonParser)
  with thousands != "," (:issue:`4596`)
- Bug in getitem with a duplicate index when using where (:issue:`4879`)
- Fix Type inference code coerces float column into datetime (:issue:`4601`)
- Fixed ``_ensure_numeric`` does not check for complex numbers
  (:issue:`4902`)
- Fixed a bug in ``Series.hist`` where two figures were being created when
  the ``by`` argument was passed (:issue:`4112`, :issue:`4113`).
- Fixed a bug in ``convert_objects`` for > 2 ndims (:issue:`4937`)
- Fixed a bug in DataFrame/Panel cache insertion and subsequent indexing
  (:issue:`4939`, :issue:`5424`)
- Fixed string methods for ``FrozenNDArray`` and ``FrozenList``
  (:issue:`4929`)
- Fixed a bug with setting invalid or out-of-range values in indexing
  enlargement scenarios (:issue:`4940`)
- Tests for fillna on empty Series (:issue:`4346`), thanks @immerrr
- Fixed ``copy()`` to shallow copy axes/indices as well and thereby keep
  separate metadata. (:issue:`4202`, :issue:`4830`)
- Fixed skiprows option in Python parser for read_csv (:issue:`4382`)
- Fixed bug preventing ``cut`` from working with ``np.inf`` levels without
  explicitly passing labels (:issue:`3415`)
- Fixed wrong check for overlapping in ``DatetimeIndex.union``
  (:issue:`4564`)
- Fixed conflict between thousands separator and date parser in csv_parser
  (:issue:`4678`)
- Fix appending when dtypes are not the same (error showing mixing
  float/np.datetime64) (:issue:`4993`)
- Fix repr for DateOffset. No longer show duplicate entries in kwds.
  Removed unused offset fields. (:issue:`4638`)
- Fixed wrong index name during read_csv if using usecols. Applies to c
  parser only. (:issue:`4201`)
- ``Timestamp`` objects can now appear in the left hand side of a comparison
  operation with a ``Series`` or ``DataFrame`` object (:issue:`4982`).
- Fix a bug when indexing with ``np.nan`` via ``iloc/loc`` (:issue:`5016`)
- Fixed a bug where low memory c parser could create different types in
  different chunks of the same file. Now coerces to numerical type or raises
  warning. (:issue:`3866`)
- Fix a bug where reshaping a ``Series`` to its own shape raised
  ``TypeError`` (:issue:`4554`) and other reshaping issues.
- Bug in setting with ``ix/loc`` and a mixed int/string index (:issue:`4544`)
- Make sure series-series boolean comparisons are label based (:issue:`4947`)
- Bug in multi-level indexing with a Timestamp partial indexer
  (:issue:`4294`)
- Tests/fix for MultiIndex construction of an all-nan frame (:issue:`4078`)
- Fixed a bug where :func:`~pandas.read_html` wasn't correctly inferring
  values of tables with commas (:issue:`5029`)
- Fixed a bug where :func:`~pandas.read_html` wasn't providing a stable
  ordering of returned tables (:issue:`4770`, :issue:`5029`).
- Fixed a bug where :func:`~pandas.read_html` was incorrectly parsing when
  passed ``index_col=0`` (:issue:`5066`).
- Fixed a bug where :func:`~pandas.read_html` was incorrectly inferring the
  type of headers (:issue:`5048`).
- Fixed a bug where ``DatetimeIndex`` joins with ``PeriodIndex`` caused a
  stack overflow (:issue:`3899`).
- Fixed a bug where ``groupby`` objects didn't allow plots (:issue:`5102`).
- Fixed a bug where ``groupby`` objects weren't tab-completing column names
  (:issue:`5102`).
- Fixed a bug where ``groupby.plot()`` and friends were duplicating figures
  multiple times (:issue:`5102`).
- Provide automatic conversion of ``object`` dtypes on fillna, related
  (:issue:`5103`)
- Fixed a bug where default options were being overwritten in the option
  parser cleaning (:issue:`5121`).
- Treat a list/ndarray identically for ``iloc`` indexing with list-like
  (:issue:`5006`)
- Fix ``MultiIndex.get_level_values()`` with missing values (:issue:`5074`)
- Fix bound checking for Timestamp() with datetime64 input (:issue:`4065`)
- Fix a bug where ``TestReadHtml`` wasn't calling the correct ``read_html()``
  function (:issue:`5150`).
- Fix a bug with ``NDFrame.replace()`` which made replacement appear as
  though it was (incorrectly) using regular expressions (:issue:`5143`).
- Fix better error message for to_datetime (:issue:`4928`)
- Made sure different locales are tested on travis-ci (:issue:`4918`). Also
  adds a couple of utilities for getting locales and setting locales with a
  context manager.
- Fixed segfault on ``isnull(MultiIndex)`` (now raises an error instead)
  (:issue:`5123`, :issue:`5125`)
- Allow duplicate indices when performing operations that align
  (:issue:`5185`, :issue:`5639`)
- Compound dtypes in a constructor raise ``NotImplementedError``
  (:issue:`5191`)
- Bug in comparing duplicate frames (:issue:`4421`) related
- Bug in describe on duplicate frames
- Bug in ``to_datetime`` with a format and ``coerce=True`` not raising
  (:issue:`5195`)
- Bug in ``loc`` setting with multiple indexers and a rhs of a Series that
  needs broadcasting (:issue:`5206`)
- Fixed bug where inplace setting of levels or labels on ``MultiIndex`` would
  not clear cached ``values`` property and therefore return wrong ``values``.
  (:issue:`5215`)
- Fixed bug where filtering a grouped DataFrame or Series did not maintain
  the original ordering (:issue:`4621`).
- Fixed ``Period`` with a business date freq to always roll-forward if on a
  non-business date. (:issue:`5203`)
- Fixed bug in Excel writers where frames with duplicate column names weren't
  written correctly. (:issue:`5235`)
- Fixed issue with ``drop`` and a non-unique index on Series (:issue:`5248`)
- Fixed segfault in C parser caused by passing more names than columns in
  the file. (:issue:`5156`)
- Fix ``Series.isin`` with date/time-like dtypes (:issue:`5021`)
- C and Python Parser can now handle the more common MultiIndex column
  format which doesn't have a row for index names (:issue:`4702`)
- Bug when trying to use an out-of-bounds date as an object dtype
  (:issue:`5312`)
- Bug when trying to display an embedded PandasObject (:issue:`5324`)
- Allows operating of Timestamps to return a datetime if the result is out-of-bounds
  related (:issue:`5312`)
- Fix return value/type signature of ``initObjToJSON()`` to be compatible
  with numpy's ``import_array()`` (:issue:`5334`, :issue:`5326`)
- Bug when renaming then set_index on a DataFrame (:issue:`5344`)
- Test suite no longer leaves around temporary files when testing graphics. (:issue:`5347`)
  (thanks for catching this @yarikoptic!)
- Fixed html tests on win32. (:issue:`4580`)
- Make sure that ``head/tail`` are ``iloc`` based, (:issue:`5370`)
- Fixed bug for ``PeriodIndex`` string representation if there are 1 or 2
  elements. (:issue:`5372`)
- The GroupBy methods ``transform`` and ``filter`` can be used on Series
  and DataFrames that have repeated (non-unique) indices. (:issue:`4620`)
- Fix empty series not printing name in repr (:issue:`4651`)
- Make tests create temp files in temp directory by default. (:issue:`5419`)
- ``pd.to_timedelta`` of a scalar returns a scalar (:issue:`5410`)
- ``pd.to_timedelta`` accepts ``NaN`` and ``NaT``, returning ``NaT`` instead of raising (:issue:`5437`)
- performance improvements in ``isnull`` on larger size pandas objects
- Fixed various setitem with 1d ndarray that does not have a matching
  length to the indexer (:issue:`5508`)
- Bug in getitem with a MultiIndex and ``iloc`` (:issue:`5528`)
- Bug in delitem on a Series (:issue:`5542`)
- Bug fix in apply when using custom function and objects are not mutated (:issue:`5545`)
- Bug in selecting from a non-unique index with ``loc`` (:issue:`5553`)
- Bug in groupby returning non-consistent types when user function returns a ``None``, (:issue:`5592`)
- Work around regression in numpy 1.7.0 which erroneously raises IndexError from ``ndarray.item`` (:issue:`5666`)
- Bug in repeated indexing of object with resultant non-unique index (:issue:`5678`)
- Bug in fillna with Series and a passed series/dict (:issue:`5703`)
- Bug in groupby transform with a datetime-like grouper (:issue:`5712`)
- Bug in MultiIndex selection in PY3 when using certain keys (:issue:`5725`)
- Row-wise concat of differing dtypes failing in certain cases (:issue:`5754`)

.. _whatsnew_0.13.0.contributors:

Contributors
~~~~~~~~~~~~

.. contributors:: v0.12.0..v0.13.0