File: v1.4.rst

package info (click to toggle)
scikit-learn 1.4.2%2Bdfsg-8
  • links: PTS, VCS
  • area: main
  • in suites: sid, trixie
  • size: 25,036 kB
  • sloc: python: 201,105; cpp: 5,790; ansic: 854; makefile: 304; sh: 56; javascript: 20
file content (1033 lines) | stat: -rw-r--r-- 48,140 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
.. include:: _contributors.rst

.. currentmodule:: sklearn

.. _release_notes_1_4:

===========
Version 1.4
===========

For a short description of the main highlights of the release, please refer to
:ref:`sphx_glr_auto_examples_release_highlights_plot_release_highlights_1_4_0.py`.

.. include:: changelog_legend.inc

.. _changes_1_4_2:

Version 1.4.2
=============

**April 2024**

This release only includes support for numpy 2.

.. _changes_1_4_1:

Version 1.4.1.post1
===================

**February 2024**

.. note::
    The 1.4.1.post1 release includes a packaging fix requiring `numpy<2` to account for
    incompatibilities with NumPy 2.0 ABI. Note that the 1.4.1 release is not available
    on PyPI and conda-forge.

Metadata Routing
----------------

- |FIX| Fix routing issue with :class:`~compose.ColumnTransformer` when used
  inside another meta-estimator.
  :pr:`28188` by `Adrin Jalali`_.

- |Fix| No error is raised when no metadata is passed to a metaestimator that
  includes a sub-estimator which doesn't support metadata routing.
  :pr:`28256` by `Adrin Jalali`_.

DataFrame Support
-----------------

- |Enhancement| |Fix| Pandas and Polars dataframe are validated directly without
  ducktyping checks.
  :pr:`28195` by `Thomas Fan`_.

Changes impacting many modules
------------------------------

- |Efficiency| |Fix| Partial revert of :pr:`28191` to avoid a performance regression for
  estimators relying on euclidean pairwise computation with
  sparse matrices. The impacted estimators are:

  - :func:`sklearn.metrics.pairwise_distances_argmin`
  - :func:`sklearn.metrics.pairwise_distances_argmin_min`
  - :class:`sklearn.cluster.AffinityPropagation`
  - :class:`sklearn.cluster.Birch`
  - :class:`sklearn.cluster.SpectralClustering`
  - :class:`sklearn.neighbors.KNeighborsClassifier`
  - :class:`sklearn.neighbors.KNeighborsRegressor`
  - :class:`sklearn.neighbors.RadiusNeighborsClassifier`
  - :class:`sklearn.neighbors.RadiusNeighborsRegressor`
  - :class:`sklearn.neighbors.LocalOutlierFactor`
  - :class:`sklearn.neighbors.NearestNeighbors`
  - :class:`sklearn.manifold.Isomap`
  - :class:`sklearn.manifold.TSNE`
  - :func:`sklearn.manifold.trustworthiness`

  :pr:`28235` by :user:`Julien Jerphanion <jjerphan>`.

- |Fix| Fixes a bug for all scikit-learn transformers when using `set_output` with
  `transform` set to `pandas` or `polars`. The bug could lead to wrong naming of the
  columns of the returned dataframe.
  :pr:`28262` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| When users try to use a method in :class:`~ensemble.StackingClassifier`,
  :class:`~ensemble.StackingClassifier`, :class:`~ensemble.StackingClassifier`,
  :class:`~feature_selection.SelectFromModel`, :class:`~feature_selection.RFE`,
  :class:`~semi_supervised.SelfTrainingClassifier`,
  :class:`~multiclass.OneVsOneClassifier`, :class:`~multiclass.OutputCodeClassifier` or
  :class:`~multiclass.OneVsRestClassifier` that their sub-estimators don't implement,
  the `AttributeError` now reraises in the traceback.
  :pr:`28167` by :user:`Stefanie Senger <StefanieSenger>`.

Metadata Routing
----------------

- |Fix| Fix :class:`multioutput.MultiOutputRegressor` and
  :class:`multioutput.MultiOutputClassifier` to work with estimators that don't
  consume any metadata when metadata routing is enabled.
  :pr:`28240` by `Adrin Jalali`_.

Changelog
---------

:mod:`sklearn.calibration`
..........................

- |Fix| `calibration.CalibratedClassifierCV` supports :term:`predict_proba` with
  float32 output from the inner estimator. :pr:`28247` by `Thomas Fan`_.

:mod:`sklearn.cluster`
......................

- |Fix| :class:`cluster.AffinityPropagation` now avoids assigning multiple different
  clusters for equal points.
  :pr:`28121` by :user:`Pietro Peterlongo <pietroppeter>` and
  :user:`Yao Xiao <Charlie-XIAO>`.

- |Fix| Avoid infinite loop in :class:`cluster.KMeans` when the number of clusters is
  larger than the number of non-duplicate samples.
  :pr:`28165` by :user:`Jérémie du Boisberranger <jeremiedbb>`.

:mod:`sklearn.compose`
......................

- |Fix| :class:`compose.ColumnTransformer` now transform into a polars dataframe when
  `verbose_feature_names_out=True` and the transformers internally used several times
  the same columns. Previously, it would raise a due to duplicated column names.
  :pr:`28262` by :user:`Guillaume Lemaitre <glemaitre>`.

:mod:`sklearn.ensemble`
.......................

- |Fix| :class:`HistGradientBoostingClassifier` and
  :class:`HistGradientBoostingRegressor` when fitted on `pandas` `DataFrame`
  with extension dtypes, for example `pd.Int64Dtype`
  :pr:`28385` by :user:`Loïc Estève <lesteve>`.

- |Fix| Fixes error message raised by :class:`ensemble.VotingClassifier` when the
  target is multilabel or multiclass-multioutput in a DataFrame format.
  :pr:`27702` by :user:`Guillaume Lemaitre <glemaitre>`.

:mod:`sklearn.impute`
.....................

- |Fix|: :class:`impute.SimpleImputer` now raises an error in `.fit` and
  `.transform` if `fill_value` can not be cast to input value dtype with
  `casting='same_kind'`.
  :pr:`28365` by :user:`Leo Grinsztajn <LeoGrin>`.

:mod:`sklearn.inspection`
.........................

- |Fix| :func:`inspection.permutation_importance` now handles properly `sample_weight`
  together with subsampling (i.e. `max_features` < 1.0).
  :pr:`28184` by :user:`Michael Mayer <mayer79>`.

:mod:`sklearn.linear_model`
...........................

- |Fix| :class:`linear_model.ARDRegression` now handles pandas input types
  for `predict(X, return_std=True)`.
  :pr:`28377` by :user:`Eddie Bergman <eddiebergman>`.

:mod:`sklearn.preprocessing`
............................

- |Fix| make :class:`preprocessing.FunctionTransformer` more lenient and overwrite
  output column names with the `get_feature_names_out` in the following cases:
  (i) the input and output column names remain the same (happen when using NumPy
  `ufunc`); (ii) the input column names are numbers; (iii) the output will be set to
  Pandas or Polars dataframe.
  :pr:`28241` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| :class:`preprocessing.FunctionTransformer` now also warns when `set_output`
  is called with `transform="polars"` and `func` does not return a Polars dataframe or
  `feature_names_out` is not specified.
  :pr:`28263` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| :class:`preprocessing.TargetEncoder` no longer fails when
  `target_type="continuous"` and the input is read-only. In particular, it now
  works with pandas copy-on-write mode enabled.
  :pr:`28233` by :user:`John Hopfensperger <s-banach>`.

:mod:`sklearn.tree`
...................

- |Fix| :class:`tree.DecisionTreeClassifier` and
  :class:`tree.DecisionTreeRegressor` are handling missing values properly. The internal
  criterion was not initialized when no missing values were present in the data, leading
  to potentially wrong criterion values.
  :pr:`28295` by :user:`Guillaume Lemaitre <glemaitre>` and
  :pr:`28327` by :user:`Adam Li <adam2392>`.

:mod:`sklearn.utils`
....................

- |Enhancement| |Fix| :func:`utils.metaestimators.available_if` now reraises the error
  from the `check` function as the cause of the `AttributeError`.
  :pr:`28198` by `Thomas Fan`_.

- |Fix| :func:`utils._safe_indexing` now raises a `ValueError` when `X` is a Python list
  and `axis=1`, as documented in the docstring.
  :pr:`28222` by :user:`Guillaume Lemaitre <glemaitre>`.

.. _changes_1_4:

Version 1.4.0
=============

**January 2024**

Changed models
--------------

The following estimators and functions, when fit with the same data and
parameters, may produce different models from the previous version. This often
occurs due to changes in the modelling logic (bug fixes or enhancements), or in
random sampling procedures.

- |Efficiency| :class:`linear_model.LogisticRegression` and
  :class:`linear_model.LogisticRegressionCV` now have much better convergence for
  solvers `"lbfgs"` and `"newton-cg"`. Both solvers can now reach much higher precision
  for the coefficients depending on the specified `tol`. Additionally, lbfgs can
  make better use of `tol`, i.e., stop sooner or reach higher precision.
  Note: The lbfgs is the default solver, so this change might effect many models.
  This change also means that with this new version of scikit-learn, the resulting
  coefficients `coef_` and `intercept_` of your models will change for these two
  solvers (when fit on the same data again). The amount of change depends on the
  specified `tol`, for small values you will get more precise results.
  :pr:`26721` by :user:`Christian Lorentzen <lorentzenchr>`.

- |Fix| fixes a memory leak seen in PyPy for estimators using the Cython loss functions.
  :pr:`27670` by :user:`Guillaume Lemaitre <glemaitre>`.

Changes impacting all modules
-----------------------------

- |MajorFeature| Transformers now support polars output with
  `set_output(transform="polars")`.
  :pr:`27315` by `Thomas Fan`_.

- |Enhancement| All estimators now recognizes the column names from any dataframe
  that adopts the
  `DataFrame Interchange Protocol <https://data-apis.org/dataframe-protocol/latest/purpose_and_scope.html>`__.
  Dataframes that return a correct representation through `np.asarray(df)` is expected
  to work with our estimators and functions.
  :pr:`26464` by `Thomas Fan`_.

- |Enhancement| The HTML representation of estimators now includes a link to the
  documentation and is color-coded to denote whether the estimator is fitted or
  not (unfitted estimators are orange, fitted estimators are blue).
  :pr:`26616` by :user:`Riccardo Cappuzzo <rcap107>`,
  :user:`Ines Ibnukhsein <Ines1999>`, :user:`Gael Varoquaux <GaelVaroquaux>`,
  `Joel Nothman`_ and :user:`Lilian Boulard <LilianBoulard>`.

- |Fix| Fixed a bug in most estimators and functions where setting a parameter to
  a large integer would cause a `TypeError`.
  :pr:`26648` by :user:`Naoise Holohan <naoise-h>`.

Metadata Routing
----------------

The following models now support metadata routing in one or more or their
methods. Refer to the :ref:`Metadata Routing User Guide <metadata_routing>` for
more details.

- |Feature| :class:`LarsCV` and :class:`LassoLarsCV` now support metadata
  routing in their `fit` method and route metadata to the CV splitter.
  :pr:`27538` by :user:`Omar Salman <OmarManzoor>`.

- |Feature| :class:`multiclass.OneVsRestClassifier`,
  :class:`multiclass.OneVsOneClassifier` and
  :class:`multiclass.OutputCodeClassifier` now support metadata routing in
  their ``fit`` and ``partial_fit``, and route metadata to the underlying
  estimator's ``fit`` and ``partial_fit``.
  :pr:`27308` by :user:`Stefanie Senger <StefanieSenger>`.

- |Feature| :class:`pipeline.Pipeline` now supports metadata routing according
  to :ref:`metadata routing user guide <metadata_routing>`.
  :pr:`26789` by `Adrin Jalali`_.

- |Feature| :func:`~model_selection.cross_validate`,
  :func:`~model_selection.cross_val_score`, and
  :func:`~model_selection.cross_val_predict` now support metadata routing. The
  metadata are routed to the estimator's `fit`, the scorer, and the CV
  splitter's `split`. The metadata is accepted via the new `params` parameter.
  `fit_params` is deprecated and will be removed in version 1.6. `groups`
  parameter is also not accepted as a separate argument when metadata routing
  is enabled and should be passed via the `params` parameter.
  :pr:`26896` by `Adrin Jalali`_.

- |Feature| :class:`~model_selection.GridSearchCV`,
  :class:`~model_selection.RandomizedSearchCV`,
  :class:`~model_selection.HalvingGridSearchCV`, and
  :class:`~model_selection.HalvingRandomSearchCV` now support metadata routing
  in their ``fit`` and ``score``, and route metadata to the underlying
  estimator's ``fit``, the CV splitter, and the scorer.
  :pr:`27058` by `Adrin Jalali`_.

- |Feature| :class:`~compose.ColumnTransformer` now supports metadata routing
  according to :ref:`metadata routing user guide <metadata_routing>`.
  :pr:`27005` by `Adrin Jalali`_.

- |Feature| :class:`linear_model.LogisticRegressionCV` now supports
  metadata routing. :meth:`linear_model.LogisticRegressionCV.fit` now
  accepts ``**params`` which are passed to the underlying splitter and
  scorer. :meth:`linear_model.LogisticRegressionCV.score` now accepts
  ``**score_params`` which are passed to the underlying scorer.
  :pr:`26525` by :user:`Omar Salman <OmarManzoor>`.

- |Feature| :class:`feature_selection.SelectFromModel` now supports metadata
  routing in `fit` and `partial_fit`.
  :pr:`27490` by :user:`Stefanie Senger <StefanieSenger>`.

- |Feature| :class:`linear_model.OrthogonalMatchingPursuitCV` now supports
  metadata routing. Its `fit` now accepts ``**fit_params``, which are passed to
  the underlying splitter.
  :pr:`27500` by :user:`Stefanie Senger <StefanieSenger>`.

- |Feature| :class:`ElasticNetCV`, :class:`LassoCV`,
  :class:`MultiTaskElasticNetCV` and :class:`MultiTaskLassoCV`
  now support metadata routing and route metadata to the CV splitter.
  :pr:`27478` by :user:`Omar Salman <OmarManzoor>`.

- |Fix| All meta-estimators for which metadata routing is not yet implemented
  now raise a `NotImplementedError` on `get_metadata_routing` and on `fit` if
  metadata routing is enabled and any metadata is passed to them.
  :pr:`27389` by `Adrin Jalali`_.


Support for SciPy sparse arrays
-------------------------------

Several estimators are now supporting SciPy sparse arrays. The following functions
and classes are impacted:

**Functions:**

- :func:`cluster.compute_optics_graph` in :pr:`27104` by
  :user:`Maren Westermann <marenwestermann>` and in :pr:`27250` by
  :user:`Yao Xiao <Charlie-XIAO>`;
- :func:`cluster.kmeans_plusplus` in :pr:`27179` by :user:`Nurseit Kamchyev <Bncer>`;
- :func:`decomposition.non_negative_factorization` in :pr:`27100` by
  :user:`Isaac Virshup <ivirshup>`;
- :func:`feature_selection.f_regression` in :pr:`27239` by
  :user:`Yaroslav Korobko <Tialo>`;
- :func:`feature_selection.r_regression` in :pr:`27239` by
  :user:`Yaroslav Korobko <Tialo>`;
- :func:`manifold.trustworthiness` in :pr:`27250` by :user:`Yao Xiao <Charlie-XIAO>`;
- :func:`manifold.spectral_embedding` in :pr:`27240` by :user:`Yao Xiao <Charlie-XIAO>`;
- :func:`metrics.pairwise_distances` in :pr:`27250` by :user:`Yao Xiao <Charlie-XIAO>`;
- :func:`metrics.pairwise_distances_chunked` in :pr:`27250` by
  :user:`Yao Xiao <Charlie-XIAO>`;
- :func:`metrics.pairwise.pairwise_kernels` in :pr:`27250` by
  :user:`Yao Xiao <Charlie-XIAO>`;
- :func:`utils.multiclass.type_of_target` in :pr:`27274` by
  :user:`Yao Xiao <Charlie-XIAO>`.

**Classes:**

- :class:`cluster.HDBSCAN` in :pr:`27250` by :user:`Yao Xiao <Charlie-XIAO>`;
- :class:`cluster.KMeans` in :pr:`27179` by :user:`Nurseit Kamchyev <Bncer>`;
- :class:`cluster.MiniBatchKMeans` in :pr:`27179` by :user:`Nurseit Kamchyev <Bncer>`;
- :class:`cluster.OPTICS` in :pr:`27104` by
  :user:`Maren Westermann <marenwestermann>` and in :pr:`27250` by
  :user:`Yao Xiao <Charlie-XIAO>`;
- :class:`cluster.SpectralClustering` in :pr:`27161` by
  :user:`Bharat Raghunathan <bharatr21>`;
- :class:`decomposition.MiniBatchNMF` in :pr:`27100` by
  :user:`Isaac Virshup <ivirshup>`;
- :class:`decomposition.NMF` in :pr:`27100` by :user:`Isaac Virshup <ivirshup>`;
- :class:`feature_extraction.text.TfidfTransformer` in :pr:`27219` by
  :user:`Yao Xiao <Charlie-XIAO>`;
- :class:`manifold.Isomap` in :pr:`27250` by :user:`Yao Xiao <Charlie-XIAO>`;
- :class:`manifold.SpectralEmbedding` in :pr:`27240` by :user:`Yao Xiao <Charlie-XIAO>`;
- :class:`manifold.TSNE` in :pr:`27250` by :user:`Yao Xiao <Charlie-XIAO>`;
- :class:`impute.SimpleImputer` in :pr:`27277` by :user:`Yao Xiao <Charlie-XIAO>`;
- :class:`impute.IterativeImputer` in :pr:`27277` by :user:`Yao Xiao <Charlie-XIAO>`;
- :class:`impute.KNNImputer` in :pr:`27277` by :user:`Yao Xiao <Charlie-XIAO>`;
- :class:`kernel_approximation.PolynomialCountSketch` in  :pr:`27301` by
  :user:`Lohit SundaramahaLingam <lohitslohit>`;
- :class:`neural_network.BernoulliRBM` in :pr:`27252` by
  :user:`Yao Xiao <Charlie-XIAO>`;
- :class:`preprocessing.PolynomialFeatures` in :pr:`27166` by
  :user:`Mohit Joshi <work-mohit>`;
- :class:`random_projection.GaussianRandomProjection` in :pr:`27314` by
  :user:`Stefanie Senger <StefanieSenger>`;
- :class:`random_projection.SparseRandomProjection` in :pr:`27314` by
  :user:`Stefanie Senger <StefanieSenger>`.

Support for Array API
---------------------

Several estimators and functions support the
`Array API <https://data-apis.org/array-api/latest/>`_. Such changes allows for using
the estimators and functions with other libraries such as JAX, CuPy, and PyTorch.
This therefore enables some GPU-accelerated computations.

See :ref:`array_api` for more details.

**Functions:**

- :func:`sklearn.metrics.accuracy_score` and :func:`sklearn.metrics.zero_one_loss` in
  :pr:`27137` by :user:`Edoardo Abati <EdAbati>`;
- :func:`sklearn.model_selection.train_test_split` in :pr:`26855` by `Tim Head`_;
- :func:`~utils.multiclass.is_multilabel` in :pr:`27601` by
  :user:`Yaroslav Korobko <Tialo>`.

**Classes:**

- :class:`decomposition.PCA` for the `full` and `randomized` solvers (with QR power
  iterations) in :pr:`26315`, :pr:`27098` and :pr:`27431` by
  :user:`Mateusz Sokół <mtsokol>`, :user:`Olivier Grisel <ogrisel>` and
  :user:`Edoardo Abati <EdAbati>`;
- :class:`preprocessing.KernelCenterer` in :pr:`27556` by
  :user:`Edoardo Abati <EdAbati>`;
- :class:`preprocessing.MaxAbsScaler` in :pr:`27110` by :user:`Edoardo Abati <EdAbati>`;
- :class:`preprocessing.MinMaxScaler` in :pr:`26243` by `Tim Head`_;
- :class:`preprocessing.Normalizer` in :pr:`27558` by :user:`Edoardo Abati <EdAbati>`.

Private Loss Function Module
----------------------------

- |FIX| The gradient computation of the binomial log loss is now numerically
  more stable for very large, in absolute value, input (raw predictions). Before, it
  could result in `np.nan`. Among the models that profit from this change are
  :class:`ensemble.GradientBoostingClassifier`,
  :class:`ensemble.HistGradientBoostingClassifier` and
  :class:`linear_model.LogisticRegression`.
  :pr:`28048` by :user:`Christian Lorentzen <lorentzenchr>`.

Changelog
---------

..
    Entries should be grouped by module (in alphabetic order) and prefixed with
    one of the labels: |MajorFeature|, |Feature|, |Efficiency|, |Enhancement|,
    |Fix| or |API| (see whats_new.rst for descriptions).
    Entries should be ordered by those labels (e.g. |Fix| after |Efficiency|).
    Changes not specific to a module should be listed under *Multiple Modules*
    or *Miscellaneous*.
    Entries should end with:
    :pr:`123456` by :user:`Joe Bloggs <joeongithub>`.
    where 123455 is the *pull request* number, not the issue number.


:mod:`sklearn.base`
...................

- |Enhancement| :meth:`base.ClusterMixin.fit_predict` and
  :meth:`base.OutlierMixin.fit_predict` now accept ``**kwargs`` which are
  passed to the ``fit`` method of the estimator.
  :pr:`26506` by `Adrin Jalali`_.

- |Enhancement| :meth:`base.TransformerMixin.fit_transform` and
  :meth:`base.OutlierMixin.fit_predict` now raise a warning if ``transform`` /
  ``predict`` consume metadata, but no custom ``fit_transform`` / ``fit_predict``
  is defined in the class inheriting from them correspondingly.
  :pr:`26831` by `Adrin Jalali`_.

- |Enhancement| :func:`base.clone` now supports `dict` as input and creates a
  copy.
  :pr:`26786` by `Adrin Jalali`_.

- |API|:func:`~utils.metadata_routing.process_routing` now has a different
  signature. The first two (the object and the method) are positional only,
  and all metadata are passed as keyword arguments.
  :pr:`26909` by `Adrin Jalali`_.

:mod:`sklearn.calibration`
..........................

- |Enhancement| The internal objective and gradient of the `sigmoid` method
  of :class:`calibration.CalibratedClassifierCV` have been replaced by the
  private loss module.
  :pr:`27185` by :user:`Omar Salman <OmarManzoor>`.

:mod:`sklearn.cluster`
......................

- |Fix| The `degree` parameter in the :class:`cluster.SpectralClustering`
  constructor now accepts real values instead of only integral values in
  accordance with the `degree` parameter of the
  :class:`sklearn.metrics.pairwise.polynomial_kernel`.
  :pr:`27668` by :user:`Nolan McMahon <NolantheNerd>`.

- |Fix| Fixes a bug in :class:`cluster.OPTICS` where the cluster correction based
  on predecessor was not using the right indexing. It would lead to inconsistent results
  depedendent on the order of the data.
  :pr:`26459` by :user:`Haoying Zhang <stevezhang1999>` and
  :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| Improve error message when checking the number of connected components
  in the `fit` method of :class:`cluster.HDBSCAN`.
  :pr:`27678` by :user:`Ganesh Tata <tataganesh>`.

- |Fix| Create copy of precomputed sparse matrix within the
  `fit` method of :class:`cluster.DBSCAN` to avoid in-place modification of
  the sparse matrix.
  :pr:`27651` by :user:`Ganesh Tata <tataganesh>`.

- |Fix| Raises a proper `ValueError` when `metric="precomputed"` and requested storing
  centers via the parameter `store_centers`.
  :pr:`27898` by :user:`Guillaume Lemaitre <glemaitre>`.

- |API| `kdtree` and `balltree` values are now deprecated and are renamed as
  `kd_tree` and `ball_tree` respectively for the `algorithm` parameter of
  :class:`cluster.HDBSCAN` ensuring consistency in naming convention.
  `kdtree` and `balltree` values will be removed in 1.6.
  :pr:`26744` by :user:`Shreesha Kumar Bhat <Shreesha3112>`.

- |API| The option `metric=None` in
  :class:`cluster.AgglomerativeClustering` and :class:`cluster.FeatureAgglomeration`
  is deprecated in version 1.4 and will be removed in version 1.6. Use the default
  value instead.
  :pr:`27828` by :user:`Guillaume Lemaitre <glemaitre>`.

:mod:`sklearn.compose`
......................

- |MajorFeature| Adds `polars <https://www.pola.rs>`__ input support to
  :class:`compose.ColumnTransformer` through the `DataFrame Interchange Protocol
  <https://data-apis.org/dataframe-protocol/latest/purpose_and_scope.html>`__.
  The minimum supported version for polars is `0.19.12`.
  :pr:`26683` by `Thomas Fan`_.

- |Fix| :func:`cluster.spectral_clustering` and :class:`cluster.SpectralClustering`
  now raise an explicit error message indicating that sparse matrices and arrays
  with `np.int64` indices are not supported.
  :pr:`27240` by :user:`Yao Xiao <Charlie-XIAO>`.

- |API| outputs that use pandas extension dtypes and contain `pd.NA` in
  :class:`~compose.ColumnTransformer` now result in a `FutureWarning` and will
  cause a `ValueError` in version 1.6, unless the output container has been
  configured as "pandas" with `set_output(transform="pandas")`. Before, such
  outputs resulted in numpy arrays of dtype `object` containing `pd.NA` which
  could not be converted to numpy floats and caused errors when passed to other
  scikit-learn estimators.
  :pr:`27734` by :user:`Jérôme Dockès <jeromedockes>`.

:mod:`sklearn.covariance`
.........................

- |Enhancement| Allow :func:`covariance.shrunk_covariance` to process
  multiple covariance matrices at once by handling nd-arrays.
  :pr:`25275` by :user:`Quentin Barthélemy <qbarthelemy>`.

- |API| |FIX| :class:`~compose.ColumnTransformer` now replaces `"passthrough"`
  with a corresponding :class:`~preprocessing.FunctionTransformer` in the
  fitted ``transformers_`` attribute.
  :pr:`27204` by `Adrin Jalali`_.

:mod:`sklearn.datasets`
.......................

- |Enhancement| :func:`datasets.make_sparse_spd_matrix` now uses a more memory-
  efficient sparse layout. It also accepts a new keyword `sparse_format` that allows
  specifying the output format of the sparse matrix. By default `sparse_format=None`,
  which returns a dense numpy ndarray as before.
  :pr:`27438` by :user:`Yao Xiao <Charlie-XIAO>`.

- |Fix| :func:`datasets.dump_svmlight_file` now does not raise `ValueError` when `X`
  is read-only, e.g., a `numpy.memmap` instance.
  :pr:`28111` by :user:`Yao Xiao <Charlie-XIAO>`.

- |API| :func:`datasets.make_sparse_spd_matrix` deprecated the keyword argument ``dim``
  in favor of ``n_dim``. ``dim`` will be removed in version 1.6.
  :pr:`27718` by :user:`Adam Li <adam2392>`.

:mod:`sklearn.decomposition`
............................

- |Feature| :class:`decomposition.PCA` now supports :class:`scipy.sparse.sparray`
  and :class:`scipy.sparse.spmatrix` inputs when using the `arpack` solver.
  When used on sparse data like :func:`datasets.fetch_20newsgroups_vectorized` this
  can lead to speed-ups of 100x (single threaded) and 70x lower memory usage.
  Based on :user:`Alexander Tarashansky <atarashansky>`'s implementation in
  `scanpy <https://github.com/scverse/scanpy>`_.
  :pr:`18689` by :user:`Isaac Virshup <ivirshup>` and
  :user:`Andrey Portnoy <andportnoy>`.

- |Enhancement| An "auto" option was added to the `n_components` parameter of
  :func:`decomposition.non_negative_factorization`, :class:`decomposition.NMF` and
  :class:`decomposition.MiniBatchNMF` to automatically infer the number of components
  from W or H shapes when using a custom initialization. The default value of this
  parameter will change from `None` to `auto` in version 1.6.
  :pr:`26634` by :user:`Alexandre Landeau <AlexL>` and :user:`Alexandre Vigny <avigny>`.

- |Fix| :func:`decomposition.dict_learning_online` does not ignore anymore the parameter
  `max_iter`.
  :pr:`27834` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| The `degree` parameter in the :class:`decomposition.KernelPCA`
  constructor now accepts real values instead of only integral values in
  accordance with the `degree` parameter of the
  :class:`sklearn.metrics.pairwise.polynomial_kernel`.
  :pr:`27668` by :user:`Nolan McMahon <NolantheNerd>`.

- |API| The option `max_iter=None` in
  :class:`decomposition.MiniBatchDictionaryLearning`,
  :class:`decomposition.MiniBatchSparsePCA`, and
  :func:`decomposition.dict_learning_online` is deprecated and will be removed in
  version 1.6. Use the default value instead.
  :pr:`27834` by :user:`Guillaume Lemaitre <glemaitre>`.

:mod:`sklearn.ensemble`
.......................

- |MajorFeature| :class:`ensemble.RandomForestClassifier` and
  :class:`ensemble.RandomForestRegressor` support missing values when
  the criterion is `gini`, `entropy`, or `log_loss`,
  for classification or `squared_error`, `friedman_mse`, or `poisson`
  for regression.
  :pr:`26391` by `Thomas Fan`_.

- |MajorFeature| :class:`ensemble.HistGradientBoostingClassifier` and
  :class:`ensemble.HistGradientBoostingRegressor` supports
  `categorical_features="from_dtype"`, which treats columns with Pandas or
  Polars Categorical dtype as categories in the algorithm.
  `categorical_features="from_dtype"` will become the default in v1.6.
  Categorical features no longer need to be encoded with numbers. When
  categorical features are numbers, the maximum value no longer needs to be
  smaller than `max_bins`; only the number of (unique) categories must be
  smaller than `max_bins`.
  :pr:`26411` by `Thomas Fan`_ and :pr:`27835` by :user:`Jérôme Dockès <jeromedockes>`.

- |MajorFeature| :class:`ensemble.HistGradientBoostingClassifier` and
  :class:`ensemble.HistGradientBoostingRegressor` got the new parameter
  `max_features` to specify the proportion of randomly chosen features considered
  in each split.
  :pr:`27139` by :user:`Christian Lorentzen <lorentzenchr>`.

- |Feature| :class:`ensemble.RandomForestClassifier`,
  :class:`ensemble.RandomForestRegressor`, :class:`ensemble.ExtraTreesClassifier`
  and :class:`ensemble.ExtraTreesRegressor` now support monotonic constraints,
  useful when features are supposed to have a positive/negative effect on the target.
  Missing values in the train data and multi-output targets are not supported.
  :pr:`13649` by :user:`Samuel Ronsin <samronsin>`,
  initiated by :user:`Patrick O'Reilly <pat-oreilly>`.

- |Efficiency| :class:`ensemble.HistGradientBoostingClassifier` and
  :class:`ensemble.HistGradientBoostingRegressor` are now a bit faster by reusing
  the parent node's histogram as children node's histogram in the subtraction trick.
  In effect, less memory has to be allocated and deallocated.
  :pr:`27865` by :user:`Christian Lorentzen <lorentzenchr>`.

- |Efficiency| :class:`ensemble.GradientBoostingClassifier` is faster,
  for binary and in particular for multiclass problems thanks to the private loss
  function module.
  :pr:`26278` and :pr:`28095` by :user:`Christian Lorentzen <lorentzenchr>`.

- |Efficiency| Improves runtime and memory usage for
  :class:`ensemble.GradientBoostingClassifier` and
  :class:`ensemble.GradientBoostingRegressor` when trained on sparse data.
  :pr:`26957` by `Thomas Fan`_.

- |Efficiency| :class:`ensemble.HistGradientBoostingClassifier` and
  :class:`ensemble.HistGradientBoostingRegressor` is now faster when `scoring`
  is a predefined metric listed in :func:`metrics.get_scorer_names` and
  early stopping is enabled.
  :pr:`26163` by `Thomas Fan`_.

- |Enhancement| A fitted property, ``estimators_samples_``, was added to all Forest
  methods, including
  :class:`ensemble.RandomForestClassifier`, :class:`ensemble.RandomForestRegressor`,
  :class:`ensemble.ExtraTreesClassifier` and :class:`ensemble.ExtraTreesRegressor`,
  which allows to retrieve the training sample indices used for each tree estimator.
  :pr:`26736` by :user:`Adam Li <adam2392>`.

- |Fix| Fixes :class:`ensemble.IsolationForest` when the input is a sparse matrix and
  `contamination` is set to a float value.
  :pr:`27645` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| Raises a `ValueError` in :class:`ensemble.RandomForestRegressor` and
  :class:`ensemble.ExtraTreesRegressor` when requesting OOB score with multioutput model
  for the targets being all rounded to integer. It was recognized as a multiclass
  problem.
  :pr:`27817` by :user:`Daniele Ongari <danieleongari>`

- |Fix| Changes estimator tags to acknowledge that
  :class:`ensemble.VotingClassifier`, :class:`ensemble.VotingRegressor`,
  :class:`ensemble.StackingClassifier`, :class:`ensemble.StackingRegressor`,
  support missing values if all `estimators` support missing values.
  :pr:`27710` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| Support loading pickles of :class:`ensemble.HistGradientBoostingClassifier` and
  :class:`ensemble.HistGradientBoostingRegressor` when the pickle has
  been generated on a platform with a different bitness. A typical example is
  to train and pickle the model on 64 bit machine and load the model on a 32
  bit machine for prediction.
  :pr:`28074` by :user:`Christian Lorentzen <lorentzenchr>` and
  :user:`Loïc Estève <lesteve>`.

- |API| In :class:`ensemble.AdaBoostClassifier`, the `algorithm` argument `SAMME.R` was
  deprecated and will be removed in 1.6.
  :pr:`26830` by :user:`Stefanie Senger <StefanieSenger>`.

:mod:`sklearn.feature_extraction`
.................................

- |API| Changed error type from :class:`AttributeError` to
  :class:`exceptions.NotFittedError` in unfitted instances of
  :class:`feature_extraction.DictVectorizer` for the following methods:
  :func:`feature_extraction.DictVectorizer.inverse_transform`,
  :func:`feature_extraction.DictVectorizer.restrict`,
  :func:`feature_extraction.DictVectorizer.transform`.
  :pr:`24838` by :user:`Lorenz Hertel <LoHertel>`.

:mod:`sklearn.feature_selection`
................................

- |Enhancement| :class:`feature_selection.SelectKBest`,
  :class:`feature_selection.SelectPercentile`, and
  :class:`feature_selection.GenericUnivariateSelect` now support unsupervised
  feature selection by providing a `score_func` taking `X` and `y=None`.
  :pr:`27721` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Enhancement| :class:`feature_selection.SelectKBest` and
  :class:`feature_selection.GenericUnivariateSelect` with `mode='k_best'`
  now shows a warning when `k` is greater than the number of features.
  :pr:`27841` by `Thomas Fan`_.

- |Fix| :class:`feature_selection.RFE` and :class:`feature_selection.RFECV` do
  not check for nans during input validation.
  :pr:`21807` by `Thomas Fan`_.

:mod:`sklearn.inspection`
.........................

- |Enhancement| :class:`inspection.DecisionBoundaryDisplay` now accepts a parameter
  `class_of_interest` to select the class of interest when plotting the response
  provided by `response_method="predict_proba"` or
  `response_method="decision_function"`. It allows to plot the decision boundary for
  both binary and multiclass classifiers.
  :pr:`27291` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| :meth:`inspection.DecisionBoundaryDisplay.from_estimator` and
  :class:`inspection.PartialDependenceDisplay.from_estimator` now return the correct
  type for subclasses.
  :pr:`27675` by :user:`John Cant <johncant>`.

- |API| :class:`inspection.DecisionBoundaryDisplay` raise an `AttributeError` instead
  of a `ValueError` when an estimator does not implement the requested response method.
  :pr:`27291` by :user:`Guillaume Lemaitre <glemaitre>`.

:mod:`sklearn.kernel_ridge`
...........................

- |Fix| The `degree` parameter in the :class:`kernel_ridge.KernelRidge`
  constructor now accepts real values instead of only integral values in
  accordance with the `degree` parameter of the
  :class:`sklearn.metrics.pairwise.polynomial_kernel`.
  :pr:`27668` by :user:`Nolan McMahon <NolantheNerd>`.

:mod:`sklearn.linear_model`
...........................

- |Efficiency| :class:`linear_model.LogisticRegression` and
  :class:`linear_model.LogisticRegressionCV` now have much better convergence for
  solvers `"lbfgs"` and `"newton-cg"`. Both solvers can now reach much higher precision
  for the coefficients depending on the specified `tol`. Additionally, lbfgs can
  make better use of `tol`, i.e., stop sooner or reach higher precision. This is
  accomplished by better scaling of the objective function, i.e., using average per
  sample losses instead of sum of per sample losses.
  :pr:`26721` by :user:`Christian Lorentzen <lorentzenchr>`.

- |Efficiency| :class:`linear_model.LogisticRegression` and
  :class:`linear_model.LogisticRegressionCV` with solver `"newton-cg"` can now be
  considerably faster for some data and parameter settings. This is accomplished by a
  better line search convergence check for negligible loss improvements that takes into
  account gradient information.
  :pr:`26721` by :user:`Christian Lorentzen <lorentzenchr>`.

- |Efficiency| Solver `"newton-cg"` in :class:`linear_model.LogisticRegression` and
  :class:`linear_model.LogisticRegressionCV` uses a little less memory. The effect is
  proportional to the number of coefficients (`n_features * n_classes`).
  :pr:`27417` by :user:`Christian Lorentzen <lorentzenchr>`.

- |Fix| Ensure that the `sigma_` attribute of
  :class:`linear_model.ARDRegression` and :class:`linear_model.BayesianRidge`
  always has a `float32` dtype when fitted on `float32` data, even with the
  type promotion rules of NumPy 2.
  :pr:`27899` by :user:`Olivier Grisel <ogrisel>`.

- |API| The attribute `loss_function_` of :class:`linear_model.SGDClassifier` and
  :class:`linear_model.SGDOneClassSVM` has been deprecated and will be removed in
  version 1.6.
  :pr:`27979` by :user:`Christian Lorentzen <lorentzenchr>`.

:mod:`sklearn.metrics`
......................

- |Efficiency| Computing pairwise distances via :class:`metrics.DistanceMetric`
  for CSR x CSR,  Dense x CSR, and CSR x Dense datasets is now 1.5x faster.
  :pr:`26765` by :user:`Meekail Zain <micky774>`.

- |Efficiency| Computing distances via :class:`metrics.DistanceMetric`
  for CSR x CSR, Dense x CSR, and CSR x Dense now uses ~50% less memory,
  and outputs distances in the same dtype as the provided data.
  :pr:`27006` by :user:`Meekail Zain <micky774>`.

- |Enhancement| Improve the rendering of the plot obtained with the
  :class:`metrics.PrecisionRecallDisplay` and :class:`metrics.RocCurveDisplay`
  classes. the x- and y-axis limits are set to [0, 1] and the aspect ratio between
  both axis is set to be 1 to get a square plot.
  :pr:`26366` by :user:`Mojdeh Rastgoo <mrastgoo>`.

- |Enhancement| Added `neg_root_mean_squared_log_error_scorer` as scorer
  :pr:`26734` by :user:`Alejandro Martin Gil <101AlexMartin>`.

- |Enhancement| :func:`metrics.confusion_matrix` now warns when only one label was
  found in `y_true` and `y_pred`.
  :pr:`27650` by :user:`Lucy Liu <lucyleeow>`.

- |Fix| computing pairwise distances with :func:`metrics.pairwise.euclidean_distances`
  no longer raises an exception when `X` is provided as a `float64` array and
  `X_norm_squared` as a `float32` array.
  :pr:`27624` by :user:`Jérôme Dockès <jeromedockes>`.

- |Fix| :func:`f1_score` now provides correct values when handling various
  cases in which division by zero occurs by using a formulation that does not
  depend on the precision and recall values.
  :pr:`27577` by :user:`Omar Salman <OmarManzoor>` and
  :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| :func:`metrics.make_scorer` now raises an error when using a regressor on a
  scorer requesting a non-thresholded decision function (from `decision_function` or
  `predict_proba`). Such scorer are specific to classification.
  :pr:`26840` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| :meth:`metrics.DetCurveDisplay.from_predictions`,
  :class:`metrics.PrecisionRecallDisplay.from_predictions`,
  :class:`metrics.PredictionErrorDisplay.from_predictions`, and
  :class:`metrics.RocCurveDisplay.from_predictions` now return the correct type
  for subclasses.
  :pr:`27675` by :user:`John Cant <johncant>`.

- |API| Deprecated `needs_threshold` and `needs_proba` from :func:`metrics.make_scorer`.
  These parameters will be removed in version 1.6. Instead, use `response_method` that
  accepts `"predict"`, `"predict_proba"` or `"decision_function"` or a list of such
  values. `needs_proba=True` is equivalent to `response_method="predict_proba"` and
  `needs_threshold=True` is equivalent to
  `response_method=("decision_function", "predict_proba")`.
  :pr:`26840` by :user:`Guillaume Lemaitre <glemaitre>`.

- |API| The `squared` parameter of :func:`metrics.mean_squared_error` and
  :func:`metrics.mean_squared_log_error` is deprecated and will be removed in 1.6.
  Use the new functions :func:`metrics.root_mean_squared_error` and
  :func:`metrics.root_mean_squared_log_error` instead.
  :pr:`26734` by :user:`Alejandro Martin Gil <101AlexMartin>`.

:mod:`sklearn.model_selection`
..............................

- |Enhancement| :func:`model_selection.learning_curve` raises a warning when
  every cross validation fold fails.
  :pr:`26299` by :user:`Rahil Parikh <rprkh>`.

- |Fix| :class:`model_selection.GridSearchCV`,
  :class:`model_selection.RandomizedSearchCV`, and
  :class:`model_selection.HalvingGridSearchCV` now don't change the given
  object in the parameter grid if it's an estimator.
  :pr:`26786` by `Adrin Jalali`_.

:mod:`sklearn.multioutput`
..........................

- |Enhancement| Add method `predict_log_proba` to :class:`multioutput.ClassifierChain`.
  :pr:`27720` by :user:`Guillaume Lemaitre <glemaitre>`.

:mod:`sklearn.neighbors`
........................

- |Efficiency| :meth:`sklearn.neighbors.KNeighborsRegressor.predict` and
  :meth:`sklearn.neighbors.KNeighborsClassifier.predict_proba` now efficiently support
  pairs of dense and sparse datasets.
  :pr:`27018` by :user:`Julien Jerphanion <jjerphan>`.

- |Efficiency| The performance of :meth:`neighbors.RadiusNeighborsClassifier.predict`
  and of :meth:`neighbors.RadiusNeighborsClassifier.predict_proba` has been improved
  when `radius` is large and `algorithm="brute"` with non-Euclidean metrics.
  :pr:`26828` by :user:`Omar Salman <OmarManzoor>`.

- |Fix| Improve error message for :class:`neighbors.LocalOutlierFactor`
  when it is invoked with `n_samples=n_neighbors`.
  :pr:`23317` by :user:`Bharat Raghunathan <bharatr21>`.

- |Fix| :meth:`neighbors.KNeighborsClassifier.predict` and
  :meth:`neighbors.KNeighborsClassifier.predict_proba` now raises an error when the
  weights of all neighbors of some sample are zero. This can happen when `weights`
  is a user-defined function.
  :pr:`26410` by :user:`Yao Xiao <Charlie-XIAO>`.

- |API| :class:`neighbors.KNeighborsRegressor` now accepts
  :class:`metrics.DistanceMetric` objects directly via the `metric` keyword
  argument allowing for the use of accelerated third-party
  :class:`metrics.DistanceMetric` objects.
  :pr:`26267` by :user:`Meekail Zain <micky774>`.

:mod:`sklearn.preprocessing`
............................

- |Efficiency| :class:`preprocessing.OrdinalEncoder` avoids calculating
  missing indices twice to improve efficiency.
  :pr:`27017` by :user:`Xuefeng Xu <xuefeng-xu>`.

- |Efficiency| Improves efficiency in :class:`preprocessing.OneHotEncoder` and
  :class:`preprocessing.OrdinalEncoder` in checking `nan`.
  :pr:`27760` by :user:`Xuefeng Xu <xuefeng-xu>`.

- |Enhancement| Improves warnings in :class:`preprocessing.FunctionTransformer` when
  `func` returns a pandas dataframe and the output is configured to be pandas.
  :pr:`26944` by `Thomas Fan`_.

- |Enhancement| :class:`preprocessing.TargetEncoder` now supports `target_type`
  'multiclass'.
  :pr:`26674` by :user:`Lucy Liu <lucyleeow>`.

- |Fix| :class:`preprocessing.OneHotEncoder` and :class:`preprocessing.OrdinalEncoder`
  raise an exception when `nan` is a category and is not the last in the user's
  provided categories.
  :pr:`27309` by :user:`Xuefeng Xu <xuefeng-xu>`.

- |Fix| :class:`preprocessing.OneHotEncoder` and :class:`preprocessing.OrdinalEncoder`
  raise an exception if the user provided categories contain duplicates.
  :pr:`27328` by :user:`Xuefeng Xu <xuefeng-xu>`.

- |Fix| :class:`preprocessing.FunctionTransformer` raises an error at `transform` if
  the output of `get_feature_names_out` is not consistent with the column names of the
  output container if those are defined.
  :pr:`27801` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| Raise a `NotFittedError` in :class:`preprocessing.OrdinalEncoder` when calling
  `transform` without calling `fit` since `categories` always requires to be checked.
  :pr:`27821` by :user:`Guillaume Lemaitre <glemaitre>`.

:mod:`sklearn.tree`
...................

- |Feature| :class:`tree.DecisionTreeClassifier`, :class:`tree.DecisionTreeRegressor`,
  :class:`tree.ExtraTreeClassifier` and :class:`tree.ExtraTreeRegressor` now support
  monotonic constraints, useful when features are supposed to have a positive/negative
  effect on the target. Missing values in the train data and multi-output targets are
  not supported.
  :pr:`13649` by :user:`Samuel Ronsin <samronsin>`, initiated by
  :user:`Patrick O'Reilly <pat-oreilly>`.

:mod:`sklearn.utils`
....................

- |Enhancement| :func:`sklearn.utils.estimator_html_repr` dynamically adapts
  diagram colors based on the browser's `prefers-color-scheme`, providing
  improved adaptability to dark mode environments.
  :pr:`26862` by :user:`Andrew Goh Yisheng <9y5>`, `Thomas Fan`_, `Adrin
  Jalali`_.

- |Enhancement| :class:`~utils.metadata_routing.MetadataRequest` and
  :class:`~utils.metadata_routing.MetadataRouter` now have a ``consumes`` method
  which can be used to check whether a given set of parameters would be consumed.
  :pr:`26831` by `Adrin Jalali`_.

- |Enhancement| Make :func:`sklearn.utils.check_array` attempt to output
  `int32`-indexed CSR and COO arrays when converting from DIA arrays if the number of
  non-zero entries is small enough. This ensures that estimators implemented in Cython
  and that do not accept `int64`-indexed sparse datastucture, now consistently
  accept the same sparse input formats for SciPy sparse matrices and arrays.
  :pr:`27372` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| :func:`sklearn.utils.check_array` should accept both matrix and array from
  the sparse SciPy module. The previous implementation would fail if `copy=True` by
  calling specific NumPy `np.may_share_memory` that does not work with SciPy sparse
  array and does not return the correct result for SciPy sparse matrix.
  :pr:`27336` by :user:`Guillaume Lemaitre <glemaitre>`.

- |Fix| :func:`~utils.estimator_checks.check_estimators_pickle` with
  `readonly_memmap=True` now relies on joblib's own capability to allocate
  aligned memory mapped arrays when loading a serialized estimator instead of
  calling a dedicated private function that would crash when OpenBLAS
  misdetects the CPU architecture.
  :pr:`27614` by :user:`Olivier Grisel <ogrisel>`.

- |Fix| Error message in :func:`~utils.check_array` when a sparse matrix was
  passed but `accept_sparse` is `False` now suggests to use `.toarray()` and not
  `X.toarray()`.
  :pr:`27757` by :user:`Lucy Liu <lucyleeow>`.

- |Fix| Fix the function :func:`~utils.check_array` to output the right error message
  when the input is a Series instead of a DataFrame.
  :pr:`28090` by :user:`Stan Furrer <stanFurrer>` and :user:`Yao Xiao <Charlie-XIAO>`.

- |API| :func:`sklearn.extmath.log_logistic` is deprecated and will be removed in 1.6.
  Use `-np.logaddexp(0, -x)` instead.
  :pr:`27544` by :user:`Christian Lorentzen <lorentzenchr>`.

.. rubric:: Code and documentation contributors

Thanks to everyone who has contributed to the maintenance and improvement of
the project since version 1.3, including:

101AlexMartin, Abhishek Singh Kushwah, Adam Li, Adarsh Wase, Adrin Jalali,
Advik Sinha, Alex, Alexander Al-Feghali, Alexis IMBERT, AlexL, Alex Molas, Anam
Fatima, Andrew Goh, andyscanzio, Aniket Patil, Artem Kislovskiy, Arturo Amor,
ashah002, avm19, Ben Holmes, Ben Mares, Benoit Chevallier-Mames, Bharat
Raghunathan, Binesh Bannerjee, Brendan Lu, Brevin Kunde, Camille Troillard,
Carlo Lemos, Chad Parmet, Christian Clauss, Christian Lorentzen, Christian
Veenhuis, Christos Aridas, Cindy Liang, Claudio Salvatore Arcidiacono, Connor
Boyle, cynthias13w, DaminK, Daniele Ongari, Daniel Schmitz, Daniel Tinoco,
David Brochart, Deborah L. Haar, DevanshKyada27, Dimitri Papadopoulos Orfanos,
Dmitry Nesterov, DUONG, Edoardo Abati, Eitan Hemed, Elabonga Atuo, Elisabeth
Günther, Emma Carballal, Emmanuel Ferdman, epimorphic, Erwan Le Floch, Fabian
Egli, Filip Karlo Došilović, Florian Idelberger, Franck Charras, Gael
Varoquaux, Ganesh Tata, Gleb Levitski, Guillaume Lemaitre, Haoying Zhang,
Harmanan Kohli, Ily, ioangatop, IsaacTrost, Isaac Virshup, Iwona Zdzieblo,
Jakub Kaczmarzyk, James McDermott, Jarrod Millman, JB Mountford, Jérémie du
Boisberranger, Jérôme Dockès, Jiawei Zhang, Joel Nothman, John Cant, John
Hopfensperger, Jona Sassenhagen, Jon Nordby, Julien Jerphanion, Kennedy Waweru,
kevin moore, Kian Eliasi, Kishan Ved, Konstantinos Pitas, Koustav Ghosh, Kushan
Sharma, ldwy4, Linus, Lohit SundaramahaLingam, Loic Esteve, Lorenz, Louis
Fouquet, Lucy Liu, Luis Silvestrin, Lukáš Folwarczný, Lukas Geiger, Malte
Londschien, Marcus Fraaß, Marek Hanuš, Maren Westermann, Mark Elliot, Martin
Larralde, Mateusz Sokół, mathurinm, mecopur, Meekail Zain, Michael Higgins,
Miki Watanabe, Milton Gomez, MN193, Mohammed Hamdy, Mohit Joshi, mrastgoo,
Naman Dhingra, Naoise Holohan, Narendra Singh dangi, Noa Malem-Shinitski,
Nolan, Nurseit Kamchyev, Oleksii Kachaiev, Olivier Grisel, Omar Salman, partev,
Peter Hull, Peter Steinbach, Pierre de Fréminville, Pooja Subramaniam, Puneeth
K, qmarcou, Quentin Barthélemy, Rahil Parikh, Rahul Mahajan, Raj Pulapakura,
Raphael, Ricardo Peres, Riccardo Cappuzzo, Roman Lutz, Salim Dohri, Samuel O.
Ronsin, Sandip Dutta, Sayed Qaiser Ali, scaja, scikit-learn-bot, Sebastian
Berg, Shreesha Kumar Bhat, Shubhal Gupta, Søren Fuglede Jørgensen, Stefanie
Senger, Tamara, Tanjina Afroj, THARAK HEGDE, thebabush, Thomas J. Fan, Thomas
Roehr, Tialo, Tim Head, tongyu, Venkatachalam N, Vijeth Moudgalya, Vincent M,
Vivek Reddy P, Vladimir Fokow, Xiao Yuan, Xuefeng Xu, Yang Tao, Yao Xiao,
Yuchen Zhou, Yuusuke Hiramatsu