File: maintainer.rst.template

package info (click to toggle)
scikit-learn 1.7.2%2Bdfsg-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 25,752 kB
  • sloc: python: 219,120; cpp: 5,790; ansic: 846; makefile: 191; javascript: 110
file content (486 lines) | stat: -rw-r--r-- 22,487 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
Maintainer Information
======================

Releasing
---------

This section is about preparing a major/minor release, a release candidate (RC), or a
bug-fix release. We follow `PEP440 <https://www.python.org/dev/peps/pep-0440/>`_ for
the version scheme and to indicate different types of releases. Our convention is to
follow the "major.minor.micro" scheme, although in practice there is no fundamental
difference between major and minor releases and micro releases are bug-fix releases.

We adopted the following release schedule:

- Major/Minor releases every 6 months, usually in May and November. These releases
  are numbered `X.Y.0` and are preceded by one or more release candidates `X.Y.0rcN`.
- Bug-fix releases are done as needed between major/minor releases and only apply to
  the last stable version. These releases are numbered `X.Y.Z`.

.. rubric:: Preparation

- Confirm that all blockers tagged for the milestone have been resolved, and that other
  issues tagged for the milestone can be postponed.

- Make sure the deprecations, FIXMEs, and TODOs tagged for the release have been taken
  care of.

- Make sure that the minimum supported versions of our dependencies have been bumped, see
  :ref:`bumping_dependencies_guideline` for details.

- For major/minor final releases, make sure that a *Release Highlights* page has been
  done as a runnable example and check that its HTML rendering looks correct. It should
  be linked from the what's new file for the new version of scikit-learn.

.. rubric:: Permissions

- The release manager must be a **maintainer** of the
  https://github.com/scikit-learn/scikit-learn repository to be able to publish on
  `pypi.org` and `test.pypi.org` (via a manual trigger of a dedicated Github Actions
  workflow).

- The release manager must be a **maintainer** of the
  https://github.com/conda-forge/scikit-learn-feedstock repository to be able to publish
  on `conda-forge`. This can be changed by editing the `recipe/meta.yaml` file in the
  first release pull request.

Reference Steps
^^^^^^^^^^^^^^^

.. tab-set::

  {% for key in ["rc", "final", "bf"] %}
  {%- if key == "rc" -%}
    {%- set title = "Major/Minor RC" -%}
  {%- elif key == "final" -%}
    {%- set title = "Major/Minor Final" -%}
  {%- else -%}
    {%- set title = "Bug-fix" -%}
  {%- endif -%}

  {%- set version_full = inferred["version_full"][key] -%}
  {%- set version_short = inferred["version_short"][key] -%}
  {%- set previous_tag = inferred["previous_tag"][key] -%}

  .. tab-item:: {{ title }}
    :class-label: tab-4

    Suppose that we are preparing the release `{{ version_full }}`.

    {% if key == "rc" %}
    The first RC ideally counts as a **feature freeze**. Each coming release candidate
    and the final release afterwards should include only minor documentation changes
    and bug fixes. Any major enhancement or new feature should be excluded.

    - Create the release branch `{{ version_short }}.X` directly in the main repository,
      where `X` is really the letter X, **not a placeholder**. The development for the
      final and subsequent bug-fix releases of `{{ version_short }}` should also happen
      under this branch with different tags.

      .. prompt:: bash

        git fetch upstream main
        git checkout upstream/main
        git checkout -b {{ version_short }}.X
        git push --set-upstream upstream {{ version_short }}.X
    {% endif %}

    {% if key != "rc" %}
    - Create a new branch from the `main` branch, then start an interactive rebase from
      `{{ version_short }}.X` to select the commits that need to be backported:

      .. prompt:: bash

        git rebase -i upstream/{{ version_short }}.X

      This will open an interactive rebase with the `git-rebase-todo` containing all the
      latest commits on `main`. At this stage, you have to perform this interactive
      rebase with at least someone else (to not forget something and to avoid doubts).

      - Do not remove lines but drop commit by replacing `pick` with `drop`.
      - Commits to pick for a bug-fix release are *generally* prefixed with `FIX`, `CI`,
        and `DOC`. They should at least include all the commits of the merged PRs that
        were milestoned for this release.
      - Commits to `drop` for a bug-fix release are *generally* prefixed with `FEAT`,
        `MAINT`, `ENH`, and `API`. Reasons for not including them are to prevent change
        of behavior (which should only happen in major/minor releases).
      - After having dropped or picked commits, **do not exit** but paste the content of
        the `git-rebase-todo` message in the PR. This file is located at
        `.git/rebase-merge/git-rebase-todo`.
      - Save and exit to start the interactive rebase. Resolve merge conflicts when
        necessary.
    {% endif %}

    - Create a PR targeting the `{{ version_short }}.X` branch.
      Copy the following release checklist to the description of this PR to track the
      progress.

      .. code-block:: markdown

        {% if key == "rc" -%}
        * [ ] Update the sklearn dev0 version in main branch
        {%- endif %}
        * [ ] Set the version number in the release branch
        {%- if key == "rc" %}
        * [ ] Set an upper bound on build dependencies in the release branch
        {%- endif %}
        * [ ] Generate the changelog in the release branch
        * [ ] Check that the wheels for the release can be built successfully
        * [ ] Merge the PR with `[cd build]` commit message to upload wheels to the staging repo
        * [ ] Upload the wheels and source tarball to https://test.pypi.org
        * [ ] Create tag on the main repo
        * [ ] Confirm bot detected at https://github.com/conda-forge/scikit-learn-feedstock
              and wait for merge
        * [ ] Upload the wheels and source tarball to PyPI
        {%- if key != "rc" %}
        * [ ] Update news and what's new date in main branch
        * [ ] Backport news and what's new date in release branch
        {%- endif %}
        {%- if key == "final" %}
        * [ ] Update symlink for stable in https://github.com/scikit-learn/scikit-learn.github.io
        {%- endif %}
        {%- if key != "rc" %}
        * [ ] Publish to https://github.com/scikit-learn/scikit-learn/releases
        {%- endif %}
        * [ ] Announce on mailing list and on social media platforms (LinkedIn, Bluesky, etc.)
        {%- if key != "rc" %}
        * [ ] Update SECURITY.md in main branch
        {%- endif %}

    {% if key == "rc" %}
    - Create a PR from `main` and targeting `main` to prepare for the next version. In
      this PR you need to:

      - Increment the dev0 `__version__` variable in `sklearn/__init__.py`. This means
        that while we are in the release candidate period, the latest stable is two
        versions behind the `main` branch, instead of one.

      - Include a new what's new file under the `doc/whats_new/` directory. Don't forget
        to add an entry for this new file in `doc/whats_new.rst`.

      - Change the what's new file to the newly created one in the `filename` field of
        the `tool.towncrier` section in `pyproject.toml`.
    {% endif %}

    - In the release branch, change the version number `__version__` in
      `sklearn/__init__.py` to `{{ version_full }}`.

    {% if key == "rc" %}
    - Still in the release branch, set or update the upper bound on the build
      dependencies in the `[build-system]` section of `pyproject.toml`. The goal is to
      prevent future backward incompatible releases of the dependencies to break the
      build in the maintenance branch.

      The upper bounds should match the latest already-released minor versions of the
      dependencies and should allow future micro (bug-fix) versions. For instance, if
      numpy 2.2.5 is the most recent version, its upper bound should be set to <2.3.0.
    {% endif %}

    - In the release branch, generate the changelog for the incoming version, i.e.,
      `doc/whats_new/{{ version_short }}.rst`.
      {%- if key == "rc" %}
      During the RC period we want to keep the fragments when we generate the changelog
      because we'll generate it again for the final release, including the changes that
      may happen in between:

      .. prompt:: bash

        towncrier build --keep --version {{ version_short }}.0

      {%- else %}
      For a non RC release, push a commit where you:

      - Generate the changelog, not keeping the fragments.

        .. prompt:: bash

          towncrier build --version {{ version_full }}

      {% if key == "final" -%}
      - Link the release highlights example.
      {% endif -%}

      - Add the list of contributor names. Suppose that the tag of the last release in
        the previous major/minor version is `{{ previous_tag }}`, then you can use the
        following command to retrieve the list of contributor names:

        .. prompt:: bash

          git shortlog -s {{ previous_tag }}.. |
            cut -f2- |
            sort --ignore-case |
            tr "\n" ";" |
            sed "s/;/, /g;s/, $//" |
            fold -s

      Then create a PR targeting the `main` branch and cherry-pick this commit there.
      {%- endif %}

    - Trigger the wheel builder with the `[cd build]` commit marker. See also the
      `workflow runs of the wheel builder
      <https://github.com/scikit-learn/scikit-learn/actions/workflows/wheels.yml>`_.

      .. prompt:: bash

        git commit --allow-empty -m "[cd build] Trigger wheel builder workflow"

      .. note::

        The acronym CD in `[cd build]` stands for `Continuous Delivery
        <https://en.wikipedia.org/wiki/Continuous_delivery>`_ and refers to the
        automation used to generate the release artifacts (binary and source
        packages). This can be seen as an extension to CI which stands for `Continuous
        Integration <https://en.wikipedia.org/wiki/Continuous_integration>`_. The CD
        workflow on GitHub Actions is also used to automatically create nightly builds
        and publish packages for the development branch of scikit-learn. See also
        :ref:`install_nightly_builds`.

    - Once all the CD jobs have completed successfully in the PR, merge it with the
      `[cd build]` marker in the commit message. This time the results will be
      uploaded to the staging area. You should then be able to upload the generated
      artifacts (`.tar.gz` and `.whl` files) to https://test.pypi.org/ using the "Run
      workflow" form for the `PyPI publishing workflow
      <https://github.com/scikit-learn/scikit-learn/actions/workflows/publish_pypi.yml>`_.

      .. warning::

        This PR should be merged with the rebase mode instead of the usual squash mode
        because we want to keep the history in the `{{ version_short }}.X` branch close
        to the history of the main branch which will help for future bug fix releases.

        In addition if on merging, the last commit, containing the `[cd build]` marker,
        is empty, the CD jobs won't be triggered. In this case, you can directly push
        a commit with the marker in the `{{ version_short }}.X` branch to trigger them.

    - If the steps above went fine, proceed **with caution** to create a new tag for the
      release. This should be done only when you are almost certain that the release is
      ready, since adding a new tag to the main repository can trigger certain automated
      processes.

      .. prompt:: bash

        git tag -a {{ version_full }}  # in the {{ version_short }}.X branch
        git push git@github.com:scikit-learn/scikit-learn.git {{ version_full }}

      .. warning::

        Don't use the github interface for publishing the release as a way to create the
        tag because it will automatically send notifications to all users that follow
        the repo even though the website isn't updated and wheels aren't uploaded yet.

    - Confirm that the bot has detected the tag on the conda-forge feedstock repository
      https://github.com/conda-forge/scikit-learn-feedstock. If not, submit a PR for the
      release, targeting the `{% if key == "rc" %}rc{% else %}main{% endif %}` branch.

      {%- if key == "rc" %}
      Make sure to update the PR such that it will be synchronized with the `main`
      branch. In particular, backport migrations that may have been added since the last
      release.
      {% endif %}

    - Trigger the `PyPI publishing workflow
      <https://github.com/scikit-learn/scikit-learn/actions/workflows/publish_pypi.yml>`_
      again, but this time to upload the artifacts to the real https://pypi.org/. To do
      so, replace `testpypi` with `pypi` in the "Run workflow" form.

      **Alternatively**, it is possible to collect locally the generated binary wheel
      packages and source tarball and upload them all to PyPI.

      .. dropdown:: Uploading artifacts from local

        Check out at the release tag and run the following commands.

        .. prompt:: bash

          rm -r dist
          python -m pip install -U wheelhouse_uploader twine
          python -m wheelhouse_uploader fetch \
            --version {{ version_full }} --local-folder dist scikit-learn \
            https://pypi.anaconda.org/scikit-learn-wheels-staging/simple/scikit-learn/

        These commands will download all the binary packages accumulated in the `staging
        area on the anaconda.org hosting service
        <https://anaconda.org/scikit-learn-wheels-staging/scikit-learn/files>`_ and put
        them in your local `./dist` folder. Check the contents of the `./dist` folder:
        it should contain all the wheels along with the source tarball `.tar.gz`. Make
        sure you do not have developer versions or older versions of the scikit-learn
        package in that folder. Before uploading to PyPI, you can test uploading to
        `test.pypi.org` first.

        .. prompt:: bash

          twine upload --verbose --repository-url https://test.pypi.org/legacy/ dist/*

        Then upload everything at once to `pypi.org`.

        .. prompt:: bash

          twine upload dist/*

    {% if key != "rc" %}
    - In the `main` branch, edit `doc/templates/index.html` to change the "News" section
      in the landing page, along with the month of the release.
      {%- if key == "final" %}
      Do not forget to remove old entries (two years or three releases ago) and update
      the "On-going development" entry.
      {%- endif %}
      Then cherry-pick it in the release branch.
    {% endif %}

    {% if key == "final" %}
    - Update the symlink for `stable` and the `latestStable` variable in
      `versionwarning.js` in https://github.com/scikit-learn/scikit-learn.github.io.

      .. prompt:: bash

        cd /tmp
        git clone --depth 1 --no-checkout git@github.com:scikit-learn/scikit-learn.github.io.git
        cd scikit-learn.github.io
        echo stable > .git/info/sparse-checkout
        git checkout main
        rm stable
        ln -s {{ version_short }} stable
        sed -i "s/latestStable = '.*/latestStable = '{{ version_short }}';/" versionwarning.js
        git add stable versionwarning.js
        git commit -m "Update stable to point to {{ version_short }}"
        git push origin main
    {% endif %}

    {% if key != "rc" %}
    - Publish the release at https://github.com/scikit-learn/scikit-learn/releases and
      announce it on the mailing list and social networks. Remember to add a link to the
      changelog in the release note. Ideally, only perform this step once the package
      is available both on PyPI and conda-forge and once the website is up to date.
    {% endif %}

    {% if key != "rc" %}
    - Update `SECURITY.md` to reflect the latest supported version `{{ version_full }}`.
    {% endif %}
  {% endfor %}

Updating Authors List
---------------------

This section is about updating :ref:`authors`. First create a `classic token on GitHub
<https://github.com/settings/tokens/new>`_ with the `read:org` permission. Then run the
following script and enter the token when prompted:

.. prompt:: bash

  cd build_tools
  make authors  # Enter the token when prompted

.. _bumping_dependencies_guideline:

Guideline for bumping minimum versions of our dependencies
----------------------------------------------------------

- **minimum Python version**: at the time of a minor scikit-learn release (`X.Y.0`),
  we drop the Python version with an initial release date of more than 4 years
  ago. In other words, our minimum Python version is between 3 and 4 years old.
- **compiled dependencies** (numpy, scipy, as well as compiled optional
  dependencies (pandas, matplotlib, pyamg, pillow, ...): we take the oldest minor
  release (`X.Y.0`) that has wheels for our minimum Python version. In practice
  this means that our minimum supported version is around 3 years old, maybe a
  bit less.
- **pure Python dependencies** (joblib, threadpoolctl): at the time of the
  scikit-learn release our minimum supported version is the most recent minor
  release (`X.Y.0`) that is at least 2 years old.
- we may decide to be less conservative than this guideline in some edge cases.
  These edge cases include: a security bugfix in one of our dependencies or a
  critical bugfix in one of our dependencies makes it too costly to support it in
  terms of maintenance.

`maint_tools/bump-dependencies-versions.py` implements these rules and can be
used to give the new minimum dependency versions. It takes as input the
expected scikit-learn release date, for example:

.. code:: bash

    python maint_tools/bump-dependencies-versions.py 2025-12-01

Merging Pull Requests
---------------------

Individual commits are squashed when a PR is merged on GitHub. Before merging:

- The resulting commit title can be edited if necessary. Note that this will rename the
  PR title by default.
- The detailed description, containing the titles of all the commits, can be edited or
  deleted.
- For PRs with multiple code contributors, care must be taken to keep the
  `Co-authored-by: name <name@example.com>` tags in the detailed description. This will
  mark the PR as having `multiple co-authors
  <https://help.github.com/en/github/committing-changes-to-your-project/creating-a-commit-with-multiple-authors>`_.
  Whether code contributions are significantly enough to merit co-authorship is left to
  the maintainer's discretion, same as for the what's new entry.

The `scikit-learn.org` Website
------------------------------

The scikit-learn website (https://scikit-learn.org) is hosted on GitHub, but should
rarely be updated manually by pushing to the
https://github.com/scikit-learn/scikit-learn.github.io repository. Most updates can be
made by pushing to `main` (for `/dev`) or a release branch `A.B.X`, from which Circle CI
builds and uploads the documentation automatically.

Experimental Features
---------------------

The :mod:`sklearn.experimental` module was introduced in 0.21 and contains
experimental features and estimators that are subject to change without
deprecation cycle.

To create an experimental module, refer to the contents of `enable_halving_search_cv.py
<https://github.com/scikit-learn/scikit-learn/blob/362cb92bb2f5b878229ea4f59519ad31c2fcee76/sklearn/experimental/enable_halving_search_cv.py>`__,
or `enable_iterative_imputer.py
<https://github.com/scikit-learn/scikit-learn/blob/c9c89cfc85dd8dfefd7921c16c87327d03140a06/sklearn/experimental/enable_iterative_imputer.py>`__.

.. note::

  These are permalinks as in 0.24, where these estimators are still experimental. They
  might be stable at the time of reading, hence the permalink. See below for
  instructions on the transition from experimental to stable.

Note that the public import path must be to a public subpackage (like `sklearn/ensemble`
or `sklearn/impute`), not just a `.py` module. Also, the (private) experimental features
that are imported must be in a submodule/subpackage of the public subpackage, e.g.
`sklearn/ensemble/_hist_gradient_boosting/` or `sklearn/impute/_iterative.py`. This is
needed so that pickles still work in the future when the features aren't experimental
anymore.

To avoid type checker (e.g. `mypy`) errors a direct import of experimental estimators
should be done in the parent module, protected by the `if typing.TYPE_CHECKING` check.
See `sklearn/ensemble/__init__.py
<https://github.com/scikit-learn/scikit-learn/blob/c9c89cfc85dd8dfefd7921c16c87327d03140a06/sklearn/ensemble/__init__.py>`__,
or `sklearn/impute/__init__.py
<https://github.com/scikit-learn/scikit-learn/blob/c9c89cfc85dd8dfefd7921c16c87327d03140a06/sklearn/impute/__init__.py>`__
for an example. Please also write basic tests following those in
`test_enable_hist_gradient_boosting.py
<https://github.com/scikit-learn/scikit-learn/blob/c9c89cfc85dd8dfefd7921c16c87327d03140a06/sklearn/experimental/tests/test_enable_hist_gradient_boosting.py>`__.

Make sure every user-facing code you write explicitly mentions that the feature is
experimental, and add a `# noqa` comment to avoid PEP8-related warnings::

  # To use this experimental feature, we need to explicitly ask for it
  from sklearn.experimental import enable_iterative_imputer  # noqa
  from sklearn.impute import IterativeImputer

For the docs to render properly, please also import `enable_my_experimental_feature` in
`doc/conf.py`, otherwise sphinx will not be able to detect and import the corresponding
modules. Note that using `from sklearn.experimental import *` **does not work**.

.. note::

  Some experimental classes and functions may not be included in the
  :mod:`sklearn.experimental` module, e.g., `sklearn.datasets.fetch_openml`.

Once the feature becomes stable, remove all occurrences of
`enable_my_experimental_feature` in the scikit-learn code base and make the
`enable_my_experimental_feature` a no-op that just raises a warning, as in
`enable_hist_gradient_boosting.py
<https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/experimental/enable_hist_gradient_boosting.py>`__.
The file should stay there indefinitely as we do not want to break users' code; we just
incentivize them to remove that import with the warning. Also remember to update the
tests accordingly, see `test_enable_hist_gradient_boosting.py
<https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/experimental/tests/test_enable_hist_gradient_boosting.py>`__.