File: local_testing.rst

package info (click to toggle)
nltk 3.9.1-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 9,384 kB
  • sloc: python: 78,923; makefile: 180; sh: 68; xml: 17
file content (155 lines) | stat: -rw-r--r-- 5,159 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
NLTK testing
============

1. Obtain nltk source code;
2. install virtualenv and tox::

       pip install virtualenv
       pip install tox

3. make sure currently supported python versions
   and pypy executables are in system PATH. It is OK not to have all the
   executables, tests will be executed for available interpreters.

4. Make sure all NLTK data is downloaded (see ``nltk.download()``);

5. run 'tox' command from the root nltk folder. It will install dependencies
   and run ``pytest`` for all available interpreters.
   You may also pass any pytest options here (for example, `-v` for verbose).

It may take a long time at first run, but the subsequent runs will
be much faster.

Please consult https://tox.wiki for more info about the tox tool.

Examples
--------

Run tests for python 3.12 in verbose mode; executing only tests
that failed in the last test run::

    tox -e py312 -- -v --failed

Run tree doctests for all available interpreters::

    tox -- tree.doctest

Run a selected unit test for Python 3.12::

    tox -e py312 -- -v nltk.test.unit.test_seekable_unicode_stream_reader

By default, numpy, scipy and scikit-learn are installed in tox virtualenvs.
This is slow, requires working build toolchain and is not always feasible.
In order to skip numpy & friends, use ``..-nodeps`` environments::

    tox -e py312-nodeps,py312,pypy

It is also possible to run tests without tox. This way NLTK would be tested
only under single interpreter, but it may be easier to have numpy and other
libraries installed this way. In order to run tests without tox, make sure
to ``pip install -r test-requirements.txt`` and run ``pytest``::

    pytest nltk/test/


Writing tests
-------------

Unlike most open-source projects, NLTK test suite is doctest-based.
This format is very expressive, and doctests are usually read
as documentation. We don't want to rewrite them to unittests;
if you're contributing code to NLTK please prefer doctests
for testing.

Doctests are located at ``nltk/test/*.doctest`` text files and
in docstrings for modules, classes, methods and functions.

That said, doctests have their limitations and sometimes it is better to use
unittests. Test should be written as unittest if some of the following apply:

* test deals with non-ascii unicode and Python 2.x support is required;
* test is a regression test that is not necessary for documentational purposes.

Unittests currently reside in ``nltk/test/unit/test_*.py`` files; pytest
is used for test running.

If a test should be written as unittest but also has a documentational value
then it should be duplicated as doctest, but with a "# doctest: +SKIP" option.

There are some gotchas with NLTK doctests (and with doctests in general):

* Use ``print("foo")``, not ``print "foo"``: NLTK doctests act
  like ``from __future__ import print_functions`` is in use.

* Don't write ``+ELLIPSIS``, ``+NORMALIZE_WHITESPACE``,
  ``+IGNORE_EXCEPTION_DETAIL`` flags (they are already ON by default in NLTK).

* Do not write doctests that have non-ascii output (they are not supported in
  Python 2.x). Incorrect::

      >>> greeting
      u'Привет'

  The proper way is to rewrite such a doctest as a unittest.

* In order to conditionally skip a doctest in a separate
  ``nltk/test/foo.doctest`` file, create ``nltk.test/foo_fixt.py``
  file from the following template::

      # <a comment describing why should the test be skipped>

      def setup_module(module):
          import pytest

          if some_condition:
              pytest.skip("foo.doctest is skipped because <...>")

* In order to conditionally skip all doctests from the module/class/function
  docstrings, put the following function in a top-level module namespace::

      # <a comment describing why should the tests from this module be skipped>

      def setup_module(module):
          import pytest

          if some_condition:
              pytest.skip("doctests from nltk.<foo>.<bar> are skipped because <...>")

  A good idea is to define ``__all__`` in such module and omit
  ``setup_module`` from ``__all__``.

  It is not possible to conditionally skip only some doctests from a module.

* Do not expect the exact float output; this may fail on some machines::

      >>> some_float_constant
      0.867

  Use ellipsis in this case to make the test robust (or compare the values)::

      >>> some_float_constant
      0.867...

      >>> abs(some_float_constant - 0.867) < 1e-6
      True

* Do not rely on dictionary or set item order. Incorrect::

      >>> some_dict
      {"x": 10, "y": 20}

  The proper way is to sort the items and print them::

      >>> for key, value in sorted(some_dict.items()):
      ...     print(key, value)
      x 10
      y 20

If the code requires some external dependencies, then

* tests for this code should be skipped if the dependencies are not available:
  use ``setup_module`` for doctests (as described above) and
  ``@pytest.mark.skipif / @pytest.mark.skip`` decorators or ``pytest.skip``
  exception for unittests;
* if the dependency is a Python package, it should be added to tox.ini
  (but not to ..-nodeps environments).