File: formats.rst

package info (click to toggle)
python-tablib 3.9.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 744 kB
  • sloc: python: 3,827; makefile: 125
file content (347 lines) | stat: -rw-r--r-- 10,348 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
.. _formats:

=======
Formats
=======

Tablib supports a wide variety of different tabular formats, both for input and
output. Moreover, you can :ref:`register your own formats <newformats>`.

cli
===

The ``cli`` format is currently export-only. The exports produce a representation
table suited to a terminal.

When exporting to a CLI you can pass the table format  with the ``tablefmt``
parameter, the supported formats are::

    >>> import tabulate
    >>> list(tabulate._table_formats)
    ['simple', 'plain', 'grid', 'fancy_grid', 'github', 'pipe', 'orgtbl',
     'jira', 'presto', 'psql', 'rst', 'mediawiki', 'moinmoin', 'youtrack',
     'html', 'latex', 'latex_raw', 'latex_booktabs', 'tsv', 'textile']

For example::

    dataset.export("cli", tablefmt="github")
    dataset.export("cli", tablefmt="grid")

This format is optional, install Tablib with ``pip install "tablib[cli]"`` to
make the format available.

csv
===

When you import CSV data, you can specify if the first line of your data source
is headers with the ``headers`` boolean parameter (defaults to ``True``)::

    import tablib

    tablib.import_set(your_data_stream, format='csv', headers=False)

It is also possible to provide the ``skip_lines`` parameter for the number of
lines that should be skipped before starting to read data.

.. versionchanged:: 3.1.0

    The ``skip_lines`` parameter was added.

When exporting with the ``csv`` format, the top row will contain headers, if
they have been set. Otherwise, the top row will contain the first row of the
dataset.

When importing a CSV data source or exporting a dataset as CSV, you can pass any
parameter supported by the :py:func:`csv.reader` and :py:func:`csv.writer`
functions. For example::

    tablib.import_set(your_data_stream, format='csv', dialect='unix')

    dataset.export('csv', delimiter=' ', quotechar='|')

.. admonition:: Line endings

     Exporting uses \\r\\n line endings by default so, make sure to include
     ``newline=''`` otherwise you will get a blank line between each row
     when you open the file in Excel::

         with open('output.csv', 'w', newline='') as f:
             f.write(dataset.export('csv'))

     If you do not do this, and you export the file on Windows, your
     CSV file will open in Excel with a blank line between each row.

dbf
===

Import/export using the dBASE_ format.

.. admonition:: Binary Warning

    The ``dbf`` format contains binary data, so make sure to write in binary
    mode::

        with open('output.dbf', 'wb') as f:
            f.write(dataset.export('dbf')

.. _dBASE: https://en.wikipedia.org/wiki/DBase

df (DataFrame)
==============

Import/export using the pandas_ DataFrame format. This format is optional,
install Tablib with ``pip install "tablib[pandas]"`` to make the format available.

.. _pandas: https://pandas.pydata.org/

html
====

The exports produce an HTML page with the data in a ``<table>``. If headers have
been set, they will be used as table headers (``thead``).

When you import HTML, you can specify a specific table to import by providing
the ``table_id`` argument::

    import tablib

    tablib.import_set(your_html, format='html', table_id='some_table_id')

Otherwise, the first table found will be imported.

.. versionchanged:: 3.6.0

    The ability to import HTML was added. The dependency on MarkupPy was dropped.

jira
====

The ``jira`` format is currently export-only. Exports format the dataset
according to the Jira table syntax::

    ||heading 1||heading 2||heading 3||
    |col A1|col A2|col A3|
    |col B1|col B2|col B3|

json
====

Import/export using the JSON_ format. If headers have been set, a JSON list of
objects will be returned. If no headers have been set, a JSON list of lists
(rows) will be returned instead.

Import assumes (for now) that headers exist.

.. _JSON: http://json.org/

latex
=====

Import/export using the LaTeX_ format. This format is export-only.
If a title has been set, it will be exported as the table caption.

.. _LaTeX: https://www.latex-project.org/

ods
===

Import/export data in OpenDocument Spreadsheet format.

.. versionadded:: 3.6.0

    Import functionality was added.

This format is optional, install Tablib with ``pip install "tablib[ods]"`` to
make the format available.

The ``import_set()`` method also supports a ``skip_lines`` parameter that you
can set to a number of lines that should be skipped before starting to read
data.

.. admonition:: Binary Warning

    :class:`Dataset.ods` contains binary data, so make sure to write in binary mode::

        with open('output.ods', 'wb') as f:
            f.write(data.ods)

rst
===

Export data as a reStructuredText_ table representation of a dataset. The
``rst`` format is export-only.

Exporting returns a simple table if the text in the first column is never
wrapped, otherwise returns a grid table::

    >>> from tablib import Dataset
    >>> bits = ((0, 0), (1, 0), (0, 1), (1, 1))
    >>> data = Dataset()
    >>> data.headers = ['A', 'B', 'A and B']
    >>> for a, b in bits:
    ...     data.append([bool(a), bool(b), bool(a * b)])
    >>> table = data.export('rst')
    >>> table.split('\\n') == [
    ...     '=====  =====  =====',
    ...     '  A      B    A and',
    ...     '                B  ',
    ...     '=====  =====  =====',
    ...     'False  False  False',
    ...     'True   False  False',
    ...     'False  True   False',
    ...     'True   True   True ',
    ...     '=====  =====  =====',
    ... ]
    True

.. _reStructuredText: http://docutils.sourceforge.net/rst.html

tsv
===

A variant of the csv_ format with tabulators as fields separators.

xls
===

Import/export data in Legacy Excel Spreadsheet representation.

This format is optional, install Tablib with ``pip install "tablib[xls]"`` to
make the format available.

Its ``import_set()`` method also supports a ``skip_lines`` parameter that you
can set to a number of lines that should be skipped before starting to read
data.

.. versionchanged:: 3.1.0

    The ``skip_lines`` parameter for ``import_set()`` was added.

.. note::

    XLS files are limited to a maximum of 65,000 rows. Use xlsx_ to avoid this
    limitation.

.. admonition:: Binary Warning

    The ``xls`` file format is binary, so make sure to write in binary mode::

        with open('output.xls', 'wb') as f:
            f.write(data.export('xls'))

xlsx
====

Import/export data in Excel 07+ Spreadsheet representation.

This format is optional, install Tablib with ``pip install "tablib[xlsx]"`` to
make the format available.

The ``import_set()`` and ``import_book()`` methods accept keyword
argument ``read_only``.  If its value is ``True`` (the default), the
XLSX data source is read lazily.  Lazy reading generally reduces time
and memory consumption, especially for large spreadsheets.  However,
it relies on the XLSX data source declaring correct dimensions.  Some
programs generate XLSX files with incorrect dimensions.  Such files
may need to be loaded with this optimization turned off by passing
``read_only=False``.

The ``import_set()`` method also supports a ``skip_lines`` parameter that you
can set to a number of lines that should be skipped before starting to read
data.

The ``export_set()`` method supports a ``column_width`` parameter. Depending
on the value passed, the column width will be set accordingly. It can be
either ``None``, an integer, or default "adaptive". If "adaptive" is passed,
the column width will be unique and will be calculated based on values' length.
For example::

    data = tablib.Dataset()
    data.export('xlsx', column_width='adaptive')

This works with ``Databook`` as well::

    data = tablib.Databook()
    data.export('xlsx', column_width='adaptive')

The adaptive width will be calculated for each sheet in the databook.

.. versionchanged:: 3.8.0
    The ``column_width`` parameter for ``export_set()`` was added.

.. versionchanged:: 3.1.0

    The ``skip_lines`` parameter for ``import_set()`` was added.

.. note::

    When reading an ``xlsx`` file containing formulas in its cells, Tablib will
    read the cell values, not the cell formulas.

.. versionchanged:: 2.0.0

    Reads cell values instead of formulas.

You can export data to xlsx format by calling :meth:`export('xlsx') <.export>`.
There are optional parameters to control the export.
For available parameters, see :meth:`tablib.formats._xlsx.XLSXFormat.export_set`.

.. admonition:: Binary Warning

    The ``xlsx`` file format is binary, so make sure to write in binary mode::

        with open('output.xlsx', 'wb') as f:
            f.write(data.export('xlsx'))

yaml
====

Import/export data in the YAML_ format.
When exporting, if headers have been set, a YAML list of objects will be
returned. If no headers have been set, a YAML list of lists (rows) will be
returned instead.

Import assumes (for now) that headers exist.

This format is optional, install Tablib with ``pip install "tablib[yaml]"`` to
make the format available.

.. _YAML: https://yaml.org

sql
===

.. versionadded:: 3.9.0

The ``sql`` format is export-only. It produces SQL INSERT statements (one per row)
assuming the target table already exists with the same columns.
The table name can be passed as an argument or will be taken from the dataset's title (or defaults to ``export_table``).
Columns can be passed as an argument or will be taken from the dataset's headers.
Values are rendered as ANSI SQL literals.
Additionally the argument ``commit`` can be passed to add a ``COMMIT;`` statement at the end.

- ``NULL`` for null values
- ``TRUE``/``FALSE`` for booleans
- ``DATE 'YYYY-MM-DD'`` for date values
- ``TIMESTAMP 'YYYY-MM-DD HH:MM:SS'`` for timestamp values
- Numeric literals for ints/floats/decimals
- Single-quoted strings with embedded quotes escaped

Example::

    import datetime
    from tablib import Dataset

    data = Dataset(title='users')
    data.headers = ['id', 'name', 'joined']
    data.append([1, 'Alice', datetime.date(2021,1,1)])

    print(data.export('sql'))
    print(data.export('sql', table='\"User_Updates\"', columns=['id', 'username', 'update_date'], commit=True))

Output::

    INSERT INTO users (id,name,joined) VALUES (1, 'Alice', DATE '2021-01-01');

    INSERT INTO "User_Updates" (id,username,update_date) VALUES (1, 'Alice', DATE '2021-01-01');
    COMMIT;