File: unified_table_text.rst

package info (click to toggle)
astropy 7.1.0-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 34,832 kB
  • sloc: python: 238,140; ansic: 55,278; lex: 8,621; sh: 3,317; xml: 2,287; makefile: 191
file content (233 lines) | stat: -rw-r--r-- 13,570 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
.. _unified_table_text:

Text (CSV, fixed-width, HTML, and specialized)
==============================================

The :meth:`~astropy.table.Table.read` and :meth:`~astropy.table.Table.write` methods can
be used to read and write text-based table data in a wide variety of `supported
formats`_. In addition to common formats like `CSV
<https://en.wikipedia.org/wiki/Comma-separated_values>`__ and :ref:`fixed-width
<fixed_width_gallery>`, the unified interface also supports `specialized formats`_ like
`LaTeX <https://en.wikipedia.org/wiki/LaTeX>`_ tables and the AAS :class:`MRT
<astropy.io.ascii.Mrt>` format.

Most of the formats are provided by :ref:`table_io_ascii`, which is a flexible and
powerful interface for reading and writing text tables. In addition, the interface
provides wrappers around select I/O functions in the `pandas`_ library for additional
flexibility and performance.

.. note::

   For reading large CSV files, the astropy :ref:`PyArrow CSV <table_io_pyarrow_csv>`
   reader is a good option to consider since it can be up to 15 times faster than other
   readers.

Supported Formats
-----------------

Character-delimited Formats
^^^^^^^^^^^^^^^^^^^^^^^^^^^
These formats use a character delimiter to separate columns. This is most commonly a
comma (CSV) or a whitespace character like space or tab.

===========================  =====  ======  ============================================================================================
           Format            Write  Suffix                                          Description
===========================  =====  ======  ============================================================================================
                      ascii    Yes          ASCII table in most supported formats (uses guessing)
                ascii.basic    Yes          :class:`~astropy.io.ascii.Basic`: Basic table with custom delimiters
     ascii.commented_header    Yes          :class:`~astropy.io.ascii.CommentedHeader`: Column names in a commented line
                  ascii.csv    Yes    .csv  :class:`~astropy.io.ascii.Csv`: Basic table with comma-separated values
                 ascii.ecsv    Yes   .ecsv  :class:`~astropy.io.ascii.Ecsv`: Basic table with Enhanced CSV (supporting metadata)
            ascii.no_header    Yes          :class:`~astropy.io.ascii.NoHeader`: Basic table with no headers
                  ascii.rdb    Yes    .rdb  :class:`~astropy.io.ascii.Rdb`: Tab-separated with a type definition header line
                  ascii.tab    Yes          :class:`~astropy.io.ascii.Tab`: Basic table with tab-separated values
                 ascii.tdat    Yes   .tdat  :class:`~astropy.io.ascii.Tdat`: Transportable Database Aggregate Table format
                 pandas.csv    Yes          :func:`pandas.read_csv` and :meth:`pandas.DataFrame.to_csv`
                pyarrow.csv     No          :func:`~astropy.io.misc.pyarrow.csv.read_csv`: Performant CSV reader
===========================  =====  ======  ============================================================================================

Fixed-width Formats
^^^^^^^^^^^^^^^^^^^
These formats use fixed-width columns, where each column has a fixed width in characters.
This can be useful for tables that are intended to also be read by humans.

===========================  =====  ======  ============================================================================================
           Format            Write  Suffix                                          Description
===========================  =====  ======  ============================================================================================
          ascii.fixed_width    Yes          :class:`~astropy.io.ascii.FixedWidth`: Fixed width
ascii.fixed_width_no_header    Yes          :class:`~astropy.io.ascii.FixedWidthNoHeader`: Fixed width with no header
 ascii.fixed_width_two_line    Yes          :class:`~astropy.io.ascii.FixedWidthTwoLine`: Fixed width with second header line
                 pandas.fwf     No          :func:`pandas.read_fwf` (fixed width format)
===========================  =====  ======  ============================================================================================

HTML and JSON Formats
^^^^^^^^^^^^^^^^^^^^^
===========================  =====  ======  ============================================================================================
           Format            Write  Suffix                                          Description
===========================  =====  ======  ============================================================================================
                 ascii.html    Yes   .html  :class:`~astropy.io.ascii.HTML`: HTML table
                   jsviewer    Yes          JavaScript viewer format (write-only)
                pandas.html    Yes          :func:`pandas.read_html` and :meth:`pandas.DataFrame.to_html`
                pandas.json    Yes          :func:`pandas.read_json` and :meth:`pandas.DataFrame.to_json`
===========================  =====  ======  ============================================================================================

Specialized Formats
^^^^^^^^^^^^^^^^^^^^
===========================  =====  ======  ============================================================================================
           Format            Write  Suffix                                          Description
===========================  =====  ======  ============================================================================================
               ascii.aastex    Yes          :class:`~astropy.io.ascii.AASTex`: AASTeX deluxetable used for AAS journals
                  ascii.cds     No          :class:`~astropy.io.ascii.Cds`: CDS format table
              ascii.daophot     No          :class:`~astropy.io.ascii.Daophot`: IRAF DAOphot format table
                 ascii.ipac    Yes          :class:`~astropy.io.ascii.Ipac`: IPAC format table
                ascii.latex    Yes    .tex  :class:`~astropy.io.ascii.Latex`: LaTeX table
                  ascii.mrt    Yes          :class:`~astropy.io.ascii.Mrt`: AAS Machine-Readable Table format
                  ascii.qdp    Yes    .qdp  :class:`~astropy.io.ascii.QDP`: Quick and Dandy Plotter files
                  ascii.rst    Yes    .rst  :class:`~astropy.io.ascii.RST`: reStructuredText simple format table
           ascii.sextractor     No          :class:`~astropy.io.ascii.SExtractor`: SExtractor format table
===========================  =====  ======  ============================================================================================

.. _table_io_ascii:

`astropy.io.ascii`
------------------
The :ref:`astropy.io.ascii <io-ascii>` sub-package provides read and write support for
:ref:`many different formats <supported_formats>`, including astronomy-specific formats
like AAS `Machine-Readable Tables (MRT) <https://journals.aas.org/mrt-standards/>`_.

We **strongly recommend** using the unified interface for reading and writing tables via
the :ref:`astropy.io.ascii <io-ascii>` sub-package. This is done by prefixing the
:ref:`format name <supported_formats>` with the ``ascii.`` prefix. For example to read a
DAOphot table use:

.. doctest-skip::

    >>> from astropy.table import Table
    >>> t = Table.read('photometry.dat', format='ascii.daophot')

Use ``format='ascii'`` in order read a table and guess the table format by successively
trying most of the available formats in a specific order. This can be slow and is not
recommended for large tables.

.. doctest-skip::

  >>> t = Table.read('astropy/io/ascii/tests/t/latex1.tex', format='ascii')
  >>> print(t)
  cola colb colc
  ---- ---- ----
     a    1    2
     b    3    4

When writing a table with ``format='ascii'`` the output is a basic
space-delimited file with a single header line containing the
column names.

All additional arguments are passed to the `astropy.io.ascii`
:func:`~astropy.io.ascii.read` and :func:`~astropy.io.ascii.write`
functions. Further details are available in the sections on
:ref:`io_ascii_read_parameters` and :ref:`io_ascii_write_parameters`. For
example, to change the column delimiter and the output format for the ``colc``
column use:

.. doctest-skip::

  >>> t.write(sys.stdout, format='ascii', delimiter='|', formats={'colc': '%0.2f'})
  cola|colb|colc
  a|1|2.00
  b|3|4.00

.. attention:: **ECSV is recommended**

   For writing and reading tables to text in a way that fully reproduces the table data,
   types, and metadata (i.e., the table will "round-trip"), we highly recommend using
   the :ref:`ecsv_format` with ``format="ascii.ecsv"``. This writes the actual data in a
   space- or comma-delimited format that most text table readers can parse, but also
   includes metadata encoded in a comment block that allows full reconstruction of the
   original columns. This includes support for :ref:`ecsv_format_mixin_columns` (such as
   `~astropy.coordinates.SkyCoord` or `~astropy.time.Time`) and
   :ref:`ecsv_format_masked_columns`.

..
  EXAMPLE END

.. _table_io_pandas:

Pandas
------

.. _pandas: https://pandas.pydata.org/pandas-docs/stable/index.html

``astropy`` `~astropy.table.Table` supports the ability to read or write tables
using some of the `I/O methods <https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html>`_
available within pandas_. This interface thus provides convenient wrappers to
the following functions / methods:

.. csv-table::
    :header: "Format name", "Data Description", "Reader", "Writer"
    :widths: 25, 25, 25, 25

    ``pandas.csv``,`CSV <https://en.wikipedia.org/wiki/Comma-separated_values>`__,`read_csv() <https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-read-csv-table>`_,`to_csv() <https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-store-in-csv>`_
    ``pandas.json``,`JSON <http://www.json.org/>`__,`read_json() <https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-json-reader>`_,`to_json() <https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-json-writer>`_
    ``pandas.html``,`HTML <https://en.wikipedia.org/wiki/HTML>`__,`read_html() <https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-read-html>`_,`to_html() <https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-html>`_
    ``pandas.fwf``,Fixed Width,`read_fwf() <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_fwf.html#pandas.read_fwf>`_,

**Notes**:

- This is subject to the limitations discussed in :ref:`astropy-table-and-dataframes`.
- There is no fixed-width writer in pandas_.
- Reading HTML requires `BeautifulSoup4 <https://pypi.org/project/beautifulsoup4/>`_ and
  `html5lib <https://pypi.org/project/html5lib/>`_ to be installed.

When reading or writing a table, any keyword arguments apart from the
``format`` and file name are passed through to pandas, for instance:

.. doctest-skip::

  >>> t.write('data.csv', format='pandas.csv', sep=' ', header=False)
  >>> t2 = Table.read('data.csv', format='pandas.csv', sep=' ', names=['a', 'b', 'c'])

.. _table_io_pyarrow_csv:

PyArrow CSV
-----------

.. _pyarrow: https://arrow.apache.org/docs/python/

The `pyarrow`_ library provides a highly-performant CSV reader that can be used in
Astropy with ``Table.read(input_file, format="pyarrow.csv", ...)``. This can by up to 15
times faster and more memory-efficient than the :ref:`astropy.io.ascii <io-ascii>` fast
reader or the default ``pandas.csv`` reader. The best performance is achieved for files
with only numeric data types, but even for files with mixed data types, the performance
is still better than the standard :ref:`astropy.io.ascii <io-ascii>` fast CSV reader.

This reader uses the :func:`~astropy.io.misc.pyarrow.csv.read_csv` function, which in
turn uses the `PyArrow CSV reader <https://arrow.apache.org/docs/python/csv.html>`__ and
sets the various options to ``pyarrow.csv.read_csv()`` appropriately. The interface is
designed to be similar to the :ref:`io.ascii read interface <io_ascii_read_parameters>`
where possible, but there are differences, most notably:

- Input can only be a string file name, `pathlib.Path`, or a binary file-like object.
- Whitespace in string data fields and header column names is preserved.
- Use ``dtypes`` instead of ``converters`` to specify the column data types.
- Use ``null_values`` instead of ``fill_values`` to specify the null (missing) values.
- No ``guess`` parameter and no guessing of the table format (e.g., ``delimiter``).
- No ``data_end`` parameter.
- No ``exclude_names`` parameter.
- Columns consisting of only string values ``True`` and ``False`` are parsed as
  boolean data.
- Columns with ISO 8601 date/time strings are parsed as shown below:
  - ``12:13:14.123456``: ``object[datetime.time]``
  - ``2025-01-01``: ``np.datetime64[D]``
  - ``2025-01-01T01:02:03``: ``np.datetime64[s]``
  - ``2025-01-01T01:02:03.123456``: ``np.datetime64[ns]``
- Timestamp parsing behavior can be customized with the ``timestamp_parsers``
  parameter.

Using the PyArrow CSV reader directly
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The :mod:`astropy.io.misc.pyarrow.csv` module also provides the
:func:`~astropy.io.misc.pyarrow.csv.convert_pa_table_to_astropy_table` function to
allow converting a ``pyarrow.Table`` to an `astropy.table.Table`. This allows using
the `PyArrow CSV reader <https://arrow.apache.org/docs/python/csv.html>`__ directly
with custom options that are not available in the astropy interface.