File: gdb.rst

package info (click to toggle)
apache-arrow 23.0.1-1
  • links: PTS
  • area: main
  • in suites: sid
  • size: 76,220 kB
  • sloc: cpp: 654,608; python: 70,522; ruby: 45,964; ansic: 18,742; sh: 7,365; makefile: 669; javascript: 125; xml: 41
file content (167 lines) | stat: -rw-r--r-- 6,620 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
.. Licensed to the Apache Software Foundation (ASF) under one
.. or more contributor license agreements.  See the NOTICE file
.. distributed with this work for additional information
.. regarding copyright ownership.  The ASF licenses this file
.. to you under the Apache License, Version 2.0 (the
.. "License"); you may not use this file except in compliance
.. with the License.  You may obtain a copy of the License at

..   http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
.. software distributed under the License is distributed on an
.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
.. KIND, either express or implied.  See the License for the
.. specific language governing permissions and limitations
.. under the License.

.. default-domain:: cpp
.. highlight:: console

==========================
Debugging code using Arrow
==========================

.. _cpp_gdb_extension:

GDB extension for Arrow C++
===========================

By default, when asked to print the value of a C++ object,
`GDB <https://www.sourceware.org/gdb/>`_ displays the contents of its
member variables.  However, for C++ objects this does not often yield
a very useful output, as C++ classes tend to hide their implementation details
behind methods and accessors.

For example, here is how a :class:`arrow::Status` instance may be displayed
by GDB::

   $3 = {
     <arrow::util::EqualityComparable<arrow::Status>> = {<No data fields>},
     <arrow::util::ToStringOstreamable<arrow::Status>> = {<No data fields>},
     members of arrow::Status:
     state_ = 0x0
   }

and here is a :class:`arrow::Decimal128Scalar`::

   $4 = (arrow::Decimal128Scalar) {
     <arrow::DecimalScalar<arrow::Decimal128Type, arrow::Decimal128>> = {
       <arrow::internal::PrimitiveScalarBase> = {
         <arrow::Scalar> = {
           <arrow::util::EqualityComparable<arrow::Scalar>> = {<No data fields>},
           members of arrow::Scalar:
           _vptr.Scalar = 0x7ffff6870e78 <vtable for arrow::Decimal128Scalar+16>,
           type = std::shared_ptr<arrow::DataType> (use count 1, weak count 0) = {
             get() = 0x555555ce58a0
           },
           is_valid = true
         }, <No data fields>},
       members of arrow::DecimalScalar<arrow::Decimal128Type, arrow::Decimal128>:
       value = {
         <arrow::BasicDecimal128> = {
           <arrow::GenericBasicDecimal<arrow::BasicDecimal128, 128, 2>> = {
             static kHighWordIndex = <optimized out>,
             static kBitWidth = 128,
             static kByteWidth = 16,
             static LittleEndianArray = <optimized out>,
             array_ = {
               _M_elems = {[0] = 1234567, [1] = 0}
             }
           },
           members of arrow::BasicDecimal128:
           static kMaxPrecision = 38,
           static kMaxScale = 38
         }, <No data fields>}
     }, <No data fields>}

Fortunately, GDB also allows custom extensions to override the default printing
for specific types.  We provide a
`GDB extension <https://github.com/apache/arrow/blob/main/cpp/gdb_arrow.py>`_
written in Python that enables pretty-printing for common Arrow C++ classes,
so as to enable a more productive debugging experience.  For example,
here is how the aforementioned :class:`arrow::Status` instance will be
displayed::

   $5 = arrow::Status::OK()

and here is the same :class:`arrow::Decimal128Scalar` instance as above::

   $6 = arrow::Decimal128Scalar of value 123.4567 [precision=10, scale=4]


Manual loading
--------------

To enable the GDB extension for Arrow, you can simply
`download it <https://github.com/apache/arrow/blob/main/cpp/gdb_arrow.py>`_
somewhere on your computer and ``source`` it from the GDB prompt::

   (gdb) source path/to/gdb_arrow.py

You will have to ``source`` it on each new GDB session.  You might want to
make this implicit by adding the ``source`` invocation in a
`gdbinit <https://sourceware.org/gdb/onlinedocs/gdb/gdbinit-man.html>`_ file.


Automatic loading
-----------------

GDB provides a facility to automatically load scripts or extensions for each
object file or library that is involved in a debugging session.  You will need
to:

1. Find out what the *auto-load* locations are for your GDB install.
   This can be determined using ``show`` subcommands on the GDB prompt;
   the answer will depend on the operating system.

   Here is an example on Ubuntu::

      (gdb) show auto-load scripts-directory
      List of directories from which to load auto-loaded scripts is $debugdir:$datadir/auto-load.
      (gdb) show data-directory
      GDB's data directory is "/usr/share/gdb".
      (gdb) show debug-file-directory
      The directory where separate debug symbols are searched for is "/usr/lib/debug".

   This tells you that the directories used for auto-loading are
   ``$debugdir`` and ``$datadir/auto-load``, which expand to
   ``/usr/lib/debug/`` and ``/usr/share/gdb/auto-load`` respectively.

2. Find out the full path to the Arrow C++ DLL, *with all symlinks resolved*.
   For example, you might have installed Arrow 7.0 in ``/usr/local`` and the
   path to the Arrow C++ DLL could then be ``/usr/local/lib/libarrow.so.700.0.0``.

3. Determine the actual auto-load script path.  It is computed by *a)* taking
   the path of the auto-load directory of your choice, *b)* appending the full
   path to the Arrow C++ DLL, *c)* appending ``-gdb.py`` at the tail.

   In the example above, if we choose ``/usr/share/gdb/auto-load`` as auto-load
   directory, the full path to the auto-load script will have to be
   ``/usr/share/gdb/auto-load/usr/local/lib/libarrow.so.700.0.0-gdb.py``.

4. Either copy or symlink the `GDB extension`_ to the file path determined
   in step 3 above.

If everything went well, then as soon as GDB encounters the Arrow C++ DLL,
it will automatically load the Arrow GDB extension so as to pretty-print
Arrow C++ classes on the display prompt.


Supported classes
-----------------

The Arrow GDB extension provides pretty-printing for the core Arrow C++ classes:

* :class:`arrow::DataType` and subclasses
* :class:`arrow::Field`, :class:`arrow::Schema` and :class:`arrow::KeyValueMetadata`
* :class:`arrow::ArrayData`, :class:`arrow::Array` and subclasses
* :class:`arrow::Scalar` and subclasses
* :class:`arrow::ChunkedArray`, :class:`arrow::RecordBatch` and :class:`arrow::Table`
* :class:`arrow::Datum`

Important utility classes are also covered:

* :class:`arrow::Status` and :class:`arrow::Result`
* :class:`arrow::Buffer` and subclasses
* :class:`arrow::Decimal128`, :class:`arrow::Decimal256`