1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248
|
\section{Introduction}
Since long glish bindings to the \href{http://casacore.googlecode.com}{casacore} system have been in
place. Quite recently Python bindings have been created in the general
casapy framework using tools like CCMTools, Xerces, Xalan, and
IDL. Albeit very flexible, it is quite complicated and it is not
straightforward to build on other systems than RedHat and OS-X.
Therefore an attempt has been made to make a simpler Python binding
using \href{http://www.boost.org/libs/python/doc}{Boost.Python}.
This proved to be very easy and succesful.
The binding consists of two parts:
\begin{itemize}
\item Converters to translate objects between Python and C++.
\item Class wrappers to map a C++ class and its functions to Python.
\end{itemize}
The Python numarray and numpy (version 1.0 or higher) packages are
supported. At build time one can choose which ones should be used.
\section{Converters}
Boost.Python offers a nice way to convert objects to and from Python.
Ralf W. Grosse-Kunstleve {\tt<rwgk@yahoo.com>} of
\href{http://www.lbl.gov}{Lawrence Berkeley National Laboratory}
has built converters for standard STL containers. This has been
extended to convert to/from other objects.
\\The following C++ objects are currently supported:
\begin{itemize}
\item scalars (bool, integer, real, complex)
\item \texttt{std::string}
\item \texttt{casa::String}
\item \texttt{std::vector<T>}
\item \texttt{casa::Vector<T>}
\item \texttt{casa::IPosition} which defines an array's shape or index.
\item \texttt{casa::Record} which is a \tt{dict}-like object.
\item \texttt{casa::ValueHolder} which is a kind of \tt{any} object
holding a scalar, casa::Array or casa::Record.
\item exceptions (\texttt{casa::IterError} and \texttt{std::exception})
\end{itemize}
These C++ objects can usually be created from several types of Python
objects and are converted to a specific Python object.
\\Note that the all kinds of numpy (and numarray) objects are
handled by the converters. These are:
\begin{itemize}
\item An array is an N-dimensional (N>0) array of a given type
(bool, int, float, complex, string).
\item A scalar array is a 0-dimensional array constructed like
\tt{numpy.array(value)}. Such an array cannot be indexed in Python.
\item An array scalar is an element from an array. The converters
treat them as normal scalars, which very few Python bindings do.
\end{itemize}
The C++ objects map in the following way to Python objects.
\begin{itemize}
\item
A vector or IPosition object is converted to a Python list.
\\It can be constructed from the following Python objects:
\begin{itemize}
\item scalar
\item list or tuple
\item numarray scalar array or 1-dim array
\item numpy scalar array or 1-dim array
\end{itemize}
Note that a list or tuple of arbitrary objects can be given. For
example, it is possible to get a \texttt{Vector<TableProxy>} from Python.
\item
A casa::Record is mapped to a Python dict.
\item
Every C++ exception is mapped to a Python \texttt{RunTimeError}
exception. However, \texttt{casa::IterError} is special and
is mapped to an end-of-iteration exception (\texttt{StopIteration}) in Python.
\item
A casa::ValueHolder is a special \href{http://casacore.googlecode.com}{casacore} object that can hold a record or
a scalar value or N-dim array of many types (bool, numeric, string).
It is meant to conceal the actual type which is useful in functions
that can accept a variety of types (like \texttt{getcell}
in the table binding).
\\Converting a ValueHolder to Python creates the appropriate Python
scalar, array, or dict object. When converting from
Python to ValueHolder, the appropriate internal ValueHolder value
is constructed; a list, tuple, and array object are converted to an
\href{http://casacore.googlecode.com}{casacore} array in the ValueHolder.
\end{itemize}
It means there is no direct Array conversion to/from Python. A
ValueHolder object is always needed to do the conversion. Note that
this is a cheap operation, as it uses Array reference semantics.
ValueHolder has functions to convert between types, so one can get
out an Array with the required type.
\subsection{Array conversion to/from numpy and numarray}
\href{http://casacore.googlecode.com}{casacore} arrays are kept in Fortran-order, while Python arrays are kept
in C-order. It was felt that the Python interface should be as
pythonic as possible. Therefore it was decided that the array axes are
reversed when converting to/from Python.
The values in an IPosition object (describing shape or
position) are also reversed when converting to/from Python.
\\Note that although numarray and numpy have Fortran-array provisions
by setting the appropriate internal strides,
they do not really support them. When adding, for instance, the scalar
value 0 to a Fortran-array, the result is a transposed version of the
original (which can be a quite expensive operation). Also operating on
a Fortran-array is twice as slow as operating on a C-array.
A function binding could be such that shape information is passed via, say, a
\texttt{Record} and not via an \texttt{IPosition} object.
In that case its values
are not reversed automatically, so the programmer is
responsible for doing it.
An \href{http://casacore.googlecode.com}{casacore} array is returned to Python as an array object
containing a copy of the \href{http://casacore.googlecode.com}{casacore} array data. If pyrap has
been built with support for only one Python array package (numpy or numarray),
it is clear which array type is returned. If support for both
packages has been built in, by default an array of the imported
package is returned. If both or no array packages have been imported, a numpy
array is returned.
\\Note that there is no support for the old Numeric package.
An \href{http://casacore.googlecode.com}{casacore} array constructed from a Python array is regarded as a
temporary object. So if possible, the \href{http://casacore.googlecode.com}{casacore} array refers to the
Python array data to avoid a needless copy. This is not possible if
the element size in Python differs from \href{http://casacore.googlecode.com}{casacore}. It is also not
possible if the Python array is not contiguous (or not aligned or
byte swapped). In those cases a copy is made.
A few more numarray/numpy specific issues are dealt with:
\begin{itemize}
\item An empty N-dim \href{http://casacore.googlecode.com}{casacore} array (i.e. an array containing no elements) is
returned as an empty N-dim Python array. If the dimensionality is
zero, it is returned as an empty 1-dim array, to prevent numarray/numpy
from treating it as a scalar value.
\item In numarray \texttt{array()} results in \texttt{Py\_None}. This
is accepted by the converters as an empty 1-dim array.
\item Empty arrays can be constructed in Python using empty lists. For
example, \texttt{array([[]])} results in an empty 2-dim array. The
converters accept such empty N-dim Python arrays. The type of an empty
array is set to Int by numarray and to Double by numpy.
\item Because the type of an empty Python array cannot easily be set,
the converters can convert an empty integer or real array to any type.
\item The converters accept a numpy string array. However, it is
returned to Python as the special \texttt{dict} object described
above.
\item Numpy array scalars (elements in an array) are treated by the
converters as ordinary scalar values.
\end{itemize}
\section{Class wrappers}
Usually a binding to an existing Proxy class is made, for example
\texttt{TableProxy}, which should be the same class used in the
glish-binding.
For a simple binding, only some simple C++ code has to be written in
pyrap\_xx/src/pyxx.cc, where XX is the name of the package/class.
\begin{verbatim}
// Include files for converters being used.
#include <casacore/python/Converters/PycExcp.h>
#include <casacore/python/Converters/PycBasicData.h>
#include <casacore/python/Converters/PycRecord.h>
// Include file for boost python.
#include <boost/python.hpp>
using namespace boost::python;
namespace casa { namespace pyrap {
void wrap_xx()
{
// Define the class; "xx" is the class name in Python.
class_<XX> ("xx")
// Define the constructor.
// Multiple constructors can be defined.
// They have to have different number of arguments.
.def (init<>())
// Add a .def line for each function to be wrapped.
// An arg line should be added for each argument giving
// its name and possibly default value.
.def ("func1", &XX::func1,
(boost::python::arg("arg1"),
boost::python::arg("arg2")=0))
;
}
}}
BOOST_PYTHON_MODULE(_xx)
{
// Register the conversion functions.
casa::pyrap::register_convert_excp();
casa::pyrap::register_convert_basicdata();
casa::pyrap::register_convert_casa_record();
// Initialize the wrapping.
casa::pyrap::wrap_xx();
}
\end{verbatim}
Python requires for each package a file \texttt{\_\_init\_\_.py},
so such an empty file should be created as well.
\subsection{More complicated wrappers}
Sometimes a C++ function cannot be wrapped directly, because the argument
order needs to be changed or because some extra Python checks are
necessary.
In such a case the class needs to be implemented in Python itself.
\\The C++ wrapped class name needs to get a different name,
usually by preceeding it with an underscore like:
\begin{verbatim}
class_<XX> ("_xx")
\end{verbatim}
The Python class should be derived from it and implement the
constructor by calling the constructor of \_xx.
\begin{verbatim}
class xx(_xx):
def __init__(self):
_xx.__init__(self)
\end{verbatim}
Now \texttt{xx} inherits all functions from \texttt{\_xx}.
The required function can be written in Python like
\begin{verbatim}
def func1 (self, arg1, arg2):
return self._func1 (arg2, arg1);
\end{verbatim}
Note that in the wrapper the function name also needs to be
preceeded by an underscore to make it different.
\subsection{Combining multiple classes}
Sometimes one wants to combine multiple classes in a package. A
example is package \texttt{pyrap\_tables} which contains the classes
\texttt{table}, \texttt{tablecolumn}, \texttt{tablerow},
\texttt{tableiter}, and \texttt{tableindex}. One is referred to the
code of this package to see how
to do it.
\section{Python specifics}
Besides an array being in C-order, there are a few more Python
specific issues.
\begin{itemize}
\item Indexing starts at 0 (vs. 1 in glish).
\item The end value in a range like \texttt{[10:20]} is exclusive
(vs. inclusive in glish). Furthermore Python supports a step and
reversed ranges.
\item Where useful, the function \texttt{\_\_str\_\_} should be added
giving the name of the object. This function is used when printing
an object.
\item Where useful, the functions \texttt{\_\_len\_\_},
\texttt{\_\_setitem\_\_(index, value)}, and
\texttt{\_\_getitem\_\_(index)} should
be added to make it possible that a user indexes an object directly like
\texttt{tabcol[i]} or \texttt{tabcol[start:stop:step]}.
\item When these functions are added, Python supports iteration in an
object. Explicit iteration can also be done by adding the functions
\texttt{\_\_iter\_\_} and \texttt{next}. At the end \texttt{next}
should raise the Python \texttt{StopIteration} exception
(or throw \texttt{casa::IterError}
when implemented in C++) to stop the iteration.
\end{itemize}
|