1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210
|
.. _coding:
Coding a new module
===================
.. _constructor:
There are three different types of pipeline modules: :class:`~pynpoint.core.processing.ReadingModule`, :class:`~pynpoint.core.processing.WritingModule`, and :class:`~pynpoint.core.processing.ProcessingModule`. The concept is similar for these three modules so here we will explain only how to code a processing module.
Class constructor
-----------------
First, we need to import the interface (i.e. abstract class) :class:`~pynpoint.core.processing.ProcessingModule`: :
.. code-block:: python
from pynpoint.core.processing import ProcessingModule
All pipeline modules are classes which contain the parameters of the pipeline step, input ports and/or output ports. So let’s create a simple ``ExampleModule`` class using the ProcessingModule interface (inheritance):
.. code-block:: python
class ExampleModule(ProcessingModule):
When an IDE like *PyCharm* is used, a warning will appear that all abstract methods must be implemented in the ``ExampleModule`` class. The abstract class :class:`~pynpoint.core.processing.ProcessingModule` has some abstract methods which have to be implemented by its children classes (e.g., ``__init__`` and ``run``). We start by implementing the ``__init__`` method (i.e., the constructor of our module):
.. code-block:: python
def __init__(self,
name_in='example',
in_tag_1='in_tag_1',
in_tag_2='in_tag_2',
out_tag_1='out_tag_1',
out_tag_2='out_tag_2',
parameter_1=0,
parameter_2='value'):
Each ``__init__`` method of :class:`~pynpoint.core.processing.PypelineModule` requires a ``name_in`` argument which is used by the pipeline to run individual modules by name. Furthermore, the input and output tags have to be defined which are used to access data from the central database. The constructor starts with a call of the :class:`~pynpoint.core.processing.ProcessingModule` interface:
.. code-block:: python
super().__init__(name_in)
Next, the input and output ports behind the database tags need to be defined:
.. code-block:: python
self.m_in_port_1 = self.add_input_port(in_tag_1)
self.m_in_port_2 = self.add_input_port(in_tag_2)
self.m_out_port_1 = self.add_output_port(out_tag_1)
self.m_out_port_2 = self.add_output_port(out_tag_2)
Reading to and writing from the central database should always be done with the ``add_input_port`` and ``add_output_port`` functionalities and not by manually creating an instance of :class:`~pynpoint.core.dataio.InputPort` or :class:`~pynpoint.core.dataio.OutputPort`.
Finally, the module parameters should be saved as attributes of the ``ExampleModule`` instance:
.. code-block:: python
self.m_parameter_1 = parameter_1
self.m_parameter_2 = parameter_2
That's it! The constructor of the ``ExampleModule`` is ready.
.. _run_method:
Run method
----------
We can now add the functionalities of the module in the ``run`` method which will be called by the pipeline:
.. code-block:: python
def run(self):
The input ports of the module are used to load data from the central database into the memory with slicing or with the ``get_all`` method:
.. code-block:: python
data1 = self.m_in_port_1.get_all()
data2 = self.m_in_port_2[0:4]
We want to avoid using the ``get_all`` method because data sets obtained in the $L'$ and $M'$ bands typically consists of thousands of images so loading all images at once in the computer memory might not be possible. Instead, it is recommended to use the ``MEMORY`` attribute that is specified in the configuration file (see :ref:`configuration`)
Attributes of a dataset can be read as follows:
.. code-block:: python
parang = self.m_in_port_1.get_attribute('PARANG')
pixscale = self.m_in_port_2.get_attribute('PIXSCALE')
And attributes of the central configuration are accessed through the :class:`~pynpoint.core.dataio.ConfigPort`:
.. code-block:: python
memory = self._m_config_port.get_attribute('MEMORY')
cpu = self._m_config_port.get_attribute('CPU')
More information on importing of data can be found in the API documentation of :class:`~pynpoint.core.dataio.InputPort`.
Next, the processing steps are implemented:
.. code-block:: python
result1 = 10.*self.m_parameter_1
result2 = 20.*self.m_parameter_1
result3 = [1, 2, 3]
attribute = self.m_parameter_2
The output ports are used to write the results to the central database:
.. code-block:: python
self.m_out_port_1.set_all(result1)
self.m_out_port_1.append(result2)
self.m_out_port_2[0:2] = result2
self.m_out_port_2.add_attribute(name='new_attribute', value=attribute)
More information on storing of data can be found in the API documentation of :class:`~pynpoint.core.dataio.OutputPort`.
The data attributes of the input port need to be copied and history information should be added. These steps should be repeated for all the output ports:
.. code-block:: python
self.m_out_port_1.copy_attributes(self.m_in_port_1)
self.m_out_port_1.add_history('ExampleModule', 'history text')
self.m_out_port_2.copy_attributes(self.m_in_port_1)
self.m_out_port_2.add_history('ExampleModule', 'history text')
Finally, the central database and all the open ports are closed:
.. code-block:: python
self.m_out_port_1.close_port()
.. important::
It is enough to close only one port because all other ports will be closed automatically.
.. _apply_function:
Apply function to images
------------------------
A processing module often applies a specific method to each image of an input port. Therefore, the :func:`~pynpoint.core.processing.ProcessingModule.apply_function_to_images` function has been implemented to apply a function to all images of an input port. This function uses the ``CPU`` and ``MEMORY`` parameters from the configuration file to automatically process subsets of images in parallel. An example of the implementation can be found in the code of the bad pixel cleaning with a sigma filter: :class:`~pynpoint.processing.badpixel.BadPixelSigmaFilterModule`.
.. _example_module:
Example module
--------------
The full code for the ``ExampleModule`` from above is:
.. code-block:: python
from pynpoint.core.processing import ProcessingModule
class ExampleModule(ProcessingModule):
def __init__(self,
name_in='example',
in_tag_1='in_tag_1',
in_tag_2='in_tag_2',
out_tag_1='out_tag_1',
out_tag_2='out_tag_2”,
parameter_1=0,
parameter_2='value'):
super(ExampleModule, self).__init__(name_in)
self.m_in_port_1 = self.add_input_port(in_tag_1)
self.m_in_port_2 = self.add_input_port(in_tag_2)
self.m_out_port_1 = self.add_output_port(out_tag_1)
self.m_out_port_2 = self.add_output_port(out_tag_2)
self.m_parameter_1 = parameter_1
self.m_parameter_2 = parameter_2
def run(self):
data1 = self.m_in_port_1.get_all()
data2 = self.m_in_port_2[0:4]
parang = self.m_in_port_1.get_attribute('PARANG')
pixscale = self.m_in_port_2.get_attribute('PIXSCALE')
memory = self._m_config_port.get_attribute('MEMORY')
cpu = self._m_config_port.get_attribute('CPU')
result1 = 10.*self.m_parameter_1
result2 = 20.*self.m_parameter_1
result3 = [1, 2, 3]
self.m_out_port_1.set_all(result1)
self.m_out_port_1.append(result2)
self.m_out_port_2[0:2] = result2
self.m_out_port_2.add_attribute(name='new_attribute', value=attribute)
self.m_out_port_1.copy_attributes(self.m_in_port_1)
self.m_out_port_1.add_history('ExampleModule', 'history text')
self.m_out_port_2.copy_attributes(self.m_in_port_1)
self.m_out_port_2.add_history('ExampleModule', 'history text')
self.m_out_port_1.close_port()
|