File: README.rst

package info (click to toggle)
python-os-faults 0.2.7-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 696 kB
  • sloc: python: 4,797; sh: 54; makefile: 24
file content (321 lines) | stat: -rw-r--r-- 9,783 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
=========
OS-Faults
=========

**OpenStack fault-injection library**

The library does destructive actions inside an OpenStack cloud. It provides
an abstraction layer over different types of cloud deployments. The actions
are implemented as drivers (e.g. DevStack driver, Fuel driver, Libvirt driver,
IPMI driver, Universal driver).

* Free software: Apache license
* Documentation: https://os-faults.readthedocs.io/
* Source: https://opendev.org/performa/os-faults/
* Bugs: https://bugs.launchpad.net/os-faults


Installation
------------

Requirements
~~~~~~~~~~~~

Ansible is required and should be installed manually system-wide or in virtual
environment. Please refer to [https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html]
for installation instructions.

Regular installation::

    pip install os-faults

The library contains optional libvirt driver [https://pypi.org/project/libvirt-python/], if you plan to use it,
please use the following command to install os-faults with extra dependencies::

    pip install os-faults libvirt-python


Configuration
-------------

The cloud deployment configuration is specified in JSON/YAML format or Python dictionary.

The library operates with 2 types of objects:
 * `service` - is a software that runs in the cloud, e.g. `nova-api`
 * `container` - is a software that runs in the cloud, e.g. `neutron_api`
 * `nodes` - nodes that host the cloud, e.g. a server with a hostname


Example 1. DevStack
~~~~~~~~~~~~~~~~~~~

Connection to DevStack can be specified using the following YAML file:

.. code-block:: yaml

    cloud_management:
      driver: devstack
      args:
        address: devstack.local
        auth:
          username: stack
          private_key_file: cloud_key
        iface: enp0s8

OS-Faults library will connect to DevStack by address `devstack.local` with user `stack`
and SSH key located in file `cloud_key`. Default networking interface is specified with
parameter `iface`. Note that user should have sudo permissions (by default DevStack user has them).

DevStack driver is responsible for service discovery. For more details please refer
to driver documentation: http://os-faults.readthedocs.io/en/latest/drivers.html#devstack-systemd-devstackmanagement

Example 2. An OpenStack with services, containers and power management
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

An arbitrary OpenStack can be handled too with help of `universal` driver.
In this example os-faults is used as Python library.

.. code-block:: python

    cloud_config = {
        'cloud_management': {
            'driver': 'universal',
        },
        'node_discover': {
            'driver': 'node_list',
            'args': [
                {
                    'ip': '192.168.5.127',
                    'auth': {
                        'username': 'root',
                        'private_key_file': 'openstack_key',
                    }
                },
                {
                    'ip': '192.168.5.128',
                    'auth': {
                        'username': 'root',
                        'private_key_file': 'openstack_key',
                    }
                }
            ]
        },
        'services': {
            'memcached': {
                'driver': 'system_service',
                'args': {
                    'service_name': 'memcached',
                    'grep': 'memcached',
                }
            }
        },
        'containers': {
            'neutron_api': {
                'driver': 'docker_container',
                'args': {
                    'container_name': 'neutron_api',
                }
            }
        },
        'power_managements': [
            {
                'driver': 'libvirt',
                'args': {
                    'connection_uri': 'qemu+unix:///system',
                }
            },
        ]
    }

The config contains all OpenStack nodes with credentials and all
services/containers. OS-Faults will automatically figure out the mapping
between services/containers and nodes. Power management configuration is
flexible and supports mixed bare-metal / virtualized deployments.

First let's establish a connection to the cloud and verify it:

.. code-block:: python

    cloud_management = os_faults.connect(cloud_config)
    cloud_management.verify()

The library can also read configuration from a file in YAML or JSON format.
The configuration file can be specified in the `OS_FAULTS_CONFIG` environment
variable. By default the library searches for file `os-faults.{json,yaml,yml}`
in one of locations:

  * current directory
  * ~/.config/os-faults
  * /etc/openstack

Now let's make some destructive action:

.. code-block:: python

    cloud_management.get_service(name='memcached').kill()
    cloud_management.get_container(name='neutron_api').restart()


Human API
---------

Human API is simplified and self-descriptive. It includes multiple commands
that are written like normal English sentences.

**Service-oriented** command performs specified `action` against `service` on
all, on one random node or on the node specified by FQDN::

    <action> <service> service [on (random|one|single|<fqdn> node[s])]

Examples:
    * `Restart Keystone service` - restarts Keystone service on all nodes.
    * `kill nova-api service on one node` - kills Nova API on one
      randomly-picked node.

**Container-oriented** command performs specified `action` against `container`
on all, on one random node or on the node specified by FQDN::

    <action> <container> container [on (random|one|single|<fqdn> node[s])]

Examples:
    * `Restart neutron_ovs_agent container` - restarts neutron_ovs_agent
      container on all nodes.
    * `Terminate neutron_api container on one node` - stops Neutron API
      container on one randomly-picked node.

**Node-oriented** command performs specified `action` on node specified by FQDN
or set of service's nodes::

    <action> [random|one|single|<fqdn>] node[s] [with <service> service]

Examples:
    * `Reboot one node with mysql` - reboots one random node with MySQL.
    * `Reset node-2.domain.tld node` - resets node `node-2.domain.tld`.

**Network-oriented** command is a subset of node-oriented and performs network
management operation on selected nodes::

    <action> <network> network on [random|one|single|<fqdn>] node[s]
        [with <service> service]

Examples:
    * `Disconnect management network on nodes with rabbitmq service` - shuts
      down management network interface on all nodes where rabbitmq runs.
    * `Connect storage network on node-1.domain.tld node` - enables storage
      network interface on node-1.domain.tld.


Extended API
------------

1. Service actions
~~~~~~~~~~~~~~~~~~

Get a service and restart it:

.. code-block:: python

    cloud_management = os_faults.connect(cloud_config)
    service = cloud_management.get_service(name='glance-api')
    service.restart()

Available actions:
 * `start` - start Service
 * `terminate` - terminate Service gracefully
 * `restart` - restart Service
 * `kill` - terminate Service abruptly
 * `unplug` - unplug Service out of network
 * `plug` - plug Service into network

2. Container actions
~~~~~~~~~~~~~~~~~~~~

Get a container and restart it:

.. code-block:: python

    cloud_management = os_faults.connect(cloud_config)
    container = cloud_management.get_container(name='neutron_api')
    container.restart()

Available actions:
 * `start` - start Container
 * `terminate` - terminate Container gracefully
 * `restart` - restart Container

3. Node actions
~~~~~~~~~~~~~~~

Get all nodes in the cloud and reboot them:

.. code-block:: python

    nodes = cloud_management.get_nodes()
    nodes.reboot()

Available actions:
 * `reboot` - reboot all nodes gracefully
 * `poweroff` - power off all nodes abruptly
 * `reset` - reset (cold restart) all nodes
 * `disconnect` - disable network with the specified name on all nodes
 * `connect` - enable network with the specified name on all nodes

4. Operate with nodes
~~~~~~~~~~~~~~~~~~~~~

Get all nodes where a service runs, pick one of them and reset:

.. code-block:: python

    nodes = service.get_nodes()
    one = nodes.pick()
    one.reset()

Get nodes where l3-agent runs and disable the management network on them:

.. code-block:: python

    fqdns = neutron.l3_agent_list_hosting_router(router_id)
    nodes = cloud_management.get_nodes(fqdns=fqdns)
    nodes.disconnect(network_name='management')

5. Operate with services
~~~~~~~~~~~~~~~~~~~~~~~~

Restart a service on a single node:

.. code-block:: python

    service = cloud_management.get_service(name='keystone')
    nodes = service.get_nodes().pick()
    service.restart(nodes)

6. Operate with containers
~~~~~~~~~~~~~~~~~~~~~~~~~~

Terminate a container on a random node:

.. code-block:: python

    container = cloud_management.get_container(name='neutron_ovs_agent')
    nodes = container.get_nodes().pick()
    container.restart(nodes)


License notes
-------------

Ansible is distributed under GPL-3.0 license and thus all programs
that link with its code are subject to GPL restrictions [1].
However these restrictions are not applied to os-faults library
since it invokes Ansible as process [2][3].

Ansible modules are provided with Apache license (compatible to GPL) [4].
Those modules import part of Ansible runtime (modules API) and executed
on remote hosts. os-faults library does not import these module
neither static nor dynamic.

 [1] https://www.gnu.org/licenses/gpl-faq.html#GPLModuleLicense
 [2] https://www.gnu.org/licenses/gpl-faq.html#GPLPlugins
 [3] https://www.gnu.org/licenses/gpl-faq.html#MereAggregation
 [4] https://www.apache.org/licenses/GPL-compatibility.html