File: resize-and-cold-migrate.rst

package info (click to toggle)
nova 2%3A31.0.0-7
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 50,892 kB
  • sloc: python: 412,488; pascal: 1,845; sh: 992; makefile: 166; xml: 83
file content (167 lines) | stat: -rw-r--r-- 7,062 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
=======================
Resize and cold migrate
=======================

The `resize API`_ and `cold migrate API`_ are commonly confused in nova because
the internal `API code`_, `conductor code`_ and `compute code`_ use the same
methods. This document explains some of the differences in what
happens between a resize and cold migrate operation.

For the most part this document describes
:term:`same-cell resize <Same-Cell Resize>`.
For details on :term:`cross-cell resize <Cross-Cell Resize>`, refer to
:doc:`/admin/configuration/cross-cell-resize`.

High level
~~~~~~~~~~

:doc:`Cold migrate </admin/migration>` is an operation performed by an
administrator to power off and move a server from one host to a **different**
host using the **same** flavor. Volumes and network interfaces are disconnected
from the source host and connected on the destination host. The type of file
system between the hosts and image backend determine if the server files and
disks have to be copied. If copy is necessary then root and ephemeral disks are
copied and swap disks are re-created.

:doc:`Resize </user/resize>` is an operation which can be performed by a
non-administrative owner of the server (the user) with a **different** flavor.
The new flavor can change certain aspects of the server such as the number of
CPUS, RAM and disk size. Otherwise for the most part the internal details are
the same as a cold migration.

Scheduling
~~~~~~~~~~

Depending on how the API is configured for
:oslo.config:option:`allow_resize_to_same_host`, the server may be able to be
resized on the current host. *All* compute drivers support *resizing* to the
same host but *only* the vCenter driver supports *cold migrating* to the same
host. Enabling resize to the same host is necessary for features such as
strict affinity server groups where there are more than one server in the same
affinity group.

Starting with `microversion 2.56`_ an administrator can specify a destination
host for the cold migrate operation. Resize does not allow specifying a
destination host.

Flavor
~~~~~~

As noted above, with resize the flavor *must* change and with cold migrate the
flavor *will not* change.

Resource claims
~~~~~~~~~~~~~~~

Both resize and cold migration perform a `resize claim`_ on the destination
node. Historically the resize claim was meant as a safety check on the selected
node to work around race conditions in the scheduler. Since the scheduler
started `atomically claiming`_ VCPU, MEMORY_MB and DISK_GB allocations using
Placement the role of the resize claim has been reduced to detecting the same
conditions but for resources like PCI devices and NUMA topology which, at least
as of the 20.0.0 (Train) release, are not modeled in Placement and as such are
not atomic.

If this claim fails, the operation can be rescheduled to an alternative
host, if there are any. The number of possible alternative hosts is determined
by the :oslo.config:option:`scheduler.max_attempts` configuration option.

Allocations
~~~~~~~~~~~

Since the 16.0.0 (Pike) release, the scheduler uses the `placement service`_
to filter compute nodes (resource providers) based on information in the flavor
and image used to build the server. Once the scheduler runs through its filters
and weighers and picks a host, resource class `allocations`_ are atomically
consumed in placement with the server as the consumer.

During both resize and cold migrate operations, the allocations held by the
server consumer against the source compute node resource provider are `moved`_
to a `migration record`_ and the scheduler will create allocations, held by the
instance consumer, on the selected destination compute node resource provider.
This is commonly referred to as `migration-based allocations`_ which were
introduced in the 17.0.0 (Queens) release.

If the operation is successful and confirmed, the source node allocations held
by the migration record are `dropped`_. If the operation fails or is reverted,
the source compute node resource provider allocations held by the migration
record are `reverted`_ back to the instance consumer and the allocations
against the destination compute node resource provider are dropped.

Summary of differences
~~~~~~~~~~~~~~~~~~~~~~

.. list-table::
   :header-rows: 1

   * -
     - Resize
     - Cold migrate
   * - New flavor
     - Yes
     - No
   * - Authorization (default)
     - Admin or owner (user)

       Policy rule: ``os_compute_api:servers:resize``
     - Admin only

       Policy rule: ``os_compute_api:os-migrate-server:migrate``
   * - Same host
     - Maybe
     - Only vCenter
   * - Can specify target host
     - No
     - Yes (microversion >= 2.56)

Sequence Diagrams
~~~~~~~~~~~~~~~~~

The following diagrams are current as of the 21.0.0 (Ussuri) release.

Resize
------

This is the sequence of calls to get the server to ``VERIFY_RESIZE`` status.

.. image:: /_static/images/resize/resize.svg
   :alt: Resize standard workflow

Confirm resize
--------------

This is the sequence of calls when confirming `or deleting`_ a server in
``VERIFY_RESIZE`` status.

Note that in the below diagram, if confirming a resize while deleting a server
the API synchronously calls the source compute service.

.. image:: /_static/images/resize/resize_confirm.svg
   :alt: Resize confirm workflow

Revert resize
-------------

This is the sequence of calls when reverting a server in ``VERIFY_RESIZE``
status.

.. image:: /_static/images/resize/resize_revert.svg
   :alt: Resize revert workflow


.. _resize API: https://docs.openstack.org/api-ref/compute/#resize-server-resize-action
.. _cold migrate API: https://docs.openstack.org/api-ref/compute/#migrate-server-migrate-action
.. _API code: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/compute/api.py#L3568
.. _conductor code: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/conductor/manager.py#L297
.. _compute code: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/compute/manager.py#L4445
.. _microversion 2.56: https://docs.openstack.org/nova/latest/reference/api-microversion-history.html#id52
.. _resize claim: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/compute/resource_tracker.py#L248
.. _atomically claiming: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/scheduler/filter_scheduler.py#L239
.. _moved: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/conductor/tasks/migrate.py#L28
.. _placement service: https://docs.openstack.org/placement/latest/
.. _allocations: https://docs.openstack.org/api-ref/placement/#allocations
.. _migration record: https://docs.openstack.org/api-ref/compute/#migrations-os-migrations
.. _migration-based allocations: https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/migration-allocations.html
.. _dropped: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/compute/manager.py#L4048
.. _reverted: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/compute/manager.py#L4233
.. _or deleting: https://opendev.org/openstack/nova/src/tag/19.0.0/nova/compute/api.py#L2135