1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331
|
---
layout: default
title: Fractal Application (3/3)
service_jsonnet: https://github.com/google/jsonnet/blob/master/case_studies/fractal/service.jsonnet
packer_jsonnet: https://github.com/google/jsonnet/blob/master/case_studies/fractal/lib/packer.libsonnet
terraform_jsonnet: https://github.com/google/jsonnet/blob/master/case_studies/fractal/lib/terraform.libsonnet
cassandra_jsonnet: https://github.com/google/jsonnet/blob/master/case_studies/fractal/lib/cassandra.libsonnet
makefile: https://github.com/google/jsonnet/blob/master/case_studies/fractal/Makefile
---
<div class="hgroup">
<div class="hgroup-inline">
<div class="panel">
<h1 id=top>
<p class=jump_to_page_top>
Pages <a href="fractal.1.html">1</a>,
<a href="fractal.2.html">2</a>,
3
</p>
Fractal Application
</h1>
</div>
<div style="clear: both"></div>
</div>
</div>
<div class="hgroup">
<div class="hgroup-inline">
<div class="panel">
<h2 id=using>Using The Configuration</h2>
</div>
<div style="clear: both"></div>
</div>
</div>
<div class="hgroup">
<div class="hgroup-inline">
<div class="panel">
<p>
The rest of this article gives simple methodologies for deploying and managing the fractal
application by editing the Jsonnet configuration and then applying these changes. In order
to reproduce this case study there are a few pre-requisites:
</p>
<ul>
<li>
A Linux or OSX system with GNU Make.
</li>
<li>
Packer, built from github, in your $PATH.
</li>
<li>
Terraform, built from github, in your $PATH.
</li>
<li>
Jsonnet, built from github (this is also where you find the configuration and source
code).
</li>
<li>
An account on Google Cloud Platform (hosting the application will incur charges).
</li>
</ul>
<p>
Once those are satisfied, follow these steps:
</p>
<ol>
<li>
In the Google Cloud Platform console, open your project, go to APIs and Auth /
credentials, and create a new service account. This will automatic download a p12 key,
which you can delete as we will not be using it. Instead, click the button to download a
JSON key for the new service account, move it to the fractal directory, and call it
<tt>service_account_key.json</tt>.
</li>
<li>
Create a credentials.jsonnet file based on the template, fill in your GCP project name and
make up some unique passwords.
</li>
</ol>
</div>
<div style="clear: both"></div>
</div>
</div>
<div class="hgroup">
<div class="hgroup-inline">
<div class="panel">
<h3 id=using_deploy>Initial Deployment and Tear Down</h3>
</div>
<div style="clear: both"></div>
</div>
</div>
<div class="hgroup">
<div class="hgroup-inline">
<div class="panel">
<p>
To deploy the application, run make -j. This should start running 3 Packer builds in
parallel. In a separate terminal, use tail -f *.log in order to watch their progress. When
the images are built, Terraform will show you the proposed changes (a long list of resources
to be created). Enter <tt>y</tt> to confirm the changes. In time, the application will be
deployed. Now you only need the appserv ip address to connect to it. You can get this
using "gcloud compute addresses list" or by navigating the Google Cloud Platform console to
"networks". Opening that ip in a web browser should take you to the fractal application
itself.
</p>
<p>
The application can be brought down again by running <tt>terraform destroy</tt>. Terraform
remembers the resources it created via the <tt>terraform.tfstate</tt> file. This will not
destroy the Packer images; they can be deleted from the console or from gcloud.
</p>
<p>
Managing a production web service usually means making continual changes to it instead of
bringing the whole thing down and up again, as we will shortly discuss. However it is still
useful to bring up a fresh application for testing / development purposes. A copy or
variant of the production service can be brought up concurrently with the production service
(e.g., in a different project). This can be useful for QA, automatic integration testing,
or load testing. It is also useful for training new ops staff or rehearsing a complex
production change in a safe environment.
</p>
</div>
<div style="clear: both"></div>
</div>
</div>
<div class="hgroup">
<div class="hgroup-inline">
<div class="panel">
<h3 id=using_cassandra>Add / Remove Cassandra Nodes</h3>
</div>
<div style="clear: both"></div>
</div>
</div>
<div class="hgroup">
<div class="hgroup-inline">
<div class="panel">
<p>
Managing the Cassandra cluster requires a combination of configuration alteration (to
control the fundamental compute resources) and use of the Cassandra command line tool
"nodetool" on the instances themselves. For example "nodetool status fractal" on any
Cassandra instance will give information about the whole cluster.
</p>
<p>
To add a new node (expand the cluster), simply edit <a href="{{ page.service_jsonnet
}}"><code>service.jsonnet</code></a>, add another
instance in the Terraform configuration and run make -j. Confirm the changes (the only
change should be the new instance, e.g., db4). It will start up and soon become part of the
cluster.
</p>
<p>
To remove a node, first decommission it using nodetool -h HOSTNAME decommission. When that
is complete, destroy the actual instance by updating <a href="{{ page.service_jsonnet
}}"><code>service.jsonnet</code></a> to remove the
resource and run make -j again. Confirm the removal of the instance. It is OK to remove
the first node, but its replacement should use <tt>GcpTopUpMixin</tt> instead of
<tt>GcpStarterMixin</tt>. You can recycle all of the nodes if you do it one at a time,
which is actually necessary for emergency kernel upgrades.
</p>
<p>
If a node is permanently and unexpectedly lost (e.g., a disk error), or you removed it
without first decommissioning it, the cluster will remain in a state where it expects the
dead node to return at some point (as if it were temporarily powered down or on the wrong
side of a network split). This situation can be rectified with nodetool removenode UUID,
run from any other node in the cluster. In this case it is probably also necessary to run
nodetool repair on the other nodes to ensure data is properly distributed.
</p>
</div>
<div style="clear: both"></div>
</div>
</div>
<div class="hgroup">
<div class="hgroup-inline">
<div class="panel">
<h3 id=using_appserv>Canary A Change To The Application Server</h3>
</div>
<div style="clear: both"></div>
</div>
</div>
<div class="hgroup">
<div class="hgroup-inline">
<div class="panel">
<p>
To introduce new functionality to the application server it is useful to divert a small
proportion of user traffic to the new code to ensure it is working properly. After this
initial "canary" test has passed, the remaining traffic can then be confidently transferred
to the new code. The same can be said for the tile generation service (e.g. to update the
C++ code).
</p>
<p>
The model used by this example is that the application server logic and static content are
embedded in the application server image. Canarying consists of building a new image and
then rolling it out gradually one instance at a time. Each step of this methodology
consists of a small modification to <a href="{{ page.service_jsonnet
}}"><code>service.jsonnet</code></a> and then running make -j.
</p>
<ol>
<li>
Edit the <tt>appserv.packer.json</tt> packer configuration in <a href="{{
page.service_jsonnet }}"><code>service.jsonnet</code></a> to update the date embedded in
the <tt>name</tt> field to the current date, and also make any desired changes to the
configuration of the image or any of the referenced Python / HTML / CSS files.
</li>
<li>
Run make -j to build the new image. Note that the previous image is still available under
the old name, which means it is possible to create new instances using either the old or
new image. This feature is essential to allow ongoing maintenance of the cluster if the
change is rolled back.
</li>
<li>
Create a single instance with the new image by adding it to the
<tt>google_compute_instance</tt> section of <a href="{{ page.service_jsonnet
}}"><code>service.jsonnet</code></a> . The easiest way to do this is to copy-paste the
existing definition, and modify the image name in the copy to reflect the new image. This
allows easily rolling back by deleting the copied code, and you can also transition the
rest of the nodes by deleting the original copy. Thus, the duplication is only temporary.
At this point the configuration may look like this:
<pre class="medium">google_compute_instance: {
["appserv" + k]: resource.FractalInstance(k) {
name: "appserv" + k,
image: "appserv-v20141222-0300",
...
}
for k in [1, 2, 3]
} + {
["appserv" + k]: resource.FractalInstance(k) {
name: "appserv" + k,
image: "appserv-v20150102-1200",
...
}
for k in [4]
} + ...</pre>
Also modify the appserv target pool to add the new instance 4, thus ensuring it receives
traffic.
<pre class="medium">appserv: {
name: "appserv",
health_checks: ["${google_compute_http_health_check.fractal.name}"],
instances: [ "%s/appserv%d" % [zone(k), k] for k in [1, 2, 3, 4] ],
}, </pre>
</li>
<li>
Run make -j to effect those changes, and monitor the situation to ensure that there is no
spike in errors. It is also possible to run make -j between the above two steps if it is
desired to interact with the new instance before directing user traffic at it.
</li>
<li>
If there is a problem, pull 4 out of the target pool and re-run make -j. That will leave
the instance up (useful for investigation) but it will no-longer receive user traffic.
Otherwise, add more instances (5, 6, ...) and add them to the target pool. You can now
start pulling the old instances out of the target pool ensuring that there is always
sufficient capacity for your traffic load. As always, make -j punctuates the steps.
</li>
<li>
Once the old instances are drained of user traffic, they can be destroyed. You can do
this in batches or one at a time. Eventually, the configuration will look like this, at
which point the first block no-longer contributes anything to the configuration and it can
be deleted:
<pre class="medium">google_compute_instance: {
["appserv" + k]: resource.FractalInstance(k) {
name: "appserv" + k,
image: "appserv-v20141222-0300",
...
}
for k in []
} + {
["appserv" + k]: resource.FractalInstance(k) {
name: "appserv" + k,
image: "appserv-v20150102-1200",
...
}
for k in [4, 5, 6]
} + ... </pre>
</li>
</ol>
</div>
<div style="clear: both"></div>
</div>
</div>
<div class="hgroup">
<div class="hgroup-inline">
<div class="panel">
<h2 id=conclusion>Conclusion</h2>
</div>
<div style="clear: both"></div>
</div>
</div>
<div class="hgroup">
<div class="hgroup-inline">
<div class="panel">
<p>
We have shown how Jsonnet can be used to centralize, unify, and manage configuration for a
realistic cloud application. We have demonstrated how programming language abstraction
techniques make the configuration very concise, with re-usable elements separated into
template libraries. Complexity is controlled in spite of the variety of different
configurations, formats, and tasks involved.
</p>
<p>
Finally we demonstrated how with a little procedural glue to drive other processes (the
humble UNIX make), we were able to build an operations methodology where many aspects of the
service can be controlled centrally by editing a single Jsonnet file and issuing make -j
update commands.
</p>
</div>
<div style="clear: both"></div>
</div>
</div>
<div class="hgroup">
<div class="hgroup-inline">
<div class="panel">
<p class=jump_to_page>
Pages <a href="fractal.1.html">1</a>,
<a href="fractal.2.html">2</a>,
3
</p>
</div>
<div style="clear: both"></div>
</div>
</div>
|