File: storage.html

package info (click to toggle)
cyrus-imapd 3.10.2-1
  • links: PTS, VCS
  • area: main
  • in suites: trixie
  • size: 59,108 kB
  • sloc: ansic: 284,386; perl: 137,327; javascript: 9,659; sh: 5,730; yacc: 2,565; makefile: 2,188; cpp: 2,147; lex: 662; xml: 621; awk: 303; python: 272; asm: 262
file content (921 lines) | stat: -rw-r--r-- 57,142 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<head>
  <meta charset="utf-8" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />

  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>Storage Considerations &mdash; Cyrus IMAP 3.10.2 documentation</title>
      <link rel="stylesheet" href="../../../_static/pygments.css" type="text/css" />
      <link rel="stylesheet" href="../../../_static/css/theme.css" type="text/css" />
      <link rel="stylesheet" href="../../../_static/graphviz.css" type="text/css" />
      <link rel="stylesheet" href="../../../_static/cyrus.css" type="text/css" />
  
        <script data-url_root="../../../" id="documentation_options" src="../../../_static/documentation_options.js"></script>
        <script src="../../../_static/jquery.js"></script>
        <script src="../../../_static/underscore.js"></script>
        <script src="../../../_static/_sphinx_javascript_frameworks_compat.js"></script>
        <script src="../../../_static/doctools.js"></script>
        <script src="../../../_static/sphinx_highlight.js"></script>
    <script src="../../../_static/js/theme.js"></script>
    <link rel="index" title="Index" href="../../../genindex.html" />
    <link rel="search" title="Search" href="../../../search.html" />
    <link rel="next" title="Supported Platforms and System Requirements" href="supported-platforms.html" />
    <link rel="prev" title="Performance Recommendations" href="performance_recommendations.html" /> 
</head>

<body class="wy-body-for-nav"> 
  <div class="wy-grid-for-nav">
    <nav data-toggle="wy-nav-shift" class="wy-nav-side">
      <div class="wy-side-scroll">
        <div class="wy-side-nav-search" >

          
          
          <a href="../../../index.html" class="icon icon-home">
            Cyrus IMAP
          </a>
              <div class="version">
                3.10.2
              </div>
<div role="search">
  <form id="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
    <input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
    <input type="hidden" name="check_keywords" value="yes" />
    <input type="hidden" name="area" value="default" />
  </form>
</div>
        </div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
              <p class="caption" role="heading"><span class="caption-text">Cyrus IMAP</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../../../download.html">Download</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../quickstart.html">Quickstart Guide</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../overview.html">Overview</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="../../../setup.html">Setup</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="../../developer/compiling.html">Compiling</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../installing.html">Installing Cyrus</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../download/upgrade.html">Upgrading to 3.10</a></li>
<li class="toctree-l2 current"><a class="reference internal" href="../deployment.html">Configuration Guide</a><ul class="current">
<li class="toctree-l3"><a class="reference internal" href="deployment_scenarios.html">Deployment Scenarios</a></li>
<li class="toctree-l3"><a class="reference internal" href="deployment_scenarios.html#cyrus-murder-server-aggregation">Cyrus Murder: Server aggregation</a></li>
<li class="toctree-l3"><a class="reference internal" href="deployment_scenarios.html#cyrus-replication">Cyrus Replication</a></li>
<li class="toctree-l3"><a class="reference internal" href="deployment_scenarios.html#hosted-environments">Hosted Environments</a></li>
<li class="toctree-l3"><a class="reference internal" href="databases.html">Databases</a></li>
<li class="toctree-l3"><a class="reference internal" href="mailbox_creation_distribution.html">Mailbox Creation Distribution</a></li>
<li class="toctree-l3"><a class="reference internal" href="known_protocol_limitations.html">Known Protocol Limitations</a></li>
<li class="toctree-l3"><a class="reference internal" href="authentication_and_authorization.html">Authentication and Authorization</a></li>
<li class="toctree-l3"><a class="reference internal" href="performance_recommendations.html">Performance Recommendations</a></li>
<li class="toctree-l3 current"><a class="current reference internal" href="#">Storage Considerations</a></li>
<li class="toctree-l3"><a class="reference internal" href="supported-platforms.html">Supported Platforms and System Requirements</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../../operations.html">Operations</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../developers.html">Developers</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../support.html">Support/Community</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Cyrus SASL</span></p>
<ul>
<li class="toctree-l1"><a class="reference external" href="http://www.cyrusimap.org/sasl">Cyrus SASL</a></li>
</ul>

        </div>
      </div>
    </nav>

    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
          <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
          <a href="../../../index.html">Cyrus IMAP</a>
      </nav>

      <div class="wy-nav-content">
        <div class="rst-content">
          <div role="navigation" aria-label="Page navigation">
  <ul class="wy-breadcrumbs">
      <li><a href="../../../index.html" class="icon icon-home" aria-label="Home"></a></li>
          <li class="breadcrumb-item"><a href="../../../setup.html">Setup</a></li>
          <li class="breadcrumb-item"><a href="../deployment.html">Configuration Guide</a></li>
      <li class="breadcrumb-item active">Storage Considerations</li>
      <li class="wy-breadcrumbs-aside">
              <a href="https://github.com/cyrusimap/cyrus-imapd/blob/master/docsrc/imap/concepts/deployment/storage.rst" class="fa fa-github"> Edit on GitHub</a>
      </li>
  </ul>
  <hr/>
</div>
          <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
           <div itemprop="articleBody">
             
  <section id="storage-considerations">
<span id="imap-deployment-storage"></span><h1>Storage Considerations<a class="headerlink" href="#storage-considerations" title="Permalink to this heading"></a></h1>
<p>Storage considerations are a complex matter, as the various options
provide or restrict one's ability to adjust the necessary parameters as
the need arises. It is foremost a challenge to clearly articulate and
prioritize the criteria for storage, and map the theory on to a
practical implementation design.</p>
<p>This article intends to provide information and outline details, and
sometimes opinions and recommendations, but it is not a guide to
providing you with the storage solution that you want or require.</p>
<p>Generally, the most important considerations for storage include;</p>
<p><a class="reference internal" href="#imap-deployment-storage-redundancy"><span class="std std-ref">Redundancy</span></a>,</p>
<blockquote>
<div><p>because nothing is as humiliating as losing all your data.</p>
</div></blockquote>
<p><a class="reference internal" href="#imap-deployment-storage-availability"><span class="std std-ref">Availability</span></a>,</p>
<blockquote>
<div><p>because nothing is more stressful than none of your data being
available.</p>
</div></blockquote>
<p><a class="reference internal" href="#imap-deployment-storage-performance"><span class="std std-ref">Performance</span></a>,</p>
<blockquote>
<div><p>because nothing is as annoying as waiting, followed by some more
waiting.</p>
</div></blockquote>
<p><a class="reference internal" href="#imap-deployment-storage-scalability"><span class="std std-ref">Scalability</span></a>,</p>
<blockquote>
<div><p>because <code class="docutils literal notranslate"><span class="pre">-ENOSPC</span></code> is good only when it applies to your stomach.</p>
</div></blockquote>
<p><a class="reference internal" href="#imap-deployment-storage-capacity"><span class="std std-ref">Capacity</span></a>,</p>
<blockquote>
<div><p>because your data must be available, backed up and archived.</p>
</div></blockquote>
<p><a class="reference internal" href="#imap-deployment-storage-cost"><span class="std std-ref">Cost</span></a>,</p>
<blockquote>
<div><p>because you can't buy a beer or feed a family with an empty wallet.</p>
</div></blockquote>
<p>Storage is not a part of Cyrus IMAP, in that Cyrus IMAP does not ship
a particular storage solution as part of the product, and it has no
particular requirements for storage either.</p>
<p>As such, your SAN, NAS, local disk, local array of disks or network
share or even the flash drive of a Raspberry Pi could be used, although
the following considerations are important:</p>
<ul class="simple">
<li><p>The Cyrus IMAP spool is I/O intensive (large volumes of data are read
and get written).</p></li>
<li><p>The Cyrus IMAP spool consists of many small files.</p></li>
</ul>
<p>As such, we recommend you take into account;</p>
<ul class="simple">
<li><p>The available bandwidth between the IMAP server and the storage
provider, if at all on the network,</p></li>
<li><p>The (network) protocol overhead, if any, should file-level read
and/or write locking be required or implied.</p></li>
<li><p>Atomic file operations.</p></li>
<li><p>Parallel access (such as shared mailboxes or multi-client
attendance).</p></li>
</ul>
<section id="general-notes-on-storage">
<h2>General Notes on Storage<a class="headerlink" href="#general-notes-on-storage" title="Permalink to this heading"></a></h2>
<p>The aforementioned considerations
<a class="reference internal" href="#imap-deployment-storage-redundancy"><span class="std std-ref">Redundancy</span></a>,
<a class="reference internal" href="#imap-deployment-storage-availability"><span class="std std-ref">Availability</span></a>,
<a class="reference internal" href="#imap-deployment-storage-performance"><span class="std std-ref">Performance</span></a>,
<a class="reference internal" href="#imap-deployment-storage-scalability"><span class="std std-ref">Scalability</span></a>,
<a class="reference internal" href="#imap-deployment-storage-capacity"><span class="std std-ref">Capacity</span></a> and
<a class="reference internal" href="#imap-deployment-storage-cost"><span class="std std-ref">Cost</span></a>
are not all of them equally important -- not to all organizations, and
not to all requirements when the priorities are set out against the
implied cost of the supposed ideal solution.</p>
<p>They are also not mutually exclusive in that, for example, redundancy
may partly address some of the availability concerns -- depending on the
exact nature of the final deployment of course, and backup/recovery
capabilities in turn may partly address redundancy requirements. Neither
necessarily directly addresses availability concerns, however.</p>
<p>What is deemed acceptable is another culprit -- more often then not,
operational cost, familiarity of staff with a particular storage
solution, or flexibility of a storage solution (or lack thereof) may get
in the way of an otherwise appropriate storage solution.</p>
<p>We believe that provided a sufficient amount of accurate information,
however, you are able to make an informed choice, and that an informed
choice is always better than an ill-informed one.</p>
</section>
<section id="redundancy">
<span id="imap-deployment-storage-redundancy"></span><h2>Redundancy<a class="headerlink" href="#redundancy" title="Permalink to this heading"></a></h2>
<p>Storage redundancy is achieved through replication of data. It is
important to understand that, as a matter of design principle,
redundancy does not in and by itself provide increased availability.</p>
<p>How redundancy could increase availability depends on the exact
implementation, and the various options for practical implementation
each have their own set of implications for cases of failure and the
need to, under certain circumstances, failover and/or recover.</p>
<p>How redundancy is achieved in an &quot;acceptable&quot; manner is another subject
open to interpretation; it is sometimes deemed acceptable to create
backups daily, and therefore potentially accept the loss of up to one
day's worth of information from live spools -- which may or may not be
recoverable through different means. More commonly however is to not
settle for anything less than real-time replication of data.</p>
<p>While storage ultimately amounts to disks, it is important to understand
that a number of (virtual) devices, channels, links and interfaces exist
between an application operating data on disk <a class="footnote-reference brackets" href="#id7" id="id1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a>, and the physical
sectors and blocks or cells of storage on that disk. In a way, this
number of layers can be compared with the <a class="reference external" href="http://en.wikipedia.org/wiki/OSI_model">OSI model for networking</a> --
but it is not the same at all.</p>
<p>This section addresses the most commonly used levels at which
replication can be applied.</p>
<section id="storage-volume-level-replication">
<h3>Storage Volume Level Replication<a class="headerlink" href="#storage-volume-level-replication" title="Permalink to this heading"></a></h3>
<p>When using the term <a class="reference internal" href="../../../glossary.html#term-storage-volume-level-replication"><span class="xref std std-term">storage volume level replication</span></a> we mean to
indicate the replication of <a class="reference internal" href="../../../glossary.html#term-disk-volumes"><span class="xref std std-term">disk volumes</span></a> as a whole. A
simplistic replication scenario of a data disk between two nodes could
look as follows:</p>
<div class="graphviz"><img src="../../../_images/graphviz-3d117e14f1a624d46409d9f38d14363039c26823.png" alt="digraph drbd {
        rankdir = LR;
        splines = true;
        overlab = prism;

        edge [color=gray50, fontname=Calibri, fontsize=11];
        node [style=filled, shape=record, fontname=Calibri, fontsize=11];

        subgraph cluster_master {
                label = &quot;Master&quot;;

                color = &quot;#BBFFBB&quot;;
                fontname = Calibri;
                rankdir = TB;
                style = filled;

                &quot;OS Disk 0&quot; [label=&quot;OS Disk&quot;,color=&quot;green&quot;];
                &quot;Data Disk 0&quot; [label=&quot;Data Disk&quot;,color=&quot;green&quot;];
            }

        subgraph cluster_slave {
                label = &quot;Slave&quot;;

                color = &quot;#FFBBBB&quot;;
                fontname = Calibri;
                rankdir = TB;
                style = filled;

                &quot;OS Disk 1&quot; [label=&quot;OS Disk&quot;,color=&quot;green&quot;];
                &quot;Data Disk 1&quot; [label=&quot;Data Disk&quot;,color=&quot;red&quot;];
            }

        &quot;Data Disk 0&quot; -&gt; &quot;Data Disk 1&quot; [label=&quot;One-Way Replication&quot;];
    }" class="graphviz" /></div>
<p>For a fully detailed picture of the internal structures, please see the
<a class="reference external" href="http://www.drbd.org/">DRBD</a> website, the canonical experts on this level of replication.</p>
<p>Normally storage-level replication occurs in such
fashion that it can be compared with a distributed version of a RAID-1
array. This incurs limitations that need to be evaluated carefully.</p>
<p>In a hardware RAID-1 array, storage is physically constrained to a
single node, and pairs of replicated disks are treated as one. In a
software RAID-1 array, it is the operating system's software RAID driver
that can (must) address the individual disks, but makes the array appear
as a single disk to all higher-level software. Here too, the disks are
physically constrained to one physical node.</p>
<p>In both cases, a <em>single point of control</em> exists with full and
exclusive access to the physical disk device(s), namely the interface
for <em>all higher-level software</em> to interact with the storage.</p>
<p>This is the underlying cause of the storage-level replication conundrum.</p>
<p>To illustrate the conundrum, we use a software RAID-1 array. The
individual disk volumes that make up the RAID-1 array are not hidden
from the rest of the operating system, but more importantly, direct
access to the underlying device is not prohibited. With an example pair
<code class="docutils literal notranslate"><span class="pre">sda2</span></code> and <code class="docutils literal notranslate"><span class="pre">sdb2</span></code>, nothing prevents you from executing <code class="docutils literal notranslate"><span class="pre">mkfs.ext4</span></code>
on <code class="docutils literal notranslate"><span class="pre">/dev/sdb2</span></code> thereby corrupting the array -- other than perhaps not
having the necessary administrative privileges.</p>
<p>To further illustrate, position one disk in the RAID-1 array on the
other side of a network (such as is a <a class="reference external" href="http://www.drbd.org/">DRBD</a> topology, as illustrated).
Since now two nodes participate in nurturing the mirrored volume, two
points of control exist -- each node controls the access to its local
disk device(s).</p>
<p>Participating nodes are <strong>required</strong> to successfully coordinate their
I/O with one another, which on the level of entire storage volumes is a
very impractical effort with high latency and enormous overhead, should
more than one node be allowed to access the replicated device <a class="footnote-reference brackets" href="#id8" id="id2" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a>.</p>
<p>It is therefore understood that, using storage level replication;</p>
<ul class="simple">
<li><p>Only one side of the mirrored volume can be active (master), and the
other side must remain passive (slave),</p></li>
<li><p>The active and passive nodes therefore have a cluster solution
implemented to manage application's failover and management of the
change in replication topology (a slave becomes the I/O master, the
former master becomes the replication slave, and other slaves, if
any, learn about the new master to replicate from),</p></li>
<li><p>Failover implementations include fencing, the STONITH principle,
ensuring no two nodes in parallel perform I/O on the same volume at
any given time.</p></li>
</ul>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>Storage volume level replication does not protect against filesystem
or payload corruption -- the replication happily mirrors the
&quot;faulty&quot; bits as it is completely agnostic to the bits' meaning and
relevance.</p>
</div>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>For the reasons outlined in this section, storage volume level
replication has only a limited number of Cyrus IMAP deployment
scenarios for which it would be recommended -- such as <em>Disaster
Recovery Failover</em>.</p>
</div>
</section>
<section id="integrated-storage-protocol-level-replication">
<span id="imap-deployment-storage-integrated-storage-protocol-level-replication"></span><h3>Integrated Storage Protocol Level Replication<a class="headerlink" href="#integrated-storage-protocol-level-replication" title="Permalink to this heading"></a></h3>
<p>Integrated storage protocol level replication is a different approach to
making storage volumes redundant, applying the replication on a
different level.</p>
<p>Integrated storage protocol level replication isn't necessarily limited
to replication for the purposes of redundancy only, as it may already
include parallel access controls, distribution across multiple storage
nodes (each providing a part of the total storage volume available),
enabling the use of cheap commodity hardware to provide the individual
parts (called &quot;bricks&quot;) that make up the larger volume.</p>
<p>Additional features may include the use of a geographically oriented set
of parameters for the calculation and assignment of replicated chunks of
data (ie. &quot;brick replication topology&quot;).</p>
<div class="graphviz"><img src="../../../_images/graphviz-b94689a1c06f2d38b8797872ac6d90bc2bea88ca.png" alt="digraph {
        rankdir = TB;
        splines = true;
        overlab = prism;

        edge [color=gray50, fontname=Calibri, fontsize=11];
        node [style=filled, shape=record, fontname=Calibri, fontsize=11];

        &quot;Storage Client #1&quot; -&gt; &quot;Storage Access Point&quot; [dir=back,color=green];
        &quot;Storage Client #2&quot; -&gt; &quot;Storage Access Point&quot; [dir=back,color=green];
        &quot;Storage Client #3&quot; -&gt; &quot;Storage Access Point&quot; [dir=back,color=green];
        &quot;Storage Client #4&quot; -&gt; &quot;Storage Access Point&quot; [dir=back,color=green];

        subgraph cluster_storage {
                color = green;
                label = &quot;Distributed and/or Replicated Volume Manager w/ Integrated Distributed (File-) Locking&quot;;

                &quot;Storage Access Point&quot; [shape=point,color=green];

                &quot;Brick #1&quot; [color=green];
                &quot;Brick #2&quot; [color=green];
                &quot;Brick #3&quot; [color=green];
                &quot;Brick #4&quot; [color=green];

                &quot;Storage Access Point&quot; -&gt; &quot;Brick #1&quot; [color=green];
                &quot;Storage Access Point&quot; -&gt; &quot;Brick #2&quot; [color=green];
                &quot;Storage Access Point&quot; -&gt; &quot;Brick #3&quot; [color=green];
                &quot;Storage Access Point&quot; -&gt; &quot;Brick #4&quot; [color=green];
            }
    }" class="graphviz" /></div>
<p>Current implementations of this type of technology include <a class="reference external" href="http://www.glusterfs.org">GlusterFS</a>
and <a class="reference external" href="http://ceph.com">Ceph</a>. Put way too simplistically, both technologies apply very
smart ways of storing individual objects, sometimes with additional
facilities for certain object types. How they work exactly is far beyond
the scope of this document.</p>
<p>Both technologies however are considered more efficient for fewer,
larger objects, than they are for more, smaller objects. Both storage
solutions also tend to be more efficient at addressing individual
objects directly, rather than hierarchies of objects (for listing).</p>
<p>This is meant to indicate that while both solutions scale up to millions
of objects, they facilitate a particular <strong>I/O pattern</strong> better than the
I/O pattern typically associated with a large volume of messages in IMAP
spools. More frequent and very short-lived I/O against individual
objects in a filesystem mounted directly causes a significant amount of
overhead in negotiating the access to the objects across the storage
cluster <a class="footnote-reference brackets" href="#id8" id="id3" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a>.</p>
<p>Both technologies are perfectly suitable for large clusters with
relatively small filesystems (see <a class="reference external" href="https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Global_File_System_2/ch-considerations.html#s2-fssize-gfs2">Filesystems: Smaller is Better</a>)
if they are mounted directly from the storage clients. They are
particularly feasible if not too many parallel write operations to
individual objects (files) are likely to occur (think, for example, of
web application servers and (asset-)caching proxies).</p>
<p>Alternatively, fewer larger objects could be stored -- such as disk
images for a virtualization environment. The I/O patterns internal to
the virtual machine would remain the same, but the I/O pattern of the
storage client (the hypervisor) is the equivalent of a single
lock-and-open when the virtual machine starts.</p>
<p>It is therefore understood that, especially in deployments of a larger
scale, one should not mount a GlusterFS or CephFS filesystem directly
from within an IMAP server, as an individual IMAP mail spool consists of
many very small objects each individually addressed frequently, and in
short-lived I/O operations, and consider the use of these distributed
filesystems for a different level of object storage, such as disk images
for a virtualization environment:</p>
<div class="graphviz"><img src="../../../_images/graphviz-afb8d78934110ec33e73c0a576318971d3e3aa3c.png" alt="digraph {
           rankdir = TB;
           splines = true;
           overlab = prism;

           edge [color=gray50, fontname=Calibri, fontsize=11];
           node [style=filled, shape=record, fontname=Calibri, fontsize=11];

           subgraph cluster_guests {
                   label = &quot;Guest Nodes&quot;;

                   &quot;Guest #1&quot;;
                   &quot;Guest #2&quot;;
                   &quot;Guest #3&quot;;
               }

           subgraph cluster_hypervisors {
                   label = &quot;Virtualization Platform&quot;;

                   &quot;Hypervisor #1&quot;;
                   &quot;Hypervisor #2&quot;;
               }

           subgraph cluster_storage {
                   color = green;
                   label = &quot;Distributed and/or Replicated Volume
Manager w/ Integrated Distributed (File-) Locking&quot;;

                   subgraph cluster_replbricks1 {
                           label = &quot;Replicated Bricks&quot;;

                           &quot;Brick #1&quot; [color=green];
                           &quot;Brick #3&quot; [color=green];
                       }

                   subgraph cluster_replbricks2 {
                           label = &quot;Replicated Bricks&quot;;

                           &quot;Brick #2&quot; [color=green];
                           &quot;Brick #4&quot; [color=green];
                       }

               }

           &quot;Guest #1&quot; -&gt; &quot;Hypervisor #1&quot; [dir=both,color=green];
           &quot;Guest #2&quot; -&gt; &quot;Hypervisor #1&quot; [dir=both,color=green];
           &quot;Guest #3&quot; -&gt; &quot;Hypervisor #2&quot; [dir=both,color=green];

           &quot;Hypervisor #1&quot; -&gt; &quot;Brick #4&quot; [dir=both,label=&quot;Guest #1&quot;];
           &quot;Hypervisor #1&quot; -&gt; &quot;Brick #3&quot; [dir=both,label=&quot;Guest #2&quot;];
           &quot;Hypervisor #2&quot; -&gt; &quot;Brick #3&quot; [dir=both,label=&quot;Guest #3&quot;];
       }" class="graphviz" /></div>
<p>In this illustration, <em>Hypervisor #1</em> and <em>Hypervisor #2</em> are storage
clients, and replicated bricks hold the disk images of each guest.</p>
<p>Each hypervisor can, in parallel, perform I/O against each individual
disk image, allowing (for example) both <em>Hypervisor #1</em> and
<em>Hypervisor #2</em> to run guests with disk images for which <em>Brick #3</em> has
been selected as the authoritative copy.</p>
</section>
<section id="application-level-replication">
<span id="deployment-application-replication"></span><h3>Application Level Replication<a class="headerlink" href="#application-level-replication" title="Permalink to this heading"></a></h3>
<p>Yet another means to provide redundancy of data is to use application-
level replication where available.</p>
<p>Famous examples include database server replication, where one or more
MySQL masters are used for write operations, and one or more MySQL
slaves are used for read operations, and LDAP replication.</p>
<p>Cyrus IMAP can also replicate its mail spools to other systems, such
that multiple backends hold the payload served to your users.</p>
</section>
<section id="shared-storage-generic">
<h3>Shared Storage (Generic)<a class="headerlink" href="#shared-storage-generic" title="Permalink to this heading"></a></h3>
<p>Contrary to popular belief, all shared storage -- NFS, iSCSI and FC
alike -- are <strong>not</strong> storage devices. They are <em>network protocols</em> for
which the application just so happens to be storage -- with perhaps the
exception to the rule being Fiber-Channel not strictly cohering to the
<a class="reference external" href="http://en.wikipedia.org/wiki/OSI_model">OSI model for networking</a>, although its own 5-layer model equates.</p>
<p>iSCSI and Fiber-Channel LUNs however are <em>mapped</em> to storage devices by
your favorite operating system's drivers for each technology, or
possibly by a hardware device (an <a class="reference internal" href="../../../glossary.html#term-HBA"><span class="xref std std-term">HBA</span></a>, or in iSCSI, an
<em>initiator</em>).</p>
<p>As such, use of these network protocols for which the purpose just so
happens to be storage does <strong>not</strong> provide redundancy.</p>
<p>It is imperative this is understood and equally well applied in planning
for storage infrastructure, or that your storage appliance vendor or
consultancy partner is trusted in their judgement.</p>
</section>
<section id="shared-storage-nfs">
<h3>Shared Storage (NFS)<a class="headerlink" href="#shared-storage-nfs" title="Permalink to this heading"></a></h3>
<p>Use of the Networked File System (NFS) in and by itself does <strong>not</strong>
provide redundancy, although the underlying storage volume might be
replicated.</p>
<p>For a variety of reasons, the use of <a class="reference external" href="http://www.time-travellers.org/shane/papers/NFS_considered_harmful.html">NFS is considered harmful</a> and is
therefore, and for other reasons,  most definitely not recommended for
Cyrus IMAP IMAP spool storage, or any other storage related to
functional components of Cyrus IMAP itself -- IMAP, LDAP, SQL, etc.</p>
<p>Most individual concerns can be addressed separately, and some should or
must already be resolved to address other potentially problematic areas
of a given infrastructure, regardless of the use of NFS.</p>
<p>A couple of concerns however only have <em>workarounds</em>, not solutions --
such as disabling locking -- and a number of concerns have no solution
at all.</p>
<p>One penalty to address is the inability for NFS mounted volumes to cache
I/O, known as in-memory buffer caching.</p>
<p>A technology called <a class="reference external" href="https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/ch-fscache.html">FS Cache</a> can facilitate eliminating round-trip-
incurred network-latency, but is still a filesystem-backed solution
(for which filesystem the local kernel applies buffer caching), requires
yet another daemon, and introduces yet another layer of synchronicity to
be maintained -- aside from <a class="reference external" href="https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/fscachelimitnfs.html">other limitations</a>.</p>
<p>An NFS-backed storage volume can still be used for fewer, larger files,
such as guest disk images.</p>
</section>
<section id="shared-storage-iscsi-or-fc-luns">
<h3>Shared Storage (iSCSI or FC LUNs)<a class="headerlink" href="#shared-storage-iscsi-or-fc-luns" title="Permalink to this heading"></a></h3>
<p>Both iSCSI LUNs and Fiber-Channel LUNs facilitate attaching a networked
block storage device as if it were a local disk (creating devices
similar to <code class="docutils literal notranslate"><span class="pre">/dev/sd{a,b,c,d}</span></code> etc.).</p>
<p>Since such a LUN is available over a &quot;network&quot; infrastructure, it may be
shared between multiple nodes but when it is, nodes need to coordinate
their I/O on some other level.</p>
<p>With an example case of hypervisors, either <a class="reference external" href="https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/LVM_Cluster_Overview.html">Cluster LVM</a> <a class="footnote-reference brackets" href="#id9" id="id4" role="doc-noteref"><span class="fn-bracket">[</span>3<span class="fn-bracket">]</span></a> or
<a class="reference external" href="http://en.wikipedia.org/wiki/GFS2">GFS</a> <a class="footnote-reference brackets" href="#id10" id="id5" role="doc-noteref"><span class="fn-bracket">[</span>4<span class="fn-bracket">]</span></a> could be used to protect against corruption of the LUN.</p>
</section>
</section>
<section id="availability">
<span id="imap-deployment-storage-availability"></span><h2>Availability<a class="headerlink" href="#availability" title="Permalink to this heading"></a></h2>
<p>Availability of storage too can be achieved via multiple routes. In one
of the aforementioned technologies, replicated bricks both available
real-time and online, in a parallel read-write capacity, provided high-
availability through redundancy (see
<a class="reference internal" href="#imap-deployment-storage-integrated-storage-protocol-level-replication"><span class="std std-ref">Integrated Storage Protocol Level Replication</span></a>).</p>
<p>An existing chunk of storage you might have is likely backed by a level
of RAID, with redundancy through mirroring individual disk volumes
and/or the inline calculation of parity, and perhaps also some spare
disks to replace those that are kicked or fall out of line.</p>
<p>Further features might include battery-backed I/O controllers, redundant
power supplies connected to different power groups, a further UPS and
a diesel generator (you start up once a month, right?).</p>
<p>The availability features of a data center are beyond the scope of this
document, but when we speak of availability with regards to storage, we
intend to speak of immediate, instant, online availability with
automated failover (such as the RAID array) -- and more prominently,
without interruption.</p>
<section id="multipath">
<h3>Multipath<a class="headerlink" href="#multipath" title="Permalink to this heading"></a></h3>
<p>Multipath is an enhancement technique in which multiple paths that are
available to the storage can be balanced, shaped and failed over
automatically. Imagine the following networking diagram:</p>
<div class="graphviz"><img src="../../../_images/graphviz-94191218877563f76f61e4eafc1cbe44611f1b1e.png" alt="digraph {
        rankdir = TB;
        splines = true;
        overlab = prism;

        edge [color=gray50, fontname=Calibri, fontsize=11];
        node [style=filled, shape=record, fontname=Calibri, fontsize=11];

        &quot;Node&quot;;

        &quot;Switch #1&quot;; &quot;Switch #2&quot;;

        &quot;Canister #1&quot;; &quot;Canister #2&quot;;

        &quot;iSCSI Target #1&quot;, &quot;iSCSI Target #2&quot;;

        &quot;Node&quot; -&gt; &quot;Switch #1&quot; [dir=none]
        &quot;Node&quot; -&gt; &quot;Switch #2&quot; [dir=none];

        &quot;Switch #1&quot; -&gt; &quot;Canister #1&quot; [dir=none];
        &quot;Switch #1&quot; -&gt; &quot;Canister #2&quot; [dir=none];

        &quot;Switch #2&quot; -&gt; &quot;Canister #1&quot; [dir=none];
        &quot;Switch #2&quot; -&gt; &quot;Canister #2&quot; [dir=none];

        &quot;Canister #1&quot; -&gt; &quot;iSCSI Target #1&quot; [dir=none];
        &quot;Canister #1&quot; -&gt; &quot;iSCSI Target #2&quot; [dir=none];

        &quot;Canister #2&quot; -&gt; &quot;iSCSI Target #1&quot; [dir=none];
        &quot;Canister #2&quot; -&gt; &quot;iSCSI Target #2&quot; [dir=none];
    }" class="graphviz" /></div>
<p>The <em>null</em> situation is depicted in the previous wiring diagram. When
multipath kicks in, primary vs. secondary paths will be chosen for each
individual target (that is unique). However, the system maintains a list
of potential paths, and continuously monitors all paths for their
viability.</p>
<p>In the example, for <em>Node</em> attaching to <em>iSCSI Target #1</em> results in up
to 4 paths to <em>iSCSI Target #1</em> -- <em>4</em> paths, not <em>8</em>, because the
networking of <em>Switch #1</em> and <em>Switch #2</em> is not considered a path with
iSCSI -- <em>two nodes</em> and <em>two send targets each</em>.</p>
<p>Multipath chooses one path to the available storage:</p>
<div class="graphviz"><img src="../../../_images/graphviz-bcf8927eec26409fd3af5aaaf76d6d8e349f78d9.png" alt="digraph {
        rankdir = TB;
        splines = true;
        overlab = prism;

        edge [color=gray50, fontname=Calibri, fontsize=11];
        node [style=filled, shape=record, fontname=Calibri, fontsize=11];

        &quot;Node&quot;;

        &quot;Switch #1&quot; [color=green];
        &quot;Switch #2&quot;;

        &quot;Canister #1&quot;;
        &quot;Canister #2&quot; [color=green];

        &quot;iSCSI Target #1&quot; [color=green];
        &quot;iSCSI Target #2&quot;;

        &quot;Node&quot; -&gt; &quot;Switch #1&quot; [dir=none,color=green]
        &quot;Node&quot; -&gt; &quot;Switch #2&quot; [dir=none];

        &quot;Switch #1&quot; -&gt; &quot;Canister #1&quot; [dir=none];
        &quot;Switch #1&quot; -&gt; &quot;Canister #2&quot; [dir=none,color=green];

        &quot;Switch #2&quot; -&gt; &quot;Canister #1&quot; [dir=none];
        &quot;Switch #2&quot; -&gt; &quot;Canister #2&quot; [dir=none];

        &quot;Canister #1&quot; -&gt; &quot;iSCSI Target #1&quot; [dir=none];
        &quot;Canister #1&quot; -&gt; &quot;iSCSI Target #2&quot; [dir=none];

        &quot;Canister #2&quot; -&gt; &quot;iSCSI Target #1&quot; [dir=none,color=green];
        &quot;Canister #2&quot; -&gt; &quot;iSCSI Target #2&quot; [dir=none];
    }" class="graphviz" /></div>
<p>Should one port, bridge, controller, switch or cable fail, then the I/O
can fall back on to any of the remaining available paths.</p>
<p>As per the example, this might mean the following (with <em>Canister #2</em>
failing):</p>
<div class="graphviz"><img src="../../../_images/graphviz-a8c8ff6e847d79fecdbe8cb1424b8b50a9e36934.png" alt="digraph {
        rankdir = TB;
        splines = true;
        overlab = prism;

        edge [color=gray50, fontname=Calibri, fontsize=11];
        node [style=filled, shape=record, fontname=Calibri, fontsize=11];

        &quot;Node&quot;;

        &quot;Switch #1&quot; [color=green];
        &quot;Switch #2&quot;;

        &quot;Canister #1&quot; [color=green];
        &quot;Canister #2&quot; [color=red];

        &quot;iSCSI Target #1&quot; [color=green];
        &quot;iSCSI Target #2&quot;;

        &quot;Node&quot; -&gt; &quot;Switch #1&quot; [dir=none,color=green]
        &quot;Node&quot; -&gt; &quot;Switch #2&quot; [dir=none];

        &quot;Switch #1&quot; -&gt; &quot;Canister #1&quot; [dir=none,color=green];
        &quot;Switch #1&quot; -&gt; &quot;Canister #2&quot; [dir=none,color=red];

        &quot;Switch #2&quot; -&gt; &quot;Canister #1&quot; [dir=none];
        &quot;Switch #2&quot; -&gt; &quot;Canister #2&quot; [dir=none];

        &quot;Canister #1&quot; -&gt; &quot;iSCSI Target #1&quot; [dir=none,color=green];
        &quot;Canister #1&quot; -&gt; &quot;iSCSI Target #2&quot; [dir=none];

        &quot;Canister #2&quot; -&gt; &quot;iSCSI Target #1&quot; [dir=none,color=red];
        &quot;Canister #2&quot; -&gt; &quot;iSCSI Target #2&quot; [dir=none];
    }" class="graphviz" /></div>
</section>
</section>
<section id="performance">
<span id="imap-deployment-storage-performance"></span><h2>Performance<a class="headerlink" href="#performance" title="Permalink to this heading"></a></h2>
<section id="storage-tiering">
<h3>Storage Tiering<a class="headerlink" href="#storage-tiering" title="Permalink to this heading"></a></h3>
<p>Storage tiering includes the combination of different types of storage
or storage volumes with different performance expectations within the
infrastructure, so that a larger volume of slower, cheaper storage can
be used for items that are not used that much, and/or are not that
important for day-to-day operations, while a smaller volume of faster,
more expensive storage can be used for items that are frequently
accessed and have significant importance to everyday use.</p>
<p>The Cyrus IMAP administrator guide has a section on using
<a class="reference internal" href="../../reference/admin/tweaking.html#admin-tweaking-cyrus-imapd-storage-tiering"><span class="std std-ref">Storage Tiering</span></a> to tweak Cyrus IMAP
performance, to illustrate various opportunities to make optimal use of
your storage.</p>
<p>As a general rule of thumb, you could divide
<a class="reference internal" href="../../../glossary.html#term-operating-system-disks"><span class="xref std std-term">operating system disks</span></a> and <a class="reference internal" href="../../../glossary.html#term-payload-disks"><span class="xref std std-term">payload disks</span></a>; the operating
system disk could hold your base installation of a node, including
everything in the root (<code class="docutils literal notranslate"><span class="pre">/</span></code>) filesystem, while your payload disk(s)
hold the files and directories that contain the system's service(s)
payload (such as <code class="docutils literal notranslate"><span class="pre">/var/lib/mysql/</span></code>, <code class="docutils literal notranslate"><span class="pre">/var/spool/cyrus/</span></code>,
<code class="docutils literal notranslate"><span class="pre">/var/lib/imap/</span></code>, <code class="docutils literal notranslate"><span class="pre">/var/lib/dirsrv/</span></code>, etc.).</p>
<p>Distributing what is and what is not frequently used may be a cumbersome
task for administrators. Some storage vendor's appliances offer
automated storage tiering, where some disks in the appliance are SSDs,
while others are SATA or SAS HDDs, and the appliance itself tiers the
storage.</p>
<p>A similar solution is available to Linux nodes, through <a class="reference external" href="http://en.wikipedia.org/wiki/Dm-cache">dm-cache</a>,
provided they run a recent kernel.</p>
</section>
<section id="disk-buffering">
<h3>Disk Buffering<a class="headerlink" href="#disk-buffering" title="Permalink to this heading"></a></h3>
<p>Reading from a disk is considered very, very slow when compared to
accessing a node's (real) memory. While dependent on the particular I/O
pattern of an application, it is not uncommon at all for an application
to read the same part of a disk volume several times during a relatively
short period of time.</p>
<p>In Cyrus IMAP, for example, while a user is logged in, a mail
folder's <code class="file docutils literal notranslate"><span class="pre">cyrus.index</span></code> is read more frequently than it is
written to -- such as when refreshing the folder view, when opening a
message in the folder, when replying to a message, etc.</p>
<p>It is important to appreciate the impact of <a class="reference external" href="http://www.tldp.org/LDP/sag/html/buffer-cache.html">memory-based buffer cache</a>
for this type of I/O on the overall performance of the environment.</p>
<p>Should no (local) memory-based buffer cache be available, because for
example you are using a network filesystem (NFS, GlusterFS, etc.), then
it is extremely important to appreciate the consequences in terms of the
performance expectations.</p>
</section>
<section id="readahead">
<h3>Readahead<a class="headerlink" href="#readahead" title="Permalink to this heading"></a></h3>
<p>Reading ahead is a feature in which -- in a future-predicting,
anticipatory fashion -- a chunk of storage is read in addition to the
chunk of storage actually being requested.</p>
<p>A common application of read-ahead is to record all files accessed
during the boot process of a node, such that later boot sequences can
read files from disk, and insert them in to the
<a class="reference external" href="http://www.tldp.org/LDP/sag/html/buffer-cache.html">memory-based buffer cache</a> ahead of software actually issuing the call
to read the file. The file's contents can now be reproduced from the
faster (real) memory rather then from the slow disk.</p>
<p>Readahead generally does not matter for small files, unless read
operations work on a collective of aggregate message files. It does
however matter for attached devices on infrastructural components such
as hypervisors, where entire block devices (for the guest) are the files
or block devices being read.</p>
<p>The ideal setting for readahead depends on a variety of factors and can
usually only be established by monitoring an environment and tweaking
the setting (followed by some more monitoring).</p>
</section>
</section>
<section id="scalability">
<span id="imap-deployment-storage-scalability"></span><h2>Scalability<a class="headerlink" href="#scalability" title="Permalink to this heading"></a></h2>
<p>When originally planning for storage capacity, a few things are to be
taken in to account. We'll point these out and address them later in
this section.</p>
<p>Generically speaking, when storage capacity is planned for initially,
a certain period of time is used to establish how much storage might be
required (for that duration).</p>
<p>However, let's suppose regulatory provisions dictate a period of 10
years of business communications need to be preserved. How does one
accurately predict the volume of communications over the next 10 years?</p>
<p>Let's suppose your organization is in flux, expanding or contracting as
businesses do at times, or budget cuts and unexpected extra tasks to
your organization might incur. Or when the organization takes over or
otherwise incorporates another.</p>
<p>Today's storage coming with a certain price-tag, and tomorrow's with a
different one, it can be an interesting exercise to plan for storage to
grow organically as needed, rather than make large investments to provide
capacity that may only be used years from today, or not be used at all,
or turn out to still not be sufficient.</p>
<p>One may also consider planning for the future expansion of the storage
solution chosen today, possibly including significant changes in
requirements (larger mailboxes).</p>
<section id="data-retention">
<h3>Data Retention<a class="headerlink" href="#data-retention" title="Permalink to this heading"></a></h3>
<p>Cyrus IMAP by default does not delete IMAP spool contents from the
filesystem for a period of 69 days.</p>
<p>This means that when a 100 users each have 1 GB of quota, the actual
data footprint might grow way beyond 100 GB on disk.</p>
<p>Depending on the nature of how you use Cyrus IMAP, a reasonable
expectation can be formulated and used for calculating the likely amount
of disk space used in addition to the content that continues to count
towards quota.</p>
<p>For example, if a large amount of message traffic ends up in a shared
folder that many users read messages from and respond to (such as might
be the case for an <a class="reference external" href="mailto:info&#37;&#52;&#48;example&#46;org">info<span>&#64;</span>example<span>&#46;</span>org</a> email address), then around triple
the amount of traffic per month will continue to be stored on disk, plus
what you decide is still current and not deleted by users (the &quot;live
size&quot;).</p>
</section>
<section id="shared-folders">
<h3>Shared Folders<a class="headerlink" href="#shared-folders" title="Permalink to this heading"></a></h3>
<p>Shared folders (primarily those to which mail is delivered) do not, by
default, have any quota on them. They are also not purged by default. As
such, they could grow infinitely (until disks run out of space).</p>
<p>A busy mailing list used for human communications, such as
<a class="reference external" href="mailto:devel&#37;&#52;&#48;lists&#46;fedoraproject&#46;org">devel<span>&#64;</span>lists<span>&#46;</span>fedoraproject<span>&#46;</span>org</a>, can be expected to grow to as much as 1
GB of data foot print on disk over a period of 3 years -- at a message
rate of less than ~100 a day and without purging.</p>
<p>A mailing list with automated messages generated from applications, such
as <a class="reference external" href="mailto:bugs-list&#37;&#52;&#48;kde&#46;org">bugs-list<span>&#64;</span>kde<span>&#46;</span>org</a>, which is notified of all ticket changes for KDE's
upstream Bugzilla, can be expected to grow to up to 3.5 GB over the same
period -- at a message rate of ~300 per day and without purging.</p>
</section>
<section id="user-s-groupware-folders">
<h3>User's Groupware Folders<a class="headerlink" href="#user-s-groupware-folders" title="Permalink to this heading"></a></h3>
<p>Users tend not to clean up their calendars, removing old appointments
that have no bearing on today's views/operations any longer. They do
count towards a user's quota.</p>
</section>
</section>
<section id="capacity">
<span id="imap-deployment-storage-capacity"></span><h2>Capacity<a class="headerlink" href="#capacity" title="Permalink to this heading"></a></h2>
<p>Regardless of the volume of storage in total, this section relates to
the volume of storage allocated to any one singular specific purpose in
Cyrus IMAP, and capacity planning in light of that (not the layer
providing the storage).</p>
<p>Archiving and e-Discovery notwithstanding, the largest chunks of data
volume you are going to find in Cyrus IMAP are the live IMAP
spools.</p>
<p>Let each individual IMAP spool be considered a volume, or part of a
volume if you feel inclined to share volumes across Cyrus IMAP backend
instances. Regardless, you need a filesystem <strong>somewhere</strong> (even if the
bricks building the volume are distributed) -- the recommended
restrictions you should put forth to the individual chunks of storage
lay therein.</p>
<p>Saturating the argument to make a point, imagine, if you will, a million
users with one gigabyte of data each. Just the original, minimal data
footprint is now around and about one petabyte.</p>
<p>Performing a filesystem check (<strong class="command">fsck.ext4</strong> comes to mind) on a
single one petabyte volume will be prohibitively expensive simply
considering the duration of the command to complete execution, let alone
successful execution, for your <strong>million</strong> users will not have access to
their data while the command has not finished -- again, let alone it
finished successfully.</p>
<p><strong>Solely therefore</strong> would you require a second copy of the groupware
payload, now establishing a minimal data footprint to <strong>two</strong> petabyte.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>Also note that the two replicas of one petabyte would, if the
replication occurs at the storage volume level, run the risk of
corrupting both replicas' filesystems.</p>
</div>
<p>Your requirements for data redundancy aside, filesystem checks needing
to be performed at least regularly <a class="footnote-reference brackets" href="#id11" id="id6" role="doc-noteref"><span class="fn-bracket">[</span>5<span class="fn-bracket">]</span></a>, if not for the level of
likelihood they need to happen because actual recovery is required,
should be restricted to a volume of data that enables the check to
complete in a timely fashion, and possibly (when no data redundancy is
implemented) even within a timeframe you feel comfortable you can hold
off your users/customers while they have no access to their data.</p>
<p>For filesystem checks to need to happen regularly, is not to say that
such filesystem checks require the node to be taken offline. Should you
use Logical Volume Management (LVM) for example, and not allocate 100%
of the volume group to the logical volume that holds the IMAP spool,
than intermediate filesystem checks can be executed on a snapshot of
said logical volume instead, and while the node remains online, to give
you a generic impression of the filesystem's health. Given this
information, you can schedule a service window should you feel the need
to check the actual filesystem.</p>
<p>A good article on filesystems, the corruption of data and their causes
and mitigation strategies has been written up by <a class="reference external" href="http://lwn.net">LWN</a>,
<a class="reference external" href="http://lwn.net/Articles/190222/">The 2006 Linux Filesystem Workshop</a>. This article also explains what
it is a filesystem check actually does, and why it is usually configured
to be ran after either a certain amount of delay or number of mounts.</p>
</section>
<section id="cost">
<span id="imap-deployment-storage-cost"></span><h2>Cost<a class="headerlink" href="#cost" title="Permalink to this heading"></a></h2>
<p>When cost is of no concern, multiple vendors of storage solutions will
tell you precisely what you need to hear -- I think we've all been
there.</p>
<p>When cost is a concern, however, cheaper disks are often slower, fail
faster, and sometimes also do not provide the
<a class="reference internal" href="#imap-deployment-storage-capacity"><span class="std std-ref">Capacity</span></a> desired.</p>
<p>On the other hand, stuffing many consumer-grade SATA III disks in to
some commodity hardware likely raises run-time costs -- energy.</p>
<p>However, a chassis of a storage solution usually comes at a higher
price point, and therefore expands capacity with relatively large
chunks, which may not be what you require at that moment.</p>
<p class="rubric">Footnotes</p>
<aside class="footnote-list brackets">
<aside class="footnote brackets" id="id7" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id1">1</a><span class="fn-bracket">]</span></span>
<p>Applications may also operate on data not stored on disk at all,
which is another common avenue potentially resulting in loss of data
-- or <em>corruption</em>, which is merely a type of data-loss.</p>
</aside>
<aside class="footnote brackets" id="id8" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id2">2</a><span class="fn-bracket">]</span></span>
<p>With read operations, the other node(s) must be blocked from
writing, and with write operations, the other node(s) must be
blocked from reading and writing.</p>
</aside>
<aside class="footnote brackets" id="id9" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id4">3</a><span class="fn-bracket">]</span></span>
<p>When using ClusterLVM, you would use logical volumes as disks for
your guests.</p>
</aside>
<aside class="footnote brackets" id="id10" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id5">4</a><span class="fn-bracket">]</span></span>
<p>When using GFS, you would mount the GFS filesystem partition on each
hypervisor and use disk image files.</p>
</aside>
<aside class="footnote brackets" id="id11" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id6">5</a><span class="fn-bracket">]</span></span>
<p>Execute filesystem checks regularly to increase your level of
confidence, that should emergency repairs need to be performed, you
are able to recover swiftly.</p>
<p>The <a class="reference internal" href="../../../glossary.html#term-MTBF"><span class="xref std std-term">MTBF</span></a> of a stable filesystem has most often been subject
to the failure of the underlying disk, with the filesystem unable to
recover (in time) from the underlying disk failing (partly).</p>
</aside>
</aside>
</section>
</section>


           </div>
          </div>
          <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
        <a href="performance_recommendations.html" class="btn btn-neutral float-left" title="Performance Recommendations" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
        <a href="supported-platforms.html" class="btn btn-neutral float-right" title="Supported Platforms and System Requirements" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
    </div>

  <hr/>

  <div role="contentinfo">
    <p>&#169; Copyright 1993–2025, The Cyrus Team.</p>
  </div>

  Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
    <a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
    provided by <a href="https://readthedocs.org">Read the Docs</a>.
   

</footer>
        </div>
      </div>
    </section>
  </div>
  <script>
      jQuery(function () {
          SphinxRtdTheme.Navigation.enable(true);
      });
  </script>
 



</body>
</html>