File: storage.html

package info (click to toggle)
cyrus-imapd 3.10.2-1
links: PTS, VCS
area: main
in suites: trixie
size: 59,108 kB
sloc: ansic: 284,386; perl: 137,327; javascript: 9,659; sh: 5,730; yacc: 2,565; makefile: 2,188; cpp: 2,147; lex: 662; xml: 621; awk: 303; python: 272; asm: 262
file content (921 lines) | stat: -rw-r--r-- 57,142 bytes
<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<head>
  <meta charset="utf-8" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />

  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>Storage Considerations &mdash; Cyrus IMAP 3.10.2 documentation</title>
      <link rel="stylesheet" href="../../../_static/pygments.css" type="text/css" />
      <link rel="stylesheet" href="../../../_static/css/theme.css" type="text/css" />
      <link rel="stylesheet" href="../../../_static/graphviz.css" type="text/css" />
      <link rel="stylesheet" href="../../../_static/cyrus.css" type="text/css" />
  
        <script data-url_root="../../../" id="documentation_options" src="../../../_static/documentation_options.js"></script>
        <script src="../../../_static/jquery.js"></script>
        <script src="../../../_static/underscore.js"></script>
        <script src="../../../_static/_sphinx_javascript_frameworks_compat.js"></script>
        <script src="../../../_static/doctools.js"></script>
        <script src="../../../_static/sphinx_highlight.js"></script>
    <script src="../../../_static/js/theme.js"></script>
    <link rel="index" title="Index" href="../../../genindex.html" />
    <link rel="search" title="Search" href="../../../search.html" />
    <link rel="next" title="Supported Platforms and System Requirements" href="supported-platforms.html" />
    <link rel="prev" title="Performance Recommendations" href="performance_recommendations.html" /> 
</head>

<body class="wy-body-for-nav"> 
  <div class="wy-grid-for-nav">
    <nav data-toggle="wy-nav-shift" class="wy-nav-side">
      <div class="wy-side-scroll">
        <div class="wy-side-nav-search" >

          
          
          <a href="../../../index.html" class="icon icon-home">
            Cyrus IMAP
          </a>
              <div class="version">
                3.10.2
              </div>
<div role="search">
  <form id="rtd-search-form" class="wy-form" action="../../../search.html" method="get">
    <input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
    <input type="hidden" name="check_keywords" value="yes" />
    <input type="hidden" name="area" value="default" />
  </form>
</div>
        </div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
              <p class="caption" role="heading"><span class="caption-text">Cyrus IMAP</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../../../download.html">Download</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../quickstart.html">Quickstart Guide</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../overview.html">Overview</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="../../../setup.html">Setup</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="../../developer/compiling.html">Compiling</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../installing.html">Installing Cyrus</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../download/upgrade.html">Upgrading to 3.10</a></li>
<li class="toctree-l2 current"><a class="reference internal" href="../deployment.html">Configuration Guide</a><ul class="current">
<li class="toctree-l3"><a class="reference internal" href="deployment_scenarios.html">Deployment Scenarios</a></li>
<li class="toctree-l3"><a class="reference internal" href="deployment_scenarios.html#cyrus-murder-server-aggregation">Cyrus Murder: Server aggregation</a></li>
<li class="toctree-l3"><a class="reference internal" href="deployment_scenarios.html#cyrus-replication">Cyrus Replication</a></li>
<li class="toctree-l3"><a class="reference internal" href="deployment_scenarios.html#hosted-environments">Hosted Environments</a></li>
<li class="toctree-l3"><a class="reference internal" href="databases.html">Databases</a></li>
<li class="toctree-l3"><a class="reference internal" href="mailbox_creation_distribution.html">Mailbox Creation Distribution</a></li>
<li class="toctree-l3"><a class="reference internal" href="known_protocol_limitations.html">Known Protocol Limitations</a></li>
<li class="toctree-l3"><a class="reference internal" href="authentication_and_authorization.html">Authentication and Authorization</a></li>
<li class="toctree-l3"><a class="reference internal" href="performance_recommendations.html">Performance Recommendations</a></li>
<li class="toctree-l3 current"><a class="current reference internal" href="#">Storage Considerations</a></li>
<li class="toctree-l3"><a class="reference internal" href="supported-platforms.html">Supported Platforms and System Requirements</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../../operations.html">Operations</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../developers.html">Developers</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../support.html">Support/Community</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Cyrus SASL</span></p>
<ul>
<li class="toctree-l1"><a class="reference external" href="http://www.cyrusimap.org/sasl">Cyrus SASL</a></li>
</ul>

        </div>
      </div>
    </nav>

    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
          <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
          <a href="../../../index.html">Cyrus IMAP</a>
      </nav>

      <div class="wy-nav-content">
        <div class="rst-content">
          <div role="navigation" aria-label="Page navigation">
  <ul class="wy-breadcrumbs">
      <li><a href="../../../index.html" class="icon icon-home" aria-label="Home"></a></li>
          <li class="breadcrumb-item"><a href="../../../setup.html">Setup</a></li>
          <li class="breadcrumb-item"><a href="../deployment.html">Configuration Guide</a></li>
      <li class="breadcrumb-item active">Storage Considerations</li>
      <li class="wy-breadcrumbs-aside">
              <a href="https://github.com/cyrusimap/cyrus-imapd/blob/master/docsrc/imap/concepts/deployment/storage.rst" class="fa fa-github"> Edit on GitHub</a>
      </li>
  </ul>
  <hr/>
</div>
          <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
           <div itemprop="articleBody">
             
  <section id="storage-considerations">
<span id="imap-deployment-storage"></span><h1>Storage Considerations<a class="headerlink" href="#storage-considerations" title="Permalink to this heading"></a></h1>
<p>Storage considerations are a complex matter, as the various options
provide or restrict one's ability to adjust the necessary parameters as
the need arises. It is foremost a challenge to clearly articulate and
prioritize the criteria for storage, and map the theory on to a
practical implementation design.</p>
<p>This article intends to provide information and outline details, and
sometimes opinions and recommendations, but it is not a guide to
providing you with the storage solution that you want or require.</p>
<p>Generally, the most important considerations for storage include;</p>
<p><a class="reference internal" href="#imap-deployment-storage-redundancy"><span class="std std-ref">Redundancy</span></a>,</p>
<blockquote>
<div><p>because nothing is as humiliating as losing all your data.</p>
</div></blockquote>
<p><a class="reference internal" href="#imap-deployment-storage-availability"><span class="std std-ref">Availability</span></a>,</p>
<blockquote>
<div><p>because nothing is more stressful than none of your data being
available.</p>
</div></blockquote>
<p><a class="reference internal" href="#imap-deployment-storage-performance"><span class="std std-ref">Performance</span></a>,</p>
<blockquote>
<div><p>because nothing is as annoying as waiting, followed by some more
waiting.</p>
</div></blockquote>
<p><a class="reference internal" href="#imap-deployment-storage-scalability"><span class="std std-ref">Scalability</span></a>,</p>
<blockquote>
<div><p>because <code class="docutils literal notranslate"><span class="pre">-ENOSPC</span></code> is good only when it applies to your stomach.</p>
</div></blockquote>
<p><a class="reference internal" href="#imap-deployment-storage-capacity"><span class="std std-ref">Capacity</span></a>,</p>
<blockquote>
<div><p>because your data must be available, backed up and archived.</p>
</div></blockquote>
<p><a class="reference internal" href="#imap-deployment-storage-cost"><span class="std std-ref">Cost</span></a>,</p>
<blockquote>
<div><p>because you can't buy a beer or feed a family with an empty wallet.</p>
</div></blockquote>
<p>Storage is not a part of Cyrus IMAP, in that Cyrus IMAP does not ship
a particular storage solution as part of the product, and it has no
particular requirements for storage either.</p>
<p>As such, your SAN, NAS, local disk, local array of disks or network
share or even the flash drive of a Raspberry Pi could be used, although
the following considerations are important:</p>
<ul class="simple">
<li><p>The Cyrus IMAP spool is I/O intensive (large volumes of data are read
and get written).</p></li>
<li><p>The Cyrus IMAP spool consists of many small files.</p></li>
</ul>
<p>As such, we recommend you take into account;</p>
<ul class="simple">
<li><p>The available bandwidth between the IMAP server and the storage
provider, if at all on the network,</p></li>
<li><p>The (network) protocol overhead, if any, should file-level read
and/or write locking be required or implied.</p></li>
<li><p>Atomic file operations.</p></li>
<li><p>Parallel access (such as shared mailboxes or multi-client
attendance).</p></li>
</ul>
<section id="general-notes-on-storage">
<h2>General Notes on Storage<a class="headerlink" href="#general-notes-on-storage" title="Permalink to this heading"></a></h2>
<p>The aforementioned considerations
<a class="reference internal" href="#imap-deployment-storage-redundancy"><span class="std std-ref">Redundancy</span></a>,
<a class="reference internal" href="#imap-deployment-storage-availability"><span class="std std-ref">Availability</span></a>,
<a class="reference internal" href="#imap-deployment-storage-performance"><span class="std std-ref">Performance</span></a>,
<a class="reference internal" href="#imap-deployment-storage-scalability"><span class="std std-ref">Scalability</span></a>,
<a class="reference internal" href="#imap-deployment-storage-capacity"><span class="std std-ref">Capacity</span></a> and
<a class="reference internal" href="#imap-deployment-storage-cost"><span class="std std-ref">Cost</span></a>
are not all of them equally important -- not to all organizations, and
not to all requirements when the priorities are set out against the
implied cost of the supposed ideal solution.</p>
<p>They are also not mutually exclusive in that, for example, redundancy
may partly address some of the availability concerns -- depending on the
exact nature of the final deployment of course, and backup/recovery
capabilities in turn may partly address redundancy requirements. Neither
necessarily directly addresses availability concerns, however.</p>
<p>What is deemed acceptable is another culprit -- more often then not,
operational cost, familiarity of staff with a particular storage
solution, or flexibility of a storage solution (or lack thereof) may get
in the way of an otherwise appropriate storage solution.</p>
<p>We believe that provided a sufficient amount of accurate information,
however, you are able to make an informed choice, and that an informed
choice is always better than an ill-informed one.</p>
</section>
<section id="redundancy">
<span id="imap-deployment-storage-redundancy"></span><h2>Redundancy<a class="headerlink" href="#redundancy" title="Permalink to this heading"></a></h2>
<p>Storage redundancy is achieved through replication of data. It is
important to understand that, as a matter of design principle,
redundancy does not in and by itself provide increased availability.</p>
<p>How redundancy could increase availability depends on the exact
implementation, and the various options for practical implementation
each have their own set of implications for cases of failure and the
need to, under certain circumstances, failover and/or recover.</p>
<p>How redundancy is achieved in an &quot;acceptable&quot; manner is another subject
open to interpretation; it is sometimes deemed acceptable to create
backups daily, and therefore potentially accept the loss of up to one
day's worth of information from live spools -- which may or may not be
recoverable through different means. More commonly however is to not
settle for anything less than real-time replication of data.</p>
<p>While storage ultimately amounts to disks, it is important to understand
that a number of (virtual) devices, channels, links and interfaces exist
between an application operating data on disk <a class="footnote-reference brackets" href="#id7" id="id1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a>, and the physical
sectors and blocks or cells of storage on that disk. In a way, this
number of layers can be compared with the <a class="reference external" href="http://en.wikipedia.org/wiki/OSI_model">OSI model for networking</a> --
but it is not the same at all.</p>
<p>This section addresses the most commonly used levels at which
replication can be applied.</p>
<section id="storage-volume-level-replication">
<h3>Storage Volume Level Replication<a class="headerlink" href="#storage-volume-level-replication" title="Permalink to this heading"></a></h3>
<p>When using the term <a class="reference internal" href="../../../glossary.html#term-storage-volume-level-replication"><span class="xref std std-term">storage volume level replication</span></a> we mean to
indicate the replication of <a class="reference internal" href="../../../glossary.html#term-disk-volumes"><span class="xref std std-term">disk volumes</span></a> as a whole. A
simplistic replication scenario of a data disk between two nodes could
look as follows:</p>
<div class="graphviz"><img src="../../../_images/graphviz-3d117e14f1a624d46409d9f38d14363039c26823.png" alt="digraph drbd {
        rankdir = LR;
        splines = true;
        overlab = prism;

        edge [color=gray50, fontname=Calibri, fontsize=11];
        node [style=filled, shape=record, fontname=Calibri, fontsize=11];

        subgraph cluster_master {
                label = &quot;Master&quot;;

                color = &quot;#BBFFBB&quot;;
                fontname = Calibri;
                rankdir = TB;
                style = filled;

                &quot;OS Disk 0&quot; [label=&quot;OS Disk&quot;,color=&quot;green&quot;];
                &quot;Data Disk 0&quot; [label=&quot;Data Disk&quot;,color=&quot;green&quot;];
            }

        subgraph cluster_slave {
                label = &quot;Slave&quot;;

                color = &quot;#FFBBBB&quot;;
                fontname = Calibri;
                rankdir = TB;
                style = filled;

                &quot;OS Disk 1&quot; [label=&quot;OS Disk&quot;,color=&quot;green&quot;];
                &quot;Data Disk 1&quot; [label=&quot;Data Disk&quot;,color=&quot;red&quot;];
            }

        &quot;Data Disk 0&quot; -&gt; &quot;Data Disk 1&quot; [label=&quot;One-Way Replication&quot;];
    }" class="graphviz" /></div>
<p>For a fully detailed picture of the internal structures, please see the
<a class="reference external" href="http://www.drbd.org/">DRBD</a> website, the canonical experts on this level of replication.</p>
<p>Normally storage-level replication occurs in such
fashion that it can be compared with a distributed version of a RAID-1
array. This incurs limitations that need to be evaluated carefully.</p>
<p>In a hardware RAID-1 array, storage is physically constrained to a
single node, and pairs of replicated disks are treated as one. In a
software RAID-1 array, it is the operating system's software RAID driver
that can (must) address the individual disks, but makes the array appear
as a single disk to all higher-level software. Here too, the disks are
physically constrained to one physical node.</p>
<p>In both cases, a <em>single point of control</em> exists with full and
exclusive access to the physical disk device(s), namely the interface
for <em>all higher-level software</em> to interact with the storage.</p>
<p>This is the underlying cause of the storage-level replication conundrum.</p>
<p>To illustrate the conundrum, we use a software RAID-1 array. The
individual disk volumes that make up the RAID-1 array are not hidden
from the rest of the operating system, but more importantly, direct
access to the underlying device is not prohibited. With an example pair
<code class="docutils literal notranslate"><span class="pre">sda2</span></code> and <code class="docutils literal notranslate"><span class="pre">sdb2</span></code>, nothing prevents you from executing <code class="docutils literal notranslate"><span class="pre">mkfs.ext4</span></code>
on <code class="docutils literal notranslate"><span class="pre">/dev/sdb2</span></code> thereby corrupting the array -- other than perhaps not
having the necessary administrative privileges.</p>
<p>To further illustrate, position one disk in the RAID-1 array on the
other side of a network (such as is a <a class="reference external" href="http://www.drbd.org/">DRBD</a> topology, as illustrated).
Since now two nodes participate in nurturing the mirrored volume, two
points of control exist -- each node controls the access to its local
disk device(s).</p>
<p>Participating nodes are <strong>required</strong> to successfully coordinate their
I/O with one another, which on the level of entire storage volumes is a
very impractical effort with high latency and enormous overhead, should
more than one node be allowed to access the replicated device <a class="footnote-reference brackets" href="#id8" id="id2" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a>.</p>
<p>It is therefore understood that, using storage level replication;</p>
<ul class="simple">
<li><p>Only one side of the mirrored volume can be active (master), and the
other side must remain passive (slave),</p></li>
<li><p>The active and passive nodes therefore have a cluster solution
implemented to manage application's failover and management of the
change in replication topology (a slave becomes the I/O master, the
former master becomes the replication slave, and other slaves, if
any, learn about the new master to replicate from),</p></li>
<li><p>Failover implementations include fencing, the STONITH principle,
ensuring no two nodes in parallel perform I/O on the same volume at
any given time.</p></li>
</ul>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>Storage volume level replication does not protect against filesystem
or payload corruption -- the replication happily mirrors the
&quot;faulty&quot; bits as it is completely agnostic to the bits' meaning and
relevance.</p>
</div>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>For the reasons outlined in this section, storage volume level
replication has only a limited number of Cyrus IMAP deployment
scenarios for which it would be recommended -- such as <em>Disaster
Recovery Failover</em>.</p>
</div>
</section>
<section id="integrated-storage-protocol-level-replication">
<span id="imap-deployment-storage-integrated-storage-protocol-level-replication"></span><h3>Integrated Storage Protocol Level Replication<a class="headerlink" href="#integrated-storage-protocol-level-replication" title="Permalink to this heading"></a></h3>
<p>Integrated storage protocol level replication is a different approach to
making storage volumes redundant, applying the replication on a
different level.</p>
<p>Integrated storage protocol level replication isn't necessarily limited
to replication for the purposes of redundancy only, as it may already
include parallel access controls, distribution across multiple storage
nodes (each providing a part of the total storage volume available),
enabling the use of cheap commodity hardware to provide the individual
parts (called &quot;bricks&quot;) that make up the larger volume.</p>
<p>Additional features may include the use of a geographically oriented set
of parameters for the calculation and assignment of replicated chunks of
data (ie. &quot;brick replication topology&quot;).</p>
<div class="graphviz"><img src="../../../_images/graphviz-b94689a1c06f2d38b8797872ac6d90bc2bea88ca.png" alt="digraph {
        rankdir = TB;
        splines = true;
        overlab = prism;

        edge [color=gray50, fontname=Calibri, fontsize=11];
        node [style=filled, shape=record, fontname=Calibri, fontsize=11];

        &quot;Storage Client #1&quot; -&gt; &quot;Storage Access Point&quot; [dir=back,color=green];
        &quot;Storage Client #2&quot; -&gt; &quot;Storage Access Point&quot; [dir=back,color=green];
        &quot;Storage Client #3&quot; -&gt; &quot;Storage Access Point&quot; [dir=back,color=green];
        &quot;Storage Client #4&quot; -&gt; &quot;Storage Access Point&quot; [dir=back,color=green];

        subgraph cluster_storage {
                color = green;
                label = &quot;Distributed and/or Replicated Volume Manager w/ Integrated Distributed (File-) Locking&quot;;

                &quot;Storage Access Point&quot; [shape=point,color=green];

                &quot;Brick #1&quot; [color=green];
                &quot;Brick #2&quot; [color=green];
                &quot;Brick #3&quot; [color=green];
                &quot;Brick #4&quot; [color=green];

                &quot;Storage Access Point&quot; -&gt; &quot;Brick #1&quot; [color=green];
                &quot;Storage Access Point&quot; -&gt; &quot;Brick #2&quot; [color=green];
                &quot;Storage Access Point&quot; -&gt; &quot;Brick #3&quot; [color=green];
                &quot;Storage Access Point&quot; -&gt; &quot;Brick #4&quot; [color=green];
            }
    }" class="graphviz" /></div>
<p>Current implementations of this type of technology include <a class="reference external" href="http://www.glusterfs.org">GlusterFS</a>
and <a class="reference external" href="http://ceph.com">Ceph</a>. Put way too simplistically, both technologies apply very
smart ways of storing individual objects, sometimes with additional
facilities for certain object types. How they work exactly is far beyond
the scope of this document.</p>
<p>Both technologies however are considered more efficient for fewer,
larger objects, than they are for more, smaller objects. Both storage
solutions also tend to be more efficient at addressing individual
objects directly, rather than hierarchies of objects (for listing).</p>
<p>This is meant to indicate that while both solutions scale up to millions
of objects, they facilitate a particular <strong>I/O pattern</strong> better than the
I/O pattern typically associated with a large volume of messages in IMAP
spools. More frequent and very short-lived I/O against individual
objects in a filesystem mounted directly causes a significant amount of
overhead in negotiating the access to the objects across the storage
cluster <a class="footnote-reference brackets" href="#id8" id="id3" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a>.</p>
<p>Both technologies are perfectly suitable for large clusters with
relatively small filesystems (see <a class="reference external" href="https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Global_File_System_2/ch-considerations.html#s2-fssize-gfs2">Filesystems: Smaller is Better</a>)
if they are mounted directly from the storage clients. They are
particularly feasible if not too many parallel write operations to
individual objects (files) are likely to occur (think, for example, of
web application servers and (asset-)caching proxies).</p>
<p>Alternatively, fewer larger objects could be stored -- such as disk
images for a virtualization environment. The I/O patterns internal to
the virtual machine would remain the same, but the I/O pattern of the
storage client (the hypervisor) is the equivalent of a single
lock-and-open when the virtual machine starts.</p>
<p>It is therefore understood that, especially in deployments of a larger
scale, one should not mount a GlusterFS or CephFS filesystem directly
from within an IMAP server, as an individual IMAP mail spool consists of
many very small objects each individually addressed frequently, and in
short-lived I/O operations, and consider the use of these distributed
filesystems for a different level of object storage, such as disk images
for a virtualization environment:</p>
<div class="graphviz"><img src="../../../_images/graphviz-afb8d78934110ec33e73c0a576318971d3e3aa3c.png" alt="digraph {
           rankdir = TB;
           splines = true;
           overlab = prism;

           edge [color=gray50, fontname=Calibri, fontsize=11];
           node [style=filled, shape=record, fontname=Calibri, fontsize=11];

           subgraph cluster_guests {
                   label = &quot;Guest Nodes&quot;;

                   &quot;Guest #1&quot;;
                   &quot;Guest #2&quot;;
                   &quot;Guest #3&quot;;
               }

           subgraph cluster_hypervisors {
                   label = &quot;Virtualization Platform&quot;;

                   &quot;Hypervisor #1&quot;;
                   &quot;Hypervisor #2&quot;;
               }

           subgraph cluster_storage {
                   color = green;
                   label = &quot;Distributed and/or Replicated Volume
Manager w/ Integrated Distributed (File-) Locking&quot;;

                   subgraph cluster_replbricks1 {
                           label = &quot;Replicated Bricks&quot;;

                           &quot;Brick #1&quot; [color=green];
                           &quot;Brick #3&quot; [color=green];
                       }

                   subgraph cluster_replbricks2 {
                           label = &quot;Replicated Bricks&quot;;

                           &quot;Brick #2&quot; [color=green];
                           &quot;Brick #4&quot; [color=green];
                       }

               }

           &quot;Guest #1&quot; -&gt; &quot;Hypervisor #1&quot; [dir=both,color=green];
           &quot;Guest #2&quot; -&gt; &quot;Hypervisor #1&quot; [dir=both,color=green];
           &quot;Guest #3&quot; -&gt; &quot;Hypervisor #2&quot; [dir=both,color=green];

           &quot;Hypervisor #1&quot; -&gt; &quot;Brick #4&quot; [dir=both,label=&quot;Guest #1&quot;];
           &quot;Hypervisor #1&quot; -&gt; &quot;Brick #3&quot; [dir=both,label=&quot;Guest #2&quot;];
           &quot;Hypervisor #2&quot; -&gt; &quot;Brick #3&quot; [dir=both,label=&quot;Guest #3&quot;];
       }" class="graphviz" /></div>
<p>In this illustration, <em>Hypervisor #1</em> and <em>Hypervisor #2</em> are storage
clients, and replicated bricks hold the disk images of each guest.</p>
<p>Each hypervisor can, in parallel, perform I/O against each individual
disk image, allowing (for example) both <em>Hypervisor #1</em> and
<em>Hypervisor #2</em> to run guests with disk images for which <em>Brick #3</em> has
been selected as the authoritative copy.</p>
</section>
<section id="application-level-replication">
<span id="deployment-application-replication"></span><h3>Application Level Replication<a class="headerlink" href="#application-level-replication" title="Permalink to this heading"></a></h3>
<p>Yet another means to provide redundancy of data is to use application-
level replication where available.</p>
<p>Famous examples include database server replication, where one or more
MySQL masters are used for write operations, and one or more MySQL
slaves are used for read operations, and LDAP replication.</p>
<p>Cyrus IMAP can also replicate its mail spools to other systems, such
that multiple backends hold the payload served to your users.</p>
</section>
<section id="shared-storage-generic">
<h3>Shared Storage (Generic)<a class="headerlink" href="#shared-storage-generic" title="Permalink to this heading"></a></h3>
<p>Contrary to popular belief, all shared storage -- NFS, iSCSI and FC
alike -- are <strong>not</strong> storage devices. They are <em>network protocols</em> for
which the application just so happens to be storage -- with perhaps the
exception to the rule being Fiber-Channel not strictly cohering to the
<a class="reference external" href="http://en.wikipedia.org/wiki/OSI_model">OSI model for networking</a>, although its own 5-layer model equates.</p>
<p>iSCSI and Fiber-Channel LUNs however are <em>mapped</em> to storage devices by
your favorite operating system's drivers for each technology, or
possibly by a hardware device (an <a class="reference internal" href="../../../glossary.html#term-HBA"><span class="xref std std-term">HBA</span></a>, or in iSCSI, an
<em>initiator</em>).</p>
<p>As such, use of these network protocols for which the purpose just so
happens to be storage does <strong>not</strong> provide redundancy.</p>
<p>It is imperative this is understood and equally well applied in planning
for storage infrastructure, or that your storage appliance vendor or
consultancy partner is trusted in their judgement.</p>
</section>
<section id="shared-storage-nfs">
<h3>Shared Storage (NFS)<a class="headerlink" href="#shared-storage-nfs" title="Permalink to this heading"></a></h3>
<p>Use of the Networked File System (NFS) in and by itself does <strong>not</strong>
provide redundancy, although the underlying storage volume might be
replicated.</p>
<p>For a variety of reasons, the use of <a class="reference external" href="http://www.time-travellers.org/shane/papers/NFS_considered_harmful.html">NFS is considered harmful</a> and is
therefore, and for other reasons,  most definitely not recommended for
Cyrus IMAP IMAP spool storage, or any other storage related to
functional components of Cyrus IMAP itself -- IMAP, LDAP, SQL, etc.</p>
<p>Most individual concerns can be addressed separately, and some should or
must already be resolved to address other potentially problematic areas
of a given infrastructure, regardless of the use of NFS.</p>
<p>A couple of concerns however only have <em>workarounds</em>, not solutions --
such as disabling locking -- and a number of concerns have no solution
at all.</p>
<p>One penalty to address is the inability for NFS mounted volumes to cache
I/O, known as in-memory buffer caching.</p>
<p>A technology called <a class="reference external" href="https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/ch-fscache.html">FS Cache</a> can facilitate eliminating round-trip-
incurred network-latency, but is still a filesystem-backed solution
(for which filesystem the local kernel applies buffer caching), requires
yet another daemon, and introduces yet another layer of synchronicity to
be maintained -- aside from <a class="reference external" href="https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/fscachelimitnfs.html">other limitations</a>.</p>
<p>An NFS-backed storage volume can still be used for fewer, larger files,
such as guest disk images.</p>
</section>
<section id="shared-storage-iscsi-or-fc-luns">
<h3>Shared Storage (iSCSI or FC LUNs)<a class="headerlink" href="#shared-storage-iscsi-or-fc-luns" title="Permalink to this heading"></a></h3>
<p>Both iSCSI LUNs and Fiber-Channel LUNs facilitate attaching a networked
block storage device as if it were a local disk (creating devices
similar to <code class="docutils literal notranslate"><span class="pre">/dev/sd{a,b,c,d}</span></code> etc.).</p>
<p>Since such a LUN is available over a &quot;network&quot; infrastructure, it may be
shared between multiple nodes but when it is, nodes need to coordinate
their I/O on some other level.</p>
<p>With an example case of hypervisors, either <a class="reference external" href="https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/LVM_Cluster_Overview.html">Cluster LVM</a> <a class="footnote-reference brackets" href="#id9" id="id4" role="doc-noteref"><span class="fn-bracket">[</span>3<span class="fn-bracket">]</span></a> or
<a class="reference external" href="http://en.wikipedia.org/wiki/GFS2">GFS</a> <a class="footnote-reference brackets" href="#id10" id="id5" role="doc-noteref"><span class="fn-bracket">[</span>4<span class="fn-bracket">]</span></a> could be used to protect against corruption of the LUN.</p>
</section>
</section>
<section id="availability">
<span id="imap-deployment-storage-availability"></span><h2>Availability<a class="headerlink" href="#availability" title="Permalink to this heading"></a></h2>
<p>Availability of storage too can be achieved via multiple routes. In one
of the aforementioned technologies, replicated bricks both available
real-time and online, in a parallel read-write capacity, provided high-
availability through redundancy (see
<a class="reference internal" href="#imap-deployment-storage-integrated-storage-protocol-level-replication"><span class="std std-ref">Integrated Storage Protocol Level Replication</span></a>).</p>
<p>An existing chunk of storage you might have is likely backed by a level
of RAID, with redundancy through mirroring individual disk volumes
and/or the inline calculation of parity, and perhaps also some spare
disks to replace those that are kicked or fall out of line.</p>
<p>Further features might include battery-backed I/O controllers, redundant
power supplies connected to different power groups, a further UPS and
a diesel generator (you start up once a month, right?).</p>
<p>The availability features of a data center are beyond the scope of this
document, but when we speak of availability with regards to storage, we
intend to speak of immediate, instant, online availability with
automated failover (such as the RAID array) -- and more prominently,
without interruption.</p>
<section id="multipath">
<h3>Multipath<a class="headerlink" href="#multipath" title="Permalink to this heading"></a></h3>
<p>Multipath is an enhancement technique in which multiple paths that are
available to the storage can be balanced, shaped and failed over
automatically. Imagine the following networking diagram:</p>
<div class="graphviz"><img src="../../../_images/graphviz-94191218877563f76f61e4eafc1cbe44611f1b1e.png" alt="digraph {
        rankdir = TB;
        splines = true;
        overlab = prism;

        edge [color=gray50, fontname=Calibri, fontsize=11];
        node [style=filled, shape=record, fontname=Calibri, fontsize=11];

        &quot;Node&quot;;

        &quot;Switch #1&quot;; &quot;Switch #2&quot;;

        &quot;Canister #1&quot;; &quot;Canister #2&quot;;

        &quot;iSCSI Target #1&quot;, &quot;iSCSI Target #2&quot;;

        &quot;Node&quot; -&gt; &quot;Switch #1&quot; [dir=none]
        &quot;Node&quot; -&gt; &quot;Switch #2&quot; [dir=none];

        &quot;Switch #1&quot; -&gt; &quot;Canister #1&quot; [dir=none];
        &quot;Switch #1&quot; -&gt; &quot;Canister #2&quot; [dir=none];

        &quot;Switch #2&quot; -&gt; &quot;Canister #1&quot; [dir=none];
        &quot;Switch #2&quot; -&gt; &quot;Canister #2&quot; [dir=none];

        &quot;Canister #1&quot; -&gt; &quot;iSCSI Target #1&quot; [dir=none];
        &quot;Canister #1&quot; -&gt; &quot;iSCSI Target #2&quot; [dir=none];

        &quot;Canister #2&quot; -&gt; &quot;iSCSI Target #1&quot; [dir=none];
        &quot;Canister #2&quot; -&gt; &quot;iSCSI Target #2&quot; [dir=none];
    }" class="graphviz" /></div>
<p>The <em>null</em> situation is depicted in the previous wiring diagram. When
multipath kicks in, primary vs. secondary paths will be chosen for each
individual target (that is unique). However, the system maintains a list
of potential paths, and continuously monitors all paths for their
viability.</p>
<p>In the example, for <em>Node</em> attaching to <em>iSCSI Target #1</em> results in up
to 4 paths to <em>iSCSI Target #1</em> -- <em>4</em> paths, not <em>8</em>, because the
networking of <em>Switch #1</em> and <em>Switch #2</em> is not considered a path with
iSCSI -- <em>two nodes</em> and <em>two send targets each</em>.</p>
<p>Multipath chooses one path to the available storage:</p>
<div class="graphviz"><img src="../../../_images/graphviz-bcf8927eec26409fd3af5aaaf76d6d8e349f78d9.png" alt="digraph {
        rankdir = TB;
        splines = true;
        overlab = prism;

        edge [color=gray50, fontname=Calibri, fontsize=11];
        node [style=filled, shape=record, fontname=Calibri, fontsize=11];

        &quot;Node&quot;;

        &quot;Switch #1&quot; [color=green];
        &quot;Switch #2&quot;;

        &quot;Canister #1&quot;;
        &quot;Canister #2&quot; [color=green];

        &quot;iSCSI Target #1&quot; [color=green];
        &quot;iSCSI Target #2&quot;;

        &quot;Node&quot; -&gt; &quot;Switch #1&quot; [dir=none,color=green]
        &quot;Node&quot; -&gt; &quot;Switch #2&quot; [dir=none];

        &quot;Switch #1&quot; -&gt; &quot;Canister #1&quot; [dir=none];
        &quot;Switch #1&quot; -&gt; &quot;Canister #2&quot; [dir=none,color=green];

        &quot;Switch #2&quot; -&gt; &quot;Canister #1&quot; [dir=none];
        &quot;Switch #2&quot; -&gt; &quot;Canister #2&quot; [dir=none];

        &quot;Canister #1&quot; -&gt; &quot;iSCSI Target #1&quot; [dir=none];
        &quot;Canister #1&quot; -&gt; &quot;iSCSI Target #2&quot; [dir=none];

        &quot;Canister #2&quot; -&gt; &quot;iSCSI Target #1&quot; [dir=none,color=green];
        &quot;Canister #2&quot; -&gt; &quot;iSCSI Target #2&quot; [dir=none];
    }" class="graphviz" /></div>
<p>Should one port, bridge, controller, switch or cable fail, then the I/O
can fall back on to any of the remaining available paths.</p>
<p>As per the example, this might mean the following (with <em>Canister #2</em>
failing):</p>
<div class="graphviz"><img src="../../../_images/graphviz-a8c8ff6e847d79fecdbe8cb1424b8b50a9e36934.png" alt="digraph {
        rankdir = TB;
        splines = true;
        overlab = prism;

        edge [color=gray50, fontname=Calibri, fontsize=11];
        node [style=filled, shape=record, fontname=Calibri, fontsize=11];

        &quot;Node&quot;;

        &quot;Switch #1&quot; [color=green];
        &quot;Switch #2&quot;;

        &quot;Canister #1&quot; [color=green];
        &quot;Canister #2&quot; [color=red];

        &quot;iSCSI Target #1&quot; [color=green];
        &quot;iSCSI Target #2&quot;;

        &quot;Node&quot; -&gt; &quot;Switch #1&quot; [dir=none,color=green]
        &quot;Node&quot; -&gt; &quot;Switch #2&quot; [dir=none];

        &quot;Switch #1&quot; -&gt; &quot;Canister #1&quot; [dir=none,color=green];
        &quot;Switch #1&quot; -&gt; &quot;Canister #2&quot; [dir=none,color=red];

        &quot;Switch #2&quot; -&gt; &quot;Canister #1&quot; [dir=none];
        &quot;Switch #2&quot; -&gt; &quot;Canister #2&quot; [dir=none];

        &quot;Canister #1&quot; -&gt; &quot;iSCSI Target #1&quot; [dir=none,color=green];
        &quot;Canister #1&quot; -&gt; &quot;iSCSI Target #2&quot; [dir=none];

        &quot;Canister #2&quot; -&gt; &quot;iSCSI Target #1&quot; [dir=none,color=red];
        &quot;Canister #2&quot; -&gt; &quot;iSCSI Target #2&quot; [dir=none];
    }" class="graphviz" /></div>
</section>
</section>
<section id="performance">
<span id="imap-deployment-storage-performance"></span><h2>Performance<a class="headerlink" href="#performance" title="Permalink to this heading"></a></h2>
<section id="storage-tiering">
<h3>Storage Tiering<a class="headerlink" href="#storage-tiering" title="Permalink to this heading"></a></h3>
<p>Storage tiering includes the combination of different types of storage
or storage volumes with different performance expectations within the
infrastructure, so that a larger volume of slower, cheaper storage can
be used for items that are not used that much, and/or are not that
important for day-to-day operations, while a smaller volume of faster,
more expensive storage can be used for items that are frequently
accessed and have significant importance to everyday use.</p>
<p>The Cyrus IMAP administrator guide has a section on using
<a class="reference internal" href="../../reference/admin/tweaking.html#admin-tweaking-cyrus-imapd-storage-tiering"><span class="std std-ref">Storage Tiering</span></a> to tweak Cyrus IMAP
performance, to illustrate various opportunities to make optimal use of
your storage.</p>
<p>As a general rule of thumb, you could divide
<a class="reference internal" href="../../../glossary.html#term-operating-system-disks"><span class="xref std std-term">operating system disks</span></a> and <a class="reference internal" href="../../../glossary.html#term-payload-disks"><span class="xref std std-term">payload disks</span></a>; the operating
system disk could hold your base installation of a node, including
everything in the root (<code class="docutils literal notranslate"><span class="pre">/</span></code>) filesystem, while your payload disk(s)
hold the files and directories that contain the system's service(s)
payload (such as <code class="docutils literal notranslate"><span class="pre">/var/lib/mysql/</span></code>, <code class="docutils literal notranslate"><span class="pre">/var/spool/cyrus/</span></code>,
<code class="docutils literal notranslate"><span class="pre">/var/lib/imap/</span></code>, <code class="docutils literal notranslate"><span class="pre">/var/lib/dirsrv/</span></code>, etc.).</p>
<p>Distributing what is and what is not frequently used may be a cumbersome
task for administrators. Some storage vendor's appliances offer
automated storage tiering, where some disks in the appliance are SSDs,
while others are SATA or SAS HDDs, and the appliance itself tiers the
storage.</p>
<p>A similar solution is available to Linux nodes, through <a class="reference external" href="http://en.wikipedia.org/wiki/Dm-cache">dm-cache</a>,
provided they run a recent kernel.</p>
</section>
<section id="disk-buffering">
<h3>Disk Buffering<a class="headerlink" href="#disk-buffering" title="Permalink to this heading"></a></h3>
<p>Reading from a disk is considered very, very slow when compared to
accessing a node's (real) memory. While dependent on the particular I/O
pattern of an application, it is not uncommon at all for an application
to read the same part of a disk volume several times during a relatively
short period of time.</p>
<p>In Cyrus IMAP, for example, while a user is logged in, a mail
folder's <code class="file docutils literal notranslate"><span class="pre">cyrus.index</span></code> is read more frequently than it is
written to -- such as when refreshing the folder view, when opening a
message in the folder, when replying to a message, etc.</p>
<p>It is important to appreciate the impact of <a class="reference external" href="http://www.tldp.org/LDP/sag/html/buffer-cache.html">memory-based buffer cache</a>
for this type of I/O on the overall performance of the environment.</p>
<p>Should no (local) memory-based buffer cache be available, because for
example you are using a network filesystem (NFS, GlusterFS, etc.), then
it is extremely important to appreciate the consequences in terms of the
performance expectations.</p>
</section>
<section id="readahead">
<h3>Readahead<a class="headerlink" href="#readahead" title="Permalink to this heading"></a></h3>
<p>Reading ahead is a feature in which -- in a future-predicting,
anticipatory fashion -- a chunk of storage is read in addition to the
chunk of storage actually being requested.</p>
<p>A common application of read-ahead is to record all files accessed
during the boot process of a node, such that later boot sequences can
read files from disk, and insert them in to the
<a class="reference external" href="http://www.tldp.org/LDP/sag/html/buffer-cache.html">memory-based buffer cache</a> ahead of software actually issuing the call
to read the file. The file's contents can now be reproduced from the
faster (real) memory rather then from the slow disk.</p>
<p>Readahead generally does not matter for small files, unless read
operations work on a collective of aggregate message files. It does
however matter for attached devices on infrastructural components such
as hypervisors, where entire block devices (for the guest) are the files
or block devices being read.</p>
<p>The ideal setting for readahead depends on a variety of factors and can
usually only be established by monitoring an environment and tweaking
the setting (followed by some more monitoring).</p>
</section>
</section>
<section id="scalability">
<span id="imap-deployment-storage-scalability"></span><h2>Scalability<a class="headerlink" href="#scalability" title="Permalink to this heading"></a></h2>
<p>When originally planning for storage capacity, a few things are to be
taken in to account. We'll point these out and address them later in
this section.</p>
<p>Generically speaking, when storage capacity is planned for initially,
a certain period of time is used to establish how much storage might be
required (for that duration).</p>
<p>However, let's suppose regulatory provisions dictate a period of 10
years of business communications need to be preserved. How does one
accurately predict the volume of communications over the next 10 years?</p>
<p>Let's suppose your organization is in flux, expanding or contracting as
businesses do at times, or budget cuts and unexpected extra tasks to
your organization might incur. Or when the organization takes over or
otherwise incorporates another.</p>
<p>Today's storage coming with a certain price-tag, and tomorrow's with a
different one, it can be an interesting exercise to plan for storage to
grow organically as needed, rather than make large investments to provide
capacity that may only be used years from today, or not be used at all,
or turn out to still not be sufficient.</p>
<p>One may also consider planning for the future expansion of the storage
solution chosen today, possibly including significant changes in
requirements (larger mailboxes).</p>
<section id="data-retention">
<h3>Data Retention<a class="headerlink" href="#data-retention" title="Permalink to this heading"></a></h3>
<p>Cyrus IMAP by default does not delete IMAP spool contents from the
filesystem for a period of 69 days.</p>
<p>This means that when a 100 users each have 1 GB of quota, the actual
data footprint might grow way beyond 100 GB on disk.</p>
<p>Depending on the nature of how you use Cyrus IMAP, a reasonable
expectation can be formulated and used for calculating the likely amount
of disk space used in addition to the content that continues to count
towards quota.</p>
<p>For example, if a large amount of message traffic ends up in a shared
folder that many users read messages from and respond to (such as might
be the case for an <a class="reference external" href="mailto:info&#37;&#52;&#48;example&#46;org">info<span>&#64;</span>example<span>&#46;</span>org</a> email address), then around triple
the amount of traffic per month will continue to be stored on disk, plus
what you decide is still current and not deleted by users (the &quot;live
size&quot;).</p>
</section>
<section id="shared-folders">
<h3>Shared Folders<a class="headerlink" href="#shared-folders" title="Permalink to this heading"></a></h3>
<p>Shared folders (primarily those to which mail is delivered) do not, by
default, have any quota on them. They are also not purged by default. As
such, they could grow infinitely (until disks run out of space).</p>
<p>A busy mailing list used for human communications, such as
<a class="reference external" href="mailto:devel&#37;&#52;&#48;lists&#46;fedoraproject&#46;org">devel<span>&#64;</span>lists<span>&#46;</span>fedoraproject<span>&#46;</span>org</a>, can be expected to grow to as much as 1
GB of data foot print on disk over a period of 3 years -- at a message
rate of less than ~100 a day and without purging.</p>
<p>A mailing list with automated messages generated from applications, such
as <a class="reference external" href="mailto:bugs-list&#37;&#52;&#48;kde&#46;org">bugs-list<span>&#64;</span>kde<span>&#46;</span>org</a>, which is notified of all ticket changes for KDE's
upstream Bugzilla, can be expected to grow to up to 3.5 GB over the same
period -- at a message rate of ~300 per day and without purging.</p>
</section>
<section id="user-s-groupware-folders">
<h3>User's Groupware Folders<a class="headerlink" href="#user-s-groupware-folders" title="Permalink to this heading"></a></h3>
<p>Users tend not to clean up their calendars, removing old appointments
that have no bearing on today's views/operations any longer. They do
count towards a user's quota.</p>
</section>
</section>
<section id="capacity">
<span id="imap-deployment-storage-capacity"></span><h2>Capacity<a class="headerlink" href="#capacity" title="Permalink to this heading"></a></h2>
<p>Regardless of the volume of storage in total, this section relates to
the volume of storage allocated to any one singular specific purpose in
Cyrus IMAP, and capacity planning in light of that (not the layer
providing the storage).</p>
<p>Archiving and e-Discovery notwithstanding, the largest chunks of data
volume you are going to find in Cyrus IMAP are the live IMAP
spools.</p>
<p>Let each individual IMAP spool be considered a volume, or part of a
volume if you feel inclined to share volumes across Cyrus IMAP backend
instances. Regardless, you need a filesystem <strong>somewhere</strong> (even if the
bricks building the volume are distributed) -- the recommended
restrictions you should put forth to the individual chunks of storage
lay therein.</p>
<p>Saturating the argument to make a point, imagine, if you will, a million
users with one gigabyte of data each. Just the original, minimal data
footprint is now around and about one petabyte.</p>
<p>Performing a filesystem check (<strong class="command">fsck.ext4</strong> comes to mind) on a
single one petabyte volume will be prohibitively expensive simply
considering the duration of the command to complete execution, let alone
successful execution, for your <strong>million</strong> users will not have access to
their data while the command has not finished -- again, let alone it
finished successfully.</p>
<p><strong>Solely therefore</strong> would you require a second copy of the groupware
payload, now establishing a minimal data footprint to <strong>two</strong> petabyte.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>Also note that the two replicas of one petabyte would, if the
replication occurs at the storage volume level, run the risk of
corrupting both replicas' filesystems.</p>
</div>
<p>Your requirements for data redundancy aside, filesystem checks needing
to be performed at least regularly <a class="footnote-reference brackets" href="#id11" id="id6" role="doc-noteref"><span class="fn-bracket">[</span>5<span class="fn-bracket">]</span></a>, if not for the level of
likelihood they need to happen because actual recovery is required,
should be restricted to a volume of data that enables the check to
complete in a timely fashion, and possibly (when no data redundancy is
implemented) even within a timeframe you feel comfortable you can hold
off your users/customers while they have no access to their data.</p>
<p>For filesystem checks to need to happen regularly, is not to say that
such filesystem checks require the node to be taken offline. Should you
use Logical Volume Management (LVM) for example, and not allocate 100%
of the volume group to the logical volume that holds the IMAP spool,
than intermediate filesystem checks can be executed on a snapshot of
said logical volume instead, and while the node remains online, to give
you a generic impression of the filesystem's health. Given this
information, you can schedule a service window should you feel the need
to check the actual filesystem.</p>
<p>A good article on filesystems, the corruption of data and their causes
and mitigation strategies has been written up by <a class="reference external" href="http://lwn.net">LWN</a>,
<a class="reference external" href="http://lwn.net/Articles/190222/">The 2006 Linux Filesystem Workshop</a>. This article also explains what
it is a filesystem check actually does, and why it is usually configured
to be ran after either a certain amount of delay or number of mounts.</p>
</section>
<section id="cost">
<span id="imap-deployment-storage-cost"></span><h2>Cost<a class="headerlink" href="#cost" title="Permalink to this heading"></a></h2>
<p>When cost is of no concern, multiple vendors of storage solutions will
tell you precisely what you need to hear -- I think we've all been
there.</p>
<p>When cost is a concern, however, cheaper disks are often slower, fail
faster, and sometimes also do not provide the
<a class="reference internal" href="#imap-deployment-storage-capacity"><span class="std std-ref">Capacity</span></a> desired.</p>
<p>On the other hand, stuffing many consumer-grade SATA III disks in to
some commodity hardware likely raises run-time costs -- energy.</p>
<p>However, a chassis of a storage solution usually comes at a higher
price point, and therefore expands capacity with relatively large
chunks, which may not be what you require at that moment.</p>
<p class="rubric">Footnotes</p>
<aside class="footnote-list brackets">
<aside class="footnote brackets" id="id7" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id1">1</a><span class="fn-bracket">]</span></span>
<p>Applications may also operate on data not stored on disk at all,
which is another common avenue potentially resulting in loss of data
-- or <em>corruption</em>, which is merely a type of data-loss.</p>
</aside>
<aside class="footnote brackets" id="id8" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id2">2</a><span class="fn-bracket">]</span></span>
<p>With read operations, the other node(s) must be blocked from
writing, and with write operations, the other node(s) must be
blocked from reading and writing.</p>
</aside>
<aside class="footnote brackets" id="id9" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id4">3</a><span class="fn-bracket">]</span></span>
<p>When using ClusterLVM, you would use logical volumes as disks for
your guests.</p>
</aside>
<aside class="footnote brackets" id="id10" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id5">4</a><span class="fn-bracket">]</span></span>
<p>When using GFS, you would mount the GFS filesystem partition on each
hypervisor and use disk image files.</p>
</aside>
<aside class="footnote brackets" id="id11" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id6">5</a><span class="fn-bracket">]</span></span>
<p>Execute filesystem checks regularly to increase your level of
confidence, that should emergency repairs need to be performed, you
are able to recover swiftly.</p>
<p>The <a class="reference internal" href="../../../glossary.html#term-MTBF"><span class="xref std std-term">MTBF</span></a> of a stable filesystem has most often been subject
to the failure of the underlying disk, with the filesystem unable to
recover (in time) from the underlying disk failing (partly).</p>
</aside>
</aside>
</section>
</section>


           </div>
          </div>
          <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
        <a href="performance_recommendations.html" class="btn btn-neutral float-left" title="Performance Recommendations" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
        <a href="supported-platforms.html" class="btn btn-neutral float-right" title="Supported Platforms and System Requirements" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
    </div>

  <hr/>

  <div role="contentinfo">
    <p>&#169; Copyright 1993–2025, The Cyrus Team.</p>
  </div>

  Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
    <a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
    provided by <a href="https://readthedocs.org">Read the Docs</a>.
   

</footer>
        </div>
      </div>
    </section>
  </div>
  <script>
      jQuery(function () {
          SphinxRtdTheme.Navigation.enable(true);
      });
  </script>
 



</body>
</html>