1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
|
<html>
<head>
<title>MySQL myreplication moodss module</title>
</head>
<body>
<p>This is a module for monitoring a pool of replicated master and slave MySQL SQL database servers (latest versions in the 4 series) and insuring that the servers are synchronized.
<p>On UNIX type platforms, it requires the mysqltcl package (at <a href="http://www.xdobry.de/mysqltcl/">http://www.xdobry.de/mysqltcl/</a>) for connection via the native MySQL protocol, or the tclodbc package (at <a href="http://tclodbc.sourceforge.net/">http://tclodbc.sourceforge.net/</a>) for connection via ODBC (Open DataBase Connectivity).
<br>On Windows platforms, please read the <i>install.txt</i> file.
<p>Data is drawn using the <b>SHOW MASTER STATUS</b>, <b>SHOW SLAVE STATUS</b>, <b>SHOW VARIABLES</b> and <b>SHOW MASTER LOGS</b> query results and initially displayed in 1 table:
<pre><img src="myreplication.gif" alt="view of the myreplication module table"></pre>
<p><b>There are 9 data columns</b>:
<ul>
<li>The database server <b>host</b> name (or IP address) or ODBC DSN (Data Source Name)
<li>The unique <b>id</b>entifier of the server instance in the community of replication partners (all values in this column should be different)
<li>The replication <b>delay</b> in seconds (shows how late the server is in the replication process, <i>0</i> if the server is synchronized with the master(s))
<li>The database server <b>role</b> (<i>master</i> or <i>slave</i> or <i>both</i>)
<li>The <b>master</b> database server host name (or IP address) (only if a slave, empty for a master)
<li>Whether the slave thread is <b>running</b> (<i>yes</i> or <i>no</i>, only if a slave, empty for a master)
<li>The binary update <b>log</b> file name (the non-index part of which should be common to all servers)
<li>The replication <b>position</b> in the binary log
<li>The latest replication related <b>error</b> message
</ul>
The <b>delay</b> is calculated as follows:
<ol>
<li>Among all the monitored servers, the highest log file index with the greatest position is used as a reference (this method ensuring forward compatibility since it does not care whether there is one or several masters for example)
<li>When a server is found late (its log file index or position do not match the reference determined above), the poll time is added to its delay value (which means that the delay precision is the poll time value)
<li>When a server is no longer late, its delay is reset to <i>0</i>
</ol>
<p><b>Error handling:</b>
<p>When an error occurs (communicating with the database servers or any other type), cells in the <i>id</i>, <i>delay</i> and <i>position</i> are set to void numeric values (displayed as <i>?</i>), while all other cells are emptied. A descriptive error message is also generated in such a case.
<p><b>Module options:</b>
<ul>
<li><i>--dsns</i>
<br>Comma separated list of ODBC Data Source Names (see your database/system administrator if in doubt). In this case, ODBC is used for connecting to the database servers. This option is incompatible with the <i>--hosts</i> option.
<li><i>--hosts</i>
<br>Comma separated list of host names or IP addresses (it does not matter whether they are <i>slave</i>s, <i>master</i>s or <i>both</i>, they are displayed in the specified order (first on the top row)). Each entry can include the port number used to connect to the database server, using the traditional <i>host:port</i> notation (see examples below). The <i>3306</i> port number is used internally if not specified.
<li><i>--user</i>
<br>database servers user name (must be common to all monitored servers, defaults to current user).
<li><i>--password</i>
<br>database password for user (must be common to all monitored servers, no default).
</ul>
<p><b>Thresholds:</b>
<p>In order to monitor the replicated servers pool, you could, for example:
<ul>
<li>Set a threshold (<i>up</i> type, compared to <i>300</i>) on each of the slave servers delay. For example, if you have chosen a poll time of <i>30</i> seconds, setting a threshold of <i>300</i> seconds on each of the slaves delay would insure that the administrator gets warned when any of the slaves is more than <i>5</i> minutes late in the replication process
<li>Set a threshold (<i>differ</i> type, compared to an empty string) on each of the slave servers error message. This would insure that the administrator gets warned when any of the slaves is in error, which generally means that it is no longer replicating
<li>Set a threshold on each of the slave servers running state (<i>differ</i> type, compared to <i>yes</i>)
<li>Contribute other examples once you yourself become a replication master ;-)
</ul>
<p><b>Examples:</b>
<pre>$ moodss myreplication --hosts 192.168.0.1,192.168.0.10,192.168.0.11
$ moodss myreplication --hosts master.company.com,slave.company.com:3308
$ moodss myreplication --hosts master.company.com,slave.company.com,backup.company.com --user replicator --password xxxxxx
$ moodss myreplication --dsns mymaster,myslave1,myslave2,myslave3 --user jdoe --password xxxxxx</pre>
</body>
</html>
|