1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467
|
Changes in CTDB 2.5.4
=====================
User-visible changes
--------------------
* New command "ctdb detach" to detach a database.
* Support for TDB robust mutexes. To enable set TDBMutexEnabled=1.
The setting is per node.
* New manual page ctdb-statistics.7.
Important bug fixes
-------------------
* Verify policy routing configuration when starting up to make sure that policy
routing tables do not override default routing tables.
* "ctdb scriptstatus" should correctly list the number of scripts executed.
* Do not run eventscripts at real-time priority.
* Make sure "ctdb restoredb" and "ctdb wipedb" cannot affect an ongoing
recovery.
* If a readonly record revokation fails, CTDB does not abort anymore. It will
retry revoke.
* pending_calls statistic now gets updated correctly.
Important internal changes
--------------------------
* Vacuuming performance has been improved.
* Fix the order of setting recovery mode and freezing databases.
* Remove NAT gateway "monitor" event.
* Add per database queue for lock requests. This improves the lock
scheduling performance.
* When processing dmaster packets (DMASTER_REQUEST and DMASTER_REPLY) defer all
call processing for that record. This avoids the temporary inconsistency in
dmaster information which causes rapid bouncing of call request between two
nodes.
* Correctly capture the output from lock helper processes, so it can be logged.
* Many test improvements and additions.
Changes in CTDB 2.5.3
=====================
User-visible changes
--------------------
* New configuration variable CTDB_NATGW_STATIC_ROUTES allows NAT
gateway feature to create static host/network routes instead of
default routes. See the documentation. Use with care.
Important bug fixes
-------------------
* ctdbd no longer crashes when tickles are processed after reloading
the nodes file.
* "ctdb reloadips" works as expected because the DEL_PUBLIC_IP control
now waits until public IP addresses are released before returning.
Important internal changes
--------------------------
* Vacuuming performance has been improved.
* Record locking now compares records based on their hashes to avoid
scheduling multiple requests for records on the same hashchain.
* An internal timeout for revoking read-only record relegations has
been changed from hard-coded 5 seconds to the value of the
ControlTimeout tunable. This makes it less likely that ctdbd will
abort.
* Many test improvements and additions.
Changes in CTDB 2.5.2
=====================
User-visible changes
--------------------
* Much improved manpages from CTDB 2.5 are now installed and packaged.
Important bug fixes
-------------------
* "ctdb reloadips" now waits for replies to addip/delip controls
before returning.
Important internal changes
--------------------------
* The event scripts are now executed using vfork(2) and a helper
binary instead of fork(2) providing a performance improvement.
* "ctdb reloadips" will now works if some nodes are inactive. This
means that public IP addresses can be reconfigured even if nodes
are stopped.
Changes in CTDB 2.5.1
=====================
Important bug fixes
-------------------
* The locking code now correctly implements a per-database active
locks limit. Whole database lock requests can no longer be denied
because there are too many active locks - this is particularly
important for freezing databases during recovery.
* The debug_locks.sh script locks against itself. If it is already
running then subsequent invocations will exit immediately.
* ctdb tool commands that operate on databases now work correctly when
a database ID is given.
* Various code fixes for issues found by Coverity.
Important internal changes
--------------------------
* statd-callout has been updated so that statd client information is
always up-to-date across the cluster. This is implemented by
storing the client information in a persistent database using a new
"ctdb ptrans" command.
* The transaction code for persistent databases now retries until it
is able to take the transaction lock. This makes the transation
semantics compatible with Samba's implementation.
* Locking helpers are created with vfork(2) instead of fork(2),
providing a performance improvement.
* config.guess has been updated to the latest upstream version so CTDB
should build on more platforms.
Changes in CTDB 2.5
===================
User-visible changes
--------------------
* The default location of the ctdbd socket is now:
/var/run/ctdb/ctdbd.socket
If you currently set CTDB_SOCKET in configuration then unsetting it
will probably do what you want.
* The default location of CTDB TDB databases is now:
/var/lib/ctdb
If you only set CTDB_DBDIR (to the old default of /var/ctdb) then
you probably want to move your databases to /var/lib/ctdb, drop your
setting of CTDB_DBDIR and just use the default.
To maintain the database files in /var/ctdb you will need to set
CTDB_DBDIR, CTDB_DBDIR_PERSISTENT and CTDB_DBDIR_STATE, since all of
these have moved.
* Use of CTDB_OPTIONS to set ctdbd command-line options is no longer
supported. Please use individual configuration variables instead.
* Obsolete tunables VacuumDefaultInterval, VacuumMinInterval and
VacuumMaxInterval have been removed. Setting them had no effect but
if you now try to set them in a configuration files via CTDB_SET_X=Y
then CTDB will not start.
* Much improved manual pages. Added new manpages ctdb(7),
ctdbd.conf(5), ctdb-tunables(7). Still some work to do.
* Most CTDB-specific configuration can now be set in
/etc/ctdb/ctdbd.conf.
This avoids cluttering distribution-specific configuration files,
such as /etc/sysconfig/ctdb. It also means that we can say: see
ctdbd.conf(5) for more details. :-)
* Configuration variable NFS_SERVER_MODE is deprecated and has been
replaced by CTDB_NFS_SERVER_MODE. See ctdbd.conf(5) for more
details.
* "ctdb reloadips" is much improved and should be used for reloading
the public IP configuration.
This commands attempts to yield much more predictable IP allocations
than using sequences of delip and addip commands. See ctdb(1) for
details.
* Ability to pass comma-separated string to ctdb(1) tool commands via
the -n option is now documented and works for most commands. See
ctdb(1) for details.
* "ctdb rebalancenode" is now a debugging command and should not be
used in normal operation. See ctdb(1) for details.
* "ctdb ban 0" is now invalid.
This was documented as causing a permanent ban. However, this was
not implemented and caused an "unban" instead. To avoid confusion,
0 is now an invalid ban duration. To administratively "ban" a node
use "ctdb stop" instead.
* The systemd configuration now puts the PID file in /run/ctdb (rather
than /run/ctdbd) for consistency with the initscript and other uses
of /var/run/ctdb.
Important bug fixes
-------------------
* Traverse regression fixed.
* The default recovery method for persistent databases has been
changed to use database sequence numbers instead of doing
record-by-record recovery (using record sequence numbers). This
fixes issues including registry corruption.
* Banned nodes are no longer told to run the "ipreallocated" event
during a takeover run, when in fallback mode with nodes that don't
support the IPREALLOCATED control.
Important internal changes
--------------------------
* Persistent transactions are now compatible with Samba and work
reliably.
* The recovery master role has been made more stable by resetting the
priority time each time a node becomes inactive. This means that
nodes that are active for a long time are more likely to retain the
recovery master role.
* The incomplete libctdb library has been removed.
* Test suite now starts ctdbd with the --sloppy-start option to speed
up startup. However, this should not be done in production.
Changes in CTDB 2.4
===================
User-visible changes
--------------------
* A missing network interface now causes monitoring to fail and the
node to become unhealthy.
* Changed ctdb command's default control timeout from 3s to 10s.
* debug-hung-script.sh now includes the output of "ctdb scriptstatus"
to provide more information.
Important bug fixes
-------------------
* Starting CTDB daemon by running ctdbd directly should not remove
existing unix socket unconditionally.
* ctdbd once again successfully kills client processes on releasing
public IPs. It was checking for them as tracked child processes
and not finding them, so wasn't killing them.
* ctdbd_wrapper now exports CTDB_SOCKET so that child processes of
ctdbd (such as uses of ctdb in eventscripts) use the correct socket.
* Always use Jenkins hash when creating volatile databases. There
were a few places where TDBs would be attached with the wrong flags.
* Vacuuming code fixes in CTDB 2.2 introduced bugs in the new code
which led to header corruption for empty records. This resulted
in inconsistent headers on two nodes and a request for such a record
keeps bouncing between nodes indefinitely and logs "High hopcount"
messages in the log. This also caused performance degradation.
* ctdbd was losing log messages at shutdown because they weren't being
given time to flush. ctdbd now sleeps for a second during shutdown
to allow time to flush log messages.
* Improved socket handling introduced in CTDB 2.2 caused ctdbd to
process a large number of packets available on single FD before
polling other FDs. Use fixed size queue buffers to allow fair
scheduling across multiple FDs.
Important internal changes
--------------------------
* A node that fails to take/release multiple IPs will only incur a
single banning credit. This makes a brief failure less likely to
cause node to be banned.
* ctdb killtcp has been changed to read connections from stdin and
10.interface now uses this feature to improve the time taken to kill
connections.
* Improvements to hot records statistics in ctdb dbstatistics.
* Recovery daemon now assembles up-to-date node flags information
from remote nodes before checking if any flags are inconsistent and
forcing a recovery.
* ctdbd no longer creates multiple lock sub-processes for the same
key. This reduces the number of lock sub-processes substantially.
* Changed the nfsd RPC check failure policy to failover quickly
instead of trying to repair a node first by restarting NFS. Such
restarts would often hang if the cause of the RPC check failure was
the cluster filesystem or storage.
* Logging improvements relating to high hopcounts and sticky records.
* Make sure lower level tdb messages are logged correctly.
* CTDB commands disable/enable/stop/continue are now resilient to
individual control failures and retry in case of failures.
Changes in CTDB 2.3
===================
User-visible changes
--------------------
* 2 new configuration variables for 60.nfs eventscript:
- CTDB_MONITOR_NFS_THREAD_COUNT
- CTDB_NFS_DUMP_STUCK_THREADS
See ctdb.sysconfig for details.
* Removed DeadlockTimeout tunable. To enable debug of locking issues set
CTDB_DEBUG_LOCKS=/etc/ctdb/debug_locks.sh
* In overall statistics and database statistics, lock buckets have been
updated to use following timings:
< 1ms, < 10ms, < 100ms, < 1s, < 2s, < 4s, < 8s, < 16s, < 32s, < 64s, >= 64s
* Initscript is now simplified with most CTDB-specific functionality
split out to ctdbd_wrapper, which is used to start and stop ctdbd.
* Add systemd support.
* CTDB subprocesses are now given informative names to allow them to
be easily distinguished when using programs like "top" or "perf".
Important bug fixes
-------------------
* ctdb tool should not exit from a retry loop if a control times out
(e.g. under high load). This simple fix will stop an exit from the
retry loop on any error.
* When updating flags on all nodes, use the correct updated flags. This
should avoid wrong flag change messages in the logs.
* The recovery daemon will not ban other nodes if the current node
is banned.
* ctdb dbstatistics command now correctly outputs database statistics.
* Fixed a panic with overlapping shutdowns (regression in 2.2).
* Fixed 60.ganesha "monitor" event (regression in 2.2).
* Fixed a buffer overflow in the "reloadips" implementation.
* Fixed segmentation faults in ping_pong (called with incorrect
argument) and test binaries (called when ctdbd not running).
Important internal changes
--------------------------
* The recovery daemon on stopped or banned node will stop participating in any
cluster activity.
* Improve cluster wide database traverse by sending the records directly from
traverse child process to requesting node.
* TDB checking and dropping of all IPs moved from initscript to "init"
event in 00.ctdb.
* To avoid "rogue IPs" the release IP callback now fails if the
released IP is still present on an interface.
Changes in CTDB 2.2
===================
User-visible changes
--------------------
* The "stopped" event has been removed.
The "ipreallocated" event is now run when a node is stopped. Use
this instead of "stopped".
* New --pidfile option for ctdbd, used by initscript
* The 60.nfs eventscript now uses configuration files in
/etc/ctdb/nfs-rpc-checks.d/ for timeouts and actions instead of
hardcoding them into the script.
* Notification handler scripts can now be dropped into /etc/ctdb/notify.d/.
* The NoIPTakeoverOnDisabled tunable has been renamed to
NoIPHostOnAllDisabled and now works properly when set on individual
nodes.
* New ctdb subcommand "runstate" prints the current internal runstate.
Runstates are used for serialising startup.
Important bug fixes
-------------------
* The Unix domain socket is now set to non-blocking after the
connection succeeds. This avoids connections failing with EAGAIN
and not being retried.
* Fetching from the log ringbuffer now succeeds if the buffer is full.
* Fix a severe recovery bug that can lead to data corruption for SMB clients.
* The statd-callout script now runs as root via sudo.
* "ctdb delip" no longer fails if it is unable to move the IP.
* A race in the ctdb tool's ipreallocate code was fixed. This fixes
potential bugs in the "disable", "enable", "stop", "continue",
"ban", "unban", "ipreallocate" and "sync" commands.
* The monitor cancellation code could sometimes hang indefinitely.
This could cause "ctdb stop" and "ctdb shutdown" to fail.
Important internal changes
--------------------------
* The socket I/O handling has been optimised to improve performance.
* IPs will not be assigned to nodes during CTDB initialisation. They
will only be assigned to nodes that are in the "running" runstate.
* Improved database locking code. One improvement is to use a
standalone locking helper executable - the avoids creating many
forked copies of ctdbd and potentially running a node out of memory.
* New control CTDB_CONTROL_IPREALLOCATED is now used to generate
"ipreallocated" events.
* Message handlers are now indexed, providing a significant
performance improvement.
|