7. Using MySQL++ in a Multithreaded Program

MySQL++ doesn’t fall out of the box ready to be used with threads. Furthermore, once you build a thread-aware program with MySQL++, it isn’t “thread safe” in an absolute sense: there exist incorrect usage patterns which will cause errors. This section will discuss these issues, and give advice on how to avoid problems.

7.1. Build Issues

Before you can safely use MySQL++ with threads, there are several things you must do to get a thread-aware build:

  1. Build MySQL++ itself with thread awareness turned on.

    On platforms that use the configure script (Linux, Mac OS X, *BSD, Solaris, Cygwin...) you need to explicitly ask for thread support. And beware, this is only a request to the configure script to look for thread support on your system, not a requirement to do or die: if the script doesn’t find what it needs to do threading, MySQL++ will just get built without thread support. See README-Unix.txt for more details.

    When building MySQL++ with the Visual C++ project files or the MinGW Makefile that comes with the MySQL++ distribution, threading is always turned on, due to the nature of Windows.

    If you build MySQL++ in some unsupported way, such as with Dev-Cpp (based on MinGW) you’re on your own to enable this.

  2. Link your program to a thread-aware build of the MySQL C API library.

    Depending on your platform, you might have to build this yourself (e.g. Cygwin), or you might get only one library which is always thread-aware (e.g. Visual C++), or there might be two different MySQL C API libraries, one of which is thread-aware and the other not (e.g. Linux). See the README-*.txt file for your particular platform, and also the MySQL developer documentation.

  3. Enable threading in your program’s build options.

    This is different for every platform, but it’s usually the case that you don’t get thread-aware builds by default. You might have to turn on a compiler option, or link your program to a different library, or some combination of both. See your development environment’s documentation, or study how MySQL++ itself turns on thread-aware build options when requested.

7.2. Connection Management

The MySQL C API underpinning MySQL++ does not allow multiple concurrent queries on a single connection. You can run into this problem in a single-threaded program, too, which is why we cover the details elsewhere, in Section 3.18, “Concurrent Queries on a Connection”. It’s a thornier problem when using threads, though.

The simple fix is to just create a separarate Connection object for each thread that needs to make database queries. This works well if you have a small number of threads that need to make queries, and each thread uses its connection often enough that the server doesn’t time out waiting for queries. (By default, current MySQL servers have an 8 hour idle timeout on connections. It’s a configuration option, though, so your server may be set differently.)

If you have lots of threads or the frequency of queries is low, the connection management overhead will be excessive. To avoid that, we created the ConnectionPool class. It manages a pool of Connections like library books: a thread checks one out, uses it, and then returns it to the pool when it’s done with it. This keeps the number of active connections as low as possible.

ConnectionPool has three methods that you need to override in a subclass to make it concrete: create(), destroy(), and max_idle_time(). These overrides let the base class delegate operations it can’t successfully do itself to its subclass. The ConnectionPool can’t know how to create() the Connection objects, because that depends on how your program gets login parameters, server information, etc. ConnectionPool also makes the subclass destroy() the Connection objects it created; it could assume that they’re simply allocated on the heap with new, but it can’t be sure, so the base class delegates destruction, too. Finally, the base class can’t know what the connection idle timeout policy in the client would make the most sense, so it asks its subclass via the max_idle_time() method.

In designing your ConnectionPool derivative, you might consider making it a Singleton (see Gamma et al.), since there should only be one pool in a program.

Here is an example showing how to use connection pools with threads:

#include "cmdline.h"
#include "threads.h"

#include <iostream>

using namespace std;


// Define a concrete ConnectionPool derivative.  Takes connection
// parameters as inputs to its ctor, which it uses to create the
// connections we're called upon to make.  Note that we also declare
// a global pointer to an object of this type, which we create soon
// after startup; this should be a common usage pattern, as what use
// are multiple pools?
class SimpleConnectionPool : public mysqlpp::ConnectionPool
{
public:
    // The object's only constructor
    SimpleConnectionPool(const char* db, const char* server,
            const char* user, const char* password) :
    db_(db ? db : ""),
    server_(server ? server : ""),
    user_(user ? user : ""),
    password_(password ? password : "")
    {
    }

    // The destructor.  We _must_ call ConnectionPool::clear() here,
    // because our superclass can't do it for us.
    ~SimpleConnectionPool()
    {
        clear();
    }

protected:
    // Superclass overrides
    mysqlpp::Connection* create()
    {
        // Create connection using the parameters we were passed upon
        // creation.  This could be something much more complex, but for
        // the purposes of the example, this suffices.
        cout.put('C'); cout.flush(); // indicate connection creation
        return new mysqlpp::Connection(
                db_.empty() ? 0 : db_.c_str(),
                server_.empty() ? 0 : server_.c_str(),
                user_.empty() ? 0 : user_.c_str(),
                password_.empty() ? "" : password_.c_str());
    }

    void destroy(mysqlpp::Connection* cp)
    {
        // Our superclass can't know how we created the Connection, so
        // it delegates destruction to us, to be safe.
        cout.put('D'); cout.flush(); // indicate connection destruction
        delete cp;
    }

    unsigned int max_idle_time()
    {
        // Set our idle time at an example-friendly 3 seconds.  A real
        // pool would return some fraction of the server's connection
        // idle timeout instead.
        return 3;
    }

private:
    // Our connection parameters
    std::string db_, server_, user_, password_;
};
SimpleConnectionPool* poolptr = 0;


#if defined(HAVE_THREADS)
static thread_return_t CALLBACK_SPECIFIER
worker_thread(thread_arg_t running_flag)
{
    // Pull data from the sample table a bunch of times, releasing the
    // connection we use each time.
    for (size_t i = 0; i < 6; ++i) {
        // Go get a free connection from the pool, or create a new one
        // if there are no free conns yet.
        mysqlpp::Connection* cp = poolptr->grab();
        if (!cp) {
            cerr << "Failed to get a connection from the pool!" << endl;
            break;
        }

        // Pull a copy of the sample stock table and print a dot for
        // each row in the result set.
        mysqlpp::Query query(cp->query("select * from stock"));
        mysqlpp::StoreQueryResult res = query.store();
        for (size_t j = 0; j < res.num_rows(); ++j) {
            cout.put('.');
        }

        // Immediately release the connection once we're done using it.
        // If we don't, the pool can't detect idle connections reliably.
        poolptr->release(cp);

        // Delay 1-4 seconds before doing it again.  Because this can
        // delay longer than the idle timeout, we'll occasionally force
        // the creation of a new connection on the next loop.
        sleep(rand() % 4 + 1);  
    }

    // Tell main() that this thread is no longer running
    *reinterpret_cast<bool*>(running_flag) = false;
    
    return 0;
}
#endif


int
main(int argc, char *argv[])
{
#if defined(HAVE_THREADS)
    // Get database access parameters from command line
    const char* db = 0, *server = 0, *user = 0, *pass = "";
    if (!parse_command_line(argc, argv, &db, &server, &user, &pass)) {
        return 1;
    }

    // Create the pool and grab a connection.  We do it partly to test
    // that the parameters are good before we start doing real work, and
    // partly because we need a Connection object to call thread_aware()
    // on to check that it's okay to start doing that real work.  This
    // latter check should never fail on Windows, but will fail on most
    // other systems unless you take positive steps to build with thread
    // awareness turned on.  See README-*.txt for your platform.
    poolptr = new SimpleConnectionPool(db, server, user, pass);
    try {
        mysqlpp::Connection* cp = poolptr->grab();
        if (!cp->thread_aware()) {
            cerr << "MySQL++ wasn't built with thread awareness!  " <<
                    argv[0] << " can't run without it." << endl;
            return 1;
        }
        poolptr->release(cp);
    }
    catch (mysqlpp::Exception& e) {
        cerr << "Failed to set up initial pooled connection: " <<
                e.what() << endl;
        return 1;
    }

    // Setup complete.  Now let's spin some threads...
    cout << endl << "Pool created and working correctly.  Now to do "
            "some real work..." << endl;
    srand(time(0));
    bool running[] = {
            true, true, true, true, true, true, true,
            true, true, true, true, true, true, true };
    const size_t num_threads = sizeof(running) / sizeof(running[0]);
    size_t i;
    for (i = 0; i < num_threads; ++i) {
        if (int err = create_thread(worker_thread, running + i)) {
            cerr << "Failed to create thread " << i <<
                    ": error code " << err << endl;
            return 1;
        }
    }

    // Test the 'running' flags every second until we find that they're
    // all turned off, indicating that all threads are stopped.
    cout.put('W'); cout.flush(); // indicate waiting for completion
    do {
        sleep(1);
        i = 0;
        while (i < num_threads && !running[i]) ++i;
    }
    while (i < num_threads);
    cout << endl << "All threads stopped!" << endl;

    // Shut it all down...
    delete poolptr;
    cout << endl;
#else
    (void)argc;     // warning squisher
    cout << argv[0] << " requires that threads be enabled!" << endl;
#endif

    return 0;
}

The example works with both Windows native threads and with POSIX threads. (The file examples/threads.h contains a few macros and such to abstract away the differences between the two threading models.) Because thread-enabled builds are only the default on Windows, it’s quite possible for this program to do nothing on other platforms. See your platform’s README-*.txt file for instructions on enabling a thread-aware build.

If you write your code without checks for thread support like you see in the code above and link it to a build of MySQL++ that isn’t thread-aware, it won’t immediately fail. The threading mechanisms just fall back to a single-threaded mode. A particular danger is that the mutex lock mechanism used to keep the pool’s internal data consistent while multiple threads access it will just quietly become a no-op if MySQL++ is built without thread support. We do it this way because we don’t want to make thread support a MySQL++ prerequisite. And, although it would be of limited value, this lets you use ConnectionPool in single-threaded programs.

You might wonder why we don’t just work around this weakness in the C API transparently in MySQL++ instead of mandating design guidelines to avoid it. We’d like to do just that, but how?

If you consider just the threaded case, you could argue for the use of mutexes to protect a connection from trying to execute two queries at once. The cure is worse than the disease: it turns a design error into a performance sap, as the second thread is blocked indefinitely waiting for the connection to free up. Much better to let the program get the “Commands out of sync” error, which will guide you to this section of the manual, which tells you how to avoid the error with a better design.

Another option would be to bury ConnectionPool functionality within MySQL++ itself, so the library could create new connections at need. That’s no good because the above example is the most complex in MySQL++, so if it were mandatory to use connection pools, the whole library would be that much more complex to use. The whole point of MySQL++ is to make using the database easier. MySQL++ offers the connection pool mechanism for those that really need it, but an option it must remain.

7.3. Helper Functions

Connection has several thread-related methods you might care about when using MySQL++ with threads.

You can call Connection::thread_aware() to determine whether MySQL++ and the underlying C API library were both built to be thread-aware. Again, I stress that thread awareness is not the same thing as thread safety: it’s still up to you to make your code thread-safe. If this method returns true, it just means it’s possible to achieve thread-safety.

If your program’s connection-management strategy allows a thread to use a Connection object that another thread created, you must call Connection::thread_start() from these threads before they do anything with MySQL++. It’s safe for the thread that created the Connection object to call it, too, but unnecessary. This is because the underlying C API library takes care of it for you when you try to establish your first connection from that thread. So, if you use the simple Connection-per-thread strategy lined out above, you never need to call this method, but if you use something more complex like ConnectionPool, you do.

Finally, there’s the complementary method, Connection::thread_end(). Strictly speaking, it’s not necessary to call this. However, as alluded above, the underlying C API library allocates some per-thread memory for each thread that calls Connection::thread_start() or establishes connections. It’s not very much memory, it doesn’t grow over time, and a typical program is going to need this memory for its entire run time anyway. Memory debuggers aren’t smart enough to know all this, though, so they will gripe about a memory leak unless you call this from each thread that uses MySQL++ before that thread exits.

It’s not relevant to this chapter’s topic, so to be clear I want to point out that Connection::thread_id() has to do with threads in the database server, not client-side threads.

7.4. Sharing MySQL++ Data Structures

We’re in the process of making it safer to share MySQL++’s data structures across threads.

By way of illustration, let me explain a problem we had up until MySQL++ v3.0. When you issue a database query that returns rows, you also get information about the columns in each row. Since the column information is the same for each row in the result set, older versions of MySQL++ kept this information in the result set object, and each Row kept a pointer back to the result set object that created it so it could access this common data at need. This was fine as long as each result set object outlived the Row objects it returned. It required uncommon usage patterns to run into trouble in this area in a single-threaded program, but in a multi-threaded program it was easy. For example, there’s frequently a desire to let one connection do the queries, and other threads process the results. You can see how avoiding lifetime problems here would require a careful locking strategy.

We got around this in MySQL++ v3.0 by giving these shared data structures a lifetime independent of the result set object that intitially creates it. These shared data structures stick around until the last object needing them gets destroyed.

Although this is now a solved problem, I bring it up because there are likely other similar lifetime and sequencing problems waiting to be discovered inside MySQL++. If you would like to help us find these, by all means, share data between threads willy-nilly. We welcome your crash reports on the MySQL++ mailing list. But if you’d prefer to avoid problems, it’s better to keep all data about a query within a single thread. Between this and the previous section’s advice, you should be able to use threads with MySQL++ without trouble.