Documentation for Ice 3.5. The latest release is Ice 3.7. Refer to the space directory for other releases.

So far, we have mostly ignored the implementations of the getNumber and setNumber operations we defined in our discussion of object creation. Obviously, getNumber and setNumber must be interlocked against concurrent access — without this interlock, concurrent requests from clients could result in one thread writing to the _phNum member while another thread is reading it, with unpredictable results. (Conversely, the name operation need not have an interlock because the name of a phone entry is immutable.) To interlock getNumber and setNumber, we can add a mutex _m to PhoneEntryI:

C++
class PhoneEntryI : public PhoneEntry {
public:
    // ...

    static PhoneEntryFactoryIPtr _factory;

private:
    const string _name;
    string _phNum;
    Mutex  _m;
};

The getNumber and setNumber implementations then lock _m to protect _phNum from concurrent access:

C++
string
PhoneEntryI::name(const Current&)
{
    return _name;
}

string
PhoneEntryI::getNumber(const Current&)
{
    // Incorrect implementation!

    Mutex::Lock lock(_m);

    return _phNum;
}

void
PhoneEntryI::setNumber(const string& phNum, const Current&)
{
    // Incorrect implementation!

    Mutex::Lock lock(_m);

    _phNum = phNum;
}

This looks good but, as it turns out, destroy throws a spanner in the works: as shown, this code suffers from a rare, but real, race condition. Consider the situation where a client calls destroy at the same time as another client calls setNumber. In a server with a thread pool with more than one thread, the calls can be dispatched in separate threads and can therefore execute concurrently.

The following sequence of events can occur:

  • The thread dispatching the setNumber call locates the servant, enters the operation implementation, and is suspended by the scheduler immediately on entry to the operation, before it can lock _m.
  • The thread dispatching the destroy call locates the servant, enters destroy, successfully removes the servant from the Active Servant Map (ASM) and the _names set, and returns.
  • The thread that was suspended in setNumber is scheduled again, locks _m, and now operates on a conceptually already-destroyed Ice object.

Note that, even if we lock _m in destroy, the problem persists:

C++
void
PhoneEntryI::destroy(const Current& c)
{
    // Note: still not correct.

    IceUtil::Mutex::Lock lock(_m);

    c.adapter?>remove(_name);
    _factory?>remove(_name, c.adapter);
}

Even though destroy now locks _m and so cannot run concurrently with getNumber and setNumber, the preceding scenario can still arise. The problem here is that a thread can enter the servant and be suspended before it gets a chance to acquire a lock. With the code as it stands, this is not a problem: setNumber will simply update the _phNum member variable in a servant that no longer has an ASM entry. In other words, the Ice object is already destroyed — it just so happens that the servant for that Ice object is still hanging around because there is still an operation executing inside it. Any updates to the servant will succeed (even though they are useless because the servant's destructor will run as soon as the last invocation leaves the servant.)

Note that this scenario is not unique to C++ and can arise even with Java synchronized operations: in that case, a thread can be suspended just after the Ice run time has identified the target servant, but before it actually calls the operation on the target servant. While the thread is suspended, another thread can execute destroy.

While this race condition does not affect our implementation, it does affect more complex applications, particularly if the servant modifies external state, such as a file system or database. For example, setNumber could modify a file in the file system; in that case, destroy would delete that file and probably close a file descriptor or stream. If we were to allow setNumber to continue executing after destroy has already done its job, we would likely encounter problems: setNumber might not find the file where it expects it to be or try to use the closed file descriptor and return an error; or worse, setNumber might end up re-creating the file in the process of updating the already-destroyed entry's phone number. (What exactly happens depends on how we write the code for each operation.)

Of course, we can try to anticipate these scenarios and handle the error conditions appropriately, but doing this for complex systems with complex servants rapidly gets out of hand: in each operation, we would have to ask ourselves what might happen if the servant is destroyed concurrently and, if so, take appropriate recovery action.

It is preferable to instead deal with interleaved invocations of destroy and other operations in a systematic fashion. We can do this by adding a _destroyed member to the PhoneEntryI servant. This member is initialized to false by the constructor and set to true by destroy. On entry to every operation (including destroy), we lock the mutex, test the _destroyed flag, and throw ObjectNotExistException if the flag is set:

C++
class PhoneEntryI : public PhoneEntry {
public:
    // ...

    static PhoneEntryFactoryIPtr _factory;

private:
    const string _name;
    string _phNum;
    bool   _destroyed;
    Mutex  _m;
};

PhoneEntryI::PhoneEntryI(const string& name, const string& phNum)
    : _name(name), _phNum(phNum), _destroyed(false)
{
}

string
PhoneEntryI::name(const Current&)
{
    Mutex::Lock lock(_m);

    if (_destroyed)
        throw ObjectNotExistException(__FILE__, __LINE__);

    return _name;
}

string
PhoneEntryI::getNumber(const Current&)
{
    Mutex::Lock lock(_m);

    if (_destroyed)
        throw ObjectNotExistException(__FILE__, __LINE__);

    return _phNum;
}

void
PhoneEntryI::setNumber(const string& phNum, const Current&)
{
    Mutex::Lock lock(_m);

    if (_destroyed)
        throw ObjectNotExistException(__FILE__, __LINE__);

    _phNum = phNum;
}

void
PhoneEntryI::destroy(const Current& c)
{
    Mutex::Lock lock(_m);

    if (_destroyed)
        throw ObjectNotExistException(__FILE__, __LINE__);

    _destroyed = true;
    c.adapter?>remove(_name);
    _factory?>remove(_name, c.adapter); // Dubious!
}

If you are concerned about the repeated code on entry to every operation, you can put that code into a member function or base class to make it reusable (although the benefits of doing so are probably too minor to make this worthwhile).

Using the _destroyed flag, if an operation is dispatched and suspended before it can lock the mutex and, meanwhile, destroy runs to completion in another thread, it becomes impossible for an operation to operate on the state of such a "zombie" servant: the test on entry to each operation ensures that any operation that runs after destroy immediately raises ObjectNotExistException.

Also note the "dubious" comment in destroy: the operation first locks _m and, while holding that lock, calls remove on the factory, which in turn locks its own _namesMutex. This is not wrong as such, but as we will see shortly, it can easily lead to deadlocks if we modify the application later.

See Also
  • No labels