Using Database Locking

A J2EE Application Server only provides locking mechanisms for Entity EJBs. Many J2EE systems do not use EJBs at all or only use Session EJBs and perform all of their concurrency control using the mechanisms built into the underlying database system. This page describes the different approaches taken by different database vendors and shows how to use them from Java.

JDBC Isolation Level

Most database systems operate by default with a Read-Committed isolation level. If an application requires a higher level of isolation, then it can change isolation level in a portable manner using the setTransactionIsolation call on a normal JDBC Connection; for example:

import java.sql.Connection;
Connection c = getConnection();
c.setTransactionIsolation(Connection.TRANSACTION_SERIALIZABLE);

Most database systems do not allow the isolation level of a connection to be modified when a transaction is in progress and so the setTransactionIsolation()call should be the very first operation performed. This can be problematic when using managed connections as other components may have used the connection already during this transaction. As a result, when using a J2EE application server, the transaction isolation level would normally be set as part of DataSource configuration.

Configuring Isolation Level for a DataSource

The mechanisms used to set the isolation level for a DataSource vary from vendor to vendor. For the JBoss server, it can be defined as a property in the -ds.xml file:


  
    GenericDS
    [jdbc: url]
    [class name]
    x
    y
TRANSACTION_SERIALIZABLE

This will cause the connection pool to set the isolation level the first time the connection is returned during a transaction.

Avoiding Deadlocks

Unfortunately simply setting the isolation level for the transaction is not sufficient to avoid the data concurrency issue. By raising the isolation level above Read-Committed, increase the potential for a deadlock when two overlapping transactions try to update the same data at the same time. It is important to realize that this deadlock can occur even if you take the care to access data in a consistent order.

For example, consider the following scenario:

  • Transaction T1 reads the Name field of Person with id 123. This will result in a read-lock being placed on that data to prevent other transactions modifiying it.
  • Transaction T2 also reads the Name field of Person with id 123. Again, this will result in a read-lock being requested, this time by thread T2. This lock will be granted immediately as the multiple transactions are permitted to hold read-locks at the same time.
  • Transaction T1 attempts to update the Name field of Person with id 123. The update requires an exclusive lock on that data, which cannot be granted as T2 has a read lock already. This causes T1 to block.
  • Transaction T2 also attempts to update the Name field of Person with id 123. Again this requires an exclusive lock but this cannot be granted because T1 holds a read lock and is waiting for an exclusive lock. This causes T2 to block and we have a dadlock.

To avoid this problem, database systems allow you to specify in a read operation that you are reading data with the intent to update it later in the same transaction. The ISO standard syntax for this is to use a FOR UPDATE cursor:

DECLARE c CURSOR FOR 
  SELECT NAME FROM PERSON WHERE ID = 123
  FOR UPDATE

This will typically cause the database to take exclusive locks on the rows identified by the query when they are read rather than waiting until the data is updated. Hence the scenario above becomes:

  • Transaction T1 reads the Name field of Person with id 123 indicating FOR UPDATE. This will result in an exclusive lock on that data.
  • Transaction T2 also reads the Name field of Person with id 123 indicating FOR UPDATE. This will also attempt to acquire an exclusive lock and the transaction will block.
  • Transaction T1 updates the Name field of Person with id 123, which suceeds as T1 already hold an exclusive lock on this data.
  • Transaction T1 commits, releasing the exclusive lock and allowing T2 to proceed (holding an exclusive lock).

Because T1 requested the lock on read, the deadlock was avoided.

The standard JDBC way of specifying this is to use an updatable ResultSet, for example:

ps = conn.createStatement(
    "SELECT NAME FROM PERSON WHERE ID=?",
    ResultSet.TYPE_FORWARD_ONLY,
    ResultSet.CONCUR_UPDATABLE
    );

which causes the driver to generate the required SQL. However, not all JDBC driver implementations support this functionality (it is an optional feature of the specification), or support it in the same manner.

Many databases have adopted a "SELECT FOR UPDATE" syntax where the for-update clause can be added to a normal SELECT statement, for example:

ps = conn.createStatement(
    "SELECT NAME FROM PERSON WHERE ID=? FOR UPDATE OF NAME"
    );

This version uses the optional OF clause to indicate precisely which columns are going to be updated which is useful if the query uses multiple tables as exclusive locks are only taken on the tables specified; if this clause is omitted then the database will lock all the rows used in all tables in the FROM clause.

Microsoft SQL Server 2000 uses a different approach where you specify the type of lock to take on a table in the FROM clause, for example:

ps = conn.createStatement(
    "SELECT NAME FROM PERSON WITH (UPDLOCK) WHERE ID=?"
    );

When this query is executed, the database takes an update lock on the selected rows rather than the normal shared read lock. This allows other readers to continue to read the data but all other writers will block; when you actually perform the update, then the lock will be escalated to an exclusive lock until you commit.

Commit Failures

Some database systems avoid the deadlock issue described above by versioning the data used during a transaction. Although this eliminates the potential for the deadlock, it may introduce the possibility of other failures.

The Oracle database, for example, uses a multi-versioning system that eliminates the need for readers to obtain locks to ensure read-repeatability. However, this can lead to other potential failures and transaction rollback:

  1. A 'Snapshot Too Old' error can occur when there are insufficient resources available to store all the required versions. This typically happens with long-running transactions rather than with the short one typical in OLTP applications. One workaround is to dedicate a separate, large rollback segment to long running operations or batch jobs.
  2. An 'Unable to Serialize Transaction' error can be returned at commit time if two overlapping transactions read then write data as in the deadlock example above. This can be avoided by using an updatable cursor or SELECT FOR UPDATE statement.

A similar multi-versioning system is used by the InnoDB engine of MySQL.

Another system that takes this even futher is the fully multi-versioned McKoi database. McKoi never locks data, even on write, but detects serialization problems at commit time and rolls back transactions that do not meet the serialization criteria. This provides high performance in low contention situations, but at the expense of having to rollback and have the application redo the work if a concurrency violation occurs.