EJB Query Language

EJB-QL is a language for writing the queries that CMP must execute for Finder and ejbSelect. It is similar to the SQL language used for relational databases, but just different enough that subtle differences can lead to confusion. EJB-QL is intended to describe queries operating on a collection of Objects and not on the simple table/row/column structure people are typically familiar with. It is similar to SQL, but is closer to the Object-Relational model of SQL-99 than to traditional SQL-92.

Because of this, translation from EJB-QL to the SQL dialect used by a particular database is not trivial. JBoss uses a query compiler to parse a query and translate the functionality into SQL taking into account the differences between different database vendors. However, some features of EJB-QL require constructs that are not always implemented by the database; for example, the MEMBER OF operator requires a sub-query which is not a feature supported by the MySQL database server. When this happens, JBoss tries to generate the closest equivalent SQL. These cases are documented in the appropriate sections below.

Query Domain

A query can include any of the persistent attributes of any CMP entities in a single ejb-jar. This comprises:

  • The CMP entities themselves
  • Any cmp-fields on those entities
  • Any cmr-fields on those entities

You cannot write queries against databases tables, against BMP entities, or queries which use CMP entities from different ejb-jars (even if they happen to be in the same ear).

CMP Entities are referenced using their abstract schema name as declared in ejb-jar.xml:

  
Order
order

Is is very common for an EJB's abstract-schema-name to be the same as its ejb-name but this is not required.

cmp-fields and cmr-fields are referenced in path expressions.

Path Expressions

Path expressions are used to reference a cmp-field or cmr-field in query. There are three types of path:

  • cmp-field, which terminates in a cmp-field and hence is always single valued
  • single-valued cmr-field, which terminates in a cmr-field declared using an EJB local component interface
  • collection-valued cmr-field, which terminates in a cmr-field declared using a Set or Collection

A path expression itself consists of a three parts:

  • A root, which is an alias name declared in the FROM clause
  • A series of zero or more single-valued cmr-fields
  • A final value, which is a cmp-field or cmr-field as described above

Each part is separated by a period. Examples of valid path expressions rooted in the Order EJB are:

Path Path Type Evaluation
o.orderDate cmp-field Returns the value of the orderDate cmp-field of the OrderEJB
o.shipAddress single-valued cmr-field Follows the shipAddress cmr-field to an Address EJB and returns a reference to it using the local component interface. If there is no related Address entity, returns null.
o.shipAddress.city cmp-field From the Order EJB, follows the shipAddress cmr-field to an Address EJB and then returns the city cmp-field from Address. If o.shipAddress is null, the entire expression evaluates to unknown.
o.lineItems collection-valued cmr-field Returns the collection of LineItem EJBs related to an order.

A path expression cannot navigate through a collection. For example, the path "o.lineItems.product" is not valid because "o.lineItems" is a collection.

Path expressions with a non-terminal cmr-field (more than one ".") create an "inner-join" between the related entities. This can implicitly change the number of results returned by the query. For a more extensive discussion on this behaviour see IS NULL Expression.

Literal Values

Unlike SQL, literals in EJB-QL are strongly typed. For example, the string value '2' is not compatible with the integer value 2.

String literals are a series of characters surrounded by single quotes; for example, 'literal'. The syntax is similar to SQL, with two single quote characters being used to represent a single quote; for example, 'literal''s'. EJB-QL does not understand Java escape sequences like "\r".

Exact numeric literals follow the Java syntax for integer literals and support the same range of numbers as a Java long. The following are all equivalent:

    51966
51966L 0xcaFe 0XCAFE 0145376

Approximate numeric literals following the Java syntax for floating-point literals and support the same range of numbers are a Java double. The following are all equivalent:

    123.456
1.23456e2
.123456E3f

EJB-QL does not support an exact decimal type equivalent to an SQL DECIMAL.

A boolean literal is either TRUE or FALSE (case-insensitive).

EJB-QL does not have any form of date/time literal.

Parameters

The arguments of the finder or ejbSelect method can be used as parameters to a query. Parameters are denoted using the syntax "?n", where n is the position of the argument to the method. For example, if the finder

   public Collection findBigOrdersForState(String state, double value) throws FinderException;

was called with the arguments:

   Collection orders = findBigOrdersForState("CA", 100.00);

Then ?1 would be a string typed parameter with value "CA" and ?2 would be an approximate numeric parameter with value 100.00.

The type of the parameter is derived from the class of the argument:

Argument Class Parameter Type
byte, short, int, long, float, double
Byte, Short, Integer, Long, Float, Double
numeric
String string
boolean, Boolean boolean
java.util.Date or any sub-class (including java.sql.Date, java.sql.Time, java.sql.Timestamp) date/time
EJB component interface (any class implementing javax.ejb.EJBObject or javax.ejb.EJBLocalObject) entity

Any other argument class is considered a value-class.

From Clause

The From clause defines the range of entities that the query covers. The set of potential entities is the cross join (or product join) of all the entities listed. The From clause is also used to define aliases for the entities, and unlike SQL an alias must be defined. Unlike ejb-names, aliases are not case sensitive, and an alias cannot be the ejb-name or abstract-schema-name of any EJB in the ejb-jar.

The clause:

FROM Order AS o, LineItem L

defines a query across all Order EJBs (with the alias o) and all LineItems (with the alias L, the "AS" keyword is optional).

EJB-QL defines an IN operator that can be use in the From clause to perform an inner join between Entities. This can greatly simplify the query's syntax and lead to better SQL being generated. For example, the query:

SELECT OBJECT(l) FROM Order o, LineItem l WHERE l MEMBER OF o.lineItems

which will generate the SQL:

SELECT t0_l.ITEM_ID
FROM ORDER_DATA t1_o, LINEITEM t0_l
WHERE EXISTS (SELECT t2_o_lineItems.ITEM_ID
FROM LINEITEM t2_o_lineItems
WHERE t1_o.ORDER_ID=t2_o_lineItems.FK_ORDER_ID
AND t2_o_lineItems.ITEM_ID = t0_l.ITEM_ID)

would be better written as

SELECT OBJECT(l) FROM Order o, IN(o.lineItems) l

because this will generate the SQL

SELECT t0_l.ITEM_ID
FROM ORDER_DATA t1_o, LINEITEM t0_l
WHERE t1_o.ORDER_ID=t0_l.FK_ORDER_ID

which is easier to understand, easier to optimize and is likely to produce a more efficient execution plan.

Select Clause

For a Finder, the select clause must contain an OBJECT() clause for the entity being returned by the finder. For example, for a finder returning Order EJBs, the select clause must be:

SELECT OBJECT(o) FROM Order AS o

The OBJECT() operator is actually redundant but is required by the specification.

For an ejbSelect method, the select clause determines the value returned by the query. This can be:

  • An EJB reference to an entity declared in the FROM clause; this requires an OBJECT() operator
  • An EJB reference obtained from a single-valued cmr-field path expression
  • The value of a cmp-field path expression

Simple examples are:

SELECT OBJECT(l) FROM Order o, IN(o.lineItems) l
SELECT l.product FROM Order o, IN(o.lineItems) l
SELECT o.shipAddress.state FROM Order o

One of the unfortunate restrictions of EJB-QL is that only a single column can be returned. To obtain multiple field values, you must select the appropriate EJB and use the cmp-field accessors to extract the values. Such queries can benefit greatly from on-find read-ahead tuning as described in CMP Tuning.

DISTINCT Operator

The DISTINCT operator can be used to prevent a query from returning duplicate values. It is implicitly used when the finder or ejbSelect method returns a Set as, by definition, a Set cannot contain duplicates.

Used with care, a DISTINCT operator can improve the performance of a query by eliminating duplicate values inside the database server before they are returned to JBoss. However, doing to uses the database's equality comparison and not the Java object's which can lead to subtle differences in semantic.

Depending on the sophistication of the database, using DISTINCT may actually hurt performance. The presence of a DISTINCT clause in the database may cause it to create a copy of the data internally to ensure any duplicates are eliminated and this can add a significant cost to the query. This is especially true if the DISTINCT is implicitly added by requesting a Set result from a complex query that the developer knows cannot return duplicate values.

Using DISTINCT indiscriminately is common trick to eliminate duplicate values from poorly written queries; it works, but it can impact the performance of the database server. Similarly, using Set where the semantic of Collection is adequate also works but can result in more expensive queries.

Where Clause

As in SQL, the where clause is a filter which determines which of the possible rows generated by the from clause will be included in the final result. A possible row will only be included if the where clause evaluates to true, an excluded if the clause evaluated to false or unknown (unknown happens as the result of comparing a value with null).

The where clause is built up of several comparison expressions combined together with the traditional AND, OR and NOT operators.

Comparison Expressions

Unlike SQL, EJB-QL is a type-strict language which supports different comparison expressions for different data types.

String Comparisons

EJB-QL Supported Operators: = <>
JBoss-QL Supported Operators: = <> > < >= <=

In EJB-QL the only operators supported for String comparisons are equals and not-equals; this is to avoid internationalization issues that might occur due to different collation sequences in Java and in the database. JBoss-QL relaxes this restriction and allows greater-than and less-than comparisons as well.

In EJB-QL, the left hand side of the comparison must either be a cmp-field path expression, or the result of the string functions CONCAT or SUBSTRING; it cannot be a literal string or a parameter. A trick to work around this in strict EJB-QL is to use one of the functions; for example:
WHERE ?1 = 'X'
is invalid, but can be written as
WHERE CONCAT(?1, "") = 'X'
However unless the database has function-based indexes using such constructs is likely to result in full table scans.

JBoss-QL relaxes this restriction and allows literals and parameters on both left and right hand sides. It also supports LCASE and UCASE functions for converting strings to lower and upper case respectively.

Boolean Comparisons

EJB-QL Supported Operators: = <>
JBoss-QL Supported Operators: = <>

Both EJB-QL and JBoss-QL support equals and not-equals operators for boolean values. EJB-QL restricts the left hand side to cmp-field path expressions only; JBoss-QL also allows parameters and literals.

As SQL-92 does not define a Boolean data type, different databases may use different values to indicate true and false. For example, the default mapping for Oracle is to store false as zero and true as one. JBoss supports the mapping of true and false to different physical values as described in Type Mappings.

DateTime Comparisons

EJB-QL Supported Operators: = <> < >
JBoss-QL Supported Operators: = <> < > <= >=

EJB-QL does not support less-than-or-equal and greater-than-or-equal operator for date/time values (which is odd as NOT less-than is the same as greater-than-or-equal, and vice-versa); JBoss-QL does not have this restriction. EJB-QL restricts the left hand side to cmp-field path expressions only; JBoss-QL also allows parameters.

Due to internationalization issues, neither EJB-QL or JBoss-QL support date/time literals. EJB-QL recommends that all date/time values be passed using the long millisecond value obtained from a Date or Calendar object. JBoss-QL also allows the use of java.util.Date, java.sql.Date, java.sql.Time and java.sql.Timestamp values.

Arithmetic Comparisons

EJB-QL Supported Operators: = <> < > <= >=
JBoss-QL Supported Operators: = <> < > <= >=

Both EJB-QL and JBoss-QL allow a full range of operators. EJB-QL restricts the left hand side to cmp-field path expressions or the result of the numeric functions LENGTH, LOCATE, ABS and SQRT; JBoss-QL allows parameters and literals. Numeric expressions can only be used on the right hand side.

EJB-QL does not support fixed-precision numeric values, treating all values with a decimal point as approximate numeric. The JBoss implementation passes all numeric literals down to the database for processing and does support the use of java.math.BigDecimal values for both cmp-fields and parameters.

Entity Comparisons

EJB-QL Supported Operators: = <>
JBoss-QL Supported Operators: = <>

Both EJB-QL and JBoss-QL allow equals and not-equals comparisons for Entities. Entities are considered equal if their primary keys are the same. The EJB-QL expression:

SELECT OBJECT(l) FROM Order o, IN(o.lineItems) l WHERE o = ?1

would result in the SQL:

SELECT t0_l.ITEM_ID
FROM ORDER_DATA t1_o, LINEITEM t0_l
WHERE ((t1_o.ORDER_ID=?)) AND (t1_o.ORDER_ID=t0_l.FK_ORDER_ID)

with the bind parameter being the primary key of the Order EJB instance passed in the call to the finder.

Value-Class Comparisons

EJB-QL Supported Operators: None
JBoss-QL Supported Operators: = <>

JBoss-QL provides the ability to compare value-class types. If the class of the cmp-field is a declared dependent value-class, then EJB-QL will determine if two values are the same by comparing all the properties; if any property is null, then the entire comparison is unknown. A <> B is equivalent to NOT (A = B) to ensure nulls are handled correctly and allow for database indexing.

BETWEEN Expression

EJB-QL only supports a BETWEEN expression for arithmetic values; for example,
  product.price BETWEEN ?1 AND ?2

It is possible to simulate a BETWEEN expression for date/time values by using the form
  NOT (order.shipDate < ?1 OR order.shipDate > ?2 )
which makes you wonder why is was not included in the specification.

JBoss-QL extends BETWEEN to support arithmetic, string and datetime values.

LIKE Expression

EJB-QL supports a LIKE expression for string values. Like SQL, it uses the '_' character to denote a single wildcard and the '%" character to denote zero or more wildcards. An escape character can be specified to escape the special meaning of these characters. Rather restrictively, EJB-QL only allows string literals for the pattern and escape strings, and only allows cmp-fields to be used in the comparison.

JBoss-QL relaxes these restrictions and allows parameters and literals on the left hand side, and parameters and cmp-fields to be used as the pattern and escape values.

IN Expression

EJB-QL provides a rudimentary form of IN expression which allows a string valued cmp-field to be compared to a set of string literals. For example:
  address.state IN ('CA', 'WA', 'OR')

It does not allow a parameter to be used on the left hand side, or for parameters or other cmp-fields to be used in the IN list.

JBoss-QL relaxes these constraints and allows literals and parameters on the left and allows parameters, functions and cmp-fields on the right. JBoss-QL also allows arithmetic and datetime values to be used.

IS NULL Expression

The EJB-QL IS NULL expression provides a mechanism for determining if a cmp-field or cmr-field value has a value or is unknown. As with SQL it evaluates to TRUE if the value of the field is a null, or to false if it has a value.

However, the presence of path expressions makes the handling of null values slightly more complicated than with SQL-92. For paths without a non-terminal cmr-field (ones with a single "."), the determination uses the value of the terminal cmp-field or cmr-field. However, if the path contains a non-terminal cmr-field (ones with more than one "."), the possibility exists that the non-terminal cmr-field may itself be null.

In EJB 2.0, the specification is confusing. In one place (11.2.6.6 for those interested), it states that path expressions should have "inner-join" semantics; however, in others (11.2.7.10 and 11.2.9) it states that the result of a path expression which contains a null value during evaluation is null.

To illustrate this confusion, consider the EJB-QL query:

  SELECT OBJECT(o) FROM Order o 
WHERE o.customer.firstName IS NULL OR o.status = ?1

Given the path evaluation clause, it could be expected that this would return Orders with out a Customer (o.customer is null), Orders where the Customer's first name was unknown (o.customer.firstName is null) or which had a certain status (o.status = ?), and that each order could only be returned once.

Although this would be true in SQL-99 (for example if o.customer was a reference), it is not true in EJB-QL.

Following the "inner-join" semantics, this can be re-written as

  SELECT OBJECT(o) FROM Order o, Customer c 
WHERE (o.customer IS NULL OR o.customer = c AND c.firstName IS NULL) OR o.status = ?1

which this means that Orders where the clause "o.status = ?1" is true will be returned multiple times (as many times as there are Customers).

The EJB 2.1 Public Final Draft attempts to clarify this issue by removing the requirement that the result of a null during evaluation is null. This allows the query to be re-written as:

  SELECT OBJECT(o) FROM Order o, Customer c 
  WHERE (o.customer = c AND c.firstName IS NULL) OR o.status = ?1

but the possibility of duplicate results remains. In some circumstances, this issue can be avoided by using a DISTINCT clause, either in the SELECT or, for EJB 2.1, in any aggregate operator. This will not be fully resolved until EJB adds an "outer-join" mechanism.

IS EMPTY Expression

The EJB-QL IS EMPTY expression provides a mechanism for determining if a collection-valued cmr-field contains any values or "is empty". It can be used to determine if the corresponding get accessor for the cmr-field would return an empty Collection or Set. You cannot use the same path expression in an IS EMPTY clause as is used in any IN() clause in the FROM clause; by definition such an expression cannot be empty.

MEMBER OF Expression

The EJB-QL MEMBER OF expression can be used to detemine if an entity is a member of a collection-valued cmr-field. This is similar to the IN sub-query operator in SQL.

JBoss-QL Extensions

General Extensions

JBoss-QL relaxes some of the stricter EJB-QL restrictions. These are detailed above, but in general:

  • It typically allows values, parameters and literals to be used interchangeably.
  • It supports a larger range of operators for most datatypes.
  • It supports some operations on dependent value classes.

New Functions

JBoss-QL adds additional functions to the grammar.

UCASE String Function

Syntax: UCASE '(' string_expression ')'

Uses a database specific function to convert a string-expression to upper case; returns a string typed value.

LCASE String Function

Syntax: LCASE '(' string_expression ')'

Uses a database specific function to convert a string-expression to lower case; returns a string typed value.

ORDER BY Clause

Syntax:
orderby_clause ::= ORDER BY orderby_item [ ',' orderby_item ]*
orderby_item ::= cmp_path_expression [ ASC | DESC ]

JBoss-QL supports an ORDER BY clause that can be used to specify the order of results returned in a Collection by generating the corresponding SQL. Although the ORDER BY clause will be always generated, the current implementation may lose the implicit order when a Set is returned.

This is compatible with the orderby_clause added to EJB 2.1, with the exception that the ordering of Sets is lost; this will be addressed in a future version.

LIMIT Clause

Syntax:
limit_clause :: = [ OFFSET input_parameter ] LIMIT input_parameter

The LIMIT Clause allows the number of potential results from a query to be constrained; the OFFSET clause will cause a certain number of results to be skipped before results are loaded. This is intended to support a basic paging mechanism but is subject to changes in the database content or in the order that rows are returned (which can be addressed by using an ORDER BY clause).

Due to the vast differences in the way databases implement this feature, JBoss does not attempt to convert this construct into SQL. Instead, it executes the entire query, iterates over the result for offset rows and then loads results for the next limit rows. This does require rows to be loaded from the database, but does not require unneeded rows to be loaded into memory. Future versions may allow this functionality to be delegated to the database.