EJB Query Language
EJB-QL is a language for writing the queries that CMP must execute for Finder and ejbSelect. It is similar to the SQL language used for relational databases, but just different enough that subtle differences can lead to confusion. EJB-QL is intended to describe queries operating on a collection of Objects and not on the simple table/row/column structure people are typically familiar with. It is similar to SQL, but is closer to the Object-Relational model of SQL-99 than to traditional SQL-92.
Because of this, translation from EJB-QL to the SQL dialect used by a particular database is not trivial. JBoss uses a query compiler to parse a query and translate the functionality into SQL taking into account the differences between different database vendors. However, some features of EJB-QL require constructs that are not always implemented by the database; for example, the MEMBER OF operator requires a sub-query which is not a feature supported by the MySQL database server. When this happens, JBoss tries to generate the closest equivalent SQL. These cases are documented in the appropriate sections below.
A query can include any of the persistent attributes of any CMP entities in a single ejb-jar. This comprises:
You cannot write queries against databases tables, against BMP entities, or queries which use CMP entities from different ejb-jars (even if they happen to be in the same ear).
CMP Entities are referenced using their abstract schema name as declared in ejb-jar.xml:
Is is very common for an EJB's abstract-schema-name to be the same as its ejb-name but this is not required.
cmp-fields and cmr-fields are referenced in path expressions.
Path expressions are used to reference a cmp-field or cmr-field in query. There are three types of path:
A path expression itself consists of a three parts:
Each part is separated by a period. Examples of valid path expressions rooted in the Order EJB are:
A path expression cannot navigate through a collection. For example, the path "o.lineItems.product" is not valid because "o.lineItems" is a collection.
Path expressions with a non-terminal cmr-field (more than one ".") create an "inner-join" between the related entities. This can implicitly change the number of results returned by the query. For a more extensive discussion on this behaviour see IS NULL Expression.
Unlike SQL, literals in EJB-QL are strongly typed. For example, the string value '2' is not compatible with the integer value 2.
String literals are a series of characters surrounded by single quotes; for example, 'literal'. The syntax is similar to SQL, with two single quote characters being used to represent a single quote; for example, 'literal''s'. EJB-QL does not understand Java escape sequences like "\r".
Exact numeric literals follow the Java syntax for integer literals and support
the same range of numbers as a Java
Approximate numeric literals following the Java syntax for floating-point
literals and support the same range of numbers are a Java
EJB-QL does not support an exact decimal type equivalent to an SQL DECIMAL.
A boolean literal is either TRUE or FALSE (case-insensitive).
EJB-QL does not have any form of date/time literal.
The arguments of the finder or ejbSelect method can be used as parameters to a query. Parameters are denoted using the syntax "?n", where n is the position of the argument to the method. For example, if the finder
public Collection findBigOrdersForState(String state, double value) throws FinderException;
was called with the arguments:
Collection orders = findBigOrdersForState("CA", 100.00);
Then ?1 would be a string typed parameter with value "CA" and ?2 would be an approximate numeric parameter with value 100.00.
The type of the parameter is derived from the class of the argument:
Any other argument class is considered a value-class.
The From clause defines the range of entities that the query covers. The set of potential entities is the cross join (or product join) of all the entities listed. The From clause is also used to define aliases for the entities, and unlike SQL an alias must be defined. Unlike ejb-names, aliases are not case sensitive, and an alias cannot be the ejb-name or abstract-schema-name of any EJB in the ejb-jar.
FROM Order AS o, LineItem L
defines a query across all Order EJBs (with the alias o) and all LineItems (with the alias L, the "AS" keyword is optional).
EJB-QL defines an IN operator that can be use in the From clause to perform an inner join between Entities. This can greatly simplify the query's syntax and lead to better SQL being generated. For example, the query:
SELECT OBJECT(l) FROM Order o, LineItem l WHERE l MEMBER OF o.lineItems
which will generate the SQL:
would be better written as
SELECT OBJECT(l) FROM Order o, IN(o.lineItems) l
because this will generate the SQL
which is easier to understand, easier to optimize and is likely to produce a more efficient execution plan.
For a Finder, the select clause must contain an OBJECT() clause for the entity being returned by the finder. For example, for a finder returning Order EJBs, the select clause must be:
SELECT OBJECT(o) FROM Order AS o
The OBJECT() operator is actually redundant but is required by the specification.
For an ejbSelect method, the select clause determines the value returned by the query. This can be:
Simple examples are:
SELECT OBJECT(l) FROM Order o, IN(o.lineItems) l
One of the unfortunate restrictions of EJB-QL is that only a single column can be returned. To obtain multiple field values, you must select the appropriate EJB and use the cmp-field accessors to extract the values. Such queries can benefit greatly from on-find read-ahead tuning as described in CMP Tuning.
The DISTINCT operator can be used to prevent a query from returning duplicate values. It is implicitly used when the finder or ejbSelect method returns a Set as, by definition, a Set cannot contain duplicates.
Used with care, a DISTINCT operator can improve the performance of a query by eliminating duplicate values inside the database server before they are returned to JBoss. However, doing to uses the database's equality comparison and not the Java object's which can lead to subtle differences in semantic.
Depending on the sophistication of the database, using DISTINCT may actually hurt performance. The presence of a DISTINCT clause in the database may cause it to create a copy of the data internally to ensure any duplicates are eliminated and this can add a significant cost to the query. This is especially true if the DISTINCT is implicitly added by requesting a Set result from a complex query that the developer knows cannot return duplicate values.
Using DISTINCT indiscriminately is common trick to eliminate duplicate values from poorly written queries; it works, but it can impact the performance of the database server. Similarly, using Set where the semantic of Collection is adequate also works but can result in more expensive queries.
As in SQL, the where clause is a filter which determines which of the possible rows generated by the from clause will be included in the final result. A possible row will only be included if the where clause evaluates to true, an excluded if the clause evaluated to false or unknown (unknown happens as the result of comparing a value with null).
The where clause is built up of several comparison expressions combined together with the traditional AND, OR and NOT operators.
Unlike SQL, EJB-QL is a type-strict language which supports different comparison expressions for different data types.
EJB-QL Supported Operators: = <>
In EJB-QL the only operators supported for String comparisons are equals and not-equals; this is to avoid internationalization issues that might occur due to different collation sequences in Java and in the database. JBoss-QL relaxes this restriction and allows greater-than and less-than comparisons as well.
In EJB-QL, the left hand side of the comparison must either be a cmp-field
path expression, or the result of the string functions CONCAT or SUBSTRING;
it cannot be a literal string or a parameter. A trick to work around this
in strict EJB-QL is to use one of the functions;
JBoss-QL relaxes this restriction and allows literals and parameters on both left and right hand sides. It also supports LCASE and UCASE functions for converting strings to lower and upper case respectively.
EJB-QL Supported Operators: = <>
Both EJB-QL and JBoss-QL support equals and not-equals operators for boolean values. EJB-QL restricts the left hand side to cmp-field path expressions only; JBoss-QL also allows parameters and literals.
As SQL-92 does not define a Boolean data type, different databases may use different values to indicate true and false. For example, the default mapping for Oracle is to store false as zero and true as one. JBoss supports the mapping of true and false to different physical values as described in Type Mappings.
EJB-QL Supported Operators: = <> < >
EJB-QL does not support less-than-or-equal and greater-than-or-equal operator for date/time values (which is odd as NOT less-than is the same as greater-than-or-equal, and vice-versa); JBoss-QL does not have this restriction. EJB-QL restricts the left hand side to cmp-field path expressions only; JBoss-QL also allows parameters.
Due to internationalization issues, neither EJB-QL or JBoss-QL support date/time literals. EJB-QL recommends that all date/time values be passed using the long millisecond value obtained from a Date or Calendar object. JBoss-QL also allows the use of java.util.Date, java.sql.Date, java.sql.Time and java.sql.Timestamp values.
EJB-QL Supported Operators: = <> < > <= >=
Both EJB-QL and JBoss-QL allow a full range of operators. EJB-QL restricts the left hand side to cmp-field path expressions or the result of the numeric functions LENGTH, LOCATE, ABS and SQRT; JBoss-QL allows parameters and literals. Numeric expressions can only be used on the right hand side.
EJB-QL does not support fixed-precision numeric values, treating all values with a decimal point as approximate numeric. The JBoss implementation passes all numeric literals down to the database for processing and does support the use of java.math.BigDecimal values for both cmp-fields and parameters.
EJB-QL Supported Operators: = <>
Both EJB-QL and JBoss-QL allow equals and not-equals comparisons for Entities. Entities are considered equal if their primary keys are the same. The EJB-QL expression:
SELECT OBJECT(l) FROM Order o, IN(o.lineItems) l WHERE o = ?1
would result in the SQL:
with the bind parameter being the primary key of the Order EJB instance passed in the call to the finder.
EJB-QL Supported Operators: None
JBoss-QL provides the ability to compare value-class types. If the class of the cmp-field is a declared dependent value-class, then EJB-QL will determine if two values are the same by comparing all the properties; if any property is null, then the entire comparison is unknown. A <> B is equivalent to NOT (A = B) to ensure nulls are handled correctly and allow for database indexing.
EJB-QL only supports a BETWEEN expression for arithmetic values; for example,
JBoss-QL extends BETWEEN to support arithmetic, string and datetime values.
EJB-QL supports a LIKE expression for string values. Like SQL, it uses the '_' character to denote a single wildcard and the '%" character to denote zero or more wildcards. An escape character can be specified to escape the special meaning of these characters. Rather restrictively, EJB-QL only allows string literals for the pattern and escape strings, and only allows cmp-fields to be used in the comparison.
JBoss-QL relaxes these restrictions and allows parameters and literals on the left hand side, and parameters and cmp-fields to be used as the pattern and escape values.
EJB-QL provides a rudimentary form of IN expression which allows a string
valued cmp-field to be compared to a set of string literals. For example:
JBoss-QL relaxes these constraints and allows literals and parameters on the left and allows parameters, functions and cmp-fields on the right. JBoss-QL also allows arithmetic and datetime values to be used.
The EJB-QL IS NULL expression provides a mechanism for determining if a cmp-field or cmr-field value has a value or is unknown. As with SQL it evaluates to TRUE if the value of the field is a null, or to false if it has a value.
However, the presence of path expressions makes the handling of null values slightly more complicated than with SQL-92. For paths without a non-terminal cmr-field (ones with a single "."), the determination uses the value of the terminal cmp-field or cmr-field. However, if the path contains a non-terminal cmr-field (ones with more than one "."), the possibility exists that the non-terminal cmr-field may itself be null.
In EJB 2.0, the specification is confusing. In one place (220.127.116.11 for those interested), it states that path expressions should have "inner-join" semantics; however, in others (18.104.22.168 and 11.2.9) it states that the result of a path expression which contains a null value during evaluation is null.
To illustrate this confusion, consider the EJB-QL query:
SELECT OBJECT(o) FROM Order o
Given the path evaluation clause, it could be expected that this would return Orders with out a Customer (o.customer is null), Orders where the Customer's first name was unknown (o.customer.firstName is null) or which had a certain status (o.status = ?), and that each order could only be returned once.
Although this would be true in SQL-99 (for example if o.customer was a reference), it is not true in EJB-QL.
Following the "inner-join" semantics, this can be re-written as
SELECT OBJECT(o) FROM Order o, Customer c
which this means that Orders where the clause "
The EJB 2.1 Public Final Draft attempts to clarify this issue by removing the requirement that the result of a null during evaluation is null. This allows the query to be re-written as:
SELECT OBJECT(o) FROM Order o, Customer c WHERE (o.customer = c AND c.firstName IS NULL) OR o.status = ?1
but the possibility of duplicate results remains. In some circumstances, this issue can be avoided by using a DISTINCT clause, either in the SELECT or, for EJB 2.1, in any aggregate operator. This will not be fully resolved until EJB adds an "outer-join" mechanism.
IS EMPTY Expression
The EJB-QL IS EMPTY expression provides a mechanism for determining if a collection-valued cmr-field contains any values or "is empty". It can be used to determine if the corresponding get accessor for the cmr-field would return an empty Collection or Set. You cannot use the same path expression in an IS EMPTY clause as is used in any IN() clause in the FROM clause; by definition such an expression cannot be empty.
MEMBER OF Expression
The EJB-QL MEMBER OF expression can be used to detemine if an entity is a member of a collection-valued cmr-field. This is similar to the IN sub-query operator in SQL.
JBoss-QL relaxes some of the stricter EJB-QL restrictions. These are detailed above, but in general:
JBoss-QL adds additional functions to the grammar.
UCASE String Function
Uses a database specific function to convert a string-expression to upper case; returns a string typed value.
LCASE String Function
Uses a database specific function to convert a string-expression to lower case; returns a string typed value.
ORDER BY Clause
JBoss-QL supports an ORDER BY clause that can be used to specify the order of results returned in a Collection by generating the corresponding SQL. Although the ORDER BY clause will be always generated, the current implementation may lose the implicit order when a Set is returned.
This is compatible with the orderby_clause added to EJB 2.1, with the exception that the ordering of Sets is lost; this will be addressed in a future version.
The LIMIT Clause allows the number of potential results from a query to be constrained; the OFFSET clause will cause a certain number of results to be skipped before results are loaded. This is intended to support a basic paging mechanism but is subject to changes in the database content or in the order that rows are returned (which can be addressed by using an ORDER BY clause).
Due to the vast differences in the way databases implement this feature, JBoss does not attempt to convert this construct into SQL. Instead, it executes the entire query, iterates over the result for offset rows and then loads results for the next limit rows. This does require rows to be loaded from the database, but does not require unneeded rows to be loaded into memory. Future versions may allow this functionality to be delegated to the database.
© 2003 Core Developers Network Ltd
"Core Developers Network", the stylized apple logo and "Core Associates" are trademarks of Core Developers Network Ltd. All other trademarks are held by their respective owners. Core Developers Network Ltd is not affiliated with any of the respective trademark owners.