openEHR logo

Archetype Query Language (AQL)

Issuer: openEHR Specification Program

Release: Release-1.0.0

Status: STABLE

Revision: [latest_issue]

Date: [latest_issue_date]

Keywords: query, AQL, archetype, Xpath

openEHR components
© 2008 - 2018 The openEHR Foundation

The openEHR Foundation is an independent, non-profit community organisation, facilitating the sharing of health records by consumers and clinicians via open-source, standards-based implementations.

Licence

image Creative Commons Attribution-NoDerivs 3.0 Unported. https://creativecommons.org/licenses/by-nd/3.0/

Support

Issues: Problem Reports
Web: www.openehr.org

Amendment Record

Issue Details Raiser Completed

R E L E A S E     1.0.2

1.0.0

SPECQUERY-6 Publish original AQL wiki page as formal specification;
Various reformatting, adjustment of headings;
Fix missing square brackets on query example. (SPECPR-236)

openEHR SEC

10 Jan 2017

Add Antlr grammar for AQL

B Lah

13 Jan 2012

Initial Writing, based on openEHR AQL wiki page.

C Ma
H Frankel
T Beale

10 Dec 2008

Acknowledgements

Primary Author

  • Chunlan Ma, MD, Ocean Informatics, Australia.

Contributors

This specification benefited from wide formal and informal input from the openEHR and wider health informatics community. The openEHR Foundation would like to recognise the following people for their contributions.

  • Heath Frankel, Ocean Informatics, Australia

  • Bostjan Lah, Architect, Marand, Slovenia

Support

The work reported in this paper has been funded by the following organisations:

  • Ocean Informatics, Australia

  • Marand, Slovenia

Trademarks

  • 'openEHR' is a trademark of the openEHR Foundation

  • 'Java' is a registered trademark of Oracle Corporation

  • 'Microsoft' and '.Net' are trademarks of the Microsoft Corporation

1. Preface

1.1. Purpose

This document describes the syntax of the openEHR Archetype Query Language (AQL).

1.2. Status

This specification is in the STABLE state. The latest development version of this document can be found at https://specifications.openehr.org/releases/QUERY/Release-1.0.0/AQL.html.

Known omissions or questions are indicated in the text with a 'to be determined' paragraph, as follows:

TBD: (example To Be Determined paragraph)

1.3. Feedback

Feedback may be provided on the technical mailing list.

Issues may be raised on the specifications Problem Report tracker.

To see changes made due to previously reported issues, see the QUERY component Change Request tracker.

1.4. Original Source

The text of this specification was originally posted on the openEHR wiki, and was used by the global openEHR community in that form.

1.5. Tooling status

1.5.1. AQL features not yet supported

The following features are likely to be needed, but are not yet implemented in tools:

  • TOP

  • XOR. Currently AND, OR, NOT, EXISTS are supported in WHERE clause.

  • functions are not supported.

  • embedded query is not supported

  • ORDER BY

  • TIMEWINDOW

1.5.2. AQL features that are proposed, but not finalised

1.5.2.1. Arithmetic functions

It has been proposed that AQL would support some basic arithmetic functions, such as addition (plus) , subtraction (minus) . Multiplication, and division may be supported as well. We do not have any scenario requiring for these two functions yet.

Most cases, it is required that the left operand and the right operand must be the same type in order to do the calculation, e.g. numeric types. openEHR RM supports addition and subtraction for the types of: DV_QUANTITY, DV_COUNT, DV_INTERVAL, DV_PROPORTION, DV_DATE_TIME, DV_DATE, and DV_TIME. Again, it is required that the left and right operands must be the same type. However, one exception to this is that openEHR RM also supports subtraction between all subtypes of DV_TEMPORAL and type of DV_DURATION. AQL would support these arithmetic functions that openEHR RM supports.

The example below shows a scenario which requires that a Composition report (c1) is issued in the last year. It illustrates that a type of ISO 8601 date string (current-date()) subtracts a type of ISO 8601 interval string (P1Y).

c1/context/other_context/items[at0006]/items[at0013]/value > current-date() - P1Y
1.5.2.2. Other functions

It is proposed that AQL may also support other functions, such as:

  • current-date(): a build-in function returning the current date value in ISO date string format.

  • current-date-time(): a build-in function returning the current date time value in ISO date/time string format.

  • max: a build-in function returning the max value out of an expression.

1.6. Further discussions

The existing AQL grammar and syntax could be further enhanced in the following areas:

  • Reduce the length of the query statement. The use of archetype path in AQL query makes the query lengthy and hard to read. Local variables with meaningful names that are assigned with path can be used to reduce the length of the query as well as improve readability of the query. One example is shown below:

let $systolic_bp="data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/magnitude"
let $diastolic_bp="data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/magnitude"

SELECT
    obs/$systolic_bp, obs/$diastolic_bp
FROM
    EHR [ehr_id/value=$ehrUid] CONTAINS COMPOSITION [openEHR-EHR-COMPOSITION.encounter.v1]
        CONTAINS OBSERVATION obs [openEHR-EHR-OBSERVATION.blood_pressure.v1]
WHERE
    obs/$systolic_bp>= 140 OR obs/$diastolic_bp>=90

2. Overview

2.1. Existing Query Languages

Currently, the available query languages, such as SQL, XQuery, or Object-Oriented Query Language, have dependencies on particular system data structure and working environment. Users must know the persistence data structure of an EHR in order to write an appropriate query. The query statement cannot be used by other systems which have different data store. Consequently, none of the these languages meet the aforementioned requirements and none of these can be used directly as the query language required by integrated care EHRs

2.2. What is AQL?

Archetype Query Language (AQL) is a declarative query language developed specifically for expressing queries used for searching and retrieving the clinical data found in archetype-based EHRs. It is applied to the openEHR EHR Reference Model (RM) and the openEHR clinical archetypes, but the syntax is independent of applications, programming languages, system environment, and storage models. The minimum requirement for data to be querying with AQL (including with archetype structures and terminology) is for the data to be marked at a fine granularity with the appropriate archetype codes and terminology codes. This may be native openEHR-structured data, or legacy system data to which the relevant data markers (mainly archetype paths and terminology codes) have been added. Unlike other query languages, such as SQL or XQuery, AQL expresses the queries at the archetype level, i.e. semantic level, other than at the data instance level. This is the key in achieving sharing queries across system boundaries or enterprise boundaries.

AQL has the following distinctive features:

  1. the utilization of openEHR archetype path syntax in AQL. openEHR path syntax is used to locate clinical statements and data values within them using Archetypes. This path syntax is used to represent the query criteria and returned results. It allows setting query criteria using archetype and node identifiers, data values within the archetypes, and class attributes defined within the openEHR RM. It also allows the returned results to be top-level archetyped RM objects, data items within the archetypes or RM attribute values.

  2. the utilization of containment mechanisms to indicate the data hierarchy and constrain the source data to which the query is applied.

  3. the utilization of ADL-like operator syntaxes, such as matches, exists, in, negation.

  4. Neutral expression syntax. AQL does not have any dependencies on the underlying RM of the archetypes. It is neutral to system implementation and environment.

  5. Supporting queries with logical time-based data rollback.

AQL has some other features which can be found from other query languages: . Supporting naming returned results. . Supporting query criteria parameters. . Supporting arithmetic operations (such as count, addition, subtraction, multiplication, and division), relational operations (>, >=, =, !=, <=, <) and Boolean operations (OR, AND, XOR, NOT). . Supporting some functions that are supported in XQuery, such as current-date(). . Users can specify their preference on the retrieved data, such as ordering preferences, or total number of retrieved results. . Supporting queries for individual clinical subjects at the point of care, administrative purposes and clinical research purposes.

2.3. Structure & Example

Like SQL, AQL has five clauses: SELECT, FROM, WHERE, ORDER BY, and TIMEWINDOW. The SELECT clause specifies the data elements to be returned. The FROM clause specifies the result source and the corresponding containment criteria. The WHERE clause specifies data value criteria within the result source. The ORDER BY clause indicates the data items used to order the returned result set. TIMEWINDOW clause is to constrain the query to data that was available in the system within the specified time criteria.

Below is an example of AQL statement. This statement returns all blood pressure values where systolic value is greater or equal to 140 or whose diastolic value is greater or equal to 90 within a specified EHR. AQL syntax is a synthesis of SQL structural syntax and openEHR path syntax.

SELECT                                                          -- Select clause
    o/data[at0001]/.../items[at0004]/value AS systolic,         -- Identified path with named result
    o/data[at0001]/.../items[at0005]/value AS systolic,
FROM                                                            -- From clause
    EHR[ehr_id=$ehrId]                                          -- RM class expression
        CONTAINS                                                -- containment
            COMPOSITION c                                       -- RM class expression
                [openEHR-EHR-COMPOSITION.encounter.v1]          -- archetype predicate
        CONTAINS
            OBSERVATION o
                [openEHR-EHR-OBSERVATION.blood_pressure.v1]
WHERE                                                           -- Where clause
    o/data[at0001]/.../items[at0004]/value/value >= 140 OR      -- value comparison
    o/data[at0001]/.../items[at0005]/value/value >= 90

3. AQL Syntax Description

3.1. Overview

AQL has SELECT, FROM, WHERE and ORDER BY clauses. openEHR path syntax is used to locate any node or data values within archetypes. The SELECT clause uses openEHR path syntax to indicate expected archetypes, elements, or data values that are need to be returned. The FROM clause uses containment mechanisms indicating the scope of the query data source. WHERE clause utilises archetype path to set query criteria on archetypes or any node within the archetypes. The following sections introduce the AQL syntax in detail.

3.2. Reserved words and characters

AQL has the following reserved keywords and characters:

  • SELECT, FROM, WHERE, ORDER BY, TOP, AND, OR

  • CONTAINS, max, now, current-date, current-date-time

  • TIMEWINDOW (need to be confirmed)

  • matches, in, exists, not in

  • " and ': double and single quote characters are used to delimit string values;

  • |: bar characters are used to delimit intervals;

  • []: brackets are used to delimit coded terms, archetype id values or openEHR reference model class attribute values.

  • {}: brackets are used with matches delimiting matches criteria

  • <, >, =, %, !, $

3.3. openEHR path syntax

The openEHR path syntax has been described in a detail (see the Architecture Overview specification). AQL utilises the basic openEHR path syntax to represent both coarse-grained and fine-grained archetype nodes and openEHR class instances and attributes. The syntax details are not repeated here. A set of openEHR path examples are provided.

3.3.1. openEHR path examples - archetype path

An archetype path is the path referring to any node within an archetype. The following is an ADL workbench screen shot of apgar observation node map. The table below shows some examples.

Archetype RM type Archetype node name Path syntax

OBSERVATION

Apgar score

/

OBSERVATION

1 minute

/data[at0002]/events[at0003]

OBSERVATION

Heart rate

/data[at0002]/events[at0003]/data[at0001]/items[at0005] x

3.3.2. openEHR path examples - path to RM class attributes

Another type of openEHR path is the path pointing to an attribute of openEHR Reference Model classes. The examples shown below uses the attributes from the COMPOSITION class.

OpenEHR RM attribute name Path syntax

COMPOSITION.category

/category

COMPOSITION.context.start_time

/context/start_time

COMPOSITION.uid.value

/uid/value

3.4. AQL variables

AQL variables are used to refer a specific archetype or openEHR class defined within an AQL statement. It has the following features:

  • AQL variable names must be defined in an AQL FROM clause.

  • A variable name must be unique within an AQL statement.

  • Not all classes or archetypes defined within an AQL statement need a variable name. It is required only when other clauses within the AQL statement need to refer to the class or archetype defined in the FROM clause. For instance, the example doesn’t define a variable name for EHR class. It has a variable name c indicating the Composition archetype openEHR-EHR-COMPOSITION.encounter.v1, and varialbe name o referring to the Observation archetype openEHR-EHR-OBSERVATION.blood_pressure.v1. Variable name o is used by both WHERE clause and SELECT clause.

  • Variable names are used by other clauses within the AQL statement along with openEHR path to form a complete AQL path - Identified path. Refer back to Section 2.2. example, o/data[at0001/events[at0006]/data[at0003]/items[at0004]/value/value] points to the systolic value node of the Observation archetype openEHR-EHR-OBSERVATION.blood_pressure.v1.

  • Except for the reserved words, any characters, whitespace, letters, Alphanumeric, and underscore can be used for a variable name. Variable names are not case sensitive. Digits are not recommended at the beginning of a variable name even though there is no formal restriction.

3.5. Parameters

AQL syntax supports parameterisation. AQL parameters are used to indicate criteria, which will be substituted with values at run time. Supporting parameterisation is one of the mechanisms used to realise sharing AQL statements within a system or across system boundaries.

The parameters can be EHR specific (such as the parameter of EHR identifier or EHR creation date time), archetype specific (such as an archetype identifier, or archetype constraints), or template specific (such as a template identifier or template constraints).

3.5.1. When parameters are needed

Parameters are needed when the same AQL query statement is used with different criteria values. This AQL example is to return all abnormal blood pressure values for a single specific EHR. This query has a parameter $ehrUid. This parameter will be substituted by a specific EHR.ehr_id value at run time. Consequently, this query can be reused for all EHRs either within an EHR system or by other EHR systems (assuming these EHR systems support AQL).

3.5.2. Where and how parameters are resolved

AQL query parameters can be resolved at application level, or EHR system level. It depends on what the query parameters are used for and the design/implementation of the system or components. Query parameters would be normally resolved outside of a query engine.

If a query needs to be reusable across different EHR systems, any query parameters normally need to be registered in these EHR systems so that they can be resolved with real values from each environment. A typical example of this type of query parameter is $ehrUid. If the query is only used within an application, then the query parameters would be resolved by the application, such as the parameter of healthcare facility identifier or template identifier.

There are no specific guidelines on how to resolve query parameters. Generally speaking, a parameter name is used as a key (or a key is associated with a parameter name) and the key needs to be unique within the boundary where the parameters are resolved. The EHR system or application needs to have the API functions to get the real value with a given parameter name or key.

3.5.3. Parameter syntax

A parameter always starts with a dollar sign $ followed by the parameter name, e.g. $ehrUid. The parameter name can consist of letters, digits and underscores. It cannot have spaces and it cannot be an AQL reserved word.

A parameter can be used for any criteria values within an AQL statement, e.g.:

  • within a predicate: [$archetypeId], [at0003, $nameValue], [ehr_id/value=$ehrId]

  • outside a predicate: o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value > $systolicCriteria

3.6. Predicates

AQL has three types of predicates: the standard predicate, the archetype predicate, and the node predicate.

3.6.1. Standard predicate

Standard predicates always have a left operand, operator and right operand, e.g. [ehr_id/value='123456'].

  • The left operand is normally an openEHR path, such as ehr_id/value, name/value.

  • The right operand is normally a criterion value or a parameter, such as '123456', $ehrUid. It can also be an openEHR path (based on the BNF).

  • The operator can be one of the following:

    >, >=, =, <, <=, !=

3.6.2. Archetype predicate

An archetype predicate is a shortcut of a standard predicate, i.e. the predicate does not have the left operand and operator. It only has an archetype id, e.g. [openEHR-EHR-COMPOSITION.encounter.v1]. The archetype predicate is a specific type of query criterion indicating what archetype instances are relevant to this query. It is used to scope the the data source from which the query expected data is to be retrieved. Therefore, an archetype predicate is only used within an AQL FROM clause, for example,

FROM EHR [ehr_id/value='1234']
CONTAINS COMPOSITION c [openEHR-EHR-COMPOSITION.encounter.v1]
CONTAINS OBSERVATION o [openEHR-EHR-OBSERVATION.blood_pressure.v1]

3.6.3. Node predicate

A node predicate is also a shortcut of a standard predicate. It has the following forms:

  • Containing an archetype_node_id (i.e. an at-code) only, e.g.:

    [at0002]
  • Containing an archetype_node_id and a name value/criterion, e.g.:

    [at0002 and name/value=$nameValue]
    [at0002 and name/value='real name value']
  • Containing an archetype_node_id and a shortcut of a name/value criterion, e.g.:

    [at0002, $nameValue]
    [at0002, 'real name value']
  • The above three forms are the most common node predicates. A more advanced form is to include a general criterion instead of the name/value criterion within the predicate. The general criterion consists of left operand, operator, and right operand, e.g.:

    [at0002 and value/defining_code/terminology_id/value=$terminologyId]

A node predicate defines criteria on fine-grained data. It is only used within an identified path.

3.7. Operators

3.7.1. Basic operators

The table below shows the basic AQL operators, meaning and example.

Operator Meaning Example

=

Equal

name/value = $nameValue

>

Greater than

o/data[at0001]/…​/data[at0003]/items[at0004]/value/value >140

>=

Greater than or equal to

o/data[at0001]/…​./data[at0003]/items[at0004]/value/value >=140

<

Smaller than

o/data[at0001]/…​/data[at0003]/items[at0004]/value/value <160

<=

Smaller than or equal to

o/data[at0001]/…​/data[at0003]/items[at0004]/value/value ⇐160

!=

not equals to

c/archetype_details/template_id/value != ''

3.7.2. Advanced operators

3.7.2.1. matches

The matches operator is used in the WHERE clause. This operator needs left operand and right operand. Left operand is an AQL identified path. Right operand is enclosed within curly braces. Right operand of matches operator has three forms:

  1. type of constraint ADL (cADL): this type of right operand can be a value list or a value range. Value range is not supported in the current AQL grammar. Below is an example with matches a string value list:

    SELECT
        o/data[at0002]/events[at0003]/data/items[at0015]/items[at0018]/name
    FROM
        Ehr [uid=$ehrUid]
            CONTAINS Composition c
                CONTAINS Observation o[openEHR-EHR-OBSERVATION.microbiology.v1]
    WHERE
        o/data[at0002]/events[at0003]/data/items[at0015]/items[at0018]/items[at0019]/items[at0021]/name/defining_code/code_string matches {'18919-1', '18961-3', '19000-9'}
  2. URI: can be a terminology URI, openEHR EHR URI, or other URI. An example with a terminology URI is shown below:

    SELECT
        e/ehr_status/subject/external_ref/id/value, diagnosis/data/items[at0002.1]/value
    FROM
        EHR e
            CONTAINS Composition c[openEHR-EHR-COMPOSITION.problem_list.v1]
                CONTAINS Evaluation diagnosis[openEHR-EHR-EVALUATION.problem-diagnosis.v1]
    WHERE
        c/name/value='Current Problems'
        AND diagnosis/data/items[at0002.1]/value/defining_code matches { terminology://Snomed-CT/hierarchy?rootConceptId=50043002 }
3.7.2.2. EXISTS

EXISTS is a unary operator, which is always followed by a single operand. The single operand is an AQL identified path. This operator returns Boolean results: true means the data associated with the specified path exists, false otherwise. An EXISTS expression is a single identified expression. The latest grammar only allows this operator in WHERE clause. It may be required as part of openEHR path in future.

The example below means retrieving the identifiers of the EHRs which do not have discharge admin entry instance.

SELECT
    e/ehr_id/value
FROM
    EHR e
        CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.administrative_encounter.v1]
            CONTAINS ADMIN_ENTRY admission[openEHR-EHR-ADMIN_ENTRY.admission.v1]
WHERE
    NOT EXISTS c/content[openEHR-EHR-ADMIN_ENTRY.discharge.v1]

This example below means from a particular EHR, retrieving all composition uid values and the contained instruction instances if the instruction instances have links with particular EHR URI value.

SELECT
    c/uid/value, instruction
FROM
    EHR e [ehr_id/value=$ehrid]
        CONTAINS COMPOSITION c
            CONTAINS INSTRUCTION instruction[openEHR-EHR-INSTRUCTION.referral.v1]
WHERE
    EXISTS instruction/links[target='ehr://327000002/87284370-2D4B-4e3d-A3F3-F303D2F4F34B@latest_trunk_version']
3.7.2.3. NOT

NOT is a unary Boolean operator and it is always followed by either a single identified expression or boolean identified expression. It returns Boolean results: true means the followed expression (single or Boolean expression) is false.

The example below uses the NOT operator followed by a single identified expression

SELECT
    e/ehr_id/value
FROM
    EHR e
        CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.administrative_encounter.v1]
            CONTAINS ADMIN_ENTRY admission[openEHR-EHR-ADMIN_ENTRY.admission.v1]
WHERE
    NOT EXISTS c/content[openEHR-EHR-ADMIN_ENTRY.discharge.v1]

The example below uses NOT operator followed by a boolean identified expression.

SELECT
    e/ehr_id/value
FROM
    EHR e
        CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.administrative_encounter.v1]
            CONTAINS ADMIN_ENTRY admission[openEHR-EHR-ADMIN_ENTRY.admission.v1]
WHERE
    NOT (EXISTS c/content[openEHR-EHR-ADMIN_ENTRY.discharge.v1]
    AND e/ehr_status/subject/external_ref/namespace='CEC')

The above example is equivallent to the two expressions shown below:

SELECT
    e/ehr_id/value
FROM
    EHR e
        CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.administrative_encounter.v1]
            CONTAINS ADMIN_ENTRY admission[openEHR-EHR-ADMIN_ENTRY.admission.v1]
WHERE
    NOT EXISTS c/content[openEHR-EHR-ADMIN_ENTRY.discharge.v1]
    OR NOT e/ehr_status/subject/external_ref/namespace = 'CEC'


SELECT
    e/ehr_id/value
FROM
    EHR e
        CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.administrative_encounter.v1]
            CONTAINS ADMIN_ENTRY admission[openEHR-EHR-ADMIN_ENTRY.admission.v1]
WHERE
    NOT EXISTS c/content[openEHR-EHR-ADMIN_ENTRY.discharge.v1]
    OR e/ehr_status/subject/external_ref/namespace != 'CEC'

These advanced operators are not yet supported by the grammar. The operator syntax is borrowed from ADL specifications. These are proposed to improve the richness and flexibility of AQL so that AQL syntax supports more complicated query scenarios.

3.8. AQL identified paths

3.8.1. Usage

Identified paths are used to locate data items within an archetype RM class. Except for the FROM clause, an identified path could appears in any clauses of an AQL statement. For example, it can be used to indicate the data to be returned in a SELECT clause, or the data item on which query criteria are applied in a WHERE clause.

3.8.2. Syntax

AQL identified path can take any of the following forms:

  • an AQL variable name defined within the FROM clause, followed by an openEHR path, e.g.

    o/data[at0001]/.../data[at0003]/items[at0004]/value/value
  • an AQL variable name followed by a predicate, e.g.

    o[name/value=$nameValue]
  • an AQL variable name followed by a predicate and an openEHR path, e.g.

    o[name/value=$nameValue]/data[at0001]/.../data[at0003]/items[at0004]/value/value

3.9. Leaf Data

3.9.1. Primitive Types

3.9.1.1. String Data

All strings are enclosed in double quotes or single quotes. Line breaks are not supported.

3.9.1.2. Integer data

Integers are represented as numberic literals, such as 1, 2, 365. Commas or periods for breaking long numbers are not allowed. Hex integers are not supported.

3.9.1.3. Real data

Real numbers are the decimal literals that include a decimal point, such as 3.1415926. Commas or periods for breaking long numbers are not allowed.

3.9.1.4. Boolean data

Boolean values are indicated using the case-insensitive literals true or false.

3.9.1.5. Dates and Times

No matter whether the date/date time is complete or partial, it is allowed in AQL as long as it is in ISO8601 date/date time format. Quotes (either single quotes or double quotes) are required to enclose the date/date time value. In AQL grammar, it is treated as a date value if the value is a complete calendar date represented in ISO8601 extended format and if single quotes are used. If the value is in other ISO8601 date/date time format, AQL treats it as a string value, but the AQL query engine processes it differently from a normal string value. It is a temporary solution. In future, a date/date time value should be identified as date/date time value if it is a valid ISO8601 date/date time. Examples are as below:

WHERE composition/context/start_time > '19860101' // a complete date. AQL grammar identifies it as a string value.
WHERE composition/context/start_time < "120000+0930"  // ISO8601 time. AQL grammar identifies it as a string value.
WHERE composition/context/start_time <= '1986-01-01'  // complete calendar date in ISO8601 extended format. AQL grammar identifies this value as a date value.

3.9.2. Other Built-in Types

3.9.2.1. URI

URI leaf data is enclosed within curly braces after matches operator. URI can be expressed as a normal URI described in [rfc3986]. URIs are not case sensitive. In AQL, a URI can be either terminology URI or an EHR URI.

3.9.2.2. Terminology URI

A terminology URI consists of the following components:

  1. terminology: the URI schemes value;

  2. terminology service: the URI authority value, such as SNOMED-CT;

  3. terminology function name: the URI path, e.g. "hierarchy" is the function name in the example shown below;

  4. argument values required by the terminology functions - URI queries;

This is an example of a terminology URI:

    terminology://snomed-CT/hierarchy?rootConceptId=50043002
    \_________/   \_______/ \_______/ \___________/ \______/
        |             |         |           |__________|
     scheme       authority   path          | queries  |
        |             |         |           |          |
    terminology  terminology function    argument   argument
       uri         service                 name      value
3.9.2.3. Value List

Value list leaf data is enclosed within curly braces after matches operator. Each item is separated by comma if there are multiple items in the list. Value list can be string value list, date time value list, integer value list, and real value list. Quotes are required for string value list and date time value list.

A value list is only used in AQL` WHERE` clause when the criteria is to match one item of the list. The relationships among these value list items are "OR".

Some examples of value list:

matches {'string item 1', 'string item 2', 'string item3'}  // a string list. It Equivalent to
                                                            // matches  string item1' or ..
                                                            // matches 'string item 2' or ..
                                                            // matches 'string item 3'
matches {'20060101', '20070101', '20080101'}                // a date value list
matches {1, 2, 3}                                           // a integer list
matches {1.1, 2.5, 3.8}                                     // a real value list
3.9.2.4. Value Range

It is not supported in the current AQL grammar.

Value range leaf data is enclosed within curly braces after matches operator. It is only used in AQL WHERE clause to specify a criteria within a range. The syntax is similar to intervals of ordered primitive types detailed in Section 4.5.2, adl.pdf. Some examples:

matches {|3.0..5.0|} // means 3.0 <= x <=5.0
matches {|<8|}       // means x < 8

3.10. Query structure

3.10.1. Overview

AQL structure has been briefly introduced in Section 2.2. This section describes the syntax in more formal detail.

AQL structure has the following clauses and these clauses must be listed in the same order as the list below in an AQL statement. An AQL statement must contain SELECT and FROM clauses.

  • SELECT (mandatory)

  • FROM (mandatory)

  • WHERE (optional)

  • ORDER BY (optional)

  • TIMEWINDOW (optional)

3.10.2. FROM

The FROM clause utilises class expressions and a set of containment criteria to specify the data source from which the query required data is to be retrieved. Its function is similar as the FROM clause of an SQL expression.

3.10.2.1. FROM Syntax

A simple FROM clause consists of three parts: keyword FROM, class expression and/or containment constraints, e.g.

FROM
    EHR e [ehr_id/value=$ehrId]
        CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.report.v1]
3.10.2.2. Class expressions

Two examples of a class expression are shown below:

EHR e [ehr_id/value=$ehrId]                        \\EHR class, class identifier/variable, and a standard predicate
COMPOSITION c[openEHR-EHR-COMPOSITION.report.v1]   \\COMPOSITION class, class identifier/variable, and an archetype predicate

Class expressions are used for two purposes:

  1. indicating the constraints on RM classes so as to scope the data source for the query. For instance, EHR e[ehr_id/value='123456'] indicates that the required data must be from a specific EHR with ehr_id value '123456'; while COMPOSITION c[openEHR-EHR-COMPOSITION.report.v1] indicates the required data must be from or must be associated with a Composition instance with archetype id - openEHR-EHR-COMPOSITION.report.v1.

  2. defining a RM class variable that may be used by other clauses to indicate the required data or data items on which query criteria are applied. The example below uses the class expression to define a variable e which is used by the SELECT clause indicating all relevant ehr_id values are retrieved, and a variable c used by WHERE clause indicating that the query criteria is set on the composition template id value.

    SELECT
        e/ehr_id/value
    FROM
        EHR e
            CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.referral.v1]
    WHERE
        c/archetype_details/template_id/value=$templateId

Class expressions syntax include three parts. A class expression must have part one and at least one of part two or part three.

  1. part one (mandatory): openEHR RM class name, such as EHR, COMPOSITION, OBSERVATION etc.

  2. part two (optional): AQL variable name

  3. part three (optional): a standard predicate or an archetype predicate.

3.10.2.3. Containment

Since archetypes are in hierarchical structure, AQL has a containment constraint which specifies the hierarchical relationships between parent and child data items. The FROM clause utilises this hierarchical constraint along with class expression to determine the data source to which the AQL query is applied.

The syntax of containment constraint is very simple: using keyword CONTAINS between two class expressions. Left class expression is the the parent object of the right class expression, e.g.

EHR e CONTAINS COMPOSITION c [openEHR-EHR-COMPOSITION.referral.v1]

Boolean operators (AND, OR, NOT) and parentheses are used when multiple containment constrains are required, e.g.

EHR e
    CONTAINS COMPOSITION c [openEHR-EHR-COMPOSITION.referral.v1] AND COMPOSITION c1 [openEHR-EHR-COMPOSITION.report.v1]

EHR e
    CONTAINS COMPOSITION c [openEHR-EHR-COMPOSITION.referral.v1]
        CONTAINS (OBSERVATION o [openEHR-EHR-OBSERVATION-laboratory-hba1c.v1] AND OBSERVATION o1 [openEHR-EHR-OBSERVATION-laboratory-glucose.v1])

3.10.3. WHERE

An AQL WHERE clause is used to represent further criteria applied to the data items within the objects declared in the FROM clause. A WHERE clause expresses the query criteria that cannot be represented in other AQL clauses, such as criteria on archetype id, composition committal date time, and the criteria on in which order the returned results should be listed.

3.10.3.1. Syntax

The WHERE clause syntax has the following parts (in order): keyword WHERE and identified expression(s). Boolean operators (AND, OR, NOT) and parenthesis can be used to represent multiple identified expressions. Examples:

WHERE
    c/name/value=$nameValue AND c/archetype_details/template_id/value=$templateId

WHERE
    (c/name/value=$nameValue OR c/archetype_details/template_id/value=$templateId)
    AND o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value >= 140
3.10.3.2. Identified expression

Identified expression specifies the criteria within WHERE clause. It consists of left operand, operator and right operand.

  1. Left operand is normally an identified path,

  2. Operator is normally the basic operator,

  3. Right operand is the criteria value, which normally is primitive type, such as String, Integer, Boolean, Double, or Float. When it is a string value, single quotation marks or double quotation marks are required. Right operand can also be a parameter or an identified path as well.

Examples:

  • Left operand is an identified path and right operand is a real criteria value - primitive type:

    o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value >= 140
    c/archetype_details/template_id/value='health_encounter'
  • Left operand is an identified path and right operand is a parameter:

    c/archetype_details/template_id/value=$templateParameter
  • Both left operand and right operand are an identified path (this is an advanced feature):

    o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value >
    o1/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value

3.10.4. SELECT

A SELECT clause specifies what data is to be retrieved by the AQL query. The data can be any types from openEHR RM and any primitive data types. In addition, the SELECT clause uses TOP to indicate the number of result setS that should be returned and name alias to rename the retrieved data.

The SELECT syntax always starts with keyword SELECT, followed by TOP (optional), and identified path(s) or variable name(s) defined in the FROM clause. Each variable name or an identified path may have a name alias renaming the associated data.

If variable name(s) is(are) required, the full object of the type associated with the variable is retrieved, such as a COMPOSITION, an OBSERVATION object etc. If identified paths are required, the path-associated data items are returned. Multiple identifiers or identified paths are separated using a comma.

Some examples are shown below.

Example 1: retrieve all compositions' name value, context start time and composer name from a specific EHR.

SELECT
    c/name/value AS Name, c/context/start_time AS date_time, c/composer/name AS Composer
FROM
    EHR e[ehr_id/value=$ehrUid] CONTAINS COMPOSITION c

Example 2: Retrieve all composition objects of a specific EHR.

SELECT
    c
FROM
    EHR e[ehr_id/value=$ehrUid] CONTAINS COMPOSITION c
3.10.4.1. TOP

The TOP syntax was borrowed from SQL language for representing the number of result sets that should be returned by the AQL query. It uses BACKWARD and FORWARD to indicate the direction where to start to get the number of results to be returned.

It starts with keyword TOP, followed by an integer number and/or the direction (i.e. BACKWARD, FORWARD), e.g.

SELECT
    TOP 10 c/name/value AS Name, c/context/start_time AS date_time, c/composer/name AS Composer
FROM
    EHR e[ehr_id/value=$ehrUid]
        CONTAINS COMPOSITION c
3.10.4.2. Name alias

As in SQL, AQL supports the use of a name alias for the retrieved data. This is domne with the keyword AS, followed by the name which conforms to the syntax rule of AQL variable.

3.10.5. ORDER BY

The ORDER BY clause is used to sort the returned results. The keyword ORDER BY is followed by an identified path and the keyword DESC, DESCENDING, ASCE, or ASCENDING, e.g.

ORDER BY c/name/value

3.10.6. TIMEWINDOW

In addition to the FROM clause, TIMEWINDOW helps to scope the data source from which the required data is retrieved. TIMEWINDOW is an addition query clause used in AQL to constrain the query to data that was available in the system (also know the data committal time) within the specified time criteria. This supports a timebased logical system rollback allowing a query to be executed as though it was performed at that specified time, which is essential for medico-legal reporting.

It starts with the keyword TIMEWINDOW, and followed by a string compatible with the ISO 8601 representation of time interval.

The first example below constrains the query source to data committed to the system before 2006-01-01.

TIMEWINDOW /2006-01-01

The second example constrains the query source to data committed within the period of two years before 2006-01-01.

TIMEWINDOW P2Y/2006-01-01

4. Result structure

The result structure of an AQL query is described as part of the openEHR EHR service definition.

5. AQL Operator Syntax Examples

The following sub-sections illustrate various AQL operators with examples.

5.1. ADL matches operator

The following examples show how the ADL matches operator may be used within AQL statements.

Note
the ADL expressions on the right-hand side of the matches operator are in ADL 1.4 format.
Clinical Scenario 1 Blood glucose > 11 mmol/L or blood glucose >= 200 mg/dL

EHR path

o/data[at0001]/events[at0002 and name/value='Any event']/data[at0003]/items[at0013.1]

AQL Fragment

ehr_path/value
    matches {
        C_DV_QUANTITY<
            list = <
               ["1"] = <
                   units = <"mmol/L">
                   magnitude = <|>=11|>
               >
               ["2"] = <
                   units=<"mg/dL">
                   magnitude=<|>=200|>
               >
            >
        >
    }
Clinical Scenario 2 Blood glucose level is between 5-6 mmol/L or between 90-110 mg/dL

EHR path

o/data[at0001]/events[at0002 and name/value='Any event']/data[at0003]/items[at0013.1]

AQL Fragment

ehr_path/value
    matches {
        C_DV_QUANTITY<
            list = <
                ["1"] = <
                    units = <"mmol/L">
                    magnitude = <|5.0..6.0|>
                >
                ["2"] = <
                    units = <"mg/dL">
                    magnitude = <|90..110|>
                >
            >
        >
    }
Clinical Scenario 3 HbA1c > 7.0%

EHR path

o/data[at0001]/events[at0002 and name/value='Any event']/data[at0003]/items[at0013.1]

AQL Fragment

ehr_path/value matches {
    DV_PROPORTION matches {
        numerator matches {|>7.0|}
        denominator matches {|100.0|}
    }
}
Clinical Scenario 4 Total cholesterol >= 5.0 mmol/L or LDL-C >= 3.0 mmol/L

Total cholesterol EHR path

o/data[at0001]/events[at0002 and name/value='Any event']/data[at0003]/items[at0013.1]

LDL-C EHR path

o/data[at0001]/events[at0002 and name/value='Any event']/data[at0003]/items[at0011.1 and
name/value='Fractions']/items[at0013.4 and name/value='LDL-Cholesterol']

AQL Fragment

Total cholesterol EHR path/value matches {
    DV_QUANTITY matches {
        units matches {"mmol/L"}
        magnitude matches {|>=5.0|}
    }
}

OR

LDL-C EHR path/value matches {
    DV_QUANTITY matches {
        units matches {"mmol/L"}
        magnitude matches {|>=3.0|}
    }
}
Clinical Scenario 5 Require all HbA1c values done in 1999

Total cholesterol EHR path

o/data/origin

AQL Fragment
(alt 1)

EHR path/value matches {"^1999"}

AQL Fragment
(alt 2)

EHR path/value matches {|1999-??-??|

AQL Fragment
(alt 3)

EHR path/value matches {|1999-01-01..1999-12-31|
Clinical Scenario 6 Require all blood glucose values done in the morning of 1st December, 1999

EHR path

o/data/origin

AQL Fragment
(alt 1)

EHR path/value matches {"^19991201T(((0[0-9])|10|11|)(([0-5][0-9])?([0-5][0-9])?)|(1200))$"}

AQL Fragment
(alt 2)

EHR path/value matches {|1999-12-01T00:00:00..1999-12-01T11:59:59|}
Clinical Scenario 7 Require all blood glucose values done in the morning of December, 1999

EHR path

o/data/origin

AQL Fragment
(alt 1)

EHR path/value matches {"^199912(([0-2][1-9])|10|20|30|31)?(T((0[0-5])|10|11)(([0-5][0-9])?([0-5][0-9])?))$"}

AQL Fragment
(alt 2)

EHR path/value matches {|1999-12-ddT00:00:00..1999-12-ddT11:59:59|}
Clinical Scenario 8 Require all HbA1c values done after 1999

EHR path

o/data/origin

AQL Fragment
(alt 1)

EHR path/value matches {"^[2-9]\d\d\d"}

AQL Fragment
(alt 2)

EHR path/value matches {|>1999-??-??|}
Clinical Scenario 9 Require all HbA1c values done before 1999

EHR path

o/data/origin

AQL Fragment
(alt 1)

EHR path/value matches {"^1\d\d[0-8]"}

AQL Fragment
(alt 2)

EHR path/value matches {|<1999-??-??|}
Clinical Scenario 10 Requires all reports composed by patient self

AQL Fragment

SELECT c
    FROM EHR e[ehr_id=$ehrUid] CONTAINS COMPOSITION c
    WHERE c/composer matches {PARTY_SELF}

5.2. Arithmetic operators

The following example shows how arithmetic operators may be used within AQL statements.

Clinical Scenario 1 For each recorded administration of ampicillin check for problem diagnosis of skin rash that is within 2 days of the administration date

AQL Fragment

SELECT e/ehr_id

FROM EHR e CONTAINS (COMPOSITION c1
       CONTAINS ACTION a [openEHR-EHR-ACTION.medication.v1]
           CONTAINS ITEM_TREE it [openEHR-EHR-ITEM_TREE.medication.v1]) AND
       CONTAINS (COMPOSITION c2 CONTAINS EVALUATION eval [openEHR-EHR-EVALUATION.problem-diagnosis.v1])

WHERE
    it/description[openEHR-EHR-ITEM_TREE.medication.v1]/items[at0001]/value matches {"SNOMED::31087008"} AND
    eval/data[at0001]/items[at0002.1]/value/value/defining_code matches {
          CODE_PHRASE matches {[SNOMED::294506009, 21626009]}
    } AND
    eval/data[at0001]/items[at0010]/value - it/description[openEHR-EHR-ITEM_TREE.medication.v1]/items[at0018]/items[at0019]/value
    matches {
      DV_DURATION matches {
          value matches{<=P2d}
      }
    }

5.3. Nest Query and 'Not in'

The following example shows the use of a nested query and the not-in operator.

Clinical Scenario 1 All patients who have not been discharged

AQL Fragment

SELECT e/ehr_id
FROM
    EHR e CONTAINS
        ADMIN_ENTRY ae1 [openEHR-EHR-ADMIN_ENTRY.admission.v1]
WHERE
    ae1/encounter_id/value not in (
        SELECT ae2/encounter_id/value
        FROM
            EHR e CONTAINS
                ADMIN_ENTRY ae2 [openEHR-EHR-ADMIN_ENTRY.discharge.v1]
    )

5.4. Exists operator

The following example shows how the exists operator may be used within AQL statements.

Clinical Scenario 1 Require all blood pressure values with position recorded

AQL Fragment

SELECT o
FROM
    EHR e CONTAINS
        COMPOSITION CONTAINS
            OBSERVATION o [openEHR-EHR-OBSERVATION.blood_pressure.v1]
 WHERE
    exists {"o//*/state[at0007]/items[at0008]"}

6. How to write an AQL query statement manually

This section describes how to build an AQL query manually. The process described here gives may provide some hints for AQL builder tools.

6.1. Query scenario

An example is used to illustrate how to write an AQL statement. The query scenario is:

"Get all abnormal blood pressure values that are recorded in a health encounter for a specific patient."

6.2. Step 1. The FROM clause

A FROM clause is to scope the data source for the query. Normally the first step is to determine the FROM clause, which has EHR class expression and archetype class expressions.

EHR class expression

This determines whether this query is applied to a single EHR or all EHRs. The latter is called a population query. If it is for all EHRs, there is no need to specify ehr_id/value in the FROM clause. Otherwise, you need to either specify a specific value or use a query parameter for ehr_id/value in a FROM clause. For this particular scenario, it is for a single EHR. A parameter called ehrUid is used for the ehr_id/value, giving this AQL query statement:

FROM EHR [ehr_id/value=$ehrUid]

Archetype expressions

  • To write archetype expressions in the FROM clause, the archetypes required for the scenario must be determined, since archetypes are used to represent clinical concepts. The easy way to identify archetypes for a scenario is to identify clinical concepts mentioned in the scenario. Each clinical concept is generally associated with one archetype. Two clinical concepts are mentioned in the above scenario: 'blood pressure' and 'health encounter'. From these concepts the two archetypes used in query are identified: an Observation archetype openEHR-EHR-OBSERVATION.blood_pressure.v1, and a Composition archetype openEHR-EHR-COMPOSITION.encounter.v1.

  • Determine whether a variable name is needed for each archetype class expression or not. This is useful if the reference to the archetype class expression is required by other AQL clauses. Without using variable names for the archetype expressions, the two archetype expressions are:

COMPOSITION [openEHR-EHR-COMPOSITION.encounter.v1]
OBSERVATION [openEHR-EHR-OBSERVATION.blood_pressure.v1]

Containment expression.

The openEHR RM is used to identify the hierarchical relationships among the found archetypes so as to define the containment expression. For example, a Composition archetype is the parent of an Observation archetype, so the AQL query looks as follows:

FROM EHR [ehr_id/value=$ehrUid] CONTAINS COMPOSITION [openEHR-EHR-COMPOSITION.encounter.v1]
CONTAINS OBSERVATION [openEHR-EHR-OBSERVATION.blood_pressure.v1]

6.3. Step 2. The WHERE clause

The WHERE clause represents all query criteria that cannot be represented in the FROM clause. To write WHERE clause expression, the following is needed:

  1. Find the criteria. The criteria required by the scenario are abnormal blood pressure values. Different guidelines may have different definitions for abnormal blood pressure values. Here abnormal blood pressure is interpreted to mean mean 1) the 'systolic pressure' value is greater than or equal to 140; OR 2) the 'diastolic pressure' value is greater than or equal to 90.

  2. Write an identified expression for criterion 1).

    1. Write the Identified path for the systolic data value. A class variable name is needed as the reference to the blood pressure class expression defined in the FROM clause, so a variable name obs is added into the FROM clause. A path to systolic data value is also required to form the identified path. Two tools can be used to get the path: the openEHR ADL Workbench or Template Designer. The path to the systolic data value is /data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value.

    2. Get appropriate operator: >= is chosen for criterion 1.

    3. Get the criteria value - 140. The query statement including the identified expression for criterion 1) is shown below:

      FROM
          EHR [ehr_id/value=$ehrUid] CONTAINS COMPOSITION [openEHR-EHR-COMPOSITION.encounter.v1]
              CONTAINS OBSERVATION obs [openEHR-EHR-OBSERVATION.blood_pressure.v1]
      WHERE
          obs/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value >= 140
  3. Write an identified expression for criterion 2). Following the previous instruction to write the identified expression for criterion 2) which is shown as:

    obs/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/value >= 90
  4. Find the appropriate boolean operator to join criterion 1) and 2) - OR. The query statement looks like:

    FROM
        EHR [ehr_id/value=$ehrUid] CONTAINS COMPOSITION [openEHR-EHR-COMPOSITION.encounter.v1]
            CONTAINS OBSERVATION obs [openEHR-EHR-OBSERVATION.blood_pressure.v1]
    WHERE obs/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value >= 140 OR
        obs/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/value >= 90

6.4. Step 3. The SELECT clause

Writing a SELECT clause depends on determining:

  1. What return data is required. The afore-mentioned scenario requires abnormal blood pressure values as return values, including both systolic pressure and diastolic pressure.

  2. Write the appropriate identified path to the required item (variable name is required if the class object is required). Multiple identified path is separated using comma. For this particular scenario, two identified paths are found for systolic and diastolic data value by following Step 2. The completed query statement looks like:

    SELECT
        obs/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/magnitude,
        obs/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/magnitude
    
    FROM
        EHR [ehr_id/value=$ehrUid] CONTAINS COMPOSITION [openEHR-EHR-COMPOSITION.encounter.v1]
            CONTAINS OBSERVATION obs [openEHR-EHR-OBSERVATION.blood_pressure.v1]
    
    WHERE
        obs/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/magnitude>= 140 OR
        obs/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/magnitude>= 90

7. AQL Syntax Specification

The following Antlr grammar expresses the AQL syntax.

// Author: Bostjan Lah
// (c) Copyright, Marand, http://www.marand.com
// Licensed under LGPL: http://www.gnu.org/copyleft/lesser.html
// Based on AQL grammar by Ocean Informatics: http://www.openehr.org/wiki/download/attachments/2949295/EQL_v0.6.grm?version=1&modificationDate=1259650833000
grammar Aql;

options {
	output=AST;
	ASTLabelType=CommonTree;
	backtrack=true;
	memoize=true;
}

tokens {
	ORDERDESC;
	ORDERASC;
}

@header {
package com.marand.thinkehr.aql.antlr;
import com.marand.thinkehr.aql.AqlRecognitionException;
}

@lexer::header{
package com.marand.thinkehr.aql.antlr;
import com.marand.thinkehr.aql.AqlRecognitionException;
}

@members{
public void displayRecognitionError(String[] tokenNames, RecognitionException e)
{
  throw new AqlRecognitionException(e);
}
}

// Rule Definitions
//<Query> ::= <Select> <From>
//           | <Select> <From> <Where>
//            | <Select> <From> <OrderBy>   ! is this allowed?
//            | <Select> <From> <Where> <OrderBy>
query	:	select from where? orderBy? ';'!? EOF!;

//<Select> ::= 'SELECT' <SelectExpr>
//         | 'SELECT' <TOP> <SelectExpr>
select	:	SELECT top? selectExpr -> ^(SELECT top? selectExpr);

//<Top> ::= 'TOP' Integer
//          | 'TOP' Integer 'BACKWARD'
//          | 'TOP' Integer 'FORWARD'
top	:	TOP INTEGER FORWARD? -> ^(TOP INTEGER)
	|	TOP INTEGER BACKWARD -> ^(TOP INTEGER BACKWARD);


//<Where> ::= 'WHERE' <IdentifiedExpr>
where	:	WHERE identifiedExpr -> ^(WHERE identifiedExpr);

//<OrderBy> ::= 'ORDER BY' <OrderBySeq>
orderBy	:	ORDERBY orderBySeq -> ^(ORDERBY orderBySeq);

//<OrderBySeq>  ::= <OrderByExpr>
//		| <OrderByExpr> ',' <OrderBySeq>
orderBySeq
	:	orderByExpr (','! orderByExpr)*;

//<OrderByExpr> ::= <IdentifiedPath>
//		| <IdentifiedPath> 'DESCENDING'
//		| <IdentifiedPath> 'ASCENDING'
//		| <IdentifiedPath> 'DESC'
//		| <IdentifiedPath> 'ASC'
orderByExpr
	:	identifiedPath (DESCENDING|DESC) -> ^(ORDERDESC identifiedPath)
	|	identifiedPath (ASCENDING|ASC)? -> ^(ORDERASC identifiedPath);

//<SelectExpr> ::= <IdentifiedPathSeq>
selectExpr
	:	identifiedPathSeq;

//! When multiple paths provided, each IdentifiedPath must represent an object of type DataValue
//<IdentifiedPathSeq> ::= <IdentifiedPath//>
//			| <IdentifiedPath> 'as' Identifier
//			| <IdentifiedPath> ',' <IdentifiedPathSeq>
//			| <IdentifiedPath> 'as' Identifier ',' <IdentifiedPathSeq>
identifiedPathSeq
	:	selectVar (','! selectVar)*;

selectVar
	:	identifiedPath^ asIdentifier?;

asIdentifier
	:	AS IDENTIFIER;

//<From> ::=   'FROM' <FromExpr>		! stop or/and without root class
//	    | 'FROM' <FromEHR> <ContainsExpr>
//           | 'FROM' <FromEHR>
//!          'FROM' <ContainsOr>
from	: FROM fromExpr -> ^(FROM fromExpr)
	| FROM ehrContains -> ^(FROM ehrContains);

//<FromEHR> ::= 'EHR' <StandardPredicate>
//              | 'EHR' Identifier <StandardPredicate>
//              | 'EHR' Identifier
fromEHR	: EHR IDENTIFIER -> ^(EHR IDENTIFIER)
	| EHR IDENTIFIER standardPredicate -> ^(EHR IDENTIFIER standardPredicate)
	| EHR standardPredicate -> ^(EHR standardPredicate);

ehrContains
	: fromEHR (CONTAINS^ contains)?;

//<IdentifiedExpr> ::= <IdentifiedEquality>
//                   | <IdentifiedExprBoolean>
//                   |'(' <IdentifiedExprBoolean> ')'
//identifiedExpr
//	: identifiedExprBoolean;
//        | '('! identifiedExprBoolean ')'!;

//<IdentifiedExprBoolean> ::= <IdentifiedExpr> 'OR' <IdentifiedExpr>
//                              | <IdentifiedExpr> 'AND' <IdentifiedExpr>
//                              | <IdentifiedExpr> 'XOR' <IdentifiedExpr>
//                              | 'NOT''(' <IdentifiedExprBoolean> ')'
//                              | 'NOT' <IdentifiedEquality>
identifiedExpr
 	: identifiedExprAnd ((OR|XOR)^ identifiedExprAnd)*;

identifiedExprAnd
	: identifiedEquality (AND^ identifiedEquality)*;


//<IdentifiedEquality> ::= <IdentifiedOperand> ComparableOperator <IdentifiedOperand>
//			     | <IdentifiedOperand> 'matches' '{' <MatchesOperand> '}'
//                          | <IdentifiedOperand> 'matches' RegExPattern
//                          | 'EXISTS' <IdentifiedPath>
identifiedEquality
 	: identifiedOperand ((MATCHES^ '{'! matchesOperand '}'!)|(COMPARABLEOPERATOR^ identifiedOperand))
        | EXISTS identifiedPath -> ^(EXISTS identifiedPath)
        | '('! identifiedExpr ')'!
        | NOT^ identifiedEquality
 	;
// 	| identifiedOperand 'matches' '{' matchesOperand '}'
//        | identifiedOperand 'matches' REGEXPATTERN
//        | 'EXISTS' identifiedPath;

//<IdentifiedOperand> ::= <Operand> | <IdentifiedPath>
identifiedOperand
 	: operand | identifiedPath;
//!<IdentifiedOperand> ::= <Operand> | <RelativePath>

//<IdentifiedPath>::= Identifier
//                    | Identifier <Predicate>
//                    | Identifier '/' <ObjectPath>
//                    | Identifier <Predicate> '/' <ObjectPath>
identifiedPath
	 : IDENTIFIER predicate? ('/' objectPath)? -> ^(IDENTIFIER predicate? objectPath?);
//!		| Identifer <AbsolutePath>
//!		| Identifer <Predicate> <AbsolutePath>


//<Predicate> ::= <NodePredicate>
predicate
 	: nodePredicate;

//<NodePredicate> ::= '['<NodePredicateOr>']'
nodePredicate
 	: OPENBRACKET nodePredicateOr CLOSEBRACKET;

//<NodePredicateOr> ::= <NodePredicateAnd>
//                 | <NodePredicateOr> 'or' <NodePredicateAnd>
nodePredicateOr
 	: nodePredicateAnd (OR^ nodePredicateAnd)*;

//<NodePredicateAnd> ::= <NodePredicateComparable>
//                 | <NodePredicateAnd> 'and' <NodePredicateComparable>
nodePredicateAnd
 	: nodePredicateComparable (AND^ nodePredicateComparable)*;

//<NodePredicateComparable> ::= <PredicateOperand> ComparableOperator <PredicateOperand>
//                          | NodeId
//                          | NodeId ',' String        ! <NodeId> and name/value = <String> shortcut
//                          | NodeId ',' parameter     ! <NodeId> and name/value = <Parameter> shortcut
//                          | <NodePredicateRegEx>     ! /items[{/at0001.*/}], /items[at0001 and name/value matches {//}]
//                          | ArchetypeId
//                          | ArchetypeId ',' String        ! <NodeId> and name/value = <String> shortcut
//                          | ArchetypeId ',' parameter     ! <NodeId> and name/value = <Parameter> shortcut
nodePredicateComparable
 	: NODEID (COMMA^ (STRING|PARAMETER))?
 	| ARCHETYPEID (COMMA^ (STRING|PARAMETER))?
 	| predicateOperand ((COMPARABLEOPERATOR^ predicateOperand)|(MATCHES^ REGEXPATTERN))
        | REGEXPATTERN     //! /items[{/at0001.*/}], /items[at0001 and name/value matches {//}]
        ;

//<NodePredicateRegEx>    ::= RegExPattern
//                          | <PredicateOperand> 'matches' RegExPattern
nodePredicateRegEx
 	: REGEXPATTERN
 	| predicateOperand MATCHES^ REGEXPATTERN;

//<MatchesOperand> ::= <ValueListItems>
//			| UriValue
matchesOperand
 	: valueListItems | URIVALUE;

//! <ValueList> ::= '{'<ValueListItems>'}'
//<ValueListItems> ::= <Operand>
//                     |<Operand> ',' <ValueListItems>
valueListItems
 	: operand (','! operand)*;

//<URI>     ::= '{' UriValue '}'
uri 	: '{' URIVALUE '}';


//<ArchetypePredicate> ::= '[' ArchetypeId ']'
//			 | '[' Parameter ']'
//                      | '[' RegExPattern ']'
archetypePredicate
 	: OPENBRACKET (archetypeId|PARAMETER|REGEXPATTERN) CLOSEBRACKET;

archetypeId
	:	ARCHETYPEID;

//<VersionPredicate> ::= '[' <VersionPredicateOptions> ']'
versionPredicate
 	: OPENBRACKET versionPredicateOptions CLOSEBRACKET;

//<VersionPredicateOptions> ::= 'latest_version' | 'all_versions'
versionPredicateOptions
 	: 'latest_version' | ALL_VERSIONS;

//<StandardPredicate> ::= '[' <PredicateExpr> ']'
standardPredicate
 	: '['! predicateExpr ']'!;

//<PredicateExpr> ::= <PredicateOr>
predicateExpr
 	: predicateOr;

//<PredicateOr> ::= <PredicateAnd>
//                 | <PredicateOr> 'or' <PredicateAnd>
predicateOr
 	: predicateAnd (OR^ predicateAnd)*;
// 	: (predicateOr 'or')? predicateAnd; !!!

//<PredicateAnd> ::= <PredicateEquality>
//                 | <PredicateAnd> 'and' <PredicateEquality>
predicateAnd
 	: predicateEquality (AND^ predicateEquality)*;
// 	: (predicateAnd 'and')? predicateEquality; !!!

//<PredicateEquality> ::= <PredicateOperand> ComparableOperator <PredicateOperand>
predicateEquality
 	: predicateOperand COMPARABLEOPERATOR^ predicateOperand;

//<PredicateOperand> ::= !Identifier
//			!| Identifier PathItem
//                     | <ObjectPath>
//                     | <Operand>
predicateOperand
 	: objectPath | operand;

//<Operand> ::= String | Integer | Float | Date | Parameter | Boolean
operand: STRING | INTEGER | FLOAT | DATE | PARAMETER | BOOLEAN;


//<ObjectPath> ::=  <PathPart>
//                | <PathPart> '/' <ObjectPath>
objectPath
 	: pathPart ('/' pathPart)*;


//<PathPart> ::= Identifier
//           | Identifier <Predicate>
pathPart
 	: IDENTIFIER predicate?;

//<FromExpr> ::=  <SimpleClassExpr>
//		| <SimpleClassExpr> <ContainsExpr>
fromExpr: containsExpression;

contains:	simpleClassExpr (CONTAINS^ containsExpression)?;

//! Check thislass
//<ContainsExpr>::= 'CONTAINS' <ContainsExpression>
//                  !'CONTAINS' <ContainsOr>
//<ContainsExpression> ::= <ClassExpr>
//                        | <ContainExpressionBoolean>
//                        |'(' <ContainExpressionBoolean> ')'
containsExpression
 	: containExpressionBool (boolOp containsExpression)?
// 	| '(' containExpressionBool ')' -> ^(OPEN containExpressionBool)
        ;

//<ContainExpressionBoolean> ::= <ContainsExpression> 'OR' <ContainsExpression>
//                              | <ContainsExpression> 'AND' <ContainsExpression>
//                              | <ContainsExpression> 'XOR' <ContainsExpression>
containExpressionBool
 	: contains
 	| '(' containsExpression ')' -> ^(OPEN containsExpression CLOSE);

boolOp	:	OR|XOR|AND;

//<ClassExpr>::=   <SimpleClassExpr>
//		   | '(' <SimpleClassExpr> <ContainsExpr> ')'
//		   | <SimpleClassExpr> <ContainsExpr>
//classExpr
// 	: '(' simpleClassExpr ')'
//	| simpleClassExpr
//	;

//<SimpleClassExpr>::= Identifier							! RM_TYPE_NAME
//               | Identifier Identifier					! RM_TYPE_NAME variable
//               | <ArchetypedClassExpr>
//		 | <VersionedClassExpr>
//		 | <VersionClassExpr>
//		 ! | <IdentifiedObjectExpr>                           ! need to be used once VersionedClassExpr is removed
simpleClassExpr
	: IDENTIFIER IDENTIFIER?					//! RM_TYPE_NAME .. RM_TYPE_NAME variable
        | archetypedClassExpr
        | versionedClassExpr
	| versionClassExpr;
//		 ! | <IdentifiedObjectExpr>                           ! need to be used once VersionedClassExpr is removed

//<ArchetypedClassExpr>::= Identifier <ArchetypePredicate>	! RM_TYPE_NAME [archetype_id]
//               | Identifier Identifier <ArchetypePredicate>	! RM_TYPE_NAME variable [archetype_id]
archetypedClassExpr
 	: IDENTIFIER^ IDENTIFIER? archetypePredicate;	//! RM_TYPE_NAME variable? [archetype_id]

//! need to be used once VersionedClassExpr is removed
//!<IdentifiedObjectExpr>::= Identifier <StandardPredicate>	! RM_TYPE_NAME [path operator operand]
//!               | Identifier Identifier <StandardPredicate>	! RM_TYPE_NAME variable [path operator operand]
//<VersionedClassExpr>::= 'VERSIONED_OBJECT'
//               | 'VERSIONED_OBJECT' Identifier
//               | 'VERSIONED_OBJECT' <StandardPredicate>
//               | 'VERSIONED_OBJECT' Identifier <StandardPredicate>
versionedClassExpr
 	: VERSIONED_OBJECT^ IDENTIFIER? standardPredicate?;

//<VersionClassExpr>::= 'VERSION'
//               | 'VERSION' Identifier
//               | 'VERSION' <StandardPredicate>
//               | 'VERSION' Identifier <StandardPredicate>
//		 | 'VERSION' <VersionPredicate>
//               | 'VERSION' Identifier <VersionPredicate>
versionClassExpr
 	: VERSION^ IDENTIFIER? (standardPredicate|versionPredicate)?;

//
// LEXER PATTERNS
//

WS  :   ( ' '
        | '\t'
        | '\r'
        | '\n'
        ) {$channel=HIDDEN;}
    ;

SELECT : ('S'|'s')('E'|'e')('L'|'l')('E'|'e')('C'|'c')('T'|'t') ;
TOP : ('T'|'t')('O'|'o')('P'|'p') ;
FORWARD : ('F'|'f')('O'|'o')('R'|'r')('W'|'w')('A'|'a')('R'|'r')('D'|'d') ;
BACKWARD : ('B'|'b')('A'|'a')('C'|'c')('K'|'k')('W'|'w')('A'|'a')('R'|'r')('D'|'d') ;
AS : ('A'|'a')('S'|'s') ;
CONTAINS : ('C'|'c')('O'|'o')('N'|'n')('T'|'t')('A'|'a')('I'|'i')('N'|'n')('S'|'s') ;
WHERE : ('W'|'w')('H'|'h')('E'|'e')('R'|'r')('E'|'e') ;
ORDERBY : ('O'|'o')('R'|'r')('D'|'d')('E'|'e')('R'|'r')(' ')('B'|'b')('Y'|'y') ;
FROM : ('F'|'f')('R'|'r')('O'|'o')('M'|'m') ;
DESCENDING : ('D'|'d')('E'|'e')('S'|'s')('C'|'c')('E'|'e')('N'|'n')('D'|'d')('I'|'i')('N'|'n')('G'|'g') ;
ASCENDING : ('A'|'a')('S'|'s')('C'|'c')('E'|'e')('N'|'n')('D'|'d')('I'|'i')('N'|'n')('G'|'g') ;
DESC : ('D'|'d')('E'|'e')('S'|'s')('C'|'c') ;
ASC : ('A'|'a')('S'|'s')('C'|'c') ;
EHR : 'EHR';
AND : ('A'|'a')('N'|'n')('D'|'d') ;
OR : ('O'|'o')('R'|'r') ;
XOR : ('X'|'x')('O'|'o')('R'|'r') ;
NOT : ('N'|'n')('O'|'o')('T'|'t') ;
MATCHES : ('M'|'m')('A'|'a')('T'|'t')('C'|'c')('H'|'h')('E'|'e')('S'|'s') ;
EXISTS: ('E'|'e')('X'|'x')('I'|'i')('S'|'s')('T'|'t')('S'|'s') ;
VERSION	:	'VERSION';
VERSIONED_OBJECT	:	'VERSIONED_OBJECT';
ALL_VERSIONS
	:	'all_versions';

fragment
ESC_SEQ
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
    |   UNICODE_ESC
    |   OCTAL_ESC
    ;

fragment
OCTAL_ESC
    :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7')
    ;

fragment
UNICODE_ESC
    :   '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
    ;

fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;

QUOTE	:	'\'';

fragment
DIGIT	:	'0'..'9';

fragment
HEXCHAR	:	 DIGIT|'a'|'A'|'b'|'B'|'c'|'C'|'d'|'D'|'e'|'E'|'f'|'F';

fragment
LETTER
	:	'a'..'z'|'A'..'Z';

fragment
ALPHANUM
	:	LETTER|DIGIT;

fragment
LETTERMINUSA
	:	'b'..'z'|'B'..'Z';

fragment
LETTERMINUST
	:	'a'..'s'|'A'..'S'|'u'..'z'|'U'..'Z';

fragment
IDCHAR	:	ALPHANUM|'_';

fragment
IDCHARMINUST
	:	LETTERMINUST|DIGIT|'_';

fragment
URISTRING
	:	ALPHANUM|'_'|'-'|'/'|':'|'.'|'?'|'&'|'%'|'$'|'#'|'@'|'!'|'+'|'='|'*';

fragment
REGEXCHAR
	:	URISTRING|'('|')'|'\\'|'^'|'{'|'}'|']'|'[';


// Terminal Definitions

//Boolean     = 'true' | 'false'
BOOLEAN	:	'true' | 'false' | 'TRUE' | 'FALSE' ;

//!NodeId      = 'a''t'{Digit}{Digit}{Digit}{Digit}
//! conflict with Identifier
//!NodeId	     = 'at'({Digit}{Digit}{Digit}{Digit}('.0'*('.'{NonZeroDigit}{Digit}*)+|('.'{NonZeroDigit}{Digit}*)*)|'0''.0'*('.'{NonZeroDigit}{Digit}*)+|('.'{NonZeroDigit}{Digit}*)+)
//NodeId	     = 'at'({Digit}+('.'{Digit}+)*)
//NODEID	:	'at' DIGIT+ ('.' DIGIT+)*;
NODEID	:	'at' DIGIT+ ('.' DIGIT+)*; // DIGIT DIGIT DIGIT DIGIT;

//!Identifier  = {Letter}({Alphanumeric}|'_')*   ! Conflicts with UID
//!Identifier  = {Letter}{IdChar}*   ! Conflicts with extended NodeId
//! restricted to allow only letters after the 4th character due to conflict with extended NodeId
//!Identifier  = {Letter}{IdChar}?{IdChar}?{IdChar}?({Letter}|'_')*  !Conficts with NodeId which may have any length of digit, such as at0.9
//Identifier = {LetterMinusA}{IdCharMinusT}?{IdChar}* | 'a''t'?(({letter}|'_')*|{LetterMinusT}{Alphanumeric}*)
// ???
IDENTIFIER
	:	('a'|'A') (ALPHANUM|'_')*
	| 	LETTERMINUSA IDCHAR*
	;

//!PathItem = '/'{Letter}({Alphanumeric}|'_')*

//Integer     = {Digit}+
INTEGER	:	'-'? DIGIT+;

//Float       = {Digit}+'.'{Digit}+
FLOAT	:	'-'? DIGIT+ '.' DIGIT+;

//Date        = ''{Digit}{Digit}{Digit}{Digit}'-'{Digit}{Digit}'-'{Digit}{Digit}''
DATE	:	'\'' DIGIT DIGIT DIGIT DIGIT DIGIT DIGIT DIGIT DIGIT 'T' DIGIT DIGIT DIGIT DIGIT DIGIT DIGIT '.' DIGIT DIGIT DIGIT '+' DIGIT DIGIT DIGIT DIGIT '\'';

//!Parameter   = '$'{letter}({Alphanumeric}|'_')*
//Parameter   = '$'{letter}{IdChar}*
PARAMETER
	:	'$' LETTER IDCHAR*;

//! could constrain UID further
//UniqueId    = {digit}+('.'{digit}+)+'.'{digit}+  ! OID
//          | {Hex Char}+('-'{Hex Char}+)+       ! UUID
UNIQUEID:	DIGIT+ ('.' DIGIT+)+ '.' DIGIT+  // OID
            | HEXCHAR+ ('-' HEXCHAR+)+       // UUID
	;

//! could constrain ArchetypeId further
//!ArchetypeId = {Letter}+'-'{Letter}+'-'({Letter}|'_')+'.'({Letter}|'_'|'-')+'.v'{Digit}+('.'{Digit}+)?  ! not allow a number in archetype id concept, such as openEHR-EHR-OBSERVATION.laboratory-hba1c.v1
//ArchetypeId = {Letter}+'-'{Letter}+'-'({Letter}|'_')+'.'({IdChar}|'-')+'.v'{Digit}+('.'{Digit}+)?
ARCHETYPEID
	:	LETTER+ '-' LETTER+ '-' (LETTER|'_')+ '.' (IDCHAR|'-')+ '.v' DIGIT+ ('.' DIGIT+)?
	;

//ComparableOperator = '=' | '!=' | '>' | '>=' | '<' | '<='
COMPARABLEOPERATOR
	:	'=' | '!=' | '>' | '>=' | '<' | '<='
	;

//UriValue   = {Letter}+'://'({UriString}|'['|']'|', '''|'')*
//            |{Letter}+':'({UriString}|'['|']'|'')*
URIVALUE: LETTER+ '://' (URISTRING|'['|']'|', \''|'\'')*
//	| LETTER+ ':' (URISTRING|'['|']'|'\'')*
        ;

//RegExPattern = '{/'{RegExChar}+'/}'
REGEXPATTERN
	:	'{/' REGEXCHAR+ '/}';

//String      = '"'{String Char}*'"'
  //          | ''{String Char}*''
STRING
    	:  '\'' ( ESC_SEQ | ~('\\'|'\'') )* '\''
    	|  '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
    	;

SLASH	:	'/';

COMMA	:	',';

OPENBRACKET
	:	'[';

CLOSEBRACKET
	:	']';

OPEN	:	'(';
CLOSE	:	')';