Skip to main content

Fundamentals of the SELECT Statement

Retrieving columns data and constants values

Common syntax of the tQL Select clause is:

SELECT <select_list>,

<select_list> ::= { * | { key_name | prefix.key_name | nested_obj.nested_key | const_field 
| expression | script(pianless_script) [ AS key_alias ] }

The possibilities retrieve a special column from the index realized in the Select clause as a list of the index’s key names. In this case, tQL also supports alias names for better representation or new calculation field naming.

The most simple select statement form is:


Select * from ticket limit 0, 100

Please, always use limit when you want to preview data samples from the index for faster execution of the query.

Usually, the select statement contains a special list of keys, which the user needs to retrieve from the index

select typeCode as "PEW type"
, id as "PEW count"
, tasks.id
, chars.project /*prefix key*/
, custom.worklogs.workerId /*nested obj*/
, '2' as constant
, script('if (doc["closedDate"].size() !=0)
doc["closedDate"].value;
else "2022-03-04T01:16:43.046+0000"
') as closedDate /*painless script*/
from ticket
limit 2
PEW typePEW counttasks.idchars.projectcustom.worklogs.workerIdconstantclosedDate
Subtaskd60a76d4-7bfd-44aa-8aed-bbe956a89b8a[ "Activity_12lmqzd:26d59f31-8fd6-11ec-a49e-1646d069437a" ]nullnull22022-03-04T01:16:43.046+0000
Chybad62a2456-80c7-462a-beb8-ae5cfa9a1f6b[ "Activity_12lmqzd:3946ae33-8ff7-11ec-a49e-1646d069437a" ]nullnull22021-08-20T10:36:17.000Z

Calculate or/and transform data

For linear data calculation, tQL supports simple arithmetical expressions +, -, *, / and more powerful painless scripts.

select typeCode as "PEW type", 
5+7 as test,
chars.marketing.number/2 as testCalculation,
script('ZonedDateTime now;
if (doc["creationDate"].size() !=0)
(System.currentTimeMillis()-doc["creationDate"].value.toInstant().toEpochMilli())/1000/60;
else 0') + ' min' as Duration
from ticket
where
chars.marketing.number is not null
limit 1
PEW type test testCalculation Duration
Osmid.Testovaci.1 12 3.0525 59038 min

Find quickly specific data by filtering on specific criteria (conditions)

tQL supports Where clauses in standard form with operators and, or, not, in, between, like, =, !=, >, <, >=,<=. Where <search_condition>

<search_condition> ::=
[ NOT ] <predicate>
[ { AND | OR } [ NOT ] { <predicate> } ]
[ ...n ]

<predicate> ::=
{expression { = | < > | ! = | > | > = | ! > | < | < = | ! < } expression
| string_expression [ NOT ] LIKE string_expression
| expression [ NOT ] BETWEEN expression AND expression
| expression [ NOT ] IN (expression, expression, …)
| expression IS [ NOT ] NULL
   select typeCode as "PEW type", 
script('
if (doc["closedDate"].size() !=0)
doc["closedDate"].value;
else "2022-03-04T01:16:43.046+0000"
') as closedDate
from ticket
where typeCode in ('Ukol', 'UserStory')
limit 2
PEW typeclosedDate
Standardni2022-03-04T01:16:43.046+0000
Standardni2022-03-04T01:16:43.046+0000

Operators is null and is not null has been used for null values filters.

select typeCode as "PEW type"
, id as "PEW count"
, chars.project
from ticket
where chars is not null and typeCode like 'Chyba%'
PEW typePEW countchars.project
Chyba188cccef-2010-4255-b1b9-7b1cfac40b7eTSMTTS
Chybaee860d3f-ea86-44a3-b768-33d7f395fd30TSMTTS

like operator

Like operator has similar behaviour to standard SQL like, you can use both % and * characters to match any substring.

  • where name like ‘cat’ - is the same as name = cat
  • where name like ‘cat%’ - will find all records starting with ‘cat’, this will create a prefix query with optimal performance.
  • where name like ‘%cat%’ - will find all records containing ‘cat’. Please note that this query is quite resource intensive
  • where name fuzzy like ‘cat’ - finds all documents with a name similar to ‘cat’.
  • where name regexp like ‘cat[0-9]*’ - finds all documents with regexp match. Please note that this query is very resource intensive

concat

Concatenate multiple strings.

where name like concat(‘%’, :search, ‘%’)

There are several examples of using where conditions in simple queries bouded on sample dataset

Find all tickets that are either "In progress" or "Done", have a priority of "Major" or "Critical", and were created between '2022-01-01' and '2022-12-31'.

select key, statusname, priorityname, format(creationdate, 'dd.mm.yy') as "creation date"
from tickettasklist
where statusname like '%Uzavřen%'
and priorityname in ('Kritická', 'Nejnižší')
and creationdate between '2025-01-01t18:23:30.181+0000' and '2025-10-01t18:23:30.181+0000'
key statusName priorityName Creation Date
POC3-5 Uzavřen Kritická 04.01.25
ADR_TIK1 Uzavřen Kritická 08.01.25
JBP-1 Uzavřen Nejnižší 07.01.25

Find all tickets where name contains "Test" or "Fix", statusCode is not "1_INPROGRESS", and creationGroupId is not NULL. Show only 5 results started from 5th record.

select key, name, statusname, chars from ticket
where (name like '%Test%' or name like '%Fix%')
and chars is null
limit 5,5;

key name statusName chars
ZE_4 test exportu Nový
ZE_1 test Nový
WEATHER-1 SpickaTest Nový
TICK_6 Test AAA Nový
TICK_7 Spicka test - ticket Nový

NOTICE. Retrieve all tasks wich contains ticked return WRONG result in this way couse tasks is the nested obj. Rigt way to fileter by the nested object fields see in next chapter.

select key, name, statusname, chars , tasks.id
from ticket
where tasks is null
limit 1

keynamestatusNametasks.id
ZE_3testNový["0a7e5263-3455-11ef-a796-3e5d7e594a64"]

Retrieve all tasks wich contains additional information about projects. As you see calling properties (task.id, chars.AdrCharakteristikaJSON.AdrCharakteristikaForm) and whole objects (tasks, chars) is allowed.

select key, name, statusname, chars , tasks 
from ticket
where chars.adrcharakteristikajson.adrcharakteristikaform like 'Adr charakterist%';
key name statusName chars tasks
ADR_12393 1346 s Adr charakteristikou Uzavřen { "AdrCharakteristikaJSON": { "AdrCharakteristikaForm": "Adr charakteristika 1346" }" } [ { "dueDate": null, "description": "<p></p>", "userGroupId": null, "skills": [], "activityId": "Activity_1ed4tpv", "routing": null, "taskType": "USER_TASK", "plannedStartTime": null, "tsmTaskDefinitionCode": null, "commitedStartTimeTo": null, "startTime": "2024-12-16T12:47:04.069+0000", "id": "Activity_1ed4tpv:d9dc9bfc-bbab-11ef-a07c-063c6f009893", "_timestamp": null, "processInstanceId": "d9db6377-bbab-11ef-a07c-063c6f009893", "processDefinitionId": "6822ecc6-bba2-11ef-a07c-063c6f009893", "userGroupCode": null, "entityType": "HISTORIC_ACTIVITY_INSTANCE", "module": "ticket", "custom": null, "active": false, "history": [], "charsTyped": , "userId": "d83df3cd-15cf-46b0-bbbd-d4f30409896a", "commitedStartTimeFrom": null, "taskDefinitionKey": "Activity_1ed4tpv", "executionId": "d9db6377-bbab-11ef-a07c-063c6f009893", "canceled": false, "taskSpecification": null, "plannedEndTime": null, "name": "Řešení", "wfmUserId": null, "endTime": "2024-12-16T12:57:27.084+0000", "incident": null, "chars": , "status": null, "followupDate": null } ]

Aggregation

A multi-value metrics aggregation that computes stats over numeric values extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents or be generated by a provided script.

The stats that are returned consist of: min, max, sum, count, avg and collect.

avg

A single-value metrics aggregation that computes the average of numeric values that are extracted from the aggregated documents.

select avg(custom.worklogs.duration)
from nested(ticket, custom.worklogs)
group by custom.worklogs[]

count

The COUNT() function returns several rows that matches a specified criterion.

select count(id) from ticket
count_id
4755

first

Returns the first non-null value (if such exists) of the field_name input column sorted by the ordering_field_name column. If ordering_field_name is not provided, only the field_name column is used for the sorting. tQL gets the results from the aggregation and then selects the first row from the bucket and returns the required column.

collect

Returns collection of the field_name input column.

select collect(keyNormalized), customerSegmentId from customer group by customerSegmentId

max

A single-value metrics aggregation that keeps track and returns the maximum value among the numeric values extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents or be generated by a provided script.

min

A single-value metrics aggregation that keeps track and returns the minimum value among the numeric values extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents or be generated by a provided script.

sum

A single-value metrics aggregation that sums up numeric values that are extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents or be generated by a provided script.

select  sum(custom.worklogs.duration) as Sum_val
, avg(custom.worklogs.duration) as Average
, max(custom.worklogs.duration) as Maximum
, min(custom.worklogs.duration) as Minimum
, first(custom.worklogs.duration) as First_val
, count(custom.worklogs.duration) as Count_val
from nested(ticket, custom.worklogs)
group by custom.worklogs[]
m_valAverageMaximumMinimumFirst_valCount_val
2521293160.95115400015665

histogram

This multi-bucket aggregation is similar to the normal histogram, but it can only be used with a date or date range values. Like the histogram, values are rounded down to the closest bucket.

select format(histogram(creationDate, 'month'),'MMM yyyy') 
from ticket
group by histogram(creationDate, 'month')
calc2
Apr 2024
Jun 2024
Jul 2024
Sep 2024
Oct 2024
Nov 2024
Dec 2024
Jan 2025

Notice. The result has been returned in milliseconds, that’s why it is necessary to use a format function, see format.

Group by and having clause

Elasticsearch organizes aggregations into three categories:

  • Metric aggregations that calculate metrics, such as sum or average, from field values.
  • Bucket aggregations that group documents into buckets also called bins, based on field values, ranges, or other criteria.
  • Pipeline aggregations that take input from other aggregations instead of documents or fields.

For metric aggregations use aggregation functions in tQL. Bucket aggregations are realized in Group by clauses.

Group by clause supported syntax:

Group by group_list,

[group_list] ::= { * | { key_name | prefix.key_name | nested_obj[].nested_key | const_field | expression | script(pianless_script) }       

Notice that group_list must contain all select_list keys, instead of aggregates. However, group_list may also contain additional keys if bucketing logic is necessary.

A simple group by resembles the standard SQL:

select statusName, count(id) as "Tasks"
from ticket
group by statusName
order by statusName desc
limit 1,3
statusName Tasks
zpracovani 1
uzavren 12
novy 53

Possible to use group by with flattend fields, better priviously check that key exist in collection:

select chars.charsLog, count(id) as "Tasks"
from ticket
where chars.charsLog is not null
group by chars.charsLog
order by chars.charsLog desc
limit 1,3
chars.charsLog Tasks
zalozeni 1
vytvoreno 1
vyreseno 2

You can use group by in porpose of distinct values such as:

  select statusName
from ticket
group by statusName
statusName
novy
uzavren
zpracovani

Ordering collection of JSON-doc by fields

The ORDER BY clause is used to sort the results of SELECT by one or more expressions ORDER BY expression [ ASC | DESC ] [, ...], where the expression represents an input column or an output column**.**

Select typeCode as "PEW type"
, id as "PEW count"
, chars.project
from ticket
where creationDate between '2022-03-01T00:00:00.000+0000' and '2022-03-05T00:00:00.000+0000'
order by chars.project asc
Limit 5
PEW typePEW countchars.project
Storye0b3ba37-cccd-4ee9-8e3e-1598bdec6a73ADAPSCET
Storyae1e9265-cda2-4d70-ba47-41c921e51fd7ADAPSCET
Subtaskca9ace47-28c2-4881-b8dc-c71820f0b9d9CET
Subtask15aa1cd5-aa3b-4961-8426-74eb82565971CET
Story1318e692-d78e-43cc-8ebe-81345e4ad155CET