All tQL - tSM Query Language
What is tQL?
tQL is a query language for querying tSM entities with ElasticSearch index support. It is similar to ElasticSearch SQL and OpenSearch SQL with some differences:
- Enhanced support for nested objects
- tQL consults tSM Column Definitions for data types and other configuration
- Post-processing of elastic result (calculations)
- Conversion of elastic result using tSM Converters (e.g. translate priorityId to priority name)
Please note that although tQL has a similar syntax to SQL, there are many differences. Please see a detailed explanation of Elasticsearch.
tQL is used extensively in tSM:
- tSM Listings
- tSM Reporting and Dashboards (aggregations, charts)
- Dedicated tQL Console to run and debug queries
- API queries
- Access rules definition (filters)
- SLA calculation (filters)
Select Queries
tQL supports retrieving rows/columns data from the existing indexes. Like SQL-similar languages, tQL uses standard syntaxes for the SELECT statement which contains main clauses SELECT, FROM, WHERE, GROUP BY, ORDER BY, and LIMIT.
SELECT select_list FROM index [ WHERE search_condition ] [ GROUP BY group_by_expression ] [ HAVING search_condition ] [ ORDER BY order_expression [ ASC | DESC ] [NULLS FIRST | LAST] ] [LIMIT max_doc_count]
The general execution of SELECT is as follows:
- All elements in the FROM list are computed. Currently, FROM supports exactly one index. However, do note that the table name can be a pattern.
- If the WHERE clause is specified, all rows that do not satisfy the condition are eliminated from the output.
- If the GROUP BY clause is specified, or if there are aggregate function calls, the output is combined into groups of rows that match one or more values, and the results of aggregate functions are computed. If the HAVING clause is present, it eliminates groups that do not satisfy the given condition.
- The actual output rows are computed using the SELECT output expressions for each selected row or row group.
- If the ORDER BY clause is specified, the returned rows are sorted in the specified order. If ORDER BY is not given, the rows are returned in whatever order the system finds the fastest to produce.
- If the LIMIT is specified (cannot use both in the same query), the SELECT statement only returns a subset of the result rows.
tQL support all general types of queries:
- Retrieving columns data and constant values from a table
- Calculate or summarize data
- Find quickly specific data by filtering on specific criteria (conditions)
- Ordering collection of JSON-doc by fields
For the sampling we have a collection of the JSON documents like this:
{
"id": "e0b3ba37-cccd-4ee9-8e3e-1598bdec6a73",
"key": "ADAPSCET-153",
"name": "Podpora a exekuce testování aplikací DL 2022",
"typeId": "4ee6f437-3c32-4e0b-be11-afc189be5659",
"typename": "Story",
"statusId": "f4961557-06c0-4b5c-bdc1-2a76efc1c2e2",
"statusCode": "1_INPROGRESS",
"statusName": "In progress",
"priorityId": "6e71a823-806e-3d03-11ee-0f0e2d55061d",
"priorityName": "Major",
"creationDate": "2022-03-04T01:16:43.046+0000",
"creationUserId": "0511ca9b-e20f-43c5-b59c-f864385b46d7",
"creationGroupId": null,
"tasks": [
{
"dueDate": null,
"description": "<p></p>",
"skills": [],
"endTime": null,
"incident": null,
"chars": {
},
"status": null,
"followupDate": null
}
],
"dataTags": [],
"process": {
"processDefinitionVersionTag": "0",
"code": "Ticket-Story",
"version": 9
},
"custom": {
"worklogs": [
{
"duration": 120,
"workerId": "0511ca9b-e20f-43c5-b59c-f864385b46d7",
"endDate": "2022-01-03T03:51:00.000+0000",
"id": "af45d2bb-29a1-4276-a784-2644dffab1ca",
"type": null,
"startDate": "2022-01-03T01:51:00.000+0000"
}
]
},
"chars": {
"project": "ADAPSCET"
}
}
Retrieving columns data and constants values
Common syntax of the tQL Select clause is:
[SELECT <select_list](https://docs.microsoft.com/en-us/sql/t-sql/queries/select-clause-transact-sql?view=sql-server-ver16) >,
<select_list> ::= { * | { key_name | prefix.key_name | nested_obj.nested_key | const_field
| expression | script(pianless_script) [ AS key_alias ] }
The possibilities retrieve a special column from the index realized in the Select clause as a list of the index’s key names. In this case, tQL also supports alias names for better representation or new calculation field naming.
The most simple select statement form is:
Select * from ticket limit 100
Please, always use limit when you want to preview data samples from the index for faster execution of the query.
Usually, the select statement contains a special list of keys, which the user needs to retrieve from the index
select typeCode as "PEW type"
, id as "PEW count"
, tasks.id
, chars.project /*prefix key*/
, custom.worklogs.workerId /*nested obj*/
, '2' as constant
, script('if (doc["closedDate"].size() !=0)
doc["closedDate"].value;
else "2022-03-04T01:16:43.046+0000"
') as closedDate /*painless script*/
from ticket
limit 2
PEW type | PEW count | tasks.id | chars.project | custom.worklogs.workerId | constant | closedDate |
Subtask | d60a76d4-7bfd-44aa-8aed-bbe956a89b8a | [ "Activity_12lmqzd:26d59f31-8fd6-11ec-a49e-1646d069437a" ] | null | null | 2 | 2022-03-04T01:16:43.046+0000 |
Chyba | d62a2456-80c7-462a-beb8-ae5cfa9a1f6b | [ "Activity_12lmqzd:3946ae33-8ff7-11ec-a49e-1646d069437a" ] | null | null | 2 | 2021-08-20T10:36:17.000Z |
Calculate or/and transform data
For linear data calculation, tQL supports simple arithmetical expressions +, -, *, / and more powerful painless scripts.
select typeCode as "PEW type"
, script('if (doc["closedDate"].size() !=0)
doc["closedDate"].value;
else "2022-03-04T01:16:43.046+0000"
') as closedDate
, typeCode+ ', priority: '+ priorityName
from ticket
limit 3
PEW type | closedDate | calc5 |
Ukol | 2022-03-10T09:35:51.000Z | Ukol, priority: Major |
Subtask | 2022-03-09T12:25:57.000Z | Subtask, priority: Major |
Subtask | 2022-03-08T08:39:03.000Z | Subtask, priority: Major |
Find quickly specific data by filtering on specific criteria (conditions)
tQL supports Where clauses in standard form with operators and, or, not, in, between, like, =, !=, >, <, >=,<=.
Where <search_condition>
<search_condition> ::=
[ NOT ] <predicate>
[ { AND | OR } [ NOT ] { <predicate> } ]
[ ...n ]
<predicate> ::=
{ expression { = | < > | ! = | > | > = | ! > | < | < = | ! < } expression
| string_expression [ NOT ] LIKE string_expression
| expression [ NOT ] BETWEEN expression AND expression
| expression [ NOT ] IN (expression, expression, …)
| expression IS [ NOT ] NULL
select typeCode as "PEW type",
script('
if (doc["closedDate"].size() !=0)
doc["closedDate"].value;
else "2022-03-04T01:16:43.046+0000"
') as closedDate
from ticket
where typeCode in ('Ukol', 'UserStory')
limit 2
PEW type | closedDate |
locality | 2022-03-04T01:16:43.046+0000 |
locality | 2022-03-04T01:16:43.046+0000 |
locality | 2022-03-04T01:16:43.046+0000 |
locality | 2022-03-04T01:16:43.046+0000 |
locality | 2022-03-04T01:16:43.046+0000 |
locality | 2022-03-04T01:16:43.046+0000 |
locality | 2022-03-04T01:16:43.046+0000 |
locality | 2022-12-15T12:38:56.648Z |
locality | 2022-03-04T01:16:43.046+0000 |
locality | 2022-03-04T01:16:43.046+0000 |
Operators is null and is not null has been used for null values filters.
Select typeCode as "PEW type"
, id as "PEW count"
, chars.project
from ticket
where chars.project is not null and typeCode like 'Chyba%'
PEW type | PEW count | chars.project |
Chyba | 188cccef-2010-4255-b1b9-7b1cfac40b7e | TSMTTS |
Chyba | ee860d3f-ea86-44a3-b768-33d7f395fd30 | TSMTTS |
Note that support for flattened predicates is not possible. If this feature is needed - please, contact support to change the type of listener’s field to flattened.
Group by and having clause
Elasticsearch organizes aggregations into three categories:
- Metric aggregations that calculate metrics, such as sum or average, from field values.
- Bucket aggregations that group documents into buckets also called bins, based on field values, ranges, or other criteria.
- Pipeline aggregations that take input from other aggregations instead of documents or fields.
For metric aggregations use aggregation functions in tQL. Bucket aggregations are realized in Group by clauses.
Group by clause supported syntax:
Group by group_list,
[group_list] ::= { * | { key_name | prefix.key_name | nested_obj[].nested_key | const_field | expression | script(pianless_script) }
Notice that group_list must contain all select_list keys, instead of aggregates. However, group_list may also contain additional keys if bucketing logic is necessary.
A simple group by resembles the standard SQL:
select chars.project, count(id) as "Tasks"
from ticket
group by chars.project
A nested Group by with a calc field
When nested object fields are grouped, use syntax with [ ], e.g. nestedObjectName[].fieldName in Group by clause
select custom.worklogs.workerId,
sum('doc["custom.worklogs.duration"].value / 60.0 / 8') as "Worklog MD"
from ticket
group by custom.worklogs[].workerId
It is also possible to combine nested and not nested fields:
select converted priorityId as priority
, statusId as Status
, count(custom.worklogs.workerId)
from ticket
group by priorityId, statusId, custom.worklogs[]
order by statusId desc
Filtering nested objects by Group by
There are three ways to apply filtering with the Group by clauses:
-
Where clause as usual supports all predicate to create a range of documents for bucketing
-
Having clauses for filtered aggregation result in sum, min, max, avg, and count functions.
-
Bucket filter for nested objects
The first one is simple:
select converted priorityId as priority
, statusId as Status
, count(custom.worklogs.workerId)
from ticket
where priorityId is not null and statusId is not null
group by priorityId, statusId, custom.worklogs[]
limit 5
priority | Status | Worklogs - workerId |
Major | New | 0 |
Major | Canceled | 24 |
Major | Postponed | 60 |
Major | Closed | 5288 |
Major | In progress | 5948 |
The second one should only be used for aggregation results:
select converted priorityId as priority
, statusId as status
, count(id) as worklogs
from ticket
Where priorityId is not null and statusId is not null
group by priorityId, statusId, custom.worklogs[]
Having count(id) >60
-
The bucket filter for nested objects has two different results for nested field aggregation.
If we use filters in the Where clause, tQL creates a range of objects which contains nested by conditions and calculates aggregates of all nested values in this range.
select converted priorityId as priority
, statusId as status
, custom.worklogs.workerId
, count(custom.worklogs.duration) as worklogs
from ticket
Where custom.worklogs[workerId like '0511ca9b-e20f-43c5-b59c-f864385b46d7']
group by priorityId, statusId, custom.worklogs[].workerId
priority | status | Worklogs - workerId | worklogs |
Major | Closed | Michal Jandura | 74 |
Major | Closed | Marek Schmidt | 19 |
Major | Closed | Jaroslav Lehar | 8 |
Major | Closed | Petr Kropík | 8 |
Major | Closed | Jan Tecl | 5 |
Major | Closed | Jira Synchronizace | 4 |
Major | Closed | Petr Žoček | 3 |
Major | Closed | Martin Tměj | 3 |
When we use bucket filters in a Group by clause, only nested values which support condition expressions will be aggregated
select converted priorityId as priority
, statusId as status
, custom.worklogs.workerId
, count(custom.worklogs.duration) as worklogs
from ticket
Where custom.worklogs[workerId like '0511ca9b-e20f-43c5-b59c-f864385b46d7']
group by priorityId, statusId, custom.worklogs[workerId like '0511ca9b-e20f-43c5-b59c-f864385b46d7'].workerId
Priority | status | Worklogs - workerId | worklogs |
Major | Closed | Michal Jandura | 74 |
Major | In progress | Michal Jandura | 100 |
Blocker | Closed | Michal Jandura | 30 |
Minor | Closed | Michal Jandura | 14 |
Critical | Closed | Michal Jandura | 40 |
Ordering collection of JSON-doc by fields
The ORDER BY clause is used to sort the results of SELECT by one or more expressions ORDER BY expression [ ASC | DESC ] [, ...], where the expression represents an input column or an output column**.**
Select typeCode as "PEW type"
, id as "PEW count"
, chars.project
from ticket
where creationDate between '2022-03-01T00:00:00.000+0000' and '2022-03-05T00:00:00.000+0000'
order by chars.project asc
Limit 5
PEW type | PEW count | chars.project |
Story | e0b3ba37-cccd-4ee9-8e3e-1598bdec6a73 | ADAPSCET |
Story | ae1e9265-cda2-4d70-ba47-41c921e51fd7 | ADAPSCET |
Subtask | ca9ace47-28c2-4881-b8dc-c71820f0b9d9 | CET |
Subtask | 15aa1cd5-aa3b-4961-8426-74eb82565971 | CET |
Story | 1318e692-d78e-43cc-8ebe-81345e4ad155 | CET |
Nested objects
The strength of ElasticSearch is the speed of searching documents of the same type. No join is generally available. To support common queries like:
- Give me all tickets where one of the tasks is assigned to me
- Give me the sum of worklogs on tickets of project ‘X’
- Give me all tasks on tickets of project ‘X’
You need to use nested objects in ElasticSearch. Please consult the ElasticSearch documentation.
We will use nested documents of worklogs inside the ticket index in the following example:
"custom": {
"worklogs": [
{
"duration": 180,
"workerId": "0b4b7f40-4731-4ffd-b1b2-8c4ec7030ff7",
"endDate": "2022-01-14T19:16:00.000+0000",
"id": "54a21df3-eb77-4ee0-87d5-d7bbf54127ea",
"type": null,
"startDate": "2022-01-14T16:16:00.000+0000"
}
]
}
Return nested rows
The result of a standard query is always a list of selected entities (e.g. tickets). If you want to return nested documents, you need to use the special syntax “from nested(entity, nestedPath)”
select key, custom.worklogs.workerId, custom.worklogs.startDate, custom.worklogs.duration
from nested(ticket, custom.worklogs)
where key = 'TSMTTS-1384'
key | custom.worklogs.workerId | custom.worklogs.startDate | custom.worklogs.duration |
TSMTTS-1384 | d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414 | 2021-12-13T08:48:00.000+0000 | 15 |
TSMTTS-1384 | 6f30f0d3-1074-45a9-bfe9-93f0c802a10a | 2022-01-04T07:12:00.000+0000 | 240 |
TSMTTS-1384 | 6f30f0d3-1074-45a9-bfe9-93f0c802a10a | 2022-01-03T03:13:00.000+0000 | 240 |
TSMTTS-1384 | 6c570ee6-139e-47b0-ac8f-8c9ae2d203ee | 2022-01-04T10:51:00.000+0000 | 30 |
TSMTTS-1384 | 6c570ee6-139e-47b0-ac8f-8c9ae2d203ee | 2022-01-06T06:47:00.000+0000 | 60 |
TSMTTS-1384 | d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414 | 2022-01-20T08:13:00.000+0000 | 15 |
This query is executed as a special aggregation query with results from nested aggregation subquery in the background.
Nested in select
If you select a field from a nested document, it always returns an array of all values from the nested document:
select key, custom.worklogs.startDate, custom.worklogs.workerId
from ticket
where key = 'TSMTTS-1384'
key | custom.worklogs.startDate | custom.worklogs.workerId |
TSMTTS-1384 | 2021-12-13T08:48:00.000+0000,2022-01-04T07:12:00.000+0000,2022-01-03T03:13:00.000+0000,2022-01-04T10:51:00.000+0000,2022-01-06T06:47:00.000+0000,2022-01-20T08:13:00.000+0000 | d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414,6f30f0d3-1074-45a9-bfe9-93f0c802a10a,6f30f0d3-1074-45a9-bfe9-93f0c802a10a,6c570ee6-139e-47b0-ac8f-8c9ae2d203ee,6c570ee6-139e-47b0-ac8f-8c9ae2d203ee,d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414 |
But you may filter the results using the nested filter operator []:
select key, custom.worklogs[workerId = 'd825efd0-9b9a-4a7c-baa5-a4e9a8d4d414'].startDate, custom.worklogs.workerId
from ticket
where key = 'TSMTTS-1384'
Internally, this is translated to script query:
select key, script('return
params._source["custom"]["worklogs"]
?.stream()
?.filter(
__collectionItem -> __collectionItem.workerId == "d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414"
)
?.map(
__collectionItem -> __collectionItem.startDate
)
?.collect(Collectors.toList())
') as custom.worklogs.startDate
, custom.worklogs.workerId
from ticket
where key = 'TSMTTS-1384'
Both queries give the same result with filtered startDate for worklogs of one worklogId:
key | custom.worklogs.startDate | custom.worklogs.workerId |
TSMTTS-1384 | 2021-12-13T08:48:00.000+0000 | d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414,6f30f0d3-1074-45a9-bfe9-93f0c802a10a,6f30f0d3-1074-45a9-bfe9-93f0c802a10a,6c570ee6-139e-47b0-ac8f-8c9ae2d203ee,6c570ee6-139e-47b0-ac8f-8c9ae2d203ee,d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414 |
Please note, that because we did not filter on the workerId field, results from all nested records are returned.
Notice: For empty filters, a nested indicator is optional for the select query. Both columns return the same value:
select custom.worklogs.workerId, custom.worklogs[].workerId from ticket
This is not the case for where / group by / order by clauses, where brackets are mandatory even for empty filters.
Nested in the WHERE condition
tQL contains special syntax for filtering in nested documents, similar to the filter in the select clause:
select key, custom.worklogs.startDate, custom.worklogs.workerId, custom.worklogs.duration
from ticket
where custom.worklogs[startDate is not null and workerId = 'd825efd0-9b9a-4a7c-baa5-a4e9a8d4d414']
key | custom.worklogs.startDate | custom.worklogs.workerId | custom.worklogs.duration |
TSMTTS-1390 | 2021-12-15T09:24:00.000+0000,2022-02-14T14:00:00.000+0000 | d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414,d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414 | 45,60 |
TSMTTS-1372 | 2021-12-09T05:30:00.000+0000,2021-12-08T10:00:00.000+0000,2021-12-10T13:08:00.000+0000 | d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414,3255f56a-95aa-4c5f-90d9-e5925684ea55,d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414 | 45,20,15 |
It is important to understand the difference between the usage of nested filters in where condition and in select. Where condition filters “tickets”, not “worklogs”, if you want to find tickets with worklogs matching the filter and get only associated worklogs records, you need to combine both approaches:
select key,
custom.worklogs[startDate is not null and workerId = 'd825efd0-9b9a-4a7c-baa5-a4e9a8d4d414'].startDate,
custom.worklogs[startDate is not null and workerId = 'd825efd0-9b9a-4a7c-baa5-a4e9a8d4d414'].workerId,
custom.worklogs[startDate is not null and workerId = 'd825efd0-9b9a-4a7c-baa5-a4e9a8d4d414'].duration
from ticket
where custom.worklogs[startDate is not null and workerId = 'd825efd0-9b9a-4a7c-baa5-a4e9a8d4d414']
key | custom.worklogs.startDate | custom.worklogs.workerId | custom.worklogs.duration |
TSMTTS-1384 | 2021-12-13T08:48:00.000+0000,2022-01-20T08:13:00.000+0000 | d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414,d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414 | 15,15 |
TSMTTS-1390 | 2021-12-15T09:24:00.000+0000,2022-02-14T14:00:00.000+0000 | d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414,d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414 | 45,60 |
TSMTTS-1379 | 2021-12-13T14:18:00.000+0000 | d825efd0-9b9a-4a7c-baa5-a4e9a8d4d414 | 150 |
Nested in a Group by
If you want to group by a nested field, you need to always use nested “[]” syntax:
select converted chars.project , statusName, priorityId,
sum(custom.worklogs.duration), count(custom.worklogs.id)
from ticket
group by statusName, chars.project, priorityId,
custom.worklogs[workerId = '04f316cd-4897-45e3-8145-7bf9e951c9a4']
Projekt | statusName | Priority | custom.worklogs.duration | custom.worklogs.id |
RMD | closed | Major | 0 | 0 |
SPM | closed | Major | 6090 | 50 |
SPM | closed | Blocker | 480 | 7 |
select custom.worklogs.workerId,
count(custom.worklogs.id) AS "Count worklogs", count(id) AS "Count tickets"
from ticket
group by custom.worklogs[].workerId
custom.worklogs.workerId | Count worklogs | Count tickets |
04f316cd-4897-45e3-8145-7bf9e951c9a4 | 381 | 75 |
0511ca9b-e20f-43c5-b59c-f864385b46d7 | 258 | 76 |
055b7d5c-e40c-4565-94d7-5cf620f1abe9 | 381 | 98 |
0b4b7f40-4731-4ffd-b1b2-8c4ec7030ff7 | 684 | 255 |
0e6cfa34-dc2b-4157-b1df-a55278621aa0 | 186 | 49 |
Also notice different numbers of nested documents (worklogs) and main documents (tickets). You may also want to filter the worklogs for grouping:
select custom.worklogs.workerId,
count(custom.worklogs.id) AS "Count worklogs", count(id) AS "Count tickets"
from ticket
group by custom.worklogs[workerId = '04f316cd-4897-45e3-8145-7bf9e951c9a4'].workerId
custom.worklogs.workerId | Count worklogs | Count tickets |
04f316cd-4897-45e3-8145-7bf9e951c9a4 | 381 | 75 |
This is not the same as filtering the tickets followed by group by (first select all tickets containing at least one worklog with this worker, then group by all worklogs from these tickets):
select custom.worklogs.workerId,
count(custom.worklogs.id) AS "Count worklogs", count(id) AS "Count tickets"
from ticket
where custom.worklogs[workerId = '04f316cd-4897-45e3-8145-7bf9e951c9a4']
group by custom.worklogs[].workerId
custom.worklogs.workerId | Count worklogs | Count tickets |
04f316cd-4897-45e3-8145-7bf9e951c9a4 | 381 | 75 |
0511ca9b-e20f-43c5-b59c-f864385b46d7 | 13 | 5 |
055b7d5c-e40c-4565-94d7-5cf620f1abe9 | 10 | 2 |
More complex example:
select custom.worklogs.workerId,
round(sum(custom.worklogs.duration)/60/8, 1) AS "Sum duration MD"
from ticket
group by custom.worklogs[startDate > '2022-01-01T00:00:00.000+0000'].workerId
custom.worklogs.workerId | Sum duration MD |
6f30f0d3-1074-45a9-bfe9-93f0c802a10a | 618.9 |
d8769f98-9758-4fcd-80a2-410c9d915b52 | 92.5 |
e7f5865c-7719-4a14-beef-4bce2aae34f9 | 30.6 |
bd4cabfe-17b5-4d43-9020-7b8d04187567 | 74.5 |
Or even:
select custom.worklogs[].workerId,
format(histogram(custom.worklogs[startDate > '2022-01-01T00:00:00.000+0000'].startDate, 'month'), 'MM') as "Month",
round(sum(custom.worklogs.duration)/60/8, 1) AS "Sum duration MD"
from ticket
group by custom.worklogs[].workerId
, histogram(custom.worklogs[startDate > '2022-01-01T00:00:00.000+0000'].startDate, 'month')
custom.worklogs.workerId | Month | Sum duration MD |
6f30f0d3-1074-45a9-bfe9-93f0c802a10a | 01 | 118 |
e7f5865c-7719-4a14-beef-4bce2aae34f9 | 01 | 11.3 |
6c570ee6-139e-47b0-ac8f-8c9ae2d203ee | 01 | 16.6 |
d8769f98-9758-4fcd-80a2-410c9d915b52 | 01 | 16.4 |
Nested in order by
You can sort by nested value in the same format as in where/group by.
select key,
format(custom.worklogs[workerId = '04f316cd-4897-45e3-8145-7bf9e951c9a4'].startDate, 'dd.MM.yyyy HH:mm') as "When",
custom.worklogs[workerId = '04f316cd-4897-45e3-8145-7bf9e951c9a4'].duration as "Duration minutes"
from ticket
where custom.worklogs[workerId = '04f316cd-4897-45e3-8145-7bf9e951c9a4']
order by custom.worklogs[workerId = '04f316cd-4897-45e3-8145-7bf9e951c9a4'].startDate
key | When | Duration minutes |
TSMTTS-868 | 14.07.2021 07:04 | 360 |
TSMTTS-850 | 15.07.2021 05:01, 29.07.2021 13:20, 30.07.2021 10:45, 04.08.2021 08:00 | 120,180,180,330 |
TSMTTS-881 | 15.07.2021 10:00, 16.07.2021 04:43, 19.07.2021 12:46, 21.07.2021 11:51 | 90,120,120,120 |
TSMTTS-883 | 15.07.2021 12:01, 16.07.2021 06:57, 19.07.2021 10:59, 20.07.2021 05:34, 20.07.2021 09:02, 21.07.2021 06:06, 02.08.2021 10:01 | 60,150,180,210,210,240,270 |
The query may look a bit tricky. You need to specify the filtering clause on many levels:
- where - select _tickets _which contain at least one work log matching the criteria
- order by - order those tickets by the start date of the earliest work log of this ticket
- select - return start date/duration of all work logs of this ticket filtered by the criteria (internally calculated by ElasticSearch painless script)
If you think that this does not suit you, you probably want to return nested rows instead. For example:
select key, custom.worklogs.startDate, custom.worklogs.duration
from nested(ticket, custom.worklogs)
where custom.worklogs[workerId = '04f316cd-4897-45e3-8145-7bf9e951c9a4']
order by custom.worklogs.startDate
key | custom.worklogs.startDate | custom.worklogs.duration |
TSMTTS-868 | 2021-07-14T07:04:00.000+0000 | 360 |
TSMTTS-850 | 2021-07-15T05:01:00.000+0000 | 180 |
TSMTTS-881 | 2021-07-15T10:00:00.000+0000 | 120 |
TSMTTS-883 | 2021-07-15T12:01:00.000+0000 | 60 |
TSMTTS-888 | 2021-07-15T15:35:00.000+0000 | 120 |
Nested in nested
Post-Processing
All tQL queries run against the ElasticSearch index, and query results are then translated to tabular form. They are post-processed before returning to the client.
Converted
The general expression for post-processing is “converted”.
select converted key, name, priorityId, statusId from ticket
TSMTTS-1430 | Display modification | Blocker | Closed |
TSMTTS-1429 | Another issue | Major | Closed |
In this case, each column is consulted with tSM Listing Column Definition to display “converter” and found values. The value is then translated (typical converters are “register value id” -> register value name”, boolean to “Yes”/”No” etc.).
Please note that the result is the same as if you use the convert method for all fields.
select convert(key), convert(name), convert(priorityId), convert(statusId) from ticket
The conversion optional parameters:
select converted(withOrigins) * from ticket
- withOrigins - add original values as a new column at the end with _origin prefix. It is useful for aggregation reports, where we need information about the original value before conversion
- useConvertOnBackend - consult column definition for the flag convertOnBackend and do the conversion only if the flag is set. We use this settings from tSM UI application to select which fields should be converted on backend (e.g. ID to name translation) and which on frontend only (e.g. user icon)
- useExportFields - if exportField is defined in the listing column definition, use this column instead of the requested field. We use it for export to excel.
- skipConvert - switch off conversion. It can be used in combination with other parameters.
Internals
If you execute a query like this:
select priorityCode, round(count(id) / 100, 1) from ticket group by priorityCode
priorityCode | calc4 |
0.8 | |
blocker | 3.4 |
critical | 6.2 |
major | 33.6 |
What happens in the background is:
- internal column “__internalCalc1” is created for count(id)
- internal column “__internalCalc2” is created for “__internalCalc1” / 100
- internal column “__internalCalc3” is created for “round(__internalCalc2)”
- Only “__internalCalc1” makes it to the ElasticSearch query
- For each ElasticSearch result row, the internal calculation for __internalCalc2 and __internalCalc3 is evaluated
- The result is returned as calc4 (from __internalCalc3)
Functions
A list of functions supported in tQL.
Select transformation
script
Script functions support painless scripts for extended data transformation.
For additional information about painless see https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting-painless.html
For example, it could be used simultaneously to check null values and transform data:
select typeCode as "PEW type"
, script('if (doc["closedDate"].size() !=0)
doc["closedDate"].value.millis / (60.0 * 10.0*60);
else "2022-03-04T01:16:43.046+0000"
') as closedDate
from ticket
limit 2
PEW type | closedDate | EW type |
2022-03-04T01:16:43.046+0000 | ||
Subtask | 45648841.277777776 | Subtask |
script_array
Elasticsearch always returns script results as an array. The default behavior in tQL is that if the array contains only one field, it will return the extracted value. But if you know that the result should be an array, you might force it with this function instead of the standard script.
select typeCode as "PEW type"
,script_array(
'def array = new def[] {2, 3, 4, 6};
return array ') as closedDate
from ticket
where closedDate between '2022-03-04T01:16:43.046+0000' and
'2022-03-14T01:16:43.046+0000'
limit 30
PEW type | closedDate |
Ukol | 2,3,4,6 |
Subtask | 2,3,4,6 |
Subtask | 2,3,4,6 |
Subtask | 2,3,4,6 |
Subtask | 2,3,4,6 |
searchHit
searchHit() returns the whole object from ElasticSearch. This function is mainly used for internal tSM purposes, but you can use it also to consult the whole JSON document content.
select searchHit() from ticket
{
"_index": "dataliteticket_20220516_064313",
"_type": "_doc",
"_id": "null",
"_score": 1.0,
"_source": {
"custom": {
"worklogs": [
{
"_index": "dataliteticket_20220516_064313",
"_type": "_doc",
"duration": 60,
"workerId": "04f316cd-4897-45e3-8145-7bf9e951c9a4",
"endDate": "2021-09-16T12:16:0"
}
]
}
}
}
JSON
Constructs JSON objects as a result, and it can be joined from multiple columns.
select
-- property names are generated automatically if not specified
json(key, name),
-- key can be identifier or double quoted string
json(myKey: key, "My complex key": name),
-- example of constant and complex value in json function call
json(name: 'myScript', scriptValue:
script('return params._source["tasks"]?.stream()?.collect(Collectors.toList())')
)
from ticket
Datediff
Returns the difference between two dates. The first parameter is the result units.
select converted key as "ID"
, tasks[].id
, tasks.history.userGroupCode
, dateDiff('ms', tasks[].startTime, first(tasks[].history[changes = 'USER_GROUP' and userId is not null].startTime)) as Date_Diff
from ticket
where tasks[].history[changes = 'USER_GROUP'] and key = 'PEW-3973'
group by key
, tasks[].id
, tasks[].history[changes = 'USER_GROUP'].userGroupCode
, tasks[].startTime
ID | history - user group | Date_Diff |
pew-3972 | ZPEW | 192 |
pew-3973 | ZPEW | 1596 |
pew-3974 | ZPEW | 911 |
pew-3975 | ZPEW | 1525 |
pew-3976 | ZPEW | 192 |
… | … | … |
You can use other values - mcs', 'ms', 'ss, 'mi', 'hh', 'dd', 'wi', 'mo', 'qq', and 'yyyy’.
Notice. The converted result is a truncated integer value of a time interval (for example 22 h -> 0 d).
select datediff('mcs','2019-12-31T23:59:59.00+0000', '2020-01-01T00:00:00.00+0000') as Interval_mcs
, datediff('ms','2019-12-31T23:59:59.00+0000', '2020-01-01T00:00:00.00+0000') as Interval_ms
, datediff('ss','2019-12-31T23:59:59.00+0000', '2020-01-01T00:00:00.00+0000') as Interval_ss
, datediff('mi','2019-12-31T23:59:59.00+0000', '2020-01-01T00:00:00.00+0000') as Interval_mi
, datediff('hh','2019-12-31T23:59:59.00+0000', '2020-01-01T00:00:00.00+0000') as Interval_hh
, datediff('dd','2019-12-31T23:59:59.00+0000', '2020-01-01T00:00:00.00+0000') as Interval_dd
, datediff('wi','2019-12-31T23:59:59.00+0000', '2020-01-01T00:00:00.00+0000') as Interval_wi
, datediff('mo','2019-12-31T23:59:59.00+0000', '2020-01-01T00:00:00.00+0000') as Interval_mo
, datediff('qq','2019-12-31T23:59:59.00+0000', '2020-01-01T00:00:00.00+0000') as Interval_qq
, datediff('yyyy','2019-12-31T23:59:59.00+0000', '2020-01-01T00:00:00.00+0000') as Interval_yyyy
from ticket
limit 1
Interval_mcs | Interval_ms | Interval_ss | Interval_mi | Interval_hh | Interval_dd | Interval_wi | Interval_mo | Interval_qq | Interval_yyyy |
1000000 | 1000 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
If you need an approximate result, you can transform milliseconds (‘ms’) by arithmetic transformation like
dateDiff('ms', tasks[].startTime, tasks[].history[].startTime) /1000.0 / 60.0 /60.0
and use the format function.
select round(datediff('ms',tasks.history.startTime,tasks.history.endTime)/1000.0/60.0 , 1)
from ticket
group by tasks[].history[].startTime, tasks[].history[].endTime
Duration |
13.1 |
0.7 |
0 |
91.2 |
200.1 |
8 |
1213.4 |
13.4 |
… |
Note that if datediff function has been used with nested values, you have to use group by clause to return single values of dates (since nested values could contain an array of dates) otherwise the result would be null.
Wrong result
select round(datediff('ms',tasks.history.startTime,tasks.history.endTime)/1000.0/60.0 , 1)
from ticket
Duration |
null |
null |
null |
… |
Aggregation
A multi-value metrics aggregation that computes stats over numeric values extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents or be generated by a provided script.
The stats that are returned consist of: min, max, sum, count, avg and collect.
avg
A single-value metrics aggregation that computes the average of numeric values that are extracted from the aggregated documents.
select avg(custom.worklogs.duration)
from nested(ticket, custom.worklogs)
group by custom.worklogs[]
count
The COUNT() function returns several rows that matches a specified criterion.
select count(id) from ticket
count_id |
4755 |
first
Returns the first non-null value (if such exists) of the field_name input column sorted by the ordering_field_name column. If ordering_field_name is not provided, only the field_name column is used for the sorting. tQL gets the results from the aggregation and then selects the first row from the bucket and returns the required column.
collect
Returns collection of the field_name input column.
select collect(keyNormalized), customerSegmentId from customer group by customerSegmentId
max
A single-value metrics aggregation that keeps track and returns the maximum value among the numeric values extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents or be generated by a provided script.
min
A single-value metrics aggregation that keeps track and returns the minimum value among the numeric values extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents or be generated by a provided script.
sum
A single-value metrics aggregation that sums up numeric values that are extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents or be generated by a provided script.
select sum(custom.worklogs.duration) as Sum_val
, avg(custom.worklogs.duration) as Average
, max(custom.worklogs.duration) as Maximum
, min(custom.worklogs.duration) as Minimum
, first(custom.worklogs.duration) as First_val
, count(custom.worklogs.duration) as Count_val
from nested(ticket, custom.worklogs)
group by custom.worklogs[]
m_val | Average | Maximum | Minimum | First_val | Count_val |
2521293 | 160.95 | 11540 | 0 | 0 | 15665 |
histogram
This multi-bucket aggregation is similar to the normal histogram, but it can only be used with a date or date range values. Like the histogram, values are rounded down to the closest bucket.
select histogram(creationDate, 'month') from ticket
month |
1643673600000 |
1646092800000 |
1648771200000 |
1651363200000 |
Notice. The result has been returned in milliseconds, that’s why it is necessary to use a format function, see format.
Where condition
like operator
Like operator has similar behaviour to standard SQL like, you can use both % and * characters to match any substring.
- where name like ‘cat’ - is the same as name = cat
- where name like ‘cat%’ - will find all records starting with ‘cat’, this will create a prefix query with optimal performance.
- where name like ‘%cat%’ - will find all records containing ‘cat’. Please note that this query is quite resource intensive
- where name fuzzy like ‘cat’ - finds all documents with a name similar to ‘cat’.
- where name regexp like ‘cat[0-9]*’ - finds all documents with regexp match. Please note that this query is very resource intensive
concat
Concatenate multiple strings.
where name like concat(‘%’, :search, ‘%’)
Post-Processing
round
select round(avg(custom.worklogs.duration), 2) as rand_avg
from nested(ticket, custom.worklogs)
group by custom.worklogs[]
rand_avg |
160.95 |
format
format(whenInserted, 'dd.MM.yyyy')
select format(histogram(creationDate, 'month'), "YYYY-MM") as month
from ticket
month |
2022-02 |
2022-03 |
2022-04 |
2022-05 |
- "dd" represents the day of the month
- "MM" represents the month
- "yyyy" represents the year
- "HH" represents the hour in 24-hour format
- "mm" represents the minutes
- "ss" represents the seconds
(case sensitive)
convert
-
select
-
convert(priority), // default converter configured for column priority
-
convert(myBooleanCol, 'booleanTranslateValueConvertor'), // convert with boolean converter
Use keyword converted for po st-processing, and use a converter to translate one value for another.
For example:
- priorityId - keyword, indexed as is. There is a registered converter to translate GUID to name in tSM
- priorityName - indexed with a normalizer ASCIIi and lowercase. Thus results are normalized. This column has no converter in tSM, no post-processing is happening. See https://www.elastic.co/guide/en/elasticsearch/reference/current/normalizer.html
Converted rules definition is allocated in Listeners->Fields configuration
Select convert(priorityId), priorityName
from ticket
group by priorityId, priorityName
Priority | priorityName |
Major | major |
Trivial | trivial |
Blocker | blocker |
Minor | minor |
Critical | critical |
Without conversion, the result looks a little bit different
Select priorityId, priorityName
from ticket
group by priorityId, priorityName
priorityId | priorityName |
6e71a823-806e-3d03-11ee-0f0e2d55061d | major |
6fd9ef55-bcec-f460-1244-0b43b3605b34 | trivial |
795c96a5-4b5b-7b8a-9f57-428b33aa0534 | blocker |
910c2db8-77da-2e2b-590b-b00f929492fe | minor |
c9c96175-67b8-e8c0-6f4b-59a67a53f808 | critical |
Notice. Converted is not applied to the WHEN clause because it’s post-processing operation
select convert(priorityId) as priority, convert(statusId) as Status custom.worklogs.workerIdfrom ticket
where priorityName like 'major'
order by statusId desc
Additional samples of the complex usage of tQL clauses for data retrieval and transformation
Aggregation
Support of sum, count, max, min, and avg realized in tQL could also be applied with complex painless scripts:
select count(id)
, max(
'def month, d;
if (doc["creationDate"].size()!=0)
{ Instant instant = Instant.ofEpochMilli(Long
.parseLong(doc["creationDate"]
.value.millis.toString()));
ZonedDateTime zdt = ZonedDateTime.ofInstant(instant, ZoneId.of("Z"));
month = zdt.getMonthValue();
}
return month') as max_month
, round(avg(
'def month, d;
if (doc["creationDate"].size()!=0)
{ Instant instant = Instant.ofEpochMilli(Long
.parseLong(doc["creationDate"]
.value.millis.toString()));
ZonedDateTime zdt = ZonedDateTime.ofInstant(instant, ZoneId.of("Z"));
month = zdt.getMonthValue();
}
return month'
) , 2) as avg_duration
from ticket
where statusName like 'Closed'
Nested Grouping
select converted chars.project
, tasks.taskType
, sum('def d ;
if (doc["custom.worklogs.duration"].size()!=0)
{d=doc["custom.worklogs.duration"].value; }
else d=0;
return d')
, count(tasks.id)
, count(priorityName)
from ticket
where tasks[active = false]
group by chars.project, tasks[active = false and id is not null].taskType, custom.worklogs[id is not null]
select converted chars.project
, tasks.taskType
, sum('def d ;
if (doc["custom.worklogs.duration"].size()!=0)
{d=doc["custom.worklogs.duration"].value; }
else d=0;
return d')
, count(tasks.id)
, count(priorityName)
from ticket
group by chars.project, tasks[active = false and id is not null].taskType, custom.worklogs[id is not null]
For a standard query, we can ask elastic to return https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html, which has the original value. We use it for the standard select queries, hence 'select priorityName from ticket' would return original values. It is not possible for the Group by, 'select priorityName from ticket group by priorityName', which returns a grouped value.
select name,
sum('doc["custom.worklogs.duration"].value / 60.0 / 8') as "Worklog MD"
from ticket
where creationDate between '2022-02-01T00:00:00.000+0000' and '2022-02-28T00:00:00.000+0000'
group by name, custom.worklogs[]
select name
, custom.worklogs.workerID
, count(id)
, sum('doc["custom.worklogs.duration"].value / 60.0 / 8') as "Worklog MD"
from ticket
group by name, custom.worklogs[].workerID
Data transformation with the painless script (instead of Format-function)
select converted statusId as "Status", priorityId as priority, creationDate,
script( ' def month;
if (doc["creationDate"].size()!=0)
{
Instant instant = Instant.ofEpochMilli(Long.parseLong(doc["creationDate"].value.millis.toString()));
ZonedDateTime zdt = ZonedDateTime.ofInstant(instant, ZoneId.of("Z"));
month = zdt.getMonthValue();
}
else month = 0;
return month ') as month
from ticket
where creationDate between '2022-02-01T00:00:00.000+0000' and '2022-02-28T00:00:00.000+0000'
order by statusId desc
Aggregate with two nested objects
select count(tasks.id), sum(custom.worklogs.duration), priorityName
from ticket
group by priorityName, custom.worklogs[], tasks[]
Drivers
You can connect to tQL remotely using the standard drivers.
JDBC Driver
Download the latest OpenSearch JDBC Driver and use it with your favorite tool:
- Connect string: jdbc:opensearch://https://tsm.datalite.cloud/ts
- m-ticket/api/opensearch
- Authentication: use an existing tSM User with “System.Tql.Remote” privilege
ODBC Driver
Download the latest OpenSearch ODBC Driver and configure Windows ODBC DataSource using the OpenSearch manual.
- Host: https://tsm.datalite.cloud/tsm-ticket/api/opensearch
- Authentication: BASIC, use an existing tSM User with “System.Tql.Remote” privilege
- Advanced options -> Enable SSL (for https servers)
Additional connectors for ODBC
Using ODBC driver, you can connect:
- Power BI Desktop: OpenSearch.mez
- Tableau: opensearch_sql_odbc.taco
For more details consult https://github.com/opensearch-project/sql/tree/main/sql-odbc.
- Using ODBC driver for powerBI with ODBC connect, will be configured:
- Click on Home > Get Data > More > Other. Select ODBC. Click on Connect.
- Then choose your ODBC DSN and write a query for index data in the advanced options:
- Click ok
- Data preview will be loaded.