Sample dataset

For the purpose of sampling, we have a collection of the JSON documents like this one:

{
  "id": "e0b3ba37-cccd-4ee9-8e3e-1598bdec6a73",
  "key": "ADAPSCET-153",
  "name": "Podpora a exekuce testování aplikací DL 2022",
  "typeId": "4ee6f437-3c32-4e0b-be11-afc189be5659",
  "typename": "Story",
  "statusId": "f4961557-06c0-4b5c-bdc1-2a76efc1c2e2",
  "statusCode": "1_INPROGRESS",
  "statusName": "In progress",
  "priorityId": "6e71a823-806e-3d03-11ee-0f0e2d55061d",
  "priorityName": "Major",
  "creationDate": "2022-03-04T01:16:43.046+0000",
  "creationUserId": "0511ca9b-e20f-43c5-b59c-f864385b46d7",
  "creationGroupId": null,
  "tasks": [
    {
      "dueDate": null,
      "description": "<p></p>",
      "skills": [],
      "activityId": "Activity_UT_Analyza",
      "routing": null,
      "taskType": "USER_TASK",
      "plannedStartTime": null,
      "tsmTaskDefinitionCode": null,
      "commitedStartTimeTo": null,
      "startTime": "2024-11-04T11:12:11.879+0000",
      "id": "a3b3d832-9a9d-11ef-97fb-76f77ba300dd",
      "_timestamp": null,
      "processInstanceId": "a3b2edcb-9a9d-11ef-97fb-76f77ba300dd",
      "userGroupCode": "Analytici",
      "module": "ticket",
      "custom": null,
      "active": true,
      "history": [
        {
          "duration": 0,
          "userGroupCode": "Analytici",
          "changes": ["USER_GROUP"],
          "startTime": "2024-11-04T11:12:12.165+0000",
          "endTime": "2024-11-04T11:12:12.209+0000",
          "userId": null,
          "status": null
        },
        {
          "duration": 1,
          "userGroupCode": "Analytici",
          "changes": ["STATUS_CODE"],
          "startTime": "2024-11-04T11:12:12.209+0000",
          "endTime": "2024-11-04T11:13:00.267+0000",
          "userId": null,
          "status": "NEW"
        },
        {
          "duration": 0,
          "userGroupCode": "Analytici",
          "changes": ["USER"],
          "startTime": "2024-11-04T11:13:00.267+0000",
          "endTime": "2024-11-04T11:13:00.283+0000",
          "userId": "412a2a1a-ba76-4dcb-b892-49f3e787db81",
          "status": "NEW"
        }
      ],
      "charsTyped": {},
      "userId": "412a2a1a-ba76-4dcb-b892-49f3e787db81",
      "commitedStartTimeFrom": null,
      "taskDefinitionKey": "Activity_UT_Analyza",
      "executionId": "a3b2edcb-9a9d-11ef-97fb-76f77ba300dd",
      "canceled": false,
      "taskSpecification": "Virtual.Buddy.Feedback.Analyza",
      "plannedEndTime": null,
      "name": "Analyza feedbacku",
      "wfmUserId": null,
      "endTime": null,
      "incident": null,
      "chars": {
        "vstupniData": {
          "datum": "2024-11-05T11:11:57.000Z",
          "zakaznik": "sd",
          "select": "Negativní - neví",
          "zadavatelEmail": "dd@post.cz",
          "zadavatelPozice": "d",
          "komentarZadavatele": "dsf",
          "puvodniDotaz": "dsf",
          "puvodniOdpoved": "dsf",
          "zadavatelJmeno": "d"
        }
      },
      "status": "IN_PROGRESS",
      "followupDate": null
    }
  ],
  "dataTags": [],
  "process": {
    "processDefinitionVersionTag": "0",
    "code": "Ticket-Story",
    "version": 9
  },
  "custom": {
    "worklogs": [
      {
        "duration": 120,
        "workerId": "0511ca9b-e20f-43c5-b59c-f864385b46d7",
        "endDate": "2022-01-03T03:51:00.000+0000",
        "id": "af45d2bb-29a1-4276-a784-2644dffab1ca",
        "type": null,
        "startDate": "2022-01-03T01:51:00.000+0000"
      }
    ]
  },
  "chars": {
    "project": "ADAPSCET"
  }
}

The provided JSON represents the mappings for a ticket index in Elasticsearch. The mappings define the structure and data types of the fields within the index, guiding how Elasticsearch will interpret, store, and search through the data. Below is a breakdown of the main elements of the index, including field types and structure.

1. General Structure:

The index is called "ticket". It contains properties which define the various fields (and nested objects) associated with the tickets.

2. Field Types:

Keyword: These fields store structured, unanalyzed data. They are best suited for exact matching, sorting, and aggregations. For example, categoryId, priorityName, name, statusCode.

Text: Fields that contain full-text data and are typically used for search and analysis. They are analyzed (broken down into terms) during indexing. However, no text fields are explicitly defined in this mapping, only keyword and date fields.

Date: These fields store date-time values. For example, creationDate, closedDate, statusChangeReason.

Nested: These fields store arrays of objects and allow you to perform queries on them. For example, advices, worklogs, tasks, history.

Object: A flexible field type that can store complex nested structures. For example, custom, scriptFields, filteredNested.

Flattened: Similar to the object type, but optimized for cases where the structure of data is dynamic and the number of keys is large. For example, chars.

Join: Defines relationships between different documents (similar to foreign keys in relational databases). For example, joinField defines a relationship between a "Ticket" and "RelatedEntity".

3. Fields and Their Purpose:

__allParentIds: Stores IDs of parent tickets (type keyword). advices: A nested field containing advice records with several subfields such as dateFrom, duration, status, and more. assignments: Stores a list of assigned users or tasks (type keyword). categoryId and categoryName: Represent the ticket's category, used for filtering or grouping. (type keyword) chars: A flattened field used for storing dynamically structured data, possibly key-value pairs. creationDate: A date field indicating when the ticket was created. custom: Contains custom properties and nested fields such as changeRequest, worklogs, etc. statusName: Stores the name of the ticket's status (type keyword). tasks: A nested field for task data, with various attributes such as activityId, dueDate, and status. type: Represents the type of ticket (type keyword). priorityName, severityName: Represent the ticket's priority and severity. whenEdited and whenInserted: Represent when the ticket was last edited and inserted. whoEdited and whoInserted: Represent the user who last edited or inserted the ticket.

Several fields are date types with a specific format, such as creationDate, closedDate, plannedStartDate, and plannedEndDate.

4. Complex Nested Fields

advices: A nested field that holds advice records, with each record containing multiple properties such as status, type, duration, and more.

worklogs: Another nested field under custom that tracks work logs, with properties like duration, startDate, endDate, workerId.

tasks: A nested field that tracks tasks associated with the ticket, with several properties such as active, status, startTime, endTime, taskType. Each task includes a nested object history, which stores information about task modifications. This structure implements a third-level hierarchy within the index.

5. Additional Notes

The index is optimized for efficient querying and retrieval, and it contains many nested and flattened fields for storing complex, hierarchical data. Some fields are disabled for indexing (e.g., _class), meaning their values are not indexed for search purposes, though they can still be used for filtering or aggregations.