Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import of standalone tasks cause NPEs due to missing flownodeID #4735

Closed
3 tasks done
PHWaechtler opened this issue Oct 21, 2024 · 3 comments
Closed
3 tasks done

Import of standalone tasks cause NPEs due to missing flownodeID #4735

PHWaechtler opened this issue Oct 21, 2024 · 3 comments
Assignees
Labels
group:support All requests that are linked to a customer request. DRI: Tassilo scope:core-api Changes to the core API: engine, dmn-engine, feel-engine, REST API, OpenAPI type:bug Issues that describe a user-facing bug in the project. version:7.21.7 version:7.22.2 version:7.23.0

Comments

@PHWaechtler
Copy link
Contributor

PHWaechtler commented Oct 21, 2024

Environment (Required on creation)

Optimize 7

Description (Required on creation; please attach any relevant screenshots, stacktraces, log files, etc. to the ticket)

During import of flownode data for standalone tasks, Optimize importer throws an exception because the flownodeID is null when it is expected to be non null. For more details please refer to the support ticket.

Steps to reproduce (Required on creation)

  1. Start Optimize 7 environment
  2. Start standalone tasks
  3. Wait for Optimize to import data
  4. Observe Optimize log

Observed Behavior (Required on creation)

Exception during import:

13:08:28.108 [EngineImportScheduler-1] ERROR o.c.o.s.i.e.m.CompletedUserTaskEngineImportMediator - Was not able to import next page, retrying after sleeping for 5063ms.
java.lang.NullPointerException: flowNodeId is marked non-null but is null
at org.camunda.optimize.dto.optimize.query.event.process.FlowNodeInstanceDto.<init>(FlowNodeInstanceDto.java:98)
at org.camunda.optimize.service.importing.engine.service.CompletedUserTaskInstanceImportService.mapEngineEntityToOptimizeEntity(CompletedUserTaskInstanceImportService.java:99)

Expected behavior (Required on creation)

No exception during import.

Root Cause (Required on prioritization)

Standalone tasks have no flownodeID (TaskDefinitionKey) in the engine, but Optimize ctor has NonNull annotation for this field.

Solution Ideas

For now, lets just focus on removing the exception rather than adding the ability to Optimize to import standalone tasks in a way that makes them usable for report analysis.

Some potential approaches detailed here. Specifically, we could consider these two options (or a suitable alternative):

Option 1.: Optimize removes the @nonnull restraint on Flownode ID and imports this data regardless
This would avoid the exception and the need to the manual workaround. However, Optimize would then keep flownode
data that is not very useful for reporting since the ID is missing so Optimize can for example not aggregate this flownode
data for flownode reports. I would also need to have a closer look at all our flownode import pipelines to determine if
Optimize would keep multiple entries, or overwrite one entry per standalone usertask import. In general, I think we should
avoid importing data that will not be useful for report analysis but it could be a quick "fix".

Option 2.: Optimize keeps the @nonnull restraint on Flownode ID but skips importing flownodes with no ID
Since the flownode data without ID is of limited use to Optimize reporting, we could also consider skipping the import of
usertask data that comes from the engine without an ID. This would mean there is no data in Optimize for standalone
tasks (unless other related data, like identity link logs, are imported.). Same as with Option 1, standalone tasks would also
not be available for report analysis, but at least we would avoid importing unnecessary data.

Current tendency: option 2 as it avoids unnecessary data import. However, need to double check if there are use cases where option 1 would be preferred or where option 2 doesnt work.

After discussion and getting more context on what standalone tasks are, we decided that there is no value in importing these to Optimize. Therefore, we should go for option 2 above or the following Option 3:

Option 3: Engine filters out standalone tasks on Optimize API
When Optimize fetches data for flownodes (/usertasks), the engine api only returns data for non standalone tasks so that Optimize does not need to do any additional filtering during its import. This could potentially be a more performant solution.

Hints

  • Consider filtering out all null values not only task fields related

Workaround (test on lower environment first)

Workaround 1. Disable standalone tasks in the engine.
Workaround 2. Remove history related to standalone tasks
Workaround 3. Set a default placeholder (e.g.: workaroundStandaloneTasks) to TASK_DEF_KEY , ACT_INST_ID, in ACT_HI_TASKINST table. If PROC_DEF_KEY_ and PROC_DEF_ID_ are null, populate them as well.

Links

https://jira.camunda.com/browse/SUPPORT-24021

Breakdown

Pull Requests

Preview Give feedback
  1. bot:backport:7.21 bot:backport:7.22 ci:all-db ci:default-build ci:h2
    PHWaechtler

Dev2QA handover

  • Does this ticket need a QA test and the testing goals are not clear from the description? Add a Dev2QA handover comment
@PHWaechtler PHWaechtler added type:bug Issues that describe a user-facing bug in the project. group:support All requests that are linked to a customer request. DRI: Tassilo scope:optimize Changes to Optimize. labels Oct 21, 2024
@yanavasileva
Copy link
Member

yanavasileva commented Nov 22, 2024

Option 2 - fix on the Optimize side

  • pros
    • easy fix by adding a filter when mapping engine entities to optimize entities
    • good learning opportunity for onboarding
  • con
    • unnecessary data imported by engine and filtering it for a second time
    • there's no out of the box option to test import of standalone tasks in IT

Option 3 - fix on the engine side

  • pros
    • easy pick to change the mybatis query and test it in the engine (JUnit)
    • query only the data that is needed for Optimize
    • good learning opportunity for onboarding

Manual testing should be done no mater of the solution.
Fetchers are independent from each other. Since there's no other reported issue for another null values related to standalone tasks (for example - operation logs), it's safe to assume that the issue occurs only for tasks.

Backport is straight forward for both options.

@yanavasileva
Copy link
Member

Decision:

  • We will implement the fix on engine side. The customer agrees to apply the patch for the engine instead of Optimize.

@yanavasileva yanavasileva added version:7.23.0 potential:7.22.2 scope:core-api Changes to the core API: engine, dmn-engine, feel-engine, REST API, OpenAPI potential:7.21.7 and removed version:optimize 3.15.0 potential:optimize 3.14.2 scope:optimize Changes to Optimize. labels Dec 4, 2024
@PHWaechtler
Copy link
Contributor Author

Affected Optimize imports:

  • RunningUserTaskInstanceImportService
  • CompletedUserTaskInstanceImportService

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
group:support All requests that are linked to a customer request. DRI: Tassilo scope:core-api Changes to the core API: engine, dmn-engine, feel-engine, REST API, OpenAPI type:bug Issues that describe a user-facing bug in the project. version:7.21.7 version:7.22.2 version:7.23.0
Projects
None yet
Development

No branches or pull requests

3 participants