-
Notifications
You must be signed in to change notification settings - Fork 8
Launches left IN PROGRESS when using agent-java-cucumber2:5.1.5 #46
Comments
Hi @amalbaccar-hub ! |
Hi @AmsterGet :) thank you for your response and for transferring the issue. Yeah you're right still not able to detect the root cause. Hopefully I get help here soon cuz it's a blocking issue for us. |
@amalbaccar-hub Any exception in the console? Any weird stuff? |
@HardNorth I got this error in service-api log: Could this be the root cause? Please tell me if you need other logs. Thanks! |
@amalbaccar-hub According to your stack trace that happens on call to Do you call it somewhere in your code? Since Client method named the same as server one. |
@HardNorth yes I have a listener/subscriber to TestRunStarted & TestRunFinished cucumber events where I call getLaunchByUuid method as follow (we need it to show RP launch URL ): LaunchResource launch = Objects.requireNonNull(rpService.getClient()).getLaunchByUuid(launchUuid).blockingGet(); However the original issue (of launch not finishing) happened even before adding this call. |
@amalbaccar-hub I would recommend you double-check that you don't fail on these hooks. That definitely can be the case. Just disable them completely and observe. Because failures on hooks is not something Cucumber Agent can handle due to implementation issues. |
@HardNorth As as said issue still there even after disabling hooks. In fact I commented code and made a test but launch stuck in IN PROGRESS state However you can notice in the screenshot that the previous launch finished properly. it almost took 3 minutes after executing the last step in feature file. it's quite long wait time. |
@HardNorth this the stacktrace I got when I do pause debugger (while it's hanging): |
@amalbaccar-hub It's for execution with several forked process. You can actually disable it with If it's really the source of the problem I would like to see a stacktrace to fix that. |
@HardNorth Great thank you so much it was the source of problem.
|
@HardNorth However now when running a whole test suite I got 2 parallel launches Meanwhile you make a fix, is there a way/workaround to make it reporting to only one launch? |
@amalbaccar-hub Server error won't help me, since it looks like client issue. I thought you caught something on the client side. I can guess it is a filesystem read/write permission issue, since client uses it by default for synchronising Launch ID. Also there is a known issue that this mechanism does not work on Windows Subsystem for Linux (WSL). For workaround you can try return back |
@HardNorth might be similar issue reportportal/reportportal#1250 (comment) |
@DzmitryHumianiuk Thank you for your response, I've read the discussion you shared. I agree with your analysis and now I'm trying to check if the launch is present in the database. I attached the service api log, I see some errors caused by RabbitMQ (I think I got these errors when I used the workaround provided by @HardNorth: rp.client.join=true & rp.client.join.mode=SOCKET and when I also tried to use rerun feature ). Please take a look. |
@HardNorth Thank you so much for the workaround ! and sorry for the late reply cuz I was busy with other stuff. This workaround works when I run in debug mode from IntelliJ IDEA. However, when we run using release JAR file in command line the problem re-appears and the launches are not finished. I tried to collect some threads dump/stacktrace, please take a look! I copied stacktrace while launch is stuck in progress state: |
@amalbaccar-hub What kind of system do you use for release run? And can you share command example? |
@HardNorth We use Jenkins to generate releases and our application is a Java CLI app running in JAVA 8 and gradle (windows).
java -Drp.endpoint=http://rp.evf.us/ |
@DzmitryHumianiuk Still getting this error although the launch with the mentioned ID(uuid) is created in the database: Please check screenshot: So your analysis here reportportal/reportportal#1250 (comment) seems to not be the case. Could you please analyse this issue further? This is really blocking us and I need to resolve ASAP. Thanks! |
@amalbaccar-hub So, in other words, you're using Jenkins pre-steps to create a launch in ReportPortal via an API call, take the received UUID, and share it with the other jobs or the current job, letting the subsequent executions report into the provided UUID of the launch? Did I get that right? In this case, I'm starting to think about the processing speed. The thing is, creating a launch is a synchronous operation. And if you've received a response from the server with your UUID, it's possible that it hasn't yet been dropped or saved in the database. And the very first request to save a child element throws an error. Although, again, it seems to me that in such a case, the request for writing should go into a queue for retries. @amalbaccar-hub, try waiting an additional couple of seconds after creating the launch before handing off the UUID to other processes. I see from the picture that the object exists in the database. And I'm trying to understand why, at the moment of the create request by the child API, it still considers it not to be there. |
@DzmitryHumianiuk No we're just using jenkins for new build generation and we are not using API call to integrate with ReportPortal; instead of that we are integrating RP agent inside our test framework. @DzmitryHumianiuk I wanted to share my new findings/understanding of the issue. In fact I tried to redeploy with a lower RP version (5.6.3, before that I used version 5.7.4). And I got the same behavior but I noticed in the API service log that it's showing Rabbit messages and exchange with API and that FINISH_LAUNCH wasn't logged. However, seems that all child items was processed as I see Rabbit messages indicating FINISH_TEST for all items. From test framework side I'm sure that finishRQ was sent to ReportPortal. you can see in the following screenshot from the stack trace that finishRQ was received by RP: Actually we discussed this stacktrace before with @HardNorth and he noted that the issue is due to a limitation for multi-process launch functionality with WSL and he proposed this workaround: rp.client.join=true and set rp.client.join.mode=SOCKET but unfortunately it's not resolving the problem ! @DzmitryHumianiuk Please do you have any other assumptions/proposals how to fix that. As I said it's a blocker for our team and it makes ReportPortal unusable for us and if I don't fix it ASAP I think we will decide not to use it. Thank you in advance! |
@amalbaccar-hub Unfortunately, if To workaround that you need to implement a special running script yourself. As the first step you need to create a Launch and save its UUID. Then pass it as environment variable Maybe I should think of writing an article describing the workaround in details. |
@HardNorth Thank you for the workaround, I'll try it and let you know. However, I would appreciate if you write an article and share it with us :) Thanks in advance! |
@HardNorth @DzmitryHumianiuk I wanted to share these news with you. In fact I tried to scale up API service by duplicating the value of RP_AMQP_QUEUES env var (it was set with default value 10 ) : - RP_AMQP_QUEUES=20 As you can see from the screenshot, the first launch was super fast (1 minute). The second launch tooks 5 minutes and for the third attempt I got Wait for Launch start timed out exception. When I checked the DB, I noticed that the launch was not created (only the first two) and getting this exception in service api log : Launch '2ea7a1b3-cad9-4ca5-b965-bb0bdbb81ac4' not found. Did you use correct Launch ID? I guess this is a performance problem and I probably need to scale up more my API service. Am I wrong? |
@amalbaccar-hub ping me in the i'll open a trial account for you on our saas, to see if this is caused by the instance performance. |
@DzmitryHumianiuk Okay sure I'll ping you, thank you so much :) |
Sorry forgot to comment, issue was resolved for us. It's a configuration problem; both reporters (step & scenario) were used which causes the issue. So closing the issue. Thanks for the support. |
I am having this issue: launches keep showing IN PROGRESS state while they should've completed.
This issue isn't always reproducible. We are using this agent: agent-java-cucumber2:5.1.5
I noticed this new release which came as a fix for a similar issue to the one I'm reporting (reportportal/agent-js-webdriverio#53 and reportportal/agent-js-webdriverio#47):
https://github.com/reportportal/agent-js-webdriverio/releases/tag/v5.1.0
@AlexGalichenko @DzmitryHumianiuk @HardNorth
Probably same fix is needed for cucumber agent as well.
The text was updated successfully, but these errors were encountered: