-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Watchdog timeout on large point count N4 Supervisors w/ read op #14
Comments
Hi @tblong, That is a lot of points! I haven't tested this before with a station of that size and no one else has reported using nhaystack with an application with that many points. I'll need to have a look at the code and see what we can do. |
HI @tblong , I have done an initial investigation into this situation and making a change to the threading arrangement for the servlet isn't as simple as I first thought. I am going to try to and setup a test station with the number of components you have and also make some changes and see what happens. This kind of change is quite a significant change so I want to proceed carefully. |
@tblong I have tried a couple of different setups today. The first setup I built had 250,000 points. I didn't get a watchdog timeout I had out of memory issues. The second setup I lowered the station to 150,000 points. The 'read' query with your filter worked over the REST API, however it is holding on to a lot of memory. I'm doing all this on a Windows Virutal Machine on my Mac. It has 16GB RAM and 4 cores allocated. It has the default memory settings for the Station JVM. Can you provide more details on your Supervisor configuration? I think there is a problem but it's more around memory management at this scale. I'm not seeing Watchdog timeouts and station restarts as you indicate. I am using the latest code though, but I don't think that should have made much of a difference. |
@tblong also I just tested the use of the |
@ci-richard-mcelhinney Much thanks for the help digging here. So it seems this might just be a max-heap setting perhaps? Possibility for memory improvements in how nHaystack crawls through the station during a read op as it gathers the response data? I will be on holiday from 6/30->7/7 but will work on getting all the station metrics and config settings I can on my return. |
@tblong I've also determined that the REST API requests are not serviced on the Engine Thread in the latest code. I'm not sure about the version you are using. So if you can upgrade you should get similar results to me and hopefully you don't see watchdog timeouts. |
@ci-richard-mcelhinney Got the station metrics gathered below today. We only had browser access for this session so were not able to get what the actual max-heap setting was but were still able to get the memory metrics of the station. The nHaystack version is v3.2.0: The The Let me know if there might be any other metric I can grab that will help further. |
Related to nHaystack v3.0.1+. When performing a read operation such as
read?filter=point+and+cur
against a large point count N4 Supervisor (as above), we have seen the watchdog timeout get triggered and the station restarts. The watchdog event occurs even when adding the optional and a low value for thelimit
parameter.Questions:
read
operation currently executed on the main engine thread within Niagara?read
op better here or to ensure the op performs work off the main engine thread?The text was updated successfully, but these errors were encountered: