-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDK 1.7.4 does not stream all events in one chunk in a distributed search #541
Comments
Hi @FritzWittwer , we are trying to reproduce the issue but we are not sure on the indentations, so request you to provide the script with proper indentations. |
Sorry for the delay, I was actually on vacation and did not see your comments. Below is the code whit the indentions as it is used (slightly newer build number as I did some more tests)
|
Hi @FritzWittwer, sorry for the delay in response. We are looking into this and will provide an update soon. |
...an update, we assumed we could use Splunklib 1.6.12 but as we started to use it with production data, we learned that it also fails as soon as the result set streamed to it reaches a certain size. This size is larger then the 50'000 events we used to test the behavior, thus it went unnoticed. Large in this context seems to be the number of bytes, not the number of events. Splunklib 1.6.12 doesn't make chunks as Splunklib 1.7.2, it just hangs. I suspect this was the reason to change something so 1.7.2 is now making chunks of the streamed events. |
@FritzWittwer We appreciate the update with additional information. The team is investigating this issue.
|
We assumed to be able to use SDK 1.6.12 with not distributed searches, and we only learned last week that this also fails with Python 3 and our real data. The test data we had used was good enough to show the different behavior between SCK 1.6.12 on Python 2 and SDK 1.7.4 with Python 3, but it did not show the blocking behavior of SDK 1.6.12 on Python 3, see also my earlier comments. Thinking back I assume the problem with SDK blocking might be quit old, more than two years ago I tried to use Python 3, and the custom command was just hanging, we did not investigate further as there has been no need to use Python 3 by then |
Describe the bug
A distributed streaming custom command does not receive all events of the search in one invocation, it gets the events in several chunks.
We have a distributed streaming custom command which searches for hits of URLS in a list of IOCs. It has to process 10 millions of events with an URL and compare them against a list of about million IOCs. The custom command builds an optimized cache data structure to be able to process this load. This cache is stored in a local file on the indexer, and loaded each time the command is invoked on an indexer.
This works well in our current environment consisting of 96 indexers with Splunk 8.2.9 and using Python 2 and splunklib 1.6.12. In average each indexer processes about 100'000 events each time the custom command is called.
Switching to Python 3 and splunklib 1.7.4 as a preparation for our planed migration to Splunk 9x, we found the command failing. Further diagnosis showed that the command is called several times on each indexer, with just a small batch of events streamed on each invocation. This makes the whole caching mechanism useless as it takes about 120 seconds to load the cache, while processing the 100'000 events takes only a few seconds.
Further investigation shows that the problem lies in the splunklib 1.7.4, 1.6.12 works as expected as long as the size of the streamed data is not to big, otherwise SDK 1.6.12 just hangs.
To Reproduce
Create a an app containing a custom command p3s7lookupurldebugempty:
in ./bin, the python file p3s7lookupurldebugempty.py
Add it to commands.conf:
add the splunklib as ./bin/splunklib
Prepare test data:
Run the search:
See the results in /tmp/debug on the indexers (only one will show it as only one indexer will have to process the events)
Expected behavior
Expected result would be:
Splunk:
SDK (please complete the following information):
Additional context
This Test have been Executed on following environment
All Splunk 8.2.0 (the same error also occurs on Splunk 8.2.9)
The text was updated successfully, but these errors were encountered: