What's new in Stormcrawler 2.5
Disclaimer
This is a Pre-ASF release and did not undergo a formal review by the PMC.
In a nutshell
- various dependency upgrades (JSoup, CrawlerCommons, Tika, Elasticsearch)
- Java 11
- bugfix AggregationSpout does not release IsInQuery boolean sometimes
- various improvements to URLFrontier module
In more details
- FEATURE-964: custom crawl delay per page by @juli-alvarez in #967
- Issue 970 HttpProtocol doesn't consider http.content.limit in test for filesize by @wowasa in #972
- Add ChannelManager for local channel management and constants to Spout.java by @FelixEngl in #982
- Fix error when spaces in path to test-resources of StatusBoltTest in ElasticSearch-Module by @FelixEngl in #985
- Add unit test basics for URLFrontier. by @FelixEngl in #984
- Fix starvation and busy waiting of StatusUpdaterBolt.java, add Constants. by @FelixEngl in #983
- Fix starvation and busy waiting of ES StatusUpdaterBolt (Fixes #986) by @FelixEngl in #988
- Fix starvation and busy waiting of ES IndexerBolt by @FelixEngl in #989
- HttpProtocol use the md protocol.set-headers to add custom header by url by @Mikwiss in #993
New Contributors
Full Changelog: 2.4...storm-crawler-2.5