forked from Aloisius/nutch
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Notable changes: - NUTCH-2959 upgrade Tika to 2.9.0 - NUTCH-2990 HttpRobotRulesParser to follow 5 redirects as specified by RFC 9309 - NUTCH-3011 HttpRobotRulesParser: handle HTTP 429 Too Many Requests same as server errors (HTTP 5xx) - NUTCH-3002 Protocol-okhttp HttpResponse: HTTP header metadata lookup should be case-insensitive Notes: - the upgrade to Tika 2.9.0 is based on a shaded package to get around a conflict with the Hadoop-provided dependency to commons-io (Hadoop ships with 2.8.0 but Tika requires 2.11.0) - because no module-level shaded Tika packages are available, Nutch core for now already includes the Tika standards parsers package and parse-tika relies on the package provided via Nutch core - cf. the comments and modifications in ivy/ivy.xml and src/plugin/parse-tika/{ivy,plugin}.xml
- Loading branch information
Showing
221 changed files
with
2,149 additions
and
5,612 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
name: master pr build | ||
|
||
on: | ||
schedule: | ||
- cron: '0 0 * * *' # every day at midnight | ||
|
||
jobs: | ||
dependency-check: | ||
strategy: | ||
matrix: | ||
java: ['11'] | ||
os: [ubuntu-latest] | ||
runs-on: ${{ matrix.os }} | ||
steps: | ||
- uses: actions/checkout@v4 | ||
- name: Set up JDK ${{ matrix.java }} | ||
uses: actions/setup-java@v3 | ||
with: | ||
java-version: ${{ matrix.java }} | ||
distribution: 'temurin' | ||
- name: Dependency check | ||
run: ant clean dependency-check -buildfile build.xml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,3 +27,5 @@ naivebayes-model | |
csvindexwriter | ||
lib/spotbugs-* | ||
ivy/dependency-check-ant/* | ||
.gradle* | ||
ivy/apache-rat-* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.