Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't try to bulk index empty body #1304

Merged
merged 1 commit into from
Sep 11, 2023

Conversation

andersju
Copy link
Member

Reindexing QA failed:

Reindex failed with:
whelk.exception.WhelkRuntimeException: Task threw exception: whelk.exception.UnexpectedHttpStatusException: 400: {"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"t
ype":"parse_exception","reason":"request body is required"},"status":400}
callstack:
null
PANIC ABORT, unhandled exception:
whelk.exception.WhelkRuntimeException: Task threw exception: whelk.exception.UnexpectedHttpStatusException: 400: {"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"t
ype":"parse_exception","reason":"request body is required"},"status":400}
whelk.exception.WhelkRuntimeException: Task threw exception: whelk.exception.UnexpectedHttpStatusException: 400: {"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"t
ype":"parse_exception","reason":"request body is required"},"status":400}
        at whelk.util.BlockingThreadPool$Queue.checkResult(BlockingThreadPool.java:109)
        at whelk.util.BlockingThreadPool$Queue.awaitAll(BlockingThreadPool.java:96)
        at whelk.util.BlockingThreadPool$SimplePool.awaitAllAndShutdown(BlockingThreadPool.java:143)
        at whelk.util.BlockingThreadPool$SimplePool$awaitAllAndShutdown$0.call(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:130)
        at whelk.reindexer.ElasticReindexer.reindex(ElasticReindexer.groovy:99)
        at whelk.reindexer.ElasticReindexer$reindex.call(Unknown Source)

Because a bunch of docs (~11k) had null as the value of descriptionLastModifiers.@id:

2023-09-09T09:01:37,503 [SimplePool-14] ERROR whelk.component.ElasticSearch - Failed to index brgstdh08j6l9jkn in elastic: java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because "id" is null
java.lang.NullPointerException: Cannot invoke "String.startsWith(String)" because "id" is null
        at whelk.JsonLd.getReferencedBNodes(JsonLd.groovy:1395) ~[xlimporter.jar:?]
        at whelk.JsonLd.getReferencedBNodes(JsonLd.groovy:1404) ~[xlimporter.jar:?]
        at whelk.JsonLd.getReferencedBNodes(JsonLd.groovy:1404) ~[xlimporter.jar:?]
        at whelk.JsonLd.cleanUp(JsonLd.groovy:1381) ~[xlimporter.jar:?]
        at whelk.JsonLd.frame(JsonLd.groovy:1263) ~[xlimporter.jar:?]

So, don't send a bulk index request to Elasticsearch if the the request body is empty; warn instead. We'll have to look at the offending records separately.

Copy link
Contributor

@kwahlin kwahlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@andersju andersju merged commit 74489ae into develop Sep 11, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants