-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: allow non-Elasticsearch search engines when reindexing courses [FC-0062] #35743
base: master
Are you sure you want to change the base?
fix: allow non-Elasticsearch search engines when reindexing courses [FC-0062] #35743
Conversation
Thanks for the pull request, @pomegranited! What's next?Please work through the following steps to get your changes ready for engineering review: 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. 🔘 Let us know that your PR is ready for review:Who will review my changes?This repository is currently maintained by Where can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:
When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
cms/djangoapps/contentstore/management/commands/reindex_course.py
Outdated
Show resolved
Hide resolved
Hi, @pomegranited! This crashed the first two times that I ran it. First run
Second run
The third run went "fine." We had a problem with transcripts (the same happens with our studio_content), but there was also another in another course for which I don't know the cause.
|
Tutor will handle this step for us.
Good catches! Thank you for trying this on an empty Meilisearch instance, I had missed that. I ended up following Regis's advice in this comment by removing the Meilisearch-specific code from this script and just handled these exceptions.
I don't know what's wrong with that course document either? I only see strings + datetimes in the log line for that course, and MeilisearchEngine handles serializing datetimes, so not sure what field it's balking at. But if this is a problem for people running real courses, we'll need to fix it in edx-search, likely in process_document()), as part of a separate PR. Could you try again with the updated code and instructions, and let me know if you see anything else weird? |
cms/djangoapps/contentstore/management/commands/reindex_course.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Thank you for your work, @pomegranited!
- I tested this using the instructions from the PR
- I read through the code
-
I checked for accessibility issuesN/A - Includes documentation
The index creation still throws an error, but it is not related to this PR.
The issue is here:
https://github.com/openedx/edx-search/blob/91686e96699cb8678bd12e2ef42ac51128c13f42/search/meilisearch.py#L233-L237
The client.create_index
call is async. It returns a task that you should wait to make sure the index is created. As we are calling client.get_index
right after, the index has not been created yet.
We have a helper function that does that here:
https://github.com/open-craft/edx-platform/blob/7a99d592725c32d4730c0efa9960d59e23def7dc/openedx/core/djangoapps/content/search/api.py#L132-L148
Good point @rpenido. I also faced this issue in student_notes. I have a fix which I think is simpler: https://github.com/openedx/edx-notes-api/pull/444/files
@pomegranited EDIT: there you go openedx/edx-search#166 |
In `search.meilisearch.create_indexes`, we were not waiting for the index creation tasks to complete. This was causing a potential race condition, where the `create_indexes` function would fail because it took a few seconds for the index creation to succeed. See the relevant conversation here: openedx/edx-platform#35743 (comment)
In `search.meilisearch.create_indexes`, we were not waiting for the index creation tasks to complete. This was causing a potential race condition, where the `create_indexes` function would fail because it took a few seconds for the index creation to succeed. See the relevant conversation here: openedx/edx-platform#35743 (comment)
In `search.meilisearch.create_indexes`, we were not waiting for the index creation tasks to complete. This was causing a potential race condition, where the `create_indexes` function would fail because it took a few seconds for the index creation to succeed. See the relevant conversation here: openedx/edx-platform#35743 (comment)
In `search.meilisearch.create_indexes`, we were not waiting for the index creation tasks to complete. This was causing a potential race condition, where the `create_indexes` function would fail because it took a few seconds for the index creation to succeed. See the relevant conversation here: openedx/edx-platform#35743 (comment)
In `search.meilisearch.create_indexes`, we were not waiting for the index creation tasks to complete. This was causing a potential race condition, where the `create_indexes` function would fail because it took a few seconds for the index creation to succeed. See the relevant conversation here: openedx/edx-platform#35743 (comment)
In `search.meilisearch.create_indexes`, we were not waiting for the index creation tasks to complete. This was causing a potential race condition, where the `create_indexes` function would fail because it took a few seconds for the index creation to succeed. See the relevant conversation here: openedx/edx-platform#35743 (comment)
Brilliant, thank you for your help @regisb and @rpenido ! I've reduced this change down to the bare minimum, wrapping the ES-specific logic and removing all the Meilisearch-specific stuff. Also bumped the edx-search version to fix the race condition error noted in comments. @rpenido do you want to run another round of tests to make sure that issue is resolved? |
@pomegranited I faced the same issue as @rpenido while running Also noticed that the reindex command (meilisearch api) raises error while indexing empty courses i.e. empty lists.
|
Same here. It seems that I tried the following:
Checking version with: |
Addressed with 2e8d7ed.
The package update has been merged: #35755 So I've merged latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. It seems that edx-seach is not updating from 4.1.0 to 4.1.1
@rpenido You'll need to rebuild openedx-dev image and launch with --skip-build
option for updating the package permanently OR you need to run tutor dev exec cms make requirements
(notice the use of exec
instead of run
) to update requirements in your currently running container but it will be lost once the container is deleted.
run
command creates a new one-off container so it will have the old version till you rebuild the image and any updates using this one-off container is lost as soon as the run command completes so next run of run
command will not reflect any updates.
@pomegranited Nice work! I don't see any errors now.
- I tested this: (reindexed all courses)
- I read through the code
-
I checked for accessibility issues -
Includes documentation
Description
openedx/edx-search#164 introduced the MeilisearchEngine to replace Elasticsearch for Sumac and later.
However, the management command used to set up the course indexes has Elasticsearch-engine specific code in it, which fails when run with the Meilisearch engine.
This PR only fixes the error so that this management command can be used when indexing courses on Meilisearch, but it does not set up the Meilisearch indexes -- this will be handled by Tutor.
Supporting information
Relates to: overhangio/tutor#1141
Part of: openedx/modular-learning#236
Private-ref: FAL-3903
Testing instructions
Deadline
ASAP -- also needs to be backported to Sumac.
Other information
create_indexes()
when using--setup
, but decided not to pollute the script with more engine-specific logic, and instead let Tutor handle this step.