Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: allow non-Elasticsearch search engines when reindexing courses [FC-0062] #35743

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

pomegranited
Copy link
Contributor

@pomegranited pomegranited commented Oct 30, 2024

Description

openedx/edx-search#164 introduced the MeilisearchEngine to replace Elasticsearch for Sumac and later.

However, the management command used to set up the course indexes has Elasticsearch-engine specific code in it, which fails when run with the Meilisearch engine.

This PR only fixes the error so that this management command can be used when indexing courses on Meilisearch, but it does not set up the Meilisearch indexes -- this will be handled by Tutor.

Supporting information

Relates to: overhangio/tutor#1141
Part of: openedx/modular-learning#236
Private-ref: FAL-3903

Testing instructions

  1. Set up your tutor dev env with either the branch from feat: replace Elasticsearch by Meilisearch overhangio/tutor#1141 or by installing and enabling the Meilisearch plugin
  2. Enable course search indexing, e.g. with this plugin:
    name: courseware_search
    version: 0.1.0
    patches:
      openedx-common-settings: |
        SEARCH_ENGINE = "search.meilisearch.MeilisearchEngine"
      lms-env-features: |
        ENABLE_COURSEWARE_INDEX: true
        ENABLE_COURSEWARE_SEARCH: true
      cms-env-features: |
        ENABLE_COURSEWARE_INDEX: true
        ENABLE_COURSEWARE_SEARCH: true
  3. Create the Meilisearch indexes:
    tutor dev run cms ./manage.py cms shell -c "from search.meilisearch import create_indexes; create_indexes()"
    
  4. Run the index setup command, and confirm that all courses are indexed:
    tutor dev run cms ./manage.py cms reindex_course --all --setup
    tutor dev run cms ./manage.py cms reindex_course --all
    
  5. Confirm that you can reindex a course's content in Studio.

Deadline

ASAP -- also needs to be backported to Sumac.

Other information

  1. Originally proposed to run create_indexes() when using --setup, but decided not to pollute the script with more engine-specific logic, and instead let Tutor handle this step.

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Oct 30, 2024
@openedx-webhooks
Copy link

Thanks for the pull request, @pomegranited!

What's next?

Please work through the following steps to get your changes ready for engineering review:

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.

🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads

🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

🔘 Let us know that your PR is ready for review:

Who will review my changes?

This repository is currently maintained by @openedx/wg-maintenance-edx-platform. Tag them in a comment and let them know that your changes are ready for review.

Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@pomegranited pomegranited changed the title fix: support Meilisearch search engine during setup fix: support Meilisearch search engine during course index setup Oct 30, 2024
@rpenido
Copy link
Contributor

rpenido commented Oct 30, 2024

Hi, @pomegranited!

This crashed the first two times that I ran it.

First run
2024-10-30 18:24:44,908 WARNING 1 [py.warnings] [user None] [ip None] warnings.py:110 - /openedx/venv/lib/python3.11/site-packages/pgpy/constants.py:5: DeprecationWarning: 'imghdr' is deprecated and slated for removal in Python 3.13
  import imghdr

2024-10-30 18:24:45,676 WARNING 1 [py.warnings] [user None] [ip None] warnings.py:110 - <frozen importlib._bootstrap>:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()

2024-10-30 18:24:45,676 WARNING 1 [py.warnings] [user None] [ip None] warnings.py:110 - <frozen importlib._bootstrap>:673: ImportWarning: _SixMetaPathImporter.exec_module() not found; falling back to load_module()

2024-10-30 18:24:46,417 INFO 1 [root] [user None] [ip None] reindex_course.py:103 - Creating meilisearch index for courseware_content
Traceback (most recent call last):
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 135, in __validate
    request.raise_for_status()
  File "/openedx/venv/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://meilisearch:7700/indexes/tutor_courseware_content

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/openedx/venv/lib/python3.11/site-packages/search/meilisearch.py", line 229, in create_indexes
    index = client.get_index(meilisearch_index_name)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/client.py", line 181, in get_index
    return Index(self.config, uid).fetch_info()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/index.py", line 131, in fetch_info
    index_dict = self.http.get(f"{self.config.paths.index}/{self.uid}")
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 76, in get
    return self.send_request(requests.get, path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 68, in send_request
    return self.__validate(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 138, in __validate
    raise MeilisearchApiError(str(err), request) from err
meilisearch.errors.MeilisearchApiError: MeilisearchApiError. Error code: index_not_found. Error message: Index `tutor_courseware_content` not found. Error documentation: https://docs.meilisearch.com/errors#index_not_found Error type: invalid_request

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 135, in __validate
    request.raise_for_status()
  File "/openedx/venv/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://meilisearch:7700/indexes/tutor_courseware_content

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/openedx/edx-platform/./manage.py", line 106, in <module>
    execute_from_command_line([sys.argv[0]] + django_args)
  File "/openedx/venv/lib/python3.11/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/openedx/venv/lib/python3.11/site-packages/django/core/management/__init__.py", line 436, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/openedx/venv/lib/python3.11/site-packages/django/core/management/base.py", line 412, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/openedx/venv/lib/python3.11/site-packages/django/core/management/base.py", line 458, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/cms/djangoapps/contentstore/management/commands/reindex_course.py", line 104, in handle
    create_indexes({index_name: INDEX_FILTERABLES[index_name]})
  File "/openedx/venv/lib/python3.11/site-packages/search/meilisearch.py", line 237, in create_indexes
    index = client.get_index(meilisearch_index_name)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/client.py", line 181, in get_index
    return Index(self.config, uid).fetch_info()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/index.py", line 131, in fetch_info
    index_dict = self.http.get(f"{self.config.paths.index}/{self.uid}")
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 76, in get
    return self.send_request(requests.get, path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 68, in send_request
    return self.__validate(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 138, in __validate
    raise MeilisearchApiError(str(err), request) from err
meilisearch.errors.MeilisearchApiError: MeilisearchApiError. Error code: index_not_found. Error message: Index `tutor_courseware_content` not found. Error documentation: https://docs.meilisearch.com/errors#index_not_found Error type: invalid_request
Error: Command failed with status 1: docker compose -f /home/rpenido/.local/share/tutor-nightly/env/local/docker-compose.yml -f /home/rpenido/.local/share/tutor-nightly/env/dev/docker-compose.yml --project-name tutor_nightly_dev run --rm cms ./manage.py cms reindex_course --all --setup
Second run
2024-10-30 18:29:30,586 WARNING 1 [py.warnings] [user None] [ip None] warnings.py:110 - /openedx/venv/lib/python3.11/site-packages/pgpy/constants.py:5: DeprecationWarning: 'imghdr' is deprecated and slated for removal in Python 3.13
  import imghdr

2024-10-30 18:29:31,365 WARNING 1 [py.warnings] [user None] [ip None] warnings.py:110 - <frozen importlib._bootstrap>:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()

2024-10-30 18:29:31,365 WARNING 1 [py.warnings] [user None] [ip None] warnings.py:110 - <frozen importlib._bootstrap>:673: ImportWarning: _SixMetaPathImporter.exec_module() not found; falling back to load_module()

2024-10-30 18:29:32,104 INFO 1 [root] [user None] [ip None] reindex_course.py:103 - Creating meilisearch index for courseware_content
2024-10-30 18:29:32,118 INFO 1 [root] [user None] [ip None] reindex_course.py:103 - Creating meilisearch index for course_info
Traceback (most recent call last):
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 135, in __validate
    request.raise_for_status()
  File "/openedx/venv/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://meilisearch:7700/indexes/tutor_course_info

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/openedx/venv/lib/python3.11/site-packages/search/meilisearch.py", line 229, in create_indexes
    index = client.get_index(meilisearch_index_name)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/client.py", line 181, in get_index
    return Index(self.config, uid).fetch_info()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/index.py", line 131, in fetch_info
    index_dict = self.http.get(f"{self.config.paths.index}/{self.uid}")
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 76, in get
    return self.send_request(requests.get, path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 68, in send_request
    return self.__validate(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 138, in __validate
    raise MeilisearchApiError(str(err), request) from err
meilisearch.errors.MeilisearchApiError: MeilisearchApiError. Error code: index_not_found. Error message: Index `tutor_course_info` not found. Error documentation: https://docs.meilisearch.com/errors#index_not_found Error type: invalid_request

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 135, in __validate
    request.raise_for_status()
  File "/openedx/venv/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://meilisearch:7700/indexes/tutor_course_info

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/openedx/edx-platform/./manage.py", line 106, in <module>
    execute_from_command_line([sys.argv[0]] + django_args)
  File "/openedx/venv/lib/python3.11/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/openedx/venv/lib/python3.11/site-packages/django/core/management/__init__.py", line 436, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/openedx/venv/lib/python3.11/site-packages/django/core/management/base.py", line 412, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/openedx/venv/lib/python3.11/site-packages/django/core/management/base.py", line 458, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/cms/djangoapps/contentstore/management/commands/reindex_course.py", line 104, in handle
    create_indexes({index_name: INDEX_FILTERABLES[index_name]})
  File "/openedx/venv/lib/python3.11/site-packages/search/meilisearch.py", line 237, in create_indexes
    index = client.get_index(meilisearch_index_name)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/client.py", line 181, in get_index
    return Index(self.config, uid).fetch_info()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/index.py", line 131, in fetch_info
    index_dict = self.http.get(f"{self.config.paths.index}/{self.uid}")
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 76, in get
    return self.send_request(requests.get, path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 68, in send_request
    return self.__validate(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 138, in __validate
    raise MeilisearchApiError(str(err), request) from err
meilisearch.errors.MeilisearchApiError: MeilisearchApiError. Error code: index_not_found. Error message: Index `tutor_course_info` not found. Error documentation: https://docs.meilisearch.com/errors#index_not_found Error type: invalid_request
Error: Command failed with status 1: docker compose -f /home/rpenido/.local/share/tutor-nightly/env/local/docker-compose.yml -f /home/rpenido/.local/share/tutor-nightly/env/dev/docker-compose.yml --project-name tutor_nightly_dev run --rm cms ./manage.py cms reindex_course --all --setup

The third run went "fine." We had a problem with transcripts (the same happens with our studio_content), but there was also another in another course for which I don't know the cause.

2024-10-30 18:30:50,222 WARNING 1 [py.warnings] [user None] [ip None] warnings.py:110 - /openedx/venv/lib/python3.11/site-packages/pgpy/constants.py:5: DeprecationWarning: 'imghdr' is deprecated and slated for removal in Python 3.13
  import imghdr

2024-10-30 18:30:51,024 WARNING 1 [py.warnings] [user None] [ip None] warnings.py:110 - <frozen importlib._bootstrap>:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()

2024-10-30 18:30:51,024 WARNING 1 [py.warnings] [user None] [ip None] warnings.py:110 - <frozen importlib._bootstrap>:673: ImportWarning: _SixMetaPathImporter.exec_module() not found; falling back to load_module()

2024-10-30 18:30:51,762 INFO 1 [root] [user None] [ip None] reindex_course.py:103 - Creating meilisearch index for courseware_content
2024-10-30 18:30:51,771 INFO 1 [root] [user None] [ip None] reindex_course.py:103 - Creating meilisearch index for course_info
2024-10-30 18:30:51,890 WARNING 1 [root] [user None] [ip None] reindex_course.py:144 - Reindexing 10 courses...
2024-10-30 18:30:51,941 WARNING 1 [edx_toggles.toggles.internal.waffle.flag] [user None] [ip None] flag.py:79 - Flag 'teams.content_groups_for_teams' accessed without a request, which is likely in the context of a celery task.
2024-10-30 18:30:51,988 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_course_info sources=[{'id': 'course-v1:org1+demo01+demo011', 'course': 'course-v1:org1+demo01+demo011', 'content': {'display_name': 'Demo Copy Course', 'overview': ' About This Course Include your long course description here. The long course description should contain 150-400 words. This is paragraph 2 of the long course description. Add more paragraphs as needed. Make sure to enclose them in paragraph tags. Requirements Add information about the skills and knowledge students need to take this course. Course Staff Staff Member #1 Biography of instructor/staff member #1 Staff Member #2 Biography of instructor/staff member #2 Frequently Asked Questions What web browser should I use? The Open edX platform works best with current versions of Chrome, Edge, Firefox, or Safari. See our list of supported browsers for the most up-to-date information. Question #2 Your answer would be displayed here. ', 'number': 'demo01'}, 'image_url': '/asset-v1:org1+demo01+demo011+type@asset+block@images_course_image.jpg', 'start': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=<bson.tz_util.FixedOffset object at 0x70bcec97bcd0>), 'number': 'demo01', 'org': 'org1', 'modes': ['audit'], 'language': 'en', 'catalog_visibility': 'both'}] kwargs={}
2024-10-30 18:30:52,058 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_courseware_content sources=[{'course': 'course-v1:org1+demo01+demo011', 'org': 'org1', 'content': {'display_name': 'Subsection'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+demo01+demo011+type@sequential+block@3e1dc089013d4a1994d36bad47965810', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Demo Copy Course', 'location': ['Section', 'Subsection']}, {'course': 'course-v1:org1+demo01+demo011', 'org': 'org1', 'content': {'display_name': 'Subsection abc'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+demo01+demo011+type@sequential+block@3e1a4c0f3d824feeafe0202f587f48eb', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Demo Copy Course', 'location': ['Section', 'Subsection abc']}, {'course': 'course-v1:org1+demo01+demo011', 'org': 'org1', 'content': {'display_name': 'Subsection 3'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+demo01+demo011+type@sequential+block@833c54b5fcd74726aca01cd0b5e75dee', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Demo Copy Course', 'location': ['Section', 'Subsection 3']}, {'course': 'course-v1:org1+demo01+demo011', 'org': 'org1', 'content': {'display_name': 'Section'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+demo01+demo011+type@chapter+block@7cb992a02cc54167a2043d0128f8d642', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Demo Copy Course', 'location': ['Section']}] kwargs={'request_timeout': 180}
2024-10-30 18:30:52,066 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:177 - Remove request: index=tutor_courseware_content, doc_ids=[] kwargs={}
2024-10-30 18:30:52,066 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 4, "category": "courseware_index"}, "time": "2024-10-30T18:30:52.066390+00:00", "event_type": "edx.course.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,066 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course_info.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 4, "category": "course_info"}, "time": "2024-10-30T18:30:52.066799+00:00", "event_type": "edx.course_info.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,090 WARNING 1 [edx_toggles.toggles.internal.waffle.flag] [user None] [ip None] flag.py:79 - Flag 'teams.content_groups_for_teams' accessed without a request, which is likely in the context of a celery task.
2024-10-30 18:30:52,103 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_course_info sources=[{'id': 'course-v1:org1+empty1+001', 'course': 'course-v1:org1+empty1+001', 'content': {'display_name': 'Empty Course', 'overview': ' About This Course Include your long course description here. The long course description should contain 150-400 words. This is paragraph 2 of the long course description. Add more paragraphs as needed. Make sure to enclose them in paragraph tags. Requirements Add information about the skills and knowledge students need to take this course. Course Staff Staff Member #1 Biography of instructor/staff member #1 Staff Member #2 Biography of instructor/staff member #2 Frequently Asked Questions What web browser should I use? The Open edX platform works best with current versions of Chrome, Edge, Firefox, or Safari. See our list of supported browsers for the most up-to-date information. Question #2 Your answer would be displayed here. ', 'number': 'empty1'}, 'image_url': '/asset-v1:org1+empty1+001+type@asset+block@images_course_image.jpg', 'start': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=<bson.tz_util.FixedOffset object at 0x70bcec9ec8d0>), 'number': 'empty1', 'org': 'org1', 'modes': ['audit'], 'language': 'en', 'catalog_visibility': 'both'}] kwargs={}
2024-10-30 18:30:52,113 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_courseware_content sources=[{'course': 'course-v1:org1+empty1+001', 'org': 'org1', 'content': {'display_name': 'Section'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+empty1+001+type@chapter+block@19ba8743d1ae498abe279f4ef0d16efe', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Empty Course', 'location': ['Section']}] kwargs={'request_timeout': 180}
2024-10-30 18:30:52,131 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:177 - Remove request: index=tutor_courseware_content, doc_ids=[] kwargs={}
2024-10-30 18:30:52,131 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 1, "category": "courseware_index"}, "time": "2024-10-30T18:30:52.131449+00:00", "event_type": "edx.course.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,132 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course_info.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 1, "category": "course_info"}, "time": "2024-10-30T18:30:52.131908+00:00", "event_type": "edx.course_info.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,171 WARNING 1 [edx_toggles.toggles.internal.waffle.flag] [user None] [ip None] flag.py:79 - Flag 'teams.content_groups_for_teams' accessed without a request, which is likely in the context of a celery task.
2024-10-30 18:30:52,185 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_course_info sources=[{'id': 'course-v1:SampleTaxonomyOrg1+STC1+2023_1', 'course': 'course-v1:SampleTaxonomyOrg1+STC1+2023_1', 'content': {'display_name': 'Sample Taxonomy Course', 'overview': ' About This Course Include your long course description here. The long course description should contain 150-400 words. This is paragraph 2 of the long course description. Add more paragraphs as needed. Make sure to enclose them in paragraph tags. Requirements Add information about the skills and knowledge students need to take this course. Course Staff Staff Member #1 Biography of instructor/staff member #1 Staff Member #2 Biography of instructor/staff member #2 Frequently Asked Questions What web browser should I use? The Open edX platform works best with current versions of Chrome, Edge, Firefox, or Safari. See our list of supported browsers for the most up-to-date information. Question #2 Your answer would be displayed here. ', 'number': 'STC1'}, 'image_url': '/asset-v1:SampleTaxonomyOrg1+STC1+2023_1+type@asset+block@images_course_image.jpg', 'start': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=<bson.tz_util.FixedOffset object at 0x70bcebf58710>), 'number': 'STC1', 'org': 'SampleTaxonomyOrg1', 'modes': ['audit'], 'language': 'en', 'catalog_visibility': 'both'}] kwargs={}
2024-10-30 18:30:52,201 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_courseware_content sources=[{'course': 'course-v1:SampleTaxonomyOrg1+STC1+2023_1', 'org': 'SampleTaxonomyOrg1', 'content': {'display_name': 'Subsection 1.1'}, 'content_type': 'Sequence', 'id': 'block-v1:SampleTaxonomyOrg1+STC1+2023_1+type@sequential+block@92e3e9ca156c44fa8a735f0e9e7c854f', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Sample Taxonomy Course', 'location': ['Section 1', 'Subsection 1.1']}, {'course': 'course-v1:SampleTaxonomyOrg1+STC1+2023_1', 'org': 'SampleTaxonomyOrg1', 'content': {'display_name': 'Subsection 1.2'}, 'content_type': 'Sequence', 'id': 'block-v1:SampleTaxonomyOrg1+STC1+2023_1+type@sequential+block@5ec9c5f6dc784deea6b2dc351a2aae48', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Sample Taxonomy Course', 'location': ['Section 1', 'Subsection 1.2']}, {'course': 'course-v1:SampleTaxonomyOrg1+STC1+2023_1', 'org': 'SampleTaxonomyOrg1', 'content': {'display_name': 'Section 1'}, 'content_type': 'Sequence', 'id': 'block-v1:SampleTaxonomyOrg1+STC1+2023_1+type@chapter+block@c7077c8cafcf420dbc0b440bf27bad04', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Sample Taxonomy Course', 'location': ['Section 1']}, {'course': 'course-v1:SampleTaxonomyOrg1+STC1+2023_1', 'org': 'SampleTaxonomyOrg1', 'content': {'display_name': 'Subsection 2.1'}, 'content_type': 'Sequence', 'id': 'block-v1:SampleTaxonomyOrg1+STC1+2023_1+type@sequential+block@4c6782011b5b49e69fa1c0f2cbc2f9d7', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Sample Taxonomy Course', 'location': ['Section 2', 'Subsection 2.1']}, {'course': 'course-v1:SampleTaxonomyOrg1+STC1+2023_1', 'org': 'SampleTaxonomyOrg1', 'content': {'display_name': 'Section 2'}, 'content_type': 'Sequence', 'id': 'block-v1:SampleTaxonomyOrg1+STC1+2023_1+type@chapter+block@abf68d95259b497ab4324a7cea267e1a', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Sample Taxonomy Course', 'location': ['Section 2']}] kwargs={'request_timeout': 180}
2024-10-30 18:30:52,210 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:177 - Remove request: index=tutor_courseware_content, doc_ids=[] kwargs={}
2024-10-30 18:30:52,211 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 5, "category": "courseware_index"}, "time": "2024-10-30T18:30:52.210898+00:00", "event_type": "edx.course.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,211 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course_info.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 5, "category": "course_info"}, "time": "2024-10-30T18:30:52.211219+00:00", "event_type": "edx.course_info.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,230 WARNING 1 [edx_toggles.toggles.internal.waffle.flag] [user None] [ip None] flag.py:79 - Flag 'teams.content_groups_for_teams' accessed without a request, which is likely in the context of a celery task.
2024-10-30 18:30:52,237 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_course_info sources=[{'id': 'course-v1:org1+empty2+002', 'course': 'course-v1:org1+empty2+002', 'content': {'display_name': 'Empty Course', 'overview': ' About This Course Include your long course description here. The long course description should contain 150-400 words. This is paragraph 2 of the long course description. Add more paragraphs as needed. Make sure to enclose them in paragraph tags. Requirements Add information about the skills and knowledge students need to take this course. Course Staff Staff Member #1 Biography of instructor/staff member #1 Staff Member #2 Biography of instructor/staff member #2 Frequently Asked Questions What web browser should I use? The Open edX platform works best with current versions of Chrome, Edge, Firefox, or Safari. See our list of supported browsers for the most up-to-date information. Question #2 Your answer would be displayed here. ', 'number': 'empty2'}, 'image_url': '/asset-v1:org1+empty2+002+type@asset+block@images_course_image.jpg', 'start': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=<bson.tz_util.FixedOffset object at 0x70bcebf3c990>), 'number': 'empty2', 'org': 'org1', 'modes': ['audit'], 'language': 'en', 'catalog_visibility': 'both'}] kwargs={}
2024-10-30 18:30:52,248 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_courseware_content sources=[{'course': 'course-v1:org1+empty2+002', 'org': 'org1', 'content': {'display_name': 'Subsection navigate'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+empty2+002+type@sequential+block@0b25ba7ba1694032a493fec244fd9bb2', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Empty Course', 'location': ['Section ABCD', 'Subsection navigate']}, {'course': 'course-v1:org1+empty2+002', 'org': 'org1', 'content': {'display_name': 'Section ABCD'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+empty2+002+type@chapter+block@77f5bd6524834d3983f2b39d9afdd711', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Empty Course', 'location': ['Section ABCD']}] kwargs={'request_timeout': 180}
2024-10-30 18:30:52,259 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:177 - Remove request: index=tutor_courseware_content, doc_ids=[] kwargs={}
2024-10-30 18:30:52,259 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 2, "category": "courseware_index"}, "time": "2024-10-30T18:30:52.259795+00:00", "event_type": "edx.course.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,260 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course_info.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 2, "category": "course_info"}, "time": "2024-10-30T18:30:52.260117+00:00", "event_type": "edx.course_info.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,286 WARNING 1 [edx_toggles.toggles.internal.waffle.flag] [user None] [ip None] flag.py:79 - Flag 'teams.content_groups_for_teams' accessed without a request, which is likely in the context of a celery task.
2024-10-30 18:30:52,293 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_course_info sources=[{'id': 'course-v1:SampleTaxonomyOrg2+STC1+2023_1', 'course': 'course-v1:SampleTaxonomyOrg2+STC1+2023_1', 'content': {'display_name': 'Sample Taxonomy Course', 'overview': ' About This Course Include your long course description here. The long course description should contain 150-400 words. This is paragraph 2 of the long course description. Add more paragraphs as needed. Make sure to enclose them in paragraph tags. Requirements Add information about the skills and knowledge students need to take this course. Course Staff Staff Member #1 Biography of instructor/staff member #1 Staff Member #2 Biography of instructor/staff member #2 Frequently Asked Questions What web browser should I use? The Open edX platform works best with current versions of Chrome, Edge, Firefox, or Safari. See our list of supported browsers for the most up-to-date information. Question #2 Your answer would be displayed here. ', 'number': 'STC1'}, 'image_url': '/asset-v1:SampleTaxonomyOrg2+STC1+2023_1+type@asset+block@images_course_image.jpg', 'start': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=<bson.tz_util.FixedOffset object at 0x70bcebf4b950>), 'number': 'STC1', 'org': 'SampleTaxonomyOrg2', 'modes': ['audit'], 'language': 'en', 'catalog_visibility': 'both'}] kwargs={}
2024-10-30 18:30:52,306 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_courseware_content sources=[{'course': 'course-v1:SampleTaxonomyOrg2+STC1+2023_1', 'org': 'SampleTaxonomyOrg2', 'content': {'display_name': 'Subsection 1.1 A'}, 'content_type': 'Sequence', 'id': 'block-v1:SampleTaxonomyOrg2+STC1+2023_1+type@sequential+block@92e3e9ca156c44fa8a735f0e9e7c854f', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Sample Taxonomy Course', 'location': ['Section 1', 'Subsection 1.1 A']}, {'course': 'course-v1:SampleTaxonomyOrg2+STC1+2023_1', 'org': 'SampleTaxonomyOrg2', 'content': {'display_name': 'Subsection 1.2'}, 'content_type': 'Sequence', 'id': 'block-v1:SampleTaxonomyOrg2+STC1+2023_1+type@sequential+block@5ec9c5f6dc784deea6b2dc351a2aae48', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Sample Taxonomy Course', 'location': ['Section 1', 'Subsection 1.2']}, {'course': 'course-v1:SampleTaxonomyOrg2+STC1+2023_1', 'org': 'SampleTaxonomyOrg2', 'content': {'display_name': 'Section 1'}, 'content_type': 'Sequence', 'id': 'block-v1:SampleTaxonomyOrg2+STC1+2023_1+type@chapter+block@c7077c8cafcf420dbc0b440bf27bad04', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Sample Taxonomy Course', 'location': ['Section 1']}, {'course': 'course-v1:SampleTaxonomyOrg2+STC1+2023_1', 'org': 'SampleTaxonomyOrg2', 'content': {'display_name': 'Subsection 2.1'}, 'content_type': 'Sequence', 'id': 'block-v1:SampleTaxonomyOrg2+STC1+2023_1+type@sequential+block@4c6782011b5b49e69fa1c0f2cbc2f9d7', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Sample Taxonomy Course', 'location': ['Section 2', 'Subsection 2.1']}, {'course': 'course-v1:SampleTaxonomyOrg2+STC1+2023_1', 'org': 'SampleTaxonomyOrg2', 'content': {'display_name': 'Section 2'}, 'content_type': 'Sequence', 'id': 'block-v1:SampleTaxonomyOrg2+STC1+2023_1+type@chapter+block@abf68d95259b497ab4324a7cea267e1a', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Sample Taxonomy Course', 'location': ['Section 2']}] kwargs={'request_timeout': 180}
2024-10-30 18:30:52,313 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:177 - Remove request: index=tutor_courseware_content, doc_ids=[] kwargs={}
2024-10-30 18:30:52,314 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 5, "category": "courseware_index"}, "time": "2024-10-30T18:30:52.313978+00:00", "event_type": "edx.course.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,314 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course_info.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 5, "category": "course_info"}, "time": "2024-10-30T18:30:52.314209+00:00", "event_type": "edx.course_info.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,332 WARNING 1 [edx_toggles.toggles.internal.waffle.flag] [user None] [ip None] flag.py:79 - Flag 'teams.content_groups_for_teams' accessed without a request, which is likely in the context of a celery task.
2024-10-30 18:30:52,341 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_course_info sources=[{'id': 'course-v1:org1+dsad+dada', 'course': 'course-v1:org1+dsad+dada', 'content': {'display_name': 'tesda', 'overview': ' About This Course Include your long course description here. The long course description should contain 150-400 words. This is paragraph 2 of the long course description. Add more paragraphs as needed. Make sure to enclose them in paragraph tags. Requirements Add information about the skills and knowledge students need to take this course. Course Staff Staff Member #1 Biography of instructor/staff member #1 Staff Member #2 Biography of instructor/staff member #2 Frequently Asked Questions What web browser should I use? The Open edX platform works best with current versions of Chrome, Edge, Firefox, or Safari. See our list of supported browsers for the most up-to-date information. Question #2 Your answer would be displayed here. ', 'number': 'dsad'}, 'image_url': '/asset-v1:org1+dsad+dada+type@asset+block@images_course_image.jpg', 'start': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=<bson.tz_util.FixedOffset object at 0x70bcebf7cdd0>), 'number': 'dsad', 'org': 'org1', 'modes': ['audit'], 'language': 'en', 'catalog_visibility': 'both'}] kwargs={}
2024-10-30 18:30:52,349 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_courseware_content sources=[] kwargs={'request_timeout': 180}
2024-10-30 18:30:52,354 ERROR 1 [edx.modulestore] [user None] [ip None] courseware_index.py:263 - Indexing error encountered, courseware index may be out of date course-v1:org1+dsad+dada - MeilisearchApiError("The `json` payload provided is malformed. `Couldn't serialize document value: data are neither an object nor a list of objects`.")
Traceback (most recent call last):
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 135, in __validate
    request.raise_for_status()
  File "/openedx/venv/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://meilisearch:7700/indexes/tutor_courseware_content/documents

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 259, in index
    searcher.index(items_index, request_timeout=timeout)
  File "/openedx/venv/lib/python3.11/site-packages/search/meilisearch.py", line 140, in index
    self.meilisearch_index.add_documents(
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/index.py", line 448, in add_documents
    add_document_task = self.http.post(url, documents, serializer=serializer)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 88, in post
    return self.send_request(requests.post, path, body, content_type, serializer=serializer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 68, in send_request
    return self.__validate(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 138, in __validate
    raise MeilisearchApiError(str(err), request) from err
meilisearch.errors.MeilisearchApiError: MeilisearchApiError. Error code: malformed_payload. Error message: The `json` payload provided is malformed. `Couldn't serialize document value: data are neither an object nor a list of objects`. Error documentation: https://docs.meilisearch.com/errors#malformed_payload Error type: invalid_request
2024-10-30 18:30:52,359 ERROR 1 [root] [user None] [ip None] reindex_course.py:162 - Error indexing course course-v1:org1+dsad+dada due to the error: Error(s) present during indexing.
Traceback (most recent call last):
  File "/openedx/edx-platform/cms/djangoapps/contentstore/management/commands/reindex_course.py", line 154, in handle
    CoursewareSearchIndexer.do_course_reindex(store, course_key)
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 373, in do_course_reindex
    return cls._do_reindex(modulestore, course_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 383, in _do_reindex
    indexed_count = super()._do_reindex(modulestore, structure_key)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 281, in _do_reindex
    indexed_count = cls.index(modulestore, structure_key, timeout=180)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 271, in index
    raise SearchIndexingError('Error(s) present during indexing', error_list)
cms.djangoapps.contentstore.courseware_index.SearchIndexingError: Error(s) present during indexing
2024-10-30 18:30:52,394 WARNING 1 [edx_toggles.toggles.internal.waffle.flag] [user None] [ip None] flag.py:79 - Flag 'teams.content_groups_for_teams' accessed without a request, which is likely in the context of a celery task.
2024-10-30 18:30:52,406 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_course_info sources=[{'id': 'course-v1:org1+123+1234', 'course': 'course-v1:org1+123+1234', 'content': {'display_name': 'Teste', 'overview': ' About This Course Include your long course description here. The long course description should contain 150-400 words. This is paragraph 2 of the long course description. Add more paragraphs as needed. Make sure to enclose them in paragraph tags. Requirements Add information about the skills and knowledge students need to take this course. Course Staff Staff Member #1 Biography of instructor/staff member #1 Staff Member #2 Biography of instructor/staff member #2 Frequently Asked Questions What web browser should I use? The Open edX platform works best with current versions of Chrome, Edge, Firefox, or Safari. See our list of supported browsers for the most up-to-date information. Question #2 Your answer would be displayed here. ', 'number': '123'}, 'image_url': '/asset-v1:org1+123+1234+type@asset+block@images_course_image.jpg', 'start': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=<bson.tz_util.FixedOffset object at 0x70bcebf4c350>), 'number': '123', 'org': 'org1', 'modes': ['audit'], 'language': 'en', 'catalog_visibility': 'both'}] kwargs={}
2024-10-30 18:30:52,419 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_courseware_content sources=[{'course': 'course-v1:org1+123+1234', 'org': 'org1', 'content': {'display_name': 'Subsection'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+123+1234+type@sequential+block@77f81ffd16444b73a84f0a012f07d236', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Teste', 'location': ['Section', 'Subsection']}, {'course': 'course-v1:org1+123+1234', 'org': 'org1', 'content': {'display_name': 'Subsection'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+123+1234+type@sequential+block@9f95fd1f48c247e29ae17db8e32c5e1a', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Teste', 'location': ['Section', 'Subsection']}, {'course': 'course-v1:org1+123+1234', 'org': 'org1', 'content': {'display_name': 'Section'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+123+1234+type@chapter+block@3606793418f8410c8874354a7d592cb7', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Teste', 'location': ['Section']}] kwargs={'request_timeout': 180}
2024-10-30 18:30:52,434 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:177 - Remove request: index=tutor_courseware_content, doc_ids=[] kwargs={}
2024-10-30 18:30:52,434 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 3, "category": "courseware_index"}, "time": "2024-10-30T18:30:52.434405+00:00", "event_type": "edx.course.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,435 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course_info.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 3, "category": "course_info"}, "time": "2024-10-30T18:30:52.434903+00:00", "event_type": "edx.course_info.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,504 WARNING 1 [edx_toggles.toggles.internal.waffle.flag] [user None] [ip None] flag.py:79 - Flag 'teams.content_groups_for_teams' accessed without a request, which is likely in the context of a celery task.
2024-10-30 18:30:52,518 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_course_info sources=[{'id': 'course-v1:edX+DemoX+Demo_Course', 'course': 'course-v1:edX+DemoX+Demo_Course', 'content': {'display_name': 'Demonstration Course', 'overview': ' About This Course Include your long course description here. The long course description should contain 150-400 words. This is paragraph 2 of the long course description. Add more paragraphs as needed. Make sure to enclose them in paragraph tags. Prerequisites Add information about course prerequisites here. Course Staff Staff Member #1 Biography of instructor/staff member #1 Staff Member #2 Biography of instructor/staff member #2 Frequently Asked Questions What web browser should I use? The Open edX platform works best with current versions of Chrome, Firefox or Safari, or with Internet Explorer version 9 and above. See our list of supported browsers for the most up-to-date information. Question #2 Your answer would be displayed here. ', 'number': 'DemoX'}, 'image_url': '/asset-v1:edX+DemoX+Demo_Course+type@asset+block@images_course_image.jpg', 'start': datetime.datetime(2013, 2, 5, 5, 0, tzinfo=<bson.tz_util.FixedOffset object at 0x70bceb510fd0>), 'number': 'DemoX', 'enrollment_start': datetime.datetime(2013, 2, 5, 0, 0, tzinfo=<bson.tz_util.FixedOffset object at 0x70bceb510fd0>), 'org': 'edX', 'modes': ['audit'], 'catalog_visibility': 'both'}] kwargs={}
2024-10-30 18:30:52,620 ERROR 1 [edxval.api] [user None] [ip None] api.py:295 - [edx-val] Error while retrieving transcript for video=5ef286e8-8e15-4b9c-a02c-f41cd38cb0b6 -- language_code=en
Traceback (most recent call last):
  File "/openedx/venv/lib/python3.11/site-packages/edxval/api.py", line 293, in get_video_transcript_data
    return dict(file_name=video_transcript.filename, content=video_transcript.transcript.file.read())
                                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/django/db/models/fields/files.py", line 48, in _get_file
    self._file = self.storage.open(self.name, "rb")
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/django/core/files/storage/base.py", line 22, in open
    return self._open(name, mode)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/django/core/files/storage/filesystem.py", line 68, in _open
    return File(open(self.path(name), mode))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/openedx/media/video-transcripts/3fb0dfd76a7d42408eca73e73a09e96b.srt'
2024-10-30 18:30:52,621 ERROR 1 [edx.modulestore] [user None] [ip None] courseware_index.py:263 - Indexing error encountered, courseware index may be out of date course-v1:edX+DemoX+Demo_Course - FileNotFoundError(2, 'No such file or directory')
Traceback (most recent call last):
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 258, in index
    prepare_item_index(item, groups_usage_info=groups_usage_info)
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 218, in prepare_item_index
    prepare_item_index(
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 218, in prepare_item_index
    prepare_item_index(
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 218, in prepare_item_index
    prepare_item_index(
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 181, in prepare_item_index
    item_index_dictionary = item.index_dictionary()
                            ^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/xmodule/video_block/video_block.py", line 1146, in index_dictionary
    _update_transcript_for_index(language)
  File "/openedx/edx-platform/xmodule/video_block/video_block.py", line 1134, in _update_transcript_for_index
    transcript = get_transcript(self, lang=language, output_format=Transcript.TXT)[0].replace("\n", " ")
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/xmodule/video_block/transcripts_utils.py", line 1172, in get_transcript
    return get_transcript_from_val(edx_video_id, lang, output_format)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/xmodule/video_block/transcripts_utils.py", line 71, in wrapper
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/xmodule/video_block/transcripts_utils.py", line 938, in get_transcript_from_val
    transcript = get_video_transcript_content(edx_video_id, lang)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/xmodule/video_block/transcripts_utils.py", line 636, in get_video_transcript_content
    transcript = edxval_api.get_video_transcript_data(edx_video_id, language_code)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/edxval/api.py", line 293, in get_video_transcript_data
    return dict(file_name=video_transcript.filename, content=video_transcript.transcript.file.read())
                                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/django/db/models/fields/files.py", line 48, in _get_file
    self._file = self.storage.open(self.name, "rb")
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/django/core/files/storage/base.py", line 22, in open
    return self._open(name, mode)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/django/core/files/storage/filesystem.py", line 68, in _open
    return File(open(self.path(name), mode))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/openedx/media/video-transcripts/3fb0dfd76a7d42408eca73e73a09e96b.srt'
2024-10-30 18:30:52,623 ERROR 1 [root] [user None] [ip None] reindex_course.py:162 - Error indexing course course-v1:edX+DemoX+Demo_Course due to the error: Error(s) present during indexing.
Traceback (most recent call last):
  File "/openedx/edx-platform/cms/djangoapps/contentstore/management/commands/reindex_course.py", line 154, in handle
    CoursewareSearchIndexer.do_course_reindex(store, course_key)
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 373, in do_course_reindex
    return cls._do_reindex(modulestore, course_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 383, in _do_reindex
    indexed_count = super()._do_reindex(modulestore, structure_key)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 281, in _do_reindex
    indexed_count = cls.index(modulestore, structure_key, timeout=180)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 271, in index
    raise SearchIndexingError('Error(s) present during indexing', error_list)
cms.djangoapps.contentstore.courseware_index.SearchIndexingError: Error(s) present during indexing
2024-10-30 18:30:52,645 WARNING 1 [edx_toggles.toggles.internal.waffle.flag] [user None] [ip None] flag.py:79 - Flag 'teams.content_groups_for_teams' accessed without a request, which is likely in the context of a celery task.
2024-10-30 18:30:52,652 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_course_info sources=[{'id': 'course-v1:org1+123+123', 'course': 'course-v1:org1+123+123', 'content': {'display_name': 'Test 123', 'overview': ' About This Course Include your long course description here. The long course description should contain 150-400 words. This is paragraph 2 of the long course description. Add more paragraphs as needed. Make sure to enclose them in paragraph tags. Requirements Add information about the skills and knowledge students need to take this course. Course Staff Staff Member #1 Biography of instructor/staff member #1 Staff Member #2 Biography of instructor/staff member #2 Frequently Asked Questions What web browser should I use? The Open edX platform works best with current versions of Chrome, Edge, Firefox, or Safari. See our list of supported browsers for the most up-to-date information. Question #2 Your answer would be displayed here. ', 'number': '123'}, 'image_url': '/asset-v1:org1+123+123+type@asset+block@images_course_image.jpg', 'start': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=<bson.tz_util.FixedOffset object at 0x70bcebf7e450>), 'number': '123', 'org': 'org1', 'modes': ['audit'], 'language': 'en', 'catalog_visibility': 'both'}] kwargs={}
2024-10-30 18:30:52,697 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_courseware_content sources=[{'course': 'course-v1:org1+123+123', 'org': 'org1', 'content': {'display_name': 'Video'}, 'content_type': 'Video', 'id': 'block-v1:org1+123+123+type@video+block@video1', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Test 123', 'location': ['Section', 'Subsection', 'Unit']}, {'course': 'course-v1:org1+123+123', 'org': 'org1', 'content': {'display_name': 'Problem Multi Select', 'capa_content': 'Question 1 Answer A Answer B Answer C'}, 'content_type': 'CAPA', 'problem_types': ['choiceresponse'], 'id': 'block-v1:org1+123+123+type@problem+block@problem1', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Test 123', 'location': ['Section', 'Subsection', 'Unit']}, {'course': 'course-v1:org1+123+123', 'org': 'org1', 'content': {'display_name': 'Multi-select', 'capa_content': 'Question Answer A Answer B Answer C'}, 'content_type': 'CAPA', 'problem_types': ['choiceresponse'], 'id': 'block-v1:org1+123+123+type@problem+block@6c8aee566426419cb289ff1771e3d525', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Test 123', 'location': ['Section', 'Subsection', 'Unit']}, {'course': 'course-v1:org1+123+123', 'org': 'org1', 'content': {'display_name': 'Multi-select', 'capa_content': 'Question Answer A Answer B Answer C'}, 'content_type': 'CAPA', 'problem_types': ['choiceresponse'], 'id': 'block-v1:org1+123+123+type@problem+block@problem2', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Test 123', 'location': ['Section', 'Subsection', 'Unit']}, {'course': 'course-v1:org1+123+123', 'org': 'org1', 'content': {'display_name': 'Text', 'html_content': ' Text with some content '}, 'content_type': 'Text', 'id': 'block-v1:org1+123+123+type@html+block@html1', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Test 123', 'location': ['Section', 'Subsection', 'Unit']}, {'course': 'course-v1:org1+123+123', 'org': 'org1', 'content': {'display_name': 'Unit'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+123+123+type@vertical+block@dac7ff21ab1b40d2a91f3f8f09a3f635', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Test 123', 'location': ['Section', 'Subsection', 'Unit']}, {'course': 'course-v1:org1+123+123', 'org': 'org1', 'content': {'display_name': 'Subsection'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+123+123+type@sequential+block@aab467c89de54961b687840c95ac9af9', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Test 123', 'location': ['Section', 'Subsection']}, {'course': 'course-v1:org1+123+123', 'org': 'org1', 'content': {'display_name': 'Section'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+123+123+type@chapter+block@300d438e614f4ed9875935308ab1095e', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Test 123', 'location': ['Section']}] kwargs={'request_timeout': 180}
2024-10-30 18:30:52,705 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:177 - Remove request: index=tutor_courseware_content, doc_ids=[] kwargs={}
2024-10-30 18:30:52,706 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 8, "category": "courseware_index"}, "time": "2024-10-30T18:30:52.706014+00:00", "event_type": "edx.course.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,706 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course_info.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 8, "category": "course_info"}, "time": "2024-10-30T18:30:52.706526+00:00", "event_type": "edx.course_info.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,734 WARNING 1 [edx_toggles.toggles.internal.waffle.flag] [user None] [ip None] flag.py:79 - Flag 'teams.content_groups_for_teams' accessed without a request, which is likely in the context of a celery task.
2024-10-30 18:30:52,745 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_course_info sources=[{'id': 'course-v1:org1+123+1231', 'course': 'course-v1:org1+123+1231', 'content': {'display_name': 'Course ABC', 'overview': ' About This Course Include your long course description here. The long course description should contain 150-400 words. This is paragraph 2 of the long course description. Add more paragraphs as needed. Make sure to enclose them in paragraph tags. Requirements Add information about the skills and knowledge students need to take this course. Course Staff Staff Member #1 Biography of instructor/staff member #1 Staff Member #2 Biography of instructor/staff member #2 Frequently Asked Questions What web browser should I use? The Open edX platform works best with current versions of Chrome, Edge, Firefox, or Safari. See our list of supported browsers for the most up-to-date information. Question #2 Your answer would be displayed here. ', 'number': '123'}, 'image_url': '/asset-v1:org1+123+1231+type@asset+block@images_course_image.jpg', 'start': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=<bson.tz_util.FixedOffset object at 0x70bceab0ff10>), 'number': '123', 'org': 'org1', 'modes': ['audit'], 'language': 'en', 'catalog_visibility': 'both'}] kwargs={}
2024-10-30 18:30:52,757 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_courseware_content sources=[{'course': 'course-v1:org1+123+1231', 'org': 'org1', 'content': {'display_name': 'Subsection'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+123+1231+type@sequential+block@d0e9433200ba4363b58461dff8b5813a', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Course ABC', 'location': ['Section Test Tag ABCD', 'Subsection']}, {'course': 'course-v1:org1+123+1231', 'org': 'org1', 'content': {'display_name': 'Section Test Tag ABCD'}, 'content_type': 'Sequence', 'id': 'block-v1:org1+123+1231+type@chapter+block@7d566f536e51428a99551a14253be062', 'start_date': datetime.datetime(2030, 1, 1, 0, 0, tzinfo=tzlocal()), 'content_groups': None, 'course_name': 'Course ABC', 'location': ['Section Test Tag ABCD']}] kwargs={'request_timeout': 180}
2024-10-30 18:30:52,770 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:177 - Remove request: index=tutor_courseware_content, doc_ids=[] kwargs={}
2024-10-30 18:30:52,771 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 2, "category": "courseware_index"}, "time": "2024-10-30T18:30:52.770813+00:00", "event_type": "edx.course.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,771 INFO 1 [tracking] [user None] [ip None] logger.py:41 - {"name": "edx.course_info.index.reindexed", "context": {}, "username": "", "session": "", "ip": "", "agent": "", "host": "", "referer": "", "accept_language": "", "event": {"indexed_count": 2, "category": "course_info"}, "time": "2024-10-30T18:30:52.771491+00:00", "event_type": "edx.course_info.index.reindexed", "event_source": "server", "page": null}
2024-10-30 18:30:52,772 WARNING 1 [root] [user None] [ip None] reindex_course.py:159 - 8 courses reindexed in 0.9 seconds. 0 remaining...
2024-10-30 18:30:52,772 WARNING 1 [root] [user None] [ip None] reindex_course.py:165 - 8 of 10 courses reindexed succesfully. Total running time: 0.9 seconds.
2024-10-30 18:30:52,772 WARNING 1 [root] [user None] [ip None] reindex_course.py:167 - Reindex failed for 2 courses:
2024-10-30 18:30:52,773 WARNING 1 [root] [user None] [ip None] reindex_course.py:169 - course-v1:org1+dsad+dada
2024-10-30 18:30:52,773 WARNING 1 [root] [user None] [ip None] reindex_course.py:169 - course-v1:edX+DemoX+Demo_Course

@pomegranited
Copy link
Contributor Author

@rpenido

This crashed the first two times that I ran it.

Good catches! Thank you for trying this on an empty Meilisearch instance, I had missed that.

I ended up following Regis's advice in this comment by removing the Meilisearch-specific code from this script and just handled these exceptions.

The third run went "fine." We had a problem with transcripts (the same happens with our studio_content), but there was also another in another course for which I don't know the cause.

I don't know what's wrong with that course document either? I only see strings + datetimes in the log line for that course, and MeilisearchEngine handles serializing datetimes, so not sure what field it's balking at. But if this is a problem for people running real courses, we'll need to fix it in edx-search, likely in process_document()), as part of a separate PR.

Could you try again with the updated code and instructions, and let me know if you see anything else weird?

Copy link
Contributor

@rpenido rpenido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍
Thank you for your work, @pomegranited!

  • I tested this using the instructions from the PR
  • I read through the code
  • I checked for accessibility issues N/A
  • Includes documentation

The index creation still throws an error, but it is not related to this PR.

The issue is here:
https://github.com/openedx/edx-search/blob/91686e96699cb8678bd12e2ef42ac51128c13f42/search/meilisearch.py#L233-L237

The client.create_index call is async. It returns a task that you should wait to make sure the index is created. As we are calling client.get_index right after, the index has not been created yet.

We have a helper function that does that here:
https://github.com/open-craft/edx-platform/blob/7a99d592725c32d4730c0efa9960d59e23def7dc/openedx/core/djangoapps/content/search/api.py#L132-L148

@regisb
Copy link
Contributor

regisb commented Oct 31, 2024

Good point @rpenido. I also faced this issue in student_notes. I have a fix which I think is simpler: https://github.com/openedx/edx-notes-api/pull/444/files

self.meilisearch_client.wait_for_task(task.task_uid, timeout_in_ms=2000)

@pomegranited I suggest you add a call to wait_for_task in index creation. I'll add a call to wait_for_task after index creation in edx-search.

EDIT: there you go openedx/edx-search#166

regisb added a commit to regisb/edx-search that referenced this pull request Oct 31, 2024
In `search.meilisearch.create_indexes`, we were not waiting for the
index creation tasks to complete. This was causing a potential race
condition, where the `create_indexes` function would fail because it
took a few seconds for the index creation to succeed.

See the relevant conversation here:
openedx/edx-platform#35743 (comment)
regisb added a commit to regisb/edx-search that referenced this pull request Oct 31, 2024
In `search.meilisearch.create_indexes`, we were not waiting for the
index creation tasks to complete. This was causing a potential race
condition, where the `create_indexes` function would fail because it
took a few seconds for the index creation to succeed.

See the relevant conversation here:
openedx/edx-platform#35743 (comment)
regisb added a commit to regisb/edx-search that referenced this pull request Oct 31, 2024
In `search.meilisearch.create_indexes`, we were not waiting for the
index creation tasks to complete. This was causing a potential race
condition, where the `create_indexes` function would fail because it
took a few seconds for the index creation to succeed.

See the relevant conversation here:
openedx/edx-platform#35743 (comment)
regisb added a commit to regisb/edx-search that referenced this pull request Oct 31, 2024
In `search.meilisearch.create_indexes`, we were not waiting for the
index creation tasks to complete. This was causing a potential race
condition, where the `create_indexes` function would fail because it
took a few seconds for the index creation to succeed.

See the relevant conversation here:
openedx/edx-platform#35743 (comment)
regisb added a commit to regisb/edx-search that referenced this pull request Oct 31, 2024
In `search.meilisearch.create_indexes`, we were not waiting for the
index creation tasks to complete. This was causing a potential race
condition, where the `create_indexes` function would fail because it
took a few seconds for the index creation to succeed.

See the relevant conversation here:
openedx/edx-platform#35743 (comment)
ormsbee pushed a commit to openedx/edx-search that referenced this pull request Oct 31, 2024
In `search.meilisearch.create_indexes`, we were not waiting for the
index creation tasks to complete. This was causing a potential race
condition, where the `create_indexes` function would fail because it
took a few seconds for the index creation to succeed.

See the relevant conversation here:
openedx/edx-platform#35743 (comment)
@pomegranited pomegranited changed the title fix: support Meilisearch search engine during course index setup fix: allow non-Elasticsearch search engines when reindexing courses [FC-0062] Nov 1, 2024
@pomegranited
Copy link
Contributor Author

Brilliant, thank you for your help @regisb and @rpenido !

I've reduced this change down to the bare minimum, wrapping the ES-specific logic and removing all the Meilisearch-specific stuff. Also bumped the edx-search version to fix the race condition error noted in comments.

@rpenido do you want to run another round of tests to make sure that issue is resolved?

@pomegranited pomegranited added the FC Relates to an Axim Funded Contribution project label Nov 1, 2024
@navinkarkera
Copy link
Contributor

navinkarkera commented Nov 1, 2024

@pomegranited I faced the same issue as @rpenido while running tutor dev run cms ./manage.py cms shell -c "from search.meilisearch import create_indexes; create_indexes()" command, it worked on third try without any errors.

Also noticed that the reindex command (meilisearch api) raises error while indexing empty courses i.e. empty lists.

======================================= SearchIndexerBase#index items_index: []
2024-11-01 06:35:22,547 INFO 1 [search.meilisearch] [user None] [ip None] meilisearch.py:133 - Index request: index=tutor_courseware_content sources=[] kwargs={'request_timeout': 180}
2024-11-01 06:35:22,549 ERROR 1 [edx.modulestore] [user None] [ip None] courseware_index.py:265 - Indexing error encountered, courseware index may be out of date course-v1:unix+ux1+2024_T2 - MeilisearchApiError("The `json` payload provided is malformed. `Couldn't serialize document value: data are neither an object nor a list of objects`.")
Traceback (most recent call last):
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 135, in __validate
    request.raise_for_status()
  File "/openedx/venv/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://meilisearch:7700/indexes/tutor_courseware_content/documents

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 261, in index
    searcher.index(items_index, request_timeout=timeout)
  File "/openedx/venv/lib/python3.11/site-packages/search/meilisearch.py", line 140, in index
    self.meilisearch_index.add_documents(
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/index.py", line 448, in add_documents
    add_document_task = self.http.post(url, documents, serializer=serializer)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 88, in post
    return self.send_request(requests.post, path, body, content_type, serializer=serializer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 68, in send_request
    return self.__validate(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/venv/lib/python3.11/site-packages/meilisearch/_httprequests.py", line 138, in __validate
    raise MeilisearchApiError(str(err), request) from err
meilisearch.errors.MeilisearchApiError: MeilisearchApiError. Error code: malformed_payload. Error message: The `json` payload provided is malformed. `Couldn't serialize document value: data are neither an object nor a list of objects`. Error documentation: https://docs.meilisearch.com/errors#malformed_payload Error type: invalid_request
2024-11-01 06:35:22,550 ERROR 1 [root] [user None] [ip None] reindex_course.py:156 - Error indexing course course-v1:unix+ux1+2024_T2 due to the error: Error(s) present during indexing.
Traceback (most recent call last):
  File "/openedx/edx-platform/cms/djangoapps/contentstore/management/commands/reindex_course.py", line 148, in handle
    CoursewareSearchIndexer.do_course_reindex(store, course_key)
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 375, in do_course_reindex
    return cls._do_reindex(modulestore, course_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 385, in _do_reindex
    indexed_count = super()._do_reindex(modulestore, structure_key)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 283, in _do_reindex
    indexed_count = cls.index(modulestore, structure_key, timeout=180)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/openedx/edx-platform/cms/djangoapps/contentstore/courseware_index.py", line 273, in index
    raise SearchIndexingError('Error(s) present during indexing', error_list)
cms.djangoapps.contentstore.courseware_index.SearchIndexingError: Error(s) present during indexing

@rpenido
Copy link
Contributor

rpenido commented Nov 1, 2024

Same here. It seems that edx-seach is not updating from 4.1.0 to 4.1.1.

I tried the following:

tutor dev run cms make requirements
tutor dev run lms make requirements
tutor dev run cms make upgrade-package package=edx-search

Checking version with: tutor dev run cms pip show edx-search.

@pomegranited
Copy link
Contributor Author

@navinkarkera

Also noticed that the reindex command (meilisearch api) raises error while indexing empty courses i.e. empty lists.

Addressed with 2e8d7ed.

I faced the same issue as @rpenido while running tutor dev run cms ./manage.py cms shell -c "from search.meilisearch import create_indexes; create_indexes()" command, it worked on third try without any errors.

Same here. It seems that edx-seach is not updating from 4.1.0 to 4.1.1.

The package update has been merged: #35755

So I've merged latest master here too. Hopefully that addresses it?

Copy link
Contributor

@navinkarkera navinkarkera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. It seems that edx-seach is not updating from 4.1.0 to 4.1.1

@rpenido You'll need to rebuild openedx-dev image and launch with --skip-build option for updating the package permanently OR you need to run tutor dev exec cms make requirements (notice the use of exec instead of run) to update requirements in your currently running container but it will be lost once the container is deleted.

run command creates a new one-off container so it will have the old version till you rebuild the image and any updates using this one-off container is lost as soon as the run command completes so next run of run command will not reflect any updates.

@pomegranited Nice work! I don't see any errors now.

  • I tested this: (reindexed all courses)
  • I read through the code
  • I checked for accessibility issues
  • Includes documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FC Relates to an Axim Funded Contribution project open-source-contribution PR author is not from Axim or 2U
Projects
Status: In Eng Review
Development

Successfully merging this pull request may close these issues.

5 participants