Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Video editor supports transcripts [FC-0076] #36058

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

ChrisChV
Copy link
Contributor

@ChrisChV ChrisChV commented Dec 25, 2024

Description

  • Add error handler on save video to avoid creating sjson
  • Support transcripts without edx_video_id in definition_to_xml
  • When copying a video from a library to a course: Create a new edx_video_id
  • Save transcripts as static assets in a video in a library when adding a new transcript.
  • Delete transcripts as static assets in a video in a library when deleting transcripts.
  • Support download transcript in a video in a library.
  • Support replace transcript in a video in a library.
  • Support updating transcripts in video in a library.
  • Refactor the code of downloading YouTube transcripts to enable this feature in libraries.
  • Support copy from a library to a course and a course to a library.
  • Which edX user roles will this change impact? "Course Author"

Supporting information

Testing instructions

Follow the testing instructions at: openedx/frontend-app-authoring#1596

Deadline

No rush

Other information

* Add error handler on save video to avoid create sjson
* Support transcripts without edx_video_id in definition_to_xml
@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Dec 25, 2024
@openedx-webhooks
Copy link

openedx-webhooks commented Dec 25, 2024

Thanks for the pull request, @ChrisChV!

This repository is currently maintained by @openedx/wg-maintenance-edx-platform.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.

🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads

🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.


Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@ChrisChV ChrisChV marked this pull request as draft December 25, 2024 21:16
@ChrisChV ChrisChV changed the title feat: Video editor supports transcripts feat: Video editor supports transcripts [FC-0076] Dec 25, 2024
@mphilbrick211 mphilbrick211 added the FC Relates to an Axim Funded Contribution project label Dec 27, 2024
Copy link
Contributor

@pomegranited pomegranited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ChrisChV , this is working well for the most part, good job dealing with the old transcript code!

But I found a bug with the upstream/downstream syncing, and left a few nits/change requests too.

@@ -616,6 +630,24 @@ def _import_file_into_course(
if thumbnail_content is not None:
content.thumbnail_location = thumbnail_location
contentstore().save(content)
if usage_key.block_type == 'video':
# Adding transcripts to VAL using the nex edx_video_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo

Suggested change
# Adding transcripts to VAL using the nex edx_video_id
# Adding transcripts to VAL using the new edx_video_id

response = self.replace_transcript(self.library_block.usage_key, self.youtube_id)

# Verify the response
self.assert_response(response, expected_status_code=200, expected_message='Success')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also test that the transcript is replaced as expected?

@@ -81,13 +84,17 @@ def link_video_to_component(video_component, user):
edx_video_id = clean_video_id(video_component.edx_video_id)
if not edx_video_id:
edx_video_id = create_external_video(display_name='external video')

if isinstance(video_component.usage_key, UsageKeyV2):
return edx_video_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we're returning early here.. Could you add a comment to clarify?

except NotFoundError:
log.debug("Can't find transcripts in storage for youtube id: %s", youtube_id)

#check youtube local and server transcripts for equality
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pylint should have caught this?

Suggested change
#check youtube local and server transcripts for equality
# check youtube local and server transcripts for equality

edx_video_id,
input_format,
transcript_content,
language_code='en',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're dealing with the old transcripts code here, which is really convoluted.. but why is the language code hard-coded here (and below on save_video_transcript)? A comment would help :)

@@ -1952,3 +1952,6 @@ def import_blocks_create_task(library_key, course_key, use_course_key_as_block_i
log.info(f"Import block task created: import_task={import_task} "
f"celery_task={result.id}")
return import_task

# To enable use content library permissions as public API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: grammar

Suggested change
# To enable use content library permissions as public API
# Allow content library permissions to be used in the public API

Comment on lines +511 to +512
except AttributeError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this error need to be caught now? Seems a little dangerous.

@@ -497,6 +498,7 @@ def studio_transcript(self, request, dispatch):
"""
_ = self.runtime.service(self, "i18n").ugettext

# pylint: disable=too-many-nested-blocks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pylint is not wrong :/

Could this massive method be refactored into _studio_transcript_upload, _studio_transcript_delete, and _studio_transcript_get ? Unless that means we have to add more tests..

Comment on lines +578 to +579
transcript_file_path = f"static/{self.transcripts.pop(language, None)}"
lib_api.delete_library_block_static_asset_file(self.scope_ids.usage_id, transcript_file_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If transcript_file_path == "static/None", I think delete_library_block_static_asset_file raises a validation error. By contrast, the remove_subs_from_store method just passes if the transcript file isn't found.

So maybe we should check for that here (at the risk of enraging pylint further..):

Suggested change
transcript_file_path = f"static/{self.transcripts.pop(language, None)}"
lib_api.delete_library_block_static_asset_file(self.scope_ids.usage_id, transcript_file_path)
transcript_path = self.transcripts.pop(language, None)
if transcript_path:
lib_api.delete_library_block_static_asset_file(self.scope_ids.usage_id, f"static/{transcript_path}")

@@ -10,6 +10,7 @@
import re
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing a bug when I sync a LibraryBlock video with transcripts from an upstream video.

Steps to reproduce:

  1. Create a library video with transcripts (here, I imported them from the example youtube video).
  2. Publish the library video.
  3. Copy it to the clipboard.
  4. Paste into a course.
    Note that the transcripts are displaying fine here.
  5. Re-edit the library video, and replace a transcript. (Here, I replaced the English one, I don't know if replacing others causes the same issue).
  6. Return to the course LibraryBlock, and refresh to see the "updates available" button. Click it.
    Note that the upstream video preview shows its transcripts fine, but the downstream (course) video preview doesn't show its transcripts anymore.
  7. Accept changes.
    Note that the course video no longer shows its transcripts, but if you edit it, you can see they're still there.
Syncing.upstream.video.breaks.transcripts.mp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FC Relates to an Axim Funded Contribution project open-source-contribution PR author is not from Axim or 2U
Projects
Status: Waiting on Author
Development

Successfully merging this pull request may close these issues.

4 participants