chore: refactoring of the codebase is needed #99

masterchief164 · 2023-05-12T10:51:05Z

No description provided.

adamjonas · 2023-05-13T10:43:38Z

Can you describe what you believe needs to be done ?

masterchief164 · 2023-05-14T17:54:56Z

The inclusion of multiple transcribers with many options like diarization, summary creation, and topic detection has made the code a lot bloated. There are a lot of if else ladders that can be simplified. Also, the methods do not follow the SOLID principles. I'd like to address those issues and thus make the code a bit more maintanable.

SarcasticNastik · 2023-06-06T14:58:36Z

Refactor Strategy

Here, I outline a general idea for refactoring the codebase using an object-oriented approach.

Application Workflow

Pre-process source files

Given source: Video|Audio|Yt-link, conditionally pre-process ( download and convert) the source files.

Current approach for detecting and specifying the source type and the corresponding processing is manual using source_type.
Automatic detection of file type and further processing should be achieved using either:

Pattern matching using match (supported by python versions $\ge$ 10).
Conditional matching.

Transcription

Transcribe the audio input to text using either deepgram or whisper.

The corresponding codebase can be easily ported.

Post-process generated files

Store the generated transcription from the previous section.
Optionally, generate a PR for the transcription.
Optionally, upload model outputs to AWS S3.

Again, all the corresponding functionalities can be easily ported.

`App` class

Methods

Corresponding to the previous section.

pre_process (needs a better name)
- detect_source_type
- download (conditionally)
- convert (conditionally)
transcribe
post_process (needs a better name)
- write_transcription_to_md
- create_pr (conditionally)
- upload_to_s3 (conditionally)
process: Complete workflow. (end-user API)

kouloumos · 2024-04-15T07:12:59Z

Codebase has been refactored using an object-oriented approach into a structured four-stage process. Most of the work has been done with #118 and 2764a7f, with additional changes in miscellaneous commits that can be found in the git history.

adamjonas added the HOLD for now label May 16, 2023

rejeses added this to The Bitcoin Development Project Roadmap Apr 12, 2024

rejeses moved this to ✅ Done in The Bitcoin Development Project Roadmap Apr 12, 2024

rejeses moved this from ✅ Done to 📋 Backlog in The Bitcoin Development Project Roadmap Apr 12, 2024

kouloumos closed this as completed Apr 15, 2024

github-project-automation bot moved this from 📋 Backlog to ✅ Done in The Bitcoin Development Project Roadmap Apr 15, 2024

kouloumos removed the HOLD for now label Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: refactoring of the codebase is needed #99

chore: refactoring of the codebase is needed #99

masterchief164 commented May 12, 2023

adamjonas commented May 13, 2023

masterchief164 commented May 14, 2023

SarcasticNastik commented Jun 6, 2023 •

edited

Loading

kouloumos commented Apr 15, 2024

chore: refactoring of the codebase is needed #99

chore: refactoring of the codebase is needed #99

Comments

masterchief164 commented May 12, 2023

adamjonas commented May 13, 2023

masterchief164 commented May 14, 2023

SarcasticNastik commented Jun 6, 2023 • edited Loading

Refactor Strategy

Application Workflow

Pre-process source files

Transcription

Post-process generated files

App class

Methods

kouloumos commented Apr 15, 2024

SarcasticNastik commented Jun 6, 2023 •

edited

Loading

`App` class