GitHub - codenamewei/youtube2text: Converts Youtube URLs to Text with Speech Recognition

🔊 Converts Youtube URLs to Text with Speech Recognition

💡 What does the library does?

Youtube -> Text: Translate youtube urls as text file (csv)
Youtube -> Audio: Downloads youtube urls as audio file (wav, flac)
Audio -> Text: Translate audio file (wav, flac) to text file (csv)

Three folders will be created to store the output files.

<Own Path> or <HOME_DIRECTORY>/youtube2text
│
├── audio/
│   └── 2022Jan02_011802.flac
|
├── audio-chunks/
│   └── 2022Jan02_011802
│       ├── chunk1.flac
│       ├── chunk2.flac
│       └── chunk3.flac
│   
└── text/
    └── 2022Jan02_011802.csv

📦 How to install

Install and update using pip

pip install youtube2text

🔧Build from source

git clone <this_repo>
cd <this_repo>
python setup.py install

✨ How to use

Using the library requires internet connection for both downloading youtube videos and speech recognition operation

from youtube2text import Youtube2Text

converter = Youtube2Text()

converter.url2text(urlpath="https://www.youtube.com/watch?v=Ad9Q8rM0Am0&t=114s")

Check out more at howtouse.ipynb

📌 Functions

Support audio output of
- wav
- flac
Support Automatic Speech Recognition with speech-recognition library

Youtube -> Text

def url2text(self, urlpath, outfile = None, audioformat = "flac", audiosamplingrate=16000):
    '''
    Convert youtube url to text

    Parameters:
        urlpath (str): Youtube url
        outfile (str, optional): File path/name of output file (.csv)
        audioformat (str, optional): Audioformat supported in self.__audioextension
        audiosamplingrate (int, optional): Audio sampling rate
    '''

Youtube -> Audio

def url2audio(self, urlpath, audiofile = None, audiosamplingrate=16000):
    '''
    Convert youtube url to audiofile

    Parameters:
        urlpath (str): Youtube url
        audiofile (str, optional): File path/name to save audio file
        audiosamplingrate (int, optional): Audio sampling rate
    '''

Audio -> Text

def audio2text(self, audiofile, textfile = None):
    '''
    Convert audio to csv file

    Parameters:
        audiofile (str): File path/name of audio file
        textfile (str, optional): File path/name of text file (*.csv)
    '''

🚨 Note

This repository is highly dependent on Pytube to download Youtube videos, which at times buggy. Workaround is often provided in issues page of Pytube repository or in this repository. Do take the intiative to file for issues to help others who will use this repository.

📝 Article

Read out the article below on how to use the repository.

Youtube to Text with Speech Recognition in Python

📩 Reach out to me

This repository is created out from personal use to retrieve audio files for conversational speech recognition and audio classification.

For custom functionality development support, enterprise support and other related questions, reach out at

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
metadata		metadata
src/youtube2text		src/youtube2text
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
changelog.md		changelog.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔊 Converts Youtube URLs to Text with Speech Recognition

💡 What does the library does?

📦 How to install

🔧Build from source

✨ How to use

📌 Functions

Youtube -> Text

Youtube -> Audio

Audio -> Text

🚨 Note

📝 Article

📩 Reach out to me

About

Releases 1

Packages

Languages

License

codenamewei/youtube2text

Folders and files

Latest commit

History

Repository files navigation

🔊 Converts Youtube URLs to Text with Speech Recognition

💡 What does the library does?

📦 How to install

🔧Build from source

✨ How to use

📌 Functions

Youtube -> Text

Youtube -> Audio

Audio -> Text

🚨 Note

📝 Article

📩 Reach out to me

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages