Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added serverless API #282

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,10 @@ It is possible to make minor changes to the generated speech.

In addition, it is required to use `--rate=-50%` instead of `--rate -50%` (note the lack of an equal sign) otherwise the `-50%` would be interpreted as just another argument.

### Deploying to Serverless API

See the [examples/serverless-api](examples/serverless-api) folder for more information on how to deploy to [Cerebrium](https://www.cerebrium.ai).

### Note on the `edge-playback` command

`edge-playback` is just a wrapper around `edge-tts` that plays back the generated speech. It takes the same arguments as the `edge-tts` option.
Expand Down
33 changes: 33 additions & 0 deletions examples/serverless-api/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Serverless Edge TTS API

This project demonstrates how to run Edge TTS as a serverless API using [Cerebrium](https://www.cerebrium.ai)

## Overview

The `main.py` file contains a function `run` that takes a text input and an optional voice parameter to generate audio and subtitles using Edge TTS. This example specifically streams the output.

## Installation

1. pip install cerebrium
2. cerebrium login
3. Make sure you are in the serverless-api folder and run ```cerebrium deploy```

## Usage

Once deployed, you should be able to make a curl request similar to the below. You can find this url on your Cerebrium dashboard.
```
curl --location 'https://api.cortex.cerebrium.ai/v4/p-xxxxxx/serverless-api/run' \
--header 'Authorization: Bearer <AUTH_TOKEN>' \
--header 'Content-Type: application/json' \
--data '{"text": "Tell me something"}'
```

The `run` function takes two parameters:

- `text` (str): The text to be converted to speech
- `voice` (str, optional): The voice to use for TTS (default: "en-GB-SoniaNeural")

It returns a dictionary containing:

- `audio_data`: The generated audio as a base64-encoded string
- `subtitles`: The generated subtitles in WebVTT format
19 changes: 19 additions & 0 deletions examples/serverless-api/cerebrium.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
[cerebrium.deployment]
name = "serverless-api"
python_version = "3.11"
docker_base_image_url = "debian:bookworm-slim"
include = "[./*, main.py, cerebrium.toml]"
exclude = "[.*]"

[cerebrium.hardware]
cpu = 2
memory = 12.0
milo157 marked this conversation as resolved.
Show resolved Hide resolved
compute = "CPU"

[cerebrium.scaling]
min_replicas = 0
max_replicas = 5
cooldown = 30

[cerebrium.dependencies.pip]
"edge-tts" = "latest"
21 changes: 21 additions & 0 deletions examples/serverless-api/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@

import edge_tts

async def run(text: str, voice: str = "en-GB-SoniaNeural"):

communicate = edge_tts.Communicate(text, voice)
submaker = edge_tts.SubMaker()
audio_data = bytearray()
subtitles = ""

async for chunk in communicate.stream():
if chunk["type"] == "audio":
audio_data.extend(chunk["data"])
elif chunk["type"] == "WordBoundary":
submaker.create_sub((chunk["offset"], chunk["duration"]), chunk["text"])

subtitles = submaker.generate_subs()
return {
"audio_data": audio_data.decode("latin-1"),
"subtitles": subtitles
}