Welcome to the FastAPI Language Prediction API! This project is a machine-learning-powered web service that predicts the language of a given sentence using an SVM classifier. It uses FastAPI for serving the model and TfidfVectorizer for text transformation.
- Language Prediction: Detects the language of an input sentence.
- Clean Text Functionality: Automatically removes noise like HTML tags, URLs, stopwords, and accents.
- Machine Learning: Uses Support Vector Machine (SVM) to classify text.
- Swagger UI: Comprehensive API documentation using Swagger.
- Modular Code Structure: Well-organized codebase for easy maintainability and scalability.
- Docker Support: Seamlessly deploy the application using Docker.
├── app/
│ ├── api/
│ │ ├── __init__.py # Package initialization
│ │ └── v1/
│ │ ├── __init__.py # API versioning
│ │ └── endpoints.py # API routes and endpoints
│ ├── core/
│ │ ├── __init__.py # Core initialization
│ │ └── config.py # Application configuration settings
│ ├── models/
│ │ └── __init__.py # Machine learning models (SVM, Tfidf, etc.)
│ ├── services/
│ │ ├── __init__.py # Services initialization
│ │ └── prediction.py # Business logic for language prediction
│ └── main.py # FastAPI initialization and application bootstrap
├── dataset/
│ └── dataset.csv # Training dataset (ignored in git)
├── Dockerfile # Dockerfile for containerization
├── docker-compose.yml # Docker Compose configuration
├── requirements.txt # Python dependencies
└── README.md # Project documentation (you're reading this!)
Follow these steps to run the application locally or deploy it using Docker.
git clone https://github.com/Damieee/language-prediction-api
cd language-prediction-api
Create a .env
file in the root directory and add the following values:
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-key
AWS_REGION=your-region
Make sure you have Python 3.8+ and install the required packages:
pip install -r requirements.txt
To run the FastAPI app locally, execute:
uvicorn app.main:app --reload
The API documentation will be available at http://localhost:8000/docs.
To deploy the application using Docker, run the following commands:
docker-compose up --build
This will build the Docker image and run the container. You can access the API at http://localhost:8000.
- Text Cleaning: Removes HTML, URLs, accents, punctuation, and stopwords.
- TF-IDF Vectorization: Converts text to numerical data.
- SVM Classifier: Classifies the input text into one of the supported languages.
- Label Encoding: Encodes languages for the model.
Request:
{
"sentence": "Enter the sentence here"
}
Response:
{
"sentence": "Enter the sentence here",
"predicted_language": "English"
}
This project is configured to run in production using Docker. In production, you should ensure the following:
- Environment Variables: Ensure that the
.env
file or Docker environment contains production-specific keys and configurations. - Docker Compose: The Docker Compose configuration allows you to spin up the application easily in production environments.
Feel free to open issues or submit pull requests for any feature improvements or bug fixes.
- Fork the repository
- Create a new feature branch (
git checkout -b feature/your-feature
) - Commit your changes (
git commit -m 'Add feature'
) - Push to the branch (
git push origin feature/your-feature
) - Open a pull request
- Maintainer: Dare Ezekiel
- Email: [email protected]