Hello! I'm an enthusiastic AI Engineer with a strong passion for the intersection of computer science, economics, and applied mathematics. My focus is on machine learning and its practical applications in real-world scenarios.
- Machine Learning
- Statistical Modeling
- NLP & LLMs
- Algorithm Design
- Software engineering (certification)
- Macroeconomics
- Finance
- HPC
Here are some of the key projects I've worked on:
| Project Name | Description | Technologies Used | Link |
|---|---|---|---|
| Investment opportunities identification - Ardian | Implementation of a search engine leveraging BERT and additional data to identify firms with high acquisition potential. | Python, BERT, Azure | GitHub Repo |
| Document Chat Application (RAG) | An intelligent web application that enables users to upload documents and engage in conversations about their content using advanced Large Language Model technology. Built with FastAPI, Firebase authentication, and OpenAI's GPT models. | OpenIA, FastAPI, Docker, K8s | Github Repo |
| Analysis of LLM at small scale - INRIA | Implementation and training of small-scale language models (<100M parameters) using the Transformers library on AWS cloud with GPU, trained on the full English Wikipedia | Python, Transformers, Pytorch, wandb.ai | GitHub Repo |
| Hackathon Name | Description | Technologies Used | Link |
|---|---|---|---|
| Hackathon Banque de France (WINNER) | Design a solution that automatically identifies legal topics of interest currently handled by the business, based on documentation, and generates legal monitoring content on these topics (such as articles, news, codes of conduct, and European legislation) to be distributed via a newsletter. | Python, Azure, React, RSS flux, GPT API, TF-IDF | forbidden to share the solution |
| H-Gen AI 2025 (WINNER) | document analysis tool developed for Gide, a leading international law firm. The application streamlines the audit process by automatically analyzing PDF documents using Large Language Models (LLM) and generating structured audit reports in Word format based on predefined templates. | Python, AWS, RAG | GitHub Repo |
| H!Paris | model trained to predict water levels in water tables over time. | Python, XGboost, LSTM | None |
| Project Name | Description | Technologies Used | Link |
|---|---|---|---|
| Double Descent | The project explores the double descent phenomenon, where test error improves after overparameterization, using linear regression, RFF, and neural networks. Experiments confirm that implicit biases enable overparameterized models to generalize effectively, challenging traditional overfitting views. | Python, Pytorch, git | GitHub Repo |
| Bayesian Statistics: Optimal Bayesian Estimation of t-Student Mixtures with a Growing Number of Components | The project extends Bayesian estimation for Gaussian mixture models to t-Student mixtures, leveraging their suitability for heavy-tailed data. While theoretical challenges arise due to the t-Student's heavy tails, empirical simulations show Bayesian methods perform robustly, particularly in scenarios with complex or heavy-tailed distributions, making them valuable for real-world applications. | Python, git | GitHub Repo |
| Time Series Analysis of the French industrial Production Index for Electricity Production | Data cleaning, transformation to stationnarity model selection and validation using ARMA and ARIMA models | R | GitHub Repo |
| Sentiment Analysis | Web scraping to extract data, followed by sentiment analysis of the top 100 box office films | Python, Selenium, NLTK, SpaCy, Scikit-learn, Pandas | GitHub Repo |
- LinkedIn: My LinkedIn
- Email: Vincent.gimenes@gmail.com
Feel free to explore the repository and reach out if youโd like to collaborate or discuss exciting ideas!
Application is the alchemy that transforms your acquired knowledge into gold

