This reproducibility package was prepared for the paper titled "Everyone's a Winner! On Hyperparameter Tuning of Recommendation Models" and submitted to the ACM RecSys '23 Conference. The results reported in this paper were achieved with the help of the Elliot Framework. Please refer to the official web documentation and GitHub page for further information.
This shared repository contains the configuration files and datasets, which were used in this analysis. Especially, there are two type of configuration files inside the "config_files" folder. The “tuned” configuration files contain different ranges of hyperparameters and these files are used to tune the algorithms. The “untuned” configuration files contain randomly selected hyperparameters. The optimal values of the hyperparameters for each model, dataset and the performance of the algorithms when using additional metrics can be found in the folder named “Additional material”. Moreover, the list of examined papers for the analysis provided in Section 2 of the paper and results with additional metrics can be seen here.
Elliot works with the following operating systems:
- Linux
- Windows 10
- macOS X
Elliot requires Python version 3.6 or later.
Elliot requires tensorflow version 2.3.2 or later. If you want to use Elliot with GPU, please ensure that CUDA or cudatoolkit version is 7.6 or later. This requires NVIDIA driver version >= 10.1 (for Linux and Windows10).
- Open the anaconda command prompt to download GitHub repo"RecSys2023_hyperparameter_tuning"
- Move to the directory where you want to store the GitHub repo"RecSys2023_hyperparameter_tuning"
- Run the following command to create "elliot_env" environment
git clone https://github.com/Faisalse/RecSys2023_hyperparameter_tuning.git conda create --name elliot_env python=3.8
Activate the conda environment
conda activate elliot_env
Change to the "hyperparameter_tuning" directory and run this line to install the required packages
pip install -r requirements.txt
Installing required packages, run this line to reproduce the results
python reproducibility.py
By default, "Untuned_movieLenz.yml" configuration file is used to reproduce the results for the MovieLens dataset. Different configuration files are available in the “config_files” folder, which are used in our experiments. So, the desired one can be picked and writes its name in "reproducibility.py" file and runs it again to reproduce the results.
In the "data" folder, you can find (MovieLens 1M, Amazon Digital Music, Epinions) datasets, which are used in this reproducibility study. We also share the original data file and split version of each dataset to carry out the experiments with the exact same setting. The table below shows the statistics of the selected datasets. The same p-core and the same threshold values (MovieLens 1M, Amazon Digital Music datasets) are used as mentioned in [1]. The p-core filter out the number of items and users lower than given k-core while the threshold parameter assumes single system-wide threshold to filter out irrelevant transactions, which received ratings < 4.
[1] Anelli, Vito Walter, et al. "Top-n recommendation algorithms: A quest for the state-of-the-art." Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization. 2022.
Dataset | Original transactions | Original users | Original items | p-core | threshold |
---|---|---|---|---|---|
MovieLens 1M | 1000209 | 6040 | 3706 | 10 | 4 |
Amazon Digital Music | 1584082 | 840372 | 840372 | 5 | 4 |
Epinions | 300548 | 8514 | 8510 | 2 | / |