Skip to content
/ LiRE Public
forked from chwoong/LiRE

Listwise Reward Estimation for Offline Preference-based Reinforcement Learning (ICML 2024)

Notifications You must be signed in to change notification settings

zhshao17/LiRE

This branch is up to date with chwoong/LiRE:master.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

7af7697 · Jun 18, 2024

History

1 Commit
Jun 18, 2024
Jun 18, 2024
Jun 18, 2024
Jun 18, 2024
Jun 18, 2024
Jun 18, 2024
Jun 18, 2024
Jun 18, 2024
Jun 18, 2024

Repository files navigation

Listwise Reward Estimation for Offline Preference-based Reinforcement Learning

This is the official implementation of LiRE.

This repository contains offline RL dataset and scripts to reproduce experiments.

The code is based on

  • CORL library: Offline Reinforcement Learning library. This library provides single-file implementations of offline RL algorithms.
  • PEBBLE: online Preference-based Reinforcement learning code. We used the SAC implementation of this code to create new offline preference-based RL dataset.

Please visit our paper and project page for more details.

Installation

1. Install with conda env file (Click to expand)
  conda env create -f LiRE.yml
  pip install git+https://github.com/Farama-Foundation/Metaworld.git@master#egg=metaworld
  pip install git+https://github.com/denisyarats/dmc2gym.git
  pip install gdown
  sudo apt install unzip
2. Install with installation list (Click to expand)
  conda create -n LiRE python=3.9
  conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
  pip install "gym[mujoco_py,classic_control]==0.23.0"
  pip install pyrallis rich tqdm==4.64.0 wandb==0.12.21
  pip install git+https://github.com/denisyarats/dmc2gym.git
  pip install git+https://github.com/Farama-Foundation/Metaworld.git@master#egg=metaworld
  pip install gdown
  sudo apt install unzip
  • Trouble shooting (Click to expand)
    • AttributeError: module 'numpy' has no attribute 'int'
      • modify to dim = int(np.prod(s.shape)) from dim = np.int(np.prod(s.shape)) in .../LiRE/lib/python3.9/site-packages/dmc2gym/wrappers.py

Algorithms

In this repro, we can run MR, SeqRank, LiRE.

For other baselines, we experimented with the following repo:

Algorithms URL
PT https://github.com/csmile-1006/PreferenceTransformer
DPPO https://github.com/snu-mllab/DPPO
IPL https://github.com/jhejna/inverse-preference-learning

Dataset

For more details, please see here

  • MetaWorld
  • DMControl

Scripts

Please see here

About

Listwise Reward Estimation for Offline Preference-based Reinforcement Learning (ICML 2024)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 77.1%
  • Jupyter Notebook 11.9%
  • Shell 11.0%