Listwise Reward Estimation for Offline Preference-based Reinforcement Learning

This is the official implementation of LiRE.

This repository contains offline RL dataset and scripts to reproduce experiments.

The code is based on

CORL library: Offline Reinforcement Learning library. This library provides single-file implementations of offline RL algorithms.
PEBBLE: online Preference-based Reinforcement learning code. We used the SAC implementation of this code to create new offline preference-based RL dataset.

Please visit our paper and project page for more details.

Installation

1. Install with conda env file (Click to expand)

  conda env create -f LiRE.yml
  pip install git+https://github.com/Farama-Foundation/Metaworld.git@master#egg=metaworld
  pip install git+https://github.com/denisyarats/dmc2gym.git
  pip install gdown
  sudo apt install unzip

2. Install with installation list (Click to expand)

  conda create -n LiRE python=3.9
  conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
  pip install "gym[mujoco_py,classic_control]==0.23.0"
  pip install pyrallis rich tqdm==4.64.0 wandb==0.12.21
  pip install git+https://github.com/denisyarats/dmc2gym.git
  pip install git+https://github.com/Farama-Foundation/Metaworld.git@master#egg=metaworld
  pip install gdown
  sudo apt install unzip

Trouble shooting (Click to expand)
- AttributeError: module 'numpy' has no attribute 'int'
  - modify to dim = int(np.prod(s.shape)) from dim = np.int(np.prod(s.shape)) in .../LiRE/lib/python3.9/site-packages/dmc2gym/wrappers.py

Algorithms

In this repro, we can run MR, SeqRank, LiRE.

For other baselines, we experimented with the following repo:

Algorithms	URL
PT	https://github.com/csmile-1006/PreferenceTransformer
DPPO	https://github.com/snu-mllab/DPPO
IPL	https://github.com/jhejna/inverse-preference-learning

Dataset

For more details, please see here

MetaWorld
DMControl

Scripts

Please see here

Name	Name	Last commit message	Last commit date
Latest commit chwoong Initial commit Jun 18, 2024 7af7697 · Jun 18, 2024 History 1 Commit
Reward_learning	Reward_learning	Initial commit	Jun 18, 2024
algorithms	algorithms	Initial commit	Jun 18, 2024
configs	configs	Initial commit	Jun 18, 2024
dataset	dataset	Initial commit	Jun 18, 2024
human_feedback	human_feedback	Initial commit	Jun 18, 2024
scripts	scripts	Initial commit	Jun 18, 2024
.gitignore	.gitignore	Initial commit	Jun 18, 2024
LiRE.yml	LiRE.yml	Initial commit	Jun 18, 2024
README.md	README.md	Initial commit	Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Listwise Reward Estimation for Offline Preference-based Reinforcement Learning

Installation

Algorithms

Dataset

Scripts

About

Releases

Packages

Languages

zhshao17/LiRE

Folders and files

Latest commit

History

Repository files navigation

Listwise Reward Estimation for Offline Preference-based Reinforcement Learning

Installation

Algorithms

Dataset

Scripts

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages