OUTRE

This is the code repository of "OUTRE: An Out-of-core De-Redundancy Framework for GNN Training on Massive Graphs within A Single Server". The code of OUTRE is built on an existing GNN training framework Ginex. The Bloom Filter implementation in OUTRE is from here.

Setup:

Disable read_ahead on Linux.

sudo -s
echo 0 > /sys/block/$block_device_name/queue/read_ahead_kb

Install necessary Linux packages.
1. sudo apt-get install -y build-essential
2. sudo apt-get install -y cgroup-tools
3. sudo apt-get install -y unzip
4. sudo apt-get install -y python3-pip and pip3 install --upgrade pip
5. Compatible NVIDIA CUDA driver and toolkit.
Install Python packages.
1. PyTorch
2. ogb
3. PyG
4. DGL with version of >= 1.0
5. others that necessary

Install ninja.

sudo wget https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip
sudo unzip ninja-linux.zip -d /usr/local/bin/
sudo update-alternatives --install /usr/bin/ninja ninja /usr/local/bin/ninja 1 --force

Use cgroup to limit the memory size. For example, we limit the host memory size to 64GB.

sudo -s
cgcreate -g memory:64gb
echo 64000000000 > /sys/fs/cgroup/memory/64gb/memory.limit_in_bytes

Allocate enough swap area.

Run on mag240m-cite:

Prepare dataset

python3 prepare_dataset_mag.py --dataset mag240m

Partition the original graph

python3 partition_fennel_twolevel.py --dataset mag240m

Create neighbor cache

python3 create_neigh_cache.py --neigh-cache-size 10000000000

Get PYTHONPATH
```
python3 get_pythonpath.py
```

Run OUTRE on mag240m-cite. Replace PYTHONPATH=... with the outcome of step 4.

sudo PYTHONPATH=xxx cgexec -g memory:64gb python3 -W ignore run_profiling.py --neigh-cache-size 10000000000 --feature-cache-size 30000000000 --dataset mag240m

sudo PYTHONPATH=xxx cgexec -g memory:64gb python3 -W ignore run_main.py --neigh-cache-size 10000000000 --feature-cache-size 30000000000 --num-epochs 1 --dataset mag240m

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
lib		lib
README.md		README.md
create_neigh_cache.py		create_neigh_cache.py
custom_sage.py		custom_sage.py
get_pythonpath.py		get_pythonpath.py
partition_fennel.py		partition_fennel.py
partition_fennel_twolevel.py		partition_fennel_twolevel.py
prepare_dataset.py		prepare_dataset.py
prepare_dataset_igb.py		prepare_dataset_igb.py
prepare_dataset_mag.py		prepare_dataset_mag.py
run_main.py		run_main.py
run_profiling.py		run_profiling.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OUTRE

Setup:

Run on mag240m-cite:

About

Releases

Packages

Languages

shengzeang/OUTRE

Folders and files

Latest commit

History

Repository files navigation

OUTRE

Setup:

Run on mag240m-cite:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages