MACOS

CPU

Choose way to install Rust.

Native Rust:

Install Rust:
```
curl –proto ‘=https’ –tlsv1.2 -sSf https://sh.rustup.rs | sh
```
Enter new shell and test: rustc --version

When running a Mac with Intel hardware (not M1), you may run into _clang: error: the clang compiler does not support '-march=native'_ during pip install. If so, set your archflags during pip install. eg: ARCHFLAGS="-arch x86_64" pip3 install -r requirements.txt

If you encounter an error while building a wheel during the pip install process, you may need to install a C++ compiler on your computer.

Setup environment:
```
conda create -n h2ogpt python=3.10
conda activate h2ogpt
pip install -r requirements.txt
```
Conda Rust:

If native rust does not work, try using conda way by creating conda environment with Python 3.10 and Rust.
```
conda create -n h2ogpt python=3.10 rust
conda activate h2ogpt
pip install -r requirements.txt
```

To run CPU mode with default model, do:

python generate.py --base_model='llama' --prompt_type=wizard2 --score_model=None --langchain_mode='UserData' --user_path=user_path

For the above, ignore the CLI output saying 0.0.0.0, and instead point browser at http://localhost:7860 (for windows/mac) or the public live URL printed by the server (disable shared link with --share=False). To support document Q/A jump to Install Optional Dependencies.

GPU (MPS Mac M1)

Note: Few fixes required for MPS support are not available in torch 2.0, make sure to install torch 2.1 from nightly as shown below.

Create conda environment with Python 3.10 and Rust.

conda create -n h2ogpt python=3.10 rust
conda activate h2ogpt

Install torch dependencies from nightly build to get latest mps support

pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu

Verify whether torch uses mps, run below python script.

 import torch
 if torch.backends.mps.is_available():
     mps_device = torch.device("mps")
     x = torch.ones(1, device=mps_device)
     print (x)
 else:
     print ("MPS device not found.")

Output

tensor([1.], device='mps:0')

Install other h2ogpt requirements
```
pip install -r requirements.txt
```

Run h2oGPT (without document Q/A):

python generate.py --base_model=h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b --cli=True

For the above, ignore the CLI output saying 0.0.0.0, and instead point browser at http://localhost:7860 (for windows/mac) or the public live URL printed by the server (disable shared link with --share=False).

To support document Q/A jump to Install Optional Dependencies.

Document Q/A dependencies

# Required for Doc Q/A: LangChain:
pip install -r reqs_optional/requirements_optional_langchain.txt
# Required for CPU: LLaMa/GPT4All:
pip install -r reqs_optional/requirements_optional_gpt4all.txt
# Optional: PyMuPDF/ArXiv:
pip install -r reqs_optional/requirements_optional_langchain.gpllike.txt
# Optional: Selenium/PlayWright:
pip install -r reqs_optional/requirements_optional_langchain.urls.txt
# Optional: for supporting unstructured package
python -m nltk.downloader all

and for supporting Word and Excel documents, download libreoffice: https://www.libreoffice.org/download/download-libreoffice/ . To support OCR, install tesseract and other dependencies:

brew install libmagic
brew link libmagic
brew install poppler
brew install tesseract --all-languages

Then for document Q/A with UI using CPU:

python generate.py --base_model='llama' --prompt_type=wizard2 --score_model=None --langchain_mode='UserData' --user_path=user_path

or MPS:

python generate.py --base_model=h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b --langchain_mode=MyData --score_model=None

For the above, ignore the CLI output saying 0.0.0.0, and instead point browser at http://localhost:7860 (for windows/mac) or the public live URL printed by the server (disable shared link with --share=False).

See CPU and GPU for some other general aspects about using h2oGPT on CPU or GPU, such as which models to try.

GPU with LLaMa

Note: Currently llama-cpp-python only supports v3 ggml 4 bit quantized models for MPS, so use llama models ends with ggmlv3 & q4_x.bin.

Install dependencies

# Required for Doc Q/A: LangChain:
pip install -r reqs_optional/requirements_optional_langchain.txt
# Required for CPU: LLaMa/GPT4All:
pip install -r reqs_optional/requirements_optional_gpt4all.txt

Install the LATEST llama-cpp-python...which happily supports MacOS Metal GPU as of version 0.1.62 (you should now have llama-cpp-python v0.1.62 or higher installed)
```
pip uninstall llama-cpp-python -y
CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir
```
Edit below settings in .env_gpt4all
- Uncomment line with n_gpu_layers=20
- Change model name with your preferred model at line with model_path_llama=WizardLM-7B-uncensored.ggmlv3.q8_0.bin

Run LLaMa model

python generate.py --base_model='llama' --cli==True

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_MACOS.md

README_MACOS.md

MACOS

CPU

GPU (MPS Mac M1)

Document Q/A dependencies

GPU with LLaMa

Files

README_MACOS.md

Latest commit

History

README_MACOS.md

File metadata and controls

MACOS

CPU

GPU (MPS Mac M1)

Document Q/A dependencies

GPU with LLaMa