Install and run LLM's Locally with text generation webui on AMD gpu's!
Published on:
Views: 0
Likes: 0
Tags:
Let's setup and run large language models similar to ChatGPT locally on our AMD gpu's! ### Installing ROCm sudo apt update sudo apt install git python3-pip python3-venv python3-dev libstdc++-12-dev sudo apt update wget https://repo.radeon.com/amdgpu-install/5.7.1/ubuntu/jammy/amdgpu-install_5.7.50701-1_all.deb sudo apt install ./amdgpu-install_5.7.50701-1_all.deb sudo amdgpu-install --usecase=graphics,rocm sudo usermod -aG video $USER sudo usermod -aG render $USER sudo reboot ### Installing Text Generation Webui mkdir ~/gpt cd gpt/ git clone https://github.com/oobabooga/text-generation-webui cd text-generation-webui # Setup virtual env python3 -m venv venv source venv/bin/activate # Install correct torch ROCm5.6 pip3 install torch --index-url https://download.pytorch.org/whl/rocm5.6 # Install the rest of dependencies pip3 install -r requirements_amd.txt # Installing with no avx support pip3 install -r requirements_amd_noavx2.txt # Create launch script sudo nano launch.sh # Inside launch script paste: #!/bin/bash source venv/bin/activate export HIP_VISIBLE_DEVICES=0 export HSA_OVERRIDE_GFX_VERSION=11.0.0 python3 ./server.py --listen ### save and exit your launch script # Make script executable sudo chmod +x launch.sh # Now you can launch webui with your script ./launch.sh Model from the video: TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ Some settings to check if models are not loading: Disable exllama float 32 fp4 Transformers works most of the time, not always performant though. Generally load_in_8bit and load_in_4bit will not work -- it uses bitsandbytes which has bad ROCm support.