Learn how you can run HuggingChat, an Open Sourced ChatGPT alternative, locally (on a VM) and interact with the Open Assistant model, respectively with any Large Language Model (LLM), in two variants.
Variant 1: Run just the Chat-UI locally and utilize a remote inference endpoint from Hugging Face
Variant 2: Run the whole stack, the Chat-UI, the Text Generation Inference Server and the (Open Assistant) LLM on your Virtual Machine
Installing HuggingChat with the Installation Scripts created in this video
If you want to get the HuggingChat Installation Scripts that we created in the course of this video feel free to purchase and download our HuggingChat Installation Scripts.
Alternatively, if you want to get your hands dirty, you find the scripts at the bottom of this page.
NEW! Installing HuggingChat with aitom8 and the HuggingChat aitom8 plugin
New: In the meanwhile we have created aitom8 which is a professional AI Automation software that automates a variety of open source projects (optionally in virtual environments like conda). For HuggingChat there is an aitom8 plugin available that allows you to install HuggingChat with just one command.
aitom8 huggingchat install
You can get aitom8 and the HuggingChat aitom8 plugin here:
NEW! Code Llama 34B model with Inference and HuggingChat | Local Setup Guide (VM) and Live Demo
New: In this video you can see a variant 3 required for downloading Llama models with your local inference server.
NEW! Talk to your documents with HuggingChat and the aitomChat extension
Learn everything, from Chat UI to Inference and Retrieval Augmented Generation (RAG) in the YouTube video below:
Get aitomChat here:
Installing HuggingChat manually
If you want to get your hands dirty, feel free to set up HuggingChat with the instructions and scripts below.
Prepare your Linux VM
Install Curl:
sudo apt install curl
Install NVM (Node Version manager):
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.3/install.sh | bash nvm -v
Install the latest LTS release of Node.js and npm:
nvm install --lts node -v npm -v
Install and run the HuggingChat UI locally
Create new npm project (AI):
mkdir ~/dev/AI cd ~/dev/AI npm init
Update package.json:
{ "name": "ai", "version": "1.0.0", "description": "Start Apps", "main": "index.js", "scripts": { "start-mongodb": "docker run --rm --name mongodb -p 27017:27017 -d -v ~/dev/mongo:/data/db mongo", "stop-mongodb": "docker stop mongodb", "install-chat-ui": "cd ./scripts && ./install-chat-ui.sh", "update-chat-ui": "cd ../chat-ui && git pull", "start-chat-ui": "cd ../chat-ui && npm run dev -- --host 127.0.0.1", "list-mongodb-collections": "docker exec -i mongodb sh -c 'mongosh --eval \"db.getCollectionNames()\" chat-ui'", "list-conversations": "docker exec -i mongodb sh -c 'mongosh --eval \"db.conversations.find()\" chat-ui'", "drop-database": "docker exec -i mongodb sh -c 'mongosh --eval \"db.dropDatabase()\" chat-ui'", "start-inference": "cd ./scripts && ./start-text-generation-inference.sh", "show-filesystem": "sudo df -Th && echo && sudo lsblk && echo && docker system df" }, "author": "", "license": "ISC" }
Create scripts directory:
mkdir ~/dev/AI/scripts
Create this script in the scripts directory:
install-chat-ui.sh
#!/usr/bin/env bash sudo apt-get install git-lfs sudo rm -R ../../chat-ui cd ../.. && git clone https://huggingface.co/spaces/huggingchat/chat-ui cd ./chat-ui && npm install if [[ -f "../AI/data/chat-ui.env" ]]; then cp -v ../AI/data/chat-ui.env .env.local fi
chmod u+x ~/dev/AI/scripts/install-chat-ui.sh
npm run install-chat-ui
Copy .env file to .env.local:
cp ~/dev/chat-ui/.env ~/dev/chat-ui/.env.local
Create the MongoDB (with npm and Docker):
npm run start-mongodb
Adapt ~/dev/chat-ui/.env.local file to your needs:
MONGODB_URL=mongodb://localhost:27017/
HF_ACCESS_TOKEN=#hf_<token> from from https://huggingface.co/settings/token
Copy your .env.local file as chat-ui.env file into the ~/dev/AI/data directory (to allow fully automated reinstalls):
mkdir ~/dev/AI/data cp ~/dev/chat-ui/.env.local ~/dev/AI/data/chat-ui.env
Run the Chat-UI:
npm run start-chat-ui
Install and run the Text Generation Inference Server locally
Create this script in the scripts directory:
start-text-generation-inference.sh (Important: if you are not running Nvidia A100 GPU then you need to pass the parameter –disable-custom-kernels )
#model=bigscience/bloom-560m model=OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 num_shard=2 volume=$PWD/../../inference-data # share a volume with the Docker container to avoid downloading weights every run name="text-generation-inference" docker run --rm --name $name --gpus all --shm-size 1g -p 8081:80 \ -v $volume:/data \ ghcr.io/huggingface/text-generation-inference:latest \ --model-id $model --num-shard $num_shard \ --disable-custom-kernels
chmod u+x ~/dev/AI/scripts/start-text-generation-inference.sh
Run the Inference Server:
npm run start-inference
Test the Inference Server:
docker exec -it text-generation-inference text-generation-launcher --help docker exec -it text-generation-inference text-generation-launcher --env docker exec -it text-generation-inference text-generation-launcher --version curl 127.0.0.1:8081/generate \ -X POST \ -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17}}' \ -H 'Content-Type: application/json' curl 127.0.0.1:8080/generate_stream \ -X POST \ -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17}}' \ -H 'Content-Type: application/json'
Add a new model to the MODELS json array in your ~/dev/AI/data/chat-ui.env file:
MODELS=`[{"name": "...", "endpoints": [{"url": "http://127.0.0.1:8081/generate_stream"}]}]`
npm run install-chat-ui
Re-Run the Chat-UI:
npm run start-chat-ui
Need further support or consulting?
Please checkout our Consulting hours.