Learn how you can run HuggingChat, an Open Sourced ChatGPT alternative, locally (on a VM) and interact with the Open Assistant model, respectively with any Large Language Model (LLM), in two variants.
Variant 1: Run just the Chat-UI locally and utilize a remote inference endpoint from Hugging Face
Variant 2: Run the whole stack, the Chat-UI, the Text Generation Inference Server and the (Open Assistant) LLM on your Virtual Machine
Installing HuggingChat with the Installation Scripts created in this video
If you want to get the HuggingChat Installation Scripts that we created in the course of this video feel free to purchase and download our HuggingChat Installation Scripts.
Alternatively, if you want to get your hands dirty, you find the scripts at the bottom of this page.
NEW! Installing HuggingChat with aitom8 and the HuggingChat aitom8 plugin
New: In the meanwhile we have created aitom8 which is a professional AI Automation software that automates a variety of open source projects (optionally in virtual environments like conda). For HuggingChat there is an aitom8 plugin available that allows you to install HuggingChat with just one command.
aitom8 huggingchat install
You can get aitom8 and the HuggingChat aitom8 plugin here:
NEW! Code Llama 34B model with Inference and HuggingChat | Local Setup Guide (VM) and Live Demo
New: In this video you can see a variant 3 required for downloading Llama models with your local inference server.
NEW! Talk to your documents with HuggingChat and the aitomChat extension
Learn everything, from Chat UI to Inference and Retrieval Augmented Generation (RAG) in the YouTube video below:
Get aitomChat here:
Installing HuggingChat manually
If you want to get your hands dirty, feel free to set up HuggingChat with the instructions and scripts below.
Prepare your Linux VM
Install Curl:
sudo apt install curl
Install NVM (Node Version manager):
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.3/install.sh | bash nvm -v
Install the latest LTS release of Node.js and npm:
nvm install --lts node -v npm -v
Install and run the HuggingChat UI locally
Create new npm project (AI):
mkdir ~/dev/AI cd ~/dev/AI npm init
Update package.json:
{ "name": "ai", "version": "1.0.0", "description": "Start Apps", "main": "index.js", "scripts": { "start-mongodb": "docker run --rm --name mongodb -p 27017:27017 -d -v ~/dev/mongo:/data/db mongo", "stop-mongodb": "docker stop mongodb", "install-chat-ui": "cd ./scripts && ./install-chat-ui.sh", "update-chat-ui": "cd ../chat-ui && git pull", "start-chat-ui": "cd ../chat-ui && npm run dev -- --host 127.0.0.1", "list-mongodb-collections": "docker exec -i mongodb sh -c 'mongosh --eval \"db.getCollectionNames()\" chat-ui'", "list-conversations": "docker exec -i mongodb sh -c 'mongosh --eval \"db.conversations.find()\" chat-ui'", "drop-database": "docker exec -i mongodb sh -c 'mongosh --eval \"db.dropDatabase()\" chat-ui'", "start-inference": "cd ./scripts && ./start-text-generation-inference.sh", "show-filesystem": "sudo df -Th && echo && sudo lsblk && echo && docker system df" }, "author": "", "license": "ISC" }
Create scripts directory:
mkdir ~/dev/AI/scripts
Create this script in the scripts directory:
install-chat-ui.sh
#!/usr/bin/env bash sudo apt-get install git-lfs sudo rm -R ../../chat-ui cd ../.. && git clone https://huggingface.co/spaces/huggingchat/chat-ui cd ./chat-ui && npm install if [[ -f "../AI/data/chat-ui.env" ]]; then cp -v ../AI/data/chat-ui.env .env.local fi
chmod u+x ~/dev/AI/scripts/install-chat-ui.sh
npm run install-chat-ui
Copy .env file to .env.local:
cp ~/dev/chat-ui/.env ~/dev/chat-ui/.env.local
Create the MongoDB (with npm and Docker):
npm run start-mongodb
Adapt ~/dev/chat-ui/.env.local file to your needs:
MONGODB_URL=mongodb://localhost:27017/
HF_ACCESS_TOKEN=#hf_<token> from from https://huggingface.co/settings/token
Copy your .env.local file as chat-ui.env file into the ~/dev/AI/data directory (to allow fully automated reinstalls):
mkdir ~/dev/AI/data cp ~/dev/chat-ui/.env.local ~/dev/AI/data/chat-ui.env
Run the Chat-UI:
npm run start-chat-ui
Install and run the Text Generation Inference Server locally
Create this script in the scripts directory:
start-text-generation-inference.sh (Important: if you are not running Nvidia A100 GPU then you need to pass the parameter –disable-custom-kernels )
#model=bigscience/bloom-560m model=OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 num_shard=2 volume=$PWD/../../inference-data # share a volume with the Docker container to avoid downloading weights every run name="text-generation-inference" docker run --rm --name $name --gpus all --shm-size 1g -p 8081:80 \ -v $volume:/data \ ghcr.io/huggingface/text-generation-inference:latest \ --model-id $model --num-shard $num_shard \ --disable-custom-kernels
chmod u+x ~/dev/AI/scripts/start-text-generation-inference.sh
Run the Inference Server:
npm run start-inference
Test the Inference Server:
docker exec -it text-generation-inference text-generation-launcher --help docker exec -it text-generation-inference text-generation-launcher --env docker exec -it text-generation-inference text-generation-launcher --version curl 127.0.0.1:8081/generate \ -X POST \ -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17}}' \ -H 'Content-Type: application/json' curl 127.0.0.1:8080/generate_stream \ -X POST \ -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17}}' \ -H 'Content-Type: application/json'
Add a new model to the MODELS json array in your ~/dev/AI/data/chat-ui.env file:
MODELS=`[{"name": "...", "endpoints": [{"url": "http://127.0.0.1:8081/generate_stream"}]}]`
npm run install-chat-ui
Re-Run the Chat-UI:
npm run start-chat-ui
Need further support or consulting?
Please checkout our Consulting hours.
You’re missing a step. After creating the install-chat-ui.sh file you must give it execute permission:
chmod u+x ~/dev/AI/scripts/install-chat-ui.sh
Hi Dan, thanks for the heads up! I’ve updated it accordingly.
Thanks! Another issue. The last line of install-chat-ui.sh is referencing a directory / file you don’t create until a few steps later (AI/data/chat-ui.env). I guess you’re creating that copy to have it outside of the chat-ui install to persist it? Not clear if it needs to be recopied after the “Re-Install the Chat-UI” step when using your own model.
Dan, sorry for late reply. Yes, I create the chat-ui.env file outside of the chat-ui to allow fully automated reinstallations. It automatically gets copied into the chat-ui in the “Re-Install the Chat-UI” step, therefore no manual action is required on your end when reinstalling the chat-ui. However, I now check the existence of the chat-ui.env in the install-chat-ui.sh file (also note the added bash shebang) before copying it and I changed the order of the steps a litte bit to make the process more understandable and immediately applicable.
do you need the HF api key even for the local model and inference server ? (i guess not), so it can be run without internet if everything pre-downloaded ?
You need the HF API key if you want to use the remote HF inference endpoint (Variant 1 in my video).
If you want to run your own inference server (Variant 2 in my video), then the HF API key is not required (for most models) and the model gets downloaded when your start the server.
However, for certain models, like Llama 2 from Meta for example, the HF API key is even required with for downloading the model with your inference server (Variant 3), you can see this in my video about running Code Llama locally here: https://youtu.be/mhq6BQX0_P0