Running AI Models Locally with WSL, Ollama, Open Web-UI, and Docker

This is a comprehensive guide on how to install wsl on a Windows 10/11 Machine, deploying docker and utilising Ollama for running AI models locally. Before starting this tutorial you should ensure you have relatively strong system resources. This would ensure smooth operation and optimal performance of these tasks.

Prerequisites:

- A relatively strong system with good CPU and RAM resources

- Administrative privileges on the Windows machine

- Having a strong GPU is a plus and would ensure better performance

- Latest version of Windows 10/11 (May require updates)

(https://images.app.goo.gl/W1rNikrx5y7SonL59)

System Requirements

According to the official Ollama.ai documentation, the recommended system requirements for running Ollama are:

- Operating System: Linux: Ubuntu 18.04 or later, macOS: macOS 11 Big Sur or later

- RAM: 8GB for running 3B models, 16GB for running 7B models, 32GB for running 13B models

- Disk Space: 12GB for installing Ollama and the base models, Additional space required for storing model data, depending on the models you use.

- CPU: Any modern CPU with at least 4 cores is recommended, for running 13B models, a CPU with at least 8 cores is recommended.

- GPU(Optional): A GPU is not required for running Ollama, but it can improve performance, especially for running larger models. If you have a GPU, you can use it to accelerate training of custom models.

In addition to the above, Ollama also requires a working internet connection to download the base models and install updates.

Step 1 - Install the Windows Subsystem for Linux (WSL)

1. Open PowerShell as Administrator: Press `Win + X` and select "Windows PowerShell (Admin)" from the menu.

2. Check if WSL is enabled: Type the following command and press Enter:

wsl --list --verbose

If you get a response saying that WSL 1 or 2 are running, it's already installed, and you can skip to Step 3. If not, continue with the following steps.

3. Enable the Windows Subsystem for Linux feature:

- Type `wsfx` and press `Tab` twice to auto-complete the command, then press Enter.

- In the "Optional Features" window, scroll down to find "Windows Subsystem for Linux," check the box next to it, and click on the "OK" button at the bottom.

- After the installation is complete, reopen PowerShell as Administrator.

4. Install your preferred Linux distribution:

- Type `wsl --install` and press Enter. This command will install the default Ubuntu distribution, but you can also choose to download a different one from the Microsoft Store or manually download an ISO and install it using the following command:

wsl --install <distribution name>

- If you chose to download and install a distribution manually, you might need to add the `--distribution` flag followed by the name of the Linux distro. For example, for Ubuntu 20.04 LTS:

wsl --install Ubuntu-20.04

5. Launch your Linux distribution: After installation is complete, you can open it by typing the following command in PowerShell and pressing Enter:

wsl <distribution name>

- If you installed Ubuntu 20.04 LTS, for example, the command would be `wsl Ubuntu-20.04`.

6. Set a username and password: Follow the on-screen instructions to set up your Linux environment, including setting a username and password.

7. Reboot if necessary: Some Linux distributions may require a reboot for changes to take effect. If you're prompted to restart your system, do so. Once rebooted, you can open your Linux distribution from the Start menu or PowerShell by typing `wsl` followed by the name of the distro (without spaces).

8. It's good practice to update your Linux system packages. Please use the following commands:

 sudo apt update

 sudo apt upgrade -y

Now that WSL is installed and your preferred Linux distribution is running on your Windows machine, you're ready to start exploring the command line and working with various tools!

Step 2 - Install Ollama

Upon executing the command

curl -fsSL https://ollama.com/install.sh | sh

, if the installation is successful, here are the steps you can follow:

1. If Ollama identifies an NVIDIA GPU during the installation process, it will automatically configure the environment for using the GPU for machine learning tasks.

2. If your system doesn't have an NVIDIA GPU or if the installation script didn't detect one, Ollama will use the CPU and RAM available in your instance. Ensure that you have at least 16 GB of RAM to run machine learning workloads efficiently. You may also want to confirm that Docker is installed and running on your system since Ollama runs its containers using Docker.

To verify the installation, you can run `ollama --version` in your terminal, which should display the version information of Ollama. You can also go to http://localhost:11434/ to test that Ollama is running.

Step 3 - Installing AI models

To use the Ollama AI models, you can browse their library at https://ollama.com/library. Here's how to pull a model using the command-line interface:

 ollama pull <Model Name>:<Version> ## e.g. ollama pull gemma2:2b

Replace `<Model Name>` with the name of the AI model you're interested in and `<Version>` with the desired version number, if applicable. As an example, use `ollama pull gemma2:2b` to download the Gemma2 model version 2b.

3. Make sure to check the system requirements for each model before pulling it on your machine to ensure it can be successfully installed and run!

4. Run your model by using the following command

 ollama run <Model Name>:<Version> ## e.g. ollama pull gemma2:2b

5. Test your model by asking questions directly in the terminal and look at it go

Step 4 - Setting up your Docker Container and Installing Open Web-UI

1. Open a terminal on your Ubuntu machine.

2. Add Docker GPG Key:

sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings 
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

3. Add the repository to Apt sources :

echo \  
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \  
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ 
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null  
sudo apt-get update

4. Install Docker:

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

5. To verify that Docker has been installed correctly, run:

docker --version

6. Run Open WebUi in a Docker Container

sudo docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Web UI will use port localhost:8080 to run the instance. When you first run it you will have to create an account which will end up being the administrator account.

I will cover administering your open web-ui environment in another article which will be linked here.

After signing up you can log into your account and select the model you would like to run.

In this tutorial example, Mistral will be my choice.

Conclusion

In conclusion, this comprehensive guide offers readers step-by-step instructions on setting up essential tools like WSL (Windows Subsystem for Linux), OLLAMA (an LLM/LM model), Docker, and a web UI (user interface). By following these detailed instructions, users can establish an efficient development environment to make the most of their programming projects with these popular tools. This guide aims to simplify the setup process for readers, so they can focus on coding and exploring their creativity with these powerful tools.

Some of the other benefits of running your AI locally are:

1. Speed and Efficiency: Running AI models locally allows for faster processing times compared to cloud-based solutions, as there is no need to wait for data to travel over the internet. This can significantly speed up workflows, particularly for real-time applications like facial recognition or speech-to-text conversion.

2. Data Privacy and Security: By running AI models locally, organizations can maintain control over their data and models. This is important for businesses handling sensitive information, as it reduces the risk of unauthorized access or data breaches that may occur when using third-party services.

3. Reduced Costs: While there are costs associated with setting up local infrastructure for AI model deployment, these can be offset by lower ongoing costs compared to cloud-based solutions. Over time, the cumulative savings from not paying for cloud storage and computing resources can add up to significant financial benefits for businesses. Additionally, local processing avoids potential extra charges for data transfer or usage that may apply with cloud services.