So you portably wondering what can I do with my old pc, or why do I ever need to run everything off of others hosted llm's and provide them with much information they then use to train even more llms, well wonder a little bit less. Your GPU doesn’t care whether it’s generating embeddings, serving Ollama models, or crunching tensors at 2 AM. That flexibility is the real win. I want to have my own server is a first step in realizing that it is not entirely unachievable amount of work. Well, if you start now then at the current moment the most used linux distribution is Ubuntu ones you install Docker possibilities are endless. You can absolutely “just install Ubuntu Server” and call it a day… until the day it doesn’t boot, your GPU “vanishes,” Docker can’t see CUDA, or your box gets slapped by the internet because you left SSH on password auth like it’s 2009. This article is the version I wish existed everywhere: not overly ceremonial, not half-baked, and definitely not written by someone who’s never had GRUB stare into their soul at 2 AM. But instead you can follow few principles when setting up your own linux server, weather it is on vmware or your own hardware.
Choosing your Linux distribution
Choosing the right distribution sometimes is based on how involved you want to be. But also on what you want to attempt to achive. This quick synopsis looks at what we are trying to achieve and accidentally represents why trendflow is built. Which is to help you discover new tech and new stacks! And here after we setup an Ubuntu Server we can also setup Docker since a properly configured NVIDIA stack gives you a platform that can host LLMs, APIs, dashboards, background jobs, and experiments—all on your terms. Lets start on bootloaders, firmware modes, kernel behavior, driver conflicts, and long‑term survivability. This blog fills that gap.
Why Docker as Local Orchestrator & Deployment:
- Docker isolates your services, ensuring broken upgrades don’t nuke your data or config.
- Named volumes mean your models and chat history persist—even if you wipe containers.
- Running locally means zero latency, full control, and no egress of private prompts to a cloud.
- You can try any supported model—swap them in seconds, benchmark, compare, and experiment.
If you want “I just want it to work”:
- Ubuntu Server LTS (22.04 or 24.04) Best docs, best community, easiest NVIDIA + Docker path, least drama. It is the best choice for a first distribution.
If you want “stable forever, updates move slow like a snail with bills”:
- Debian Stable Great for servers, but NVIDIA/CUDA setup can be slightly more “manual labor.”
If you want “enterprise-ish, RHEL-like”:
- Rocky Linux / AlmaLinux Great stability, but your “how-to” searches sometimes become archaeology.
If you want newer packages + modern defaults:
- Fedora Server Fast-moving. Great, but not my first pick if you want zero surprises.
If you want a hypervisor-first home lab:
- Proxmox VE (Debian-based) Amazing for running multiple VMs/containers, especially if your “server” is really a host.
So as we choose Ubuntu server here is what we want out of it:
- Boot that doesn’t randomly break (UEFI + correct partitioning + GRUB behaving),
- Remote access that doesn’t invite strangers (SSH keys + firewall)
- Updates that don’t surprise you (unattended upgrades, predictable reboots)
- GPU acceleration that actually works (nvidia-smi shows your GPU; containers can use it)
- Virtualized setup notes (VMware mode vs “real GPU passthrough” mode)
This means:
- Firmware / Boot layer
UEFI vs Legacy BIOS, Secure Boot, boot order, disk partitioning (GPT/ESP), GRUB. - Base OS layer
Users, SSH, updates, network, storage. - Services layer
Docker/Podman, monitoring, backups, apps. - Hardware acceleration layer
NVIDIA driver, persistence, CUDA tooling (optional), container runtime support.
Install Ubuntu Server (clean defaults)
During install:
- Use guided install unless you know you need custom partitioning.
- Make sure it creates an EFI partition (ESP).
Confirm you booted in UEFI mode
You want UEFI unless you have a very specific reason not to. In your motherboard firmware (BIOS/UEFI settings), set:
- Boot mode: UEFI
- Disk mode: AHCI (unless you know why you need RAID)
- Virtualization: Intel VT-x / AMD-V enabled
- IOMMU (for GPU passthrough): Intel VT-d / AMD-Vi enabled (optional, but nice)
- Secure Boot: OFF (recommended if you plan to use NVIDIA drivers without extra steps) > Secure Boot can be used with NVIDIA, but it adds key enrollment + module signing fun. > If you want less suffering: turn it off.
Next, this command confirms the machine is booted in UEFI mode:
test -d /sys/firmware/efi && echo "UEFI boot" || echo "Legacy boot"
If you’re on UEFI, the important pieces are:
/boot/efi(EFI System Partition mount)grub-efi-amd64packages- EFI entries visible via
efibootmgr
This command confirms your EFI partition is mounted at /boot/efi.
findmnt /boot/efi
AUTH
SSH setup (key-based login)
On your local machine git bash or terminal: This command generates an SSH keypair (skip if you already have one).
ssh-keygen -t ed25519 -C "[email protected]"
On your mcachine: this command copies your public SSH key to the server so you can log in without passwords.
ssh-copy-id admin@YOUR_SERVER_IP
On the server: this command opens the SSH config file so you can harden SSH safely.
sudo nano /etc/ssh/sshd_config
Bonus: when configuring a repo connecting on your server with this same ssh approach which is a best practice you can establish multiple connections via "/.ssh/config" file as such:
Host github-primary HostName github.com User git IdentityFile ~/.ssh/id_ed25519 IdentitiesOnly yes Host github-secondary HostName github.com User git IdentityFile ~/.ssh/id_ed25519_2 IdentitiesOnly yes
GRUB basics (Bootloader)
what it is why you care and what NOT to do
GRUB is your bootloader. You don’t need to “tune” it—just know how to verify it’s the right one and regenerate config when needed. It’s the bouncer that decides which kernel boots.
If GRUB is messed up, your server will look at you like: “boot? never heard of her.”
Check GRUB install and config location
This command prints which GRUB package is installed and confirms you have the bootloader tools
dpkg -l | grep -E "grub-efi|grub-pc"
Update GRUB after kernel changes
This command will show what NVIDIA driver Ubuntu recommends for your detected GPU.
ubuntu-drivers devices
This command regenerates the GRUB configuration file to reflect current kernels and boot entries.
sudo update-grub
This command installs the recommended NVIDIA driver automatically. However before running this command I would recommend doing research about which version of the drivers is suitable for the version of your ubuntu. Additionally while running any installation few the logs for failed messages as corrupted installation of drivers can cause issues:
sudo ubuntu-drivers autoinstall
Drivers and GPU configuration: Nvidia Edition
This command proves the driver is loaded and the GPU is usable. If nvidia-smi fails and you left Secure Boot ON or more troubleshooting is required.
nvidia-smi
If you see this your drivers are installed in this example it is NVIDIA-SMI 570.195.03 and you you can make sure all specification are correct and your gpu is in good state:
+-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 570.195.03 Driver Version: 570.195.03 CUDA Version: 12.8 | |-----------------------------------------+------------------------+----------------------| | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA RTX ... Off | 00000000:01:00.0 On | Off | | 30% 28C P8 11W / 250W | 374MiB / 32760MiB | 0% Default | +-----------------------------------------------------------------------------------------+
If this is not visible time to install drivers
GPU Drivers
The goal is to get a driver which works well with linux distribution something like nvidia-driver-000-open is a solid choice where “open” here means the open kernel module version. Works well with newer GPUs. Before installing do this nuclear cleanup so that only one driver is installed. In some cases when running automatic installation of driver ubuntu will pull in many version creating conflicts:
# 1) Remove all NVIDIA packages sudo apt purge 'nvidia-*' -y sudo apt purge 'libnvidia-*' -y sudo apt purge 'linux-modules-nvidia-*' -y sudo apt autoremove -y sudo apt clean # 2) Confirm nothing NVIDIA-ish is left dpkg -l | grep -i nvidia
Pining Drivers
Tell APT to stop “helping” with installation of drivers it sees and tell it to install what you require and nothing less for your server. This is a helpful approach when driver installation failing. You can starting by adding a pin to block 000-series drivers from ever being pulled in meaning any package whose version matches 000.* has negative priority:
sudo nano /etc/apt/preferences.d/nvidia-pin
substituted the version 000 to be one which you want to avoid:
Package: *nvidia* Pin: version 000.* Pin-Priority: -1
Then refresh:
sudo apt update
Install the driver
Install the Actual Driver You Want (000-open) or anything else you researched works best with your version:
sudo apt install nvidia-driver-000-open -y
Is it finished? if you see it build a DKMS module for your current kernel, then yes: example:
Building initial module for 6.14.0-36-generic ... Installing to /lib/modules/6.14.0-36-generic/updates/dkms/
Optional: Persistent NVIDIA Modules
Some setups need to enable persistence mode to keep the GPU “awake” for Docker containers:
sudo nvidia-smi -pm 1
To make persistence mode survive reboots, create a systemd service (optional):
/etc/systemd/system/nvidia-persistenced.service
[Unit] Description=NVIDIA Persistence Daemon After=multi-user.target [Service] Type=simple ExecStart=/usr/bin/nvidia-persistenced --user root Restart=always [Install] WantedBy=multi-user.target
sudo systemctl enable --now nvidia-persistenced
Networking, Firewall & Updates
- UFW (Uncomplicated Firewall):
sudo ufw allow OpenSSH sudo ufw enable - Static IP: Set via
netplanfor servers that must be reachable. - Automatic security updates:
sudo apt install unattended-upgrades sudo dpkg-reconfigure --priority=low unattended-upgrades
Backups & Snapshots: Don’t Get Burned
Before major upgrades (kernel, drivers, etc.), always:
- Snapshot the VM (if virtualized)
- Use
rsyncorrsnapshotfor /etc, /home, and data volumes - For bare metal: clone your OS disk or image
/boot/efiand/boot
Example:
<br>sudo rsync -aAXv /etc /home /data /mnt/backup/
Unto the Fun Things: Docker, Ollama, and WebUI
Now for the actual reason you wanted a server in the first place: we can start with setting a LLMs. Ollama is a great choice right now and Open WebUI is a perfect way to manage your LLMs it with full GPU acceleration—at home, no cloud fees, no data leaks. We will start with Docker and move all the way through loading your first model in the web interface.
Install Docker & Docker Compose: Docker is how we’ll launch Ollama and Open WebUI as isolated, easily-managed services.
# Install Docker curl -fsSL https://get.docker.com | sudo bash # Install Docker Compose (v2+) sudo apt-get install docker-compose-plugin
Add your user to the docker group: so you don’t need to run everything as root
sudo usermod -aG docker $USER newgrp docker
Create the Services Directory & Compose File
Make a new directory to house your service definitions and persistent data:
mkdir -p ~/services cd ~/services
Create your docker-compose.yaml
Then create a file named docker-compose.yaml with this content in the new root services/ folder. This defines both services and creates named volumes for persistent storage. All Ollama models, downloads, and chat history live inside these volumes—so you don’t lose them if you redeploy:
services: ollama: image: ollama/ollama:latest container_name: ollama restart: unless-stopped pull_policy: always tty: true gpus: all environment: - NVIDIA_VISIBLE_DEVICES=all - NVIDIA_DRIVER_CAPABILITIES=compute,utility volumes: - ollama:/root/.ollama # ports: # - "11434:11434" # optional: expose API on the host open-webui: image: ghcr.io/open-webui/open-webui:${WEBUI_DOCKER_TAG-main} container_name: open-webui volumes: - open-webui:/app/backend/data depends_on: - ollama ports: - ${OPEN_WEBUI_PORT-3000}:8080 environment: - 'OLLAMA_BASE_URL=http://ollama:11434' - 'WEBUI_SECRET_KEY=' extra_hosts: - host.docker.internal:host-gateway restart: unless-stopped volumes: ollama: {} open-webui: {}
Basic commands to spin this up:
# Go to your services directory cd ~/services # Pull images and launch docker compose up -d # To check logs for Ollama docker logs -f ollama # To shut down docker compose down
Logs:
Logs can be watched with:
docker logs -f ollama docker logs -f open-webui
To check if Ollama is using the GPU:
docker exec -it ollama nvidia-smi
> Ollama model files and data will persist under your Docker volume (/var/lib/docker/volumes/ollama/_data).
Access the WebUI and Connect to Ollama
- Open your browser and go to:
http://localhost:3000 (or your server’s IP, e.g. http://192.168.1.100:3000) - First-time setup:
You may be prompted to create a password (depending on Open WebUI version). Set this up as your admin login for the interface. - Verify Ollama connection:
In the WebUI’s settings, look for “Ollama server” or “Model backend” and confirm it points tohttp://ollama:11434(default in the compose).
Pulling a Model Into Ollama
By default, Ollama doesn’t ship with a model installed.
To pull a model (e.g., llama3):
docker exec -it ollama ollama pull llama3
You can substitutes llama3 for any supported Ollama model, like qwen2:7b, mistral, phi3, etc.
<br>
Why via docker exec?
This runs the ollama CLI inside the container, so you don’t need to install anything on the host.
You’ll see progress as the model downloads. Depending on your network and disk speed, this could take a few minutes (some models are several GB).
<br>
Tip: If the model isn’t visible, refresh the page or restart the open-webui container:
<br>docker compose restart open-webui
Quick Reference: Useful Commands
| Purpose | Command |
|------------------------|-----------------------------------------|
| Start stack | docker compose up -d |
| Stop stack | docker compose down |
| View logs | docker logs -f ollama |
| Shell in container | docker exec -it ollama bash |
| List downloaded models | docker exec -it ollama ollama list |
| Remove a model | docker exec -it ollama ollama rm |
| Update all containers | docker compose pull && docker compose up -d |
Conclusion
After reading this guide hopefully it can provide you with work it would require to set up your own mini server. Best part that 2ith this setup, you can own your stack: full GPU acceleration, secured access, stable boot, and robust recovery—all ready to serve LLMs like Ollama, with a web interface, at home. No more “hope it works”—you control every piece, and you can now deploy new models or services in minutes, not hours.
Welcome to the world of self-hosted AI.
