Self-hosting Upgrade: planning my first 3-node cluster for deployment in a data center

kiol@discuss.online · 1 day ago

Self-hosting Upgrade: planning my first 3-node cluster for deployment in a data center

fruitycoder@sh.itjust.works · 14 hours ago

I would recommend 4-5 nodes. 5 if you want true high availability. 4 still requires some intervention in case of failure.

Just because it’s bare metal. Got to think of your Mean Time to Repair (MTTR) which is to say if a whole node goes bust how long will it take to potentially order and install a new one.

If you go kubernetes (k8s) I would recommend rke2 or k3s. They are really straightforward setups and pretty enterprise ready out of the box.

If you have a hard requirement for Ceph I would recommend doing Rook-Ceph which makes deploying and management a lot easier by letting k8s handle it. For simpler but less performant (in my testing) persistent volumes (PVs) like ceph Longhorn is really easy to deploy and manage.

For backups Velero is really nice for apps in your cluster, since it can be done per namespace and include PV data too. Rke2/k3s both have nice etcd (the backend data base for k8s) snapshoting and backup tools too for full disaster recovery.

Rke2/k3s both have ways to auto deploying charts from the filesystem too https://docs.rke2.io/add-ons/helm

This is a good stepping stone for GitOps imho. If that matters to you at all. Starting with just having a git dir for these files, then later doing some like ArgoCD

I would also recommend, since you are looking at hyper converged storage have dedicated network lines for it is generally recommended. So create a bond of two ports per node just for storage, tag them with their own vlan, and in your setup of rook or longhorn specific that vlan interface as the device for data to flow.

Pxe boot is also nice at this scale, either setup on your router (OpenWrt has decent support), you maintance laptop/machine, and/or do something like Tinkerbell (cloud native pxe from your k8s cluster!). It’s just nice to be able to blow away a node and rebuild if you are tinkering a lot.

Remember cattle not pets, and welcome to the range cowpoke!

kiol@discuss.online · 13 hours ago

Seems Mellanox ConnectX-3 Pro Dual Port 10G SFP+ Low Profile MCX312C-XCCT is decent choice, which I can use for 10gb triangle between the current 3 nodes. I was thinking of using jetkvm + rs-232 expansion serial cable with 4-port hdmi/usb switcher for controlling the nodes. Financially, a future expansion would be moving from the triangle to a 10gb switch, allowing for NAS or other node additions. Also, each node has an empty Sata SSD port currently. Updated the forum thread.

fruitycoder@sh.itjust.works · 10 hours ago

I really enjoy the pikvm and the switcher for my home lab. Redfish support gets fishy with a switcher if that is a concern though.

I do love a good mesh for a cluster block though. My next next next project is using KubeOVN to turn my cluster block into a switch with “out” connections to connection other devices (wifi, laptop, cameras, etc) to it as my network router and of course upstream from the modem and hotspot for Internet connection.

Decronym@lemmy.decronym.xyz · edit-2 10 hours ago

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:

Fewer Letters	More Letters
Git	Popular version control system, primarily for code
IP	Internet Protocol
NAS	Network-Attached Storage
SSD	Solid State Drive mass storage
k8s	Kubernetes container management package

5 acronyms in this thread; the most compressed thread commented on today has 8 acronyms.

[Thread #268 for this comm, first seen 1st May 2026, 16:40] [FAQ] [Full list] [Contact] [Source code]

non_burglar@lemmy.world · 17 hours ago

I’m not sure what data center will allow you to hodgepodge a 1u cluster of consumer-grade hardware, but heat and power management alone will be a problem.

frongt@lemmy.zip · 1 day ago

What’s your plan for when you reboot and it doesn’t come back up? I strongly recommend having some kind of IPMI and virtual console, whether that’s pikvm or whatever. Far too many times I’ve had a server go down and it’s saved me from having to drive down to the datacenter and stand in a loud, frigid room and troubleshoot the issue.

kiol@discuss.online · 1 day ago

I thought I listed it in the description, but this is something I’m actively wondering about. A KVM over IP, such as PiKVM, Jetkvm, openkvm that allows access to all 3 nodes and saves future drives to the data center by using a wireguard connection instead. PiKVM + HDMI and USB switch is an option. Recommended option from PiKVM docs is https://www.easycoolav.com/products/ezcoo-hdmi-20-switch-4x1-with-usb-30-kvm-4-port-hdmi-switch-for-4k60hz-hdr-and-audio-breakout

fruitycoder@sh.itjust.works · 15 hours ago

The pikvm and pikvm switch are really fun tbh

femtek@lemmy.blahaj.zone · 1 day ago

I really like my Jetkvm, has helped me out.

vinnymac@lemmy.world · 1 day ago

Same, I use Jetkvm for this too, has worked great

kiol@discuss.online · 13 hours ago

Are you using it with an HDMI + USB switch or similar?

kiol@discuss.online · 1 day ago

Cool, have you tried using it with multiple devices on an hdmi switch?

femtek@lemmy.blahaj.zone · 15 hours ago

I have not but may look into that as I’m thinking of getting 3 small computers to have a kubernetes cluster at home.

Self-hosting Upgrade: planning my first 3-node cluster for deployment in a data center

Self-hosting Upgrade: planning my first 3-node cluster for deployment in a data center

1U Micro‑Datacenter Homelab Build - Learning Together Forum