My Home Lab v5.0
An overview of my setup, including hardware, software, and network topology
This series explains how I run my websites and services at home using affordable hardware and containers.
I’ve spent significant time on my homelab, enjoying the process of hosting public-facing sites and services for experimentation and sharing. My focus is on efficient updates and minimizing manual deployment, while keeping costs low for this hobby project.
Hardware
Mini PC HM90
OS: Proxmox PVE
OS: Proxmox PVE
CPU: AMD Ryzen 9 4900H
RAM: 32 GB DDR4
Storage: 2x SSD 128gb (RAIDZ1)
Eth: 2x 2.5 Gbit interface
Cisco Switch L3
Model CBS250-8T-D
Model CBS250-8T-D
Ports: 8GE + 1 POE
Features: Vlan, QoS, GVRP, MSTP, snooping IGMP
Security: ACL, 802.1X/Radius, SSH/SSL
DELL Optiplex 3010
OS: Proxmox PVE
OS: Proxmox PVE
CPU: i3-10105 @ 3.70GHz
RAM: 64 GB DDR4
Storage: 2x M2-SSD 256gb (RAIDZ1)
PCI-M2: 2x M2-SSD 512gb (RAIDZ1)
SATA-SSD: 1x 500gb (cache L2ARC)
Eth: 2x 2.5 Gbit interface
Terramaster D6-320
Model: D6-320
Model: D6-320
OS: Dedicated VM (truenas) on PVE.
Storage: 3x 2TB HDD RAIDZ1
3x 512GB SSD RAIDZ1
Interface: USB3.2 (10Gbps)
At its core is the Cisco Business 250 Series switch, which not only manages internal traffic but also directly connects to the WAN. This configuration allows seamless migration (using virtual interface vmbr0) of the OPNsense virtual router between cluster nodes in case of failure, ensuring uninterrupted network access.
The switch is set up with LACP in balance-tcp mode, efficiently distributing traffic and providing redundancy across multiple links. An SSD is used for hosting virtual machines and caching, reducing IO delay in the node, while three 2TB Western Digital HDDs are configured in a RAID Z1 (ZFS) array for reliable data storage with fault tolerance. The entire setup is connected via 2x 1 Gbit Ethernet.
Network Architecture Overview
Main LAN Configuration
The primary Local Area Network (LAN) is designated as 10.100.0.0/24, utilizing a Layer 3 (L3) switch to facilitate physical separation of Virtual Local Area Networks (VLANs). The VLAN configurations are as follows:
VLAN10: 10.100.10.0/24 | Subnet Mask: 255.255.255.0 | Purpose: LAN services.
VLAN20: 10.100.20.0/24 | Subnet Mask: 255.255.255.0 | Purpose: Internet access.
VLAN30: 10.100.30.0/24 | Subnet Mask: 255.255.255.0 | Purpose: IoT devices.
VLAN40: 10.100.40.0/24 | Subnet Mask: 255.255.255.0 | Purpose: WAN service servers.
Link Aggregation Configuration
The network interfaces are configured using Link Aggregation Control Protocol (LACP) in balance-tcp mode. This configuration allows for load balancing of traffic across multiple links by considering Layer 2 to Layer 4 data, such as destination MAC addresses, IP addresses, and TCP ports, enhancing throughput and redundancy in case of link failure.
WAN Access Configuration
The switch is configured with the WAN port activated as a VLAN access port, allowing both nodes to maintain access to the WAN in the event of a failure of one node. This configuration ensures that the OPNsense firewall can seamlessly migrate from one node to another, maintaining WAN connectivity and service continuity.
Security and Connectivity
The homelab is architected to ensure data integrity and high availability through several advanced security measures:
Cloudflare Integration: Serving as a reverse proxy, Cloudflare obscures the real IP address of the homelab, providing enhanced security features such as DDoS protection, rate limiting, and geolocation-based access controls.
Remote Access: Secure remote connectivity is established via a Tailscale VPN, ensuring encrypted access to the network from external locations. Port 41641 UDP is specifically opened for Tailscale to facilitate direct peer-to-peer connections without relying on relays, which is particularly important due to the complexities introduced by Carrier-Grade NAT (CGNAT). Allowing this port helps Tailscale establish a more efficient connection between devices on challenging networks, thereby improving performance and reducing latency.
Firewall Configuration: Firewall rules are meticulously configured to manage incoming and outgoing WAN traffic, effectively blocking unauthorized inter-VLAN communication and mitigating threats from known malicious IP addresses.
Security Monitoring: Deployment of a Wazuh Security Information and Event Management (SIEM) agent enables comprehensive monitoring for real-time threat detection.
Access Control Mechanisms
An Access Control List (ACL) is implemented for the reverse proxy web service, enhancing security by regulating access to sensitive resources based on predefined rules. Geographic blocking is also enforced to restrict access based on the origin of incoming requests, thereby reducing exposure to threats from high-risk regions.
Intrusion Detection and Prevention
An integrated Intrusion Prevention System (IPS) and Intrusion Detection System (IDS) within the router actively monitors network traffic for suspicious activity, providing automated responses to detected threats. This system is managed through OPNsense, which is accessible on port 3000 exclusively from the management VLAN, ensuring that only authorized personnel can configure and monitor the IPS/IDS functionalities.
Internal DNS Management
A local DNS server is employed for internal DNS management, which enhances resolution speed and caching capabilities while allowing for custom configurations such as A, AAAA, CNAME records, etc.
Monitoring Solutions
For system metrics visualization and alerting, Grafana is utilized with a customized dashboard (ID 1860). Critical event notifications are sent via email and Telegram alerts, while Uptime Robot monitors service availability.
Network Redundancy and Power Management
To ensure operational continuity and resilience:
Power Backup: An APC Back-UPS 400VA with a 132 Watt battery provides essential power backup during outages.
Dual Feed Power Supply: A dual feed system featuring an Automatic Transfer Switch (ATS) seamlessly transitions power supply between the home electrical grid and a solar panel setup with battery storage, enhancing reliability.
High Availability Configuration: The homelab employs a Proxmox cluster configured for High Availability (HA), facilitating automatic failover in case of node failure along with replication tasks to maintain synchronized virtual machine (VM) states across nodes.
Data Redundancy Strategies
Data redundancy is fortified through RAID-Z1, which offers single-disk fault tolerance through parity similar to traditional RAID 5 configurations but with enhanced efficiency due to ZFS's architecture. The most significant advantages of using ZFS include:
Data Integrity: ZFS employs checksumming for all data blocks, allowing it to detect and correct silent data corruption automatically.
Self-Healing Capabilities: When ZFS reads data from a RAID-Z1 pool, it checks against checksums; if it detects any corruption, it can use parity information to reconstruct the correct data.
Storage Efficiency: RAID-Z1 only sacrifices the capacity of one disk for parity while maximizing usable storage space within the array.
Dynamic Striping: ZFS uses dynamic stripe widths that allow each write operation to be handled efficiently without requiring read-modify-write cycles typical in traditional RAID systems.
These features make RAID-Z1 an attractive option for environments where moderate data protection is required without sacrificing too much storage capacity.
Containerization with Docker
To run microservices efficiently within my homelab environment, I utilize Docker, managed through Portainer on a dedicated virtual machine configured with MAC-VLAN networking. This setup allows each containerized microservice to have its own unique MAC address on the network, facilitating direct communication without interfering with other containers or services running on the same host. It is essential to disable promiscuous mode on the VM's network interface when using MAC-VLANs in Proxmox to prevent potential network issues such as duplicate packets or unexpected behavior that can arise from promiscuous mode being enabled by default on TAP interfaces. Additionally, I utilize netbooting configured via DHCP for provisioning VM ISO images through a dedicated microservice running in Docker. This approach allows me to dynamically load ISO images over the network without needing local storage on each hypervisor node, streamlining the deployment process for new VMs. Portainer simplifies Docker management by providing an intuitive web interface that allows easy deployment, monitoring, and management of containers across multiple hosts. This streamlined approach enhances visibility into container performance and resource usage while supporting best practices in microservices architecture.
Hypervisor Utilization
The hypervisor in use is Proxmox, which is an open-source virtualization management platform that allows for the efficient management of virtual machines (VMs) and containers. It supports various virtualization technologies such as KVM for full virtualization and LXC for lightweight container-based virtualization. Proxmox enables features like live migration of VMs between nodes without downtime, provided that both nodes share the same kernel version, ensuring seamless service continuity during maintenance or failures.
Quorum Management with Raspberry Pi
A Raspberry Pi (RPI) serves as a quorum device for the Proxmox cluster, ensuring that even with only two active nodes, quorum can be achieved without needing an additional physical server. This setup allows for High Availability by providing an odd number of votes necessary for cluster operations, thus enabling VMs to boot and run reliably even if one of the main Proxmox nodes experiences issues.
Wake-on-LAN Implementation
The Raspberry Pi also runs a cron job that periodically checks the status of the primary node (Node 100) by pinging it at regular intervals. If Node 100 becomes unresponsive, the RPI triggers Wake-on-LAN (WoL) functionality to remotely boot the secondary node designated as Node 200. This automated process ensures that operational continuity is maintained without manual intervention by sending a "magic packet" over the network to activate Node 200 from a powered-off state.
Memory Management Optimization
To optimize resource usage within the virtual environment, RAM ballooning is enabled on Proxmox. This feature allows dynamic adjustment of memory allocation among VMs based on their current needs, thereby conserving memory resources during low-demand periods while ensuring adequate performance during peak usage times.
External DNS Service
An external DNS service is maintained to ensure continued accessibility in the event of primary system downtime, thereby guaranteeing service reachability.
Cold Storage Solutions
Cold storage solutions are implemented for efficient long-term data backup, preserving older data securely while optimizing performance for active operations. This comprehensive architecture ensures that my homelab remains resilient, efficient, and sustainable amidst evolving technological demands and security threats.
Hosted Services
Wazuh:
A security information and event management (SIEM) system that includes intrusion detection, vulnerability detection, and security monitoring for threat detection and response.
Nextcloud:
A self-hosted, open-source file synchronization and sharing platform that provides cloud storage, document editing, and collaboration features.
Nginx Proxy Manager:
An Nginx-based reverse proxy server that facilitates routing and load balancing of web traffic to different web services, enhancing web server security and performance.
DNS (Pi-hole):
A DNS-based ad-blocking and DNS sinkhole service that filters out unwanted content and malicious domains by intercepting DNS requests.
Portainer:
A Docker container management interface that simplifies the creation, management, and monitoring of Docker containers and applications in a containerized environment.
AMP:
AMP (Application Management Panel) is a simple to use, self-hosted web control panel for game servers.