Network Segmentation

December 20, 2023

Ever since I began to understand the concepts of subnets, VLANs, and inter-network communication in my high school CCNA course, I've realized their power, when properly configured, to mitigate security risks on devices with less than adequate security features built in (or just to add an additional layer of security). This concept became immediately applicable during my internships in the Oil and Gas SCADA industry, where network-level controls often served as the primary defense against attackers due to challenges like the inability to patch, hard-coded credentials, and a prevailing view of security as an operational hindrance rather than a safeguard. These experiences laid the groundwork for my current approach to network segmentation, utilizing the ISA/IEC 62443 concept of Zones and Conduits as a guide.

Devices

Before diving into the specifics of each network segment, it's essential to understand the varied and complex ecosystem of devices that make up my homelab network. This environment encompasses everything from standard computing devices like phones and computers to more specialized devices such as smart IoT devices, physical security controllers, and a home automation system. Also integral to the network are management devices including switches, firewalls, and virtualization infrastructure, along with a dedicated VoIP system for communications. Each of these devices plays a unique role, comes with its own set of security and operational considerations, and consequently, demands a tailored approach to network segmentation and security.

Network Segments

I've spent a significant amount of time considering the most impactful way to segment my homelab network (well, maybe it's homeprod at this point). In the end, based on informal risk assessment, this is what I've settled on and have been running for the last couple of years:

Primary Network: This network is where "trusted" devices reside. This includes our phones and computers. This network has internet access, access to applications via the application DMZ, and some limited access to devices on the automation network.
IoT Network: Vendor cloud-connected devices are placed on this network. It provides full internet access with botnet/malware IP blocklists and anti-tracking DNS filters applied.
Automation Network: Highly restricted network where local home automation devices and security cameras reside. This network has no internet access and very limited inbound access.
Management Network: Management interfaces for infrastructure, such as switches, firewalls, wireless controllers, access points, and virtualization infrastructure are on this segment. This segment has no internet access and can only be accessed through the management jump host.
Management DMZ: Houses the management jump host to access the management network. To access the management zone, you must RDP to this jump host with Duo multifactor authentication.
Infrastructure Network: This zone contains shared services that are used across multiple networks. Currently, it hosts 2 Pi-hole DNS servers that are used by all devices on all the networks. All DNS forwarding is tunneled over TLS via cloudflared.
Applications: Internal applications are hosted in this zone. Servers are micro-segmented internally via Proxmox firewall rules. Management is performed from the management jump host and applications are published via the applications DMZ.
Applications DMZ: This zone hosts reverse proxy servers that are used to provide secure access to the home automation system, security cameras, and applications in the applications network. The reverse proxy provides public Let's Encrypt certificates and multi-factor authentication to applications (whether accessed internally or externally). Servers in this zone have no internet access and very limited pinholes to applications for proxying.
Voice over IP: Network specifically for VoIP. It houses VoIP phones and intercoms throughout the house as well as the PBX to provide inbound and outbound connectivity over SIP trunks. This network has restricted internet access to the VoIP provider and limited inbound access so that our cellphones can connect to the PBX with softphone clients and so that the automation system can perform API calls on the intercoms to play messages.
Guests: Unfiltered internet access with the exception of botnet/malware IP blocklists. This zone uses the DNS servers in the infrastructure zone but has no other access to the other networks. Guest isolation is also utilized on the wireless controller.

Risk Assessment and Mitigations

When designing these zones, I wanted to evaluate the risks of each device and attempt to group them together into devices that needed similar access or devices that needed to be able to communicate directly. Although I didn't do a formal ISA/IEC 62443 risk assessment, the design was informed by my professional usage of it. I looked at the confidentiality of the data and the integrity and availability requirements of the devices. From there, I worked to group them into zones of devices with similar requirements. Some of the questions I considered when developing my approach were:

What kinetic capabilities does the device have (turn on a light vs unlock a door)?
Do the devices have regular security patches from the vendor? Am I able to apply them in a timely manner without interrupting a family vacation if a new critical patch was released?
Does the vendor have a robust security program or are the devices likely to have (or have a history of) vulnerabilities?
What assurance do I have that the device is secured and configured properly? Do I have the visibility to get this assurance?
What kind of data is stored on the device? Is it sensitive or does it have a need to be high integrity?
What's the availability need of the device? Is it used in some way for safety or security?
Does putting this device in its own network segment provide enough additional security to justify the additional subnet?
Are there other mitigating controls I can use (such as Proxmox's firewall for micro-segmentation)?

Devices on the primary network are generally trusted and contain sensitive information in the form of user data (iMessages, camera rolls, web browser session tokens, etc...). They need access to internal applications and the home automation system (access to internal applications still requires that the end user devices first go through the reverse proxy server and authenticate). Internet access is provided only to common web ports and other specific ports required by services used on the devices. Access to the internet is limited by IP blocklists for malware hosts and DNS filtering for malware and tracking sites. Devices on this subnet are regularly patched and run anti-virus and other host-based protections where possible. End users on this subnet also expect a smooth user experience, requiring a balance between security and usability.

IoT network devices rely on a persistent connection to a cloud service provider. These devices include the thermostat, printer, dehumidifier, night lights, label maker, dishwasher, and other similar devices. Connection methods to the vendor cloud are typically not disclosed to the consumer and can result in the service provider being able to (inadvertently) communicate with other devices on the same network (particularly in the case of reverse tunnels being used to establish communication). These devices often have a lack of visibility into how they have been secured by the vendor (if they have been secured) and are not able to be hardened or controlled by the end user. In many cases, these devices intentionally have local access disabled by the vendor, rendering them useless without connectivity to the cloud. Due to the similarities and communication requirements, these devices are placed together on a network of similar devices.

The automation network is home to the trusted, local automation system (OpenHAB). This is the core of my smarthome infrastructure. Devices are capable of local control, such as: local Arduino-based door controllers, Z-Wave hub, ZigBee hub, alarm system interface, video security (30+ IP cameras, NVRs, and video object detection), and door intercoms. These devices perform sensitive operations and contain sensitive data related to the security and safety of the house resulting in the need for additional protection at the network level. Due to the lack of available patches for some of the end devices, their sensitivity, and their criticality, this zone is treated like a process control network. No traffic is allowed outbound from this network and access to this network is restricted to the management DMZ for privileged tasks and to the applications DMZ for end user access to applications.

Network devices, such as switches, firewalls, wireless controllers, access points, uninterruptable power supplies, power distribution units, and virtualization infrastructure have their management interfaces in the management network. This network segment is similar to the automation network but has different types of devices - the core of the network and virtualization stack. These devices are not only critical to the operation of the network, but, if maliciously misconfigured, could allow unintended access to other devices. To help mitigate the risk of exploitation, no internet access is available to this zone, and it can only be accessed by the management DMZ.

The management DMZ houses a single Windows server that is used as a jump host. Its only purpose is to be a sanitary place to perform administrative actions on the privileged networks. Multifactor authentication is enforced to access this server by Duo Security's Windows agent and requires a successful push authentication to log on. Outbound internet access is generally denied with the exception of Windows patching and Duo's small list of API servers. This design was modeled after Microsoft's concept of a secure administrative host.

Common services used across zones reside in the infrastructure network. This zone hosts services, like DNS, which are used by multiple zones on the network. These servers only allow inbound connections to their services and cannot initiate connections into the zones that use their services. Management interfaces of servers in this zone are restricted to the management DMZ to help minimize their attackable surface area.

Internal applications are hosted in the applications zone. This zone has servers and containers inside of it, hosted on the virtual infrastructure stack, for internal applications (non-public). Inside the zone, the servers are micro-segmented through the use of Proxmox's firewall features to limit communications between virtual hosts, even though they are on the same network. Applications are accessed via the application DMZ by end users and management of these servers is done by the jump host in the management DMZ. Servers in this zone have highly restricted internet access depending on the needs of the applications.

The applications DMZ is a zone housing reverse proxy servers that provide access to the applications in the applications zone. All access to applications must pass through this zone. This zone translates between internal SSL certificates to public Let's Encrypt certificates, enforces strong SSL/TLS ciphers, and enforces multifactor authentication -- all of which is done before proxying the user through to the application. Other mitigations are applied at this point too, including, but not limited to, IP filtering and URL restrictions. This zone has pinhole access to applications and no other access to limit the blast radius if it were to be compromised.

The voice over IP zone hosts the IP phones, IP intercoms, and the internal PBX. This zone has very limited internet access, only to the required subnets based on the VoIP provider requirements to establish a SIP trunk. VoIP phones give us provider/path diversity in our ability to make calls if our cell phones are not functioning, as well as e911 service to make sure that our actual address (vs an approximate location with a cell phone) is sent to the 911 call center in the event of an emergency. This zone has no other internal dependencies to increase availability. Incoming traffic into this zone is limited to API calls to the IP intercoms from the home automation server in the automation network and SIP traffic from iPhones (only) on the primary network for them to use their softphone application to make calls.

The guest network is intended to provide open internet access to guests' devices, while keeping the devices isolated from themselves and the internal networks. The devices in this zone are typically not family owned, and thus, we have no administrative control over them, their function, or their use. Because of this, they are untrusted for all purposes. This zone does utilize the infrastructure zone for its DNS servers, but that is the only non-internet access this zone has. Internet access is fully open with the exception of IPs on the malware/botnet IP blocklists and DNS entries blocked by the malware DNS lists (this zone does not utilize the anti-tracking DNS blocklists). Devices on this network are all wireless and connect on a dedicated SSID with device isolation enabled at the wireless controller to prevent device-to-device communication. As a note: this is where company owned devices for our work are attached since they are untrusted with respect to our network.

Security vs Usability

Since my homelab also hosts the production network for my family, usability is a key requirement. The internet, home automation, and internal applications have to "just work" for them. Because of this, I've purposefully deviated from some "rules" to enhance usability or decrease complexity - all backed by risk-based analysis or the addition of other compensating countermeasures. A few of these situations are:

Apple Devices: All Apple devices were placed on the primary network, even devices such as HomePods and Apple TVs which are arguably considered IoT devices. This decision was made because of the level of access Apple has to our family's data already and the similarity of all these devices with respect to the operating systems they run (note: this placement on the primary network is only for Apple smarthome devices, not any other similar devices from other companies). If I trust my iPhone to be on the primary network, moving the HomePods to another segment doesn't provide many additional security benefits, but it does massively increase the complexity of inter-device communications with the protocols (like mDNS) that Apple devices use.
VPN vs Reverse Proxy: While VPN is typically considered to be the ideal way to provide secure access to internal applications from outside the network, it doesn't provide the best user experience when applications have a use-case to be accessed from non-owned devices. The reverse proxy provides additional layers of security (such as public encryption certificates and enforcing multifactor authentication) prior to passing the user through to the application. It also limits the attack surface to a single application exposed to the internet since users must authenticate to it before being passed through to the actual applications behind it.
SSL Interception: I had previously implemented SSL interception on the primary network to scan and filter internet traffic. As more and more applications grew to use certificate pinning and other modern security measures, this became almost impossible to keep up with and constantly caused usability issues for my family. I have since ceased to use this technology and now rely on IP and DNS malware blocklists to provide some level of protection. Desktops and laptops on the primary network also have an agent-based web filtering agent to provide additional threat intelligence-based web blocking.

Conclusion

Creating the network for my homelab has been a journey of finding the right balance between security and everyday usability. By applying professional methodologies in my home environment, I've carefully structured each network segment with a focus on risk assessment and mitigation. Choices highlighted in the Security vs Usability section reflect an ongoing effort to align necessary security measures with the practical realities of a production family network. While this setup is well suited to my current needs, it is designed to evolve and adapt, underscoring the importance of flexible and informed network design in any setting.

In summary, this network segmentation strategy not only ensures robust security but also maintains a high level of usability. It demonstrates that with thoughtful planning and a risk-based approach, creating a network environment that is both secure and user-friendly is possible. Achieving this balance is crucial in a setting where technology supports daily life without imposing undue complexity on users.