066 | Redundancy of Network Links Within a Single Building: Copper, Fiber, and Bonding | BigMike.help - IT support for startups, developers and business

Redundancy of Network Links Within a Single Building: Copper, Fiber, and Bonding

In the previous article, we explained why network link redundancy is not just a luxury but a critical need for ensuring business continuity. Today we’ll focus on the most basic — but no less critical — level: communication between servers and network equipment within a single building, whether it’s a server room, data center, or a regular office.

This is often where the most frustrating failures occur, even in what seems like a controlled environment: someone accidentally unplugs a cable, a switch port fails, or an entire switch goes down.

Specifics of Local Network Issues

Inside a single building (or data center), we’re talking about high-speed data transfer between application servers, databases, storage systems (SAN/NAS), and internal networks. If this link breaks — even for a second — it can cause application crashes, session losses, data corruption, and other serious issues.

Typical Solutions for Local Network Resilience

To minimize risks, the following approaches are used:

1. Redundant Network Interface Cards (NICs) on Servers

Nearly all modern servers come with multiple network cards. That’s no coincidence. Connecting a server to the network using two or more NICs is the first and simplest step toward resilience. If one card fails, the other can take over.

2. Two Independent Network Switches

Connecting each server to two different, independent switches (or two separate switch stacks) is the gold standard in data centers. If one switch completely fails, the server can continue operating through the other.

How it looks: Each server has at least two network cables, each going to a separate physical switch.

3. Link Aggregation (Bonding / EtherChannel / LACP)

This is one of the most powerful and common solutions. Link aggregation (called bonding in Linux) allows multiple physical NICs on the server (and corresponding switch ports) to be combined into one logical interface. This offers two major benefits:

Increased bandwidth: The total throughput of all aggregated links. For example, two 1 Gbps links can provide 2 Gbps total.
Resilience: If one physical cable or switch port fails, traffic is automatically switched to the remaining active links in the group.

Bonding modes:

active-backup (mode 1): The simplest and most common mode for fault tolerance. One interface is active, the other is on standby. If the active fails, the backup takes over immediately. Does not increase bandwidth.
balance-xor (mode 2): Distributes traffic across all active interfaces using XOR hashing (e.g., based on source/destination MAC addresses). Provides load balancing and resilience.
802.3ad (LACP - Link Aggregation Control Protocol, mode 4): A standardized protocol that dynamically negotiates aggregation with the switch. This is the most intelligent mode, providing optimal load balancing and redundancy in a coordinated environment. Requires LACP support on the switch.

Configuration: Setting up bonding requires configuration on both the Linux server and the network switch. For LACP, they must “negotiate” with each other.

4. Using Different Types of Cables

In critical systems, consider using different cable types for separate connections:

Copper (UTP/STP): Cheap and convenient for short distances. Susceptible to electromagnetic interference and limited in range.
Fiber optic: More expensive but offers much higher speed, longer distance, and total immunity to EMI. Ideal for switch-to-switch connections and storage links.

Using different types can protect against specific failure scenarios.

5. Redundancy of SFP Modules

If you’re using fiber optic connections, don’t forget about SFP/SFP+ modules. Always keep spare modules in case of failure.

6. Redundant Storage Connections (Multipathing I/O)

For servers connected to shared storage systems (SAN/NAS), it is essential to configure multipathing I/O. This gives the server multiple paths to the same disk or LUN on the storage system. If one path (e.g., cable, Fibre Channel HBA port, or storage port) fails, data continues to flow through the other, ensuring uninterrupted operation.

What Can Fail at This Level?

NIC failure on the server.
Switch port failure.
Patch cord break or damage.
Complete switch failure (rare but possible).
SFP module failure.

Example Configurations

Basic Resilience: A server with 2 NICs connected to 2 different switches. NIC1 to SW1, NIC2 to SW2. active-backup bonding is configured on the server.
High Availability and Performance: A server with 2 NICs bonded in LACP mode. Each cable connects to a separate switch, which are stacked or clustered to appear as one logical unit for LACP.

Monitoring

Effective redundancy is impossible without constant monitoring:

Port status: Are they active? Are there any port errors?
Bonding interface status: Which interface is active? Are there failovers?
Gateway/remote IP reachability: Pings or availability checks to key network points.

Conclusion

Ensuring communication resilience within a single building is the foundation of a reliable IT infrastructure. Using redundant components (NICs, switches) in combination with channel aggregation technologies (bonding/LACP) and the right cabling choices significantly increases the availability of your servers and applications. Don’t forget about multipath storage connectivity and continuous monitoring!

In the next article, we’ll move up a level and discuss how to ensure redundancy for communication links between geographically distributed offices.

066 | Redundancy of Network Links Within a Single Building: Copper, Fiber, and Bonding