TRILL Switching for WISPS, Part 2
< Part 1… intro in case you missed it
Many WISPS start out with a bridged/switched network. It's cheap and easy to create (simply plug stuff in!) but doesn't scale well. Sooner or later, the only way to scale up is to scale out — using routers, not just switches and bridges.
I learned the practical limits of spanning tree the hard way, somewhere around the last century. Since then, I've been a proponent of Routing All The Bits.
So for me, moving back to 'switching,' even on a small test network segment, felt a bit grungy. We'll test a mesh of up to 10 TRILL switches from IgniteNet in a real deployment.
In Part 1, we saw that TRILL is an overlay network. It's actually a routed network underneath. In practice, it should be more efficient and resilient than a traditional bridged network.
Here in Part 2 we'll talk about how to configure TRILL (it's almost zero-config), and what effect it has on traffic paths. SPOILER: it picks the shortest path.
Design & Setup
Bridged Network: Foundation
We'll use a physical backhaul network of wireless and wired single links. We can use non-TRILL switches as well as TRILL switches; if a TRILL port sees Spanning Tree BPDU packets, it lets STP do its thing and steps out of the way. Consequently, we disable STP wherever we want to use TRILL to manage loop-free topology.
My original intent for proof-of-concept was to test gigabit-only links and TRILL-only switches. On the bench, this is easy to cable. In production, I used a mix of 1Gb copper ethernet, fiber, and a variety of wireless links:
- Mimosa B24 (24 GHz bridge)
- 1-gig Mikrotik Wireless Wire (60 GHz bridge, RSTP disabled)
- 2.5Gb IgniteNet MetroLink PTP/PTMP radios (60 GHz)
To augment testing with sub-gigabit connections, I added an SFP VDSL2+ link that tops out at about 110Mb, converted to a 1 Gig copper interface.
Each POP or hop contains a minimum of two backhauls or interconnections, a TRILL switch, and a router. We'll get to the routers in part 3.
Routed IS-IS Network — Port Configuration
Bridged TRILL Network — Status
Bridging Reconvergence
I simulated a link-down by pushing continuous UDP/TCP across the mesh, and disabling the RF side of test backhauls in the path — this causes packets to stop flowing, but does not physically signal an outage. (A switch would immediately detect a downed ethernet port.)
With the latest firmware, convergence occurs in 6-8 seconds on a link-down condition, and even faster on a link-up condition.
Load Balancing, kinda
Conclusions
What's working
- Self-configuring TRILL shortest-path-first mesh
- Reasonably stable, fast-ish failover
- Poor-man's full-duplexer/load balancer
- Metric can be manipulated via CLI (persistently?)
- Low cost for the feature set
What could use improvement
- Some outstanding issues with accessing management IPs on trunks
- No multipath/ECMP
- GUI limits max MTU below what hardware supports
- GUI doesn't allow setting interface metric
- No SNMP/API monitoring of TRILL adjacency
These views are my own; IgniteNet did not sponsor this post, but did answer copious questions during the debugging and trial phases.
…coming in part 3…
In part 3, we'll talk about:
- types of deployments where TRILL mesh is a good idea/bad idea
- various routing topologies when using a Layer 2 mesh
- additional types and methods of traffic steering
Testing Setup
- IgniteNet Metrolinq 2.1.1 and 2.2.0
- Mikrotik RouterOS 6.43rc47 and later, for Wireless Wire — earlier revisions have an MTU bug
- Mikrotik RouterOS 6.42.1 and later, for routers
- Mimosa B24 1.5.1
- Managed and unmanaged fiber converters, various manufacturers and firmware revisions