The head of a mesh company argues mesh networks don't scale -- and one of the folks behind an open-source mesh software project examines the argument: MeshDynamics sells a multiple-radio solution for mesh networking, and the head of the firm wrote a brief article explaining why single-radio mesh networks can't work beyond a very small deployment. I asked Sascha Meinrath of the CUWiN project for his feedback on Francis daCosta's comments.
(After this item was posted, Jim Thompson contributed his own extensive thoughts about both daCosta and Sascha's statements, hence the "third" view of the revised title.)
Sascha writes:
While I do think that Francis daCosta brings up some potential pitfalls to wireless mesh networks, the doomsday picture he presents is based on a flawed understanding of how mesh networking topographies work. I'll explain below:
deCosta wrote:
1- Radio is a shared medium and forces everyone to stay silent while one person holds the stage. Wired networks, on the other hand, can and do hold multiple simultaneous conversations.
2- In a single radio ad hoc mesh network, the best you can do is (1/2)^^n at each hop. So in a multi hop mesh network, the Max available bandwidth available to you degrades at the rate of 1/2, 1/4, 1/8. By the time you are 4 hops away the max you can get is 1/16 of the total available bandwidth.
This problem exists only when all tranceivers within a mesh topography "see" each other. And herein is the flaw in the argument. Within a mesh network Request To Sends (RTSs) do silence nodes within range; however this degradation moves in waves--so if part of a mesh consisted of 7 nodes (of which G is connected to the Internet):
A-----------------> ------B-----------------> ------------C-----------------> <-----------------D-----------------> <-----------------E-----------------> <-----------------F-----------------> <-----------------G-----------------> | Internet Connection
Here's what would happen. A would pass a packet to B; when B passed a packet to C, A couldn't talk--thus the 1/2 reduction in throughput; when C passed it to D, the same problem would occur for both A & B (thus a 1/4 throughput); likewise for D to E (because D would silence A, B, & C), thus a 1/8th throughput. However, when E passes a packet to F, A is unaffected, when F passes a packet to G, both A & B are unaffected. Thus, in this solution, throughput would theoretically max out at 1/8th (which is probably still much more throughput than the average Internet connection--where the usual bottleneck resides).
What this really points to is the need for power control in radios (which is something that CUWiN wants to work on), smart antennas, and other innovations that help to create wireless topographies where as few radios as possible "overlap." I've written about some of these solutions in a paper that I'm adapting for a book chapter -- you can download this.
3- That does not sound too bad when you are putting together a wireless sensor network with limited bandwidth and latency considerations. It is DISASTROUS if you wish to provide the level of latency/throughput people are accustomed to with their wired networks. Consider the case of just 10 client stations at each node of a 4 hop mesh network. The clients at the last rung will receive -at best- 1/(16,0000) of the total bandwidth at the root.
This simply points out the need to separate inter- and intra-nodal communications architectures--a problem that CUWiN has both already identified and implemented.
4- Why has this not been noticed as yet? Because first there are not a lot of mesh networks around and second, they have not been tested under high usage situations. Browsing and email don't count. Try video -- where both latency and bandwidth matter -- or VOIP where the bandwidth is a measly 64Kbps but where latency matters. Even in a simple 4 hop ad hoc mesh network with 10 clients, VOIP phones wont work well beyond the first or second hop -- the latency and jitter caused by CSMA/CA contention windows (how wireless systems avoid collisions) will be unbearable.
I do agree that QoS problems continue to plague most mesh wireless networks. It's a problem that needs to be solved and that most deployments and commercial (and open source) solutions sidestep. I think Francis is wise to blow the whistle on this deployment problem; I think that many commercial mesh systems have been way oversold--which will only make the problem worse.
I am constantly amazed at how little most wireless companies know about the physics, software, and hardware of the networks they deploy. Most don't even realize that if they're using routing protocols that use Standard Link State they're going to crash and burn when they scale up. For a quick graphic of the problem, just check out page 29 (labeled page 26) of this link.
This is why CUWiN is creating an A-HSLS (Adaptive Hazy Sighted Link State) protocol (as far as we know, the only open source A-HSLS protocol). We believe that routing overhead will kill networks well before throughput does.
I am optimistic that solutions will be forthcoming. What we really need today are "altruistic venture capitalists"--folks who are interested in investing in the public good -- people who will sopport the development of CUWiN (or other open-source projects that are working on these solutions) so that we can build mesh wireless systems that not only work and scale, but exceed our current expectations of what we, today, believe is possible.