B.A.T.M.A.N. V status update

Added by Marek Lindner over 13 years ago

In November 2009, when batman developers and wireless enthusiasts came together in Brussels (WBMv2) the first concepts ideas on how to improve the BATMAN IV routing algorithm emerged and after a couple of days of healthy discussion a first draft came to light. One year later, it is time to ask: What has been accomplished since then ? What is working right now ? Where are we going from here ?

This overview tries to present the state of development regarding the various B.A.T.M.A.N. V ideas/concepts. Some are already part of a release, others are in the design / implementation phase and a couple more are in the state of "vague idea". Feel free to join any topic and discuss with us to get your idea heard as well!

Mesh bonding mode

Status: implemented/stable

The bonding mode was one of the first B.A.T.M.A.N. V features to be implemented. It was tested during the WBMv3 in Italy and found its way into the 201010 release.

However, our tests have shown that bonding mode achieves maximum throughput over single hops only. For multi-hop paths interface alternating (see the next section) provides better performance results.

Incoming interface based routing / Interface alternating

Status: implemented/stable

Our original concept of how the incoming interface based routing (later called "interface alternating") should work turned out to be susceptible to routing loops, therefore the ideas had to be revised. While implementing the bonding mode it became apparent that these 2 features share the same base problem: Finding a neighbor which also has several interfaces that a node can communicate with simultaneously. As a result the same functionality was achieved by simply devising the interface alternating as a variation of bonding. Consequently, bonding mode and interface alternating were released together. Check our documentation section if you are interested to find out how to use these features and how they work internally.

Neighborhood Discovery Protocol

Status: implemented/experimental

B.A.T.M.A.N. IV uses a single message type (Originator Message - OGM) to solve two problems: Measuring the link quality to direct (single hop) neighbors and propagating the path qualities through the mesh. Albeit being simple, this approach leaves room for optimizations, especially in dense and/or mobile networks (see "overhead reduction" in the outlook). B.A.T.M.A.N. V introduces NDP, in addition to the OGMs. NDP takes care of the link quality measurements while OGMs take care of propagating the path qualities. This has the following advantages:
  • Modularization of the code
  • The delta changes between two OGMs can be increased with an NDP interval faster than the OGM interval, resulting in a higher influence of a single OGM.
  • Seperate optimization strategies can be used for NDP and OGMs individually then.
  • The more sparse the network is (number of single hop neighbors significantly smaller than the number of all nodes), the faster the NDP interval can be chosen relative to the OGM interval, resulting in major convergence speed improvements in sparse networks.
  • As NDP messages are never rebroadcasted it also helps to reduce the overhead in dense networks (many single hop neighbors).
  • (Mobile nodes can chose a faster OGM + NDP interval then. It directly improves their transmit path - which was not the case with the OGM LQ measurements due to the echo-quality dependency,
    and indirectly improves their receive path a little, due to the asymmetric penalty)

The first set of patches has been posted but the work is not complete yet. We hope to have it ready for the WBMv4.

Weighted LQ measurements

Status: design phase

Currently, B.A.T.M.A.N. IV is using a normal window for counting received packets and calculating the link quality from this information. However, these link quality measurements turned out to be the major convergence speed issue (with the default values, batman-adv needs 64 seconds to get from 0 to 100% link quality). The idea is, to give newer packets more importance in the counting and measuring process, either via a weighted window or an exponential weighted average.
  • Improves convergence speed due to more responsive link quality measurements in mobile scenarios.
  • NDP with faster intervals + weighted LQ measurements complement each other nicely for faster LQ measurements.

Dead node fast path switching/invalidating

Status: vague idea

When a node notices the breakdown of a neighbor (see routing scenarios to get an idea about the conditions), this node could send any data packet, which it would usually send over this neighbor to either its second best hop if available (which does not always have to be the case due to OGM forwarding policies). Or it could send the packet back to the next hop towards the source again. With the help of sequence numbers, any node on the 'backtracking' path (the backtracking path can be different from the usual path in case of asymmetric links) could notice that a path became invalid very quickly.
  • Quick response in case of node/link failures
  • Avoiding packet drops in case of rapid link failures

Host Network Announcements (HNA) and Roaming

Status: "(partly) implemented/experimental"

So far, every host's mac address in the mesh network (either of bridged-in hosts or the batman-adv node itself on bat0) are being announced proactively, periodically: Those mac addresses are just
attached to every OGM. A (semi-)reactive approach shall be introduced, as the HNA information usually do not change that frequently. It further has the following advantages:
  • Minimizing size of OGMs, reducing overhead
    -> Reduces overhead a lot in case of whole wired networks being bridged into the mesh network
  • Allowing many, many, many more hosts being bridged into the network without affecting the routing protocols performance in a negative way.
  • Fast HNA handover: Instead of having to wait for a new OGM and its HNA entries to notice the roaming of a host, a reactive technique shall minimize this time frame needed for the roaming.
  • Any layer 2 implementation (such as batman-adv) would be able to use this new mechanism to reduce the hand-over time to nearly zero as packets can be safely redirected from the old node in charge of the host to the new one (usually the new node in charge will notice the roaming earlier than the sender of data, as the new node is usually physically closer to the old node) until the mesh is in sync again.

This task is work-in-progress - a first draft describing full mechanism has been posted.
-A second draft describing full mechanism has been posted

Multicast optimizations

Status: implemented/experimental

B.A.T.M.A.N. IV has no explicit multicast support, therefore batman-adv does not treat multicast packets any different than broadcast packets: They just get flooded through the mesh. To make services like audio/video transfer feasible, the retransmissions should be reduced. 2 different approaches have been proposed around the same time:
  • The first scheme offers the possibility to 'flood' multicast packets on the according unicast paths only for nodes which are both sender and receiver of the same group.
  • The second multicast optimization approach is non-group-specific and works for both multicast and broadcast packets.

Both approaches share similar concepts but are not identical. These two groups exchanged many ideas and probably merge the best of both approaches in the weeks to come.

Thanks again for the tremendous input and sharing of experiences, ideas and code! Seeing all these ideas coming together is what makes this project worthwhile!

Happy routing,

The B.A.T.M.A.N. team