Project

General

Profile

Actions

Bug #204

closed

Forwarding fragmented Packets partly broken in batman-adv 2014.4

Added by Ruben Kelevra about 9 years ago. Updated about 7 years ago.

Status:
Rejected
Priority:
Urgent
Assignee:
-
Target version:
-
Start date:
02/26/2015
Due date:
% Done:

0%

Estimated time:

Description

Using batman-adv 2014.4

We got a tun device on Router A with mtu 1312.

We got a dualband mesh between both (Router A and B) with mtu 1312. Both Routers are a TP-Link WDR3600.

root@A:~# batctl f
enabled

root@B:~# batctl f
enabled

Ping from A to an IPv6:

root@A:~# ping6 fda0:747e:ab29:2144::c02 -s 1800
PING fda0:747e:ab29:2144::c02 (fda0:747e:ab29:2144::c02): 1800 data bytes
1808 bytes from fda0:747e:ab29:2144::c02: seq=0 ttl=64 time=13.813 ms

Ping from B to the same IPv6:

root@B:~# ping6 fda0:747e:ab29:2144::c02 -s 1800
PING fda0:747e:ab29:2144::c02 (fda0:747e:ab29:2144::c02): 1800 data bytes
^C
--- fda0:747e:ab29:2144::c02 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss

The maximum -s option which works is

root@B:~# ping6 fda0:747e:ab29:2144::c02 -s 1428
PING fda0:747e:ab29:2144::c02 (fda0:747e:ab29:2144::c02): 1428 data bytes
1436 bytes from fda0:747e:ab29:2144::c02: seq=0 ttl=64 time=18.166 ms

###

With Dualband-Mesh MTU 1532:

root@B:~#  ping6 fda0:747e:ab29:2144::c02 -s 1800
PING fda0:747e:ab29:2144::c02 (fda0:747e:ab29:2144::c02): 1800 data bytes
^C
--- fda0:747e:ab29:2144::c02 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss
root@10feed40a81e:~# ping6 fda0:747e:ab29:2144::c02 -s 1428
PING fda0:747e:ab29:2144::c02 (fda0:747e:ab29:2144::c02): 1428 data bytes
1436 bytes from fda0:747e:ab29:2144::c02: seq=0 ttl=64 time=92.526 ms
1436 bytes from fda0:747e:ab29:2144::c02: seq=1 ttl=64 time=14.691 ms
^C
--- fda0:747e:ab29:2144::c02 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 14.691/53.608/92.526 ms
root@10feed40a81e:~# ping6 fda0:747e:ab29:2144::c02 -s 1429
PING fda0:747e:ab29:2144::c02 (fda0:747e:ab29:2144::c02): 1429 data bytes
^C
--- fda0:747e:ab29:2144::c02 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss

So the mesh-mtu does not change anything. The Router A is able to submit fragmented packets above -s 1500 and Router B not.


Files

16«.png (27.1 KB) 16«.png Ruben Kelevra, 03/13/2015 06:24 PM
nodeA_ping_filtered.pcap (3.36 KB) nodeA_ping_filtered.pcap fastd packages of one ping roundtrip Ruben Kelevra, 03/13/2015 06:31 PM
nodeB_ping_filtered.pcap (1.61 KB) nodeB_ping_filtered.pcap fastd packages of one broken ping Ruben Kelevra, 03/13/2015 06:31 PM
batman-adv package.pcapng (432 Bytes) batman-adv package.pcapng Ruben Kelevra, 03/13/2015 07:04 PM
Actions #1

Updated by Marek Lindner about 9 years ago

I see various ping commands but I am left wondering what you think isn't working the way you think it should and more info what your test setups are. Would you mind adding that ? Thanks!

Actions #2

Updated by Ruben Kelevra about 9 years ago

Hey Marek,

We got the following setup:

gateway (fastd) router A (adhoc wifi) router B

Client-Interfaces on Batman-adv have all mtu 1500. So batman-adv have to be able to transport packages with mtu 1500.

This was working on batman-adv 2014.2, but router a/b was upgraded today.

The transport via fastd have to be fragmented to mtu 1312 via batman-adv, this seem to work, since this command work:

ping6 fda0:747e:ab29:2144::c02 -s 1800

The IPv6 fda0:747e:ab29:2144::c02 is the gateway.

On the second router, with normal setting, the adhoc-wifi transmit the package without any fragmentation, because the mtu on client-interface is 1500, the package would be fragmented for this mtu via ip-fragmentation. After this, is it transmitted via batman-adv without fragmentation, because of the mtu 1532 of the adhoc-wifi.

If this packages now reach router A, router A need to fragment this packages, because the next mtu is 1312, so now should batman-adv fragment it. This seem to not work.

But further, I changed the adhoc-wifi to mtu 1312, so router B's batman-adv need to fragment the packages locally. Which means, Router A just have to retransmit the fragmented packages, this doesn't work either.

Actions #3

Updated by Ruben Kelevra about 9 years ago

I'm sorry but I have to confirm this problem for 2014.3 too.

Gateway over Tunnel, and one Wifi-Hop (over a 2014.3 Node) to an 2014.3 Node:

[ruben@rig ~]$ ping6 fda0:747e:ab29:2144:ffff:10fe:ed40:a81e -s 1500
PING fda0:747e:ab29:2144:ffff:10fe:ed40:a81e(fda0:747e:ab29:2144:ffff:10fe:ed40:a81e) 1500 data bytes
^C
--- fda0:747e:ab29:2144:ffff:10fe:ed40:a81e ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2000ms

Gateway over Tunnel, and one Wifi-Hop (over a 2014.2 Node) to an 2014.2 Node:

[ruben@rig ~]$ ping6 fda0:747e:ab29:2196:ffff:c04a:000b:6cc6 -s 1500
PING fda0:747e:ab29:2196:ffff:c04a:000b:6cc6(fda0:747e:ab29:2196:ffff:c04a:b:6cc6) 1500 data bytes
1508 bytes from fda0:747e:ab29:2196:ffff:c04a:b:6cc6: icmp_seq=1 ttl=64 time=88.7 ms
1508 bytes from fda0:747e:ab29:2196:ffff:c04a:b:6cc6: icmp_seq=3 ttl=64 time=75.4 ms
1508 bytes from fda0:747e:ab29:2196:ffff:c04a:b:6cc6: icmp_seq=4 ttl=64 time=141 ms
1508 bytes from fda0:747e:ab29:2196:ffff:c04a:b:6cc6: icmp_seq=5 ttl=64 time=244 ms
1508 bytes from fda0:747e:ab29:2196:ffff:c04a:b:6cc6: icmp_seq=6 ttl=64 time=75.1 ms
^C
--- fda0:747e:ab29:2196:ffff:c04a:000b:6cc6 ping statistics ---
6 packets transmitted, 5 received, 16% packet loss, time 5004ms
rtt min/avg/max/mdev = 75.159/125.077/244.183/64.404 ms

Actions #4

Updated by Ruben Kelevra about 9 years ago

I'm sorry, I have to admit that I haven't confirmed that the right version was build, my last post still referrer to batman-adv 2014.4, just building a new firmware with 2014.3 to test if it's introduced with 2014.4.

Actions #5

Updated by Marek Lindner about 9 years ago

Thanks for the info. Let me try to describe the setup with my own words based on what I understood. Feel free to correct me.

We have a chain of batman-adv nodes using different MTUs (whether they are running fastd or not isn't all that relevant).

nodeA -- 1312 -- nodeB -- 1532 -- nodeC -- 1532 -- nodeD

Note that I am referring to the MTUs configured on the hard-interfaces to avoid confusion. That MTU on the bat0 interface is based on the lowest MTU of all configured hard interfaces. If packets are forwarded from one hard-iface to another the bat0 MTU does not matter.

In this setup, nodeB is the only batman-adv node with 2 different hard-iface MTUs.

Here are my questions:
  • When pinging from nodeA towards nodeD which node can you successfully ping with a standard MTU of 1500 bytes ?
  • Why do test with an packet size of 1800 bytes ? To enforce fragmentation ? Is that relevant for a real use case or is there another reason for that magic number ?
  • Did you manage to verify whether or not the MTU problem is a regression ? Or is that yet to come ?
Actions #6

Updated by Ruben Kelevra about 9 years ago

Hey Marek,

you got one thing wrong: there is no nodeD.

We got:

nodeA -- 1312 -- nodeB -- 1532 -- nodeC

Else I just tested the same Setup with Batman-adv 2014.3. The same result as 2014.4.

  • When pinging from nodeA towards nodeD nodeC which node can you successfully ping with a standard MTU of 1500 bytes ?
    Batman-adv 2014.4: nodeB
    Batman-adv 2014.3: nodeB
    Batman-adv 2014.2: nodeB and nodeC

*Why do test with an packet size of 1800 bytes ? To enforce fragmentation ? Is that relevant for a real use case or is there another reason for that magic number ?
1800 Bytes are large enouth to get two nearly same size packets ...

*Did you manage to verify whether or not the MTU problem is a regression ? Or is that yet to come ?
Next I gonna build the same OpenWRT with batman-adv 2014.2 which works before, to be sure this is a bug in batman-adv itself.

Best regards

Ruben

Actions #7

Updated by Marek Lindner about 9 years ago

After checking the 2014.3 changelog / patches I still don't see how an MTU behavior change could be introduced. But it may still be there. Let me know how your test goes.

If you feel confident enough that I can observe the same problem without running fastd I can also try to replicate it.

Actions #8

Updated by Ruben Kelevra about 9 years ago

I tested yesterday what exactly is beeing transmitted on the supernode-eth0, so I changed the fastd-encrytion to null and captured with tcpdump:

The Ping from the NodeA to the Supernode is left, and the Ping from NodeB to the Supernode is right. Both IPv6 with "-s 1452". As you can see, the second fragment was truncated, so the package seem to be altered and since the size is not correct, the package is going to be dropped (by batman-adv?).

Actions #10

Updated by Ruben Kelevra about 9 years ago

And this is the second part of the NodeB ping as ethernat /batman-adv package, without fastd, udp, ip, ethernet header

Actions #11

Updated by Marek Lindner about 9 years ago

Coming back to my questions:

  • Have you confirmed 2014.2 does indeed not exhibit the problem ? It would be immensely helpful to know if this is a regression or not.
  • Can we replicate the fragmentation issue without running fastd ? Or: What is the easiest setup to replicate the matter ?
Actions #12

Updated by Ruben Kelevra about 9 years ago

Please close this bug as invalid, with an older version of OpenWRT-trunk this issue does not appear. So it's located there.

Lastest version I tested, which works is r43777.

Thanks for all your help, and sorry for the waisted time, I try to find this bug now in the openwrt-repo.

Actions #13

Updated by Marek Lindner about 9 years ago

  • Status changed from New to Closed

Ok, thanks for the info!

Actions #14

Updated by Sven Eckelmann about 7 years ago

  • Status changed from Closed to Rejected
Actions

Also available in: Atom PDF