Project

General

Profile

Bug #334

B.A.T.M.A.N. V throughput meter apparently broken on v2017.0 on TP-Link DIR-810l

Added by Alvaro Antelo over 2 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Start date:
04/26/2017
Due date:
% Done:

0%


Description

Dear Sirs,

I have been experimenting with batman-adv V on various platforms (WDR-3600, WR-740, DIR-505 and DIR-810l) over LEDE 17.01 with great success but recently did an upgrade to the master branch of LEDE (r4018-4b195a6) compiling it with batman-adv 2017.0 and D-Link DIR-810l loosed all connectivity, although other Atheros platforms had no problems whatsoever.

Not really sure if this problem is LEDE or batman related, but poking through the console on DIR-810l, I collected some logs and I am posting in here so maybe you experts can shed some light on the problem. Reverting DIR-810L firmware to LEDE 17.01 and batman 2016.5 solved the problem.

The affected device received ELP packets from its neighbours and showed unusually high throughput readings but it was unable to do a batman ping on the other devices and none of them received any packets from it.
Attached are some logs and configs (OK and NOK).

The only thing that catched my attention was these log messages:
ieee80211 phy0: rt2800_config_channel: Warning - Using incomplete support for external PA

And chipset detection apparently wrong:
On master branch: ieee80211 phy0: rt2x00_set_rt: Info - RT chipset 6352, rev 0500 detected
On 17.01: ieee80211 phy0: rt2x00_set_rt: Info - RT chipset 5390, rev 0500 detected

Best regards,

Alvaro Antelo

banner_NOK.txt View - LEDE Version NOK (474 Bytes) Alvaro Antelo, 04/26/2017 07:47 AM

banner_OK.txt View - LEDE Version OK (480 Bytes) Alvaro Antelo, 04/26/2017 07:47 AM

batctl_log_NOK.txt View (8.31 KB) Alvaro Antelo, 04/26/2017 07:47 AM

batctl_log_OK.txt View (7.61 KB) Alvaro Antelo, 04/26/2017 07:47 AM

batctl_NOK.txt View (1.82 KB) Alvaro Antelo, 04/26/2017 07:47 AM

batctl_p_NOK.txt View (824 Bytes) Alvaro Antelo, 04/26/2017 07:47 AM

batctl_OK.txt View (1.85 KB) Alvaro Antelo, 04/26/2017 07:47 AM

batctl_p_OK.txt View (726 Bytes) Alvaro Antelo, 04/26/2017 07:47 AM

batman_config.txt View - same batman-adv config for both versions (644 Bytes) Alvaro Antelo, 04/26/2017 07:47 AM

cpuinfo.txt View (731 Bytes) Alvaro Antelo, 04/26/2017 07:47 AM

History

#1 Updated by Sven Eckelmann over 2 years ago

  • Status changed from New to Feedback
  • Assignee changed from batman-adv developers to Alvaro Antelo

Sorry, but I cannot find anything throughput meter related in your ticket (beside the mentioning in the title). The "batctl o" output you've showed is displaying the data batman-adv got from cfg80211. So please check first if `iw dev ... station dump` shows the right results or also the wrong "throughput" information.

PS: It looks like your driver misbehaves (as shown with the rest of the logs). I would therefore recommend to poke in this direction.

#2 Updated by Alvaro Antelo over 2 years ago

Sven,

Yes, you are right, it seems I am having problem with the driver, I will contact LEDE forum, and try finding what commits changed the driver and breaked it.

Regarding throughput, it is probably a symptom as the main problem is a complete lack of connectivity.
You see, bellow we have figures of around 30Gbits/s reported (and selected) which is impossible

root@node-6:~# batctl o
[B.A.T.M.A.N. adv 2017.0, MainIF/MAC: adhoc0/00:11:22:00:35:51 (bat0/b6:e2:5e:78:a6:94 BATMAN_V)]
Originator last-seen ( throughput) Nexthop [outgoingIF]
90:f6:52:ce:a3:9c 0.100s ( 5.8) f4:f2:6d:5a:87:c0 [ adhoc0]
90:f6:52:ce:a3:9c 0.100s ( 10.6) ec:08:6b:ec:3f:a4 [ adhoc0]
90:f6:52:ce:a3:9c 0.100s ( 11.7) f4:f2:6d:5a:74:ac [ adhoc0] * 90:f6:52:ce:a3:9c 0.100s ( 29199.8) 90:f6:52:ce:a3:9c [ adhoc0]
f4:f2:6d:5a:87:c0 0.060s ( 7.5) 90:f6:52:ce:a3:9c [ adhoc0]
f4:f2:6d:5a:87:c0 0.060s ( 7.5) f4:f2:6d:5a:74:ac [ adhoc0]
f4:f2:6d:5a:87:c0 0.060s ( 15.0) ec:08:6b:ec:3f:a4 [ adhoc0] * f4:f2:6d:5a:87:c0 0.060s ( 29199.8) f4:f2:6d:5a:87:c0 [ adhoc0]
ec:08:6b:ec:3f:a4 0.040s ( 12.4) 90:f6:52:ce:a3:9c [ adhoc0]
ec:08:6b:ec:3f:a4 0.040s ( 11.0) f4:f2:6d:5a:87:c0 [ adhoc0]
ec:08:6b:ec:3f:a4 0.040s ( 17.1) f4:f2:6d:5a:74:ac [ adhoc0] * ec:08:6b:ec:3f:a4 0.040s ( 29199.8) ec:08:6b:ec:3f:a4 [ adhoc0]
f4:f2:6d:5a:74:ac 0.310s ( 15.0) 90:f6:52:ce:a3:9c [ adhoc0]
f4:f2:6d:5a:74:ac 0.310s ( 7.9) f4:f2:6d:5a:87:c0 [ adhoc0]
f4:f2:6d:5a:74:ac 0.310s ( 15.9) ec:08:6b:ec:3f:a4 [ adhoc0] * f4:f2:6d:5a:74:ac 0.310s ( 33371.2) f4:f2:6d:5a:74:ac [ adhoc0]

iw station dump showed unusual results as well, such as low tx bitrate for the signal level and complete lack of rx bitrate, also no expected throughput.

root@node-6:~# iw dev adhoc0 station dump
Station f4:f2:6d:5a:87:c0 (on adhoc0)
inactive time: 20 ms
rx bytes: 919569
rx packets: 11853
tx bytes: 63045
tx packets: 327
tx retries: 1267
tx failed: 317
rx drop misc: 0
signal: -58 dBm
signal avg: -57 dBm
tx bitrate: 6.5 MBit/s MCS 0
authorized: yes
authenticated: yes
associated: yes
preamble: long
WMM/WME: yes
MFP: no
TDLS peer: no
DTIM period: 0
beacon interval:100
connected time: 557 seconds
Station ec:08:6b:ec:3f:a4 (on adhoc0)
inactive time: 20 ms
rx bytes: 946494
rx packets: 12047
tx bytes: 66837
tx packets: 344
tx retries: 1340
tx failed: 336
rx drop misc: 0
signal: -24 dBm
signal avg: -24 dBm
tx bitrate: 6.5 MBit/s MCS 0
authorized: yes
authenticated: yes
associated: yes
preamble: long
WMM/WME: yes
MFP: no
TDLS peer: no
DTIM period: 0
beacon interval:100
connected time: 557 seconds
Station 90:f6:52:ce:a3:9c (on adhoc0)
inactive time: 10 ms
rx bytes: 627080
rx packets: 8225
tx bytes: 64092
tx packets: 330
tx retries: 1277
tx failed: 320
rx drop misc: 0
signal: -70 dBm
signal avg: -67 dBm
tx bitrate: 6.5 MBit/s MCS 0
authorized: yes
authenticated: yes
associated: yes
preamble: long
WMM/WME: yes
MFP: no
TDLS peer: no
DTIM period: 0
beacon interval:100
connected time: 557 seconds
Station f4:f2:6d:5a:74:ac (on adhoc0)
inactive time: 20 ms
rx bytes: 882220
rx packets: 11170
tx bytes: 65511
tx packets: 343
tx retries: 1345
tx failed: 337
rx drop misc: 0
signal: -36 dBm
signal avg: -37 dBm
tx bitrate: 6.5 MBit/s MCS 0
authorized: yes
authenticated: yes
associated: yes
preamble: long
WMM/WME: yes
MFP: no
TDLS peer: no
DTIM period: 0
beacon interval:100
connected time: 557 seconds

Best regards,

Alvaro Antelo

#3 Updated by Sven Eckelmann about 2 years ago

Did you get any feedback why this stopped working?

Can you please test the following two patches to check if batman-adv can at least filter out the bogus values?

#4 Updated by Sven Eckelmann about 2 years ago

I have updated the second patch to work also on never kernel version:

#5 Updated by Alvaro Antelo about 2 years ago

Thank you Sven, I will test the patches as soon as possible and also report the driver's problem to LEDE forum. (still have not done it).

Today I only had time to test with LEDE 17.01.2 (OK) and compiling from master r4403-3ff3158 without your patches (same driver problem and no connectivity).

Regards,

Alvaro Antelo

#6 Updated by Alvaro Antelo about 2 years ago

Sven,

Sorry for the delay.

Applied your patches and now DIR-810L reports default throughput:

[B.A.T.M.A.N. adv 2017.1, MainIF/MAC: adhoc0/00:11:22:00:35:51 (bat0/ce:38:05:61:1d:76 BATMAN_V)]
IF Neighbor last-seen
f4:f2:6d:5a:87:c0 0.360s ( 1.0) [ adhoc0]
90:f6:52:ce:a3:9c 0.420s ( 1.0) [ adhoc0]
f4:f2:6d:5a:74:ac 0.620s ( 1.0) [ adhoc0]
ec:08:6b:ec:3f:a4 0.450s ( 1.0) [ adhoc0]

Still I have the driver problem (unrelated to batman-adv)
Mon Jun 12 16:09:10 2017 kern.warn kernel: [ 25.122709] ieee80211 phy0: rt2800_config_channel: Warning - Using incomplete support for external PA
Mon Jun 12 16:09:10 2017 kern.info kernel: [ 25.203270] IPv6: ADDRCONF: adhoc0: link is not ready
Mon Jun 12 16:09:10 2017 kern.info kernel: [ 25.316852] adhoc0: Created IBSS using preconfigured BSSID 02:ca:fe:ca:00:12
Mon Jun 12 16:09:10 2017 kern.info kernel: [ 25.330980] adhoc0: Creating new IBSS network, BSSID 02:ca:fe:ca:00:12
Mon Jun 12 16:09:10 2017 kern.warn kernel: [ 25.346636] ieee80211 phy0: rt2800_config_channel: Warning - Using incomplete support for external PA

Incomplete station info remains
root@node-1:~# iwinfo adhoc0 assoclist
EC:08:6B:EC:3F:A4 -20 dBm / unknown (SNR -20) 170 ms ago
RX: unknown 8830 Pkts.
TX: 6.5 MBit/s, MCS 0, 20MHz 601 Pkts.
root@node-1:~# iw adhoc0 station dump
Station ec:08:6b:ec:3f:a4 (on adhoc0)
inactive time: 10 ms
rx bytes: 643529
rx packets: 8672
tx bytes: 121065
tx packets: 601
tx retries: 1603
tx failed: 402
rx drop misc: 0
signal: -20 dBm
signal avg: -19 dBm
tx bitrate: 6.5 MBit/s MCS 0
authorized: yes
authenticated: yes
associated: yes
preamble: long
WMM/WME: yes
MFP: no
TDLS peer: no
DTIM period: 0
beacon interval:100
connected time: 858 seconds

and no connectivity also:
root@node-1:~# batctl p ec:08:6b:ec:3f:a4
PING ec:08:6b:ec:3f:a4 (ec:08:6b:ec:3f:a4) 20(48) bytes of data
Reply from host ec:08:6b:ec:3f:a4 timed out
Reply from host ec:08:6b:ec:3f:a4 timed out
Reply from host ec:08:6b:ec:3f:a4 timed out

But I managed to track the exact LEDE git commit that broke the driver on DIR-810l
https://github.com/lede-project/source/commit/61cfc8075b615c231cf6349b3708d0a7e073613e
If I revert to the previous commit https://github.com/lede-project/source/commit/3d71d1d9a98f55659acbfb8434406636310cb54b everything works as expected.
I will report this on LEDE forum.

Best regards,
Alvaro Antelo

#7 Updated by Marek Lindner about 2 years ago

Sounds like you confirmed that the proposed patch works as expected ? Can we close the issue ? Or did I misunderstand ?

#8 Updated by Sven Eckelmann about 2 years ago

  • Target version set to 2017.2
  • Status changed from Feedback to Resolved

Only the parts in batman-adv were fixed. Not the problems with his wifi driver.

#9 Updated by Alvaro Antelo about 2 years ago

Sven Eckelmann wrote:

Only the parts in batman-adv were fixed. Not the problems with his wifi driver.

Yes, the patch works as expected, thank you Sven and Marek.

#10 Updated by Marek Lindner about 2 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF