Bug #131


"unable to handle kernel NULL pointer dereference" in send_vis_packets

Added by Linus Lüssing about 14 years ago. Updated over 6 years ago.

Target version:
Start date:
Due date:
% Done:


Estimated time:


I got a nice backtrace of the crashing batman-adv-module. I'm using batman-adv self-compiled from the trunk rev. 1418 with the subgraph-patch with Simon's additions (mailing-list Sun Aug 30 19:55:53 UTC 2009). I was restarting tinc which is providing two interfaces for batman and then noticed, that batman still claimed those interfaces being inactive although they were up.


module-crash-01-09-09.log (2.63 KB) module-crash-01-09-09.log Linus Lüssing, 09/01/2009 10:01 PM
objdump-vis.o (37.2 KB) objdump-vis.o Linus Lüssing, 09/01/2009 10:19 PM
module-crash-02-09-09.log (2.88 KB) module-crash-02-09-09.log Linus Lüssing, 09/02/2009 11:44 PM
Actions #1

Updated by Linus Lüssing about 14 years ago

And the same problem again. What I did this time is pretty much the same as yesterday: tinc interface inside of batman-adv, then I restarted the tinc daemon while the interface was still used by batman.

Actions #2

Updated by Linus Lüssing about 14 years ago

Ok, it is absolutely reproduceable, here the steps I'm doing in detail:
First, starting tinc, which creates two tunnel interfaces '3micc' and '3micc-blochi'. The vis-server is on a batman-node behind '3micc'. Then I'm inserting the batman-adv-kernel module with 'insmod batman_adv.ko'. Next two steps are "echo 3micc > /proc/net/batman-adv/interfaces" and "echo 3micc-blochi > /proc/net/batman-adv/interfaces". Originators on this machine get listed from '3micc', but from '3micc-blochi' not (couldn't figure out why this is so yet). Now I'm restarting tinc with '/etc/init.d/tinc restart' and it responds with:
"Restarting tinc daemons: 3m 3micc 3micc-blochiA tincd is already running for net @3micc-blochi' with pid 3378.
3micc-maxi sellars2asta." (already something odd with 3micc-blochi???)
At that moment, I get the backtrace with dmesg. I'm also not able to stopp the tincd explicitly for 3micc-blochi with 'tincd -k -n 3micc-blochi', dmesg then gives me a lot of lines like "[ 889.600012] unregister_netdevice: waiting for 3micc-blochi to become free. Usage count = 1". So it looks like tincd crashes first which takes down batman-adv then.

Actions #3

Updated by Linus Lüssing about 14 years ago

Ok, and now run tincd in debug-mode again. It looks like it does not totally crash, but tries to close a connection... The problem also only occurs if I'm restarting really fast, so that it might not have the chance to close the connection. Though it is strange that this kills the batman-adv-module anyway.

/dev/net/tun is a Linux tun/tap device (tap mode)
Listening on :: port 3659
Can't bind to port 3659/tcp: Die Adresse wird bereits verwendet
Trying to connect to blochi ( port 587)
Connected to blochi ( port 587)
Connection with blochi ( port 587) activated
Got TERM signal
Closing connection with blochi ( port 587)
Closing connection with krtek (MYSELF)

Actions #4

Updated by Marek Lindner about 14 years ago

  • Status changed from New to In Progress

The patches submitted in r1425 and r1426 should fix the issues. Please close this ticket as soon as you can confirm it.

Actions #5

Updated by Linus Lüssing about 14 years ago

  • Status changed from In Progress to Closed

Yes, thanks Marek, they did :).

Actions #6

Updated by Anonymous about 13 years ago

Milestone 0.2 release deleted

Actions #7

Updated by Anonymous over 12 years ago

  • Category set to 2
Actions #8

Updated by Sven Eckelmann over 6 years ago

  • Target version set to 0.2

Also available in: Atom PDF