Project

General

Profile

Actions

Bug #179

closed

lxc-shutdown: null pointer dereference+unable to handle kernel paging request

Added by Linus Lüssing over 10 years ago. Updated about 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
Start date:
11/29/2013
Due date:
% Done:

0%

Estimated time:

Description

On a Debian Wheezy I'm having a crash everytime I shut down an LXC container (via lxc.network.type=phys). A soft interface "bat1" was previousl added to the according lxc container (via lxc.network.type=phys).

picocom -b 115200 /dev/ttyUSB0
picocom v1.7

port is        : /dev/ttyUSB0
flowcontrol    : none
baudrate is    : 115200
parity is      : none
databits are   : 8
escape is      : C-a
local echo is  : no
noinit is      : no
noreset is     : no
nolock is      : no
send_cmd is    : sz -vv
receive_cmd is : rz -vv
imap is        :
omap is        :
emap is        : crcrlf,delbs,

Terminal ready
[  379.061816] kobject_add_internal failed for mesh with -EEXIST, don't try to register things with the same name in the same directory.
[  379.074844] batman_adv: bat1: Can't add sysfs directory: bat1/mesh
[  448.465618] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[  448.468015] IP: [<ffffffff8114f714>] sysfs_attr_ns+0x1/0x84
[  448.468015] PGD 79a94067 PUD 7a4b1067 PMD 0
[  448.468015] Oops: 0000 [#1] SMP
[  448.468015] CPU 2
[  448.468015] Modules linked in: tun nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc macvlan batman_adv(O) crc32c libcrc32c loop i915 parport_pc evdev parport coretemp pcspkr psmouse iTCO_wdt iTCO_vendor_support video drm_kms_helper i2c_i801 drm snd_hda_intel i2c_algo_bit i2c_core serio_raw mperf snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd soundcore button processor thermal_sys ext4 crc16 jbd2 mbcache dm_mod sg sd_mod crc_t10dif usb_storage uhci_hcd ahci libahci libata ehci_hcd scsi_mod usbcore r8169 usb_common mii [last unloaded: scsi_wait_scan]
[  448.468015]
[  448.468015] Pid: 142, comm: kworker/u:3 Tainted: G        W  O 3.2.0-4-amd64 #1 Debian 3.2.51-1                  /D525MW
[  448.468015] RIP: 0010:[<ffffffff8114f714>]  [<ffffffff8114f714>] sysfs_attr_ns+0x1/0x84
[  448.468015] RSP: 0018:ffff880037703d80  EFLAGS: 00010286
[  448.468015] RAX: 0000000000000000 RBX: ffffffffa0303930 RCX: ffffffff8168f0a0
[  448.468015] RDX: ffff880037703d98 RSI: ffffffffa0303930 RDI: 0000000000000000
[  448.468015] RBP: 0000000000000000 R08: 0000000000000200 R09: ffffffff8168f0e0
[  448.468015] R10: ffff88007b37d180 R11: ffff88007b37d180 R12: ffff880037703e00
[  448.468015] R13: ffff880037703e50 R14: ffff88007c087205 R15: ffffffff81659720
[  448.468015] FS:  0000000000000000(0000) GS:ffff88007ed00000(0000) knlGS:0000000000000000
[  448.468015] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  448.468015] CR2: 0000000000000030 CR3: 0000000079eb0000 CR4: 00000000000006e0
[  448.468015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  448.468015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  448.468015] Process kworker/u:3 (pid: 142, threadinfo ffff880037702000, task ffff88003703b850)
[  448.468015] Stack:
[  448.468015]  ffffffffa0303930 ffffffff8114f7ad ffff880037703d90 ffffffffa0304650
[  448.468015]  ffffffff81036628 ffffffffa03038a8 ffff880037786000 ffffffffa02fbd72
[  448.468015]  0000000000000000 ffffffffa0304650 ffff880037786000 ffffffffa02fa5d4
[  448.468015] Call Trace:
[  448.468015]  [<ffffffff8114f7ad>] ? sysfs_remove_file+0x16/0x32
[  448.468015]  [<ffffffff81036628>] ? should_resched+0x5/0x23
[  448.468015]  [<ffffffffa02fbd72>] ? batadv_sysfs_del_meshif+0x18/0x3b [batman_adv]
[  448.468015]  [<ffffffffa02fa5d4>] ? batadv_softif_destroy_netlink+0x3a/0x49 [batman_adv]
[  448.468015]  [<ffffffff8128f0ce>] ? default_device_exit_batch+0x3f/0x87
[  448.468015]  [<ffffffff8128aee6>] ? cleanup_net+0xf1/0x180
[  448.468015]  [<ffffffff8105b529>] ? process_one_work+0x161/0x269
[  448.468015]  [<ffffffff8105c4f2>] ? worker_thread+0xc2/0x145
[  448.468015]  [<ffffffff8105c430>] ? manage_workers.isra.25+0x15b/0x15b
[  448.468015]  [<ffffffff8105f631>] ? kthread+0x76/0x7e
[  448.468015]  [<ffffffff81356374>] ? kernel_thread_helper+0x4/0x10
[  448.468015]  [<ffffffff8105f5bb>] ? kthread_worker_fn+0x139/0x139
[  448.468015]  [<ffffffff81356370>] ? gs_change+0x13/0x13
[  448.468015] Code: 18 4c 89 e6 e8 76 4f fc ff 48 89 df 48 89 44 24 08 e8 30 ef 1f 00 48 8b 44 24 08 48 83 c4 28 5b 5d 41 5c 41 5d 41 5e 41 5f c3 53 <48> 8b 47 30 48 89 d3 48 85 c0 75 22 48 8b 0f 48 c7 c2 d2 ed 4d
[  448.468015] RIP  [<ffffffff8114f714>] sysfs_attr_ns+0x1/0x84
[  448.468015]  RSP <ffff880037703d80>
[  448.468015] CR2: 0000000000000030
[  448.805392] ---[ end trace 61a5f4682395bf4a ]---
[  448.814532] BUG: unable to handle kernel paging request at fffffffffffffff8
[  448.818312] IP: [<ffffffff8105f84a>] kthread_data+0x7/0xc
[  448.818312] PGD 1607067 PUD 1608067 PMD 0
[  448.818312] Oops: 0000 [#2] SMP
[  448.818312] CPU 2
[  448.818312] Modules linked in: tun nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc macvlan batman_adv(O) crc32c libcrc32c loop i915 parport_pc evdev parport coretemp pcspkr psmouse iTCO_wdt iTCO_vendor_support video drm_kms_helper i2c_i801 drm snd_hda_intel i2c_algo_bit i2c_core serio_raw mperf snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd soundcore button processor thermal_sys ext4 crc16 jbd2 mbcache dm_mod sg sd_mod crc_t10dif usb_storage uhci_hcd ahci libahci libata ehci_hcd scsi_mod usbcore r8169 usb_common mii [last unloaded: scsi_wait_scan]
[  448.818312]
[  448.818312] Pid: 142, comm: kworker/u:3 Tainted: G      D W  O 3.2.0-4-amd64 #1 Debian 3.2.51-1                  /D525MW
[  448.818312] RIP: 0010:[<ffffffff8105f84a>]  [<ffffffff8105f84a>] kthread_data+0x7/0xc
[  448.818312] RSP: 0018:ffff880037703a40  EFLAGS: 00010002
[  448.818312] RAX: 0000000000000000 RBX: ffff88007ed13780 RCX: 0000000000000002
[  448.818312] RDX: 0000000000000002 RSI: 0000000000000002 RDI: ffff88003703b850
[  448.818312] RBP: 0000000000000002 R08: 0000000000000400 R09: ffff88007bbf6cc0
[  448.818312] R10: dead000000200200 R11: ffff88007bbf6cc0 R12: ffff880037703b10
[  448.818312] R13: ffff88007eb83510 R14: 0000000000000002 R15: ffff88003703bb50
[  448.818312] FS:  0000000000000000(0000) GS:ffff88007ed00000(0000) knlGS:0000000000000000
[  448.818312] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  448.818312] CR2: fffffffffffffff8 CR3: 0000000079eb0000 CR4: 00000000000006e0
[  448.818312] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  448.818312] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  448.818312] Process kworker/u:3 (pid: 142, threadinfo ffff880037702000, task ffff88003703b850)
[  448.818312] Stack:
[  448.818312]  ffffffff8105c81e ffff88007ed13780 ffff88003703b850 ffff880037703b10
[  448.818312]  ffffffff8134d9d0 0000000000000202 ffff88007992c9c0 0000000000013780
[  448.818312]  ffff880037703fd8 ffff880037703fd8 ffff88003703b850 0000000000000282
[  448.818312] Call Trace:
[  448.818312]  [<ffffffff8105c81e>] ? wq_worker_sleeping+0xb/0x6f
[  448.818312]  [<ffffffff8134d9d0>] ? __schedule+0x138/0x610
[  448.818312]  [<ffffffff810eb124>] ? kmem_cache_free+0x2d/0x69
[  448.818312]  [<ffffffff8104a403>] ? do_exit+0x711/0x713
[  448.818312]  [<ffffffff8134f247>] ? _raw_spin_unlock_irqrestore+0xe/0xf
[  448.818312]  [<ffffffff8135004e>] ? oops_end+0xb1/0xb6
[  448.818312]  [<ffffffff81348204>] ? no_context+0x1ff/0x20e
[  448.818312]  [<ffffffff81352044>] ? do_page_fault+0x1b6/0x345
[  448.818312]  [<ffffffff81042084>] ? __cond_resched+0x1d/0x26
[  448.818312]  [<ffffffff8134f22f>] ? _raw_spin_lock_irq+0xa/0x14
[  448.818312]  [<ffffffff8105ac3e>] ? wait_on_work+0xfc/0x11c
[  448.818312]  [<ffffffffa02fd8b3>] ? batadv_tt_global_del_orig+0x97/0xaf [batman_adv]
[  448.818312]  [<ffffffff8134f209>] ? _raw_spin_lock_irqsave+0x9/0x25
[  448.818312]  [<ffffffff8134f7b5>] ? page_fault+0x25/0x30
[  448.818312]  [<ffffffff8114f714>] ? sysfs_attr_ns+0x1/0x84
[  448.818312]  [<ffffffff8114f7ad>] ? sysfs_remove_file+0x16/0x32
[  448.818312]  [<ffffffff81036628>] ? should_resched+0x5/0x23
[  448.818312]  [<ffffffffa02fbd72>] ? batadv_sysfs_del_meshif+0x18/0x3b [batman_adv]
[  448.818312]  [<ffffffffa02fa5d4>] ? batadv_softif_destroy_netlink+0x3a/0x49 [batman_adv]
[  448.818312]  [<ffffffff8128f0ce>] ? default_device_exit_batch+0x3f/0x87
[  448.818312]  [<ffffffff8128aee6>] ? cleanup_net+0xf1/0x180
[  448.818312]  [<ffffffff8105b529>] ? process_one_work+0x161/0x269
[  448.818312]  [<ffffffff8105c4f2>] ? worker_thread+0xc2/0x145
[  448.818312]  [<ffffffff8105c430>] ? manage_workers.isra.25+0x15b/0x15b
[  448.818312]  [<ffffffff8105f631>] ? kthread+0x76/0x7e
[  448.818312]  [<ffffffff81356374>] ? kernel_thread_helper+0x4/0x10
[  448.818312]  [<ffffffff8105f5bb>] ? kthread_worker_fn+0x139/0x139
[  448.818312]  [<ffffffff81356370>] ? gs_change+0x13/0x13
[  448.818312] Code: 3f 48 c1 e5 03 48 c1 e0 06 48 8d b0 e0 5d 40 81 48 29 ee e8 9d 32 fe ff 81 4b 14 00 00 00 04 41 59 5b 5d c3 48 8b 87 a8 02 00 00 <48> 8b 40 f8 c3 48 3b 3d 92 b8 72 00 75 08 0f bf 87 72 06 00 00
[  448.818312] RIP  [<ffffffff8105f84a>] kthread_data+0x7/0xc
[  448.818312]  RSP <ffff880037703a40>
[  448.818312] CR2: fffffffffffffff8
[  448.818312] ---[ end trace 61a5f4682395bf4b ]---
[  448.818312] Fixing recursive fault but reboot is needed!

Distro: Debian Wheezy
Kernel: 3.2.51
Architecture: x86_64
batman-adv: batctl 2013.3.0 [batman-adv: 2013.3.0]

I'm going to try 2013.4.0 later. And this crash has probably something to do with the LXC container (though despite of the lxc-shutdown issues it seems to run fine). The interface added to bat1 is a macvlan interface of the same interface as added to bat0, generated on the host and not within the container / network namespace (not sure whether that matters).


Files

Actions #1

Updated by Antonio Quartulli about 10 years ago

Have you been able to reproduce this? What about batman-adv-2014.0.0 + the patches existing in the maint branch?

Actions #2

Updated by Daniel Ehlers almost 10 years ago

I can reproduce the same oops with moving batman interface between network namespaces and then removing the slace interface in the default netns. For some reason the batX interface will not spawn in the network namespace of
the device first appended. I backtraced the creation back to *batadv_kobj_to_netdev() in sysfs.c. For some reason the netdev received here has the default netns, even if the device is in another netns for real.

The appended patch prohibit the movement of the batX interface between netnss.

Actions #3

Updated by Daniel Ehlers almost 10 years ago

Antonio Quartulli wrote:

Have you been able to reproduce this? What about batman-adv-2014.0.0 + the patches existing in the maint branch?

Btw. see attached file. Works for 2013.4.0 and 2014.0.0+

Actions #4

Updated by Antonio Quartulli about 9 years ago

Hi Daniel,
sorry for the very long time required to reply to this ticket. Unfortunately my confidence with netspaces is very little and I couldn't really be able to review your patch.

What about sending this patch directly to netdev while CCing the batman-adv mailing list ?
There you should find somebody able to judge this change and potentially provide comments.

Updated by Florian Steinel over 8 years ago

both patches forward ported to batman-adv tag v2015.1 .
TODO:
- identify resources that must be allocated for every namespace
- send rfc email to netdev and cc batman-adv

The namespace support for tipc seems a good guide
(overview of linux namespaces)

Actions #7

Updated by Sven Eckelmann almost 8 years ago

  • Status changed from In Progress to Closed

The patches mentioned in this ticket were merged. There are still some problems regarding debugfs but these are not related to the problem reported here.

Actions #8

Updated by Sven Eckelmann about 7 years ago

  • Target version set to 2016.2
Actions

Also available in: Atom PDF