Project

General

Profile

Bug #343

BATMAN V: WARN_ON spam from neighbor compare API

Added by Sven Eckelmann about 2 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Target version:
Start date:
08/02/2017
Due date:
% Done:

0%


Description

The function batadv_v_neigh_cmp and batadv_v_neigh_is_sob were implemented in b05bbab5e1fc ("batman-adv: B.A.T.M.A.N. V - implement neighbor comparison API calls"). These functions get the per interface neighbor information with RCU protected functions and no extra locking around them. It is therefore possible that the per interface information is no longer available when the ifinfo list when batadv_neigh_ifinfo_get is called.

Users have now reported that this WARN_ON check is spamming a lot around on their systems. Steffen from Freifunk Chemnitz gave us following log from one of their servers (but the same also happens on gluon nodes):

Aug  2 14:13:46 descartes kernel: [2434144.963905] ------------[ cut here ]------------
Aug  2 14:13:46 descartes kernel: [2434144.963932] WARNING: CPU: 1 PID: 8696 at /usr/src/batman-adv-2017.1/build/net/batman-adv/bat_v.c:652 batadv_v_neigh_is_sob+0x67/0x80 [batman_adv]()
Aug  2 14:13:46 descartes kernel: [2434144.963944] Modules linked in: batman_adv(O) tun ip_gre ip_tunnel gre ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables dummy bridge stp llc cfg80211 rfkill libcrc32c cpufreq_conservative cpufreq_userspace cpufreq_stats cpufreq_powersave snd_hda_codec_hdmi snd_hda_intel ttm snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer drm_kms_helper drm snd i2c_algo_bit soundcore shpchp k8temp kvm_amd kvm edac_mce_amd sp5100_tco edac_core evdev serio_raw button processor thermal_sys autofs4 ext4 crc16 mbcache jbd2 crc32c_generic btrfs xor raid6_pq dm_mod raid1 md_mod sg sd_mod crc_t10dif crct10dif_generic crct10dif_common ata_generic ohci_pci floppy pata_atiixp ahci libahci ehci_pci ohci_hcd ehci_hcd i2c_piix4 i2c_core r8169 mii libata scsi_mod usbcore usb_common [last unloaded: batman_adv]
Aug  2 14:13:46 descartes kernel: [2434144.964136] CPU: 1 PID: 8696 Comm: fastd Tainted: G        W  O  3.16.0-4-amd64 #1 Debian 3.16.43-2+deb8u1
Aug  2 14:13:46 descartes kernel: [2434144.964147] Hardware name: MICRO-STAR INTERANTIONAL CO.,LTD MS-7368/MS-7368, BIOS V1.5B2 10/31/2007
Aug  2 14:13:46 descartes kernel: [2434144.964159]  0000000000000000 ffffffff81514281 0000000000000000 0000000000000009
Aug  2 14:13:46 descartes kernel: [2434144.964173]  ffffffff81068877 0000000000000000 0000000000000000 0000000000000000
Aug  2 14:13:46 descartes kernel: [2434144.964186]  ffff88011a8ec5c0 ffff88011a8d4040 ffffffffa057c407 ffff88011a8d4078
Aug  2 14:13:46 descartes kernel: [2434144.964696] Call Trace:
Aug  2 14:13:46 descartes kernel: [2434144.964703]  <IRQ>  [<ffffffff81514281>] ? dump_stack+0x5d/0x78
Aug  2 14:13:46 descartes kernel: [2434144.964720]  [<ffffffff81068877>] ? warn_slowpath_common+0x77/0x90
Aug  2 14:13:46 descartes kernel: [2434144.964730]  [<ffffffffa057c407>] ? batadv_v_neigh_is_sob+0x67/0x80 [batman_adv]
Aug  2 14:13:46 descartes kernel: [2434144.964744]  [<ffffffffa058d4c8>] ? batadv_find_router+0x148/0x3a0 [batman_adv]
Aug  2 14:13:46 descartes kernel: [2434144.964759]  [<ffffffffa058e9b0>] ? batadv_send_skb_to_orig+0x20/0x90 [batman_adv]
Aug  2 14:13:46 descartes kernel: [2434144.964772]  [<ffffffffa058ebc3>] ? batadv_send_skb_unicast+0x83/0xd0 [batman_adv]
Aug  2 14:13:46 descartes kernel: [2434144.964787]  [<ffffffffa058ec65>] ? batadv_send_skb_via_tt_generic+0x55/0x90 [batman_adv]
Aug  2 14:13:46 descartes kernel: [2434144.964801]  [<ffffffffa05901a0>] ? batadv_interface_tx+0x410/0x450 [batman_adv]
Aug  2 14:13:46 descartes kernel: [2434144.964814]  [<ffffffff81426d17>] ? dev_hard_start_xmit+0x2e7/0x610
Aug  2 14:13:46 descartes kernel: [2434144.964822]  [<ffffffff8142739e>] ? __dev_queue_xmit+0x35e/0x4d0
Aug  2 14:13:46 descartes kernel: [2434144.964831]  [<ffffffff8142f771>] ? neigh_resolve_output+0xf1/0x200
Aug  2 14:13:46 descartes kernel: [2434144.964839]  [<ffffffff814bfeb7>] ? ip6_finish_output2+0x147/0x440
Aug  2 14:13:46 descartes kernel: [2434144.964848]  [<ffffffff814d7793>] ? ndisc_send_skb+0x183/0x2c0
Aug  2 14:13:46 descartes kernel: [2434144.964856]  [<ffffffff814d80d6>] ? ndisc_send_na+0x166/0x220
Aug  2 14:13:46 descartes kernel: [2434144.964863]  [<ffffffff814d83a8>] ? ndisc_recv_ns+0x218/0x4e0
Aug  2 14:13:46 descartes kernel: [2434144.964871]  [<ffffffff814d9230>] ? ndisc_rcv+0x210/0xf50
Aug  2 14:13:46 descartes kernel: [2434144.964879]  [<ffffffff81413b22>] ? skb_checksum+0x22/0x30
Aug  2 14:13:46 descartes kernel: [2434144.964887]  [<ffffffff81413b70>] ? skb_push+0x40/0x40
Aug  2 14:13:46 descartes kernel: [2434144.964895]  [<ffffffff814e0810>] ? icmpv6_rcv+0x440/0x890
Aug  2 14:13:46 descartes kernel: [2434144.964904]  [<ffffffff814413e1>] ? fib_rules_lookup+0x111/0x160
Aug  2 14:13:46 descartes kernel: [2434144.964914]  [<ffffffff814f73f3>] ? fib6_rule_lookup+0x43/0x80
Aug  2 14:13:46 descartes kernel: [2434144.964922]  [<ffffffff814cf420>] ? ip6_pol_route.isra.42+0x460/0x460
Aug  2 14:13:46 descartes kernel: [2434144.964930]  [<ffffffff814c2c78>] ? ip6_input_finish+0xc8/0x440
Aug  2 14:13:46 descartes kernel: [2434144.964938]  [<ffffffff81425083>] ? __netif_receive_skb_core+0x563/0x770
Aug  2 14:13:46 descartes kernel: [2434144.964946]  [<ffffffff81425f15>] ? process_backlog+0x95/0x160
Aug  2 14:13:46 descartes kernel: [2434144.964954]  [<ffffffff81425699>] ? net_rx_action+0x129/0x250
Aug  2 14:13:46 descartes kernel: [2434144.964962]  [<ffffffff8106d921>] ? __do_softirq+0xf1/0x2d0
Aug  2 14:13:46 descartes kernel: [2434144.964971]  [<ffffffff8151c0bc>] ? do_softirq_own_stack+0x1c/0x30
Aug  2 14:13:46 descartes kernel: [2434144.964977]  <EOI>  [<ffffffff8106db9d>] ? do_softirq+0x4d/0x60
Aug  2 14:13:46 descartes kernel: [2434144.964988]  [<ffffffff81422490>] ? netif_rx_ni+0x30/0x90
Aug  2 14:13:46 descartes kernel: [2434144.964997]  [<ffffffffa0604697>] ? tun_get_user+0x447/0x8e0 [tun]
Aug  2 14:13:46 descartes kernel: [2434144.965005]  [<ffffffffa0604c2b>] ? tun_chr_aio_write+0x7b/0xa0 [tun]
Aug  2 14:13:46 descartes kernel: [2434144.965015]  [<ffffffff811aa64c>] ? do_sync_write+0x5c/0x90
Aug  2 14:13:46 descartes kernel: [2434144.965023]  [<ffffffff811aaf52>] ? vfs_write+0xb2/0x1f0
Aug  2 14:13:46 descartes kernel: [2434144.965030]  [<ffffffff811aba92>] ? SyS_write+0x42/0xa0
Aug  2 14:13:46 descartes kernel: [2434144.965040]  [<ffffffff8151a48d>] ? system_call_fast_compare_end+0x10/0x15
Aug  2 14:13:46 descartes kernel: [2434144.965047] ---[ end trace 5f28b990d49282ef ]---

Is the WARN_ON really necessary?

History

#1 Updated by Sven Eckelmann almost 2 years ago

  • Assignee changed from batman-adv developers to Antonio Quartulli

Hey Antonio,

we are now getting requests a company (you know which one) about this. So it would be nice when you could make a statement.

#3 Updated by Sven Eckelmann almost 2 years ago

  • Status changed from New to Resolved
  • Target version set to 2017.4

#4 Updated by Sven Eckelmann over 1 year ago

  • Status changed from Resolved to Closed

Release (v2017.4) with this patch was just published.

Also available in: Atom PDF