* [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 @ 2016-07-26 3:50 Fengguang Wu 2016-07-26 9:14 ` Eric Dumazet 0 siblings, 1 reply; 17+ messages in thread From: Fengguang Wu @ 2016-07-26 3:50 UTC (permalink / raw) To: LKML Cc: netdev, Satyam Sharma, Thomas Gleixner, intel-wired-lan, Jeff Kirsher, Fengguang Wu, Ye Xiaolong Greetings, This BUG message can be found in recent kernels as well as v4.4 and linux-stable. It happens when running modprobe netconsole netconsole=@/,$port@$server/ [ 39.937534] 22 Jul 13:30:40 ntpdate[440]: step time server 192.168.1.1 offset -673.833841 sec [ 39.943285] netpoll: netconsole: local port 6665 [ 39.943436] netpoll: netconsole: local IPv4 address 0.0.0.0 [ 39.943609] netpoll: netconsole: interface 'eth0' [ 39.943756] netpoll: netconsole: remote port 6672 [ 39.943913] netpoll: netconsole: remote IPv4 address 192.168.1.1 [ 39.944099] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff [ 39.944311] netpoll: netconsole: local IP 192.168.1.193 [ 39.944514] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 [ 39.944515] in_atomic(): 1, irqs_disabled(): 1, pid: 448, name: modprobe [ 39.944517] CPU: 6 PID: 448 Comm: modprobe Not tainted 4.7.0-rc7-wt-ath-10122-gf9b5ec2 #102 [ 39.944518] Hardware name: /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012 [ 39.944522] 0000000000000000 ffffc90001f2f9e8 ffffffff813417d9 ffff88007faba5c0 [ 39.944524] 000000000000006e ffffc90001f2fa00 ffffffff810aec03 ffffffff81a25948 [ 39.944525] ffffc90001f2fa28 ffffffff810aec9a ffff8803e5bd9400 ffff8803e50fbd68 [ 39.944526] Call Trace: [ 39.944533] [<ffffffff813417d9>] dump_stack+0x63/0x8a [ 39.944536] [<ffffffff810aec03>] ___might_sleep+0xd3/0x120 [ 39.944537] [<ffffffff810aec9a>] __might_sleep+0x4a/0x80 [ 39.944541] [<ffffffff810e4638>] synchronize_irq+0x38/0xa0 [ 39.944543] [<ffffffff810e3c8e>] ? __irq_put_desc_unlock+0x1e/0x40 [ 39.944545] [<ffffffff810e48e3>] ? __disable_irq_nosync+0x43/0x60 [ 39.944547] [<ffffffff810e492c>] disable_irq+0x1c/0x20 [ 39.944559] [<ffffffffa0220932>] e1000_netpoll+0xf2/0x120 [e1000e] [ 39.944563] [<ffffffff815f2bdc>] netpoll_poll_dev+0x5c/0x1a0 [ 39.944567] [<ffffffff815bb361>] ? __kmalloc_reserve+0x31/0x90 [ 39.944569] [<ffffffff815f2e8b>] netpoll_send_skb_on_dev+0x16b/0x250 [ 39.944572] [<ffffffff815f325c>] netpoll_send_udp+0x2ec/0x450 [ 39.944576] [<ffffffffa003cb62>] write_msg+0xb2/0xf0 [netconsole] [ 39.944578] [<ffffffff810e04e5>] call_console_drivers+0x115/0x120 [ 39.944580] [<ffffffff810e1f13>] console_unlock+0x333/0x5c0 [ 39.944583] [<ffffffff810e2c74>] register_console+0x1c4/0x380 [ 39.944586] [<ffffffffa004f1c5>] init_netconsole+0x1c5/0x1000 [netconsole] [ 39.944588] [<ffffffffa004f000>] ? 0xffffffffa004f000 [ 39.944591] [<ffffffff8100216d>] do_one_initcall+0x3d/0x150 [ 39.944592] [<ffffffff810aec9a>] ? __might_sleep+0x4a/0x80 [ 39.944596] [<ffffffff811f5098>] ? kmem_cache_alloc_trace+0x188/0x1e0 [ 39.944598] [<ffffffff8118f871>] do_init_module+0x5f/0x1d8 [ 39.944602] [<ffffffff81114009>] load_module+0x1429/0x1b40 [ 39.944604] [<ffffffff81110cd0>] ? __symbol_put+0x40/0x40 [ 39.944607] [<ffffffff8121f348>] ? kernel_read_file+0x178/0x1a0 [ 39.944608] [<ffffffff8121f429>] ? kernel_read_file_from_fd+0x49/0x80 [ 39.944611] [<ffffffff81114973>] SYSC_finit_module+0xc3/0xf0 [ 39.944614] [<ffffffff811149be>] SyS_finit_module+0xe/0x10 [ 39.944617] [<ffffffff816e5877>] entry_SYSCALL_64_fastpath+0x1a/0xa9 [ 39.946384] console [netcon0] enabled [ 39.946514] netconsole: network logging started Can this be possibly fixed? Thanks, Fengguang ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 2016-07-26 3:50 [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 Fengguang Wu @ 2016-07-26 9:14 ` Eric Dumazet [not found] ` <20160726093224.GA10339@wfg-t540p.sh.intel.com> ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Eric Dumazet @ 2016-07-26 9:14 UTC (permalink / raw) To: Fengguang Wu Cc: LKML, netdev, Satyam Sharma, Thomas Gleixner, intel-wired-lan, Jeff Kirsher, Ye Xiaolong On Tue, 2016-07-26 at 11:50 +0800, Fengguang Wu wrote: > Greetings, > > This BUG message can be found in recent kernels as well as v4.4 and > linux-stable. It happens when running > > modprobe netconsole netconsole=@/,$port@$server/ > > [ 39.937534] 22 Jul 13:30:40 ntpdate[440]: step time server 192.168.1.1 offset -673.833841 sec > [ 39.943285] netpoll: netconsole: local port 6665 > [ 39.943436] netpoll: netconsole: local IPv4 address 0.0.0.0 > [ 39.943609] netpoll: netconsole: interface 'eth0' > [ 39.943756] netpoll: netconsole: remote port 6672 > [ 39.943913] netpoll: netconsole: remote IPv4 address 192.168.1.1 > [ 39.944099] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff > [ 39.944311] netpoll: netconsole: local IP 192.168.1.193 > [ 39.944514] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 > [ 39.944515] in_atomic(): 1, irqs_disabled(): 1, pid: 448, name: modprobe > [ 39.944517] CPU: 6 PID: 448 Comm: modprobe Not tainted 4.7.0-rc7-wt-ath-10122-gf9b5ec2 #102 > [ 39.944518] Hardware name: /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012 > [ 39.944522] 0000000000000000 ffffc90001f2f9e8 ffffffff813417d9 ffff88007faba5c0 > [ 39.944524] 000000000000006e ffffc90001f2fa00 ffffffff810aec03 ffffffff81a25948 > [ 39.944525] ffffc90001f2fa28 ffffffff810aec9a ffff8803e5bd9400 ffff8803e50fbd68 > [ 39.944526] Call Trace: > [ 39.944533] [<ffffffff813417d9>] dump_stack+0x63/0x8a > [ 39.944536] [<ffffffff810aec03>] ___might_sleep+0xd3/0x120 > [ 39.944537] [<ffffffff810aec9a>] __might_sleep+0x4a/0x80 > [ 39.944541] [<ffffffff810e4638>] synchronize_irq+0x38/0xa0 > [ 39.944543] [<ffffffff810e3c8e>] ? __irq_put_desc_unlock+0x1e/0x40 > [ 39.944545] [<ffffffff810e48e3>] ? __disable_irq_nosync+0x43/0x60 > [ 39.944547] [<ffffffff810e492c>] disable_irq+0x1c/0x20 > [ 39.944559] [<ffffffffa0220932>] e1000_netpoll+0xf2/0x120 [e1000e] > [ 39.944563] [<ffffffff815f2bdc>] netpoll_poll_dev+0x5c/0x1a0 > [ 39.944567] [<ffffffff815bb361>] ? __kmalloc_reserve+0x31/0x90 > [ 39.944569] [<ffffffff815f2e8b>] netpoll_send_skb_on_dev+0x16b/0x250 > [ 39.944572] [<ffffffff815f325c>] netpoll_send_udp+0x2ec/0x450 > [ 39.944576] [<ffffffffa003cb62>] write_msg+0xb2/0xf0 [netconsole] > [ 39.944578] [<ffffffff810e04e5>] call_console_drivers+0x115/0x120 > [ 39.944580] [<ffffffff810e1f13>] console_unlock+0x333/0x5c0 > [ 39.944583] [<ffffffff810e2c74>] register_console+0x1c4/0x380 > [ 39.944586] [<ffffffffa004f1c5>] init_netconsole+0x1c5/0x1000 [netconsole] > [ 39.944588] [<ffffffffa004f000>] ? 0xffffffffa004f000 > [ 39.944591] [<ffffffff8100216d>] do_one_initcall+0x3d/0x150 > [ 39.944592] [<ffffffff810aec9a>] ? __might_sleep+0x4a/0x80 > [ 39.944596] [<ffffffff811f5098>] ? kmem_cache_alloc_trace+0x188/0x1e0 > [ 39.944598] [<ffffffff8118f871>] do_init_module+0x5f/0x1d8 > [ 39.944602] [<ffffffff81114009>] load_module+0x1429/0x1b40 > [ 39.944604] [<ffffffff81110cd0>] ? __symbol_put+0x40/0x40 > [ 39.944607] [<ffffffff8121f348>] ? kernel_read_file+0x178/0x1a0 > [ 39.944608] [<ffffffff8121f429>] ? kernel_read_file_from_fd+0x49/0x80 > [ 39.944611] [<ffffffff81114973>] SYSC_finit_module+0xc3/0xf0 > [ 39.944614] [<ffffffff811149be>] SyS_finit_module+0xe/0x10 > [ 39.944617] [<ffffffff816e5877>] entry_SYSCALL_64_fastpath+0x1a/0xa9 > [ 39.946384] console [netcon0] enabled > [ 39.946514] netconsole: network logging started > > Can this be possibly fixed? Could you try this ? diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c index f42129d09e2c23ba9fdb5cde890d50ecb7166a42..a53c41c4c4f7d1fe52f95a2cab8784a938b3820b 100644 --- a/drivers/net/ethernet/intel/e1000/e1000_main.c +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c @@ -5257,9 +5257,13 @@ static void e1000_netpoll(struct net_device *netdev) { struct e1000_adapter *adapter = netdev_priv(netdev); - disable_irq(adapter->pdev->irq); - e1000_intr(adapter->pdev->irq, netdev); - enable_irq(adapter->pdev->irq); + if (napi_schedule_prep(&adapter->napi)) { + adapter->total_tx_bytes = 0; + adapter->total_tx_packets = 0; + adapter->total_rx_bytes = 0; + adapter->total_rx_packets = 0; + __napi_schedule(&adapter->napi); + } } #endif ^ permalink raw reply related [flat|nested] 17+ messages in thread
[parent not found: <20160726093224.GA10339@wfg-t540p.sh.intel.com>]
* Re: [PATCH] schedule function called for e1000 driver interrupt [not found] ` <20160726093224.GA10339@wfg-t540p.sh.intel.com> @ 2016-07-26 9:45 ` kbuild test robot 2016-07-26 9:50 ` [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 Thomas Gleixner 2016-07-26 9:50 ` [PATCH] schedule function called for e1000 driver interrupt kbuild test robot 2 siblings, 0 replies; 17+ messages in thread From: kbuild test robot @ 2016-07-26 9:45 UTC (permalink / raw) To: Fengguang Wu Cc: kbuild-all, Eric Dumazet, nick, netdev, LKML, Ye Xiaolong, intel-wired-lan, Satyam Sharma, Thomas Gleixner, cc [-- Attachment #1: Type: text/plain, Size: 1555 bytes --] Hi, [auto build test ERROR on jkirsher-next-queue/dev-queue] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Fengguang-Wu/schedule-function-called-for-e1000-driver-interrupt/20160726-173521 base: https://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue.git dev-queue config: x86_64-randconfig-x011-201630 (attached as .config) compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All errors (new ones prefixed by >>): drivers/net/ethernet/intel/e1000/e1000_main.c: In function 'e1000_intr': >> drivers/net/ethernet/intel/e1000/e1000_main.c:3800:22: error: 'struct e1000_adapter' has no member named 'watchdog_timer'; did you mean 'watchdog_task'? mod_timer(&adapter->watchdog_timer, jiffies + 1); ^~ vim +3800 drivers/net/ethernet/intel/e1000/e1000_main.c 3794 return IRQ_HANDLED; 3795 3796 if (unlikely(icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))) { 3797 hw->get_link_status = 1; 3798 /* guard against interrupt when we're going down */ 3799 if (!test_bit(__E1000_DOWN, &adapter->flags)) > 3800 mod_timer(&adapter->watchdog_timer, jiffies + 1); 3801 } 3802 3803 /* disable interrupts, without the synchronize_irq bit */ --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/octet-stream, Size: 26488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 [not found] ` <20160726093224.GA10339@wfg-t540p.sh.intel.com> 2016-07-26 9:45 ` [PATCH] schedule function called for e1000 driver interrupt kbuild test robot @ 2016-07-26 9:50 ` Thomas Gleixner [not found] ` <8578bb16-cd04-e8a5-c7f4-be061ede95b4@gmail.com> 2016-07-26 9:50 ` [PATCH] schedule function called for e1000 driver interrupt kbuild test robot 2 siblings, 1 reply; 17+ messages in thread From: Thomas Gleixner @ 2016-07-26 9:50 UTC (permalink / raw) To: Fengguang Wu Cc: Eric Dumazet, LKML, netdev, Satyam Sharma, intel-wired-lan, Jeff Kirsher, Ye Xiaolong On Tue, 26 Jul 2016, Fengguang Wu wrote: > --- a/drivers/net/ethernet/intel/e1000/e1000_main.c > +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c > @@ -3797,7 +3797,7 @@ static irqreturn_t e1000_intr(int irq, void *data) > hw->get_link_status = 1; > /* guard against interrupt when we're going down */ > if (!test_bit(__E1000_DOWN, &adapter->flags)) > - schedule_delayed_work(&adapter->watchdog_task, 1); > + mod_timer(&adapter->watchdog_timer, jiffies + 1); ROTFL .... ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <8578bb16-cd04-e8a5-c7f4-be061ede95b4@gmail.com>]
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 [not found] ` <8578bb16-cd04-e8a5-c7f4-be061ede95b4@gmail.com> @ 2016-07-28 7:45 ` Thomas Gleixner 2016-07-28 9:46 ` Valdis.Kletnieks 0 siblings, 1 reply; 17+ messages in thread From: Thomas Gleixner @ 2016-07-28 7:45 UTC (permalink / raw) To: nick Cc: Fengguang Wu, Eric Dumazet, LKML, netdev, Satyam Sharma, intel-wired-lan, Jeff Kirsher, Ye Xiaolong On Tue, 26 Jul 2016, nick wrote: > diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c > index f42129d..e1830af 100644 > --- a/drivers/net/ethernet/intel/e1000/e1000_main.c > +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c > @@ -3797,7 +3797,7 @@ static irqreturn_t e1000_intr(int irq, void *data) > hw->get_link_status = 1; > /* guard against interrupt when we're going down */ > if (!test_bit(__E1000_DOWN, &adapter->flags)) > - schedule_delayed_work(&adapter->watchdog_task, 1); > + mod_work(&adapter->watchdog_task, jiffies + 1); And that's not even funny anymore. Are you using a random generator to create these patches? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 2016-07-28 7:45 ` Thomas Gleixner @ 2016-07-28 9:46 ` Valdis.Kletnieks 0 siblings, 0 replies; 17+ messages in thread From: Valdis.Kletnieks @ 2016-07-28 9:46 UTC (permalink / raw) To: Thomas Gleixner Cc: nick, Fengguang Wu, Eric Dumazet, LKML, netdev, Satyam Sharma, intel-wired-lan, Jeff Kirsher, Ye Xiaolong [-- Attachment #1: Type: text/plain, Size: 924 bytes --] On Thu, 28 Jul 2016 09:45:12 +0200, Thomas Gleixner said: > On Tue, 26 Jul 2016, nick wrote: > > diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c > > index f42129d..e1830af 100644 > > --- a/drivers/net/ethernet/intel/e1000/e1000_main.c > > +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c > > @@ -3797,7 +3797,7 @@ static irqreturn_t e1000_intr(int irq, void *data) > > hw->get_link_status = 1; > > /* guard against interrupt when we're going down */ > > if (!test_bit(__E1000_DOWN, &adapter->flags)) > > - schedule_delayed_work(&adapter->watchdog_task, 1); > > + mod_work(&adapter->watchdog_task, jiffies + 1); > > And that's not even funny anymore. Are you using a random generator to create > these patches? At some point, we need to decide if the occasional accidentally-correct trivial patch from Nick is worth all the wasted maintainer time. [-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] schedule function called for e1000 driver interrupt [not found] ` <20160726093224.GA10339@wfg-t540p.sh.intel.com> 2016-07-26 9:45 ` [PATCH] schedule function called for e1000 driver interrupt kbuild test robot 2016-07-26 9:50 ` [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 Thomas Gleixner @ 2016-07-26 9:50 ` kbuild test robot 2 siblings, 0 replies; 17+ messages in thread From: kbuild test robot @ 2016-07-26 9:50 UTC (permalink / raw) To: Fengguang Wu Cc: kbuild-all, Eric Dumazet, nick, netdev, LKML, Ye Xiaolong, intel-wired-lan, Satyam Sharma, Thomas Gleixner, cc [-- Attachment #1: Type: text/plain, Size: 1689 bytes --] Hi, [auto build test ERROR on jkirsher-next-queue/dev-queue] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Fengguang-Wu/schedule-function-called-for-e1000-driver-interrupt/20160726-173521 base: https://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue.git dev-queue config: sparc64-allyesconfig (attached as .config) compiler: sparc64-linux-gnu-gcc (Debian 5.4.0-6) 5.4.0 20160609 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=sparc64 All errors (new ones prefixed by >>): drivers/net/ethernet/intel/e1000/e1000_main.c: In function 'e1000_intr': >> drivers/net/ethernet/intel/e1000/e1000_main.c:3800:22: error: 'struct e1000_adapter' has no member named 'watchdog_timer' mod_timer(&adapter->watchdog_timer, jiffies + 1); ^ vim +3800 drivers/net/ethernet/intel/e1000/e1000_main.c 3794 return IRQ_HANDLED; 3795 3796 if (unlikely(icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))) { 3797 hw->get_link_status = 1; 3798 /* guard against interrupt when we're going down */ 3799 if (!test_bit(__E1000_DOWN, &adapter->flags)) > 3800 mod_timer(&adapter->watchdog_timer, jiffies + 1); 3801 } 3802 3803 /* disable interrupts, without the synchronize_irq bit */ --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/octet-stream, Size: 46493 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 2016-07-26 9:14 ` Eric Dumazet [not found] ` <20160726093224.GA10339@wfg-t540p.sh.intel.com> @ 2016-07-26 15:32 ` Fengguang Wu 2016-07-26 16:28 ` Eric Dumazet 2016-07-27 21:38 ` Jeff Kirsher 2 siblings, 1 reply; 17+ messages in thread From: Fengguang Wu @ 2016-07-26 15:32 UTC (permalink / raw) To: Eric Dumazet Cc: LKML, netdev, Satyam Sharma, Thomas Gleixner, intel-wired-lan, Jeff Kirsher, Ye Xiaolong Hi Eric, It works! On Tue, Jul 26, 2016 at 11:14:52AM +0200, Eric Dumazet wrote: >On Tue, 2016-07-26 at 11:50 +0800, Fengguang Wu wrote: >> Greetings, >> >> This BUG message can be found in recent kernels as well as v4.4 and >> linux-stable. It happens when running >> >> modprobe netconsole netconsole=@/,$port@$server/ >> >> [ 39.937534] 22 Jul 13:30:40 ntpdate[440]: step time server 192.168.1.1 offset -673.833841 sec >> [ 39.943285] netpoll: netconsole: local port 6665 >> [ 39.943436] netpoll: netconsole: local IPv4 address 0.0.0.0 >> [ 39.943609] netpoll: netconsole: interface 'eth0' >> [ 39.943756] netpoll: netconsole: remote port 6672 >> [ 39.943913] netpoll: netconsole: remote IPv4 address 192.168.1.1 >> [ 39.944099] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff >> [ 39.944311] netpoll: netconsole: local IP 192.168.1.193 >> [ 39.944514] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 >> [ 39.944515] in_atomic(): 1, irqs_disabled(): 1, pid: 448, name: modprobe >> [ 39.944517] CPU: 6 PID: 448 Comm: modprobe Not tainted 4.7.0-rc7-wt-ath-10122-gf9b5ec2 #102 >> [ 39.944518] Hardware name: /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012 >> [ 39.944522] 0000000000000000 ffffc90001f2f9e8 ffffffff813417d9 ffff88007faba5c0 >> [ 39.944524] 000000000000006e ffffc90001f2fa00 ffffffff810aec03 ffffffff81a25948 >> [ 39.944525] ffffc90001f2fa28 ffffffff810aec9a ffff8803e5bd9400 ffff8803e50fbd68 >> [ 39.944526] Call Trace: >> [ 39.944533] [<ffffffff813417d9>] dump_stack+0x63/0x8a >> [ 39.944536] [<ffffffff810aec03>] ___might_sleep+0xd3/0x120 >> [ 39.944537] [<ffffffff810aec9a>] __might_sleep+0x4a/0x80 >> [ 39.944541] [<ffffffff810e4638>] synchronize_irq+0x38/0xa0 >> [ 39.944543] [<ffffffff810e3c8e>] ? __irq_put_desc_unlock+0x1e/0x40 >> [ 39.944545] [<ffffffff810e48e3>] ? __disable_irq_nosync+0x43/0x60 >> [ 39.944547] [<ffffffff810e492c>] disable_irq+0x1c/0x20 >> [ 39.944559] [<ffffffffa0220932>] e1000_netpoll+0xf2/0x120 [e1000e] >> [ 39.944563] [<ffffffff815f2bdc>] netpoll_poll_dev+0x5c/0x1a0 >> [ 39.944567] [<ffffffff815bb361>] ? __kmalloc_reserve+0x31/0x90 >> [ 39.944569] [<ffffffff815f2e8b>] netpoll_send_skb_on_dev+0x16b/0x250 >> [ 39.944572] [<ffffffff815f325c>] netpoll_send_udp+0x2ec/0x450 >> [ 39.944576] [<ffffffffa003cb62>] write_msg+0xb2/0xf0 [netconsole] >> [ 39.944578] [<ffffffff810e04e5>] call_console_drivers+0x115/0x120 >> [ 39.944580] [<ffffffff810e1f13>] console_unlock+0x333/0x5c0 >> [ 39.944583] [<ffffffff810e2c74>] register_console+0x1c4/0x380 >> [ 39.944586] [<ffffffffa004f1c5>] init_netconsole+0x1c5/0x1000 [netconsole] >> [ 39.944588] [<ffffffffa004f000>] ? 0xffffffffa004f000 >> [ 39.944591] [<ffffffff8100216d>] do_one_initcall+0x3d/0x150 >> [ 39.944592] [<ffffffff810aec9a>] ? __might_sleep+0x4a/0x80 >> [ 39.944596] [<ffffffff811f5098>] ? kmem_cache_alloc_trace+0x188/0x1e0 >> [ 39.944598] [<ffffffff8118f871>] do_init_module+0x5f/0x1d8 >> [ 39.944602] [<ffffffff81114009>] load_module+0x1429/0x1b40 >> [ 39.944604] [<ffffffff81110cd0>] ? __symbol_put+0x40/0x40 >> [ 39.944607] [<ffffffff8121f348>] ? kernel_read_file+0x178/0x1a0 >> [ 39.944608] [<ffffffff8121f429>] ? kernel_read_file_from_fd+0x49/0x80 >> [ 39.944611] [<ffffffff81114973>] SYSC_finit_module+0xc3/0xf0 >> [ 39.944614] [<ffffffff811149be>] SyS_finit_module+0xe/0x10 >> [ 39.944617] [<ffffffff816e5877>] entry_SYSCALL_64_fastpath+0x1a/0xa9 >> [ 39.946384] console [netcon0] enabled >> [ 39.946514] netconsole: network logging started >> >> Can this be possibly fixed? > >Could you try this ? > >diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c >index f42129d09e2c23ba9fdb5cde890d50ecb7166a42..a53c41c4c4f7d1fe52f95a2cab8784a938b3820b 100644 >--- a/drivers/net/ethernet/intel/e1000/e1000_main.c >+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c >@@ -5257,9 +5257,13 @@ static void e1000_netpoll(struct net_device *netdev) > { > struct e1000_adapter *adapter = netdev_priv(netdev); > >- disable_irq(adapter->pdev->irq); >- e1000_intr(adapter->pdev->irq, netdev); >- enable_irq(adapter->pdev->irq); >+ if (napi_schedule_prep(&adapter->napi)) { >+ adapter->total_tx_bytes = 0; >+ adapter->total_tx_packets = 0; >+ adapter->total_rx_bytes = 0; >+ adapter->total_rx_packets = 0; >+ __napi_schedule(&adapter->napi); >+ } The machines are actually running e1000e driver, so I copied your approach to e1000e and it works: kern :info : [ 16.109647] netpoll: netconsole: local port 6665 kern :info : [ 16.109961] netpoll: netconsole: local IPv4 address 0.0.0.0 kern :info : [ 16.110346] netpoll: netconsole: interface 'eth0' kern :info : [ 16.110672] netpoll: netconsole: remote port 6676 kern :info : [ 16.110991] netpoll: netconsole: remote IPv4 address 192.168.2.1 kern :info : [ 16.111398] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff kern :info : [ 16.111845] netpoll: netconsole: local IP 192.168.2.3 kern :info : [ 16.114284] console [netcon0] enabled kern :info : [ 16.114550] netconsole: network logging started However I'm not sure if it'll have side effects, because this effectively disables the various checks in e1000_intr() and e1000_intr_msi(). diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index 9b4ec13..4f89873 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -6711,15 +6711,14 @@ static void e1000_netpoll(struct net_device *netdev) case E1000E_INT_MODE_MSIX: e1000_intr_msix(adapter->pdev->irq, netdev); break; - case E1000E_INT_MODE_MSI: - disable_irq(adapter->pdev->irq); - e1000_intr_msi(adapter->pdev->irq, netdev); - enable_irq(adapter->pdev->irq); - break; default: /* E1000E_INT_MODE_LEGACY */ - disable_irq(adapter->pdev->irq); - e1000_intr(adapter->pdev->irq, netdev); - enable_irq(adapter->pdev->irq); + if (napi_schedule_prep(&adapter->napi)) { + adapter->total_tx_bytes = 0; + adapter->total_tx_packets = 0; + adapter->total_rx_bytes = 0; + adapter->total_rx_packets = 0; + __napi_schedule(&adapter->napi); + } break; } } Thanks, Fengguang ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 2016-07-26 15:32 ` [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 Fengguang Wu @ 2016-07-26 16:28 ` Eric Dumazet 2016-07-27 15:01 ` Fengguang Wu 0 siblings, 1 reply; 17+ messages in thread From: Eric Dumazet @ 2016-07-26 16:28 UTC (permalink / raw) To: Fengguang Wu Cc: LKML, netdev, Satyam Sharma, Thomas Gleixner, intel-wired-lan, Jeff Kirsher, Ye Xiaolong On Tue, 2016-07-26 at 23:32 +0800, Fengguang Wu wrote: > Hi Eric, > > It works! > > On Tue, Jul 26, 2016 at 11:14:52AM +0200, Eric Dumazet wrote: > >On Tue, 2016-07-26 at 11:50 +0800, Fengguang Wu wrote: > >> Greetings, > >> > >> This BUG message can be found in recent kernels as well as v4.4 and > >> linux-stable. It happens when running > >> > >> modprobe netconsole netconsole=@/,$port@$server/ > >> > >> [ 39.937534] 22 Jul 13:30:40 ntpdate[440]: step time server 192.168.1.1 offset -673.833841 sec > >> [ 39.943285] netpoll: netconsole: local port 6665 > >> [ 39.943436] netpoll: netconsole: local IPv4 address 0.0.0.0 > >> [ 39.943609] netpoll: netconsole: interface 'eth0' > >> [ 39.943756] netpoll: netconsole: remote port 6672 > >> [ 39.943913] netpoll: netconsole: remote IPv4 address 192.168.1.1 > >> [ 39.944099] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff > >> [ 39.944311] netpoll: netconsole: local IP 192.168.1.193 > >> [ 39.944514] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 > >> [ 39.944515] in_atomic(): 1, irqs_disabled(): 1, pid: 448, name: modprobe > >> [ 39.944517] CPU: 6 PID: 448 Comm: modprobe Not tainted 4.7.0-rc7-wt-ath-10122-gf9b5ec2 #102 > >> [ 39.944518] Hardware name: /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012 > >> [ 39.944522] 0000000000000000 ffffc90001f2f9e8 ffffffff813417d9 ffff88007faba5c0 > >> [ 39.944524] 000000000000006e ffffc90001f2fa00 ffffffff810aec03 ffffffff81a25948 > >> [ 39.944525] ffffc90001f2fa28 ffffffff810aec9a ffff8803e5bd9400 ffff8803e50fbd68 > >> [ 39.944526] Call Trace: > >> [ 39.944533] [<ffffffff813417d9>] dump_stack+0x63/0x8a > >> [ 39.944536] [<ffffffff810aec03>] ___might_sleep+0xd3/0x120 > >> [ 39.944537] [<ffffffff810aec9a>] __might_sleep+0x4a/0x80 > >> [ 39.944541] [<ffffffff810e4638>] synchronize_irq+0x38/0xa0 > >> [ 39.944543] [<ffffffff810e3c8e>] ? __irq_put_desc_unlock+0x1e/0x40 > >> [ 39.944545] [<ffffffff810e48e3>] ? __disable_irq_nosync+0x43/0x60 > >> [ 39.944547] [<ffffffff810e492c>] disable_irq+0x1c/0x20 > >> [ 39.944559] [<ffffffffa0220932>] e1000_netpoll+0xf2/0x120 [e1000e] > >> [ 39.944563] [<ffffffff815f2bdc>] netpoll_poll_dev+0x5c/0x1a0 > >> [ 39.944567] [<ffffffff815bb361>] ? __kmalloc_reserve+0x31/0x90 > >> [ 39.944569] [<ffffffff815f2e8b>] netpoll_send_skb_on_dev+0x16b/0x250 > >> [ 39.944572] [<ffffffff815f325c>] netpoll_send_udp+0x2ec/0x450 > >> [ 39.944576] [<ffffffffa003cb62>] write_msg+0xb2/0xf0 [netconsole] > >> [ 39.944578] [<ffffffff810e04e5>] call_console_drivers+0x115/0x120 > >> [ 39.944580] [<ffffffff810e1f13>] console_unlock+0x333/0x5c0 > >> [ 39.944583] [<ffffffff810e2c74>] register_console+0x1c4/0x380 > >> [ 39.944586] [<ffffffffa004f1c5>] init_netconsole+0x1c5/0x1000 [netconsole] > >> [ 39.944588] [<ffffffffa004f000>] ? 0xffffffffa004f000 > >> [ 39.944591] [<ffffffff8100216d>] do_one_initcall+0x3d/0x150 > >> [ 39.944592] [<ffffffff810aec9a>] ? __might_sleep+0x4a/0x80 > >> [ 39.944596] [<ffffffff811f5098>] ? kmem_cache_alloc_trace+0x188/0x1e0 > >> [ 39.944598] [<ffffffff8118f871>] do_init_module+0x5f/0x1d8 > >> [ 39.944602] [<ffffffff81114009>] load_module+0x1429/0x1b40 > >> [ 39.944604] [<ffffffff81110cd0>] ? __symbol_put+0x40/0x40 > >> [ 39.944607] [<ffffffff8121f348>] ? kernel_read_file+0x178/0x1a0 > >> [ 39.944608] [<ffffffff8121f429>] ? kernel_read_file_from_fd+0x49/0x80 > >> [ 39.944611] [<ffffffff81114973>] SYSC_finit_module+0xc3/0xf0 > >> [ 39.944614] [<ffffffff811149be>] SyS_finit_module+0xe/0x10 > >> [ 39.944617] [<ffffffff816e5877>] entry_SYSCALL_64_fastpath+0x1a/0xa9 > >> [ 39.946384] console [netcon0] enabled > >> [ 39.946514] netconsole: network logging started > >> > >> Can this be possibly fixed? > > > >Could you try this ? > > > >diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c > >index f42129d09e2c23ba9fdb5cde890d50ecb7166a42..a53c41c4c4f7d1fe52f95a2cab8784a938b3820b 100644 > >--- a/drivers/net/ethernet/intel/e1000/e1000_main.c > >+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c > >@@ -5257,9 +5257,13 @@ static void e1000_netpoll(struct net_device *netdev) > > { > > struct e1000_adapter *adapter = netdev_priv(netdev); > > > >- disable_irq(adapter->pdev->irq); > >- e1000_intr(adapter->pdev->irq, netdev); > >- enable_irq(adapter->pdev->irq); > >+ if (napi_schedule_prep(&adapter->napi)) { > >+ adapter->total_tx_bytes = 0; > >+ adapter->total_tx_packets = 0; > >+ adapter->total_rx_bytes = 0; > >+ adapter->total_rx_packets = 0; > >+ __napi_schedule(&adapter->napi); > >+ } > > The machines are actually running e1000e driver, so I copied your > approach to e1000e and it works: > > kern :info : [ 16.109647] netpoll: netconsole: local port 6665 > kern :info : [ 16.109961] netpoll: netconsole: local IPv4 address 0.0.0.0 > kern :info : [ 16.110346] netpoll: netconsole: interface 'eth0' > kern :info : [ 16.110672] netpoll: netconsole: remote port 6676 > kern :info : [ 16.110991] netpoll: netconsole: remote IPv4 address 192.168.2.1 > kern :info : [ 16.111398] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff > kern :info : [ 16.111845] netpoll: netconsole: local IP 192.168.2.3 > kern :info : [ 16.114284] console [netcon0] enabled > kern :info : [ 16.114550] netconsole: network logging started > > However I'm not sure if it'll have side effects, because this > effectively disables the various checks in e1000_intr() and > e1000_intr_msi(). > As far as netpoll is concerned, this should not matter. We only want to drain packets from TX rings. I have no idea why you hit this issue only recently, since this looks a rather old bug to me ? > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > index 9b4ec13..4f89873 100644 > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > @@ -6711,15 +6711,14 @@ static void e1000_netpoll(struct net_device *netdev) > case E1000E_INT_MODE_MSIX: > e1000_intr_msix(adapter->pdev->irq, netdev); > break; > - case E1000E_INT_MODE_MSI: > - disable_irq(adapter->pdev->irq); > - e1000_intr_msi(adapter->pdev->irq, netdev); > - enable_irq(adapter->pdev->irq); > - break; > default: /* E1000E_INT_MODE_LEGACY */ > - disable_irq(adapter->pdev->irq); > - e1000_intr(adapter->pdev->irq, netdev); > - enable_irq(adapter->pdev->irq); > + if (napi_schedule_prep(&adapter->napi)) { > + adapter->total_tx_bytes = 0; > + adapter->total_tx_packets = 0; > + adapter->total_rx_bytes = 0; > + adapter->total_rx_packets = 0; > + __napi_schedule(&adapter->napi); > + } > break; > } > } > > Thanks, > Fengguang ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 2016-07-26 16:28 ` Eric Dumazet @ 2016-07-27 15:01 ` Fengguang Wu 2016-07-27 18:50 ` Eric Dumazet 0 siblings, 1 reply; 17+ messages in thread From: Fengguang Wu @ 2016-07-27 15:01 UTC (permalink / raw) To: Eric Dumazet Cc: LKML, netdev, Satyam Sharma, Thomas Gleixner, intel-wired-lan, Jeff Kirsher, Ye Xiaolong On Tue, Jul 26, 2016 at 06:28:33PM +0200, Eric Dumazet wrote: >On Tue, 2016-07-26 at 23:32 +0800, Fengguang Wu wrote: >> Hi Eric, >> >> It works! >> >> On Tue, Jul 26, 2016 at 11:14:52AM +0200, Eric Dumazet wrote: >> >On Tue, 2016-07-26 at 11:50 +0800, Fengguang Wu wrote: >> >> Greetings, >> >> >> >> This BUG message can be found in recent kernels as well as v4.4 and >> >> linux-stable. It happens when running >> >> >> >> modprobe netconsole netconsole=@/,$port@$server/ >> >> >> >> [ 39.937534] 22 Jul 13:30:40 ntpdate[440]: step time server 192.168.1.1 offset -673.833841 sec >> >> [ 39.943285] netpoll: netconsole: local port 6665 >> >> [ 39.943436] netpoll: netconsole: local IPv4 address 0.0.0.0 >> >> [ 39.943609] netpoll: netconsole: interface 'eth0' >> >> [ 39.943756] netpoll: netconsole: remote port 6672 >> >> [ 39.943913] netpoll: netconsole: remote IPv4 address 192.168.1.1 >> >> [ 39.944099] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff >> >> [ 39.944311] netpoll: netconsole: local IP 192.168.1.193 >> >> [ 39.944514] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 >> >> [ 39.944515] in_atomic(): 1, irqs_disabled(): 1, pid: 448, name: modprobe >> >> [ 39.944517] CPU: 6 PID: 448 Comm: modprobe Not tainted 4.7.0-rc7-wt-ath-10122-gf9b5ec2 #102 >> >> [ 39.944518] Hardware name: /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012 >> >> [ 39.944522] 0000000000000000 ffffc90001f2f9e8 ffffffff813417d9 ffff88007faba5c0 >> >> [ 39.944524] 000000000000006e ffffc90001f2fa00 ffffffff810aec03 ffffffff81a25948 >> >> [ 39.944525] ffffc90001f2fa28 ffffffff810aec9a ffff8803e5bd9400 ffff8803e50fbd68 >> >> [ 39.944526] Call Trace: >> >> [ 39.944533] [<ffffffff813417d9>] dump_stack+0x63/0x8a >> >> [ 39.944536] [<ffffffff810aec03>] ___might_sleep+0xd3/0x120 >> >> [ 39.944537] [<ffffffff810aec9a>] __might_sleep+0x4a/0x80 >> >> [ 39.944541] [<ffffffff810e4638>] synchronize_irq+0x38/0xa0 >> >> [ 39.944543] [<ffffffff810e3c8e>] ? __irq_put_desc_unlock+0x1e/0x40 >> >> [ 39.944545] [<ffffffff810e48e3>] ? __disable_irq_nosync+0x43/0x60 >> >> [ 39.944547] [<ffffffff810e492c>] disable_irq+0x1c/0x20 >> >> [ 39.944559] [<ffffffffa0220932>] e1000_netpoll+0xf2/0x120 [e1000e] >> >> [ 39.944563] [<ffffffff815f2bdc>] netpoll_poll_dev+0x5c/0x1a0 >> >> [ 39.944567] [<ffffffff815bb361>] ? __kmalloc_reserve+0x31/0x90 >> >> [ 39.944569] [<ffffffff815f2e8b>] netpoll_send_skb_on_dev+0x16b/0x250 >> >> [ 39.944572] [<ffffffff815f325c>] netpoll_send_udp+0x2ec/0x450 >> >> [ 39.944576] [<ffffffffa003cb62>] write_msg+0xb2/0xf0 [netconsole] >> >> [ 39.944578] [<ffffffff810e04e5>] call_console_drivers+0x115/0x120 >> >> [ 39.944580] [<ffffffff810e1f13>] console_unlock+0x333/0x5c0 >> >> [ 39.944583] [<ffffffff810e2c74>] register_console+0x1c4/0x380 >> >> [ 39.944586] [<ffffffffa004f1c5>] init_netconsole+0x1c5/0x1000 [netconsole] >> >> [ 39.944588] [<ffffffffa004f000>] ? 0xffffffffa004f000 >> >> [ 39.944591] [<ffffffff8100216d>] do_one_initcall+0x3d/0x150 >> >> [ 39.944592] [<ffffffff810aec9a>] ? __might_sleep+0x4a/0x80 >> >> [ 39.944596] [<ffffffff811f5098>] ? kmem_cache_alloc_trace+0x188/0x1e0 >> >> [ 39.944598] [<ffffffff8118f871>] do_init_module+0x5f/0x1d8 >> >> [ 39.944602] [<ffffffff81114009>] load_module+0x1429/0x1b40 >> >> [ 39.944604] [<ffffffff81110cd0>] ? __symbol_put+0x40/0x40 >> >> [ 39.944607] [<ffffffff8121f348>] ? kernel_read_file+0x178/0x1a0 >> >> [ 39.944608] [<ffffffff8121f429>] ? kernel_read_file_from_fd+0x49/0x80 >> >> [ 39.944611] [<ffffffff81114973>] SYSC_finit_module+0xc3/0xf0 >> >> [ 39.944614] [<ffffffff811149be>] SyS_finit_module+0xe/0x10 >> >> [ 39.944617] [<ffffffff816e5877>] entry_SYSCALL_64_fastpath+0x1a/0xa9 >> >> [ 39.946384] console [netcon0] enabled >> >> [ 39.946514] netconsole: network logging started >> >> >> >> Can this be possibly fixed? >> > >> >Could you try this ? >> > >> >diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c >> >index f42129d09e2c23ba9fdb5cde890d50ecb7166a42..a53c41c4c4f7d1fe52f95a2cab8784a938b3820b 100644 >> >--- a/drivers/net/ethernet/intel/e1000/e1000_main.c >> >+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c >> >@@ -5257,9 +5257,13 @@ static void e1000_netpoll(struct net_device *netdev) >> > { >> > struct e1000_adapter *adapter = netdev_priv(netdev); >> > >> >- disable_irq(adapter->pdev->irq); >> >- e1000_intr(adapter->pdev->irq, netdev); >> >- enable_irq(adapter->pdev->irq); >> >+ if (napi_schedule_prep(&adapter->napi)) { >> >+ adapter->total_tx_bytes = 0; >> >+ adapter->total_tx_packets = 0; >> >+ adapter->total_rx_bytes = 0; >> >+ adapter->total_rx_packets = 0; >> >+ __napi_schedule(&adapter->napi); >> >+ } >> >> The machines are actually running e1000e driver, so I copied your >> approach to e1000e and it works: >> >> kern :info : [ 16.109647] netpoll: netconsole: local port 6665 >> kern :info : [ 16.109961] netpoll: netconsole: local IPv4 address 0.0.0.0 >> kern :info : [ 16.110346] netpoll: netconsole: interface 'eth0' >> kern :info : [ 16.110672] netpoll: netconsole: remote port 6676 >> kern :info : [ 16.110991] netpoll: netconsole: remote IPv4 address 192.168.2.1 >> kern :info : [ 16.111398] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff >> kern :info : [ 16.111845] netpoll: netconsole: local IP 192.168.2.3 >> kern :info : [ 16.114284] console [netcon0] enabled >> kern :info : [ 16.114550] netconsole: network logging started >> >> However I'm not sure if it'll have side effects, because this >> effectively disables the various checks in e1000_intr() and >> e1000_intr_msi(). >> > >As far as netpoll is concerned, this should not matter. > >We only want to drain packets from TX rings. OK. >I have no idea why you hit this issue only recently, since this looks a >rather old bug to me ? Yeah it's a rather old bug. It becomes obvious when we try to detect and filter out buggy kernels for the machines that are expected to run stable services. This BUG effectively blocks the stable machines from booting because no clean kernel (v4.6.4, v4.6, v4.5, v4.4, ...) are available at all. ;) Thanks, Fengguang >> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c >> index 9b4ec13..4f89873 100644 >> --- a/drivers/net/ethernet/intel/e1000e/netdev.c >> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c >> @@ -6711,15 +6711,14 @@ static void e1000_netpoll(struct net_device *netdev) >> case E1000E_INT_MODE_MSIX: >> e1000_intr_msix(adapter->pdev->irq, netdev); >> break; >> - case E1000E_INT_MODE_MSI: >> - disable_irq(adapter->pdev->irq); >> - e1000_intr_msi(adapter->pdev->irq, netdev); >> - enable_irq(adapter->pdev->irq); >> - break; >> default: /* E1000E_INT_MODE_LEGACY */ >> - disable_irq(adapter->pdev->irq); >> - e1000_intr(adapter->pdev->irq, netdev); >> - enable_irq(adapter->pdev->irq); >> + if (napi_schedule_prep(&adapter->napi)) { >> + adapter->total_tx_bytes = 0; >> + adapter->total_tx_packets = 0; >> + adapter->total_rx_bytes = 0; >> + adapter->total_rx_packets = 0; >> + __napi_schedule(&adapter->napi); >> + } >> break; >> } >> } >> >> Thanks, >> Fengguang > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 2016-07-27 15:01 ` Fengguang Wu @ 2016-07-27 18:50 ` Eric Dumazet 0 siblings, 0 replies; 17+ messages in thread From: Eric Dumazet @ 2016-07-27 18:50 UTC (permalink / raw) To: Fengguang Wu Cc: LKML, netdev, Satyam Sharma, Thomas Gleixner, intel-wired-lan, Jeff Kirsher, Ye Xiaolong On Wed, 2016-07-27 at 23:01 +0800, Fengguang Wu wrote: > On Tue, Jul 26, 2016 at 06:28:33PM +0200, Eric Dumazet wrote: > >On Tue, 2016-07-26 at 23:32 +0800, Fengguang Wu wrote: > >> Hi Eric, > >> > >> It works! > >> > >> On Tue, Jul 26, 2016 at 11:14:52AM +0200, Eric Dumazet wrote: > >> >On Tue, 2016-07-26 at 11:50 +0800, Fengguang Wu wrote: > >> >> Greetings, > >> >> > >> >> This BUG message can be found in recent kernels as well as v4.4 and > >> >> linux-stable. It happens when running > >> >> > >> >> modprobe netconsole netconsole=@/,$port@$server/ > >> >> > >> >> [ 39.937534] 22 Jul 13:30:40 ntpdate[440]: step time server 192.168.1.1 offset -673.833841 sec > >> >> [ 39.943285] netpoll: netconsole: local port 6665 > >> >> [ 39.943436] netpoll: netconsole: local IPv4 address 0.0.0.0 > >> >> [ 39.943609] netpoll: netconsole: interface 'eth0' > >> >> [ 39.943756] netpoll: netconsole: remote port 6672 > >> >> [ 39.943913] netpoll: netconsole: remote IPv4 address 192.168.1.1 > >> >> [ 39.944099] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff > >> >> [ 39.944311] netpoll: netconsole: local IP 192.168.1.193 > >> >> [ 39.944514] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 > >> >> [ 39.944515] in_atomic(): 1, irqs_disabled(): 1, pid: 448, name: modprobe > >> >> [ 39.944517] CPU: 6 PID: 448 Comm: modprobe Not tainted 4.7.0-rc7-wt-ath-10122-gf9b5ec2 #102 > >> >> [ 39.944518] Hardware name: /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012 > >> >> [ 39.944522] 0000000000000000 ffffc90001f2f9e8 ffffffff813417d9 ffff88007faba5c0 > >> >> [ 39.944524] 000000000000006e ffffc90001f2fa00 ffffffff810aec03 ffffffff81a25948 > >> >> [ 39.944525] ffffc90001f2fa28 ffffffff810aec9a ffff8803e5bd9400 ffff8803e50fbd68 > >> >> [ 39.944526] Call Trace: > >> >> [ 39.944533] [<ffffffff813417d9>] dump_stack+0x63/0x8a > >> >> [ 39.944536] [<ffffffff810aec03>] ___might_sleep+0xd3/0x120 > >> >> [ 39.944537] [<ffffffff810aec9a>] __might_sleep+0x4a/0x80 > >> >> [ 39.944541] [<ffffffff810e4638>] synchronize_irq+0x38/0xa0 > >> >> [ 39.944543] [<ffffffff810e3c8e>] ? __irq_put_desc_unlock+0x1e/0x40 > >> >> [ 39.944545] [<ffffffff810e48e3>] ? __disable_irq_nosync+0x43/0x60 > >> >> [ 39.944547] [<ffffffff810e492c>] disable_irq+0x1c/0x20 > >> >> [ 39.944559] [<ffffffffa0220932>] e1000_netpoll+0xf2/0x120 [e1000e] > >> >> [ 39.944563] [<ffffffff815f2bdc>] netpoll_poll_dev+0x5c/0x1a0 > >> >> [ 39.944567] [<ffffffff815bb361>] ? __kmalloc_reserve+0x31/0x90 > >> >> [ 39.944569] [<ffffffff815f2e8b>] netpoll_send_skb_on_dev+0x16b/0x250 > >> >> [ 39.944572] [<ffffffff815f325c>] netpoll_send_udp+0x2ec/0x450 > >> >> [ 39.944576] [<ffffffffa003cb62>] write_msg+0xb2/0xf0 [netconsole] > >> >> [ 39.944578] [<ffffffff810e04e5>] call_console_drivers+0x115/0x120 > >> >> [ 39.944580] [<ffffffff810e1f13>] console_unlock+0x333/0x5c0 > >> >> [ 39.944583] [<ffffffff810e2c74>] register_console+0x1c4/0x380 > >> >> [ 39.944586] [<ffffffffa004f1c5>] init_netconsole+0x1c5/0x1000 [netconsole] > >> >> [ 39.944588] [<ffffffffa004f000>] ? 0xffffffffa004f000 > >> >> [ 39.944591] [<ffffffff8100216d>] do_one_initcall+0x3d/0x150 > >> >> [ 39.944592] [<ffffffff810aec9a>] ? __might_sleep+0x4a/0x80 > >> >> [ 39.944596] [<ffffffff811f5098>] ? kmem_cache_alloc_trace+0x188/0x1e0 > >> >> [ 39.944598] [<ffffffff8118f871>] do_init_module+0x5f/0x1d8 > >> >> [ 39.944602] [<ffffffff81114009>] load_module+0x1429/0x1b40 > >> >> [ 39.944604] [<ffffffff81110cd0>] ? __symbol_put+0x40/0x40 > >> >> [ 39.944607] [<ffffffff8121f348>] ? kernel_read_file+0x178/0x1a0 > >> >> [ 39.944608] [<ffffffff8121f429>] ? kernel_read_file_from_fd+0x49/0x80 > >> >> [ 39.944611] [<ffffffff81114973>] SYSC_finit_module+0xc3/0xf0 > >> >> [ 39.944614] [<ffffffff811149be>] SyS_finit_module+0xe/0x10 > >> >> [ 39.944617] [<ffffffff816e5877>] entry_SYSCALL_64_fastpath+0x1a/0xa9 > >> >> [ 39.946384] console [netcon0] enabled > >> >> [ 39.946514] netconsole: network logging started > >> >> > >> >> Can this be possibly fixed? > >> > > >> >Could you try this ? > >> > > >> >diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c > >> >index f42129d09e2c23ba9fdb5cde890d50ecb7166a42..a53c41c4c4f7d1fe52f95a2cab8784a938b3820b 100644 > >> >--- a/drivers/net/ethernet/intel/e1000/e1000_main.c > >> >+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c > >> >@@ -5257,9 +5257,13 @@ static void e1000_netpoll(struct net_device *netdev) > >> > { > >> > struct e1000_adapter *adapter = netdev_priv(netdev); > >> > > >> >- disable_irq(adapter->pdev->irq); > >> >- e1000_intr(adapter->pdev->irq, netdev); > >> >- enable_irq(adapter->pdev->irq); > >> >+ if (napi_schedule_prep(&adapter->napi)) { > >> >+ adapter->total_tx_bytes = 0; > >> >+ adapter->total_tx_packets = 0; > >> >+ adapter->total_rx_bytes = 0; > >> >+ adapter->total_rx_packets = 0; > >> >+ __napi_schedule(&adapter->napi); > >> >+ } > >> > >> The machines are actually running e1000e driver, so I copied your > >> approach to e1000e and it works: > >> > >> kern :info : [ 16.109647] netpoll: netconsole: local port 6665 > >> kern :info : [ 16.109961] netpoll: netconsole: local IPv4 address 0.0.0.0 > >> kern :info : [ 16.110346] netpoll: netconsole: interface 'eth0' > >> kern :info : [ 16.110672] netpoll: netconsole: remote port 6676 > >> kern :info : [ 16.110991] netpoll: netconsole: remote IPv4 address 192.168.2.1 > >> kern :info : [ 16.111398] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff > >> kern :info : [ 16.111845] netpoll: netconsole: local IP 192.168.2.3 > >> kern :info : [ 16.114284] console [netcon0] enabled > >> kern :info : [ 16.114550] netconsole: network logging started > >> > >> However I'm not sure if it'll have side effects, because this > >> effectively disables the various checks in e1000_intr() and > >> e1000_intr_msi(). > >> > > > >As far as netpoll is concerned, this should not matter. > > > >We only want to drain packets from TX rings. > > OK. > > >I have no idea why you hit this issue only recently, since this looks a > >rather old bug to me ? > > Yeah it's a rather old bug. It becomes obvious when we try to detect > and filter out buggy kernels for the machines that are expected to run > stable services. This BUG effectively blocks the stable machines from > booting because no clean kernel (v4.6.4, v4.6, v4.5, v4.4, ...) are > available at all. ;) > > Thanks, > Fengguang > >> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > >> index 9b4ec13..4f89873 100644 > >> --- a/drivers/net/ethernet/intel/e1000e/netdev.c > >> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > >> @@ -6711,15 +6711,14 @@ static void e1000_netpoll(struct net_device *netdev) > >> case E1000E_INT_MODE_MSIX: > >> e1000_intr_msix(adapter->pdev->irq, netdev); > >> break; > >> - case E1000E_INT_MODE_MSI: > >> - disable_irq(adapter->pdev->irq); > >> - e1000_intr_msi(adapter->pdev->irq, netdev); > >> - enable_irq(adapter->pdev->irq); > >> - break; > >> default: /* E1000E_INT_MODE_LEGACY */ > >> - disable_irq(adapter->pdev->irq); > >> - e1000_intr(adapter->pdev->irq, netdev); > >> - enable_irq(adapter->pdev->irq); > >> + if (napi_schedule_prep(&adapter->napi)) { > >> + adapter->total_tx_bytes = 0; > >> + adapter->total_tx_packets = 0; > >> + adapter->total_rx_bytes = 0; > >> + adapter->total_rx_packets = 0; > >> + __napi_schedule(&adapter->napi); > >> + } > >> break; > >> } > >> } > >> > >> Thanks, > >> Fengguang > > > About all netpoll implementations use disable_irq(), so I guess netpoll is not compatible with threaded irqs. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 2016-07-26 9:14 ` Eric Dumazet [not found] ` <20160726093224.GA10339@wfg-t540p.sh.intel.com> 2016-07-26 15:32 ` [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 Fengguang Wu @ 2016-07-27 21:38 ` Jeff Kirsher 2016-07-28 5:43 ` Eric Dumazet 2 siblings, 1 reply; 17+ messages in thread From: Jeff Kirsher @ 2016-07-27 21:38 UTC (permalink / raw) To: Eric Dumazet, Fengguang Wu Cc: LKML, netdev, Satyam Sharma, Thomas Gleixner, intel-wired-lan, Ye Xiaolong [-- Attachment #1: Type: text/plain, Size: 1319 bytes --] On Tue, 2016-07-26 at 11:14 +0200, Eric Dumazet wrote: > Could you try this ? > > diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c > b/drivers/net/ethernet/intel/e1000/e1000_main.c > index > f42129d09e2c23ba9fdb5cde890d50ecb7166a42..a53c41c4c4f7d1fe52f95a2cab8784a > 938b3820b 100644 > --- a/drivers/net/ethernet/intel/e1000/e1000_main.c > +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c > @@ -5257,9 +5257,13 @@ static void e1000_netpoll(struct net_device > *netdev) > { > struct e1000_adapter *adapter = netdev_priv(netdev); > > - disable_irq(adapter->pdev->irq); > - e1000_intr(adapter->pdev->irq, netdev); > - enable_irq(adapter->pdev->irq); > + if (napi_schedule_prep(&adapter->napi)) { > + adapter->total_tx_bytes = 0; > + adapter->total_tx_packets = 0; > + adapter->total_rx_bytes = 0; > + adapter->total_rx_packets = 0; > + __napi_schedule(&adapter->napi); > + } > } > #endif > Since this fixes the issue Fengguang saw, will you be submitting a formal patch Eric? (please) I can get this queued up for Dave's net tree as soon as I receive the formal patch. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 2016-07-27 21:38 ` Jeff Kirsher @ 2016-07-28 5:43 ` Eric Dumazet 2016-07-28 10:19 ` Sabrina Dubroca 2016-07-28 23:28 ` [Intel-wired-lan] " Francois Romieu 0 siblings, 2 replies; 17+ messages in thread From: Eric Dumazet @ 2016-07-28 5:43 UTC (permalink / raw) To: Jeff Kirsher Cc: Fengguang Wu, LKML, netdev, Satyam Sharma, Thomas Gleixner, intel-wired-lan, Ye Xiaolong On Wed, 2016-07-27 at 14:38 -0700, Jeff Kirsher wrote: > On Tue, 2016-07-26 at 11:14 +0200, Eric Dumazet wrote: > > Could you try this ? > > > > diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c > > b/drivers/net/ethernet/intel/e1000/e1000_main.c > > index > > f42129d09e2c23ba9fdb5cde890d50ecb7166a42..a53c41c4c4f7d1fe52f95a2cab8784a > > 938b3820b 100644 > > --- a/drivers/net/ethernet/intel/e1000/e1000_main.c > > +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c > > @@ -5257,9 +5257,13 @@ static void e1000_netpoll(struct net_device > > *netdev) > > { > > struct e1000_adapter *adapter = netdev_priv(netdev); > > > > - disable_irq(adapter->pdev->irq); > > - e1000_intr(adapter->pdev->irq, netdev); > > - enable_irq(adapter->pdev->irq); > > + if (napi_schedule_prep(&adapter->napi)) { > > + adapter->total_tx_bytes = 0; > > + adapter->total_tx_packets = 0; > > + adapter->total_rx_bytes = 0; > > + adapter->total_rx_packets = 0; > > + __napi_schedule(&adapter->napi); > > + } > > } > > #endif > > > > Since this fixes the issue Fengguang saw, will you be submitting a formal > patch Eric? (please) I can get this queued up for Dave's net tree as soon > as I receive the formal patch. I would prefer having a definitive advice from Thomas Gleixner and/or others if disable_irq() is forbidden from IRQ path. As I said, about all netpoll() methods in net drivers use disable_irq() so a lot of patches would be needed. disable_irq() should then test this condition earlier, so that we can detect potential bug, even if the IRQ is not (yet) threaded. Thanks. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 2016-07-28 5:43 ` Eric Dumazet @ 2016-07-28 10:19 ` Sabrina Dubroca 2016-07-28 12:21 ` Thomas Gleixner 2016-07-28 13:30 ` Fengguang Wu 2016-07-28 23:28 ` [Intel-wired-lan] " Francois Romieu 1 sibling, 2 replies; 17+ messages in thread From: Sabrina Dubroca @ 2016-07-28 10:19 UTC (permalink / raw) To: Eric Dumazet Cc: Jeff Kirsher, Fengguang Wu, LKML, netdev, Satyam Sharma, Thomas Gleixner, intel-wired-lan, Ye Xiaolong 2016-07-28, 07:43:55 +0200, Eric Dumazet wrote: > On Wed, 2016-07-27 at 14:38 -0700, Jeff Kirsher wrote: > > On Tue, 2016-07-26 at 11:14 +0200, Eric Dumazet wrote: > > > Could you try this ? > > > > > > diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c > > > b/drivers/net/ethernet/intel/e1000/e1000_main.c > > > index > > > f42129d09e2c23ba9fdb5cde890d50ecb7166a42..a53c41c4c4f7d1fe52f95a2cab8784a > > > 938b3820b 100644 > > > --- a/drivers/net/ethernet/intel/e1000/e1000_main.c > > > +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c > > > @@ -5257,9 +5257,13 @@ static void e1000_netpoll(struct net_device > > > *netdev) > > > { > > > struct e1000_adapter *adapter = netdev_priv(netdev); > > > > > > - disable_irq(adapter->pdev->irq); > > > - e1000_intr(adapter->pdev->irq, netdev); > > > - enable_irq(adapter->pdev->irq); > > > + if (napi_schedule_prep(&adapter->napi)) { > > > + adapter->total_tx_bytes = 0; > > > + adapter->total_tx_packets = 0; > > > + adapter->total_rx_bytes = 0; > > > + adapter->total_rx_packets = 0; > > > + __napi_schedule(&adapter->napi); > > > + } > > > } > > > #endif > > > > > > > Since this fixes the issue Fengguang saw, will you be submitting a formal > > patch Eric? (please) I can get this queued up for Dave's net tree as soon > > as I receive the formal patch. > > I would prefer having a definitive advice from Thomas Gleixner and/or > others if disable_irq() is forbidden from IRQ path. > > As I said, about all netpoll() methods in net drivers use disable_irq() > so a lot of patches would be needed. > > disable_irq() should then test this condition earlier, so that we can > detect potential bug, even if the IRQ is not (yet) threaded. The idea when this first came up was to skip the sleeping part of disable_irq(): http://marc.info/?l=linux-netdev&m=142314159626052 This fell off my todolist and I didn't send the conversion patches, which would basically look like this: diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index 41f32c0b341e..b022691e680b 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -6713,20 +6713,20 @@ static irqreturn_t e1000_intr_msix(int __always_unused irq, void *data) vector = 0; msix_irq = adapter->msix_entries[vector].vector; - disable_irq(msix_irq); - e1000_intr_msix_rx(msix_irq, netdev); + if (disable_hardirq(msix_irq)) + e1000_intr_msix_rx(msix_irq, netdev); enable_irq(msix_irq); vector++; msix_irq = adapter->msix_entries[vector].vector; - disable_irq(msix_irq); - e1000_intr_msix_tx(msix_irq, netdev); + if (disable_hardirq(msix_irq)) + e1000_intr_msix_tx(msix_irq, netdev); enable_irq(msix_irq); vector++; msix_irq = adapter->msix_entries[vector].vector; - disable_irq(msix_irq); - e1000_msix_other(msix_irq, netdev); + if (disable_hardirq(msix_irq)) + e1000_msix_other(msix_irq, netdev); enable_irq(msix_irq); } @@ -6750,13 +6750,13 @@ static void e1000_netpoll(struct net_device *netdev) e1000_intr_msix(adapter->pdev->irq, netdev); break; case E1000E_INT_MODE_MSI: - disable_irq(adapter->pdev->irq); - e1000_intr_msi(adapter->pdev->irq, netdev); + if (disable_hardirq(adapter->pdev->irq)) + e1000_intr_msi(adapter->pdev->irq, netdev); enable_irq(adapter->pdev->irq); break; default: /* E1000E_INT_MODE_LEGACY */ - disable_irq(adapter->pdev->irq); - e1000_intr(adapter->pdev->irq, netdev); + if (disable_hardirq(adapter->pdev->irq)) + e1000_intr(adapter->pdev->irq, netdev); enable_irq(adapter->pdev->irq); break; } -- Sabrina ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 2016-07-28 10:19 ` Sabrina Dubroca @ 2016-07-28 12:21 ` Thomas Gleixner 2016-07-28 13:30 ` Fengguang Wu 1 sibling, 0 replies; 17+ messages in thread From: Thomas Gleixner @ 2016-07-28 12:21 UTC (permalink / raw) To: Sabrina Dubroca Cc: Eric Dumazet, Jeff Kirsher, Fengguang Wu, LKML, netdev, Satyam Sharma, intel-wired-lan, Ye Xiaolong On Thu, 28 Jul 2016, Sabrina Dubroca wrote: > 2016-07-28, 07:43:55 +0200, Eric Dumazet wrote: > > I would prefer having a definitive advice from Thomas Gleixner and/or > > others if disable_irq() is forbidden from IRQ path. Yes it is. Before we added threaded interrupt handlers it was not an issue, but with (possibly) threaded interrupts it's an absolute no-no. > > As I said, about all netpoll() methods in net drivers use disable_irq() > > so a lot of patches would be needed. > > > > disable_irq() should then test this condition earlier, so that we can > > detect potential bug, even if the IRQ is not (yet) threaded. > > The idea when this first came up was to skip the sleeping part of > disable_irq(): > > http://marc.info/?l=linux-netdev&m=142314159626052 > > This fell off my todolist and I didn't send the conversion patches, > which would basically look like this: > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > index 41f32c0b341e..b022691e680b 100644 > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > @@ -6713,20 +6713,20 @@ static irqreturn_t e1000_intr_msix(int __always_unused irq, void *data) > > vector = 0; > msix_irq = adapter->msix_entries[vector].vector; > - disable_irq(msix_irq); > - e1000_intr_msix_rx(msix_irq, netdev); > + if (disable_hardirq(msix_irq)) > + e1000_intr_msix_rx(msix_irq, netdev); > enable_irq(msix_irq); That'll work nicely even when one of the affected interrupts is threaded. Thanks, tglx ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 2016-07-28 10:19 ` Sabrina Dubroca 2016-07-28 12:21 ` Thomas Gleixner @ 2016-07-28 13:30 ` Fengguang Wu 1 sibling, 0 replies; 17+ messages in thread From: Fengguang Wu @ 2016-07-28 13:30 UTC (permalink / raw) To: Sabrina Dubroca Cc: Eric Dumazet, Jeff Kirsher, LKML, netdev, Satyam Sharma, Thomas Gleixner, intel-wired-lan, Ye Xiaolong Hi Sabrina, >The idea when this first came up was to skip the sleeping part of >disable_irq(): > >http://marc.info/?l=linux-netdev&m=142314159626052 > >This fell off my todolist and I didn't send the conversion patches, >which would basically look like this: Yes it works in the several machines that had the BUG! [ 23.806847] netpoll: netconsole: local port 6665 [ 23.807145] netpoll: netconsole: local IPv4 address 0.0.0.0 [ 23.807494] netpoll: netconsole: interface 'eth0' [ 23.807799] netpoll: netconsole: remote port 6646 [ 23.808096] netpoll: netconsole: remote IPv4 address 192.168.1.1 [ 23.808474] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff [ 23.808910] netpoll: netconsole: local IP 192.168.1.161 [ 23.811680] 28 Jul 19:42:10 ntpdate[376]: step time server 192.168.1.1 offset 1696.257557 sec [ 23.811886] console [netcon0] enabled [ 23.812131] netconsole: network logging started Thanks, Fengguang > >diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c >index 41f32c0b341e..b022691e680b 100644 >--- a/drivers/net/ethernet/intel/e1000e/netdev.c >+++ b/drivers/net/ethernet/intel/e1000e/netdev.c >@@ -6713,20 +6713,20 @@ static irqreturn_t e1000_intr_msix(int __always_unused irq, void *data) > > vector = 0; > msix_irq = adapter->msix_entries[vector].vector; >- disable_irq(msix_irq); >- e1000_intr_msix_rx(msix_irq, netdev); >+ if (disable_hardirq(msix_irq)) >+ e1000_intr_msix_rx(msix_irq, netdev); > enable_irq(msix_irq); > > vector++; > msix_irq = adapter->msix_entries[vector].vector; >- disable_irq(msix_irq); >- e1000_intr_msix_tx(msix_irq, netdev); >+ if (disable_hardirq(msix_irq)) >+ e1000_intr_msix_tx(msix_irq, netdev); > enable_irq(msix_irq); > > vector++; > msix_irq = adapter->msix_entries[vector].vector; >- disable_irq(msix_irq); >- e1000_msix_other(msix_irq, netdev); >+ if (disable_hardirq(msix_irq)) >+ e1000_msix_other(msix_irq, netdev); > enable_irq(msix_irq); > } > >@@ -6750,13 +6750,13 @@ static void e1000_netpoll(struct net_device *netdev) > e1000_intr_msix(adapter->pdev->irq, netdev); > break; > case E1000E_INT_MODE_MSI: >- disable_irq(adapter->pdev->irq); >- e1000_intr_msi(adapter->pdev->irq, netdev); >+ if (disable_hardirq(adapter->pdev->irq)) >+ e1000_intr_msi(adapter->pdev->irq, netdev); > enable_irq(adapter->pdev->irq); > break; > default: /* E1000E_INT_MODE_LEGACY */ >- disable_irq(adapter->pdev->irq); >- e1000_intr(adapter->pdev->irq, netdev); >+ if (disable_hardirq(adapter->pdev->irq)) >+ e1000_intr(adapter->pdev->irq, netdev); > enable_irq(adapter->pdev->irq); > br >ak; > } ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Intel-wired-lan] [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 2016-07-28 5:43 ` Eric Dumazet 2016-07-28 10:19 ` Sabrina Dubroca @ 2016-07-28 23:28 ` Francois Romieu 1 sibling, 0 replies; 17+ messages in thread From: Francois Romieu @ 2016-07-28 23:28 UTC (permalink / raw) To: Eric Dumazet Cc: Jeff Kirsher, netdev, LKML, Ye Xiaolong, intel-wired-lan, Satyam Sharma, Thomas Gleixner Eric Dumazet <eric.dumazet@gmail.com> : [...] > I would prefer having a definitive advice from Thomas Gleixner and/or > others if disable_irq() is forbidden from IRQ path. > > As I said, about all netpoll() methods in net drivers use disable_irq() > so a lot of patches would be needed. s/about all/many/ There has been a WARN_ONCE(!irqs_disabled() in netpoll_send_skb_on_dev for quite some time now but it's apparently screened by too many tests to be effective. :o/ -- Ueimor ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2016-07-28 23:29 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-07-26 3:50 [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 Fengguang Wu 2016-07-26 9:14 ` Eric Dumazet [not found] ` <20160726093224.GA10339@wfg-t540p.sh.intel.com> 2016-07-26 9:45 ` [PATCH] schedule function called for e1000 driver interrupt kbuild test robot 2016-07-26 9:50 ` [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 Thomas Gleixner [not found] ` <8578bb16-cd04-e8a5-c7f4-be061ede95b4@gmail.com> 2016-07-28 7:45 ` Thomas Gleixner 2016-07-28 9:46 ` Valdis.Kletnieks 2016-07-26 9:50 ` [PATCH] schedule function called for e1000 driver interrupt kbuild test robot 2016-07-26 15:32 ` [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 Fengguang Wu 2016-07-26 16:28 ` Eric Dumazet 2016-07-27 15:01 ` Fengguang Wu 2016-07-27 18:50 ` Eric Dumazet 2016-07-27 21:38 ` Jeff Kirsher 2016-07-28 5:43 ` Eric Dumazet 2016-07-28 10:19 ` Sabrina Dubroca 2016-07-28 12:21 ` Thomas Gleixner 2016-07-28 13:30 ` Fengguang Wu 2016-07-28 23:28 ` [Intel-wired-lan] " Francois Romieu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).