linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Fengguang Wu <fengguang.wu@intel.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	netdev@vger.kernel.org, Satyam Sharma <satyam@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	intel-wired-lan@lists.osuosl.org,
	Jeff Kirsher <jeffrey.t.kirsher@intel.com>,
	Ye Xiaolong <xiaolong.ye@intel.com>
Subject: Re: [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110
Date: Wed, 27 Jul 2016 23:01:46 +0800	[thread overview]
Message-ID: <20160727150146.GA22982@wfg-t540p.sh.intel.com> (raw)
In-Reply-To: <1469550513.17736.5.camel@edumazet-glaptop3.roam.corp.google.com>

On Tue, Jul 26, 2016 at 06:28:33PM +0200, Eric Dumazet wrote:
>On Tue, 2016-07-26 at 23:32 +0800, Fengguang Wu wrote:
>> Hi Eric,
>>
>> It works!
>>
>> On Tue, Jul 26, 2016 at 11:14:52AM +0200, Eric Dumazet wrote:
>> >On Tue, 2016-07-26 at 11:50 +0800, Fengguang Wu wrote:
>> >> Greetings,
>> >>
>> >> This BUG message can be found in recent kernels as well as v4.4 and
>> >> linux-stable. It happens when running
>> >>
>> >>         modprobe netconsole netconsole=@/,$port@$server/
>> >>
>> >> [   39.937534] 22 Jul 13:30:40 ntpdate[440]: step time server 192.168.1.1 offset -673.833841 sec
>> >> [   39.943285] netpoll: netconsole: local port 6665
>> >> [   39.943436] netpoll: netconsole: local IPv4 address 0.0.0.0
>> >> [   39.943609] netpoll: netconsole: interface 'eth0'
>> >> [   39.943756] netpoll: netconsole: remote port 6672
>> >> [   39.943913] netpoll: netconsole: remote IPv4 address 192.168.1.1
>> >> [   39.944099] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff
>> >> [   39.944311] netpoll: netconsole: local IP 192.168.1.193
>> >> [   39.944514] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110
>> >> [   39.944515] in_atomic(): 1, irqs_disabled(): 1, pid: 448, name: modprobe
>> >> [   39.944517] CPU: 6 PID: 448 Comm: modprobe Not tainted 4.7.0-rc7-wt-ath-10122-gf9b5ec2 #102
>> >> [   39.944518] Hardware name:                  /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012
>> >> [   39.944522]  0000000000000000 ffffc90001f2f9e8 ffffffff813417d9 ffff88007faba5c0
>> >> [   39.944524]  000000000000006e ffffc90001f2fa00 ffffffff810aec03 ffffffff81a25948
>> >> [   39.944525]  ffffc90001f2fa28 ffffffff810aec9a ffff8803e5bd9400 ffff8803e50fbd68
>> >> [   39.944526] Call Trace:
>> >> [   39.944533]  [<ffffffff813417d9>] dump_stack+0x63/0x8a
>> >> [   39.944536]  [<ffffffff810aec03>] ___might_sleep+0xd3/0x120
>> >> [   39.944537]  [<ffffffff810aec9a>] __might_sleep+0x4a/0x80
>> >> [   39.944541]  [<ffffffff810e4638>] synchronize_irq+0x38/0xa0
>> >> [   39.944543]  [<ffffffff810e3c8e>] ? __irq_put_desc_unlock+0x1e/0x40
>> >> [   39.944545]  [<ffffffff810e48e3>] ? __disable_irq_nosync+0x43/0x60
>> >> [   39.944547]  [<ffffffff810e492c>] disable_irq+0x1c/0x20
>> >> [   39.944559]  [<ffffffffa0220932>] e1000_netpoll+0xf2/0x120 [e1000e]
>> >> [   39.944563]  [<ffffffff815f2bdc>] netpoll_poll_dev+0x5c/0x1a0
>> >> [   39.944567]  [<ffffffff815bb361>] ? __kmalloc_reserve+0x31/0x90
>> >> [   39.944569]  [<ffffffff815f2e8b>] netpoll_send_skb_on_dev+0x16b/0x250
>> >> [   39.944572]  [<ffffffff815f325c>] netpoll_send_udp+0x2ec/0x450
>> >> [   39.944576]  [<ffffffffa003cb62>] write_msg+0xb2/0xf0 [netconsole]
>> >> [   39.944578]  [<ffffffff810e04e5>] call_console_drivers+0x115/0x120
>> >> [   39.944580]  [<ffffffff810e1f13>] console_unlock+0x333/0x5c0
>> >> [   39.944583]  [<ffffffff810e2c74>] register_console+0x1c4/0x380
>> >> [   39.944586]  [<ffffffffa004f1c5>] init_netconsole+0x1c5/0x1000 [netconsole]
>> >> [   39.944588]  [<ffffffffa004f000>] ? 0xffffffffa004f000
>> >> [   39.944591]  [<ffffffff8100216d>] do_one_initcall+0x3d/0x150
>> >> [   39.944592]  [<ffffffff810aec9a>] ? __might_sleep+0x4a/0x80
>> >> [   39.944596]  [<ffffffff811f5098>] ? kmem_cache_alloc_trace+0x188/0x1e0
>> >> [   39.944598]  [<ffffffff8118f871>] do_init_module+0x5f/0x1d8
>> >> [   39.944602]  [<ffffffff81114009>] load_module+0x1429/0x1b40
>> >> [   39.944604]  [<ffffffff81110cd0>] ? __symbol_put+0x40/0x40
>> >> [   39.944607]  [<ffffffff8121f348>] ? kernel_read_file+0x178/0x1a0
>> >> [   39.944608]  [<ffffffff8121f429>] ? kernel_read_file_from_fd+0x49/0x80
>> >> [   39.944611]  [<ffffffff81114973>] SYSC_finit_module+0xc3/0xf0
>> >> [   39.944614]  [<ffffffff811149be>] SyS_finit_module+0xe/0x10
>> >> [   39.944617]  [<ffffffff816e5877>] entry_SYSCALL_64_fastpath+0x1a/0xa9
>> >> [   39.946384] console [netcon0] enabled
>> >> [   39.946514] netconsole: network logging started
>> >>
>> >> Can this be possibly fixed?
>> >
>> >Could you try this ?
>> >
>> >diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
>> >index f42129d09e2c23ba9fdb5cde890d50ecb7166a42..a53c41c4c4f7d1fe52f95a2cab8784a938b3820b 100644
>> >--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
>> >+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
>> >@@ -5257,9 +5257,13 @@ static void e1000_netpoll(struct net_device *netdev)
>> > {
>> > 	struct e1000_adapter *adapter = netdev_priv(netdev);
>> >
>> >-	disable_irq(adapter->pdev->irq);
>> >-	e1000_intr(adapter->pdev->irq, netdev);
>> >-	enable_irq(adapter->pdev->irq);
>> >+	if (napi_schedule_prep(&adapter->napi)) {
>> >+		adapter->total_tx_bytes = 0;
>> >+		adapter->total_tx_packets = 0;
>> >+		adapter->total_rx_bytes = 0;
>> >+		adapter->total_rx_packets = 0;
>> >+		__napi_schedule(&adapter->napi);
>> >+	}
>>
>> The machines are actually running e1000e driver, so I copied your
>> approach to e1000e and it works:
>>
>> kern  :info  : [   16.109647] netpoll: netconsole: local port 6665
>> kern  :info  : [   16.109961] netpoll: netconsole: local IPv4 address 0.0.0.0
>> kern  :info  : [   16.110346] netpoll: netconsole: interface 'eth0'
>> kern  :info  : [   16.110672] netpoll: netconsole: remote port 6676
>> kern  :info  : [   16.110991] netpoll: netconsole: remote IPv4 address 192.168.2.1
>> kern  :info  : [   16.111398] netpoll: netconsole: remote ethernet address ff:ff:ff:ff:ff:ff
>> kern  :info  : [   16.111845] netpoll: netconsole: local IP 192.168.2.3
>> kern  :info  : [   16.114284] console [netcon0] enabled
>> kern  :info  : [   16.114550] netconsole: network logging started
>>
>> However I'm not sure if it'll have side effects, because this
>> effectively disables the various checks in e1000_intr() and
>> e1000_intr_msi().
>>
>
>As far as netpoll is concerned, this should not matter.
>
>We only want to drain packets from TX rings.

OK.

>I have no idea why you hit this issue only recently, since this looks a
>rather old bug to me ?

Yeah it's a rather old bug. It becomes obvious when we try to detect
and filter out buggy kernels for the machines that are expected to run
stable services. This BUG effectively blocks the stable machines from
booting because no clean kernel (v4.6.4, v4.6, v4.5, v4.4, ...) are
available at all. ;)

Thanks,
Fengguang

>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
>> index 9b4ec13..4f89873 100644
>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>> @@ -6711,15 +6711,14 @@ static void e1000_netpoll(struct net_device *netdev)
>>  	case E1000E_INT_MODE_MSIX:
>>  		e1000_intr_msix(adapter->pdev->irq, netdev);
>>  		break;
>> -	case E1000E_INT_MODE_MSI:
>> -		disable_irq(adapter->pdev->irq);
>> -		e1000_intr_msi(adapter->pdev->irq, netdev);
>> -		enable_irq(adapter->pdev->irq);
>> -		break;
>>  	default:		/* E1000E_INT_MODE_LEGACY */
>> -		disable_irq(adapter->pdev->irq);
>> -		e1000_intr(adapter->pdev->irq, netdev);
>> -		enable_irq(adapter->pdev->irq);
>> +		if (napi_schedule_prep(&adapter->napi)) {
>> +			adapter->total_tx_bytes = 0;
>> +			adapter->total_tx_packets = 0;
>> +			adapter->total_rx_bytes = 0;
>> +			adapter->total_rx_packets = 0;
>> +			__napi_schedule(&adapter->napi);
>> +		}
>>  		break;
>>  	}
>>  }
>>
>> Thanks,
>> Fengguang
>

  reply	other threads:[~2016-07-27 15:01 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-26  3:50 [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 Fengguang Wu
2016-07-26  9:14 ` Eric Dumazet
     [not found]   ` <20160726093224.GA10339@wfg-t540p.sh.intel.com>
2016-07-26  9:45     ` [PATCH] schedule function called for e1000 driver interrupt kbuild test robot
2016-07-26  9:50     ` [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 Thomas Gleixner
     [not found]       ` <8578bb16-cd04-e8a5-c7f4-be061ede95b4@gmail.com>
2016-07-28  7:45         ` Thomas Gleixner
2016-07-28  9:46           ` Valdis.Kletnieks
2016-07-26  9:50     ` [PATCH] schedule function called for e1000 driver interrupt kbuild test robot
2016-07-26 15:32   ` [e1000_netpoll] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 Fengguang Wu
2016-07-26 16:28     ` Eric Dumazet
2016-07-27 15:01       ` Fengguang Wu [this message]
2016-07-27 18:50         ` Eric Dumazet
2016-07-27 21:38   ` Jeff Kirsher
2016-07-28  5:43     ` Eric Dumazet
2016-07-28 10:19       ` Sabrina Dubroca
2016-07-28 12:21         ` Thomas Gleixner
2016-07-28 13:30         ` Fengguang Wu
2016-07-28 23:28       ` [Intel-wired-lan] " Francois Romieu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160727150146.GA22982@wfg-t540p.sh.intel.com \
    --to=fengguang.wu@intel.com \
    --cc=eric.dumazet@gmail.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=satyam@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=xiaolong.ye@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).