netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ixgbe: RTNL assertion failed
@ 2013-05-03 19:09 Stephen Hemminger
  2013-05-03 23:17 ` Skidmore, Donald C
  0 siblings, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2013-05-03 19:09 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: netdev

Running 3.9 kernel, ixgbe is splatting on resume from suspend.

[26430.213254] ixgbe 0000:01:00.0: setting latency timer to 64
[26430.213257] RTNL: assertion failed at /build/buildd-linux_3.2.41-2-amd64-Wvc92F/linux-3.2.41/net/core/dev.c (1758)
[26430.213259] Pid: 7839, comm: kworker/u:1 Not tainted 3.2.0-4-amd64 #1 Debian 3.2.41-2
[26430.213261] Call Trace:
[26430.213266]  [<ffffffff8128b89e>] ? netif_set_real_num_tx_queues+0x5c/0x15e
[26430.213286]  [<ffffffffa0051749>] ? ixgbe_set_num_queues+0x208/0x221 [ixgbe]
[26430.213287] sd 0:0:0:0: [sda] Starting disk
[26430.213292]  [<ffffffffa0054949>] ? ixgbe_init_interrupt_scheme+0x16/0x790 [ixgbe]
[26430.213299]  [<ffffffffa0055aae>] ? ixgbe_resume+0x7a/0xe3 [ixgbe]
[26430.213303]  [<ffffffff81255b7c>] ? pm_op+0xa1/0x141
[26430.213305]  [<ffffffff81255f40>] ? device_resume+0xa2/0xfc
[26430.213307]  [<ffffffff81255fae>] ? async_resume+0x14/0x38
[26430.213311]  [<ffffffff810648c4>] ? async_run_entry_fn+0x96/0x142
[26430.213313]  [<ffffffff8105b225>] ? process_one_work+0x161/0x264
[26430.213316]  [<ffffffff8105c1e6>] ? worker_thread+0xc2/0x145
[26430.213318]  [<ffffffff8105c124>] ? manage_workers.isra.25+0x15b/0x15b
[26430.213320]  [<ffffffff8105f321>] ? kthread+0x76/0x7e
[26430.213323]  [<ffffffff81354ab4>] ? kernel_thread_helper+0x4/0x10
[26430.213325]  [<ffffffff8105f2ab>] ? kthread_worker_fn+0x139/0x139
[26430.213327]  [<ffffffff81354ab0>] ? gs_change+0x13/0x13

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: ixgbe: RTNL assertion failed
  2013-05-03 19:09 ixgbe: RTNL assertion failed Stephen Hemminger
@ 2013-05-03 23:17 ` Skidmore, Donald C
  2013-05-04  1:54   ` Ben Hutchings
  0 siblings, 1 reply; 8+ messages in thread
From: Skidmore, Donald C @ 2013-05-03 23:17 UTC (permalink / raw)
  To: Stephen Hemminger, Kirsher, Jeffrey T; +Cc: netdev

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Stephen Hemminger
> Sent: Friday, May 03, 2013 12:09 PM
> To: Kirsher, Jeffrey T
> Cc: netdev@vger.kernel.org
> Subject: ixgbe: RTNL assertion failed
> 
> Running 3.9 kernel, ixgbe is splatting on resume from suspend.
> 
> [26430.213254] ixgbe 0000:01:00.0: setting latency timer to 64 [26430.213257]
> RTNL: assertion failed at /build/buildd-linux_3.2.41-2-amd64-Wvc92F/linux-
> 3.2.41/net/core/dev.c (1758) [26430.213259] Pid: 7839, comm: kworker/u:1
> Not tainted 3.2.0-4-amd64 #1 Debian 3.2.41-2 [26430.213261] Call Trace:
> [26430.213266]  [<ffffffff8128b89e>] ?
> netif_set_real_num_tx_queues+0x5c/0x15e
> [26430.213286]  [<ffffffffa0051749>] ? ixgbe_set_num_queues+0x208/0x221
> [ixgbe] [26430.213287] sd 0:0:0:0: [sda] Starting disk [26430.213292]
> [<ffffffffa0054949>] ? ixgbe_init_interrupt_scheme+0x16/0x790 [ixgbe]
> [26430.213299]  [<ffffffffa0055aae>] ? ixgbe_resume+0x7a/0xe3 [ixgbe]
> [26430.213303]  [<ffffffff81255b7c>] ? pm_op+0xa1/0x141 [26430.213305]
> [<ffffffff81255f40>] ? device_resume+0xa2/0xfc [26430.213307]
> [<ffffffff81255fae>] ? async_resume+0x14/0x38 [26430.213311]
> [<ffffffff810648c4>] ? async_run_entry_fn+0x96/0x142 [26430.213313]
> [<ffffffff8105b225>] ? process_one_work+0x161/0x264 [26430.213316]
> [<ffffffff8105c1e6>] ? worker_thread+0xc2/0x145 [26430.213318]
> [<ffffffff8105c124>] ? manage_workers.isra.25+0x15b/0x15b
> [26430.213320]  [<ffffffff8105f321>] ? kthread+0x76/0x7e [26430.213323]
> [<ffffffff81354ab4>] ? kernel_thread_helper+0x4/0x10 [26430.213325]
> [<ffffffff8105f2ab>] ? kthread_worker_fn+0x139/0x139 [26430.213327]
> [<ffffffff81354ab0>] ? gs_change+0x13/0x13
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in the body
> of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

Hey Stephen,

I'm having a little problem finding a path were we call netif_set_real_num_tx_queues without holding RTNL in net-next.  While looking over the stack dump one of our engineers noticed the text "Not tainted 3.2.0-4-amd64 #1 Debian 3.2.41-2 ".  Could this mean I'm looking over the wrong source?  It would make me feel better as I'm not seeing anything as is. :)

Thanks,
-Don Skidmore <donald.c.skidmore@intel.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ixgbe: RTNL assertion failed
  2013-05-03 23:17 ` Skidmore, Donald C
@ 2013-05-04  1:54   ` Ben Hutchings
  2013-05-04  3:51     ` Stephen Hemminger
  2013-05-04 21:05     ` Skidmore, Donald C
  0 siblings, 2 replies; 8+ messages in thread
From: Ben Hutchings @ 2013-05-04  1:54 UTC (permalink / raw)
  To: Skidmore, Donald C; +Cc: Stephen Hemminger, Kirsher, Jeffrey T, netdev

On Fri, May 03, 2013 at 11:17:39PM +0000, Skidmore, Donald C wrote:
> > -----Original Message-----
> > From: netdev-owner@vger.kernel.org [mailto:netdev-
> > owner@vger.kernel.org] On Behalf Of Stephen Hemminger
> > Sent: Friday, May 03, 2013 12:09 PM
> > To: Kirsher, Jeffrey T
> > Cc: netdev@vger.kernel.org
> > Subject: ixgbe: RTNL assertion failed
> > 
> > Running 3.9 kernel, ixgbe is splatting on resume from suspend.
> > 
> > [26430.213254] ixgbe 0000:01:00.0: setting latency timer to 64 [26430.213257]
> > RTNL: assertion failed at /build/buildd-linux_3.2.41-2-amd64-Wvc92F/linux-
> > 3.2.41/net/core/dev.c (1758) [26430.213259] Pid: 7839, comm: kworker/u:1
> > Not tainted 3.2.0-4-amd64 #1 Debian 3.2.41-2 [26430.213261] Call Trace:
[...]
> I'm having a little problem finding a path were we call netif_set_real_num_tx_queues without holding RTNL in net-next.  While looking over the stack dump one of our engineers noticed the text "Not tainted 3.2.0-4-amd64 #1 Debian 3.2.41-2 ".  Could this mean I'm looking over the wrong source?  It would make me feel better as I'm not seeing anything as is. :)

Indeed, this is not 3.9.

The version of ixgbe in this Debian kernel has bql support backported,
but is otherwise the same as in 3.2.41.  I assume that this bug has
been fixed some time between 3.2 and 3.9, but no-one requested that
the fix be included in stable branches.  Please can you identify the
fix?

Ben.

-- 
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
                                                              - Albert Camus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ixgbe: RTNL assertion failed
  2013-05-04  1:54   ` Ben Hutchings
@ 2013-05-04  3:51     ` Stephen Hemminger
  2013-05-04 21:05     ` Skidmore, Donald C
  1 sibling, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2013-05-04  3:51 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Skidmore, Donald C, Kirsher, Jeffrey T, netdev

On Sat, 4 May 2013 02:54:06 +0100
Ben Hutchings <ben@decadent.org.uk> wrote:

> On Fri, May 03, 2013 at 11:17:39PM +0000, Skidmore, Donald C wrote:
> > > -----Original Message-----
> > > From: netdev-owner@vger.kernel.org [mailto:netdev-
> > > owner@vger.kernel.org] On Behalf Of Stephen Hemminger
> > > Sent: Friday, May 03, 2013 12:09 PM
> > > To: Kirsher, Jeffrey T
> > > Cc: netdev@vger.kernel.org
> > > Subject: ixgbe: RTNL assertion failed
> > > 
> > > Running 3.9 kernel, ixgbe is splatting on resume from suspend.
> > > 
> > > [26430.213254] ixgbe 0000:01:00.0: setting latency timer to 64 [26430.213257]
> > > RTNL: assertion failed at /build/buildd-linux_3.2.41-2-amd64-Wvc92F/linux-
> > > 3.2.41/net/core/dev.c (1758) [26430.213259] Pid: 7839, comm: kworker/u:1
> > > Not tainted 3.2.0-4-amd64 #1 Debian 3.2.41-2 [26430.213261] Call Trace:
> [...]
> > I'm having a little problem finding a path were we call netif_set_real_num_tx_queues without holding RTNL in net-next.  While looking over the stack dump one of our engineers noticed the text "Not tainted 3.2.0-4-amd64 #1 Debian 3.2.41-2 ".  Could this mean I'm looking over the wrong source?  It would make me feel better as I'm not seeing anything as is. :)
> 
> Indeed, this is not 3.9.
> 
> The version of ixgbe in this Debian kernel has bql support backported,
> but is otherwise the same as in 3.2.41.  I assume that this bug has
> been fixed some time between 3.2 and 3.9, but no-one requested that
> the fix be included in stable branches.  Please can you identify the
> fix?
> 
> Ben.
> 

Yeah, it was a Debian kernel, thought it was 3.9, must have been swapping kernels.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: ixgbe: RTNL assertion failed
  2013-05-04  1:54   ` Ben Hutchings
  2013-05-04  3:51     ` Stephen Hemminger
@ 2013-05-04 21:05     ` Skidmore, Donald C
  2013-05-04 21:21       ` Ben Hutchings
  1 sibling, 1 reply; 8+ messages in thread
From: Skidmore, Donald C @ 2013-05-04 21:05 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Stephen Hemminger, Kirsher, Jeffrey T, netdev



> -----Original Message-----
> From: Ben Hutchings [mailto:ben@decadent.org.uk]
> Sent: Friday, May 03, 2013 6:54 PM
> To: Skidmore, Donald C
> Cc: Stephen Hemminger; Kirsher, Jeffrey T; netdev@vger.kernel.org
> Subject: Re: ixgbe: RTNL assertion failed
> 
> On Fri, May 03, 2013 at 11:17:39PM +0000, Skidmore, Donald C wrote:
> > > -----Original Message-----
> > > From: netdev-owner@vger.kernel.org [mailto:netdev-
> > > owner@vger.kernel.org] On Behalf Of Stephen Hemminger
> > > Sent: Friday, May 03, 2013 12:09 PM
> > > To: Kirsher, Jeffrey T
> > > Cc: netdev@vger.kernel.org
> > > Subject: ixgbe: RTNL assertion failed
> > >
> > > Running 3.9 kernel, ixgbe is splatting on resume from suspend.
> > >
> > > [26430.213254] ixgbe 0000:01:00.0: setting latency timer to 64
> > > [26430.213257]
> > > RTNL: assertion failed at
> > > /build/buildd-linux_3.2.41-2-amd64-Wvc92F/linux-
> > > 3.2.41/net/core/dev.c (1758) [26430.213259] Pid: 7839, comm:
> > > kworker/u:1 Not tainted 3.2.0-4-amd64 #1 Debian 3.2.41-2 [26430.213261]
> Call Trace:
> [...]
> > I'm having a little problem finding a path were we call
> > netif_set_real_num_tx_queues without holding RTNL in net-next.  While
> > looking over the stack dump one of our engineers noticed the text "Not
> > tainted 3.2.0-4-amd64 #1 Debian 3.2.41-2 ".  Could this mean I'm
> > looking over the wrong source?  It would make me feel better as I'm
> > not seeing anything as is. :)
> 
> Indeed, this is not 3.9.
> 
> The version of ixgbe in this Debian kernel has bql support backported, but is
> otherwise the same as in 3.2.41.  I assume that this bug has been fixed some
> time between 3.2 and 3.9, but no-one requested that the fix be included in
> stable branches.  Please can you identify the fix?
> 
> Ben.
> 
> --
> Ben Hutchings
> We get into the habit of living before acquiring the habit of thinking.
>                                                               - Albert Camus

I believe this is the patch:

commit 34948a947d1a576c10afee6d14792fd237549577
Author: Benjamin Poirier <bpoirier@suse.de>
Date:   Fri Apr 6 07:20:21 2012 +0000

    ixgbe: add missing rtnl_lock in PM resume path

    Upon resume from standby, ixgbe may trigger the ASSERT_RTNL() in
    netif_set_real_num_tx_queues(). The call stack is:
        netif_set_real_num_tx_queues
        ixgbe_set_num_queues
        ixgbe_init_interrupt_scheme
        ixgbe_resume

    Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
    Tested-by: Stephen Ko <stephen.s.ko@intel.com>
    Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/i
index dac7c01..9e2be8c 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -4836,7 +4836,9 @@ static int ixgbe_resume(struct pci_dev *pdev)

        pci_wake_from_d3(pdev, false);

+       rtnl_lock();
        err = ixgbe_init_interrupt_scheme(adapter);
+       rtnl_unlock();
        if (err) {
                e_dev_err("Cannot initialize interrupts for device\n");
                return err;


Thanks,
-Don Skidmore <donald.c.skidmore@intel.com> 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: ixgbe: RTNL assertion failed
  2013-05-04 21:05     ` Skidmore, Donald C
@ 2013-05-04 21:21       ` Ben Hutchings
  2013-05-06 14:58         ` Stephen Hemminger
  0 siblings, 1 reply; 8+ messages in thread
From: Ben Hutchings @ 2013-05-04 21:21 UTC (permalink / raw)
  To: Skidmore, Donald C, Stephen Hemminger; +Cc: netdev, Kirsher, Jeffrey T

[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]

On Sat, 2013-05-04 at 21:05 +0000, Skidmore, Donald C wrote:
[...]
> > The version of ixgbe in this Debian kernel has bql support backported, but is
> > otherwise the same as in 3.2.41.  I assume that this bug has been fixed some
> > time between 3.2 and 3.9, but no-one requested that the fix be included in
> > stable branches.  Please can you identify the fix?
> > 
> > Ben.
> > 
> > --
> > Ben Hutchings
> > We get into the habit of living before acquiring the habit of thinking.
> >                                                               - Albert Camus
> 
> I believe this is the patch:
> 
> commit 34948a947d1a576c10afee6d14792fd237549577
> Author: Benjamin Poirier <bpoirier@suse.de>
> Date:   Fri Apr 6 07:20:21 2012 +0000
> 
>     ixgbe: add missing rtnl_lock in PM resume path
[...]

Looks like it.  And it applies cleanly to 3.2.y.  Stephen, could you
test this on top of 3.2.y and then nominate it for stable?

Ben.

-- 
Ben Hutchings
Lowery's Law:
             If it jams, force it. If it breaks, it needed replacing anyway.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ixgbe: RTNL assertion failed
  2013-05-04 21:21       ` Ben Hutchings
@ 2013-05-06 14:58         ` Stephen Hemminger
  2013-05-10  4:36           ` Ben Hutchings
  0 siblings, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2013-05-06 14:58 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Skidmore, Donald C, netdev, Kirsher, Jeffrey T

[-- Attachment #1: Type: text/plain, Size: 1179 bytes --]

On Sat, 04 May 2013 22:21:32 +0100
Ben Hutchings <ben@decadent.org.uk> wrote:

> On Sat, 2013-05-04 at 21:05 +0000, Skidmore, Donald C wrote:
> [...]
> > > The version of ixgbe in this Debian kernel has bql support backported, but is
> > > otherwise the same as in 3.2.41.  I assume that this bug has been fixed some
> > > time between 3.2 and 3.9, but no-one requested that the fix be included in
> > > stable branches.  Please can you identify the fix?
> > > 
> > > Ben.
> > > 
> > > --
> > > Ben Hutchings
> > > We get into the habit of living before acquiring the habit of thinking.
> > >                                                               - Albert Camus
> > 
> > I believe this is the patch:
> > 
> > commit 34948a947d1a576c10afee6d14792fd237549577
> > Author: Benjamin Poirier <bpoirier@suse.de>
> > Date:   Fri Apr 6 07:20:21 2012 +0000
> > 
> >     ixgbe: add missing rtnl_lock in PM resume path
> [...]
> 
> Looks like it.  And it applies cleanly to 3.2.y.  Stephen, could you
> test this on top of 3.2.y and then nominate it for stable?
> 
> Ben.
> 

Patch works.
Tested with 3.2.44 with this patch and there is no problem.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ixgbe: RTNL assertion failed
  2013-05-06 14:58         ` Stephen Hemminger
@ 2013-05-10  4:36           ` Ben Hutchings
  0 siblings, 0 replies; 8+ messages in thread
From: Ben Hutchings @ 2013-05-10  4:36 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Skidmore, Donald C, netdev, Kirsher, Jeffrey T

[-- Attachment #1: Type: text/plain, Size: 1450 bytes --]

On Mon, 2013-05-06 at 07:58 -0700, Stephen Hemminger wrote:
> On Sat, 04 May 2013 22:21:32 +0100
> Ben Hutchings <ben@decadent.org.uk> wrote:
> 
> > On Sat, 2013-05-04 at 21:05 +0000, Skidmore, Donald C wrote:
> > [...]
> > > > The version of ixgbe in this Debian kernel has bql support backported, but is
> > > > otherwise the same as in 3.2.41.  I assume that this bug has been fixed some
> > > > time between 3.2 and 3.9, but no-one requested that the fix be included in
> > > > stable branches.  Please can you identify the fix?
> > > > 
> > > > Ben.
> > > > 
> > > > --
> > > > Ben Hutchings
> > > > We get into the habit of living before acquiring the habit of thinking.
> > > >                                                               - Albert Camus
> > > 
> > > I believe this is the patch:
> > > 
> > > commit 34948a947d1a576c10afee6d14792fd237549577
> > > Author: Benjamin Poirier <bpoirier@suse.de>
> > > Date:   Fri Apr 6 07:20:21 2012 +0000
> > > 
> > >     ixgbe: add missing rtnl_lock in PM resume path
> > [...]
> > 
> > Looks like it.  And it applies cleanly to 3.2.y.  Stephen, could you
> > test this on top of 3.2.y and then nominate it for stable?
> > 
> > Ben.
> > 
> 
> Patch works.
> Tested with 3.2.44 with this patch and there is no problem.

Thanks, I've added this to my queue.

Ben.

-- 
Ben Hutchings
For every action, there is an equal and opposite criticism. - Harrison

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-05-10  4:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-03 19:09 ixgbe: RTNL assertion failed Stephen Hemminger
2013-05-03 23:17 ` Skidmore, Donald C
2013-05-04  1:54   ` Ben Hutchings
2013-05-04  3:51     ` Stephen Hemminger
2013-05-04 21:05     ` Skidmore, Donald C
2013-05-04 21:21       ` Ben Hutchings
2013-05-06 14:58         ` Stephen Hemminger
2013-05-10  4:36           ` Ben Hutchings

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).