netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net RFC 0/2] net: BUG napi_disable sleep while atomic
@ 2013-09-18 20:19 Jacob Keller
  2013-09-18 20:20 ` [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable Jacob Keller
  2013-09-18 20:20 ` [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context Jacob Keller
  0 siblings, 2 replies; 12+ messages in thread
From: Jacob Keller @ 2013-09-18 20:19 UTC (permalink / raw)
  To: netdev

The following series implements a fix for a possible sleep while atomic bug in
the ixgbe driver, caused because napi_disable might sleep. The first patch adds
a might_sleep() call in order to make future sleep while atomic bugs easier to
discover, (as napi_disable does not always sleep!). The second patch is my best
guess at an actual fix for the bug. There is likely a similar bug in every
driver which implemented NET_RX_BUSY_POLL similar to ixgbe. The first patch
will likely cause BUG splats to appear in those drivers unless they are fixed.

---

Jacob Keller (2):
      netdevice: add might_sleep() call to napi_disable
      ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context


 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    4 ++--
 include/linux/netdevice.h                     |    2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

--
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable
  2013-09-18 20:19 [PATCH net RFC 0/2] net: BUG napi_disable sleep while atomic Jacob Keller
@ 2013-09-18 20:20 ` Jacob Keller
  2013-10-08 12:39   ` Amir Vadai
  2013-09-18 20:20 ` [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context Jacob Keller
  1 sibling, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2013-09-18 20:20 UTC (permalink / raw)
  To: netdev
  Cc: Alex Duyck, Hyong-Youb Kim, Dmitry Kravkov, Amir Vadai, Eliezer Tamir

napi_disable potentially calls msleep(1) if the NAPI_STATE_SCHED bit is
previously set by something else. Because it does not always call msleep, it
was difficult to track down a bug related to calling napi_disable within
local_bh_disable()d context. This patch adds a might_sleep() call to the
napi_disable routine in order to aid in the future debugging of similar issues.
This will cause a BUG in drivers which have implemented busy polling in a
similar fashion to ixgbe, and which call napi_disable inside the
local_bh_disable()d section where the vector napi lock is taken.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Cc: Alex Duyck <alexander.h.duyck@intel.com>
Cc: Hyong-Youb Kim <hykim@myri.com>
Cc: Amir Vadai <amirv@mellanox.com>
Cc: Dmitry Kravkov <dmitry@broadcom.com>
---
 include/linux/netdevice.h |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3de49ac..81dab00 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -483,6 +483,8 @@ extern void napi_hash_del(struct napi_struct *napi);
  */
 static inline void napi_disable(struct napi_struct *n)
 {
+	might_sleep();
+
 	set_bit(NAPI_STATE_DISABLE, &n->state);
 	while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
 		msleep(1);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
  2013-09-18 20:19 [PATCH net RFC 0/2] net: BUG napi_disable sleep while atomic Jacob Keller
  2013-09-18 20:20 ` [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable Jacob Keller
@ 2013-09-18 20:20 ` Jacob Keller
  2013-09-18 23:09   ` Francois Romieu
  1 sibling, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2013-09-18 20:20 UTC (permalink / raw)
  To: netdev
  Cc: Alexander Duyck, Hyong-Youb Kim, Dmitry Kravkov, Amir Vadai,
	Eliezer Tamir

This patch fixes a bug caused by calling napi_disable after local_bh_disable.
It is possible for napi_disable to sleep, (though not guarunteed) so it could
cause an atomic sleep bug during the  schedule() call in msleep. This patch
resolves the issue by moving the local_bh_disable() calls inside the for loop
in ixgbe_napi_disable_all().

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Cc: Alexander Duyck <alexander.duyck@intel.com>
Cc: Hyong-Youb Kim <hykim@myri.com>
Cc: Amir Vadai <amirv@mellanox.com>
Cc: Dmitry Kravkov <dmitry@broadcom.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 0ade0cd..85dbf58 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -3893,15 +3893,15 @@ static void ixgbe_napi_disable_all(struct ixgbe_adapter *adapter)
 {
 	int q_idx;
 
-	local_bh_disable(); /* for ixgbe_qv_lock_napi() */
 	for (q_idx = 0; q_idx < adapter->num_q_vectors; q_idx++) {
 		napi_disable(&adapter->q_vector[q_idx]->napi);
+		local_bh_disable(); /* for ixgbe_qv_lock_napi() */
 		while (!ixgbe_qv_lock_napi(adapter->q_vector[q_idx])) {
 			pr_info("QV %d locked\n", q_idx);
 			mdelay(1);
 		}
+		local_bh_enable();
 	}
-	local_bh_enable();
 }
 
 #ifdef CONFIG_IXGBE_DCB

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
  2013-09-18 20:20 ` [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context Jacob Keller
@ 2013-09-18 23:09   ` Francois Romieu
  2013-09-18 23:24     ` Keller, Jacob E
  0 siblings, 1 reply; 12+ messages in thread
From: Francois Romieu @ 2013-09-18 23:09 UTC (permalink / raw)
  To: Jacob Keller
  Cc: netdev, Alexander Duyck, Hyong-Youb Kim, Dmitry Kravkov,
	Amir Vadai, Eliezer Tamir

Jacob Keller <jacob.e.keller@intel.com> :
> This patch fixes a bug caused by calling napi_disable after local_bh_disable.
> It is possible for napi_disable to sleep, (though not guarunteed) so it could
> cause an atomic sleep bug during the  schedule() call in msleep. This patch
> resolves the issue by moving the local_bh_disable() calls inside the for loop
> in ixgbe_napi_disable_all().

Why couldn't you take mdelay(1) outside of the locally disabled bh section ?

-- 
Ueimor

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
  2013-09-18 23:09   ` Francois Romieu
@ 2013-09-18 23:24     ` Keller, Jacob E
  2013-09-19 21:32       ` Francois Romieu
  0 siblings, 1 reply; 12+ messages in thread
From: Keller, Jacob E @ 2013-09-18 23:24 UTC (permalink / raw)
  To: Francois Romieu
  Cc: netdev, Duyck, Alexander H, Hyong-Youb Kim, Dmitry Kravkov,
	Amir Vadai, Eliezer Tamir

> -----Original Message-----
> From: Francois Romieu [mailto:romieu@fr.zoreil.com]
> Sent: Wednesday, September 18, 2013 4:10 PM
> To: Keller, Jacob E
> Cc: netdev@vger.kernel.org; Duyck, Alexander H; Hyong-Youb Kim;
> Dmitry Kravkov; Amir Vadai; Eliezer Tamir
> Subject: Re: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by
> napi_disable inside local_bh_disable()d context
> 
> Jacob Keller <jacob.e.keller@intel.com> :
> > This patch fixes a bug caused by calling napi_disable after
> local_bh_disable.
> > It is possible for napi_disable to sleep, (though not guarunteed) so it
> could
> > cause an atomic sleep bug during the  schedule() call in msleep. This
> patch
> > resolves the issue by moving the local_bh_disable() calls inside the for
> loop
> > in ixgbe_napi_disable_all().
> 
> Why couldn't you take mdelay(1) outside of the locally disabled bh
> section ?

I actually am not sure. Originally the local_bh_disable was put around the entire for loop, over all of the
napi_disable and the qv_lock_napi code.

I know that we need local_bh_disable around the qv_lock_napi code because it uses spin_lock instead
of spin_lock_bh. I believe the reason we need bh disabled around the entire context is so that the small
window between failed calls to qv_lock_napi don't get interrupted and continue to have busy_poll lock
the q_vector over and over.

I have to move the local_bh_disable in order to put napi_disable outside of the call since napi_disable
could sleep, causing a scheduling while atomic BUG.

Regards, Jake

> 
> --
> Ueimor

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
  2013-09-18 23:24     ` Keller, Jacob E
@ 2013-09-19 21:32       ` Francois Romieu
  2013-09-19 22:28         ` Keller, Jacob E
  0 siblings, 1 reply; 12+ messages in thread
From: Francois Romieu @ 2013-09-19 21:32 UTC (permalink / raw)
  To: Keller, Jacob E
  Cc: netdev, Duyck, Alexander H, Hyong-Youb Kim, Dmitry Kravkov,
	Amir Vadai, Eliezer Tamir

Keller, Jacob E <jacob.e.keller@intel.com> :
[...]
> I know that we need local_bh_disable around the qv_lock_napi code because
> it uses spin_lock instead of spin_lock_bh. I believe the reason we need bh
> disabled around the entire context is so that the small window between
> failed calls to qv_lock_napi don't get interrupted and continue to have
> busy_poll lock the q_vector over and over.

Thanks for explaining the intent. I'll do my homework.

> I have to move the local_bh_disable in order to put napi_disable outside
> of the call since napi_disable could sleep, causing a scheduling while
> atomic BUG.

I am in violent agreement with this part.

--
Ueimor

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
  2013-09-19 21:32       ` Francois Romieu
@ 2013-09-19 22:28         ` Keller, Jacob E
  2013-10-01 12:11           ` Yuval Mintz
  0 siblings, 1 reply; 12+ messages in thread
From: Keller, Jacob E @ 2013-09-19 22:28 UTC (permalink / raw)
  To: Francois Romieu
  Cc: netdev, Duyck, Alexander H, Hyong-Youb Kim, Dmitry Kravkov,
	Amir Vadai, Eliezer Tamir

> -----Original Message-----
> From: Francois Romieu [mailto:romieu@fr.zoreil.com]
> Sent: Thursday, September 19, 2013 2:33 PM
> To: Keller, Jacob E
> Cc: netdev@vger.kernel.org; Duyck, Alexander H; Hyong-Youb Kim;
> Dmitry Kravkov; Amir Vadai; Eliezer Tamir
> Subject: Re: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by
> napi_disable inside local_bh_disable()d context
> 
> Keller, Jacob E <jacob.e.keller@intel.com> :
> [...]
> > I know that we need local_bh_disable around the qv_lock_napi code
> because
> > it uses spin_lock instead of spin_lock_bh. I believe the reason we need
> bh
> > disabled around the entire context is so that the small window between
> > failed calls to qv_lock_napi don't get interrupted and continue to have
> > busy_poll lock the q_vector over and over.
> 
> Thanks for explaining the intent. I'll do my homework.
> 

Thanks. Again, I am not entirely sure this is the right solution, (hence the RFC) because I am not that familiar with the code here.

> > I have to move the local_bh_disable in order to put napi_disable
> outside
> > of the call since napi_disable could sleep, causing a scheduling while
> > atomic BUG.
> 
> I am in violent agreement with this part.
> --
> Ueimor

Regards,
Jake

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
  2013-09-19 22:28         ` Keller, Jacob E
@ 2013-10-01 12:11           ` Yuval Mintz
  2013-10-01 20:05             ` Keller, Jacob E
  2013-10-01 20:08             ` Keller, Jacob E
  0 siblings, 2 replies; 12+ messages in thread
From: Yuval Mintz @ 2013-10-01 12:11 UTC (permalink / raw)
  To: Keller, Jacob E, Francois Romieu
  Cc: netdev, Duyck, Alexander H, Hyong-Youb Kim, Dmitry Kravkov,
	Amir Vadai, Eliezer Tamir

> > > I have to move the local_bh_disable in order to put napi_disable
> > outside
> > > of the call since napi_disable could sleep, causing a scheduling while
> > > atomic BUG.
> >
> > I am in violent agreement with this part.
> > --
> > Ueimor
> 
> Regards,
> Jake
> --

It seem like we've hit the same issue with the bnx2x driver.
Is there anything new about the RFC?

Thanks,
Yuval

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
  2013-10-01 12:11           ` Yuval Mintz
@ 2013-10-01 20:05             ` Keller, Jacob E
  2013-10-01 20:08             ` Keller, Jacob E
  1 sibling, 0 replies; 12+ messages in thread
From: Keller, Jacob E @ 2013-10-01 20:05 UTC (permalink / raw)
  To: Yuval Mintz
  Cc: Francois Romieu, netdev, Duyck, Alexander H, Hyong-Youb Kim,
	Dmitry Kravkov, Amir Vadai, Eliezer Tamir

On Tue, 2013-10-01 at 12:11 +0000, Yuval Mintz wrote:
> > > > I have to move the local_bh_disable in order to put napi_disable
> > > outside
> > > > of the call since napi_disable could sleep, causing a scheduling while
> > > > atomic BUG.
> > >
> > > I am in violent agreement with this part.
> > > --
> > > Ueimor
> > 
> > Regards,
> > Jake
> > --
> 
> It seem like we've hit the same issue with the bnx2x driver.
> Is there anything new about the RFC?
> 
> Thanks,
> Yuval
> 
> 
> 
> 
> 

The napi_disable call for might_sleep() is the same. The solution in the
ixgbe driver is different. I completely re-wrote the segment about how
to disable the q_vector by adding a new state, rather than abusing the
QV_LOCKED_NAPI state. In addition I refactored it so that the
qv_lock_napi used spin_lock_bh() instead of plain spin_lock(), and
changed it so that we didn't need the local_bh_disable() call in
ixgbe_napi_disable_all.

This is a much cleaner solution than what I originally proposed.

Regards,
Jake

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
  2013-10-01 12:11           ` Yuval Mintz
  2013-10-01 20:05             ` Keller, Jacob E
@ 2013-10-01 20:08             ` Keller, Jacob E
  1 sibling, 0 replies; 12+ messages in thread
From: Keller, Jacob E @ 2013-10-01 20:08 UTC (permalink / raw)
  To: Yuval Mintz
  Cc: Francois Romieu, netdev, Duyck, Alexander H, Hyong-Youb Kim,
	Dmitry Kravkov, Amir Vadai, Eliezer Tamir

On Tue, 2013-10-01 at 12:11 +0000, Yuval Mintz wrote:
> > > > I have to move the local_bh_disable in order to put napi_disable
> > > outside
> > > > of the call since napi_disable could sleep, causing a scheduling while
> > > > atomic BUG.
> > >
> > > I am in violent agreement with this part.
> > > --
> > > Ueimor
> > 
> > Regards,
> > Jake
> > --
> 
> It seem like we've hit the same issue with the bnx2x driver.
> Is there anything new about the RFC?
> 
> Thanks,
> Yuval
> 
> 
> 
> 
> 

FYI, the new patch is currently going through internal validation here,
and once Jeff Kirsher mails it, I will ensure that you are Cc'ed on it.

Regards,
Jake

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable
  2013-09-18 20:20 ` [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable Jacob Keller
@ 2013-10-08 12:39   ` Amir Vadai
  0 siblings, 0 replies; 12+ messages in thread
From: Amir Vadai @ 2013-10-08 12:39 UTC (permalink / raw)
  To: Jacob Keller
  Cc: netdev, Alex Duyck, Hyong-Youb Kim, Dmitry Kravkov, Eliezer Tamir

On 18/09/2013 23:20, Jacob Keller wrote:
> napi_disable potentially calls msleep(1) if the NAPI_STATE_SCHED bit is
> previously set by something else. Because it does not always call msleep, it
> was difficult to track down a bug related to calling napi_disable within
> local_bh_disable()d context. This patch adds a might_sleep() call to the
> napi_disable routine in order to aid in the future debugging of similar issues.
> This will cause a BUG in drivers which have implemented busy polling in a
> similar fashion to ixgbe, and which call napi_disable inside the
> local_bh_disable()d section where the vector napi lock is taken.
> 
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
> Cc: Alex Duyck <alexander.h.duyck@intel.com>
> Cc: Hyong-Youb Kim <hykim@myri.com>
> Cc: Amir Vadai <amirv@mellanox.com>
> Cc: Dmitry Kravkov <dmitry@broadcom.com>
> ---
>  include/linux/netdevice.h |    2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 3de49ac..81dab00 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -483,6 +483,8 @@ extern void napi_hash_del(struct napi_struct *napi);
>   */
>  static inline void napi_disable(struct napi_struct *n)
>  {
> +	might_sleep();
> +
>  	set_bit(NAPI_STATE_DISABLE, &n->state);
>  	while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
>  		msleep(1);
> 

Works ok with mlx4_en driver.

Acked-By: Amir Vadai <amirv@mellanox.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable
  2013-09-17  0:59 [PATCH net RFC 0/2] Series short description Jacob Keller
@ 2013-09-17  0:59 ` Jacob Keller
  0 siblings, 0 replies; 12+ messages in thread
From: Jacob Keller @ 2013-09-17  0:59 UTC (permalink / raw)
  To: netdev

napi_disable potentially calls msleep(1) if the NAPI_STATE_SCHED bit is
previously set by something else. Because it does not always call msleep, it
was difficult to track down a bug related to calling napi_disable within
local_bh_disable()d context. This patch adds a might_sleep() call to the
napi_disable routine in order to aid in the future debugging of similar issues.
This will cause a BUG in drivers which have implemented busy polling in a
similar fashion to ixgbe, and which call napi_disable inside the
local_bh_disable()d section where the vector napi lock is taken.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Cc: Alex Duyck <alexander.h.duyck@intel.com>
Cc: Hyong-Youb Kim <hykim@myri.com>
Cc: Amir Vadai <amirv@mellanox.com>
Cc: Dmitry Kravkov <dmitry@broadcom.com>
---
 include/linux/netdevice.h |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 041b42a..5e3ef61 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -483,6 +483,8 @@ extern void napi_hash_del(struct napi_struct *napi);
  */
 static inline void napi_disable(struct napi_struct *n)
 {
+	might_sleep();
+
 	set_bit(NAPI_STATE_DISABLE, &n->state);
 	while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
 		msleep(1);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-10-08 12:39 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-18 20:19 [PATCH net RFC 0/2] net: BUG napi_disable sleep while atomic Jacob Keller
2013-09-18 20:20 ` [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable Jacob Keller
2013-10-08 12:39   ` Amir Vadai
2013-09-18 20:20 ` [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context Jacob Keller
2013-09-18 23:09   ` Francois Romieu
2013-09-18 23:24     ` Keller, Jacob E
2013-09-19 21:32       ` Francois Romieu
2013-09-19 22:28         ` Keller, Jacob E
2013-10-01 12:11           ` Yuval Mintz
2013-10-01 20:05             ` Keller, Jacob E
2013-10-01 20:08             ` Keller, Jacob E
  -- strict thread matches above, loose matches on Subject: below --
2013-09-17  0:59 [PATCH net RFC 0/2] Series short description Jacob Keller
2013-09-17  0:59 ` [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable Jacob Keller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).