* [PATCH net RFC 0/2] net: BUG napi_disable sleep while atomic
@ 2013-09-18 20:19 Jacob Keller
2013-09-18 20:20 ` [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable Jacob Keller
2013-09-18 20:20 ` [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context Jacob Keller
0 siblings, 2 replies; 12+ messages in thread
From: Jacob Keller @ 2013-09-18 20:19 UTC (permalink / raw)
To: netdev
The following series implements a fix for a possible sleep while atomic bug in
the ixgbe driver, caused because napi_disable might sleep. The first patch adds
a might_sleep() call in order to make future sleep while atomic bugs easier to
discover, (as napi_disable does not always sleep!). The second patch is my best
guess at an actual fix for the bug. There is likely a similar bug in every
driver which implemented NET_RX_BUSY_POLL similar to ixgbe. The first patch
will likely cause BUG splats to appear in those drivers unless they are fixed.
---
Jacob Keller (2):
netdevice: add might_sleep() call to napi_disable
ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++--
include/linux/netdevice.h | 2 ++
2 files changed, 4 insertions(+), 2 deletions(-)
--
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable
2013-09-18 20:19 [PATCH net RFC 0/2] net: BUG napi_disable sleep while atomic Jacob Keller
@ 2013-09-18 20:20 ` Jacob Keller
2013-10-08 12:39 ` Amir Vadai
2013-09-18 20:20 ` [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context Jacob Keller
1 sibling, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2013-09-18 20:20 UTC (permalink / raw)
To: netdev
Cc: Alex Duyck, Hyong-Youb Kim, Dmitry Kravkov, Amir Vadai, Eliezer Tamir
napi_disable potentially calls msleep(1) if the NAPI_STATE_SCHED bit is
previously set by something else. Because it does not always call msleep, it
was difficult to track down a bug related to calling napi_disable within
local_bh_disable()d context. This patch adds a might_sleep() call to the
napi_disable routine in order to aid in the future debugging of similar issues.
This will cause a BUG in drivers which have implemented busy polling in a
similar fashion to ixgbe, and which call napi_disable inside the
local_bh_disable()d section where the vector napi lock is taken.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Cc: Alex Duyck <alexander.h.duyck@intel.com>
Cc: Hyong-Youb Kim <hykim@myri.com>
Cc: Amir Vadai <amirv@mellanox.com>
Cc: Dmitry Kravkov <dmitry@broadcom.com>
---
include/linux/netdevice.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3de49ac..81dab00 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -483,6 +483,8 @@ extern void napi_hash_del(struct napi_struct *napi);
*/
static inline void napi_disable(struct napi_struct *n)
{
+ might_sleep();
+
set_bit(NAPI_STATE_DISABLE, &n->state);
while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
msleep(1);
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
2013-09-18 20:19 [PATCH net RFC 0/2] net: BUG napi_disable sleep while atomic Jacob Keller
2013-09-18 20:20 ` [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable Jacob Keller
@ 2013-09-18 20:20 ` Jacob Keller
2013-09-18 23:09 ` Francois Romieu
1 sibling, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2013-09-18 20:20 UTC (permalink / raw)
To: netdev
Cc: Alexander Duyck, Hyong-Youb Kim, Dmitry Kravkov, Amir Vadai,
Eliezer Tamir
This patch fixes a bug caused by calling napi_disable after local_bh_disable.
It is possible for napi_disable to sleep, (though not guarunteed) so it could
cause an atomic sleep bug during the schedule() call in msleep. This patch
resolves the issue by moving the local_bh_disable() calls inside the for loop
in ixgbe_napi_disable_all().
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Cc: Alexander Duyck <alexander.duyck@intel.com>
Cc: Hyong-Youb Kim <hykim@myri.com>
Cc: Amir Vadai <amirv@mellanox.com>
Cc: Dmitry Kravkov <dmitry@broadcom.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 0ade0cd..85dbf58 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -3893,15 +3893,15 @@ static void ixgbe_napi_disable_all(struct ixgbe_adapter *adapter)
{
int q_idx;
- local_bh_disable(); /* for ixgbe_qv_lock_napi() */
for (q_idx = 0; q_idx < adapter->num_q_vectors; q_idx++) {
napi_disable(&adapter->q_vector[q_idx]->napi);
+ local_bh_disable(); /* for ixgbe_qv_lock_napi() */
while (!ixgbe_qv_lock_napi(adapter->q_vector[q_idx])) {
pr_info("QV %d locked\n", q_idx);
mdelay(1);
}
+ local_bh_enable();
}
- local_bh_enable();
}
#ifdef CONFIG_IXGBE_DCB
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
2013-09-18 20:20 ` [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context Jacob Keller
@ 2013-09-18 23:09 ` Francois Romieu
2013-09-18 23:24 ` Keller, Jacob E
0 siblings, 1 reply; 12+ messages in thread
From: Francois Romieu @ 2013-09-18 23:09 UTC (permalink / raw)
To: Jacob Keller
Cc: netdev, Alexander Duyck, Hyong-Youb Kim, Dmitry Kravkov,
Amir Vadai, Eliezer Tamir
Jacob Keller <jacob.e.keller@intel.com> :
> This patch fixes a bug caused by calling napi_disable after local_bh_disable.
> It is possible for napi_disable to sleep, (though not guarunteed) so it could
> cause an atomic sleep bug during the schedule() call in msleep. This patch
> resolves the issue by moving the local_bh_disable() calls inside the for loop
> in ixgbe_napi_disable_all().
Why couldn't you take mdelay(1) outside of the locally disabled bh section ?
--
Ueimor
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
2013-09-18 23:09 ` Francois Romieu
@ 2013-09-18 23:24 ` Keller, Jacob E
2013-09-19 21:32 ` Francois Romieu
0 siblings, 1 reply; 12+ messages in thread
From: Keller, Jacob E @ 2013-09-18 23:24 UTC (permalink / raw)
To: Francois Romieu
Cc: netdev, Duyck, Alexander H, Hyong-Youb Kim, Dmitry Kravkov,
Amir Vadai, Eliezer Tamir
> -----Original Message-----
> From: Francois Romieu [mailto:romieu@fr.zoreil.com]
> Sent: Wednesday, September 18, 2013 4:10 PM
> To: Keller, Jacob E
> Cc: netdev@vger.kernel.org; Duyck, Alexander H; Hyong-Youb Kim;
> Dmitry Kravkov; Amir Vadai; Eliezer Tamir
> Subject: Re: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by
> napi_disable inside local_bh_disable()d context
>
> Jacob Keller <jacob.e.keller@intel.com> :
> > This patch fixes a bug caused by calling napi_disable after
> local_bh_disable.
> > It is possible for napi_disable to sleep, (though not guarunteed) so it
> could
> > cause an atomic sleep bug during the schedule() call in msleep. This
> patch
> > resolves the issue by moving the local_bh_disable() calls inside the for
> loop
> > in ixgbe_napi_disable_all().
>
> Why couldn't you take mdelay(1) outside of the locally disabled bh
> section ?
I actually am not sure. Originally the local_bh_disable was put around the entire for loop, over all of the
napi_disable and the qv_lock_napi code.
I know that we need local_bh_disable around the qv_lock_napi code because it uses spin_lock instead
of spin_lock_bh. I believe the reason we need bh disabled around the entire context is so that the small
window between failed calls to qv_lock_napi don't get interrupted and continue to have busy_poll lock
the q_vector over and over.
I have to move the local_bh_disable in order to put napi_disable outside of the call since napi_disable
could sleep, causing a scheduling while atomic BUG.
Regards, Jake
>
> --
> Ueimor
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
2013-09-18 23:24 ` Keller, Jacob E
@ 2013-09-19 21:32 ` Francois Romieu
2013-09-19 22:28 ` Keller, Jacob E
0 siblings, 1 reply; 12+ messages in thread
From: Francois Romieu @ 2013-09-19 21:32 UTC (permalink / raw)
To: Keller, Jacob E
Cc: netdev, Duyck, Alexander H, Hyong-Youb Kim, Dmitry Kravkov,
Amir Vadai, Eliezer Tamir
Keller, Jacob E <jacob.e.keller@intel.com> :
[...]
> I know that we need local_bh_disable around the qv_lock_napi code because
> it uses spin_lock instead of spin_lock_bh. I believe the reason we need bh
> disabled around the entire context is so that the small window between
> failed calls to qv_lock_napi don't get interrupted and continue to have
> busy_poll lock the q_vector over and over.
Thanks for explaining the intent. I'll do my homework.
> I have to move the local_bh_disable in order to put napi_disable outside
> of the call since napi_disable could sleep, causing a scheduling while
> atomic BUG.
I am in violent agreement with this part.
--
Ueimor
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
2013-09-19 21:32 ` Francois Romieu
@ 2013-09-19 22:28 ` Keller, Jacob E
2013-10-01 12:11 ` Yuval Mintz
0 siblings, 1 reply; 12+ messages in thread
From: Keller, Jacob E @ 2013-09-19 22:28 UTC (permalink / raw)
To: Francois Romieu
Cc: netdev, Duyck, Alexander H, Hyong-Youb Kim, Dmitry Kravkov,
Amir Vadai, Eliezer Tamir
> -----Original Message-----
> From: Francois Romieu [mailto:romieu@fr.zoreil.com]
> Sent: Thursday, September 19, 2013 2:33 PM
> To: Keller, Jacob E
> Cc: netdev@vger.kernel.org; Duyck, Alexander H; Hyong-Youb Kim;
> Dmitry Kravkov; Amir Vadai; Eliezer Tamir
> Subject: Re: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by
> napi_disable inside local_bh_disable()d context
>
> Keller, Jacob E <jacob.e.keller@intel.com> :
> [...]
> > I know that we need local_bh_disable around the qv_lock_napi code
> because
> > it uses spin_lock instead of spin_lock_bh. I believe the reason we need
> bh
> > disabled around the entire context is so that the small window between
> > failed calls to qv_lock_napi don't get interrupted and continue to have
> > busy_poll lock the q_vector over and over.
>
> Thanks for explaining the intent. I'll do my homework.
>
Thanks. Again, I am not entirely sure this is the right solution, (hence the RFC) because I am not that familiar with the code here.
> > I have to move the local_bh_disable in order to put napi_disable
> outside
> > of the call since napi_disable could sleep, causing a scheduling while
> > atomic BUG.
>
> I am in violent agreement with this part.
> --
> Ueimor
Regards,
Jake
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
2013-09-19 22:28 ` Keller, Jacob E
@ 2013-10-01 12:11 ` Yuval Mintz
2013-10-01 20:05 ` Keller, Jacob E
2013-10-01 20:08 ` Keller, Jacob E
0 siblings, 2 replies; 12+ messages in thread
From: Yuval Mintz @ 2013-10-01 12:11 UTC (permalink / raw)
To: Keller, Jacob E, Francois Romieu
Cc: netdev, Duyck, Alexander H, Hyong-Youb Kim, Dmitry Kravkov,
Amir Vadai, Eliezer Tamir
> > > I have to move the local_bh_disable in order to put napi_disable
> > outside
> > > of the call since napi_disable could sleep, causing a scheduling while
> > > atomic BUG.
> >
> > I am in violent agreement with this part.
> > --
> > Ueimor
>
> Regards,
> Jake
> --
It seem like we've hit the same issue with the bnx2x driver.
Is there anything new about the RFC?
Thanks,
Yuval
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
2013-10-01 12:11 ` Yuval Mintz
@ 2013-10-01 20:05 ` Keller, Jacob E
2013-10-01 20:08 ` Keller, Jacob E
1 sibling, 0 replies; 12+ messages in thread
From: Keller, Jacob E @ 2013-10-01 20:05 UTC (permalink / raw)
To: Yuval Mintz
Cc: Francois Romieu, netdev, Duyck, Alexander H, Hyong-Youb Kim,
Dmitry Kravkov, Amir Vadai, Eliezer Tamir
On Tue, 2013-10-01 at 12:11 +0000, Yuval Mintz wrote:
> > > > I have to move the local_bh_disable in order to put napi_disable
> > > outside
> > > > of the call since napi_disable could sleep, causing a scheduling while
> > > > atomic BUG.
> > >
> > > I am in violent agreement with this part.
> > > --
> > > Ueimor
> >
> > Regards,
> > Jake
> > --
>
> It seem like we've hit the same issue with the bnx2x driver.
> Is there anything new about the RFC?
>
> Thanks,
> Yuval
>
>
>
>
>
The napi_disable call for might_sleep() is the same. The solution in the
ixgbe driver is different. I completely re-wrote the segment about how
to disable the q_vector by adding a new state, rather than abusing the
QV_LOCKED_NAPI state. In addition I refactored it so that the
qv_lock_napi used spin_lock_bh() instead of plain spin_lock(), and
changed it so that we didn't need the local_bh_disable() call in
ixgbe_napi_disable_all.
This is a much cleaner solution than what I originally proposed.
Regards,
Jake
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context
2013-10-01 12:11 ` Yuval Mintz
2013-10-01 20:05 ` Keller, Jacob E
@ 2013-10-01 20:08 ` Keller, Jacob E
1 sibling, 0 replies; 12+ messages in thread
From: Keller, Jacob E @ 2013-10-01 20:08 UTC (permalink / raw)
To: Yuval Mintz
Cc: Francois Romieu, netdev, Duyck, Alexander H, Hyong-Youb Kim,
Dmitry Kravkov, Amir Vadai, Eliezer Tamir
On Tue, 2013-10-01 at 12:11 +0000, Yuval Mintz wrote:
> > > > I have to move the local_bh_disable in order to put napi_disable
> > > outside
> > > > of the call since napi_disable could sleep, causing a scheduling while
> > > > atomic BUG.
> > >
> > > I am in violent agreement with this part.
> > > --
> > > Ueimor
> >
> > Regards,
> > Jake
> > --
>
> It seem like we've hit the same issue with the bnx2x driver.
> Is there anything new about the RFC?
>
> Thanks,
> Yuval
>
>
>
>
>
FYI, the new patch is currently going through internal validation here,
and once Jeff Kirsher mails it, I will ensure that you are Cc'ed on it.
Regards,
Jake
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable
2013-09-18 20:20 ` [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable Jacob Keller
@ 2013-10-08 12:39 ` Amir Vadai
0 siblings, 0 replies; 12+ messages in thread
From: Amir Vadai @ 2013-10-08 12:39 UTC (permalink / raw)
To: Jacob Keller
Cc: netdev, Alex Duyck, Hyong-Youb Kim, Dmitry Kravkov, Eliezer Tamir
On 18/09/2013 23:20, Jacob Keller wrote:
> napi_disable potentially calls msleep(1) if the NAPI_STATE_SCHED bit is
> previously set by something else. Because it does not always call msleep, it
> was difficult to track down a bug related to calling napi_disable within
> local_bh_disable()d context. This patch adds a might_sleep() call to the
> napi_disable routine in order to aid in the future debugging of similar issues.
> This will cause a BUG in drivers which have implemented busy polling in a
> similar fashion to ixgbe, and which call napi_disable inside the
> local_bh_disable()d section where the vector napi lock is taken.
>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
> Cc: Alex Duyck <alexander.h.duyck@intel.com>
> Cc: Hyong-Youb Kim <hykim@myri.com>
> Cc: Amir Vadai <amirv@mellanox.com>
> Cc: Dmitry Kravkov <dmitry@broadcom.com>
> ---
> include/linux/netdevice.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 3de49ac..81dab00 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -483,6 +483,8 @@ extern void napi_hash_del(struct napi_struct *napi);
> */
> static inline void napi_disable(struct napi_struct *n)
> {
> + might_sleep();
> +
> set_bit(NAPI_STATE_DISABLE, &n->state);
> while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
> msleep(1);
>
Works ok with mlx4_en driver.
Acked-By: Amir Vadai <amirv@mellanox.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable
2013-09-17 0:59 [PATCH net RFC 0/2] Series short description Jacob Keller
@ 2013-09-17 0:59 ` Jacob Keller
0 siblings, 0 replies; 12+ messages in thread
From: Jacob Keller @ 2013-09-17 0:59 UTC (permalink / raw)
To: netdev
napi_disable potentially calls msleep(1) if the NAPI_STATE_SCHED bit is
previously set by something else. Because it does not always call msleep, it
was difficult to track down a bug related to calling napi_disable within
local_bh_disable()d context. This patch adds a might_sleep() call to the
napi_disable routine in order to aid in the future debugging of similar issues.
This will cause a BUG in drivers which have implemented busy polling in a
similar fashion to ixgbe, and which call napi_disable inside the
local_bh_disable()d section where the vector napi lock is taken.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Cc: Alex Duyck <alexander.h.duyck@intel.com>
Cc: Hyong-Youb Kim <hykim@myri.com>
Cc: Amir Vadai <amirv@mellanox.com>
Cc: Dmitry Kravkov <dmitry@broadcom.com>
---
include/linux/netdevice.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 041b42a..5e3ef61 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -483,6 +483,8 @@ extern void napi_hash_del(struct napi_struct *napi);
*/
static inline void napi_disable(struct napi_struct *n)
{
+ might_sleep();
+
set_bit(NAPI_STATE_DISABLE, &n->state);
while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
msleep(1);
^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2013-10-08 12:39 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-18 20:19 [PATCH net RFC 0/2] net: BUG napi_disable sleep while atomic Jacob Keller
2013-09-18 20:20 ` [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable Jacob Keller
2013-10-08 12:39 ` Amir Vadai
2013-09-18 20:20 ` [PATCH net RFC 2/2] ixgbe: fix sleep bug caused by napi_disable inside local_bh_disable()d context Jacob Keller
2013-09-18 23:09 ` Francois Romieu
2013-09-18 23:24 ` Keller, Jacob E
2013-09-19 21:32 ` Francois Romieu
2013-09-19 22:28 ` Keller, Jacob E
2013-10-01 12:11 ` Yuval Mintz
2013-10-01 20:05 ` Keller, Jacob E
2013-10-01 20:08 ` Keller, Jacob E
-- strict thread matches above, loose matches on Subject: below --
2013-09-17 0:59 [PATCH net RFC 0/2] Series short description Jacob Keller
2013-09-17 0:59 ` [PATCH net RFC 1/2] netdevice: add might_sleep() call to napi_disable Jacob Keller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).