All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] napi: fix race inside napi_enable
@ 2021-09-18  6:48 Xuan Zhuo
  2021-09-18  7:19 ` Yunsheng Lin
  0 siblings, 1 reply; 2+ messages in thread
From: Xuan Zhuo @ 2021-09-18  6:48 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Eric Dumazet, Daniel Borkmann,
	Antoine Tenart, Alexander Lobakin, Wei Wang, Taehee Yoo,
	Björn Töpel, Arnd Bergmann, Kumar Kartikeya Dwivedi,
	dust.li

The process will cause napi.state to contain NAPI_STATE_SCHED and
not in the poll_list, which will cause napi_disable() to get stuck.

The prefix "NAPI_STATE_" is removed in the figure below, and
NAPI_STATE_HASHED is ignored in napi.state.

                      CPU0       |                   CPU1       | napi.state
===============================================================================
napi_disable()                   |                              | SCHED | NPSVC
napi_enable()                    |                              |
{                                |                              |
    smp_mb__before_atomic();     |                              |
    clear_bit(SCHED, &n->state); |                              | NPSVC
                                 | napi_schedule_prep()         | SCHED | NPSVC
                                 | napi_poll()                  |
                                 |   napi_complete_done()       |
                                 |   {                          |
                                 |      if (n->state & (NPSVC | | (1)
                                 |               _BUSY_POLL)))  |
                                 |           return false;      |
                                 |     ................         |
                                 |   }                          | SCHED | NPSVC
                                 |                              |
    clear_bit(NPSVC, &n->state); |                              | SCHED
}                                |                              |
                                 |                              |
napi_schedule_prep()             |                              | SCHED | MISSED (2)

(1) Here return direct. Because of NAPI_STATE_NPSVC exists.
(2) NAPI_STATE_SCHED exists. So not add napi.poll_list to sd->poll_list

Since NAPI_STATE_SCHED already exists and napi is not in the
sd->poll_list queue, NAPI_STATE_SCHED cannot be cleared and will always
exist.

1. This will cause this queue to no longer receive packets.
2. If you encounter napi_disable under the protection of rtnl_lock, it
   will cause the entire rtnl_lock to be locked, affecting the overall
   system.

This patch uses cmpxchg to implement napi_enable(), which ensures that
there will be no race due to the separation of clear two bits.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
---
 net/core/dev.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 74fd402d26dd..3457ae964b8c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6923,12 +6923,19 @@ EXPORT_SYMBOL(napi_disable);
  */
 void napi_enable(struct napi_struct *n)
 {
+	unsigned long val, new;
+
 	BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
-	smp_mb__before_atomic();
-	clear_bit(NAPI_STATE_SCHED, &n->state);
-	clear_bit(NAPI_STATE_NPSVC, &n->state);
-	if (n->dev->threaded && n->thread)
-		set_bit(NAPI_STATE_THREADED, &n->state);
+
+	do {
+		val = READ_ONCE(n->state);
+
+		new = val & ~(NAPIF_STATE_SCHED | NAPIF_STATE_NPSVC);
+
+		if (n->dev->threaded && n->thread)
+			new |= NAPIF_STATE_THREADED;
+
+	} while (cmpxchg(&n->state, val, new) != val);
 }
 EXPORT_SYMBOL(napi_enable);
 
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH net] napi: fix race inside napi_enable
  2021-09-18  6:48 [PATCH net] napi: fix race inside napi_enable Xuan Zhuo
@ 2021-09-18  7:19 ` Yunsheng Lin
  0 siblings, 0 replies; 2+ messages in thread
From: Yunsheng Lin @ 2021-09-18  7:19 UTC (permalink / raw)
  To: Xuan Zhuo, netdev
  Cc: David S. Miller, Jakub Kicinski, Eric Dumazet, Daniel Borkmann,
	Antoine Tenart, Alexander Lobakin, Wei Wang, Taehee Yoo,
	Björn Töpel, Arnd Bergmann, Kumar Kartikeya Dwivedi,
	dust.li

On 2021/9/18 14:48, Xuan Zhuo wrote:
> The process will cause napi.state to contain NAPI_STATE_SCHED and
> not in the poll_list, which will cause napi_disable() to get stuck.
> 
> The prefix "NAPI_STATE_" is removed in the figure below, and
> NAPI_STATE_HASHED is ignored in napi.state.
> 
>                       CPU0       |                   CPU1       | napi.state
> ===============================================================================
> napi_disable()                   |                              | SCHED | NPSVC
> napi_enable()                    |                              |
> {                                |                              |
>     smp_mb__before_atomic();     |                              |
>     clear_bit(SCHED, &n->state); |                              | NPSVC
>                                  | napi_schedule_prep()         | SCHED | NPSVC
>                                  | napi_poll()                  |
>                                  |   napi_complete_done()       |
>                                  |   {                          |
>                                  |      if (n->state & (NPSVC | | (1)
>                                  |               _BUSY_POLL)))  |
>                                  |           return false;      |
>                                  |     ................         |
>                                  |   }                          | SCHED | NPSVC
>                                  |                              |
>     clear_bit(NPSVC, &n->state); |                              | SCHED
> }                                |                              |
>                                  |                              |
> napi_schedule_prep()             |                              | SCHED | MISSED (2)
> 
> (1) Here return direct. Because of NAPI_STATE_NPSVC exists.
> (2) NAPI_STATE_SCHED exists. So not add napi.poll_list to sd->poll_list
> 
> Since NAPI_STATE_SCHED already exists and napi is not in the
> sd->poll_list queue, NAPI_STATE_SCHED cannot be cleared and will always
> exist.
> 
> 1. This will cause this queue to no longer receive packets.
> 2. If you encounter napi_disable under the protection of rtnl_lock, it
>    will cause the entire rtnl_lock to be locked, affecting the overall
>    system.
> 
> This patch uses cmpxchg to implement napi_enable(), which ensures that
> there will be no race due to the separation of clear two bits.
> 

a Fixes tag is needed here.

> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
> ---
>  net/core/dev.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 74fd402d26dd..3457ae964b8c 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6923,12 +6923,19 @@ EXPORT_SYMBOL(napi_disable);
>   */
>  void napi_enable(struct napi_struct *n)
>  {
> +	unsigned long val, new;
> +
>  	BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
> -	smp_mb__before_atomic();
> -	clear_bit(NAPI_STATE_SCHED, &n->state);
> -	clear_bit(NAPI_STATE_NPSVC, &n->state);
> -	if (n->dev->threaded && n->thread)
> -		set_bit(NAPI_STATE_THREADED, &n->state);
> +
> +	do {
> +		val = READ_ONCE(n->state);

It seems we might need to move the above BUG_ON here to preserve
the orginial BUG_ON behavior?

> +
> +		new = val & ~(NAPIF_STATE_SCHED | NAPIF_STATE_NPSVC);
> +
> +		if (n->dev->threaded && n->thread)
> +			new |= NAPIF_STATE_THREADED;
> +
> +	} while (cmpxchg(&n->state, val, new) != val);
>  }
>  EXPORT_SYMBOL(napi_enable);
>  
> 

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-09-18  7:19 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-18  6:48 [PATCH net] napi: fix race inside napi_enable Xuan Zhuo
2021-09-18  7:19 ` Yunsheng Lin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.