All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2] net: core: make napi_disable more robust
@ 2021-04-14  8:08 Lijun Pan
  2021-04-14  8:45 ` Yunsheng Lin
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Lijun Pan @ 2021-04-14  8:08 UTC (permalink / raw)
  To: netdev; +Cc: Lijun Pan

There are chances that napi_disable can be called twice by NIC driver.
This could generate deadlock. For example,
the first napi_disable will spin until NAPI_STATE_SCHED is cleared
by napi_complete_done, then set it again.
When napi_disable is called the second time, it will loop infinitely
because no dev->poll will be running to clear NAPI_STATE_SCHED.

Though it is driver writer's responsibility to make sure it being
called only once, making napi_disable more robust does not hurt, not
to say it can prevent a buggy driver from crashing a system.
So, we check the napi state bit to make sure that if napi is already
disabled, we exit the call early enough to avoid spinning infinitely.

Fixes: bea3348eef27 ("[NET]: Make NAPI polling independent of struct net_device objects.")
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
---
v2: justify that this patch makes napi_disable more robust.

 net/core/dev.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index 1f79b9aa9a3f..fa0aa212b7bb 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6830,6 +6830,24 @@ EXPORT_SYMBOL(netif_napi_add);
 void napi_disable(struct napi_struct *n)
 {
 	might_sleep();
+
+	/* make sure napi_disable() runs only once,
+	 * When napi is disabled, the state bits are like:
+	 * NAPI_STATE_SCHED (set by previous napi_disable)
+	 * NAPI_STATE_NPSVC (set by previous napi_disable)
+	 * NAPI_STATE_DISABLE (cleared by previous napi_disable)
+	 * NAPI_STATE_PREFER_BUSY_POLL (cleared by previous napi_complete_done)
+	 * NAPI_STATE_MISSED (cleared by previous napi_complete_done)
+	 */
+
+	if (napi_disable_pending(n))
+		return;
+	if (test_bit(NAPI_STATE_SCHED, &n->state) &&
+	    test_bit(NAPI_STATE_NPSVC, &n->state) &&
+	    !test_bit(NAPI_STATE_MISSED, &n->state) &&
+	    !test_bit(NAPI_STATE_PREFER_BUSY_POLL, &n->state))
+		return;
+
 	set_bit(NAPI_STATE_DISABLE, &n->state);
 
 	while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2] net: core: make napi_disable more robust
  2021-04-14  8:08 [PATCH net v2] net: core: make napi_disable more robust Lijun Pan
@ 2021-04-14  8:45 ` Yunsheng Lin
  2021-04-14 17:31   ` Lijun Pan
  2021-04-14 23:21 ` Jakub Kicinski
  2021-04-15  6:47 ` Eric Dumazet
  2 siblings, 1 reply; 6+ messages in thread
From: Yunsheng Lin @ 2021-04-14  8:45 UTC (permalink / raw)
  To: Lijun Pan, netdev

On 2021/4/14 16:08, Lijun Pan wrote:
> There are chances that napi_disable can be called twice by NIC driver.
> This could generate deadlock. For example,
> the first napi_disable will spin until NAPI_STATE_SCHED is cleared
> by napi_complete_done, then set it again.
> When napi_disable is called the second time, it will loop infinitely
> because no dev->poll will be running to clear NAPI_STATE_SCHED.
> 
> Though it is driver writer's responsibility to make sure it being
> called only once, making napi_disable more robust does not hurt, not
> to say it can prevent a buggy driver from crashing a system.
> So, we check the napi state bit to make sure that if napi is already
> disabled, we exit the call early enough to avoid spinning infinitely.
> 
> Fixes: bea3348eef27 ("[NET]: Make NAPI polling independent of struct net_device objects.")
> Signed-off-by: Lijun Pan <lijunp213@gmail.com>
> ---
> v2: justify that this patch makes napi_disable more robust.
> 
>  net/core/dev.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 1f79b9aa9a3f..fa0aa212b7bb 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6830,6 +6830,24 @@ EXPORT_SYMBOL(netif_napi_add);
>  void napi_disable(struct napi_struct *n)
>  {
>  	might_sleep();
> +
> +	/* make sure napi_disable() runs only once,
> +	 * When napi is disabled, the state bits are like:
> +	 * NAPI_STATE_SCHED (set by previous napi_disable)
> +	 * NAPI_STATE_NPSVC (set by previous napi_disable)
> +	 * NAPI_STATE_DISABLE (cleared by previous napi_disable)
> +	 * NAPI_STATE_PREFER_BUSY_POLL (cleared by previous napi_complete_done)
> +	 * NAPI_STATE_MISSED (cleared by previous napi_complete_done)
> +	 */
> +
> +	if (napi_disable_pending(n))
> +		return;
> +	if (test_bit(NAPI_STATE_SCHED, &n->state) &&
> +	    test_bit(NAPI_STATE_NPSVC, &n->state) &&
> +	    !test_bit(NAPI_STATE_MISSED, &n->state) &&
> +	    !test_bit(NAPI_STATE_PREFER_BUSY_POLL, &n->state))
> +		return;

The NAPI_STATE_DISABLE is cleared at the end of napi_disable(),
and if a buggy driver/hw triggers a interrupt and driver calls
napi_schedule_irqoff(), which may set NAPI_STATE_MISSED
if NAPI_STATE_SCHED is set(in napi_schedule_prep()), the above
checking does not seem to handle it?

> +
>  	set_bit(NAPI_STATE_DISABLE, &n->state);
>  
>  	while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2] net: core: make napi_disable more robust
  2021-04-14  8:45 ` Yunsheng Lin
@ 2021-04-14 17:31   ` Lijun Pan
  0 siblings, 0 replies; 6+ messages in thread
From: Lijun Pan @ 2021-04-14 17:31 UTC (permalink / raw)
  To: Yunsheng Lin; +Cc: netdev

On Wed, Apr 14, 2021 at 3:45 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>
> On 2021/4/14 16:08, Lijun Pan wrote:
> > There are chances that napi_disable can be called twice by NIC driver.
> > This could generate deadlock. For example,
> > the first napi_disable will spin until NAPI_STATE_SCHED is cleared
> > by napi_complete_done, then set it again.
> > When napi_disable is called the second time, it will loop infinitely
> > because no dev->poll will be running to clear NAPI_STATE_SCHED.
> >
> > Though it is driver writer's responsibility to make sure it being
> > called only once, making napi_disable more robust does not hurt, not
> > to say it can prevent a buggy driver from crashing a system.
> > So, we check the napi state bit to make sure that if napi is already
> > disabled, we exit the call early enough to avoid spinning infinitely.
> >
> > Fixes: bea3348eef27 ("[NET]: Make NAPI polling independent of struct net_device objects.")
> > Signed-off-by: Lijun Pan <lijunp213@gmail.com>
> > ---
> > v2: justify that this patch makes napi_disable more robust.
> >
> >  net/core/dev.c | 18 ++++++++++++++++++
> >  1 file changed, 18 insertions(+)
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 1f79b9aa9a3f..fa0aa212b7bb 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -6830,6 +6830,24 @@ EXPORT_SYMBOL(netif_napi_add);
> >  void napi_disable(struct napi_struct *n)
> >  {
> >       might_sleep();
> > +
> > +     /* make sure napi_disable() runs only once,
> > +      * When napi is disabled, the state bits are like:
> > +      * NAPI_STATE_SCHED (set by previous napi_disable)
> > +      * NAPI_STATE_NPSVC (set by previous napi_disable)
> > +      * NAPI_STATE_DISABLE (cleared by previous napi_disable)
> > +      * NAPI_STATE_PREFER_BUSY_POLL (cleared by previous napi_complete_done)
> > +      * NAPI_STATE_MISSED (cleared by previous napi_complete_done)
> > +      */
> > +
> > +     if (napi_disable_pending(n))
> > +             return;
> > +     if (test_bit(NAPI_STATE_SCHED, &n->state) &&
> > +         test_bit(NAPI_STATE_NPSVC, &n->state) &&
> > +         !test_bit(NAPI_STATE_MISSED, &n->state) &&
> > +         !test_bit(NAPI_STATE_PREFER_BUSY_POLL, &n->state))
> > +             return;
>
> The NAPI_STATE_DISABLE is cleared at the end of napi_disable(),
> and if a buggy driver/hw triggers a interrupt and driver calls
> napi_schedule_irqoff(), which may set NAPI_STATE_MISSED
> if NAPI_STATE_SCHED is set(in napi_schedule_prep()), the above
> checking does not seem to handle it?

What I described in the commit message is the napi_disable() being
called from the same instance, same cpu. e.g.,
funcA {
    napi_disable();
    ...
    funcB{
        if (blah)
            napi_disable();
            ...
    }
    funcC;
}

The scenario you mentioned above seems to have napi already enabled
and scheduled, such that napi_schedule_prep() would set NAPI_STATE_MISSED.
The two scenarios are different per my understanding. Is there a way
that your scenario will finally call into my scenario?
Let me know if I understand you correctly.

Maybe testing NAPI_STATE_MISSED bit is not needed
because this bit is not that reliable.

Lijun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2] net: core: make napi_disable more robust
  2021-04-14  8:08 [PATCH net v2] net: core: make napi_disable more robust Lijun Pan
  2021-04-14  8:45 ` Yunsheng Lin
@ 2021-04-14 23:21 ` Jakub Kicinski
  2021-04-15  6:46   ` Eric Dumazet
  2021-04-15  6:47 ` Eric Dumazet
  2 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2021-04-14 23:21 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev

On Wed, 14 Apr 2021 03:08:45 -0500 Lijun Pan wrote:
> There are chances that napi_disable can be called twice by NIC driver.
> This could generate deadlock. For example,
> the first napi_disable will spin until NAPI_STATE_SCHED is cleared
> by napi_complete_done, then set it again.
> When napi_disable is called the second time, it will loop infinitely
> because no dev->poll will be running to clear NAPI_STATE_SCHED.
> 
> Though it is driver writer's responsibility to make sure it being
> called only once, making napi_disable more robust does not hurt, not
> to say it can prevent a buggy driver from crashing a system.
> So, we check the napi state bit to make sure that if napi is already
> disabled, we exit the call early enough to avoid spinning infinitely.

You've already been told by Eric & Dave to fix the driver instead.

Your check is _not_ correct - SCHED && NPSVC && !MISSED && !BUSY_POLL 
can well arise without disabling the NAPI.

But regardless, a driver bug should be relatively easy to identify with
task getting stuck in napi_disable(). We don't provide "protection" 
for taking spin locks or ref counts twice either. Unless you can show 
a strong use case please stop posting new versions of this patch.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2] net: core: make napi_disable more robust
  2021-04-14 23:21 ` Jakub Kicinski
@ 2021-04-15  6:46   ` Eric Dumazet
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2021-04-15  6:46 UTC (permalink / raw)
  To: Jakub Kicinski, Lijun Pan; +Cc: netdev



On 4/15/21 1:21 AM, Jakub Kicinski wrote:
> On Wed, 14 Apr 2021 03:08:45 -0500 Lijun Pan wrote:
>> There are chances that napi_disable can be called twice by NIC driver.
>> This could generate deadlock. For example,
>> the first napi_disable will spin until NAPI_STATE_SCHED is cleared
>> by napi_complete_done, then set it again.
>> When napi_disable is called the second time, it will loop infinitely
>> because no dev->poll will be running to clear NAPI_STATE_SCHED.
>>
>> Though it is driver writer's responsibility to make sure it being
>> called only once, making napi_disable more robust does not hurt, not
>> to say it can prevent a buggy driver from crashing a system.
>> So, we check the napi state bit to make sure that if napi is already
>> disabled, we exit the call early enough to avoid spinning infinitely.
> 
> You've already been told by Eric & Dave to fix the driver instead.
> 
> Your check is _not_ correct - SCHED && NPSVC && !MISSED && !BUSY_POLL 
> can well arise without disabling the NAPI.
> 
> But regardless, a driver bug should be relatively easy to identify with
> task getting stuck in napi_disable(). We don't provide "protection" 
> for taking spin locks or ref counts twice either. Unless you can show 
> a strong use case please stop posting new versions of this patch.
> 

+222

I notice this v2 does not even mention which driver has the issue.

I suspect an out-of-tree driver.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2] net: core: make napi_disable more robust
  2021-04-14  8:08 [PATCH net v2] net: core: make napi_disable more robust Lijun Pan
  2021-04-14  8:45 ` Yunsheng Lin
  2021-04-14 23:21 ` Jakub Kicinski
@ 2021-04-15  6:47 ` Eric Dumazet
  2 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2021-04-15  6:47 UTC (permalink / raw)
  To: Lijun Pan, netdev



On 4/14/21 10:08 AM, Lijun Pan wrote:
> There are chances that napi_disable can be called twice by NIC driver.
> This could generate deadlock. For example,
> the first napi_disable will spin until NAPI_STATE_SCHED is cleared
> by napi_complete_done, then set it again.
> When napi_disable is called the second time, it will loop infinitely
> because no dev->poll will be running to clear NAPI_STATE_SCHED.
> 
> Though it is driver writer's responsibility to make sure it being
> called only once, making napi_disable more robust does not hurt, not
> to say it can prevent a buggy driver from crashing a system.

This is hard to digest. A buggy driver has plenty of ways to crash the system.

If you need help to fix the buggy driver, please ask for help.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-04-15  6:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-14  8:08 [PATCH net v2] net: core: make napi_disable more robust Lijun Pan
2021-04-14  8:45 ` Yunsheng Lin
2021-04-14 17:31   ` Lijun Pan
2021-04-14 23:21 ` Jakub Kicinski
2021-04-15  6:46   ` Eric Dumazet
2021-04-15  6:47 ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.