All of lore.kernel.org
 help / color / mirror / Atom feed
* net/sched: latent livelock in dev_deactivate_many() due to yield() usage
@ 2017-04-02  4:28 Mike Galbraith
  2017-04-04 22:39 ` Cong Wang
  2017-04-06  0:31 ` Stephen Hemminger
  0 siblings, 2 replies; 12+ messages in thread
From: Mike Galbraith @ 2017-04-02  4:28 UTC (permalink / raw)
  To: netdev; +Cc: LKML, Peter Zijlstra

Greetings network wizards,

Quoting kernel/sched/core.c:
/**
 * yield - yield the current processor to other threads.
 *
 * Do not ever use this function, there's a 99% chance you're doing it wrong.
 *
 * The scheduler is at all times free to pick the calling task as the most
 * eligible task to run, if removing the yield() call from your code breaks
 * it, its already broken.
 *
 * Typical broken usage is:
 *
 * while (!event)
 *	yield();
 *
 * where one assumes that yield() will let 'the other' process run that will
 * make event true. If the current task is a SCHED_FIFO task that will never
 * happen. Never use yield() as a progress guarantee!!
 *
 * If you want to use yield() to wait for something, use wait_event().
 * If you want to use yield() to be 'nice' for others, use cond_resched().
 * If you still want to use yield(), do not!
 */

Livelock can be triggered by setting kworkers to SCHED_FIFO, then
suspend/resume.. you come back from sleepy-land with a spinning
kworker.  For whatever reason, I can only do that with an enterprise
like config, my standard config refuses to play, but no matter, it's
"Typical broken usage".

(yield() should be rendered dead)

	-Mike

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage
  2017-04-02  4:28 net/sched: latent livelock in dev_deactivate_many() due to yield() usage Mike Galbraith
@ 2017-04-04 22:39 ` Cong Wang
  2017-04-05  3:20   ` Mike Galbraith
  2017-04-06  0:31 ` Stephen Hemminger
  1 sibling, 1 reply; 12+ messages in thread
From: Cong Wang @ 2017-04-04 22:39 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: netdev, LKML, Peter Zijlstra

On Sat, Apr 1, 2017 at 9:28 PM, Mike Galbraith <efault@gmx.de> wrote:
> Greetings network wizards,
>
> Quoting kernel/sched/core.c:
> /**
>  * yield - yield the current processor to other threads.
>  *
>  * Do not ever use this function, there's a 99% chance you're doing it wrong.
>  *
>  * The scheduler is at all times free to pick the calling task as the most
>  * eligible task to run, if removing the yield() call from your code breaks
>  * it, its already broken.
>  *
>  * Typical broken usage is:
>  *
>  * while (!event)
>  *      yield();
>  *
>  * where one assumes that yield() will let 'the other' process run that will
>  * make event true. If the current task is a SCHED_FIFO task that will never
>  * happen. Never use yield() as a progress guarantee!!
>  *
>  * If you want to use yield() to wait for something, use wait_event().
>  * If you want to use yield() to be 'nice' for others, use cond_resched().
>  * If you still want to use yield(), do not!
>  */
>
> Livelock can be triggered by setting kworkers to SCHED_FIFO, then
> suspend/resume.. you come back from sleepy-land with a spinning
> kworker.  For whatever reason, I can only do that with an enterprise
> like config, my standard config refuses to play, but no matter, it's
> "Typical broken usage".
>
> (yield() should be rendered dead)


Thanks for the report! Looks like a quick solution here is to replace
this yield() with cond_resched(), it is harder to really wait for
all qdisc's to transmit all packets.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage
  2017-04-04 22:39 ` Cong Wang
@ 2017-04-05  3:20   ` Mike Galbraith
  2017-04-05  5:25     ` Cong Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Mike Galbraith @ 2017-04-05  3:20 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, LKML, Peter Zijlstra

On Tue, 2017-04-04 at 15:39 -0700, Cong Wang wrote:

> Thanks for the report! Looks like a quick solution here is to replace
> this yield() with cond_resched(), it is harder to really wait for
> all qdisc's to transmit all packets.

No, cond_resched() won't help.  What I did is below, but I suspect net
wizards will do something better.

---
 net/sched/sch_generic.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -16,6 +16,7 @@
 #include <linux/types.h>
 #include <linux/kernel.h>
 #include <linux/sched.h>
+#include <linux/swait.h>
 #include <linux/string.h>
 #include <linux/errno.h>
 #include <linux/netdevice.h>
@@ -901,6 +902,7 @@ static bool some_qdisc_is_busy(struct ne
  */
 void dev_deactivate_many(struct list_head *head)
 {
+	DECLARE_SWAIT_QUEUE_HEAD_ONSTACK(swait);
 	struct net_device *dev;
 	bool sync_needed = false;
 
@@ -924,8 +926,7 @@ void dev_deactivate_many(struct list_hea
 
 	/* Wait for outstanding qdisc_run calls. */
 	list_for_each_entry(dev, head, close_list)
-		while (some_qdisc_is_busy(dev))
-			yield();
+		swait_event_timeout(swait, !some_qdisc_is_busy(dev), 1);
 }
 
 void dev_deactivate(struct net_device *dev)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage
  2017-04-05  3:20   ` Mike Galbraith
@ 2017-04-05  5:25     ` Cong Wang
  2017-04-05  6:12       ` Mike Galbraith
  2017-04-06 10:26       ` Peter Zijlstra
  0 siblings, 2 replies; 12+ messages in thread
From: Cong Wang @ 2017-04-05  5:25 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: netdev, LKML, Peter Zijlstra

On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith <efault@gmx.de> wrote:
> -               while (some_qdisc_is_busy(dev))
> -                       yield();
> +               swait_event_timeout(swait, !some_qdisc_is_busy(dev), 1);
>  }

I don't see why this is an improvement even if I don't care about the
hardcoded timeout for now... Why the scheduler can make a better
decision with swait_event_timeout() than with cond_resched()?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage
  2017-04-05  5:25     ` Cong Wang
@ 2017-04-05  6:12       ` Mike Galbraith
  2017-04-05 23:55         ` Cong Wang
  2017-04-06 10:26       ` Peter Zijlstra
  1 sibling, 1 reply; 12+ messages in thread
From: Mike Galbraith @ 2017-04-05  6:12 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, LKML, Peter Zijlstra

On Tue, 2017-04-04 at 22:25 -0700, Cong Wang wrote:
> On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith <efault@gmx.de> wrote:
> > -               while (some_qdisc_is_busy(dev))
> > -                       yield();
> > +               swait_event_timeout(swait,
> > !some_qdisc_is_busy(dev), 1);
> >  }
> 
> I don't see why this is an improvement even if I don't care about the
> hardcoded timeout for now... Why the scheduler can make a better
> decision with swait_event_timeout() than with cond_resched()?

Because sleeping gets you out of the way?  There is no other decision
the scheduler can make while a SCHED_FIFO task is trying to yield when
it is the one and only task at it's priority.  The scheduler is doing
exactly what it is supposed to do, problem is people calling yield()
tend to think it does something it does not do, which is why it is
decorated with "if you think you want yield(), think again"

Yes, yield semantics suck rocks, basically don't exist.  Hop in your
time machine and slap whoever you find claiming responsibility :)

	-Mike

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage
  2017-04-05  6:12       ` Mike Galbraith
@ 2017-04-05 23:55         ` Cong Wang
  2017-04-06  1:08           ` Mike Galbraith
  0 siblings, 1 reply; 12+ messages in thread
From: Cong Wang @ 2017-04-05 23:55 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: netdev, LKML, Peter Zijlstra

On Tue, Apr 4, 2017 at 11:12 PM, Mike Galbraith <efault@gmx.de> wrote:
> On Tue, 2017-04-04 at 22:25 -0700, Cong Wang wrote:
>> On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith <efault@gmx.de> wrote:
>> > -               while (some_qdisc_is_busy(dev))
>> > -                       yield();
>> > +               swait_event_timeout(swait,
>> > !some_qdisc_is_busy(dev), 1);
>> >  }
>>
>> I don't see why this is an improvement even if I don't care about the
>> hardcoded timeout for now... Why the scheduler can make a better
>> decision with swait_event_timeout() than with cond_resched()?
>
> Because sleeping gets you out of the way?  There is no other decision
> the scheduler can make while a SCHED_FIFO task is trying to yield when
> it is the one and only task at it's priority.  The scheduler is doing
> exactly what it is supposed to do, problem is people calling yield()
> tend to think it does something it does not do, which is why it is
> decorated with "if you think you want yield(), think again"
>
> Yes, yield semantics suck rocks, basically don't exist.  Hop in your
> time machine and slap whoever you find claiming responsibility :)

I am not trying to defend for yield(), I am trying to understand when
cond_resched() is not a right solution to replace yield() and when it is.
For me, the dev_deactivate_many() case is, because I interpret
"be nice" differently.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage
  2017-04-02  4:28 net/sched: latent livelock in dev_deactivate_many() due to yield() usage Mike Galbraith
  2017-04-04 22:39 ` Cong Wang
@ 2017-04-06  0:31 ` Stephen Hemminger
  2017-04-06  1:28   ` Mike Galbraith
  2017-04-06 10:28   ` Peter Zijlstra
  1 sibling, 2 replies; 12+ messages in thread
From: Stephen Hemminger @ 2017-04-06  0:31 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: netdev, LKML, Peter Zijlstra

On Sun, 02 Apr 2017 06:28:41 +0200
Mike Galbraith <efault@gmx.de> wrote:

> Livelock can be triggered by setting kworkers to SCHED_FIFO, then
> suspend/resume.. you come back from sleepy-land with a spinning
> kworker.  For whatever reason, I can only do that with an enterprise
> like config, my standard config refuses to play, but no matter, it's
> "Typical broken usage".
> 
> (yield() should be rendered dead)

The kernel is not normally built to have kworkers run at SCHED_FIFO.
The user has do some action to alter the process priorities.

I classify this as user error. We don't support killing kworker threads
either.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage
  2017-04-05 23:55         ` Cong Wang
@ 2017-04-06  1:08           ` Mike Galbraith
  0 siblings, 0 replies; 12+ messages in thread
From: Mike Galbraith @ 2017-04-06  1:08 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, LKML, Peter Zijlstra

On Wed, 2017-04-05 at 16:55 -0700, Cong Wang wrote:
> On Tue, Apr 4, 2017 at 11:12 PM, Mike Galbraith <efault@gmx.de> wrote:
> > On Tue, 2017-04-04 at 22:25 -0700, Cong Wang wrote:
> > > On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith <efault@gmx.de> wrote:
> > > > -               while (some_qdisc_is_busy(dev))
> > > > -                       yield();
> > > > +               swait_event_timeout(swait,
> > > > !some_qdisc_is_busy(dev), 1);
> > > >  }
> > > 
> > > I don't see why this is an improvement even if I don't care about the
> > > hardcoded timeout for now... Why the scheduler can make a better
> > > decision with swait_event_timeout() than with cond_resched()?
> > 
> > Because sleeping gets you out of the way?  There is no other decision
> > the scheduler can make while a SCHED_FIFO task is trying to yield when
> > it is the one and only task at it's priority.  The scheduler is doing
> > exactly what it is supposed to do, problem is people calling yield()
> > tend to think it does something it does not do, which is why it is
> > decorated with "if you think you want yield(), think again"
> > 
> > Yes, yield semantics suck rocks, basically don't exist.  Hop in your
> > time machine and slap whoever you find claiming responsibility :)
> 
> I am not trying to defend for yield(), I am trying to understand when
> cond_resched() is not a right solution to replace yield() and when it is.
> For me, the dev_deactivate_many() case is, because I interpret
> "be nice" differently.

Yeah, I know you weren't defending it, just as I know that the net-fu
masters don't need that comment held close to their noses in order to
be able to read it.. waving it about wasn't for their benefit ;-)

	-Mike

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage
  2017-04-06  0:31 ` Stephen Hemminger
@ 2017-04-06  1:28   ` Mike Galbraith
  2017-04-06 10:28   ` Peter Zijlstra
  1 sibling, 0 replies; 12+ messages in thread
From: Mike Galbraith @ 2017-04-06  1:28 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, LKML, Peter Zijlstra

On Wed, 2017-04-05 at 17:31 -0700, Stephen Hemminger wrote:
> On Sun, 02 Apr 2017 06:28:41 +0200
> Mike Galbraith <efault@gmx.de> wrote:
> 
> > Livelock can be triggered by setting kworkers to SCHED_FIFO, then
> > suspend/resume.. you come back from sleepy-land with a spinning
> > kworker.  For whatever reason, I can only do that with an enterprise
> > like config, my standard config refuses to play, but no matter, it's
> > "Typical broken usage".
> > 
> > (yield() should be rendered dead)
> 
> The kernel is not normally built to have kworkers run at SCHED_FIFO.
> The user has do some action to alter the process priorities.
> 
> I classify this as user error. We don't support killing kworker threads
> either.

We'll have to agree to disagree on that.  I assert that any thread that
must run as SCHED_OTHER in order to be safe is in fact broken.

	-Mike

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage
  2017-04-05  5:25     ` Cong Wang
  2017-04-05  6:12       ` Mike Galbraith
@ 2017-04-06 10:26       ` Peter Zijlstra
  1 sibling, 0 replies; 12+ messages in thread
From: Peter Zijlstra @ 2017-04-06 10:26 UTC (permalink / raw)
  To: Cong Wang; +Cc: Mike Galbraith, netdev, LKML

On Tue, Apr 04, 2017 at 10:25:19PM -0700, Cong Wang wrote:
> On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith <efault@gmx.de> wrote:
> > -               while (some_qdisc_is_busy(dev))
> > -                       yield();
> > +               swait_event_timeout(swait, !some_qdisc_is_busy(dev), 1);
> >  }
> 
> I don't see why this is an improvement even if I don't care about the
> hardcoded timeout for now... Why the scheduler can make a better
> decision with swait_event_timeout() than with cond_resched()?

cond_resched() might be a no-op.

and doing yield() will result in a priority inversion deadlock. Imagine
the task doing yield() being the top priority (fifo99) task in the
system. Then it will simply spin forever, not giving whatever task is
required to make your condition true time to run.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage
  2017-04-06  0:31 ` Stephen Hemminger
  2017-04-06  1:28   ` Mike Galbraith
@ 2017-04-06 10:28   ` Peter Zijlstra
  2017-04-06 11:08     ` Peter Zijlstra
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2017-04-06 10:28 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Mike Galbraith, netdev, LKML

On Wed, Apr 05, 2017 at 05:31:05PM -0700, Stephen Hemminger wrote:
> On Sun, 02 Apr 2017 06:28:41 +0200
> Mike Galbraith <efault@gmx.de> wrote:
> 
> > Livelock can be triggered by setting kworkers to SCHED_FIFO, then
> > suspend/resume.. you come back from sleepy-land with a spinning
> > kworker.  For whatever reason, I can only do that with an enterprise
> > like config, my standard config refuses to play, but no matter, it's
> > "Typical broken usage".
> > 
> > (yield() should be rendered dead)
> 
> The kernel is not normally built to have kworkers run at SCHED_FIFO.
> The user has do some action to alter the process priorities.
> 
> I classify this as user error. We don't support killing kworker threads
> either.

PI can boost anybody to FIFO, all you need is to be holding a lock.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage
  2017-04-06 10:28   ` Peter Zijlstra
@ 2017-04-06 11:08     ` Peter Zijlstra
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Zijlstra @ 2017-04-06 11:08 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Mike Galbraith, netdev, LKML

On Thu, Apr 06, 2017 at 12:28:44PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 05, 2017 at 05:31:05PM -0700, Stephen Hemminger wrote:
> > On Sun, 02 Apr 2017 06:28:41 +0200
> > Mike Galbraith <efault@gmx.de> wrote:
> > 
> > > Livelock can be triggered by setting kworkers to SCHED_FIFO, then
> > > suspend/resume.. you come back from sleepy-land with a spinning
> > > kworker.  For whatever reason, I can only do that with an enterprise
> > > like config, my standard config refuses to play, but no matter, it's
> > > "Typical broken usage".
> > > 
> > > (yield() should be rendered dead)
> > 
> > The kernel is not normally built to have kworkers run at SCHED_FIFO.
> > The user has do some action to alter the process priorities.
> > 
> > I classify this as user error. We don't support killing kworker threads
> > either.
> 
> PI can boost anybody to FIFO, all you need is to be holding a lock.

Note that this extends to rcu_read_lock(), which can cause boosting
under some cases.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-04-06 11:08 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-02  4:28 net/sched: latent livelock in dev_deactivate_many() due to yield() usage Mike Galbraith
2017-04-04 22:39 ` Cong Wang
2017-04-05  3:20   ` Mike Galbraith
2017-04-05  5:25     ` Cong Wang
2017-04-05  6:12       ` Mike Galbraith
2017-04-05 23:55         ` Cong Wang
2017-04-06  1:08           ` Mike Galbraith
2017-04-06 10:26       ` Peter Zijlstra
2017-04-06  0:31 ` Stephen Hemminger
2017-04-06  1:28   ` Mike Galbraith
2017-04-06 10:28   ` Peter Zijlstra
2017-04-06 11:08     ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.