All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] sched: low latency feedback to userspace
@ 2016-05-09 10:18 Maximilian Krüger
  2016-05-09 11:18 ` Peter Zijlstra
  0 siblings, 1 reply; 4+ messages in thread
From: Maximilian Krüger @ 2016-05-09 10:18 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, linux-kernel

I am planning to extend the CFS as part of my master thesis. I want 
user-threads to allow to decide whether to enter a critical section or 
call sched_yield()
For tight synchronized workloads it might be useful, to only start 
certain short tasks, when they still can be completed in the current 
time slice without being interrupted by the scheduler.
Since low latency is key, my current plan is to use a shared-mapped page 
for signaling and only use a syscall for the setup. I'd be curious, if 
you might find this useful in general and if there is a chance for 
getting this accepted upstream, given my benchmarks can prove gives the 
intended benefits.

While these are very general questions, I have one detail question 
concerning how good the chances for getting this upstream are. Would you 
prefer, if I implemented the setup with an additional syscall, or as an 
extensions of existing syscalls (probably sched_setattr, sched_getattr).
greetings,
Maximilian Krüger

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] sched: low latency feedback to userspace
  2016-05-09 10:18 [RFC] sched: low latency feedback to userspace Maximilian Krüger
@ 2016-05-09 11:18 ` Peter Zijlstra
  2016-05-09 13:03   ` Maximilian Krüger
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2016-05-09 11:18 UTC (permalink / raw)
  To: Maximilian Krüger; +Cc: Ingo Molnar, linux-kernel

On Mon, May 09, 2016 at 12:18:54PM +0200, Maximilian Krüger wrote:
> I am planning to extend the CFS as part of my master thesis. I want
> user-threads to allow to decide whether to enter a critical section or call
> sched_yield()

sched_yield() for anything other than SCHED_FIFO / SCHED_DEADLINE is a
'bug'. That is, calling sched_yield() outside of those two cases is
undefined behaviour and the kernel is free to eat your granny and set
your pet on fire.

> For tight synchronized workloads it might be useful, to only start certain
> short tasks, when they still can be completed in the current time slice
> without being interrupted by the scheduler.
> Since low latency is key, my current plan is to use a shared-mapped page for
> signaling and only use a syscall for the setup. I'd be curious, if you might
> find this useful in general and if there is a chance for getting this
> accepted upstream, given my benchmarks can prove gives the intended
> benefits.

Schemes like this have been proposed many times (Google might find them
for you in your favourite LKML archive) and shot down the same number of
times.

Such proposals always end up needing to define a 'limit', which is
artificial and subject to creep, also such soft preempt-disable or boost
schemes have very open ended semantics. They also become useless if
_everyone_ requests them at the same time -- something not unlikely
since every userspace program thinks it is the most important thing
under the sun.


Would something like:

  http://lkml.kernel.org/r/20151027235635.16059.11630.stgit@pjt-glaptop.roam.corp.google.com

and

  http://lkml.kernel.org/r/1459789313-4917-1-git-send-email-mathieu.desnoyers@efficios.com

work for what you want to achieve? If not; please explain in more detail
why you want what you propose.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] sched: low latency feedback to userspace
  2016-05-09 11:18 ` Peter Zijlstra
@ 2016-05-09 13:03   ` Maximilian Krüger
  2016-05-09 13:42     ` Peter Zijlstra
  0 siblings, 1 reply; 4+ messages in thread
From: Maximilian Krüger @ 2016-05-09 13:03 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel


> sched_yield() for anything other than SCHED_FIFO / SCHED_DEADLINE is a 
> 'bug'. That is, calling sched_yield() outside of those two cases is 
> undefined behaviour and the kernel is free to eat your granny and set 
> your pet on fire. 

okay, fair.c/yield_task_fair() does not exactly sound, as if it would 
set my granny on fire or eat my pet, nor does man 2 yield, but correct 
me if I'm wrong.
> Schemes like this have been proposed many times (Google might find them
> for you in your favourite LKML archive) and shot down the same number of
> times.
>
> Such proposals always end up needing to define a 'limit', which is
> artificial and subject to creep, also such soft preempt-disable or boost
> schemes have very open ended semantics. They also become useless if
> _everyone_ requests them at the same time -- something not unlikely
> since every userspace program thinks it is the most important thing
> under the sun.
I'm totally with you in this point, which is, why I under no 
circumstances will prevent
preemption or sacrifice the scheduler fairness, so these tasks might run 
longer uninterrupted, even though this still is up to discussion. Which 
is, why I would argue, that there would not be a benefit for tasks to 
use this, just because they think that they are more important than 
anyone else.

>
>
> Would something like:
>
>    http://lkml.kernel.org/r/20151027235635.16059.11630.stgit@pjt-glaptop.roam.corp.google.com
>
> and
>
>    http://lkml.kernel.org/r/1459789313-4917-1-git-send-email-mathieu.desnoyers@efficios.com
>
> work for what you want to achieve? If not; please explain in more detail
> why you want what you propose.
the second patchset actually looks useful to me, but I very much agree 
with the comments, that this thing looks overly complicated. So yes, the 
interface I have in mind will be similar on an abstract level, but I 
don't intend any generic interface or anything close to this complexity.

thanks for the the quick feedback,
Max

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] sched: low latency feedback to userspace
  2016-05-09 13:03   ` Maximilian Krüger
@ 2016-05-09 13:42     ` Peter Zijlstra
  0 siblings, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2016-05-09 13:42 UTC (permalink / raw)
  To: Maximilian Krüger; +Cc: Ingo Molnar, linux-kernel

On Mon, May 09, 2016 at 03:03:59PM +0200, Maximilian Krüger wrote:
> 
> >sched_yield() for anything other than SCHED_FIFO / SCHED_DEADLINE is a
> >'bug'. That is, calling sched_yield() outside of those two cases is
> >undefined behaviour and the kernel is free to eat your granny and set your
> >pet on fire.
> 
> okay, fair.c/yield_task_fair() does not exactly sound, as if it would set my
> granny on fire or eat my pet, nor does man 2 yield, but correct me if I'm
> wrong.

So barring the co-operative multitasking model (which is still somewhat
employed in places), sched_yield() is basically undefined except for
FIFO / DEADLINE.

Now, every OS does 'something', and we do too. But that something could
really include anything from nothing to setting pet on fire.

That said; there is a lot of crufty code out there that calls
sched_yield() for all sorts of reasons, but mostly bad ones.

I have seen code like:

	while (!event)
		sched_yield();

to wait for completion of 'event'. Now imagine what would happen if this
code ends up running at a real-time priority?

And, no, you really cannot say that will never happen as an application
developer, because the application user might start you with a different
scheduling class (for whatever reason), or the system might have
unconditional priority inheritance and you get caught in a boost chain,
or whatever.

The worst part is; we've had people reject patches when we fixed their
yield abuse for them :/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-05-09 13:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-09 10:18 [RFC] sched: low latency feedback to userspace Maximilian Krüger
2016-05-09 11:18 ` Peter Zijlstra
2016-05-09 13:03   ` Maximilian Krüger
2016-05-09 13:42     ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.