* [RFC] sched: low latency feedback to userspace
@ 2016-05-09 10:18 Maximilian Krüger
2016-05-09 11:18 ` Peter Zijlstra
0 siblings, 1 reply; 4+ messages in thread
From: Maximilian Krüger @ 2016-05-09 10:18 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, linux-kernel
I am planning to extend the CFS as part of my master thesis. I want
user-threads to allow to decide whether to enter a critical section or
call sched_yield()
For tight synchronized workloads it might be useful, to only start
certain short tasks, when they still can be completed in the current
time slice without being interrupted by the scheduler.
Since low latency is key, my current plan is to use a shared-mapped page
for signaling and only use a syscall for the setup. I'd be curious, if
you might find this useful in general and if there is a chance for
getting this accepted upstream, given my benchmarks can prove gives the
intended benefits.
While these are very general questions, I have one detail question
concerning how good the chances for getting this upstream are. Would you
prefer, if I implemented the setup with an additional syscall, or as an
extensions of existing syscalls (probably sched_setattr, sched_getattr).
greetings,
Maximilian Krüger
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] sched: low latency feedback to userspace
2016-05-09 10:18 [RFC] sched: low latency feedback to userspace Maximilian Krüger
@ 2016-05-09 11:18 ` Peter Zijlstra
2016-05-09 13:03 ` Maximilian Krüger
0 siblings, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2016-05-09 11:18 UTC (permalink / raw)
To: Maximilian Krüger; +Cc: Ingo Molnar, linux-kernel
On Mon, May 09, 2016 at 12:18:54PM +0200, Maximilian Krüger wrote:
> I am planning to extend the CFS as part of my master thesis. I want
> user-threads to allow to decide whether to enter a critical section or call
> sched_yield()
sched_yield() for anything other than SCHED_FIFO / SCHED_DEADLINE is a
'bug'. That is, calling sched_yield() outside of those two cases is
undefined behaviour and the kernel is free to eat your granny and set
your pet on fire.
> For tight synchronized workloads it might be useful, to only start certain
> short tasks, when they still can be completed in the current time slice
> without being interrupted by the scheduler.
> Since low latency is key, my current plan is to use a shared-mapped page for
> signaling and only use a syscall for the setup. I'd be curious, if you might
> find this useful in general and if there is a chance for getting this
> accepted upstream, given my benchmarks can prove gives the intended
> benefits.
Schemes like this have been proposed many times (Google might find them
for you in your favourite LKML archive) and shot down the same number of
times.
Such proposals always end up needing to define a 'limit', which is
artificial and subject to creep, also such soft preempt-disable or boost
schemes have very open ended semantics. They also become useless if
_everyone_ requests them at the same time -- something not unlikely
since every userspace program thinks it is the most important thing
under the sun.
Would something like:
http://lkml.kernel.org/r/20151027235635.16059.11630.stgit@pjt-glaptop.roam.corp.google.com
and
http://lkml.kernel.org/r/1459789313-4917-1-git-send-email-mathieu.desnoyers@efficios.com
work for what you want to achieve? If not; please explain in more detail
why you want what you propose.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] sched: low latency feedback to userspace
2016-05-09 11:18 ` Peter Zijlstra
@ 2016-05-09 13:03 ` Maximilian Krüger
2016-05-09 13:42 ` Peter Zijlstra
0 siblings, 1 reply; 4+ messages in thread
From: Maximilian Krüger @ 2016-05-09 13:03 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel
> sched_yield() for anything other than SCHED_FIFO / SCHED_DEADLINE is a
> 'bug'. That is, calling sched_yield() outside of those two cases is
> undefined behaviour and the kernel is free to eat your granny and set
> your pet on fire.
okay, fair.c/yield_task_fair() does not exactly sound, as if it would
set my granny on fire or eat my pet, nor does man 2 yield, but correct
me if I'm wrong.
> Schemes like this have been proposed many times (Google might find them
> for you in your favourite LKML archive) and shot down the same number of
> times.
>
> Such proposals always end up needing to define a 'limit', which is
> artificial and subject to creep, also such soft preempt-disable or boost
> schemes have very open ended semantics. They also become useless if
> _everyone_ requests them at the same time -- something not unlikely
> since every userspace program thinks it is the most important thing
> under the sun.
I'm totally with you in this point, which is, why I under no
circumstances will prevent
preemption or sacrifice the scheduler fairness, so these tasks might run
longer uninterrupted, even though this still is up to discussion. Which
is, why I would argue, that there would not be a benefit for tasks to
use this, just because they think that they are more important than
anyone else.
>
>
> Would something like:
>
> http://lkml.kernel.org/r/20151027235635.16059.11630.stgit@pjt-glaptop.roam.corp.google.com
>
> and
>
> http://lkml.kernel.org/r/1459789313-4917-1-git-send-email-mathieu.desnoyers@efficios.com
>
> work for what you want to achieve? If not; please explain in more detail
> why you want what you propose.
the second patchset actually looks useful to me, but I very much agree
with the comments, that this thing looks overly complicated. So yes, the
interface I have in mind will be similar on an abstract level, but I
don't intend any generic interface or anything close to this complexity.
thanks for the the quick feedback,
Max
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] sched: low latency feedback to userspace
2016-05-09 13:03 ` Maximilian Krüger
@ 2016-05-09 13:42 ` Peter Zijlstra
0 siblings, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2016-05-09 13:42 UTC (permalink / raw)
To: Maximilian Krüger; +Cc: Ingo Molnar, linux-kernel
On Mon, May 09, 2016 at 03:03:59PM +0200, Maximilian Krüger wrote:
>
> >sched_yield() for anything other than SCHED_FIFO / SCHED_DEADLINE is a
> >'bug'. That is, calling sched_yield() outside of those two cases is
> >undefined behaviour and the kernel is free to eat your granny and set your
> >pet on fire.
>
> okay, fair.c/yield_task_fair() does not exactly sound, as if it would set my
> granny on fire or eat my pet, nor does man 2 yield, but correct me if I'm
> wrong.
So barring the co-operative multitasking model (which is still somewhat
employed in places), sched_yield() is basically undefined except for
FIFO / DEADLINE.
Now, every OS does 'something', and we do too. But that something could
really include anything from nothing to setting pet on fire.
That said; there is a lot of crufty code out there that calls
sched_yield() for all sorts of reasons, but mostly bad ones.
I have seen code like:
while (!event)
sched_yield();
to wait for completion of 'event'. Now imagine what would happen if this
code ends up running at a real-time priority?
And, no, you really cannot say that will never happen as an application
developer, because the application user might start you with a different
scheduling class (for whatever reason), or the system might have
unconditional priority inheritance and you get caught in a boost chain,
or whatever.
The worst part is; we've had people reject patches when we fixed their
yield abuse for them :/
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-05-09 13:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-09 10:18 [RFC] sched: low latency feedback to userspace Maximilian Krüger
2016-05-09 11:18 ` Peter Zijlstra
2016-05-09 13:03 ` Maximilian Krüger
2016-05-09 13:42 ` Peter Zijlstra
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.