linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Khalid Aziz <khalid.aziz@oracle.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Andi Kleen <andi@firstfloor.org>,
	One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Oleg Nesterov <oleg@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] [PATCH] Pre-emption control for userspace
Date: Thu, 06 Mar 2014 09:32:11 -0700	[thread overview]
Message-ID: <5318A30B.4020702@oracle.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1403052301160.18573@ionos.tec.linutronix.de>

On 03/06/2014 04:14 AM, Thomas Gleixner wrote:
> We understand that you want to avoid preemption in the first place and
> not getting into the contention handling case.
>
> But, what you're trying to do is essentially creating an ABI which we
> have to support and maintain forever. And that definitely is worth a
> few serious questions.

Fair enough. I agree a new ABI should not be created lightly.

>
> Lets ignore the mm related issues for now as those can be solved. That's
> the least of my worries.
>
> Right now you are using this for a single use case with a well defined
> environment, where all related threads reside in the same scheduling
> class (FAIR). But that's one of a gazillion of use cases of Linux.
>

Creating a new ABI for a single use case or a special case is something 
I would argue against as well. I am with you on that. I am stating that 
databases and JVM happen to be two real world examples of the scenario 
where CFS can cause convoying problem inadvertently for a well designed 
critical section that represents a small portion of overall execution 
thread, simply because of where in the current timeslice the critical 
section is hit. If there are other examples others have come across, I 
would love to hear it. If we can indeed say this is a very special case 
for an uncommon workload, I would completely agree with refusing to 
create a new ABI.

> If we allow you to special case your database workload then we have no
> argument why we should not do the same thing for realtime workloads
> where the SCHED_FAIR housekeeping thread can hold a lock shortly to
> access some important data in the SCHED_FIFO realtime computation
> thread. Of course the RT people want to avoid the lock contention as
> much as you do, just for different reasons.
>
> Add SCHED_EDF, cgroups and hierarchical scheduling to the picture and
> hell breaks lose.

Realtime and deadline scheduler policies are supposed to be higher 
priority than CFS. A thread running in CFS that can impact threads 
running with realtime policies is a bad thing, agreed? What I am 
proposing actually allows a thread running with CFS to get out of the 
way of threads running with realtime policies quicker. In your specific 
example, the SCHED_FAIR housekeeping thread gets a chance to get out of 
SCHED_FIFO threads' way by giving its critical section better chance to 
complete execution before causing a convoy problem and while its cache 
is hot by using the exact same mechanism I am proposing. The logic is 
not onerous. Thread asks for amnesty from one context switch if and only 
if rescheduling point happens in the middle of its timeslice. If 
rescheduling point does not occur during its critical section, the 
thread takes that request back and life goes on as if nothing changed. 
If rescheduling point happens in the middle of thread's critical 
section, it gets the amnesty but it yields the processor as soon as it 
is done with its critical section. Any thread that does not play nice 
gets penalized next time it wants immunity (as hpa suggested).

Thanks,
Khalid


  reply	other threads:[~2014-03-06 16:33 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-03 18:07 [RFC] [PATCH] Pre-emption control for userspace Khalid Aziz
2014-03-03 21:51 ` Davidlohr Bueso
2014-03-03 23:29   ` Khalid Aziz
2014-03-04 13:56 ` Oleg Nesterov
2014-03-04 17:44   ` Khalid Aziz
2014-03-04 18:38     ` Al Viro
2014-03-04 19:01       ` Khalid Aziz
2014-03-04 19:03     ` Oleg Nesterov
2014-03-04 20:14       ` Khalid Aziz
2014-03-05 14:38         ` Oleg Nesterov
2014-03-05 16:12           ` Oleg Nesterov
2014-03-05 17:10             ` Khalid Aziz
2014-03-04 21:12 ` H. Peter Anvin
2014-03-04 21:39   ` Khalid Aziz
2014-03-04 22:23     ` One Thousand Gnomes
2014-03-04 22:44       ` Khalid Aziz
2014-03-05  0:39         ` Thomas Gleixner
2014-03-05  0:51           ` Andi Kleen
2014-03-05 11:10             ` Peter Zijlstra
2014-03-05 17:29               ` Khalid Aziz
2014-03-05 19:58               ` Khalid Aziz
2014-03-06  9:57                 ` Peter Zijlstra
2014-03-06 16:08                   ` Khalid Aziz
2014-03-06 11:14                 ` Thomas Gleixner
2014-03-06 16:32                   ` Khalid Aziz [this message]
2014-03-05 14:54             ` Oleg Nesterov
2014-03-05 15:56               ` Andi Kleen
2014-03-05 16:36                 ` Oleg Nesterov
2014-03-05 17:22                   ` Khalid Aziz
2014-03-05 23:13                     ` David Lang
2014-03-05 23:48                       ` Khalid Aziz
2014-03-05 23:56                         ` H. Peter Anvin
2014-03-06  0:02                           ` Khalid Aziz
2014-03-06  0:13                             ` H. Peter Anvin
2014-03-05 23:59                         ` David Lang
2014-03-06  0:17                           ` Khalid Aziz
2014-03-06  0:36                             ` David Lang
2014-03-06  1:22                               ` Khalid Aziz
2014-03-06 14:23                                 ` David Lang
2014-03-06 12:13             ` Kevin Easton
2014-03-06 13:59               ` Peter Zijlstra
2014-03-06 22:41                 ` Andi Kleen
2014-03-06 14:25               ` David Lang
2014-03-06 16:12                 ` Khalid Aziz
2014-03-06 13:24   ` Rasmus Villemoes
2014-03-06 13:34     ` Peter Zijlstra
2014-03-06 13:45       ` Rasmus Villemoes
2014-03-06 14:02         ` Peter Zijlstra
2014-03-06 14:33           ` Thomas Gleixner
2014-03-06 14:34             ` H. Peter Anvin
2014-03-06 14:04         ` Thomas Gleixner
2014-03-25 17:17 ` [PATCH v2] " Khalid Aziz
2014-03-25 17:44   ` Andrew Morton
2014-03-25 17:56     ` Khalid Aziz
2014-03-25 18:14       ` Andrew Morton
2014-03-25 17:46   ` Oleg Nesterov
2014-03-25 17:59     ` Khalid Aziz
2014-03-25 18:20   ` Andi Kleen
2014-03-25 18:47     ` Khalid Aziz
2014-03-25 19:47       ` Andi Kleen
2014-03-25 18:59   ` Eric W. Biederman
2014-03-25 19:15     ` Khalid Aziz
2014-03-25 20:31       ` Eric W. Biederman
2014-03-25 21:37         ` Khalid Aziz
2014-03-26  6:03     ` Mike Galbraith
2014-03-25 23:01 ` [RFC] [PATCH] " Davidlohr Bueso
2014-03-25 23:29   ` Khalid Aziz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5318A30B.4020702@oracle.com \
    --to=khalid.aziz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=gnomes@lxorguk.ukuu.org.uk \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).