linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Chad Talbott <ctalbott@google.com>
Cc: jaxboe@fusionio.com, linux-kernel@vger.kernel.org,
	mrubin@google.com, teravest@google.com
Subject: Re: [PATCH 0/3] cfq-iosched: Fair cross-group preemption
Date: Tue, 22 Mar 2011 14:12:31 -0400	[thread overview]
Message-ID: <20110322181231.GJ3757@redhat.com> (raw)
In-Reply-To: <AANLkTinTiEAFG1F1df380BiDtVFVr=nCsSqhM9__XdQ4@mail.gmail.com>

On Tue, Mar 22, 2011 at 10:39:36AM -0700, Chad Talbott wrote:
> On Tue, Mar 22, 2011 at 8:09 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > Why not just implement simply RT class groups and always allow an RT
> > group to preempt an BE class. Same thing we do for cfq queues. I will
> > not worry too much about a run away application consuming all the
> > bandwidth. If that's a concern we could use blkio controller to limit
> > the IO rate of a latency sensitive applicaiton to make sure it does
> > not starve BE applications.
> 
> That is not quite the same semantics.  This limited preemption patch
> is still work-conserving.  If the RT task in the only task on the
> system with IO, it will be able to use all available disk time.
> 

It is not same semantics but it feels like too much of special casing
for a single use case.

You are using the generic notion of a RT thread (which in general means
that it gets all the cpu or all the disk ahead of BE task). But you have
changed the definition of RT for this special use case. And also now
group RT is different from queue RT definition.

Why not have similar mechanism for cpu scheduler also then. This
application first should be able to get cpu bandwidth in same predictable
manner before it gets the disk bandwidth.

And I think your generation number patch should address this issue up
to great extent. Isn't it? If a latency sensitive task is not using
its fair quota, it will get a lower vdisktime and get to dispatch soon?

If that soon is not enough, then we could operate with reduce base slice
length so that we allocate smaller slices to groups and get better IO
latencies at the cost of total throughput. 

> > If RT starving BE is an issue, then it is an issue with plain cfq queue
> > also. First we shall have to fix it there.
> >
> > This definition that a latency sensitive task get prioritized only
> > till it is consuming its fair share and if task starts using more than
> > fair share then CFQ automatically stops prioritizing it sounds little
> > odd to me. If you are looking for predictability, then we lost it. We
> > shall have to very well know that task is not eating more than its
> > fair share before we can gurantee any kind of latencies to that task. And
> > if we know that task is not hogging the disk, there is anyway no risk
> > of it starving other groups/tasks completely.
> 
> In a shared environment, we have to be a little bit defensive.  We
> hope that a latency sensitive task is well characterized and won't
> exceed its share of the disk, and that we haven't over-committed the
> disk.  If the app does do more IO than expected, then we'd like them
> to bear the burden.  We have a choice of two outcomes.  A single job
> sometimes failing to achieve low disk latency when it's very busy.  Or
> all jobs on a disk sometimes being very slow when another (unrelated)
> job is very busy.  The first is easier to understand and debug.

To me you are trying to come up with a new scheduling class which is
not RT and you are trying to overload the meaning of RT for your use
case and that's the issue I have.

Coming up with a new scheduling class is also not desirable as that
will demand another service tree and we already have too many. Also
it should probably be also done for task and not just group otherwise
extending this concept to hierarchical setup will get complicated. Queues
and groups will just not gel well.

Frankly speaking, the problem you are having should be solved by your
generation number patch and by having smaller base slices. 

Or You could put latency sensitive applications in an RT class and 
then throttle them using blkio controller. That way you get good
latencies as well as you don't starve other tasks.

But I don't think overloading the meaning for RT or this specific use
case is a good idea.

Thanks
Vivek

  reply	other threads:[~2011-03-22 18:12 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-22  1:10 [PATCH 0/3] cfq-iosched: Fair cross-group preemption Chad Talbott
2011-03-22  1:10 ` [PATCH 1/3] cfq-iosched: Fair cross-group preemption (interface and documentation) Chad Talbott
2011-03-22 10:03   ` Gui Jianfeng
2011-03-22 18:07     ` Chad Talbott
2011-03-22  1:10 ` [PATCH 2/3] cfq-iosched: Fair cross-group preemption (implementation) Chad Talbott
2011-03-22  1:10 ` [PATCH 3/3] cfq-iosched: Fair cross-group preemption (stats) Chad Talbott
2011-03-22 15:09 ` [PATCH 0/3] cfq-iosched: Fair cross-group preemption Vivek Goyal
2011-03-22 17:39   ` Chad Talbott
2011-03-22 18:12     ` Vivek Goyal [this message]
2011-03-22 23:46       ` Chad Talbott
2011-03-23  1:43         ` Vivek Goyal
2011-03-23 20:10       ` Chad Talbott
2011-03-23 20:41         ` Vivek Goyal
2011-03-24 21:47           ` Chad Talbott
2011-03-25  5:43             ` Gui Jianfeng
2011-03-25 21:32             ` Vivek Goyal
2011-03-25 23:53               ` Chad Talbott
2011-03-28 13:15                 ` Vivek Goyal
2011-03-28 16:59                   ` Chad Talbott
2011-03-28 17:24                     ` Vivek Goyal
2011-03-28 13:17                 ` Vivek Goyal
2011-03-28 17:02                   ` Chad Talbott

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110322181231.GJ3757@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=ctalbott@google.com \
    --cc=jaxboe@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mrubin@google.com \
    --cc=teravest@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).