All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nauman Rafique <nauman@google.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Corrado Zoccolo <czoccolo@gmail.com>,
	linux-kernel@vger.kernel.org, jens.axboe@oracle.com,
	dpshah@google.com, lizf@cn.fujitsu.com, ryov@valinux.co.jp,
	fernando@oss.ntt.co.jp, s-uchida@ap.jp.nec.com,
	taka@valinux.co.jp, guijianfeng@cn.fujitsu.com,
	jmoyer@redhat.com, balbir@linux.vnet.ibm.com,
	righi.andrea@gmail.com, m-ikeda@ds.jp.nec.com,
	akpm@linux-foundation.org, riel@redhat.com,
	kamezawa.hiroyu@jp.fujitsu.com
Subject: Re: [RFC] Workload type Vs Groups (Was: Re: [PATCH 02/20] blkio:  Change CFQ to use CFS like queue time stamps)
Date: Mon, 9 Nov 2009 09:33:47 -0800	[thread overview]
Message-ID: <e98e18940911090933l33544686j64358573bb659592@mail.gmail.com> (raw)
In-Reply-To: <20091106222257.GB2969@redhat.com>

On Fri, Nov 6, 2009 at 2:22 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, Nov 04, 2009 at 10:18:15PM +0100, Corrado Zoccolo wrote:
>> Hi Vivek,
>> On Wed, Nov 4, 2009 at 12:43 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > o Previously CFQ had one service tree where queues of all theree prio classes
>> >  were being queued. One side affect of this time stamping approach is that
>> >  now single tree approach might not work and we need to keep separate service
>> >  trees for three prio classes.
>> >
>> Single service tree is no longer true in cfq for-2.6.33.
>> Now we have a matrix of service trees, with first dimension being the
>> priority class, and second dimension being the workload type
>> (synchronous idle, synchronous no-idle, async).
>> You can have a look at the series: http://lkml.org/lkml/2009/10/26/482 .
>> It may have other interesting influences on your work, as the idle
>> introduced at the end of the synchronous no-idle tree, that provides
>> fairness also for seeky or high-think-time queues.
>>
>
> Hi All,
>
> I am now rebasing my patches to for-2.6.33 branch. There are significant
> number of changes in that branch, especially changes from corrado bring
> in an interesting question.
>
> Currently corrado has introduced the functinality of kind of grouping the
> cfq queues based on workload type and gives the time slots to these sub
> groups (sync-idle, sync-noidle, async).
>
> I was thinking of placing groups on top of this model, so that we select
> the group first and then select the type of workload and then finally
> the queue to run.
>
> Corrodo came up with an interesting suggestion (in a private mail), that
> what if we implement workload type at top and divide the share among
> groups with-in workoad type.
>
> So one would first select the workload to run and then select group
> with-in workload and then cfq queue with-in group.
>
> The advantage of this approach are.
>
> - for sync-noidle group, we will not idle per group. We will idle only
>  only at root level. (Well if we don't idle on the group once it becomes
>  empty, we will not see fairness for group. So it will be fairness vs
>  throughput call).
>
> - It allows us to limit system wide share of workload type. So for
>  example, one can kind of fix system wide share of async queues.
>  Generally it might not be very prudent to allocate a group 50% of
>  disk share and then that group decides to just do async IO and sync
>  IO in rest of the groups suffer.
>
> Disadvantage
>
> - The definition of fairness becomes bit murkier. Now fairness will be
>  achieved for a group with-in the workload type. So if a group is doing
>  IO of type sync-idle as well as sync-noidle and other group is doing
>  IO of type only sync-noidle, then first group will get overall more
>  disk time even if both the groups have same weight.
>
> Looking for some feedback about which appraoch makes more sense before I
> write patches.

On the first look, the first option did make some sense. But isn't the
whole point of adding cgroups is to support fairness, or isolation? If
we are adding cgroups support in a way that does not support
isolation, there is not much point to the whole effort.

The first approach seems to be directed towards keeping good overall
throughput. Fairness and isolation might always come with a
possibility of the loss in overall throughput. The assumption is that
once someone is using cgroup, the overall system efficiency is a
concern which is secondary to the performance we are supporting for
each cgroup.

Also, the second approach is cleaner design. For each cgroup, we will
need one data structure, instead of having 3, one for each workload
type. And all the new functionality should still live under a config
option; so if someone does not want cgroups, they are just turn them
off and we will be back to just one set of threads for each workload
type.

>
> Thanks
> Vivek
>

  reply	other threads:[~2009-11-09 17:33 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-03 23:43 [RFC] Block IO Controller V1 Vivek Goyal
2009-11-03 23:43 ` [PATCH 01/20] blkio: Documentation Vivek Goyal
2009-11-04 13:37   ` Jeff Moyer
2009-11-04 17:21   ` Balbir Singh
2009-11-04 17:52     ` Vivek Goyal
2009-11-04 23:36       ` Balbir Singh
2009-11-03 23:43 ` [PATCH 02/20] blkio: Change CFQ to use CFS like queue time stamps Vivek Goyal
2009-11-04 14:30   ` Jeff Moyer
2009-11-04 16:37     ` Vivek Goyal
2009-11-04 17:59       ` Corrado Zoccolo
2009-11-04 18:54         ` Vivek Goyal
2009-11-05  2:44       ` Divyesh Shah
2009-11-05 14:39         ` Vivek Goyal
2009-11-04 21:18   ` Corrado Zoccolo
2009-11-04 22:25     ` Vivek Goyal
2009-11-05  8:36       ` Corrado Zoccolo
2009-11-04 23:22     ` Vivek Goyal
2009-11-05  8:27       ` Corrado Zoccolo
2009-11-05  0:05     ` Vivek Goyal
2009-11-06 22:22     ` [RFC] Workload type Vs Groups (Was: Re: [PATCH 02/20] blkio: Change CFQ to use CFS like queue time stamps) Vivek Goyal
2009-11-09 17:33       ` Nauman Rafique [this message]
2009-11-09 21:47       ` Corrado Zoccolo
2009-11-09 23:12         ` Vivek Goyal
2009-11-10 11:29           ` Corrado Zoccolo
2009-11-10 13:31             ` Vivek Goyal
2009-11-10 14:12               ` Vivek Goyal
2009-11-10 18:05                 ` Corrado Zoccolo
2009-11-10 19:15                   ` Vivek Goyal
2009-11-12  8:53                     ` Corrado Zoccolo
2009-11-11  0:48   ` [PATCH 02/20] blkio: Change CFQ to use CFS like queue time stamps Gui Jianfeng
2009-11-12 23:07     ` Vivek Goyal
2009-11-13  0:59       ` Gui Jianfeng
2009-11-13  1:24         ` Vivek Goyal
2009-11-13  2:05           ` Gui Jianfeng
2009-11-03 23:43 ` [PATCH 03/20] blkio: Introduce the notion of weights Vivek Goyal
2009-11-04 15:06   ` Jeff Moyer
2009-11-04 15:41     ` Vivek Goyal
2009-11-04 17:07       ` Divyesh Shah
2009-11-04 19:00         ` Vivek Goyal
2009-11-04 19:15       ` Jeff Moyer
2009-11-03 23:43 ` [PATCH 04/20] blkio: Introduce the notion of cfq entity Vivek Goyal
2009-11-03 23:43 ` [PATCH 05/20] blkio: Introduce the notion of cfq groups Vivek Goyal
2009-11-03 23:43 ` [PATCH 06/20] blkio: Introduce cgroup interface Vivek Goyal
2009-11-04 15:23   ` Jeff Moyer
2009-11-04 16:47     ` Vivek Goyal
2009-11-03 23:43 ` [PATCH 07/20] blkio: Provide capablity to enqueue/dequeue group entities Vivek Goyal
2009-11-04 15:34   ` Jeff Moyer
2009-11-04 16:54     ` Vivek Goyal
2009-11-03 23:43 ` [PATCH 08/20] blkio: Add support for dynamic creation of cfq_groups Vivek Goyal
2009-11-04 16:01   ` Jeff Moyer
2009-11-03 23:43 ` [PATCH 09/20] blkio: Porpogate blkio cgroup weight or ioprio class updation to cfq groups Vivek Goyal
2009-11-05  5:35   ` Gui Jianfeng
2009-11-05 14:42     ` Vivek Goyal
2009-11-03 23:43 ` [PATCH 10/20] blkio: Implement cfq group deletion and reference counting support Vivek Goyal
2009-11-04 18:44   ` Jeff Moyer
2009-11-04 19:00     ` Vivek Goyal
2009-11-03 23:43 ` [PATCH 11/20] blkio: Some CFQ debugging Aid Vivek Goyal
2009-11-04 18:52   ` Jeff Moyer
2009-11-04 19:12     ` Vivek Goyal
2009-11-04 19:25       ` Jeff Moyer
2009-11-05  3:10   ` Divyesh Shah
2009-11-05 14:42     ` Vivek Goyal
2009-11-06  0:56       ` Divyesh Shah
2009-11-03 23:43 ` [PATCH 12/20] blkio: Export disk time and sectors dispatched from cgroup interface Vivek Goyal
2009-11-03 23:43 ` [PATCH 13/20] blkio: Add a group dequeue interface in cgroup for debugging Vivek Goyal
2009-11-03 23:43 ` [PATCH 14/20] blkio: Do not allow request merging across cfq groups Vivek Goyal
2009-11-03 23:43 ` [PATCH 15/20] blkio: Take care of preemptions across groups Vivek Goyal
2009-11-04 19:00   ` Jeff Moyer
2009-11-04 19:27     ` Vivek Goyal
2009-11-04 19:30       ` Jeff Moyer
2009-11-06  7:55   ` Gui Jianfeng
2009-11-06 22:10     ` Vivek Goyal
2009-11-09  7:41       ` Gui Jianfeng
2009-11-03 23:43 ` [PATCH 16/20] blkio: do not select co-operating queues from different cfq groups Vivek Goyal
2009-11-03 23:43 ` [PATCH 17/20] blkio: Wait for queue to get backlogged before it expires Vivek Goyal
2009-11-03 23:43 ` [PATCH 18/20] blkio: arm idle timer even if think time is great then time slice left Vivek Goyal
2009-11-04 19:04   ` Jeff Moyer
2009-11-04 19:17     ` Vivek Goyal
2009-11-03 23:43 ` [PATCH 19/20] blkio: Arm slice timer even if there are requests in driver Vivek Goyal
2009-11-03 23:43 ` [PATCH 20/20] blkio: Drop the reference to queue once the task changes cgroup Vivek Goyal
2009-11-04 19:09   ` Jeff Moyer
2009-11-04 19:18     ` Vivek Goyal
2009-11-04  7:43 ` [RFC] Block IO Controller V1 Jens Axboe
2009-11-04 13:39   ` Vivek Goyal
2009-11-04 19:12 ` Jeff Moyer
2009-11-04 19:19   ` Vivek Goyal
2009-11-04 19:27     ` Jeff Moyer
2009-11-04 19:38       ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e98e18940911090933l33544686j64358573bb659592@mail.gmail.com \
    --to=nauman@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=czoccolo@gmail.com \
    --cc=dpshah@google.com \
    --cc=fernando@oss.ntt.co.jp \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=jens.axboe@oracle.com \
    --cc=jmoyer@redhat.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=m-ikeda@ds.jp.nec.com \
    --cc=riel@redhat.com \
    --cc=righi.andrea@gmail.com \
    --cc=ryov@valinux.co.jp \
    --cc=s-uchida@ap.jp.nec.com \
    --cc=taka@valinux.co.jp \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.