All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ryo Tsuruta <ryov@valinux.co.jp>
To: vgoyal@redhat.com
Cc: dm-devel@redhat.com, agk@redhat.com,
	linux-kernel@vger.kernel.org,
	containers@lists.linux-foundation.org, nauman@google.com,
	dpshah@google.com, lizf@cn.fujitsu.com, mikew@google.com,
	fchecconi@gmail.com, paolo.valente@unimore.it,
	jens.axboe@oracle.com, fernando@intellilink.co.jp,
	s-uchida@ap.jp.nec.com, taka@valinux.co.jp,
	guijianfeng@cn.fujitsu.com, arozansk@redhat.com,
	jmoyer@redhat.com, riel@redhat.com, peterz@infradead.org,
	menage@google.com, balbir@linux.vnet.ibm.com,
	dhaval@linux.vnet.ibm.com, chrisw@redhat.com
Subject: 2-Level IO scheduling (Re: [dm-devel] [PATCH 1/2] dm-ioband: I/O bandwidth controller v1.10.0: Source code and patch)
Date: Thu, 29 Jan 2009 12:36:44 +0900 (JST)	[thread overview]
Message-ID: <20090129.123644.28802208.ryov@valinux.co.jp> (raw)
In-Reply-To: <20090126162951.GI31802@redhat.com>

Hi Vivek,

I split this mail thread into three topics:
  o 2-Level IO scheduling
  o Hierarchical grouping facility for IO controller
  o Implement IO controller as a dm-driver

This mail is about 2-Level IO scheduling.

> Just because device mapper framework allows one to implement IO controller
> in a separate module, we should not implement it there. It will be
> difficult to take care of issues like, configuration, breaking underlying IO
> scheduler's assumptions, capability to treat tasks and groups at same level
> etc.

If you are satisfied with low-accuracy bandwidth control by an IO
scheduler, you don't need to use dm-ioband. If you want to use
dm-ioband with an IO scheduler, dm-ioband can work with any type of IO
scheduler, of course dm-ioband can work with your own IO scheduler
which you are developing.

> > > - If there is one task of io priority 0 in a cgroup and rest of the tasks
> > >   are of io prio 7. All the tasks belong to best effort class. If tasks of
> > >   lower priority (7) do lot of IO, then due to buffering there is a chance
> > >   that IO from lower prio tasks is seen by CFQ first and io from higher prio
> > >   task is not seen by cfq for quite some time hence that task not getting it
> > >   fair share with in the cgroup. Similar situation can arise with RT tasks
> > >   also.
> > 
> > Whether using dm-ioband or not, if the tasks of IO priority 7 do lot
> > of IO, then the device queue is going to be full and tasks which tries
> > to issue IOs are blocked until the queue get a slot. The IOs are
> > backlogged even if they are issued from the task of IO priority 0.
> > I don't understand why you think it's the biggest issue. The same
> > thing is going to happen without dm-ioband. 
> > 
> 
> True that even limited availability of request descriptors can be a
> bottleneck and can lead to same kind of issues but my contention is
> that you are aggravating the problem. Putting a 2nd layer can break IO
> scheduler's assumption even before underlying request queue is full.

I don't think so. Dm-ioband doesn't break IO scheduler's assumptions.
In CFQ's case, the priority order is not changed within a cgroup.

> So second level solution on top will increase the frequency of such
> incidents where a lower priority task can run away with more job done than
> high priority task because there are no separate queues for different
> priority tasks and release of buffered bio is FIFO.
> 
> Secondly what happens to tasks of RT class? dm-ioband does not have any
> notion of handling the RT cgroup or RT tasks.

It's not an issue, it's a talk about how to determine a policy.
I think giving priority to cgroup policy rather than I/O scheduler
policy is more flexible.

> Thirdly, doing any kind of resource control at higher level takes away the
> capability to treat task and groups at same level. I have had this
> discussion in other offline thread also where you are copied. I think
> it is a good idea to treat tasks and groups at same level where possible
> (depends if IO scheduler creates separate queues for tasks or not, cfq
> does.) 
> 
> > If I were you, I create two cgroups and let tasks of lower priority
> > belong to one cgroup and tasks of higher priority belong to another,
> > and give higher bandwidth to the cgroup to which the higher priority
> > tasks belong. What do you think about this way?
> 
> I think this is not practical. What we are talking is that task
> priority does not have any meaning. If we want service difference between
> two tasks, we need to pack them in separate cgroup otherwise we can't
> gurantee things. If we need to pack every task in separate cgroup then
> why to even have the notion of task priority.  

It is possible to modify dm-ioband to cooperate with CFQ, but I'm not
sure it's really meaningful. What do you do when a task of RT class
issues a lot of I/O? Do you always give priority to the I/Os from the
task of RT class despite of the assigned bandwidth? Which one do you
give priority bandwidth or RT class?

Thanks,
Ryo Tsuruta

  parent reply	other threads:[~2009-01-29  3:36 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-20  5:10 [PATCH 0/2] dm-ioband: I/O bandwidth controller v1.10.0: Introduction Ryo Tsuruta
2009-01-20  5:11 ` [PATCH 1/2] dm-ioband: I/O bandwidth controller v1.10.0: Source code and patch Ryo Tsuruta
2009-01-20  5:12   ` [PATCH 2/2] dm-ioband: I/O bandwidth controller v1.10.0: Document Ryo Tsuruta
2009-01-20 14:52   ` [PATCH 1/2] dm-ioband: I/O bandwidth controller v1.10.0: Source code and patch Alasdair G Kergon
2009-01-21 13:03     ` Ryo Tsuruta
2009-01-21 17:18       ` Alasdair G Kergon
2009-01-22 12:05         ` Ryo Tsuruta
2009-02-04  5:07           ` Ryo Tsuruta
2009-01-20 15:19   ` Alasdair G Kergon
2009-01-20 15:53   ` Alasdair G Kergon
     [not found]     ` <20090120155334.GH9859-swAlYijrCMMf7BdofF/totBPR1lH4CV8@public.gmane.org>
2009-01-22 16:12       ` [dm-devel] " Vivek Goyal
2009-01-22 16:12         ` Vivek Goyal
     [not found]         ` <20090122161218.GA28795-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-01-23 10:14           ` Ryo Tsuruta
2009-01-23 10:14         ` Ryo Tsuruta
     [not found]           ` <20090123.191404.39168431.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2009-01-26 16:29             ` Vivek Goyal
2009-01-26 16:29           ` Vivek Goyal
     [not found]             ` <20090126162951.GI31802-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-01-29  3:36               ` 2-Level IO scheduling (Re: [dm-devel] [PATCH 1/2] dm-ioband: I/O bandwidth controller v1.10.0: Source code and patch) Ryo Tsuruta
2009-01-29  3:39               ` Hierarchical grouping facility for IO controller " Ryo Tsuruta
2009-01-29  3:41               ` Implementation of dm-ioband as a dm-driver " Ryo Tsuruta
2009-01-29  3:36             ` Ryo Tsuruta [this message]
2009-01-29  3:39             ` Hierarchical grouping facility for IO controller " Ryo Tsuruta
2009-01-29  3:41             ` Implementation of dm-ioband as a dm-driver " Ryo Tsuruta
2009-01-20 15:04 ` [PATCH 0/2] dm-ioband: I/O bandwidth controller v1.10.0: Introduction Alasdair G Kergon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090129.123644.28802208.ryov@valinux.co.jp \
    --to=ryov@valinux.co.jp \
    --cc=agk@redhat.com \
    --cc=arozansk@redhat.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=chrisw@redhat.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=dhaval@linux.vnet.ibm.com \
    --cc=dm-devel@redhat.com \
    --cc=dpshah@google.com \
    --cc=fchecconi@gmail.com \
    --cc=fernando@intellilink.co.jp \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=jens.axboe@oracle.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=mikew@google.com \
    --cc=nauman@google.com \
    --cc=paolo.valente@unimore.it \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=s-uchida@ap.jp.nec.com \
    --cc=taka@valinux.co.jp \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.