All of lore.kernel.org
 help / color / mirror / Atom feed
From: Weiping Zhang <zwp10758@gmail.com>
To: Tejun Heo <tj@kernel.org>
Cc: Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@lst.de>,
	Bart Van Assche <bvanassche@acm.org>,
	Minwoo Im <minwoo.im.dev@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ming Lei <ming.lei@redhat.com>,
	"Nadolski, Edmund" <edmund.nadolski@intel.com>,
	linux-block@vger.kernel.org, cgroups@vger.kernel.org,
	linux-nvme@lists.infradead.org
Subject: Re: [PATCH v5 0/4] Add support Weighted Round Robin for blkcg and nvme
Date: Tue, 31 Mar 2020 23:47:41 +0800	[thread overview]
Message-ID: <CAA70yB51=VQrL+2wC+DL8cYmGVACb2_w5UHc4XFn7MgZjUJaeg@mail.gmail.com> (raw)
In-Reply-To: <20200331143635.GS162390@mtj.duckdns.org>

Tejun Heo <tj@kernel.org> 于2020年3月31日周二 下午10:36写道:
>
> Hello, Weiping.
>
> On Tue, Mar 31, 2020 at 02:17:06PM +0800, Weiping Zhang wrote:
> > Recently I do some cgroup io weight testing,
> > https://github.com/dublio/iotrack/wiki/cgroup-io-weight-test
> > I think a proper io weight policy
> > should consider high weight cgroup's iops, latency and also take whole
> > disk's throughput
> > into account, that is to say, the policy should do more carfully trade
> > off between cgroup's
> > IO performance and whole disk's throughput. I know one policy cannot
> > do all things perfectly,
> > but from the test result nvme-wrr can work well.
>
> That's w/o iocost QoS targets configured, right? iocost should be able to
> achieve similar results as wrr with QoS configured.
>
Yes, I have not set Qos target.
> > From the following test result, nvme-wrr work well for both cgroup's
> > latency, iops, and whole
> > disk's throughput.
>
> As I wrote before, the issues I see with wrr are the followings.
>
> * Hardware dependent. Some will work ok or even fantastic. Many others will do
>   horribly.
>
> * Lack of configuration granularity. We can't configure it granular enough to
>   serve hierarchical configuration.
>
> * Likely not a huge problem with the deep QD of nvmes but lack of queue depth
>   control can lead to loss of latency control and thus loss of protection for
>   low concurrency workloads when pitched against workloads which can saturate
>   QD.
>
> All that said, given the feature is available, I don't see any reason to not
> allow to use it, but I don't think it fits the cgroup interface model given the
> hardware dependency and coarse granularity. For these cases, I think the right
> thing to do is using cgroups to provide tagging information - ie. build a
> dedicated interface which takes cgroup fd or ino as the tag and associate
> configurations that way. There already are other use cases which use cgroup this
> way (e.g. perf).
>
Do you means drop the "io.wrr" or "blkio.wrr" in cgroup, and use a
dedicated interface
like /dev/xxx or /proc/xxx?

I see the perf code:
struct fd f = fdget(fd)
struct cgroup_subsys_state *css =
css_tryget_online_from_dir(f.file->f_path.dentry,
        &perf_event_cgrp_subsys);

Looks can be applied to block cgroup in same way.

Thanks your help.

WARNING: multiple messages have this Message-ID (diff)
From: Weiping Zhang <zwp10758@gmail.com>
To: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	Bart Van Assche <bvanassche@acm.org>,
	linux-nvme@lists.infradead.org, Ming Lei <ming.lei@redhat.com>,
	linux-block@vger.kernel.org, Minwoo Im <minwoo.im.dev@gmail.com>,
	cgroups@vger.kernel.org, Keith Busch <kbusch@kernel.org>,
	"Nadolski, Edmund" <edmund.nadolski@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v5 0/4] Add support Weighted Round Robin for blkcg and nvme
Date: Tue, 31 Mar 2020 23:47:41 +0800	[thread overview]
Message-ID: <CAA70yB51=VQrL+2wC+DL8cYmGVACb2_w5UHc4XFn7MgZjUJaeg@mail.gmail.com> (raw)
In-Reply-To: <20200331143635.GS162390@mtj.duckdns.org>

Tejun Heo <tj@kernel.org> 于2020年3月31日周二 下午10:36写道:
>
> Hello, Weiping.
>
> On Tue, Mar 31, 2020 at 02:17:06PM +0800, Weiping Zhang wrote:
> > Recently I do some cgroup io weight testing,
> > https://github.com/dublio/iotrack/wiki/cgroup-io-weight-test
> > I think a proper io weight policy
> > should consider high weight cgroup's iops, latency and also take whole
> > disk's throughput
> > into account, that is to say, the policy should do more carfully trade
> > off between cgroup's
> > IO performance and whole disk's throughput. I know one policy cannot
> > do all things perfectly,
> > but from the test result nvme-wrr can work well.
>
> That's w/o iocost QoS targets configured, right? iocost should be able to
> achieve similar results as wrr with QoS configured.
>
Yes, I have not set Qos target.
> > From the following test result, nvme-wrr work well for both cgroup's
> > latency, iops, and whole
> > disk's throughput.
>
> As I wrote before, the issues I see with wrr are the followings.
>
> * Hardware dependent. Some will work ok or even fantastic. Many others will do
>   horribly.
>
> * Lack of configuration granularity. We can't configure it granular enough to
>   serve hierarchical configuration.
>
> * Likely not a huge problem with the deep QD of nvmes but lack of queue depth
>   control can lead to loss of latency control and thus loss of protection for
>   low concurrency workloads when pitched against workloads which can saturate
>   QD.
>
> All that said, given the feature is available, I don't see any reason to not
> allow to use it, but I don't think it fits the cgroup interface model given the
> hardware dependency and coarse granularity. For these cases, I think the right
> thing to do is using cgroups to provide tagging information - ie. build a
> dedicated interface which takes cgroup fd or ino as the tag and associate
> configurations that way. There already are other use cases which use cgroup this
> way (e.g. perf).
>
Do you means drop the "io.wrr" or "blkio.wrr" in cgroup, and use a
dedicated interface
like /dev/xxx or /proc/xxx?

I see the perf code:
struct fd f = fdget(fd)
struct cgroup_subsys_state *css =
css_tryget_online_from_dir(f.file->f_path.dentry,
        &perf_event_cgrp_subsys);

Looks can be applied to block cgroup in same way.

Thanks your help.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

WARNING: multiple messages have this Message-ID (diff)
From: Weiping Zhang <zwp10758-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Keith Busch <kbusch-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>,
	Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>,
	Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>,
	Minwoo Im <minwoo.im.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	Ming Lei <ming.lei-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"Nadolski,
	Edmund" <edmund.nadolski-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
Subject: Re: [PATCH v5 0/4] Add support Weighted Round Robin for blkcg and nvme
Date: Tue, 31 Mar 2020 23:47:41 +0800	[thread overview]
Message-ID: <CAA70yB51=VQrL+2wC+DL8cYmGVACb2_w5UHc4XFn7MgZjUJaeg@mail.gmail.com> (raw)
In-Reply-To: <20200331143635.GS162390-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>

Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 于2020年3月31日周二 下午10:36写道:
>
> Hello, Weiping.
>
> On Tue, Mar 31, 2020 at 02:17:06PM +0800, Weiping Zhang wrote:
> > Recently I do some cgroup io weight testing,
> > https://github.com/dublio/iotrack/wiki/cgroup-io-weight-test
> > I think a proper io weight policy
> > should consider high weight cgroup's iops, latency and also take whole
> > disk's throughput
> > into account, that is to say, the policy should do more carfully trade
> > off between cgroup's
> > IO performance and whole disk's throughput. I know one policy cannot
> > do all things perfectly,
> > but from the test result nvme-wrr can work well.
>
> That's w/o iocost QoS targets configured, right? iocost should be able to
> achieve similar results as wrr with QoS configured.
>
Yes, I have not set Qos target.
> > From the following test result, nvme-wrr work well for both cgroup's
> > latency, iops, and whole
> > disk's throughput.
>
> As I wrote before, the issues I see with wrr are the followings.
>
> * Hardware dependent. Some will work ok or even fantastic. Many others will do
>   horribly.
>
> * Lack of configuration granularity. We can't configure it granular enough to
>   serve hierarchical configuration.
>
> * Likely not a huge problem with the deep QD of nvmes but lack of queue depth
>   control can lead to loss of latency control and thus loss of protection for
>   low concurrency workloads when pitched against workloads which can saturate
>   QD.
>
> All that said, given the feature is available, I don't see any reason to not
> allow to use it, but I don't think it fits the cgroup interface model given the
> hardware dependency and coarse granularity. For these cases, I think the right
> thing to do is using cgroups to provide tagging information - ie. build a
> dedicated interface which takes cgroup fd or ino as the tag and associate
> configurations that way. There already are other use cases which use cgroup this
> way (e.g. perf).
>
Do you means drop the "io.wrr" or "blkio.wrr" in cgroup, and use a
dedicated interface
like /dev/xxx or /proc/xxx?

I see the perf code:
struct fd f = fdget(fd)
struct cgroup_subsys_state *css =
css_tryget_online_from_dir(f.file->f_path.dentry,
        &perf_event_cgrp_subsys);

Looks can be applied to block cgroup in same way.

Thanks your help.

  reply	other threads:[~2020-03-31 15:47 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-04  3:30 [PATCH v5 0/4] Add support Weighted Round Robin for blkcg and nvme Weiping Zhang
2020-02-04  3:30 ` Weiping Zhang
2020-02-04  3:30 ` Weiping Zhang
2020-02-04  3:31 ` [PATCH v5 1/4] block: add weighted round robin for blkcgroup Weiping Zhang
2020-02-04  3:31   ` Weiping Zhang
2020-02-04  3:31   ` Weiping Zhang
2020-02-04  3:31 ` [PATCH v5 2/4] nvme: add get_ams for nvme_ctrl_ops Weiping Zhang
2020-02-04  3:31   ` Weiping Zhang
2020-02-04  3:31   ` Weiping Zhang
2020-02-04  3:31 ` [PATCH v5 3/4] nvme-pci: rename module parameter write_queues to read_queues Weiping Zhang
2020-02-04  3:31   ` Weiping Zhang
2020-02-04  3:31   ` Weiping Zhang
2020-02-04  3:31 ` [PATCH v5 4/4] nvme: add support weighted round robin queue Weiping Zhang
2020-02-04  3:31   ` Weiping Zhang
2020-02-04  3:31   ` Weiping Zhang
2020-02-04 15:42 ` [PATCH v5 0/4] Add support Weighted Round Robin for blkcg and nvme Keith Busch
2020-02-04 15:42   ` Keith Busch
2020-02-16  8:09   ` Weiping Zhang
2020-02-16  8:09     ` Weiping Zhang
2020-02-16  8:09     ` Weiping Zhang
2020-03-31  6:17     ` Weiping Zhang
2020-03-31  6:17       ` Weiping Zhang
2020-03-31 10:29       ` Paolo Valente
2020-03-31 10:29         ` Paolo Valente
2020-03-31 14:36       ` Tejun Heo
2020-03-31 14:36         ` Tejun Heo
2020-03-31 14:36         ` Tejun Heo
2020-03-31 15:47         ` Weiping Zhang [this message]
2020-03-31 15:47           ` Weiping Zhang
2020-03-31 15:47           ` Weiping Zhang
2020-03-31 15:51           ` Tejun Heo
2020-03-31 15:51             ` Tejun Heo
2020-03-31 15:52             ` Christoph Hellwig
2020-03-31 15:52               ` Christoph Hellwig
2020-03-31 15:52               ` Christoph Hellwig
2020-03-31 15:54               ` Tejun Heo
2020-03-31 15:54                 ` Tejun Heo
2020-03-31 15:54                 ` Tejun Heo
2020-03-31 16:31               ` Weiping Zhang
2020-03-31 16:31                 ` Weiping Zhang
2020-03-31 16:31                 ` Weiping Zhang
2020-03-31 16:33                 ` Christoph Hellwig
2020-03-31 16:33                   ` Christoph Hellwig
2020-03-31 16:33                   ` Christoph Hellwig
2020-03-31 16:52                   ` Weiping Zhang
2020-03-31 16:52                     ` Weiping Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA70yB51=VQrL+2wC+DL8cYmGVACb2_w5UHc4XFn7MgZjUJaeg@mail.gmail.com' \
    --to=zwp10758@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=cgroups@vger.kernel.org \
    --cc=edmund.nadolski@intel.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=ming.lei@redhat.com \
    --cc=minwoo.im.dev@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.