Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Alexei Starovoitov <ast@fb.com>
Cc: Quentin Monnet <quentin.monnet@netronome.com>,
	"axboe@kernel.dk" <axboe@kernel.dk>, Andy Newell <newella@fb.com>,
	Chris Mason <clm@fb.com>,
	"josef@toxicpanda.com" <josef@toxicpanda.com>,
	Dennis Zhou <dennisz@fb.com>,
	"lizefan@huawei.com" <lizefan@huawei.com>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Kernel Team <Kernel-team@fb.com>,
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
	"ast@kernel.org" <ast@kernel.org>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	Martin Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>
Subject: Re: [PATCH 10/10] blkcg: implement BPF_PROG_TYPE_IO_COST
Date: Fri, 14 Jun 2019 10:09:14 -0700
Message-ID: <20190614170914.GF538958@devbig004.ftw2.facebook.com> (raw)
In-Reply-To: <bed0a66a-7aa6-ac36-9182-31a4937257e5@fb.com>

Hello, Alexei.

On Fri, Jun 14, 2019 at 04:35:35PM +0000, Alexei Starovoitov wrote:
> the example bpf prog looks flexible enough to allow some degree
> of experiments. The question is what kind of new algorithms you envision
> it will do? what other inputs it would need to make a decision?
> I think it's ok to start with what it does now and extend further
> when need arises.

I'm not sure right now.  The linear model worked a lot better than I
originally expected and looks like it can cover most of the current
use cases.  It could easily be that we just haven't seen enough
different cases yet.

At one point, quadratic model was on the table in case the linear
model wasn't good enough.  Also, one area which may need improvements
could be factoring in r/w mixture into consideration.  Some SSDs'
performance nose-dive when r/w commands are mixed in certain
proportions.  Right now, we just deal with that by adjusting global
performance ratio (vrate) but I can imagine a model which considers
the issue history in the past X seconds of the cgroup and bumps the
overall cost according to r/w mixture.

> > * Is block ioctl the right mechanism to attach these programs?
> 
> imo ioctl is a bit weird, but since its only one program per block
> device it's probably ok? Unless you see it being cgroup scoped in
> the future? Then cgroup-bpf style hooks will be more suitable
> and allow a chain of programs.

As this is a device property, I think there should only be one program
per block device.

> > * Are there more parameters that need to be exposed to the programs?
> > 
> > * It'd be great to have efficient access to per-blockdev and
> >    per-blockdev-cgroup-pair storages available to these programs so
> >    that they can keep track of history.  What'd be the best of way of
> >    doing that considering the fact that these programs will be called
> >    per each IO and the overhead can add up quickly?
> 
> Martin's socket local storage solved that issue for sockets.
> Something very similar can work for per-blockdev-per-cgroup.

Cool, that sounds great in case we need to develop this further.  Andy
had this self-learning model which didn't need any external input and
could tune itself solely based on device saturation state.  If the
prog can remember states cheaply, it'd be pretty cool to experiment
with things like that in bpf.

Thanks.

-- 
tejun

  reply index

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-14  1:56 [PATCHSET block/for-next] IO cost model based work-conserving porportional controller Tejun Heo
2019-06-14  1:56 ` [PATCH 01/10] blkcg: pass @q and @blkcg into blkcg_pol_alloc_pd_fn() Tejun Heo
2019-06-14  1:56 ` [PATCH 02/10] blkcg: make ->cpd_init_fn() optional Tejun Heo
2019-06-14  1:56 ` [PATCH 03/10] blkcg: separate blkcg_conf_get_disk() out of blkg_conf_prep() Tejun Heo
2019-06-14  1:56 ` [PATCH 04/10] block/rq_qos: add rq_qos_merge() Tejun Heo
2019-06-14  1:56 ` [PATCH 05/10] block/rq_qos: implement rq_qos_ops->queue_depth_changed() Tejun Heo
2019-06-14  1:56 ` [PATCH 06/10] blkcg: s/RQ_QOS_CGROUP/RQ_QOS_LATENCY/ Tejun Heo
2019-06-14  1:56 ` [PATCH 07/10] blk-mq: add optional request->pre_start_time_ns Tejun Heo
2019-06-14  1:56 ` [PATCH 08/10] blkcg: implement blk-ioweight Tejun Heo
2019-06-14 12:17   ` Toke Høiland-Jørgensen
2019-06-14 15:09     ` Tejun Heo
2019-06-14 20:50       ` Toke Høiland-Jørgensen
2019-06-15 15:57         ` Tejun Heo
2019-06-14  1:56 ` [PATCH 09/10] blkcg: add tools/cgroup/monitor_ioweight.py Tejun Heo
2019-06-14  1:56 ` [PATCH 10/10] blkcg: implement BPF_PROG_TYPE_IO_COST Tejun Heo
2019-06-14 11:32   ` Quentin Monnet
2019-06-14 14:52     ` Tejun Heo
2019-06-14 16:35       ` Alexei Starovoitov
2019-06-14 17:09         ` Tejun Heo [this message]
2019-06-14 17:56 ` [PATCHSET block/for-next] IO cost model based work-conserving porportional controller Tejun Heo

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190614170914.GF538958@devbig004.ftw2.facebook.com \
    --to=tj@kernel.org \
    --cc=Kernel-team@fb.com \
    --cc=ast@fb.com \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bpf@vger.kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=clm@fb.com \
    --cc=daniel@iogearbox.net \
    --cc=dennisz@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=josef@toxicpanda.com \
    --cc=kafai@fb.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=newella@fb.com \
    --cc=quentin.monnet@netronome.com \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org linux-block@archiver.kernel.org
	public-inbox-index linux-block


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/ public-inbox