linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Ming Lei <ming.lei@redhat.com>, Rachit Agarwal <rach4x0r@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>, Qizhe Cai <qc228@cornell.edu>,
	Rachit Agarwal <ragarwal@cornell.edu>,
	linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-block@vger.kernel.org,
	Midhul Vuppalapati <mvv25@cornell.edu>,
	Jaehyun Hwang <jaehyun.hwang@cornell.edu>,
	Rachit Agarwal <ragarwal@cs.cornell.edu>,
	Keith Busch <kbusch@kernel.org>,
	Sagi Grimberg <sagi@lightbitslabs.com>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH] iosched: Add i10 I/O Scheduler
Date: Fri, 13 Nov 2020 12:58:10 -0800	[thread overview]
Message-ID: <44d5bcb0-689e-50c8-fa8e-a7d2b569f75c@grimberg.me> (raw)
In-Reply-To: <20201113145912.GA1074955@T590>


> blk-mq actually has built-in batching(or sort of) mechanism, which is enabled
> if the hw queue is busy(hctx->dispatch_busy is > 0). We use EWMA to compute
> hctx->dispatch_busy, and it is adaptive, even though the implementation is quite
> coarse. But there should be much space to improve, IMO.

You are correct, however nvme-tcp should be getting to dispatch_busy > 0
IIUC.

> It is reported that this way improves SQ high-end SCSI SSD very much[1],
> and MMC performance gets improved too[2].
> 
> [1] https://lore.kernel.org/linux-block/3cc3e03901dc1a63ef32e036182521af@mail.gmail.com/
> [2] https://lore.kernel.org/linux-block/CADBw62o9eTQDJ9RvNgEqSpXmg6Xcq=2TxH0Hfxhp29uF2W=TXA@mail.gmail.com/

Yes, the guys paid attention to the MMC related improvements that you
made.

>> The i10 I/O scheduler builds upon recent work on [6]. We have tested the i10 I/O
>> scheduler with nvme-tcp optimizaitons [2,3] and batching dispatch [4], varying number
>> of cores, varying read/write ratios, and varying request sizes, and with NVMe SSD and
>> RAM block device. For NVMe SSDs, the i10 I/O scheduler achieves ~60% improvements in
>> terms of IOPS per core over "noop" I/O scheduler. These results are available at [5],
>> and many additional results are presented in [6].
> 
> In case of none scheduler, basically nvme driver won't provide any queue busy
> feedback, so the built-in batching dispatch doesn't work simply.

Exactly.

> kyber scheduler uses io latency feedback to throttle and build io batch,
> can you compare i10 with kyber on nvme/nvme-tcp?

I assume it should be simple to get, I'll let Rachit/Jaehyun comment.

>> While other schedulers may also batch I/O (e.g., mq-deadline), the optimization target
>> in the i10 I/O scheduler is throughput maximization. Hence there is no latency target
>> nor a need for a global tracking context, so a new scheduler is needed rather than
>> to build this functionality to an existing scheduler.
>>
>> We currently use fixed default values as batching thresholds (e.g., 16 for #requests,
>> 64KB for #bytes, and 50us for timeout). These default values are based on sensitivity
>> tests in [6]. For our future work, we plan to support adaptive batching according to
> 
> Frankly speaking, hardcode 16 #rquests or 64KB may not work everywhere,
> and product environment could be much complicated than your sensitivity
> tests. If possible, please start with adaptive batching.

That was my feedback as well for sure. But given that this is a
scheduler one would opt-in to anyway, that won't be a must-have
initially. I'm not sure if the guys made progress with this yet, I'll
let them comment.

      reply	other threads:[~2020-11-13 20:58 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-12 14:07 [PATCH] iosched: Add i10 I/O Scheduler Rachit Agarwal
2020-11-12 18:02 ` Jens Axboe
2020-11-13 20:34   ` Sagi Grimberg
2020-11-13 21:03     ` Jens Axboe
2020-11-13 21:23       ` Sagi Grimberg
2020-11-13 21:26         ` Jens Axboe
2020-11-13 21:36           ` Sagi Grimberg
2020-11-13 21:44             ` Jens Axboe
2020-11-13 21:56               ` Sagi Grimberg
     [not found]                 ` <CAKeUqKKHg1wD19pnwJEd8whubnuGVic_ZhDjebaq3kKmY9TtsQ@mail.gmail.com>
2020-11-30 19:20                   ` Sagi Grimberg
     [not found]                   ` <CAKeUqKK3yykq8LNv1CCHZTHSz1=bzBaCwVQmi6bhpbYzqVJsqQ@mail.gmail.com>
2021-01-11 18:15                     ` Rachit Agarwal
2020-11-16  8:41             ` Ming Lei
2020-11-13 14:59 ` Ming Lei
2020-11-13 20:58   ` Sagi Grimberg [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44d5bcb0-689e-50c8-fa8e-a7d2b569f75c@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=jaehyun.hwang@cornell.edu \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=ming.lei@redhat.com \
    --cc=mvv25@cornell.edu \
    --cc=qc228@cornell.edu \
    --cc=rach4x0r@gmail.com \
    --cc=ragarwal@cornell.edu \
    --cc=ragarwal@cs.cornell.edu \
    --cc=sagi@lightbitslabs.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).