From: Jens Axboe <axboe@kernel.dk>
To: Bart Van Assche <bvanassche@acm.org>,
Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: linux-block@vger.kernel.org, Jaegeuk Kim <jaegeuk@kernel.org>,
Avri Altman <avri.altman@wdc.com>,
Damien Le Moal <damien.lemoal@wdc.com>
Subject: Re: [PATCH 4/8] block/mq-deadline: Only use zone locking if necessary
Date: Mon, 9 Jan 2023 18:03:45 -0700 [thread overview]
Message-ID: <86eef990-0725-9669-6b7e-1fe935a6b648@kernel.dk> (raw)
In-Reply-To: <31d32f69-4c14-c9be-494f-7071112073f9@acm.org>
On 1/9/23 5:56?PM, Bart Van Assche wrote:
> On 1/9/23 16:48, Jens Axboe wrote:
>> On 1/9/23 5:44?PM, Bart Van Assche wrote:
>>> On 1/9/23 16:41, Jens Axboe wrote:
>>>> Or, probably better, a stacked scheduler where the bottom one can be zone
>>>> away. Then we can get rid of littering the entire stack and IO schedulers
>>>> with silly blk_queue_pipeline_zoned_writes() or blk_is_zoned_write() etc.
>>>
>>> Hi Jens,
>>>
>>> Isn't one of Damien's viewpoints that an I/O scheduler should not do
>>> the reordering of write requests since reordering of write requests
>>> may involve waiting for write requests, write request that will never
>>> be received if all tags have been allocated?
>>
>> It should be work conservering, it should not wait for anything. If
>> there are holes or gaps, then there's nothing the scheduler can do.
>>
>> My point is that the strict ordering was pretty hacky when it went in,
>> and rather than get better, it's proliferating. That's not a good
>> direction.
>
> Hi Jens,
>
> As you know one of the deeply embedded design choices in the blk-mq
> code is that reordering can happen at any time between submission of a
> request to the blk-mq code and request completion. I agree with that
> design choice.
Indeed. And getting rid of any ordering ops like barriers greatly
simplified things and fixed a number of issued related to that.
> For the use cases I'm looking at the sequential write required zone
> type works best. This zone type works best since it guarantees that
> data on the storage medium is sequential. This results in optimal
> sequential read performance.
That's a given.
> Combining these two approaches is not ideal and I agree that the
> combination of these two approaches adds some complexity. Personally I
> prefer to add a limited amount of complexity rather than implementing
> a new block layer from scratch.
I'm not talking about a new block layer at all, ordered devices are not
nearly important enough to warrant that kind of attention. Nor would it
be a good solution even if they were. I'm merely saying that I'm getting
more and more disgruntled with the direction that is being taken to
cater to these kinds of devices, and perhaps a much better idea is to
contain that complexity in a separate scheduler (be it stacked or not).
Because I'm really not thrilled to see the addition of various "is this
device ordered" all over the place, and now we are getting "is this
device ordered AND pipelined". Do you see what I mean? It's making
things _worse_, not better, and we really should be making it better
rather than pile more stuff on top of it.
--
Jens Axboe
next prev parent reply other threads:[~2023-01-10 1:03 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-09 23:27 [PATCH 0/8] Enable zoned write pipelining for UFS devices Bart Van Assche
2023-01-09 23:27 ` [PATCH 1/8] block: Document blk_queue_zone_is_seq() and blk_rq_zone_is_seq() Bart Van Assche
2023-01-09 23:36 ` Damien Le Moal
2023-01-09 23:27 ` [PATCH 2/8] block: Introduce the blk_rq_is_seq_zone_write() function Bart Van Assche
2023-01-09 23:38 ` Damien Le Moal
2023-01-09 23:52 ` Bart Van Assche
2023-01-10 9:52 ` Niklas Cassel
2023-01-10 11:54 ` Damien Le Moal
2023-01-10 12:13 ` Niklas Cassel
2023-01-10 12:41 ` Damien Le Moal
2023-01-09 23:27 ` [PATCH 3/8] block: Introduce a request queue flag for pipelining zoned writes Bart Van Assche
2023-01-09 23:27 ` [PATCH 4/8] block/mq-deadline: Only use zone locking if necessary Bart Van Assche
2023-01-09 23:46 ` Damien Le Moal
2023-01-09 23:51 ` Bart Van Assche
2023-01-09 23:56 ` Damien Le Moal
2023-01-10 0:19 ` Bart Van Assche
2023-01-10 0:32 ` Damien Le Moal
2023-01-10 0:38 ` Jens Axboe
2023-01-10 0:41 ` Jens Axboe
2023-01-10 0:44 ` Bart Van Assche
2023-01-10 0:48 ` Jens Axboe
2023-01-10 0:56 ` Bart Van Assche
2023-01-10 1:03 ` Jens Axboe [this message]
2023-01-10 1:17 ` Bart Van Assche
2023-01-10 1:48 ` Jens Axboe
2023-01-10 2:24 ` Damien Le Moal
2023-01-10 3:00 ` Jens Axboe
2023-01-09 23:27 ` [PATCH 5/8] block/null_blk: Refactor null_queue_rq() Bart Van Assche
2023-01-09 23:27 ` [PATCH 6/8] block/null_blk: Add support for pipelining zoned writes Bart Van Assche
2023-01-09 23:27 ` [PATCH 7/8] scsi: Retry unaligned " Bart Van Assche
2023-01-09 23:51 ` Damien Le Moal
2023-01-09 23:55 ` Bart Van Assche
2023-01-09 23:27 ` [PATCH 8/8] scsi: ufs: Enable zoned write pipelining Bart Van Assche
2023-01-10 9:16 ` Avri Altman
2023-01-10 17:42 ` Bart Van Assche
2023-01-10 12:23 ` Bean Huo
2023-01-10 17:41 ` Bart Van Assche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86eef990-0725-9669-6b7e-1fe935a6b648@kernel.dk \
--to=axboe@kernel.dk \
--cc=avri.altman@wdc.com \
--cc=bvanassche@acm.org \
--cc=damien.lemoal@opensource.wdc.com \
--cc=damien.lemoal@wdc.com \
--cc=jaegeuk@kernel.org \
--cc=linux-block@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.