All of lore.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <damien.lemoal@opensource.wdc.com>
To: Bart Van Assche <bvanassche@acm.org>, Khazhy Kumykov <khazhy@google.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	Damien Le Moal <damien.lemoal@wdc.com>,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	Jaegeuk Kim <jaegeuk@kernel.org>
Subject: Re: [PATCH 2/5] scsi: Retry unaligned zoned writes
Date: Wed, 15 Jun 2022 08:50:20 +0900	[thread overview]
Message-ID: <c8e55085-4b7d-ef11-a22a-39ac71e63227@opensource.wdc.com> (raw)
In-Reply-To: <186833db-bb36-e3c3-5670-ac8ff0b2906b@acm.org>

On 6/15/22 07:39, Bart Van Assche wrote:
> On 6/14/22 14:47, Khazhy Kumykov wrote:
>> On Tue, Jun 14, 2022 at 10:49 AM Bart Van Assche <bvanassche@acm.org> wrote:
>>>
>>>  From ZBC-2: "The device server terminates with CHECK CONDITION status, with
>>> the sense key set to ILLEGAL REQUEST, and the additional sense code set to
>>> UNALIGNED WRITE COMMAND a write command, other than an entire medium write
>>> same command, that specifies: a) the starting LBA in a sequential write
>>> required zone set to a value that is not equal to the write pointer for that
>>> sequential write required zone; or b) an ending LBA that is not equal to the
>>> last logical block within a physical block (see SBC-5)."
>>>
>>> I am not aware of any other conditions that may trigger the UNALIGNED
>>> WRITE COMMAND response.
>>>
>>> Retry unaligned writes in preparation of removing zone locking.
>> Is /just/ retrying effective here? A series of writes to the same zone
>> would all need to be sent in order - in the worst case (requests
>> somehow ordered in reverse order) this becomes quadratic as only 1
>> request "succeeds" out of the N outstanding requests, with the rest
>> all needing to retry. (Imagine a user writes an entire "zone" - which
>> could be split into hundreds of requests).
>>
>> Block layer / schedulers are free to do this reordering, which I
>> understand does happen whenever we need to requeue - and would result
>> in a retry of all writes after the first re-ordered request. (side
>> note: fwiw "requests somehow in reverse order" can happen - bfq
>> inherited cfq's odd behavior of sometimes issuing sequential IO in
>> reverse order due to back_seek, e.g.)
> 
> Hi Khazhy,
> 
> For zoned block devices I propose to only support those I/O schedulers 
> that either preserve the LBA order or fix the LBA order if two or more 
> out-of-order requests are received by the I/O scheduler.

We try that "fix" with the work for zoned btrfs. It does not work. Even
adding a delay to wait for out of order requests (if there is a hole in a
write sequence) does not reliably work as FSes may sometimes take 10s of
seconds to issue all write requests that can be all ordered into a nice
write stream. Even with that delay increased to minutes, we were still
seeing unaligned write errors.

> 
> I agree that in the worst case the number of retries is proportional to 
> the square of the number of pending requests. However, for the use case 
> that matters most to me, F2FS on top of a UFS device, we haven't seen 
> any retries in our tests without I/O scheduler. This is probably because 
> of how F2FS submits writes combined with the UFS controller only 
> supporting a single hardware queue. I expect to see a small number of 
> retries once UFS controllers become available that support multiple 
> hardware queues.
> 
> Thanks,
> 
> Bart.


-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2022-06-14 23:50 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14 17:49 [PATCH 0/5] Improve zoned storage write performance Bart Van Assche
2022-06-14 17:49 ` [PATCH 1/5] block: Introduce the blk_rq_is_seq_write() function Bart Van Assche
2022-06-16 20:41   ` Jens Axboe
2022-06-16 21:16     ` Bart Van Assche
2022-06-14 17:49 ` [PATCH 2/5] scsi: Retry unaligned zoned writes Bart Van Assche
2022-06-14 21:47   ` Khazhy Kumykov
2022-06-14 22:39     ` Bart Van Assche
2022-06-14 23:50       ` Damien Le Moal [this message]
2022-06-14 23:54         ` Bart Van Assche
2022-06-15  0:55           ` Damien Le Moal
2022-06-14 23:29   ` Damien Le Moal
2022-06-14 23:56     ` Bart Van Assche
2022-06-15  1:09       ` Damien Le Moal
2022-06-15  5:49       ` Christoph Hellwig
2022-06-15  7:21         ` Damien Le Moal
2022-06-15 19:38           ` Bart Van Assche
2022-06-16  0:14             ` Damien Le Moal
2022-06-15 19:42         ` Bart Van Assche
2022-06-16 18:36         ` Bart Van Assche
2022-06-14 17:49 ` [PATCH 3/5] nvme: Make the number of retries request specific Bart Van Assche
2022-06-14 17:49 ` [PATCH 4/5] nvme: Increase the number of retries for zoned writes Bart Van Assche
2022-06-14 18:03   ` Keith Busch
2022-06-14 17:49 ` [PATCH 5/5] block/mq-deadline: Remove zone locking Bart Van Assche
2022-06-14 23:40   ` Damien Le Moal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c8e55085-4b7d-ef11-a22a-39ac71e63227@opensource.wdc.com \
    --to=damien.lemoal@opensource.wdc.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=damien.lemoal@wdc.com \
    --cc=hch@lst.de \
    --cc=jaegeuk@kernel.org \
    --cc=khazhy@google.com \
    --cc=linux-block@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.