linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Javier González" <javier@javigon.com>
To: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: Keith Busch <kbusch@kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"hch@lst.de" <hch@lst.de>, "sagi@grimberg.me" <sagi@grimberg.me>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	SelvaKumar S <selvakuma.s1@samsung.com>,
	Kanchan Joshi <joshi.k@samsung.com>,
	Nitesh Shetty <nj.shetty@samsung.com>
Subject: Re: [PATCH 6/6] nvme: Add consistency check for zone count
Date: Fri, 26 Jun 2020 09:29:00 +0200	[thread overview]
Message-ID: <20200626072900.rjigm3wiya4sdufv@mpHalley.localdomain> (raw)
In-Reply-To: <CY4PR04MB3751A7165FC7F068C1ECBB5EE7930@CY4PR04MB3751.namprd04.prod.outlook.com>

On 26.06.2020 07:09, Damien Le Moal wrote:
>On 2020/06/26 15:55, Javier González wrote:
>> On 26.06.2020 06:49, Damien Le Moal wrote:
>>> On 2020/06/26 15:13, Javier González wrote:
>>>> On 26.06.2020 00:04, Damien Le Moal wrote:
>>>>> On 2020/06/26 6:49, Keith Busch wrote:
>>>>>> On Thu, Jun 25, 2020 at 02:21:52PM +0200, Javier González wrote:
>>>>>>>  drivers/nvme/host/zns.c | 7 +++++++
>>>>>>>  1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
>>>>>>> index 7d8381fe7665..de806788a184 100644
>>>>>>> --- a/drivers/nvme/host/zns.c
>>>>>>> +++ b/drivers/nvme/host/zns.c
>>>>>>> @@ -234,6 +234,13 @@ static int nvme_ns_report_zones(struct nvme_ns *ns, sector_t sector,
>>>>>>>  		sector += ns->zsze * nz;
>>>>>>>  	}
>>>>>>>
>>>>>>> +	if (nr_zones < 0 && zone_idx != ns->nr_zones) {
>>>>>>> +		dev_err(ns->ctrl->device, "inconsistent zone count %u/%u\n",
>>>>>>> +				zone_idx, ns->nr_zones);
>>>>>>> +		ret = -EINVAL;
>>>>>>> +		goto out_free;
>>>>>>> +	}
>>>>>>> +
>>>>>>>  	ret = zone_idx;
>>>>>>
>>>>>> nr_zones is unsigned, so it's never < 0.
>>>>>>
>>>>>> The API we're providing doesn't require zone_idx equal the namespace's
>>>>>> nr_zones at the end, though. A subset of the total number of zones can
>>>>>> be requested here.
>>>>>>
>>>>
>>>> I did see nr_zones coming with -1; guess it is my compiler.
>>>
>>> See include/linux/blkdev.h. -1 is:
>>>
>>> #define BLK_ALL_ZONES  ((unsigned int)-1)
>>>
>>> Which is documented in block/blk-zoned.c:
>>>
>>> /**
>>> * blkdev_report_zones - Get zones information
>>> * @bdev:       Target block device
>>> * @sector:     Sector from which to report zones
>>> * @nr_zones:   Maximum number of zones to report
>>> * @cb:         Callback function called for each reported zone
>>> * @data:       Private data for the callback
>>> *
>>> * Description:
>>> *    Get zone information starting from the zone containing @sector for at most
>>> *    @nr_zones, and call @cb for each zone reported by the device.
>>> *    To report all zones in a device starting from @sector, the BLK_ALL_ZONES
>>> *    constant can be passed to @nr_zones.
>>> *    Returns the number of zones reported by the device, or a negative errno
>>> *    value in case of failure.
>>> *
>>> *    Note: The caller must use memalloc_noXX_save/restore() calls to control
>>> *    memory allocations done within this function.
>>> */
>>> int blkdev_report_zones(struct block_device *bdev, sector_t sector,
>>>                        unsigned int nr_zones, report_zones_cb cb, void *data)
>>>
>>>>
>>>>>
>>>>> Yes, absolutely. zone_idx is not an absolute zone number. It is the index of the
>>>>> reported zone descriptor in the current report range requested by the user,
>>>>> which is not necessarily for the entire drive (i.e., provided nr zones is less
>>>>> than the total number of zones of the disk and/or start sector is > 0). So
>>>>> zone_idx indicates the actual number of zones reported, it is not the total
>>>>
>>>> I see. As I can see, when nr_zones comes undefined I believed we could
>>>> assume that zone_idx is absolute, but I can be wrong.
>>>
>>> No. zone_idx is *always* the index of the zone in the current report. Whatever
>>> that report is, regardless of the report starting point and number of zones
>>> requested. E.g. For a single zone report (nr_zones = 1), you will always see
>>> zone_idx = 0. For a full report, zone_idx will correspond to the zone number.
>>> This is used for example in blk_revalidate_disk_zones() to initialize the zone
>>> bitmaps.
>>>
>>>> Does it make sense to support this check with an additional counter and
>>>> a explicit nr_zones initialization when undefined or you
>>>> prefer to just remove it as Matias suggested?
>>>
>>> The check is not needed at all.
>>>
>>> If the device is buggy and reports more zones than the device capacity or any
>>> other bugs, the driver can catch that when it processes the report.
>>> blk_revalidate_disk_zones() also has many checks.
>>
>> I have managed to create a QEMU ZNS device that gave me a headache with
>> a little bit of extra capacity that triggered an additional zone report.
>> This was the motivation for the patch.
>
>The device emulation sound buggy... If the capacity is wrong, then the report
>will be too since zones are all supposed to be sequential (no holes between
>zones) and up to the disk capacity only (last zone start + len = capacity + 1)
>
>If one or the other is wrong, this should be easy to detect. Normally,
>blk_revalidate_disk_zones() should be able to catch that.

We have the capability to select the reported device capacity manually
for a number of reasons. One of the different test configurations in our
CI did go through.

But it is OK, I will remove the check on V2.

Javier

  reply	other threads:[~2020-06-26  7:29 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-25 12:21 [PATCH 0/6] ZNS: Extra features for current patches Javier González
2020-06-25 12:21 ` [PATCH 1/6] block: introduce IOCTL for zone mgmt Javier González
2020-06-26  1:17   ` Damien Le Moal
2020-06-26  6:01     ` Javier González
2020-06-26  6:37       ` Damien Le Moal
2020-06-26  6:51         ` Javier González
2020-06-26  7:03           ` Damien Le Moal
2020-06-26  7:08             ` Javier González
2020-06-25 12:21 ` [PATCH 2/6] block: add support for selecting all zones Javier González
2020-06-26  1:27   ` Damien Le Moal
2020-06-26  5:58     ` Javier González
2020-06-26  6:35       ` Damien Le Moal
2020-06-26  6:52         ` Javier González
2020-06-26  7:06           ` Damien Le Moal
2020-06-25 12:21 ` [PATCH 3/6] block: add support for zone offline transition Javier González
2020-06-25 14:12   ` Matias Bjørling
2020-06-25 19:48     ` Javier González
2020-06-26  1:14       ` Damien Le Moal
2020-06-26  6:18         ` Javier González
2020-06-26  9:11         ` hch
2020-06-26  9:15           ` Damien Le Moal
2020-06-26  9:17             ` hch
2020-06-26 10:02               ` Javier González
2020-06-26  9:07     ` Christoph Hellwig
2020-06-26  1:34   ` Damien Le Moal
2020-06-26  6:08     ` Javier González
2020-06-26  6:42       ` Damien Le Moal
2020-06-26  6:58         ` Javier González
2020-06-26  7:17           ` Damien Le Moal
2020-06-26  7:26             ` Javier González
2020-06-25 12:21 ` [PATCH 4/6] block: introduce IOCTL to report dev properties Javier González
2020-06-25 13:10   ` Matias Bjørling
2020-06-25 19:42     ` Javier González
2020-06-25 19:58       ` Matias Bjørling
2020-06-26  6:24         ` Javier González
2020-06-25 20:25       ` Keith Busch
2020-06-26  6:28         ` Javier González
2020-06-26 15:52           ` Keith Busch
2020-06-26 16:25             ` Javier González
2020-06-26  0:57       ` Damien Le Moal
2020-06-26  6:27         ` Javier González
2020-06-26  1:38   ` Damien Le Moal
2020-06-26  6:22     ` Javier González
2020-06-25 12:21 ` [PATCH 5/6] block: add zone attr. to zone mgmt IOCTL struct Javier González
2020-06-25 15:13   ` Matias Bjørling
2020-06-25 19:51     ` Javier González
2020-06-26  1:45   ` Damien Le Moal
2020-06-26  6:03     ` Javier González
2020-06-26  6:38       ` Damien Le Moal
2020-06-26  6:49         ` Javier González
2020-06-26  9:14   ` Christoph Hellwig
2020-06-26 10:01     ` Javier González
2020-06-25 12:21 ` [PATCH 6/6] nvme: Add consistency check for zone count Javier González
2020-06-25 13:16   ` Matias Bjørling
2020-06-25 19:45     ` Javier González
2020-06-25 21:49   ` Keith Busch
2020-06-26  0:04     ` Damien Le Moal
2020-06-26  6:13       ` Javier González
2020-06-26  6:49         ` Damien Le Moal
2020-06-26  6:55           ` Javier González
2020-06-26  7:09             ` Damien Le Moal
2020-06-26  7:29               ` Javier González [this message]
2020-06-26  7:42                 ` Damien Le Moal
2020-06-26  9:16   ` Christoph Hellwig
2020-06-26 10:03     ` Javier González
2020-06-25 13:04 ` [PATCH 0/6] ZNS: Extra features for current patches Matias Bjørling
2020-06-25 14:48   ` Matias Bjørling
2020-06-25 19:39     ` Javier González
2020-06-25 19:53       ` Matias Bjørling
2020-06-26  6:26         ` Javier González

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200626072900.rjigm3wiya4sdufv@mpHalley.localdomain \
    --to=javier@javigon.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=joshi.k@samsung.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=nj.shetty@samsung.com \
    --cc=sagi@grimberg.me \
    --cc=selvakuma.s1@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).