linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Damien Le Moal <Damien.LeMoal@wdc.com>
To: "Javier González" <javier@javigon.com>
Cc: "linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"hch@lst.de" <hch@lst.de>,
	"kbusch@kernel.org" <kbusch@kernel.org>,
	"sagi@grimberg.me" <sagi@grimberg.me>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	SelvaKumar S <selvakuma.s1@samsung.com>,
	Kanchan Joshi <joshi.k@samsung.com>,
	Nitesh Shetty <nj.shetty@samsung.com>
Subject: Re: [PATCH 2/6] block: add support for selecting all zones
Date: Fri, 26 Jun 2020 07:06:34 +0000	[thread overview]
Message-ID: <CY4PR04MB3751CD082BC0F77A5284EC34E7930@CY4PR04MB3751.namprd04.prod.outlook.com> (raw)
In-Reply-To: 20200626065241.u4fd3m7624kdsonw@mpHalley.localdomain

On 2020/06/26 15:52, Javier González wrote:
> On 26.06.2020 06:35, Damien Le Moal wrote:
>> On 2020/06/26 14:58, Javier González wrote:
>>> On 26.06.2020 01:27, Damien Le Moal wrote:
>>>> On 2020/06/25 21:22, Javier González wrote:
>>>>> From: Javier González <javier.gonz@samsung.com>
>>>>>
>>>>> Add flag to allow selecting all zones on a single zone management
>>>>> operation
>>>>>
>>>>> Signed-off-by: Javier González <javier.gonz@samsung.com>
>>>>> Signed-off-by: SelvaKumar S <selvakuma.s1@samsung.com>
>>>>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>>>>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>>>>> ---
>>>>>  block/blk-zoned.c             | 3 +++
>>>>>  include/linux/blk_types.h     | 3 ++-
>>>>>  include/uapi/linux/blkzoned.h | 9 +++++++++
>>>>>  3 files changed, 14 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>>>> index e87c60004dc5..29194388a1bb 100644
>>>>> --- a/block/blk-zoned.c
>>>>> +++ b/block/blk-zoned.c
>>>>> @@ -420,6 +420,9 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>>>>  		return -ENOTTY;
>>>>>  	}
>>>>>
>>>>> +	if (zmgmt.flags & BLK_ZONE_SELECT_ALL)
>>>>> +		op |= REQ_ZONE_ALL;
>>>>> +
>>>>>  	return blkdev_zone_mgmt(bdev, op, zmgmt.sector, zmgmt.nr_sectors,
>>>>>  				GFP_KERNEL);
>>>>>  }
>>>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>>>>> index ccb895f911b1..16b57fb2b99c 100644
>>>>> --- a/include/linux/blk_types.h
>>>>> +++ b/include/linux/blk_types.h
>>>>> @@ -351,6 +351,7 @@ enum req_flag_bits {
>>>>>  	 * work item to avoid such priority inversions.
>>>>>  	 */
>>>>>  	__REQ_CGROUP_PUNT,
>>>>> +	__REQ_ZONE_ALL,		/* apply zone operation to all zones */
>>>>>
>>>>>  	/* command specific flags for REQ_OP_WRITE_ZEROES: */
>>>>>  	__REQ_NOUNMAP,		/* do not free blocks when zeroing */
>>>>> @@ -378,7 +379,7 @@ enum req_flag_bits {
>>>>>  #define REQ_BACKGROUND		(1ULL << __REQ_BACKGROUND)
>>>>>  #define REQ_NOWAIT		(1ULL << __REQ_NOWAIT)
>>>>>  #define REQ_CGROUP_PUNT		(1ULL << __REQ_CGROUP_PUNT)
>>>>> -
>>>>> +#define REQ_ZONE_ALL		(1ULL << __REQ_ZONE_ALL)
>>>>>  #define REQ_NOUNMAP		(1ULL << __REQ_NOUNMAP)
>>>>>  #define REQ_HIPRI		(1ULL << __REQ_HIPRI)
>>>>>
>>>>> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
>>>>> index 07b5fde21d9f..a8c89fe58f97 100644
>>>>> --- a/include/uapi/linux/blkzoned.h
>>>>> +++ b/include/uapi/linux/blkzoned.h
>>>>> @@ -157,6 +157,15 @@ enum blk_zone_action {
>>>>>  	BLK_ZONE_MGMT_RESET	= 0x4,
>>>>>  };
>>>>>
>>>>> +/**
>>>>> + * enum blk_zone_mgmt_flags - Flags for blk_zone_mgmt
>>>>> + *
>>>>> + * BLK_ZONE_SELECT_ALL: Select all zones for current zone action
>>>>> + */
>>>>> +enum blk_zone_mgmt_flags {
>>>>> +	BLK_ZONE_SELECT_ALL	= 1 << 0,
>>>>> +};
>>>>> +
>>>>>  /**
>>>>>   * struct blk_zone_mgmt - Extended zoned management
>>>>>   *
>>>>>
>>>>
>>>> NACK.
>>>>
>>>> Details:
>>>> 1) REQ_OP_ZONE_RESET together with REQ_ZONE_ALL is the same as
>>>> REQ_OP_ZONE_RESET_ALL, isn't it ? You are duplicating a functionality that
>>>> already exists.
>>>> 2) The patch introduces REQ_ZONE_ALL at the block layer only without defining
>>>> how it ties into SCSI and NVMe driver use of it. Is REQ_ZONE_ALL indicating that
>>>> the zone management commands are to be executed with the ALL bit set ? If yes,
>>>> that will break device-mapper. See the special code for handling
>>>> REQ_OP_ZONE_RESET_ALL. That code is in place for a reason: the target block
>>>> device may not be an entire physical device. In that case, applying a zone
>>>> management command to all zones of the physical drive is wrong.
>>>> 3) REQ_ZONE_ALL seems completely equivalent to specifying a sector range of [0
>>>> .. drive capacity]. So what is the point ? The current interface handles that.
>>>> That is how we chose between REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL right now.
>>>> 4) Without any in-kernel user, I do not see the point. And for applications, I
>>>> do not see any good use case for doing open all, close all, offline all or
>>>> finish all. If you have any such good use case, please elaborate.
>>>>
>>>
>>> The main use if reset all, but without having to look through all zones,
>>> as it imposes an overhead when we have a large number of zones. Having
>>> the possibility to offload it to HW is more efficient.
>>>
>>> I had not thought about the device mapper use case. Would it be an
>>> option to translate this into REQ_OP_ZONE_RESET_ALL when we have a
>>> device mapper (or any other case where this might break) and then leave
>>> the bit go to the driver if it applies to the whole device?
>>
>> But REQ_OP_ZONE_RESET_ALL is already implemented and supported and will reset
>> all zones of a drive using a single command if the ioctl is called for the
>> entire sector range of a physical drive. For device mapper with a partial
>> mapping, the per zone reset loop will be used. If you have no other use case for
>> the REQ_ZONE_ALL flag, what is the point here ? Reset is already optimized for
>> the all zones case
> 
> OK. I might have missed this. I thought we were sending several commands
> instead of a single reset with the bit. I will check again. Thanks for
> pointing at this.

In block/blk-zoned.c, there is:

static inline bool blkdev_allow_reset_all_zones(struct block_device *bdev,
                                                sector_t sector,
                                                sector_t nr_sectors)
{
        if (!blk_queue_zone_resetall(bdev_get_queue(bdev)))
                return false;

        /*
         * REQ_OP_ZONE_RESET_ALL can be executed only if the number of sectors
         * of the applicable zone range is the entire disk.
         */
        return !sector && nr_sectors == get_capacity(bdev->bd_disk);
}

And:

int blkdev_zone_mgmt(struct block_device *bdev, enum req_opf op,
                     sector_t sector, sector_t nr_sectors,
                     gfp_t gfp_mask)
{
	...

	while (sector < end_sector) {
                bio = blk_next_bio(bio, 0, gfp_mask);
                bio_set_dev(bio, bdev);

                /*
                 * Special case for the zone reset operation that reset all
                 * zones, this is useful for applications like mkfs.
                 */
                if (op == REQ_OP_ZONE_RESET &&
                    blkdev_allow_reset_all_zones(bdev, sector, nr_sectors)) {
                        bio->bi_opf = REQ_OP_ZONE_RESET_ALL;
                        break;
			^^^^^ This means one command only...

                }

                bio->bi_opf = op | REQ_SYNC;
                bio->bi_iter.bi_sector = sector;
                sector += zone_sectors;

                /* This may take a while, so be nice to others */
                cond_resched();
        }

        ret = submit_bio_wait(bio);
        bio_put(bio);

        return ret;
}

And see scsi/sd_zbc.c and zns.c. REQ_OP_ZONE_RESET_ALL end up setting the ALL
bit for reset command.


> 
> Javier
> 


-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2020-06-26  7:06 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-25 12:21 [PATCH 0/6] ZNS: Extra features for current patches Javier González
2020-06-25 12:21 ` [PATCH 1/6] block: introduce IOCTL for zone mgmt Javier González
2020-06-26  1:17   ` Damien Le Moal
2020-06-26  6:01     ` Javier González
2020-06-26  6:37       ` Damien Le Moal
2020-06-26  6:51         ` Javier González
2020-06-26  7:03           ` Damien Le Moal
2020-06-26  7:08             ` Javier González
2020-06-25 12:21 ` [PATCH 2/6] block: add support for selecting all zones Javier González
2020-06-26  1:27   ` Damien Le Moal
2020-06-26  5:58     ` Javier González
2020-06-26  6:35       ` Damien Le Moal
2020-06-26  6:52         ` Javier González
2020-06-26  7:06           ` Damien Le Moal [this message]
2020-06-25 12:21 ` [PATCH 3/6] block: add support for zone offline transition Javier González
2020-06-25 14:12   ` Matias Bjørling
2020-06-25 19:48     ` Javier González
2020-06-26  1:14       ` Damien Le Moal
2020-06-26  6:18         ` Javier González
2020-06-26  9:11         ` hch
2020-06-26  9:15           ` Damien Le Moal
2020-06-26  9:17             ` hch
2020-06-26 10:02               ` Javier González
2020-06-26  9:07     ` Christoph Hellwig
2020-06-26  1:34   ` Damien Le Moal
2020-06-26  6:08     ` Javier González
2020-06-26  6:42       ` Damien Le Moal
2020-06-26  6:58         ` Javier González
2020-06-26  7:17           ` Damien Le Moal
2020-06-26  7:26             ` Javier González
2020-06-25 12:21 ` [PATCH 4/6] block: introduce IOCTL to report dev properties Javier González
2020-06-25 13:10   ` Matias Bjørling
2020-06-25 19:42     ` Javier González
2020-06-25 19:58       ` Matias Bjørling
2020-06-26  6:24         ` Javier González
2020-06-25 20:25       ` Keith Busch
2020-06-26  6:28         ` Javier González
2020-06-26 15:52           ` Keith Busch
2020-06-26 16:25             ` Javier González
2020-06-26  0:57       ` Damien Le Moal
2020-06-26  6:27         ` Javier González
2020-06-26  1:38   ` Damien Le Moal
2020-06-26  6:22     ` Javier González
2020-06-25 12:21 ` [PATCH 5/6] block: add zone attr. to zone mgmt IOCTL struct Javier González
2020-06-25 15:13   ` Matias Bjørling
2020-06-25 19:51     ` Javier González
2020-06-26  1:45   ` Damien Le Moal
2020-06-26  6:03     ` Javier González
2020-06-26  6:38       ` Damien Le Moal
2020-06-26  6:49         ` Javier González
2020-06-26  9:14   ` Christoph Hellwig
2020-06-26 10:01     ` Javier González
2020-06-25 12:21 ` [PATCH 6/6] nvme: Add consistency check for zone count Javier González
2020-06-25 13:16   ` Matias Bjørling
2020-06-25 19:45     ` Javier González
2020-06-25 21:49   ` Keith Busch
2020-06-26  0:04     ` Damien Le Moal
2020-06-26  6:13       ` Javier González
2020-06-26  6:49         ` Damien Le Moal
2020-06-26  6:55           ` Javier González
2020-06-26  7:09             ` Damien Le Moal
2020-06-26  7:29               ` Javier González
2020-06-26  7:42                 ` Damien Le Moal
2020-06-26  9:16   ` Christoph Hellwig
2020-06-26 10:03     ` Javier González
2020-06-25 13:04 ` [PATCH 0/6] ZNS: Extra features for current patches Matias Bjørling
2020-06-25 14:48   ` Matias Bjørling
2020-06-25 19:39     ` Javier González
2020-06-25 19:53       ` Matias Bjørling
2020-06-26  6:26         ` Javier González

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CY4PR04MB3751CD082BC0F77A5284EC34E7930@CY4PR04MB3751.namprd04.prod.outlook.com \
    --to=damien.lemoal@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=javier@javigon.com \
    --cc=joshi.k@samsung.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=nj.shetty@samsung.com \
    --cc=sagi@grimberg.me \
    --cc=selvakuma.s1@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).