All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 0/9] nvmet: add ZBD backend support
@ 2020-12-02  6:22 ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: sagi, hch, damien.lemoal, johannes.thumshirn, Chaitanya Kulkarni

Hi,

NVMeOF Host is capable of handling the NVMe Protocol based Zoned Block
Devices (ZBD) in the Zoned Namespaces (ZNS) mode with the passthru
backend. There is no support for a generic block device backend to
handle the ZBD devices which are not NVMe protocol compliant yet provide
NVMe ZNS interface.

This adds support to export the ZBDs (which are not NVMe drives) to host
the from target via NVMeOF using the host side ZNS interface.

The patch series is generated in bottom-top manner where, it first adds
prep patch and ZNS command-specific handlers on the top of genblk and 
updates the data structures, then one by one it wires up the admin cmds
in the order host calls them in namespace initializing sequence. Once
everything is ready, it wires-up the I/O command handlers. See below for 
patch-series overview.

All the testcases are passing for the ZoneFS with ZBD exported with NVMeOF
backed by null_blk ZBD and null_blk ZBD without NVMeOF.

Please consider this for nvme-5.11.

Regards,
Chaitanya

Changes from V3:-

1.  Get rid of the bio_max_zasl check.
2.  Remove extra lines.
4.  Remove the block layer api export patch.
5.  Remove the bvec check in the bio_iov_iter_get_pages() for
    REQ_OP_ZONE_APPEND so that we can reuse the code.

Changes from V2:-

1.  Move conventional zone bitmap check into 
    nvmet_bdev_validate_zns_zones(). 
2.  Don't use report zones call to check the runt zone.
3.  Trim nvmet_zasl() helper.
4.  Fix typo in the nvmet_zns_update_zasl().
5.  Remove the comment and fix the mdts calculation in
    nvmet_execute_identify_cns_cs_ctrl().
6.  Use u64 for bufsize in nvmet_bdev_execute_zone_mgmt_recv().
7.  Remove nvmet_zones_to_desc_size() and fix the nr_zones
    calculation.
8.  Remove the op variable in nvmet_bdev_execute_zone_append().
9.  Fix the nr_zones calculation nvmet_bdev_execute_zone_mgmt_recv().
10. Update cover letter subject.

Changes from V1:-

1.  Remove the nvmet-$(CONFIG_BLK_DEV_ZONED) += zns.o.
2.  Mark helpers inline.
3.  Fix typos in the comments and update the comments.
4.  Get rid of the curly brackets.
5.  Don't allow drives with last smaller zones.
6.  Calculate the zasl as a function of ax_zone_append_sectors,
    bio_max_pages so we don't have to split the bio.
7.  Add global subsys->zasl and update the zasl when new namespace
    is enabled.
8.  Rmove the loop in the nvmet_bdev_execute_zone_mgmt_recv() and
    move functionality in to the report zone callback.
9.  Add goto for default case in nvmet_bdev_execute_zone_mgmt_send().
10. Allocate the zones buffer with zones size instead of bdev nr_zones.

Chaitanya Kulkarni (9):
  block: allow bvec for zone append get pages
  nvmet: add ZNS support for bdev-ns
  nvmet: trim down id-desclist to use req->ns
  nvmet: add NVME_CSI_ZNS in ns-desc for zbdev
  nvmet: add cns-cs-ctrl in id-ctrl for ZNS bdev
  nvmet: add cns-cs-ns in id-ctrl for ZNS bdev
  nvmet: add zns cmd effects to support zbdev
  nvmet: add zns bdev config support
  nvmet: add ZNS based I/O cmds handlers

 block/bio.c                       |   2 -
 drivers/nvme/target/Makefile      |   2 +-
 drivers/nvme/target/admin-cmd.c   |  38 ++-
 drivers/nvme/target/io-cmd-bdev.c |  12 +
 drivers/nvme/target/io-cmd-file.c |   2 +-
 drivers/nvme/target/nvmet.h       |  19 ++
 drivers/nvme/target/zns.c         | 417 ++++++++++++++++++++++++++++++
 7 files changed, 475 insertions(+), 17 deletions(-)
 create mode 100644 drivers/nvme/target/zns.c


Test Results :-
# nvme zns id-ctrl /dev/nvme
NVMe ZNS Identify Controller:
zasl    : 4
#
#
# nvme zns id-ctrl /dev/nvme1n1
NVMe ZNS Identify Controller:
zasl    : 4
# nvme zns id-ns /dev/nvme1n1
#
#
ZNS Command Set Identify Namespace:
zoc     : 0
ozcs    : 0
mar     : 0
mor     : 0
rrl     : 0
frl     : 0
lbafe  0: zsze:0x10000 zdes:0 (in use)
# 
#
#
# lsblk | grep null
nullb0            252:0    0    1G  0 disk 
# 
# 
# 
# ./zonefs-tests.sh /dev/nullb0 
Gathering information on /dev/nullb0...
zonefs-tests on /dev/nullb0:
  4 zones (0 conventional zones, 4 sequential zones)
  524288 512B sectors zone size (256 MiB)
  0 max open zones
Running tests
  Test 0010:  mkzonefs (options)                                   ... PASS
  Test 0011:  mkzonefs (force format)                              ... PASS
  Test 0012:  mkzonefs (invalid device)                            ... PASS
  Test 0013:  mkzonefs (super block zone state)                    ... PASS
  Test 0020:  mount (default)                                      ... PASS
  Test 0021:  mount (invalid device)                               ... PASS
  Test 0022:  mount (check mount directory sub-directories)        ... PASS
  Test 0023:  mount (options)                                      ... PASS
  Test 0030:  Number of files (default)                            ... PASS
  Test 0031:  Number of files (aggr_cnv)                           ... skip
  Test 0032:  Number of files using stat (default)                 ... PASS
  Test 0033:  Number of files using stat (aggr_cnv)                ... PASS
  Test 0034:  Number of blocks using stat (default)                ... PASS
  Test 0035:  Number of blocks using stat (aggr_cnv)               ... PASS
  Test 0040:  Files permissions (default)                          ... PASS
  Test 0041:  Files permissions (aggr_cnv)                         ... skip
  Test 0042:  Files permissions (set value)                        ... PASS
  Test 0043:  Files permissions (set value + aggr_cnv)             ... skip
  Test 0050:  Files owner (default)                                ... PASS
  Test 0051:  Files owner (aggr_cnv)                               ... skip
  Test 0052:  Files owner (set value)                              ... PASS
  Test 0053:  Files owner (set value + aggr_cnv)                   ... skip
  Test 0060:  Files size (default)                                 ... PASS
  Test 0061:  Files size (aggr_cnv)                                ... skip
  Test 0070:  Conventional file truncate                           ... skip
  Test 0071:  Conventional file truncate (aggr_cnv)                ... skip
  Test 0072:  Conventional file unlink                             ... skip
  Test 0073:  Conventional file unlink (aggr_cnv)                  ... skip
  Test 0074:  Conventional file random write                       ... skip
  Test 0075:  Conventional file random write (direct)              ... skip
  Test 0076:  Conventional file random write (aggr_cnv)            ... skip
  Test 0077:  Conventional file random write (aggr_cnv, direct)    ... skip
  Test 0078:  Conventional file mmap read/write                    ... skip
  Test 0079:  Conventional file mmap read/write (aggr_cnv)         ... skip
  Test 0080:  Sequential file truncate                             ... PASS
  Test 0081:  Sequential file unlink                               ... PASS
  Test 0082:  Sequential file buffered write IO                    ... PASS
  Test 0083:  Sequential file overwrite                            ... PASS
  Test 0084:  Sequential file unaligned write (sync IO)            ... PASS
  Test 0085:  Sequential file unaligned write (async IO)           ... PASS
  Test 0086:  Sequential file append (sync)                        ... PASS
  Test 0087:  Sequential file append (async)                       ... PASS
  Test 0088:  Sequential file random read                          ... PASS
  Test 0089:  Sequential file mmap read/write                      ... PASS
  Test 0090:  sequential file 4K synchronous write                 ... PASS
  Test 0091:  Sequential file large synchronous write              ... PASS

46 / 46 tests passed
# 
# 
# 
# ./zonefs-tests.sh /dev/nvme1n1 
Gathering information on /dev/nvme1n1...
zonefs-tests on /dev/nvme1n1:
  4 zones (0 conventional zones, 4 sequential zones)
  524288 512B sectors zone size (256 MiB)
  1 max open zones
Running tests
  Test 0010:  mkzonefs (options)                                   ... PASS
  Test 0011:  mkzonefs (force format)                              ... PASS
  Test 0012:  mkzonefs (invalid device)                            ... PASS
  Test 0013:  mkzonefs (super block zone state)                    ... PASS
  Test 0020:  mount (default)                                      ... PASS
  Test 0021:  mount (invalid device)                               ... PASS
  Test 0022:  mount (check mount directory sub-directories)        ... PASS
  Test 0023:  mount (options)                                      ... PASS
  Test 0030:  Number of files (default)                            ... PASS
  Test 0031:  Number of files (aggr_cnv)                           ... skip
  Test 0032:  Number of files using stat (default)                 ... PASS
  Test 0033:  Number of files using stat (aggr_cnv)                ... PASS
  Test 0034:  Number of blocks using stat (default)                ... PASS
  Test 0035:  Number of blocks using stat (aggr_cnv)               ... PASS
  Test 0040:  Files permissions (default)                          ... PASS
  Test 0041:  Files permissions (aggr_cnv)                         ... skip
  Test 0042:  Files permissions (set value)                        ... PASS
  Test 0043:  Files permissions (set value + aggr_cnv)             ... skip
  Test 0050:  Files owner (default)                                ... PASS
  Test 0051:  Files owner (aggr_cnv)                               ... skip
  Test 0052:  Files owner (set value)                              ... PASS
  Test 0053:  Files owner (set value + aggr_cnv)                   ... skip
  Test 0060:  Files size (default)                                 ... PASS
  Test 0061:  Files size (aggr_cnv)                                ... skip
  Test 0070:  Conventional file truncate                           ... skip
  Test 0071:  Conventional file truncate (aggr_cnv)                ... skip
  Test 0072:  Conventional file unlink                             ... skip
  Test 0073:  Conventional file unlink (aggr_cnv)                  ... skip
  Test 0074:  Conventional file random write                       ... skip
  Test 0075:  Conventional file random write (direct)              ... skip
  Test 0076:  Conventional file random write (aggr_cnv)            ... skip
  Test 0077:  Conventional file random write (aggr_cnv, direct)    ... skip
  Test 0078:  Conventional file mmap read/write                    ... skip
  Test 0079:  Conventional file mmap read/write (aggr_cnv)         ... skip
  Test 0080:  Sequential file truncate                             ... PASS
  Test 0081:  Sequential file unlink                               ... PASS
  Test 0082:  Sequential file buffered write IO                    ... PASS
  Test 0083:  Sequential file overwrite                            ... PASS
  Test 0084:  Sequential file unaligned write (sync IO)            ... PASS
  Test 0085:  Sequential file unaligned write (async IO)           ... PASS
  Test 0086:  Sequential file append (sync)                        ... PASS
  Test 0087:  Sequential file append (async)                       ... PASS
  Test 0088:  Sequential file random read                          ... PASS
  Test 0089:  Sequential file mmap read/write                      ... PASS
  Test 0090:  sequential file 4K synchronous write                 ... PASS
  Test 0091:  Sequential file large synchronous write              ... PASS

46 / 46 tests passed
-- 
2.22.1


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH V4 0/9] nvmet: add ZBD backend support
@ 2020-12-02  6:22 ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: johannes.thumshirn, damien.lemoal, sagi, Chaitanya Kulkarni, hch

Hi,

NVMeOF Host is capable of handling the NVMe Protocol based Zoned Block
Devices (ZBD) in the Zoned Namespaces (ZNS) mode with the passthru
backend. There is no support for a generic block device backend to
handle the ZBD devices which are not NVMe protocol compliant yet provide
NVMe ZNS interface.

This adds support to export the ZBDs (which are not NVMe drives) to host
the from target via NVMeOF using the host side ZNS interface.

The patch series is generated in bottom-top manner where, it first adds
prep patch and ZNS command-specific handlers on the top of genblk and 
updates the data structures, then one by one it wires up the admin cmds
in the order host calls them in namespace initializing sequence. Once
everything is ready, it wires-up the I/O command handlers. See below for 
patch-series overview.

All the testcases are passing for the ZoneFS with ZBD exported with NVMeOF
backed by null_blk ZBD and null_blk ZBD without NVMeOF.

Please consider this for nvme-5.11.

Regards,
Chaitanya

Changes from V3:-

1.  Get rid of the bio_max_zasl check.
2.  Remove extra lines.
4.  Remove the block layer api export patch.
5.  Remove the bvec check in the bio_iov_iter_get_pages() for
    REQ_OP_ZONE_APPEND so that we can reuse the code.

Changes from V2:-

1.  Move conventional zone bitmap check into 
    nvmet_bdev_validate_zns_zones(). 
2.  Don't use report zones call to check the runt zone.
3.  Trim nvmet_zasl() helper.
4.  Fix typo in the nvmet_zns_update_zasl().
5.  Remove the comment and fix the mdts calculation in
    nvmet_execute_identify_cns_cs_ctrl().
6.  Use u64 for bufsize in nvmet_bdev_execute_zone_mgmt_recv().
7.  Remove nvmet_zones_to_desc_size() and fix the nr_zones
    calculation.
8.  Remove the op variable in nvmet_bdev_execute_zone_append().
9.  Fix the nr_zones calculation nvmet_bdev_execute_zone_mgmt_recv().
10. Update cover letter subject.

Changes from V1:-

1.  Remove the nvmet-$(CONFIG_BLK_DEV_ZONED) += zns.o.
2.  Mark helpers inline.
3.  Fix typos in the comments and update the comments.
4.  Get rid of the curly brackets.
5.  Don't allow drives with last smaller zones.
6.  Calculate the zasl as a function of ax_zone_append_sectors,
    bio_max_pages so we don't have to split the bio.
7.  Add global subsys->zasl and update the zasl when new namespace
    is enabled.
8.  Rmove the loop in the nvmet_bdev_execute_zone_mgmt_recv() and
    move functionality in to the report zone callback.
9.  Add goto for default case in nvmet_bdev_execute_zone_mgmt_send().
10. Allocate the zones buffer with zones size instead of bdev nr_zones.

Chaitanya Kulkarni (9):
  block: allow bvec for zone append get pages
  nvmet: add ZNS support for bdev-ns
  nvmet: trim down id-desclist to use req->ns
  nvmet: add NVME_CSI_ZNS in ns-desc for zbdev
  nvmet: add cns-cs-ctrl in id-ctrl for ZNS bdev
  nvmet: add cns-cs-ns in id-ctrl for ZNS bdev
  nvmet: add zns cmd effects to support zbdev
  nvmet: add zns bdev config support
  nvmet: add ZNS based I/O cmds handlers

 block/bio.c                       |   2 -
 drivers/nvme/target/Makefile      |   2 +-
 drivers/nvme/target/admin-cmd.c   |  38 ++-
 drivers/nvme/target/io-cmd-bdev.c |  12 +
 drivers/nvme/target/io-cmd-file.c |   2 +-
 drivers/nvme/target/nvmet.h       |  19 ++
 drivers/nvme/target/zns.c         | 417 ++++++++++++++++++++++++++++++
 7 files changed, 475 insertions(+), 17 deletions(-)
 create mode 100644 drivers/nvme/target/zns.c


Test Results :-
# nvme zns id-ctrl /dev/nvme
NVMe ZNS Identify Controller:
zasl    : 4
#
#
# nvme zns id-ctrl /dev/nvme1n1
NVMe ZNS Identify Controller:
zasl    : 4
# nvme zns id-ns /dev/nvme1n1
#
#
ZNS Command Set Identify Namespace:
zoc     : 0
ozcs    : 0
mar     : 0
mor     : 0
rrl     : 0
frl     : 0
lbafe  0: zsze:0x10000 zdes:0 (in use)
# 
#
#
# lsblk | grep null
nullb0            252:0    0    1G  0 disk 
# 
# 
# 
# ./zonefs-tests.sh /dev/nullb0 
Gathering information on /dev/nullb0...
zonefs-tests on /dev/nullb0:
  4 zones (0 conventional zones, 4 sequential zones)
  524288 512B sectors zone size (256 MiB)
  0 max open zones
Running tests
  Test 0010:  mkzonefs (options)                                   ... PASS
  Test 0011:  mkzonefs (force format)                              ... PASS
  Test 0012:  mkzonefs (invalid device)                            ... PASS
  Test 0013:  mkzonefs (super block zone state)                    ... PASS
  Test 0020:  mount (default)                                      ... PASS
  Test 0021:  mount (invalid device)                               ... PASS
  Test 0022:  mount (check mount directory sub-directories)        ... PASS
  Test 0023:  mount (options)                                      ... PASS
  Test 0030:  Number of files (default)                            ... PASS
  Test 0031:  Number of files (aggr_cnv)                           ... skip
  Test 0032:  Number of files using stat (default)                 ... PASS
  Test 0033:  Number of files using stat (aggr_cnv)                ... PASS
  Test 0034:  Number of blocks using stat (default)                ... PASS
  Test 0035:  Number of blocks using stat (aggr_cnv)               ... PASS
  Test 0040:  Files permissions (default)                          ... PASS
  Test 0041:  Files permissions (aggr_cnv)                         ... skip
  Test 0042:  Files permissions (set value)                        ... PASS
  Test 0043:  Files permissions (set value + aggr_cnv)             ... skip
  Test 0050:  Files owner (default)                                ... PASS
  Test 0051:  Files owner (aggr_cnv)                               ... skip
  Test 0052:  Files owner (set value)                              ... PASS
  Test 0053:  Files owner (set value + aggr_cnv)                   ... skip
  Test 0060:  Files size (default)                                 ... PASS
  Test 0061:  Files size (aggr_cnv)                                ... skip
  Test 0070:  Conventional file truncate                           ... skip
  Test 0071:  Conventional file truncate (aggr_cnv)                ... skip
  Test 0072:  Conventional file unlink                             ... skip
  Test 0073:  Conventional file unlink (aggr_cnv)                  ... skip
  Test 0074:  Conventional file random write                       ... skip
  Test 0075:  Conventional file random write (direct)              ... skip
  Test 0076:  Conventional file random write (aggr_cnv)            ... skip
  Test 0077:  Conventional file random write (aggr_cnv, direct)    ... skip
  Test 0078:  Conventional file mmap read/write                    ... skip
  Test 0079:  Conventional file mmap read/write (aggr_cnv)         ... skip
  Test 0080:  Sequential file truncate                             ... PASS
  Test 0081:  Sequential file unlink                               ... PASS
  Test 0082:  Sequential file buffered write IO                    ... PASS
  Test 0083:  Sequential file overwrite                            ... PASS
  Test 0084:  Sequential file unaligned write (sync IO)            ... PASS
  Test 0085:  Sequential file unaligned write (async IO)           ... PASS
  Test 0086:  Sequential file append (sync)                        ... PASS
  Test 0087:  Sequential file append (async)                       ... PASS
  Test 0088:  Sequential file random read                          ... PASS
  Test 0089:  Sequential file mmap read/write                      ... PASS
  Test 0090:  sequential file 4K synchronous write                 ... PASS
  Test 0091:  Sequential file large synchronous write              ... PASS

46 / 46 tests passed
# 
# 
# 
# ./zonefs-tests.sh /dev/nvme1n1 
Gathering information on /dev/nvme1n1...
zonefs-tests on /dev/nvme1n1:
  4 zones (0 conventional zones, 4 sequential zones)
  524288 512B sectors zone size (256 MiB)
  1 max open zones
Running tests
  Test 0010:  mkzonefs (options)                                   ... PASS
  Test 0011:  mkzonefs (force format)                              ... PASS
  Test 0012:  mkzonefs (invalid device)                            ... PASS
  Test 0013:  mkzonefs (super block zone state)                    ... PASS
  Test 0020:  mount (default)                                      ... PASS
  Test 0021:  mount (invalid device)                               ... PASS
  Test 0022:  mount (check mount directory sub-directories)        ... PASS
  Test 0023:  mount (options)                                      ... PASS
  Test 0030:  Number of files (default)                            ... PASS
  Test 0031:  Number of files (aggr_cnv)                           ... skip
  Test 0032:  Number of files using stat (default)                 ... PASS
  Test 0033:  Number of files using stat (aggr_cnv)                ... PASS
  Test 0034:  Number of blocks using stat (default)                ... PASS
  Test 0035:  Number of blocks using stat (aggr_cnv)               ... PASS
  Test 0040:  Files permissions (default)                          ... PASS
  Test 0041:  Files permissions (aggr_cnv)                         ... skip
  Test 0042:  Files permissions (set value)                        ... PASS
  Test 0043:  Files permissions (set value + aggr_cnv)             ... skip
  Test 0050:  Files owner (default)                                ... PASS
  Test 0051:  Files owner (aggr_cnv)                               ... skip
  Test 0052:  Files owner (set value)                              ... PASS
  Test 0053:  Files owner (set value + aggr_cnv)                   ... skip
  Test 0060:  Files size (default)                                 ... PASS
  Test 0061:  Files size (aggr_cnv)                                ... skip
  Test 0070:  Conventional file truncate                           ... skip
  Test 0071:  Conventional file truncate (aggr_cnv)                ... skip
  Test 0072:  Conventional file unlink                             ... skip
  Test 0073:  Conventional file unlink (aggr_cnv)                  ... skip
  Test 0074:  Conventional file random write                       ... skip
  Test 0075:  Conventional file random write (direct)              ... skip
  Test 0076:  Conventional file random write (aggr_cnv)            ... skip
  Test 0077:  Conventional file random write (aggr_cnv, direct)    ... skip
  Test 0078:  Conventional file mmap read/write                    ... skip
  Test 0079:  Conventional file mmap read/write (aggr_cnv)         ... skip
  Test 0080:  Sequential file truncate                             ... PASS
  Test 0081:  Sequential file unlink                               ... PASS
  Test 0082:  Sequential file buffered write IO                    ... PASS
  Test 0083:  Sequential file overwrite                            ... PASS
  Test 0084:  Sequential file unaligned write (sync IO)            ... PASS
  Test 0085:  Sequential file unaligned write (async IO)           ... PASS
  Test 0086:  Sequential file append (sync)                        ... PASS
  Test 0087:  Sequential file append (async)                       ... PASS
  Test 0088:  Sequential file random read                          ... PASS
  Test 0089:  Sequential file mmap read/write                      ... PASS
  Test 0090:  sequential file 4K synchronous write                 ... PASS
  Test 0091:  Sequential file large synchronous write              ... PASS

46 / 46 tests passed
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH V4 1/9] block: allow bvec for zone append get pages
  2020-12-02  6:22 ` Chaitanya Kulkarni
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: sagi, hch, damien.lemoal, johannes.thumshirn, Chaitanya Kulkarni

Remove the bvec check in the bio_iov_iter_get_pages() for
REQ_OP_ZONE_APPEND so that we can reuse the code and build iter from
bvec.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 block/bio.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index fa01bef35bb1..54b532bb39f3 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1110,8 +1110,6 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 
 	do {
 		if (bio_op(bio) == REQ_OP_ZONE_APPEND) {
-			if (WARN_ON_ONCE(is_bvec))
-				return -EINVAL;
 			ret = __bio_iov_append_get_pages(bio, iter);
 		} else {
 			if (is_bvec)
-- 
2.22.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 1/9] block: allow bvec for zone append get pages
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: johannes.thumshirn, damien.lemoal, sagi, Chaitanya Kulkarni, hch

Remove the bvec check in the bio_iov_iter_get_pages() for
REQ_OP_ZONE_APPEND so that we can reuse the code and build iter from
bvec.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 block/bio.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index fa01bef35bb1..54b532bb39f3 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1110,8 +1110,6 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 
 	do {
 		if (bio_op(bio) == REQ_OP_ZONE_APPEND) {
-			if (WARN_ON_ONCE(is_bvec))
-				return -EINVAL;
 			ret = __bio_iov_append_get_pages(bio, iter);
 		} else {
 			if (is_bvec)
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 2/9] nvmet: add ZNS support for bdev-ns
  2020-12-02  6:22 ` Chaitanya Kulkarni
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: sagi, hch, damien.lemoal, johannes.thumshirn, Chaitanya Kulkarni

Add zns-bdev-config, id-ctrl, id-ns, zns-cmd-effects, zone-mgmt-send,
zone-mgmt-recv and zone-append handlers for NVMeOF target to enable ZNS
support for bdev.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/Makefile      |   2 +-
 drivers/nvme/target/admin-cmd.c   |   4 +-
 drivers/nvme/target/io-cmd-file.c |   2 +-
 drivers/nvme/target/nvmet.h       |  19 ++
 drivers/nvme/target/zns.c         | 417 ++++++++++++++++++++++++++++++
 5 files changed, 440 insertions(+), 4 deletions(-)
 create mode 100644 drivers/nvme/target/zns.c

diff --git a/drivers/nvme/target/Makefile b/drivers/nvme/target/Makefile
index ebf91fc4c72e..d050f829b43a 100644
--- a/drivers/nvme/target/Makefile
+++ b/drivers/nvme/target/Makefile
@@ -10,7 +10,7 @@ obj-$(CONFIG_NVME_TARGET_FCLOOP)	+= nvme-fcloop.o
 obj-$(CONFIG_NVME_TARGET_TCP)		+= nvmet-tcp.o
 
 nvmet-y		+= core.o configfs.o admin-cmd.o fabrics-cmd.o \
-			discovery.o io-cmd-file.o io-cmd-bdev.o
+		   zns.o discovery.o io-cmd-file.o io-cmd-bdev.o
 nvmet-$(CONFIG_NVME_TARGET_PASSTHRU)	+= passthru.o
 nvme-loop-y	+= loop.o
 nvmet-rdma-y	+= rdma.o
diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index dca34489a1dc..509fd8dcca0c 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -579,8 +579,8 @@ static void nvmet_execute_identify_nslist(struct nvmet_req *req)
 	nvmet_req_complete(req, status);
 }
 
-static u16 nvmet_copy_ns_identifier(struct nvmet_req *req, u8 type, u8 len,
-				    void *id, off_t *off)
+u16 nvmet_copy_ns_identifier(struct nvmet_req *req, u8 type, u8 len,
+			     void *id, off_t *off)
 {
 	struct nvme_ns_id_desc desc = {
 		.nidt = type,
diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c
index 0abbefd9925e..2bd10960fa50 100644
--- a/drivers/nvme/target/io-cmd-file.c
+++ b/drivers/nvme/target/io-cmd-file.c
@@ -89,7 +89,7 @@ int nvmet_file_ns_enable(struct nvmet_ns *ns)
 	return ret;
 }
 
-static void nvmet_file_init_bvec(struct bio_vec *bv, struct scatterlist *sg)
+void nvmet_file_init_bvec(struct bio_vec *bv, struct scatterlist *sg)
 {
 	bv->bv_page = sg_page(sg);
 	bv->bv_offset = sg->offset;
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 592763732065..eee7866ae512 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -81,6 +81,10 @@ struct nvmet_ns {
 	struct pci_dev		*p2p_dev;
 	int			pi_type;
 	int			metadata_size;
+#ifdef CONFIG_BLK_DEV_ZONED
+	struct nvme_id_ns_zns	id_zns;
+	unsigned int		zasl;
+#endif
 };
 
 static inline struct nvmet_ns *to_nvmet_ns(struct config_item *item)
@@ -251,6 +255,10 @@ struct nvmet_subsys {
 	unsigned int		admin_timeout;
 	unsigned int		io_timeout;
 #endif /* CONFIG_NVME_TARGET_PASSTHRU */
+
+#ifdef CONFIG_BLK_DEV_ZONED
+	struct nvme_id_ctrl_zns	id_ctrl_zns;
+#endif
 };
 
 static inline struct nvmet_subsys *to_subsys(struct config_item *item)
@@ -603,4 +611,15 @@ static inline bool nvmet_ns_has_pi(struct nvmet_ns *ns)
 	return ns->pi_type && ns->metadata_size == sizeof(struct t10_pi_tuple);
 }
 
+void nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req);
+void nvmet_execute_identify_cns_cs_ns(struct nvmet_req *req);
+u16 nvmet_process_zns_cis(struct nvmet_req *req, off_t *off);
+bool nvmet_bdev_zns_enable(struct nvmet_ns *ns);
+void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req);
+void nvmet_bdev_execute_zone_mgmt_send(struct nvmet_req *req);
+void nvmet_bdev_execute_zone_append(struct nvmet_req *req);
+void nvmet_zns_add_cmd_effects(struct nvme_effects_log *log);
+u16 nvmet_copy_ns_identifier(struct nvmet_req *req, u8 type, u8 len,
+			     void *id, off_t *off);
+void nvmet_file_init_bvec(struct bio_vec *bv, struct scatterlist *sg);
 #endif /* _NVMET_H */
diff --git a/drivers/nvme/target/zns.c b/drivers/nvme/target/zns.c
new file mode 100644
index 000000000000..300e85a02854
--- /dev/null
+++ b/drivers/nvme/target/zns.c
@@ -0,0 +1,417 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * NVMe ZNS-ZBD command implementation.
+ * Copyright (c) 2020-2021 HGST, a Western Digital Company.
+ */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#include <linux/uio.h>
+#include <linux/nvme.h>
+#include <linux/xarray.h>
+#include <linux/blkdev.h>
+#include <linux/module.h>
+#include "nvmet.h"
+
+#ifdef CONFIG_BLK_DEV_ZONED
+#define NVMET_MPSMIN_SHIFT	12
+
+static u16 nvmet_bdev_zns_checks(struct nvmet_req *req)
+{
+	u16 status = 0;
+
+	if (!bdev_is_zoned(req->ns->bdev)) {
+		status = NVME_SC_INVALID_NS | NVME_SC_DNR;
+		goto out;
+	}
+
+	if (req->cmd->zmr.zra != NVME_ZRA_ZONE_REPORT) {
+		status = NVME_SC_INVALID_FIELD;
+		goto out;
+	}
+
+	if (req->cmd->zmr.zrasf != NVME_ZRASF_ZONE_REPORT_ALL) {
+		status = NVME_SC_INVALID_FIELD;
+		goto out;
+	}
+
+	if (req->cmd->zmr.pr != NVME_REPORT_ZONE_PARTIAL)
+		status = NVME_SC_INVALID_FIELD;
+out:
+	return status;
+}
+
+static inline struct block_device *nvmet_bdev(struct nvmet_req *req)
+{
+	return req->ns->bdev;
+}
+
+static inline u64 nvmet_sect_to_lba(struct nvmet_ns *ns, sector_t sect)
+{
+	return sect >> (ns->blksize_shift - SECTOR_SHIFT);
+}
+
+static inline sector_t nvmet_lba_to_sect(struct nvmet_ns *ns, __le64 lba)
+{
+	return le64_to_cpu(lba) << (ns->blksize_shift - SECTOR_SHIFT);
+}
+
+/*
+ *  ZNS related command implementation and helpers.
+ */
+
+u16 nvmet_process_zns_cis(struct nvmet_req *req, off_t *off)
+{
+	u16 nvme_cis_zns = NVME_CSI_ZNS;
+
+	if (!bdev_is_zoned(nvmet_bdev(req)))
+		return NVME_SC_SUCCESS;
+
+	return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, NVME_NIDT_CSI_LEN,
+					&nvme_cis_zns, off);
+}
+
+void nvmet_zns_add_cmd_effects(struct nvme_effects_log *log)
+{
+	log->iocs[nvme_cmd_zone_append]		= cpu_to_le32(1 << 0);
+	log->iocs[nvme_cmd_zone_mgmt_send]	= cpu_to_le32(1 << 0);
+	log->iocs[nvme_cmd_zone_mgmt_recv]	= cpu_to_le32(1 << 0);
+}
+
+static inline bool nvmet_bdev_validate_zns_zones(struct nvmet_ns *ns)
+{
+	if (ns->bdev->bd_disk->queue->conv_zones_bitmap) {
+		pr_err("block devices with conventional zones are not supported.");
+		return false;
+	}
+
+	return !(get_capacity(ns->bdev->bd_disk) &
+			(bdev_zone_sectors(ns->bdev) - 1));
+}
+
+static inline u8 nvmet_zasl(unsigned int zone_append_sects)
+{
+	/*
+	 * Zone Append Size Limit is the value experessed in the units
+	 * of minimum memory page size (i.e. 12) and is reported power of 2.
+	 */
+	return ilog2((zone_append_sects << 9) >> NVMET_MPSMIN_SHIFT);
+}
+
+static inline void nvmet_zns_update_zasl(struct nvmet_ns *ns)
+{
+	struct request_queue *q = ns->bdev->bd_disk->queue;
+	struct nvmet_ns *ins;
+	unsigned long idx;
+	u8 min_zasl;
+
+	/*
+	 * Calculate new ctrl->zasl value when enabling the new ns. This value
+	 * has to be the minimum of the max_zone_append values from available
+	 * namespaces.
+	 */
+	min_zasl = ns->zasl = nvmet_zasl(queue_max_zone_append_sectors(q));
+
+	xa_for_each(&(ns->subsys->namespaces), idx, ins) {
+		struct request_queue *iq = ins->bdev->bd_disk->queue;
+		unsigned int imax_za_sects = queue_max_zone_append_sectors(iq);
+		u8 izasl = nvmet_zasl(imax_za_sects);
+
+		if (!bdev_is_zoned(ins->bdev))
+			continue;
+
+		min_zasl = min_zasl > izasl ? izasl : min_zasl;
+	}
+
+	ns->subsys->id_ctrl_zns.zasl = min_zasl;
+}
+
+bool nvmet_bdev_zns_enable(struct nvmet_ns *ns)
+{
+	if (!nvmet_bdev_validate_zns_zones(ns))
+		return false;
+
+	/*
+	 * For ZBC and ZAC devices, writes into sequential zones must be aligned
+	 * to the device physical block size. So use this value as the logical
+	 * block size to avoid errors.
+	 */
+	ns->blksize_shift = blksize_bits(bdev_physical_block_size(ns->bdev));
+
+	nvmet_zns_update_zasl(ns);
+
+	return true;
+}
+
+/*
+ * ZNS related Admin and I/O command handlers.
+ */
+void nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req)
+{
+	u8 zasl = req->sq->ctrl->subsys->id_ctrl_zns.zasl;
+	struct nvmet_ctrl *ctrl = req->sq->ctrl;
+	struct nvme_id_ctrl_zns *id;
+	u16 status;
+
+	id = kzalloc(sizeof(*id), GFP_KERNEL);
+	if (!id) {
+		status = NVME_SC_INTERNAL;
+		goto out;
+	}
+
+	if (ctrl->ops->get_mdts)
+		id->zasl = min_t(u8, ctrl->ops->get_mdts(ctrl), zasl);
+	else
+		id->zasl = zasl;
+
+	status = nvmet_copy_to_sgl(req, 0, id, sizeof(*id));
+
+	kfree(id);
+out:
+	nvmet_req_complete(req, status);
+}
+
+void nvmet_execute_identify_cns_cs_ns(struct nvmet_req *req)
+{
+	struct nvme_id_ns_zns *id_zns;
+	u16 status = 0;
+	u64 zsze;
+
+	if (le32_to_cpu(req->cmd->identify.nsid) == NVME_NSID_ALL) {
+		req->error_loc = offsetof(struct nvme_identify, nsid);
+		status = NVME_SC_INVALID_NS | NVME_SC_DNR;
+		goto out;
+	}
+
+	id_zns = kzalloc(sizeof(*id_zns), GFP_KERNEL);
+	if (!id_zns) {
+		status = NVME_SC_INTERNAL;
+		goto out;
+	}
+
+	req->ns = nvmet_find_namespace(req->sq->ctrl, req->cmd->identify.nsid);
+	if (!req->ns) {
+		status = NVME_SC_INTERNAL;
+		goto done;
+	}
+
+	if (!bdev_is_zoned(nvmet_bdev(req))) {
+		req->error_loc = offsetof(struct nvme_identify, nsid);
+		status = NVME_SC_INVALID_NS | NVME_SC_DNR;
+		goto done;
+	}
+
+	nvmet_ns_revalidate(req->ns);
+	zsze = (bdev_zone_sectors(nvmet_bdev(req)) << 9) >>
+					req->ns->blksize_shift;
+	id_zns->lbafe[0].zsze = cpu_to_le64(zsze);
+	id_zns->mor = cpu_to_le32(bdev_max_open_zones(nvmet_bdev(req)));
+	id_zns->mar = cpu_to_le32(bdev_max_active_zones(nvmet_bdev(req)));
+
+done:
+	status = nvmet_copy_to_sgl(req, 0, id_zns, sizeof(*id_zns));
+	kfree(id_zns);
+out:
+	nvmet_req_complete(req, status);
+}
+
+struct nvmet_report_zone_data {
+	struct nvmet_ns *ns;
+	struct nvme_zone_report *rz;
+};
+
+static int nvmet_bdev_report_zone_cb(struct blk_zone *z, unsigned int idx,
+				     void *data)
+{
+	struct nvmet_report_zone_data *report_zone_data = data;
+	struct nvme_zone_descriptor *entries = report_zone_data->rz->entries;
+	struct nvmet_ns *ns = report_zone_data->ns;
+
+	entries[idx].zcap = cpu_to_le64(nvmet_sect_to_lba(ns, z->capacity));
+	entries[idx].zslba = cpu_to_le64(nvmet_sect_to_lba(ns, z->start));
+	entries[idx].wp = cpu_to_le64(nvmet_sect_to_lba(ns, z->wp));
+	entries[idx].za = z->reset ? 1 << 2 : 0;
+	entries[idx].zt = z->type;
+	entries[idx].zs = z->cond << 4;
+
+	return 0;
+}
+
+void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req)
+{
+	u64 bufsize = (le32_to_cpu(req->cmd->zmr.numd) + 1) << 2;
+	struct nvmet_report_zone_data data = { .ns = req->ns };
+	struct nvme_zone_mgmt_recv_cmd *zmr = &req->cmd->zmr;
+	sector_t sect = nvmet_lba_to_sect(req->ns, le64_to_cpu(zmr->slba));
+	unsigned int nr_zones;
+	int reported_zones;
+	u16 status;
+
+	nr_zones = (bufsize - sizeof(struct nvme_zone_report)) /
+			sizeof(struct nvme_zone_descriptor);
+
+	status = nvmet_bdev_zns_checks(req);
+	if (status)
+		goto out;
+
+	data.rz = __vmalloc(bufsize, GFP_KERNEL | __GFP_NORETRY);
+	if (!data.rz) {
+		status = NVME_SC_INTERNAL;
+		goto out;
+	}
+
+	reported_zones = blkdev_report_zones(nvmet_bdev(req), sect, nr_zones,
+					     nvmet_bdev_report_zone_cb,
+					     &data);
+	if (reported_zones < 0) {
+		status = NVME_SC_INTERNAL;
+		goto out_free_report_zones;
+	}
+
+	data.rz->nr_zones = cpu_to_le64(reported_zones);
+
+	status = nvmet_copy_to_sgl(req, 0, data.rz, bufsize);
+
+out_free_report_zones:
+	kvfree(data.rz);
+out:
+	nvmet_req_complete(req, status);
+}
+
+void nvmet_bdev_execute_zone_mgmt_send(struct nvmet_req *req)
+{
+	sector_t nr_sect = bdev_zone_sectors(nvmet_bdev(req));
+	struct nvme_zone_mgmt_send_cmd *c = &req->cmd->zms;
+	enum req_opf op = REQ_OP_LAST;
+	u16 status = NVME_SC_SUCCESS;
+	sector_t sect;
+	int ret;
+
+	sect = nvmet_lba_to_sect(req->ns, le64_to_cpu(req->cmd->zms.slba));
+
+	if (c->select_all)
+		nr_sect = get_capacity(nvmet_bdev(req)->bd_disk);
+
+	switch (c->zsa) {
+	case NVME_ZONE_OPEN:
+		op = REQ_OP_ZONE_OPEN;
+		break;
+	case NVME_ZONE_CLOSE:
+		op = REQ_OP_ZONE_CLOSE;
+		break;
+	case NVME_ZONE_FINISH:
+		op = REQ_OP_ZONE_FINISH;
+		break;
+	case NVME_ZONE_RESET:
+		op = REQ_OP_ZONE_RESET;
+		break;
+	default:
+		status = NVME_SC_INVALID_FIELD;
+		goto out;
+	}
+
+	ret = blkdev_zone_mgmt(nvmet_bdev(req), op, sect, nr_sect, GFP_KERNEL);
+	if (ret)
+		status = NVME_SC_INTERNAL;
+out:
+	nvmet_req_complete(req, status);
+}
+
+void nvmet_bdev_execute_zone_append(struct nvmet_req *req)
+{
+	u64 slba = le64_to_cpu(req->cmd->rw.slba);
+	sector_t sect = nvmet_lba_to_sect(req->ns, slba);
+	unsigned long bv_cnt = req->sg_cnt;
+	u16 status = NVME_SC_SUCCESS;
+	size_t mapped_data_len = 0;
+	int sg_cnt = req->sg_cnt;
+	struct scatterlist *sg;
+	struct iov_iter iter;
+	struct bio_vec *bvec;
+	size_t mapped_cnt;
+	struct bio *bio;
+	int ret;
+
+	if (!nvmet_check_transfer_len(req, nvmet_rw_data_len(req)))
+		return;
+
+	if (!req->sg_cnt)
+		goto out;
+
+	if (WARN_ON(req->sg_cnt > BIO_MAX_PAGES)) {
+		status = NVME_SC_INTERNAL | NVME_SC_DNR;
+		goto out;
+	}
+
+	bvec = kmalloc_array(bv_cnt, sizeof(*bvec), GFP_KERNEL);
+	if (!bvec) {
+		status = NVME_SC_INTERNAL | NVME_SC_DNR;
+		goto out;
+	}
+
+	for_each_sg(req->sg, sg, req->sg_cnt, mapped_cnt) {
+		nvmet_file_init_bvec(&bvec[mapped_cnt], sg);
+		mapped_data_len += bvec[mapped_cnt].bv_len;
+		sg_cnt--;
+		if (mapped_cnt == bv_cnt)
+			break;
+	}
+
+	if (WARN_ON(sg_cnt)) {
+		status = NVME_SC_INTERNAL | NVME_SC_DNR;
+		goto out;
+	}
+
+	iov_iter_bvec(&iter, WRITE, bvec, mapped_cnt, mapped_data_len);
+
+	bio = bio_alloc(GFP_KERNEL, bv_cnt);
+	bio_set_dev(bio, nvmet_bdev(req));
+	bio->bi_iter.bi_sector = sect;
+	bio->bi_opf = REQ_OP_ZONE_APPEND | REQ_SYNC | REQ_IDLE;
+
+	ret = bio_iov_iter_get_pages(bio, &iter);
+	if (unlikely(ret)) {
+		status = NVME_SC_INTERNAL | NVME_SC_DNR;
+		bio_io_error(bio);
+		goto bvec_free;
+	}
+
+	ret = submit_bio_wait(bio);
+	status = ret < 0 ? NVME_SC_INTERNAL : status;
+	bio_put(bio);
+
+	sect += (mapped_data_len >> 9);
+	req->cqe->result.u64 = le64_to_cpu(nvmet_sect_to_lba(req->ns, sect));
+
+bvec_free:
+	kfree(bvec);
+out:
+	nvmet_req_complete(req, status);
+}
+
+#else  /* CONFIG_BLK_DEV_ZONED */
+void nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req)
+{
+}
+void nvmet_execute_identify_cns_cs_ns(struct nvmet_req *req)
+{
+}
+u16 nvmet_process_zns_cis(struct nvmet_req *req, off_t *off)
+{
+	return 0;
+}
+bool nvmet_bdev_zns_config(struct nvmet_ns *ns)
+{
+	return false;
+}
+void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req)
+{
+}
+void nvmet_bdev_execute_zone_mgmt_send(struct nvmet_req *req)
+{
+}
+void nvmet_bdev_execute_zone_append(struct nvmet_req *req)
+{
+}
+void nvmet_zns_add_cmd_effects(struct nvme_effects_log *log)
+{
+}
+#endif /* CONFIG_BLK_DEV_ZONED */
-- 
2.22.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 2/9] nvmet: add ZNS support for bdev-ns
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: johannes.thumshirn, damien.lemoal, sagi, Chaitanya Kulkarni, hch

Add zns-bdev-config, id-ctrl, id-ns, zns-cmd-effects, zone-mgmt-send,
zone-mgmt-recv and zone-append handlers for NVMeOF target to enable ZNS
support for bdev.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/Makefile      |   2 +-
 drivers/nvme/target/admin-cmd.c   |   4 +-
 drivers/nvme/target/io-cmd-file.c |   2 +-
 drivers/nvme/target/nvmet.h       |  19 ++
 drivers/nvme/target/zns.c         | 417 ++++++++++++++++++++++++++++++
 5 files changed, 440 insertions(+), 4 deletions(-)
 create mode 100644 drivers/nvme/target/zns.c

diff --git a/drivers/nvme/target/Makefile b/drivers/nvme/target/Makefile
index ebf91fc4c72e..d050f829b43a 100644
--- a/drivers/nvme/target/Makefile
+++ b/drivers/nvme/target/Makefile
@@ -10,7 +10,7 @@ obj-$(CONFIG_NVME_TARGET_FCLOOP)	+= nvme-fcloop.o
 obj-$(CONFIG_NVME_TARGET_TCP)		+= nvmet-tcp.o
 
 nvmet-y		+= core.o configfs.o admin-cmd.o fabrics-cmd.o \
-			discovery.o io-cmd-file.o io-cmd-bdev.o
+		   zns.o discovery.o io-cmd-file.o io-cmd-bdev.o
 nvmet-$(CONFIG_NVME_TARGET_PASSTHRU)	+= passthru.o
 nvme-loop-y	+= loop.o
 nvmet-rdma-y	+= rdma.o
diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index dca34489a1dc..509fd8dcca0c 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -579,8 +579,8 @@ static void nvmet_execute_identify_nslist(struct nvmet_req *req)
 	nvmet_req_complete(req, status);
 }
 
-static u16 nvmet_copy_ns_identifier(struct nvmet_req *req, u8 type, u8 len,
-				    void *id, off_t *off)
+u16 nvmet_copy_ns_identifier(struct nvmet_req *req, u8 type, u8 len,
+			     void *id, off_t *off)
 {
 	struct nvme_ns_id_desc desc = {
 		.nidt = type,
diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c
index 0abbefd9925e..2bd10960fa50 100644
--- a/drivers/nvme/target/io-cmd-file.c
+++ b/drivers/nvme/target/io-cmd-file.c
@@ -89,7 +89,7 @@ int nvmet_file_ns_enable(struct nvmet_ns *ns)
 	return ret;
 }
 
-static void nvmet_file_init_bvec(struct bio_vec *bv, struct scatterlist *sg)
+void nvmet_file_init_bvec(struct bio_vec *bv, struct scatterlist *sg)
 {
 	bv->bv_page = sg_page(sg);
 	bv->bv_offset = sg->offset;
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 592763732065..eee7866ae512 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -81,6 +81,10 @@ struct nvmet_ns {
 	struct pci_dev		*p2p_dev;
 	int			pi_type;
 	int			metadata_size;
+#ifdef CONFIG_BLK_DEV_ZONED
+	struct nvme_id_ns_zns	id_zns;
+	unsigned int		zasl;
+#endif
 };
 
 static inline struct nvmet_ns *to_nvmet_ns(struct config_item *item)
@@ -251,6 +255,10 @@ struct nvmet_subsys {
 	unsigned int		admin_timeout;
 	unsigned int		io_timeout;
 #endif /* CONFIG_NVME_TARGET_PASSTHRU */
+
+#ifdef CONFIG_BLK_DEV_ZONED
+	struct nvme_id_ctrl_zns	id_ctrl_zns;
+#endif
 };
 
 static inline struct nvmet_subsys *to_subsys(struct config_item *item)
@@ -603,4 +611,15 @@ static inline bool nvmet_ns_has_pi(struct nvmet_ns *ns)
 	return ns->pi_type && ns->metadata_size == sizeof(struct t10_pi_tuple);
 }
 
+void nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req);
+void nvmet_execute_identify_cns_cs_ns(struct nvmet_req *req);
+u16 nvmet_process_zns_cis(struct nvmet_req *req, off_t *off);
+bool nvmet_bdev_zns_enable(struct nvmet_ns *ns);
+void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req);
+void nvmet_bdev_execute_zone_mgmt_send(struct nvmet_req *req);
+void nvmet_bdev_execute_zone_append(struct nvmet_req *req);
+void nvmet_zns_add_cmd_effects(struct nvme_effects_log *log);
+u16 nvmet_copy_ns_identifier(struct nvmet_req *req, u8 type, u8 len,
+			     void *id, off_t *off);
+void nvmet_file_init_bvec(struct bio_vec *bv, struct scatterlist *sg);
 #endif /* _NVMET_H */
diff --git a/drivers/nvme/target/zns.c b/drivers/nvme/target/zns.c
new file mode 100644
index 000000000000..300e85a02854
--- /dev/null
+++ b/drivers/nvme/target/zns.c
@@ -0,0 +1,417 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * NVMe ZNS-ZBD command implementation.
+ * Copyright (c) 2020-2021 HGST, a Western Digital Company.
+ */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#include <linux/uio.h>
+#include <linux/nvme.h>
+#include <linux/xarray.h>
+#include <linux/blkdev.h>
+#include <linux/module.h>
+#include "nvmet.h"
+
+#ifdef CONFIG_BLK_DEV_ZONED
+#define NVMET_MPSMIN_SHIFT	12
+
+static u16 nvmet_bdev_zns_checks(struct nvmet_req *req)
+{
+	u16 status = 0;
+
+	if (!bdev_is_zoned(req->ns->bdev)) {
+		status = NVME_SC_INVALID_NS | NVME_SC_DNR;
+		goto out;
+	}
+
+	if (req->cmd->zmr.zra != NVME_ZRA_ZONE_REPORT) {
+		status = NVME_SC_INVALID_FIELD;
+		goto out;
+	}
+
+	if (req->cmd->zmr.zrasf != NVME_ZRASF_ZONE_REPORT_ALL) {
+		status = NVME_SC_INVALID_FIELD;
+		goto out;
+	}
+
+	if (req->cmd->zmr.pr != NVME_REPORT_ZONE_PARTIAL)
+		status = NVME_SC_INVALID_FIELD;
+out:
+	return status;
+}
+
+static inline struct block_device *nvmet_bdev(struct nvmet_req *req)
+{
+	return req->ns->bdev;
+}
+
+static inline u64 nvmet_sect_to_lba(struct nvmet_ns *ns, sector_t sect)
+{
+	return sect >> (ns->blksize_shift - SECTOR_SHIFT);
+}
+
+static inline sector_t nvmet_lba_to_sect(struct nvmet_ns *ns, __le64 lba)
+{
+	return le64_to_cpu(lba) << (ns->blksize_shift - SECTOR_SHIFT);
+}
+
+/*
+ *  ZNS related command implementation and helpers.
+ */
+
+u16 nvmet_process_zns_cis(struct nvmet_req *req, off_t *off)
+{
+	u16 nvme_cis_zns = NVME_CSI_ZNS;
+
+	if (!bdev_is_zoned(nvmet_bdev(req)))
+		return NVME_SC_SUCCESS;
+
+	return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, NVME_NIDT_CSI_LEN,
+					&nvme_cis_zns, off);
+}
+
+void nvmet_zns_add_cmd_effects(struct nvme_effects_log *log)
+{
+	log->iocs[nvme_cmd_zone_append]		= cpu_to_le32(1 << 0);
+	log->iocs[nvme_cmd_zone_mgmt_send]	= cpu_to_le32(1 << 0);
+	log->iocs[nvme_cmd_zone_mgmt_recv]	= cpu_to_le32(1 << 0);
+}
+
+static inline bool nvmet_bdev_validate_zns_zones(struct nvmet_ns *ns)
+{
+	if (ns->bdev->bd_disk->queue->conv_zones_bitmap) {
+		pr_err("block devices with conventional zones are not supported.");
+		return false;
+	}
+
+	return !(get_capacity(ns->bdev->bd_disk) &
+			(bdev_zone_sectors(ns->bdev) - 1));
+}
+
+static inline u8 nvmet_zasl(unsigned int zone_append_sects)
+{
+	/*
+	 * Zone Append Size Limit is the value experessed in the units
+	 * of minimum memory page size (i.e. 12) and is reported power of 2.
+	 */
+	return ilog2((zone_append_sects << 9) >> NVMET_MPSMIN_SHIFT);
+}
+
+static inline void nvmet_zns_update_zasl(struct nvmet_ns *ns)
+{
+	struct request_queue *q = ns->bdev->bd_disk->queue;
+	struct nvmet_ns *ins;
+	unsigned long idx;
+	u8 min_zasl;
+
+	/*
+	 * Calculate new ctrl->zasl value when enabling the new ns. This value
+	 * has to be the minimum of the max_zone_append values from available
+	 * namespaces.
+	 */
+	min_zasl = ns->zasl = nvmet_zasl(queue_max_zone_append_sectors(q));
+
+	xa_for_each(&(ns->subsys->namespaces), idx, ins) {
+		struct request_queue *iq = ins->bdev->bd_disk->queue;
+		unsigned int imax_za_sects = queue_max_zone_append_sectors(iq);
+		u8 izasl = nvmet_zasl(imax_za_sects);
+
+		if (!bdev_is_zoned(ins->bdev))
+			continue;
+
+		min_zasl = min_zasl > izasl ? izasl : min_zasl;
+	}
+
+	ns->subsys->id_ctrl_zns.zasl = min_zasl;
+}
+
+bool nvmet_bdev_zns_enable(struct nvmet_ns *ns)
+{
+	if (!nvmet_bdev_validate_zns_zones(ns))
+		return false;
+
+	/*
+	 * For ZBC and ZAC devices, writes into sequential zones must be aligned
+	 * to the device physical block size. So use this value as the logical
+	 * block size to avoid errors.
+	 */
+	ns->blksize_shift = blksize_bits(bdev_physical_block_size(ns->bdev));
+
+	nvmet_zns_update_zasl(ns);
+
+	return true;
+}
+
+/*
+ * ZNS related Admin and I/O command handlers.
+ */
+void nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req)
+{
+	u8 zasl = req->sq->ctrl->subsys->id_ctrl_zns.zasl;
+	struct nvmet_ctrl *ctrl = req->sq->ctrl;
+	struct nvme_id_ctrl_zns *id;
+	u16 status;
+
+	id = kzalloc(sizeof(*id), GFP_KERNEL);
+	if (!id) {
+		status = NVME_SC_INTERNAL;
+		goto out;
+	}
+
+	if (ctrl->ops->get_mdts)
+		id->zasl = min_t(u8, ctrl->ops->get_mdts(ctrl), zasl);
+	else
+		id->zasl = zasl;
+
+	status = nvmet_copy_to_sgl(req, 0, id, sizeof(*id));
+
+	kfree(id);
+out:
+	nvmet_req_complete(req, status);
+}
+
+void nvmet_execute_identify_cns_cs_ns(struct nvmet_req *req)
+{
+	struct nvme_id_ns_zns *id_zns;
+	u16 status = 0;
+	u64 zsze;
+
+	if (le32_to_cpu(req->cmd->identify.nsid) == NVME_NSID_ALL) {
+		req->error_loc = offsetof(struct nvme_identify, nsid);
+		status = NVME_SC_INVALID_NS | NVME_SC_DNR;
+		goto out;
+	}
+
+	id_zns = kzalloc(sizeof(*id_zns), GFP_KERNEL);
+	if (!id_zns) {
+		status = NVME_SC_INTERNAL;
+		goto out;
+	}
+
+	req->ns = nvmet_find_namespace(req->sq->ctrl, req->cmd->identify.nsid);
+	if (!req->ns) {
+		status = NVME_SC_INTERNAL;
+		goto done;
+	}
+
+	if (!bdev_is_zoned(nvmet_bdev(req))) {
+		req->error_loc = offsetof(struct nvme_identify, nsid);
+		status = NVME_SC_INVALID_NS | NVME_SC_DNR;
+		goto done;
+	}
+
+	nvmet_ns_revalidate(req->ns);
+	zsze = (bdev_zone_sectors(nvmet_bdev(req)) << 9) >>
+					req->ns->blksize_shift;
+	id_zns->lbafe[0].zsze = cpu_to_le64(zsze);
+	id_zns->mor = cpu_to_le32(bdev_max_open_zones(nvmet_bdev(req)));
+	id_zns->mar = cpu_to_le32(bdev_max_active_zones(nvmet_bdev(req)));
+
+done:
+	status = nvmet_copy_to_sgl(req, 0, id_zns, sizeof(*id_zns));
+	kfree(id_zns);
+out:
+	nvmet_req_complete(req, status);
+}
+
+struct nvmet_report_zone_data {
+	struct nvmet_ns *ns;
+	struct nvme_zone_report *rz;
+};
+
+static int nvmet_bdev_report_zone_cb(struct blk_zone *z, unsigned int idx,
+				     void *data)
+{
+	struct nvmet_report_zone_data *report_zone_data = data;
+	struct nvme_zone_descriptor *entries = report_zone_data->rz->entries;
+	struct nvmet_ns *ns = report_zone_data->ns;
+
+	entries[idx].zcap = cpu_to_le64(nvmet_sect_to_lba(ns, z->capacity));
+	entries[idx].zslba = cpu_to_le64(nvmet_sect_to_lba(ns, z->start));
+	entries[idx].wp = cpu_to_le64(nvmet_sect_to_lba(ns, z->wp));
+	entries[idx].za = z->reset ? 1 << 2 : 0;
+	entries[idx].zt = z->type;
+	entries[idx].zs = z->cond << 4;
+
+	return 0;
+}
+
+void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req)
+{
+	u64 bufsize = (le32_to_cpu(req->cmd->zmr.numd) + 1) << 2;
+	struct nvmet_report_zone_data data = { .ns = req->ns };
+	struct nvme_zone_mgmt_recv_cmd *zmr = &req->cmd->zmr;
+	sector_t sect = nvmet_lba_to_sect(req->ns, le64_to_cpu(zmr->slba));
+	unsigned int nr_zones;
+	int reported_zones;
+	u16 status;
+
+	nr_zones = (bufsize - sizeof(struct nvme_zone_report)) /
+			sizeof(struct nvme_zone_descriptor);
+
+	status = nvmet_bdev_zns_checks(req);
+	if (status)
+		goto out;
+
+	data.rz = __vmalloc(bufsize, GFP_KERNEL | __GFP_NORETRY);
+	if (!data.rz) {
+		status = NVME_SC_INTERNAL;
+		goto out;
+	}
+
+	reported_zones = blkdev_report_zones(nvmet_bdev(req), sect, nr_zones,
+					     nvmet_bdev_report_zone_cb,
+					     &data);
+	if (reported_zones < 0) {
+		status = NVME_SC_INTERNAL;
+		goto out_free_report_zones;
+	}
+
+	data.rz->nr_zones = cpu_to_le64(reported_zones);
+
+	status = nvmet_copy_to_sgl(req, 0, data.rz, bufsize);
+
+out_free_report_zones:
+	kvfree(data.rz);
+out:
+	nvmet_req_complete(req, status);
+}
+
+void nvmet_bdev_execute_zone_mgmt_send(struct nvmet_req *req)
+{
+	sector_t nr_sect = bdev_zone_sectors(nvmet_bdev(req));
+	struct nvme_zone_mgmt_send_cmd *c = &req->cmd->zms;
+	enum req_opf op = REQ_OP_LAST;
+	u16 status = NVME_SC_SUCCESS;
+	sector_t sect;
+	int ret;
+
+	sect = nvmet_lba_to_sect(req->ns, le64_to_cpu(req->cmd->zms.slba));
+
+	if (c->select_all)
+		nr_sect = get_capacity(nvmet_bdev(req)->bd_disk);
+
+	switch (c->zsa) {
+	case NVME_ZONE_OPEN:
+		op = REQ_OP_ZONE_OPEN;
+		break;
+	case NVME_ZONE_CLOSE:
+		op = REQ_OP_ZONE_CLOSE;
+		break;
+	case NVME_ZONE_FINISH:
+		op = REQ_OP_ZONE_FINISH;
+		break;
+	case NVME_ZONE_RESET:
+		op = REQ_OP_ZONE_RESET;
+		break;
+	default:
+		status = NVME_SC_INVALID_FIELD;
+		goto out;
+	}
+
+	ret = blkdev_zone_mgmt(nvmet_bdev(req), op, sect, nr_sect, GFP_KERNEL);
+	if (ret)
+		status = NVME_SC_INTERNAL;
+out:
+	nvmet_req_complete(req, status);
+}
+
+void nvmet_bdev_execute_zone_append(struct nvmet_req *req)
+{
+	u64 slba = le64_to_cpu(req->cmd->rw.slba);
+	sector_t sect = nvmet_lba_to_sect(req->ns, slba);
+	unsigned long bv_cnt = req->sg_cnt;
+	u16 status = NVME_SC_SUCCESS;
+	size_t mapped_data_len = 0;
+	int sg_cnt = req->sg_cnt;
+	struct scatterlist *sg;
+	struct iov_iter iter;
+	struct bio_vec *bvec;
+	size_t mapped_cnt;
+	struct bio *bio;
+	int ret;
+
+	if (!nvmet_check_transfer_len(req, nvmet_rw_data_len(req)))
+		return;
+
+	if (!req->sg_cnt)
+		goto out;
+
+	if (WARN_ON(req->sg_cnt > BIO_MAX_PAGES)) {
+		status = NVME_SC_INTERNAL | NVME_SC_DNR;
+		goto out;
+	}
+
+	bvec = kmalloc_array(bv_cnt, sizeof(*bvec), GFP_KERNEL);
+	if (!bvec) {
+		status = NVME_SC_INTERNAL | NVME_SC_DNR;
+		goto out;
+	}
+
+	for_each_sg(req->sg, sg, req->sg_cnt, mapped_cnt) {
+		nvmet_file_init_bvec(&bvec[mapped_cnt], sg);
+		mapped_data_len += bvec[mapped_cnt].bv_len;
+		sg_cnt--;
+		if (mapped_cnt == bv_cnt)
+			break;
+	}
+
+	if (WARN_ON(sg_cnt)) {
+		status = NVME_SC_INTERNAL | NVME_SC_DNR;
+		goto out;
+	}
+
+	iov_iter_bvec(&iter, WRITE, bvec, mapped_cnt, mapped_data_len);
+
+	bio = bio_alloc(GFP_KERNEL, bv_cnt);
+	bio_set_dev(bio, nvmet_bdev(req));
+	bio->bi_iter.bi_sector = sect;
+	bio->bi_opf = REQ_OP_ZONE_APPEND | REQ_SYNC | REQ_IDLE;
+
+	ret = bio_iov_iter_get_pages(bio, &iter);
+	if (unlikely(ret)) {
+		status = NVME_SC_INTERNAL | NVME_SC_DNR;
+		bio_io_error(bio);
+		goto bvec_free;
+	}
+
+	ret = submit_bio_wait(bio);
+	status = ret < 0 ? NVME_SC_INTERNAL : status;
+	bio_put(bio);
+
+	sect += (mapped_data_len >> 9);
+	req->cqe->result.u64 = le64_to_cpu(nvmet_sect_to_lba(req->ns, sect));
+
+bvec_free:
+	kfree(bvec);
+out:
+	nvmet_req_complete(req, status);
+}
+
+#else  /* CONFIG_BLK_DEV_ZONED */
+void nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req)
+{
+}
+void nvmet_execute_identify_cns_cs_ns(struct nvmet_req *req)
+{
+}
+u16 nvmet_process_zns_cis(struct nvmet_req *req, off_t *off)
+{
+	return 0;
+}
+bool nvmet_bdev_zns_config(struct nvmet_ns *ns)
+{
+	return false;
+}
+void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req)
+{
+}
+void nvmet_bdev_execute_zone_mgmt_send(struct nvmet_req *req)
+{
+}
+void nvmet_bdev_execute_zone_append(struct nvmet_req *req)
+{
+}
+void nvmet_zns_add_cmd_effects(struct nvme_effects_log *log)
+{
+}
+#endif /* CONFIG_BLK_DEV_ZONED */
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 3/9] nvmet: trim down id-desclist to use req->ns
  2020-12-02  6:22 ` Chaitanya Kulkarni
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: sagi, hch, damien.lemoal, johannes.thumshirn, Chaitanya Kulkarni

In this prep patch we remove the extra local variable struct nvmet_ns
in nvmet_execute_identify_desclist() since req already has the member
that can be reused, this also eliminates the explicit call to
nvmet_put_namespace() which is already present in the request
completion path.

This reduces the arguments to the function in the following patch to
implement the ZNS for bdev-ns so we can get away with passing the req
argument instead of req and ns.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/admin-cmd.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index 509fd8dcca0c..c64b40c631e0 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -603,37 +603,35 @@ u16 nvmet_copy_ns_identifier(struct nvmet_req *req, u8 type, u8 len,
 
 static void nvmet_execute_identify_desclist(struct nvmet_req *req)
 {
-	struct nvmet_ns *ns;
 	u16 status = 0;
 	off_t off = 0;
 
-	ns = nvmet_find_namespace(req->sq->ctrl, req->cmd->identify.nsid);
-	if (!ns) {
+	req->ns = nvmet_find_namespace(req->sq->ctrl, req->cmd->identify.nsid);
+	if (!req->ns) {
 		req->error_loc = offsetof(struct nvme_identify, nsid);
 		status = NVME_SC_INVALID_NS | NVME_SC_DNR;
 		goto out;
 	}
 
-	if (memchr_inv(&ns->uuid, 0, sizeof(ns->uuid))) {
+	if (memchr_inv(&req->ns->uuid, 0, sizeof(req->ns->uuid))) {
 		status = nvmet_copy_ns_identifier(req, NVME_NIDT_UUID,
 						  NVME_NIDT_UUID_LEN,
-						  &ns->uuid, &off);
+						  &req->ns->uuid, &off);
 		if (status)
-			goto out_put_ns;
+			goto out;
 	}
-	if (memchr_inv(ns->nguid, 0, sizeof(ns->nguid))) {
+	if (memchr_inv(req->ns->nguid, 0, sizeof(req->ns->nguid))) {
 		status = nvmet_copy_ns_identifier(req, NVME_NIDT_NGUID,
 						  NVME_NIDT_NGUID_LEN,
-						  &ns->nguid, &off);
+						  &req->ns->nguid, &off);
 		if (status)
-			goto out_put_ns;
+			goto out;
 	}
 
 	if (sg_zero_buffer(req->sg, req->sg_cnt, NVME_IDENTIFY_DATA_SIZE - off,
 			off) != NVME_IDENTIFY_DATA_SIZE - off)
 		status = NVME_SC_INTERNAL | NVME_SC_DNR;
-out_put_ns:
-	nvmet_put_namespace(ns);
+
 out:
 	nvmet_req_complete(req, status);
 }
-- 
2.22.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 3/9] nvmet: trim down id-desclist to use req->ns
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: johannes.thumshirn, damien.lemoal, sagi, Chaitanya Kulkarni, hch

In this prep patch we remove the extra local variable struct nvmet_ns
in nvmet_execute_identify_desclist() since req already has the member
that can be reused, this also eliminates the explicit call to
nvmet_put_namespace() which is already present in the request
completion path.

This reduces the arguments to the function in the following patch to
implement the ZNS for bdev-ns so we can get away with passing the req
argument instead of req and ns.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/admin-cmd.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index 509fd8dcca0c..c64b40c631e0 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -603,37 +603,35 @@ u16 nvmet_copy_ns_identifier(struct nvmet_req *req, u8 type, u8 len,
 
 static void nvmet_execute_identify_desclist(struct nvmet_req *req)
 {
-	struct nvmet_ns *ns;
 	u16 status = 0;
 	off_t off = 0;
 
-	ns = nvmet_find_namespace(req->sq->ctrl, req->cmd->identify.nsid);
-	if (!ns) {
+	req->ns = nvmet_find_namespace(req->sq->ctrl, req->cmd->identify.nsid);
+	if (!req->ns) {
 		req->error_loc = offsetof(struct nvme_identify, nsid);
 		status = NVME_SC_INVALID_NS | NVME_SC_DNR;
 		goto out;
 	}
 
-	if (memchr_inv(&ns->uuid, 0, sizeof(ns->uuid))) {
+	if (memchr_inv(&req->ns->uuid, 0, sizeof(req->ns->uuid))) {
 		status = nvmet_copy_ns_identifier(req, NVME_NIDT_UUID,
 						  NVME_NIDT_UUID_LEN,
-						  &ns->uuid, &off);
+						  &req->ns->uuid, &off);
 		if (status)
-			goto out_put_ns;
+			goto out;
 	}
-	if (memchr_inv(ns->nguid, 0, sizeof(ns->nguid))) {
+	if (memchr_inv(req->ns->nguid, 0, sizeof(req->ns->nguid))) {
 		status = nvmet_copy_ns_identifier(req, NVME_NIDT_NGUID,
 						  NVME_NIDT_NGUID_LEN,
-						  &ns->nguid, &off);
+						  &req->ns->nguid, &off);
 		if (status)
-			goto out_put_ns;
+			goto out;
 	}
 
 	if (sg_zero_buffer(req->sg, req->sg_cnt, NVME_IDENTIFY_DATA_SIZE - off,
 			off) != NVME_IDENTIFY_DATA_SIZE - off)
 		status = NVME_SC_INTERNAL | NVME_SC_DNR;
-out_put_ns:
-	nvmet_put_namespace(ns);
+
 out:
 	nvmet_req_complete(req, status);
 }
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 4/9] nvmet: add NVME_CSI_ZNS in ns-desc for zbdev
  2020-12-02  6:22 ` Chaitanya Kulkarni
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: sagi, hch, damien.lemoal, johannes.thumshirn, Chaitanya Kulkarni

When discovering the ZNS, the host-side looks for the NVME_CSI_ZNS
value in the ns-desc. Update the nvmet_execute_identify_desclist()
such that it can now update the ns-desc with NVME_CSI_ZNS if bdev is
zoned.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/admin-cmd.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index c64b40c631e0..d4fc1bb1a318 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -628,6 +628,10 @@ static void nvmet_execute_identify_desclist(struct nvmet_req *req)
 			goto out;
 	}
 
+	status = nvmet_process_zns_cis(req, &off);
+	if (status)
+		goto out;
+
 	if (sg_zero_buffer(req->sg, req->sg_cnt, NVME_IDENTIFY_DATA_SIZE - off,
 			off) != NVME_IDENTIFY_DATA_SIZE - off)
 		status = NVME_SC_INTERNAL | NVME_SC_DNR;
-- 
2.22.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 4/9] nvmet: add NVME_CSI_ZNS in ns-desc for zbdev
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: johannes.thumshirn, damien.lemoal, sagi, Chaitanya Kulkarni, hch

When discovering the ZNS, the host-side looks for the NVME_CSI_ZNS
value in the ns-desc. Update the nvmet_execute_identify_desclist()
such that it can now update the ns-desc with NVME_CSI_ZNS if bdev is
zoned.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/admin-cmd.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index c64b40c631e0..d4fc1bb1a318 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -628,6 +628,10 @@ static void nvmet_execute_identify_desclist(struct nvmet_req *req)
 			goto out;
 	}
 
+	status = nvmet_process_zns_cis(req, &off);
+	if (status)
+		goto out;
+
 	if (sg_zero_buffer(req->sg, req->sg_cnt, NVME_IDENTIFY_DATA_SIZE - off,
 			off) != NVME_IDENTIFY_DATA_SIZE - off)
 		status = NVME_SC_INTERNAL | NVME_SC_DNR;
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 5/9] nvmet: add cns-cs-ctrl in id-ctrl for ZNS bdev
  2020-12-02  6:22 ` Chaitanya Kulkarni
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: sagi, hch, damien.lemoal, johannes.thumshirn, Chaitanya Kulkarni

Update the nvmet_execute_identify() such that it can now handle
NVME_ID_CNS_CS_CTRL when identify.cis is set to ZNS. This allows
host to identify the support for ZNS.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/admin-cmd.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index d4fc1bb1a318..e7d2b96cda6b 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -650,6 +650,10 @@ static void nvmet_execute_identify(struct nvmet_req *req)
 		return nvmet_execute_identify_ns(req);
 	case NVME_ID_CNS_CTRL:
 		return nvmet_execute_identify_ctrl(req);
+	case NVME_ID_CNS_CS_CTRL:
+		if (req->cmd->identify.csi == NVME_CSI_ZNS)
+			return nvmet_execute_identify_cns_cs_ctrl(req);
+		break;
 	case NVME_ID_CNS_NS_ACTIVE_LIST:
 		return nvmet_execute_identify_nslist(req);
 	case NVME_ID_CNS_NS_DESC_LIST:
-- 
2.22.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 5/9] nvmet: add cns-cs-ctrl in id-ctrl for ZNS bdev
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: johannes.thumshirn, damien.lemoal, sagi, Chaitanya Kulkarni, hch

Update the nvmet_execute_identify() such that it can now handle
NVME_ID_CNS_CS_CTRL when identify.cis is set to ZNS. This allows
host to identify the support for ZNS.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/admin-cmd.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index d4fc1bb1a318..e7d2b96cda6b 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -650,6 +650,10 @@ static void nvmet_execute_identify(struct nvmet_req *req)
 		return nvmet_execute_identify_ns(req);
 	case NVME_ID_CNS_CTRL:
 		return nvmet_execute_identify_ctrl(req);
+	case NVME_ID_CNS_CS_CTRL:
+		if (req->cmd->identify.csi == NVME_CSI_ZNS)
+			return nvmet_execute_identify_cns_cs_ctrl(req);
+		break;
 	case NVME_ID_CNS_NS_ACTIVE_LIST:
 		return nvmet_execute_identify_nslist(req);
 	case NVME_ID_CNS_NS_DESC_LIST:
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 6/9] nvmet: add cns-cs-ns in id-ctrl for ZNS bdev
  2020-12-02  6:22 ` Chaitanya Kulkarni
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: sagi, hch, damien.lemoal, johannes.thumshirn, Chaitanya Kulkarni

Update the nvmet_execute_identify() such that it can now handle
NVME_ID_CNS_CS_NS when identify.cis is set to ZNS. This allows
host to identify the ns with ZNS capabilities.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/admin-cmd.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index e7d2b96cda6b..cd368cbe3855 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -648,6 +648,10 @@ static void nvmet_execute_identify(struct nvmet_req *req)
 	switch (req->cmd->identify.cns) {
 	case NVME_ID_CNS_NS:
 		return nvmet_execute_identify_ns(req);
+	case NVME_ID_CNS_CS_NS:
+		if (req->cmd->identify.csi == NVME_CSI_ZNS)
+			return nvmet_execute_identify_cns_cs_ns(req);
+		break;
 	case NVME_ID_CNS_CTRL:
 		return nvmet_execute_identify_ctrl(req);
 	case NVME_ID_CNS_CS_CTRL:
-- 
2.22.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 6/9] nvmet: add cns-cs-ns in id-ctrl for ZNS bdev
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: johannes.thumshirn, damien.lemoal, sagi, Chaitanya Kulkarni, hch

Update the nvmet_execute_identify() such that it can now handle
NVME_ID_CNS_CS_NS when identify.cis is set to ZNS. This allows
host to identify the ns with ZNS capabilities.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/admin-cmd.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index e7d2b96cda6b..cd368cbe3855 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -648,6 +648,10 @@ static void nvmet_execute_identify(struct nvmet_req *req)
 	switch (req->cmd->identify.cns) {
 	case NVME_ID_CNS_NS:
 		return nvmet_execute_identify_ns(req);
+	case NVME_ID_CNS_CS_NS:
+		if (req->cmd->identify.csi == NVME_CSI_ZNS)
+			return nvmet_execute_identify_cns_cs_ns(req);
+		break;
 	case NVME_ID_CNS_CTRL:
 		return nvmet_execute_identify_ctrl(req);
 	case NVME_ID_CNS_CS_CTRL:
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 7/9] nvmet: add zns cmd effects to support zbdev
  2020-12-02  6:22 ` Chaitanya Kulkarni
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: sagi, hch, damien.lemoal, johannes.thumshirn, Chaitanya Kulkarni

Update the target side command effects logs with support for
ZNS commands for zbdev.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/admin-cmd.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index cd368cbe3855..0099275951da 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -191,6 +191,8 @@ static void nvmet_execute_get_log_cmd_effects_ns(struct nvmet_req *req)
 	log->iocs[nvme_cmd_dsm]			= cpu_to_le32(1 << 0);
 	log->iocs[nvme_cmd_write_zeroes]	= cpu_to_le32(1 << 0);
 
+	nvmet_zns_add_cmd_effects(log);
+
 	status = nvmet_copy_to_sgl(req, 0, log, sizeof(*log));
 
 	kfree(log);
-- 
2.22.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 7/9] nvmet: add zns cmd effects to support zbdev
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: johannes.thumshirn, damien.lemoal, sagi, Chaitanya Kulkarni, hch

Update the target side command effects logs with support for
ZNS commands for zbdev.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/admin-cmd.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index cd368cbe3855..0099275951da 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -191,6 +191,8 @@ static void nvmet_execute_get_log_cmd_effects_ns(struct nvmet_req *req)
 	log->iocs[nvme_cmd_dsm]			= cpu_to_le32(1 << 0);
 	log->iocs[nvme_cmd_write_zeroes]	= cpu_to_le32(1 << 0);
 
+	nvmet_zns_add_cmd_effects(log);
+
 	status = nvmet_copy_to_sgl(req, 0, log, sizeof(*log));
 
 	kfree(log);
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 8/9] nvmet: add zns bdev config support
  2020-12-02  6:22 ` Chaitanya Kulkarni
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: sagi, hch, damien.lemoal, johannes.thumshirn, Chaitanya Kulkarni

For zbd based bdev backend we need to override the ns->blksize_shift
with the physical block size instead of using the logical block size
so that SMR drives will not result in an error.

Update the nvmet_bdev_ns_enable() to reflect that.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/io-cmd-bdev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index 125dde3f410e..e1f6d59dd341 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -86,6 +86,9 @@ int nvmet_bdev_ns_enable(struct nvmet_ns *ns)
 	if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY_T10))
 		nvmet_bdev_ns_enable_integrity(ns);
 
+	if (bdev_is_zoned(ns->bdev) && !nvmet_bdev_zns_enable(ns))
+		return -EINVAL;
+
 	return 0;
 }
 
-- 
2.22.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 8/9] nvmet: add zns bdev config support
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: johannes.thumshirn, damien.lemoal, sagi, Chaitanya Kulkarni, hch

For zbd based bdev backend we need to override the ns->blksize_shift
with the physical block size instead of using the logical block size
so that SMR drives will not result in an error.

Update the nvmet_bdev_ns_enable() to reflect that.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/io-cmd-bdev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index 125dde3f410e..e1f6d59dd341 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -86,6 +86,9 @@ int nvmet_bdev_ns_enable(struct nvmet_ns *ns)
 	if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY_T10))
 		nvmet_bdev_ns_enable_integrity(ns);
 
+	if (bdev_is_zoned(ns->bdev) && !nvmet_bdev_zns_enable(ns))
+		return -EINVAL;
+
 	return 0;
 }
 
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 9/9] nvmet: add ZNS based I/O cmds handlers
  2020-12-02  6:22 ` Chaitanya Kulkarni
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: sagi, hch, damien.lemoal, johannes.thumshirn, Chaitanya Kulkarni

Add zone-mgmt-send, zone-mgmt-recv and zone-zppend handlers for the
bdev backend so that it can support zbd.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/io-cmd-bdev.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index e1f6d59dd341..25dcd0544d5d 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -453,6 +453,15 @@ u16 nvmet_bdev_parse_io_cmd(struct nvmet_req *req)
 	case nvme_cmd_write_zeroes:
 		req->execute = nvmet_bdev_execute_write_zeroes;
 		return 0;
+	case nvme_cmd_zone_append:
+		req->execute = nvmet_bdev_execute_zone_append;
+		return 0;
+	case nvme_cmd_zone_mgmt_recv:
+		req->execute = nvmet_bdev_execute_zone_mgmt_recv;
+		return 0;
+	case nvme_cmd_zone_mgmt_send:
+		req->execute = nvmet_bdev_execute_zone_mgmt_send;
+		return 0;
 	default:
 		pr_err("unhandled cmd %d on qid %d\n", cmd->common.opcode,
 		       req->sq->qid);
-- 
2.22.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V4 9/9] nvmet: add ZNS based I/O cmds handlers
@ 2020-12-02  6:22   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-02  6:22 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: johannes.thumshirn, damien.lemoal, sagi, Chaitanya Kulkarni, hch

Add zone-mgmt-send, zone-mgmt-recv and zone-zppend handlers for the
bdev backend so that it can support zbd.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/io-cmd-bdev.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index e1f6d59dd341..25dcd0544d5d 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -453,6 +453,15 @@ u16 nvmet_bdev_parse_io_cmd(struct nvmet_req *req)
 	case nvme_cmd_write_zeroes:
 		req->execute = nvmet_bdev_execute_write_zeroes;
 		return 0;
+	case nvme_cmd_zone_append:
+		req->execute = nvmet_bdev_execute_zone_append;
+		return 0;
+	case nvme_cmd_zone_mgmt_recv:
+		req->execute = nvmet_bdev_execute_zone_mgmt_recv;
+		return 0;
+	case nvme_cmd_zone_mgmt_send:
+		req->execute = nvmet_bdev_execute_zone_mgmt_send;
+		return 0;
 	default:
 		pr_err("unhandled cmd %d on qid %d\n", cmd->common.opcode,
 		       req->sq->qid);
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 1/9] block: allow bvec for zone append get pages
  2020-12-02  6:22   ` Chaitanya Kulkarni
@ 2020-12-02  8:34     ` Johannes Thumshirn
  -1 siblings, 0 replies; 54+ messages in thread
From: Johannes Thumshirn @ 2020-12-02  8:34 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-block, linux-nvme; +Cc: sagi, hch, Damien Le Moal

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 1/9] block: allow bvec for zone append get pages
@ 2020-12-02  8:34     ` Johannes Thumshirn
  0 siblings, 0 replies; 54+ messages in thread
From: Johannes Thumshirn @ 2020-12-02  8:34 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-block, linux-nvme; +Cc: Damien Le Moal, sagi, hch

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 1/9] block: allow bvec for zone append get pages
  2020-12-02  6:22   ` Chaitanya Kulkarni
@ 2020-12-02  8:55     ` Christoph Hellwig
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-02  8:55 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: linux-block, linux-nvme, sagi, hch, damien.lemoal, johannes.thumshirn

On Tue, Dec 01, 2020 at 10:22:19PM -0800, Chaitanya Kulkarni wrote:
> Remove the bvec check in the bio_iov_iter_get_pages() for
> REQ_OP_ZONE_APPEND so that we can reuse the code and build iter from
> bvec.

We should do the same optimization for bvec pages that the normal path
does.  That being said using bio_iov_iter_get_pages in nvmet does not
	make any sense to me whatsover.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 1/9] block: allow bvec for zone append get pages
@ 2020-12-02  8:55     ` Christoph Hellwig
  0 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-02  8:55 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: damien.lemoal, sagi, johannes.thumshirn, linux-nvme, linux-block, hch

On Tue, Dec 01, 2020 at 10:22:19PM -0800, Chaitanya Kulkarni wrote:
> Remove the bvec check in the bio_iov_iter_get_pages() for
> REQ_OP_ZONE_APPEND so that we can reuse the code and build iter from
> bvec.

We should do the same optimization for bvec pages that the normal path
does.  That being said using bio_iov_iter_get_pages in nvmet does not
	make any sense to me whatsover.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 2/9] nvmet: add ZNS support for bdev-ns
  2020-12-02  6:22   ` Chaitanya Kulkarni
@ 2020-12-02  9:07     ` Christoph Hellwig
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-02  9:07 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: linux-block, linux-nvme, sagi, hch, damien.lemoal, johannes.thumshirn

On Tue, Dec 01, 2020 at 10:22:20PM -0800, Chaitanya Kulkarni wrote:
> Add zns-bdev-config, id-ctrl, id-ns, zns-cmd-effects, zone-mgmt-send,
> zone-mgmt-recv and zone-append handlers

If you think you need to speel out command please do so as in the spec
instead of uisng strange abbreviations.

That being said the commit log should document the why and the overall
architecture considerations, not low-level details like this.

> +#ifdef CONFIG_BLK_DEV_ZONED
> +	struct nvme_id_ns_zns	id_zns;
> +	unsigned int		zasl;
> +#endif

This wastes a lot of space for all normal non-zns uses.  Please have some
sort of private data field that zns can use.  Why do we even need to store
the whole data structure instead of the few native format fields we need?

>  static inline struct nvmet_ns *to_nvmet_ns(struct config_item *item)
> @@ -251,6 +255,10 @@ struct nvmet_subsys {
>  	unsigned int		admin_timeout;
>  	unsigned int		io_timeout;
>  #endif /* CONFIG_NVME_TARGET_PASSTHRU */
> +
> +#ifdef CONFIG_BLK_DEV_ZONED
> +	struct nvme_id_ctrl_zns	id_ctrl_zns;
> +#endif

Same here.

> +#define NVMET_MPSMIN_SHIFT	12

Needs some documentation on the why.

> +static inline struct block_device *nvmet_bdev(struct nvmet_req *req)
> +{
> +	return req->ns->bdev;
> +}

I don't really see the point of this helper.

> +/*
> + *  ZNS related command implementation and helpers.
> + */
> +
> +u16 nvmet_process_zns_cis(struct nvmet_req *req, off_t *off)
> +{
> +	u16 nvme_cis_zns = NVME_CSI_ZNS;
> +
> +	if (!bdev_is_zoned(nvmet_bdev(req)))
> +		return NVME_SC_SUCCESS;
> +
> +	return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, NVME_NIDT_CSI_LEN,
> +					&nvme_cis_zns, off);
> +}

This looks weird.  We can want to support the command set identifier in
general, so this should go into common code, and just look up the command
set identifier in the nvmet_ns structure.

> +void nvmet_zns_add_cmd_effects(struct nvme_effects_log *log)
> +{
> +	log->iocs[nvme_cmd_zone_append]		= cpu_to_le32(1 << 0);
> +	log->iocs[nvme_cmd_zone_mgmt_send]	= cpu_to_le32(1 << 0);
> +	log->iocs[nvme_cmd_zone_mgmt_recv]	= cpu_to_le32(1 << 0);
> +}

Just add this to the caller under an if.  For the if a helper that checks
IS_ENABLED() and the csi would be useful.

> +
> +static inline bool nvmet_bdev_validate_zns_zones(struct nvmet_ns *ns)
> +{
> +	if (ns->bdev->bd_disk->queue->conv_zones_bitmap) {
> +		pr_err("block devices with conventional zones are not supported.");
> +		return false;
> +	}
> +
> +	return !(get_capacity(ns->bdev->bd_disk) &
> +			(bdev_zone_sectors(ns->bdev) - 1));
> +}

I think this should be open coded in the caller.

> +static inline u8 nvmet_zasl(unsigned int zone_append_sects)
> +{
> +	/*
> +	 * Zone Append Size Limit is the value experessed in the units
> +	 * of minimum memory page size (i.e. 12) and is reported power of 2.
> +	 */
> +	return ilog2((zone_append_sects << 9) >> NVMET_MPSMIN_SHIFT);
> +}
> +
> +static inline void nvmet_zns_update_zasl(struct nvmet_ns *ns)
> +{
> +	struct request_queue *q = ns->bdev->bd_disk->queue;
> +	struct nvmet_ns *ins;
> +	unsigned long idx;
> +	u8 min_zasl;
> +
> +	/*
> +	 * Calculate new ctrl->zasl value when enabling the new ns. This value
> +	 * has to be the minimum of the max_zone_append values from available
> +	 * namespaces.
> +	 */
> +	min_zasl = ns->zasl = nvmet_zasl(queue_max_zone_append_sectors(q));
> +
> +	xa_for_each(&(ns->subsys->namespaces), idx, ins) {
> +		struct request_queue *iq = ins->bdev->bd_disk->queue;
> +		unsigned int imax_za_sects = queue_max_zone_append_sectors(iq);
> +		u8 izasl = nvmet_zasl(imax_za_sects);
> +
> +		if (!bdev_is_zoned(ins->bdev))
> +			continue;
> +
> +		min_zasl = min_zasl > izasl ? izasl : min_zasl;
> +	}
> +
> +	ns->subsys->id_ctrl_zns.zasl = min_zasl;
> +}

This will change the limit when a new namespaces is added.  I think we need
to just pick the value of the first namespaces and refuse adding a new
one if the limit is lower to not completely break hosts.

> +void nvmet_bdev_execute_zone_append(struct nvmet_req *req)

This whole function looks weird.  I'd expect that we mostly (if not
entirely) reuse nvmet_bdev_execute_rw, just using bio_add_hw_page
instead of bio_add_page and setting up the proper op field.

> +#else  /* CONFIG_BLK_DEV_ZONED */

We really do try to avoid these kinds of ifdefs in .c files.  Either
put the stubs inline into the header, of use IS_ENABLED() magic in
the callers.  In this case I think the new helper I mentioned above
which checks IS_ENABLED + the csi seems like the right way.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 2/9] nvmet: add ZNS support for bdev-ns
@ 2020-12-02  9:07     ` Christoph Hellwig
  0 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-02  9:07 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: damien.lemoal, sagi, johannes.thumshirn, linux-nvme, linux-block, hch

On Tue, Dec 01, 2020 at 10:22:20PM -0800, Chaitanya Kulkarni wrote:
> Add zns-bdev-config, id-ctrl, id-ns, zns-cmd-effects, zone-mgmt-send,
> zone-mgmt-recv and zone-append handlers

If you think you need to speel out command please do so as in the spec
instead of uisng strange abbreviations.

That being said the commit log should document the why and the overall
architecture considerations, not low-level details like this.

> +#ifdef CONFIG_BLK_DEV_ZONED
> +	struct nvme_id_ns_zns	id_zns;
> +	unsigned int		zasl;
> +#endif

This wastes a lot of space for all normal non-zns uses.  Please have some
sort of private data field that zns can use.  Why do we even need to store
the whole data structure instead of the few native format fields we need?

>  static inline struct nvmet_ns *to_nvmet_ns(struct config_item *item)
> @@ -251,6 +255,10 @@ struct nvmet_subsys {
>  	unsigned int		admin_timeout;
>  	unsigned int		io_timeout;
>  #endif /* CONFIG_NVME_TARGET_PASSTHRU */
> +
> +#ifdef CONFIG_BLK_DEV_ZONED
> +	struct nvme_id_ctrl_zns	id_ctrl_zns;
> +#endif

Same here.

> +#define NVMET_MPSMIN_SHIFT	12

Needs some documentation on the why.

> +static inline struct block_device *nvmet_bdev(struct nvmet_req *req)
> +{
> +	return req->ns->bdev;
> +}

I don't really see the point of this helper.

> +/*
> + *  ZNS related command implementation and helpers.
> + */
> +
> +u16 nvmet_process_zns_cis(struct nvmet_req *req, off_t *off)
> +{
> +	u16 nvme_cis_zns = NVME_CSI_ZNS;
> +
> +	if (!bdev_is_zoned(nvmet_bdev(req)))
> +		return NVME_SC_SUCCESS;
> +
> +	return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, NVME_NIDT_CSI_LEN,
> +					&nvme_cis_zns, off);
> +}

This looks weird.  We can want to support the command set identifier in
general, so this should go into common code, and just look up the command
set identifier in the nvmet_ns structure.

> +void nvmet_zns_add_cmd_effects(struct nvme_effects_log *log)
> +{
> +	log->iocs[nvme_cmd_zone_append]		= cpu_to_le32(1 << 0);
> +	log->iocs[nvme_cmd_zone_mgmt_send]	= cpu_to_le32(1 << 0);
> +	log->iocs[nvme_cmd_zone_mgmt_recv]	= cpu_to_le32(1 << 0);
> +}

Just add this to the caller under an if.  For the if a helper that checks
IS_ENABLED() and the csi would be useful.

> +
> +static inline bool nvmet_bdev_validate_zns_zones(struct nvmet_ns *ns)
> +{
> +	if (ns->bdev->bd_disk->queue->conv_zones_bitmap) {
> +		pr_err("block devices with conventional zones are not supported.");
> +		return false;
> +	}
> +
> +	return !(get_capacity(ns->bdev->bd_disk) &
> +			(bdev_zone_sectors(ns->bdev) - 1));
> +}

I think this should be open coded in the caller.

> +static inline u8 nvmet_zasl(unsigned int zone_append_sects)
> +{
> +	/*
> +	 * Zone Append Size Limit is the value experessed in the units
> +	 * of minimum memory page size (i.e. 12) and is reported power of 2.
> +	 */
> +	return ilog2((zone_append_sects << 9) >> NVMET_MPSMIN_SHIFT);
> +}
> +
> +static inline void nvmet_zns_update_zasl(struct nvmet_ns *ns)
> +{
> +	struct request_queue *q = ns->bdev->bd_disk->queue;
> +	struct nvmet_ns *ins;
> +	unsigned long idx;
> +	u8 min_zasl;
> +
> +	/*
> +	 * Calculate new ctrl->zasl value when enabling the new ns. This value
> +	 * has to be the minimum of the max_zone_append values from available
> +	 * namespaces.
> +	 */
> +	min_zasl = ns->zasl = nvmet_zasl(queue_max_zone_append_sectors(q));
> +
> +	xa_for_each(&(ns->subsys->namespaces), idx, ins) {
> +		struct request_queue *iq = ins->bdev->bd_disk->queue;
> +		unsigned int imax_za_sects = queue_max_zone_append_sectors(iq);
> +		u8 izasl = nvmet_zasl(imax_za_sects);
> +
> +		if (!bdev_is_zoned(ins->bdev))
> +			continue;
> +
> +		min_zasl = min_zasl > izasl ? izasl : min_zasl;
> +	}
> +
> +	ns->subsys->id_ctrl_zns.zasl = min_zasl;
> +}

This will change the limit when a new namespaces is added.  I think we need
to just pick the value of the first namespaces and refuse adding a new
one if the limit is lower to not completely break hosts.

> +void nvmet_bdev_execute_zone_append(struct nvmet_req *req)

This whole function looks weird.  I'd expect that we mostly (if not
entirely) reuse nvmet_bdev_execute_rw, just using bio_add_hw_page
instead of bio_add_page and setting up the proper op field.

> +#else  /* CONFIG_BLK_DEV_ZONED */

We really do try to avoid these kinds of ifdefs in .c files.  Either
put the stubs inline into the header, of use IS_ENABLED() magic in
the callers.  In this case I think the new helper I mentioned above
which checks IS_ENABLED + the csi seems like the right way.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 3/9] nvmet: trim down id-desclist to use req->ns
  2020-12-02  6:22   ` Chaitanya Kulkarni
@ 2020-12-02  9:08     ` Christoph Hellwig
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-02  9:08 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: linux-block, linux-nvme, sagi, hch, damien.lemoal, johannes.thumshirn

This should probably go to the front of the queue.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 3/9] nvmet: trim down id-desclist to use req->ns
@ 2020-12-02  9:08     ` Christoph Hellwig
  0 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-02  9:08 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: damien.lemoal, sagi, johannes.thumshirn, linux-nvme, linux-block, hch

This should probably go to the front of the queue.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 4/9] nvmet: add NVME_CSI_ZNS in ns-desc for zbdev
  2020-12-02  6:22   ` Chaitanya Kulkarni
@ 2020-12-02  9:09     ` Christoph Hellwig
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-02  9:09 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: linux-block, linux-nvme, sagi, hch, damien.lemoal, johannes.thumshirn

As mentioned before CSI support is a feature independen of ZNS and should
be supported on all controllers, and in a separate prep patch.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 4/9] nvmet: add NVME_CSI_ZNS in ns-desc for zbdev
@ 2020-12-02  9:09     ` Christoph Hellwig
  0 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-02  9:09 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: damien.lemoal, sagi, johannes.thumshirn, linux-nvme, linux-block, hch

As mentioned before CSI support is a feature independen of ZNS and should
be supported on all controllers, and in a separate prep patch.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 5/9] nvmet: add cns-cs-ctrl in id-ctrl for ZNS bdev
  2020-12-02  6:22   ` Chaitanya Kulkarni
@ 2020-12-02  9:10     ` Christoph Hellwig
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-02  9:10 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: linux-block, linux-nvme, sagi, hch, damien.lemoal, johannes.thumshirn

On Tue, Dec 01, 2020 at 10:22:23PM -0800, Chaitanya Kulkarni wrote:
> Update the nvmet_execute_identify() such that it can now handle
> NVME_ID_CNS_CS_CTRL when identify.cis is set to ZNS. This allows
> host to identify the support for ZNS.

The changs in this and all following patches really belong into the current
patch 2.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 5/9] nvmet: add cns-cs-ctrl in id-ctrl for ZNS bdev
@ 2020-12-02  9:10     ` Christoph Hellwig
  0 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-02  9:10 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: damien.lemoal, sagi, johannes.thumshirn, linux-nvme, linux-block, hch

On Tue, Dec 01, 2020 at 10:22:23PM -0800, Chaitanya Kulkarni wrote:
> Update the nvmet_execute_identify() such that it can now handle
> NVME_ID_CNS_CS_CTRL when identify.cis is set to ZNS. This allows
> host to identify the support for ZNS.

The changs in this and all following patches really belong into the current
patch 2.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 0/9] nvmet: add ZBD backend support
  2020-12-02  6:22 ` Chaitanya Kulkarni
@ 2020-12-02  9:20   ` Christoph Hellwig
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-02  9:20 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: linux-block, linux-nvme, sagi, hch, damien.lemoal, johannes.thumshirn

Unless I'm missing something this fails to advertise multiple command
support in the CAP property, as well as the enablement in the CC
property.  How does the host manage to even use this?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 0/9] nvmet: add ZBD backend support
@ 2020-12-02  9:20   ` Christoph Hellwig
  0 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-02  9:20 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: damien.lemoal, sagi, johannes.thumshirn, linux-nvme, linux-block, hch

Unless I'm missing something this fails to advertise multiple command
support in the CAP property, as well as the enablement in the CC
property.  How does the host manage to even use this?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 1/9] block: allow bvec for zone append get pages
  2020-12-02  8:55     ` Christoph Hellwig
@ 2020-12-04  2:43       ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-04  2:43 UTC (permalink / raw)
  To: hch
  Cc: linux-block, linux-nvme, sagi, hch, Damien Le Moal, Johannes Thumshirn

On 12/2/20 00:55, Christoph Hellwig wrote:
> On Tue, Dec 01, 2020 at 10:22:19PM -0800, Chaitanya Kulkarni wrote:
>> Remove the bvec check in the bio_iov_iter_get_pages() for
>> REQ_OP_ZONE_APPEND so that we can reuse the code and build iter from
>> bvec.
> We should do the same optimization for bvec pages that the normal path
> does.  That being said using bio_iov_iter_get_pages in nvmet does not
> 	make any sense to me whatsover.
>
Are you referring to the inline bvec ? then yes, I'll add it in next
version.

I did not understand bio_iov_iter_get_pages() comment though.

Reimplementing the bio loop over sg with the use of bio_add_hw_page() seems

repetition of the code which we already have in bio_iov_iter_get_pages().


Can you please explain why bio_iov_iter_get_pages() not the right way ?



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 1/9] block: allow bvec for zone append get pages
@ 2020-12-04  2:43       ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-04  2:43 UTC (permalink / raw)
  To: hch
  Cc: Damien Le Moal, sagi, Johannes Thumshirn, linux-nvme, linux-block, hch

On 12/2/20 00:55, Christoph Hellwig wrote:
> On Tue, Dec 01, 2020 at 10:22:19PM -0800, Chaitanya Kulkarni wrote:
>> Remove the bvec check in the bio_iov_iter_get_pages() for
>> REQ_OP_ZONE_APPEND so that we can reuse the code and build iter from
>> bvec.
> We should do the same optimization for bvec pages that the normal path
> does.  That being said using bio_iov_iter_get_pages in nvmet does not
> 	make any sense to me whatsover.
>
Are you referring to the inline bvec ? then yes, I'll add it in next
version.

I did not understand bio_iov_iter_get_pages() comment though.

Reimplementing the bio loop over sg with the use of bio_add_hw_page() seems

repetition of the code which we already have in bio_iov_iter_get_pages().


Can you please explain why bio_iov_iter_get_pages() not the right way ?



_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 2/9] nvmet: add ZNS support for bdev-ns
  2020-12-02  9:07     ` Christoph Hellwig
@ 2020-12-04  3:13       ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-04  3:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, linux-nvme, sagi, Damien Le Moal, Johannes Thumshirn

On 12/2/20 01:07, Christoph Hellwig wrote:
> On Tue, Dec 01, 2020 at 10:22:20PM -0800, Chaitanya Kulkarni wrote:
>> Add zns-bdev-config, id-ctrl, id-ns, zns-cmd-effects, zone-mgmt-send,
>> zone-mgmt-recv and zone-append handlers
> If you think you need to speel out command please do so as in the spec
> instead of uisng strange abbreviations.
>
> That being said the commit log should document the why and the overall
> architecture considerations, not low-level details like this.

The commit log is really bad, let me rewrite in a more meaningful way.

>
>> +#ifdef CONFIG_BLK_DEV_ZONED
>> +	struct nvme_id_ns_zns	id_zns;
>> +	unsigned int		zasl;
>> +#endif
> This wastes a lot of space for all normal non-zns uses.  Please have some
> sort of private data field that zns can use.  Why do we even need to store
> the whole data structure instead of the few native format fields we need?
No need, this should use the fields than data structure.
>>  static inline struct nvmet_ns *to_nvmet_ns(struct config_item *item)
>> @@ -251,6 +255,10 @@ struct nvmet_subsys {
>>  	unsigned int		admin_timeout;
>>  	unsigned int		io_timeout;
>>  #endif /* CONFIG_NVME_TARGET_PASSTHRU */
>> +
>> +#ifdef CONFIG_BLK_DEV_ZONED
>> +	struct nvme_id_ctrl_zns	id_ctrl_zns;
>> +#endif
> Same here.
Yep.
>
>> +#define NVMET_MPSMIN_SHIFT	12
> Needs some documentation on the why.
Okay.
>> +static inline struct block_device *nvmet_bdev(struct nvmet_req *req)
>> +{
>> +	return req->ns->bdev;
>> +}
> I don't really see the point of this helper.
Okay, I'll remove it.
>> +/*
>> + *  ZNS related command implementation and helpers.
>> + */
>> +
>> +u16 nvmet_process_zns_cis(struct nvmet_req *req, off_t *off)
>> +{
>> +	u16 nvme_cis_zns = NVME_CSI_ZNS;
>> +
>> +	if (!bdev_is_zoned(nvmet_bdev(req)))
>> +		return NVME_SC_SUCCESS;
>> +
>> +	return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, NVME_NIDT_CSI_LEN,
>> +					&nvme_cis_zns, off);
>> +}
> This looks weird.  We can want to support the command set identifier in
> general, so this should go into common code, and just look up the command
> set identifier in the nvmet_ns structure.

ZNS is the only user for this, so I've added to the zns code. I'll move to

admin-cmd.

>> +void nvmet_zns_add_cmd_effects(struct nvme_effects_log *log)
>> +{
>> +	log->iocs[nvme_cmd_zone_append]		= cpu_to_le32(1 << 0);
>> +	log->iocs[nvme_cmd_zone_mgmt_send]	= cpu_to_le32(1 << 0);
>> +	log->iocs[nvme_cmd_zone_mgmt_recv]	= cpu_to_le32(1 << 0);
>> +}
> Just add this to the caller under an if.  For the if a helper that checks
> IS_ENABLED() and the csi would be useful.
Okay.
>> +
>> +static inline bool nvmet_bdev_validate_zns_zones(struct nvmet_ns *ns)
>> +{
>> +	if (ns->bdev->bd_disk->queue->conv_zones_bitmap) {
>> +		pr_err("block devices with conventional zones are not supported.");
>> +		return false;
>> +	}
>> +
>> +	return !(get_capacity(ns->bdev->bd_disk) &
>> +			(bdev_zone_sectors(ns->bdev) - 1));
>> +}
> I think this should be open coded in the caller.
>
Okay.
>> +static inline u8 nvmet_zasl(unsigned int zone_append_sects)
>> +{
>> +	/*
>> +	 * Zone Append Size Limit is the value experessed in the units
>> +	 * of minimum memory page size (i.e. 12) and is reported power of 2.
>> +	 */
>> +	return ilog2((zone_append_sects << 9) >> NVMET_MPSMIN_SHIFT);
>> +}
>> +
>> +static inline void nvmet_zns_update_zasl(struct nvmet_ns *ns)
>> +{
>> +	struct request_queue *q = ns->bdev->bd_disk->queue;
>> +	struct nvmet_ns *ins;
>> +	unsigned long idx;
>> +	u8 min_zasl;
>> +
>> +	/*
>> +	 * Calculate new ctrl->zasl value when enabling the new ns. This value
>> +	 * has to be the minimum of the max_zone_append values from available
>> +	 * namespaces.
>> +	 */
>> +	min_zasl = ns->zasl = nvmet_zasl(queue_max_zone_append_sectors(q));
>> +
>> +	xa_for_each(&(ns->subsys->namespaces), idx, ins) {
>> +		struct request_queue *iq = ins->bdev->bd_disk->queue;
>> +		unsigned int imax_za_sects = queue_max_zone_append_sectors(iq);
>> +		u8 izasl = nvmet_zasl(imax_za_sects);
>> +
>> +		if (!bdev_is_zoned(ins->bdev))
>> +			continue;
>> +
>> +		min_zasl = min_zasl > izasl ? izasl : min_zasl;
>> +	}
>> +
>> +	ns->subsys->id_ctrl_zns.zasl = min_zasl;
>> +}
> This will change the limit when a new namespaces is added.  I think we need
> to just pick the value of the first namespaces and refuse adding a new
> one if the limit is lower to not completely break hosts.

But that will force users to add ns with highest zasl first no matter what.

Isn't there should be a way to update the host with async event

so that host can refresh the ctrl->zasl when ns addition async notification

is generated ?

>> +void nvmet_bdev_execute_zone_append(struct nvmet_req *req)
> This whole function looks weird.  I'd expect that we mostly (if not
> entirely) reuse nvmet_bdev_execute_rw, just using bio_add_hw_page
> instead of bio_add_page and setting up the proper op field.

that was the initial choice, please see the reply to the your comment

on the first patch.

>> +#else  /* CONFIG_BLK_DEV_ZONED */
> We really do try to avoid these kinds of ifdefs in .c files.  Either
> put the stubs inline into the header, of use IS_ENABLED() magic in
> the callers.  In this case I think the new helper I mentioned above
> which checks IS_ENABLED + the csi seems like the right way.
>
With addition of the these empty stubs these functions are getting added

to all the transport code which has nothing to with the backend that felt

wrong to me. But if you are okay with that I'll make this change.




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 2/9] nvmet: add ZNS support for bdev-ns
@ 2020-12-04  3:13       ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-04  3:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, Johannes Thumshirn, Damien Le Moal, sagi, linux-nvme

On 12/2/20 01:07, Christoph Hellwig wrote:
> On Tue, Dec 01, 2020 at 10:22:20PM -0800, Chaitanya Kulkarni wrote:
>> Add zns-bdev-config, id-ctrl, id-ns, zns-cmd-effects, zone-mgmt-send,
>> zone-mgmt-recv and zone-append handlers
> If you think you need to speel out command please do so as in the spec
> instead of uisng strange abbreviations.
>
> That being said the commit log should document the why and the overall
> architecture considerations, not low-level details like this.

The commit log is really bad, let me rewrite in a more meaningful way.

>
>> +#ifdef CONFIG_BLK_DEV_ZONED
>> +	struct nvme_id_ns_zns	id_zns;
>> +	unsigned int		zasl;
>> +#endif
> This wastes a lot of space for all normal non-zns uses.  Please have some
> sort of private data field that zns can use.  Why do we even need to store
> the whole data structure instead of the few native format fields we need?
No need, this should use the fields than data structure.
>>  static inline struct nvmet_ns *to_nvmet_ns(struct config_item *item)
>> @@ -251,6 +255,10 @@ struct nvmet_subsys {
>>  	unsigned int		admin_timeout;
>>  	unsigned int		io_timeout;
>>  #endif /* CONFIG_NVME_TARGET_PASSTHRU */
>> +
>> +#ifdef CONFIG_BLK_DEV_ZONED
>> +	struct nvme_id_ctrl_zns	id_ctrl_zns;
>> +#endif
> Same here.
Yep.
>
>> +#define NVMET_MPSMIN_SHIFT	12
> Needs some documentation on the why.
Okay.
>> +static inline struct block_device *nvmet_bdev(struct nvmet_req *req)
>> +{
>> +	return req->ns->bdev;
>> +}
> I don't really see the point of this helper.
Okay, I'll remove it.
>> +/*
>> + *  ZNS related command implementation and helpers.
>> + */
>> +
>> +u16 nvmet_process_zns_cis(struct nvmet_req *req, off_t *off)
>> +{
>> +	u16 nvme_cis_zns = NVME_CSI_ZNS;
>> +
>> +	if (!bdev_is_zoned(nvmet_bdev(req)))
>> +		return NVME_SC_SUCCESS;
>> +
>> +	return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, NVME_NIDT_CSI_LEN,
>> +					&nvme_cis_zns, off);
>> +}
> This looks weird.  We can want to support the command set identifier in
> general, so this should go into common code, and just look up the command
> set identifier in the nvmet_ns structure.

ZNS is the only user for this, so I've added to the zns code. I'll move to

admin-cmd.

>> +void nvmet_zns_add_cmd_effects(struct nvme_effects_log *log)
>> +{
>> +	log->iocs[nvme_cmd_zone_append]		= cpu_to_le32(1 << 0);
>> +	log->iocs[nvme_cmd_zone_mgmt_send]	= cpu_to_le32(1 << 0);
>> +	log->iocs[nvme_cmd_zone_mgmt_recv]	= cpu_to_le32(1 << 0);
>> +}
> Just add this to the caller under an if.  For the if a helper that checks
> IS_ENABLED() and the csi would be useful.
Okay.
>> +
>> +static inline bool nvmet_bdev_validate_zns_zones(struct nvmet_ns *ns)
>> +{
>> +	if (ns->bdev->bd_disk->queue->conv_zones_bitmap) {
>> +		pr_err("block devices with conventional zones are not supported.");
>> +		return false;
>> +	}
>> +
>> +	return !(get_capacity(ns->bdev->bd_disk) &
>> +			(bdev_zone_sectors(ns->bdev) - 1));
>> +}
> I think this should be open coded in the caller.
>
Okay.
>> +static inline u8 nvmet_zasl(unsigned int zone_append_sects)
>> +{
>> +	/*
>> +	 * Zone Append Size Limit is the value experessed in the units
>> +	 * of minimum memory page size (i.e. 12) and is reported power of 2.
>> +	 */
>> +	return ilog2((zone_append_sects << 9) >> NVMET_MPSMIN_SHIFT);
>> +}
>> +
>> +static inline void nvmet_zns_update_zasl(struct nvmet_ns *ns)
>> +{
>> +	struct request_queue *q = ns->bdev->bd_disk->queue;
>> +	struct nvmet_ns *ins;
>> +	unsigned long idx;
>> +	u8 min_zasl;
>> +
>> +	/*
>> +	 * Calculate new ctrl->zasl value when enabling the new ns. This value
>> +	 * has to be the minimum of the max_zone_append values from available
>> +	 * namespaces.
>> +	 */
>> +	min_zasl = ns->zasl = nvmet_zasl(queue_max_zone_append_sectors(q));
>> +
>> +	xa_for_each(&(ns->subsys->namespaces), idx, ins) {
>> +		struct request_queue *iq = ins->bdev->bd_disk->queue;
>> +		unsigned int imax_za_sects = queue_max_zone_append_sectors(iq);
>> +		u8 izasl = nvmet_zasl(imax_za_sects);
>> +
>> +		if (!bdev_is_zoned(ins->bdev))
>> +			continue;
>> +
>> +		min_zasl = min_zasl > izasl ? izasl : min_zasl;
>> +	}
>> +
>> +	ns->subsys->id_ctrl_zns.zasl = min_zasl;
>> +}
> This will change the limit when a new namespaces is added.  I think we need
> to just pick the value of the first namespaces and refuse adding a new
> one if the limit is lower to not completely break hosts.

But that will force users to add ns with highest zasl first no matter what.

Isn't there should be a way to update the host with async event

so that host can refresh the ctrl->zasl when ns addition async notification

is generated ?

>> +void nvmet_bdev_execute_zone_append(struct nvmet_req *req)
> This whole function looks weird.  I'd expect that we mostly (if not
> entirely) reuse nvmet_bdev_execute_rw, just using bio_add_hw_page
> instead of bio_add_page and setting up the proper op field.

that was the initial choice, please see the reply to the your comment

on the first patch.

>> +#else  /* CONFIG_BLK_DEV_ZONED */
> We really do try to avoid these kinds of ifdefs in .c files.  Either
> put the stubs inline into the header, of use IS_ENABLED() magic in
> the callers.  In this case I think the new helper I mentioned above
> which checks IS_ENABLED + the csi seems like the right way.
>
With addition of the these empty stubs these functions are getting added

to all the transport code which has nothing to with the backend that felt

wrong to me. But if you are okay with that I'll make this change.




_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 3/9] nvmet: trim down id-desclist to use req->ns
  2020-12-02  9:08     ` Christoph Hellwig
@ 2020-12-04  3:14       ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-04  3:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, linux-nvme, sagi, Damien Le Moal, Johannes Thumshirn

On 12/2/20 01:08, Christoph Hellwig wrote:
> This should probably go to the front of the queue.
>
Yes.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 3/9] nvmet: trim down id-desclist to use req->ns
@ 2020-12-04  3:14       ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-04  3:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, Johannes Thumshirn, Damien Le Moal, sagi, linux-nvme

On 12/2/20 01:08, Christoph Hellwig wrote:
> This should probably go to the front of the queue.
>
Yes.


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 4/9] nvmet: add NVME_CSI_ZNS in ns-desc for zbdev
  2020-12-02  9:09     ` Christoph Hellwig
@ 2020-12-04  3:14       ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-04  3:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, linux-nvme, sagi, Damien Le Moal, Johannes Thumshirn

On 12/2/20 01:09, Christoph Hellwig wrote:
> As mentioned before CSI support is a feature independen of ZNS and should
> be supported on all controllers, and in a separate prep patch.
>
Okay.



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 4/9] nvmet: add NVME_CSI_ZNS in ns-desc for zbdev
@ 2020-12-04  3:14       ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-04  3:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, Johannes Thumshirn, Damien Le Moal, sagi, linux-nvme

On 12/2/20 01:09, Christoph Hellwig wrote:
> As mentioned before CSI support is a feature independen of ZNS and should
> be supported on all controllers, and in a separate prep patch.
>
Okay.



_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 5/9] nvmet: add cns-cs-ctrl in id-ctrl for ZNS bdev
  2020-12-02  9:10     ` Christoph Hellwig
@ 2020-12-04  3:20       ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-04  3:20 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, linux-nvme, sagi, Damien Le Moal, Johannes Thumshirn

On 12/2/20 01:10, Christoph Hellwig wrote:
> On Tue, Dec 01, 2020 at 10:22:23PM -0800, Chaitanya Kulkarni wrote:
>> Update the nvmet_execute_identify() such that it can now handle
>> NVME_ID_CNS_CS_CTRL when identify.cis is set to ZNS. This allows
>> host to identify the support for ZNS.
> The changs in this and all following patches really belong into the current
> patch 2.
>

This and each following patch doing one specific thing for which host

side component is responsible, with one patch for backend handlers and

wiring up of the same on for admin-cmd.c and io-cmd-bdev.c seems

we are packing too much into one, unlike what we have has a nice

bottom-up flow for the implementation, why not we keep that nice flow ?


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 5/9] nvmet: add cns-cs-ctrl in id-ctrl for ZNS bdev
@ 2020-12-04  3:20       ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-04  3:20 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, Johannes Thumshirn, Damien Le Moal, sagi, linux-nvme

On 12/2/20 01:10, Christoph Hellwig wrote:
> On Tue, Dec 01, 2020 at 10:22:23PM -0800, Chaitanya Kulkarni wrote:
>> Update the nvmet_execute_identify() such that it can now handle
>> NVME_ID_CNS_CS_CTRL when identify.cis is set to ZNS. This allows
>> host to identify the support for ZNS.
> The changs in this and all following patches really belong into the current
> patch 2.
>

This and each following patch doing one specific thing for which host

side component is responsible, with one patch for backend handlers and

wiring up of the same on for admin-cmd.c and io-cmd-bdev.c seems

we are packing too much into one, unlike what we have has a nice

bottom-up flow for the implementation, why not we keep that nice flow ?


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 1/9] block: allow bvec for zone append get pages
  2020-12-04  2:43       ` Chaitanya Kulkarni
@ 2020-12-04  8:46         ` hch
  -1 siblings, 0 replies; 54+ messages in thread
From: hch @ 2020-12-04  8:46 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: hch, linux-block, linux-nvme, sagi, hch, Damien Le Moal,
	Johannes Thumshirn

On Fri, Dec 04, 2020 at 02:43:10AM +0000, Chaitanya Kulkarni wrote:
> >> Remove the bvec check in the bio_iov_iter_get_pages() for
> >> REQ_OP_ZONE_APPEND so that we can reuse the code and build iter from
> >> bvec.
> > We should do the same optimization for bvec pages that the normal path
> > does.  That being said using bio_iov_iter_get_pages in nvmet does not
> > 	make any sense to me whatsover.
> >
> Are you referring to the inline bvec ? then yes, I'll add it in next
> version.
> 
> I did not understand bio_iov_iter_get_pages() comment though.
> 
> Reimplementing the bio loop over sg with the use of bio_add_hw_page() seems
> 
> repetition of the code which we already have in bio_iov_iter_get_pages().
> 
> 
> Can you please explain why bio_iov_iter_get_pages() not the right way ?

bio_iov_iter_get_pages is highly inefficient for this use case, as we'll
need to allocate two sets of bio_vecs.  It also is rather redundant as
Zone Append should be able to just largely reuse the write path.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 1/9] block: allow bvec for zone append get pages
@ 2020-12-04  8:46         ` hch
  0 siblings, 0 replies; 54+ messages in thread
From: hch @ 2020-12-04  8:46 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: linux-block, Damien Le Moal, sagi, Johannes Thumshirn,
	linux-nvme, hch, hch

On Fri, Dec 04, 2020 at 02:43:10AM +0000, Chaitanya Kulkarni wrote:
> >> Remove the bvec check in the bio_iov_iter_get_pages() for
> >> REQ_OP_ZONE_APPEND so that we can reuse the code and build iter from
> >> bvec.
> > We should do the same optimization for bvec pages that the normal path
> > does.  That being said using bio_iov_iter_get_pages in nvmet does not
> > 	make any sense to me whatsover.
> >
> Are you referring to the inline bvec ? then yes, I'll add it in next
> version.
> 
> I did not understand bio_iov_iter_get_pages() comment though.
> 
> Reimplementing the bio loop over sg with the use of bio_add_hw_page() seems
> 
> repetition of the code which we already have in bio_iov_iter_get_pages().
> 
> 
> Can you please explain why bio_iov_iter_get_pages() not the right way ?

bio_iov_iter_get_pages is highly inefficient for this use case, as we'll
need to allocate two sets of bio_vecs.  It also is rather redundant as
Zone Append should be able to just largely reuse the write path.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 2/9] nvmet: add ZNS support for bdev-ns
  2020-12-04  3:13       ` Chaitanya Kulkarni
@ 2020-12-04  9:27         ` Christoph Hellwig
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-04  9:27 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: Christoph Hellwig, linux-block, linux-nvme, sagi, Damien Le Moal,
	Johannes Thumshirn

On Fri, Dec 04, 2020 at 03:13:40AM +0000, Chaitanya Kulkarni wrote:
> >> +	if (!bdev_is_zoned(nvmet_bdev(req)))
> >> +		return NVME_SC_SUCCESS;
> >> +
> >> +	return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, NVME_NIDT_CSI_LEN,
> >> +					&nvme_cis_zns, off);
> >> +}
> > This looks weird.  We can want to support the command set identifier in
> > general, so this should go into common code, and just look up the command
> > set identifier in the nvmet_ns structure.
> 
> ZNS is the only user for this, so I've added to the zns code. I'll move to
> 
> admin-cmd.

CSI is a generic feature, ZNS is just the first thing that requires it.

> > This will change the limit when a new namespaces is added.  I think we need
> > to just pick the value of the first namespaces and refuse adding a new
> > one if the limit is lower to not completely break hosts.
> 
> But that will force users to add ns with highest zasl first no matter what.
> 
> Isn't there should be a way to update the host with async event
> 
> so that host can refresh the ctrl->zasl when ns addition async notification
> 
> is generated ?

I'd rather not go too dynamic and change capabilities down dynamically.
I think that will cause all kinds of fun problems with I/Os already
queue up in the block layer while the limit changes.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 2/9] nvmet: add ZNS support for bdev-ns
@ 2020-12-04  9:27         ` Christoph Hellwig
  0 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-04  9:27 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: Damien Le Moal, sagi, Johannes Thumshirn, linux-nvme,
	linux-block, Christoph Hellwig

On Fri, Dec 04, 2020 at 03:13:40AM +0000, Chaitanya Kulkarni wrote:
> >> +	if (!bdev_is_zoned(nvmet_bdev(req)))
> >> +		return NVME_SC_SUCCESS;
> >> +
> >> +	return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, NVME_NIDT_CSI_LEN,
> >> +					&nvme_cis_zns, off);
> >> +}
> > This looks weird.  We can want to support the command set identifier in
> > general, so this should go into common code, and just look up the command
> > set identifier in the nvmet_ns structure.
> 
> ZNS is the only user for this, so I've added to the zns code. I'll move to
> 
> admin-cmd.

CSI is a generic feature, ZNS is just the first thing that requires it.

> > This will change the limit when a new namespaces is added.  I think we need
> > to just pick the value of the first namespaces and refuse adding a new
> > one if the limit is lower to not completely break hosts.
> 
> But that will force users to add ns with highest zasl first no matter what.
> 
> Isn't there should be a way to update the host with async event
> 
> so that host can refresh the ctrl->zasl when ns addition async notification
> 
> is generated ?

I'd rather not go too dynamic and change capabilities down dynamically.
I think that will cause all kinds of fun problems with I/Os already
queue up in the block layer while the limit changes.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 5/9] nvmet: add cns-cs-ctrl in id-ctrl for ZNS bdev
  2020-12-04  3:20       ` Chaitanya Kulkarni
@ 2020-12-04  9:28         ` Christoph Hellwig
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-04  9:28 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: Christoph Hellwig, linux-block, linux-nvme, sagi, Damien Le Moal,
	Johannes Thumshirn

On Fri, Dec 04, 2020 at 03:20:40AM +0000, Chaitanya Kulkarni wrote:
> On 12/2/20 01:10, Christoph Hellwig wrote:
> > On Tue, Dec 01, 2020 at 10:22:23PM -0800, Chaitanya Kulkarni wrote:
> >> Update the nvmet_execute_identify() such that it can now handle
> >> NVME_ID_CNS_CS_CTRL when identify.cis is set to ZNS. This allows
> >> host to identify the support for ZNS.
> > The changs in this and all following patches really belong into the current
> > patch 2.
> >
> 
> This and each following patch doing one specific thing for which host
> 
> side component is responsible, with one patch for backend handlers and
> 
> wiring up of the same on for admin-cmd.c and io-cmd-bdev.c seems
> 
> we are packing too much into one, unlike what we have has a nice
> 
> bottom-up flow for the implementation, why not we keep that nice flow ?

Patches should do one thing at time.  With this series patch 2
is very incomplete, which makes both review and later reading the
commit history very hard.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 5/9] nvmet: add cns-cs-ctrl in id-ctrl for ZNS bdev
@ 2020-12-04  9:28         ` Christoph Hellwig
  0 siblings, 0 replies; 54+ messages in thread
From: Christoph Hellwig @ 2020-12-04  9:28 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: Damien Le Moal, sagi, Johannes Thumshirn, linux-nvme,
	linux-block, Christoph Hellwig

On Fri, Dec 04, 2020 at 03:20:40AM +0000, Chaitanya Kulkarni wrote:
> On 12/2/20 01:10, Christoph Hellwig wrote:
> > On Tue, Dec 01, 2020 at 10:22:23PM -0800, Chaitanya Kulkarni wrote:
> >> Update the nvmet_execute_identify() such that it can now handle
> >> NVME_ID_CNS_CS_CTRL when identify.cis is set to ZNS. This allows
> >> host to identify the support for ZNS.
> > The changs in this and all following patches really belong into the current
> > patch 2.
> >
> 
> This and each following patch doing one specific thing for which host
> 
> side component is responsible, with one patch for backend handlers and
> 
> wiring up of the same on for admin-cmd.c and io-cmd-bdev.c seems
> 
> we are packing too much into one, unlike what we have has a nice
> 
> bottom-up flow for the implementation, why not we keep that nice flow ?

Patches should do one thing at time.  With this series patch 2
is very incomplete, which makes both review and later reading the
commit history very hard.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 0/9] nvmet: add ZBD backend support
  2020-12-02  9:20   ` Christoph Hellwig
@ 2020-12-10  3:07     ` Chaitanya Kulkarni
  -1 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-10  3:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, linux-nvme, sagi, Damien Le Moal, Johannes Thumshirn

On 12/2/20 01:20, Christoph Hellwig wrote:
> Unless I'm missing something this fails to advertise multiple command
> support in the CAP property, as well as the enablement in the CC
> property.  How does the host manage to even use this?
>
Yes, it is because host side doesn't check for the controller cap
property it
only checks for the ns->head->ids.csi == NVME_CSI_ZNS that is set from the
ns-desclist call, so this series got awaywithout cap/cc and CSI target side
support.

I think something like following (totally untested) will help to avoid the
scenarios like this for ZNS drives so we can rejects the buggy controllers
early to make sure we are spec compliant :-

# git diff# git diff
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index d9b152bae19d..7b196299c9b7 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2166,6 +2166,11 @@ static int nvme_update_ns_info(struct nvme_ns
*ns, struct nvme_id_ns *id)
        nvme_set_queue_limits(ns->ctrl, ns->queue);
 
        if (ns->head->ids.csi == NVME_CSI_ZNS) {
+               if (!(NVME_CAP_CSS(ns->ctrl->cap) & NVME_CAP_CSS_CSI)) {
+                       pr_err("zns ns found with ctrl support for CSI");
+                       goto out_unfreeze;
+               }
+
                ret = nvme_update_zone_info(ns, lbaf);
                if (ret)
                        goto out_unfreeze;



^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 0/9] nvmet: add ZBD backend support
@ 2020-12-10  3:07     ` Chaitanya Kulkarni
  0 siblings, 0 replies; 54+ messages in thread
From: Chaitanya Kulkarni @ 2020-12-10  3:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, Johannes Thumshirn, Damien Le Moal, sagi, linux-nvme

On 12/2/20 01:20, Christoph Hellwig wrote:
> Unless I'm missing something this fails to advertise multiple command
> support in the CAP property, as well as the enablement in the CC
> property.  How does the host manage to even use this?
>
Yes, it is because host side doesn't check for the controller cap
property it
only checks for the ns->head->ids.csi == NVME_CSI_ZNS that is set from the
ns-desclist call, so this series got awaywithout cap/cc and CSI target side
support.

I think something like following (totally untested) will help to avoid the
scenarios like this for ZNS drives so we can rejects the buggy controllers
early to make sure we are spec compliant :-

# git diff# git diff
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index d9b152bae19d..7b196299c9b7 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2166,6 +2166,11 @@ static int nvme_update_ns_info(struct nvme_ns
*ns, struct nvme_id_ns *id)
        nvme_set_queue_limits(ns->ctrl, ns->queue);
 
        if (ns->head->ids.csi == NVME_CSI_ZNS) {
+               if (!(NVME_CAP_CSS(ns->ctrl->cap) & NVME_CAP_CSS_CSI)) {
+                       pr_err("zns ns found with ctrl support for CSI");
+                       goto out_unfreeze;
+               }
+
                ret = nvme_update_zone_info(ns, lbaf);
                if (ret)
                        goto out_unfreeze;



_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 0/9] nvmet: add ZBD backend support
  2020-12-10  3:07     ` Chaitanya Kulkarni
@ 2020-12-10 15:15       ` Keith Busch
  -1 siblings, 0 replies; 54+ messages in thread
From: Keith Busch @ 2020-12-10 15:15 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: Christoph Hellwig, linux-block, Johannes Thumshirn,
	Damien Le Moal, sagi, linux-nvme

On Thu, Dec 10, 2020 at 03:07:32AM +0000, Chaitanya Kulkarni wrote:
> I think something like following (totally untested) will help to avoid the
> scenarios like this for ZNS drives so we can rejects the buggy controllers
> early to make sure we are spec compliant :-
> 
> # git diff# git diff
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index d9b152bae19d..7b196299c9b7 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2166,6 +2166,11 @@ static int nvme_update_ns_info(struct nvme_ns
> *ns, struct nvme_id_ns *id)
>         nvme_set_queue_limits(ns->ctrl, ns->queue);
>  
>         if (ns->head->ids.csi == NVME_CSI_ZNS) {
> +               if (!(NVME_CAP_CSS(ns->ctrl->cap) & NVME_CAP_CSS_CSI)) {
> +                       pr_err("zns ns found with ctrl support for CSI");
> +                       goto out_unfreeze;
> +               }

I think you want to use nvme_multi_css() for the 'if()' check, but
otherwise this looks like a valid sanity check.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V4 0/9] nvmet: add ZBD backend support
@ 2020-12-10 15:15       ` Keith Busch
  0 siblings, 0 replies; 54+ messages in thread
From: Keith Busch @ 2020-12-10 15:15 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: Damien Le Moal, sagi, Johannes Thumshirn, linux-nvme,
	linux-block, Christoph Hellwig

On Thu, Dec 10, 2020 at 03:07:32AM +0000, Chaitanya Kulkarni wrote:
> I think something like following (totally untested) will help to avoid the
> scenarios like this for ZNS drives so we can rejects the buggy controllers
> early to make sure we are spec compliant :-
> 
> # git diff# git diff
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index d9b152bae19d..7b196299c9b7 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2166,6 +2166,11 @@ static int nvme_update_ns_info(struct nvme_ns
> *ns, struct nvme_id_ns *id)
>         nvme_set_queue_limits(ns->ctrl, ns->queue);
>  
>         if (ns->head->ids.csi == NVME_CSI_ZNS) {
> +               if (!(NVME_CAP_CSS(ns->ctrl->cap) & NVME_CAP_CSS_CSI)) {
> +                       pr_err("zns ns found with ctrl support for CSI");
> +                       goto out_unfreeze;
> +               }

I think you want to use nvme_multi_css() for the 'if()' check, but
otherwise this looks like a valid sanity check.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2020-12-10 15:16 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-02  6:22 [PATCH V4 0/9] nvmet: add ZBD backend support Chaitanya Kulkarni
2020-12-02  6:22 ` Chaitanya Kulkarni
2020-12-02  6:22 ` [PATCH V4 1/9] block: allow bvec for zone append get pages Chaitanya Kulkarni
2020-12-02  6:22   ` Chaitanya Kulkarni
2020-12-02  8:34   ` Johannes Thumshirn
2020-12-02  8:34     ` Johannes Thumshirn
2020-12-02  8:55   ` Christoph Hellwig
2020-12-02  8:55     ` Christoph Hellwig
2020-12-04  2:43     ` Chaitanya Kulkarni
2020-12-04  2:43       ` Chaitanya Kulkarni
2020-12-04  8:46       ` hch
2020-12-04  8:46         ` hch
2020-12-02  6:22 ` [PATCH V4 2/9] nvmet: add ZNS support for bdev-ns Chaitanya Kulkarni
2020-12-02  6:22   ` Chaitanya Kulkarni
2020-12-02  9:07   ` Christoph Hellwig
2020-12-02  9:07     ` Christoph Hellwig
2020-12-04  3:13     ` Chaitanya Kulkarni
2020-12-04  3:13       ` Chaitanya Kulkarni
2020-12-04  9:27       ` Christoph Hellwig
2020-12-04  9:27         ` Christoph Hellwig
2020-12-02  6:22 ` [PATCH V4 3/9] nvmet: trim down id-desclist to use req->ns Chaitanya Kulkarni
2020-12-02  6:22   ` Chaitanya Kulkarni
2020-12-02  9:08   ` Christoph Hellwig
2020-12-02  9:08     ` Christoph Hellwig
2020-12-04  3:14     ` Chaitanya Kulkarni
2020-12-04  3:14       ` Chaitanya Kulkarni
2020-12-02  6:22 ` [PATCH V4 4/9] nvmet: add NVME_CSI_ZNS in ns-desc for zbdev Chaitanya Kulkarni
2020-12-02  6:22   ` Chaitanya Kulkarni
2020-12-02  9:09   ` Christoph Hellwig
2020-12-02  9:09     ` Christoph Hellwig
2020-12-04  3:14     ` Chaitanya Kulkarni
2020-12-04  3:14       ` Chaitanya Kulkarni
2020-12-02  6:22 ` [PATCH V4 5/9] nvmet: add cns-cs-ctrl in id-ctrl for ZNS bdev Chaitanya Kulkarni
2020-12-02  6:22   ` Chaitanya Kulkarni
2020-12-02  9:10   ` Christoph Hellwig
2020-12-02  9:10     ` Christoph Hellwig
2020-12-04  3:20     ` Chaitanya Kulkarni
2020-12-04  3:20       ` Chaitanya Kulkarni
2020-12-04  9:28       ` Christoph Hellwig
2020-12-04  9:28         ` Christoph Hellwig
2020-12-02  6:22 ` [PATCH V4 6/9] nvmet: add cns-cs-ns " Chaitanya Kulkarni
2020-12-02  6:22   ` Chaitanya Kulkarni
2020-12-02  6:22 ` [PATCH V4 7/9] nvmet: add zns cmd effects to support zbdev Chaitanya Kulkarni
2020-12-02  6:22   ` Chaitanya Kulkarni
2020-12-02  6:22 ` [PATCH V4 8/9] nvmet: add zns bdev config support Chaitanya Kulkarni
2020-12-02  6:22   ` Chaitanya Kulkarni
2020-12-02  6:22 ` [PATCH V4 9/9] nvmet: add ZNS based I/O cmds handlers Chaitanya Kulkarni
2020-12-02  6:22   ` Chaitanya Kulkarni
2020-12-02  9:20 ` [PATCH V4 0/9] nvmet: add ZBD backend support Christoph Hellwig
2020-12-02  9:20   ` Christoph Hellwig
2020-12-10  3:07   ` Chaitanya Kulkarni
2020-12-10  3:07     ` Chaitanya Kulkarni
2020-12-10 15:15     ` Keith Busch
2020-12-10 15:15       ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.