* [PATCH V11 0/4] nvmet: add ZBD backend support @ 2021-03-11 4:39 Chaitanya Kulkarni 2021-03-11 4:39 ` [PATCH V11 1/4] nvmet: add NVM Command Set Identifier support Chaitanya Kulkarni ` (3 more replies) 0 siblings, 4 replies; 13+ messages in thread From: Chaitanya Kulkarni @ 2021-03-11 4:39 UTC (permalink / raw) To: linux-nvme; +Cc: hch, kbusch, sagi, damien.lemoal, Chaitanya Kulkarni Hi, NVMeOF Host is capable of handling the NVMe Protocol based Zoned Block Devices (ZBD) in the Zoned Namespaces (ZNS) mode with the passthru backend. There is no support for a generic block device backend to handle the ZBDs which are not NVMe protocol compliant. This adds support to export the ZBDs (which are not NVMe drives) to host from the target via NVMeOF using the host side ZNS interface. Note: This patch-series is based on nvme-5.13. Following scenarios tested successfully:- * Zonefs Test with dm-linear on the top of SMR HDD exported over NVMeOF. * Zonefs Test With CONFIG_BLK_DEV_ZONED nvme genblk target and ZBD with null_blk zoned target with zonefs. * Without CONFIG_BLK_DEV_ZONED nvme tests on genblk target. -ck Changes from V10:- 1. Move CONFIG_BLK_DEV_ZONED check into the caller of nvmet_set_csi_zns_effects(). 2. Move ZNS related csi code from csi patch to its own ZNS backend patch. 3. For ZNS command effects logs set default command effects with nvmet_set_csi_nvm_effects() along with nvmet_set_csi_zns_effects(). 4. Use goto for failure case in the nvmet_set_csi_zns_effects(). 5. Return directly from swicth in nvmet_execute_identify_desclist_csi(). 6. Merge Patch 2nd/3rd into one patch and move ZNS related code into its own :- [PATCH V10 2/8] nvmet: add NVM Command Set Identifier support [PATCH V10 3/8] nvmet: add command set supported ctrl cap Merged into new patch minux ZNS calls :- [PATCH V11 1/4] nvmet: add NVM Command Set Identifier support 7. Move req->cmd->identify.csi == NVME_CSI_ZNS checks in to respective caller in nvmet_execute_identify(). 8. Update error log page in nvmet_bdev_zns_checks(). 9. Remove the terney expression nvmet_zns_update_zasl(). 10. Drop following patches:- [PATCH V10 1/8] nvmet: trim args for nvmet_copy_ns_identifier() [PATCH V10 6/8] nvme-core: check ctrl css before setting up zns [PATCH V10 7/8] nvme-core: add a helper to print css related error Changes from V9:- 1. Use bio_add_zone_append_page() for REQ_OP_ZONE_APPEND. 2. Add a separate prep patch to reduce the arguments for nvmet_copy_ns_identifier(). 3. Add a separate patch for nvmet CSI support. 4. Add a separate patch for nvmet CSS support. 5. Move nvmet_cc_css_check() to multi-css supported patch. 6. Return error in csi cmd effects helper when !CONFIG_BLK_DEV_ZONED. 7. Return error in id desc list helper when !CONFIG_BLK_DEV_ZONED. 8. Remove goto and return from nvmet_bdev_zns_checks(). 9. Move nr_zones calculations near call to blkdev_report_zones() in nvmet_bdev_execute_zone_mgmt_recv(). 10. Split command effects logs into respective CSI helpers. 11. Don't use the local variables to pass NVME_CSI_XXX values, instead use req->ns->csi, also move this from ZBD support patch to nvmet CSI support. 12. Fix the bug that is chekcing cns value instead of csi in identify. 13. bdev_is_zoned() is stubbed out the CONFIG_BLK_DEV_ZONED, so remove the check for CONFIG_BLK_DEV_ZONED before calling bdev_is_zoned(). 14 Drop follwing patches :- [PATCH V9 1/9] block: export bio_add_hw_pages(). [PATCH V9 5/9] nvmet: add bio get helper. [PATCH V9 6/9] nvmet: add bio init helper. [PATCH V9 8/9] nvmet: add common I/O length check helper. [PATCH V9 9/9] nvmet: call nvmet_bio_done() for zone append. 15. Add a patch to check for ctrl bits to make sure ctrl supports the multi css on the host side when setting up Zoned Namespace. 17. Add a documentation patch for the host side when calculating the buffer size allocation size for the report zones. 16. Rebase and retest 5.12-rc1. Changes from V8:- 1. Rebase and retest on latest nvme-5.11. 2. Export ctrl->cap csi support only if CONFIG_BLK_DEV_ZONE is set. 3. Add a fix to admin ns-desc list handler for handling default csi. Changes from V7:- 1. Just like what block layer provides an API for bio_init(), provide nvmet_bio_init() such that we move bio initialization code for nvme-read-write commands from bdev and zns backend into the centralize helper. 2. With bdev/zns/file now we have three backends that are checking for req->sg_cnt and calling nvmet_check_transfer_len() before we process nvme-read-write commands. Move this duplicate code from three backeneds into the helper. 3. Export and use nvmet_bio_done() callback in nvmet_execute_zone_append() instead of the open coding the function. This also avoids code duplication for bio & request completion with error log page update. 4. Add zonefs tests log for dm linear device created on the top of SMR HDD exported with NVMeOF ZNS backend with the help of nvme-loop. Changes from V6:- 1. Instead of calling report zones to find conventional zones in the loop use the loop inside LLD blkdev_report_zones()->LLD_report_zones, that also simplifies the report zone callback. 2. Fix the bug in nvmet_bdev_has_conv_zones(). 3. Remove conditional operators in the nvmet_bdev_execute_zone_append(). Changes from V5:- 1. Use bio->bi_iter.bi_sector for result of the REQ_OP_ZONE_APPEND command. 2. Add endianness to the helper nvmet_sect_to_lba(). 3. Make bufsize u32 in zone mgmt recv command handler. 4. Add __GFP_ZERO for report zone data buffer to return clean buffer. Changes from V4:- 1. Don't use bio_iov_iter_get_pages() instead add a patch to export bio_add_hw_page() and call it directly for zone append. 2. Add inline vector optimization for append bio. 3. Update the commit logs for the patches. 4. Remove ZNS related identify data structures, use individual members. 5. Add a comment for macro NVMET_MPSMIN_SHIFT. 6. Remove nvmet_bdev() helper. 7. Move the command set identifier code into common code. 8. Use IS_ENABLED() and move helpers fomr zns.c into common code. 9. Add a patch to support Command Set identifiers. 10. Open code nvmet_bdev_validate_zns_zones(). 11. Remove the per namespace min zasl calculation and don't allow namespaces with zasl value > the first ns zasl value. 12. Move the stubs into the header file. 13. Add lba to/from sector conversion helpers and update the io-cmd-bdev.c to avoid the code duplication. 14. Add everything into one patch for zns command handlers and respective calls from the target code. 15. Remove the trim ns-desclist admin callback patch from this series. 16. Add bio get and put helpers patches to reduce the duplicate code in generic bdev, passthru, and generic zns backend. Changes from V3:- 1. Get rid of the bio_max_zasl check. 2. Remove extra lines. 3. Remove the block layer api export patch. 4. Remove the bvec check in the bio_iov_iter_get_pages() for REQ_OP_ZONE_APPEND so that we can reuse the code. Changes from V2:- 1. Move conventional zone bitmap check into nvmet_bdev_validate_zns_zones(). 2. Don't use report zones call to check the runt zone. 3. Trim nvmet_zasl() helper. 4. Fix typo in the nvmet_zns_update_zasl(). 5. Remove the comment and fix the mdts calculation in nvmet_execute_identify_cns_cs_ctrl(). 6. Use u64 for bufsize in nvmet_bdev_execute_zone_mgmt_recv(). 7. Remove nvmet_zones_to_desc_size() and fix the nr_zones calculation. 8. Remove the op variable in nvmet_bdev_execute_zone_append(). 9. Fix the nr_zones calculation nvmet_bdev_execute_zone_mgmt_recv(). 10. Update cover letter subject. Changes from V1:- 1. Remove the nvmet-$(CONFIG_BLK_DEV_ZONED) += zns.o. 2. Mark helpers inline. 3. Fix typos in the comments and update the comments. 4. Get rid of the curly brackets. 5. Don't allow drives with last smaller zones. 6. Calculate the zasl as a function of ax_zone_append_sectors, bio_max_pages so we don't have to split the bio. 7. Add global subsys->zasl and update the zasl when new namespace is enabled. 8. Rmove the loop in the nvmet_bdev_execute_zone_mgmt_recv() and move functionality in to the report zone callback. 9. Add goto for default case in nvmet_bdev_execute_zone_mgmt_send(). 10. Allocate the zones buffer with zones size instead of bdev nr_zones. Chaitanya Kulkarni (4): nvmet: add NVM Command Set Identifier support nvmet: add ZBD over ZNS backend support nvmet: add nvmet_req_bio put helper for backends nvme: add comments to nvme_zns_alloc_report_buffer drivers/nvme/host/zns.c | 22 ++ drivers/nvme/target/Makefile | 1 + drivers/nvme/target/admin-cmd.c | 74 ++++++- drivers/nvme/target/core.c | 16 +- drivers/nvme/target/io-cmd-bdev.c | 37 +++- drivers/nvme/target/nvmet.h | 45 +++++ drivers/nvme/target/passthru.c | 3 +- drivers/nvme/target/zns.c | 326 ++++++++++++++++++++++++++++++ include/linux/nvme.h | 1 + 9 files changed, 504 insertions(+), 21 deletions(-) create mode 100644 drivers/nvme/target/zns.c * Zonefs Test log with dm-linear on the top of SMR HDD:- -------------------------------------------------------------------------------- 1. Test Zoned Block Device info :- -------------------------------------------------------------------------------- # fdisk -l /dev/sdh Disk /dev/sdh: 13.64 TiB, 15000173281280 bytes, 3662151680 sectors Disk model: HGST HSH721415AL Units: sectors of 1 * 4096 = 4096 bytes Sector size (logical/physical): 4096 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes # cat /sys/block/sdh/queue/nr_zones 55880 # cat /sys/block/sdh/queue/zoned host-managed # cat /sys/block/sdh/queue/zone_append_max_bytes 688128 2. Creating NVMeOF target backed by dm-linear on the top of ZBD -------------------------------------------------------------------------------- # ./zbdev.sh 1 dm-zbd ++ NQN=dm-zbd ++ echo '0 29022486528 linear /dev/sdh 274726912' | dmsetup create cksdh 9 directories, 4 files ++ mkdir /sys/kernel/config/nvmet/subsystems/dm-zbd ++ mkdir /sys/kernel/config/nvmet/subsystems/dm-zbd/namespaces/1 ++ echo -n /dev/dm-0 ++ cat /sys/kernel/config/nvmet/subsystems/dm-zbd/namespaces/1/device_path /dev/dm-0 ++ echo 1 ++ mkdir /sys/kernel/config/nvmet/ports/1/ ++ echo -n loop ++ echo -n 1 ++ ln -s /sys/kernel/config/nvmet/subsystems/dm-zbd /sys/kernel/config/nvmet/ports/1/subsystems/ ++ sleep 1 ++ echo transport=loop,nqn=dm-zbd ++ sleep 1 ++ dmesg -c [233450.572565] nvmet: adding nsid 1 to subsystem dm-zbd [233452.269477] nvmet: creating controller 1 for subsystem dm-zbd for NQN nqn.2014-08.org.nvmexpress:uuid:853d7e82-8018-44ce-8784-ab81e7465ad9. [233452.283352] nvme nvme0: Please enable CONFIG_NVME_MULTIPATH for full support of multi-port devices. [233452.292805] nvme nvme0: creating 8 I/O queues. [233452.299210] nvme nvme0: new ctrl: "dm-zbd" 3. dm-linear and backend SMR HDD association :- -------------------------------------------------------------------------------- # cat /sys/kernel/config/nvmet/subsystems/dm-zbd/namespaces/1/device_path /dev/dm-0 # dmsetup ls --tree cksdh (252:0) └─ (8:112) # lsblk | head -3 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdh 8:112 0 13.6T 0 disk └─cksdh 252:0 0 13.5T 0 dm 4. NVMeOF controller :- -------------------------------------------------------------------------------- # nvme list | tr -s ' ' ' ' Node SN Model Namespace Usage Format FW Rev /dev/nvme0n1 8c6f348dcd64404c Linux 1 14.86 TB / 14.86 TB 4 KiB + 0 B 5.10.0nv 5. Zonefs tests results :- -------------------------------------------------------------------------------- # ./zonefs-tests.sh /dev/nvme0n1 Gathering information on /dev/nvme0n1... zonefs-tests on /dev/nvme0n1: 55356 zones (0 conventional zones, 55356 sequential zones) 524288 512B sectors zone size (256 MiB) 1 max open zones Running tests Test 0010: mkzonefs (options) ... PASS Test 0011: mkzonefs (force format) ... PASS Test 0012: mkzonefs (invalid device) ... PASS Test 0013: mkzonefs (super block zone state) ... PASS Test 0020: mount (default) ... PASS Test 0021: mount (invalid device) ... PASS Test 0022: mount (check mount directory sub-directories) ... PASS Test 0023: mount (options) ... PASS Test 0030: Number of files (default) ... PASS Test 0031: Number of files (aggr_cnv) ... skip Test 0032: Number of files using stat (default) ... PASS Test 0033: Number of files using stat (aggr_cnv) ... PASS Test 0034: Number of blocks using stat (default) ... PASS Test 0035: Number of blocks using stat (aggr_cnv) ... PASS Test 0040: Files permissions (default) ... PASS Test 0041: Files permissions (aggr_cnv) ... skip Test 0042: Files permissions (set value) ... PASS Test 0043: Files permissions (set value + aggr_cnv) ... skip Test 0050: Files owner (default) ... PASS Test 0051: Files owner (aggr_cnv) ... skip Test 0052: Files owner (set value) ... PASS Test 0053: Files owner (set value + aggr_cnv) ... skip Test 0060: Files size (default) ... PASS Test 0061: Files size (aggr_cnv) ... skip Test 0070: Conventional file truncate ... skip Test 0071: Conventional file truncate (aggr_cnv) ... skip Test 0072: Conventional file unlink ... skip Test 0073: Conventional file unlink (aggr_cnv) ... skip Test 0074: Conventional file random write ... skip Test 0075: Conventional file random write (direct) ... skip Test 0076: Conventional file random write (aggr_cnv) ... skip Test 0077: Conventional file random write (aggr_cnv, direct) ... skip Test 0078: Conventional file mmap read/write ... skip Test 0079: Conventional file mmap read/write (aggr_cnv) ... skip Test 0080: Sequential file truncate ... PASS Test 0081: Sequential file unlink ... PASS Test 0082: Sequential file buffered write IO ... PASS Test 0083: Sequential file overwrite ... PASS Test 0084: Sequential file unaligned write (sync IO) ... PASS Test 0085: Sequential file unaligned write (async IO) ... PASS Test 0086: Sequential file append (sync) ... PASS Test 0087: Sequential file append (async) ... PASS Test 0088: Sequential file random read ... PASS Test 0089: Sequential file mmap read/write ... PASS Test 0090: sequential file 4K synchronous write ... PASS Test 0091: Sequential file large synchronous write ... PASS 46 / 46 tests passed * With CONFIG_BLK_DEV_ZONED nvme and zonefs tests on membacked null_blk zoned :- -------------------------------------------------------------------------------- # grep -i blk_dev_zoned .config CONFIG_BLK_DEV_ZONED=y # make M=drivers/nvme/ clean CLEAN drivers/nvme//Module.symvers # make M=drivers/nvme/ CC [M] drivers/nvme//host/core.o CC [M] drivers/nvme//host/trace.o CC [M] drivers/nvme//host/lightnvm.o CC [M] drivers/nvme//host/zns.o CC [M] drivers/nvme//host/hwmon.o LD [M] drivers/nvme//host/nvme-core.o CC [M] drivers/nvme//host/pci.o LD [M] drivers/nvme//host/nvme.o CC [M] drivers/nvme//host/fabrics.o LD [M] drivers/nvme//host/nvme-fabrics.o CC [M] drivers/nvme//host/rdma.o LD [M] drivers/nvme//host/nvme-rdma.o CC [M] drivers/nvme//host/fc.o LD [M] drivers/nvme//host/nvme-fc.o CC [M] drivers/nvme//host/tcp.o LD [M] drivers/nvme//host/nvme-tcp.o CC [M] drivers/nvme//target/core.o CC [M] drivers/nvme//target/configfs.o CC [M] drivers/nvme//target/admin-cmd.o CC [M] drivers/nvme//target/fabrics-cmd.o CC [M] drivers/nvme//target/discovery.o CC [M] drivers/nvme//target/io-cmd-file.o CC [M] drivers/nvme//target/io-cmd-bdev.o CC [M] drivers/nvme//target/passthru.o CC [M] drivers/nvme//target/zns.o CC [M] drivers/nvme//target/trace.o LD [M] drivers/nvme//target/nvmet.o CC [M] drivers/nvme//target/loop.o LD [M] drivers/nvme//target/nvme-loop.o CC [M] drivers/nvme//target/rdma.o LD [M] drivers/nvme//target/nvmet-rdma.o CC [M] drivers/nvme//target/fc.o LD [M] drivers/nvme//target/nvmet-fc.o CC [M] drivers/nvme//target/fcloop.o LD [M] drivers/nvme//target/nvme-fcloop.o CC [M] drivers/nvme//target/tcp.o LD [M] drivers/nvme//target/nvmet-tcp.o MODPOST drivers/nvme//Module.symvers CC [M] drivers/nvme//host/nvme-core.mod.o LD [M] drivers/nvme//host/nvme-core.ko CC [M] drivers/nvme//host/nvme-fabrics.mod.o LD [M] drivers/nvme//host/nvme-fabrics.ko CC [M] drivers/nvme//host/nvme-fc.mod.o LD [M] drivers/nvme//host/nvme-fc.ko CC [M] drivers/nvme//host/nvme-rdma.mod.o LD [M] drivers/nvme//host/nvme-rdma.ko CC [M] drivers/nvme//host/nvme-tcp.mod.o LD [M] drivers/nvme//host/nvme-tcp.ko CC [M] drivers/nvme//host/nvme.mod.o LD [M] drivers/nvme//host/nvme.ko CC [M] drivers/nvme//target/nvme-fcloop.mod.o LD [M] drivers/nvme//target/nvme-fcloop.ko CC [M] drivers/nvme//target/nvme-loop.mod.o LD [M] drivers/nvme//target/nvme-loop.ko CC [M] drivers/nvme//target/nvmet-fc.mod.o LD [M] drivers/nvme//target/nvmet-fc.ko CC [M] drivers/nvme//target/nvmet-rdma.mod.o LD [M] drivers/nvme//target/nvmet-rdma.ko CC [M] drivers/nvme//target/nvmet-tcp.mod.o LD [M] drivers/nvme//target/nvmet-tcp.ko CC [M] drivers/nvme//target/nvmet.mod.o LD [M] drivers/nvme//target/nvmet.ko # # cdblktests # ./check tests/nvme/ nvme/002 (create many subsystems and test discovery) [passed] runtime 24.378s ... 24.636s nvme/003 (test if we're sending keep-alives to a discovery controller) [passed] runtime 10.133s ... 10.152s nvme/004 (test nvme and nvmet UUID NS descriptors) [passed] runtime 2.463s ... 2.478s nvme/005 (reset local loopback target) [not run] nvme_core module does not have parameter multipath nvme/006 (create an NVMeOF target with a block device-backed ns) [passed] runtime 0.095s ... 0.122s nvme/007 (create an NVMeOF target with a file-backed ns) [passed] runtime 0.065s ... 0.079s nvme/008 (create an NVMeOF host with a block device-backed ns) [passed] runtime 2.473s ... 2.501s nvme/009 (create an NVMeOF host with a file-backed ns) [passed] runtime 2.460s ... 2.424s nvme/010 (run data verification fio job on NVMeOF block device-backed ns) [passed] runtime 24.526s ... 28.015s nvme/011 (run data verification fio job on NVMeOF file-backed ns) [passed] runtime 265.967s ... 282.717s nvme/012 (run mkfs and data verification fio job on NVMeOF block device-backed ns) [passed] runtime 44.665s ... 48.124s nvme/013 (run mkfs and data verification fio job on NVMeOF file-backed ns) [passed] runtime 261.739s ... 352.331s nvme/014 (flush a NVMeOF block device-backed ns) [passed] runtime 21.268s ... 22.013s nvme/015 (unit test for NVMe flush for file backed ns) [passed] runtime 18.820s ... 22.104s nvme/016 (create/delete many NVMeOF block device-backed ns and test discovery) [passed] runtime 13.899s ... 14.322s nvme/017 (create/delete many file-ns and test discovery) [passed] runtime 14.322s ... 14.031s nvme/018 (unit test NVMe-oF out of range access on a file backend) [passed] runtime 2.450s ... 2.444s nvme/019 (test NVMe DSM Discard command on NVMeOF block-device ns) [passed] runtime 2.475s ... 2.489s nvme/020 (test NVMe DSM Discard command on NVMeOF file-backed ns) [passed] runtime 2.410s ... 2.448s nvme/021 (test NVMe list command on NVMeOF file-backed ns) [passed] runtime 2.441s ... 2.439s nvme/022 (test NVMe reset command on NVMeOF file-backed ns) [passed] runtime 2.864s ... 2.863s nvme/023 (test NVMe smart-log command on NVMeOF block-device ns) [passed] runtime 2.465s ... 2.446s nvme/024 (test NVMe smart-log command on NVMeOF file-backed ns) [passed] runtime 2.416s ... 2.411s nvme/025 (test NVMe effects-log command on NVMeOF file-backed ns) [passed] runtime 2.419s ... 2.748s nvme/026 (test NVMe ns-descs command on NVMeOF file-backed ns) [passed] runtime 2.422s ... 2.410s nvme/027 (test NVMe ns-rescan command on NVMeOF file-backed ns) [passed] runtime 2.456s ... 2.462s nvme/028 (test NVMe list-subsys command on NVMeOF file-backed ns) [passed] runtime 2.427s ... 2.429s nvme/029 (test userspace IO via nvme-cli read/write interface) [passed] runtime 2.751s ... 2.755s nvme/030 (ensure the discovery generation counter is updated appropriately) [passed] runtime 0.346s ... 0.357s nvme/031 (test deletion of NVMeOF controllers immediately after setup) [passed] runtime 13.601s ... 13.591s nvme/038 (test deletion of NVMeOF subsystem without enabling) [passed] runtime 0.039s ... 0.059s # # cdzonefstest # ./zonefs-tests.sh /dev/nvme1n1 Gathering information on /dev/nvme1n1... zonefs-tests on /dev/nvme1n1: 16 zones (0 conventional zones, 16 sequential zones) 131072 512B sectors zone size (64 MiB) 1 max open zones Running tests Test 0010: mkzonefs (options) ... PASS Test 0011: mkzonefs (force format) ... PASS Test 0012: mkzonefs (invalid device) ... PASS Test 0013: mkzonefs (super block zone state) ... PASS Test 0020: mount (default) ... PASS Test 0021: mount (invalid device) ... PASS Test 0022: mount (check mount directory sub-directories) ... PASS Test 0023: mount (options) ... PASS Test 0030: Number of files (default) ... PASS Test 0031: Number of files (aggr_cnv) ... skip Test 0032: Number of files using stat (default) ... PASS Test 0033: Number of files using stat (aggr_cnv) ... PASS Test 0034: Number of blocks using stat (default) ... PASS Test 0035: Number of blocks using stat (aggr_cnv) ... PASS Test 0040: Files permissions (default) ... PASS Test 0041: Files permissions (aggr_cnv) ... skip Test 0042: Files permissions (set value) ... PASS Test 0043: Files permissions (set value + aggr_cnv) ... skip Test 0050: Files owner (default) ... PASS Test 0051: Files owner (aggr_cnv) ... skip Test 0052: Files owner (set value) ... PASS Test 0053: Files owner (set value + aggr_cnv) ... skip Test 0060: Files size (default) ... PASS Test 0061: Files size (aggr_cnv) ... skip Test 0070: Conventional file truncate ... skip Test 0071: Conventional file truncate (aggr_cnv) ... skip Test 0072: Conventional file unlink ... skip Test 0073: Conventional file unlink (aggr_cnv) ... skip Test 0074: Conventional file random write ... skip Test 0075: Conventional file random write (direct) ... skip Test 0076: Conventional file random write (aggr_cnv) ... skip Test 0077: Conventional file random write (aggr_cnv, direct) ... skip Test 0078: Conventional file mmap read/write ... skip Test 0079: Conventional file mmap read/write (aggr_cnv) ... skip Test 0080: Sequential file truncate ... PASS Test 0081: Sequential file unlink ... PASS Test 0082: Sequential file buffered write IO ... PASS Test 0083: Sequential file overwrite ... PASS Test 0084: Sequential file unaligned write (sync IO) ... PASS Test 0085: Sequential file unaligned write (async IO) ... PASS Test 0086: Sequential file append (sync) ... PASS Test 0087: Sequential file append (async) ... PASS Test 0088: Sequential file random read ... PASS Test 0089: Sequential file mmap read/write ... PASS Test 0090: sequential file 4K synchronous write ... PASS Test 0091: Sequential file large synchronous write ... PASS 46 / 46 tests passed * Without CONFIG_BLK_DEV_ZONED nvme tests :- -------------------------------------------------------------------------------- # # grep -i blk_dev_zoned .config # CONFIG_BLK_DEV_ZONED is not set # makej M=drivers/nvme/ clean CLEAN drivers/nvme//Module.symvers # makej M=drivers/nvme/ CC [M] drivers/nvme//host/core.o CC [M] drivers/nvme//host/trace.o CC [M] drivers/nvme//host/lightnvm.o CC [M] drivers/nvme//target/core.o CC [M] drivers/nvme//host/hwmon.o CC [M] drivers/nvme//target/configfs.o CC [M] drivers/nvme//host/pci.o CC [M] drivers/nvme//target/admin-cmd.o CC [M] drivers/nvme//host/fabrics.o CC [M] drivers/nvme//host/rdma.o CC [M] drivers/nvme//target/fabrics-cmd.o CC [M] drivers/nvme//target/discovery.o CC [M] drivers/nvme//host/fc.o CC [M] drivers/nvme//target/io-cmd-file.o CC [M] drivers/nvme//host/tcp.o CC [M] drivers/nvme//target/io-cmd-bdev.o CC [M] drivers/nvme//target/passthru.o CC [M] drivers/nvme//target/trace.o CC [M] drivers/nvme//target/loop.o CC [M] drivers/nvme//target/rdma.o CC [M] drivers/nvme//target/fc.o CC [M] drivers/nvme//target/fcloop.o CC [M] drivers/nvme//target/tcp.o LD [M] drivers/nvme//target/nvme-loop.o LD [M] drivers/nvme//target/nvme-fcloop.o LD [M] drivers/nvme//target/nvmet-tcp.o LD [M] drivers/nvme//host/nvme-fabrics.o LD [M] drivers/nvme//host/nvme.o LD [M] drivers/nvme//host/nvme-rdma.o LD [M] drivers/nvme//target/nvmet-rdma.o LD [M] drivers/nvme//target/nvmet.o LD [M] drivers/nvme//target/nvmet-fc.o LD [M] drivers/nvme//host/nvme-tcp.o LD [M] drivers/nvme//host/nvme-fc.o LD [M] drivers/nvme//host/nvme-core.o MODPOST drivers/nvme//Module.symvers CC [M] drivers/nvme//host/nvme-core.mod.o CC [M] drivers/nvme//host/nvme-fabrics.mod.o CC [M] drivers/nvme//host/nvme-fc.mod.o CC [M] drivers/nvme//host/nvme-rdma.mod.o CC [M] drivers/nvme//host/nvme-tcp.mod.o CC [M] drivers/nvme//host/nvme.mod.o CC [M] drivers/nvme//target/nvme-fcloop.mod.o CC [M] drivers/nvme//target/nvme-loop.mod.o CC [M] drivers/nvme//target/nvmet-fc.mod.o CC [M] drivers/nvme//target/nvmet-rdma.mod.o CC [M] drivers/nvme//target/nvmet-tcp.mod.o CC [M] drivers/nvme//target/nvmet.mod.o LD [M] drivers/nvme//target/nvme-fcloop.ko LD [M] drivers/nvme//host/nvme-tcp.ko LD [M] drivers/nvme//host/nvme-core.ko LD [M] drivers/nvme//target/nvmet-tcp.ko LD [M] drivers/nvme//target/nvme-loop.ko LD [M] drivers/nvme//target/nvmet-fc.ko LD [M] drivers/nvme//host/nvme-fabrics.ko LD [M] drivers/nvme//host/nvme-fc.ko LD [M] drivers/nvme//target/nvmet-rdma.ko LD [M] drivers/nvme//host/nvme-rdma.ko LD [M] drivers/nvme//host/nvme.ko LD [M] drivers/nvme//target/nvmet.ko # # cdblktests # ./check tests/nvme/ nvme/002 (create many subsystems and test discovery) [passed] runtime ... 27.640s nvme/003 (test if we're sending keep-alives to a discovery controller) [passed] runtime 10.145s ... 10.147s nvme/004 (test nvme and nvmet UUID NS descriptors) [passed] runtime 1.713s ... 1.712s nvme/005 (reset local loopback target) [not run] nvme_core module does not have parameter multipath nvme/006 (create an NVMeOF target with a block device-backed ns) [passed] runtime 0.111s ... 0.115s nvme/007 (create an NVMeOF target with a file-backed ns) [passed] runtime 0.081s ... 0.069s nvme/008 (create an NVMeOF host with a block device-backed ns) [passed] runtime 1.690s ... 1.727s nvme/009 (create an NVMeOF host with a file-backed ns) [passed] runtime 1.659s ... 1.661s nvme/010 (run data verification fio job on NVMeOF block device-backed ns) [passed] runtime 28.781s ... 30.166s nvme/011 (run data verification fio job on NVMeOF file-backed ns) [passed] runtime 253.831s ... 238.774s nvme/012 (run mkfs and data verification fio job on NVMeOF block device-backed ns) [passed] runtime 40.003s ... 68.076s nvme/013 (run mkfs and data verification fio job on NVMeOF file-backed ns) [passed] runtime 272.649s ... 283.720s nvme/014 (flush a NVMeOF block device-backed ns) [passed] runtime 21.772s ... 21.397s nvme/015 (unit test for NVMe flush for file backed ns) [passed] runtime 21.908s ... 18.622s nvme/016 (create/delete many NVMeOF block device-backed ns and test discovery) [passed] runtime 15.860s ... 18.313s nvme/017 (create/delete many file-ns and test discovery) [passed] runtime 16.470s ... 18.374s nvme/018 (unit test NVMe-oF out of range access on a file backend) [passed] runtime 1.665s ... 1.890s nvme/019 (test NVMe DSM Discard command on NVMeOF block-device ns) [passed] runtime 1.681s ... 1.982s nvme/020 (test NVMe DSM Discard command on NVMeOF file-backed ns) [passed] runtime 1.645s ... 1.913s nvme/021 (test NVMe list command on NVMeOF file-backed ns) [passed] runtime 1.648s ... 1.956s nvme/022 (test NVMe reset command on NVMeOF file-backed ns) [passed] runtime 2.063s ... 2.553s nvme/023 (test NVMe smart-log command on NVMeOF block-device ns) [passed] runtime 1.692s ... 2.588s nvme/024 (test NVMe smart-log command on NVMeOF file-backed ns) [passed] runtime 1.643s ... 1.656s nvme/025 (test NVMe effects-log command on NVMeOF file-backed ns) [passed] runtime 1.640s ... 1.668s nvme/026 (test NVMe ns-descs command on NVMeOF file-backed ns) [passed] runtime 1.643s ... 1.961s nvme/027 (test NVMe ns-rescan command on NVMeOF file-backed ns) [passed] runtime 1.641s ... 1.677s nvme/028 (test NVMe list-subsys command on NVMeOF file-backed ns) [passed] runtime 1.648s ... 1.868s nvme/029 (test userspace IO via nvme-cli read/write interface) [passed] runtime 1.982s ... 2.703s nvme/030 (ensure the discovery generation counter is updated appropriately) [passed] runtime 0.308s ... 0.328s nvme/031 (test deletion of NVMeOF controllers immediately after setup) [passed] runtime 5.432s ... 7.495s nvme/038 (test deletion of NVMeOF subsystem without enabling) [passed] runtime 0.053s ... 0.046s -- 2.22.1 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH V11 1/4] nvmet: add NVM Command Set Identifier support 2021-03-11 4:39 [PATCH V11 0/4] nvmet: add ZBD backend support Chaitanya Kulkarni @ 2021-03-11 4:39 ` Chaitanya Kulkarni 2021-03-11 4:39 ` [PATCH V11 2/4] nvmet: add ZBD over ZNS backend support Chaitanya Kulkarni ` (2 subsequent siblings) 3 siblings, 0 replies; 13+ messages in thread From: Chaitanya Kulkarni @ 2021-03-11 4:39 UTC (permalink / raw) To: linux-nvme; +Cc: hch, kbusch, sagi, damien.lemoal, Chaitanya Kulkarni [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset=y, Size: 5955 bytes --] NVMe TP 4056 allows the controller to support different command sets. NVMeoF target currently only supports namespaces that contain traditional logical blocks that may be randomly read and written. In some applications there is a value in exposing namespaces that contain logical blocks that have special access rules (e.g. sequentially write required namespace such as Zoned Namespace (ZNS)). In order to support the Zoned Block Devices (ZBD) backend, the controller needs to have support for ZNS Command Set Identifier (CSI). In this preparation patch, we adjust the code such that it can now support the default command set identifier. We update the namespace data structure to store the CSI value which defaults to NVME_CSI_NVM that represents traditional logical blocks namespace type. The CSI support is required to implement the ZBD backend for NVMeOF with host-side NVMe ZNS interface, since ZNS commands belongs to the different command set than the default one. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> --- drivers/nvme/target/admin-cmd.c | 47 +++++++++++++++++++++++++++------ drivers/nvme/target/core.c | 16 ++++++++++- drivers/nvme/target/nvmet.h | 1 + include/linux/nvme.h | 1 + 4 files changed, 56 insertions(+), 9 deletions(-) diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c index f4cc32674edd..176c8593d341 100644 --- a/drivers/nvme/target/admin-cmd.c +++ b/drivers/nvme/target/admin-cmd.c @@ -162,15 +162,8 @@ static void nvmet_execute_get_log_page_smart(struct nvmet_req *req) nvmet_req_complete(req, status); } -static void nvmet_execute_get_log_cmd_effects_ns(struct nvmet_req *req) +static void nvmet_set_csi_nvm_effects(struct nvme_effects_log *log) { - u16 status = NVME_SC_INTERNAL; - struct nvme_effects_log *log; - - log = kzalloc(sizeof(*log), GFP_KERNEL); - if (!log) - goto out; - log->acs[nvme_admin_get_log_page] = cpu_to_le32(1 << 0); log->acs[nvme_admin_identify] = cpu_to_le32(1 << 0); log->acs[nvme_admin_abort_cmd] = cpu_to_le32(1 << 0); @@ -184,9 +177,31 @@ static void nvmet_execute_get_log_cmd_effects_ns(struct nvmet_req *req) log->iocs[nvme_cmd_flush] = cpu_to_le32(1 << 0); log->iocs[nvme_cmd_dsm] = cpu_to_le32(1 << 0); log->iocs[nvme_cmd_write_zeroes] = cpu_to_le32(1 << 0); +} + +static void nvmet_execute_get_log_cmd_effects_ns(struct nvmet_req *req) +{ + struct nvme_effects_log *log; + u16 status = NVME_SC_SUCCESS; + + log = kzalloc(sizeof(*log), GFP_KERNEL); + if (!log) { + status = NVME_SC_INTERNAL; + goto out; + } + + switch (req->cmd->get_log_page.csi) { + case NVME_CSI_NVM: + nvmet_set_csi_nvm_effects(log); + break; + default: + status = NVME_SC_INVALID_LOG_PAGE; + goto free; + } status = nvmet_copy_to_sgl(req, 0, log, sizeof(*log)); +free: kfree(log); out: nvmet_req_complete(req, status); @@ -611,6 +626,18 @@ static u16 nvmet_copy_ns_identifier(struct nvmet_req *req, u8 type, u8 len, return 0; } +static u16 nvmet_execute_identify_desclist_csi(struct nvmet_req *req, off_t *o) +{ + switch (req->ns->csi) { + case NVME_CSI_NVM: + return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, + NVME_NIDT_CSI_LEN, + &req->ns->csi, o); + } + + return NVME_SC_INVALID_IO_CMD_SET; +} + static void nvmet_execute_identify_desclist(struct nvmet_req *req) { off_t off = 0; @@ -635,6 +662,10 @@ static void nvmet_execute_identify_desclist(struct nvmet_req *req) goto out; } + status = nvmet_execute_identify_desclist_csi(req, &off); + if (status) + goto out; + if (sg_zero_buffer(req->sg, req->sg_cnt, NVME_IDENTIFY_DATA_SIZE - off, off) != NVME_IDENTIFY_DATA_SIZE - off) status = NVME_SC_INTERNAL | NVME_SC_DNR; diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index adbede9ab7f3..4abe0b542c96 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -693,6 +693,7 @@ struct nvmet_ns *nvmet_ns_alloc(struct nvmet_subsys *subsys, u32 nsid) uuid_gen(&ns->uuid); ns->buffered_io = false; + ns->csi = NVME_CSI_NVM; return ns; } @@ -1113,6 +1114,17 @@ static inline u8 nvmet_cc_iocqes(u32 cc) return (cc >> NVME_CC_IOCQES_SHIFT) & 0xf; } +static inline bool nvmet_cc_css_check(u8 cc_css) +{ + switch (cc_css <<= NVME_CC_CSS_SHIFT) { + case NVME_CC_CSS_NVM: + case NVME_CC_CSS_CSI: + return true; + default: + return false; + } +} + static void nvmet_start_ctrl(struct nvmet_ctrl *ctrl) { lockdep_assert_held(&ctrl->lock); @@ -1121,7 +1133,7 @@ static void nvmet_start_ctrl(struct nvmet_ctrl *ctrl) nvmet_cc_iocqes(ctrl->cc) != NVME_NVM_IOCQES || nvmet_cc_mps(ctrl->cc) != 0 || nvmet_cc_ams(ctrl->cc) != 0 || - nvmet_cc_css(ctrl->cc) != 0) { + !nvmet_cc_css_check(nvmet_cc_css(ctrl->cc))) { ctrl->csts = NVME_CSTS_CFS; return; } @@ -1172,6 +1184,8 @@ static void nvmet_init_cap(struct nvmet_ctrl *ctrl) { /* command sets supported: NVMe command set: */ ctrl->cap = (1ULL << 37); + /* Controller supports one or more I/O Command Sets */ + ctrl->cap |= (1ULL << 43); /* CC.EN timeout in 500msec units: */ ctrl->cap |= (15ULL << 24); /* maximum queue entries supported: */ diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index 24e261bf153a..ee5999920155 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -81,6 +81,7 @@ struct nvmet_ns { struct pci_dev *p2p_dev; int pi_type; int metadata_size; + u8 csi; }; static inline struct nvmet_ns *to_nvmet_ns(struct config_item *item) diff --git a/include/linux/nvme.h b/include/linux/nvme.h index b08787cd0881..f09fbbb7876b 100644 --- a/include/linux/nvme.h +++ b/include/linux/nvme.h @@ -1494,6 +1494,7 @@ enum { NVME_SC_NS_WRITE_PROTECTED = 0x20, NVME_SC_CMD_INTERRUPTED = 0x21, NVME_SC_TRANSIENT_TR_ERR = 0x22, + NVME_SC_INVALID_IO_CMD_SET = 0x2C, NVME_SC_LBA_RANGE = 0x80, NVME_SC_CAP_EXCEEDED = 0x81, -- 2.22.1 [-- Attachment #2: Type: text/plain, Size: 158 bytes --] _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH V11 2/4] nvmet: add ZBD over ZNS backend support 2021-03-11 4:39 [PATCH V11 0/4] nvmet: add ZBD backend support Chaitanya Kulkarni 2021-03-11 4:39 ` [PATCH V11 1/4] nvmet: add NVM Command Set Identifier support Chaitanya Kulkarni @ 2021-03-11 4:39 ` Chaitanya Kulkarni 2021-03-11 5:14 ` Damien Le Moal 2021-03-11 4:39 ` [PATCH V11 3/4] nvmet: add nvmet_req_bio put helper for backends Chaitanya Kulkarni 2021-03-11 4:39 ` [PATCH V11 4/4] nvme: add comments to nvme_zns_alloc_report_buffer Chaitanya Kulkarni 3 siblings, 1 reply; 13+ messages in thread From: Chaitanya Kulkarni @ 2021-03-11 4:39 UTC (permalink / raw) To: linux-nvme; +Cc: hch, kbusch, sagi, damien.lemoal, Chaitanya Kulkarni NVMe TP 4053 – Zoned Namespaces (ZNS) allows host software to communicate with a non-volatile memory subsystem using zones for NVMe protocol-based controllers. NVMeOF already support the ZNS NVMe Protocol compliant devices on the target in the passthru mode. There are Generic zoned block devices like Shingled Magnetic Recording (SMR) HDDs that are not based on the NVMe protocol. This patch adds ZNS backend to support the ZBDs for NVMeOF target. This support includes implementing the new command set NVME_CSI_ZNS, adding different command handlers for ZNS command set such as NVMe Identify Controller, NVMe Identify Namespace, NVMe Zone Append, NVMe Zone Management Send and NVMe Zone Management Receive. With the new command set identifier, we also update the target command effects logs to reflect the ZNS compliant commands. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> --- drivers/nvme/target/Makefile | 1 + drivers/nvme/target/admin-cmd.c | 27 +++ drivers/nvme/target/io-cmd-bdev.c | 34 +++- drivers/nvme/target/nvmet.h | 38 ++++ drivers/nvme/target/zns.c | 327 ++++++++++++++++++++++++++++++ 5 files changed, 419 insertions(+), 8 deletions(-) create mode 100644 drivers/nvme/target/zns.c diff --git a/drivers/nvme/target/Makefile b/drivers/nvme/target/Makefile index ebf91fc4c72e..9837e580fa7e 100644 --- a/drivers/nvme/target/Makefile +++ b/drivers/nvme/target/Makefile @@ -12,6 +12,7 @@ obj-$(CONFIG_NVME_TARGET_TCP) += nvmet-tcp.o nvmet-y += core.o configfs.o admin-cmd.o fabrics-cmd.o \ discovery.o io-cmd-file.o io-cmd-bdev.o nvmet-$(CONFIG_NVME_TARGET_PASSTHRU) += passthru.o +nvmet-$(CONFIG_BLK_DEV_ZONED) += zns.o nvme-loop-y += loop.o nvmet-rdma-y += rdma.o nvmet-fc-y += fc.o diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c index 176c8593d341..bf4876df624a 100644 --- a/drivers/nvme/target/admin-cmd.c +++ b/drivers/nvme/target/admin-cmd.c @@ -179,6 +179,13 @@ static void nvmet_set_csi_nvm_effects(struct nvme_effects_log *log) log->iocs[nvme_cmd_write_zeroes] = cpu_to_le32(1 << 0); } +static void nvmet_set_csi_zns_effects(struct nvme_effects_log *log) +{ + log->iocs[nvme_cmd_zone_append] = cpu_to_le32(1 << 0); + log->iocs[nvme_cmd_zone_mgmt_send] = cpu_to_le32(1 << 0); + log->iocs[nvme_cmd_zone_mgmt_recv] = cpu_to_le32(1 << 0); +} + static void nvmet_execute_get_log_cmd_effects_ns(struct nvmet_req *req) { struct nvme_effects_log *log; @@ -194,6 +201,15 @@ static void nvmet_execute_get_log_cmd_effects_ns(struct nvmet_req *req) case NVME_CSI_NVM: nvmet_set_csi_nvm_effects(log); break; + case NVME_CSI_ZNS: + if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED)) { + status = NVME_SC_INVALID_IO_CMD_SET; + goto free; + } + + nvmet_set_csi_nvm_effects(log); + nvmet_set_csi_zns_effects(log); + break; default: status = NVME_SC_INVALID_LOG_PAGE; goto free; @@ -630,6 +646,13 @@ static u16 nvmet_execute_identify_desclist_csi(struct nvmet_req *req, off_t *o) { switch (req->ns->csi) { case NVME_CSI_NVM: + return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, + NVME_NIDT_CSI_LEN, + &req->ns->csi, o); + case NVME_CSI_ZNS: + if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED)) + return NVME_SC_INVALID_IO_CMD_SET; + return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, NVME_NIDT_CSI_LEN, &req->ns->csi, o); @@ -682,8 +705,12 @@ static void nvmet_execute_identify(struct nvmet_req *req) switch (req->cmd->identify.cns) { case NVME_ID_CNS_NS: return nvmet_execute_identify_ns(req); + case NVME_ID_CNS_CS_NS: + return nvmet_execute_identify_cns_cs_ns(req); case NVME_ID_CNS_CTRL: return nvmet_execute_identify_ctrl(req); + case NVME_ID_CNS_CS_CTRL: + return nvmet_execute_identify_cns_cs_ctrl(req); case NVME_ID_CNS_NS_ACTIVE_LIST: return nvmet_execute_identify_nslist(req); case NVME_ID_CNS_NS_DESC_LIST: diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c index 9a8b3726a37c..ada0215f5e56 100644 --- a/drivers/nvme/target/io-cmd-bdev.c +++ b/drivers/nvme/target/io-cmd-bdev.c @@ -63,6 +63,14 @@ static void nvmet_bdev_ns_enable_integrity(struct nvmet_ns *ns) } } +void nvmet_bdev_ns_disable(struct nvmet_ns *ns) +{ + if (ns->bdev) { + blkdev_put(ns->bdev, FMODE_WRITE | FMODE_READ); + ns->bdev = NULL; + } +} + int nvmet_bdev_ns_enable(struct nvmet_ns *ns) { int ret; @@ -86,15 +94,16 @@ int nvmet_bdev_ns_enable(struct nvmet_ns *ns) if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY_T10)) nvmet_bdev_ns_enable_integrity(ns); - return 0; -} - -void nvmet_bdev_ns_disable(struct nvmet_ns *ns) -{ - if (ns->bdev) { - blkdev_put(ns->bdev, FMODE_WRITE | FMODE_READ); - ns->bdev = NULL; + /* bdev_is_zoned() is stubbed out of CONFIG_BLK_DEV_ZONED */ + if (bdev_is_zoned(ns->bdev)) { + if (!nvmet_bdev_zns_enable(ns)) { + nvmet_bdev_ns_disable(ns); + return -EINVAL; + } + ns->csi = NVME_CSI_ZNS; } + + return 0; } void nvmet_bdev_ns_revalidate(struct nvmet_ns *ns) @@ -448,6 +457,15 @@ u16 nvmet_bdev_parse_io_cmd(struct nvmet_req *req) case nvme_cmd_write_zeroes: req->execute = nvmet_bdev_execute_write_zeroes; return 0; + case nvme_cmd_zone_append: + req->execute = nvmet_bdev_execute_zone_append; + return 0; + case nvme_cmd_zone_mgmt_recv: + req->execute = nvmet_bdev_execute_zone_mgmt_recv; + return 0; + case nvme_cmd_zone_mgmt_send: + req->execute = nvmet_bdev_execute_zone_mgmt_send; + return 0; default: return nvmet_report_invalid_opcode(req); } diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index ee5999920155..f3fccc49de03 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -247,6 +247,10 @@ struct nvmet_subsys { unsigned int admin_timeout; unsigned int io_timeout; #endif /* CONFIG_NVME_TARGET_PASSTHRU */ + +#ifdef CONFIG_BLK_DEV_ZONED + u8 zasl; +#endif /* CONFIG_BLK_DEV_ZONED */ }; static inline struct nvmet_subsys *to_subsys(struct config_item *item) @@ -584,6 +588,40 @@ static inline struct nvme_ctrl *nvmet_passthru_ctrl(struct nvmet_subsys *subsys) } #endif /* CONFIG_NVME_TARGET_PASSTHRU */ +#ifdef CONFIG_BLK_DEV_ZONED +bool nvmet_bdev_zns_enable(struct nvmet_ns *ns); +void nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req); +void nvmet_execute_identify_cns_cs_ns(struct nvmet_req *req); +void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req); +void nvmet_bdev_execute_zone_mgmt_send(struct nvmet_req *req); +void nvmet_bdev_execute_zone_append(struct nvmet_req *req); +#else /* CONFIG_BLK_DEV_ZONED */ +static inline bool nvmet_bdev_zns_enable(struct nvmet_ns *ns) +{ + return false; +} +static inline void +nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req) +{ +} +static inline void +nvmet_execute_identify_cns_cs_ns(struct nvmet_req *req) +{ +} +static inline void +nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req) +{ +} +static inline void +nvmet_bdev_execute_zone_mgmt_send(struct nvmet_req *req) +{ +} +static inline void +nvmet_bdev_execute_zone_append(struct nvmet_req *req) +{ +} +#endif /* CONFIG_BLK_DEV_ZONED */ + static inline struct nvme_ctrl * nvmet_req_passthru_ctrl(struct nvmet_req *req) { diff --git a/drivers/nvme/target/zns.c b/drivers/nvme/target/zns.c new file mode 100644 index 000000000000..8121b29df766 --- /dev/null +++ b/drivers/nvme/target/zns.c @@ -0,0 +1,327 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * NVMe ZNS-ZBD command implementation. + * Copyright (c) 2020-2021 HGST, a Western Digital Company. + */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt +#include <linux/nvme.h> +#include <linux/blkdev.h> +#include "nvmet.h" + +/* + * We set the Memory Page Size Minimum (MPSMIN) for target controller to 0 + * which gets added by 12 in the nvme_enable_ctrl() which results in 2^12 = 4k + * as page_shift value. When calculating the ZASL use shift by 12. + */ +#define NVMET_MPSMIN_SHIFT 12 + +static u16 nvmet_bdev_zns_checks(struct nvmet_req *req) +{ + if (!bdev_is_zoned(req->ns->bdev)) + return NVME_SC_INVALID_NS | NVME_SC_DNR; + + if (req->cmd->zmr.zra != NVME_ZRA_ZONE_REPORT) { + req->error_loc = offsetof(struct nvme_zone_mgmt_recv_cmd, zra); + return NVME_SC_INVALID_FIELD | NVME_SC_DNR; + } + + if (req->cmd->zmr.zrasf != NVME_ZRASF_ZONE_REPORT_ALL) { + req->error_loc = + offsetof(struct nvme_zone_mgmt_recv_cmd, zrasf); + return NVME_SC_INVALID_FIELD | NVME_SC_DNR; + } + + if (req->cmd->zmr.pr != NVME_REPORT_ZONE_PARTIAL) { + req->error_loc = offsetof(struct nvme_zone_mgmt_recv_cmd, pr); + return NVME_SC_INVALID_FIELD | NVME_SC_DNR; + } + + return NVME_SC_SUCCESS; +} + +static inline u8 nvmet_zasl(unsigned int zone_append_sects) +{ + /* + * Zone Append Size Limit is the value experessed in the units + * of minimum memory page size (i.e. 12) and is reported power of 2. + */ + return ilog2((zone_append_sects << 9) >> NVMET_MPSMIN_SHIFT); +} + +static inline bool nvmet_zns_update_zasl(struct nvmet_ns *ns) +{ + struct request_queue *q = ns->bdev->bd_disk->queue; + u8 zasl = nvmet_zasl(queue_max_zone_append_sectors(q)); + + if (ns->subsys->zasl) + return ns->subsys->zasl < zasl; + + ns->subsys->zasl = zasl; + return true; +} + +static int nvmet_bdev_validate_zns_zones_cb(struct blk_zone *z, + unsigned int i, void *data) +{ + if (z->type == BLK_ZONE_TYPE_CONVENTIONAL) + return -EOPNOTSUPP; + return 0; +} + +static bool nvmet_bdev_has_conv_zones(struct block_device *bdev) +{ + int ret; + + if (bdev->bd_disk->queue->conv_zones_bitmap) + return true; + + ret = blkdev_report_zones(bdev, 0, blkdev_nr_zones(bdev->bd_disk), + nvmet_bdev_validate_zns_zones_cb, NULL); + + return ret <= 0 ? true : false; +} + +bool nvmet_bdev_zns_enable(struct nvmet_ns *ns) +{ + if (nvmet_bdev_has_conv_zones(ns->bdev)) + return false; + + ns->blksize_shift = blksize_bits(bdev_physical_block_size(ns->bdev)); + + if (!nvmet_zns_update_zasl(ns)) + return false; + + return !(get_capacity(ns->bdev->bd_disk) & + (bdev_zone_sectors(ns->bdev) - 1)); +} + +void nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req) +{ + u8 zasl = req->sq->ctrl->subsys->zasl; + struct nvmet_ctrl *ctrl = req->sq->ctrl; + struct nvme_id_ctrl_zns *id; + u16 status; + + if (req->cmd->identify.csi != NVME_CSI_ZNS) { + req->error_loc = offsetof(struct nvme_common_command, opcode); + status = NVME_SC_INVALID_OPCODE | NVME_SC_DNR; + goto out; + } + + id = kzalloc(sizeof(*id), GFP_KERNEL); + if (!id) { + status = NVME_SC_INTERNAL; + goto out; + } + + if (ctrl->ops->get_mdts) + id->zasl = min_t(u8, ctrl->ops->get_mdts(ctrl), zasl); + else + id->zasl = zasl; + + status = nvmet_copy_to_sgl(req, 0, id, sizeof(*id)); + + kfree(id); +out: + nvmet_req_complete(req, status); +} + +void nvmet_execute_identify_cns_cs_ns(struct nvmet_req *req) +{ + struct nvme_id_ns_zns *id_zns; + u64 zsze; + u16 status; + + if (req->cmd->identify.csi != NVME_CSI_ZNS) { + req->error_loc = offsetof(struct nvme_common_command, opcode); + status = NVME_SC_INVALID_OPCODE | NVME_SC_DNR; + goto out; + } + + if (le32_to_cpu(req->cmd->identify.nsid) == NVME_NSID_ALL) { + req->error_loc = offsetof(struct nvme_identify, nsid); + status = NVME_SC_INVALID_NS | NVME_SC_DNR; + goto out; + } + + id_zns = kzalloc(sizeof(*id_zns), GFP_KERNEL); + if (!id_zns) { + status = NVME_SC_INTERNAL; + goto out; + } + + status = nvmet_req_find_ns(req); + if (status) { + status = NVME_SC_INTERNAL; + goto done; + } + + if (!bdev_is_zoned(req->ns->bdev)) { + req->error_loc = offsetof(struct nvme_identify, nsid); + status = NVME_SC_INVALID_NS | NVME_SC_DNR; + goto done; + } + + nvmet_ns_revalidate(req->ns); + zsze = (bdev_zone_sectors(req->ns->bdev) << 9) >> + req->ns->blksize_shift; + id_zns->lbafe[0].zsze = cpu_to_le64(zsze); + id_zns->mor = cpu_to_le32(bdev_max_open_zones(req->ns->bdev)); + id_zns->mar = cpu_to_le32(bdev_max_active_zones(req->ns->bdev)); + +done: + status = nvmet_copy_to_sgl(req, 0, id_zns, sizeof(*id_zns)); + kfree(id_zns); +out: + nvmet_req_complete(req, status); +} + +struct nvmet_report_zone_data { + struct nvmet_ns *ns; + struct nvme_zone_report *rz; +}; + +static int nvmet_bdev_report_zone_cb(struct blk_zone *z, unsigned i, void *d) +{ + struct nvmet_report_zone_data *report_zone_data = d; + struct nvme_zone_descriptor *entries = report_zone_data->rz->entries; + struct nvmet_ns *ns = report_zone_data->ns; + + entries[i].zcap = nvmet_sect_to_lba(ns, z->capacity); + entries[i].zslba = nvmet_sect_to_lba(ns, z->start); + entries[i].wp = nvmet_sect_to_lba(ns, z->wp); + entries[i].za = z->reset ? 1 << 2 : 0; + entries[i].zt = z->type; + entries[i].zs = z->cond << 4; + + return 0; +} + +void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req) +{ + sector_t sect = nvmet_lba_to_sect(req->ns, req->cmd->zmr.slba); + u32 bufsize = (le32_to_cpu(req->cmd->zmr.numd) + 1) << 2; + struct nvmet_report_zone_data data = { .ns = req->ns }; + unsigned int nr_zones; + int reported_zones; + u16 status; + + status = nvmet_bdev_zns_checks(req); + if (status) + goto out; + + data.rz = __vmalloc(bufsize, GFP_KERNEL | __GFP_NORETRY | __GFP_ZERO); + if (!data.rz) { + status = NVME_SC_INTERNAL; + goto out; + } + + nr_zones = (bufsize - sizeof(struct nvme_zone_report)) / + sizeof(struct nvme_zone_descriptor); + reported_zones = blkdev_report_zones(req->ns->bdev, sect, nr_zones, + nvmet_bdev_report_zone_cb, &data); + if (reported_zones < 0) { + status = NVME_SC_INTERNAL; + goto out_free_report_zones; + } + + data.rz->nr_zones = cpu_to_le64(reported_zones); + + status = nvmet_copy_to_sgl(req, 0, data.rz, bufsize); + +out_free_report_zones: + kvfree(data.rz); +out: + nvmet_req_complete(req, status); +} + +void nvmet_bdev_execute_zone_mgmt_send(struct nvmet_req *req) +{ + sector_t sect = nvmet_lba_to_sect(req->ns, req->cmd->zms.slba); + sector_t nr_sect = bdev_zone_sectors(req->ns->bdev); + u16 status = NVME_SC_SUCCESS; + u8 zsa = req->cmd->zms.zsa; + enum req_opf op; + int ret; + const unsigned int zsa_to_op[] = { + [NVME_ZONE_OPEN] = REQ_OP_ZONE_OPEN, + [NVME_ZONE_CLOSE] = REQ_OP_ZONE_CLOSE, + [NVME_ZONE_FINISH] = REQ_OP_ZONE_FINISH, + [NVME_ZONE_RESET] = REQ_OP_ZONE_RESET, + }; + + if (zsa > ARRAY_SIZE(zsa_to_op) || !zsa_to_op[zsa]) { + status = NVME_SC_INVALID_FIELD; + goto out; + } + + op = zsa_to_op[zsa]; + + if (req->cmd->zms.select_all) + nr_sect = get_capacity(req->ns->bdev->bd_disk); + + ret = blkdev_zone_mgmt(req->ns->bdev, op, sect, nr_sect, GFP_KERNEL); + if (ret) + status = NVME_SC_INTERNAL; +out: + nvmet_req_complete(req, status); +} + +void nvmet_bdev_execute_zone_append(struct nvmet_req *req) +{ + sector_t sect = nvmet_lba_to_sect(req->ns, req->cmd->rw.slba); + u16 status = NVME_SC_SUCCESS; + unsigned int total_len = 0; + struct scatterlist *sg; + int ret = 0, sg_cnt; + struct bio *bio; + + if (!nvmet_check_transfer_len(req, nvmet_rw_data_len(req))) + return; + + if (!req->sg_cnt) { + nvmet_req_complete(req, 0); + return; + } + + if (req->transfer_len <= NVMET_MAX_INLINE_DATA_LEN) { + bio = &req->b.inline_bio; + bio_init(bio, req->inline_bvec, ARRAY_SIZE(req->inline_bvec)); + } else { + bio = bio_alloc(GFP_KERNEL, req->sg_cnt); + } + + bio_set_dev(bio, req->ns->bdev); + bio->bi_iter.bi_sector = sect; + bio->bi_opf = REQ_OP_ZONE_APPEND | REQ_SYNC | REQ_IDLE; + if (req->cmd->rw.control & cpu_to_le16(NVME_RW_FUA)) + bio->bi_opf |= REQ_FUA; + + for_each_sg(req->sg, sg, req->sg_cnt, sg_cnt) { + struct page *p = sg_page(sg); + unsigned int l = sg->length; + unsigned int o = sg->offset; + + ret = bio_add_zone_append_page(bio, p, l, o); + if (ret != sg->length) { + status = NVME_SC_INTERNAL; + goto out_bio_put; + } + + total_len += sg->length; + } + + if (total_len != nvmet_rw_data_len(req)) { + status = NVME_SC_INTERNAL | NVME_SC_DNR; + goto out_bio_put; + } + + ret = submit_bio_wait(bio); + req->cqe->result.u64 = nvmet_sect_to_lba(req->ns, + bio->bi_iter.bi_sector); + +out_bio_put: + if (bio != &req->b.inline_bio) + bio_put(bio); + nvmet_req_complete(req, ret < 0 ? NVME_SC_INTERNAL : status); +} -- 2.22.1 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH V11 2/4] nvmet: add ZBD over ZNS backend support 2021-03-11 4:39 ` [PATCH V11 2/4] nvmet: add ZBD over ZNS backend support Chaitanya Kulkarni @ 2021-03-11 5:14 ` Damien Le Moal 2021-03-11 5:29 ` Chaitanya Kulkarni 0 siblings, 1 reply; 13+ messages in thread From: Damien Le Moal @ 2021-03-11 5:14 UTC (permalink / raw) To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, kbusch, sagi On 2021/03/11 13:39, Chaitanya Kulkarni wrote: > NVMe TP 4053 – Zoned Namespaces (ZNS) allows host software to > communicate with a non-volatile memory subsystem using zones for > NVMe protocol-based controllers. NVMeOF already support the ZNS NVMe > Protocol compliant devices on the target in the passthru mode. There > are Generic zoned block devices like Shingled Magnetic Recording (SMR) > HDDs that are not based on the NVMe protocol. > > This patch adds ZNS backend to support the ZBDs for NVMeOF target. > > This support includes implementing the new command set NVME_CSI_ZNS, > adding different command handlers for ZNS command set such as > NVMe Identify Controller, NVMe Identify Namespace, NVMe Zone Append, > NVMe Zone Management Send and NVMe Zone Management Receive. > > With the new command set identifier, we also update the target command > effects logs to reflect the ZNS compliant commands. > > Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> > --- > drivers/nvme/target/Makefile | 1 + > drivers/nvme/target/admin-cmd.c | 27 +++ > drivers/nvme/target/io-cmd-bdev.c | 34 +++- > drivers/nvme/target/nvmet.h | 38 ++++ > drivers/nvme/target/zns.c | 327 ++++++++++++++++++++++++++++++ > 5 files changed, 419 insertions(+), 8 deletions(-) > create mode 100644 drivers/nvme/target/zns.c > > diff --git a/drivers/nvme/target/Makefile b/drivers/nvme/target/Makefile > index ebf91fc4c72e..9837e580fa7e 100644 > --- a/drivers/nvme/target/Makefile > +++ b/drivers/nvme/target/Makefile > @@ -12,6 +12,7 @@ obj-$(CONFIG_NVME_TARGET_TCP) += nvmet-tcp.o > nvmet-y += core.o configfs.o admin-cmd.o fabrics-cmd.o \ > discovery.o io-cmd-file.o io-cmd-bdev.o > nvmet-$(CONFIG_NVME_TARGET_PASSTHRU) += passthru.o > +nvmet-$(CONFIG_BLK_DEV_ZONED) += zns.o > nvme-loop-y += loop.o > nvmet-rdma-y += rdma.o > nvmet-fc-y += fc.o > diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c > index 176c8593d341..bf4876df624a 100644 > --- a/drivers/nvme/target/admin-cmd.c > +++ b/drivers/nvme/target/admin-cmd.c > @@ -179,6 +179,13 @@ static void nvmet_set_csi_nvm_effects(struct nvme_effects_log *log) > log->iocs[nvme_cmd_write_zeroes] = cpu_to_le32(1 << 0); > } > > +static void nvmet_set_csi_zns_effects(struct nvme_effects_log *log) > +{ > + log->iocs[nvme_cmd_zone_append] = cpu_to_le32(1 << 0); > + log->iocs[nvme_cmd_zone_mgmt_send] = cpu_to_le32(1 << 0); > + log->iocs[nvme_cmd_zone_mgmt_recv] = cpu_to_le32(1 << 0); > +} > + > static void nvmet_execute_get_log_cmd_effects_ns(struct nvmet_req *req) > { > struct nvme_effects_log *log; > @@ -194,6 +201,15 @@ static void nvmet_execute_get_log_cmd_effects_ns(struct nvmet_req *req) > case NVME_CSI_NVM: > nvmet_set_csi_nvm_effects(log); > break; > + case NVME_CSI_ZNS: > + if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED)) { > + status = NVME_SC_INVALID_IO_CMD_SET; > + goto free; > + } > + > + nvmet_set_csi_nvm_effects(log); > + nvmet_set_csi_zns_effects(log); > + break; > default: > status = NVME_SC_INVALID_LOG_PAGE; > goto free; > @@ -630,6 +646,13 @@ static u16 nvmet_execute_identify_desclist_csi(struct nvmet_req *req, off_t *o) > { > switch (req->ns->csi) { > case NVME_CSI_NVM: > + return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, > + NVME_NIDT_CSI_LEN, > + &req->ns->csi, o); > + case NVME_CSI_ZNS: > + if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED)) > + return NVME_SC_INVALID_IO_CMD_SET; > + > return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, > NVME_NIDT_CSI_LEN, > &req->ns->csi, o); > @@ -682,8 +705,12 @@ static void nvmet_execute_identify(struct nvmet_req *req) > switch (req->cmd->identify.cns) { > case NVME_ID_CNS_NS: > return nvmet_execute_identify_ns(req); > + case NVME_ID_CNS_CS_NS: > + return nvmet_execute_identify_cns_cs_ns(req); > case NVME_ID_CNS_CTRL: > return nvmet_execute_identify_ctrl(req); > + case NVME_ID_CNS_CS_CTRL: > + return nvmet_execute_identify_cns_cs_ctrl(req); > case NVME_ID_CNS_NS_ACTIVE_LIST: > return nvmet_execute_identify_nslist(req); > case NVME_ID_CNS_NS_DESC_LIST: > diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c > index 9a8b3726a37c..ada0215f5e56 100644 > --- a/drivers/nvme/target/io-cmd-bdev.c > +++ b/drivers/nvme/target/io-cmd-bdev.c > @@ -63,6 +63,14 @@ static void nvmet_bdev_ns_enable_integrity(struct nvmet_ns *ns) > } > } > > +void nvmet_bdev_ns_disable(struct nvmet_ns *ns) > +{ > + if (ns->bdev) { > + blkdev_put(ns->bdev, FMODE_WRITE | FMODE_READ); > + ns->bdev = NULL; > + } > +} > + > int nvmet_bdev_ns_enable(struct nvmet_ns *ns) > { > int ret; > @@ -86,15 +94,16 @@ int nvmet_bdev_ns_enable(struct nvmet_ns *ns) > if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY_T10)) > nvmet_bdev_ns_enable_integrity(ns); > > - return 0; > -} > - > -void nvmet_bdev_ns_disable(struct nvmet_ns *ns) > -{ > - if (ns->bdev) { > - blkdev_put(ns->bdev, FMODE_WRITE | FMODE_READ); > - ns->bdev = NULL; > + /* bdev_is_zoned() is stubbed out of CONFIG_BLK_DEV_ZONED */ > + if (bdev_is_zoned(ns->bdev)) { > + if (!nvmet_bdev_zns_enable(ns)) { > + nvmet_bdev_ns_disable(ns); > + return -EINVAL; > + } > + ns->csi = NVME_CSI_ZNS; > } > + > + return 0; > } > > void nvmet_bdev_ns_revalidate(struct nvmet_ns *ns) > @@ -448,6 +457,15 @@ u16 nvmet_bdev_parse_io_cmd(struct nvmet_req *req) > case nvme_cmd_write_zeroes: > req->execute = nvmet_bdev_execute_write_zeroes; > return 0; > + case nvme_cmd_zone_append: > + req->execute = nvmet_bdev_execute_zone_append; > + return 0; > + case nvme_cmd_zone_mgmt_recv: > + req->execute = nvmet_bdev_execute_zone_mgmt_recv; > + return 0; > + case nvme_cmd_zone_mgmt_send: > + req->execute = nvmet_bdev_execute_zone_mgmt_send; > + return 0; > default: > return nvmet_report_invalid_opcode(req); > } > diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h > index ee5999920155..f3fccc49de03 100644 > --- a/drivers/nvme/target/nvmet.h > +++ b/drivers/nvme/target/nvmet.h > @@ -247,6 +247,10 @@ struct nvmet_subsys { > unsigned int admin_timeout; > unsigned int io_timeout; > #endif /* CONFIG_NVME_TARGET_PASSTHRU */ > + > +#ifdef CONFIG_BLK_DEV_ZONED > + u8 zasl; > +#endif /* CONFIG_BLK_DEV_ZONED */ > }; > > static inline struct nvmet_subsys *to_subsys(struct config_item *item) > @@ -584,6 +588,40 @@ static inline struct nvme_ctrl *nvmet_passthru_ctrl(struct nvmet_subsys *subsys) > } > #endif /* CONFIG_NVME_TARGET_PASSTHRU */ > > +#ifdef CONFIG_BLK_DEV_ZONED > +bool nvmet_bdev_zns_enable(struct nvmet_ns *ns); > +void nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req); > +void nvmet_execute_identify_cns_cs_ns(struct nvmet_req *req); > +void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req); > +void nvmet_bdev_execute_zone_mgmt_send(struct nvmet_req *req); > +void nvmet_bdev_execute_zone_append(struct nvmet_req *req); > +#else /* CONFIG_BLK_DEV_ZONED */ > +static inline bool nvmet_bdev_zns_enable(struct nvmet_ns *ns) > +{ > + return false; > +} > +static inline void > +nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req) > +{ > +} > +static inline void > +nvmet_execute_identify_cns_cs_ns(struct nvmet_req *req) > +{ > +} > +static inline void > +nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req) > +{ > +} > +static inline void > +nvmet_bdev_execute_zone_mgmt_send(struct nvmet_req *req) > +{ > +} > +static inline void > +nvmet_bdev_execute_zone_append(struct nvmet_req *req) > +{ > +} > +#endif /* CONFIG_BLK_DEV_ZONED */ > + > static inline struct nvme_ctrl * > nvmet_req_passthru_ctrl(struct nvmet_req *req) > { > diff --git a/drivers/nvme/target/zns.c b/drivers/nvme/target/zns.c > new file mode 100644 > index 000000000000..8121b29df766 > --- /dev/null > +++ b/drivers/nvme/target/zns.c > @@ -0,0 +1,327 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * NVMe ZNS-ZBD command implementation. > + * Copyright (c) 2020-2021 HGST, a Western Digital Company. > + */ > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > +#include <linux/nvme.h> > +#include <linux/blkdev.h> > +#include "nvmet.h" > + > +/* > + * We set the Memory Page Size Minimum (MPSMIN) for target controller to 0 > + * which gets added by 12 in the nvme_enable_ctrl() which results in 2^12 = 4k > + * as page_shift value. When calculating the ZASL use shift by 12. > + */ > +#define NVMET_MPSMIN_SHIFT 12 > + > +static u16 nvmet_bdev_zns_checks(struct nvmet_req *req) > +{ > + if (!bdev_is_zoned(req->ns->bdev)) > + return NVME_SC_INVALID_NS | NVME_SC_DNR; > + > + if (req->cmd->zmr.zra != NVME_ZRA_ZONE_REPORT) { > + req->error_loc = offsetof(struct nvme_zone_mgmt_recv_cmd, zra); > + return NVME_SC_INVALID_FIELD | NVME_SC_DNR; > + } > + > + if (req->cmd->zmr.zrasf != NVME_ZRASF_ZONE_REPORT_ALL) { > + req->error_loc = > + offsetof(struct nvme_zone_mgmt_recv_cmd, zrasf); > + return NVME_SC_INVALID_FIELD | NVME_SC_DNR; > + } > + > + if (req->cmd->zmr.pr != NVME_REPORT_ZONE_PARTIAL) { > + req->error_loc = offsetof(struct nvme_zone_mgmt_recv_cmd, pr); > + return NVME_SC_INVALID_FIELD | NVME_SC_DNR; > + } > + > + return NVME_SC_SUCCESS; > +} > + > +static inline u8 nvmet_zasl(unsigned int zone_append_sects) > +{ > + /* > + * Zone Append Size Limit is the value experessed in the units > + * of minimum memory page size (i.e. 12) and is reported power of 2. > + */ > + return ilog2((zone_append_sects << 9) >> NVMET_MPSMIN_SHIFT); > +} > + > +static inline bool nvmet_zns_update_zasl(struct nvmet_ns *ns) > +{ > + struct request_queue *q = ns->bdev->bd_disk->queue; > + u8 zasl = nvmet_zasl(queue_max_zone_append_sectors(q)); > + > + if (ns->subsys->zasl) > + return ns->subsys->zasl < zasl; > + > + ns->subsys->zasl = zasl; > + return true; > +} > + > +static int nvmet_bdev_validate_zns_zones_cb(struct blk_zone *z, > + unsigned int i, void *data) > +{ > + if (z->type == BLK_ZONE_TYPE_CONVENTIONAL) > + return -EOPNOTSUPP; > + return 0; > +} > + > +static bool nvmet_bdev_has_conv_zones(struct block_device *bdev) > +{ > + int ret; > + > + if (bdev->bd_disk->queue->conv_zones_bitmap) > + return true; > + > + ret = blkdev_report_zones(bdev, 0, blkdev_nr_zones(bdev->bd_disk), > + nvmet_bdev_validate_zns_zones_cb, NULL); > + > + return ret <= 0 ? true : false; > +} > + > +bool nvmet_bdev_zns_enable(struct nvmet_ns *ns) > +{ > + if (nvmet_bdev_has_conv_zones(ns->bdev)) > + return false; > + > + ns->blksize_shift = blksize_bits(bdev_physical_block_size(ns->bdev)); Shouldn't this be using logical block size ? Otherwise, zsze calculation in nvmet_execute_identify_cns_cs_ns() could be wrong. > + > + if (!nvmet_zns_update_zasl(ns)) > + return false; > + > + return !(get_capacity(ns->bdev->bd_disk) & > + (bdev_zone_sectors(ns->bdev) - 1)); > +} > + > +void nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req) > +{ > + u8 zasl = req->sq->ctrl->subsys->zasl; > + struct nvmet_ctrl *ctrl = req->sq->ctrl; > + struct nvme_id_ctrl_zns *id; > + u16 status; > + > + if (req->cmd->identify.csi != NVME_CSI_ZNS) { > + req->error_loc = offsetof(struct nvme_common_command, opcode); > + status = NVME_SC_INVALID_OPCODE | NVME_SC_DNR; > + goto out; > + } > + > + id = kzalloc(sizeof(*id), GFP_KERNEL); > + if (!id) { > + status = NVME_SC_INTERNAL; > + goto out; > + } > + > + if (ctrl->ops->get_mdts) > + id->zasl = min_t(u8, ctrl->ops->get_mdts(ctrl), zasl); > + else > + id->zasl = zasl; > + > + status = nvmet_copy_to_sgl(req, 0, id, sizeof(*id)); > + > + kfree(id); > +out: > + nvmet_req_complete(req, status); > +} > + > +void nvmet_execute_identify_cns_cs_ns(struct nvmet_req *req) > +{ > + struct nvme_id_ns_zns *id_zns; > + u64 zsze; > + u16 status; > + > + if (req->cmd->identify.csi != NVME_CSI_ZNS) { > + req->error_loc = offsetof(struct nvme_common_command, opcode); > + status = NVME_SC_INVALID_OPCODE | NVME_SC_DNR; > + goto out; > + } > + > + if (le32_to_cpu(req->cmd->identify.nsid) == NVME_NSID_ALL) { > + req->error_loc = offsetof(struct nvme_identify, nsid); > + status = NVME_SC_INVALID_NS | NVME_SC_DNR; > + goto out; > + } > + > + id_zns = kzalloc(sizeof(*id_zns), GFP_KERNEL); > + if (!id_zns) { > + status = NVME_SC_INTERNAL; > + goto out; > + } > + > + status = nvmet_req_find_ns(req); > + if (status) { > + status = NVME_SC_INTERNAL; > + goto done; > + } > + > + if (!bdev_is_zoned(req->ns->bdev)) { > + req->error_loc = offsetof(struct nvme_identify, nsid); > + status = NVME_SC_INVALID_NS | NVME_SC_DNR; > + goto done; > + } > + > + nvmet_ns_revalidate(req->ns); > + zsze = (bdev_zone_sectors(req->ns->bdev) << 9) >> > + req->ns->blksize_shift; > + id_zns->lbafe[0].zsze = cpu_to_le64(zsze); > + id_zns->mor = cpu_to_le32(bdev_max_open_zones(req->ns->bdev)); > + id_zns->mar = cpu_to_le32(bdev_max_active_zones(req->ns->bdev)); > + > +done: > + status = nvmet_copy_to_sgl(req, 0, id_zns, sizeof(*id_zns)); > + kfree(id_zns); > +out: > + nvmet_req_complete(req, status); > +} > + > +struct nvmet_report_zone_data { > + struct nvmet_ns *ns; > + struct nvme_zone_report *rz; > +}; > + > +static int nvmet_bdev_report_zone_cb(struct blk_zone *z, unsigned i, void *d) > +{ > + struct nvmet_report_zone_data *report_zone_data = d; > + struct nvme_zone_descriptor *entries = report_zone_data->rz->entries; > + struct nvmet_ns *ns = report_zone_data->ns; > + > + entries[i].zcap = nvmet_sect_to_lba(ns, z->capacity); > + entries[i].zslba = nvmet_sect_to_lba(ns, z->start); > + entries[i].wp = nvmet_sect_to_lba(ns, z->wp); > + entries[i].za = z->reset ? 1 << 2 : 0; > + entries[i].zt = z->type; > + entries[i].zs = z->cond << 4; > + > + return 0; > +} > + > +void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req) > +{ > + sector_t sect = nvmet_lba_to_sect(req->ns, req->cmd->zmr.slba); > + u32 bufsize = (le32_to_cpu(req->cmd->zmr.numd) + 1) << 2; > + struct nvmet_report_zone_data data = { .ns = req->ns }; > + unsigned int nr_zones; > + int reported_zones; > + u16 status; > + > + status = nvmet_bdev_zns_checks(req); > + if (status) > + goto out; > + > + data.rz = __vmalloc(bufsize, GFP_KERNEL | __GFP_NORETRY | __GFP_ZERO); > + if (!data.rz) { > + status = NVME_SC_INTERNAL; > + goto out; > + } > + > + nr_zones = (bufsize - sizeof(struct nvme_zone_report)) / > + sizeof(struct nvme_zone_descriptor); What if nr_zones is 0 ? This should be failed. > + reported_zones = blkdev_report_zones(req->ns->bdev, sect, nr_zones, > + nvmet_bdev_report_zone_cb, &data); > + if (reported_zones < 0) { > + status = NVME_SC_INTERNAL; > + goto out_free_report_zones; > + } > + > + data.rz->nr_zones = cpu_to_le64(reported_zones); > + > + status = nvmet_copy_to_sgl(req, 0, data.rz, bufsize); > + > +out_free_report_zones: > + kvfree(data.rz); > +out: > + nvmet_req_complete(req, status); > +} > + > +void nvmet_bdev_execute_zone_mgmt_send(struct nvmet_req *req) > +{ > + sector_t sect = nvmet_lba_to_sect(req->ns, req->cmd->zms.slba); > + sector_t nr_sect = bdev_zone_sectors(req->ns->bdev); > + u16 status = NVME_SC_SUCCESS; > + u8 zsa = req->cmd->zms.zsa; > + enum req_opf op; > + int ret; > + const unsigned int zsa_to_op[] = { > + [NVME_ZONE_OPEN] = REQ_OP_ZONE_OPEN, > + [NVME_ZONE_CLOSE] = REQ_OP_ZONE_CLOSE, > + [NVME_ZONE_FINISH] = REQ_OP_ZONE_FINISH, > + [NVME_ZONE_RESET] = REQ_OP_ZONE_RESET, > + }; > + > + if (zsa > ARRAY_SIZE(zsa_to_op) || !zsa_to_op[zsa]) { > + status = NVME_SC_INVALID_FIELD; > + goto out; > + } > + > + op = zsa_to_op[zsa]; > + > + if (req->cmd->zms.select_all) > + nr_sect = get_capacity(req->ns->bdev->bd_disk); > + > + ret = blkdev_zone_mgmt(req->ns->bdev, op, sect, nr_sect, GFP_KERNEL); > + if (ret) > + status = NVME_SC_INTERNAL; > +out: > + nvmet_req_complete(req, status); > +} > + > +void nvmet_bdev_execute_zone_append(struct nvmet_req *req) > +{ > + sector_t sect = nvmet_lba_to_sect(req->ns, req->cmd->rw.slba); > + u16 status = NVME_SC_SUCCESS; > + unsigned int total_len = 0; > + struct scatterlist *sg; > + int ret = 0, sg_cnt; > + struct bio *bio; > + > + if (!nvmet_check_transfer_len(req, nvmet_rw_data_len(req))) > + return; > + > + if (!req->sg_cnt) { > + nvmet_req_complete(req, 0); > + return; > + } > + > + if (req->transfer_len <= NVMET_MAX_INLINE_DATA_LEN) { > + bio = &req->b.inline_bio; > + bio_init(bio, req->inline_bvec, ARRAY_SIZE(req->inline_bvec)); > + } else { > + bio = bio_alloc(GFP_KERNEL, req->sg_cnt); > + } > + > + bio_set_dev(bio, req->ns->bdev); > + bio->bi_iter.bi_sector = sect; > + bio->bi_opf = REQ_OP_ZONE_APPEND | REQ_SYNC | REQ_IDLE; > + if (req->cmd->rw.control & cpu_to_le16(NVME_RW_FUA)) > + bio->bi_opf |= REQ_FUA; > + > + for_each_sg(req->sg, sg, req->sg_cnt, sg_cnt) { > + struct page *p = sg_page(sg); > + unsigned int l = sg->length; > + unsigned int o = sg->offset; > + > + ret = bio_add_zone_append_page(bio, p, l, o); > + if (ret != sg->length) { > + status = NVME_SC_INTERNAL; > + goto out_bio_put; > + } > + > + total_len += sg->length; > + } > + > + if (total_len != nvmet_rw_data_len(req)) { > + status = NVME_SC_INTERNAL | NVME_SC_DNR; > + goto out_bio_put; > + } > + > + ret = submit_bio_wait(bio); > + req->cqe->result.u64 = nvmet_sect_to_lba(req->ns, > + bio->bi_iter.bi_sector); > + > +out_bio_put: > + if (bio != &req->b.inline_bio) > + bio_put(bio); > + nvmet_req_complete(req, ret < 0 ? NVME_SC_INTERNAL : status); > +} > -- Damien Le Moal Western Digital Research _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V11 2/4] nvmet: add ZBD over ZNS backend support 2021-03-11 5:14 ` Damien Le Moal @ 2021-03-11 5:29 ` Chaitanya Kulkarni 2021-03-11 5:39 ` Damien Le Moal 0 siblings, 1 reply; 13+ messages in thread From: Chaitanya Kulkarni @ 2021-03-11 5:29 UTC (permalink / raw) To: Damien Le Moal, linux-nvme; +Cc: hch, kbusch, sagi >> + >> +bool nvmet_bdev_zns_enable(struct nvmet_ns *ns) >> +{ >> + if (nvmet_bdev_has_conv_zones(ns->bdev)) >> + return false; >> + >> + ns->blksize_shift = blksize_bits(bdev_physical_block_size(ns->bdev)); > Shouldn't this be using logical block size ? Otherwise, zsze calculation in > nvmet_execute_identify_cns_cs_ns() could be wrong. Okay, will send out V12 with that fix. >> + >> + if (!nvmet_zns_update_zasl(ns)) >> + return false; >> + >> + return !(get_capacity(ns->bdev->bd_disk) & >> + (bdev_zone_sectors(ns->bdev) - 1)); >> +} >> + >> +void nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req) >> >> + >> +void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req) >> +{ >> + sector_t sect = nvmet_lba_to_sect(req->ns, req->cmd->zmr.slba); >> + u32 bufsize = (le32_to_cpu(req->cmd->zmr.numd) + 1) << 2; >> + struct nvmet_report_zone_data data = { .ns = req->ns }; >> + unsigned int nr_zones; >> + int reported_zones; >> + u16 status; >> + >> + status = nvmet_bdev_zns_checks(req); >> + if (status) >> + goto out; >> + >> + data.rz = __vmalloc(bufsize, GFP_KERNEL | __GFP_NORETRY | __GFP_ZERO); >> + if (!data.rz) { >> + status = NVME_SC_INTERNAL; >> + goto out; >> + } >> + >> + nr_zones = (bufsize - sizeof(struct nvme_zone_report)) / >> + sizeof(struct nvme_zone_descriptor); > What if nr_zones is 0 ? This should be failed. blkdev_report_zones() already handles that check. I thinkthe error condition below blkdev_report_zones() should include<= 0 case instead of just <. Will send V12 with the <= fix. > >> + reported_zones = blkdev_report_zones(req->ns->bdev, sect, nr_zones, >> + nvmet_bdev_report_zone_cb, &data); >> + if (reported_zones < 0) { >> + status = NVME_SC_INTERNAL; >> + goto out_free_report_zones; >> + } >> + >> + data.rz->nr_zones = cpu_to_le64(reported_zones); >> + >> + status = nvmet_copy_to_sgl(req, 0, data.rz, bufsize); >> + >> +out_free_report_zones: >> + kvfree(data.rz); >> +out: >> + nvmet_req_complete(req, status); >> +} >> + >> _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V11 2/4] nvmet: add ZBD over ZNS backend support 2021-03-11 5:29 ` Chaitanya Kulkarni @ 2021-03-11 5:39 ` Damien Le Moal 2021-03-11 5:41 ` Chaitanya Kulkarni 0 siblings, 1 reply; 13+ messages in thread From: Damien Le Moal @ 2021-03-11 5:39 UTC (permalink / raw) To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, kbusch, sagi On 2021/03/11 14:29, Chaitanya Kulkarni wrote: >>> + >>> +bool nvmet_bdev_zns_enable(struct nvmet_ns *ns) >>> +{ >>> + if (nvmet_bdev_has_conv_zones(ns->bdev)) >>> + return false; >>> + >>> + ns->blksize_shift = blksize_bits(bdev_physical_block_size(ns->bdev)); >> Shouldn't this be using logical block size ? Otherwise, zsze calculation in >> nvmet_execute_identify_cns_cs_ns() could be wrong. > > Okay, will send out V12 with that fix. > >>> + >>> + if (!nvmet_zns_update_zasl(ns)) >>> + return false; >>> + >>> + return !(get_capacity(ns->bdev->bd_disk) & >>> + (bdev_zone_sectors(ns->bdev) - 1)); >>> +} >>> + >>> +void nvmet_execute_identify_cns_cs_ctrl(struct nvmet_req *req) >>> >>> + >>> +void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req) >>> +{ >>> + sector_t sect = nvmet_lba_to_sect(req->ns, req->cmd->zmr.slba); >>> + u32 bufsize = (le32_to_cpu(req->cmd->zmr.numd) + 1) << 2; >>> + struct nvmet_report_zone_data data = { .ns = req->ns }; >>> + unsigned int nr_zones; >>> + int reported_zones; >>> + u16 status; >>> + >>> + status = nvmet_bdev_zns_checks(req); >>> + if (status) >>> + goto out; >>> + >>> + data.rz = __vmalloc(bufsize, GFP_KERNEL | __GFP_NORETRY | __GFP_ZERO); >>> + if (!data.rz) { >>> + status = NVME_SC_INTERNAL; >>> + goto out; >>> + } >>> + >>> + nr_zones = (bufsize - sizeof(struct nvme_zone_report)) / >>> + sizeof(struct nvme_zone_descriptor); >> What if nr_zones is 0 ? This should be failed. > > blkdev_report_zones() already handles that check. I thinkthe error condition > below blkdev_report_zones() should include<= 0 case instead of just <. Reporting 0 zones with a valid buffer size (nr_zones > 0) is a valid reply, not an error. This can happen depending on reporting options. Even though blkdev_report_zones() does not allow reporting options, it would be strange to fail empty reports. > > Will send V12 with the <= fix. > >> >>> + reported_zones = blkdev_report_zones(req->ns->bdev, sect, nr_zones, >>> + nvmet_bdev_report_zone_cb, &data); >>> + if (reported_zones < 0) { >>> + status = NVME_SC_INTERNAL; >>> + goto out_free_report_zones; >>> + } >>> + >>> + data.rz->nr_zones = cpu_to_le64(reported_zones); >>> + >>> + status = nvmet_copy_to_sgl(req, 0, data.rz, bufsize); >>> + >>> +out_free_report_zones: >>> + kvfree(data.rz); >>> +out: >>> + nvmet_req_complete(req, status); >>> +} >>> + >>> > > -- Damien Le Moal Western Digital Research _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V11 2/4] nvmet: add ZBD over ZNS backend support 2021-03-11 5:39 ` Damien Le Moal @ 2021-03-11 5:41 ` Chaitanya Kulkarni 0 siblings, 0 replies; 13+ messages in thread From: Chaitanya Kulkarni @ 2021-03-11 5:41 UTC (permalink / raw) To: Damien Le Moal, linux-nvme; +Cc: hch, kbusch, sagi On 3/10/21 21:39, Damien Le Moal wrote: >>>> + nr_zones = (bufsize - sizeof(struct nvme_zone_report)) / >>>> + sizeof(struct nvme_zone_descriptor); >>> What if nr_zones is 0 ? This should be failed. >> blkdev_report_zones() already handles that check. I thinkthe error condition >> below blkdev_report_zones() should include<= 0 case instead of just <. > Reporting 0 zones with a valid buffer size (nr_zones > 0) is a valid reply, not > an error. This can happen depending on reporting options. Even though > blkdev_report_zones() does not allow reporting options, it would be strange to > fail empty reports. > I see, will add the check before calling the blkdev_report_zones() for nr_zones == 0. Thanks for the clarification. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH V11 3/4] nvmet: add nvmet_req_bio put helper for backends 2021-03-11 4:39 [PATCH V11 0/4] nvmet: add ZBD backend support Chaitanya Kulkarni 2021-03-11 4:39 ` [PATCH V11 1/4] nvmet: add NVM Command Set Identifier support Chaitanya Kulkarni 2021-03-11 4:39 ` [PATCH V11 2/4] nvmet: add ZBD over ZNS backend support Chaitanya Kulkarni @ 2021-03-11 4:39 ` Chaitanya Kulkarni 2021-03-11 4:39 ` [PATCH V11 4/4] nvme: add comments to nvme_zns_alloc_report_buffer Chaitanya Kulkarni 3 siblings, 0 replies; 13+ messages in thread From: Chaitanya Kulkarni @ 2021-03-11 4:39 UTC (permalink / raw) To: linux-nvme; +Cc: hch, kbusch, sagi, damien.lemoal, Chaitanya Kulkarni With the addition of zns backend now we have three different backends with inline bio optimization. That leads to having duplicate code in for freeing the bio in all three backends: generic bdev, passsthru and generic zns. Add a helper function to avoid duplicate code and update the respective backends. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> --- drivers/nvme/target/io-cmd-bdev.c | 3 +-- drivers/nvme/target/nvmet.h | 6 ++++++ drivers/nvme/target/passthru.c | 3 +-- drivers/nvme/target/zns.c | 3 +-- 4 files changed, 9 insertions(+), 6 deletions(-) diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c index ada0215f5e56..ca39d787d71f 100644 --- a/drivers/nvme/target/io-cmd-bdev.c +++ b/drivers/nvme/target/io-cmd-bdev.c @@ -173,8 +173,7 @@ static void nvmet_bio_done(struct bio *bio) struct nvmet_req *req = bio->bi_private; nvmet_req_complete(req, blk_to_nvme_status(req, bio->bi_status)); - if (bio != &req->b.inline_bio) - bio_put(bio); + nvmet_req_bio_put(req, bio); } #ifdef CONFIG_BLK_DEV_INTEGRITY diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index f3fccc49de03..2f1bd3ac34a2 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -654,4 +654,10 @@ static inline sector_t nvmet_lba_to_sect(struct nvmet_ns *ns, __le64 lba) return le64_to_cpu(lba) << (ns->blksize_shift - SECTOR_SHIFT); } +static inline void nvmet_req_bio_put(struct nvmet_req *req, struct bio *bio) +{ + if (bio != &req->b.inline_bio) + bio_put(bio); +} + #endif /* _NVMET_H */ diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c index 26c587ccd152..011aeebace55 100644 --- a/drivers/nvme/target/passthru.c +++ b/drivers/nvme/target/passthru.c @@ -206,8 +206,7 @@ static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq) for_each_sg(req->sg, sg, req->sg_cnt, i) { if (bio_add_pc_page(rq->q, bio, sg_page(sg), sg->length, sg->offset) < sg->length) { - if (bio != &req->p.inline_bio) - bio_put(bio); + nvmet_req_bio_put(req, bio); return -EINVAL; } } diff --git a/drivers/nvme/target/zns.c b/drivers/nvme/target/zns.c index 8121b29df766..42f16ce55fa0 100644 --- a/drivers/nvme/target/zns.c +++ b/drivers/nvme/target/zns.c @@ -321,7 +321,6 @@ void nvmet_bdev_execute_zone_append(struct nvmet_req *req) bio->bi_iter.bi_sector); out_bio_put: - if (bio != &req->b.inline_bio) - bio_put(bio); + nvmet_req_bio_put(req, bio); nvmet_req_complete(req, ret < 0 ? NVME_SC_INTERNAL : status); } -- 2.22.1 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH V11 4/4] nvme: add comments to nvme_zns_alloc_report_buffer 2021-03-11 4:39 [PATCH V11 0/4] nvmet: add ZBD backend support Chaitanya Kulkarni ` (2 preceding siblings ...) 2021-03-11 4:39 ` [PATCH V11 3/4] nvmet: add nvmet_req_bio put helper for backends Chaitanya Kulkarni @ 2021-03-11 4:39 ` Chaitanya Kulkarni 2021-03-11 5:08 ` Damien Le Moal 3 siblings, 1 reply; 13+ messages in thread From: Chaitanya Kulkarni @ 2021-03-11 4:39 UTC (permalink / raw) To: linux-nvme; +Cc: hch, kbusch, sagi, damien.lemoal, Chaitanya Kulkarni The report zone buffer calculation is dependent nvme report zones header, nvme report zone descriptor and on the various block layer request queue attributes such as queue_max_hw_sectors(), queue_max_segments(). These queue_XXX attributes are calculated on different ctrl values in the nvme-core. Add clear comments about what values we are using and how they are calculated based on the controller's attributes. This is needed since when referencing the code after long time it is not straight forward to understand how we calculate the buffer size given that there are variables and ctrl attributes involved. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> --- drivers/nvme/host/zns.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c index bc2f344f0ae0..c2d08d9cc269 100644 --- a/drivers/nvme/host/zns.c +++ b/drivers/nvme/host/zns.c @@ -125,16 +125,38 @@ static void *nvme_zns_alloc_report_buffer(struct nvme_ns *ns, size_t bufsize; void *buf; + /* + * Set the minimum buffer size for report zone header and one zone + * descriptor. + */ const size_t min_bufsize = sizeof(struct nvme_zone_report) + sizeof(struct nvme_zone_descriptor); + /* + * Recalculate the number of zones based on disk size of zone size. + */ nr_zones = min_t(unsigned int, nr_zones, get_capacity(ns->disk) >> ilog2(ns->zsze)); + /* + * Calculate the buffer size based on the report zone header and number + * of zone descriptors are required for each zone. + */ bufsize = sizeof(struct nvme_zone_report) + nr_zones * sizeof(struct nvme_zone_descriptor); + + /* + * Recalculate and Limit the buffer size to queue max hw sectors. For + * NVMe queue max hw sectors are calcualted based on controller's + * Maximum Data Transfer Size (MDTS). + */ bufsize = min_t(size_t, bufsize, queue_max_hw_sectors(q) << SECTOR_SHIFT); + /* + * Recalculate and Limit the buffer size to queue max segments. For + * NVMe queue max segments are calculated based on how many controller + * pages are needed to fit the max hw sectors. + */ bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT); while (bufsize >= min_bufsize) { -- 2.22.1 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH V11 4/4] nvme: add comments to nvme_zns_alloc_report_buffer 2021-03-11 4:39 ` [PATCH V11 4/4] nvme: add comments to nvme_zns_alloc_report_buffer Chaitanya Kulkarni @ 2021-03-11 5:08 ` Damien Le Moal 2021-03-11 5:36 ` Chaitanya Kulkarni 0 siblings, 1 reply; 13+ messages in thread From: Damien Le Moal @ 2021-03-11 5:08 UTC (permalink / raw) To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, kbusch, sagi On 2021/03/11 13:39, Chaitanya Kulkarni wrote: > The report zone buffer calculation is dependent nvme report zones > header, nvme report zone descriptor and on the various block > layer request queue attributes such as queue_max_hw_sectors(), > queue_max_segments(). These queue_XXX attributes are calculated on > different ctrl values in the nvme-core. > > Add clear comments about what values we are using and how they are > calculated based on the controller's attributes. > > This is needed since when referencing the code after long time it is not > straight forward to understand how we calculate the buffer size given > that there are variables and ctrl attributes involved. > > Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> > --- > drivers/nvme/host/zns.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c > index bc2f344f0ae0..c2d08d9cc269 100644 > --- a/drivers/nvme/host/zns.c > +++ b/drivers/nvme/host/zns.c > @@ -125,16 +125,38 @@ static void *nvme_zns_alloc_report_buffer(struct nvme_ns *ns, > size_t bufsize; > void *buf; > > + /* > + * Set the minimum buffer size for report zone header and one zone > + * descriptor. > + */ > const size_t min_bufsize = sizeof(struct nvme_zone_report) + > sizeof(struct nvme_zone_descriptor); This seems unused. And not really useful anyway since if this function is being called, it should be clear already that at least one zone can be reported, that is, the report start sector is below capacity. > > + /* > + * Recalculate the number of zones based on disk size of zone size. > + */ > nr_zones = min_t(unsigned int, nr_zones, > get_capacity(ns->disk) >> ilog2(ns->zsze)); > > + /* > + * Calculate the buffer size based on the report zone header and number > + * of zone descriptors are required for each zone. > + */ This comment is not really useful. > bufsize = sizeof(struct nvme_zone_report) + > nr_zones * sizeof(struct nvme_zone_descriptor); > + > + /* > + * Recalculate and Limit the buffer size to queue max hw sectors. For > + * NVMe queue max hw sectors are calcualted based on controller's > + * Maximum Data Transfer Size (MDTS). > + */ What combining this comment and the next into: /* * Limit the buffer size to the maximum data transfer size and on * the maximum number of segments allowed. */ Simpler in my opinion. > bufsize = min_t(size_t, bufsize, > queue_max_hw_sectors(q) << SECTOR_SHIFT); > + /* > + * Recalculate and Limit the buffer size to queue max segments. For > + * NVMe queue max segments are calculated based on how many controller > + * pages are needed to fit the max hw sectors. > + */ > bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT); > > while (bufsize >= min_bufsize) { > -- Damien Le Moal Western Digital Research _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V11 4/4] nvme: add comments to nvme_zns_alloc_report_buffer 2021-03-11 5:08 ` Damien Le Moal @ 2021-03-11 5:36 ` Chaitanya Kulkarni 2021-03-11 6:22 ` hch 0 siblings, 1 reply; 13+ messages in thread From: Chaitanya Kulkarni @ 2021-03-11 5:36 UTC (permalink / raw) To: Damien Le Moal, linux-nvme; +Cc: hch, kbusch, sagi On 3/10/21 21:08, Damien Le Moal wrote: > On 2021/03/11 13:39, Chaitanya Kulkarni wrote: >> The report zone buffer calculation is dependent nvme report zones >> header, nvme report zone descriptor and on the various block >> layer request queue attributes such as queue_max_hw_sectors(), >> queue_max_segments(). These queue_XXX attributes are calculated on >> different ctrl values in the nvme-core. >> >> Add clear comments about what values we are using and how they are >> calculated based on the controller's attributes. >> >> This is needed since when referencing the code after long time it is not >> straight forward to understand how we calculate the buffer size given >> that there are variables and ctrl attributes involved. >> >> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> >> --- >> drivers/nvme/host/zns.c | 22 ++++++++++++++++++++++ >> 1 file changed, 22 insertions(+) >> >> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c >> index bc2f344f0ae0..c2d08d9cc269 100644 >> --- a/drivers/nvme/host/zns.c >> +++ b/drivers/nvme/host/zns.c >> @@ -125,16 +125,38 @@ static void *nvme_zns_alloc_report_buffer(struct nvme_ns *ns, >> size_t bufsize; >> void *buf; >> >> + /* >> + * Set the minimum buffer size for report zone header and one zone >> + * descriptor. >> + */ >> const size_t min_bufsize = sizeof(struct nvme_zone_report) + >> sizeof(struct nvme_zone_descriptor); > This seems unused. And not really useful anyway since if this function is being > called, it should be clear already that at least one zone can be reported, that > is, the report start sector is below capacity. Okay will remove the comment V12. >> >> + /* >> + * Recalculate the number of zones based on disk size of zone size. >> + */ >> nr_zones = min_t(unsigned int, nr_zones, >> get_capacity(ns->disk) >> ilog2(ns->zsze)); >> >> + /* >> + * Calculate the buffer size based on the report zone header and number >> + * of zone descriptors are required for each zone. >> + */ > This comment is not really useful. Okay will remove it in V12. > >> bufsize = sizeof(struct nvme_zone_report) + >> nr_zones * sizeof(struct nvme_zone_descriptor); >> + >> + /* >> + * Recalculate and Limit the buffer size to queue max hw sectors. For >> + * NVMe queue max hw sectors are calcualted based on controller's >> + * Maximum Data Transfer Size (MDTS). >> + */ Will remove the above comment. > What combining this comment and the next into: > > /* > * Limit the buffer size to the maximum data transfer size and on > * the maximum number of segments allowed. > */ Will use the above comment. > Simpler in my opinion. >> bufsize = min_t(size_t, bufsize, >> queue_max_hw_sectors(q) << SECTOR_SHIFT); >> + /* >> + * Recalculate and Limit the buffer size to queue max segments. For >> + * NVMe queue max segments are calculated based on how many controller >> + * pages are needed to fit the max hw sectors. >> + */ Will remove above comment. >> bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT); >> >> while (bufsize >= min_bufsize) { >> > _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V11 4/4] nvme: add comments to nvme_zns_alloc_report_buffer 2021-03-11 5:36 ` Chaitanya Kulkarni @ 2021-03-11 6:22 ` hch 2021-03-11 6:22 ` Chaitanya Kulkarni 0 siblings, 1 reply; 13+ messages in thread From: hch @ 2021-03-11 6:22 UTC (permalink / raw) To: Chaitanya Kulkarni; +Cc: Damien Le Moal, linux-nvme, hch, kbusch, sagi On Thu, Mar 11, 2021 at 05:36:53AM +0000, Chaitanya Kulkarni wrote: > > This seems unused. And not really useful anyway since if this function is being > > called, it should be clear already that at least one zone can be reported, that > > is, the report start sector is below capacity. > > Okay will remove the comment V12. Please just resend this patch individually. It isn't really related to the rest. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V11 4/4] nvme: add comments to nvme_zns_alloc_report_buffer 2021-03-11 6:22 ` hch @ 2021-03-11 6:22 ` Chaitanya Kulkarni 0 siblings, 0 replies; 13+ messages in thread From: Chaitanya Kulkarni @ 2021-03-11 6:22 UTC (permalink / raw) To: hch; +Cc: Damien Le Moal, linux-nvme, kbusch, sagi On 3/10/21 22:22, hch@lst.de wrote: > On Thu, Mar 11, 2021 at 05:36:53AM +0000, Chaitanya Kulkarni wrote: >>> This seems unused. And not really useful anyway since if this function is being >>> called, it should be clear already that at least one zone can be reported, that >>> is, the report start sector is below capacity. >> Okay will remove the comment V12. > Please just resend this patch individually. It isn't really related > to the rest. > Okay. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-03-11 6:27 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-11 4:39 [PATCH V11 0/4] nvmet: add ZBD backend support Chaitanya Kulkarni 2021-03-11 4:39 ` [PATCH V11 1/4] nvmet: add NVM Command Set Identifier support Chaitanya Kulkarni 2021-03-11 4:39 ` [PATCH V11 2/4] nvmet: add ZBD over ZNS backend support Chaitanya Kulkarni 2021-03-11 5:14 ` Damien Le Moal 2021-03-11 5:29 ` Chaitanya Kulkarni 2021-03-11 5:39 ` Damien Le Moal 2021-03-11 5:41 ` Chaitanya Kulkarni 2021-03-11 4:39 ` [PATCH V11 3/4] nvmet: add nvmet_req_bio put helper for backends Chaitanya Kulkarni 2021-03-11 4:39 ` [PATCH V11 4/4] nvme: add comments to nvme_zns_alloc_report_buffer Chaitanya Kulkarni 2021-03-11 5:08 ` Damien Le Moal 2021-03-11 5:36 ` Chaitanya Kulkarni 2021-03-11 6:22 ` hch 2021-03-11 6:22 ` Chaitanya Kulkarni
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).