linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V5 0/1] nvme: allow passthru cmd error logging
@ 2023-04-09 21:25 Chaitanya Kulkarni
  2023-04-09 21:25 ` [PATCH V5 1/1] " Chaitanya Kulkarni
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Chaitanya Kulkarni @ 2023-04-09 21:25 UTC (permalink / raw)
  To: alan.adamson; +Cc: linux-nvme, Chaitanya Kulkarni

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="a", Size: 11448 bytes --]

Hi Alan,

I spnet some time on the V4 to fix several nits and testing with keeping
your authorship, please have a look, feel free to send a next version
if you find something not right.

In nvme_end_req()_ we only log errors which are for non-passthru
commands. Add a helper function nvme_log_err_passthru() which allows us
to log error for passthru commands by decoding cdw10-cdw15 values of
nvme command.

Full disclouser there is a checkpatch warning on this patch :-

WARNING: Symbolic permissions 'S_IRUGO | S_IWUSR' are not preferred. Consider using octal permissions '0644'.
#146: FILE: drivers/nvme/host/core.c:3487:
+static DEVICE_ATTR(passthru_err_log, S_IRUGO | S_IWUSR,

This patch is using DEVICE_ATTR declaration style in nvme/host/core.c
instead of inventing new one to remove above warning.

This has several fixes on V4, see list below.

Below is short testlog :-

nvme (nvme-6.4) # echo "############ TEST Admin Command error ###################"
############ TEST Admin Command error ###################
nvme (nvme-6.4) # nvme telemetry-log -o /tmp/test /dev/nvme0  
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
Failed to acquire telemetry log 16386!
nvme (nvme-6.4) # dmesg  -c 
[50359.040637] nvme0: Get Log Page(0x2), Invalid Field in Command (sct 0x0 / sc 0x2) DNR cdw10=0x7f0107 cdw11=0x0 cdw12=0x0 cdw13=0x0 cdw14=0x0 cdw15=0x0
nvme (nvme-6.4) # echo "############ TEST I/O Command error ###################"
############ TEST I/O Command error ###################
nvme (nvme-6.4) # nvme write-zeroes -n 1 -s 0x200000 -c 10 /dev/nvme0
NVMe status: LBA Out of Range: The command references an LBA that exceeds the size of the namespace(0x4080)
nvme (nvme-6.4) # dmesg  -c
[134411.419290] nvme nvme0: using deprecated NVME_IOCTL_IO_CMD ioctl on the char device!
[134411.419484] nvme0n1: Write Zeros(0x8), LBA Out of Range (sct 0x0 / sc 0x80) DNR cdw10=0x200000 cdw11=0x0 cdw12=0xa cdw13=0x0 cdw14=0x0 cdw15=0x0
nvme (nvme-6.4) #

nvme (nvme-6.4) # 

-ck

Chaitanya Kulkarni (1):
  nvme: allow passthru cme error logging

 drivers/nvme/host/core.c | 83 +++++++++++++++++++++++++++++++++++++---
 drivers/nvme/host/nvme.h |  1 +
 2 files changed, 79 insertions(+), 5 deletions(-)

v5:

- Trim down code in the nvme_log_error_passthrough().
  Use following to get the disk name as an arg to
   pr_err_ratelimited() :-
	ns ? ns->disk->disk_name : dev_name(nr->ctrl->device),
  Use following to get the admin vs I/O opcode string as an arg to
  pr_err_ratelimited() :-
       	ns ? nvme_get_opcode_str(nr->cmd->common.opcode) :
       	     nvme_get_admin_opcode_str(nr->cmd->common.opcode),
- Rename nvme_log_error_passthrough() -> nvme_log_err_passthru().
- Remove else and return directly in nvme_passthru_err_log_show().
- Generate error on invalid values of the passthru_enable variable
  in nvme_passthru_log_store().
- Rename passthrough -> passthru.
- Rename sysfs attr from passthru_admin_err_logging -> passthru_log_err.

v4:
- Change sysfs attribute to passthru_admin_err_logging
- Only log passthrough admin commands.  IO passthrough commands will always be logged.

v3:
- Log a passthrough specific message that dumps CDW* contents.
- Enable/disable vis sysfs rather than debugfs.

v2:
- Included Pankaj Raghav's patch 'nvme: ignore starting sector while error logging for passthrough requests'
  with a couple changes.
- Moved error_logging flag to nvme_ctrl structure
- The entire nvme-debugfs.c does not need to be guarded by #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS.
- Use IS_ENABLED((CONFIG_NVME_ERROR_LOGGING_DEBUG_FS)) to determine if error logging should be
  initialized.
- Various other nits.

nvme (nvme-6.4) # git am p/nvme-passthru-err-log/0001-nvme-allow-passthru-cme-error-logging.patch
Applying: nvme: allow passthru cmd error logging
nvme (nvme-6.4) # gitlog -2
aeae7179040c (HEAD -> nvme-6.4) nvme: allow passthru cmd error logging
803fc6f9ccee Merge branch 'nvme-6.4' of git://git.infradead.org/nvme into nvme-6.4
nvme (nvme-6.4) # ./compile_nvme.sh 
+ umount /mnt/nvme0n1
+ clear_dmesg
./compile_nvme.sh: line 3: clear_dmesg: command not found
umount: /mnt/nvme0n1: no mount point specified.
+ ./delete.sh
+ NQN=testnqn
+ nvme disconnect -n testnqn
NQN:testnqn disconnected 0 controller(s)

real	0m0.004s
user	0m0.001s
sys	0m0.002s
+ rm -fr '/sys/kernel/config/nvmet/ports/1/subsystems/*'
+ rmdir /sys/kernel/config/nvmet/ports/1
rmdir: failed to remove '/sys/kernel/config/nvmet/ports/1': No such file or directory
+ for subsys in /sys/kernel/config/nvmet/subsystems/*
+ for ns in ${subsys}/namespaces/*
+ echo 0
./delete.sh: line 14: /sys/kernel/config/nvmet/subsystems/*/namespaces/*/enable: No such file or directory
+ rmdir '/sys/kernel/config/nvmet/subsystems/*/namespaces/*'
rmdir: failed to remove '/sys/kernel/config/nvmet/subsystems/*/namespaces/*': No such file or directory
+ rmdir '/sys/kernel/config/nvmet/subsystems/*'
rmdir: failed to remove '/sys/kernel/config/nvmet/subsystems/*': No such file or directory
+ rmdir 'config/nullb/nullb*'
rmdir: failed to remove 'config/nullb/nullb*': No such file or directory
+ umount /mnt/nvme0n1
umount: /mnt/nvme0n1: no mount point specified.
+ umount /mnt/backend
umount: /mnt/backend: not mounted.
+ modprobe -r nvme_loop
+ modprobe -r nvme_fabrics
+ modprobe -r nvmet
+ modprobe -r nvme
+ modprobe -r null_blk
+ tree /sys/kernel/config
/sys/kernel/config
└── pci_ep
    ├── controllers
    └── functions

3 directories, 0 files
+ modprobe -r nvme-fabrics
+ modprobe -r nvme_loop
+ modprobe -r nvmet
+ modprobe -r nvme
+ sleep 1
+ modprobe -r nvme-core
+ lsmod
+ grep nvme
+ sleep 1
+ git diff
+ sleep 1
++ nproc
+ make -j 48 M=drivers/nvme/ modules
  CC [M]  drivers/nvme/common/auth.o
  CC [M]  drivers/nvme/host/core.o
  CC [M]  drivers/nvme/host/ioctl.o
  CC [M]  drivers/nvme/target/core.o
  CC [M]  drivers/nvme/target/configfs.o
  CC [M]  drivers/nvme/host/constants.o
  CC [M]  drivers/nvme/host/trace.o
  CC [M]  drivers/nvme/target/admin-cmd.o
  CC [M]  drivers/nvme/host/multipath.o
  CC [M]  drivers/nvme/target/fabrics-cmd.o
  CC [M]  drivers/nvme/host/zns.o
  CC [M]  drivers/nvme/target/discovery.o
  CC [M]  drivers/nvme/host/fault_inject.o
  CC [M]  drivers/nvme/target/io-cmd-file.o
  CC [M]  drivers/nvme/host/hwmon.o
  CC [M]  drivers/nvme/target/io-cmd-bdev.o
  CC [M]  drivers/nvme/host/auth.o
  CC [M]  drivers/nvme/target/passthru.o
  CC [M]  drivers/nvme/target/zns.o
  CC [M]  drivers/nvme/host/pci.o
  CC [M]  drivers/nvme/host/fabrics.o
  CC [M]  drivers/nvme/target/fabrics-cmd-auth.o
  CC [M]  drivers/nvme/target/auth.o
  CC [M]  drivers/nvme/target/trace.o
  CC [M]  drivers/nvme/host/rdma.o
  CC [M]  drivers/nvme/target/loop.o
  CC [M]  drivers/nvme/host/fc.o
  CC [M]  drivers/nvme/target/rdma.o
  CC [M]  drivers/nvme/host/tcp.o
  CC [M]  drivers/nvme/target/fc.o
  CC [M]  drivers/nvme/target/fcloop.o
  CC [M]  drivers/nvme/target/tcp.o
  LD [M]  drivers/nvme/common/nvme-common.o
  LD [M]  drivers/nvme/host/nvme-fabrics.o
  LD [M]  drivers/nvme/target/nvme-fcloop.o
  LD [M]  drivers/nvme/target/nvme-loop.o
  LD [M]  drivers/nvme/target/nvmet.o
  LD [M]  drivers/nvme/target/nvmet-fc.o
  LD [M]  drivers/nvme/host/nvme-rdma.o
  LD [M]  drivers/nvme/target/nvmet-tcp.o
  LD [M]  drivers/nvme/host/nvme-fc.o
  LD [M]  drivers/nvme/host/nvme.o
  LD [M]  drivers/nvme/host/nvme-tcp.o
  LD [M]  drivers/nvme/target/nvmet-rdma.o
  LD [M]  drivers/nvme/host/nvme-core.o
  MODPOST drivers/nvme/Module.symvers
  CC [M]  drivers/nvme/host/nvme-core.mod.o
  CC [M]  drivers/nvme/host/nvme.mod.o
  CC [M]  drivers/nvme/host/nvme-fabrics.mod.o
  CC [M]  drivers/nvme/host/nvme-rdma.mod.o
  CC [M]  drivers/nvme/host/nvme-fc.mod.o
  CC [M]  drivers/nvme/host/nvme-tcp.mod.o
  CC [M]  drivers/nvme/target/nvmet.mod.o
  CC [M]  drivers/nvme/target/nvme-loop.mod.o
  CC [M]  drivers/nvme/target/nvmet-rdma.mod.o
  CC [M]  drivers/nvme/target/nvmet-fc.mod.o
  CC [M]  drivers/nvme/target/nvme-fcloop.mod.o
  CC [M]  drivers/nvme/target/nvmet-tcp.mod.o
  CC [M]  drivers/nvme/common/nvme-common.mod.o
  LD [M]  drivers/nvme/host/nvme-core.ko
  LD [M]  drivers/nvme/host/nvme-fabrics.ko
  LD [M]  drivers/nvme/host/nvme.ko
  LD [M]  drivers/nvme/host/nvme-fc.ko
  LD [M]  drivers/nvme/host/nvme-rdma.ko
  LD [M]  drivers/nvme/target/nvmet-rdma.ko
  LD [M]  drivers/nvme/target/nvmet.ko
  LD [M]  drivers/nvme/host/nvme-tcp.ko
  LD [M]  drivers/nvme/target/nvme-loop.ko
  LD [M]  drivers/nvme/target/nvme-fcloop.ko
  LD [M]  drivers/nvme/target/nvmet-tcp.ko
  LD [M]  drivers/nvme/target/nvmet-fc.ko
  LD [M]  drivers/nvme/common/nvme-common.ko
+ HOST=drivers/nvme/host
+ TARGET=drivers/nvme/target
++ uname -r
+ HOST_DEST=/lib/modules/6.3.0-rc2nvme+/kernel/drivers/nvme/host/
++ uname -r
+ TARGET_DEST=/lib/modules/6.3.0-rc2nvme+/kernel/drivers/nvme/target/
+ cp drivers/nvme/host/nvme-core.ko drivers/nvme/host/nvme-fabrics.ko drivers/nvme/host/nvme-fc.ko drivers/nvme/host/nvme.ko drivers/nvme/host/nvme-rdma.ko drivers/nvme/host/nvme-tcp.ko /lib/modules/6.3.0-rc2nvme+/kernel/drivers/nvme/host//
+ cp drivers/nvme/target/nvme-fcloop.ko drivers/nvme/target/nvme-loop.ko drivers/nvme/target/nvmet-fc.ko drivers/nvme/target/nvmet.ko drivers/nvme/target/nvmet-rdma.ko drivers/nvme/target/nvmet-tcp.ko /lib/modules/6.3.0-rc2nvme+/kernel/drivers/nvme/target//
+ ls -lrth /lib/modules/6.3.0-rc2nvme+/kernel/drivers/nvme/host/ /lib/modules/6.3.0-rc2nvme+/kernel/drivers/nvme/target//
/lib/modules/6.3.0-rc2nvme+/kernel/drivers/nvme/host/:
total 7.5M
-rw-r--r--. 1 root root 3.5M Apr  8 14:59 nvme-core.ko
-rw-r--r--. 1 root root 477K Apr  8 14:59 nvme-fabrics.ko
-rw-r--r--. 1 root root 974K Apr  8 14:59 nvme-fc.ko
-rw-r--r--. 1 root root 783K Apr  8 14:59 nvme.ko
-rw-r--r--. 1 root root 925K Apr  8 14:59 nvme-rdma.ko
-rw-r--r--. 1 root root 900K Apr  8 14:59 nvme-tcp.ko

/lib/modules/6.3.0-rc2nvme+/kernel/drivers/nvme/target//:
total 7.3M
-rw-r--r--. 1 root root 529K Apr  8 14:59 nvme-fcloop.ko
-rw-r--r--. 1 root root 469K Apr  8 14:59 nvme-loop.ko
-rw-r--r--. 1 root root 799K Apr  8 14:59 nvmet-fc.ko
-rw-r--r--. 1 root root 4.0M Apr  8 14:59 nvmet.ko
-rw-r--r--. 1 root root 891K Apr  8 14:59 nvmet-rdma.ko
-rw-r--r--. 1 root root 751K Apr  8 14:59 nvmet-tcp.ko
+ sync
+ sync
+ sync
+ modprobe nvme
+ echo 'Press enter to continue ...'
Press enter to continue ...
nvme (nvme-6.4) # dmesg  -c > /dev/null
nvme (nvme-6.4) # 
nvme (nvme-6.4) # 
bash: cho: command not found...
nvme (nvme-6.4) # echo 1 > /sys/class/nvme/nvme0/passthru_err_log 
nvme (nvme-6.4) # echo "############ TEST Admin Command error ###################"
############ TEST Admin Command error ###################
nvme (nvme-6.4) # nvme telemetry-log -o /tmp/test /dev/nvme0  
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
Failed to acquire telemetry log 16386!
nvme (nvme-6.4) # dmesg  -c 
[50359.040637] nvme0: Get Log Page(0x2), Invalid Field in Command (sct 0x0 / sc 0x2) DNR cdw10=0x7f0107 cdw11=0x0 cdw12=0x0 cdw13=0x0 cdw14=0x0 cdw15=0x0
nvme (nvme-6.4) # echo "############ TEST I/O Command error ###################"
############ TEST I/O Command error ###################
nvme (nvme-6.4) # nvme write-zeroes -n 1 -s 0x200000 -c 10 /dev/nvme0
NVMe status: LBA Out of Range: The command references an LBA that exceeds the size of the namespace(0x4080)
nvme (nvme-6.4) # 
o

-- 
2.29.0



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH V5 1/1] nvme: allow passthru cmd error logging
  2023-04-09 21:25 [PATCH V5 0/1] nvme: allow passthru cmd error logging Chaitanya Kulkarni
@ 2023-04-09 21:25 ` Chaitanya Kulkarni
  2023-05-22 22:58   ` alan.adamson
  2023-04-11  0:05 ` [PATCH V5 0/1] " alan.adamson
       [not found] ` <CGME20230412091843eucas1p22a4c7c34297c4f260ccc54cff9d67a52@eucas1p2.samsung.com>
  2 siblings, 1 reply; 13+ messages in thread
From: Chaitanya Kulkarni @ 2023-04-09 21:25 UTC (permalink / raw)
  To: alan.adamson; +Cc: linux-nvme, Chaitanya Kulkarni

From: Alan Adamson <alan.adamson@oracle.com>

Commit d7ac8dca938c ("nvme: quiet user passthrough command errors")
disabled error logging for user passthrough commands.  This commit
adds the ability to opt-in to passthrough admin error logging. IO
commands initiated as passthrough will always be logged.

The logging output for passthrough commands (Admin and IO) has been
changed to include CDWXX fields.

nvme0n1: Read(0x2), LBA Out of Range (sct 0x0 / sc 0x80) DNR cdw10=0x0 cdw11=0x1
        cdw12=0x70000 cdw13=0x0 cdw14=0x0 cdw15=0x0

Add a helper function nvme_log_err_passthru() which allows us to log
error for passthru commands by decoding cdw10-cdw15 values of nvme
command.

Add a new sysfs attr passthru_err_log that allows user to conditionally
enable passthru command logging, by default it is disabled.

To enable passthrough admin error logging:
        echo 1 > /sys/class/nvme/nvme0/passthru_err_log

To disable passthrough admin error logging:
        echo 0 > /sys/class/nvme/nvme0/passthru_err_log

Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
[kch] fix sevaral nits and trim down code, details in cover-letter.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 drivers/nvme/host/core.c | 83 +++++++++++++++++++++++++++++++++++++---
 drivers/nvme/host/nvme.h |  1 +
 2 files changed, 79 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 954641a45e55..e32265adc034 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -337,6 +337,30 @@ static void nvme_log_error(struct request *req)
 			   nr->status & NVME_SC_DNR  ? "DNR "  : "");
 }
 
+static void nvme_log_err_passthru(struct request *req)
+{
+	struct nvme_ns *ns = req->q->queuedata;
+	struct nvme_request *nr = nvme_req(req);
+
+	pr_err_ratelimited("%s: %s(0x%x), %s (sct 0x%x / sc 0x%x) %s%s"
+		"cdw10=0x%x cdw11=0x%x cdw12=0x%x cdw13=0x%x cdw14=0x%x cdw15=0x%x\n",
+		ns ? ns->disk->disk_name : dev_name(nr->ctrl->device),
+		ns ? nvme_get_opcode_str(nr->cmd->common.opcode) :
+		     nvme_get_admin_opcode_str(nr->cmd->common.opcode),
+		nr->cmd->common.opcode,
+		nvme_get_error_status_str(nr->status),
+		nr->status >> 8 & 7,	/* Status Code Type */
+		nr->status & 0xff,	/* Status Code */
+		nr->status & NVME_SC_MORE ? "MORE " : "",
+		nr->status & NVME_SC_DNR  ? "DNR "  : "",
+		nr->cmd->common.cdw10,
+		nr->cmd->common.cdw11,
+		nr->cmd->common.cdw12,
+		nr->cmd->common.cdw13,
+		nr->cmd->common.cdw14,
+		nr->cmd->common.cdw14);
+}
+
 enum nvme_disposition {
 	COMPLETE,
 	RETRY,
@@ -381,8 +405,12 @@ static inline void nvme_end_req(struct request *req)
 {
 	blk_status_t status = nvme_error_status(nvme_req(req)->status);
 
-	if (unlikely(nvme_req(req)->status && !(req->rq_flags & RQF_QUIET)))
-		nvme_log_error(req);
+	if (unlikely(nvme_req(req)->status && !(req->rq_flags & RQF_QUIET))) {
+		if (blk_rq_is_passthrough(req))
+			nvme_log_err_passthru(req);
+		else
+			nvme_log_error(req);
+	}
 	nvme_end_req_zoned(req);
 	nvme_trace_bio_complete(req);
 	if (req->cmd_flags & REQ_NVME_MPATH)
@@ -666,10 +694,15 @@ static inline void nvme_clear_nvme_request(struct request *req)
 /* initialize a passthrough request */
 void nvme_init_request(struct request *req, struct nvme_command *cmd)
 {
+	struct nvme_request *nr = nvme_req(req);
+
 	if (req->q->queuedata)
 		req->timeout = NVME_IO_TIMEOUT;
-	else /* no queuedata implies admin queue */
+	else { /* no queuedata implies admin queue */
 		req->timeout = NVME_ADMIN_TIMEOUT;
+		if (!nr->ctrl->passthru_log_err)
+			req->rq_flags |= RQF_QUIET;
+	}
 
 	/* passthru commands should let the driver set the SGL flags */
 	cmd->common.flags &= ~NVME_CMD_SGL_ALL;
@@ -678,8 +711,8 @@ void nvme_init_request(struct request *req, struct nvme_command *cmd)
 	if (req->mq_hctx->type == HCTX_TYPE_POLL)
 		req->cmd_flags |= REQ_POLLED;
 	nvme_clear_nvme_request(req);
-	req->rq_flags |= RQF_QUIET;
-	memcpy(nvme_req(req)->cmd, cmd, sizeof(*cmd));
+
+	memcpy(nr->cmd, cmd, sizeof(*cmd));
 }
 EXPORT_SYMBOL_GPL(nvme_init_request);
 
@@ -3417,6 +3450,44 @@ static ssize_t nvme_sysfs_rescan(struct device *dev,
 }
 static DEVICE_ATTR(rescan_controller, S_IWUSR, NULL, nvme_sysfs_rescan);
 
+static ssize_t nvme_passthru_err_log_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+	if (ctrl->passthru_log_err)
+		return sysfs_emit(buf, "on\n");
+
+	return sysfs_emit(buf, "off\n");
+}
+
+static ssize_t nvme_passthru_err_log_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t count)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+	int passthru_enable, err;
+
+	err = kstrtoint(buf, 10, &passthru_enable);
+	if (err)
+		return -EINVAL;
+
+	switch (passthru_enable) {
+	case true:
+	case false:
+		ctrl->passthru_log_err = passthru_enable;
+		break;
+	default:
+		pr_err("invlid value %d for admin error logging [on:1 off:0]\n",
+			passthru_enable);
+		break;
+	}
+	return count;
+}
+
+static DEVICE_ATTR(passthru_err_log, S_IRUGO | S_IWUSR,
+	nvme_passthru_err_log_show,
+	nvme_passthru_err_log_store);
+
 static inline struct nvme_ns_head *dev_to_ns_head(struct device *dev)
 {
 	struct gendisk *disk = dev_to_disk(dev);
@@ -3925,6 +3996,7 @@ static struct attribute *nvme_dev_attrs[] = {
 	&dev_attr_dhchap_secret.attr,
 	&dev_attr_dhchap_ctrl_secret.attr,
 #endif
+	&dev_attr_passthru_err_log.attr,
 	NULL
 };
 
@@ -5124,6 +5196,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
 	int ret;
 
 	ctrl->state = NVME_CTRL_NEW;
+	ctrl->passthru_log_err = false;
 	clear_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
 	spin_lock_init(&ctrl->lock);
 	mutex_init(&ctrl->scan_lock);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index bf46f122e9e1..f5721ad8264c 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -248,6 +248,7 @@ struct nvme_ctrl {
 	bool comp_seen;
 	enum nvme_ctrl_state state;
 	bool identified;
+	bool passthru_log_err;
 	spinlock_t lock;
 	struct mutex scan_lock;
 	const struct nvme_ctrl_ops *ops;
-- 
2.29.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH V5 0/1] nvme: allow passthru cmd error logging
  2023-04-09 21:25 [PATCH V5 0/1] nvme: allow passthru cmd error logging Chaitanya Kulkarni
  2023-04-09 21:25 ` [PATCH V5 1/1] " Chaitanya Kulkarni
@ 2023-04-11  0:05 ` alan.adamson
  2023-04-11  5:23   ` Chaitanya Kulkarni
       [not found] ` <CGME20230412091843eucas1p22a4c7c34297c4f260ccc54cff9d67a52@eucas1p2.samsung.com>
  2 siblings, 1 reply; 13+ messages in thread
From: alan.adamson @ 2023-04-11  0:05 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: linux-nvme


On 4/9/23 2:25 PM, Chaitanya Kulkarni wrote:
> Hi Alan,
>
> I spnet some time on the V4 to fix several nits and testing with keeping
> your authorship, please have a look, feel free to send a next version
> if you find something not right.
>
> In nvme_end_req()_ we only log errors which are for non-passthru
> commands. Add a helper function nvme_log_err_passthru() which allows us
> to log error for passthru commands by decoding cdw10-cdw15 values of
> nvme command.

CK,

Thanks for doing this. Looks good.  I've tested it in my environment.

Alan




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V5 0/1] nvme: allow passthru cmd error logging
  2023-04-11  0:05 ` [PATCH V5 0/1] " alan.adamson
@ 2023-04-11  5:23   ` Chaitanya Kulkarni
  2023-04-14 22:01     ` alan.adamson
  0 siblings, 1 reply; 13+ messages in thread
From: Chaitanya Kulkarni @ 2023-04-11  5:23 UTC (permalink / raw)
  To: alan.adamson, Chaitanya Kulkarni; +Cc: linux-nvme

On 4/10/23 17:05, alan.adamson@oracle.com wrote:
>
> On 4/9/23 2:25 PM, Chaitanya Kulkarni wrote:
>> Hi Alan,
>>
>> I spnet some time on the V4 to fix several nits and testing with keeping
>> your authorship, please have a look, feel free to send a next version
>> if you find something not right.
>>
>> In nvme_end_req()_ we only log errors which are for non-passthru
>> commands. Add a helper function nvme_log_err_passthru() which allows us
>> to log error for passthru commands by decoding cdw10-cdw15 values of
>> nvme command.
>
> CK,
>
> Thanks for doing this. Looks good.  I've tested it in my environment.
>
> Alan
>
>

Thanks for having a look, let's wait for others to have comments.

-ck



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V5 0/1] nvme: allow passthru cmd error logging
       [not found] ` <CGME20230412091843eucas1p22a4c7c34297c4f260ccc54cff9d67a52@eucas1p2.samsung.com>
@ 2023-04-12  9:10   ` Pankaj Raghav
  2023-04-13  2:20     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 13+ messages in thread
From: Pankaj Raghav @ 2023-04-12  9:10 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: alan.adamson, linux-nvme

[-- Attachment #1: Type: text/plain, Size: 1452 bytes --]

On Sun, Apr 09, 2023 at 02:25:37PM -0700, Chaitanya Kulkarni wrote:
> 
> -ck
> 
> Chaitanya Kulkarni (1):
>   nvme: allow passthru cme error logging
> 
>  drivers/nvme/host/core.c | 83 +++++++++++++++++++++++++++++++++++++---
>  drivers/nvme/host/nvme.h |  1 +
>  2 files changed, 79 insertions(+), 5 deletions(-)
> 
> v5:
> 
> - Trim down code in the nvme_log_error_passthrough().
>   Use following to get the disk name as an arg to
>    pr_err_ratelimited() :-
> 	ns ? ns->disk->disk_name : dev_name(nr->ctrl->device),
>   Use following to get the admin vs I/O opcode string as an arg to
>   pr_err_ratelimited() :-
>        	ns ? nvme_get_opcode_str(nr->cmd->common.opcode) :
>        	     nvme_get_admin_opcode_str(nr->cmd->common.opcode),
> - Rename nvme_log_error_passthrough() -> nvme_log_err_passthru().
> - Remove else and return directly in nvme_passthru_err_log_show().
> - Generate error on invalid values of the passthru_enable variable
>   in nvme_passthru_log_store().
> - Rename passthrough -> passthru.

> - Rename sysfs attr from passthru_admin_err_logging -> passthru_log_err.
Curious to know why `admin` was removed from the sysfs attr. We enable
logging for admin commands using this attr but IO errors are always logged.

I tried out the patch and I was surprised to see IO err was logged for
passthru requests without me enabling it explicitly. Having the sysfs
attr to be passthru_log_admin_err would have been more clear IMO.

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V5 0/1] nvme: allow passthru cmd error logging
  2023-04-12  9:10   ` Pankaj Raghav
@ 2023-04-13  2:20     ` Chaitanya Kulkarni
  0 siblings, 0 replies; 13+ messages in thread
From: Chaitanya Kulkarni @ 2023-04-13  2:20 UTC (permalink / raw)
  To: Pankaj Raghav, Chaitanya Kulkarni; +Cc: alan.adamson, linux-nvme

On 4/12/23 02:10, Pankaj Raghav wrote:
> On Sun, Apr 09, 2023 at 02:25:37PM -0700, Chaitanya Kulkarni wrote:
>> -ck
>>
>> Chaitanya Kulkarni (1):
>>    nvme: allow passthru cme error logging
>>
>>   drivers/nvme/host/core.c | 83 +++++++++++++++++++++++++++++++++++++---
>>   drivers/nvme/host/nvme.h |  1 +
>>   2 files changed, 79 insertions(+), 5 deletions(-)
>>
>> v5:
>>
>> - Trim down code in the nvme_log_error_passthrough().
>>    Use following to get the disk name as an arg to
>>     pr_err_ratelimited() :-
>> 	ns ? ns->disk->disk_name : dev_name(nr->ctrl->device),
>>    Use following to get the admin vs I/O opcode string as an arg to
>>    pr_err_ratelimited() :-
>>         	ns ? nvme_get_opcode_str(nr->cmd->common.opcode) :
>>         	     nvme_get_admin_opcode_str(nr->cmd->common.opcode),
>> - Rename nvme_log_error_passthrough() -> nvme_log_err_passthru().
>> - Remove else and return directly in nvme_passthru_err_log_show().
>> - Generate error on invalid values of the passthru_enable variable
>>    in nvme_passthru_log_store().
>> - Rename passthrough -> passthru.
>> - Rename sysfs attr from passthru_admin_err_logging -> passthru_log_err.
> Curious to know why `admin` was removed from the sysfs attr. We enable
> logging for admin commands using this attr but IO errors are always logged.
>
> I tried out the patch and I was surprised to see IO err was logged for
> passthru requests without me enabling it explicitly. Having the sysfs
> attr to be passthru_log_admin_err would have been more clear IMO.

Right now we only use this attr to log admin commands, as decision has
been made to log I/O errors irrespective of the attr value is set.
See nvme_ini_request() admin command timeout else in the patch where
we assign quiet flag.

That is not a surprise it's by design and Alan has explicitly made it
clear to have it this way in his reply, and there are no objections so
far. This keeps ctrl access away from fast path in nvme_init_request().

In future if we decide to change the policy and start I/O logging
based on the sysfs values (which might happen) then we will have to
1. Introduce a new attr (polluting sysfs not a good idea)       OR
2. Rename it to generic name that in the patch (not acceptable) OR
3. Overload existing attr to accommodate I/O command (not acceptable)

So why not use generic to start with ?

I can always send V6 with passthru_admin_err_log name but I'd
avoid using non-generic names unless someone guarantees we will never
ever make passthru I/O commands err logging optional...

-ck



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V5 0/1] nvme: allow passthru cmd error logging
  2023-04-11  5:23   ` Chaitanya Kulkarni
@ 2023-04-14 22:01     ` alan.adamson
  2023-05-05 21:59       ` alan.adamson
  0 siblings, 1 reply; 13+ messages in thread
From: alan.adamson @ 2023-04-14 22:01 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: linux-nvme


On 4/10/23 10:23 PM, Chaitanya Kulkarni wrote:
> On 4/10/23 17:05, alan.adamson@oracle.com wrote:
>> On 4/9/23 2:25 PM, Chaitanya Kulkarni wrote:
>>> Hi Alan,
>>>
>>> I spnet some time on the V4 to fix several nits and testing with keeping
>>> your authorship, please have a look, feel free to send a next version
>>> if you find something not right.
>>>
>>> In nvme_end_req()_ we only log errors which are for non-passthru
>>> commands. Add a helper function nvme_log_err_passthru() which allows us
>>> to log error for passthru commands by decoding cdw10-cdw15 values of
>>> nvme command.
>> CK,
>>
>> Thanks for doing this. Looks good.  I've tested it in my environment.
>>
>> Alan
>>
>>
> Thanks for having a look, let's wait for others to have comments.
>
> -ck

I'll be on vacation next week so we will move forward with the patch 
when I get back on 4/24.


Alan




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V5 0/1] nvme: allow passthru cmd error logging
  2023-04-14 22:01     ` alan.adamson
@ 2023-05-05 21:59       ` alan.adamson
  2023-05-05 23:56         ` Chaitanya Kulkarni
  0 siblings, 1 reply; 13+ messages in thread
From: alan.adamson @ 2023-05-05 21:59 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: linux-nvme


>>>> In nvme_end_req()_ we only log errors which are for non-passthru
>>>> commands. Add a helper function nvme_log_err_passthru() which 
>>>> allows us
>>>> to log error for passthru commands by decoding cdw10-cdw15 values of
>>>> nvme command.
>>> CK,
>>>
>>> Thanks for doing this. Looks good.  I've tested it in my environment.
>>>
>>> Alan
>>>
>>>
>> Thanks for having a look, let's wait for others to have comments.
>>
>> -ck
>
> I'll be on vacation next week so we will move forward with the patch 
> when I get back on 4/24.

I'm back from vacation.  We can continue with the review.


Thanks,

Alan



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V5 0/1] nvme: allow passthru cmd error logging
  2023-05-05 21:59       ` alan.adamson
@ 2023-05-05 23:56         ` Chaitanya Kulkarni
  2023-05-13 10:13           ` Chaitanya Kulkarni
  0 siblings, 1 reply; 13+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-05 23:56 UTC (permalink / raw)
  To: alan.adamson; +Cc: linux-nvme, Christoph Hellwig, kbusch, Sagi Grimberg

On 5/5/23 14:59, alan.adamson@oracle.com wrote:
>
>>>>> In nvme_end_req()_ we only log errors which are for non-passthru
>>>>> commands. Add a helper function nvme_log_err_passthru() which 
>>>>> allows us
>>>>> to log error for passthru commands by decoding cdw10-cdw15 values of
>>>>> nvme command.
>>>> CK,
>>>>
>>>> Thanks for doing this. Looks good.  I've tested it in my environment.
>>>>
>>>> Alan
>>>>
>>>>
>>> Thanks for having a look, let's wait for others to have comments.
>>>
>>> -ck
>>
>> I'll be on vacation next week so we will move forward with the patch 
>> when I get back on 4/24.
>
> I'm back from vacation.  We can continue with the review.
>
>
> Thanks,
>
> Alan
>

since you've tested V5 in your environment, I think we can pick this up 
before
next PR if there is no major objection ...

-ck



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V5 0/1] nvme: allow passthru cmd error logging
  2023-05-05 23:56         ` Chaitanya Kulkarni
@ 2023-05-13 10:13           ` Chaitanya Kulkarni
  0 siblings, 0 replies; 13+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-13 10:13 UTC (permalink / raw)
  To: Christoph Hellwig, kbusch, Sagi Grimberg; +Cc: alan.adamson, linux-nvme

On 5/5/23 16:56, Chaitanya Kulkarni wrote:
> On 5/5/23 14:59, alan.adamson@oracle.com wrote:
>>>>>> In nvme_end_req()_ we only log errors which are for non-passthru
>>>>>> commands. Add a helper function nvme_log_err_passthru() which
>>>>>> allows us
>>>>>> to log error for passthru commands by decoding cdw10-cdw15 values of
>>>>>> nvme command.
>>>>> CK,
>>>>>
>>>>> Thanks for doing this. Looks good.  I've tested it in my environment.
>>>>>
>>>>> Alan
>>>>>
>>>>>
>>>> Thanks for having a look, let's wait for others to have comments.
>>>>
>>>> -ck
>>> I'll be on vacation next week so we will move forward with the patch
>>> when I get back on 4/24.
>> I'm back from vacation.  We can continue with the review.
>>
>>
>> Thanks,
>>
>> Alan
>>
> since you've tested V5 in your environment, I think we can pick this up
> before
> next PR if there is no major objection ...
>
> -ck
>
>

Keith/Christoph/Sagi,

Any objection on this version ?

-ck



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V5 1/1] nvme: allow passthru cmd error logging
  2023-04-09 21:25 ` [PATCH V5 1/1] " Chaitanya Kulkarni
@ 2023-05-22 22:58   ` alan.adamson
  2023-05-23  0:08     ` Chaitanya Kulkarni
  2023-06-15 16:48     ` alan.adamson
  0 siblings, 2 replies; 13+ messages in thread
From: alan.adamson @ 2023-05-22 22:58 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: linux-nvme


On 4/9/23 2:25 PM, Chaitanya Kulkarni wrote:
> From: Alan Adamson <alan.adamson@oracle.com>
>
> Commit d7ac8dca938c ("nvme: quiet user passthrough command errors")
> disabled error logging for user passthrough commands.  This commit
> adds the ability to opt-in to passthrough admin error logging. IO
> commands initiated as passthrough will always be logged.
>
> The logging output for passthrough commands (Admin and IO) has been
> changed to include CDWXX fields.
>
> nvme0n1: Read(0x2), LBA Out of Range (sct 0x0 / sc 0x80) DNR cdw10=0x0 cdw11=0x1
>          cdw12=0x70000 cdw13=0x0 cdw14=0x0 cdw15=0x0
>
> Add a helper function nvme_log_err_passthru() which allows us to log
> error for passthru commands by decoding cdw10-cdw15 values of nvme
> command.
>
> Add a new sysfs attr passthru_err_log that allows user to conditionally
> enable passthru command logging, by default it is disabled.
>
> To enable passthrough admin error logging:
>          echo 1 > /sys/class/nvme/nvme0/passthru_err_log
>
> To disable passthrough admin error logging:
>          echo 0 > /sys/class/nvme/nvme0/passthru_err_log
>
> Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
> [kch] fix sevaral nits and trim down code, details in cover-letter.
> Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
> ---
>   drivers/nvme/host/core.c | 83 +++++++++++++++++++++++++++++++++++++---
>   drivers/nvme/host/nvme.h |  1 +
>   2 files changed, 79 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 954641a45e55..e32265adc034 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -337,6 +337,30 @@ static void nvme_log_error(struct request *req)
>   			   nr->status & NVME_SC_DNR  ? "DNR "  : "");
>   }
>   
> +static void nvme_log_err_passthru(struct request *req)
> +{
> +	struct nvme_ns *ns = req->q->queuedata;
> +	struct nvme_request *nr = nvme_req(req);
> +
> +	pr_err_ratelimited("%s: %s(0x%x), %s (sct 0x%x / sc 0x%x) %s%s"
> +		"cdw10=0x%x cdw11=0x%x cdw12=0x%x cdw13=0x%x cdw14=0x%x cdw15=0x%x\n",
> +		ns ? ns->disk->disk_name : dev_name(nr->ctrl->device),
> +		ns ? nvme_get_opcode_str(nr->cmd->common.opcode) :
> +		     nvme_get_admin_opcode_str(nr->cmd->common.opcode),
> +		nr->cmd->common.opcode,
> +		nvme_get_error_status_str(nr->status),
> +		nr->status >> 8 & 7,	/* Status Code Type */
> +		nr->status & 0xff,	/* Status Code */
> +		nr->status & NVME_SC_MORE ? "MORE " : "",
> +		nr->status & NVME_SC_DNR  ? "DNR "  : "",
> +		nr->cmd->common.cdw10,
> +		nr->cmd->common.cdw11,
> +		nr->cmd->common.cdw12,
> +		nr->cmd->common.cdw13,
> +		nr->cmd->common.cdw14,
> +		nr->cmd->common.cdw14);
> +}
> +
>   enum nvme_disposition {
>   	COMPLETE,
>   	RETRY,
> @@ -381,8 +405,12 @@ static inline void nvme_end_req(struct request *req)
>   {
>   	blk_status_t status = nvme_error_status(nvme_req(req)->status);
>   
> -	if (unlikely(nvme_req(req)->status && !(req->rq_flags & RQF_QUIET)))
> -		nvme_log_error(req);
> +	if (unlikely(nvme_req(req)->status && !(req->rq_flags & RQF_QUIET))) {
> +		if (blk_rq_is_passthrough(req))
> +			nvme_log_err_passthru(req);
> +		else
> +			nvme_log_error(req);
> +	}
>   	nvme_end_req_zoned(req);
>   	nvme_trace_bio_complete(req);
>   	if (req->cmd_flags & REQ_NVME_MPATH)
> @@ -666,10 +694,15 @@ static inline void nvme_clear_nvme_request(struct request *req)
>   /* initialize a passthrough request */
>   void nvme_init_request(struct request *req, struct nvme_command *cmd)
>   {
> +	struct nvme_request *nr = nvme_req(req);
> +
>   	if (req->q->queuedata)
>   		req->timeout = NVME_IO_TIMEOUT;
> -	else /* no queuedata implies admin queue */
> +	else { /* no queuedata implies admin queue */
>   		req->timeout = NVME_ADMIN_TIMEOUT;
> +		if (!nr->ctrl->passthru_log_err)
> +			req->rq_flags |= RQF_QUIET;
> +	}
>   
>   	/* passthru commands should let the driver set the SGL flags */
>   	cmd->common.flags &= ~NVME_CMD_SGL_ALL;
> @@ -678,8 +711,8 @@ void nvme_init_request(struct request *req, struct nvme_command *cmd)
>   	if (req->mq_hctx->type == HCTX_TYPE_POLL)
>   		req->cmd_flags |= REQ_POLLED;
>   	nvme_clear_nvme_request(req);
> -	req->rq_flags |= RQF_QUIET;
> -	memcpy(nvme_req(req)->cmd, cmd, sizeof(*cmd));
> +
> +	memcpy(nr->cmd, cmd, sizeof(*cmd));
>   }
>   EXPORT_SYMBOL_GPL(nvme_init_request);
>   
> @@ -3417,6 +3450,44 @@ static ssize_t nvme_sysfs_rescan(struct device *dev,
>   }
>   static DEVICE_ATTR(rescan_controller, S_IWUSR, NULL, nvme_sysfs_rescan);
>   
> +static ssize_t nvme_passthru_err_log_show(struct device *dev,
> +		struct device_attribute *attr, char *buf)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +
> +	if (ctrl->passthru_log_err)
> +		return sysfs_emit(buf, "on\n");
> +
> +	return sysfs_emit(buf, "off\n");
> +}
> +
> +static ssize_t nvme_passthru_err_log_store(struct device *dev,
> +		struct device_attribute *attr, const char *buf, size_t count)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +	int passthru_enable, err;
> +
> +	err = kstrtoint(buf, 10, &passthru_enable);
> +	if (err)
> +		return -EINVAL;
> +
> +	switch (passthru_enable) {
> +	case true:
> +	case false:
> +		ctrl->passthru_log_err = passthru_enable;
> +		break;
> +	default:
> +		pr_err("invlid value %d for admin error logging [on:1 off:0]\n",
> +			passthru_enable);
> +		break;
> +	}
> +	return count;
> +}
> +
> +static DEVICE_ATTR(passthru_err_log, S_IRUGO | S_IWUSR,
> +	nvme_passthru_err_log_show,
> +	nvme_passthru_err_log_store);
> +
>   static inline struct nvme_ns_head *dev_to_ns_head(struct device *dev)
>   {
>   	struct gendisk *disk = dev_to_disk(dev);
> @@ -3925,6 +3996,7 @@ static struct attribute *nvme_dev_attrs[] = {
>   	&dev_attr_dhchap_secret.attr,
>   	&dev_attr_dhchap_ctrl_secret.attr,
>   #endif
> +	&dev_attr_passthru_err_log.attr,
>   	NULL
>   };
>   
> @@ -5124,6 +5196,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
>   	int ret;
>   
>   	ctrl->state = NVME_CTRL_NEW;
> +	ctrl->passthru_log_err = false;
>   	clear_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
>   	spin_lock_init(&ctrl->lock);
>   	mutex_init(&ctrl->scan_lock);
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index bf46f122e9e1..f5721ad8264c 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -248,6 +248,7 @@ struct nvme_ctrl {
>   	bool comp_seen;
>   	enum nvme_ctrl_state state;
>   	bool identified;
> +	bool passthru_log_err;
>   	spinlock_t lock;
>   	struct mutex scan_lock;
>   	const struct nvme_ctrl_ops *ops;

This v5 version of the patch has been tested with the latest upstream.  
Any objections?

Alan




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V5 1/1] nvme: allow passthru cmd error logging
  2023-05-22 22:58   ` alan.adamson
@ 2023-05-23  0:08     ` Chaitanya Kulkarni
  2023-06-15 16:48     ` alan.adamson
  1 sibling, 0 replies; 13+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-23  0:08 UTC (permalink / raw)
  To: hch, Keith Busch, Sagi Grimberg
  Cc: linux-nvme, Chaitanya Kulkarni, alan.adamson

On 5/22/2023 3:58 PM, alan.adamson@oracle.com wrote:
> 
> On 4/9/23 2:25 PM, Chaitanya Kulkarni wrote:
>> From: Alan Adamson <alan.adamson@oracle.com>
>>
>> Commit d7ac8dca938c ("nvme: quiet user passthrough command errors")
>> disabled error logging for user passthrough commands.  This commit
>> adds the ability to opt-in to passthrough admin error logging. IO
>> commands initiated as passthrough will always be logged.
>>
>> The logging output for passthrough commands (Admin and IO) has been
>> changed to include CDWXX fields.
>>
>> nvme0n1: Read(0x2), LBA Out of Range (sct 0x0 / sc 0x80) DNR cdw10=0x0 
>> cdw11=0x1
>>          cdw12=0x70000 cdw13=0x0 cdw14=0x0 cdw15=0x0
>>
>> Add a helper function nvme_log_err_passthru() which allows us to log
>> error for passthru commands by decoding cdw10-cdw15 values of nvme
>> command.
>>
>> Add a new sysfs attr passthru_err_log that allows user to conditionally
>> enable passthru command logging, by default it is disabled.
>>
>> To enable passthrough admin error logging:
>>          echo 1 > /sys/class/nvme/nvme0/passthru_err_log
>>
>> To disable passthrough admin error logging:
>>          echo 0 > /sys/class/nvme/nvme0/passthru_err_log
>>
>> Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
>> [kch] fix sevaral nits and trim down code, details in cover-letter.
>> Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
>> ---
>>   drivers/nvme/host/core.c | 83 +++++++++++++++++++++++++++++++++++++---
>>   drivers/nvme/host/nvme.h |  1 +
>>   2 files changed, 79 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 954641a45e55..e32265adc034 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -337,6 +337,30 @@ static void nvme_log_error(struct request *req)
>>                  nr->status & NVME_SC_DNR  ? "DNR "  : "");
>>   }
>> +static void nvme_log_err_passthru(struct request *req)
>> +{
>> +    struct nvme_ns *ns = req->q->queuedata;
>> +    struct nvme_request *nr = nvme_req(req);
>> +
>> +    pr_err_ratelimited("%s: %s(0x%x), %s (sct 0x%x / sc 0x%x) %s%s"
>> +        "cdw10=0x%x cdw11=0x%x cdw12=0x%x cdw13=0x%x cdw14=0x%x 
>> cdw15=0x%x\n",
>> +        ns ? ns->disk->disk_name : dev_name(nr->ctrl->device),
>> +        ns ? nvme_get_opcode_str(nr->cmd->common.opcode) :
>> +             nvme_get_admin_opcode_str(nr->cmd->common.opcode),
>> +        nr->cmd->common.opcode,
>> +        nvme_get_error_status_str(nr->status),
>> +        nr->status >> 8 & 7,    /* Status Code Type */
>> +        nr->status & 0xff,    /* Status Code */
>> +        nr->status & NVME_SC_MORE ? "MORE " : "",
>> +        nr->status & NVME_SC_DNR  ? "DNR "  : "",
>> +        nr->cmd->common.cdw10,
>> +        nr->cmd->common.cdw11,
>> +        nr->cmd->common.cdw12,
>> +        nr->cmd->common.cdw13,
>> +        nr->cmd->common.cdw14,
>> +        nr->cmd->common.cdw14);
>> +}
>> +
>>   enum nvme_disposition {
>>       COMPLETE,
>>       RETRY,
>> @@ -381,8 +405,12 @@ static inline void nvme_end_req(struct request *req)
>>   {
>>       blk_status_t status = nvme_error_status(nvme_req(req)->status);
>> -    if (unlikely(nvme_req(req)->status && !(req->rq_flags & RQF_QUIET)))
>> -        nvme_log_error(req);
>> +    if (unlikely(nvme_req(req)->status && !(req->rq_flags & 
>> RQF_QUIET))) {
>> +        if (blk_rq_is_passthrough(req))
>> +            nvme_log_err_passthru(req);
>> +        else
>> +            nvme_log_error(req);
>> +    }
>>       nvme_end_req_zoned(req);
>>       nvme_trace_bio_complete(req);
>>       if (req->cmd_flags & REQ_NVME_MPATH)
>> @@ -666,10 +694,15 @@ static inline void 
>> nvme_clear_nvme_request(struct request *req)
>>   /* initialize a passthrough request */
>>   void nvme_init_request(struct request *req, struct nvme_command *cmd)
>>   {
>> +    struct nvme_request *nr = nvme_req(req);
>> +
>>       if (req->q->queuedata)
>>           req->timeout = NVME_IO_TIMEOUT;
>> -    else /* no queuedata implies admin queue */
>> +    else { /* no queuedata implies admin queue */
>>           req->timeout = NVME_ADMIN_TIMEOUT;
>> +        if (!nr->ctrl->passthru_log_err)
>> +            req->rq_flags |= RQF_QUIET;
>> +    }
>>       /* passthru commands should let the driver set the SGL flags */
>>       cmd->common.flags &= ~NVME_CMD_SGL_ALL;
>> @@ -678,8 +711,8 @@ void nvme_init_request(struct request *req, struct 
>> nvme_command *cmd)
>>       if (req->mq_hctx->type == HCTX_TYPE_POLL)
>>           req->cmd_flags |= REQ_POLLED;
>>       nvme_clear_nvme_request(req);
>> -    req->rq_flags |= RQF_QUIET;
>> -    memcpy(nvme_req(req)->cmd, cmd, sizeof(*cmd));
>> +
>> +    memcpy(nr->cmd, cmd, sizeof(*cmd));
>>   }
>>   EXPORT_SYMBOL_GPL(nvme_init_request);
>> @@ -3417,6 +3450,44 @@ static ssize_t nvme_sysfs_rescan(struct device 
>> *dev,
>>   }
>>   static DEVICE_ATTR(rescan_controller, S_IWUSR, NULL, 
>> nvme_sysfs_rescan);
>> +static ssize_t nvme_passthru_err_log_show(struct device *dev,
>> +        struct device_attribute *attr, char *buf)
>> +{
>> +    struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
>> +
>> +    if (ctrl->passthru_log_err)
>> +        return sysfs_emit(buf, "on\n");
>> +
>> +    return sysfs_emit(buf, "off\n");
>> +}
>> +
>> +static ssize_t nvme_passthru_err_log_store(struct device *dev,
>> +        struct device_attribute *attr, const char *buf, size_t count)
>> +{
>> +    struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
>> +    int passthru_enable, err;
>> +
>> +    err = kstrtoint(buf, 10, &passthru_enable);
>> +    if (err)
>> +        return -EINVAL;
>> +
>> +    switch (passthru_enable) {
>> +    case true:
>> +    case false:
>> +        ctrl->passthru_log_err = passthru_enable;
>> +        break;
>> +    default:
>> +        pr_err("invlid value %d for admin error logging [on:1 off:0]\n",
>> +            passthru_enable);
>> +        break;
>> +    }
>> +    return count;
>> +}
>> +
>> +static DEVICE_ATTR(passthru_err_log, S_IRUGO | S_IWUSR,
>> +    nvme_passthru_err_log_show,
>> +    nvme_passthru_err_log_store);
>> +
>>   static inline struct nvme_ns_head *dev_to_ns_head(struct device *dev)
>>   {
>>       struct gendisk *disk = dev_to_disk(dev);
>> @@ -3925,6 +3996,7 @@ static struct attribute *nvme_dev_attrs[] = {
>>       &dev_attr_dhchap_secret.attr,
>>       &dev_attr_dhchap_ctrl_secret.attr,
>>   #endif
>> +    &dev_attr_passthru_err_log.attr,
>>       NULL
>>   };
>> @@ -5124,6 +5196,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, 
>> struct device *dev,
>>       int ret;
>>       ctrl->state = NVME_CTRL_NEW;
>> +    ctrl->passthru_log_err = false;
>>       clear_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
>>       spin_lock_init(&ctrl->lock);
>>       mutex_init(&ctrl->scan_lock);
>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
>> index bf46f122e9e1..f5721ad8264c 100644
>> --- a/drivers/nvme/host/nvme.h
>> +++ b/drivers/nvme/host/nvme.h
>> @@ -248,6 +248,7 @@ struct nvme_ctrl {
>>       bool comp_seen;
>>       enum nvme_ctrl_state state;
>>       bool identified;
>> +    bool passthru_log_err;
>>       spinlock_t lock;
>>       struct mutex scan_lock;
>>       const struct nvme_ctrl_ops *ops;
> 
> This v5 version of the patch has been tested with the latest upstream. 
> Any objections?
> 
> Alan
> 
> 

Sagi/Christoph/Keith,

any objections ?

-ck



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V5 1/1] nvme: allow passthru cmd error logging
  2023-05-22 22:58   ` alan.adamson
  2023-05-23  0:08     ` Chaitanya Kulkarni
@ 2023-06-15 16:48     ` alan.adamson
  1 sibling, 0 replies; 13+ messages in thread
From: alan.adamson @ 2023-06-15 16:48 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, kbusch, Sagi Grimberg


On 5/22/23 3:58 PM, alan.adamson@oracle.com wrote:
>
> On 4/9/23 2:25 PM, Chaitanya Kulkarni wrote:
>> From: Alan Adamson <alan.adamson@oracle.com>
>>
>> Commit d7ac8dca938c ("nvme: quiet user passthrough command errors")
>> disabled error logging for user passthrough commands.  This commit
>> adds the ability to opt-in to passthrough admin error logging. IO
>> commands initiated as passthrough will always be logged.
>>
>> The logging output for passthrough commands (Admin and IO) has been
>> changed to include CDWXX fields.
>>
>> nvme0n1: Read(0x2), LBA Out of Range (sct 0x0 / sc 0x80) DNR 
>> cdw10=0x0 cdw11=0x1
>>          cdw12=0x70000 cdw13=0x0 cdw14=0x0 cdw15=0x0
>>
>> Add a helper function nvme_log_err_passthru() which allows us to log
>> error for passthru commands by decoding cdw10-cdw15 values of nvme
>> command.
>>
>> Add a new sysfs attr passthru_err_log that allows user to conditionally
>> enable passthru command logging, by default it is disabled.
>>
>> To enable passthrough admin error logging:
>>          echo 1 > /sys/class/nvme/nvme0/passthru_err_log
>>
>> To disable passthrough admin error logging:
>>          echo 0 > /sys/class/nvme/nvme0/passthru_err_log
>>
>> Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
>> [kch] fix sevaral nits and trim down code, details in cover-letter.
>> Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
>> ---
>>   drivers/nvme/host/core.c | 83 +++++++++++++++++++++++++++++++++++++---
>>   drivers/nvme/host/nvme.h |  1 +
>>   2 files changed, 79 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 954641a45e55..e32265adc034 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -337,6 +337,30 @@ static void nvme_log_error(struct request *req)
>>                  nr->status & NVME_SC_DNR  ? "DNR "  : "");
>>   }
>>   +static void nvme_log_err_passthru(struct request *req)
>> +{
>> +    struct nvme_ns *ns = req->q->queuedata;
>> +    struct nvme_request *nr = nvme_req(req);
>> +
>> +    pr_err_ratelimited("%s: %s(0x%x), %s (sct 0x%x / sc 0x%x) %s%s"
>> +        "cdw10=0x%x cdw11=0x%x cdw12=0x%x cdw13=0x%x cdw14=0x%x 
>> cdw15=0x%x\n",
>> +        ns ? ns->disk->disk_name : dev_name(nr->ctrl->device),
>> +        ns ? nvme_get_opcode_str(nr->cmd->common.opcode) :
>> + nvme_get_admin_opcode_str(nr->cmd->common.opcode),
>> +        nr->cmd->common.opcode,
>> +        nvme_get_error_status_str(nr->status),
>> +        nr->status >> 8 & 7,    /* Status Code Type */
>> +        nr->status & 0xff,    /* Status Code */
>> +        nr->status & NVME_SC_MORE ? "MORE " : "",
>> +        nr->status & NVME_SC_DNR  ? "DNR "  : "",
>> +        nr->cmd->common.cdw10,
>> +        nr->cmd->common.cdw11,
>> +        nr->cmd->common.cdw12,
>> +        nr->cmd->common.cdw13,
>> +        nr->cmd->common.cdw14,
>> +        nr->cmd->common.cdw14);
>> +}
>> +
>>   enum nvme_disposition {
>>       COMPLETE,
>>       RETRY,
>> @@ -381,8 +405,12 @@ static inline void nvme_end_req(struct request 
>> *req)
>>   {
>>       blk_status_t status = nvme_error_status(nvme_req(req)->status);
>>   -    if (unlikely(nvme_req(req)->status && !(req->rq_flags & 
>> RQF_QUIET)))
>> -        nvme_log_error(req);
>> +    if (unlikely(nvme_req(req)->status && !(req->rq_flags & 
>> RQF_QUIET))) {
>> +        if (blk_rq_is_passthrough(req))
>> +            nvme_log_err_passthru(req);
>> +        else
>> +            nvme_log_error(req);
>> +    }
>>       nvme_end_req_zoned(req);
>>       nvme_trace_bio_complete(req);
>>       if (req->cmd_flags & REQ_NVME_MPATH)
>> @@ -666,10 +694,15 @@ static inline void 
>> nvme_clear_nvme_request(struct request *req)
>>   /* initialize a passthrough request */
>>   void nvme_init_request(struct request *req, struct nvme_command *cmd)
>>   {
>> +    struct nvme_request *nr = nvme_req(req);
>> +
>>       if (req->q->queuedata)
>>           req->timeout = NVME_IO_TIMEOUT;
>> -    else /* no queuedata implies admin queue */
>> +    else { /* no queuedata implies admin queue */
>>           req->timeout = NVME_ADMIN_TIMEOUT;
>> +        if (!nr->ctrl->passthru_log_err)
>> +            req->rq_flags |= RQF_QUIET;
>> +    }
>>         /* passthru commands should let the driver set the SGL flags */
>>       cmd->common.flags &= ~NVME_CMD_SGL_ALL;
>> @@ -678,8 +711,8 @@ void nvme_init_request(struct request *req, 
>> struct nvme_command *cmd)
>>       if (req->mq_hctx->type == HCTX_TYPE_POLL)
>>           req->cmd_flags |= REQ_POLLED;
>>       nvme_clear_nvme_request(req);
>> -    req->rq_flags |= RQF_QUIET;
>> -    memcpy(nvme_req(req)->cmd, cmd, sizeof(*cmd));
>> +
>> +    memcpy(nr->cmd, cmd, sizeof(*cmd));
>>   }
>>   EXPORT_SYMBOL_GPL(nvme_init_request);
>>   @@ -3417,6 +3450,44 @@ static ssize_t nvme_sysfs_rescan(struct 
>> device *dev,
>>   }
>>   static DEVICE_ATTR(rescan_controller, S_IWUSR, NULL, 
>> nvme_sysfs_rescan);
>>   +static ssize_t nvme_passthru_err_log_show(struct device *dev,
>> +        struct device_attribute *attr, char *buf)
>> +{
>> +    struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
>> +
>> +    if (ctrl->passthru_log_err)
>> +        return sysfs_emit(buf, "on\n");
>> +
>> +    return sysfs_emit(buf, "off\n");
>> +}
>> +
>> +static ssize_t nvme_passthru_err_log_store(struct device *dev,
>> +        struct device_attribute *attr, const char *buf, size_t count)
>> +{
>> +    struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
>> +    int passthru_enable, err;
>> +
>> +    err = kstrtoint(buf, 10, &passthru_enable);
>> +    if (err)
>> +        return -EINVAL;
>> +
>> +    switch (passthru_enable) {
>> +    case true:
>> +    case false:
>> +        ctrl->passthru_log_err = passthru_enable;
>> +        break;
>> +    default:
>> +        pr_err("invlid value %d for admin error logging [on:1 
>> off:0]\n",
>> +            passthru_enable);
>> +        break;
>> +    }
>> +    return count;
>> +}
>> +
>> +static DEVICE_ATTR(passthru_err_log, S_IRUGO | S_IWUSR,
>> +    nvme_passthru_err_log_show,
>> +    nvme_passthru_err_log_store);
>> +
>>   static inline struct nvme_ns_head *dev_to_ns_head(struct device *dev)
>>   {
>>       struct gendisk *disk = dev_to_disk(dev);
>> @@ -3925,6 +3996,7 @@ static struct attribute *nvme_dev_attrs[] = {
>>       &dev_attr_dhchap_secret.attr,
>>       &dev_attr_dhchap_ctrl_secret.attr,
>>   #endif
>> +    &dev_attr_passthru_err_log.attr,
>>       NULL
>>   };
>>   @@ -5124,6 +5196,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, 
>> struct device *dev,
>>       int ret;
>>         ctrl->state = NVME_CTRL_NEW;
>> +    ctrl->passthru_log_err = false;
>>       clear_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
>>       spin_lock_init(&ctrl->lock);
>>       mutex_init(&ctrl->scan_lock);
>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
>> index bf46f122e9e1..f5721ad8264c 100644
>> --- a/drivers/nvme/host/nvme.h
>> +++ b/drivers/nvme/host/nvme.h
>> @@ -248,6 +248,7 @@ struct nvme_ctrl {
>>       bool comp_seen;
>>       enum nvme_ctrl_state state;
>>       bool identified;
>> +    bool passthru_log_err;
>>       spinlock_t lock;
>>       struct mutex scan_lock;
>>       const struct nvme_ctrl_ops *ops;
>
> This v5 version of the patch has been tested with the latest 
> upstream.  Any objections?
>
> Alan
>
>
Any objections to this change?


Alan Adamson




^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-06-15 16:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-09 21:25 [PATCH V5 0/1] nvme: allow passthru cmd error logging Chaitanya Kulkarni
2023-04-09 21:25 ` [PATCH V5 1/1] " Chaitanya Kulkarni
2023-05-22 22:58   ` alan.adamson
2023-05-23  0:08     ` Chaitanya Kulkarni
2023-06-15 16:48     ` alan.adamson
2023-04-11  0:05 ` [PATCH V5 0/1] " alan.adamson
2023-04-11  5:23   ` Chaitanya Kulkarni
2023-04-14 22:01     ` alan.adamson
2023-05-05 21:59       ` alan.adamson
2023-05-05 23:56         ` Chaitanya Kulkarni
2023-05-13 10:13           ` Chaitanya Kulkarni
     [not found] ` <CGME20230412091843eucas1p22a4c7c34297c4f260ccc54cff9d67a52@eucas1p2.samsung.com>
2023-04-12  9:10   ` Pankaj Raghav
2023-04-13  2:20     ` Chaitanya Kulkarni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).