All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V9 0/1] nvme: allow passthru cmd error logging
@ 2024-01-30  0:19 Alan Adamson
  2024-01-30  0:19 ` [PATCH V9 1/1] " Alan Adamson
  0 siblings, 1 reply; 5+ messages in thread
From: Alan Adamson @ 2024-01-30  0:19 UTC (permalink / raw)
  To: linux-nvme; +Cc: alan.adamson, kch, kbusch, hch, sagi

In nvme_end_req() we only log errors which are for non-passthru
commands. Add a helper function nvme_log_err_passthru() that allows us
to log error for passthru commands by decoding cdw10-cdw15 values of
nvme command.

Below is short testlog :-

* Admin Passsthru error log off, no errors are printed
* Admin Passsthru error log on, errors are printed
* IO Passsthru error log off, no errors are printed
* IO Passsthru error log on, errors are printed

v9:
- Move logging_enabled flag in device structure to nvme_ctrl and nvme_ns structures.
- Create seperate show/store functions for nvme_dev_attrs and nvme_ns_attrs.

v8:
- Add a logging_enabled flag to device structure.
- Add a passthru_err_log_enabled sysfs attribute for namespaces to
  allow logging of passthru IO commands.

v7:
- Changed attribute/flag to passthru_err_log_enabled.
- Use kstrtobool rather than kstrtoint.

v6:
- Rebase, retest nvme-6.5 and add test log for admin and I/O
  passthru error log.

v5:
- Trim down code in the nvme_log_error_passthrough().
  Use following to get the disk name as an arg to
   pr_err_ratelimited() :-
        ns ? ns->disk->disk_name : dev_name(nr->ctrl->device),
  Use following to get the admin vs I/O opcode string as an arg to
  pr_err_ratelimited() :-
        ns ? nvme_get_opcode_str(nr->cmd->common.opcode) :
             nvme_get_admin_opcode_str(nr->cmd->common.opcode),
- Rename nvme_log_error_passthrough() -> nvme_log_err_passthru().
- Remove else and return directly in nvme_passthru_err_log_show().
- Generate error on invalid values of the passthru_enable variable
  in nvme_passthru_log_store().
- Rename passthrough -> passthru.
- Rename sysfs attr from passthru_admin_err_logging -> passthru_log_err.

v4:
- Change sysfs attribute to passthru_admin_err_logging
- Only log passthrough admin commands.  IO passthrough commands will
  always be logged.

v3:
- Log a passthrough specific message that dumps CDW* contents.
- Enable/disable vis sysfs rather than debugfs.

v2:
- Included Pankaj Raghav's patch 'nvme: ignore starting sector while
  error logging for passthrough requests'
  with a couple changes.
- Moved error_logging flag to nvme_ctrl structure
- The entire nvme-debugfs.c does not need to be guarded by
  #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS.
- Use IS_ENABLED((CONFIG_NVME_ERROR_LOGGING_DEBUG_FS)) to determine if
  error logging should be initialized.
- Various other nits.


Testing
-------
* Admin Passsthru error log off, no errors are printed :-

[root@localhost ~]# echo 0 > /sys/class/nvme/nvme0/passthru_err_log_enabled
[root@localhost ~]# nvme telemetry-log -o /tmp/test /dev/nvme0
[root@localhost ~]# dmesg
[root@localhost ~]


* Admin Passsthru error log on, errors are printed :-

[root@localhost ~]# echo 1 > /sys/class/nvme/nvme0/passthru_err_log_enabled
[root@localhost ~]# nvme telemetry-log -o /tmp/test /dev/nvme0
[root@localhost ~]# dmesg
[ 2364.008105] nvme0: Get Log Page(0x2), Invalid Field in Command (sct 0x0 / sc 0x2) DNR cdw10=0x7f0107 cdw11=0x0 cdw12=0x0 cdw13=0x0 cdw14=0x0 cdw15=0x0


* IO Passsthru error log off, errors are not printed :-

[root@localhost ~]# echo 0 > /sys/class/nvme/nvme0/nvme0n1/passthru_err_log_enabled
[root@localhost ~]# nvme write-zeroes -n 1 -s 0x200000 -c 10 /dev/nvme0
NVMe status: LBA Out of Range: The command references an LBA that exceeds the size of the namespace(0x4080)
[root@localhost ~]# dmesg
[ 3131.602986] nvme nvme0: using deprecated NVME_IOCTL_IO_CMD ioctl on the char device!
[root@localhost ~]#


* IO Passsthru error log on, errors are printed :-

[root@localhost ~]# echo 1 > /sys/class/nvme/nvme0/nvme0n1/passthru_err_log_enabled
[root@localhost ~]# nvme write-zeroes -n 1 -s 0x200000 -c 10 /dev/nvme0
NVMe status: LBA Out of Range: The command references an LBA that exceeds the size of the namespace(0x4080)
[root@localhost ~]# dmesg
[ 2944.910393] nvme nvme0: using deprecated NVME_IOCTL_IO_CMD ioctl on the char device!
[ 2944.910423] nvme0n1: Write Zeroes(0x8), LBA Out of Range (sct 0x0 / sc 0x80) DNR cdw10=0x200000 cdw11=0x0 cdw12=0xa cdw13=0x0 cdw14=0x0 cdw15=0x0
[root@localhost ~]#

Alan Adamson (1):
  nvme: allow passthru cmd error logging

 drivers/nvme/host/core.c  | 59 +++++++++++++++++++++++++++++++++++----
 drivers/nvme/host/nvme.h  |  3 +-
 drivers/nvme/host/sysfs.c | 57 +++++++++++++++++++++++++++++++++++++
 3 files changed, 112 insertions(+), 7 deletions(-)

-- 
2.39.3



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH V9 1/1] nvme: allow passthru cmd error logging
  2024-01-30  0:19 [PATCH V9 0/1] nvme: allow passthru cmd error logging Alan Adamson
@ 2024-01-30  0:19 ` Alan Adamson
  2024-01-31  6:23   ` Christoph Hellwig
                     ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Alan Adamson @ 2024-01-30  0:19 UTC (permalink / raw)
  To: linux-nvme; +Cc: alan.adamson, kch, kbusch, hch, sagi

Commit d7ac8dca938c ("nvme: quiet user passthrough command errors")
disabled error logging for user passthrough commands.  This commit
adds the ability to opt-in to passthrough admin error logging. IO
commands initiated as passthrough will always be logged.

The logging output for passthrough commands (Admin and IO) has been
changed to include CDWXX fields.

nvme0n1: Read(0x2), LBA Out of Range (sct 0x0 / sc 0x80) DNR cdw10=0x0 cdw11=0x1
        cdw12=0x70000 cdw13=0x0 cdw14=0x0 cdw15=0x0

Add a helper function nvme_log_err_passthru() which allows us to log
error for passthru commands by decoding cdw10-cdw15 values of nvme
command.

Add a new sysfs attr passthru_err_log_enabled that allows user to conditionally
enable passthrough command logging for either passthrough Admin commands sent to
the controller or passthrough IO commands sent to a namespace.

By default, passthrough error logging is disabled.

To enable passthrough admin error logging:
        echo 1 > /sys/class/nvme/nvme0/passthru_err_log_enabled

To disable passthrough admin error logging:
        echo 0 > /sys/class/nvme/nvme0/passthru_err_log_enabled

To enable passthrough io error logging:
        echo 1 > /sys/class/nvme/nvme0/nvme0n1/passthru_err_log_enabled

To disable passthrough io error logging:
        echo 0 > /sys/class/nvme/nvme0/nvme0n1/passthru_err_log_enabled

Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
[kch] fix sevaral nits and trim down code, details in cover-letter.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 drivers/nvme/host/core.c  | 59 +++++++++++++++++++++++++++++++++++----
 drivers/nvme/host/nvme.h  |  3 +-
 drivers/nvme/host/sysfs.c | 57 +++++++++++++++++++++++++++++++++++++
 3 files changed, 112 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 85ab0fcf9e88..76bf72e4b2ff 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -338,6 +338,30 @@ static void nvme_log_error(struct request *req)
 			   nr->status & NVME_SC_DNR  ? "DNR "  : "");
 }
 
+static void nvme_log_err_passthru(struct request *req)
+{
+	struct nvme_ns *ns = req->q->queuedata;
+	struct nvme_request *nr = nvme_req(req);
+
+	pr_err_ratelimited("%s: %s(0x%x), %s (sct 0x%x / sc 0x%x) %s%s"
+		"cdw10=0x%x cdw11=0x%x cdw12=0x%x cdw13=0x%x cdw14=0x%x cdw15=0x%x\n",
+		ns ? ns->disk->disk_name : dev_name(nr->ctrl->device),
+		ns ? nvme_get_opcode_str(nr->cmd->common.opcode) :
+		     nvme_get_admin_opcode_str(nr->cmd->common.opcode),
+		nr->cmd->common.opcode,
+		nvme_get_error_status_str(nr->status),
+		nr->status >> 8 & 7,	/* Status Code Type */
+		nr->status & 0xff,	/* Status Code */
+		nr->status & NVME_SC_MORE ? "MORE " : "",
+		nr->status & NVME_SC_DNR  ? "DNR "  : "",
+		nr->cmd->common.cdw10,
+		nr->cmd->common.cdw11,
+		nr->cmd->common.cdw12,
+		nr->cmd->common.cdw13,
+		nr->cmd->common.cdw14,
+		nr->cmd->common.cdw14);
+}
+
 enum nvme_disposition {
 	COMPLETE,
 	RETRY,
@@ -385,8 +409,12 @@ static inline void nvme_end_req(struct request *req)
 {
 	blk_status_t status = nvme_error_status(nvme_req(req)->status);
 
-	if (unlikely(nvme_req(req)->status && !(req->rq_flags & RQF_QUIET)))
-		nvme_log_error(req);
+	if (unlikely(nvme_req(req)->status && !(req->rq_flags & RQF_QUIET))) {
+		if (blk_rq_is_passthrough(req))
+			nvme_log_err_passthru(req);
+		else
+			nvme_log_error(req);
+	}
 	nvme_end_req_zoned(req);
 	nvme_trace_bio_complete(req);
 	if (req->cmd_flags & REQ_NVME_MPATH)
@@ -679,10 +707,21 @@ static inline void nvme_clear_nvme_request(struct request *req)
 /* initialize a passthrough request */
 void nvme_init_request(struct request *req, struct nvme_command *cmd)
 {
-	if (req->q->queuedata)
+	struct nvme_request *nr = nvme_req(req);
+	bool logging_enabled;
+
+	if (req->q->queuedata) {
+		struct nvme_ns *ns = req->q->disk->private_data;
+
+		logging_enabled = ns->passthru_err_log_enabled;
 		req->timeout = NVME_IO_TIMEOUT;
-	else /* no queuedata implies admin queue */
+	} else { /* no queuedata implies admin queue */
+		logging_enabled = nr->ctrl->passthru_err_log_enabled;
 		req->timeout = NVME_ADMIN_TIMEOUT;
+	}
+
+	if (!logging_enabled)
+		req->rq_flags |= RQF_QUIET;
 
 	/* passthru commands should let the driver set the SGL flags */
 	cmd->common.flags &= ~NVME_CMD_SGL_ALL;
@@ -691,8 +730,7 @@ void nvme_init_request(struct request *req, struct nvme_command *cmd)
 	if (req->mq_hctx->type == HCTX_TYPE_POLL)
 		req->cmd_flags |= REQ_POLLED;
 	nvme_clear_nvme_request(req);
-	req->rq_flags |= RQF_QUIET;
-	memcpy(nvme_req(req)->cmd, cmd, sizeof(*cmd));
+	memcpy(nr->cmd, cmd, sizeof(*cmd));
 }
 EXPORT_SYMBOL_GPL(nvme_init_request);
 
@@ -3651,6 +3689,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info)
 
 	ns->disk = disk;
 	ns->queue = disk->queue;
+	ns->passthru_err_log_enabled = false;
 
 	if (ctrl->opts && ctrl->opts->data_digest)
 		blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, ns->queue);
@@ -3714,6 +3753,13 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info)
 	nvme_mpath_add_disk(ns, info->anagrpid);
 	nvme_fault_inject_init(&ns->fault_inject, ns->disk->disk_name);
 
+	/*
+	 * Set ns->disk->device->driver_data to ns so we can access
+	 * ns->logging_enabled in nvme_passthru_err_log_enabled_store() and
+	 * nvme_passthru_err_log_enabled_show().
+	 */
+	dev_set_drvdata(disk_to_dev(ns->disk), ns);
+
 	return;
 
  out_cleanup_ns_from_list:
@@ -4514,6 +4560,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
 	int ret;
 
 	WRITE_ONCE(ctrl->state, NVME_CTRL_NEW);
+	ctrl->passthru_err_log_enabled = false;
 	clear_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
 	spin_lock_init(&ctrl->lock);
 	mutex_init(&ctrl->scan_lock);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 030c80818240..7f9f2a7472e1 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -263,6 +263,7 @@ enum nvme_ctrl_flags {
 struct nvme_ctrl {
 	bool comp_seen;
 	bool identified;
+	bool passthru_err_log_enabled;
 	enum nvme_ctrl_state state;
 	spinlock_t lock;
 	struct mutex scan_lock;
@@ -522,7 +523,7 @@ struct nvme_ns {
 	struct device		cdev_device;
 
 	struct nvme_fault_inject fault_inject;
-
+	bool			passthru_err_log_enabled;
 };
 
 /* NVMe ns supports metadata actions by the controller (generate/strip) */
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index 754e91111042..c038531a5a06 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -35,6 +35,61 @@ static ssize_t nvme_sysfs_rescan(struct device *dev,
 }
 static DEVICE_ATTR(rescan_controller, S_IWUSR, NULL, nvme_sysfs_rescan);
 
+static ssize_t nvme_adm_passthru_err_log_enabled_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+	return sysfs_emit(buf, ctrl->passthru_err_log_enabled ? "on" : "off");
+}
+
+static ssize_t nvme_adm_passthru_err_log_enabled_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t count)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+	int err;
+	bool passthru_err_log_enabled;
+
+	err = kstrtobool(buf, &passthru_err_log_enabled);
+	if (err)
+		return -EINVAL;
+
+	ctrl->passthru_err_log_enabled = passthru_err_log_enabled;
+
+	return count;
+}
+
+static ssize_t nvme_io_passthru_err_log_enabled_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct nvme_ns *n = dev_get_drvdata(dev);
+
+	return sysfs_emit(buf, n->passthru_err_log_enabled ? "on\n" : "off\n");
+}
+
+static ssize_t nvme_io_passthru_err_log_enabled_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t count)
+{
+	struct nvme_ns *ns = dev_get_drvdata(dev);
+	int err;
+	bool passthru_err_log_enabled;
+
+	err = kstrtobool(buf, &passthru_err_log_enabled);
+	if (err)
+		return -EINVAL;
+	ns->passthru_err_log_enabled = passthru_err_log_enabled;
+
+	return count;
+}
+
+static struct device_attribute dev_attr_adm_passthru_err_log_enabled = \
+	__ATTR(passthru_err_log_enabled, S_IRUGO | S_IWUSR, \
+	nvme_adm_passthru_err_log_enabled_show, nvme_adm_passthru_err_log_enabled_store);
+
+static struct device_attribute dev_attr_io_passthru_err_log_enabled = \
+	__ATTR(passthru_err_log_enabled, S_IRUGO | S_IWUSR, \
+	nvme_io_passthru_err_log_enabled_show, nvme_io_passthru_err_log_enabled_store);
+
 static inline struct nvme_ns_head *dev_to_ns_head(struct device *dev)
 {
 	struct gendisk *disk = dev_to_disk(dev);
@@ -208,6 +263,7 @@ static struct attribute *nvme_ns_attrs[] = {
 	&dev_attr_ana_grpid.attr,
 	&dev_attr_ana_state.attr,
 #endif
+	&dev_attr_io_passthru_err_log_enabled.attr,
 	NULL,
 };
 
@@ -655,6 +711,7 @@ static struct attribute *nvme_dev_attrs[] = {
 #ifdef CONFIG_NVME_TCP_TLS
 	&dev_attr_tls_key.attr,
 #endif
+	&dev_attr_adm_passthru_err_log_enabled.attr,
 	NULL
 };
 
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH V9 1/1] nvme: allow passthru cmd error logging
  2024-01-30  0:19 ` [PATCH V9 1/1] " Alan Adamson
@ 2024-01-31  6:23   ` Christoph Hellwig
  2024-01-31 22:54   ` Chaitanya Kulkarni
  2024-02-01  1:09   ` Keith Busch
  2 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2024-01-31  6:23 UTC (permalink / raw)
  To: Alan Adamson; +Cc: linux-nvme, kch, kbusch, hch, sagi

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V9 1/1] nvme: allow passthru cmd error logging
  2024-01-30  0:19 ` [PATCH V9 1/1] " Alan Adamson
  2024-01-31  6:23   ` Christoph Hellwig
@ 2024-01-31 22:54   ` Chaitanya Kulkarni
  2024-02-01  1:09   ` Keith Busch
  2 siblings, 0 replies; 5+ messages in thread
From: Chaitanya Kulkarni @ 2024-01-31 22:54 UTC (permalink / raw)
  To: Alan Adamson, linux-nvme; +Cc: Chaitanya Kulkarni, kbusch, hch, sagi

> +static ssize_t nvme_adm_passthru_err_log_enabled_show(struct device *dev,
> +		struct device_attribute *attr, char *buf)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +
> +	return sysfs_emit(buf, ctrl->passthru_err_log_enabled ? "on" : "off");
> +}
> +

just like in nvme_io_passthru_err_log_enable_show() see below we need "\n"
to be consistent:-

return sysfs_emit(buf, ctrl->passthru_err_log_enabled ? "on\n" : "off\n");

with that Looks good.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>

[...]

> tatic ssize_t nvme_io_passthru_err_log_enabled_show(struct device *dev,
> +		struct device_attribute *attr, char *buf)
> +{
> +	struct nvme_ns *n = dev_get_drvdata(dev);
> +
> +	return sysfs_emit(buf, n->passthru_err_log_enabled ? "on\n" : "off\n");
> +}
> +
>

-ck



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V9 1/1] nvme: allow passthru cmd error logging
  2024-01-30  0:19 ` [PATCH V9 1/1] " Alan Adamson
  2024-01-31  6:23   ` Christoph Hellwig
  2024-01-31 22:54   ` Chaitanya Kulkarni
@ 2024-02-01  1:09   ` Keith Busch
  2 siblings, 0 replies; 5+ messages in thread
From: Keith Busch @ 2024-02-01  1:09 UTC (permalink / raw)
  To: Alan Adamson; +Cc: linux-nvme, kch, hch, sagi

On Mon, Jan 29, 2024 at 04:19:38PM -0800, Alan Adamson wrote:
> Commit d7ac8dca938c ("nvme: quiet user passthrough command errors")
> disabled error logging for user passthrough commands.  This commit
> adds the ability to opt-in to passthrough admin error logging. IO
> commands initiated as passthrough will always be logged.
> 
> The logging output for passthrough commands (Admin and IO) has been
> changed to include CDWXX fields.
> 
> nvme0n1: Read(0x2), LBA Out of Range (sct 0x0 / sc 0x80) DNR cdw10=0x0 cdw11=0x1
>         cdw12=0x70000 cdw13=0x0 cdw14=0x0 cdw15=0x0
> 
> Add a helper function nvme_log_err_passthru() which allows us to log
> error for passthru commands by decoding cdw10-cdw15 values of nvme
> command.
> 
> Add a new sysfs attr passthru_err_log_enabled that allows user to conditionally
> enable passthrough command logging for either passthrough Admin commands sent to
> the controller or passthrough IO commands sent to a namespace.

Thanks, applied to nvme-6.8 with Chaitanya's suggestion folded in.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-02-01  1:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-30  0:19 [PATCH V9 0/1] nvme: allow passthru cmd error logging Alan Adamson
2024-01-30  0:19 ` [PATCH V9 1/1] " Alan Adamson
2024-01-31  6:23   ` Christoph Hellwig
2024-01-31 22:54   ` Chaitanya Kulkarni
2024-02-01  1:09   ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.