* [PATCH V4 0/1] nvme: Add verbose error logging
@ 2022-01-24 18:51 Alan Adamson
2022-01-24 18:51 ` [PATCH V4 1/1] " Alan Adamson
0 siblings, 1 reply; 8+ messages in thread
From: Alan Adamson @ 2022-01-24 18:51 UTC (permalink / raw)
To: linux-nvme; +Cc: alan.adamson, kbusch, hch, sagi
V4:
- Adding logging for admin commands.
- Display NVMe Command Opcode value rather than generic block opcode.
- Various other nits.
V3:
- Don't populate nvme_errors[] array with NULL strings.
V2:
- Change nvme_errors[] to a sparse array where the status is used as an index into the array.
- Change pr_err() to pr_err_ratelimited().
- By enabling CONFIG_NVME_VERBOSE_ERRORS, the verbose status error is displayed. If it is not
enabled, then a message will still be displayed, but without the verbose status.
- Remove call to nvme_error_status() when determining whether to call nvme_error_log(). Speeds
up the fast path just a bit.
This patch improves logging for NVMe errors. Currently, we only get a
a vague idea as to why commands fail since only the block layer status
is captured on error. This patch allows us to see why a command was failed
by the controller. This is very useful when debugging problems in the field.
An example of an improved logged error:
[ 183.333734] nvme0n1: Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
[ 227.767945] nvme0: Activate Firmware(0x10), Invalid Field in Command (sct 0x0 / sc 0x2) DNR
Alan Adamson (1):
nvme: Add verbose error logging
drivers/nvme/host/Kconfig | 8 ++
drivers/nvme/host/Makefile | 2 +-
drivers/nvme/host/core.c | 3 +
drivers/nvme/host/errors.c | 230 +++++++++++++++++++++++++++++++++++++
drivers/nvme/host/nvme.h | 2 +
include/linux/nvme.h | 1 +
6 files changed, 245 insertions(+), 1 deletion(-)
create mode 100644 drivers/nvme/host/errors.c
--
2.27.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH V4 1/1] nvme: Add verbose error logging
2022-01-24 18:51 [PATCH V4 0/1] nvme: Add verbose error logging Alan Adamson
@ 2022-01-24 18:51 ` Alan Adamson
2022-01-24 20:20 ` Keith Busch
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Alan Adamson @ 2022-01-24 18:51 UTC (permalink / raw)
To: linux-nvme; +Cc: alan.adamson, kbusch, hch, sagi
Improves logging of NVMe errors. If NVME_VERBOSE_ERRORS is configured,
a verbose description of the error is logged, otherwise only status codes/bits
is logged.
Verbose logging examples:
[ 183.333734] nvme0n1: Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
[ 227.767945] nvme0: Activate Firmware(0x10), Invalid Field in Command (sct 0x0 / sc 0x2) DNR
Non-verbose logging examples:
[ 41.280509] nvme0n1: I/O Cmd(0x2) @ LBA 0, 1 blocks, I/O Error (sct 0x2 / sc 0x81) DNR
[ 77.249153] nvme0: Admin Cmd(0x10), I/O Error (sct 0x0 / sc 0x2) DNR
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
---
drivers/nvme/host/Kconfig | 8 ++
drivers/nvme/host/Makefile | 2 +-
drivers/nvme/host/core.c | 3 +
drivers/nvme/host/errors.c | 230 +++++++++++++++++++++++++++++++++++++
drivers/nvme/host/nvme.h | 2 +
include/linux/nvme.h | 1 +
6 files changed, 245 insertions(+), 1 deletion(-)
create mode 100644 drivers/nvme/host/errors.c
diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
index dc0450ca23a3..d6d056963c06 100644
--- a/drivers/nvme/host/Kconfig
+++ b/drivers/nvme/host/Kconfig
@@ -24,6 +24,14 @@ config NVME_MULTIPATH
/dev/nvmeXnY device will show up for each NVMe namespace,
even if it is accessible through multiple controllers.
+config NVME_VERBOSE_ERRORS
+ bool "NVMe verbose error reporting"
+ depends on NVME_CORE
+ help
+ This option enables verbose reporting for NVMe errors. The
+ error translation table will grow the kernel image size by
+ about 4 KB.
+
config NVME_HWMON
bool "NVMe hardware monitoring"
depends on (NVME_CORE=y && HWMON=y) || (NVME_CORE=m && HWMON)
diff --git a/drivers/nvme/host/Makefile b/drivers/nvme/host/Makefile
index dfaacd472e5d..ea3d702feb51 100644
--- a/drivers/nvme/host/Makefile
+++ b/drivers/nvme/host/Makefile
@@ -9,7 +9,7 @@ obj-$(CONFIG_NVME_RDMA) += nvme-rdma.o
obj-$(CONFIG_NVME_FC) += nvme-fc.o
obj-$(CONFIG_NVME_TCP) += nvme-tcp.o
-nvme-core-y := core.o ioctl.o
+nvme-core-y := core.o ioctl.o errors.o
nvme-core-$(CONFIG_TRACING) += trace.o
nvme-core-$(CONFIG_NVME_MULTIPATH) += multipath.o
nvme-core-$(CONFIG_BLK_DEV_ZONED) += zns.o
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 5e0bfda04bd7..ca5d36fcf4f8 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -354,6 +354,9 @@ void nvme_complete_rq(struct request *req)
switch (nvme_decide_disposition(req)) {
case COMPLETE:
+ if (unlikely(nvme_req(req)->status != NVME_SC_SUCCESS))
+ nvme_error_log(req);
+
nvme_end_req(req);
return;
case RETRY:
diff --git a/drivers/nvme/host/errors.c b/drivers/nvme/host/errors.c
new file mode 100644
index 000000000000..446831fc32ea
--- /dev/null
+++ b/drivers/nvme/host/errors.c
@@ -0,0 +1,230 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * NVM Express device driver verbose errors
+ * Copyright (c) 2021, Oracle and/or its affiliates
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ */
+
+#include <linux/blkdev.h>
+#include "nvme.h"
+
+struct nvme_string_table {
+ unsigned int code;
+ const unsigned char *string;
+};
+
+#ifdef CONFIG_NVME_VERBOSE_ERRORS
+static const struct nvme_string_table nvme_ops[] = {
+ { nvme_cmd_flush, "Flush" },
+ { nvme_cmd_write, "Write" },
+ { nvme_cmd_read, "Read" },
+ { nvme_cmd_write_uncor, "Write Uncorrectable" },
+ { nvme_cmd_compare, "Compare" },
+ { nvme_cmd_write_zeroes, "Write Zeros" },
+ { nvme_cmd_dsm, "Deallocate (DSM)" },
+ { nvme_cmd_verify, "Verify" },
+ { nvme_cmd_resv_register, "Reservation Register" },
+ { nvme_cmd_resv_report, "Reservation Report" },
+ { nvme_cmd_resv_acquire, "Reservation Acquire" },
+ { nvme_cmd_resv_release, "Reservation Release" },
+ { nvme_cmd_zone_mgmt_send, "Zone Management Send" },
+ { nvme_cmd_zone_mgmt_recv, "Zone Management Receive" },
+ { nvme_cmd_zone_append, "Zone Management Append" },
+};
+
+static const struct nvme_string_table nvme_admin_ops[] = {
+ { nvme_admin_delete_sq, "Delete SQ" },
+ { nvme_admin_create_sq, "Create SQ" },
+ { nvme_admin_get_log_page, "Get Log Page" },
+ { nvme_admin_delete_cq, "Delete CQ" },
+ { nvme_admin_create_cq, "Create CQ" },
+ { nvme_admin_identify, "Identify" },
+ { nvme_admin_abort_cmd, "Abort Command" },
+ { nvme_admin_set_features, "Set Features" },
+ { nvme_admin_get_features, "Get Features" },
+ { nvme_admin_async_event, "Async Event" },
+ { nvme_admin_ns_mgmt, "Namespace Management" },
+ { nvme_admin_activate_fw, "Activate Firmware" },
+ { nvme_admin_download_fw, "Download Firmware" },
+ { nvme_admin_dev_self_test, "Device Self Test" },
+ { nvme_admin_ns_attach, "Namespace Attach" },
+ { nvme_admin_keep_alive, "Keep Alive" },
+ { nvme_admin_directive_send, "Directive Send" },
+ { nvme_admin_directive_recv, "Directive Receive" },
+ { nvme_admin_virtual_mgmt, "Virtual Management" },
+ { nvme_admin_nvme_mi_send, "NVMe Send MI" },
+ { nvme_admin_nvme_mi_recv, "NVMe Receive MI" },
+ { nvme_admin_dbbuf, "Doorbell Buffer Config" },
+ { nvme_admin_format_nvm, "Format NVM" },
+ { nvme_admin_security_send, "Security Send" },
+ { nvme_admin_security_recv, "Security Receive" },
+ { nvme_admin_sanitize_nvm, "Sanitize NVM" },
+ { nvme_admin_get_lba_status, "Get LBA Status" },
+};
+
+static const char * const nvme_errors[] = {
+ [NVME_SC_SUCCESS] = "Success",
+ [NVME_SC_INVALID_OPCODE] = "Invalid Command Opcode",
+ [NVME_SC_INVALID_FIELD] = "Invalid Field in Command",
+ [NVME_SC_CMDID_CONFLICT] = "Command ID Conflict",
+ [NVME_SC_DATA_XFER_ERROR] = "Data Transfer Error",
+ [NVME_SC_POWER_LOSS] = "Commands Aborted due to Power Loss Notification",
+ [NVME_SC_INTERNAL] = "Internal Error",
+ [NVME_SC_ABORT_REQ] = "Command Abort Requested",
+ [NVME_SC_ABORT_QUEUE] = "Command Aborted due to SQ Deletion",
+ [NVME_SC_FUSED_FAIL] = "Command Aborted due to Failed Fused Command",
+ [NVME_SC_FUSED_MISSING] = "Command Aborted due to Missing Fused Command",
+ [NVME_SC_INVALID_NS] = "Invalid Namespace or Format",
+ [NVME_SC_CMD_SEQ_ERROR] = "Command Sequence Error",
+ [NVME_SC_SGL_INVALID_LAST] = "Invalid SGL Segment Descriptor",
+ [NVME_SC_SGL_INVALID_COUNT] = "Invalid Number of SGL Descriptors",
+ [NVME_SC_SGL_INVALID_DATA] = "Data SGL Length Invalid",
+ [NVME_SC_SGL_INVALID_METADATA] = "Metadata SGL Length Invalid",
+ [NVME_SC_SGL_INVALID_TYPE] = "SGL Descriptor Type Invalid",
+ [NVME_SC_CMB_INVALID_USE] = "Invalid Use of Controller Memory Buffer",
+ [NVME_SC_PRP_INVALID_OFFSET] = "PRP Offset Invalid",
+ [NVME_SC_ATOMIC_WU_EXCEEDED] = "Atomic Write Unit Exceeded",
+ [NVME_SC_OP_DENIED] = "Operation Denied",
+ [NVME_SC_SGL_INVALID_OFFSET] = "SGL Offset Invalid",
+ [NVME_SC_RESERVED] = "Reserved",
+ [NVME_SC_HOST_ID_INCONSIST] = "Host Identifier Inconsistent Format",
+ [NVME_SC_KA_TIMEOUT_EXPIRED] = "Keep Alive Timeout Expired",
+ [NVME_SC_KA_TIMEOUT_INVALID] = "Keep Alive Timeout Invalid",
+ [NVME_SC_ABORTED_PREEMPT_ABORT] = "Command Aborted due to Preempt and Abort",
+ [NVME_SC_SANITIZE_FAILED] = "Sanitize Failed",
+ [NVME_SC_SANITIZE_IN_PROGRESS] = "Sanitize In Progress",
+ [NVME_SC_SGL_INVALID_GRANULARITY] = "SGL Data Block Granularity Invalid",
+ [NVME_SC_CMD_NOT_SUP_CMB_QUEUE] = "Command Not Supported for Queue in CMB",
+ [NVME_SC_NS_WRITE_PROTECTED] = "Namespace is Write Protected",
+ [NVME_SC_CMD_INTERRUPTED] = "Command Interrupted",
+ [NVME_SC_TRANSIENT_TR_ERR] = "Transient Transport Error",
+ [NVME_SC_INVALID_IO_CMD_SET] = "Invalid IO Command Set",
+ [NVME_SC_LBA_RANGE] = "LBA Out of Range",
+ [NVME_SC_CAP_EXCEEDED] = "Capacity Exceeded",
+ [NVME_SC_NS_NOT_READY] = "Namespace Not Ready",
+ [NVME_SC_RESERVATION_CONFLICT] = "Reservation Conflict",
+ [NVME_SC_FORMAT_IN_PROGRESS] = "Format In Progress",
+ [NVME_SC_CQ_INVALID] = "Completion Queue Invalid",
+ [NVME_SC_QID_INVALID] = "Invalid Queue Identifier",
+ [NVME_SC_QUEUE_SIZE] = "Invalid Queue Size",
+ [NVME_SC_ABORT_LIMIT] = "Abort Command Limit Exceeded",
+ [NVME_SC_ABORT_MISSING] = "Reserved", /* XXX */
+ [NVME_SC_ASYNC_LIMIT] = "Asynchronous Event Request Limit Exceeded",
+ [NVME_SC_FIRMWARE_SLOT] = "Invalid Firmware Slot",
+ [NVME_SC_FIRMWARE_IMAGE] = "Invalid Firmware Image",
+ [NVME_SC_INVALID_VECTOR] = "Invalid Interrupt Vector",
+ [NVME_SC_INVALID_LOG_PAGE] = "Invalid Log Page",
+ [NVME_SC_INVALID_FORMAT] = "Invalid Format",
+ [NVME_SC_FW_NEEDS_CONV_RESET] = "Firmware Activation Requires Conventional Reset",
+ [NVME_SC_INVALID_QUEUE] = "Invalid Queue Deletion",
+ [NVME_SC_FEATURE_NOT_SAVEABLE] = "Feature Identifier Not Saveable",
+ [NVME_SC_FEATURE_NOT_CHANGEABLE] = "Feature Not Changeable",
+ [NVME_SC_FEATURE_NOT_PER_NS] = "Feature Not Namespace Specific",
+ [NVME_SC_FW_NEEDS_SUBSYS_RESET] = "Firmware Activation Requires NVM Subsystem Reset",
+ [NVME_SC_FW_NEEDS_RESET] = "Firmware Activation Requires Reset",
+ [NVME_SC_FW_NEEDS_MAX_TIME] = "Firmware Activation Requires Maximum Time Violation",
+ [NVME_SC_FW_ACTIVATE_PROHIBITED] = "Firmware Activation Prohibited",
+ [NVME_SC_OVERLAPPING_RANGE] = "Overlapping Range",
+ [NVME_SC_NS_INSUFFICIENT_CAP] = "Namespace Insufficient Capacity",
+ [NVME_SC_NS_ID_UNAVAILABLE] = "Namespace Identifier Unavailable",
+ [NVME_SC_NS_ALREADY_ATTACHED] = "Namespace Already Attached",
+ [NVME_SC_NS_IS_PRIVATE] = "Namespace Is Private",
+ [NVME_SC_NS_NOT_ATTACHED] = "Namespace Not Attached",
+ [NVME_SC_THIN_PROV_NOT_SUPP] = "Thin Provisioning Not Supported",
+ [NVME_SC_CTRL_LIST_INVALID] = "Controller List Invalid",
+ [NVME_SC_SELT_TEST_IN_PROGRESS] = "Device Self-test In Progress",
+ [NVME_SC_BP_WRITE_PROHIBITED] = "Boot Partition Write Prohibited",
+ [NVME_SC_CTRL_ID_INVALID] = "Invalid Controller Identifier",
+ [NVME_SC_SEC_CTRL_STATE_INVALID] = "Invalid Secondary Controller State",
+ [NVME_SC_CTRL_RES_NUM_INVALID] = "Invalid Number of Controller Resources",
+ [NVME_SC_RES_ID_INVALID] = "Invalid Resource Identifier",
+ [NVME_SC_PMR_SAN_PROHIBITED] = "Sanitize Prohibited",
+ [NVME_SC_ANA_GROUP_ID_INVALID] = "ANA Group Identifier Invalid",
+ [NVME_SC_ANA_ATTACH_FAILED] = "ANA Attach Failed",
+ [NVME_SC_BAD_ATTRIBUTES] = "Conflicting Attributes",
+ [NVME_SC_INVALID_PI] = "Invalid Protection Information",
+ [NVME_SC_READ_ONLY] = "Attempted Write to Read Only Range",
+ [NVME_SC_ONCS_NOT_SUPPORTED] = "ONCS Not Supported",
+ [NVME_SC_ZONE_BOUNDARY_ERROR] = "Zoned Boundary Error",
+ [NVME_SC_ZONE_FULL] = "Zone Is Full",
+ [NVME_SC_ZONE_READ_ONLY] = "Zone Is Read Only",
+ [NVME_SC_ZONE_OFFLINE] = "Zone Is Offline",
+ [NVME_SC_ZONE_INVALID_WRITE] = "Zone Invalid Write",
+ [NVME_SC_ZONE_TOO_MANY_ACTIVE] = "Too Many Active Zones",
+ [NVME_SC_ZONE_TOO_MANY_OPEN] = "Too Many Open Zones",
+ [NVME_SC_ZONE_INVALID_TRANSITION] = "Invalid Zone State Transition",
+ [NVME_SC_WRITE_FAULT] = "Write Fault",
+ [NVME_SC_READ_ERROR] = "Unrecovered Read Error",
+ [NVME_SC_GUARD_CHECK] = "End-to-end Guard Check Error",
+ [NVME_SC_APPTAG_CHECK] = "End-to-end Application Tag Check Error",
+ [NVME_SC_REFTAG_CHECK] = "End-to-end Reference Tag Check Error",
+ [NVME_SC_COMPARE_FAILED] = "Compare Failure",
+ [NVME_SC_ACCESS_DENIED] = "Access Denied",
+ [NVME_SC_UNWRITTEN_BLOCK] = "Deallocated or Unwritten Logical Block",
+ [NVME_SC_ANA_PERSISTENT_LOSS] = "Asymmetric Access Persistent Loss",
+ [NVME_SC_ANA_INACCESSIBLE] = "Asymmetric Access Inaccessible",
+ [NVME_SC_ANA_TRANSITION] = "Asymmetric Access Transition",
+ [NVME_SC_HOST_PATH_ERROR] = "Host Pathing Error",
+};
+#endif /* CONFIG_NVME_VERBOSE_ERRORS */
+
+void nvme_error_log(struct request *req)
+{
+ struct nvme_ns *ns = req->q->queuedata;
+ struct nvme_request *nr = nvme_req(req);
+ const unsigned char *op_str = "I/O Cmd";
+ const unsigned char *admin_op_str = "Admin Cmd";
+ const unsigned char *err_str = NULL;
+#ifdef CONFIG_NVME_VERBOSE_ERRORS
+ const struct nvme_string_table *entry;
+ unsigned int i;
+
+ if ((nr->status & 0x7ff) <= ARRAY_SIZE(nvme_errors))
+ err_str = nvme_errors[nr->status & 0x7ff];
+#endif
+ if (err_str == NULL)
+ err_str = "I/O Error";
+
+ if (ns) {
+#ifdef CONFIG_NVME_VERBOSE_ERRORS
+ for (i = 0, entry = nvme_ops ; i < ARRAY_SIZE(nvme_ops) ; i++)
+ if (entry[i].code == nr->cmd->common.opcode)
+ op_str = entry[i].string;
+#endif
+ pr_err_ratelimited("%s: %s(0x%x) @ LBA %llu, %llu blocks, %s (sct 0x%x / sc 0x%x) %s%s\n",
+ ns->disk ? ns->disk->disk_name : "?",
+ op_str,
+ nr->cmd->common.opcode,
+ (unsigned long long)nvme_sect_to_lba(ns, blk_rq_pos(req)),
+ (unsigned long long)blk_rq_bytes(req) >> ns->lba_shift,
+ err_str,
+ nr->status >> 8 & 7, /* Status Code Type */
+ nr->status & 0xff, /* Status Code */
+ nr->status & NVME_SC_MORE ? "MORE " : "",
+ nr->status & NVME_SC_DNR ? "DNR " : "");
+ } else {
+#ifdef CONFIG_NVME_VERBOSE_ERRORS
+ for (i = 0, entry = nvme_admin_ops ; i < ARRAY_SIZE(nvme_admin_ops) ; i++)
+ if (entry[i].code == nr->cmd->common.opcode)
+ admin_op_str = entry[i].string;
+#endif
+ pr_err_ratelimited("%s: %s(0x%x), %s (sct 0x%x / sc 0x%x) %s%s\n",
+ dev_name(nr->ctrl->device),
+ admin_op_str,
+ nr->cmd->common.opcode,
+ err_str,
+ nr->status >> 8 & 7, /* Status Code Type */
+ nr->status & 0xff, /* Status Code */
+ nr->status & NVME_SC_MORE ? "MORE " : "",
+ nr->status & NVME_SC_DNR ? "DNR " : "");
+ }
+}
+EXPORT_SYMBOL_GPL(nvme_error_log);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index a162f6c6da6e..3216156b71b2 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -930,4 +930,6 @@ static inline bool nvme_multi_css(struct nvme_ctrl *ctrl)
return (ctrl->ctrl_config & NVME_CC_CSS_MASK) == NVME_CC_CSS_CSI;
}
+extern void nvme_error_log(struct request *req);
+
#endif /* _NVME_H */
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 855dd9b3e84b..1f946e5bf7c1 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -1636,6 +1636,7 @@ enum {
NVME_SC_HOST_ABORTED_CMD = 0x371,
NVME_SC_CRD = 0x1800,
+ NVME_SC_MORE = 0x2000,
NVME_SC_DNR = 0x4000,
};
--
2.27.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH V4 1/1] nvme: Add verbose error logging
2022-01-24 18:51 ` [PATCH V4 1/1] " Alan Adamson
@ 2022-01-24 20:20 ` Keith Busch
2022-01-24 21:22 ` Alan Adamson
2022-01-25 10:02 ` Sagi Grimberg
2022-01-28 16:40 ` Hannes Reinecke
2 siblings, 1 reply; 8+ messages in thread
From: Keith Busch @ 2022-01-24 20:20 UTC (permalink / raw)
To: Alan Adamson; +Cc: linux-nvme, hch, sagi
On Mon, Jan 24, 2022 at 10:51:44AM -0800, Alan Adamson wrote:
> Improves logging of NVMe errors. If NVME_VERBOSE_ERRORS is configured,
> a verbose description of the error is logged, otherwise only status codes/bits
> is logged.
>
> Verbose logging examples:
> [ 183.333734] nvme0n1: Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
>
> [ 227.767945] nvme0: Activate Firmware(0x10), Invalid Field in Command (sct 0x0 / sc 0x2) DNR
>
> Non-verbose logging examples:
> [ 41.280509] nvme0n1: I/O Cmd(0x2) @ LBA 0, 1 blocks, I/O Error (sct 0x2 / sc 0x81) DNR
>
> [ 77.249153] nvme0: Admin Cmd(0x10), I/O Error (sct 0x0 / sc 0x2) DNR
>
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Looks good to me.
Reviewed-by: Keith Busch <kbusch@kernel.org>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V4 1/1] nvme: Add verbose error logging
2022-01-24 20:20 ` Keith Busch
@ 2022-01-24 21:22 ` Alan Adamson
0 siblings, 0 replies; 8+ messages in thread
From: Alan Adamson @ 2022-01-24 21:22 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme, Christoph Hellwig, sagi
Thanks for all the feedback. I think the patch evolved nicely.
Alan
> On Jan 24, 2022, at 12:20 PM, Keith Busch <kbusch@kernel.org> wrote:
>
> On Mon, Jan 24, 2022 at 10:51:44AM -0800, Alan Adamson wrote:
>> Improves logging of NVMe errors. If NVME_VERBOSE_ERRORS is configured,
>> a verbose description of the error is logged, otherwise only status codes/bits
>> is logged.
>>
>> Verbose logging examples:
>> [ 183.333734] nvme0n1: Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
>>
>> [ 227.767945] nvme0: Activate Firmware(0x10), Invalid Field in Command (sct 0x0 / sc 0x2) DNR
>>
>> Non-verbose logging examples:
>> [ 41.280509] nvme0n1: I/O Cmd(0x2) @ LBA 0, 1 blocks, I/O Error (sct 0x2 / sc 0x81) DNR
>>
>> [ 77.249153] nvme0: Admin Cmd(0x10), I/O Error (sct 0x0 / sc 0x2) DNR
>>
>> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
>> Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
>> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
>
> Looks good to me.
>
> Reviewed-by: Keith Busch <kbusch@kernel.org>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V4 1/1] nvme: Add verbose error logging
2022-01-24 18:51 ` [PATCH V4 1/1] " Alan Adamson
2022-01-24 20:20 ` Keith Busch
@ 2022-01-25 10:02 ` Sagi Grimberg
[not found] ` <BB566CE6-19C1-472F-978B-564A2EC4C4EC@oracle.com>
2022-01-28 16:40 ` Hannes Reinecke
2 siblings, 1 reply; 8+ messages in thread
From: Sagi Grimberg @ 2022-01-25 10:02 UTC (permalink / raw)
To: Alan Adamson, linux-nvme; +Cc: kbusch, hch
On 1/24/22 20:51, Alan Adamson wrote:
> Improves logging of NVMe errors. If NVME_VERBOSE_ERRORS is configured,
> a verbose description of the error is logged, otherwise only status codes/bits
> is logged.
>
> Verbose logging examples:
> [ 183.333734] nvme0n1: Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
>
> [ 227.767945] nvme0: Activate Firmware(0x10), Invalid Field in Command (sct 0x0 / sc 0x2) DNR
>
> Non-verbose logging examples:
> [ 41.280509] nvme0n1: I/O Cmd(0x2) @ LBA 0, 1 blocks, I/O Error (sct 0x2 / sc 0x81) DNR
>
> [ 77.249153] nvme0: Admin Cmd(0x10), I/O Error (sct 0x0 / sc 0x2) DNR
>
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
> ---
> drivers/nvme/host/Kconfig | 8 ++
> drivers/nvme/host/Makefile | 2 +-
> drivers/nvme/host/core.c | 3 +
> drivers/nvme/host/errors.c | 230 +++++++++++++++++++++++++++++++++++++
> drivers/nvme/host/nvme.h | 2 +
> include/linux/nvme.h | 1 +
> 6 files changed, 245 insertions(+), 1 deletion(-)
> create mode 100644 drivers/nvme/host/errors.c
>
> diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
> index dc0450ca23a3..d6d056963c06 100644
> --- a/drivers/nvme/host/Kconfig
> +++ b/drivers/nvme/host/Kconfig
> @@ -24,6 +24,14 @@ config NVME_MULTIPATH
> /dev/nvmeXnY device will show up for each NVMe namespace,
> even if it is accessible through multiple controllers.
>
> +config NVME_VERBOSE_ERRORS
> + bool "NVMe verbose error reporting"
> + depends on NVME_CORE
> + help
> + This option enables verbose reporting for NVMe errors. The
> + error translation table will grow the kernel image size by
> + about 4 KB.
> +
> config NVME_HWMON
> bool "NVMe hardware monitoring"
> depends on (NVME_CORE=y && HWMON=y) || (NVME_CORE=m && HWMON)
> diff --git a/drivers/nvme/host/Makefile b/drivers/nvme/host/Makefile
> index dfaacd472e5d..ea3d702feb51 100644
> --- a/drivers/nvme/host/Makefile
> +++ b/drivers/nvme/host/Makefile
> @@ -9,7 +9,7 @@ obj-$(CONFIG_NVME_RDMA) += nvme-rdma.o
> obj-$(CONFIG_NVME_FC) += nvme-fc.o
> obj-$(CONFIG_NVME_TCP) += nvme-tcp.o
>
> -nvme-core-y := core.o ioctl.o
> +nvme-core-y := core.o ioctl.o errors.o
> nvme-core-$(CONFIG_TRACING) += trace.o
> nvme-core-$(CONFIG_NVME_MULTIPATH) += multipath.o
> nvme-core-$(CONFIG_BLK_DEV_ZONED) += zns.o
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 5e0bfda04bd7..ca5d36fcf4f8 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -354,6 +354,9 @@ void nvme_complete_rq(struct request *req)
>
> switch (nvme_decide_disposition(req)) {
> case COMPLETE:
> + if (unlikely(nvme_req(req)->status != NVME_SC_SUCCESS))
> + nvme_error_log(req);
> +
> nvme_end_req(req);
> return;
> case RETRY:
> diff --git a/drivers/nvme/host/errors.c b/drivers/nvme/host/errors.c
> new file mode 100644
> index 000000000000..446831fc32ea
> --- /dev/null
> +++ b/drivers/nvme/host/errors.c
> @@ -0,0 +1,230 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * NVM Express device driver verbose errors
> + * Copyright (c) 2021, Oracle and/or its affiliates
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
> + * more details.
> + */
> +
> +#include <linux/blkdev.h>
> +#include "nvme.h"
> +
> +struct nvme_string_table {
> + unsigned int code;
> + const unsigned char *string;
> +};
> +
> +#ifdef CONFIG_NVME_VERBOSE_ERRORS
> +static const struct nvme_string_table nvme_ops[] = {
> + { nvme_cmd_flush, "Flush" },
> + { nvme_cmd_write, "Write" },
> + { nvme_cmd_read, "Read" },
> + { nvme_cmd_write_uncor, "Write Uncorrectable" },
> + { nvme_cmd_compare, "Compare" },
> + { nvme_cmd_write_zeroes, "Write Zeros" },
> + { nvme_cmd_dsm, "Deallocate (DSM)" },
> + { nvme_cmd_verify, "Verify" },
> + { nvme_cmd_resv_register, "Reservation Register" },
> + { nvme_cmd_resv_report, "Reservation Report" },
> + { nvme_cmd_resv_acquire, "Reservation Acquire" },
> + { nvme_cmd_resv_release, "Reservation Release" },
> + { nvme_cmd_zone_mgmt_send, "Zone Management Send" },
> + { nvme_cmd_zone_mgmt_recv, "Zone Management Receive" },
> + { nvme_cmd_zone_append, "Zone Management Append" },
> +};
> +
> +static const struct nvme_string_table nvme_admin_ops[] = {
> + { nvme_admin_delete_sq, "Delete SQ" },
> + { nvme_admin_create_sq, "Create SQ" },
> + { nvme_admin_get_log_page, "Get Log Page" },
> + { nvme_admin_delete_cq, "Delete CQ" },
> + { nvme_admin_create_cq, "Create CQ" },
> + { nvme_admin_identify, "Identify" },
> + { nvme_admin_abort_cmd, "Abort Command" },
> + { nvme_admin_set_features, "Set Features" },
> + { nvme_admin_get_features, "Get Features" },
> + { nvme_admin_async_event, "Async Event" },
> + { nvme_admin_ns_mgmt, "Namespace Management" },
> + { nvme_admin_activate_fw, "Activate Firmware" },
> + { nvme_admin_download_fw, "Download Firmware" },
> + { nvme_admin_dev_self_test, "Device Self Test" },
> + { nvme_admin_ns_attach, "Namespace Attach" },
> + { nvme_admin_keep_alive, "Keep Alive" },
> + { nvme_admin_directive_send, "Directive Send" },
> + { nvme_admin_directive_recv, "Directive Receive" },
> + { nvme_admin_virtual_mgmt, "Virtual Management" },
> + { nvme_admin_nvme_mi_send, "NVMe Send MI" },
> + { nvme_admin_nvme_mi_recv, "NVMe Receive MI" },
> + { nvme_admin_dbbuf, "Doorbell Buffer Config" },
> + { nvme_admin_format_nvm, "Format NVM" },
> + { nvme_admin_security_send, "Security Send" },
> + { nvme_admin_security_recv, "Security Receive" },
> + { nvme_admin_sanitize_nvm, "Sanitize NVM" },
> + { nvme_admin_get_lba_status, "Get LBA Status" },
> +};
> +
> +static const char * const nvme_errors[] = {
> + [NVME_SC_SUCCESS] = "Success",
Given that you have success here, maybe a better name for the array
is nvme_statuses.
> + [NVME_SC_INVALID_OPCODE] = "Invalid Command Opcode",
> + [NVME_SC_INVALID_FIELD] = "Invalid Field in Command",
> + [NVME_SC_CMDID_CONFLICT] = "Command ID Conflict",
> + [NVME_SC_DATA_XFER_ERROR] = "Data Transfer Error",
> + [NVME_SC_POWER_LOSS] = "Commands Aborted due to Power Loss Notification",
> + [NVME_SC_INTERNAL] = "Internal Error",
> + [NVME_SC_ABORT_REQ] = "Command Abort Requested",
> + [NVME_SC_ABORT_QUEUE] = "Command Aborted due to SQ Deletion",
> + [NVME_SC_FUSED_FAIL] = "Command Aborted due to Failed Fused Command",
> + [NVME_SC_FUSED_MISSING] = "Command Aborted due to Missing Fused Command",
> + [NVME_SC_INVALID_NS] = "Invalid Namespace or Format",
> + [NVME_SC_CMD_SEQ_ERROR] = "Command Sequence Error",
> + [NVME_SC_SGL_INVALID_LAST] = "Invalid SGL Segment Descriptor",
> + [NVME_SC_SGL_INVALID_COUNT] = "Invalid Number of SGL Descriptors",
> + [NVME_SC_SGL_INVALID_DATA] = "Data SGL Length Invalid",
> + [NVME_SC_SGL_INVALID_METADATA] = "Metadata SGL Length Invalid",
> + [NVME_SC_SGL_INVALID_TYPE] = "SGL Descriptor Type Invalid",
> + [NVME_SC_CMB_INVALID_USE] = "Invalid Use of Controller Memory Buffer",
> + [NVME_SC_PRP_INVALID_OFFSET] = "PRP Offset Invalid",
> + [NVME_SC_ATOMIC_WU_EXCEEDED] = "Atomic Write Unit Exceeded",
> + [NVME_SC_OP_DENIED] = "Operation Denied",
> + [NVME_SC_SGL_INVALID_OFFSET] = "SGL Offset Invalid",
> + [NVME_SC_RESERVED] = "Reserved",
> + [NVME_SC_HOST_ID_INCONSIST] = "Host Identifier Inconsistent Format",
> + [NVME_SC_KA_TIMEOUT_EXPIRED] = "Keep Alive Timeout Expired",
> + [NVME_SC_KA_TIMEOUT_INVALID] = "Keep Alive Timeout Invalid",
> + [NVME_SC_ABORTED_PREEMPT_ABORT] = "Command Aborted due to Preempt and Abort",
> + [NVME_SC_SANITIZE_FAILED] = "Sanitize Failed",
> + [NVME_SC_SANITIZE_IN_PROGRESS] = "Sanitize In Progress",
> + [NVME_SC_SGL_INVALID_GRANULARITY] = "SGL Data Block Granularity Invalid",
> + [NVME_SC_CMD_NOT_SUP_CMB_QUEUE] = "Command Not Supported for Queue in CMB",
> + [NVME_SC_NS_WRITE_PROTECTED] = "Namespace is Write Protected",
> + [NVME_SC_CMD_INTERRUPTED] = "Command Interrupted",
> + [NVME_SC_TRANSIENT_TR_ERR] = "Transient Transport Error",
> + [NVME_SC_INVALID_IO_CMD_SET] = "Invalid IO Command Set",
> + [NVME_SC_LBA_RANGE] = "LBA Out of Range",
> + [NVME_SC_CAP_EXCEEDED] = "Capacity Exceeded",
> + [NVME_SC_NS_NOT_READY] = "Namespace Not Ready",
> + [NVME_SC_RESERVATION_CONFLICT] = "Reservation Conflict",
> + [NVME_SC_FORMAT_IN_PROGRESS] = "Format In Progress",
> + [NVME_SC_CQ_INVALID] = "Completion Queue Invalid",
> + [NVME_SC_QID_INVALID] = "Invalid Queue Identifier",
> + [NVME_SC_QUEUE_SIZE] = "Invalid Queue Size",
> + [NVME_SC_ABORT_LIMIT] = "Abort Command Limit Exceeded",
> + [NVME_SC_ABORT_MISSING] = "Reserved", /* XXX */
> + [NVME_SC_ASYNC_LIMIT] = "Asynchronous Event Request Limit Exceeded",
> + [NVME_SC_FIRMWARE_SLOT] = "Invalid Firmware Slot",
> + [NVME_SC_FIRMWARE_IMAGE] = "Invalid Firmware Image",
> + [NVME_SC_INVALID_VECTOR] = "Invalid Interrupt Vector",
> + [NVME_SC_INVALID_LOG_PAGE] = "Invalid Log Page",
> + [NVME_SC_INVALID_FORMAT] = "Invalid Format",
> + [NVME_SC_FW_NEEDS_CONV_RESET] = "Firmware Activation Requires Conventional Reset",
> + [NVME_SC_INVALID_QUEUE] = "Invalid Queue Deletion",
> + [NVME_SC_FEATURE_NOT_SAVEABLE] = "Feature Identifier Not Saveable",
> + [NVME_SC_FEATURE_NOT_CHANGEABLE] = "Feature Not Changeable",
> + [NVME_SC_FEATURE_NOT_PER_NS] = "Feature Not Namespace Specific",
> + [NVME_SC_FW_NEEDS_SUBSYS_RESET] = "Firmware Activation Requires NVM Subsystem Reset",
> + [NVME_SC_FW_NEEDS_RESET] = "Firmware Activation Requires Reset",
> + [NVME_SC_FW_NEEDS_MAX_TIME] = "Firmware Activation Requires Maximum Time Violation",
> + [NVME_SC_FW_ACTIVATE_PROHIBITED] = "Firmware Activation Prohibited",
> + [NVME_SC_OVERLAPPING_RANGE] = "Overlapping Range",
> + [NVME_SC_NS_INSUFFICIENT_CAP] = "Namespace Insufficient Capacity",
> + [NVME_SC_NS_ID_UNAVAILABLE] = "Namespace Identifier Unavailable",
> + [NVME_SC_NS_ALREADY_ATTACHED] = "Namespace Already Attached",
> + [NVME_SC_NS_IS_PRIVATE] = "Namespace Is Private",
> + [NVME_SC_NS_NOT_ATTACHED] = "Namespace Not Attached",
> + [NVME_SC_THIN_PROV_NOT_SUPP] = "Thin Provisioning Not Supported",
> + [NVME_SC_CTRL_LIST_INVALID] = "Controller List Invalid",
> + [NVME_SC_SELT_TEST_IN_PROGRESS] = "Device Self-test In Progress",
> + [NVME_SC_BP_WRITE_PROHIBITED] = "Boot Partition Write Prohibited",
> + [NVME_SC_CTRL_ID_INVALID] = "Invalid Controller Identifier",
> + [NVME_SC_SEC_CTRL_STATE_INVALID] = "Invalid Secondary Controller State",
> + [NVME_SC_CTRL_RES_NUM_INVALID] = "Invalid Number of Controller Resources",
> + [NVME_SC_RES_ID_INVALID] = "Invalid Resource Identifier",
> + [NVME_SC_PMR_SAN_PROHIBITED] = "Sanitize Prohibited",
> + [NVME_SC_ANA_GROUP_ID_INVALID] = "ANA Group Identifier Invalid",
> + [NVME_SC_ANA_ATTACH_FAILED] = "ANA Attach Failed",
> + [NVME_SC_BAD_ATTRIBUTES] = "Conflicting Attributes",
> + [NVME_SC_INVALID_PI] = "Invalid Protection Information",
> + [NVME_SC_READ_ONLY] = "Attempted Write to Read Only Range",
> + [NVME_SC_ONCS_NOT_SUPPORTED] = "ONCS Not Supported",
> + [NVME_SC_ZONE_BOUNDARY_ERROR] = "Zoned Boundary Error",
> + [NVME_SC_ZONE_FULL] = "Zone Is Full",
> + [NVME_SC_ZONE_READ_ONLY] = "Zone Is Read Only",
> + [NVME_SC_ZONE_OFFLINE] = "Zone Is Offline",
> + [NVME_SC_ZONE_INVALID_WRITE] = "Zone Invalid Write",
> + [NVME_SC_ZONE_TOO_MANY_ACTIVE] = "Too Many Active Zones",
> + [NVME_SC_ZONE_TOO_MANY_OPEN] = "Too Many Open Zones",
> + [NVME_SC_ZONE_INVALID_TRANSITION] = "Invalid Zone State Transition",
> + [NVME_SC_WRITE_FAULT] = "Write Fault",
> + [NVME_SC_READ_ERROR] = "Unrecovered Read Error",
> + [NVME_SC_GUARD_CHECK] = "End-to-end Guard Check Error",
> + [NVME_SC_APPTAG_CHECK] = "End-to-end Application Tag Check Error",
> + [NVME_SC_REFTAG_CHECK] = "End-to-end Reference Tag Check Error",
> + [NVME_SC_COMPARE_FAILED] = "Compare Failure",
> + [NVME_SC_ACCESS_DENIED] = "Access Denied",
> + [NVME_SC_UNWRITTEN_BLOCK] = "Deallocated or Unwritten Logical Block",
> + [NVME_SC_ANA_PERSISTENT_LOSS] = "Asymmetric Access Persistent Loss",
> + [NVME_SC_ANA_INACCESSIBLE] = "Asymmetric Access Inaccessible",
> + [NVME_SC_ANA_TRANSITION] = "Asymmetric Access Transition",
> + [NVME_SC_HOST_PATH_ERROR] = "Host Pathing Error",
> +};
> +#endif /* CONFIG_NVME_VERBOSE_ERRORS */
> +
> +void nvme_error_log(struct request *req)
> +{
> + struct nvme_ns *ns = req->q->queuedata;
> + struct nvme_request *nr = nvme_req(req);
> + const unsigned char *op_str = "I/O Cmd";
> + const unsigned char *admin_op_str = "Admin Cmd";
> + const unsigned char *err_str = NULL;
> +#ifdef CONFIG_NVME_VERBOSE_ERRORS
> + const struct nvme_string_table *entry;
> + unsigned int i;
> +
> + if ((nr->status & 0x7ff) <= ARRAY_SIZE(nvme_errors))
> + err_str = nvme_errors[nr->status & 0x7ff];
Maybe we need:
if ((nr->status & 0x7ff) <= ARRAY_SIZE(nvme_errors) &&
nvme_errors[nr->status & 0x7ff] != NULL)
err_str = nvme_errors[nr->status & 0x7ff];
else
err_str = "Unknown"
btw, shouldn't the index check be "<" and not "<=" ?
We are looking to see if this is a valid entry don't we?
Array of size 1000 should only accept access to entries 0-999...
> +#endif
> + if (err_str == NULL)
> + err_str = "I/O Error";
> +
> + if (ns) {
> +#ifdef CONFIG_NVME_VERBOSE_ERRORS
> + for (i = 0, entry = nvme_ops ; i < ARRAY_SIZE(nvme_ops) ; i++)
> + if (entry[i].code == nr->cmd->common.opcode)
> + op_str = entry[i].string;
Same here, the loop looks redundant to me:
if (nr->cmd->common.opcode <= ARRAY_SIZE(nvme_ops) &&
nvme_ops[nr->cmd->common.opcode] != NULL)
op_str = nvme_ops[nr->cmd->common.opcode].string;
else
op_str = "Unknown"
> +#endif
> + pr_err_ratelimited("%s: %s(0x%x) @ LBA %llu, %llu blocks, %s (sct 0x%x / sc 0x%x) %s%s\n",
> + ns->disk ? ns->disk->disk_name : "?",
> + op_str,
> + nr->cmd->common.opcode,
> + (unsigned long long)nvme_sect_to_lba(ns, blk_rq_pos(req)),
> + (unsigned long long)blk_rq_bytes(req) >> ns->lba_shift,
> + err_str,
> + nr->status >> 8 & 7, /* Status Code Type */
> + nr->status & 0xff, /* Status Code */
> + nr->status & NVME_SC_MORE ? "MORE " : "",
> + nr->status & NVME_SC_DNR ? "DNR " : "");
> + } else {
> +#ifdef CONFIG_NVME_VERBOSE_ERRORS
> + for (i = 0, entry = nvme_admin_ops ; i < ARRAY_SIZE(nvme_admin_ops) ; i++)
> + if (entry[i].code == nr->cmd->common.opcode)
> + admin_op_str = entry[i].string;
Same practice here.
> +#endif
> + pr_err_ratelimited("%s: %s(0x%x), %s (sct 0x%x / sc 0x%x) %s%s\n",
> + dev_name(nr->ctrl->device),
> + admin_op_str,
> + nr->cmd->common.opcode,
> + err_str,
> + nr->status >> 8 & 7, /* Status Code Type */
> + nr->status & 0xff, /* Status Code */
> + nr->status & NVME_SC_MORE ? "MORE " : "",
> + nr->status & NVME_SC_DNR ? "DNR " : "");
> + }
> +}
> +EXPORT_SYMBOL_GPL(nvme_error_log);
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index a162f6c6da6e..3216156b71b2 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -930,4 +930,6 @@ static inline bool nvme_multi_css(struct nvme_ctrl *ctrl)
> return (ctrl->ctrl_config & NVME_CC_CSS_MASK) == NVME_CC_CSS_CSI;
> }
>
> +extern void nvme_error_log(struct request *req);
> +
> #endif /* _NVME_H */
> diff --git a/include/linux/nvme.h b/include/linux/nvme.h
> index 855dd9b3e84b..1f946e5bf7c1 100644
> --- a/include/linux/nvme.h
> +++ b/include/linux/nvme.h
> @@ -1636,6 +1636,7 @@ enum {
> NVME_SC_HOST_ABORTED_CMD = 0x371,
>
> NVME_SC_CRD = 0x1800,
> + NVME_SC_MORE = 0x2000,
> NVME_SC_DNR = 0x4000,
> };
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V4 1/1] nvme: Add verbose error logging
[not found] ` <BB566CE6-19C1-472F-978B-564A2EC4C4EC@oracle.com>
@ 2022-01-25 22:54 ` Sagi Grimberg
2022-01-25 23:18 ` Alan Adamson
0 siblings, 1 reply; 8+ messages in thread
From: Sagi Grimberg @ 2022-01-25 22:54 UTC (permalink / raw)
To: Alan Adamson; +Cc: linux-nvme, kbusch, hch
> +#endif
> + if (err_str == NULL)
> + err_str = "I/O Error";
> +
> + if (ns) {
> +#ifdef CONFIG_NVME_VERBOSE_ERRORS
> + for (i = 0, entry = nvme_ops ; i < ARRAY_SIZE(nvme_ops) ; i++)
> + if (entry[i].code == nr->cmd->common.opcode)
> + op_str = entry[i].string;
>
> Same here, the loop looks redundant to me:
> if (nr->cmd->common.opcode <= ARRAY_SIZE(nvme_ops) &&
> nvme_ops[nr->cmd->common.opcode] != NULL)
> op_str = nvme_ops[nr->cmd->common.opcode].string;
> else
> op_str = "Unknown"
>
>
> Currently the nvme_ops and nvme_admin_ops arrays are not built by index so I need to walk the array to find the
> appropriate opcode/name. Are you asking to change these arrays to be build by index?
>
> static const char * const nvme_admin_ops[] = {
> [nvme_admin_delete_sq] = "Delete SQ",
> [nvme_admin_create_sq] = "Create SQ",
> [nvme_admin_get_log_page] = "Get Log Page”,
>
> Rather than:
>
> static const struct nvme_string_table nvme_admin_ops[] = {
> { nvme_admin_delete_sq, "Delete SQ" },
> { nvme_admin_create_sq, "Create SQ" },
> { nvme_admin_get_log_page, "Get Log Page" },
Yes, is that a problem?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V4 1/1] nvme: Add verbose error logging
2022-01-25 22:54 ` Sagi Grimberg
@ 2022-01-25 23:18 ` Alan Adamson
0 siblings, 0 replies; 8+ messages in thread
From: Alan Adamson @ 2022-01-25 23:18 UTC (permalink / raw)
To: Sagi Grimberg; +Cc: linux-nvme, kbusch, hch
> On Jan 25, 2022, at 2:54 PM, Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
>> +#endif
>> + if (err_str == NULL)
>> + err_str = "I/O Error";
>> +
>> + if (ns) {
>> +#ifdef CONFIG_NVME_VERBOSE_ERRORS
>> + for (i = 0, entry = nvme_ops ; i < ARRAY_SIZE(nvme_ops) ; i++)
>> + if (entry[i].code == nr->cmd->common.opcode)
>> + op_str = entry[i].string;
>> Same here, the loop looks redundant to me:
>> if (nr->cmd->common.opcode <= ARRAY_SIZE(nvme_ops) &&
>> nvme_ops[nr->cmd->common.opcode] != NULL)
>> op_str = nvme_ops[nr->cmd->common.opcode].string;
>> else
>> op_str = "Unknown"
>> Currently the nvme_ops and nvme_admin_ops arrays are not built by index so I need to walk the array to find the
>> appropriate opcode/name. Are you asking to change these arrays to be build by index?
>> static const char * const nvme_admin_ops[] = {
>> [nvme_admin_delete_sq] = "Delete SQ",
>> [nvme_admin_create_sq] = "Create SQ",
>> [nvme_admin_get_log_page] = "Get Log Page”,
>> Rather than:
>> static const struct nvme_string_table nvme_admin_ops[] = {
>> { nvme_admin_delete_sq, "Delete SQ" },
>> { nvme_admin_create_sq, "Create SQ" },
>> { nvme_admin_get_log_page, "Get Log Page" },
>
> Yes, is that a problem?
No, not a problem.
Alan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V4 1/1] nvme: Add verbose error logging
2022-01-24 18:51 ` [PATCH V4 1/1] " Alan Adamson
2022-01-24 20:20 ` Keith Busch
2022-01-25 10:02 ` Sagi Grimberg
@ 2022-01-28 16:40 ` Hannes Reinecke
2 siblings, 0 replies; 8+ messages in thread
From: Hannes Reinecke @ 2022-01-28 16:40 UTC (permalink / raw)
To: linux-nvme
On 1/24/22 19:51, Alan Adamson wrote:
> Improves logging of NVMe errors. If NVME_VERBOSE_ERRORS is configured,
> a verbose description of the error is logged, otherwise only status codes/bits
> is logged.
>
> Verbose logging examples:
> [ 183.333734] nvme0n1: Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
>
> [ 227.767945] nvme0: Activate Firmware(0x10), Invalid Field in Command (sct 0x0 / sc 0x2) DNR
>
> Non-verbose logging examples:
> [ 41.280509] nvme0n1: I/O Cmd(0x2) @ LBA 0, 1 blocks, I/O Error (sct 0x2 / sc 0x81) DNR
>
> [ 77.249153] nvme0: Admin Cmd(0x10), I/O Error (sct 0x0 / sc 0x2) DNR
>
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
> ---
> drivers/nvme/host/Kconfig | 8 ++
> drivers/nvme/host/Makefile | 2 +-
> drivers/nvme/host/core.c | 3 +
> drivers/nvme/host/errors.c | 230 +++++++++++++++++++++++++++++++++++++
> drivers/nvme/host/nvme.h | 2 +
> include/linux/nvme.h | 1 +
> 6 files changed, 245 insertions(+), 1 deletion(-)
> create mode 100644 drivers/nvme/host/errors.c
>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-01-28 16:40 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-24 18:51 [PATCH V4 0/1] nvme: Add verbose error logging Alan Adamson
2022-01-24 18:51 ` [PATCH V4 1/1] " Alan Adamson
2022-01-24 20:20 ` Keith Busch
2022-01-24 21:22 ` Alan Adamson
2022-01-25 10:02 ` Sagi Grimberg
[not found] ` <BB566CE6-19C1-472F-978B-564A2EC4C4EC@oracle.com>
2022-01-25 22:54 ` Sagi Grimberg
2022-01-25 23:18 ` Alan Adamson
2022-01-28 16:40 ` Hannes Reinecke
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.