All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/1] add smart_critical_warning property for nvme
@ 2021-01-12  7:49 zhenwei pi
  2021-01-12  7:49 ` [PATCH v2 1/1] hw/block/nvme: add smart_critical_warning property zhenwei pi
  0 siblings, 1 reply; 3+ messages in thread
From: zhenwei pi @ 2021-01-12  7:49 UTC (permalink / raw)
  To: kbusch, its, kwolf, mreitz; +Cc: zhenwei pi, philmd, qemu-devel, qemu-block

v1 -> v2:
Suggested by Philippe & Klaus, set/get smart_critical_warning by QMP.

v1:
Add smart_critical_warning for nvme device which can be set by QEMU
command line to emulate hardware error.

Zhenwei Pi (1):
  hw/block/nvme: add smart_critical_warning property

 hw/block/nvme.c | 28 ++++++++++++++++++++++++++++
 hw/block/nvme.h |  1 +
 2 files changed, 29 insertions(+)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2 1/1] hw/block/nvme: add smart_critical_warning property
  2021-01-12  7:49 [PATCH v2 0/1] add smart_critical_warning property for nvme zhenwei pi
@ 2021-01-12  7:49 ` zhenwei pi
  2021-01-12 10:39   ` Klaus Jensen
  0 siblings, 1 reply; 3+ messages in thread
From: zhenwei pi @ 2021-01-12  7:49 UTC (permalink / raw)
  To: kbusch, its, kwolf, mreitz; +Cc: zhenwei pi, philmd, qemu-devel, qemu-block

There is a very low probability that hitting physical NVMe disk
hardware critical warning case, it's hard to write & test a monitor
agent service.

For debugging purposes, add a new 'smart_critical_warning' property
to emulate this situation.

The orignal version of this change is implemented by adding a fixed
property which could be initialized by QEMU command line. Suggested
by Philippe & Klaus, rework like current version.

Test with this patch:
1, change smart_critical_warning property for a running VM:
 #virsh qemu-monitor-command nvme-upstream '{ "execute": "qom-set",
  "arguments": { "path": "/machine/peripheral-anon/device[0]",
  "property": "smart_critical_warning", "value":16 } }'
2, run smartctl in guest
 #smartctl -H -l error /dev/nvme0n1

  === START OF SMART DATA SECTION ===
  SMART overall-health self-assessment test result: FAILED!
  - volatile memory backup device has failed

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
---
 hw/block/nvme.c | 28 ++++++++++++++++++++++++++++
 hw/block/nvme.h |  1 +
 2 files changed, 29 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 27d2c72716..a98757b6a1 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1214,6 +1214,7 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
     }
 
     trans_len = MIN(sizeof(smart) - off, buf_len);
+    smart.critical_warning = n->smart_critical_warning;
 
     smart.data_units_read[0] = cpu_to_le64(DIV_ROUND_UP(stats.units_read,
                                                         1000));
@@ -2827,6 +2828,29 @@ static Property nvme_props[] = {
     DEFINE_PROP_END_OF_LIST(),
 };
 
+
+static void nvme_get_smart_warning(Object *obj, Visitor *v, const char *name,
+                                   void *opaque, Error **errp)
+{
+    NvmeCtrl *s = NVME(obj);
+    uint8_t value = s->smart_critical_warning;
+
+    visit_type_uint8(v, name, &value, errp);
+}
+
+static void nvme_set_smart_warning(Object *obj, Visitor *v, const char *name,
+                                   void *opaque, Error **errp)
+{
+    NvmeCtrl *s = NVME(obj);
+    uint8_t value;
+
+    if (!visit_type_uint8(v, name, &value, errp)) {
+        return;
+    }
+
+    s->smart_critical_warning = value;
+}
+
 static const VMStateDescription nvme_vmstate = {
     .name = "nvme",
     .unmigratable = 1,
@@ -2857,6 +2881,10 @@ static void nvme_instance_init(Object *obj)
                                       "bootindex", "/namespace@1,0",
                                       DEVICE(obj));
     }
+
+    object_property_add(obj, "smart_critical_warning", "uint8",
+                        nvme_get_smart_warning,
+                        nvme_set_smart_warning, NULL, NULL);
 }
 
 static const TypeInfo nvme_info = {
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index e080a2318a..64e3497244 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -139,6 +139,7 @@ typedef struct NvmeCtrl {
     uint64_t    timestamp_set_qemu_clock_ms;    /* QEMU clock time */
     uint64_t    starttime_ms;
     uint16_t    temperature;
+    uint8_t     smart_critical_warning;
 
     HostMemoryBackend *pmrdev;
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2 1/1] hw/block/nvme: add smart_critical_warning property
  2021-01-12  7:49 ` [PATCH v2 1/1] hw/block/nvme: add smart_critical_warning property zhenwei pi
@ 2021-01-12 10:39   ` Klaus Jensen
  0 siblings, 0 replies; 3+ messages in thread
From: Klaus Jensen @ 2021-01-12 10:39 UTC (permalink / raw)
  To: zhenwei pi; +Cc: kwolf, qemu-block, qemu-devel, mreitz, kbusch, philmd

[-- Attachment #1: Type: text/plain, Size: 3799 bytes --]

On Jan 12 15:49, zhenwei pi wrote:
> There is a very low probability that hitting physical NVMe disk
> hardware critical warning case, it's hard to write & test a monitor
> agent service.
> 
> For debugging purposes, add a new 'smart_critical_warning' property
> to emulate this situation.
> 
> The orignal version of this change is implemented by adding a fixed
> property which could be initialized by QEMU command line. Suggested
> by Philippe & Klaus, rework like current version.
> 
> Test with this patch:
> 1, change smart_critical_warning property for a running VM:
>  #virsh qemu-monitor-command nvme-upstream '{ "execute": "qom-set",
>   "arguments": { "path": "/machine/peripheral-anon/device[0]",
>   "property": "smart_critical_warning", "value":16 } }'
> 2, run smartctl in guest
>  #smartctl -H -l error /dev/nvme0n1
> 
>   === START OF SMART DATA SECTION ===
>   SMART overall-health self-assessment test result: FAILED!
>   - volatile memory backup device has failed
> 
> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>

I think we also need to check the asynchronous event configuration and
issue an AEN if required like we do when the temperature threshold
changes.

Philippe, what are the locking semantics here? This runs under the big
lock?

> ---
>  hw/block/nvme.c | 28 ++++++++++++++++++++++++++++
>  hw/block/nvme.h |  1 +
>  2 files changed, 29 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 27d2c72716..a98757b6a1 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1214,6 +1214,7 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
>      }
>  
>      trans_len = MIN(sizeof(smart) - off, buf_len);
> +    smart.critical_warning = n->smart_critical_warning;
>  
>      smart.data_units_read[0] = cpu_to_le64(DIV_ROUND_UP(stats.units_read,
>                                                          1000));
> @@ -2827,6 +2828,29 @@ static Property nvme_props[] = {
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> +
> +static void nvme_get_smart_warning(Object *obj, Visitor *v, const char *name,
> +                                   void *opaque, Error **errp)
> +{
> +    NvmeCtrl *s = NVME(obj);
> +    uint8_t value = s->smart_critical_warning;
> +
> +    visit_type_uint8(v, name, &value, errp);
> +}
> +
> +static void nvme_set_smart_warning(Object *obj, Visitor *v, const char *name,
> +                                   void *opaque, Error **errp)
> +{
> +    NvmeCtrl *s = NVME(obj);
> +    uint8_t value;
> +
> +    if (!visit_type_uint8(v, name, &value, errp)) {
> +        return;
> +    }
> +
> +    s->smart_critical_warning = value;
> +}
> +
>  static const VMStateDescription nvme_vmstate = {
>      .name = "nvme",
>      .unmigratable = 1,
> @@ -2857,6 +2881,10 @@ static void nvme_instance_init(Object *obj)
>                                        "bootindex", "/namespace@1,0",
>                                        DEVICE(obj));
>      }
> +
> +    object_property_add(obj, "smart_critical_warning", "uint8",
> +                        nvme_get_smart_warning,
> +                        nvme_set_smart_warning, NULL, NULL);
>  }
>  
>  static const TypeInfo nvme_info = {
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index e080a2318a..64e3497244 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -139,6 +139,7 @@ typedef struct NvmeCtrl {
>      uint64_t    timestamp_set_qemu_clock_ms;    /* QEMU clock time */
>      uint64_t    starttime_ms;
>      uint16_t    temperature;
> +    uint8_t     smart_critical_warning;
>  
>      HostMemoryBackend *pmrdev;
>  
> -- 
> 2.25.1
> 
> 

-- 
One of us - No more doubt, silence or taboo about mental illness.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-01-12 10:54 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-12  7:49 [PATCH v2 0/1] add smart_critical_warning property for nvme zhenwei pi
2021-01-12  7:49 ` [PATCH v2 1/1] hw/block/nvme: add smart_critical_warning property zhenwei pi
2021-01-12 10:39   ` Klaus Jensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.