All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 1/2] hw/nvme: add mi device
       [not found] <CGME20210709135651epcas5p1c544dec5377413bfa4b2eeab6ee43f26@epcas5p1.samsung.com>
@ 2021-07-09 13:55 ` Padmakar Kalghatgi
  2021-07-09 15:58   ` Keith Busch
  2021-07-12 11:03     ` Stefan Hajnoczi
  0 siblings, 2 replies; 13+ messages in thread
From: Padmakar Kalghatgi @ 2021-07-09 13:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: fam, kwolf, jg123.choi, qemu-block, k.jensen, d.palani,
	linux-nvme, mreitz, its, u.kishore, stefanha, kbusch,
	javier.gonz, prakash.v, mohit.kap

[-- Attachment #1: Type: text/plain, Size: 34165 bytes --]

The enclosed patch contains the implementation of certain
commands of nvme-mi specification.The MI commands are useful
to manage/configure/monitor the device.Eventhough the MI commands
can be sent via the inband NVMe-MI send/recieve commands, the idea 
here is to emulate the sideband interface for MI.

Since the nvme-mi specification deals in communicating
to the nvme subsystem via. a sideband interface, in this
qemu implementation, virtual-vsock is used for making the
sideband communication, the guest VM needs to make the
connection to the specific cid of the vsock of the qemu host.

One needs to specify the following command in the launch to
specify the nvme-mi device, cid and to setup the vsock:
-device nvme-mi,bus=<nvme bus number>
-device vhost-vsock-pci, guest-cid=<vsock cid>

The following commands are tested with nvme-cli by hooking
to the cid of the vsock as shown above and use the socket
send/recieve commands to issue the commands and get the response.

we are planning to push the changes for nvme-cli as well to test the
MI functionality.

As the connection can be established by the guest VM at any point,
we have created a thread which is looking for a connection request.
Please suggest if there is a native/better way to handle this.

This module makes use of the NvmeCtrl structure of the nvme module,
to fetch relevant information of the nvme device which are used in
some of the mi commands. Eventhough certain commands might require
modification to the nvme module, currently we have currently refrained
from making changes to the nvme module.

Since this is our first foray into implementing a new module in qemu,
we will be happy to receive comments, suggestions to improve the current
implementation.

cmd-name                              cmd-type
*************************************************
read nvme-mi ds                        nvme-mi
configuration set                      nvme-mi
configuration get                      nvme-mi
vpd read                               nvme-mi
vpd write                              nvme-mi
controller Health Staus Poll           nvme-mi
nvme subsystem health status poll      nvme-mi
identify                               nvme-admin
get log page                           nvme-admin
get features                           nvme-admin

Signed-off-by: Padmakar Kalghatgi

---
 hw/nvme/meson.build |   2 +-
 hw/nvme/nvme-mi.c   | 676 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/nvme/nvme-mi.h   | 288 ++++++++++++++++++++++
 3 files changed, 965 insertions(+), 1 deletion(-)
 create mode 100644 hw/nvme/nvme-mi.c
 create mode 100644 hw/nvme/nvme-mi.h

diff --git a/hw/nvme/meson.build b/hw/nvme/meson.build
index 3cf4004..8dac50e 100644
--- a/hw/nvme/meson.build
+++ b/hw/nvme/meson.build
@@ -1 +1 @@
-softmmu_ss.add(when: 'CONFIG_NVME_PCI', if_true: files('ctrl.c', 'dif.c', 'ns.c', 'subsys.c'))
+softmmu_ss.add(when: 'CONFIG_NVME_PCI', if_true: files('ctrl.c', 'dif.c', 'ns.c', 'subsys.c', 'nvme-mi.c'))
diff --git a/hw/nvme/nvme-mi.c b/hw/nvme/nvme-mi.c
new file mode 100644
index 0000000..5e93417
--- /dev/null
+++ b/hw/nvme/nvme-mi.c
@@ -0,0 +1,676 @@
+/*
+ * QEMU NVMe-MI Controller
+ *
+ * Copyright (c) 2021, Samsung Electronics co Ltd.
+ *
+ * Written by Padmakar Kalghatgi <p.kalghatgi@samsung.com>
+ *            Arun Kumar Agasar <arun.ag@partner.samsung.com>
+ *            Saurav Kumar <saurav.29@partner.samsung.com>
+ *
+ * This code is licensed under the GNU GPL v2 or later.
+ */
+
+/**
+ * Reference Specs: http://www.nvmexpress.org,
+ *
+ *
+ * Usage
+ * -----
+ * The nvme-mi device has to be used along with nvme device only
+ *
+ * Add options:
+ *    -device  nvme-mi,bus=<nvme bus number>
+ *    -device  vhost-vsock-pci, guest-cid=<vsock cid>
+ *
+ * the cid is used to connect to the vsock
+ */
+
+#include "qemu/osdep.h"
+#include "hw/qdev-core.h"
+#include "hw/block/block.h"
+#include "hw/pci/msix.h"
+#include "nvme.h"
+#include "nvme-mi.h"
+#include "qemu/crc32c.h"
+
+#define NVME_TEMPERATURE 0x143
+#define NVME_TEMPERATURE_WARNING 0x157
+#define NVME_TEMPERATURE_CRITICAL 0x175
+
+static void nvme_mi_send_resp(NvmeMiCtrl *ctrl_mi, uint8_t *resp, uint32_t size)
+{
+    uint32_t crc_value = crc32c(0xFFFFFFFF, resp, size);
+    size += 4;
+    uint32_t retries = 5;
+    uint32_t offset = 0;
+    uint32_t som = 1;
+    uint32_t eom = 0;
+    uint32_t pktseq = 0;
+    uint32_t mtus = ctrl_mi->mctp_unit_size;
+    while (size > 0) {
+        uint32_t sizesent = size > mtus ? mtus : size;
+        size -= sizesent;
+        eom = size > 0 ? 0 : 1;
+        g_autofree uint8_t *buf = (uint8_t *)g_malloc(sizesent + 8);
+        buf[2] = sizesent + 5;
+        buf[7] = (som << 7) | (eom << 6) | (pktseq << 5);
+        som = 0;
+        memcpy(buf + 8, resp + offset, sizesent);
+        offset += sizesent;
+        if (size <= 0) {
+            memcpy(buf + 4 + offset , &crc_value, sizeof(crc_value));
+        }
+        retries = 5;
+        while (retries > 0) {
+            int32_t nsend = send(ctrl_mi->sock_desc, buf, sizesent + 8, 0);
+            if (nsend < 0) {
+                retries--;
+            } else {
+                break;
+            }
+        }
+    }
+}
+
+static void nvme_mi_resp_hdr_init(NvmeMIResponse *resp, bool bAdminCommand)
+{
+    resp->msg_header.msgtype = 4;
+    resp->msg_header.ic = 1;
+    resp->msg_header.csi = 0;
+    resp->msg_header.reserved = 0;
+    resp->msg_header.nmimt = bAdminCommand ? 2 : 1;
+    resp->msg_header.ror = 1;
+    resp->msg_header.reserved1 = 0;
+}
+static void nvme_mi_nvm_subsys_ds(NvmeMiCtrl *ctrl_mi, NvmeMIRequest *req)
+{
+    NvmeMIResponse resp;
+    NvmMiSubsysInfoDs ds;
+    ds.nump = 1;
+    ds.mjr = (ctrl_mi->n->bar.vs & 0xFF0000) >> 16;
+    ds.mnr = (ctrl_mi->n->bar.vs & 0xFF00) >> 8;
+
+    nvme_mi_resp_hdr_init(&resp , false);
+    resp.status = SUCCESS;
+    resp.mgmt_resp = sizeof(ds);
+    uint32_t total_size = sizeof(resp) + sizeof(ds);
+    uint8_t resp_message[total_size];
+    memcpy(resp_message, &resp, sizeof(resp));
+    memcpy(resp_message + sizeof(resp), &ds, sizeof(ds));
+
+    nvme_mi_send_resp(ctrl_mi, resp_message, total_size);
+}
+
+static void nvme_mi_opt_supp_cmd_list(NvmeMiCtrl *ctrl_mi, NvmeMIRequest *req)
+{
+    NvmeMIResponse resp;
+    nvme_mi_resp_hdr_init(&resp , false);
+    resp.status = SUCCESS;
+
+    uint16_t mi_opt_cmd_cnt = sizeof(NvmeMiCmdOptSupList) /
+                              sizeof(uint32_t);
+    uint16_t admin_mi_opt_cmd_cnt = sizeof(NvmeMiAdminCmdOptSupList) /
+                                    sizeof(uint32_t);
+    uint32_t offset = 0;
+    uint16_t total_commands = mi_opt_cmd_cnt + admin_mi_opt_cmd_cnt;
+    uint32_t size = 2 * (total_commands + 1);
+
+    g_autofree uint8_t *cmd_supp_list = (uint8_t *)g_malloc0(size);
+
+    memcpy(cmd_supp_list, &total_commands, sizeof(uint16_t));
+    offset += sizeof(uint16_t);
+    for (uint32_t i = 0; i < mi_opt_cmd_cnt; i++) {
+        memcpy(cmd_supp_list + offset, &NvmeMiCmdOptSupList[i],
+               sizeof(uint8_t));
+        cmd_supp_list[offset + 1] = 1;
+        offset += 2;
+    }
+
+    for (uint32_t i = 0; i < admin_mi_opt_cmd_cnt; i++) {
+        memcpy(cmd_supp_list + offset, &NvmeMiAdminCmdOptSupList[i],
+               sizeof(uint8_t));
+        cmd_supp_list[offset + 1] = 1;
+        offset += 2;
+    }
+
+    resp.mgmt_resp = size;
+    uint32_t total_size = sizeof(resp) + size;
+    uint8_t resp_message[total_size];
+    memcpy(resp_message, &resp, sizeof(resp));
+    memcpy(resp_message + sizeof(resp), cmd_supp_list, size);
+
+    nvme_mi_send_resp(ctrl_mi, resp_message, total_size);
+}
+
+static void nvme_mi_controller_health_ds(NvmeMiCtrl *ctrl_mi,
+                                         NvmeMIRequest *req)
+{
+    uint32_t dword0 = req->dword0;
+    uint32_t dword1 = req->dword1;
+    uint32_t maxrent = (dword0 >> 16) & 0xFF;
+    uint32_t reportall = (dword0 >> 31) & 0x1;
+    uint32_t incvf = (dword0 >> 26) & 0x1;
+    uint32_t incpf = (dword0 >> 25) & 0x1;
+    uint32_t incf = (dword0 >> 24) & 0x1;
+
+    NvmeMIResponse resp;
+    nvme_mi_resp_hdr_init(&resp , false);
+
+    if (maxrent > 255 || (reportall == 0) || incvf || incpf || (incf == 0)) {
+        resp.status = INVALID_PARAMETER;
+        return nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+    }
+    NvmeMiCtrlHealthDs nvme_mi_chds;
+    if (dword1 & 0x1) {
+        nvme_mi_chds.csts.rdy = ctrl_mi->n->bar.csts & 0x1;
+        nvme_mi_chds.csts.cfs |= ctrl_mi->n->bar.csts & 0x2;
+        nvme_mi_chds.csts.shst |= ctrl_mi->n->bar.csts & 0xa;
+        nvme_mi_chds.csts.nssro |= ctrl_mi->n->bar.csts & 0x10;
+        nvme_mi_chds.csts.en |= ctrl_mi->n->bar.cc & 0x1 << 5;
+    }
+    if (dword1 & 0x2) {
+        nvme_mi_chds.ctemp = ctrl_mi->n->temperature;
+    }
+    if (((ctrl_mi->n->temperature >= ctrl_mi->n->features.temp_thresh_hi) ||
+        (ctrl_mi->n->temperature <= ctrl_mi->n->features.temp_thresh_low)) &&
+         (dword1 & 0x2)) {
+        nvme_mi_chds.cwarn.temp_above_or_under_thresh = 0x1;
+    }
+    printf("size = %lu\n", sizeof(resp) + sizeof(NvmeMiCtrlHealthDs));
+    g_autofree uint8_t *resp_buf = (uint8_t *)g_malloc(sizeof(resp) +
+                                                       sizeof(NvmeMiCtrlHealthDs));
+    resp.mgmt_resp = 1 << 0x10;
+    memcpy(resp_buf, &resp, sizeof(resp));
+    memcpy(resp_buf + sizeof(resp), &nvme_mi_chds, sizeof(nvme_mi_chds));
+    nvme_mi_send_resp(ctrl_mi, resp_buf, sizeof(resp) + sizeof(NvmeMiCtrlHealthDs));
+}
+
+static void nvme_mi_read_nvme_mi_ds(NvmeMiCtrl *ctrl_mi, NvmeMIRequest *req)
+{
+    ReadNvmeMiDs ds;
+    ds.cntrlid = req->dword0 & 0xFFFF;
+    ds.portlid = (req->dword0 & 0xFF0000) >> 16;
+    ds.dtyp = (req->dword0 & ~0xFF) >> 24;
+    int dtyp = ds.dtyp;
+    switch (dtyp) {
+    case NVM_SUBSYSTEM_INFORMATION:
+        nvme_mi_nvm_subsys_ds(ctrl_mi, req);
+        break;
+    case OPT_SUPP_CMD_LIST:
+        nvme_mi_opt_supp_cmd_list(ctrl_mi, req);
+        break;
+    }
+}
+
+static void nvme_mi_configuration_get(NvmeMiCtrl *ctrl_mi, NvmeMIRequest *req)
+{
+    uint8_t config_identifier = (req->dword0 & 0xFF);
+    NvmeMIResponse resp;
+
+    uint32_t total_size = sizeof(resp);
+    uint8_t resp_message[total_size];
+    switch (config_identifier) {
+    case SMBUS_I2C_FREQ: {
+       nvme_mi_resp_hdr_init(&resp, false);
+       resp.status = SUCCESS;
+       resp.mgmt_resp = ctrl_mi->smbus_freq;
+       memcpy(resp_message, &resp, sizeof(resp));
+
+       nvme_mi_send_resp(ctrl_mi, resp_message, total_size);
+    }
+    break;
+    case MCTP_TRANS_UNIT_SIZE: {
+        nvme_mi_resp_hdr_init(&resp, false);
+        resp.status = SUCCESS;
+        resp.mgmt_resp = ctrl_mi->mctp_unit_size;
+        memcpy(resp_message, &resp, sizeof(resp));
+
+        nvme_mi_send_resp(ctrl_mi, resp_message, total_size);
+    }
+    break;
+    }
+}
+
+static void nvme_mi_configuration_set(NvmeMiCtrl *ctrl_mi, NvmeMIRequest *req)
+{
+    uint8_t config_identifier = (req->dword0 & 0xFF);
+    NvmeMIResponse resp;
+    uint32_t total_size = sizeof(resp);
+    uint8_t resp_message[total_size];
+    switch (config_identifier) {
+    case SMBUS_I2C_FREQ: {
+        nvme_mi_resp_hdr_init(&resp, false);
+        resp.status = SUCCESS;
+        resp.mgmt_resp = 0;
+        ctrl_mi->smbus_freq = (req->dword0 & 0xF00) >> 8;
+        memcpy(resp_message, &resp, sizeof(resp));
+
+        nvme_mi_send_resp(ctrl_mi, resp_message, total_size);
+    }
+    break;
+    case MCTP_TRANS_UNIT_SIZE: {
+        nvme_mi_resp_hdr_init(&resp, false);
+        resp.status = SUCCESS;
+        ctrl_mi->mctp_unit_size = (req->dword1 & 0xFFFF);
+        memcpy(resp_message, &resp, sizeof(resp));
+
+        nvme_mi_send_resp(ctrl_mi, resp_message, total_size);
+    }
+    break;
+    default:
+        nvme_mi_resp_hdr_init(&resp, false);
+        resp.status = INVALID_PARAMETER;
+        memcpy(resp_message, &resp, sizeof(resp));
+        nvme_mi_send_resp(ctrl_mi, resp_message, total_size);
+    }
+
+}
+
+static void nvme_mi_vpd_read(NvmeMiCtrl *ctrl_mi, NvmeMIRequest *req)
+{
+    uint16_t dofst = (req->dword0 & 0xFFFF);
+    uint16_t dlen = (req->dword1 & 0xFFFF);
+    NvmeMIResponse resp;
+    nvme_mi_resp_hdr_init(&resp, false);
+    if ((dofst + dlen) > sizeof(NvmeMiVpdElements)) {
+        resp.status = INVALID_PARAMETER;
+        nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+    } else {
+        resp.status = SUCCESS;
+        g_autofree uint8_t *resp_buf = (uint8_t *) g_malloc(dlen + sizeof(resp));
+        memcpy(resp_buf, &resp, sizeof(resp));
+        memcpy(resp_buf + sizeof(resp), &ctrl_mi->vpd_data + dofst, dlen);
+        nvme_mi_send_resp(ctrl_mi, resp_buf, dlen + sizeof(resp));
+    }
+}
+
+static void nvme_mi_vpd_write(NvmeMiCtrl *ctrl_mi, NvmeMIRequest *req,
+                              uint8_t *buf)
+{
+    uint16_t dofst = (req->dword0 & 0xFFFF);
+    uint16_t dlen = (req->dword1 & 0xFFFF);
+    NvmeMIResponse resp;
+    nvme_mi_resp_hdr_init(&resp, false);
+    if ((dofst + dlen) > sizeof(NvmeMiVpdElements)) {
+        resp.status = INVALID_PARAMETER;
+        nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+    } else {
+        resp.status = SUCCESS;
+        memcpy(&ctrl_mi->vpd_data + dofst, buf + 16 , dlen);
+        nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+    }
+}
+
+static void nvme_mi_nvm_subsys_health_status_poll(NvmeMiCtrl *ctrl_mi,
+                                                  NvmeMIRequest *req)
+{
+    NvmeMIResponse resp;
+    NvmeMiNvmSubsysHspds nshds;
+    nvme_mi_resp_hdr_init(&resp, false);
+    for (uint32_t cntlid = 1; cntlid < ARRAY_SIZE(ctrl_mi->n->subsys->ctrls);
+         cntlid++) {
+
+        NvmeCtrl *ctrl = nvme_subsys_ctrl(ctrl_mi->n->subsys, cntlid);
+        if (!ctrl) {
+            continue;
+        }
+
+        if ((ctrl->bar.csts & 0x1) == 0x1) {
+            nshds.ccs = 0x1;
+        }
+        if ((ctrl->bar.csts & 0x2) == 0x2) {
+            nshds.ccs |= 0x2;
+        }
+        if ((ctrl->bar.csts & 0x10) == 0x10) {
+            nshds.ccs |= 0x10;
+        }
+        if (find_first_bit(ctrl->changed_nsids, NVME_CHANGED_NSID_SIZE) !=
+            NVME_CHANGED_NSID_SIZE) {
+                nshds.ccs |= 0x40;
+        }
+        if ((ctrl->temperature >= ctrl->features.temp_thresh_hi) ||
+           (ctrl->temperature <= ctrl->features.temp_thresh_low)) {
+            nshds.ccs |= 0x200;
+        }
+    }
+
+
+    g_autofree uint8_t *resp_buf = (uint8_t *)g_malloc(sizeof(resp) + sizeof(nshds));
+    memcpy(resp_buf, &resp, sizeof(resp));
+    memcpy(resp_buf + sizeof(resp), &nshds, sizeof(nshds));
+    nvme_mi_send_resp(ctrl_mi, resp_buf, sizeof(resp_buf));
+}
+
+static void nvme_mi_admin_identify_ns(NvmeMiCtrl *ctrl_mi,
+                                      NvmeAdminMIRequest *req,
+                                      uint32_t dofst, uint32_t dlen)
+{
+    NvmeIdNs *id_ns;
+    uint32_t nsid = req->sqentry1;
+    NvmeMIAdminResponse resp;
+    nvme_mi_resp_hdr_init((NvmeMIResponse *)&resp, true);
+    resp.status = SUCCESS;
+    NvmeNamespace *ns = nvme_ns(ctrl_mi->n, nsid);
+    if (!ns) {
+        resp.cqdword0 = 0;
+        resp.cqdword1 = 0;
+        resp.cqdword3 = NVME_INVALID_NSID << 16;
+            nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(NvmeMIAdminResponse));
+        return ;
+    }
+
+    id_ns = &ns->id_ns;
+
+    g_autofree uint8_t *resp_buff = g_malloc0(sizeof(NvmeMIAdminResponse) + dlen);
+    memcpy(resp_buff, &resp, sizeof(NvmeMIAdminResponse));
+    memcpy(resp_buff + sizeof(NvmeMIAdminResponse), id_ns + dofst, dlen);
+
+    nvme_mi_send_resp(ctrl_mi, (uint8_t *)resp_buff,
+                      sizeof(NvmeMIAdminResponse) + dlen);
+}
+static void nvme_mi_admin_identify_ctrl(NvmeMiCtrl *ctrl_mi,
+                                        NvmeAdminMIRequest *req,
+                                        uint32_t dofst, uint32_t dlen)
+{
+    NvmeMIAdminResponse resp;
+    nvme_mi_resp_hdr_init((NvmeMIResponse *)&resp, true);
+    resp.status = SUCCESS;
+    g_autofree uint8_t *resp_buff = g_malloc0(sizeof(NvmeMIAdminResponse) + dlen);
+    memcpy(resp_buff, &resp, sizeof(NvmeMIAdminResponse));
+    memcpy(resp_buff + sizeof(NvmeMIAdminResponse), &ctrl_mi->n->id_ctrl + dofst, dlen);
+
+    nvme_mi_send_resp(ctrl_mi, (uint8_t *)resp_buff, sizeof(NvmeMIAdminResponse) + dlen);
+}
+static void nvme_mi_admin_identify(NvmeMiCtrl *ctrl_mi, NvmeAdminMIRequest *req)
+{
+    uint32_t cns = req->sqentry10 & 0xFF;
+    uint32_t cflags = req->cmdflags;
+    uint32_t dofst = req->dataofst;
+    uint32_t dlen = req->datalen;
+    NvmeMIResponse resp;
+    if (dofst + dlen > 4096) {
+        nvme_mi_resp_hdr_init(&resp, true);
+        resp.status = INVALID_PARAMETER;
+        return nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+    }
+    if ((cflags & 0x1) == 0) {
+        dlen = 4096;
+    }
+    if (!(cflags & 0x2)) {
+        dofst = 0;
+    }
+    switch (cns) {
+    case 0x00:
+        return nvme_mi_admin_identify_ns(ctrl_mi, req, dofst, dlen);
+    case 0x1:
+        return nvme_mi_admin_identify_ctrl(ctrl_mi, req, dofst, dlen);
+    default:
+    {
+        NvmeMIAdminResponse resp;
+        nvme_mi_resp_hdr_init((NvmeMIResponse *)&resp, true);
+        resp.status = SUCCESS;
+        resp.cqdword3 = NVME_INVALID_FIELD << 16;
+        nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+    }
+    }
+}
+static void nvme_mi_admin_error_info_log(NvmeMiCtrl *ctrl_mi,
+                                         NvmeAdminMIRequest *req,
+                                         uint32_t dofst, uint32_t dlen)
+{
+    NvmeMIAdminResponse resp;
+    NvmeErrorLog errlog;
+    memset(&errlog, 0x0, sizeof(errlog));
+    nvme_mi_resp_hdr_init((NvmeMIResponse *)&resp, true);
+    resp.status = SUCCESS;
+    g_autofree uint8_t *resp_buff = g_malloc0(sizeof(NvmeMIAdminResponse) + dlen);
+    memcpy(resp_buff, &resp, sizeof(NvmeMIAdminResponse));
+    memcpy(resp_buff + sizeof(NvmeMIAdminResponse), &errlog + dofst, dlen);
+    nvme_mi_send_resp(ctrl_mi, (uint8_t *)resp_buff, sizeof(NvmeMIAdminResponse) + dlen);
+}
+
+static void nvme_mi_admin_get_log_page(NvmeMiCtrl *ctrl_mi,
+                                       NvmeAdminMIRequest *req)
+{
+    uint32_t lid = req->sqentry10;
+    uint32_t cflags = req->cmdflags;
+    uint32_t dofst = req->dataofst;
+    uint32_t dlen = req->datalen;
+    NvmeMIResponse resp;
+
+    switch (lid) {
+    case 0x00:
+        if (dofst + dlen > sizeof(NvmeErrorLog)) {
+            nvme_mi_resp_hdr_init(&resp, true);
+            resp.status = INVALID_PARAMETER;
+            return nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+        }
+        if ((cflags & 0x1) == 0) {
+            dlen = sizeof(NvmeErrorLog);
+        }
+        if (!(cflags & 0x2)) {
+            dofst = 0;
+        }
+        if (dofst + dlen > sizeof(NvmeErrorLog)) {
+            nvme_mi_resp_hdr_init(&resp, true);
+            resp.status = INVALID_PARAMETER;
+            return nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+        }
+
+        return nvme_mi_admin_error_info_log(ctrl_mi, req, dofst, dlen);
+    default:
+    {
+        NvmeMIAdminResponse resp;
+        nvme_mi_resp_hdr_init((NvmeMIResponse *)&resp, true);
+        resp.status = SUCCESS;
+        resp.cqdword3 = NVME_INVALID_FIELD << 16;
+        nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+    }
+    }
+}
+
+static void nvme_mi_admin_get_features(NvmeMiCtrl *ctrl_mi,
+                                       NvmeAdminMIRequest *req)
+{
+    uint32_t fid = req->sqentry10 & 0xFF;
+    uint32_t dofst = req->dataofst;
+    uint32_t dlen = req->datalen;
+
+    if (dofst || dlen) {
+        NvmeMIResponse resp;
+        nvme_mi_resp_hdr_init(&resp, true);
+        resp.status = INVALID_PARAMETER;
+        return nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+    }
+
+    NvmeMIAdminResponse resp;
+    nvme_mi_resp_hdr_init((NvmeMIResponse *)&resp, true);
+    resp.status = SUCCESS;
+
+    switch (fid) {
+    case NVME_TEMPERATURE_THRESHOLD:
+        resp.cqdword0 = 0;
+
+        if (NVME_TEMP_TMPSEL(req->sqentry11) != NVME_TEMP_TMPSEL_COMPOSITE) {
+            break;
+        }
+
+        if (NVME_TEMP_THSEL(req->sqentry11) == NVME_TEMP_THSEL_OVER) {
+            resp.cqdword0 = NVME_TEMPERATURE_WARNING;
+        }
+        break;
+    case NVME_NUMBER_OF_QUEUES:
+        resp.cqdword0 = (ctrl_mi->n->params.max_ioqpairs - 1) |
+                        ((ctrl_mi->n->params.max_ioqpairs - 1) << 16);
+        break;
+    default:
+        resp.cqdword3 = NVME_INVALID_FIELD << 16;
+        return nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+    }
+
+    nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(NvmeMIAdminResponse));
+}
+
+static void nvme_mi_admin_command(NvmeMiCtrl *ctrl_mi, void* req_arg)
+{
+    uint8_t *msg  = ((uint8_t *)req_arg);
+    NvmeMiMessageHeader msghdr;
+    memcpy(&msghdr, msg, sizeof(NvmeMiMessageHeader));
+    if (msghdr.nmimt == 1) {
+        NvmeMIRequest *req = (NvmeMIRequest *) (msg);
+        switch (req->opc) {
+        case READ_NVME_MI_DS:
+            nvme_mi_read_nvme_mi_ds(ctrl_mi, req);
+            break;
+        case CHSP:
+            nvme_mi_controller_health_ds(ctrl_mi, req);
+            break;
+        case NVM_SHSP:
+            nvme_mi_nvm_subsys_health_status_poll(ctrl_mi, req);
+            break;
+        case CONFIGURATION_SET:
+            nvme_mi_configuration_set(ctrl_mi, req);
+            break;
+        case CONFIGURATION_GET:
+            nvme_mi_configuration_get(ctrl_mi, req);
+            break;
+        case VPD_READ:
+            nvme_mi_vpd_read(ctrl_mi, req);
+            break;
+        case VPD_WRITE:
+            nvme_mi_vpd_write(ctrl_mi, req, msg);
+            break;
+        default:
+        {
+            NvmeMIResponse resp;
+            nvme_mi_resp_hdr_init(&resp, false);
+            resp.status = INVALID_COMMAND_OPCODE;
+            nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+            break;
+        }
+        }
+    } else {
+        NvmeAdminMIRequest *req = (NvmeAdminMIRequest *) (msg);
+        switch  (req->opc) {
+        case NVME_ADM_MI_CMD_IDENTIFY:
+            printf("identify\n");
+            nvme_mi_admin_identify(ctrl_mi, req);
+            break;
+        case NVME_ADM_MI_CMD_GET_LOG_PAGE:
+            printf("get log page\n");
+            nvme_mi_admin_get_log_page(ctrl_mi, req);
+            break;
+        case NVME_ADM_MI_CMD_GET_FEATURES:
+            printf("get features\n");
+            nvme_mi_admin_get_features(ctrl_mi, req);
+            break;
+        default:
+        {
+            NvmeMIResponse resp;
+            nvme_mi_resp_hdr_init(&resp, true);
+            resp.status = INVALID_COMMAND_OPCODE;
+            nvme_mi_send_resp(ctrl_mi, (uint8_t *)&resp, sizeof(resp));
+            break;
+        }
+        }
+    }
+
+    return;
+}
+
+static void nvme_mi_init(NvmeMiCtrl *ctrl_mi)
+{
+    pthread_t vsock_tid;
+    pthread_create(&vsock_tid, NULL, vsock_server_start, ctrl_mi);
+    pthread_detach(vsock_tid);
+}
+
+void *vsock_server_start(void *arg)
+{
+    NvmeMiCtrl *ctrl_mi = (NvmeMiCtrl *)arg;
+    int listenfd = 0, c = 0;
+    struct sockaddr_vm sa = {
+                        .svm_family = AF_VSOCK,
+                        .svm_cid = VMADDR_CID_ANY,
+                        .svm_port = 1342,
+                    };
+    struct sockaddr_vm client;
+    listenfd = socket(AF_VSOCK, SOCK_STREAM, 0);
+    if (listenfd == -1) {
+        pthread_exit(NULL);
+    }
+    if (bind(listenfd, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
+        pthread_exit(NULL);
+    }
+    listen(listenfd, 1);
+    puts("Waiting for incoming connections...");
+    c = sizeof(struct sockaddr_vm);
+    while (1) {
+        ctrl_mi->sock_desc = accept(listenfd, (struct sockaddr *)&client,
+                                       (socklen_t *)&c);
+        if ((ctrl_mi->sock_desc) < 0) {
+            continue;
+        }
+
+        NvmeMiMctpHeader mctp_hdr;
+        uint32_t total_len = 0;
+        uint8_t *getdata_buf = NULL;
+        while (!mctp_hdr.EOM) {
+            uint8_t buf[7];
+            recv(ctrl_mi->sock_desc, buf, 7, 0);
+            mctp_hdr.byte_count = buf[1];
+            mctp_hdr.EOM = (buf[6] & 0x40) >> 6;
+            mctp_hdr.SOM = (buf[6] & 0x80) >> 7;
+            mctp_hdr.pckt_seq = (buf[6] & 0x20) >> 5;
+
+            uint32_t curr_len = total_len;
+            total_len = total_len + (mctp_hdr.byte_count - 5);
+
+            getdata_buf = (uint8_t *)g_realloc(getdata_buf, total_len);
+            recv(ctrl_mi->sock_desc, getdata_buf + curr_len,
+                        (mctp_hdr.byte_count - 5), 0);
+        }
+        NvmeMiMessageHeader msghdr;
+        memcpy(&msghdr, getdata_buf, sizeof(NvmeMiMessageHeader));
+        nvme_mi_admin_command(ctrl_mi, getdata_buf);
+    }
+    pthread_exit(0);
+}
+
+static void nvme_mi_realize(DeviceState *dev, Error **errp)
+{
+    NvmeMiCtrl *ctrl_mi = (NvmeMiCtrl *)malloc(sizeof(NvmeMiCtrl));
+    BusState *s = qdev_get_parent_bus(dev);
+    ctrl_mi->n = NVME(s->parent);
+    ctrl_mi->mctp_unit_size = 64;
+    ctrl_mi->smbus_freq = 0x01;
+    ctrl_mi->sock_desc = 0;
+    memset(&ctrl_mi->vpd_data, 0, sizeof(ctrl_mi->vpd_data));
+
+    nvme_mi_init(ctrl_mi);
+}
+
+static void nvme_mi_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+
+    set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+
+    dc->bus_type = TYPE_NVME_BUS;
+    dc->realize = nvme_mi_realize;
+    dc->desc = "nvme mi";
+}
+
+static const TypeInfo nvme_mi = {
+    .name = TYPE_NVME_MI,
+    .parent = TYPE_DEVICE,
+    .instance_size = sizeof(NvmeMiCtrl),
+    .class_init = nvme_mi_class_init,
+};
+
+static void register_types(void)
+{
+    type_register_static(&nvme_mi);
+}
+
+type_init(register_types);
diff --git a/hw/nvme/nvme-mi.h b/hw/nvme/nvme-mi.h
new file mode 100644
index 0000000..4c155a0
--- /dev/null
+++ b/hw/nvme/nvme-mi.h
@@ -0,0 +1,288 @@
+/*
+ * QEMU NVMe-MI
+ *
+ * Copyright (c) 2021 Samsung Electronics Co., Ltd.
+ *
+ * Authors:
+ *   Padmakar Kalghatgi      <p.kalghatgi@samsung.com>
+ *   Arun Kumar Agasar       <arun.ag@partner.samsung.com>
+ *   Saurav Kumar            <saurav.29@partner.samsung.com>
+ *
+ * This code is licensed under the GNU GPL v2 or later.
+ */
+
+#ifndef NVME_MI_H
+#define NVME_MI_H
+
+#include <sys/socket.h>
+#include <linux/vm_sockets.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <pthread.h>
+
+#define TYPE_NVME_MI "nvme-mi"
+
+#define NVM_SUBSYSTEM_INFORMATION 0
+#define PORT_INFORMATION 1
+#define CONTROLLER_LIST 2
+#define CONTROLLER_INFORMATION 3
+#define OPT_SUPP_CMD_LIST 4
+#define MGMT_EPT_BUFF_CMD_SUPP_LIST 5
+
+typedef struct NvmeMiVpdElements {
+    long common_header;
+} NvmeMiVpdElements;
+
+typedef struct NvmeMiCtrl {
+   int32_t sock_desc;
+   uint32_t mctp_unit_size;
+   uint32_t smbus_freq;
+   NvmeCtrl *n;
+   NvmeMiVpdElements vpd_data;
+   u_char dummy[1000];
+} NvmeMiCtrl;
+
+enum NvmeMiMngmtInterfaceCmdSetsOpcodes {
+   READ_NVME_MI_DS                   = 0x00,
+   NVM_SHSP                          = 0x01,
+   CHSP                              = 0x02,
+   CONFIGURATION_SET                 = 0x03,
+   CONFIGURATION_GET                 = 0x04,
+   VPD_READ                          = 0x05,
+   VPD_WRITE                         = 0x06,
+   MI_RESET                          = 0x07,
+   SES_RECEIVE                       = 0x08,
+   SES_SEND                          = 0x09,
+   MANAGEMENT_ENDPOINT_BUFFER_READ   = 0x0A,
+   MANAGEMENT_ENDPOINT_BUFFER_WRITE  = 0x0B,
+   MI_RESERVED                       = 0x0C,
+   VENDOR_SPECIFIC                   = 0xC0
+};
+
+enum NvmeMiControlPrimitiveOpcodes {
+   PAUSE                             = 0x00,
+   RESUME                            = 0x01,
+   ABORT                             = 0x02,
+   GET_STATE                         = 0x03,
+   REPLAY                            = 0x04,
+   CTRL_PRIMITIVE_RESERVED           = 0x05,
+   CTRL_PRIMITIVE_VENDOR_SPECIFIC    = 0xF0
+};
+
+enum NvmeMiAdminCommands {
+   NVME_ADM_MI_CMD_DELETE_SQ      = 0x00,
+   NVME_ADM_MI_CMD_CREATE_SQ      = 0x01,
+   NVME_ADM_MI_CMD_GET_LOG_PAGE   = 0x02,
+   NVME_ADM_MI_CMD_DELETE_CQ      = 0x04,
+   NVME_ADM_MI_CMD_CREATE_CQ      = 0x05,
+   NVME_ADM_MI_CMD_IDENTIFY       = 0x06,
+   NVME_ADM_MI_CMD_ABORT          = 0x08,
+   NVME_ADM_MI_CMD_SET_FEATURES   = 0x09,
+   NVME_ADM_MI_CMD_GET_FEATURES   = 0x0a,
+   NVME_ADM_MI_CMD_ASYNC_EV_REQ   = 0x0c,
+   NVME_ADM_MI_CMD_NS_MANAGEMENT  = 0x0d,
+   NVME_ADM_MI_CMD_ACTIVATE_FW    = 0x10,
+   NVME_ADM_MI_CMD_DOWNLOAD_FW    = 0x11,
+   NVME_ADM_MI_CMD_DST            = 0x14,
+   NVME_ADM_MI_CMD_NS_ATTACHMENT  = 0x15,
+   NVME_ADM_MI_CMD_KEEP_ALIVE     = 0x18,
+   NVME_ADM_MI_CMD_DIRECTIVE_SEND = 0x19,
+   NVME_ADM_MI_CMD_DIRECTIVE_RECV = 0x1A,
+   NVME_ADM_MI_CMD_VIRTUALIZATION = 0x1C,
+   NVME_ADM_MI_CMD_DB_BUF_CONIF   = 0x7C,
+   NVME_ADM_MI_CMD_FORMAT_NVM     = 0x80,
+   NVME_ADM_MI_CMD_SECURITY_SEND  = 0x81,
+   NVME_ADM_MI_CMD_SECURITY_RECV  = 0x82,
+   NVME_ADM_MI_CMD_SANITIZE       = 0x84,
+   NVME_ADM_MI_CMD_GET_LBA_STATUS = 0x86,
+};
+
+enum NvmeMiConfigGetResponseValue {
+   DEFAULT_MCTP_SIZE   = 64,
+   DEFAULT_SMBUS_FREQ  = 1,
+   SET_SMBUS_FREQ      = 129,
+   SET_7BITS           = 255,
+   SET_4BITS           = 15,
+   SET_16BITS          = 65535
+};
+
+enum NvmeMiConfigurationIdentifier {
+   SMBUS_I2C_FREQ = 1,
+   HEALTH_STATUS_CHG,
+   MCTP_TRANS_UNIT_SIZE,
+};
+
+enum NvmeMiResponseMessageStatus {
+   SUCCESS,
+   MORE_PROCESSING_REQUIRED,
+   INTERNAL_ERROR,
+   INVALID_COMMAND_OPCODE,
+   INVALID_PARAMETER,
+   INVALID_COMMAND_SIZE,
+   INVALID_COMMAND_INPUT_DATA_SIZE,
+   ACCESS_DENIED,
+   VPD_UPDATES_EXCEEDED = 0x20,
+   PCIe_INACCESSIBLE
+};
+
+typedef struct NvmeMiMctpHeader {
+   uint32_t byte_count:8;
+   uint32_t SOM:1;
+   uint32_t EOM:1;
+   uint32_t pckt_seq:2;
+} NvmeMiMctpHeader;
+
+typedef struct NvmeMiMessageHeader {
+   uint32_t msgtype:7;
+   uint32_t ic:1;
+   uint32_t csi:1;
+   uint32_t reserved:2;
+   uint32_t nmimt:4;
+   uint32_t ror:1;
+   uint32_t reserved1:16;
+} NvmeMiMessageHeader;
+
+typedef struct NvmeMIRequest {
+   NvmeMiMessageHeader msg_header;
+   uint32_t               opc:8;
+   uint32_t               rsvd:24;
+   uint32_t               dword0;
+   uint32_t               dword1;
+   uint8_t                *buf;
+   uint32_t               mic;
+} NvmeMIRequest;
+
+typedef struct NvmeAdminMIRequest {
+   NvmeMiMessageHeader msg_header;
+   uint8_t                opc;
+   uint8_t                cmdflags;
+   uint16_t               cntlid;
+   uint32_t               sqentry1;
+   uint32_t               sqentry2;
+   uint32_t               sqentry3;
+   uint32_t               sqentry4;
+   uint32_t               sqentry5;
+   uint32_t               dataofst;
+   uint32_t               datalen;
+   uint32_t               reserved[2];
+   uint32_t               sqentry10;
+   uint32_t               sqentry11;
+   uint32_t               sqentry12;
+   uint32_t               sqentry13;
+   uint32_t               sqentry14;
+   uint32_t               sqentry15;
+   uint8_t                *buf;
+   uint32_t               mic;
+} NvmeAdminMIRequest;
+
+typedef struct ReadNvmeMiDs {
+    uint16_t cntrlid;
+    uint8_t  portlid;
+    uint8_t  dtyp;
+}  ReadNvmeMiDs;
+
+typedef struct NvmeMiConfigurationSet {
+    uint8_t conf_identifier_dword_0;
+    uint16_t conf_identifier_specific_dword_0;
+    uint16_t conf_identifier_specific_dword_1;
+}  MiConfigurationSet;
+
+typedef struct NvmeMiNvmSubsysHspds {
+    uint8_t nss;
+    uint8_t sw;
+    uint8_t ctemp;
+    uint8_t pdlu;
+    uint16_t ccs;
+    uint16_t reserved;
+} NvmeMiNvmSubsysHspds;
+
+typedef struct NvmeMiControlPrimitives {
+    uint32_t nmh;
+    uint32_t cpo;
+    uint32_t tag;
+    uint32_t cpsp;
+    uint32_t mic;
+} NvmeMiControlPrimitives;
+
+typedef struct NvmMiSubsysInfoDs {
+    uint8_t nump;
+    uint8_t mjr;
+    uint8_t mnr;
+    uint8_t rsvd[29];
+} NvmMiSubsysInfoDs;
+
+typedef struct NvmeMiCwarnStruct {
+    uint8_t spare_thresh:1;
+    uint8_t temp_above_or_under_thresh:1;
+    uint8_t rel_degraded:1;
+    uint8_t read_only:1;
+    uint8_t vol_mem_bup_fail:1;
+    uint8_t reserved:3;
+} NvmeMiCwarnStruct;
+
+typedef struct NvmeMiCstsStruct {
+    uint16_t rdy:1;
+    uint16_t cfs:1;
+    uint16_t shst:2;
+    uint16_t nssro:1;
+    uint16_t en:1;
+    uint16_t nssac:1;
+    uint16_t fwact:1;
+    uint16_t reserved:8;
+} NvmeMiCstsStruct;
+
+typedef struct NvmeMiCtrlHealthDs {
+   uint16_t ctlid;
+   NvmeMiCstsStruct csts;
+   uint16_t ctemp;
+   uint16_t pdlu;
+   uint8_t spare;
+   NvmeMiCwarnStruct cwarn;
+   uint8_t reserved[7];
+} NvmeMiCtrlHealthDs;
+
+typedef struct NvmeMIResponse {
+   NvmeMiMessageHeader msg_header;
+   uint8_t                status;
+   uint32_t               mgmt_resp:24;
+} NvmeMIResponse;
+
+typedef struct NvmeMIAdminResponse {
+   NvmeMiMessageHeader msg_header;
+   uint32_t status:8;
+   uint32_t mgmt_resp:24;
+   uint32_t cqdword0;
+   uint32_t cqdword1;
+   uint32_t cqdword3;
+} NvmeMIAdminResponse;
+
+uint32_t NvmeMiCmdOptSupList[] = {
+  /*
+   * MANAGEMENT_ENDPOINT_BUFFER_READ,
+   * MANAGEMENT_ENDPOINT_BUFFER_WRITE,
+   */
+};
+
+uint32_t NvmeMiAdminCmdOptSupList[] = {
+   /*
+    *  NVME_ADM_CMD_DST,
+    *  NVME_ADM_CMD_DOWNLOAD_FW,
+    *  NVME_ADM_CMD_ACTIVATE_FW,
+    *  NVME_ADM_CMD_FORMAT_NVM,
+    *  NVME_ADM_CMD_NS_MANAGEMENT,
+    *  NVME_ADM_CMD_NS_ATTACHMENT,
+    *  NVME_ADM_CMD_DIRECTIVE_SEND,
+    *  NVME_ADM_CMD_DIRECTIVE_RECV,
+    *  NVME_ADM_CMD_SET_FEATURES,
+    *  NVME_ADM_CMD_SANITIZE,
+    */
+};
+
+void *vsock_server_start(void *);
+void *ControlPrimitiveThread(void *);
+void *nvme_mi_admin_commandthread(void *);
+
+#endif
-- 
2.7.0.windows.1


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] hw/nvme: add mi device
  2021-07-09 13:55 ` [RFC PATCH 1/2] hw/nvme: add mi device Padmakar Kalghatgi
@ 2021-07-09 15:58   ` Keith Busch
  2021-07-15 12:01     ` Padmakar Kalghatgi
  2021-07-12 11:03     ` Stefan Hajnoczi
  1 sibling, 1 reply; 13+ messages in thread
From: Keith Busch @ 2021-07-09 15:58 UTC (permalink / raw)
  To: Padmakar Kalghatgi
  Cc: fam, kwolf, jg123.choi, qemu-block, k.jensen, d.palani,
	qemu-devel, linux-nvme, mreitz, u.kishore, stefanha, its,
	javier.gonz, prakash.v, mohit.kap

On Fri, Jul 09, 2021 at 07:25:45PM +0530, Padmakar Kalghatgi wrote:
> The following commands are tested with nvme-cli by hooking
> to the cid of the vsock as shown above and use the socket
> send/recieve commands to issue the commands and get the response.

Why sockets? Shouldn't mi targets use smbus for that?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] hw/nvme: add mi device
  2021-07-09 13:55 ` [RFC PATCH 1/2] hw/nvme: add mi device Padmakar Kalghatgi
@ 2021-07-12 11:03     ` Stefan Hajnoczi
  2021-07-12 11:03     ` Stefan Hajnoczi
  1 sibling, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2021-07-12 11:03 UTC (permalink / raw)
  To: Padmakar Kalghatgi
  Cc: qemu-devel, kbusch, its, fam, kwolf, mreitz, qemu-block,
	linux-nvme, k.jensen, javier.gonz, prakash.v, d.palani,
	u.kishore, mohit.kap, jg123.choi


[-- Attachment #1.1: Type: text/plain, Size: 2919 bytes --]

On Fri, Jul 09, 2021 at 07:25:45PM +0530, Padmakar Kalghatgi wrote:
> The enclosed patch contains the implementation of certain
> commands of nvme-mi specification.The MI commands are useful
> to manage/configure/monitor the device.Eventhough the MI commands
> can be sent via the inband NVMe-MI send/recieve commands, the idea here is
> to emulate the sideband interface for MI.
> 
> Since the nvme-mi specification deals in communicating
> to the nvme subsystem via. a sideband interface, in this
> qemu implementation, virtual-vsock is used for making the
> sideband communication, the guest VM needs to make the
> connection to the specific cid of the vsock of the qemu host.
> 
> One needs to specify the following command in the launch to
> specify the nvme-mi device, cid and to setup the vsock:
> -device nvme-mi,bus=<nvme bus number>
> -device vhost-vsock-pci, guest-cid=<vsock cid>
> 
> The following commands are tested with nvme-cli by hooking
> to the cid of the vsock as shown above and use the socket
> send/recieve commands to issue the commands and get the response.
> 
> we are planning to push the changes for nvme-cli as well to test the
> MI functionality.

Is the purpose of this feature (-device nvme-mi) testing MI with QEMU's
NVMe implementation?

My understanding is that instead of inventing an out-of-band interface
in the form of a new paravirtualized device, you decided to use vsock to
send MI commands from the guest to QEMU?

> As the connection can be established by the guest VM at any point,
> we have created a thread which is looking for a connection request.
> Please suggest if there is a native/better way to handle this.

QEMU has an event-driven architecture and uses threads sparingly. When
it uses threads it uses qemu_create_thread() instead of
pthread_create(), but I suggest using qemu_set_fd_handler() or a
coroutine with QIOChannel to integrate into the QEMU event loop instead.

I didn't see any thread synchronization, so I'm not sure if accessing
NVMe state from the MI thread is safe. Changing the code to use QEMU's
event loop can solve that problem since there's no separate thread.

> This module makes use of the NvmeCtrl structure of the nvme module,
> to fetch relevant information of the nvme device which are used in
> some of the mi commands. Eventhough certain commands might require
> modification to the nvme module, currently we have currently refrained
> from making changes to the nvme module.

Why did you decide to implement -device nvme-mi as a device on
TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised
that there's no NVMe bus interface (callbacks). It seems like this could
just as easily be a property of an NVMe controller -device
nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not
familiar enough with MI and NVMe architecture...

Stefan

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 158 bytes --]

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] hw/nvme: add mi device
@ 2021-07-12 11:03     ` Stefan Hajnoczi
  0 siblings, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2021-07-12 11:03 UTC (permalink / raw)
  To: Padmakar Kalghatgi
  Cc: fam, kwolf, jg123.choi, qemu-block, k.jensen, d.palani,
	qemu-devel, linux-nvme, mreitz, its, u.kishore, kbusch,
	javier.gonz, prakash.v, mohit.kap

[-- Attachment #1: Type: text/plain, Size: 2919 bytes --]

On Fri, Jul 09, 2021 at 07:25:45PM +0530, Padmakar Kalghatgi wrote:
> The enclosed patch contains the implementation of certain
> commands of nvme-mi specification.The MI commands are useful
> to manage/configure/monitor the device.Eventhough the MI commands
> can be sent via the inband NVMe-MI send/recieve commands, the idea here is
> to emulate the sideband interface for MI.
> 
> Since the nvme-mi specification deals in communicating
> to the nvme subsystem via. a sideband interface, in this
> qemu implementation, virtual-vsock is used for making the
> sideband communication, the guest VM needs to make the
> connection to the specific cid of the vsock of the qemu host.
> 
> One needs to specify the following command in the launch to
> specify the nvme-mi device, cid and to setup the vsock:
> -device nvme-mi,bus=<nvme bus number>
> -device vhost-vsock-pci, guest-cid=<vsock cid>
> 
> The following commands are tested with nvme-cli by hooking
> to the cid of the vsock as shown above and use the socket
> send/recieve commands to issue the commands and get the response.
> 
> we are planning to push the changes for nvme-cli as well to test the
> MI functionality.

Is the purpose of this feature (-device nvme-mi) testing MI with QEMU's
NVMe implementation?

My understanding is that instead of inventing an out-of-band interface
in the form of a new paravirtualized device, you decided to use vsock to
send MI commands from the guest to QEMU?

> As the connection can be established by the guest VM at any point,
> we have created a thread which is looking for a connection request.
> Please suggest if there is a native/better way to handle this.

QEMU has an event-driven architecture and uses threads sparingly. When
it uses threads it uses qemu_create_thread() instead of
pthread_create(), but I suggest using qemu_set_fd_handler() or a
coroutine with QIOChannel to integrate into the QEMU event loop instead.

I didn't see any thread synchronization, so I'm not sure if accessing
NVMe state from the MI thread is safe. Changing the code to use QEMU's
event loop can solve that problem since there's no separate thread.

> This module makes use of the NvmeCtrl structure of the nvme module,
> to fetch relevant information of the nvme device which are used in
> some of the mi commands. Eventhough certain commands might require
> modification to the nvme module, currently we have currently refrained
> from making changes to the nvme module.

Why did you decide to implement -device nvme-mi as a device on
TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised
that there's no NVMe bus interface (callbacks). It seems like this could
just as easily be a property of an NVMe controller -device
nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not
familiar enough with MI and NVMe architecture...

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] hw/nvme: add mi device
  2021-07-12 11:03     ` Stefan Hajnoczi
@ 2021-07-13  5:30       ` Christoph Hellwig
  -1 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-07-13  5:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Padmakar Kalghatgi, qemu-devel, kbusch, its, fam, kwolf, mreitz,
	qemu-block, linux-nvme, k.jensen, javier.gonz, prakash.v,
	d.palani, u.kishore, mohit.kap, jg123.choi

On Mon, Jul 12, 2021 at 12:03:27PM +0100, Stefan Hajnoczi wrote:
> Why did you decide to implement -device nvme-mi as a device on
> TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised
> that there's no NVMe bus interface (callbacks). It seems like this could
> just as easily be a property of an NVMe controller -device
> nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not
> familiar enough with MI and NVMe architecture...

I'm too far away from qemu these days to understand what TYPE_NVME_BUS
is.  Bt NVMe-MI has tree possible transports:

 1) out of band through smbus.  This seems something that could be
    trivially modelled in qemu
 2) out of band over MCTP / PCIe VDM.
 3) in band using NVMe admin commands that pass through MI commands

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] hw/nvme: add mi device
@ 2021-07-13  5:30       ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-07-13  5:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: fam, kwolf, jg123.choi, qemu-block, k.jensen, d.palani,
	qemu-devel, linux-nvme, mreitz, its, u.kishore,
	Padmakar Kalghatgi, kbusch, javier.gonz, prakash.v, mohit.kap

On Mon, Jul 12, 2021 at 12:03:27PM +0100, Stefan Hajnoczi wrote:
> Why did you decide to implement -device nvme-mi as a device on
> TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised
> that there's no NVMe bus interface (callbacks). It seems like this could
> just as easily be a property of an NVMe controller -device
> nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not
> familiar enough with MI and NVMe architecture...

I'm too far away from qemu these days to understand what TYPE_NVME_BUS
is.  Bt NVMe-MI has tree possible transports:

 1) out of band through smbus.  This seems something that could be
    trivially modelled in qemu
 2) out of band over MCTP / PCIe VDM.
 3) in band using NVMe admin commands that pass through MI commands


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] hw/nvme: add mi device
  2021-07-13  5:30       ` Christoph Hellwig
@ 2021-07-13  9:37         ` Stefan Hajnoczi
  -1 siblings, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2021-07-13  9:37 UTC (permalink / raw)
  To: Christoph Hellwig, Padmakar Kalghatgi, its
  Cc: qemu-devel, kbusch, fam, kwolf, mreitz, qemu-block, linux-nvme,
	k.jensen, javier.gonz, prakash.v, d.palani, u.kishore, mohit.kap,
	jg123.choi


[-- Attachment #1.1: Type: text/plain, Size: 2317 bytes --]

On Tue, Jul 13, 2021 at 06:30:28AM +0100, Christoph Hellwig wrote:
On Tue, Jul 13, 2021 at 06:30:28AM +0100, Christoph Hellwig wrote:
> On Mon, Jul 12, 2021 at 12:03:27PM +0100, Stefan Hajnoczi wrote:
> > Why did you decide to implement -device nvme-mi as a device on
> > TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised
> > that there's no NVMe bus interface (callbacks). It seems like this could
> > just as easily be a property of an NVMe controller -device
> > nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not
> > familiar enough with MI and NVMe architecture...
> 
> I'm too far away from qemu these days to understand what TYPE_NVME_BUS
> is.  Bt NVMe-MI has tree possible transports:
> 
>  1) out of band through smbus.  This seems something that could be
>     trivially modelled in qemu
>  2) out of band over MCTP / PCIe VDM.
>  3) in band using NVMe admin commands that pass through MI commands

Thanks for explaining!

Common NVMe-MI code can be shared by -device nvme-mi-smbus, in-band NVMe
MI commands (part of -device nvme), a vsock transport, etc. This patch
has nvme_mi_admin_command() as the entry point to common MI code, so not
much needs to be done to achieve this.

My question about why -device nvme-mi was because this "device" doesn't
implement any bus interface (callbacks). The bus effectively just serves
as an owner of this device. The guest does not access the device via the
bus. So I'm not sure a -device is appropriate, it's an usual device.

If the device is kept, please name it -device nvme-mi-vsock so it's
clear this is the NVMe-MI vsock transport. I think the device could be
dropped and instead an -device nvme,mi-vsock=on|off property could be
added to enable the MI vsock transport on a specific NVMe controller.
This raises the question of whether the port number should be
configurable so multiple vsock Management Endpoints can coexist.

I don't have time to explore the architectural model, but here's the
link in case anyone wants to think through all the options for NVMe MI
Management Endpoints and how QEMU should model them:
"1.4 NVM Subsystem Architectural Model"
https://nvmexpress.org/wp-content/uploads/NVM-Express-Management-Interface-1.2-2021.06.02-Ratified.pdf

Stefan

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 158 bytes --]

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] hw/nvme: add mi device
@ 2021-07-13  9:37         ` Stefan Hajnoczi
  0 siblings, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2021-07-13  9:37 UTC (permalink / raw)
  To: Christoph Hellwig, Padmakar Kalghatgi, its
  Cc: fam, kwolf, jg123.choi, qemu-block, k.jensen, d.palani,
	qemu-devel, linux-nvme, mreitz, u.kishore, kbusch, javier.gonz,
	prakash.v, mohit.kap

[-- Attachment #1: Type: text/plain, Size: 2317 bytes --]

On Tue, Jul 13, 2021 at 06:30:28AM +0100, Christoph Hellwig wrote:
On Tue, Jul 13, 2021 at 06:30:28AM +0100, Christoph Hellwig wrote:
> On Mon, Jul 12, 2021 at 12:03:27PM +0100, Stefan Hajnoczi wrote:
> > Why did you decide to implement -device nvme-mi as a device on
> > TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised
> > that there's no NVMe bus interface (callbacks). It seems like this could
> > just as easily be a property of an NVMe controller -device
> > nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not
> > familiar enough with MI and NVMe architecture...
> 
> I'm too far away from qemu these days to understand what TYPE_NVME_BUS
> is.  Bt NVMe-MI has tree possible transports:
> 
>  1) out of band through smbus.  This seems something that could be
>     trivially modelled in qemu
>  2) out of band over MCTP / PCIe VDM.
>  3) in band using NVMe admin commands that pass through MI commands

Thanks for explaining!

Common NVMe-MI code can be shared by -device nvme-mi-smbus, in-band NVMe
MI commands (part of -device nvme), a vsock transport, etc. This patch
has nvme_mi_admin_command() as the entry point to common MI code, so not
much needs to be done to achieve this.

My question about why -device nvme-mi was because this "device" doesn't
implement any bus interface (callbacks). The bus effectively just serves
as an owner of this device. The guest does not access the device via the
bus. So I'm not sure a -device is appropriate, it's an usual device.

If the device is kept, please name it -device nvme-mi-vsock so it's
clear this is the NVMe-MI vsock transport. I think the device could be
dropped and instead an -device nvme,mi-vsock=on|off property could be
added to enable the MI vsock transport on a specific NVMe controller.
This raises the question of whether the port number should be
configurable so multiple vsock Management Endpoints can coexist.

I don't have time to explore the architectural model, but here's the
link in case anyone wants to think through all the options for NVMe MI
Management Endpoints and how QEMU should model them:
"1.4 NVM Subsystem Architectural Model"
https://nvmexpress.org/wp-content/uploads/NVM-Express-Management-Interface-1.2-2021.06.02-Ratified.pdf

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] hw/nvme: add mi device
  2021-07-09 15:58   ` Keith Busch
@ 2021-07-15 12:01     ` Padmakar Kalghatgi
  0 siblings, 0 replies; 13+ messages in thread
From: Padmakar Kalghatgi @ 2021-07-15 12:01 UTC (permalink / raw)
  To: Keith Busch
  Cc: fam, kwolf, jg123.choi, qemu-block, k.jensen, d.palani,
	qemu-devel, linux-nvme, mreitz, u.kishore, stefanha, its,
	javier.gonz, prakash.v, mohit.kap

[-- Attachment #1: Type: text/plain, Size: 563 bytes --]

On Fri, Jul 09, 2021 at 08:58:42AM -0700, Keith Busch wrote:
>On Fri, Jul 09, 2021 at 07:25:45PM +0530, Padmakar Kalghatgi wrote:
>> The following commands are tested with nvme-cli by hooking
>> to the cid of the vsock as shown above and use the socket
>> send/recieve commands to issue the commands and get the response.
>
>Why sockets? Shouldn't mi targets use smbus for that?
>
vsock mimcs the sideband communication, hence we used it.
However, we are working on the smbus/i2c implementation 
for nvme-mi in qemu/nvme-cli, we will send the patch in few days.


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] hw/nvme: add mi device
  2021-07-12 11:03     ` Stefan Hajnoczi
@ 2021-07-15 12:36       ` Padmakar Kalghatgi
  -1 siblings, 0 replies; 13+ messages in thread
From: Padmakar Kalghatgi @ 2021-07-15 12:36 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, kbusch, its, fam, kwolf, mreitz, qemu-block,
	linux-nvme, k.jensen, javier.gonz, prakash.v, d.palani,
	u.kishore, mohit.kap, jg123.choi

[-- Attachment #1: Type: text/plain, Size: 3543 bytes --]

On Mon, Jul 12, 2021 at 12:03:27PM +0100, Stefan Hajnoczi wrote:
>On Fri, Jul 09, 2021 at 07:25:45PM +0530, Padmakar Kalghatgi wrote:
>> The enclosed patch contains the implementation of certain
>> commands of nvme-mi specification.The MI commands are useful
>> to manage/configure/monitor the device.Eventhough the MI commands
>> can be sent via the inband NVMe-MI send/recieve commands, the idea here is
>> to emulate the sideband interface for MI.
>>
>> Since the nvme-mi specification deals in communicating
>> to the nvme subsystem via. a sideband interface, in this
>> qemu implementation, virtual-vsock is used for making the
>> sideband communication, the guest VM needs to make the
>> connection to the specific cid of the vsock of the qemu host.
>>
>> One needs to specify the following command in the launch to
>> specify the nvme-mi device, cid and to setup the vsock:
>> -device nvme-mi,bus=<nvme bus number>
>> -device vhost-vsock-pci, guest-cid=<vsock cid>
>>
>> The following commands are tested with nvme-cli by hooking
>> to the cid of the vsock as shown above and use the socket
>> send/recieve commands to issue the commands and get the response.
>>
>> we are planning to push the changes for nvme-cli as well to test the
>> MI functionality.
>
>Is the purpose of this feature (-device nvme-mi) testing MI with QEMU's
>NVMe implementation?
>
>My understanding is that instead of inventing an out-of-band interface
>in the form of a new paravirtualized device, you decided to use vsock to
>send MI commands from the guest to QEMU?
>
>> As the connection can be established by the guest VM at any point,
>> we have created a thread which is looking for a connection request.
>> Please suggest if there is a native/better way to handle this.
>
>QEMU has an event-driven architecture and uses threads sparingly. When
>it uses threads it uses qemu_create_thread() instead of
>pthread_create(), but I suggest using qemu_set_fd_handler() or a
>coroutine with QIOChannel to integrate into the QEMU event loop instead.
>
>I didn't see any thread synchronization, so I'm not sure if accessing
>NVMe state from the MI thread is safe. Changing the code to use QEMU's
>event loop can solve that problem since there's no separate thread.
>
vsock mimcs the sideband communication hence we used it. 
However we are working the smbus/i2c implementation for nvme-mi in 
qemu/nvme-cli, we will send the patch in few days. to communicate with 
nvme-mi over smbus/i2c, nvme-mi device needs to inherit from the i2c class 
which has callbacks for sending and recieving messages, this approach 
would get rid of the threads.

>> This module makes use of the NvmeCtrl structure of the nvme module,
>> to fetch relevant information of the nvme device which are used in
>> some of the mi commands. Eventhough certain commands might require
>> modification to the nvme module, currently we have currently refrained
>> from making changes to the nvme module.
>
>Why did you decide to implement -device nvme-mi as a device on
>TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised
>that there's no NVMe bus interface (callbacks). It seems like this could
>just as easily be a property of an NVMe controller -device
>nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not
>familiar enough with MI and NVMe architecture...
>
>Stefan
since nvme communication happens over pcie and nvme-mi happens over
smbus/i2c nvme-mi cannot be a property of nvme rather it should be a separate
device which will be on the smbus/i2c.



[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



[-- Attachment #3: Type: text/plain, Size: 158 bytes --]

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] hw/nvme: add mi device
@ 2021-07-15 12:36       ` Padmakar Kalghatgi
  0 siblings, 0 replies; 13+ messages in thread
From: Padmakar Kalghatgi @ 2021-07-15 12:36 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: fam, kwolf, jg123.choi, qemu-block, k.jensen, d.palani,
	qemu-devel, linux-nvme, mreitz, its, u.kishore, kbusch,
	javier.gonz, prakash.v, mohit.kap

[-- Attachment #1: Type: text/plain, Size: 3543 bytes --]

On Mon, Jul 12, 2021 at 12:03:27PM +0100, Stefan Hajnoczi wrote:
>On Fri, Jul 09, 2021 at 07:25:45PM +0530, Padmakar Kalghatgi wrote:
>> The enclosed patch contains the implementation of certain
>> commands of nvme-mi specification.The MI commands are useful
>> to manage/configure/monitor the device.Eventhough the MI commands
>> can be sent via the inband NVMe-MI send/recieve commands, the idea here is
>> to emulate the sideband interface for MI.
>>
>> Since the nvme-mi specification deals in communicating
>> to the nvme subsystem via. a sideband interface, in this
>> qemu implementation, virtual-vsock is used for making the
>> sideband communication, the guest VM needs to make the
>> connection to the specific cid of the vsock of the qemu host.
>>
>> One needs to specify the following command in the launch to
>> specify the nvme-mi device, cid and to setup the vsock:
>> -device nvme-mi,bus=<nvme bus number>
>> -device vhost-vsock-pci, guest-cid=<vsock cid>
>>
>> The following commands are tested with nvme-cli by hooking
>> to the cid of the vsock as shown above and use the socket
>> send/recieve commands to issue the commands and get the response.
>>
>> we are planning to push the changes for nvme-cli as well to test the
>> MI functionality.
>
>Is the purpose of this feature (-device nvme-mi) testing MI with QEMU's
>NVMe implementation?
>
>My understanding is that instead of inventing an out-of-band interface
>in the form of a new paravirtualized device, you decided to use vsock to
>send MI commands from the guest to QEMU?
>
>> As the connection can be established by the guest VM at any point,
>> we have created a thread which is looking for a connection request.
>> Please suggest if there is a native/better way to handle this.
>
>QEMU has an event-driven architecture and uses threads sparingly. When
>it uses threads it uses qemu_create_thread() instead of
>pthread_create(), but I suggest using qemu_set_fd_handler() or a
>coroutine with QIOChannel to integrate into the QEMU event loop instead.
>
>I didn't see any thread synchronization, so I'm not sure if accessing
>NVMe state from the MI thread is safe. Changing the code to use QEMU's
>event loop can solve that problem since there's no separate thread.
>
vsock mimcs the sideband communication hence we used it. 
However we are working the smbus/i2c implementation for nvme-mi in 
qemu/nvme-cli, we will send the patch in few days. to communicate with 
nvme-mi over smbus/i2c, nvme-mi device needs to inherit from the i2c class 
which has callbacks for sending and recieving messages, this approach 
would get rid of the threads.

>> This module makes use of the NvmeCtrl structure of the nvme module,
>> to fetch relevant information of the nvme device which are used in
>> some of the mi commands. Eventhough certain commands might require
>> modification to the nvme module, currently we have currently refrained
>> from making changes to the nvme module.
>
>Why did you decide to implement -device nvme-mi as a device on
>TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised
>that there's no NVMe bus interface (callbacks). It seems like this could
>just as easily be a property of an NVMe controller -device
>nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not
>familiar enough with MI and NVMe architecture...
>
>Stefan
since nvme communication happens over pcie and nvme-mi happens over
smbus/i2c nvme-mi cannot be a property of nvme rather it should be a separate
device which will be on the smbus/i2c.



[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] hw/nvme: add mi device
  2021-07-13  9:37         ` Stefan Hajnoczi
@ 2021-07-15 14:37           ` Padmakar Kalghatgi
  -1 siblings, 0 replies; 13+ messages in thread
From: Padmakar Kalghatgi @ 2021-07-15 14:37 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Christoph Hellwig, its, qemu-devel, kbusch, fam, kwolf, mreitz,
	qemu-block, linux-nvme, k.jensen, javier.gonz, prakash.v,
	d.palani, u.kishore, mohit.kap, jg123.choi

[-- Attachment #1: Type: text/plain, Size: 2567 bytes --]

On Tue, Jul 13, 2021 at 10:37:23AM +0100, Stefan Hajnoczi wrote:
>On Tue, Jul 13, 2021 at 06:30:28AM +0100, Christoph Hellwig wrote:
>On Tue, Jul 13, 2021 at 06:30:28AM +0100, Christoph Hellwig wrote:
>> On Mon, Jul 12, 2021 at 12:03:27PM +0100, Stefan Hajnoczi wrote:
>> > Why did you decide to implement -device nvme-mi as a device on
>> > TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised
>> > that there's no NVMe bus interface (callbacks). It seems like this could
>> > just as easily be a property of an NVMe controller -device
>> > nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not
>> > familiar enough with MI and NVMe architecture...
>>
>> I'm too far away from qemu these days to understand what TYPE_NVME_BUS
>> is.  Bt NVMe-MI has tree possible transports:
>>
>>  1) out of band through smbus.  This seems something that could be
>>     trivially modelled in qemu
>>  2) out of band over MCTP / PCIe VDM.
>>  3) in band using NVMe admin commands that pass through MI commands
>
>Thanks for explaining!
>
>Common NVMe-MI code can be shared by -device nvme-mi-smbus, in-band NVMe
>MI commands (part of -device nvme), a vsock transport, etc. This patch
>has nvme_mi_admin_command() as the entry point to common MI code, so not
>much needs to be done to achieve this.
>
>My question about why -device nvme-mi was because this "device" doesn't
>implement any bus interface (callbacks). The bus effectively just serves
>as an owner of this device. The guest does not access the device via the
>bus. So I'm not sure a -device is appropriate, it's an usual device.
>
>If the device is kept, please name it -device nvme-mi-vsock so it's
>clear this is the NVMe-MI vsock transport. I think the device could be
>dropped and instead an -device nvme,mi-vsock=on|off property could be
>added to enable the MI vsock transport on a specific NVMe controller.
>This raises the question of whether the port number should be
>configurable so multiple vsock Management Endpoints can coexist.
>
>I don't have time to explore the architectural model, but here's the
>link in case anyone wants to think through all the options for NVMe MI
>Management Endpoints and how QEMU should model them:
>"1.4 NVM Subsystem Architectural Model"
>https://protect2.fireeye.com/v1/url?k=8ee99db1-ee0b00ec-8ee816fe-000babd9f1ba-c174da71c1d11e79&q=1&e=b7b9709a-33ac-4d98-a6c0-ff53377a3278&u=https%3A%2F%2Fnvmexpress.org%2Fwp-content%2Fuploads%2FNVM-Express-Management-Interface-1.2-2021.06.02-Ratified.pdf
>
>Stefan
Thanks Stefan for the suggestion.


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



[-- Attachment #3: Type: text/plain, Size: 158 bytes --]

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] hw/nvme: add mi device
@ 2021-07-15 14:37           ` Padmakar Kalghatgi
  0 siblings, 0 replies; 13+ messages in thread
From: Padmakar Kalghatgi @ 2021-07-15 14:37 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: fam, kwolf, jg123.choi, qemu-block, k.jensen, d.palani,
	qemu-devel, linux-nvme, mreitz, Christoph Hellwig, kbusch,
	u.kishore, its, javier.gonz, prakash.v, mohit.kap

[-- Attachment #1: Type: text/plain, Size: 2567 bytes --]

On Tue, Jul 13, 2021 at 10:37:23AM +0100, Stefan Hajnoczi wrote:
>On Tue, Jul 13, 2021 at 06:30:28AM +0100, Christoph Hellwig wrote:
>On Tue, Jul 13, 2021 at 06:30:28AM +0100, Christoph Hellwig wrote:
>> On Mon, Jul 12, 2021 at 12:03:27PM +0100, Stefan Hajnoczi wrote:
>> > Why did you decide to implement -device nvme-mi as a device on
>> > TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised
>> > that there's no NVMe bus interface (callbacks). It seems like this could
>> > just as easily be a property of an NVMe controller -device
>> > nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not
>> > familiar enough with MI and NVMe architecture...
>>
>> I'm too far away from qemu these days to understand what TYPE_NVME_BUS
>> is.  Bt NVMe-MI has tree possible transports:
>>
>>  1) out of band through smbus.  This seems something that could be
>>     trivially modelled in qemu
>>  2) out of band over MCTP / PCIe VDM.
>>  3) in band using NVMe admin commands that pass through MI commands
>
>Thanks for explaining!
>
>Common NVMe-MI code can be shared by -device nvme-mi-smbus, in-band NVMe
>MI commands (part of -device nvme), a vsock transport, etc. This patch
>has nvme_mi_admin_command() as the entry point to common MI code, so not
>much needs to be done to achieve this.
>
>My question about why -device nvme-mi was because this "device" doesn't
>implement any bus interface (callbacks). The bus effectively just serves
>as an owner of this device. The guest does not access the device via the
>bus. So I'm not sure a -device is appropriate, it's an usual device.
>
>If the device is kept, please name it -device nvme-mi-vsock so it's
>clear this is the NVMe-MI vsock transport. I think the device could be
>dropped and instead an -device nvme,mi-vsock=on|off property could be
>added to enable the MI vsock transport on a specific NVMe controller.
>This raises the question of whether the port number should be
>configurable so multiple vsock Management Endpoints can coexist.
>
>I don't have time to explore the architectural model, but here's the
>link in case anyone wants to think through all the options for NVMe MI
>Management Endpoints and how QEMU should model them:
>"1.4 NVM Subsystem Architectural Model"
>https://protect2.fireeye.com/v1/url?k=8ee99db1-ee0b00ec-8ee816fe-000babd9f1ba-c174da71c1d11e79&q=1&e=b7b9709a-33ac-4d98-a6c0-ff53377a3278&u=https%3A%2F%2Fnvmexpress.org%2Fwp-content%2Fuploads%2FNVM-Express-Management-Interface-1.2-2021.06.02-Ratified.pdf
>
>Stefan
Thanks Stefan for the suggestion.


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-07-15 15:00 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20210709135651epcas5p1c544dec5377413bfa4b2eeab6ee43f26@epcas5p1.samsung.com>
2021-07-09 13:55 ` [RFC PATCH 1/2] hw/nvme: add mi device Padmakar Kalghatgi
2021-07-09 15:58   ` Keith Busch
2021-07-15 12:01     ` Padmakar Kalghatgi
2021-07-12 11:03   ` Stefan Hajnoczi
2021-07-12 11:03     ` Stefan Hajnoczi
2021-07-13  5:30     ` Christoph Hellwig
2021-07-13  5:30       ` Christoph Hellwig
2021-07-13  9:37       ` Stefan Hajnoczi
2021-07-13  9:37         ` Stefan Hajnoczi
2021-07-15 14:37         ` Padmakar Kalghatgi
2021-07-15 14:37           ` Padmakar Kalghatgi
2021-07-15 12:36     ` Padmakar Kalghatgi
2021-07-15 12:36       ` Padmakar Kalghatgi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.