All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Haozhong Zhang <haozhong.zhang@intel.com>
Cc: qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Xiao Guangrong <guangrong.xiao@gmail.com>,
	Stefan Hajnoczi <stefanha@gmail.com>
Subject: Re: [Qemu-devel] [PATCH v2 2/4] nvdimm: warn if the backend is not a DAX device
Date: Tue, 6 Jun 2017 10:59:07 -0700	[thread overview]
Message-ID: <CAPcyv4hf+XAabjswqhkq7HsixBuBnivJuwgC7tvdzHC3_EDhxg@mail.gmail.com> (raw)
In-Reply-To: <20170606072229.9302-3-haozhong.zhang@intel.com>

On Tue, Jun 6, 2017 at 12:22 AM, Haozhong Zhang
<haozhong.zhang@intel.com> wrote:
> Applications in Linux guest that use device-dax never trigger flush
> that can be trapped by KVM/QEMU. Meanwhile, if the host backend is not
> device-dax, QEMU cannot guarantee the persistence of guest writes.
> Before solving this flushing problem, QEMU should warn users if the
> host backend is not device-dax.
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> Message-id: CAPcyv4hV2-ZW8SMCRtD0P_86KgR3DHOvNe+6T5SY2u7wXg3gEg@mail.gmail.com
> ---
>  hw/mem/nvdimm.c      |  6 ++++++
>  include/qemu/osdep.h |  9 ++++++++
>  util/osdep.c         | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 76 insertions(+)
>
> diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
> index a9b0863f20..b23542fbdf 100644
> --- a/hw/mem/nvdimm.c
> +++ b/hw/mem/nvdimm.c
> @@ -26,6 +26,7 @@
>  #include "qapi/error.h"
>  #include "qapi/visitor.h"
>  #include "hw/mem/nvdimm.h"
> +#include "qemu/error-report.h"
>
>  static void nvdimm_get_label_size(Object *obj, Visitor *v, const char *name,
>                                    void *opaque, Error **errp)
> @@ -84,6 +85,11 @@ static void nvdimm_realize(PCDIMMDevice *dimm, Error **errp)
>      NVDIMMDevice *nvdimm = NVDIMM(dimm);
>      uint64_t align, pmem_size, size = memory_region_size(mr);
>
> +    if (!qemu_fd_is_dev_dax(memory_region_get_fd(mr))) {
> +        error_report("warning: nvdimm backend does not look like a DAX device, "
> +                     "unable to guarantee persistence of guest writes");
> +    }
> +
>      align = memory_region_get_alignment(mr);
>
>      pmem_size = size - nvdimm->label_size;
> diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
> index 1c9f5e260c..7f26af371e 100644
> --- a/include/qemu/osdep.h
> +++ b/include/qemu/osdep.h
> @@ -470,4 +470,13 @@ char *qemu_get_pid_name(pid_t pid);
>   */
>  pid_t qemu_fork(Error **errp);
>
> +/**
> + * qemu_fd_is_dev_dax:
> + *
> + * Check whether @fd describes a DAX device.
> + *
> + * Returns true if it is; otherwise, return false.
> + */
> +bool qemu_fd_is_dev_dax(int fd);
> +
>  #endif
> diff --git a/util/osdep.c b/util/osdep.c
> index a2863c8e53..02881f96bc 100644
> --- a/util/osdep.c
> +++ b/util/osdep.c
> @@ -471,3 +471,64 @@ writev(int fd, const struct iovec *iov, int iov_cnt)
>      return readv_writev(fd, iov, iov_cnt, true);
>  }
>  #endif
> +
> +#ifdef __linux__
> +static ssize_t qemu_dev_dax_sysfs_read(int fd, const char *entry,
> +                                       char *buf, size_t len)
> +{
> +    ssize_t read_bytes;
> +    struct stat st;
> +    unsigned int major, minor;
> +    char *path, *pos;
> +    int sysfs_fd;
> +
> +    if (fstat(fd, &st)) {
> +        return 0;
> +    }
> +
> +    major = major(st.st_rdev);
> +    minor = minor(st.st_rdev);
> +    path = g_strdup_printf("/sys/dev/char/%u:%u/%s", major, minor, entry);
> +
> +    sysfs_fd = open(path, O_RDONLY);
> +    g_free(path);
> +    if (sysfs_fd == -1) {
> +        return 0;
> +    }
> +
> +    read_bytes = read(sysfs_fd, buf, len - 1);
> +    close(sysfs_fd);
> +    if (read_bytes > 0) {
> +        buf[read_bytes] = '\0';
> +        pos = g_strstr_len(buf, read_bytes, "\n");
> +        if (pos) {
> +            *pos = '\0';
> +        }
> +    }
> +
> +    return read_bytes;
> +}
> +#endif /* __linux__ */
> +
> +bool qemu_fd_is_dev_dax(int fd)
> +{
> +    bool is_dax = false;
> +
> +#ifdef __linux__
> +    char devtype[7];
> +    ssize_t len;
> +
> +    if (fd == -1) {
> +        return false;
> +    }
> +
> +    len = qemu_dev_dax_sysfs_read(fd, "device/devtype",
> +                                  devtype, sizeof(devtype));
> +    if (len <= 0) {
> +        return false;
> +    }
> +    is_dax = !strncmp(devtype, "nd_dax", len);

There's no guarantee that device-dax instances are always parented by
an "nd_dax" device-type. A more reliable check is to see if
"/sys/dev/char/%u:%u/subsystem" points to "/sys/class/dax".

  reply	other threads:[~2017-06-06 17:59 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-06  7:22 [Qemu-devel] [PATCH v2 0/4] nvdimm: fixes for (non-)dax backends Haozhong Zhang
2017-06-06  7:22 ` [Qemu-devel] [PATCH v2 1/4] nvdimm: add a macro for property "label-size" Haozhong Zhang
2017-06-07 15:29   ` Stefan Hajnoczi
2017-06-06  7:22 ` [Qemu-devel] [PATCH v2 2/4] nvdimm: warn if the backend is not a DAX device Haozhong Zhang
2017-06-06 17:59   ` Dan Williams [this message]
2017-06-08  1:07     ` Haozhong Zhang
2017-06-07 15:14   ` Stefan Hajnoczi
2017-06-08  1:07     ` Haozhong Zhang
2017-06-08 12:51   ` Michael S. Tsirkin
2017-06-12  3:18     ` Haozhong Zhang
2017-06-12  3:25       ` Michael S. Tsirkin
2017-06-06  7:22 ` [Qemu-devel] [PATCH v2 3/4] nvdimm: add a boolean option "restrict" Haozhong Zhang
2017-06-07 15:27   ` Stefan Hajnoczi
2017-06-08  1:45     ` Haozhong Zhang
2017-06-08 10:19       ` Stefan Hajnoczi
2017-06-08  6:39     ` Haozhong Zhang
2017-06-08 10:18       ` Stefan Hajnoczi
2017-06-08 12:56   ` Michael S. Tsirkin
2017-06-09  9:34     ` Stefan Hajnoczi
2017-06-12  1:18     ` Haozhong Zhang
2017-06-06  7:22 ` [Qemu-devel] [PATCH v2 4/4] util/mmap-alloc: account for DAX device in qemu_fd_getpagesize Haozhong Zhang
2017-06-07 15:29   ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4hf+XAabjswqhkq7HsixBuBnivJuwgC7tvdzHC3_EDhxg@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=guangrong.xiao@gmail.com \
    --cc=haozhong.zhang@intel.com \
    --cc=imammedo@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.