linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3)
@ 2016-08-20  8:07 Namhyung Kim
  2016-08-20  8:07 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
                   ` (3 more replies)
  0 siblings, 4 replies; 31+ messages in thread
From: Namhyung Kim @ 2016-08-20  8:07 UTC (permalink / raw)
  To: virtio-dev, kvm, qemu-devel, virtualization
  Cc: LKML, Paolo Bonzini, Radim Krčmář,
	Michael S. Tsirkin, Anthony Liguori, Anton Vorontsov,
	Colin Cross, Kees Cook, Tony Luck, Steven Rostedt, Ingo Molnar,
	Minchan Kim, Will Deacon

Hello,

This is another iteration of the virtio-pstore work.  In this patchset
I addressed most of feedbacks from previous version and drooped the
support for PSTORE_TYPE_CONSOLE for simplicity.  It'll be added once the basic implementation

 * changes in v3)
  - use QIOChannel API  (Stefan, Daniel)
  - add bound check for malcious guests  (Daniel)
  - drop support PSTORE_TYPE_CONSOLE for now
  - update license to allow GPL v2 or later  (Michael)
  - limit number of pstore files on qemu

 * changes in v2)
  - update VIRTIO_ID_PSTORE to 22  (Cornelia, Stefan)
  - make buffer size configurable  (Cornelia)
  - support PSTORE_TYPE_CONSOLE  (Kees)
  - use separate virtqueues for read and write
  - support concurrent async write
  - manage pstore (file) id in device side
  - fix various mistakes in qemu device  (Stefan)

It started from the fact that dumping ftrace buffer at kernel
oops/panic takes too much time.  Although there's a way to reduce the
size of the original data, sometimes I want to have the information as
many as possible.  Maybe kexec/kdump can solve this problem but it
consumes some portion of guest memory so I'd like to avoid it.  And I
know the qemu + crashtool can dump and analyze the whole guest memory
including the ftrace buffer without wasting guest memory, but it adds
one more layer and has some limitation as an out-of-tree tool like not
being in sync with the kernel changes.

So I think it'd be great using the pstore interface to dump guest
kernel data on the host.  One can read the data on the host directly
or on the guest (at the next boot) using pstore filesystem as usual.
While this patchset only implements dumping kernel log buffer, it can
be extended to have ftrace buffer and probably some more..

The patch 0001 implements virtio pstore driver.  It has two virt queue
for (sync) read and (async) write, pstore buffer and io request and
response structure.  The virtio_pstore_req struct is to give
information about the current pstore operation.  The result will be
written to the virtio_pstore_res struct.  For read operation it also
uses virtio_pstore_fileinfo struct.

The patch 0002 and 0003 implement virtio-pstore legacy PCI device on
qemu-kvm and kvmtool respectively.  I referenced virtio-baloon and
virtio-rng implementations and I don't know whether kvmtool supports
modern virtio 1.0+ spec.  Other transports might be supported later.

For example, using virtio-pstore on qemu looks like below:

  $ qemu-system-x86_64 -enable-kvm -device virtio-pstore,directory=xxx

When guest kernel gets panic the log messages will be saved under the
xxx directory.

  $ ls xxx
  dmesg-1.enc.z  dmesg-2.enc.z

As you can see the pstore subsystem compresses the log data using zlib
(now supports lzo and lz4 too).  The data can be extracted with the
following command:

  $ cat xxx/dmesg-1.enc.z | \
  > python -c 'import sys, zlib; print(zlib.decompress(sys.stdin.read()))'
  Oops#1 Part1
  <5>[    0.000000] Linux version 4.6.0kvm+ (namhyung@danjae) (gcc version 5.3.0 (GCC) ) #145 SMP Mon Jul 18 10:22:45 KST 2016
  <6>[    0.000000] Command line: root=/dev/vda console=ttyS0
  <6>[    0.000000] x86/fpu: Legacy x87 FPU detected.
  <6>[    0.000000] x86/fpu: Using 'eager' FPU context switches.
  <6>[    0.000000] e820: BIOS-provided physical RAM map:
  <6>[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
  <6>[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
  <6>[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
  <6>[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000007fddfff] usable
  <6>[    0.000000] BIOS-e820: [mem 0x0000000007fde000-0x0000000007ffffff] reserved
  <6>[    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
  <6>[    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
  <6>[    0.000000] NX (Execute Disable) protection: active
  <6>[    0.000000] SMBIOS 2.8 present.
  <7>[    0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
  ...


Namhyung Kim (3):
  virtio: Basic implementation of virtio pstore driver
  qemu: Implement virtio-pstore device
  kvmtool: Implement virtio-pstore device

 drivers/virtio/Kconfig             |  10 +
 drivers/virtio/Makefile            |   1 +
 drivers/virtio/virtio_pstore.c     | 417 +++++++++++++++++++++++++++++++++++++
 include/uapi/linux/Kbuild          |   1 +
 include/uapi/linux/virtio_ids.h    |   1 +
 include/uapi/linux/virtio_pstore.h |  74 +++++++
 6 files changed, 504 insertions(+)
 create mode 100644 drivers/virtio/virtio_pstore.c
 create mode 100644 include/uapi/linux/virtio_pstore.h


Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Anthony Liguori <aliguori@amazon.com>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kvm@vger.kernel.org
Cc: qemu-devel@nongnu.org
Cc: virtualization@lists.linux-foundation.org

Thanks,
Namhyung


-- 
2.9.3

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-08-20  8:07 [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Namhyung Kim
@ 2016-08-20  8:07 ` Namhyung Kim
  2016-09-13 15:19   ` Michael S. Tsirkin
  2016-11-10 16:39   ` Michael S. Tsirkin
  2016-08-20  8:07 ` [PATCH 2/3] qemu: Implement virtio-pstore device Namhyung Kim
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 31+ messages in thread
From: Namhyung Kim @ 2016-08-20  8:07 UTC (permalink / raw)
  To: virtio-dev, kvm, qemu-devel, virtualization
  Cc: LKML, Paolo Bonzini, Radim Krčmář,
	Michael S. Tsirkin, Anthony Liguori, Anton Vorontsov,
	Colin Cross, Kees Cook, Tony Luck, Steven Rostedt, Ingo Molnar,
	Minchan Kim

The virtio pstore driver provides interface to the pstore subsystem so
that the guest kernel's log/dump message can be saved on the host
machine.  Users can access the log file directly on the host, or on the
guest at the next boot using pstore filesystem.  It currently deals with
kernel log (printk) buffer only, but we can extend it to have other
information (like ftrace dump) later.

It supports legacy PCI device using single order-2 page buffer.  It uses
two virtqueues - one for (sync) read and another for (async) write.
Since it cannot wait for write finished, it supports up to 128
concurrent IO.  The buffer size is configurable now.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Anthony Liguori <aliguori@amazon.com>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: kvm@vger.kernel.org
Cc: qemu-devel@nongnu.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 drivers/virtio/Kconfig             |  10 +
 drivers/virtio/Makefile            |   1 +
 drivers/virtio/virtio_pstore.c     | 417 +++++++++++++++++++++++++++++++++++++
 include/uapi/linux/Kbuild          |   1 +
 include/uapi/linux/virtio_ids.h    |   1 +
 include/uapi/linux/virtio_pstore.h |  74 +++++++
 6 files changed, 504 insertions(+)
 create mode 100644 drivers/virtio/virtio_pstore.c
 create mode 100644 include/uapi/linux/virtio_pstore.h

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 77590320d44c..8f0e6c796c12 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -58,6 +58,16 @@ config VIRTIO_INPUT
 
 	 If unsure, say M.
 
+config VIRTIO_PSTORE
+	tristate "Virtio pstore driver"
+	depends on VIRTIO
+	depends on PSTORE
+	---help---
+	 This driver supports virtio pstore devices to save/restore
+	 panic and oops messages on the host.
+
+	 If unsure, say M.
+
  config VIRTIO_MMIO
 	tristate "Platform bus driver for memory mapped virtio devices"
 	depends on HAS_IOMEM && HAS_DMA
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 41e30e3dc842..bee68cb26d48 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
 virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
 obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
 obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
+obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o
diff --git a/drivers/virtio/virtio_pstore.c b/drivers/virtio/virtio_pstore.c
new file mode 100644
index 000000000000..0a63c7db4278
--- /dev/null
+++ b/drivers/virtio/virtio_pstore.c
@@ -0,0 +1,417 @@
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pstore.h>
+#include <linux/virtio.h>
+#include <linux/virtio_config.h>
+#include <uapi/linux/virtio_ids.h>
+#include <uapi/linux/virtio_pstore.h>
+
+#define VIRT_PSTORE_ORDER    2
+#define VIRT_PSTORE_BUFSIZE  (4096 << VIRT_PSTORE_ORDER)
+#define VIRT_PSTORE_NR_REQ   128
+
+struct virtio_pstore {
+	struct virtio_device	*vdev;
+	struct virtqueue	*vq[2];
+	struct pstore_info	 pstore;
+	struct virtio_pstore_req req[VIRT_PSTORE_NR_REQ];
+	struct virtio_pstore_res res[VIRT_PSTORE_NR_REQ];
+	unsigned int		 req_id;
+
+	/* Waiting for host to ack */
+	wait_queue_head_t	acked;
+	int			failed;
+};
+
+#define TYPE_TABLE_ENTRY(_entry)				\
+	{ PSTORE_TYPE_##_entry, VIRTIO_PSTORE_TYPE_##_entry }
+
+struct type_table {
+	int pstore;
+	u16 virtio;
+} type_table[] = {
+	TYPE_TABLE_ENTRY(DMESG),
+};
+
+#undef TYPE_TABLE_ENTRY
+
+
+static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id type)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
+		if (type == type_table[i].pstore)
+			return cpu_to_virtio16(vps->vdev, type_table[i].virtio);
+	}
+
+	return cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_UNKNOWN);
+}
+
+static enum pstore_type_id from_virtio_type(struct virtio_pstore *vps, u16 type)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
+		if (virtio16_to_cpu(vps->vdev, type) == type_table[i].virtio)
+			return type_table[i].pstore;
+	}
+
+	return PSTORE_TYPE_UNKNOWN;
+}
+
+static void virtpstore_ack(struct virtqueue *vq)
+{
+	struct virtio_pstore *vps = vq->vdev->priv;
+
+	wake_up(&vps->acked);
+}
+
+static void virtpstore_check(struct virtqueue *vq)
+{
+	struct virtio_pstore *vps = vq->vdev->priv;
+	struct virtio_pstore_res *res;
+	unsigned int len;
+
+	res = virtqueue_get_buf(vq, &len);
+	if (res == NULL)
+		return;
+
+	if (virtio32_to_cpu(vq->vdev, res->ret) < 0)
+		vps->failed = 1;
+}
+
+static void virt_pstore_get_reqs(struct virtio_pstore *vps,
+				 struct virtio_pstore_req **preq,
+				 struct virtio_pstore_res **pres)
+{
+	unsigned int idx = vps->req_id++ % VIRT_PSTORE_NR_REQ;
+
+	*preq = &vps->req[idx];
+	*pres = &vps->res[idx];
+
+	memset(*preq, 0, sizeof(**preq));
+	memset(*pres, 0, sizeof(**pres));
+}
+
+static int virt_pstore_open(struct pstore_info *psi)
+{
+	struct virtio_pstore *vps = psi->data;
+	struct virtio_pstore_req *req;
+	struct virtio_pstore_res *res;
+	struct scatterlist sgo[1], sgi[1];
+	struct scatterlist *sgs[2] = { sgo, sgi };
+	unsigned int len;
+
+	virt_pstore_get_reqs(vps, &req, &res);
+
+	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_OPEN);
+
+	sg_init_one(sgo, req, sizeof(*req));
+	sg_init_one(sgi, res, sizeof(*res));
+	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
+	virtqueue_kick(vps->vq[0]);
+
+	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
+	return virtio32_to_cpu(vps->vdev, res->ret);
+}
+
+static int virt_pstore_close(struct pstore_info *psi)
+{
+	struct virtio_pstore *vps = psi->data;
+	struct virtio_pstore_req *req = &vps->req[vps->req_id];
+	struct virtio_pstore_res *res = &vps->res[vps->req_id];
+	struct scatterlist sgo[1], sgi[1];
+	struct scatterlist *sgs[2] = { sgo, sgi };
+	unsigned int len;
+
+	virt_pstore_get_reqs(vps, &req, &res);
+
+	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_CLOSE);
+
+	sg_init_one(sgo, req, sizeof(*req));
+	sg_init_one(sgi, res, sizeof(*res));
+	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
+	virtqueue_kick(vps->vq[0]);
+
+	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
+	return virtio32_to_cpu(vps->vdev, res->ret);
+}
+
+static ssize_t virt_pstore_read(u64 *id, enum pstore_type_id *type,
+				int *count, struct timespec *time,
+				char **buf, bool *compressed,
+				ssize_t *ecc_notice_size,
+				struct pstore_info *psi)
+{
+	struct virtio_pstore *vps = psi->data;
+	struct virtio_pstore_req *req;
+	struct virtio_pstore_res *res;
+	struct virtio_pstore_fileinfo info;
+	struct scatterlist sgo[1], sgi[3];
+	struct scatterlist *sgs[2] = { sgo, sgi };
+	unsigned int len;
+	unsigned int flags;
+	int ret;
+	void *bf;
+
+	virt_pstore_get_reqs(vps, &req, &res);
+
+	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_READ);
+
+	sg_init_one(sgo, req, sizeof(*req));
+	sg_init_table(sgi, 3);
+	sg_set_buf(&sgi[0], res, sizeof(*res));
+	sg_set_buf(&sgi[1], &info, sizeof(info));
+	sg_set_buf(&sgi[2], psi->buf, psi->bufsize);
+	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
+	virtqueue_kick(vps->vq[0]);
+
+	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
+	if (len < sizeof(*res) + sizeof(info))
+		return -1;
+
+	ret = virtio32_to_cpu(vps->vdev, res->ret);
+	if (ret < 0)
+		return ret;
+
+	len = virtio32_to_cpu(vps->vdev, info.len);
+
+	bf = kmalloc(len, GFP_KERNEL);
+	if (bf == NULL)
+		return -ENOMEM;
+
+	*id    = virtio64_to_cpu(vps->vdev, info.id);
+	*type  = from_virtio_type(vps, info.type);
+	*count = virtio32_to_cpu(vps->vdev, info.count);
+
+	flags = virtio32_to_cpu(vps->vdev, info.flags);
+	*compressed = flags & VIRTIO_PSTORE_FL_COMPRESSED;
+
+	time->tv_sec  = virtio64_to_cpu(vps->vdev, info.time_sec);
+	time->tv_nsec = virtio32_to_cpu(vps->vdev, info.time_nsec);
+
+	memcpy(bf, psi->buf, len);
+	*buf = bf;
+
+	return len;
+}
+
+static int notrace virt_pstore_write(enum pstore_type_id type,
+				     enum kmsg_dump_reason reason,
+				     u64 *id, unsigned int part, int count,
+				     bool compressed, size_t size,
+				     struct pstore_info *psi)
+{
+	struct virtio_pstore *vps = psi->data;
+	struct virtio_pstore_req *req;
+	struct virtio_pstore_res *res;
+	struct scatterlist sgo[2], sgi[1];
+	struct scatterlist *sgs[2] = { sgo, sgi };
+	unsigned int flags = compressed ? VIRTIO_PSTORE_FL_COMPRESSED : 0;
+
+	if (vps->failed)
+		return -1;
+
+	*id = vps->req_id;
+	virt_pstore_get_reqs(vps, &req, &res);
+
+	req->cmd   = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_WRITE);
+	req->type  = to_virtio_type(vps, type);
+	req->flags = cpu_to_virtio32(vps->vdev, flags);
+
+	sg_init_table(sgo, 2);
+	sg_set_buf(&sgo[0], req, sizeof(*req));
+	sg_set_buf(&sgo[1], pstore_get_buf(psi), size);
+	sg_init_one(sgi, res, sizeof(*res));
+	virtqueue_add_sgs(vps->vq[1], sgs, 1, 1, vps, GFP_ATOMIC);
+	virtqueue_kick(vps->vq[1]);
+
+	return 0;
+}
+
+static int virt_pstore_erase(enum pstore_type_id type, u64 id, int count,
+			     struct timespec time, struct pstore_info *psi)
+{
+	struct virtio_pstore *vps = psi->data;
+	struct virtio_pstore_req *req;
+	struct virtio_pstore_res *res;
+	struct scatterlist sgo[1], sgi[1];
+	struct scatterlist *sgs[2] = { sgo, sgi };
+	unsigned int len;
+
+	virt_pstore_get_reqs(vps, &req, &res);
+
+	req->cmd   = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_ERASE);
+	req->type  = to_virtio_type(vps, type);
+	req->id	   = cpu_to_virtio64(vps->vdev, id);
+	req->count = cpu_to_virtio32(vps->vdev, count);
+
+	sg_init_one(sgo, req, sizeof(*req));
+	sg_init_one(sgi, res, sizeof(*res));
+	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
+	virtqueue_kick(vps->vq[0]);
+
+	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
+	return virtio32_to_cpu(vps->vdev, res->ret);
+}
+
+static int virt_pstore_init(struct virtio_pstore *vps)
+{
+	struct pstore_info *psinfo = &vps->pstore;
+	int err;
+
+	if (!psinfo->bufsize)
+		psinfo->bufsize = VIRT_PSTORE_BUFSIZE;
+
+	psinfo->buf = alloc_pages_exact(psinfo->bufsize, GFP_KERNEL);
+	if (!psinfo->buf) {
+		pr_err("cannot allocate pstore buffer\n");
+		return -ENOMEM;
+	}
+
+	psinfo->owner = THIS_MODULE;
+	psinfo->name  = "virtio";
+	psinfo->open  = virt_pstore_open;
+	psinfo->close = virt_pstore_close;
+	psinfo->read  = virt_pstore_read;
+	psinfo->erase = virt_pstore_erase;
+	psinfo->write = virt_pstore_write;
+	psinfo->flags = PSTORE_FLAGS_DMESG;
+
+	psinfo->data  = vps;
+	spin_lock_init(&psinfo->buf_lock);
+
+	err = pstore_register(psinfo);
+	if (err)
+		kfree(psinfo->buf);
+
+	return err;
+}
+
+static int virt_pstore_exit(struct virtio_pstore *vps)
+{
+	struct pstore_info *psinfo = &vps->pstore;
+
+	pstore_unregister(psinfo);
+
+	free_pages_exact(psinfo->buf, psinfo->bufsize);
+	psinfo->buf = NULL;
+	psinfo->bufsize = 0;
+
+	return 0;
+}
+
+static int virtpstore_init_vqs(struct virtio_pstore *vps)
+{
+	vq_callback_t *callbacks[] = { virtpstore_ack, virtpstore_check };
+	const char *names[] = { "pstore_read", "pstore_write" };
+
+	return vps->vdev->config->find_vqs(vps->vdev, 2, vps->vq,
+					   callbacks, names);
+}
+
+static void virtpstore_init_config(struct virtio_pstore *vps)
+{
+	u32 bufsize;
+
+	virtio_cread(vps->vdev, struct virtio_pstore_config, bufsize, &bufsize);
+
+	vps->pstore.bufsize = PAGE_ALIGN(bufsize);
+}
+
+static void virtpstore_confirm_config(struct virtio_pstore *vps)
+{
+	u32 bufsize = vps->pstore.bufsize;
+
+	virtio_cwrite(vps->vdev, struct virtio_pstore_config, bufsize,
+		     &bufsize);
+}
+
+static int virtpstore_probe(struct virtio_device *vdev)
+{
+	struct virtio_pstore *vps;
+	int err;
+
+	if (!vdev->config->get) {
+		dev_err(&vdev->dev, "driver init: config access disabled\n");
+		return -EINVAL;
+	}
+
+	vdev->priv = vps = kzalloc(sizeof(*vps), GFP_KERNEL);
+	if (!vps) {
+		err = -ENOMEM;
+		goto out;
+	}
+	vps->vdev = vdev;
+
+	err = virtpstore_init_vqs(vps);
+	if (err < 0)
+		goto out_free;
+
+	virtpstore_init_config(vps);
+
+	err = virt_pstore_init(vps);
+	if (err)
+		goto out_del_vq;
+
+	virtpstore_confirm_config(vps);
+
+	init_waitqueue_head(&vps->acked);
+
+	virtio_device_ready(vdev);
+
+	dev_info(&vdev->dev, "driver init: ok (bufsize = %luK, flags = %x)\n",
+		 vps->pstore.bufsize >> 10, vps->pstore.flags);
+
+	return 0;
+
+out_del_vq:
+	vdev->config->del_vqs(vdev);
+out_free:
+	kfree(vps);
+out:
+	dev_err(&vdev->dev, "driver init: failed with %d\n", err);
+	return err;
+}
+
+static void virtpstore_remove(struct virtio_device *vdev)
+{
+	struct virtio_pstore *vps = vdev->priv;
+
+	virt_pstore_exit(vps);
+
+	/* Now we reset the device so we can clean up the queues. */
+	vdev->config->reset(vdev);
+
+	vdev->config->del_vqs(vdev);
+
+	kfree(vps);
+}
+
+static unsigned int features[] = {
+};
+
+static struct virtio_device_id id_table[] = {
+	{ VIRTIO_ID_PSTORE, VIRTIO_DEV_ANY_ID },
+	{ 0 },
+};
+
+static struct virtio_driver virtio_pstore_driver = {
+	.driver.name         = KBUILD_MODNAME,
+	.driver.owner        = THIS_MODULE,
+	.feature_table       = features,
+	.feature_table_size  = ARRAY_SIZE(features),
+	.id_table            = id_table,
+	.probe               = virtpstore_probe,
+	.remove              = virtpstore_remove,
+};
+
+module_virtio_driver(virtio_pstore_driver);
+MODULE_DEVICE_TABLE(virtio, id_table);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Namhyung Kim <namhyung@kernel.org>");
+MODULE_DESCRIPTION("Virtio pstore driver");
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 6d4e92ccdc91..9bbb1554d8b2 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -449,6 +449,7 @@ header-y += virtio_ids.h
 header-y += virtio_input.h
 header-y += virtio_net.h
 header-y += virtio_pci.h
+header-y += virtio_pstore.h
 header-y += virtio_ring.h
 header-y += virtio_rng.h
 header-y += virtio_scsi.h
diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
index 77925f587b15..c72a9ab588c0 100644
--- a/include/uapi/linux/virtio_ids.h
+++ b/include/uapi/linux/virtio_ids.h
@@ -41,5 +41,6 @@
 #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
 #define VIRTIO_ID_GPU          16 /* virtio GPU */
 #define VIRTIO_ID_INPUT        18 /* virtio input */
+#define VIRTIO_ID_PSTORE       22 /* virtio pstore */
 
 #endif /* _LINUX_VIRTIO_IDS_H */
diff --git a/include/uapi/linux/virtio_pstore.h b/include/uapi/linux/virtio_pstore.h
new file mode 100644
index 000000000000..f4b0d204d8ae
--- /dev/null
+++ b/include/uapi/linux/virtio_pstore.h
@@ -0,0 +1,74 @@
+#ifndef _LINUX_VIRTIO_PSTORE_H
+#define _LINUX_VIRTIO_PSTORE_H
+/* This header is BSD licensed so anyone can use the definitions to implement
+ * compatible drivers/servers.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ *    may be used to endorse or promote products derived from this software
+ *    without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE. */
+#include <linux/types.h>
+#include <linux/virtio_types.h>
+
+#define VIRTIO_PSTORE_CMD_NULL   0
+#define VIRTIO_PSTORE_CMD_OPEN   1
+#define VIRTIO_PSTORE_CMD_READ   2
+#define VIRTIO_PSTORE_CMD_WRITE  3
+#define VIRTIO_PSTORE_CMD_ERASE  4
+#define VIRTIO_PSTORE_CMD_CLOSE  5
+
+#define VIRTIO_PSTORE_TYPE_UNKNOWN  0
+#define VIRTIO_PSTORE_TYPE_DMESG    1
+
+#define VIRTIO_PSTORE_FL_COMPRESSED  1
+
+struct virtio_pstore_req {
+	__virtio16		cmd;
+	__virtio16		type;
+	__virtio32		flags;
+	__virtio64		id;
+	__virtio32		count;
+	__virtio32		reserved;
+};
+
+struct virtio_pstore_res {
+	__virtio16		cmd;
+	__virtio16		type;
+	__virtio32		ret;
+};
+
+struct virtio_pstore_fileinfo {
+	__virtio64		id;
+	__virtio32		count;
+	__virtio16		type;
+	__virtio16		unused;
+	__virtio32		flags;
+	__virtio32		len;
+	__virtio64		time_sec;
+	__virtio32		time_nsec;
+	__virtio32		reserved;
+};
+
+struct virtio_pstore_config {
+	__virtio32		bufsize;
+};
+
+#endif /* _LINUX_VIRTIO_PSTORE_H */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/3] qemu: Implement virtio-pstore device
  2016-08-20  8:07 [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Namhyung Kim
  2016-08-20  8:07 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
@ 2016-08-20  8:07 ` Namhyung Kim
  2016-08-24 22:00   ` Daniel P. Berrange
  2016-09-13 15:57   ` Michael S. Tsirkin
  2016-08-20  8:07 ` [PATCH 3/3] kvmtool: " Namhyung Kim
  2016-08-23 10:25 ` [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Joel Fernandes
  3 siblings, 2 replies; 31+ messages in thread
From: Namhyung Kim @ 2016-08-20  8:07 UTC (permalink / raw)
  To: virtio-dev, kvm, qemu-devel, virtualization
  Cc: LKML, Paolo Bonzini, Radim Krčmář,
	Michael S. Tsirkin, Anthony Liguori, Anton Vorontsov,
	Colin Cross, Kees Cook, Tony Luck, Steven Rostedt, Ingo Molnar,
	Minchan Kim, Daniel P . Berrange

Add virtio pstore device to allow kernel log files saved on the host.
It will save the log files on the directory given by pstore device
option.

  $ qemu-system-x86_64 -device virtio-pstore,directory=dir-xx ...

  (guest) # echo c > /proc/sysrq-trigger

  $ ls dir-xx
  dmesg-1.enc.z  dmesg-2.enc.z

The log files are usually compressed using zlib.  Users can see the log
messages directly on the host or on the guest (using pstore filesystem).

The 'directory' property is required for virtio-pstore device to work.
It also adds 'bufsize' property to set size of pstore bufer.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Anthony Liguori <aliguori@amazon.com>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Daniel P. Berrange <berrange@redhat.com>
Cc: kvm@vger.kernel.org
Cc: qemu-devel@nongnu.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 hw/virtio/Makefile.objs                        |   2 +-
 hw/virtio/virtio-pci.c                         |  52 ++
 hw/virtio/virtio-pci.h                         |  14 +
 hw/virtio/virtio-pstore.c                      | 699 +++++++++++++++++++++++++
 include/hw/pci/pci.h                           |   1 +
 include/hw/virtio/virtio-pstore.h              |  36 ++
 include/standard-headers/linux/virtio_ids.h    |   1 +
 include/standard-headers/linux/virtio_pstore.h |  76 +++
 qdev-monitor.c                                 |   1 +
 9 files changed, 881 insertions(+), 1 deletion(-)
 create mode 100644 hw/virtio/virtio-pstore.c
 create mode 100644 include/hw/virtio/virtio-pstore.h
 create mode 100644 include/standard-headers/linux/virtio_pstore.h

diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 3e2b175..aae7082 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -4,4 +4,4 @@ common-obj-y += virtio-bus.o
 common-obj-y += virtio-mmio.o
 
 obj-y += virtio.o virtio-balloon.o 
-obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
+obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o virtio-pstore.o
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 755f921..c184823 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2416,6 +2416,57 @@ static const TypeInfo virtio_host_pci_info = {
 };
 #endif
 
+/* virtio-pstore-pci */
+
+static void virtio_pstore_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
+{
+    VirtIOPstorePCI *vps = VIRTIO_PSTORE_PCI(vpci_dev);
+    DeviceState *vdev = DEVICE(&vps->vdev);
+    Error *err = NULL;
+
+    qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
+    object_property_set_bool(OBJECT(vdev), true, "realized", &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+}
+
+static void virtio_pstore_pci_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
+    PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
+
+    k->realize = virtio_pstore_pci_realize;
+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+
+    pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
+    pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_PSTORE;
+    pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
+    pcidev_k->class_id = PCI_CLASS_OTHERS;
+}
+
+static void virtio_pstore_pci_instance_init(Object *obj)
+{
+    VirtIOPstorePCI *dev = VIRTIO_PSTORE_PCI(obj);
+
+    virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
+                                TYPE_VIRTIO_PSTORE);
+    object_property_add_alias(obj, "directory", OBJECT(&dev->vdev),
+                              "directory", &error_abort);
+    object_property_add_alias(obj, "bufsize", OBJECT(&dev->vdev),
+                              "bufsize", &error_abort);
+}
+
+static const TypeInfo virtio_pstore_pci_info = {
+    .name          = TYPE_VIRTIO_PSTORE_PCI,
+    .parent        = TYPE_VIRTIO_PCI,
+    .instance_size = sizeof(VirtIOPstorePCI),
+    .instance_init = virtio_pstore_pci_instance_init,
+    .class_init    = virtio_pstore_pci_class_init,
+};
+
 /* virtio-pci-bus */
 
 static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
@@ -2485,6 +2536,7 @@ static void virtio_pci_register_types(void)
 #ifdef CONFIG_VHOST_SCSI
     type_register_static(&vhost_scsi_pci_info);
 #endif
+    type_register_static(&virtio_pstore_pci_info);
 }
 
 type_init(virtio_pci_register_types)
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 25fbf8a..354b2b7 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -31,6 +31,7 @@
 #ifdef CONFIG_VHOST_SCSI
 #include "hw/virtio/vhost-scsi.h"
 #endif
+#include "hw/virtio/virtio-pstore.h"
 
 typedef struct VirtIOPCIProxy VirtIOPCIProxy;
 typedef struct VirtIOBlkPCI VirtIOBlkPCI;
@@ -44,6 +45,7 @@ typedef struct VirtIOInputPCI VirtIOInputPCI;
 typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI;
 typedef struct VirtIOInputHostPCI VirtIOInputHostPCI;
 typedef struct VirtIOGPUPCI VirtIOGPUPCI;
+typedef struct VirtIOPstorePCI VirtIOPstorePCI;
 
 /* virtio-pci-bus */
 
@@ -324,6 +326,18 @@ struct VirtIOGPUPCI {
     VirtIOGPU vdev;
 };
 
+/*
+ * virtio-pstore-pci: This extends VirtioPCIProxy.
+ */
+#define TYPE_VIRTIO_PSTORE_PCI "virtio-pstore-pci"
+#define VIRTIO_PSTORE_PCI(obj) \
+        OBJECT_CHECK(VirtIOPstorePCI, (obj), TYPE_VIRTIO_PSTORE_PCI)
+
+struct VirtIOPstorePCI {
+    VirtIOPCIProxy parent_obj;
+    VirtIOPstore vdev;
+};
+
 /* Virtio ABI version, if we increment this, we break the guest driver. */
 #define VIRTIO_PCI_ABI_VERSION          0
 
diff --git a/hw/virtio/virtio-pstore.c b/hw/virtio/virtio-pstore.c
new file mode 100644
index 0000000..b8fb4be
--- /dev/null
+++ b/hw/virtio/virtio-pstore.c
@@ -0,0 +1,699 @@
+/*
+ * Virtio Pstore Device
+ *
+ * Copyright (C) 2016  LG Electronics
+ *
+ * Authors:
+ *  Namhyung Kim  <namhyung@gmail.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include <stdio.h>
+
+#include "qemu/osdep.h"
+#include "qemu/iov.h"
+#include "qemu-common.h"
+#include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "sysemu/kvm.h"
+#include "qapi/visitor.h"
+#include "qapi-event.h"
+#include "io/channel-util.h"
+#include "trace.h"
+
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+#include "hw/virtio/virtio-pstore.h"
+
+#define PSTORE_DEFAULT_BUFSIZE   (16 * 1024)
+#define PSTORE_DEFAULT_FILE_MAX  5
+
+/* the index should match to the type value */
+static const char *virtio_pstore_file_prefix[] = {
+    "unknown-",		/* VIRTIO_PSTORE_TYPE_UNKNOWN */
+    "dmesg-",		/* VIRTIO_PSTORE_TYPE_DMESG */
+};
+
+static char *virtio_pstore_to_filename(VirtIOPstore *s,
+                                       struct virtio_pstore_req *req)
+{
+    const char *basename;
+    unsigned long long id;
+    unsigned int type = le16_to_cpu(req->type);
+    unsigned int flags = le32_to_cpu(req->flags);
+
+    if (type < ARRAY_SIZE(virtio_pstore_file_prefix)) {
+        basename = virtio_pstore_file_prefix[type];
+    } else {
+        basename = "unknown-";
+    }
+
+    id = s->id++;
+    return g_strdup_printf("%s/%s%llu%s", s->directory, basename, id,
+                            flags & VIRTIO_PSTORE_FL_COMPRESSED ? ".enc.z" : "");
+}
+
+static char *virtio_pstore_from_filename(VirtIOPstore *s, char *name,
+                                         struct virtio_pstore_fileinfo *info)
+{
+    char *filename;
+    unsigned int idx;
+
+    filename = g_strdup_printf("%s/%s", s->directory, name);
+    if (filename == NULL)
+        return NULL;
+
+    for (idx = 0; idx < ARRAY_SIZE(virtio_pstore_file_prefix); idx++) {
+        if (g_str_has_prefix(name, virtio_pstore_file_prefix[idx])) {
+            info->type = idx;
+            name += strlen(virtio_pstore_file_prefix[idx]);
+            break;
+        }
+    }
+
+    if (idx == ARRAY_SIZE(virtio_pstore_file_prefix)) {
+        g_free(filename);
+        return NULL;
+    }
+
+    qemu_strtoull(name, NULL, 0, &info->id);
+
+    info->flags = 0;
+    if (g_str_has_suffix(name, ".enc.z")) {
+        info->flags |= VIRTIO_PSTORE_FL_COMPRESSED;
+    }
+
+    return filename;
+}
+
+static int prefix_idx;
+static int prefix_count;
+static int prefix_len;
+
+static int filter_pstore(const struct dirent *de)
+{
+    int i;
+
+    for (i = 0; i < prefix_count; i++) {
+        const char *prefix = virtio_pstore_file_prefix[prefix_idx + i];
+
+        if (g_str_has_prefix(de->d_name, prefix)) {
+            return 1;
+        }
+    }
+    return 0;
+}
+
+static int sort_pstore(const struct dirent **a, const struct dirent **b)
+{
+    uint64_t id_a, id_b;
+
+    qemu_strtoull((*a)->d_name + prefix_len, NULL, 0, &id_a);
+    qemu_strtoull((*b)->d_name + prefix_len, NULL, 0, &id_b);
+
+    return id_a - id_b;
+}
+
+static int rotate_pstore_file(VirtIOPstore *s, unsigned short type)
+{
+    int ret = 0;
+    int i, num;
+    char *filename;
+    struct dirent **files;
+
+    if (type >= ARRAY_SIZE(virtio_pstore_file_prefix)) {
+        type = VIRTIO_PSTORE_TYPE_UNKNOWN;
+    }
+
+    prefix_idx = type;
+    prefix_len = strlen(virtio_pstore_file_prefix[type]);
+    prefix_count = 1;  /* only scan current type */
+
+    /* delete the oldest file in the same type */
+    num = scandir(s->directory, &files, filter_pstore, sort_pstore);
+    if (num < 0)
+        return num;
+    if (num < (int)s->file_max)
+        goto out;
+
+    filename = g_strdup_printf("%s/%s", s->directory, files[0]->d_name);
+    if (filename == NULL) {
+        ret = -1;
+        goto out;
+    }
+
+    ret = unlink(filename);
+
+out:
+    for (i = 0; i < num; i++) {
+        g_free(files[i]);
+    }
+    g_free(files);
+
+    return ret;
+}
+
+static ssize_t virtio_pstore_do_open(VirtIOPstore *s)
+{
+    /* scan all pstore files */
+    prefix_idx = 0;
+    prefix_count = ARRAY_SIZE(virtio_pstore_file_prefix);
+
+    s->file_idx = 0;
+    s->num_file = scandir(s->directory, &s->files, filter_pstore, alphasort);
+
+    return s->num_file >= 0 ? 0 : -1;
+}
+
+static ssize_t virtio_pstore_do_close(VirtIOPstore *s)
+{
+    int i;
+
+    for (i = 0; i < s->num_file; i++) {
+        g_free(s->files[i]);
+    }
+    g_free(s->files);
+    s->files = NULL;
+
+    s->num_file = 0;
+    return 0;
+}
+
+static ssize_t virtio_pstore_do_erase(VirtIOPstore *s,
+                                      struct virtio_pstore_req *req)
+{
+    char *filename;
+    int ret;
+
+    filename = virtio_pstore_to_filename(s, req);
+    if (filename == NULL)
+        return -1;
+
+    ret = unlink(filename);
+
+    g_free(filename);
+    return ret;
+}
+
+struct pstore_read_arg {
+    VirtIOPstore *vps;
+    VirtQueueElement *elem;
+    struct virtio_pstore_fileinfo info;
+    QIOChannel *ioc;
+};
+
+static gboolean pstore_async_read_fn(QIOChannel *ioc, GIOCondition condition,
+                                     gpointer data)
+{
+    struct pstore_read_arg *rarg = data;
+    struct virtio_pstore_fileinfo *info = &rarg->info;
+    VirtIOPstore *vps = rarg->vps;
+    VirtQueueElement *elem = rarg->elem;
+    struct virtio_pstore_res res;
+    size_t offset = sizeof(res) + sizeof(*info);
+    struct iovec *sg = elem->in_sg;
+    unsigned int sg_num = elem->in_num;
+    Error *err = NULL;
+    ssize_t len;
+    int ret;
+
+    /* skip res and fileinfo */
+    iov_discard_front(&sg, &sg_num, sizeof(res) + sizeof(*info));
+
+    len = qio_channel_readv(rarg->ioc, sg, sg_num, &err);
+    if (len < 0) {
+        if (errno == EAGAIN) {
+            len = 0;
+        }
+        ret = -1;
+    } else {
+        info->len = cpu_to_le32(len);
+        ret = 0;
+    }
+
+    res.cmd  = cpu_to_le16(VIRTIO_PSTORE_CMD_READ);
+    res.type = cpu_to_le16(VIRTIO_PSTORE_TYPE_UNKNOWN);
+    res.ret  = cpu_to_le32(ret);
+
+    /* now copy res and fileinfo */
+    iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
+    iov_from_buf(elem->in_sg, elem->in_num, sizeof(res), info, sizeof(*info));
+
+    len += offset;
+    virtqueue_push(vps->rvq, elem, len);
+    virtio_notify(VIRTIO_DEVICE(vps), vps->rvq);
+
+    return G_SOURCE_REMOVE;
+}
+
+static void free_rarg_fn(gpointer data)
+{
+    struct pstore_read_arg *rarg = data;
+
+    qio_channel_close(rarg->ioc, NULL);
+
+    g_free(rarg->elem);
+    g_free(rarg);
+}
+
+static ssize_t virtio_pstore_do_read(VirtIOPstore *s, VirtQueueElement *elem)
+{
+    char *filename = NULL;
+    int fd, idx;
+    struct stat stbuf;
+    struct pstore_read_arg *rarg = NULL;
+    Error *err = NULL;
+    int ret = -1;
+
+    if (s->file_idx >= s->num_file) {
+        return 0;
+    }
+
+    rarg = g_malloc(sizeof(*rarg));
+    if (rarg == NULL) {
+        return -1;
+    }
+
+    idx = s->file_idx++;
+    filename = virtio_pstore_from_filename(s, s->files[idx]->d_name,
+                                           &rarg->info);
+    if (filename == NULL) {
+        goto out;
+    }
+
+    fd = open(filename, O_RDONLY);
+    if (fd < 0) {
+        error_report("cannot open %s", filename);
+        goto out;
+    }
+
+    if (fstat(fd, &stbuf) < 0) {
+        goto out;
+    }
+
+    rarg->vps            = s;
+    rarg->elem           = elem;
+    rarg->info.id        = cpu_to_le64(rarg->info.id);
+    rarg->info.type      = cpu_to_le16(rarg->info.type);
+    rarg->info.flags     = cpu_to_le32(rarg->info.flags);
+    rarg->info.time_sec  = cpu_to_le64(stbuf.st_ctim.tv_sec);
+    rarg->info.time_nsec = cpu_to_le32(stbuf.st_ctim.tv_nsec);
+
+    rarg->ioc = qio_channel_new_fd(fd, &err);
+    if (err) {
+        error_reportf_err(err, "cannot create io channel: ");
+        goto out;
+    }
+
+    qio_channel_set_blocking(rarg->ioc, false, &err);
+    qio_channel_add_watch(rarg->ioc, G_IO_IN, pstore_async_read_fn, rarg,
+                          free_rarg_fn);
+    g_free(filename);
+    return 1;
+
+out:
+    g_free(filename);
+    g_free(rarg);
+
+    return ret;
+}
+
+struct pstore_write_arg {
+    VirtIOPstore *vps;
+    VirtQueueElement *elem;
+    struct virtio_pstore_req *req;
+    QIOChannel *ioc;
+};
+
+static gboolean pstore_async_write_fn(QIOChannel *ioc, GIOCondition condition,
+                                      gpointer data)
+{
+    struct pstore_write_arg *warg = data;
+    VirtIOPstore *vps = warg->vps;
+    VirtQueueElement *elem = warg->elem;
+    struct iovec *sg = elem->out_sg;
+    unsigned int sg_num = elem->out_num;
+    struct virtio_pstore_res res;
+    Error *err = NULL;
+    ssize_t len;
+    int ret;
+
+    /* we already consumed the req */
+    iov_discard_front(&sg, &sg_num, sizeof(*warg->req));
+
+    len = qio_channel_writev(warg->ioc, sg, sg_num, &err);
+    if (len < 0) {
+        ret = -1;
+    } else {
+        ret = 0;
+    }
+
+    res.cmd  = cpu_to_le16(VIRTIO_PSTORE_CMD_WRITE);
+    res.type = warg->req->type;
+    res.ret  = cpu_to_le32(ret);
+
+    /* tell the result to guest */
+    iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
+
+    virtqueue_push(vps->wvq, elem, sizeof(res));
+    virtio_notify(VIRTIO_DEVICE(vps), vps->wvq);
+
+    return G_SOURCE_REMOVE;
+}
+
+static void free_warg_fn(gpointer data)
+{
+    struct pstore_write_arg *warg = data;
+
+    qio_channel_close(warg->ioc, NULL);
+
+    g_free(warg->elem);
+    g_free(warg);
+}
+
+static ssize_t virtio_pstore_do_write(VirtIOPstore *s, VirtQueueElement *elem,
+                                      struct virtio_pstore_req *req)
+{
+    unsigned short type = le16_to_cpu(req->type);
+    char *filename = NULL;
+    int fd;
+    int flags = O_WRONLY | O_CREAT | O_TRUNC;
+    struct pstore_write_arg *warg = NULL;
+    Error *err = NULL;
+    int ret = -1;
+
+    /* do not keep same type of files more than 'file-max' */
+    rotate_pstore_file(s, type);
+
+    filename = virtio_pstore_to_filename(s, req);
+    if (filename == NULL) {
+        return -1;
+    }
+
+    warg = g_malloc(sizeof(*warg));
+    if (warg == NULL) {
+        goto out;
+    }
+
+    fd = open(filename, flags, 0644);
+    if (fd < 0) {
+        error_report("cannot open %s", filename);
+        ret = fd;
+        goto out;
+    }
+
+    warg->vps            = s;
+    warg->elem           = elem;
+    warg->req            = req;
+
+    warg->ioc = qio_channel_new_fd(fd, &err);
+    if (err) {
+        error_reportf_err(err, "cannot create io channel: ");
+        goto out;
+    }
+
+    qio_channel_set_blocking(warg->ioc, false, &err);
+    qio_channel_add_watch(warg->ioc, G_IO_OUT, pstore_async_write_fn, warg,
+                          free_warg_fn);
+    g_free(filename);
+    return 1;
+
+out:
+    g_free(filename);
+    g_free(warg);
+    return ret;
+}
+
+static void virtio_pstore_handle_io(VirtIODevice *vdev, VirtQueue *vq)
+{
+    VirtIOPstore *s = VIRTIO_PSTORE(vdev);
+    VirtQueueElement *elem;
+    struct virtio_pstore_req req;
+    struct virtio_pstore_res res;
+    ssize_t len = 0;
+    int ret;
+
+    for (;;) {
+        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
+        if (!elem) {
+            return;
+        }
+
+        if (elem->out_num < 1 || elem->in_num < 1) {
+            error_report("request or response buffer is missing");
+            exit(1);
+        }
+
+        if (elem->out_num > 2 || elem->in_num > 3) {
+            error_report("invalid number of input/output buffer");
+            exit(1);
+        }
+
+        len = iov_to_buf(elem->out_sg, elem->out_num, 0, &req, sizeof(req));
+        if (len != (ssize_t)sizeof(req)) {
+            error_report("invalid request size: %ld", (long)len);
+            exit(1);
+        }
+        res.cmd  = req.cmd;
+        res.type = req.type;
+
+        switch (le16_to_cpu(req.cmd)) {
+        case VIRTIO_PSTORE_CMD_OPEN:
+            ret = virtio_pstore_do_open(s);
+            break;
+        case VIRTIO_PSTORE_CMD_CLOSE:
+            ret = virtio_pstore_do_close(s);
+            break;
+        case VIRTIO_PSTORE_CMD_ERASE:
+            ret = virtio_pstore_do_erase(s, &req);
+            break;
+        case VIRTIO_PSTORE_CMD_READ:
+            ret = virtio_pstore_do_read(s, elem);
+            if (ret == 1) {
+                /* async channel io */
+                continue;
+            }
+            break;
+        case VIRTIO_PSTORE_CMD_WRITE:
+            ret = virtio_pstore_do_write(s, elem, &req);
+            if (ret == 1) {
+                /* async channel io */
+                continue;
+            }
+            break;
+        default:
+            ret = -1;
+            break;
+        }
+
+        res.ret = ret;
+
+        iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
+        virtqueue_push(vq, elem, sizeof(res) + len);
+
+        virtio_notify(vdev, vq);
+        g_free(elem);
+
+        if (ret < 0) {
+            return;
+        }
+    }
+}
+
+static void virtio_pstore_device_realize(DeviceState *dev, Error **errp)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    VirtIOPstore *s = VIRTIO_PSTORE(dev);
+
+    virtio_init(vdev, "virtio-pstore", VIRTIO_ID_PSTORE,
+                sizeof(struct virtio_pstore_config));
+
+    s->id = 1;
+
+    if (!s->bufsize)
+        s->bufsize = PSTORE_DEFAULT_BUFSIZE;
+    if (!s->file_max)
+        s->file_max = PSTORE_DEFAULT_FILE_MAX;
+
+    s->rvq = virtio_add_queue(vdev, 128, virtio_pstore_handle_io);
+    s->wvq = virtio_add_queue(vdev, 128, virtio_pstore_handle_io);
+}
+
+static void virtio_pstore_device_unrealize(DeviceState *dev, Error **errp)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+
+    virtio_cleanup(vdev);
+}
+
+static void virtio_pstore_get_config(VirtIODevice *vdev, uint8_t *config_data)
+{
+    VirtIOPstore *dev = VIRTIO_PSTORE(vdev);
+    struct virtio_pstore_config config;
+
+    config.bufsize = cpu_to_le32(dev->bufsize);
+
+    memcpy(config_data, &config, sizeof(struct virtio_pstore_config));
+}
+
+static void virtio_pstore_set_config(VirtIODevice *vdev,
+                                     const uint8_t *config_data)
+{
+    VirtIOPstore *dev = VIRTIO_PSTORE(vdev);
+    struct virtio_pstore_config config;
+
+    memcpy(&config, config_data, sizeof(struct virtio_pstore_config));
+
+    dev->bufsize = le32_to_cpu(config.bufsize);
+}
+
+static uint64_t get_features(VirtIODevice *vdev, uint64_t f, Error **errp)
+{
+    return f;
+}
+
+static void pstore_get_directory(Object *obj, Visitor *v,
+                                 const char *name, void *opaque,
+                                 Error **errp)
+{
+    VirtIOPstore *s = opaque;
+
+    visit_type_str(v, name, &s->directory, errp);
+}
+
+static void pstore_set_directory(Object *obj, Visitor *v,
+                                 const char *name, void *opaque,
+                                 Error **errp)
+{
+    VirtIOPstore *s = opaque;
+    Error *local_err = NULL;
+    char *value;
+
+    visit_type_str(v, name, &value, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    g_free(s->directory);
+    s->directory = value;
+}
+
+static void pstore_release_directory(Object *obj, const char *name,
+                                     void *opaque)
+{
+    VirtIOPstore *s = opaque;
+
+    g_free(s->directory);
+    s->directory = NULL;
+}
+
+static void pstore_get_bufsize(Object *obj, Visitor *v,
+                               const char *name, void *opaque,
+                               Error **errp)
+{
+    VirtIOPstore *s = opaque;
+    uint64_t value = s->bufsize;
+
+    visit_type_size(v, name, &value, errp);
+}
+
+static void pstore_set_bufsize(Object *obj, Visitor *v,
+                               const char *name, void *opaque,
+                               Error **errp)
+{
+    VirtIOPstore *s = opaque;
+    Error *error = NULL;
+    uint64_t value;
+
+    visit_type_size(v, name, &value, &error);
+    if (error) {
+        error_propagate(errp, error);
+        return;
+    }
+
+    if (value < 4096) {
+        error_setg(&error, "Warning: too small buffer size: %"PRIu64, value);
+        error_propagate(errp, error);
+        return;
+    }
+
+    s->bufsize = value;
+}
+
+static void pstore_get_file_max(Object *obj, Visitor *v,
+                                const char *name, void *opaque,
+                                Error **errp)
+{
+    VirtIOPstore *s = opaque;
+    int64_t value = s->file_max;
+
+    visit_type_int(v, name, &value, errp);
+}
+
+static void pstore_set_file_max(Object *obj, Visitor *v,
+                                const char *name, void *opaque,
+                                Error **errp)
+{
+    VirtIOPstore *s = opaque;
+    Error *error = NULL;
+    int64_t value;
+
+    visit_type_int(v, name, &value, &error);
+    if (error) {
+        error_propagate(errp, error);
+        return;
+    }
+
+    s->file_max = value;
+}
+
+static Property virtio_pstore_properties[] = {
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void virtio_pstore_instance_init(Object *obj)
+{
+    VirtIOPstore *s = VIRTIO_PSTORE(obj);
+
+    object_property_add(obj, "directory", "str",
+                        pstore_get_directory, pstore_set_directory,
+                        pstore_release_directory, s, NULL);
+    object_property_add(obj, "bufsize", "size",
+                        pstore_get_bufsize, pstore_set_bufsize, NULL, s, NULL);
+    object_property_add(obj, "file-max", "int",
+                        pstore_get_file_max, pstore_set_file_max, NULL, s, NULL);
+}
+
+static void virtio_pstore_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
+
+    dc->props = virtio_pstore_properties;
+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+    vdc->realize = virtio_pstore_device_realize;
+    vdc->unrealize = virtio_pstore_device_unrealize;
+    vdc->get_config = virtio_pstore_get_config;
+    vdc->set_config = virtio_pstore_set_config;
+    vdc->get_features = get_features;
+}
+
+static const TypeInfo virtio_pstore_info = {
+    .name = TYPE_VIRTIO_PSTORE,
+    .parent = TYPE_VIRTIO_DEVICE,
+    .instance_size = sizeof(VirtIOPstore),
+    .instance_init = virtio_pstore_instance_init,
+    .class_init = virtio_pstore_class_init,
+};
+
+static void virtio_register_types(void)
+{
+    type_register_static(&virtio_pstore_info);
+}
+
+type_init(virtio_register_types)
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 929ec2f..b31774a 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -79,6 +79,7 @@
 #define PCI_DEVICE_ID_VIRTIO_SCSI        0x1004
 #define PCI_DEVICE_ID_VIRTIO_RNG         0x1005
 #define PCI_DEVICE_ID_VIRTIO_9P          0x1009
+#define PCI_DEVICE_ID_VIRTIO_PSTORE      0x100a
 
 #define PCI_VENDOR_ID_REDHAT             0x1b36
 #define PCI_DEVICE_ID_REDHAT_BRIDGE      0x0001
diff --git a/include/hw/virtio/virtio-pstore.h b/include/hw/virtio/virtio-pstore.h
new file mode 100644
index 0000000..85b1828
--- /dev/null
+++ b/include/hw/virtio/virtio-pstore.h
@@ -0,0 +1,36 @@
+/*
+ * Virtio Pstore Support
+ *
+ * Authors:
+ *  Namhyung Kim      <namhyung@gmail.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef _QEMU_VIRTIO_PSTORE_H
+#define _QEMU_VIRTIO_PSTORE_H
+
+#include "standard-headers/linux/virtio_pstore.h"
+#include "hw/virtio/virtio.h"
+#include "hw/pci/pci.h"
+
+#define TYPE_VIRTIO_PSTORE "virtio-pstore-device"
+#define VIRTIO_PSTORE(obj) \
+        OBJECT_CHECK(VirtIOPstore, (obj), TYPE_VIRTIO_PSTORE)
+
+typedef struct VirtIOPstore {
+    VirtIODevice    parent_obj;
+    VirtQueue      *rvq;
+    VirtQueue      *wvq;
+    char           *directory;
+    int             file_idx;
+    int             num_file;
+    struct dirent **files;
+    uint64_t        id;
+    uint64_t        bufsize;
+    uint64_t        file_max;
+} VirtIOPstore;
+
+#endif
diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard-headers/linux/virtio_ids.h
index 77925f5..c72a9ab 100644
--- a/include/standard-headers/linux/virtio_ids.h
+++ b/include/standard-headers/linux/virtio_ids.h
@@ -41,5 +41,6 @@
 #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
 #define VIRTIO_ID_GPU          16 /* virtio GPU */
 #define VIRTIO_ID_INPUT        18 /* virtio input */
+#define VIRTIO_ID_PSTORE       22 /* virtio pstore */
 
 #endif /* _LINUX_VIRTIO_IDS_H */
diff --git a/include/standard-headers/linux/virtio_pstore.h b/include/standard-headers/linux/virtio_pstore.h
new file mode 100644
index 0000000..2f91839
--- /dev/null
+++ b/include/standard-headers/linux/virtio_pstore.h
@@ -0,0 +1,76 @@
+#ifndef _LINUX_VIRTIO_PSTORE_H
+#define _LINUX_VIRTIO_PSTORE_H
+/* This header is BSD licensed so anyone can use the definitions to implement
+ * compatible drivers/servers.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ *    may be used to endorse or promote products derived from this software
+ *    without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE. */
+#include "standard-headers/linux/types.h"
+#include "standard-headers/linux/virtio_types.h"
+#include "standard-headers/linux/virtio_ids.h"
+#include "standard-headers/linux/virtio_config.h"
+
+#define VIRTIO_PSTORE_CMD_NULL   0
+#define VIRTIO_PSTORE_CMD_OPEN   1
+#define VIRTIO_PSTORE_CMD_READ   2
+#define VIRTIO_PSTORE_CMD_WRITE  3
+#define VIRTIO_PSTORE_CMD_ERASE  4
+#define VIRTIO_PSTORE_CMD_CLOSE  5
+
+#define VIRTIO_PSTORE_TYPE_UNKNOWN  0
+#define VIRTIO_PSTORE_TYPE_DMESG    1
+
+#define VIRTIO_PSTORE_FL_COMPRESSED  1
+
+struct virtio_pstore_req {
+    __virtio16 cmd;
+    __virtio16 type;
+    __virtio32 flags;
+    __virtio64 id;
+    __virtio32 count;
+    __virtio32 reserved;
+};
+
+struct virtio_pstore_res {
+    __virtio16 cmd;
+    __virtio16 type;
+    __virtio32 ret;
+};
+
+struct virtio_pstore_fileinfo {
+    __virtio64 id;
+    __virtio32 count;
+    __virtio16 type;
+    __virtio16 unused;
+    __virtio32 flags;
+    __virtio32 len;
+    __virtio64 time_sec;
+    __virtio32 time_nsec;
+    __virtio32 reserved;
+};
+
+struct virtio_pstore_config {
+    __virtio32 bufsize;
+};
+
+#endif /* _LINUX_VIRTIO_PSTORE_H */
diff --git a/qdev-monitor.c b/qdev-monitor.c
index e19617f..e1df5a9 100644
--- a/qdev-monitor.c
+++ b/qdev-monitor.c
@@ -73,6 +73,7 @@ static const QDevAlias qdev_alias_table[] = {
     { "virtio-serial-pci", "virtio-serial", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
     { "virtio-tablet-ccw", "virtio-tablet", QEMU_ARCH_S390X },
     { "virtio-tablet-pci", "virtio-tablet", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
+    { "virtio-pstore-pci", "virtio-pstore" },
     { }
 };
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/3] kvmtool: Implement virtio-pstore device
  2016-08-20  8:07 [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Namhyung Kim
  2016-08-20  8:07 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
  2016-08-20  8:07 ` [PATCH 2/3] qemu: Implement virtio-pstore device Namhyung Kim
@ 2016-08-20  8:07 ` Namhyung Kim
  2016-08-23 10:25 ` [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Joel Fernandes
  3 siblings, 0 replies; 31+ messages in thread
From: Namhyung Kim @ 2016-08-20  8:07 UTC (permalink / raw)
  To: virtio-dev, kvm, qemu-devel, virtualization
  Cc: LKML, Namhyung Kim, Will Deacon

From: Namhyung Kim <namhyung@gmail.com>

Add virtio pstore device to allow kernel log messages saved on the
host.  With this patch, it will save the log files under directory given
by --pstore option.

  $ lkvm run --pstore=dir-xx

  (guest) # echo c > /proc/sysrq-trigger

  $ ls dir-xx
  dmesg-1.enc.z  dmesg-2.enc.z

The log files are usually compressed using zlib.  User can easily see
the messages on the host or on the guest (using pstore filesystem).

Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
---
 Makefile                     |   1 +
 builtin-run.c                |   2 +
 include/kvm/kvm-config.h     |   1 +
 include/kvm/virtio-pci-dev.h |   2 +
 include/kvm/virtio-pstore.h  |  53 +++++
 include/linux/virtio_ids.h   |   1 +
 virtio/pstore.c              | 447 +++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 507 insertions(+)
 create mode 100644 include/kvm/virtio-pstore.h
 create mode 100644 virtio/pstore.c

diff --git a/Makefile b/Makefile
index 1f0196f..d7462b9 100644
--- a/Makefile
+++ b/Makefile
@@ -67,6 +67,7 @@ OBJS	+= virtio/net.o
 OBJS	+= virtio/rng.o
 OBJS    += virtio/balloon.o
 OBJS	+= virtio/pci.o
+OBJS	+= virtio/pstore.o
 OBJS	+= disk/blk.o
 OBJS	+= disk/qcow.o
 OBJS	+= disk/raw.o
diff --git a/builtin-run.c b/builtin-run.c
index 72b878d..08c12dd 100644
--- a/builtin-run.c
+++ b/builtin-run.c
@@ -128,6 +128,8 @@ void kvm_run_set_wrapper_sandbox(void)
 			" rootfs"),					\
 	OPT_STRING('\0', "hugetlbfs", &(cfg)->hugetlbfs_path, "path",	\
 			"Hugetlbfs path"),				\
+	OPT_STRING('\0', "pstore", &(cfg)->pstore_path, "path",		\
+			"pstore data path"),				\
 									\
 	OPT_GROUP("Kernel options:"),					\
 	OPT_STRING('k', "kernel", &(cfg)->kernel_filename, "kernel",	\
diff --git a/include/kvm/kvm-config.h b/include/kvm/kvm-config.h
index 386fa8c..42b7651 100644
--- a/include/kvm/kvm-config.h
+++ b/include/kvm/kvm-config.h
@@ -45,6 +45,7 @@ struct kvm_config {
 	const char *hugetlbfs_path;
 	const char *custom_rootfs_name;
 	const char *real_cmdline;
+	const char *pstore_path;
 	struct virtio_net_params *net_params;
 	bool single_step;
 	bool vnc;
diff --git a/include/kvm/virtio-pci-dev.h b/include/kvm/virtio-pci-dev.h
index 48ae018..4339d94 100644
--- a/include/kvm/virtio-pci-dev.h
+++ b/include/kvm/virtio-pci-dev.h
@@ -15,6 +15,7 @@
 #define PCI_DEVICE_ID_VIRTIO_BLN		0x1005
 #define PCI_DEVICE_ID_VIRTIO_SCSI		0x1008
 #define PCI_DEVICE_ID_VIRTIO_9P			0x1009
+#define PCI_DEVICE_ID_VIRTIO_PSTORE		0x100a
 #define PCI_DEVICE_ID_VESA			0x2000
 #define PCI_DEVICE_ID_PCI_SHMEM			0x0001
 
@@ -34,5 +35,6 @@
 #define PCI_CLASS_RNG				0xff0000
 #define PCI_CLASS_BLN				0xff0000
 #define PCI_CLASS_9P				0xff0000
+#define PCI_CLASS_PSTORE			0xff0000
 
 #endif /* VIRTIO_PCI_DEV_H_ */
diff --git a/include/kvm/virtio-pstore.h b/include/kvm/virtio-pstore.h
new file mode 100644
index 0000000..9f52ffd
--- /dev/null
+++ b/include/kvm/virtio-pstore.h
@@ -0,0 +1,53 @@
+#ifndef KVM__PSTORE_VIRTIO_H
+#define KVM__PSTORE_VIRTIO_H
+
+#include <kvm/virtio.h>
+#include <sys/types.h>
+
+#define VIRTIO_PSTORE_CMD_NULL   0
+#define VIRTIO_PSTORE_CMD_OPEN   1
+#define VIRTIO_PSTORE_CMD_READ   2
+#define VIRTIO_PSTORE_CMD_WRITE  3
+#define VIRTIO_PSTORE_CMD_ERASE  4
+#define VIRTIO_PSTORE_CMD_CLOSE  5
+
+#define VIRTIO_PSTORE_TYPE_UNKNOWN  0
+#define VIRTIO_PSTORE_TYPE_DMESG    1
+
+#define VIRTIO_PSTORE_FL_COMPRESSED  1
+
+struct virtio_pstore_req {
+	__virtio16		cmd;
+	__virtio16		type;
+	__virtio32		flags;
+	__virtio64		id;
+	__virtio32		count;
+	__virtio32		reserved;
+};
+
+struct virtio_pstore_res {
+	__virtio16		cmd;
+	__virtio16		type;
+	__virtio32		ret;
+};
+
+struct virtio_pstore_fileinfo {
+	__virtio64		id;
+	__virtio32		count;
+	__virtio16		type;
+	__virtio16		unused;
+	__virtio32		flags;
+	__virtio32		len;
+	__virtio64		time_sec;
+	__virtio32		time_nsec;
+	__virtio32		reserved;
+};
+
+struct virtio_pstore_config {
+	__virtio32		bufsize;
+};
+
+int virtio_pstore__init(struct kvm *kvm);
+int virtio_pstore__exit(struct kvm *kvm);
+
+#endif /* KVM__PSTORE_VIRTIO_H */
diff --git a/include/linux/virtio_ids.h b/include/linux/virtio_ids.h
index 5f60aa4..40eabf7 100644
--- a/include/linux/virtio_ids.h
+++ b/include/linux/virtio_ids.h
@@ -40,5 +40,6 @@
 #define VIRTIO_ID_RPROC_SERIAL 11 /* virtio remoteproc serial link */
 #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
 #define VIRTIO_ID_INPUT        18 /* virtio input */
+#define VIRTIO_ID_PSTORE       22 /* virtio pstore */
 
 #endif /* _LINUX_VIRTIO_IDS_H */
diff --git a/virtio/pstore.c b/virtio/pstore.c
new file mode 100644
index 0000000..fb9806f
--- /dev/null
+++ b/virtio/pstore.c
@@ -0,0 +1,447 @@
+#include "kvm/virtio-pstore.h"
+
+#include "kvm/virtio-pci-dev.h"
+
+#include "kvm/virtio.h"
+#include "kvm/util.h"
+#include "kvm/kvm.h"
+#include "kvm/threadpool.h"
+#include "kvm/guest_compat.h"
+
+#include <linux/virtio_ring.h>
+
+#include <linux/list.h>
+#include <fcntl.h>
+#include <dirent.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <pthread.h>
+#include <linux/kernel.h>
+#include <sys/eventfd.h>
+
+#define NUM_VIRT_QUEUES			2
+#define VIRTIO_PSTORE_QUEUE_SIZE	128
+
+struct io_thread_arg {
+	struct kvm		*kvm;
+	struct pstore_dev	*pdev;
+};
+
+struct pstore_dev {
+	struct list_head	list;
+	struct virtio_device	vdev;
+	pthread_t		io_thread;
+	int			io_efd;
+	int			done;
+
+	struct virtio_pstore_config *config;
+
+	int			fd;
+	DIR			*dir;
+	u64			id;
+
+	/* virtio queue */
+	struct virt_queue	vqs[NUM_VIRT_QUEUES];
+};
+
+static LIST_HEAD(pdevs);
+static int compat_id = -1;
+
+static u8 *get_config(struct kvm *kvm, void *dev)
+{
+	struct pstore_dev *pdev = dev;
+
+	return (u8*)pdev->config;
+}
+
+static u32 get_host_features(struct kvm *kvm, void *dev)
+{
+	/* Unused */
+	return 0;
+}
+
+static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
+{
+	/* Unused */
+}
+
+static void virtio_pstore_to_filename(struct kvm *kvm, struct pstore_dev *pdev,
+				      char *buf, size_t sz,
+				      struct virtio_pstore_req *req)
+{
+	const char *basename;
+	unsigned long long id = 0;
+	unsigned int flags = virtio_host_to_guest_u64(pdev->vqs, req->flags);
+
+	switch (req->type) {
+	case VIRTIO_PSTORE_TYPE_DMESG:
+		basename = "dmesg";
+		id = pdev->id++;
+		break;
+	default:
+		basename = "unknown";
+		break;
+	}
+
+	snprintf(buf, sz, "%s/%s-%llu%s", kvm->cfg.pstore_path, basename, id,
+		 flags & VIRTIO_PSTORE_FL_COMPRESSED ? ".enc.z" : "");
+}
+
+static void virtio_pstore_from_filename(struct kvm *kvm, char *name,
+					char *buf, size_t sz,
+					struct virtio_pstore_fileinfo *info)
+{
+	size_t len = strlen(name);
+
+	snprintf(buf, sz, "%s/%s", kvm->cfg.pstore_path, name);
+
+	info->flags = 0;
+	if (len > 6 && !strncmp(name + len - 6, ".enc.z", 6))
+		info->flags |= VIRTIO_PSTORE_FL_COMPRESSED;
+
+	if (!strncmp(name, "dmesg-", 6)) {
+		info->type = VIRTIO_PSTORE_TYPE_DMESG;
+		name += strlen("dmesg-");
+	} else if (!strncmp(name, "unknown-", 8)) {
+		info->type = VIRTIO_PSTORE_TYPE_UNKNOWN;
+		name += strlen("unknown-");
+	}
+
+	info->id = strtoul(name, NULL, 0);
+}
+
+static int virtio_pstore_do_open(struct kvm *kvm, struct pstore_dev *pdev,
+				 struct virtio_pstore_req *req,
+				 struct iovec *iov)
+{
+	pdev->dir = opendir(kvm->cfg.pstore_path);
+	if (pdev->dir == NULL)
+		return -errno;
+
+	return 0;
+}
+
+static int virtio_pstore_do_close(struct kvm *kvm, struct pstore_dev *pdev,
+				  struct virtio_pstore_req *req,
+				  struct iovec *iov)
+{
+	if (pdev->dir == NULL)
+		return -1;
+
+	closedir(pdev->dir);
+	pdev->dir = NULL;
+
+	return 0;
+}
+
+static ssize_t virtio_pstore_do_read(struct kvm *kvm, struct pstore_dev *pdev,
+				     struct virtio_pstore_req *req,
+				     struct iovec *iov,
+				     struct virtio_pstore_fileinfo *info)
+{
+	char path[PATH_MAX];
+	FILE *fp;
+	ssize_t len = 0;
+	struct stat stbuf;
+	struct dirent *dent;
+
+	if (pdev->dir == NULL)
+		return 0;
+
+	dent = readdir(pdev->dir);
+	while (dent) {
+		if (dent->d_name[0] != '.')
+			break;
+		dent = readdir(pdev->dir);
+	}
+
+	if (dent == NULL)
+		return 0;
+
+	virtio_pstore_from_filename(kvm, dent->d_name, path, sizeof(path), info);
+	fp = fopen(path, "r");
+	if (fp == NULL)
+		return -1;
+
+	if (fstat(fileno(fp), &stbuf) < 0)
+		return -1;
+
+	len = fread(iov[3].iov_base, 1, iov[3].iov_len, fp);
+	if (len < 0 && errno == EAGAIN) {
+		len = 0;
+		goto out;
+	}
+
+	info->id     = virtio_host_to_guest_u64(pdev->vqs, info->id);
+	info->type   = virtio_host_to_guest_u64(pdev->vqs, info->type);
+	info->flags  = virtio_host_to_guest_u32(pdev->vqs, info->flags);
+	info->len    = virtio_host_to_guest_u32(pdev->vqs, len);
+
+	info->time_sec  = virtio_host_to_guest_u64(pdev->vqs, stbuf.st_ctim.tv_sec);
+	info->time_nsec = virtio_host_to_guest_u32(pdev->vqs, stbuf.st_ctim.tv_nsec);
+
+	len += sizeof(*info);
+
+out:
+	fclose(fp);
+	return len;
+}
+
+static ssize_t virtio_pstore_do_write(struct kvm *kvm, struct pstore_dev *pdev,
+				      struct virtio_pstore_req *req,
+				      struct iovec *iov)
+{
+	char path[PATH_MAX];
+	FILE *fp;
+	ssize_t len = 0;
+
+	virtio_pstore_to_filename(kvm, pdev, path, sizeof(path), req);
+
+	fp = fopen(path, "a");
+	if (fp == NULL)
+		return -1;
+
+	len = fwrite(iov[1].iov_base, 1, iov[1].iov_len, fp);
+	if (len < 0 && errno == EAGAIN)
+		len = 0;
+
+	fclose(fp);
+	return 0;
+}
+
+static ssize_t virtio_pstore_do_erase(struct kvm *kvm, struct pstore_dev *pdev,
+				      struct virtio_pstore_req *req,
+				      struct iovec *iov)
+{
+	char path[PATH_MAX];
+
+	virtio_pstore_to_filename(kvm, pdev, path, sizeof(path), req);
+
+	return unlink(path);
+}
+
+static bool virtio_pstore_do_io_request(struct kvm *kvm, struct pstore_dev *pdev,
+					struct virt_queue *vq)
+{
+	struct iovec iov[VIRTIO_PSTORE_QUEUE_SIZE];
+	struct virtio_pstore_req *req;
+	struct virtio_pstore_res *res;
+	struct virtio_pstore_fileinfo *info;
+	ssize_t len = 0;
+	u16 out, in, head;
+	int ret = 0;
+
+	head = virt_queue__get_iov(vq, iov, &out, &in, kvm);
+
+	if (iov[0].iov_len != sizeof(*req) || iov[out].iov_len != sizeof(*res)) {
+		return false;
+	}
+
+	req = iov[0].iov_base;
+	res = iov[out].iov_base;
+
+	switch (virtio_guest_to_host_u16(vq, req->cmd)) {
+	case VIRTIO_PSTORE_CMD_OPEN:
+		ret = virtio_pstore_do_open(kvm, pdev, req, iov);
+		break;
+	case VIRTIO_PSTORE_CMD_READ:
+		info = iov[out + 1].iov_base;
+		ret = virtio_pstore_do_read(kvm, pdev, req, iov, info);
+		if (ret > 0) {
+			len = ret;
+			ret = 0;
+		}
+		break;
+	case VIRTIO_PSTORE_CMD_WRITE:
+		ret = virtio_pstore_do_write(kvm, pdev, req, iov);
+		break;
+	case VIRTIO_PSTORE_CMD_CLOSE:
+		ret = virtio_pstore_do_close(kvm, pdev, req, iov);
+		break;
+	case VIRTIO_PSTORE_CMD_ERASE:
+		ret = virtio_pstore_do_erase(kvm, pdev, req, iov);
+		break;
+	default:
+		return false;
+	}
+
+	res->cmd  = req->cmd;
+	res->type = req->type;
+	res->ret  = virtio_host_to_guest_u32(vq, ret);
+
+	virt_queue__set_used_elem(vq, head, sizeof(*res) + len);
+
+	return ret == 0;
+}
+
+static void virtio_pstore_do_io(struct kvm *kvm, struct pstore_dev *pdev,
+				struct virt_queue *vq)
+{
+	bool done = false;
+
+	while (virt_queue__available(vq)) {
+		virtio_pstore_do_io_request(kvm, pdev, vq);
+		done = true;
+	}
+
+	if (done)
+		pdev->vdev.ops->signal_vq(kvm, &pdev->vdev, vq - pdev->vqs);
+}
+
+static void *virtio_pstore_io_thread(void *arg)
+{
+	struct io_thread_arg *io_arg = arg;
+	struct pstore_dev *pdev = io_arg->pdev;
+	struct kvm *kvm = io_arg->kvm;
+	u64 data;
+	int r;
+
+	kvm__set_thread_name("virtio-pstore-io");
+
+	while (!pdev->done) {
+		r = read(pdev->io_efd, &data, sizeof(u64));
+		if (r < 0)
+			continue;
+
+		virtio_pstore_do_io(kvm, pdev, &pdev->vqs[0]);
+		virtio_pstore_do_io(kvm, pdev, &pdev->vqs[1]);
+	}
+	free(io_arg);
+
+	pthread_exit(NULL);
+	return NULL;
+}
+
+static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
+		   u32 pfn)
+{
+	struct pstore_dev *pdev = dev;
+	struct virt_queue *queue;
+	void *p;
+
+	compat__remove_message(compat_id);
+
+	queue		= &pdev->vqs[vq];
+	queue->pfn	= pfn;
+	p		= virtio_get_vq(kvm, queue->pfn, page_size);
+
+	vring_init(&queue->vring, VIRTIO_PSTORE_QUEUE_SIZE, p, align);
+
+	return 0;
+}
+
+static int notify_vq(struct kvm *kvm, void *dev, u32 vq)
+{
+	struct pstore_dev *pdev = dev;
+	u64 data = 1;
+	int r;
+
+	r = write(pdev->io_efd, &data, sizeof(data));
+	if (r < 0)
+		return r;
+
+	return 0;
+}
+
+static int get_pfn_vq(struct kvm *kvm, void *dev, u32 vq)
+{
+	struct pstore_dev *pdev = dev;
+
+	return pdev->vqs[vq].pfn;
+}
+
+static int get_size_vq(struct kvm *kvm, void *dev, u32 vq)
+{
+	return VIRTIO_PSTORE_QUEUE_SIZE;
+}
+
+static int set_size_vq(struct kvm *kvm, void *dev, u32 vq, int size)
+{
+	/* FIXME: dynamic */
+	return size;
+}
+
+static struct virtio_ops pstore_dev_virtio_ops = {
+	.get_config		= get_config,
+	.get_host_features	= get_host_features,
+	.set_guest_features	= set_guest_features,
+	.init_vq		= init_vq,
+	.notify_vq		= notify_vq,
+	.get_pfn_vq		= get_pfn_vq,
+	.get_size_vq		= get_size_vq,
+	.set_size_vq		= set_size_vq,
+};
+
+int virtio_pstore__init(struct kvm *kvm)
+{
+	struct pstore_dev *pdev;
+	struct io_thread_arg *io_arg = NULL;
+	int r;
+
+	if (!kvm->cfg.pstore_path)
+		return 0;
+
+	pdev = calloc(1, sizeof(*pdev));
+	if (pdev == NULL)
+		return -ENOMEM;
+
+	pdev->config = calloc(1, sizeof(*pdev->config));
+	if (pdev->config == NULL) {
+		r = -ENOMEM;
+		goto cleanup;
+	}
+
+	pdev->id = 1;
+
+	io_arg = malloc(sizeof(*io_arg));
+	if (io_arg == NULL) {
+		r = -ENOMEM;
+		goto cleanup;
+	}
+
+	pdev->io_efd = eventfd(0, 0);
+
+	*io_arg = (struct io_thread_arg) {
+		.pdev   = pdev,
+		.kvm    = kvm,
+	};
+	r = pthread_create(&pdev->io_thread, NULL,
+			   virtio_pstore_io_thread, io_arg);
+	if (r < 0)
+		goto cleanup;
+
+	r = virtio_init(kvm, pdev, &pdev->vdev, &pstore_dev_virtio_ops,
+			VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_PSTORE,
+			VIRTIO_ID_PSTORE, PCI_CLASS_PSTORE);
+	if (r < 0)
+		goto cleanup;
+
+	list_add_tail(&pdev->list, &pdevs);
+
+	if (compat_id == -1)
+		compat_id = virtio_compat_add_message("virtio-pstore", "CONFIG_VIRTIO_PSTORE");
+	return 0;
+
+cleanup:
+	free(io_arg);
+	free(pdev->config);
+	free(pdev);
+
+	return r;
+}
+virtio_dev_init(virtio_pstore__init);
+
+int virtio_pstore__exit(struct kvm *kvm)
+{
+	struct pstore_dev *pdev, *tmp;
+
+	list_for_each_entry_safe(pdev, tmp, &pdevs, list) {
+		list_del(&pdev->list);
+		close(pdev->io_efd);
+		pdev->vdev.ops->exit(kvm, &pdev->vdev);
+		free(pdev);
+	}
+
+	return 0;
+}
+virtio_dev_exit(virtio_pstore__exit);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3)
  2016-08-20  8:07 [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Namhyung Kim
                   ` (2 preceding siblings ...)
  2016-08-20  8:07 ` [PATCH 3/3] kvmtool: " Namhyung Kim
@ 2016-08-23 10:25 ` Joel Fernandes
  2016-08-23 15:20   ` Namhyung Kim
  3 siblings, 1 reply; 31+ messages in thread
From: Joel Fernandes @ 2016-08-23 10:25 UTC (permalink / raw)
  To: namhyung; +Cc: linux-kernel

From: Namhyung Kim <namhyung@kernel.org>

> Hello,
>
> This is another iteration of the virtio-pstore work.  In this patchset
> I addressed most of feedbacks from previous version and drooped the
> support for PSTORE_TYPE_CONSOLE for simplicity.  It'll be added once the basic 

Hi Namhyung,

This looks like a useful pstore backend. Great work.

BTW, Have you considered using -mem-path in Qemu for this purpose?
I was thinking about using this, and then somehow have kernel reserve a part of physical memory for the pstore. Then after the crash, or whenever you want to read the contents of the pstore on the host, you could just extract that part of the mem-path file.

Any thoughts on what you think about it? In your approach though, you wouldn't need a backing mem-path file which is the size of the guest RAM (which could be as big as the mem-path file). I wonder if the mem-path file can be created sparse, and/or Qemu has support to configure a certain part of guest RAM as file-backed memory and the rest of it from Anonymous memory (not backed by mem-path) so that the size of the mem-path file can be kept at a minimum.

Thanks,
Joel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3)
  2016-08-23 10:25 ` [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Joel Fernandes
@ 2016-08-23 15:20   ` Namhyung Kim
  2016-08-24  7:10     ` Joel
  0 siblings, 1 reply; 31+ messages in thread
From: Namhyung Kim @ 2016-08-23 15:20 UTC (permalink / raw)
  To: Joel Fernandes; +Cc: linux-kernel, qemu-devel, Kees Cook

Hi Joel,

On Tue, Aug 23, 2016 at 7:25 PM, Joel Fernandes <agnel.joel@gmail.com> wrote:
> From: Namhyung Kim <namhyung@kernel.org>
>
>> Hello,
>>
>> This is another iteration of the virtio-pstore work.  In this patchset
>> I addressed most of feedbacks from previous version and drooped the
>> support for PSTORE_TYPE_CONSOLE for simplicity.  It'll be added once the basic
>
> Hi Namhyung,
>
> This looks like a useful pstore backend. Great work.

Thanks!

>
> BTW, Have you considered using -mem-path in Qemu for this purpose?
> I was thinking about using this, and then somehow have kernel reserve
> a part of physical memory for the pstore. Then after the crash, or
> whenever you want to read the contents of the pstore on the host, you
> could just extract that part of the mem-path file.

I wasn't aware of the -mem-path option and it seems that the existing
ramoops pstore backend can take care of it.

>
> Any thoughts on what you think about it? In your approach though, you
> wouldn't need a backing mem-path file which is the size of the guest
> RAM (which could be as big as the mem-path file). I wonder if the
> mem-path file can be created sparse, and/or Qemu has support to
> configure a certain part of guest RAM as file-backed memory and the
> rest of it from Anonymous memory (not backed by mem-path) so that
> the size of the mem-path file can be kept at a minimum.

The pstore (ramoops) requires the region of the memory is preserved
across reboot.  Is it possible when -mem-path is used?  I think it's
better to use a separate region/file for pstore rather than being a part
of guest RAM or as you said, it'd be great if qemu supported such a
hybrid mem-path.

Also my approach can handle streams of data bigger than the pstore
buffer size.  Although we can extract the contents of mem-path file
periodically, it might be hard for externel process to know the right
time to extract and there's a possibility of information loss IMHO.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3)
  2016-08-23 15:20   ` Namhyung Kim
@ 2016-08-24  7:10     ` Joel
  0 siblings, 0 replies; 31+ messages in thread
From: Joel @ 2016-08-24  7:10 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: linux-kernel, qemu-devel, Kees Cook

Hi Namhyung,

> On Aug 23, 2016, at 8:20 AM, Namhyung Kim <namhyung@kernel.org> wrote:
> 
> Hi Joel,
> 
> On Tue, Aug 23, 2016 at 7:25 PM, Joel Fernandes <agnel.joel@gmail.com> wrote:
>> From: Namhyung Kim <namhyung@kernel.org>
> 
>> 
>> Any thoughts on what you think about it? In your approach though, you
>> wouldn't need a backing mem-path file which is the size of the guest
>> RAM (which could be as big as the mem-path file). I wonder if the
>> mem-path file can be created sparse, and/or Qemu has support to
>> configure a certain part of guest RAM as file-backed memory and the
>> rest of it from Anonymous memory (not backed by mem-path) so that
>> the size of the mem-path file can be kept at a minimum.
> 
> The pstore (ramoops) requires the region of the memory is preserved
> across reboot.  Is it possible when -mem-path is used?  I think it’s

I believe the stock qemu won’t persist memory on its own without a reboot.
I found atleast one post where someone was trying to make mem-path
persist across a reboot and claimed to succeed:
https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg03476.html

> 
> Also my approach can handle streams of data bigger than the pstore
> buffer size.  Although we can extract the contents of mem-path file
> periodically, it might be hard for externel process to know the right
> time to extract and there's a possibility of information loss IMHO.
> 

I agree, your approach is better for an emulated environment.

Thanks,
Joel

> Thanks,
> Namhyung

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] qemu: Implement virtio-pstore device
  2016-08-20  8:07 ` [PATCH 2/3] qemu: Implement virtio-pstore device Namhyung Kim
@ 2016-08-24 22:00   ` Daniel P. Berrange
  2016-08-26  4:48     ` Namhyung Kim
  2016-09-13 15:57   ` Michael S. Tsirkin
  1 sibling, 1 reply; 31+ messages in thread
From: Daniel P. Berrange @ 2016-08-24 22:00 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Michael S. Tsirkin, Anthony Liguori, Anton Vorontsov,
	Colin Cross, Kees Cook, Tony Luck, Steven Rostedt, Ingo Molnar,
	Minchan Kim


> diff --git a/hw/virtio/virtio-pstore.c b/hw/virtio/virtio-pstore.c
> new file mode 100644
> index 0000000..b8fb4be
> --- /dev/null
> +++ b/hw/virtio/virtio-pstore.c
> @@ -0,0 +1,699 @@
> +/*
> + * Virtio Pstore Device
> + *
> + * Copyright (C) 2016  LG Electronics
> + *
> + * Authors:
> + *  Namhyung Kim  <namhyung@gmail.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include <stdio.h>
> +
> +#include "qemu/osdep.h"
> +#include "qemu/iov.h"
> +#include "qemu-common.h"
> +#include "qemu/cutils.h"
> +#include "qemu/error-report.h"
> +#include "sysemu/kvm.h"
> +#include "qapi/visitor.h"
> +#include "qapi-event.h"
> +#include "io/channel-util.h"
> +#include "trace.h"
> +
> +#include "hw/virtio/virtio.h"
> +#include "hw/virtio/virtio-bus.h"
> +#include "hw/virtio/virtio-access.h"
> +#include "hw/virtio/virtio-pstore.h"
> +
> +#define PSTORE_DEFAULT_BUFSIZE   (16 * 1024)
> +#define PSTORE_DEFAULT_FILE_MAX  5
> +
> +/* the index should match to the type value */
> +static const char *virtio_pstore_file_prefix[] = {
> +    "unknown-",		/* VIRTIO_PSTORE_TYPE_UNKNOWN */
> +    "dmesg-",		/* VIRTIO_PSTORE_TYPE_DMESG */
> +};
> +
> +static char *virtio_pstore_to_filename(VirtIOPstore *s,
> +                                       struct virtio_pstore_req *req)
> +{
> +    const char *basename;
> +    unsigned long long id;
> +    unsigned int type = le16_to_cpu(req->type);
> +    unsigned int flags = le32_to_cpu(req->flags);
> +
> +    if (type < ARRAY_SIZE(virtio_pstore_file_prefix)) {
> +        basename = virtio_pstore_file_prefix[type];
> +    } else {
> +        basename = "unknown-";
> +    }
> +
> +    id = s->id++;
> +    return g_strdup_printf("%s/%s%llu%s", s->directory, basename, id,
> +                            flags & VIRTIO_PSTORE_FL_COMPRESSED ? ".enc.z" : "");
> +}
> +
> +static char *virtio_pstore_from_filename(VirtIOPstore *s, char *name,
> +                                         struct virtio_pstore_fileinfo *info)
> +{
> +    char *filename;
> +    unsigned int idx;
> +
> +    filename = g_strdup_printf("%s/%s", s->directory, name);
> +    if (filename == NULL)
> +        return NULL;
> +
> +    for (idx = 0; idx < ARRAY_SIZE(virtio_pstore_file_prefix); idx++) {
> +        if (g_str_has_prefix(name, virtio_pstore_file_prefix[idx])) {
> +            info->type = idx;
> +            name += strlen(virtio_pstore_file_prefix[idx]);
> +            break;
> +        }
> +    }
> +
> +    if (idx == ARRAY_SIZE(virtio_pstore_file_prefix)) {
> +        g_free(filename);
> +        return NULL;
> +    }
> +
> +    qemu_strtoull(name, NULL, 0, &info->id);
> +
> +    info->flags = 0;
> +    if (g_str_has_suffix(name, ".enc.z")) {
> +        info->flags |= VIRTIO_PSTORE_FL_COMPRESSED;
> +    }
> +
> +    return filename;
> +}
> +
> +static int prefix_idx;
> +static int prefix_count;
> +static int prefix_len;
> +
> +static int filter_pstore(const struct dirent *de)
> +{
> +    int i;
> +
> +    for (i = 0; i < prefix_count; i++) {
> +        const char *prefix = virtio_pstore_file_prefix[prefix_idx + i];
> +
> +        if (g_str_has_prefix(de->d_name, prefix)) {
> +            return 1;
> +        }
> +    }
> +    return 0;
> +}
> +
> +static int sort_pstore(const struct dirent **a, const struct dirent **b)
> +{
> +    uint64_t id_a, id_b;
> +
> +    qemu_strtoull((*a)->d_name + prefix_len, NULL, 0, &id_a);
> +    qemu_strtoull((*b)->d_name + prefix_len, NULL, 0, &id_b);
> +
> +    return id_a - id_b;
> +}
> +
> +static int rotate_pstore_file(VirtIOPstore *s, unsigned short type)

AFAIK you're not actually doing file rotation here - that implies a
fixed base filename, with .0, .1, .2, etc suffixes where we rename
files each time. It looks like you are assuming separate filenames,
and are merely deleting the oldest each time.

> +{
> +    int ret = 0;
> +    int i, num;
> +    char *filename;
> +    struct dirent **files;
> +
> +    if (type >= ARRAY_SIZE(virtio_pstore_file_prefix)) {
> +        type = VIRTIO_PSTORE_TYPE_UNKNOWN;
> +    }
> +
> +    prefix_idx = type;
> +    prefix_len = strlen(virtio_pstore_file_prefix[type]);
> +    prefix_count = 1;  /* only scan current type */
> +
> +    /* delete the oldest file in the same type */
> +    num = scandir(s->directory, &files, filter_pstore, sort_pstore);
> +    if (num < 0)
> +        return num;
> +    if (num < (int)s->file_max)
> +        goto out;
> +
> +    filename = g_strdup_printf("%s/%s", s->directory, files[0]->d_name);
> +    if (filename == NULL) {
> +        ret = -1;
> +        goto out;
> +    }
> +
> +    ret = unlink(filename);





> +static gboolean pstore_async_read_fn(QIOChannel *ioc, GIOCondition condition,
> +                                     gpointer data)
> +{
> +    struct pstore_read_arg *rarg = data;
> +    struct virtio_pstore_fileinfo *info = &rarg->info;
> +    VirtIOPstore *vps = rarg->vps;
> +    VirtQueueElement *elem = rarg->elem;
> +    struct virtio_pstore_res res;
> +    size_t offset = sizeof(res) + sizeof(*info);
> +    struct iovec *sg = elem->in_sg;
> +    unsigned int sg_num = elem->in_num;
> +    Error *err = NULL;
> +    ssize_t len;
> +    int ret;
> +
> +    /* skip res and fileinfo */
> +    iov_discard_front(&sg, &sg_num, sizeof(res) + sizeof(*info));
> +
> +    len = qio_channel_readv(rarg->ioc, sg, sg_num, &err);
> +    if (len < 0) {
> +        if (errno == EAGAIN) {
> +            len = 0;
> +        }
> +        ret = -1;
> +    } else {
> +        info->len = cpu_to_le32(len);
> +        ret = 0;
> +    }
> +
> +    res.cmd  = cpu_to_le16(VIRTIO_PSTORE_CMD_READ);
> +    res.type = cpu_to_le16(VIRTIO_PSTORE_TYPE_UNKNOWN);
> +    res.ret  = cpu_to_le32(ret);
> +
> +    /* now copy res and fileinfo */
> +    iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> +    iov_from_buf(elem->in_sg, elem->in_num, sizeof(res), info, sizeof(*info));
> +
> +    len += offset;
> +    virtqueue_push(vps->rvq, elem, len);
> +    virtio_notify(VIRTIO_DEVICE(vps), vps->rvq);
> +
> +    return G_SOURCE_REMOVE;

G_SOURCE_REMOVE was added in glib 2.32, but QEMU only permits
stuff that is present in 2.22. Just use "FALSE" instead.

> +static ssize_t virtio_pstore_do_read(VirtIOPstore *s, VirtQueueElement *elem)
> +{
> +    char *filename = NULL;
> +    int fd, idx;
> +    struct stat stbuf;
> +    struct pstore_read_arg *rarg = NULL;
> +    Error *err = NULL;
> +    int ret = -1;
> +
> +    if (s->file_idx >= s->num_file) {
> +        return 0;
> +    }
> +
> +    rarg = g_malloc(sizeof(*rarg));
> +    if (rarg == NULL) {
> +        return -1;
> +    }
> +
> +    idx = s->file_idx++;
> +    filename = virtio_pstore_from_filename(s, s->files[idx]->d_name,
> +                                           &rarg->info);
> +    if (filename == NULL) {
> +        goto out;
> +    }
> +
> +    fd = open(filename, O_RDONLY);
> +    if (fd < 0) {
> +        error_report("cannot open %s", filename);
> +        goto out;
> +    }
> +
> +    if (fstat(fd, &stbuf) < 0) {
> +        goto out;
> +    }
> +
> +    rarg->vps            = s;
> +    rarg->elem           = elem;
> +    rarg->info.id        = cpu_to_le64(rarg->info.id);
> +    rarg->info.type      = cpu_to_le16(rarg->info.type);
> +    rarg->info.flags     = cpu_to_le32(rarg->info.flags);
> +    rarg->info.time_sec  = cpu_to_le64(stbuf.st_ctim.tv_sec);
> +    rarg->info.time_nsec = cpu_to_le32(stbuf.st_ctim.tv_nsec);
> +
> +    rarg->ioc = qio_channel_new_fd(fd, &err);

You should just use qio_channel_open_path() and avoid the earlier
call to open()

> +    if (err) {
> +        error_reportf_err(err, "cannot create io channel: ");
> +        goto out;
> +    }
> +
> +    qio_channel_set_blocking(rarg->ioc, false, &err);
> +    qio_channel_add_watch(rarg->ioc, G_IO_IN, pstore_async_read_fn, rarg,
> +                          free_rarg_fn);
> +    g_free(filename);
> +    return 1;
> +
> +out:
> +    g_free(filename);
> +    g_free(rarg);
> +
> +    return ret;
> +}


> +static ssize_t virtio_pstore_do_write(VirtIOPstore *s, VirtQueueElement *elem,
> +                                      struct virtio_pstore_req *req)
> +{
> +    unsigned short type = le16_to_cpu(req->type);
> +    char *filename = NULL;
> +    int fd;
> +    int flags = O_WRONLY | O_CREAT | O_TRUNC;
> +    struct pstore_write_arg *warg = NULL;
> +    Error *err = NULL;
> +    int ret = -1;
> +
> +    /* do not keep same type of files more than 'file-max' */
> +    rotate_pstore_file(s, type);
> +
> +    filename = virtio_pstore_to_filename(s, req);
> +    if (filename == NULL) {
> +        return -1;
> +    }
> +
> +    warg = g_malloc(sizeof(*warg));
> +    if (warg == NULL) {
> +        goto out;
> +    }
> +
> +    fd = open(filename, flags, 0644);
> +    if (fd < 0) {
> +        error_report("cannot open %s", filename);
> +        ret = fd;
> +        goto out;
> +    }
> +
> +    warg->vps            = s;
> +    warg->elem           = elem;
> +    warg->req            = req;
> +
> +    warg->ioc = qio_channel_new_fd(fd, &err);

Same point about using new_path() instead of new_fd()

> +    if (err) {
> +        error_reportf_err(err, "cannot create io channel: ");
> +        goto out;
> +    }
> +
> +    qio_channel_set_blocking(warg->ioc, false, &err);
> +    qio_channel_add_watch(warg->ioc, G_IO_OUT, pstore_async_write_fn, warg,
> +                          free_warg_fn);
> +    g_free(filename);
> +    return 1;
> +
> +out:
> +    g_free(filename);
> +    g_free(warg);
> +    return ret;
> +}
> +
> +static void virtio_pstore_handle_io(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +    VirtIOPstore *s = VIRTIO_PSTORE(vdev);
> +    VirtQueueElement *elem;
> +    struct virtio_pstore_req req;
> +    struct virtio_pstore_res res;
> +    ssize_t len = 0;
> +    int ret;
> +
> +    for (;;) {
> +        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> +        if (!elem) {
> +            return;
> +        }
> +
> +        if (elem->out_num < 1 || elem->in_num < 1) {
> +            error_report("request or response buffer is missing");
> +            exit(1);
> +        }
> +
> +        if (elem->out_num > 2 || elem->in_num > 3) {
> +            error_report("invalid number of input/output buffer");
> +            exit(1);
> +        }
> +
> +        len = iov_to_buf(elem->out_sg, elem->out_num, 0, &req, sizeof(req));
> +        if (len != (ssize_t)sizeof(req)) {
> +            error_report("invalid request size: %ld", (long)len);
> +            exit(1);
> +        }
> +        res.cmd  = req.cmd;
> +        res.type = req.type;
> +
> +        switch (le16_to_cpu(req.cmd)) {
> +        case VIRTIO_PSTORE_CMD_OPEN:
> +            ret = virtio_pstore_do_open(s);
> +            break;
> +        case VIRTIO_PSTORE_CMD_CLOSE:
> +            ret = virtio_pstore_do_close(s);
> +            break;
> +        case VIRTIO_PSTORE_CMD_ERASE:
> +            ret = virtio_pstore_do_erase(s, &req);
> +            break;
> +        case VIRTIO_PSTORE_CMD_READ:
> +            ret = virtio_pstore_do_read(s, elem);
> +            if (ret == 1) {
> +                /* async channel io */
> +                continue;
> +            }
> +            break;
> +        case VIRTIO_PSTORE_CMD_WRITE:
> +            ret = virtio_pstore_do_write(s, elem, &req);
> +            if (ret == 1) {
> +                /* async channel io */
> +                continue;
> +            }
> +            break;
> +        default:
> +            ret = -1;
> +            break;
> +        }
> +
> +        res.ret = ret;
> +
> +        iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> +        virtqueue_push(vq, elem, sizeof(res) + len);
> +
> +        virtio_notify(vdev, vq);
> +        g_free(elem);
> +
> +        if (ret < 0) {
> +            return;
> +        }
> +    }
> +}

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] qemu: Implement virtio-pstore device
  2016-08-24 22:00   ` Daniel P. Berrange
@ 2016-08-26  4:48     ` Namhyung Kim
  2016-08-26 12:27       ` Daniel P. Berrange
  0 siblings, 1 reply; 31+ messages in thread
From: Namhyung Kim @ 2016-08-26  4:48 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Michael S. Tsirkin, Anthony Liguori, Anton Vorontsov,
	Colin Cross, Kees Cook, Tony Luck, Steven Rostedt, Ingo Molnar,
	Minchan Kim

Hi Daniel,

On Wed, Aug 24, 2016 at 06:00:51PM -0400, Daniel P. Berrange wrote:
> 
> > diff --git a/hw/virtio/virtio-pstore.c b/hw/virtio/virtio-pstore.c
> > new file mode 100644
> > index 0000000..b8fb4be
> > --- /dev/null
> > +++ b/hw/virtio/virtio-pstore.c
> > @@ -0,0 +1,699 @@
> > +/*
> > + * Virtio Pstore Device
> > + *
> > + * Copyright (C) 2016  LG Electronics
> > + *
> > + * Authors:
> > + *  Namhyung Kim  <namhyung@gmail.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include <stdio.h>
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/iov.h"
> > +#include "qemu-common.h"
> > +#include "qemu/cutils.h"
> > +#include "qemu/error-report.h"
> > +#include "sysemu/kvm.h"
> > +#include "qapi/visitor.h"
> > +#include "qapi-event.h"
> > +#include "io/channel-util.h"
> > +#include "trace.h"
> > +
> > +#include "hw/virtio/virtio.h"
> > +#include "hw/virtio/virtio-bus.h"
> > +#include "hw/virtio/virtio-access.h"
> > +#include "hw/virtio/virtio-pstore.h"
> > +
> > +#define PSTORE_DEFAULT_BUFSIZE   (16 * 1024)
> > +#define PSTORE_DEFAULT_FILE_MAX  5
> > +
> > +/* the index should match to the type value */
> > +static const char *virtio_pstore_file_prefix[] = {
> > +    "unknown-",		/* VIRTIO_PSTORE_TYPE_UNKNOWN */
> > +    "dmesg-",		/* VIRTIO_PSTORE_TYPE_DMESG */
> > +};
> > +
> > +static char *virtio_pstore_to_filename(VirtIOPstore *s,
> > +                                       struct virtio_pstore_req *req)
> > +{
> > +    const char *basename;
> > +    unsigned long long id;
> > +    unsigned int type = le16_to_cpu(req->type);
> > +    unsigned int flags = le32_to_cpu(req->flags);
> > +
> > +    if (type < ARRAY_SIZE(virtio_pstore_file_prefix)) {
> > +        basename = virtio_pstore_file_prefix[type];
> > +    } else {
> > +        basename = "unknown-";
> > +    }
> > +
> > +    id = s->id++;
> > +    return g_strdup_printf("%s/%s%llu%s", s->directory, basename, id,
> > +                            flags & VIRTIO_PSTORE_FL_COMPRESSED ? ".enc.z" : "");
> > +}
> > +
> > +static char *virtio_pstore_from_filename(VirtIOPstore *s, char *name,
> > +                                         struct virtio_pstore_fileinfo *info)
> > +{
> > +    char *filename;
> > +    unsigned int idx;
> > +
> > +    filename = g_strdup_printf("%s/%s", s->directory, name);
> > +    if (filename == NULL)
> > +        return NULL;
> > +
> > +    for (idx = 0; idx < ARRAY_SIZE(virtio_pstore_file_prefix); idx++) {
> > +        if (g_str_has_prefix(name, virtio_pstore_file_prefix[idx])) {
> > +            info->type = idx;
> > +            name += strlen(virtio_pstore_file_prefix[idx]);
> > +            break;
> > +        }
> > +    }
> > +
> > +    if (idx == ARRAY_SIZE(virtio_pstore_file_prefix)) {
> > +        g_free(filename);
> > +        return NULL;
> > +    }
> > +
> > +    qemu_strtoull(name, NULL, 0, &info->id);
> > +
> > +    info->flags = 0;
> > +    if (g_str_has_suffix(name, ".enc.z")) {
> > +        info->flags |= VIRTIO_PSTORE_FL_COMPRESSED;
> > +    }
> > +
> > +    return filename;
> > +}
> > +
> > +static int prefix_idx;
> > +static int prefix_count;
> > +static int prefix_len;
> > +
> > +static int filter_pstore(const struct dirent *de)
> > +{
> > +    int i;
> > +
> > +    for (i = 0; i < prefix_count; i++) {
> > +        const char *prefix = virtio_pstore_file_prefix[prefix_idx + i];
> > +
> > +        if (g_str_has_prefix(de->d_name, prefix)) {
> > +            return 1;
> > +        }
> > +    }
> > +    return 0;
> > +}
> > +
> > +static int sort_pstore(const struct dirent **a, const struct dirent **b)
> > +{
> > +    uint64_t id_a, id_b;
> > +
> > +    qemu_strtoull((*a)->d_name + prefix_len, NULL, 0, &id_a);
> > +    qemu_strtoull((*b)->d_name + prefix_len, NULL, 0, &id_b);
> > +
> > +    return id_a - id_b;
> > +}
> > +
> > +static int rotate_pstore_file(VirtIOPstore *s, unsigned short type)
> 
> AFAIK you're not actually doing file rotation here - that implies a
> fixed base filename, with .0, .1, .2, etc suffixes where we rename
> files each time. It looks like you are assuming separate filenames,
> and are merely deleting the oldest each time.

Ah, right.  It's not rotation and I think it's enough for my purpose.
I need to change the name.

> 
> > +{
> > +    int ret = 0;
> > +    int i, num;
> > +    char *filename;
> > +    struct dirent **files;
> > +
> > +    if (type >= ARRAY_SIZE(virtio_pstore_file_prefix)) {
> > +        type = VIRTIO_PSTORE_TYPE_UNKNOWN;
> > +    }
> > +
> > +    prefix_idx = type;
> > +    prefix_len = strlen(virtio_pstore_file_prefix[type]);
> > +    prefix_count = 1;  /* only scan current type */
> > +
> > +    /* delete the oldest file in the same type */
> > +    num = scandir(s->directory, &files, filter_pstore, sort_pstore);
> > +    if (num < 0)
> > +        return num;
> > +    if (num < (int)s->file_max)
> > +        goto out;
> > +
> > +    filename = g_strdup_printf("%s/%s", s->directory, files[0]->d_name);
> > +    if (filename == NULL) {
> > +        ret = -1;
> > +        goto out;
> > +    }
> > +
> > +    ret = unlink(filename);
> 
> 
> 
> 
> 
> > +static gboolean pstore_async_read_fn(QIOChannel *ioc, GIOCondition condition,
> > +                                     gpointer data)
> > +{
> > +    struct pstore_read_arg *rarg = data;
> > +    struct virtio_pstore_fileinfo *info = &rarg->info;
> > +    VirtIOPstore *vps = rarg->vps;
> > +    VirtQueueElement *elem = rarg->elem;
> > +    struct virtio_pstore_res res;
> > +    size_t offset = sizeof(res) + sizeof(*info);
> > +    struct iovec *sg = elem->in_sg;
> > +    unsigned int sg_num = elem->in_num;
> > +    Error *err = NULL;
> > +    ssize_t len;
> > +    int ret;
> > +
> > +    /* skip res and fileinfo */
> > +    iov_discard_front(&sg, &sg_num, sizeof(res) + sizeof(*info));
> > +
> > +    len = qio_channel_readv(rarg->ioc, sg, sg_num, &err);
> > +    if (len < 0) {
> > +        if (errno == EAGAIN) {
> > +            len = 0;
> > +        }
> > +        ret = -1;
> > +    } else {
> > +        info->len = cpu_to_le32(len);
> > +        ret = 0;
> > +    }
> > +
> > +    res.cmd  = cpu_to_le16(VIRTIO_PSTORE_CMD_READ);
> > +    res.type = cpu_to_le16(VIRTIO_PSTORE_TYPE_UNKNOWN);
> > +    res.ret  = cpu_to_le32(ret);
> > +
> > +    /* now copy res and fileinfo */
> > +    iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> > +    iov_from_buf(elem->in_sg, elem->in_num, sizeof(res), info, sizeof(*info));
> > +
> > +    len += offset;
> > +    virtqueue_push(vps->rvq, elem, len);
> > +    virtio_notify(VIRTIO_DEVICE(vps), vps->rvq);
> > +
> > +    return G_SOURCE_REMOVE;
> 
> G_SOURCE_REMOVE was added in glib 2.32, but QEMU only permits
> stuff that is present in 2.22. Just use "FALSE" instead.

Didn't know that, will change.

> 
> > +static ssize_t virtio_pstore_do_read(VirtIOPstore *s, VirtQueueElement *elem)
> > +{
> > +    char *filename = NULL;
> > +    int fd, idx;
> > +    struct stat stbuf;
> > +    struct pstore_read_arg *rarg = NULL;
> > +    Error *err = NULL;
> > +    int ret = -1;
> > +
> > +    if (s->file_idx >= s->num_file) {
> > +        return 0;
> > +    }
> > +
> > +    rarg = g_malloc(sizeof(*rarg));
> > +    if (rarg == NULL) {
> > +        return -1;
> > +    }
> > +
> > +    idx = s->file_idx++;
> > +    filename = virtio_pstore_from_filename(s, s->files[idx]->d_name,
> > +                                           &rarg->info);
> > +    if (filename == NULL) {
> > +        goto out;
> > +    }
> > +
> > +    fd = open(filename, O_RDONLY);
> > +    if (fd < 0) {
> > +        error_report("cannot open %s", filename);
> > +        goto out;
> > +    }
> > +
> > +    if (fstat(fd, &stbuf) < 0) {
> > +        goto out;
> > +    }
> > +
> > +    rarg->vps            = s;
> > +    rarg->elem           = elem;
> > +    rarg->info.id        = cpu_to_le64(rarg->info.id);
> > +    rarg->info.type      = cpu_to_le16(rarg->info.type);
> > +    rarg->info.flags     = cpu_to_le32(rarg->info.flags);
> > +    rarg->info.time_sec  = cpu_to_le64(stbuf.st_ctim.tv_sec);
> > +    rarg->info.time_nsec = cpu_to_le32(stbuf.st_ctim.tv_nsec);
> > +
> > +    rarg->ioc = qio_channel_new_fd(fd, &err);
> 
> You should just use qio_channel_open_path() and avoid the earlier
> call to open()

I did it because to call fstat() using the fd and wanted to keep the
generic ioc pointer.


> 
> > +    if (err) {
> > +        error_reportf_err(err, "cannot create io channel: ");
> > +        goto out;
> > +    }
> > +
> > +    qio_channel_set_blocking(rarg->ioc, false, &err);
> > +    qio_channel_add_watch(rarg->ioc, G_IO_IN, pstore_async_read_fn, rarg,
> > +                          free_rarg_fn);
> > +    g_free(filename);
> > +    return 1;
> > +
> > +out:
> > +    g_free(filename);
> > +    g_free(rarg);
> > +
> > +    return ret;
> > +}
> 
> 
> > +static ssize_t virtio_pstore_do_write(VirtIOPstore *s, VirtQueueElement *elem,
> > +                                      struct virtio_pstore_req *req)
> > +{
> > +    unsigned short type = le16_to_cpu(req->type);
> > +    char *filename = NULL;
> > +    int fd;
> > +    int flags = O_WRONLY | O_CREAT | O_TRUNC;
> > +    struct pstore_write_arg *warg = NULL;
> > +    Error *err = NULL;
> > +    int ret = -1;
> > +
> > +    /* do not keep same type of files more than 'file-max' */
> > +    rotate_pstore_file(s, type);
> > +
> > +    filename = virtio_pstore_to_filename(s, req);
> > +    if (filename == NULL) {
> > +        return -1;
> > +    }
> > +
> > +    warg = g_malloc(sizeof(*warg));
> > +    if (warg == NULL) {
> > +        goto out;
> > +    }
> > +
> > +    fd = open(filename, flags, 0644);
> > +    if (fd < 0) {
> > +        error_report("cannot open %s", filename);
> > +        ret = fd;
> > +        goto out;
> > +    }
> > +
> > +    warg->vps            = s;
> > +    warg->elem           = elem;
> > +    warg->req            = req;
> > +
> > +    warg->ioc = qio_channel_new_fd(fd, &err);
> 
> Same point about using new_path() instead of new_fd()

OK.

> 
> > +    if (err) {
> > +        error_reportf_err(err, "cannot create io channel: ");
> > +        goto out;
> > +    }
> > +
> > +    qio_channel_set_blocking(warg->ioc, false, &err);
> > +    qio_channel_add_watch(warg->ioc, G_IO_OUT, pstore_async_write_fn, warg,
> > +                          free_warg_fn);
> > +    g_free(filename);
> > +    return 1;
> > +
> > +out:
> > +    g_free(filename);
> > +    g_free(warg);
> > +    return ret;
> > +}
> > +
> > +static void virtio_pstore_handle_io(VirtIODevice *vdev, VirtQueue *vq)
> > +{
> > +    VirtIOPstore *s = VIRTIO_PSTORE(vdev);
> > +    VirtQueueElement *elem;
> > +    struct virtio_pstore_req req;
> > +    struct virtio_pstore_res res;
> > +    ssize_t len = 0;
> > +    int ret;
> > +
> > +    for (;;) {
> > +        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> > +        if (!elem) {
> > +            return;
> > +        }
> > +
> > +        if (elem->out_num < 1 || elem->in_num < 1) {
> > +            error_report("request or response buffer is missing");
> > +            exit(1);
> > +        }
> > +
> > +        if (elem->out_num > 2 || elem->in_num > 3) {
> > +            error_report("invalid number of input/output buffer");
> > +            exit(1);
> > +        }
> > +
> > +        len = iov_to_buf(elem->out_sg, elem->out_num, 0, &req, sizeof(req));
> > +        if (len != (ssize_t)sizeof(req)) {
> > +            error_report("invalid request size: %ld", (long)len);
> > +            exit(1);
> > +        }
> > +        res.cmd  = req.cmd;
> > +        res.type = req.type;
> > +
> > +        switch (le16_to_cpu(req.cmd)) {
> > +        case VIRTIO_PSTORE_CMD_OPEN:
> > +            ret = virtio_pstore_do_open(s);
> > +            break;
> > +        case VIRTIO_PSTORE_CMD_CLOSE:
> > +            ret = virtio_pstore_do_close(s);
> > +            break;
> > +        case VIRTIO_PSTORE_CMD_ERASE:
> > +            ret = virtio_pstore_do_erase(s, &req);
> > +            break;
> > +        case VIRTIO_PSTORE_CMD_READ:
> > +            ret = virtio_pstore_do_read(s, elem);
> > +            if (ret == 1) {
> > +                /* async channel io */
> > +                continue;
> > +            }
> > +            break;
> > +        case VIRTIO_PSTORE_CMD_WRITE:
> > +            ret = virtio_pstore_do_write(s, elem, &req);
> > +            if (ret == 1) {
> > +                /* async channel io */
> > +                continue;
> > +            }
> > +            break;
> > +        default:
> > +            ret = -1;
> > +            break;
> > +        }
> > +
> > +        res.ret = ret;
> > +
> > +        iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> > +        virtqueue_push(vq, elem, sizeof(res) + len);
> > +
> > +        virtio_notify(vdev, vq);
> > +        g_free(elem);
> > +
> > +        if (ret < 0) {
> > +            return;
> > +        }
> > +    }
> > +}
> 
> Regards,
> Daniel

As always, thanks for your review!

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] qemu: Implement virtio-pstore device
  2016-08-26  4:48     ` Namhyung Kim
@ 2016-08-26 12:27       ` Daniel P. Berrange
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel P. Berrange @ 2016-08-26 12:27 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Michael S. Tsirkin, Anthony Liguori, Anton Vorontsov,
	Colin Cross, Kees Cook, Tony Luck, Steven Rostedt, Ingo Molnar,
	Minchan Kim

On Fri, Aug 26, 2016 at 01:48:40PM +0900, Namhyung Kim wrote:
> Hi Daniel,
> 
> On Wed, Aug 24, 2016 at 06:00:51PM -0400, Daniel P. Berrange wrote:

> > > +    fd = open(filename, O_RDONLY);
> > > +    if (fd < 0) {
> > > +        error_report("cannot open %s", filename);
> > > +        goto out;
> > > +    }
> > > +
> > > +    if (fstat(fd, &stbuf) < 0) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    rarg->vps            = s;
> > > +    rarg->elem           = elem;
> > > +    rarg->info.id        = cpu_to_le64(rarg->info.id);
> > > +    rarg->info.type      = cpu_to_le16(rarg->info.type);
> > > +    rarg->info.flags     = cpu_to_le32(rarg->info.flags);
> > > +    rarg->info.time_sec  = cpu_to_le64(stbuf.st_ctim.tv_sec);
> > > +    rarg->info.time_nsec = cpu_to_le32(stbuf.st_ctim.tv_nsec);
> > > +
> > > +    rarg->ioc = qio_channel_new_fd(fd, &err);
> > 
> > You should just use qio_channel_open_path() and avoid the earlier
> > call to open()
> 
> I did it because to call fstat() using the fd and wanted to keep the
> generic ioc pointer.

I'd suggest just using a cast inline, eg

  fstat(QIO_CHANNEL_FILE(ioc)->fd, &stbuf)


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-08-20  8:07 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
@ 2016-09-13 15:19   ` Michael S. Tsirkin
  2016-09-16  9:05     ` Namhyung Kim
  2016-11-10 16:39   ` Michael S. Tsirkin
  1 sibling, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2016-09-13 15:19 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Anthony Liguori, Anton Vorontsov, Colin Cross, Kees Cook,
	Tony Luck, Steven Rostedt, Ingo Molnar, Minchan Kim

On Sat, Aug 20, 2016 at 05:07:42PM +0900, Namhyung Kim wrote:
> The virtio pstore driver provides interface to the pstore subsystem so
> that the guest kernel's log/dump message can be saved on the host
> machine.  Users can access the log file directly on the host, or on the
> guest at the next boot using pstore filesystem.  It currently deals with
> kernel log (printk) buffer only, but we can extend it to have other
> information (like ftrace dump) later.
> 
> It supports legacy PCI device using single order-2 page buffer.  It uses
> two virtqueues - one for (sync) read and another for (async) write.
> Since it cannot wait for write finished, it supports up to 128
> concurrent IO.  The buffer size is configurable now.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Anthony Liguori <aliguori@amazon.com>
> Cc: Anton Vorontsov <anton@enomsg.org>
> Cc: Colin Cross <ccross@android.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: kvm@vger.kernel.org
> Cc: qemu-devel@nongnu.org
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  drivers/virtio/Kconfig             |  10 +
>  drivers/virtio/Makefile            |   1 +
>  drivers/virtio/virtio_pstore.c     | 417 +++++++++++++++++++++++++++++++++++++
>  include/uapi/linux/Kbuild          |   1 +
>  include/uapi/linux/virtio_ids.h    |   1 +
>  include/uapi/linux/virtio_pstore.h |  74 +++++++
>  6 files changed, 504 insertions(+)
>  create mode 100644 drivers/virtio/virtio_pstore.c
>  create mode 100644 include/uapi/linux/virtio_pstore.h
> 
> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> index 77590320d44c..8f0e6c796c12 100644
> --- a/drivers/virtio/Kconfig
> +++ b/drivers/virtio/Kconfig
> @@ -58,6 +58,16 @@ config VIRTIO_INPUT
>  
>  	 If unsure, say M.
>  
> +config VIRTIO_PSTORE
> +	tristate "Virtio pstore driver"
> +	depends on VIRTIO
> +	depends on PSTORE
> +	---help---
> +	 This driver supports virtio pstore devices to save/restore
> +	 panic and oops messages on the host.
> +
> +	 If unsure, say M.
> +
>   config VIRTIO_MMIO
>  	tristate "Platform bus driver for memory mapped virtio devices"
>  	depends on HAS_IOMEM && HAS_DMA
> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> index 41e30e3dc842..bee68cb26d48 100644
> --- a/drivers/virtio/Makefile
> +++ b/drivers/virtio/Makefile
> @@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
>  virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
>  obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
>  obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
> +obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o
> diff --git a/drivers/virtio/virtio_pstore.c b/drivers/virtio/virtio_pstore.c
> new file mode 100644
> index 000000000000..0a63c7db4278
> --- /dev/null
> +++ b/drivers/virtio/virtio_pstore.c
> @@ -0,0 +1,417 @@
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/pstore.h>
> +#include <linux/virtio.h>
> +#include <linux/virtio_config.h>
> +#include <uapi/linux/virtio_ids.h>
> +#include <uapi/linux/virtio_pstore.h>
> +
> +#define VIRT_PSTORE_ORDER    2
> +#define VIRT_PSTORE_BUFSIZE  (4096 << VIRT_PSTORE_ORDER)
> +#define VIRT_PSTORE_NR_REQ   128

where are these numbers from?


> +
> +struct virtio_pstore {
> +	struct virtio_device	*vdev;
> +	struct virtqueue	*vq[2];
> +	struct pstore_info	 pstore;
> +	struct virtio_pstore_req req[VIRT_PSTORE_NR_REQ];
> +	struct virtio_pstore_res res[VIRT_PSTORE_NR_REQ];
> +	unsigned int		 req_id;
> +
> +	/* Waiting for host to ack */
> +	wait_queue_head_t	acked;
> +	int			failed;
> +};
> +
> +#define TYPE_TABLE_ENTRY(_entry)				\
> +	{ PSTORE_TYPE_##_entry, VIRTIO_PSTORE_TYPE_##_entry }
> +
> +struct type_table {
> +	int pstore;
> +	u16 virtio;
> +} type_table[] = {
> +	TYPE_TABLE_ENTRY(DMESG),
> +};
> +
> +#undef TYPE_TABLE_ENTRY

Let's not play preprocessor games until this becomes
a big issue. Simple
{ PSTORE_TYPE_DMESG, VIRTIO_PSTORE_TYPE_DMESG}
does the trick just as well for now.
Also see below.



> +
> +

single empty line pls.

> +static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id type)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
> +		if (type == type_table[i].pstore)
> +			return cpu_to_virtio16(vps->vdev, type_table[i].virtio);
> +	}

Rather complex for something that always returns a single value.
why do we need a table at all?
How about a switch statement?

static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id type)
{
    switch (type) {
    case PSTORE_TYPE_DMESG:
        return VIRTIO_PSTORE_TYPE_DMESG;
    default:
        return VIRTIO_PSTORE_TYPE_UNKNOWN;
    }
}


> +
> +	return cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_UNKNOWN);
> +}

This returns an incorrect type.


> +
> +static enum pstore_type_id from_virtio_type(struct virtio_pstore *vps, u16 type)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
> +		if (virtio16_to_cpu(vps->vdev, type) == type_table[i].virtio)
> +			return type_table[i].pstore;
> +	}
> +
> +	return PSTORE_TYPE_UNKNOWN;
> +}
> +
> +static void virtpstore_ack(struct virtqueue *vq)
> +{
> +	struct virtio_pstore *vps = vq->vdev->priv;
> +
> +	wake_up(&vps->acked);
> +}
> +
> +static void virtpstore_check(struct virtqueue *vq)
> +{
> +	struct virtio_pstore *vps = vq->vdev->priv;
> +	struct virtio_pstore_res *res;
> +	unsigned int len;
> +
> +	res = virtqueue_get_buf(vq, &len);
> +	if (res == NULL)
> +		return;
> +
> +	if (virtio32_to_cpu(vq->vdev, res->ret) < 0)
> +		vps->failed = 1;
> +}
> +
> +static void virt_pstore_get_reqs(struct virtio_pstore *vps,
> +				 struct virtio_pstore_req **preq,
> +				 struct virtio_pstore_res **pres)
> +{
> +	unsigned int idx = vps->req_id++ % VIRT_PSTORE_NR_REQ;
> +
> +	*preq = &vps->req[idx];
> +	*pres = &vps->res[idx];
> +
> +	memset(*preq, 0, sizeof(**preq));
> +	memset(*pres, 0, sizeof(**pres));
> +}
> +
> +static int virt_pstore_open(struct pstore_info *psi)
> +{
> +	struct virtio_pstore *vps = psi->data;
> +	struct virtio_pstore_req *req;
> +	struct virtio_pstore_res *res;
> +	struct scatterlist sgo[1], sgi[1];
> +	struct scatterlist *sgs[2] = { sgo, sgi };
> +	unsigned int len;
> +
> +	virt_pstore_get_reqs(vps, &req, &res);
> +
> +	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_OPEN);
> +
> +	sg_init_one(sgo, req, sizeof(*req));
> +	sg_init_one(sgi, res, sizeof(*res));
> +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> +	virtqueue_kick(vps->vq[0]);
> +
> +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> +	return virtio32_to_cpu(vps->vdev, res->ret);
> +}
> +
> +static int virt_pstore_close(struct pstore_info *psi)
> +{
> +	struct virtio_pstore *vps = psi->data;
> +	struct virtio_pstore_req *req = &vps->req[vps->req_id];
> +	struct virtio_pstore_res *res = &vps->res[vps->req_id];
> +	struct scatterlist sgo[1], sgi[1];
> +	struct scatterlist *sgs[2] = { sgo, sgi };
> +	unsigned int len;
> +
> +	virt_pstore_get_reqs(vps, &req, &res);
> +
> +	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_CLOSE);
> +
> +	sg_init_one(sgo, req, sizeof(*req));
> +	sg_init_one(sgi, res, sizeof(*res));
> +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> +	virtqueue_kick(vps->vq[0]);
> +
> +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> +	return virtio32_to_cpu(vps->vdev, res->ret);
> +}
> +
> +static ssize_t virt_pstore_read(u64 *id, enum pstore_type_id *type,
> +				int *count, struct timespec *time,
> +				char **buf, bool *compressed,
> +				ssize_t *ecc_notice_size,
> +				struct pstore_info *psi)
> +{
> +	struct virtio_pstore *vps = psi->data;
> +	struct virtio_pstore_req *req;
> +	struct virtio_pstore_res *res;
> +	struct virtio_pstore_fileinfo info;
> +	struct scatterlist sgo[1], sgi[3];
> +	struct scatterlist *sgs[2] = { sgo, sgi };
> +	unsigned int len;
> +	unsigned int flags;
> +	int ret;
> +	void *bf;
> +
> +	virt_pstore_get_reqs(vps, &req, &res);
> +
> +	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_READ);
> +
> +	sg_init_one(sgo, req, sizeof(*req));
> +	sg_init_table(sgi, 3);
> +	sg_set_buf(&sgi[0], res, sizeof(*res));
> +	sg_set_buf(&sgi[1], &info, sizeof(info));
> +	sg_set_buf(&sgi[2], psi->buf, psi->bufsize);
> +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> +	virtqueue_kick(vps->vq[0]);
> +
> +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> +	if (len < sizeof(*res) + sizeof(info))
> +		return -1;
> +
> +	ret = virtio32_to_cpu(vps->vdev, res->ret);
> +	if (ret < 0)
> +		return ret;
> +
> +	len = virtio32_to_cpu(vps->vdev, info.len);
> +
> +	bf = kmalloc(len, GFP_KERNEL);
> +	if (bf == NULL)
> +		return -ENOMEM;
> +
> +	*id    = virtio64_to_cpu(vps->vdev, info.id);
> +	*type  = from_virtio_type(vps, info.type);
> +	*count = virtio32_to_cpu(vps->vdev, info.count);
> +
> +	flags = virtio32_to_cpu(vps->vdev, info.flags);
> +	*compressed = flags & VIRTIO_PSTORE_FL_COMPRESSED;
> +
> +	time->tv_sec  = virtio64_to_cpu(vps->vdev, info.time_sec);
> +	time->tv_nsec = virtio32_to_cpu(vps->vdev, info.time_nsec);
> +
> +	memcpy(bf, psi->buf, len);
> +	*buf = bf;
> +
> +	return len;
> +}
> +
> +static int notrace virt_pstore_write(enum pstore_type_id type,
> +				     enum kmsg_dump_reason reason,
> +				     u64 *id, unsigned int part, int count,
> +				     bool compressed, size_t size,
> +				     struct pstore_info *psi)
> +{
> +	struct virtio_pstore *vps = psi->data;
> +	struct virtio_pstore_req *req;
> +	struct virtio_pstore_res *res;
> +	struct scatterlist sgo[2], sgi[1];
> +	struct scatterlist *sgs[2] = { sgo, sgi };
> +	unsigned int flags = compressed ? VIRTIO_PSTORE_FL_COMPRESSED : 0;
> +
> +	if (vps->failed)
> +		return -1;
> +
> +	*id = vps->req_id;
> +	virt_pstore_get_reqs(vps, &req, &res);
> +
> +	req->cmd   = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_WRITE);
> +	req->type  = to_virtio_type(vps, type);
> +	req->flags = cpu_to_virtio32(vps->vdev, flags);
> +
> +	sg_init_table(sgo, 2);
> +	sg_set_buf(&sgo[0], req, sizeof(*req));
> +	sg_set_buf(&sgo[1], pstore_get_buf(psi), size);
> +	sg_init_one(sgi, res, sizeof(*res));
> +	virtqueue_add_sgs(vps->vq[1], sgs, 1, 1, vps, GFP_ATOMIC);
> +	virtqueue_kick(vps->vq[1]);
> +
> +	return 0;
> +}
> +
> +static int virt_pstore_erase(enum pstore_type_id type, u64 id, int count,
> +			     struct timespec time, struct pstore_info *psi)
> +{
> +	struct virtio_pstore *vps = psi->data;
> +	struct virtio_pstore_req *req;
> +	struct virtio_pstore_res *res;
> +	struct scatterlist sgo[1], sgi[1];
> +	struct scatterlist *sgs[2] = { sgo, sgi };
> +	unsigned int len;
> +
> +	virt_pstore_get_reqs(vps, &req, &res);
> +
> +	req->cmd   = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_ERASE);
> +	req->type  = to_virtio_type(vps, type);
> +	req->id	   = cpu_to_virtio64(vps->vdev, id);
> +	req->count = cpu_to_virtio32(vps->vdev, count);
> +
> +	sg_init_one(sgo, req, sizeof(*req));
> +	sg_init_one(sgi, res, sizeof(*res));
> +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> +	virtqueue_kick(vps->vq[0]);
> +
> +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> +	return virtio32_to_cpu(vps->vdev, res->ret);
> +}
> +
> +static int virt_pstore_init(struct virtio_pstore *vps)
> +{
> +	struct pstore_info *psinfo = &vps->pstore;
> +	int err;
> +
> +	if (!psinfo->bufsize)
> +		psinfo->bufsize = VIRT_PSTORE_BUFSIZE;
> +
> +	psinfo->buf = alloc_pages_exact(psinfo->bufsize, GFP_KERNEL);
> +	if (!psinfo->buf) {
> +		pr_err("cannot allocate pstore buffer\n");
> +		return -ENOMEM;
> +	}
> +
> +	psinfo->owner = THIS_MODULE;
> +	psinfo->name  = "virtio";
> +	psinfo->open  = virt_pstore_open;
> +	psinfo->close = virt_pstore_close;
> +	psinfo->read  = virt_pstore_read;
> +	psinfo->erase = virt_pstore_erase;
> +	psinfo->write = virt_pstore_write;
> +	psinfo->flags = PSTORE_FLAGS_DMESG;
> +
> +	psinfo->data  = vps;
> +	spin_lock_init(&psinfo->buf_lock);
> +
> +	err = pstore_register(psinfo);
> +	if (err)
> +		kfree(psinfo->buf);
> +
> +	return err;
> +}
> +
> +static int virt_pstore_exit(struct virtio_pstore *vps)
> +{
> +	struct pstore_info *psinfo = &vps->pstore;
> +
> +	pstore_unregister(psinfo);

I don't know enough about pstore - does this
actually ensure that
1. all existing users close the device
2. no new users can open it
somehow?

> +
> +	free_pages_exact(psinfo->buf, psinfo->bufsize);
> +	psinfo->buf = NULL;
> +	psinfo->bufsize = 0;
> +
> +	return 0;
> +}
> +
> +static int virtpstore_init_vqs(struct virtio_pstore *vps)
> +{
> +	vq_callback_t *callbacks[] = { virtpstore_ack, virtpstore_check };
> +	const char *names[] = { "pstore_read", "pstore_write" };
> +
> +	return vps->vdev->config->find_vqs(vps->vdev, 2, vps->vq,
> +					   callbacks, names);
> +}
> +
> +static void virtpstore_init_config(struct virtio_pstore *vps)
> +{
> +	u32 bufsize;
> +
> +	virtio_cread(vps->vdev, struct virtio_pstore_config, bufsize, &bufsize);
> +
> +	vps->pstore.bufsize = PAGE_ALIGN(bufsize);
> +}
> +
> +static void virtpstore_confirm_config(struct virtio_pstore *vps)
> +{
> +	u32 bufsize = vps->pstore.bufsize;
> +
> +	virtio_cwrite(vps->vdev, struct virtio_pstore_config, bufsize,
> +		     &bufsize);
> +}
> +
> +static int virtpstore_probe(struct virtio_device *vdev)
> +{
> +	struct virtio_pstore *vps;
> +	int err;
> +
> +	if (!vdev->config->get) {
> +		dev_err(&vdev->dev, "driver init: config access disabled\n");
> +		return -EINVAL;
> +	}
> +
> +	vdev->priv = vps = kzalloc(sizeof(*vps), GFP_KERNEL);
> +	if (!vps) {
> +		err = -ENOMEM;
> +		goto out;
> +	}
> +	vps->vdev = vdev;
> +
> +	err = virtpstore_init_vqs(vps);
> +	if (err < 0)
> +		goto out_free;
> +
> +	virtpstore_init_config(vps);
> +
> +	err = virt_pstore_init(vps);
> +	if (err)
> +		goto out_del_vq;
> +
> +	virtpstore_confirm_config(vps);
> +
> +	init_waitqueue_head(&vps->acked);
> +
> +	virtio_device_ready(vdev);
> +
> +	dev_info(&vdev->dev, "driver init: ok (bufsize = %luK, flags = %x)\n",
> +		 vps->pstore.bufsize >> 10, vps->pstore.flags);
> +
> +	return 0;
> +
> +out_del_vq:
> +	vdev->config->del_vqs(vdev);
> +out_free:
> +	kfree(vps);
> +out:
> +	dev_err(&vdev->dev, "driver init: failed with %d\n", err);
> +	return err;
> +}
> +
> +static void virtpstore_remove(struct virtio_device *vdev)
> +{
> +	struct virtio_pstore *vps = vdev->priv;
> +
> +	virt_pstore_exit(vps);
> +
> +	/* Now we reset the device so we can clean up the queues. */
> +	vdev->config->reset(vdev);
> +
> +	vdev->config->del_vqs(vdev);
> +
> +	kfree(vps);
> +}
> +
> +static unsigned int features[] = {
> +};
> +
> +static struct virtio_device_id id_table[] = {
> +	{ VIRTIO_ID_PSTORE, VIRTIO_DEV_ANY_ID },
> +	{ 0 },
> +};
> +
> +static struct virtio_driver virtio_pstore_driver = {

We need some way to avoid trying to load this
as a legacy device. There isn't a way to do it yet
so I won't block your patch on this but pls try to
come up with something, and I'll do, too.


> +	.driver.name         = KBUILD_MODNAME,
> +	.driver.owner        = THIS_MODULE,
> +	.feature_table       = features,
> +	.feature_table_size  = ARRAY_SIZE(features),
> +	.id_table            = id_table,
> +	.probe               = virtpstore_probe,
> +	.remove              = virtpstore_remove,

Won't this need freeze/restore callbacks?

> +};
> +
> +module_virtio_driver(virtio_pstore_driver);
> +MODULE_DEVICE_TABLE(virtio, id_table);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Namhyung Kim <namhyung@kernel.org>");
> +MODULE_DESCRIPTION("Virtio pstore driver");
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index 6d4e92ccdc91..9bbb1554d8b2 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -449,6 +449,7 @@ header-y += virtio_ids.h
>  header-y += virtio_input.h
>  header-y += virtio_net.h
>  header-y += virtio_pci.h
> +header-y += virtio_pstore.h
>  header-y += virtio_ring.h
>  header-y += virtio_rng.h
>  header-y += virtio_scsi.h
> diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
> index 77925f587b15..c72a9ab588c0 100644
> --- a/include/uapi/linux/virtio_ids.h
> +++ b/include/uapi/linux/virtio_ids.h
> @@ -41,5 +41,6 @@
>  #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
>  #define VIRTIO_ID_GPU          16 /* virtio GPU */
>  #define VIRTIO_ID_INPUT        18 /* virtio input */
> +#define VIRTIO_ID_PSTORE       22 /* virtio pstore */
>  
>  #endif /* _LINUX_VIRTIO_IDS_H */
> diff --git a/include/uapi/linux/virtio_pstore.h b/include/uapi/linux/virtio_pstore.h
> new file mode 100644
> index 000000000000..f4b0d204d8ae
> --- /dev/null
> +++ b/include/uapi/linux/virtio_pstore.h
> @@ -0,0 +1,74 @@
> +#ifndef _LINUX_VIRTIO_PSTORE_H
> +#define _LINUX_VIRTIO_PSTORE_H
> +/* This header is BSD licensed so anyone can use the definitions to implement
> + * compatible drivers/servers.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + * 3. Neither the name of IBM nor the names of its contributors
> + *    may be used to endorse or promote products derived from this software
> + *    without specific prior written permission.
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE. */
> +#include <linux/types.h>
> +#include <linux/virtio_types.h>
> +
> +#define VIRTIO_PSTORE_CMD_NULL   0
> +#define VIRTIO_PSTORE_CMD_OPEN   1
> +#define VIRTIO_PSTORE_CMD_READ   2
> +#define VIRTIO_PSTORE_CMD_WRITE  3
> +#define VIRTIO_PSTORE_CMD_ERASE  4
> +#define VIRTIO_PSTORE_CMD_CLOSE  5
> +
> +#define VIRTIO_PSTORE_TYPE_UNKNOWN  0
> +#define VIRTIO_PSTORE_TYPE_DMESG    1
> +
> +#define VIRTIO_PSTORE_FL_COMPRESSED  1

Most other headers use _F_ and not _FL_
Also, we specify bit number and not the
bitmask. So:

#define VIRTIO_PSTORE_F_COMPRESSED  0

and

(0x1 << VIRTIO_PSTORE_F_COMPRESSED)


> +
> +struct virtio_pstore_req {
> +	__virtio16		cmd;
> +	__virtio16		type;
> +	__virtio32		flags;
> +	__virtio64		id;
> +	__virtio32		count;
> +	__virtio32		reserved;
> +};
> +
> +struct virtio_pstore_res {
> +	__virtio16		cmd;
> +	__virtio16		type;
> +	__virtio32		ret;
> +};
> +
> +struct virtio_pstore_fileinfo {
> +	__virtio64		id;
> +	__virtio32		count;
> +	__virtio16		type;
> +	__virtio16		unused;
> +	__virtio32		flags;
> +	__virtio32		len;
> +	__virtio64		time_sec;
> +	__virtio32		time_nsec;
> +	__virtio32		reserved;

Any reason one is reserved the other is unused?
If not just calls them pad1, pad2?

> +};
> +
> +struct virtio_pstore_config {
> +	__virtio32		bufsize;
> +};
> +

__virtio things are for compatibility things.
New devices should just use __le everywhere.

Let me post a patch that adds config space accessors
so you can do this.


> +#endif /* _LINUX_VIRTIO_PSTORE_H */
> -- 
> 2.9.3

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] qemu: Implement virtio-pstore device
  2016-08-20  8:07 ` [PATCH 2/3] qemu: Implement virtio-pstore device Namhyung Kim
  2016-08-24 22:00   ` Daniel P. Berrange
@ 2016-09-13 15:57   ` Michael S. Tsirkin
  2016-09-16 10:05     ` Namhyung Kim
  1 sibling, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2016-09-13 15:57 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Anthony Liguori, Anton Vorontsov, Colin Cross, Kees Cook,
	Tony Luck, Steven Rostedt, Ingo Molnar, Minchan Kim,
	Daniel P . Berrange

On Sat, Aug 20, 2016 at 05:07:43PM +0900, Namhyung Kim wrote:
> Add virtio pstore device to allow kernel log files saved on the host.
> It will save the log files on the directory given by pstore device
> option.
> 
>   $ qemu-system-x86_64 -device virtio-pstore,directory=dir-xx ...
> 
>   (guest) # echo c > /proc/sysrq-trigger
> 
>   $ ls dir-xx
>   dmesg-1.enc.z  dmesg-2.enc.z
> 
> The log files are usually compressed using zlib.  Users can see the log
> messages directly on the host or on the guest (using pstore filesystem).
> 
> The 'directory' property is required for virtio-pstore device to work.
> It also adds 'bufsize' property to set size of pstore bufer.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Anthony Liguori <aliguori@amazon.com>
> Cc: Anton Vorontsov <anton@enomsg.org>
> Cc: Colin Cross <ccross@android.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Daniel P. Berrange <berrange@redhat.com>
> Cc: kvm@vger.kernel.org
> Cc: qemu-devel@nongnu.org
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  hw/virtio/Makefile.objs                        |   2 +-
>  hw/virtio/virtio-pci.c                         |  52 ++
>  hw/virtio/virtio-pci.h                         |  14 +
>  hw/virtio/virtio-pstore.c                      | 699 +++++++++++++++++++++++++
>  include/hw/pci/pci.h                           |   1 +
>  include/hw/virtio/virtio-pstore.h              |  36 ++
>  include/standard-headers/linux/virtio_ids.h    |   1 +
>  include/standard-headers/linux/virtio_pstore.h |  76 +++
>  qdev-monitor.c                                 |   1 +
>  9 files changed, 881 insertions(+), 1 deletion(-)
>  create mode 100644 hw/virtio/virtio-pstore.c
>  create mode 100644 include/hw/virtio/virtio-pstore.h
>  create mode 100644 include/standard-headers/linux/virtio_pstore.h
> 
> diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> index 3e2b175..aae7082 100644
> --- a/hw/virtio/Makefile.objs
> +++ b/hw/virtio/Makefile.objs
> @@ -4,4 +4,4 @@ common-obj-y += virtio-bus.o
>  common-obj-y += virtio-mmio.o
>  
>  obj-y += virtio.o virtio-balloon.o 
> -obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
> +obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o virtio-pstore.o
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index 755f921..c184823 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -2416,6 +2416,57 @@ static const TypeInfo virtio_host_pci_info = {
>  };
>  #endif
>  
> +/* virtio-pstore-pci */
> +
> +static void virtio_pstore_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
> +{
> +    VirtIOPstorePCI *vps = VIRTIO_PSTORE_PCI(vpci_dev);
> +    DeviceState *vdev = DEVICE(&vps->vdev);
> +    Error *err = NULL;
> +
> +    qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
> +    object_property_set_bool(OBJECT(vdev), true, "realized", &err);
> +    if (err) {
> +        error_propagate(errp, err);
> +        return;
> +    }
> +}
> +
> +static void virtio_pstore_pci_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +    VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
> +    PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
> +
> +    k->realize = virtio_pstore_pci_realize;
> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> +
> +    pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
> +    pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_PSTORE;
> +    pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
> +    pcidev_k->class_id = PCI_CLASS_OTHERS;
> +}
> +
> +static void virtio_pstore_pci_instance_init(Object *obj)
> +{
> +    VirtIOPstorePCI *dev = VIRTIO_PSTORE_PCI(obj);
> +
> +    virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
> +                                TYPE_VIRTIO_PSTORE);
> +    object_property_add_alias(obj, "directory", OBJECT(&dev->vdev),
> +                              "directory", &error_abort);
> +    object_property_add_alias(obj, "bufsize", OBJECT(&dev->vdev),
> +                              "bufsize", &error_abort);
> +}
> +
> +static const TypeInfo virtio_pstore_pci_info = {
> +    .name          = TYPE_VIRTIO_PSTORE_PCI,
> +    .parent        = TYPE_VIRTIO_PCI,
> +    .instance_size = sizeof(VirtIOPstorePCI),
> +    .instance_init = virtio_pstore_pci_instance_init,
> +    .class_init    = virtio_pstore_pci_class_init,
> +};
> +
>  /* virtio-pci-bus */
>  
>  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
> @@ -2485,6 +2536,7 @@ static void virtio_pci_register_types(void)
>  #ifdef CONFIG_VHOST_SCSI
>      type_register_static(&vhost_scsi_pci_info);
>  #endif
> +    type_register_static(&virtio_pstore_pci_info);
>  }
>  
>  type_init(virtio_pci_register_types)
> diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> index 25fbf8a..354b2b7 100644
> --- a/hw/virtio/virtio-pci.h
> +++ b/hw/virtio/virtio-pci.h
> @@ -31,6 +31,7 @@
>  #ifdef CONFIG_VHOST_SCSI
>  #include "hw/virtio/vhost-scsi.h"
>  #endif
> +#include "hw/virtio/virtio-pstore.h"
>  
>  typedef struct VirtIOPCIProxy VirtIOPCIProxy;
>  typedef struct VirtIOBlkPCI VirtIOBlkPCI;
> @@ -44,6 +45,7 @@ typedef struct VirtIOInputPCI VirtIOInputPCI;
>  typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI;
>  typedef struct VirtIOInputHostPCI VirtIOInputHostPCI;
>  typedef struct VirtIOGPUPCI VirtIOGPUPCI;
> +typedef struct VirtIOPstorePCI VirtIOPstorePCI;
>  
>  /* virtio-pci-bus */
>  
> @@ -324,6 +326,18 @@ struct VirtIOGPUPCI {
>      VirtIOGPU vdev;
>  };
>  
> +/*
> + * virtio-pstore-pci: This extends VirtioPCIProxy.
> + */
> +#define TYPE_VIRTIO_PSTORE_PCI "virtio-pstore-pci"
> +#define VIRTIO_PSTORE_PCI(obj) \
> +        OBJECT_CHECK(VirtIOPstorePCI, (obj), TYPE_VIRTIO_PSTORE_PCI)
> +
> +struct VirtIOPstorePCI {
> +    VirtIOPCIProxy parent_obj;
> +    VirtIOPstore vdev;
> +};
> +
>  /* Virtio ABI version, if we increment this, we break the guest driver. */
>  #define VIRTIO_PCI_ABI_VERSION          0
>  
> diff --git a/hw/virtio/virtio-pstore.c b/hw/virtio/virtio-pstore.c
> new file mode 100644
> index 0000000..b8fb4be
> --- /dev/null
> +++ b/hw/virtio/virtio-pstore.c
> @@ -0,0 +1,699 @@
> +/*
> + * Virtio Pstore Device
> + *
> + * Copyright (C) 2016  LG Electronics
> + *
> + * Authors:
> + *  Namhyung Kim  <namhyung@gmail.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include <stdio.h>
> +
> +#include "qemu/osdep.h"
> +#include "qemu/iov.h"
> +#include "qemu-common.h"
> +#include "qemu/cutils.h"
> +#include "qemu/error-report.h"
> +#include "sysemu/kvm.h"
> +#include "qapi/visitor.h"
> +#include "qapi-event.h"
> +#include "io/channel-util.h"
> +#include "trace.h"
> +
> +#include "hw/virtio/virtio.h"
> +#include "hw/virtio/virtio-bus.h"
> +#include "hw/virtio/virtio-access.h"
> +#include "hw/virtio/virtio-pstore.h"
> +
> +#define PSTORE_DEFAULT_BUFSIZE   (16 * 1024)
> +#define PSTORE_DEFAULT_FILE_MAX  5
> +
> +/* the index should match to the type value */
> +static const char *virtio_pstore_file_prefix[] = {
> +    "unknown-",		/* VIRTIO_PSTORE_TYPE_UNKNOWN */

Is there value in treating everything unexpected as "unknown"
and rotating them as if they were logs?
It might be better to treat everything that's not known
as guest error.


> +    "dmesg-",		/* VIRTIO_PSTORE_TYPE_DMESG */

use named initializers for this instead of comments.

> +};
> +
> +static char *virtio_pstore_to_filename(VirtIOPstore *s,
> +                                       struct virtio_pstore_req *req)
> +{
> +    const char *basename;
> +    unsigned long long id;
> +    unsigned int type = le16_to_cpu(req->type);
> +    unsigned int flags = le32_to_cpu(req->flags);
> +
> +    if (type < ARRAY_SIZE(virtio_pstore_file_prefix)) {
> +        basename = virtio_pstore_file_prefix[type];
> +    } else {
> +        basename = "unknown-";
> +    }
> +
> +    id = s->id++;
> +    return g_strdup_printf("%s/%s%llu%s", s->directory, basename, id,
> +                            flags & VIRTIO_PSTORE_FL_COMPRESSED ? ".enc.z" : "");
> +}
> +
> +static char *virtio_pstore_from_filename(VirtIOPstore *s, char *name,
> +                                         struct virtio_pstore_fileinfo *info)
> +{
> +    char *filename;
> +    unsigned int idx;
> +
> +    filename = g_strdup_printf("%s/%s", s->directory, name);
> +    if (filename == NULL)
> +        return NULL;
> +
> +    for (idx = 0; idx < ARRAY_SIZE(virtio_pstore_file_prefix); idx++) {
> +        if (g_str_has_prefix(name, virtio_pstore_file_prefix[idx])) {
> +            info->type = idx;
> +            name += strlen(virtio_pstore_file_prefix[idx]);
> +            break;
> +        }
> +    }
> +
> +    if (idx == ARRAY_SIZE(virtio_pstore_file_prefix)) {
> +        g_free(filename);
> +        return NULL;
> +    }
> +
> +    qemu_strtoull(name, NULL, 0, &info->id);

What if this fails?

> +
> +    info->flags = 0;
> +    if (g_str_has_suffix(name, ".enc.z")) {
> +        info->flags |= VIRTIO_PSTORE_FL_COMPRESSED;
> +    }
> +
> +    return filename;
> +}
> +
> +static int prefix_idx;
> +static int prefix_count;
> +static int prefix_len;
This does not work properly if there are multiple instances
of it. Pls move everything into device state.

> +
> +static int filter_pstore(const struct dirent *de)
> +{
> +    int i;
> +
> +    for (i = 0; i < prefix_count; i++) {
> +        const char *prefix = virtio_pstore_file_prefix[prefix_idx + i];
> +
> +        if (g_str_has_prefix(de->d_name, prefix)) {
> +            return 1;
> +        }
> +    }
> +    return 0;
> +}
> +
> +static int sort_pstore(const struct dirent **a, const struct dirent **b)
> +{
> +    uint64_t id_a, id_b;
> +
> +    qemu_strtoull((*a)->d_name + prefix_len, NULL, 0, &id_a);
> +    qemu_strtoull((*b)->d_name + prefix_len, NULL, 0, &id_b);
> +
> +    return id_a - id_b;
> +}
> +
> +static int rotate_pstore_file(VirtIOPstore *s, unsigned short type)
> +{
> +    int ret = 0;
> +    int i, num;
> +    char *filename;
> +    struct dirent **files;
> +
> +    if (type >= ARRAY_SIZE(virtio_pstore_file_prefix)) {
> +        type = VIRTIO_PSTORE_TYPE_UNKNOWN;
> +    }
> +
> +    prefix_idx = type;
> +    prefix_len = strlen(virtio_pstore_file_prefix[type]);
> +    prefix_count = 1;  /* only scan current type */
> +
> +    /* delete the oldest file in the same type */
> +    num = scandir(s->directory, &files, filter_pstore, sort_pstore);
> +    if (num < 0)
> +        return num;
> +    if (num < (int)s->file_max)
> +        goto out;
> +
> +    filename = g_strdup_printf("%s/%s", s->directory, files[0]->d_name);
> +    if (filename == NULL) {
> +        ret = -1;
> +        goto out;
> +    }
> +
> +    ret = unlink(filename);
> +
> +out:
> +    for (i = 0; i < num; i++) {
> +        g_free(files[i]);
> +    }
> +    g_free(files);
> +
> +    return ret;
> +}

Pls prefix everything with virtio_pstore or another
unique prefix. also below.

> +
> +static ssize_t virtio_pstore_do_open(VirtIOPstore *s)
> +{
> +    /* scan all pstore files */
> +    prefix_idx = 0;
> +    prefix_count = ARRAY_SIZE(virtio_pstore_file_prefix);
> +
> +    s->file_idx = 0;
> +    s->num_file = scandir(s->directory, &s->files, filter_pstore, alphasort);
> +
> +    return s->num_file >= 0 ? 0 : -1;
> +}
> +
> +static ssize_t virtio_pstore_do_close(VirtIOPstore *s)
> +{
> +    int i;
> +
> +    for (i = 0; i < s->num_file; i++) {
> +        g_free(s->files[i]);
> +    }
> +    g_free(s->files);
> +    s->files = NULL;
> +
> +    s->num_file = 0;
> +    return 0;
> +}
> +
> +static ssize_t virtio_pstore_do_erase(VirtIOPstore *s,
> +                                      struct virtio_pstore_req *req)
> +{
> +    char *filename;
> +    int ret;
> +
> +    filename = virtio_pstore_to_filename(s, req);
> +    if (filename == NULL)
> +        return -1;

this can't happen.

also this is a coding style violation.

> +
> +    ret = unlink(filename);
> +
> +    g_free(filename);
> +    return ret;
> +}
> +
> +struct pstore_read_arg {
> +    VirtIOPstore *vps;
> +    VirtQueueElement *elem;
> +    struct virtio_pstore_fileinfo info;
> +    QIOChannel *ioc;
> +};
> +
> +static gboolean pstore_async_read_fn(QIOChannel *ioc, GIOCondition condition,
> +                                     gpointer data)
> +{
> +    struct pstore_read_arg *rarg = data;
> +    struct virtio_pstore_fileinfo *info = &rarg->info;
> +    VirtIOPstore *vps = rarg->vps;
> +    VirtQueueElement *elem = rarg->elem;
> +    struct virtio_pstore_res res;
> +    size_t offset = sizeof(res) + sizeof(*info);
> +    struct iovec *sg = elem->in_sg;
> +    unsigned int sg_num = elem->in_num;
> +    Error *err = NULL;
> +    ssize_t len;
> +    int ret;
> +
> +    /* skip res and fileinfo */
> +    iov_discard_front(&sg, &sg_num, sizeof(res) + sizeof(*info));
> +
> +    len = qio_channel_readv(rarg->ioc, sg, sg_num, &err);
> +    if (len < 0) {
> +        if (errno == EAGAIN) {
> +            len = 0;
> +        }
> +        ret = -1;
> +    } else {
> +        info->len = cpu_to_le32(len);
> +        ret = 0;
> +    }
> +
> +    res.cmd  = cpu_to_le16(VIRTIO_PSTORE_CMD_READ);
> +    res.type = cpu_to_le16(VIRTIO_PSTORE_TYPE_UNKNOWN);
> +    res.ret  = cpu_to_le32(ret);
> +
> +    /* now copy res and fileinfo */
> +    iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> +    iov_from_buf(elem->in_sg, elem->in_num, sizeof(res), info, sizeof(*info));
> +
> +    len += offset;
> +    virtqueue_push(vps->rvq, elem, len);
> +    virtio_notify(VIRTIO_DEVICE(vps), vps->rvq);
> +
> +    return G_SOURCE_REMOVE;
> +}
> +
> +static void free_rarg_fn(gpointer data)
> +{
> +    struct pstore_read_arg *rarg = data;
> +
> +    qio_channel_close(rarg->ioc, NULL);
> +
> +    g_free(rarg->elem);
> +    g_free(rarg);
> +}
> +
> +static ssize_t virtio_pstore_do_read(VirtIOPstore *s, VirtQueueElement *elem)
> +{
> +    char *filename = NULL;
> +    int fd, idx;
> +    struct stat stbuf;
> +    struct pstore_read_arg *rarg = NULL;
> +    Error *err = NULL;
> +    int ret = -1;
> +
> +    if (s->file_idx >= s->num_file) {
> +        return 0;
> +    }
> +
> +    rarg = g_malloc(sizeof(*rarg));
> +    if (rarg == NULL) {
> +        return -1;
> +    }
> +
> +    idx = s->file_idx++;
> +    filename = virtio_pstore_from_filename(s, s->files[idx]->d_name,
> +                                           &rarg->info);
> +    if (filename == NULL) {
> +        goto out;
> +    }
> +
> +    fd = open(filename, O_RDONLY);
> +    if (fd < 0) {
> +        error_report("cannot open %s", filename);
> +        goto out;
> +    }

I see open here but close nowhere. Does this leak fds?

> +
> +    if (fstat(fd, &stbuf) < 0) {

So we can stat, but can we e.g. read?


> +        goto out;
> +    }
> +
> +    rarg->vps            = s;
> +    rarg->elem           = elem;
> +    rarg->info.id        = cpu_to_le64(rarg->info.id);
> +    rarg->info.type      = cpu_to_le16(rarg->info.type);
> +    rarg->info.flags     = cpu_to_le32(rarg->info.flags);
> +    rarg->info.time_sec  = cpu_to_le64(stbuf.st_ctim.tv_sec);

Is this seconds since epoch?
Why ctim specifically?
Pls add comments.

> +    rarg->info.time_nsec = cpu_to_le32(stbuf.st_ctim.tv_nsec);

Not all hosts support nanosecond precision.
Do we need some way to tell guest what's reliable?

Unless you limit this to linux host, you should care about things
like this (in man fstat)

           Since  kernel 2.5.48, the stat structure supports nanosecond
    resolution for the three file timestamp fields.  The nanosecond compo‐
    nents of each timestamp are available via names of the form
    st_atim.tv_nsec if the _BSD_SOURCE or _SVID_SOURCE feature  test  macro
    is  defined.   Nanosecond  timestamps are nowadays standardized,
    starting with POSIX.1-2008, and, starting with version 2.12, glibc also
    exposes the nanosecond component names if _POSIX_C_SOURCE is defined
    with the value 200809L or greater,  or  _XOPEN_SOURCE  is defined  with
    the  value 700 or greater.  If none of the aforementioned macros are
    defined, then the nanosecond values are exposed with names of the form
    st_atimensec.




> +
> +    rarg->ioc = qio_channel_new_fd(fd, &err);
> +    if (err) {
> +        error_reportf_err(err, "cannot create io channel: ");
> +        goto out;
> +    }
> +
> +    qio_channel_set_blocking(rarg->ioc, false, &err);
> +    qio_channel_add_watch(rarg->ioc, G_IO_IN, pstore_async_read_fn, rarg,
> +                          free_rarg_fn);
> +    g_free(filename);
> +    return 1;
> +
> +out:
> +    g_free(filename);
> +    g_free(rarg);
> +
> +    return ret;
> +}
> +
> +struct pstore_write_arg {
> +    VirtIOPstore *vps;
> +    VirtQueueElement *elem;
> +    struct virtio_pstore_req *req;
> +    QIOChannel *ioc;
> +};
> +
> +static gboolean pstore_async_write_fn(QIOChannel *ioc, GIOCondition condition,
> +                                      gpointer data)
> +{
> +    struct pstore_write_arg *warg = data;
> +    VirtIOPstore *vps = warg->vps;
> +    VirtQueueElement *elem = warg->elem;
> +    struct iovec *sg = elem->out_sg;
> +    unsigned int sg_num = elem->out_num;
> +    struct virtio_pstore_res res;
> +    Error *err = NULL;
> +    ssize_t len;
> +    int ret;
> +
> +    /* we already consumed the req */
> +    iov_discard_front(&sg, &sg_num, sizeof(*warg->req));
> +
> +    len = qio_channel_writev(warg->ioc, sg, sg_num, &err);
> +    if (len < 0) {
> +        ret = -1;
> +    } else {
> +        ret = 0;
> +    }

This can discard part of the data written.
Don't we care?

> +
> +    res.cmd  = cpu_to_le16(VIRTIO_PSTORE_CMD_WRITE);
> +    res.type = warg->req->type;
> +    res.ret  = cpu_to_le32(ret);
> +
> +    /* tell the result to guest */
> +    iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> +
> +    virtqueue_push(vps->wvq, elem, sizeof(res));
> +    virtio_notify(VIRTIO_DEVICE(vps), vps->wvq);
> +
> +    return G_SOURCE_REMOVE;
> +}
> +
> +static void free_warg_fn(gpointer data)
> +{
> +    struct pstore_write_arg *warg = data;
> +
> +    qio_channel_close(warg->ioc, NULL);
> +
> +    g_free(warg->elem);
> +    g_free(warg);
> +}
> +
> +static ssize_t virtio_pstore_do_write(VirtIOPstore *s, VirtQueueElement *elem,
> +                                      struct virtio_pstore_req *req)
> +{
> +    unsigned short type = le16_to_cpu(req->type);
> +    char *filename = NULL;
> +    int fd;
> +    int flags = O_WRONLY | O_CREAT | O_TRUNC;
> +    struct pstore_write_arg *warg = NULL;
> +    Error *err = NULL;
> +    int ret = -1;
> +
> +    /* do not keep same type of files more than 'file-max' */
> +    rotate_pstore_file(s, type);

If you don't care about failures, should this function
return a value? How about reporting it to the user?


> +
> +    filename = virtio_pstore_to_filename(s, req);
> +    if (filename == NULL) {
> +        return -1;
> +    }

this can't happen

> +
> +    warg = g_malloc(sizeof(*warg));
> +    if (warg == NULL) {
> +        goto out;
> +    }
> +
> +    fd = open(filename, flags, 0644);
> +    if (fd < 0) {
> +        error_report("cannot open %s", filename);
> +        ret = fd;
> +        goto out;
> +    }
> +
> +    warg->vps            = s;
> +    warg->elem           = elem;
> +    warg->req            = req;
> +
> +    warg->ioc = qio_channel_new_fd(fd, &err);
> +    if (err) {
> +        error_reportf_err(err, "cannot create io channel: ");
> +        goto out;
> +    }
> +
> +    qio_channel_set_blocking(warg->ioc, false, &err);
> +    qio_channel_add_watch(warg->ioc, G_IO_OUT, pstore_async_write_fn, warg,
> +                          free_warg_fn);
> +    g_free(filename);
> +    return 1;
> +
> +out:
> +    g_free(filename);
> +    g_free(warg);
> +    return ret;
> +}
> +
> +static void virtio_pstore_handle_io(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +    VirtIOPstore *s = VIRTIO_PSTORE(vdev);
> +    VirtQueueElement *elem;
> +    struct virtio_pstore_req req;
> +    struct virtio_pstore_res res;
> +    ssize_t len = 0;
> +    int ret;
> +
> +    for (;;) {
> +        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> +        if (!elem) {
> +            return;
> +        }
> +
> +        if (elem->out_num < 1 || elem->in_num < 1) {
> +            error_report("request or response buffer is missing");
> +            exit(1);
> +        }
> +
> +        if (elem->out_num > 2 || elem->in_num > 3) {
> +            error_report("invalid number of input/output buffer");
> +            exit(1);
> +        }
> +
> +        len = iov_to_buf(elem->out_sg, elem->out_num, 0, &req, sizeof(req));
> +        if (len != (ssize_t)sizeof(req)) {
> +            error_report("invalid request size: %ld", (long)len);
> +            exit(1);
> +        }
> +        res.cmd  = req.cmd;
> +        res.type = req.type;
> +
> +        switch (le16_to_cpu(req.cmd)) {
> +        case VIRTIO_PSTORE_CMD_OPEN:
> +            ret = virtio_pstore_do_open(s);
> +            break;
> +        case VIRTIO_PSTORE_CMD_CLOSE:
> +            ret = virtio_pstore_do_close(s);
> +            break;
> +        case VIRTIO_PSTORE_CMD_ERASE:
> +            ret = virtio_pstore_do_erase(s, &req);
> +            break;
> +        case VIRTIO_PSTORE_CMD_READ:
> +            ret = virtio_pstore_do_read(s, elem);
> +            if (ret == 1) {
> +                /* async channel io */
> +                continue;
> +            }
> +            break;
> +        case VIRTIO_PSTORE_CMD_WRITE:
> +            ret = virtio_pstore_do_write(s, elem, &req);
> +            if (ret == 1) {
> +                /* async channel io */
> +                continue;
> +            }
> +            break;
> +        default:
> +            ret = -1;
> +            break;
> +        }
> +
> +        res.ret = ret;
> +
> +        iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> +        virtqueue_push(vq, elem, sizeof(res) + len);
> +
> +        virtio_notify(vdev, vq);
> +        g_free(elem);
> +
> +        if (ret < 0) {
> +            return;

what does this do?

> +        }
> +    }
> +}
> +
> +static void virtio_pstore_device_realize(DeviceState *dev, Error **errp)
> +{
> +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +    VirtIOPstore *s = VIRTIO_PSTORE(dev);
> +
> +    virtio_init(vdev, "virtio-pstore", VIRTIO_ID_PSTORE,
> +                sizeof(struct virtio_pstore_config));
> +
> +    s->id = 1;
> +
> +    if (!s->bufsize)
> +        s->bufsize = PSTORE_DEFAULT_BUFSIZE;
> +    if (!s->file_max)
> +        s->file_max = PSTORE_DEFAULT_FILE_MAX;
> +
> +    s->rvq = virtio_add_queue(vdev, 128, virtio_pstore_handle_io);
> +    s->wvq = virtio_add_queue(vdev, 128, virtio_pstore_handle_io);
> +}
> +
> +static void virtio_pstore_device_unrealize(DeviceState *dev, Error **errp)
> +{
> +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +
> +    virtio_cleanup(vdev);
> +}
> +
> +static void virtio_pstore_get_config(VirtIODevice *vdev, uint8_t *config_data)
> +{
> +    VirtIOPstore *dev = VIRTIO_PSTORE(vdev);
> +    struct virtio_pstore_config config;

Add {} here - you want all fields initialized
if you add them, to avoid leaking them to guest.

> +
> +    config.bufsize = cpu_to_le32(dev->bufsize);
> +
> +    memcpy(config_data, &config, sizeof(struct virtio_pstore_config));
> +}
> +
> +static void virtio_pstore_set_config(VirtIODevice *vdev,
> +                                     const uint8_t *config_data)
> +{
> +    VirtIOPstore *dev = VIRTIO_PSTORE(vdev);
> +    struct virtio_pstore_config config;
> +
> +    memcpy(&config, config_data, sizeof(struct virtio_pstore_config));
> +
> +    dev->bufsize = le32_to_cpu(config.bufsize);
> +}
> +
> +static uint64_t get_features(VirtIODevice *vdev, uint64_t f, Error **errp)
> +{
> +    return f;
> +}
> +
> +static void pstore_get_directory(Object *obj, Visitor *v,
> +                                 const char *name, void *opaque,
> +                                 Error **errp)
> +{
> +    VirtIOPstore *s = opaque;
> +
> +    visit_type_str(v, name, &s->directory, errp);
> +}
> +
> +static void pstore_set_directory(Object *obj, Visitor *v,
> +                                 const char *name, void *opaque,
> +                                 Error **errp)
> +{
> +    VirtIOPstore *s = opaque;
> +    Error *local_err = NULL;
> +    char *value;
> +
> +    visit_type_str(v, name, &value, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +
> +    g_free(s->directory);
> +    s->directory = value;
> +}
> +
> +static void pstore_release_directory(Object *obj, const char *name,
> +                                     void *opaque)
> +{
> +    VirtIOPstore *s = opaque;
> +
> +    g_free(s->directory);
> +    s->directory = NULL;
> +}
> +
> +static void pstore_get_bufsize(Object *obj, Visitor *v,
> +                               const char *name, void *opaque,
> +                               Error **errp)
> +{
> +    VirtIOPstore *s = opaque;
> +    uint64_t value = s->bufsize;
> +
> +    visit_type_size(v, name, &value, errp);
> +}
> +
> +static void pstore_set_bufsize(Object *obj, Visitor *v,
> +                               const char *name, void *opaque,
> +                               Error **errp)
> +{
> +    VirtIOPstore *s = opaque;
> +    Error *error = NULL;
> +    uint64_t value;
> +
> +    visit_type_size(v, name, &value, &error);
> +    if (error) {
> +        error_propagate(errp, error);
> +        return;
> +    }
> +
> +    if (value < 4096) {
> +        error_setg(&error, "Warning: too small buffer size: %"PRIu64, value);
> +        error_propagate(errp, error);
> +        return;
> +    }
> +
> +    s->bufsize = value;
> +}
> +
> +static void pstore_get_file_max(Object *obj, Visitor *v,
> +                                const char *name, void *opaque,
> +                                Error **errp)
> +{
> +    VirtIOPstore *s = opaque;
> +    int64_t value = s->file_max;
> +
> +    visit_type_int(v, name, &value, errp);
> +}
> +
> +static void pstore_set_file_max(Object *obj, Visitor *v,
> +                                const char *name, void *opaque,
> +                                Error **errp)
> +{
> +    VirtIOPstore *s = opaque;
> +    Error *error = NULL;
> +    int64_t value;
> +
> +    visit_type_int(v, name, &value, &error);
> +    if (error) {
> +        error_propagate(errp, error);
> +        return;
> +    }
> +
> +    s->file_max = value;
> +}

Do you need dynamic properties? There are easier ways
to define an int property. Same for others.

> +
> +static Property virtio_pstore_properties[] = {
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void virtio_pstore_instance_init(Object *obj)
> +{
> +    VirtIOPstore *s = VIRTIO_PSTORE(obj);
> +
> +    object_property_add(obj, "directory", "str",
> +                        pstore_get_directory, pstore_set_directory,
> +                        pstore_release_directory, s, NULL);
> +    object_property_add(obj, "bufsize", "size",
> +                        pstore_get_bufsize, pstore_set_bufsize, NULL, s, NULL);
> +    object_property_add(obj, "file-max", "int",
> +                        pstore_get_file_max, pstore_set_file_max, NULL, s, NULL);
> +}
> +
> +static void virtio_pstore_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +    VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
> +
> +    dc->props = virtio_pstore_properties;
> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> +    vdc->realize = virtio_pstore_device_realize;
> +    vdc->unrealize = virtio_pstore_device_unrealize;
> +    vdc->get_config = virtio_pstore_get_config;
> +    vdc->set_config = virtio_pstore_set_config;
> +    vdc->get_features = get_features;
> +}
> +
> +static const TypeInfo virtio_pstore_info = {
> +    .name = TYPE_VIRTIO_PSTORE,
> +    .parent = TYPE_VIRTIO_DEVICE,
> +    .instance_size = sizeof(VirtIOPstore),
> +    .instance_init = virtio_pstore_instance_init,
> +    .class_init = virtio_pstore_class_init,
> +};
> +
> +static void virtio_register_types(void)
> +{
> +    type_register_static(&virtio_pstore_info);
> +}
> +
> +type_init(virtio_register_types)
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index 929ec2f..b31774a 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -79,6 +79,7 @@
>  #define PCI_DEVICE_ID_VIRTIO_SCSI        0x1004
>  #define PCI_DEVICE_ID_VIRTIO_RNG         0x1005
>  #define PCI_DEVICE_ID_VIRTIO_9P          0x1009
> +#define PCI_DEVICE_ID_VIRTIO_PSTORE      0x100a
>  
>  #define PCI_VENDOR_ID_REDHAT             0x1b36
>  #define PCI_DEVICE_ID_REDHAT_BRIDGE      0x0001
> diff --git a/include/hw/virtio/virtio-pstore.h b/include/hw/virtio/virtio-pstore.h
> new file mode 100644
> index 0000000..85b1828
> --- /dev/null
> +++ b/include/hw/virtio/virtio-pstore.h
> @@ -0,0 +1,36 @@
> +/*
> + * Virtio Pstore Support
> + *
> + * Authors:
> + *  Namhyung Kim      <namhyung@gmail.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef _QEMU_VIRTIO_PSTORE_H
> +#define _QEMU_VIRTIO_PSTORE_H
> +
> +#include "standard-headers/linux/virtio_pstore.h"
> +#include "hw/virtio/virtio.h"
> +#include "hw/pci/pci.h"
> +
> +#define TYPE_VIRTIO_PSTORE "virtio-pstore-device"
> +#define VIRTIO_PSTORE(obj) \
> +        OBJECT_CHECK(VirtIOPstore, (obj), TYPE_VIRTIO_PSTORE)
> +
> +typedef struct VirtIOPstore {
> +    VirtIODevice    parent_obj;
> +    VirtQueue      *rvq;
> +    VirtQueue      *wvq;
> +    char           *directory;
> +    int             file_idx;
> +    int             num_file;
> +    struct dirent **files;
> +    uint64_t        id;
> +    uint64_t        bufsize;
> +    uint64_t        file_max;
> +} VirtIOPstore;
> +
> +#endif
> diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard-headers/linux/virtio_ids.h
> index 77925f5..c72a9ab 100644
> --- a/include/standard-headers/linux/virtio_ids.h
> +++ b/include/standard-headers/linux/virtio_ids.h
> @@ -41,5 +41,6 @@
>  #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
>  #define VIRTIO_ID_GPU          16 /* virtio GPU */
>  #define VIRTIO_ID_INPUT        18 /* virtio input */
> +#define VIRTIO_ID_PSTORE       22 /* virtio pstore */
>  
>  #endif /* _LINUX_VIRTIO_IDS_H */
> diff --git a/include/standard-headers/linux/virtio_pstore.h b/include/standard-headers/linux/virtio_pstore.h
> new file mode 100644
> index 0000000..2f91839
> --- /dev/null
> +++ b/include/standard-headers/linux/virtio_pstore.h
> @@ -0,0 +1,76 @@
> +#ifndef _LINUX_VIRTIO_PSTORE_H
> +#define _LINUX_VIRTIO_PSTORE_H
> +/* This header is BSD licensed so anyone can use the definitions to implement
> + * compatible drivers/servers.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + * 3. Neither the name of IBM nor the names of its contributors
> + *    may be used to endorse or promote products derived from this software
> + *    without specific prior written permission.
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE. */
> +#include "standard-headers/linux/types.h"
> +#include "standard-headers/linux/virtio_types.h"
> +#include "standard-headers/linux/virtio_ids.h"
> +#include "standard-headers/linux/virtio_config.h"
> +
> +#define VIRTIO_PSTORE_CMD_NULL   0
> +#define VIRTIO_PSTORE_CMD_OPEN   1
> +#define VIRTIO_PSTORE_CMD_READ   2
> +#define VIRTIO_PSTORE_CMD_WRITE  3
> +#define VIRTIO_PSTORE_CMD_ERASE  4
> +#define VIRTIO_PSTORE_CMD_CLOSE  5
> +
> +#define VIRTIO_PSTORE_TYPE_UNKNOWN  0
> +#define VIRTIO_PSTORE_TYPE_DMESG    1
> +
> +#define VIRTIO_PSTORE_FL_COMPRESSED  1
> +
> +struct virtio_pstore_req {
> +    __virtio16 cmd;
> +    __virtio16 type;
> +    __virtio32 flags;
> +    __virtio64 id;
> +    __virtio32 count;
> +    __virtio32 reserved;
> +};
> +
> +struct virtio_pstore_res {
> +    __virtio16 cmd;
> +    __virtio16 type;
> +    __virtio32 ret;
> +};
> +
> +struct virtio_pstore_fileinfo {
> +    __virtio64 id;
> +    __virtio32 count;
> +    __virtio16 type;
> +    __virtio16 unused;
> +    __virtio32 flags;
> +    __virtio32 len;
> +    __virtio64 time_sec;
> +    __virtio32 time_nsec;
> +    __virtio32 reserved;
> +};
> +
> +struct virtio_pstore_config {
> +    __virtio32 bufsize;
> +};
> +
> +#endif /* _LINUX_VIRTIO_PSTORE_H */
> diff --git a/qdev-monitor.c b/qdev-monitor.c
> index e19617f..e1df5a9 100644
> --- a/qdev-monitor.c
> +++ b/qdev-monitor.c
> @@ -73,6 +73,7 @@ static const QDevAlias qdev_alias_table[] = {
>      { "virtio-serial-pci", "virtio-serial", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
>      { "virtio-tablet-ccw", "virtio-tablet", QEMU_ARCH_S390X },
>      { "virtio-tablet-pci", "virtio-tablet", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
> +    { "virtio-pstore-pci", "virtio-pstore" },
>      { }
>  };
>  
> -- 
> 2.9.3

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-09-13 15:19   ` Michael S. Tsirkin
@ 2016-09-16  9:05     ` Namhyung Kim
  0 siblings, 0 replies; 31+ messages in thread
From: Namhyung Kim @ 2016-09-16  9:05 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Anthony Liguori, Anton Vorontsov, Colin Cross, Kees Cook,
	Tony Luck, Steven Rostedt, Ingo Molnar, Minchan Kim

Hello Michael,

Thanks for your detailed review.  Btw are you ok with the overall
direction of the patch?


On Tue, Sep 13, 2016 at 06:19:41PM +0300, Michael S. Tsirkin wrote:
> On Sat, Aug 20, 2016 at 05:07:42PM +0900, Namhyung Kim wrote:
> > The virtio pstore driver provides interface to the pstore subsystem so
> > that the guest kernel's log/dump message can be saved on the host
> > machine.  Users can access the log file directly on the host, or on the
> > guest at the next boot using pstore filesystem.  It currently deals with
> > kernel log (printk) buffer only, but we can extend it to have other
> > information (like ftrace dump) later.
> > 
> > It supports legacy PCI device using single order-2 page buffer.  It uses
> > two virtqueues - one for (sync) read and another for (async) write.
> > Since it cannot wait for write finished, it supports up to 128
> > concurrent IO.  The buffer size is configurable now.
> > 
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Anthony Liguori <aliguori@amazon.com>
> > Cc: Anton Vorontsov <anton@enomsg.org>
> > Cc: Colin Cross <ccross@android.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Tony Luck <tony.luck@intel.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Minchan Kim <minchan@kernel.org>
> > Cc: kvm@vger.kernel.org
> > Cc: qemu-devel@nongnu.org
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  drivers/virtio/Kconfig             |  10 +
> >  drivers/virtio/Makefile            |   1 +
> >  drivers/virtio/virtio_pstore.c     | 417 +++++++++++++++++++++++++++++++++++++
> >  include/uapi/linux/Kbuild          |   1 +
> >  include/uapi/linux/virtio_ids.h    |   1 +
> >  include/uapi/linux/virtio_pstore.h |  74 +++++++
> >  6 files changed, 504 insertions(+)
> >  create mode 100644 drivers/virtio/virtio_pstore.c
> >  create mode 100644 include/uapi/linux/virtio_pstore.h
> > 
> > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> > index 77590320d44c..8f0e6c796c12 100644
> > --- a/drivers/virtio/Kconfig
> > +++ b/drivers/virtio/Kconfig
> > @@ -58,6 +58,16 @@ config VIRTIO_INPUT
> >  
> >  	 If unsure, say M.
> >  
> > +config VIRTIO_PSTORE
> > +	tristate "Virtio pstore driver"
> > +	depends on VIRTIO
> > +	depends on PSTORE
> > +	---help---
> > +	 This driver supports virtio pstore devices to save/restore
> > +	 panic and oops messages on the host.
> > +
> > +	 If unsure, say M.
> > +
> >   config VIRTIO_MMIO
> >  	tristate "Platform bus driver for memory mapped virtio devices"
> >  	depends on HAS_IOMEM && HAS_DMA
> > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> > index 41e30e3dc842..bee68cb26d48 100644
> > --- a/drivers/virtio/Makefile
> > +++ b/drivers/virtio/Makefile
> > @@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
> >  virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
> >  obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
> >  obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
> > +obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o
> > diff --git a/drivers/virtio/virtio_pstore.c b/drivers/virtio/virtio_pstore.c
> > new file mode 100644
> > index 000000000000..0a63c7db4278
> > --- /dev/null
> > +++ b/drivers/virtio/virtio_pstore.c
> > @@ -0,0 +1,417 @@
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include <linux/pstore.h>
> > +#include <linux/virtio.h>
> > +#include <linux/virtio_config.h>
> > +#include <uapi/linux/virtio_ids.h>
> > +#include <uapi/linux/virtio_pstore.h>
> > +
> > +#define VIRT_PSTORE_ORDER    2
> > +#define VIRT_PSTORE_BUFSIZE  (4096 << VIRT_PSTORE_ORDER)
> > +#define VIRT_PSTORE_NR_REQ   128
> 
> where are these numbers from?

The buffer size was chosen to be larger than default kmsg_bytes
(10240) in the pstore platform code and the request count is a
arbitrary value.

> 
> 
> > +
> > +struct virtio_pstore {
> > +	struct virtio_device	*vdev;
> > +	struct virtqueue	*vq[2];
> > +	struct pstore_info	 pstore;
> > +	struct virtio_pstore_req req[VIRT_PSTORE_NR_REQ];
> > +	struct virtio_pstore_res res[VIRT_PSTORE_NR_REQ];
> > +	unsigned int		 req_id;
> > +
> > +	/* Waiting for host to ack */
> > +	wait_queue_head_t	acked;
> > +	int			failed;
> > +};
> > +
> > +#define TYPE_TABLE_ENTRY(_entry)				\
> > +	{ PSTORE_TYPE_##_entry, VIRTIO_PSTORE_TYPE_##_entry }
> > +
> > +struct type_table {
> > +	int pstore;
> > +	u16 virtio;
> > +} type_table[] = {
> > +	TYPE_TABLE_ENTRY(DMESG),
> > +};
> > +
> > +#undef TYPE_TABLE_ENTRY
> 
> Let's not play preprocessor games until this becomes
> a big issue. Simple
> { PSTORE_TYPE_DMESG, VIRTIO_PSTORE_TYPE_DMESG}
> does the trick just as well for now.
> Also see below.
> 
> 
> 
> > +
> > +
> 
> single empty line pls.

Ok.

> 
> > +static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id type)
> > +{
> > +	unsigned int i;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
> > +		if (type == type_table[i].pstore)
> > +			return cpu_to_virtio16(vps->vdev, type_table[i].virtio);
> > +	}
> 
> Rather complex for something that always returns a single value.
> why do we need a table at all?
> How about a switch statement?

The pstore has 4 message types and I'd like to add a few more.  This
patch only implements the most popular dmesg type and others will be
added later.

But I'm ok with the switch too.

> 
> static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id type)
> {
>     switch (type) {
>     case PSTORE_TYPE_DMESG:
>         return VIRTIO_PSTORE_TYPE_DMESG;
>     default:
>         return VIRTIO_PSTORE_TYPE_UNKNOWN;
>     }
> }
> 
> 
> > +
> > +	return cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_UNKNOWN);
> > +}
> 
> This returns an incorrect type.

Right.  But it was fixed in v5 and you seem to review an earlier
version unfortunately.

> 
> 
> > +
> > +static enum pstore_type_id from_virtio_type(struct virtio_pstore *vps, u16 type)
> > +{
> > +	unsigned int i;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
> > +		if (virtio16_to_cpu(vps->vdev, type) == type_table[i].virtio)
> > +			return type_table[i].pstore;
> > +	}
> > +
> > +	return PSTORE_TYPE_UNKNOWN;
> > +}
> > +
> > +static void virtpstore_ack(struct virtqueue *vq)
> > +{
> > +	struct virtio_pstore *vps = vq->vdev->priv;
> > +
> > +	wake_up(&vps->acked);
> > +}
> > +
> > +static void virtpstore_check(struct virtqueue *vq)
> > +{
> > +	struct virtio_pstore *vps = vq->vdev->priv;
> > +	struct virtio_pstore_res *res;
> > +	unsigned int len;
> > +
> > +	res = virtqueue_get_buf(vq, &len);
> > +	if (res == NULL)
> > +		return;
> > +
> > +	if (virtio32_to_cpu(vq->vdev, res->ret) < 0)
> > +		vps->failed = 1;
> > +}
> > +
> > +static void virt_pstore_get_reqs(struct virtio_pstore *vps,
> > +				 struct virtio_pstore_req **preq,
> > +				 struct virtio_pstore_res **pres)
> > +{
> > +	unsigned int idx = vps->req_id++ % VIRT_PSTORE_NR_REQ;
> > +
> > +	*preq = &vps->req[idx];
> > +	*pres = &vps->res[idx];
> > +
> > +	memset(*preq, 0, sizeof(**preq));
> > +	memset(*pres, 0, sizeof(**pres));
> > +}
> > +
> > +static int virt_pstore_open(struct pstore_info *psi)
> > +{
> > +	struct virtio_pstore *vps = psi->data;
> > +	struct virtio_pstore_req *req;
> > +	struct virtio_pstore_res *res;
> > +	struct scatterlist sgo[1], sgi[1];
> > +	struct scatterlist *sgs[2] = { sgo, sgi };
> > +	unsigned int len;
> > +
> > +	virt_pstore_get_reqs(vps, &req, &res);
> > +
> > +	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_OPEN);
> > +
> > +	sg_init_one(sgo, req, sizeof(*req));
> > +	sg_init_one(sgi, res, sizeof(*res));
> > +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> > +	virtqueue_kick(vps->vq[0]);
> > +
> > +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> > +	return virtio32_to_cpu(vps->vdev, res->ret);
> > +}
> > +
> > +static int virt_pstore_close(struct pstore_info *psi)
> > +{
> > +	struct virtio_pstore *vps = psi->data;
> > +	struct virtio_pstore_req *req = &vps->req[vps->req_id];
> > +	struct virtio_pstore_res *res = &vps->res[vps->req_id];
> > +	struct scatterlist sgo[1], sgi[1];
> > +	struct scatterlist *sgs[2] = { sgo, sgi };
> > +	unsigned int len;
> > +
> > +	virt_pstore_get_reqs(vps, &req, &res);
> > +
> > +	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_CLOSE);
> > +
> > +	sg_init_one(sgo, req, sizeof(*req));
> > +	sg_init_one(sgi, res, sizeof(*res));
> > +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> > +	virtqueue_kick(vps->vq[0]);
> > +
> > +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> > +	return virtio32_to_cpu(vps->vdev, res->ret);
> > +}
> > +
> > +static ssize_t virt_pstore_read(u64 *id, enum pstore_type_id *type,
> > +				int *count, struct timespec *time,
> > +				char **buf, bool *compressed,
> > +				ssize_t *ecc_notice_size,
> > +				struct pstore_info *psi)
> > +{
> > +	struct virtio_pstore *vps = psi->data;
> > +	struct virtio_pstore_req *req;
> > +	struct virtio_pstore_res *res;
> > +	struct virtio_pstore_fileinfo info;
> > +	struct scatterlist sgo[1], sgi[3];
> > +	struct scatterlist *sgs[2] = { sgo, sgi };
> > +	unsigned int len;
> > +	unsigned int flags;
> > +	int ret;
> > +	void *bf;
> > +
> > +	virt_pstore_get_reqs(vps, &req, &res);
> > +
> > +	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_READ);
> > +
> > +	sg_init_one(sgo, req, sizeof(*req));
> > +	sg_init_table(sgi, 3);
> > +	sg_set_buf(&sgi[0], res, sizeof(*res));
> > +	sg_set_buf(&sgi[1], &info, sizeof(info));
> > +	sg_set_buf(&sgi[2], psi->buf, psi->bufsize);
> > +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> > +	virtqueue_kick(vps->vq[0]);
> > +
> > +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> > +	if (len < sizeof(*res) + sizeof(info))
> > +		return -1;
> > +
> > +	ret = virtio32_to_cpu(vps->vdev, res->ret);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	len = virtio32_to_cpu(vps->vdev, info.len);
> > +
> > +	bf = kmalloc(len, GFP_KERNEL);
> > +	if (bf == NULL)
> > +		return -ENOMEM;
> > +
> > +	*id    = virtio64_to_cpu(vps->vdev, info.id);
> > +	*type  = from_virtio_type(vps, info.type);
> > +	*count = virtio32_to_cpu(vps->vdev, info.count);
> > +
> > +	flags = virtio32_to_cpu(vps->vdev, info.flags);
> > +	*compressed = flags & VIRTIO_PSTORE_FL_COMPRESSED;
> > +
> > +	time->tv_sec  = virtio64_to_cpu(vps->vdev, info.time_sec);
> > +	time->tv_nsec = virtio32_to_cpu(vps->vdev, info.time_nsec);
> > +
> > +	memcpy(bf, psi->buf, len);
> > +	*buf = bf;
> > +
> > +	return len;
> > +}
> > +
> > +static int notrace virt_pstore_write(enum pstore_type_id type,
> > +				     enum kmsg_dump_reason reason,
> > +				     u64 *id, unsigned int part, int count,
> > +				     bool compressed, size_t size,
> > +				     struct pstore_info *psi)
> > +{
> > +	struct virtio_pstore *vps = psi->data;
> > +	struct virtio_pstore_req *req;
> > +	struct virtio_pstore_res *res;
> > +	struct scatterlist sgo[2], sgi[1];
> > +	struct scatterlist *sgs[2] = { sgo, sgi };
> > +	unsigned int flags = compressed ? VIRTIO_PSTORE_FL_COMPRESSED : 0;
> > +
> > +	if (vps->failed)
> > +		return -1;
> > +
> > +	*id = vps->req_id;
> > +	virt_pstore_get_reqs(vps, &req, &res);
> > +
> > +	req->cmd   = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_WRITE);
> > +	req->type  = to_virtio_type(vps, type);
> > +	req->flags = cpu_to_virtio32(vps->vdev, flags);
> > +
> > +	sg_init_table(sgo, 2);
> > +	sg_set_buf(&sgo[0], req, sizeof(*req));
> > +	sg_set_buf(&sgo[1], pstore_get_buf(psi), size);
> > +	sg_init_one(sgi, res, sizeof(*res));
> > +	virtqueue_add_sgs(vps->vq[1], sgs, 1, 1, vps, GFP_ATOMIC);
> > +	virtqueue_kick(vps->vq[1]);
> > +
> > +	return 0;
> > +}
> > +
> > +static int virt_pstore_erase(enum pstore_type_id type, u64 id, int count,
> > +			     struct timespec time, struct pstore_info *psi)
> > +{
> > +	struct virtio_pstore *vps = psi->data;
> > +	struct virtio_pstore_req *req;
> > +	struct virtio_pstore_res *res;
> > +	struct scatterlist sgo[1], sgi[1];
> > +	struct scatterlist *sgs[2] = { sgo, sgi };
> > +	unsigned int len;
> > +
> > +	virt_pstore_get_reqs(vps, &req, &res);
> > +
> > +	req->cmd   = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_ERASE);
> > +	req->type  = to_virtio_type(vps, type);
> > +	req->id	   = cpu_to_virtio64(vps->vdev, id);
> > +	req->count = cpu_to_virtio32(vps->vdev, count);
> > +
> > +	sg_init_one(sgo, req, sizeof(*req));
> > +	sg_init_one(sgi, res, sizeof(*res));
> > +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> > +	virtqueue_kick(vps->vq[0]);
> > +
> > +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> > +	return virtio32_to_cpu(vps->vdev, res->ret);
> > +}
> > +
> > +static int virt_pstore_init(struct virtio_pstore *vps)
> > +{
> > +	struct pstore_info *psinfo = &vps->pstore;
> > +	int err;
> > +
> > +	if (!psinfo->bufsize)
> > +		psinfo->bufsize = VIRT_PSTORE_BUFSIZE;
> > +
> > +	psinfo->buf = alloc_pages_exact(psinfo->bufsize, GFP_KERNEL);
> > +	if (!psinfo->buf) {
> > +		pr_err("cannot allocate pstore buffer\n");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	psinfo->owner = THIS_MODULE;
> > +	psinfo->name  = "virtio";
> > +	psinfo->open  = virt_pstore_open;
> > +	psinfo->close = virt_pstore_close;
> > +	psinfo->read  = virt_pstore_read;
> > +	psinfo->erase = virt_pstore_erase;
> > +	psinfo->write = virt_pstore_write;
> > +	psinfo->flags = PSTORE_FLAGS_DMESG;
> > +
> > +	psinfo->data  = vps;
> > +	spin_lock_init(&psinfo->buf_lock);
> > +
> > +	err = pstore_register(psinfo);
> > +	if (err)
> > +		kfree(psinfo->buf);
> > +
> > +	return err;
> > +}
> > +
> > +static int virt_pstore_exit(struct virtio_pstore *vps)
> > +{
> > +	struct pstore_info *psinfo = &vps->pstore;
> > +
> > +	pstore_unregister(psinfo);
> 
> I don't know enough about pstore - does this
> actually ensure that
> 1. all existing users close the device
> 2. no new users can open it
> somehow?

The pstore driver doesn't create a device node (except the pmsg device
which this patch doesn't deal with) so it doesn't need to worry about
the user AFAIK.  It just calls the pstore callbacks (if any) on some
system events (e.g. kmsg dump, console write and so on).

In fact, pstore is a pseudo file system, works as read-only mode.

> 
> > +
> > +	free_pages_exact(psinfo->buf, psinfo->bufsize);
> > +	psinfo->buf = NULL;
> > +	psinfo->bufsize = 0;
> > +
> > +	return 0;
> > +}
> > +
> > +static int virtpstore_init_vqs(struct virtio_pstore *vps)
> > +{
> > +	vq_callback_t *callbacks[] = { virtpstore_ack, virtpstore_check };
> > +	const char *names[] = { "pstore_read", "pstore_write" };
> > +
> > +	return vps->vdev->config->find_vqs(vps->vdev, 2, vps->vq,
> > +					   callbacks, names);
> > +}
> > +
> > +static void virtpstore_init_config(struct virtio_pstore *vps)
> > +{
> > +	u32 bufsize;
> > +
> > +	virtio_cread(vps->vdev, struct virtio_pstore_config, bufsize, &bufsize);
> > +
> > +	vps->pstore.bufsize = PAGE_ALIGN(bufsize);
> > +}
> > +
> > +static void virtpstore_confirm_config(struct virtio_pstore *vps)
> > +{
> > +	u32 bufsize = vps->pstore.bufsize;
> > +
> > +	virtio_cwrite(vps->vdev, struct virtio_pstore_config, bufsize,
> > +		     &bufsize);
> > +}
> > +
> > +static int virtpstore_probe(struct virtio_device *vdev)
> > +{
> > +	struct virtio_pstore *vps;
> > +	int err;
> > +
> > +	if (!vdev->config->get) {
> > +		dev_err(&vdev->dev, "driver init: config access disabled\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	vdev->priv = vps = kzalloc(sizeof(*vps), GFP_KERNEL);
> > +	if (!vps) {
> > +		err = -ENOMEM;
> > +		goto out;
> > +	}
> > +	vps->vdev = vdev;
> > +
> > +	err = virtpstore_init_vqs(vps);
> > +	if (err < 0)
> > +		goto out_free;
> > +
> > +	virtpstore_init_config(vps);
> > +
> > +	err = virt_pstore_init(vps);
> > +	if (err)
> > +		goto out_del_vq;
> > +
> > +	virtpstore_confirm_config(vps);
> > +
> > +	init_waitqueue_head(&vps->acked);
> > +
> > +	virtio_device_ready(vdev);
> > +
> > +	dev_info(&vdev->dev, "driver init: ok (bufsize = %luK, flags = %x)\n",
> > +		 vps->pstore.bufsize >> 10, vps->pstore.flags);
> > +
> > +	return 0;
> > +
> > +out_del_vq:
> > +	vdev->config->del_vqs(vdev);
> > +out_free:
> > +	kfree(vps);
> > +out:
> > +	dev_err(&vdev->dev, "driver init: failed with %d\n", err);
> > +	return err;
> > +}
> > +
> > +static void virtpstore_remove(struct virtio_device *vdev)
> > +{
> > +	struct virtio_pstore *vps = vdev->priv;
> > +
> > +	virt_pstore_exit(vps);
> > +
> > +	/* Now we reset the device so we can clean up the queues. */
> > +	vdev->config->reset(vdev);
> > +
> > +	vdev->config->del_vqs(vdev);
> > +
> > +	kfree(vps);
> > +}
> > +
> > +static unsigned int features[] = {
> > +};
> > +
> > +static struct virtio_device_id id_table[] = {
> > +	{ VIRTIO_ID_PSTORE, VIRTIO_DEV_ANY_ID },
> > +	{ 0 },
> > +};
> > +
> > +static struct virtio_driver virtio_pstore_driver = {
> 
> We need some way to avoid trying to load this
> as a legacy device. There isn't a way to do it yet
> so I won't block your patch on this but pls try to
> come up with something, and I'll do, too.

I have no idea what I can do.  Also it seems the kvmtools supports
only legacy devices (please correct me I was wrong).


> 
> 
> > +	.driver.name         = KBUILD_MODNAME,
> > +	.driver.owner        = THIS_MODULE,
> > +	.feature_table       = features,
> > +	.feature_table_size  = ARRAY_SIZE(features),
> > +	.id_table            = id_table,
> > +	.probe               = virtpstore_probe,
> > +	.remove              = virtpstore_remove,
> 
> Won't this need freeze/restore callbacks?

Probably.. :)  Will add.

> 
> > +};
> > +
> > +module_virtio_driver(virtio_pstore_driver);
> > +MODULE_DEVICE_TABLE(virtio, id_table);
> > +
> > +MODULE_LICENSE("GPL");
> > +MODULE_AUTHOR("Namhyung Kim <namhyung@kernel.org>");
> > +MODULE_DESCRIPTION("Virtio pstore driver");
> > diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> > index 6d4e92ccdc91..9bbb1554d8b2 100644
> > --- a/include/uapi/linux/Kbuild
> > +++ b/include/uapi/linux/Kbuild
> > @@ -449,6 +449,7 @@ header-y += virtio_ids.h
> >  header-y += virtio_input.h
> >  header-y += virtio_net.h
> >  header-y += virtio_pci.h
> > +header-y += virtio_pstore.h
> >  header-y += virtio_ring.h
> >  header-y += virtio_rng.h
> >  header-y += virtio_scsi.h
> > diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
> > index 77925f587b15..c72a9ab588c0 100644
> > --- a/include/uapi/linux/virtio_ids.h
> > +++ b/include/uapi/linux/virtio_ids.h
> > @@ -41,5 +41,6 @@
> >  #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
> >  #define VIRTIO_ID_GPU          16 /* virtio GPU */
> >  #define VIRTIO_ID_INPUT        18 /* virtio input */
> > +#define VIRTIO_ID_PSTORE       22 /* virtio pstore */
> >  
> >  #endif /* _LINUX_VIRTIO_IDS_H */
> > diff --git a/include/uapi/linux/virtio_pstore.h b/include/uapi/linux/virtio_pstore.h
> > new file mode 100644
> > index 000000000000..f4b0d204d8ae
> > --- /dev/null
> > +++ b/include/uapi/linux/virtio_pstore.h
> > @@ -0,0 +1,74 @@
> > +#ifndef _LINUX_VIRTIO_PSTORE_H
> > +#define _LINUX_VIRTIO_PSTORE_H
> > +/* This header is BSD licensed so anyone can use the definitions to implement
> > + * compatible drivers/servers.
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> > + * modification, are permitted provided that the following conditions
> > + * are met:
> > + * 1. Redistributions of source code must retain the above copyright
> > + *    notice, this list of conditions and the following disclaimer.
> > + * 2. Redistributions in binary form must reproduce the above copyright
> > + *    notice, this list of conditions and the following disclaimer in the
> > + *    documentation and/or other materials provided with the distribution.
> > + * 3. Neither the name of IBM nor the names of its contributors
> > + *    may be used to endorse or promote products derived from this software
> > + *    without specific prior written permission.
> > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
> > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> > + * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
> > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > + * SUCH DAMAGE. */
> > +#include <linux/types.h>
> > +#include <linux/virtio_types.h>
> > +
> > +#define VIRTIO_PSTORE_CMD_NULL   0
> > +#define VIRTIO_PSTORE_CMD_OPEN   1
> > +#define VIRTIO_PSTORE_CMD_READ   2
> > +#define VIRTIO_PSTORE_CMD_WRITE  3
> > +#define VIRTIO_PSTORE_CMD_ERASE  4
> > +#define VIRTIO_PSTORE_CMD_CLOSE  5
> > +
> > +#define VIRTIO_PSTORE_TYPE_UNKNOWN  0
> > +#define VIRTIO_PSTORE_TYPE_DMESG    1
> > +
> > +#define VIRTIO_PSTORE_FL_COMPRESSED  1
> 
> Most other headers use _F_ and not _FL_
> Also, we specify bit number and not the
> bitmask. So:
> 
> #define VIRTIO_PSTORE_F_COMPRESSED  0
> 
> and
> 
> (0x1 << VIRTIO_PSTORE_F_COMPRESSED)

Ok.

> 
> 
> > +
> > +struct virtio_pstore_req {
> > +	__virtio16		cmd;
> > +	__virtio16		type;
> > +	__virtio32		flags;
> > +	__virtio64		id;
> > +	__virtio32		count;
> > +	__virtio32		reserved;
> > +};
> > +
> > +struct virtio_pstore_res {
> > +	__virtio16		cmd;
> > +	__virtio16		type;
> > +	__virtio32		ret;
> > +};
> > +
> > +struct virtio_pstore_fileinfo {
> > +	__virtio64		id;
> > +	__virtio32		count;
> > +	__virtio16		type;
> > +	__virtio16		unused;
> > +	__virtio32		flags;
> > +	__virtio32		len;
> > +	__virtio64		time_sec;
> > +	__virtio32		time_nsec;
> > +	__virtio32		reserved;
> 
> Any reason one is reserved the other is unused?
> If not just calls them pad1, pad2?

No specific reason, will change.


> 
> > +};
> > +
> > +struct virtio_pstore_config {
> > +	__virtio32		bufsize;
> > +};
> > +
> 
> __virtio things are for compatibility things.
> New devices should just use __le everywhere.

Right.  Again, already fixed in v5. :)


> 
> Let me post a patch that adds config space accessors
> so you can do this.

Please CC me on the patch.

Thanks,
Namhyung

> 
> 
> > +#endif /* _LINUX_VIRTIO_PSTORE_H */
> > -- 
> > 2.9.3

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] qemu: Implement virtio-pstore device
  2016-09-13 15:57   ` Michael S. Tsirkin
@ 2016-09-16 10:05     ` Namhyung Kim
  2016-11-10 22:50       ` Michael S. Tsirkin
  0 siblings, 1 reply; 31+ messages in thread
From: Namhyung Kim @ 2016-09-16 10:05 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Anthony Liguori, Anton Vorontsov, Colin Cross, Kees Cook,
	Tony Luck, Steven Rostedt, Ingo Molnar, Minchan Kim,
	Daniel P . Berrange

On Tue, Sep 13, 2016 at 06:57:10PM +0300, Michael S. Tsirkin wrote:
> On Sat, Aug 20, 2016 at 05:07:43PM +0900, Namhyung Kim wrote:
> > Add virtio pstore device to allow kernel log files saved on the host.
> > It will save the log files on the directory given by pstore device
> > option.
> > 
> >   $ qemu-system-x86_64 -device virtio-pstore,directory=dir-xx ...
> > 
> >   (guest) # echo c > /proc/sysrq-trigger
> > 
> >   $ ls dir-xx
> >   dmesg-1.enc.z  dmesg-2.enc.z
> > 
> > The log files are usually compressed using zlib.  Users can see the log
> > messages directly on the host or on the guest (using pstore filesystem).
> > 
> > The 'directory' property is required for virtio-pstore device to work.
> > It also adds 'bufsize' property to set size of pstore bufer.
> > 
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Anthony Liguori <aliguori@amazon.com>
> > Cc: Anton Vorontsov <anton@enomsg.org>
> > Cc: Colin Cross <ccross@android.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Tony Luck <tony.luck@intel.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Minchan Kim <minchan@kernel.org>
> > Cc: Daniel P. Berrange <berrange@redhat.com>
> > Cc: kvm@vger.kernel.org
> > Cc: qemu-devel@nongnu.org
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  hw/virtio/Makefile.objs                        |   2 +-
> >  hw/virtio/virtio-pci.c                         |  52 ++
> >  hw/virtio/virtio-pci.h                         |  14 +
> >  hw/virtio/virtio-pstore.c                      | 699 +++++++++++++++++++++++++
> >  include/hw/pci/pci.h                           |   1 +
> >  include/hw/virtio/virtio-pstore.h              |  36 ++
> >  include/standard-headers/linux/virtio_ids.h    |   1 +
> >  include/standard-headers/linux/virtio_pstore.h |  76 +++
> >  qdev-monitor.c                                 |   1 +
> >  9 files changed, 881 insertions(+), 1 deletion(-)
> >  create mode 100644 hw/virtio/virtio-pstore.c
> >  create mode 100644 include/hw/virtio/virtio-pstore.h
> >  create mode 100644 include/standard-headers/linux/virtio_pstore.h
> > 
> > diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> > index 3e2b175..aae7082 100644
> > --- a/hw/virtio/Makefile.objs
> > +++ b/hw/virtio/Makefile.objs
> > @@ -4,4 +4,4 @@ common-obj-y += virtio-bus.o
> >  common-obj-y += virtio-mmio.o
> >  
> >  obj-y += virtio.o virtio-balloon.o 
> > -obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
> > +obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o virtio-pstore.o
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index 755f921..c184823 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -2416,6 +2416,57 @@ static const TypeInfo virtio_host_pci_info = {
> >  };
> >  #endif
> >  
> > +/* virtio-pstore-pci */
> > +
> > +static void virtio_pstore_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
> > +{
> > +    VirtIOPstorePCI *vps = VIRTIO_PSTORE_PCI(vpci_dev);
> > +    DeviceState *vdev = DEVICE(&vps->vdev);
> > +    Error *err = NULL;
> > +
> > +    qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
> > +    object_property_set_bool(OBJECT(vdev), true, "realized", &err);
> > +    if (err) {
> > +        error_propagate(errp, err);
> > +        return;
> > +    }
> > +}
> > +
> > +static void virtio_pstore_pci_class_init(ObjectClass *klass, void *data)
> > +{
> > +    DeviceClass *dc = DEVICE_CLASS(klass);
> > +    VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
> > +    PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
> > +
> > +    k->realize = virtio_pstore_pci_realize;
> > +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> > +
> > +    pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
> > +    pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_PSTORE;
> > +    pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
> > +    pcidev_k->class_id = PCI_CLASS_OTHERS;
> > +}
> > +
> > +static void virtio_pstore_pci_instance_init(Object *obj)
> > +{
> > +    VirtIOPstorePCI *dev = VIRTIO_PSTORE_PCI(obj);
> > +
> > +    virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
> > +                                TYPE_VIRTIO_PSTORE);
> > +    object_property_add_alias(obj, "directory", OBJECT(&dev->vdev),
> > +                              "directory", &error_abort);
> > +    object_property_add_alias(obj, "bufsize", OBJECT(&dev->vdev),
> > +                              "bufsize", &error_abort);
> > +}
> > +
> > +static const TypeInfo virtio_pstore_pci_info = {
> > +    .name          = TYPE_VIRTIO_PSTORE_PCI,
> > +    .parent        = TYPE_VIRTIO_PCI,
> > +    .instance_size = sizeof(VirtIOPstorePCI),
> > +    .instance_init = virtio_pstore_pci_instance_init,
> > +    .class_init    = virtio_pstore_pci_class_init,
> > +};
> > +
> >  /* virtio-pci-bus */
> >  
> >  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
> > @@ -2485,6 +2536,7 @@ static void virtio_pci_register_types(void)
> >  #ifdef CONFIG_VHOST_SCSI
> >      type_register_static(&vhost_scsi_pci_info);
> >  #endif
> > +    type_register_static(&virtio_pstore_pci_info);
> >  }
> >  
> >  type_init(virtio_pci_register_types)
> > diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> > index 25fbf8a..354b2b7 100644
> > --- a/hw/virtio/virtio-pci.h
> > +++ b/hw/virtio/virtio-pci.h
> > @@ -31,6 +31,7 @@
> >  #ifdef CONFIG_VHOST_SCSI
> >  #include "hw/virtio/vhost-scsi.h"
> >  #endif
> > +#include "hw/virtio/virtio-pstore.h"
> >  
> >  typedef struct VirtIOPCIProxy VirtIOPCIProxy;
> >  typedef struct VirtIOBlkPCI VirtIOBlkPCI;
> > @@ -44,6 +45,7 @@ typedef struct VirtIOInputPCI VirtIOInputPCI;
> >  typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI;
> >  typedef struct VirtIOInputHostPCI VirtIOInputHostPCI;
> >  typedef struct VirtIOGPUPCI VirtIOGPUPCI;
> > +typedef struct VirtIOPstorePCI VirtIOPstorePCI;
> >  
> >  /* virtio-pci-bus */
> >  
> > @@ -324,6 +326,18 @@ struct VirtIOGPUPCI {
> >      VirtIOGPU vdev;
> >  };
> >  
> > +/*
> > + * virtio-pstore-pci: This extends VirtioPCIProxy.
> > + */
> > +#define TYPE_VIRTIO_PSTORE_PCI "virtio-pstore-pci"
> > +#define VIRTIO_PSTORE_PCI(obj) \
> > +        OBJECT_CHECK(VirtIOPstorePCI, (obj), TYPE_VIRTIO_PSTORE_PCI)
> > +
> > +struct VirtIOPstorePCI {
> > +    VirtIOPCIProxy parent_obj;
> > +    VirtIOPstore vdev;
> > +};
> > +
> >  /* Virtio ABI version, if we increment this, we break the guest driver. */
> >  #define VIRTIO_PCI_ABI_VERSION          0
> >  
> > diff --git a/hw/virtio/virtio-pstore.c b/hw/virtio/virtio-pstore.c
> > new file mode 100644
> > index 0000000..b8fb4be
> > --- /dev/null
> > +++ b/hw/virtio/virtio-pstore.c
> > @@ -0,0 +1,699 @@
> > +/*
> > + * Virtio Pstore Device
> > + *
> > + * Copyright (C) 2016  LG Electronics
> > + *
> > + * Authors:
> > + *  Namhyung Kim  <namhyung@gmail.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include <stdio.h>
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/iov.h"
> > +#include "qemu-common.h"
> > +#include "qemu/cutils.h"
> > +#include "qemu/error-report.h"
> > +#include "sysemu/kvm.h"
> > +#include "qapi/visitor.h"
> > +#include "qapi-event.h"
> > +#include "io/channel-util.h"
> > +#include "trace.h"
> > +
> > +#include "hw/virtio/virtio.h"
> > +#include "hw/virtio/virtio-bus.h"
> > +#include "hw/virtio/virtio-access.h"
> > +#include "hw/virtio/virtio-pstore.h"
> > +
> > +#define PSTORE_DEFAULT_BUFSIZE   (16 * 1024)
> > +#define PSTORE_DEFAULT_FILE_MAX  5
> > +
> > +/* the index should match to the type value */
> > +static const char *virtio_pstore_file_prefix[] = {
> > +    "unknown-",		/* VIRTIO_PSTORE_TYPE_UNKNOWN */
> 
> Is there value in treating everything unexpected as "unknown"
> and rotating them as if they were logs?
> It might be better to treat everything that's not known
> as guest error.

I was thinking about the version mismatch between the kernel and qemu.
I'd like to make the device can deal with a new kernel version which
might implement a new pstore message type.  It will be saved as
unknown but the kernel can read it properly later.

> 
> 
> > +    "dmesg-",		/* VIRTIO_PSTORE_TYPE_DMESG */
> 
> use named initializers for this instead of comments.

Ok.

> 
> > +};
> > +
> > +static char *virtio_pstore_to_filename(VirtIOPstore *s,
> > +                                       struct virtio_pstore_req *req)
> > +{
> > +    const char *basename;
> > +    unsigned long long id;
> > +    unsigned int type = le16_to_cpu(req->type);
> > +    unsigned int flags = le32_to_cpu(req->flags);
> > +
> > +    if (type < ARRAY_SIZE(virtio_pstore_file_prefix)) {
> > +        basename = virtio_pstore_file_prefix[type];
> > +    } else {
> > +        basename = "unknown-";
> > +    }
> > +
> > +    id = s->id++;
> > +    return g_strdup_printf("%s/%s%llu%s", s->directory, basename, id,
> > +                            flags & VIRTIO_PSTORE_FL_COMPRESSED ? ".enc.z" : "");
> > +}
> > +
> > +static char *virtio_pstore_from_filename(VirtIOPstore *s, char *name,
> > +                                         struct virtio_pstore_fileinfo *info)
> > +{
> > +    char *filename;
> > +    unsigned int idx;
> > +
> > +    filename = g_strdup_printf("%s/%s", s->directory, name);
> > +    if (filename == NULL)
> > +        return NULL;
> > +
> > +    for (idx = 0; idx < ARRAY_SIZE(virtio_pstore_file_prefix); idx++) {
> > +        if (g_str_has_prefix(name, virtio_pstore_file_prefix[idx])) {
> > +            info->type = idx;
> > +            name += strlen(virtio_pstore_file_prefix[idx]);
> > +            break;
> > +        }
> > +    }
> > +
> > +    if (idx == ARRAY_SIZE(virtio_pstore_file_prefix)) {
> > +        g_free(filename);
> > +        return NULL;
> > +    }
> > +
> > +    qemu_strtoull(name, NULL, 0, &info->id);
> 
> What if this fails?

Hmm.. will add a check for return value then.

> 
> > +
> > +    info->flags = 0;
> > +    if (g_str_has_suffix(name, ".enc.z")) {
> > +        info->flags |= VIRTIO_PSTORE_FL_COMPRESSED;
> > +    }
> > +
> > +    return filename;
> > +}
> > +
> > +static int prefix_idx;
> > +static int prefix_count;
> > +static int prefix_len;
> This does not work properly if there are multiple instances
> of it. Pls move everything into device state.

Kernel (currently?) allows only a single pstore device active.  But I
think it'd be better to move them into device state anyway.

> 
> > +
> > +static int filter_pstore(const struct dirent *de)
> > +{
> > +    int i;
> > +
> > +    for (i = 0; i < prefix_count; i++) {
> > +        const char *prefix = virtio_pstore_file_prefix[prefix_idx + i];
> > +
> > +        if (g_str_has_prefix(de->d_name, prefix)) {
> > +            return 1;
> > +        }
> > +    }
> > +    return 0;
> > +}
> > +
> > +static int sort_pstore(const struct dirent **a, const struct dirent **b)
> > +{
> > +    uint64_t id_a, id_b;
> > +
> > +    qemu_strtoull((*a)->d_name + prefix_len, NULL, 0, &id_a);
> > +    qemu_strtoull((*b)->d_name + prefix_len, NULL, 0, &id_b);
> > +
> > +    return id_a - id_b;
> > +}
> > +
> > +static int rotate_pstore_file(VirtIOPstore *s, unsigned short type)
> > +{
> > +    int ret = 0;
> > +    int i, num;
> > +    char *filename;
> > +    struct dirent **files;
> > +
> > +    if (type >= ARRAY_SIZE(virtio_pstore_file_prefix)) {
> > +        type = VIRTIO_PSTORE_TYPE_UNKNOWN;
> > +    }
> > +
> > +    prefix_idx = type;
> > +    prefix_len = strlen(virtio_pstore_file_prefix[type]);
> > +    prefix_count = 1;  /* only scan current type */
> > +
> > +    /* delete the oldest file in the same type */
> > +    num = scandir(s->directory, &files, filter_pstore, sort_pstore);
> > +    if (num < 0)
> > +        return num;
> > +    if (num < (int)s->file_max)
> > +        goto out;
> > +
> > +    filename = g_strdup_printf("%s/%s", s->directory, files[0]->d_name);
> > +    if (filename == NULL) {
> > +        ret = -1;
> > +        goto out;
> > +    }
> > +
> > +    ret = unlink(filename);
> > +
> > +out:
> > +    for (i = 0; i < num; i++) {
> > +        g_free(files[i]);
> > +    }
> > +    g_free(files);
> > +
> > +    return ret;
> > +}
> 
> Pls prefix everything with virtio_pstore or another
> unique prefix. also below.

Ok.

> 
> > +
> > +static ssize_t virtio_pstore_do_open(VirtIOPstore *s)
> > +{
> > +    /* scan all pstore files */
> > +    prefix_idx = 0;
> > +    prefix_count = ARRAY_SIZE(virtio_pstore_file_prefix);
> > +
> > +    s->file_idx = 0;
> > +    s->num_file = scandir(s->directory, &s->files, filter_pstore, alphasort);
> > +
> > +    return s->num_file >= 0 ? 0 : -1;
> > +}
> > +
> > +static ssize_t virtio_pstore_do_close(VirtIOPstore *s)
> > +{
> > +    int i;
> > +
> > +    for (i = 0; i < s->num_file; i++) {
> > +        g_free(s->files[i]);
> > +    }
> > +    g_free(s->files);
> > +    s->files = NULL;
> > +
> > +    s->num_file = 0;
> > +    return 0;
> > +}
> > +
> > +static ssize_t virtio_pstore_do_erase(VirtIOPstore *s,
> > +                                      struct virtio_pstore_req *req)
> > +{
> > +    char *filename;
> > +    int ret;
> > +
> > +    filename = virtio_pstore_to_filename(s, req);
> > +    if (filename == NULL)
> > +        return -1;
> 
> this can't happen.

Why?  The virtio_pstore_to_filename() calls g_strdup_printf().  That
means I don't need to worry about the memory allocation failure?

> 
> also this is a coding style violation.

Oh, I missed the add {}, will fix.

> 
> > +
> > +    ret = unlink(filename);
> > +
> > +    g_free(filename);
> > +    return ret;
> > +}
> > +
> > +struct pstore_read_arg {
> > +    VirtIOPstore *vps;
> > +    VirtQueueElement *elem;
> > +    struct virtio_pstore_fileinfo info;
> > +    QIOChannel *ioc;
> > +};
> > +
> > +static gboolean pstore_async_read_fn(QIOChannel *ioc, GIOCondition condition,
> > +                                     gpointer data)
> > +{
> > +    struct pstore_read_arg *rarg = data;
> > +    struct virtio_pstore_fileinfo *info = &rarg->info;
> > +    VirtIOPstore *vps = rarg->vps;
> > +    VirtQueueElement *elem = rarg->elem;
> > +    struct virtio_pstore_res res;
> > +    size_t offset = sizeof(res) + sizeof(*info);
> > +    struct iovec *sg = elem->in_sg;
> > +    unsigned int sg_num = elem->in_num;
> > +    Error *err = NULL;
> > +    ssize_t len;
> > +    int ret;
> > +
> > +    /* skip res and fileinfo */
> > +    iov_discard_front(&sg, &sg_num, sizeof(res) + sizeof(*info));
> > +
> > +    len = qio_channel_readv(rarg->ioc, sg, sg_num, &err);
> > +    if (len < 0) {
> > +        if (errno == EAGAIN) {
> > +            len = 0;
> > +        }
> > +        ret = -1;
> > +    } else {
> > +        info->len = cpu_to_le32(len);
> > +        ret = 0;
> > +    }
> > +
> > +    res.cmd  = cpu_to_le16(VIRTIO_PSTORE_CMD_READ);
> > +    res.type = cpu_to_le16(VIRTIO_PSTORE_TYPE_UNKNOWN);
> > +    res.ret  = cpu_to_le32(ret);
> > +
> > +    /* now copy res and fileinfo */
> > +    iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> > +    iov_from_buf(elem->in_sg, elem->in_num, sizeof(res), info, sizeof(*info));
> > +
> > +    len += offset;
> > +    virtqueue_push(vps->rvq, elem, len);
> > +    virtio_notify(VIRTIO_DEVICE(vps), vps->rvq);
> > +
> > +    return G_SOURCE_REMOVE;
> > +}
> > +
> > +static void free_rarg_fn(gpointer data)
> > +{
> > +    struct pstore_read_arg *rarg = data;
> > +
> > +    qio_channel_close(rarg->ioc, NULL);
> > +
> > +    g_free(rarg->elem);
> > +    g_free(rarg);
> > +}
> > +
> > +static ssize_t virtio_pstore_do_read(VirtIOPstore *s, VirtQueueElement *elem)
> > +{
> > +    char *filename = NULL;
> > +    int fd, idx;
> > +    struct stat stbuf;
> > +    struct pstore_read_arg *rarg = NULL;
> > +    Error *err = NULL;
> > +    int ret = -1;
> > +
> > +    if (s->file_idx >= s->num_file) {
> > +        return 0;
> > +    }
> > +
> > +    rarg = g_malloc(sizeof(*rarg));
> > +    if (rarg == NULL) {
> > +        return -1;
> > +    }
> > +
> > +    idx = s->file_idx++;
> > +    filename = virtio_pstore_from_filename(s, s->files[idx]->d_name,
> > +                                           &rarg->info);
> > +    if (filename == NULL) {
> > +        goto out;
> > +    }
> > +
> > +    fd = open(filename, O_RDONLY);
> > +    if (fd < 0) {
> > +        error_report("cannot open %s", filename);
> > +        goto out;
> > +    }
> 
> I see open here but close nowhere. Does this leak fds?

I guess so.  But this is changed to use qio_channel_file API in v5 and
I hope doing it right.

> 
> > +
> > +    if (fstat(fd, &stbuf) < 0) {
> 
> So we can stat, but can we e.g. read?

It's just being a paranoid, I think it should succeed, no?

> 
> 
> > +        goto out;
> > +    }
> > +
> > +    rarg->vps            = s;
> > +    rarg->elem           = elem;
> > +    rarg->info.id        = cpu_to_le64(rarg->info.id);
> > +    rarg->info.type      = cpu_to_le16(rarg->info.type);
> > +    rarg->info.flags     = cpu_to_le32(rarg->info.flags);
> > +    rarg->info.time_sec  = cpu_to_le64(stbuf.st_ctim.tv_sec);
> 
> Is this seconds since epoch?
> Why ctim specifically?
> Pls add comments.

I think it doesn't matter either ctim or mtim.

> 
> > +    rarg->info.time_nsec = cpu_to_le32(stbuf.st_ctim.tv_nsec);
> 
> Not all hosts support nanosecond precision.
> Do we need some way to tell guest what's reliable?

In fact I'm not sure how much it affects users.  The pstore messages
are occasional and AFAIK pstore keeps it only for users' information.

> 
> Unless you limit this to linux host, you should care about things
> like this (in man fstat)
> 
>            Since  kernel 2.5.48, the stat structure supports nanosecond
>     resolution for the three file timestamp fields.  The nanosecond compo‐
>     nents of each timestamp are available via names of the form
>     st_atim.tv_nsec if the _BSD_SOURCE or _SVID_SOURCE feature  test  macro
>     is  defined.   Nanosecond  timestamps are nowadays standardized,
>     starting with POSIX.1-2008, and, starting with version 2.12, glibc also
>     exposes the nanosecond component names if _POSIX_C_SOURCE is defined
>     with the value 200809L or greater,  or  _XOPEN_SOURCE  is defined  with
>     the  value 700 or greater.  If none of the aforementioned macros are
>     defined, then the nanosecond values are exposed with names of the form
>     st_atimensec.

Thanks for the info.

> 
> 
> 
> 
> > +
> > +    rarg->ioc = qio_channel_new_fd(fd, &err);
> > +    if (err) {
> > +        error_reportf_err(err, "cannot create io channel: ");
> > +        goto out;
> > +    }
> > +
> > +    qio_channel_set_blocking(rarg->ioc, false, &err);
> > +    qio_channel_add_watch(rarg->ioc, G_IO_IN, pstore_async_read_fn, rarg,
> > +                          free_rarg_fn);
> > +    g_free(filename);
> > +    return 1;
> > +
> > +out:
> > +    g_free(filename);
> > +    g_free(rarg);
> > +
> > +    return ret;
> > +}
> > +
> > +struct pstore_write_arg {
> > +    VirtIOPstore *vps;
> > +    VirtQueueElement *elem;
> > +    struct virtio_pstore_req *req;
> > +    QIOChannel *ioc;
> > +};
> > +
> > +static gboolean pstore_async_write_fn(QIOChannel *ioc, GIOCondition condition,
> > +                                      gpointer data)
> > +{
> > +    struct pstore_write_arg *warg = data;
> > +    VirtIOPstore *vps = warg->vps;
> > +    VirtQueueElement *elem = warg->elem;
> > +    struct iovec *sg = elem->out_sg;
> > +    unsigned int sg_num = elem->out_num;
> > +    struct virtio_pstore_res res;
> > +    Error *err = NULL;
> > +    ssize_t len;
> > +    int ret;
> > +
> > +    /* we already consumed the req */
> > +    iov_discard_front(&sg, &sg_num, sizeof(*warg->req));
> > +
> > +    len = qio_channel_writev(warg->ioc, sg, sg_num, &err);
> > +    if (len < 0) {
> > +        ret = -1;
> > +    } else {
> > +        ret = 0;
> > +    }
> 
> This can discard part of the data written.
> Don't we care?

Doing partial write is better than failing out.  But if it's
meaningful to add a retry loop, I'd like to do so.

> 
> > +
> > +    res.cmd  = cpu_to_le16(VIRTIO_PSTORE_CMD_WRITE);
> > +    res.type = warg->req->type;
> > +    res.ret  = cpu_to_le32(ret);
> > +
> > +    /* tell the result to guest */
> > +    iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> > +
> > +    virtqueue_push(vps->wvq, elem, sizeof(res));
> > +    virtio_notify(VIRTIO_DEVICE(vps), vps->wvq);
> > +
> > +    return G_SOURCE_REMOVE;
> > +}
> > +
> > +static void free_warg_fn(gpointer data)
> > +{
> > +    struct pstore_write_arg *warg = data;
> > +
> > +    qio_channel_close(warg->ioc, NULL);
> > +
> > +    g_free(warg->elem);
> > +    g_free(warg);
> > +}
> > +
> > +static ssize_t virtio_pstore_do_write(VirtIOPstore *s, VirtQueueElement *elem,
> > +                                      struct virtio_pstore_req *req)
> > +{
> > +    unsigned short type = le16_to_cpu(req->type);
> > +    char *filename = NULL;
> > +    int fd;
> > +    int flags = O_WRONLY | O_CREAT | O_TRUNC;
> > +    struct pstore_write_arg *warg = NULL;
> > +    Error *err = NULL;
> > +    int ret = -1;
> > +
> > +    /* do not keep same type of files more than 'file-max' */
> > +    rotate_pstore_file(s, type);
> 
> If you don't care about failures, should this function
> return a value? How about reporting it to the user?

Did you mean when it failed to delete the oldest file (FYI it's not
really 'rotate').  Hmm.. will add error check and report.

> 
> 
> > +
> > +    filename = virtio_pstore_to_filename(s, req);
> > +    if (filename == NULL) {
> > +        return -1;
> > +    }
> 
> this can't happen
> 
> > +
> > +    warg = g_malloc(sizeof(*warg));
> > +    if (warg == NULL) {
> > +        goto out;
> > +    }
> > +
> > +    fd = open(filename, flags, 0644);
> > +    if (fd < 0) {
> > +        error_report("cannot open %s", filename);
> > +        ret = fd;
> > +        goto out;
> > +    }
> > +
> > +    warg->vps            = s;
> > +    warg->elem           = elem;
> > +    warg->req            = req;
> > +
> > +    warg->ioc = qio_channel_new_fd(fd, &err);
> > +    if (err) {
> > +        error_reportf_err(err, "cannot create io channel: ");
> > +        goto out;
> > +    }
> > +
> > +    qio_channel_set_blocking(warg->ioc, false, &err);
> > +    qio_channel_add_watch(warg->ioc, G_IO_OUT, pstore_async_write_fn, warg,
> > +                          free_warg_fn);
> > +    g_free(filename);
> > +    return 1;
> > +
> > +out:
> > +    g_free(filename);
> > +    g_free(warg);
> > +    return ret;
> > +}
> > +
> > +static void virtio_pstore_handle_io(VirtIODevice *vdev, VirtQueue *vq)
> > +{
> > +    VirtIOPstore *s = VIRTIO_PSTORE(vdev);
> > +    VirtQueueElement *elem;
> > +    struct virtio_pstore_req req;
> > +    struct virtio_pstore_res res;
> > +    ssize_t len = 0;
> > +    int ret;
> > +
> > +    for (;;) {
> > +        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> > +        if (!elem) {
> > +            return;
> > +        }
> > +
> > +        if (elem->out_num < 1 || elem->in_num < 1) {
> > +            error_report("request or response buffer is missing");
> > +            exit(1);
> > +        }
> > +
> > +        if (elem->out_num > 2 || elem->in_num > 3) {
> > +            error_report("invalid number of input/output buffer");
> > +            exit(1);
> > +        }
> > +
> > +        len = iov_to_buf(elem->out_sg, elem->out_num, 0, &req, sizeof(req));
> > +        if (len != (ssize_t)sizeof(req)) {
> > +            error_report("invalid request size: %ld", (long)len);
> > +            exit(1);
> > +        }
> > +        res.cmd  = req.cmd;
> > +        res.type = req.type;
> > +
> > +        switch (le16_to_cpu(req.cmd)) {
> > +        case VIRTIO_PSTORE_CMD_OPEN:
> > +            ret = virtio_pstore_do_open(s);
> > +            break;
> > +        case VIRTIO_PSTORE_CMD_CLOSE:
> > +            ret = virtio_pstore_do_close(s);
> > +            break;
> > +        case VIRTIO_PSTORE_CMD_ERASE:
> > +            ret = virtio_pstore_do_erase(s, &req);
> > +            break;
> > +        case VIRTIO_PSTORE_CMD_READ:
> > +            ret = virtio_pstore_do_read(s, elem);
> > +            if (ret == 1) {
> > +                /* async channel io */
> > +                continue;
> > +            }
> > +            break;
> > +        case VIRTIO_PSTORE_CMD_WRITE:
> > +            ret = virtio_pstore_do_write(s, elem, &req);
> > +            if (ret == 1) {
> > +                /* async channel io */
> > +                continue;
> > +            }
> > +            break;
> > +        default:
> > +            ret = -1;
> > +            break;
> > +        }
> > +
> > +        res.ret = ret;
> > +
> > +        iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> > +        virtqueue_push(vq, elem, sizeof(res) + len);
> > +
> > +        virtio_notify(vdev, vq);
> > +        g_free(elem);
> > +
> > +        if (ret < 0) {
> > +            return;
> 
> what does this do?

If it failed on any processing, reports it to the kernel and stop
processing later commands.  The kernel won't send same kind of command
later.

> 
> > +        }
> > +    }
> > +}
> > +
> > +static void virtio_pstore_device_realize(DeviceState *dev, Error **errp)
> > +{
> > +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > +    VirtIOPstore *s = VIRTIO_PSTORE(dev);
> > +
> > +    virtio_init(vdev, "virtio-pstore", VIRTIO_ID_PSTORE,
> > +                sizeof(struct virtio_pstore_config));
> > +
> > +    s->id = 1;
> > +
> > +    if (!s->bufsize)
> > +        s->bufsize = PSTORE_DEFAULT_BUFSIZE;
> > +    if (!s->file_max)
> > +        s->file_max = PSTORE_DEFAULT_FILE_MAX;
> > +
> > +    s->rvq = virtio_add_queue(vdev, 128, virtio_pstore_handle_io);
> > +    s->wvq = virtio_add_queue(vdev, 128, virtio_pstore_handle_io);
> > +}
> > +
> > +static void virtio_pstore_device_unrealize(DeviceState *dev, Error **errp)
> > +{
> > +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > +
> > +    virtio_cleanup(vdev);
> > +}
> > +
> > +static void virtio_pstore_get_config(VirtIODevice *vdev, uint8_t *config_data)
> > +{
> > +    VirtIOPstore *dev = VIRTIO_PSTORE(vdev);
> > +    struct virtio_pstore_config config;
> 
> Add {} here - you want all fields initialized
> if you add them, to avoid leaking them to guest.

Ok.

> 
> > +
> > +    config.bufsize = cpu_to_le32(dev->bufsize);
> > +
> > +    memcpy(config_data, &config, sizeof(struct virtio_pstore_config));
> > +}
> > +
> > +static void virtio_pstore_set_config(VirtIODevice *vdev,
> > +                                     const uint8_t *config_data)
> > +{
> > +    VirtIOPstore *dev = VIRTIO_PSTORE(vdev);
> > +    struct virtio_pstore_config config;
> > +
> > +    memcpy(&config, config_data, sizeof(struct virtio_pstore_config));
> > +
> > +    dev->bufsize = le32_to_cpu(config.bufsize);
> > +}
> > +
> > +static uint64_t get_features(VirtIODevice *vdev, uint64_t f, Error **errp)
> > +{
> > +    return f;
> > +}
> > +
> > +static void pstore_get_directory(Object *obj, Visitor *v,
> > +                                 const char *name, void *opaque,
> > +                                 Error **errp)
> > +{
> > +    VirtIOPstore *s = opaque;
> > +
> > +    visit_type_str(v, name, &s->directory, errp);
> > +}
> > +
> > +static void pstore_set_directory(Object *obj, Visitor *v,
> > +                                 const char *name, void *opaque,
> > +                                 Error **errp)
> > +{
> > +    VirtIOPstore *s = opaque;
> > +    Error *local_err = NULL;
> > +    char *value;
> > +
> > +    visit_type_str(v, name, &value, &local_err);
> > +    if (local_err) {
> > +        error_propagate(errp, local_err);
> > +        return;
> > +    }
> > +
> > +    g_free(s->directory);
> > +    s->directory = value;
> > +}
> > +
> > +static void pstore_release_directory(Object *obj, const char *name,
> > +                                     void *opaque)
> > +{
> > +    VirtIOPstore *s = opaque;
> > +
> > +    g_free(s->directory);
> > +    s->directory = NULL;
> > +}
> > +
> > +static void pstore_get_bufsize(Object *obj, Visitor *v,
> > +                               const char *name, void *opaque,
> > +                               Error **errp)
> > +{
> > +    VirtIOPstore *s = opaque;
> > +    uint64_t value = s->bufsize;
> > +
> > +    visit_type_size(v, name, &value, errp);
> > +}
> > +
> > +static void pstore_set_bufsize(Object *obj, Visitor *v,
> > +                               const char *name, void *opaque,
> > +                               Error **errp)
> > +{
> > +    VirtIOPstore *s = opaque;
> > +    Error *error = NULL;
> > +    uint64_t value;
> > +
> > +    visit_type_size(v, name, &value, &error);
> > +    if (error) {
> > +        error_propagate(errp, error);
> > +        return;
> > +    }
> > +
> > +    if (value < 4096) {
> > +        error_setg(&error, "Warning: too small buffer size: %"PRIu64, value);
> > +        error_propagate(errp, error);
> > +        return;
> > +    }
> > +
> > +    s->bufsize = value;
> > +}
> > +
> > +static void pstore_get_file_max(Object *obj, Visitor *v,
> > +                                const char *name, void *opaque,
> > +                                Error **errp)
> > +{
> > +    VirtIOPstore *s = opaque;
> > +    int64_t value = s->file_max;
> > +
> > +    visit_type_int(v, name, &value, errp);
> > +}
> > +
> > +static void pstore_set_file_max(Object *obj, Visitor *v,
> > +                                const char *name, void *opaque,
> > +                                Error **errp)
> > +{
> > +    VirtIOPstore *s = opaque;
> > +    Error *error = NULL;
> > +    int64_t value;
> > +
> > +    visit_type_int(v, name, &value, &error);
> > +    if (error) {
> > +        error_propagate(errp, error);
> > +        return;
> > +    }
> > +
> > +    s->file_max = value;
> > +}
> 
> Do you need dynamic properties? There are easier ways
> to define an int property. Same for others.

It was due to my insufficient knowledge about the qemu code base.  I
don't think it needs to be dynamic.

Thanks,
Namhyung

> 
> > +
> > +static Property virtio_pstore_properties[] = {
> > +    DEFINE_PROP_END_OF_LIST(),
> > +};
> > +
> > +static void virtio_pstore_instance_init(Object *obj)
> > +{
> > +    VirtIOPstore *s = VIRTIO_PSTORE(obj);
> > +
> > +    object_property_add(obj, "directory", "str",
> > +                        pstore_get_directory, pstore_set_directory,
> > +                        pstore_release_directory, s, NULL);
> > +    object_property_add(obj, "bufsize", "size",
> > +                        pstore_get_bufsize, pstore_set_bufsize, NULL, s, NULL);
> > +    object_property_add(obj, "file-max", "int",
> > +                        pstore_get_file_max, pstore_set_file_max, NULL, s, NULL);
> > +}
> > +
> > +static void virtio_pstore_class_init(ObjectClass *klass, void *data)
> > +{
> > +    DeviceClass *dc = DEVICE_CLASS(klass);
> > +    VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
> > +
> > +    dc->props = virtio_pstore_properties;
> > +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> > +    vdc->realize = virtio_pstore_device_realize;
> > +    vdc->unrealize = virtio_pstore_device_unrealize;
> > +    vdc->get_config = virtio_pstore_get_config;
> > +    vdc->set_config = virtio_pstore_set_config;
> > +    vdc->get_features = get_features;
> > +}
> > +
> > +static const TypeInfo virtio_pstore_info = {
> > +    .name = TYPE_VIRTIO_PSTORE,
> > +    .parent = TYPE_VIRTIO_DEVICE,
> > +    .instance_size = sizeof(VirtIOPstore),
> > +    .instance_init = virtio_pstore_instance_init,
> > +    .class_init = virtio_pstore_class_init,
> > +};
> > +
> > +static void virtio_register_types(void)
> > +{
> > +    type_register_static(&virtio_pstore_info);
> > +}
> > +
> > +type_init(virtio_register_types)
> > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > index 929ec2f..b31774a 100644
> > --- a/include/hw/pci/pci.h
> > +++ b/include/hw/pci/pci.h
> > @@ -79,6 +79,7 @@
> >  #define PCI_DEVICE_ID_VIRTIO_SCSI        0x1004
> >  #define PCI_DEVICE_ID_VIRTIO_RNG         0x1005
> >  #define PCI_DEVICE_ID_VIRTIO_9P          0x1009
> > +#define PCI_DEVICE_ID_VIRTIO_PSTORE      0x100a
> >  
> >  #define PCI_VENDOR_ID_REDHAT             0x1b36
> >  #define PCI_DEVICE_ID_REDHAT_BRIDGE      0x0001
> > diff --git a/include/hw/virtio/virtio-pstore.h b/include/hw/virtio/virtio-pstore.h
> > new file mode 100644
> > index 0000000..85b1828
> > --- /dev/null
> > +++ b/include/hw/virtio/virtio-pstore.h
> > @@ -0,0 +1,36 @@
> > +/*
> > + * Virtio Pstore Support
> > + *
> > + * Authors:
> > + *  Namhyung Kim      <namhyung@gmail.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2.  See
> > + * the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#ifndef _QEMU_VIRTIO_PSTORE_H
> > +#define _QEMU_VIRTIO_PSTORE_H
> > +
> > +#include "standard-headers/linux/virtio_pstore.h"
> > +#include "hw/virtio/virtio.h"
> > +#include "hw/pci/pci.h"
> > +
> > +#define TYPE_VIRTIO_PSTORE "virtio-pstore-device"
> > +#define VIRTIO_PSTORE(obj) \
> > +        OBJECT_CHECK(VirtIOPstore, (obj), TYPE_VIRTIO_PSTORE)
> > +
> > +typedef struct VirtIOPstore {
> > +    VirtIODevice    parent_obj;
> > +    VirtQueue      *rvq;
> > +    VirtQueue      *wvq;
> > +    char           *directory;
> > +    int             file_idx;
> > +    int             num_file;
> > +    struct dirent **files;
> > +    uint64_t        id;
> > +    uint64_t        bufsize;
> > +    uint64_t        file_max;
> > +} VirtIOPstore;
> > +
> > +#endif
> > diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard-headers/linux/virtio_ids.h
> > index 77925f5..c72a9ab 100644
> > --- a/include/standard-headers/linux/virtio_ids.h
> > +++ b/include/standard-headers/linux/virtio_ids.h
> > @@ -41,5 +41,6 @@
> >  #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
> >  #define VIRTIO_ID_GPU          16 /* virtio GPU */
> >  #define VIRTIO_ID_INPUT        18 /* virtio input */
> > +#define VIRTIO_ID_PSTORE       22 /* virtio pstore */
> >  
> >  #endif /* _LINUX_VIRTIO_IDS_H */
> > diff --git a/include/standard-headers/linux/virtio_pstore.h b/include/standard-headers/linux/virtio_pstore.h
> > new file mode 100644
> > index 0000000..2f91839
> > --- /dev/null
> > +++ b/include/standard-headers/linux/virtio_pstore.h
> > @@ -0,0 +1,76 @@
> > +#ifndef _LINUX_VIRTIO_PSTORE_H
> > +#define _LINUX_VIRTIO_PSTORE_H
> > +/* This header is BSD licensed so anyone can use the definitions to implement
> > + * compatible drivers/servers.
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> > + * modification, are permitted provided that the following conditions
> > + * are met:
> > + * 1. Redistributions of source code must retain the above copyright
> > + *    notice, this list of conditions and the following disclaimer.
> > + * 2. Redistributions in binary form must reproduce the above copyright
> > + *    notice, this list of conditions and the following disclaimer in the
> > + *    documentation and/or other materials provided with the distribution.
> > + * 3. Neither the name of IBM nor the names of its contributors
> > + *    may be used to endorse or promote products derived from this software
> > + *    without specific prior written permission.
> > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
> > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> > + * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
> > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > + * SUCH DAMAGE. */
> > +#include "standard-headers/linux/types.h"
> > +#include "standard-headers/linux/virtio_types.h"
> > +#include "standard-headers/linux/virtio_ids.h"
> > +#include "standard-headers/linux/virtio_config.h"
> > +
> > +#define VIRTIO_PSTORE_CMD_NULL   0
> > +#define VIRTIO_PSTORE_CMD_OPEN   1
> > +#define VIRTIO_PSTORE_CMD_READ   2
> > +#define VIRTIO_PSTORE_CMD_WRITE  3
> > +#define VIRTIO_PSTORE_CMD_ERASE  4
> > +#define VIRTIO_PSTORE_CMD_CLOSE  5
> > +
> > +#define VIRTIO_PSTORE_TYPE_UNKNOWN  0
> > +#define VIRTIO_PSTORE_TYPE_DMESG    1
> > +
> > +#define VIRTIO_PSTORE_FL_COMPRESSED  1
> > +
> > +struct virtio_pstore_req {
> > +    __virtio16 cmd;
> > +    __virtio16 type;
> > +    __virtio32 flags;
> > +    __virtio64 id;
> > +    __virtio32 count;
> > +    __virtio32 reserved;
> > +};
> > +
> > +struct virtio_pstore_res {
> > +    __virtio16 cmd;
> > +    __virtio16 type;
> > +    __virtio32 ret;
> > +};
> > +
> > +struct virtio_pstore_fileinfo {
> > +    __virtio64 id;
> > +    __virtio32 count;
> > +    __virtio16 type;
> > +    __virtio16 unused;
> > +    __virtio32 flags;
> > +    __virtio32 len;
> > +    __virtio64 time_sec;
> > +    __virtio32 time_nsec;
> > +    __virtio32 reserved;
> > +};
> > +
> > +struct virtio_pstore_config {
> > +    __virtio32 bufsize;
> > +};
> > +
> > +#endif /* _LINUX_VIRTIO_PSTORE_H */
> > diff --git a/qdev-monitor.c b/qdev-monitor.c
> > index e19617f..e1df5a9 100644
> > --- a/qdev-monitor.c
> > +++ b/qdev-monitor.c
> > @@ -73,6 +73,7 @@ static const QDevAlias qdev_alias_table[] = {
> >      { "virtio-serial-pci", "virtio-serial", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
> >      { "virtio-tablet-ccw", "virtio-tablet", QEMU_ARCH_S390X },
> >      { "virtio-tablet-pci", "virtio-tablet", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
> > +    { "virtio-pstore-pci", "virtio-pstore" },
> >      { }
> >  };
> >  
> > -- 
> > 2.9.3

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-08-20  8:07 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
  2016-09-13 15:19   ` Michael S. Tsirkin
@ 2016-11-10 16:39   ` Michael S. Tsirkin
  2016-11-15  4:50     ` Namhyung Kim
  1 sibling, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2016-11-10 16:39 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Anthony Liguori, Anton Vorontsov, Colin Cross, Kees Cook,
	Tony Luck, Steven Rostedt, Ingo Molnar, Minchan Kim

On Sat, Aug 20, 2016 at 05:07:42PM +0900, Namhyung Kim wrote:
> The virtio pstore driver provides interface to the pstore subsystem so
> that the guest kernel's log/dump message can be saved on the host
> machine.  Users can access the log file directly on the host, or on the
> guest at the next boot using pstore filesystem.  It currently deals with
> kernel log (printk) buffer only, but we can extend it to have other
> information (like ftrace dump) later.
> 
> It supports legacy PCI device using single order-2 page buffer.

Do you mean a legacy virtio device? I don't see why
you would want to support pre-1.0 mode.
If you drop that, you can drop all cpu_to_virtio things
and just use __le accessors.

> It uses
> two virtqueues - one for (sync) read and another for (async) write.
> Since it cannot wait for write finished, it supports up to 128
> concurrent IO.  The buffer size is configurable now.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Anthony Liguori <aliguori@amazon.com>
> Cc: Anton Vorontsov <anton@enomsg.org>
> Cc: Colin Cross <ccross@android.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: kvm@vger.kernel.org
> Cc: qemu-devel@nongnu.org
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  drivers/virtio/Kconfig             |  10 +
>  drivers/virtio/Makefile            |   1 +
>  drivers/virtio/virtio_pstore.c     | 417 +++++++++++++++++++++++++++++++++++++
>  include/uapi/linux/Kbuild          |   1 +
>  include/uapi/linux/virtio_ids.h    |   1 +
>  include/uapi/linux/virtio_pstore.h |  74 +++++++
>  6 files changed, 504 insertions(+)
>  create mode 100644 drivers/virtio/virtio_pstore.c
>  create mode 100644 include/uapi/linux/virtio_pstore.h
> 
> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> index 77590320d44c..8f0e6c796c12 100644
> --- a/drivers/virtio/Kconfig
> +++ b/drivers/virtio/Kconfig
> @@ -58,6 +58,16 @@ config VIRTIO_INPUT
>  
>  	 If unsure, say M.
>  
> +config VIRTIO_PSTORE
> +	tristate "Virtio pstore driver"
> +	depends on VIRTIO
> +	depends on PSTORE
> +	---help---
> +	 This driver supports virtio pstore devices to save/restore
> +	 panic and oops messages on the host.
> +
> +	 If unsure, say M.
> +
>   config VIRTIO_MMIO
>  	tristate "Platform bus driver for memory mapped virtio devices"
>  	depends on HAS_IOMEM && HAS_DMA
> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> index 41e30e3dc842..bee68cb26d48 100644
> --- a/drivers/virtio/Makefile
> +++ b/drivers/virtio/Makefile
> @@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
>  virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
>  obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
>  obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
> +obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o
> diff --git a/drivers/virtio/virtio_pstore.c b/drivers/virtio/virtio_pstore.c
> new file mode 100644
> index 000000000000..0a63c7db4278
> --- /dev/null
> +++ b/drivers/virtio/virtio_pstore.c
> @@ -0,0 +1,417 @@
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/pstore.h>
> +#include <linux/virtio.h>
> +#include <linux/virtio_config.h>
> +#include <uapi/linux/virtio_ids.h>
> +#include <uapi/linux/virtio_pstore.h>
> +
> +#define VIRT_PSTORE_ORDER    2
> +#define VIRT_PSTORE_BUFSIZE  (4096 << VIRT_PSTORE_ORDER)
> +#define VIRT_PSTORE_NR_REQ   128
> +
> +struct virtio_pstore {
> +	struct virtio_device	*vdev;
> +	struct virtqueue	*vq[2];

I'd add named fields instead of an array here, vq[0]
vq[1] all over the place is hard to read.

> +	struct pstore_info	 pstore;
> +	struct virtio_pstore_req req[VIRT_PSTORE_NR_REQ];
> +	struct virtio_pstore_res res[VIRT_PSTORE_NR_REQ];
> +	unsigned int		 req_id;
> +
> +	/* Waiting for host to ack */
> +	wait_queue_head_t	acked;
> +	int			failed;
> +};
> +
> +#define TYPE_TABLE_ENTRY(_entry)				\
> +	{ PSTORE_TYPE_##_entry, VIRTIO_PSTORE_TYPE_##_entry }
> +
> +struct type_table {
> +	int pstore;
> +	u16 virtio;
> +} type_table[] = {
> +	TYPE_TABLE_ENTRY(DMESG),
> +};
> +
> +#undef TYPE_TABLE_ENTRY

let's avoid macros for now pls. In fact, I would just open-code this
in to_virtio_type below. We can always change our minds later if
lots of types are added.

> +
> +

single emoty line pls

> +static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id type)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
> +		if (type == type_table[i].pstore)
> +			return cpu_to_virtio16(vps->vdev, type_table[i].virtio);
> +	}
> +
> +	return cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_UNKNOWN);

This assigns u16 to __virtio type, sparse will warn
if you enable endian-ness checks.
Pls fix that and generally, please make sure this is
clean from sparse warnings.

> +}
> +
> +static enum pstore_type_id from_virtio_type(struct virtio_pstore *vps, u16 type)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
> +		if (virtio16_to_cpu(vps->vdev, type) == type_table[i].virtio)
> +			return type_table[i].pstore;
> +	}
> +
> +	return PSTORE_TYPE_UNKNOWN;
> +}
> +
> +static void virtpstore_ack(struct virtqueue *vq)
> +{
> +	struct virtio_pstore *vps = vq->vdev->priv;
> +
> +	wake_up(&vps->acked);
> +}
> +
> +static void virtpstore_check(struct virtqueue *vq)
> +{
> +	struct virtio_pstore *vps = vq->vdev->priv;
> +	struct virtio_pstore_res *res;
> +	unsigned int len;
> +
> +	res = virtqueue_get_buf(vq, &len);
> +	if (res == NULL)
> +		return;
> +
> +	if (virtio32_to_cpu(vq->vdev, res->ret) < 0)
> +		vps->failed = 1;
> +}
> +
> +static void virt_pstore_get_reqs(struct virtio_pstore *vps,
> +				 struct virtio_pstore_req **preq,
> +				 struct virtio_pstore_res **pres)
> +{
> +	unsigned int idx = vps->req_id++ % VIRT_PSTORE_NR_REQ;
> +
> +	*preq = &vps->req[idx];
> +	*pres = &vps->res[idx];
> +
> +	memset(*preq, 0, sizeof(**preq));
> +	memset(*pres, 0, sizeof(**pres));
> +}
> +
> +static int virt_pstore_open(struct pstore_info *psi)
> +{
> +	struct virtio_pstore *vps = psi->data;
> +	struct virtio_pstore_req *req;
> +	struct virtio_pstore_res *res;
> +	struct scatterlist sgo[1], sgi[1];
> +	struct scatterlist *sgs[2] = { sgo, sgi };
> +	unsigned int len;
> +
> +	virt_pstore_get_reqs(vps, &req, &res);
> +
> +	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_OPEN);
> +
> +	sg_init_one(sgo, req, sizeof(*req));
> +	sg_init_one(sgi, res, sizeof(*res));
> +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> +	virtqueue_kick(vps->vq[0]);
> +
> +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));

Does this block userspace in an uninterruptible wait if
hardware is slow? That's not nice.

> +	return virtio32_to_cpu(vps->vdev, res->ret);
> +}
> +
> +static int virt_pstore_close(struct pstore_info *psi)
> +{
> +	struct virtio_pstore *vps = psi->data;
> +	struct virtio_pstore_req *req = &vps->req[vps->req_id];
> +	struct virtio_pstore_res *res = &vps->res[vps->req_id];
> +	struct scatterlist sgo[1], sgi[1];
> +	struct scatterlist *sgs[2] = { sgo, sgi };
> +	unsigned int len;
> +
> +	virt_pstore_get_reqs(vps, &req, &res);
> +
> +	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_CLOSE);
> +
> +	sg_init_one(sgo, req, sizeof(*req));
> +	sg_init_one(sgi, res, sizeof(*res));
> +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> +	virtqueue_kick(vps->vq[0]);
> +
> +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> +	return virtio32_to_cpu(vps->vdev, res->ret);
> +}
> +
> +static ssize_t virt_pstore_read(u64 *id, enum pstore_type_id *type,
> +				int *count, struct timespec *time,
> +				char **buf, bool *compressed,
> +				ssize_t *ecc_notice_size,
> +				struct pstore_info *psi)
> +{
> +	struct virtio_pstore *vps = psi->data;
> +	struct virtio_pstore_req *req;
> +	struct virtio_pstore_res *res;
> +	struct virtio_pstore_fileinfo info;
> +	struct scatterlist sgo[1], sgi[3];
> +	struct scatterlist *sgs[2] = { sgo, sgi };
> +	unsigned int len;
> +	unsigned int flags;
> +	int ret;
> +	void *bf;
> +
> +	virt_pstore_get_reqs(vps, &req, &res);
> +
> +	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_READ);
> +
> +	sg_init_one(sgo, req, sizeof(*req));
> +	sg_init_table(sgi, 3);
> +	sg_set_buf(&sgi[0], res, sizeof(*res));
> +	sg_set_buf(&sgi[1], &info, sizeof(info));
> +	sg_set_buf(&sgi[2], psi->buf, psi->bufsize);
> +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> +	virtqueue_kick(vps->vq[0]);
> +
> +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> +	if (len < sizeof(*res) + sizeof(info))
> +		return -1;
> +
> +	ret = virtio32_to_cpu(vps->vdev, res->ret);
> +	if (ret < 0)
> +		return ret;
> +
> +	len = virtio32_to_cpu(vps->vdev, info.len);
> +
> +	bf = kmalloc(len, GFP_KERNEL);
> +	if (bf == NULL)
> +		return -ENOMEM;
> +
> +	*id    = virtio64_to_cpu(vps->vdev, info.id);
> +	*type  = from_virtio_type(vps, info.type);
> +	*count = virtio32_to_cpu(vps->vdev, info.count);
> +
> +	flags = virtio32_to_cpu(vps->vdev, info.flags);
> +	*compressed = flags & VIRTIO_PSTORE_FL_COMPRESSED;
> +
> +	time->tv_sec  = virtio64_to_cpu(vps->vdev, info.time_sec);
> +	time->tv_nsec = virtio32_to_cpu(vps->vdev, info.time_nsec);
> +
> +	memcpy(bf, psi->buf, len);
> +	*buf = bf;
> +
> +	return len;
> +}
> +
> +static int notrace virt_pstore_write(enum pstore_type_id type,
> +				     enum kmsg_dump_reason reason,
> +				     u64 *id, unsigned int part, int count,
> +				     bool compressed, size_t size,
> +				     struct pstore_info *psi)
> +{
> +	struct virtio_pstore *vps = psi->data;
> +	struct virtio_pstore_req *req;
> +	struct virtio_pstore_res *res;
> +	struct scatterlist sgo[2], sgi[1];
> +	struct scatterlist *sgs[2] = { sgo, sgi };
> +	unsigned int flags = compressed ? VIRTIO_PSTORE_FL_COMPRESSED : 0;
> +
> +	if (vps->failed)
> +		return -1;
> +
> +	*id = vps->req_id;
> +	virt_pstore_get_reqs(vps, &req, &res);
> +
> +	req->cmd   = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_WRITE);
> +	req->type  = to_virtio_type(vps, type);
> +	req->flags = cpu_to_virtio32(vps->vdev, flags);
> +
> +	sg_init_table(sgo, 2);
> +	sg_set_buf(&sgo[0], req, sizeof(*req));
> +	sg_set_buf(&sgo[1], pstore_get_buf(psi), size);
> +	sg_init_one(sgi, res, sizeof(*res));
> +	virtqueue_add_sgs(vps->vq[1], sgs, 1, 1, vps, GFP_ATOMIC);
> +	virtqueue_kick(vps->vq[1]);
> +
> +	return 0;
> +}
> +
> +static int virt_pstore_erase(enum pstore_type_id type, u64 id, int count,
> +			     struct timespec time, struct pstore_info *psi)
> +{
> +	struct virtio_pstore *vps = psi->data;
> +	struct virtio_pstore_req *req;
> +	struct virtio_pstore_res *res;
> +	struct scatterlist sgo[1], sgi[1];
> +	struct scatterlist *sgs[2] = { sgo, sgi };
> +	unsigned int len;
> +
> +	virt_pstore_get_reqs(vps, &req, &res);
> +
> +	req->cmd   = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_ERASE);
> +	req->type  = to_virtio_type(vps, type);
> +	req->id	   = cpu_to_virtio64(vps->vdev, id);
> +	req->count = cpu_to_virtio32(vps->vdev, count);
> +
> +	sg_init_one(sgo, req, sizeof(*req));
> +	sg_init_one(sgi, res, sizeof(*res));
> +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> +	virtqueue_kick(vps->vq[0]);
> +
> +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> +	return virtio32_to_cpu(vps->vdev, res->ret);
> +}
> +
> +static int virt_pstore_init(struct virtio_pstore *vps)
> +{
> +	struct pstore_info *psinfo = &vps->pstore;
> +	int err;
> +
> +	if (!psinfo->bufsize)
> +		psinfo->bufsize = VIRT_PSTORE_BUFSIZE;
> +
> +	psinfo->buf = alloc_pages_exact(psinfo->bufsize, GFP_KERNEL);
> +	if (!psinfo->buf) {
> +		pr_err("cannot allocate pstore buffer\n");
> +		return -ENOMEM;
> +	}
> +
> +	psinfo->owner = THIS_MODULE;
> +	psinfo->name  = "virtio";
> +	psinfo->open  = virt_pstore_open;
> +	psinfo->close = virt_pstore_close;
> +	psinfo->read  = virt_pstore_read;
> +	psinfo->erase = virt_pstore_erase;
> +	psinfo->write = virt_pstore_write;
> +	psinfo->flags = PSTORE_FLAGS_DMESG;
> +
> +	psinfo->data  = vps;
> +	spin_lock_init(&psinfo->buf_lock);
> +
> +	err = pstore_register(psinfo);
> +	if (err)
> +		kfree(psinfo->buf);
> +
> +	return err;
> +}
> +
> +static int virt_pstore_exit(struct virtio_pstore *vps)
> +{
> +	struct pstore_info *psinfo = &vps->pstore;
> +
> +	pstore_unregister(psinfo);
> +
> +	free_pages_exact(psinfo->buf, psinfo->bufsize);
> +	psinfo->buf = NULL;
> +	psinfo->bufsize = 0;
> +
> +	return 0;
> +}
> +
> +static int virtpstore_init_vqs(struct virtio_pstore *vps)
> +{
> +	vq_callback_t *callbacks[] = { virtpstore_ack, virtpstore_check };
> +	const char *names[] = { "pstore_read", "pstore_write" };
> +
> +	return vps->vdev->config->find_vqs(vps->vdev, 2, vps->vq,
> +					   callbacks, names);
> +}
> +
> +static void virtpstore_init_config(struct virtio_pstore *vps)
> +{
> +	u32 bufsize;
> +
> +	virtio_cread(vps->vdev, struct virtio_pstore_config, bufsize, &bufsize);
> +
> +	vps->pstore.bufsize = PAGE_ALIGN(bufsize);
> +}
> +
> +static void virtpstore_confirm_config(struct virtio_pstore *vps)
> +{
> +	u32 bufsize = vps->pstore.bufsize;
> +
> +	virtio_cwrite(vps->vdev, struct virtio_pstore_config, bufsize,
> +		     &bufsize);
> +}
> +
> +static int virtpstore_probe(struct virtio_device *vdev)
> +{
> +	struct virtio_pstore *vps;
> +	int err;
> +
> +	if (!vdev->config->get) {
> +		dev_err(&vdev->dev, "driver init: config access disabled\n");
> +		return -EINVAL;
> +	}
> +
> +	vdev->priv = vps = kzalloc(sizeof(*vps), GFP_KERNEL);
> +	if (!vps) {
> +		err = -ENOMEM;
> +		goto out;
> +	}
> +	vps->vdev = vdev;
> +
> +	err = virtpstore_init_vqs(vps);
> +	if (err < 0)
> +		goto out_free;
> +
> +	virtpstore_init_config(vps);
> +
> +	err = virt_pstore_init(vps);
> +	if (err)
> +		goto out_del_vq;
> +
> +	virtpstore_confirm_config(vps);
> +
> +	init_waitqueue_head(&vps->acked);
> +
> +	virtio_device_ready(vdev);
> +
> +	dev_info(&vdev->dev, "driver init: ok (bufsize = %luK, flags = %x)\n",
> +		 vps->pstore.bufsize >> 10, vps->pstore.flags);
> +
> +	return 0;
> +
> +out_del_vq:
> +	vdev->config->del_vqs(vdev);
> +out_free:
> +	kfree(vps);
> +out:
> +	dev_err(&vdev->dev, "driver init: failed with %d\n", err);
> +	return err;
> +}
> +
> +static void virtpstore_remove(struct virtio_device *vdev)
> +{
> +	struct virtio_pstore *vps = vdev->priv;
> +
> +	virt_pstore_exit(vps);
> +
> +	/* Now we reset the device so we can clean up the queues. */
> +	vdev->config->reset(vdev);
> +
> +	vdev->config->del_vqs(vdev);
> +
> +	kfree(vps);
> +}
> +
> +static unsigned int features[] = {
> +};
> +
> +static struct virtio_device_id id_table[] = {
> +	{ VIRTIO_ID_PSTORE, VIRTIO_DEV_ANY_ID },
> +	{ 0 },
> +};
> +
> +static struct virtio_driver virtio_pstore_driver = {
> +	.driver.name         = KBUILD_MODNAME,
> +	.driver.owner        = THIS_MODULE,
> +	.feature_table       = features,
> +	.feature_table_size  = ARRAY_SIZE(features),
> +	.id_table            = id_table,
> +	.probe               = virtpstore_probe,
> +	.remove              = virtpstore_remove,
> +};
> +
> +module_virtio_driver(virtio_pstore_driver);
> +MODULE_DEVICE_TABLE(virtio, id_table);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Namhyung Kim <namhyung@kernel.org>");
> +MODULE_DESCRIPTION("Virtio pstore driver");
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index 6d4e92ccdc91..9bbb1554d8b2 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -449,6 +449,7 @@ header-y += virtio_ids.h
>  header-y += virtio_input.h
>  header-y += virtio_net.h
>  header-y += virtio_pci.h
> +header-y += virtio_pstore.h
>  header-y += virtio_ring.h
>  header-y += virtio_rng.h
>  header-y += virtio_scsi.h
> diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
> index 77925f587b15..c72a9ab588c0 100644
> --- a/include/uapi/linux/virtio_ids.h
> +++ b/include/uapi/linux/virtio_ids.h
> @@ -41,5 +41,6 @@
>  #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
>  #define VIRTIO_ID_GPU          16 /* virtio GPU */
>  #define VIRTIO_ID_INPUT        18 /* virtio input */
> +#define VIRTIO_ID_PSTORE       22 /* virtio pstore */
>  
>  #endif /* _LINUX_VIRTIO_IDS_H */
> diff --git a/include/uapi/linux/virtio_pstore.h b/include/uapi/linux/virtio_pstore.h
> new file mode 100644
> index 000000000000..f4b0d204d8ae
> --- /dev/null
> +++ b/include/uapi/linux/virtio_pstore.h
> @@ -0,0 +1,74 @@
> +#ifndef _LINUX_VIRTIO_PSTORE_H
> +#define _LINUX_VIRTIO_PSTORE_H
> +/* This header is BSD licensed so anyone can use the definitions to implement
> + * compatible drivers/servers.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + * 3. Neither the name of IBM nor the names of its contributors
> + *    may be used to endorse or promote products derived from this software
> + *    without specific prior written permission.
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE. */
> +#include <linux/types.h>
> +#include <linux/virtio_types.h>
> +
> +#define VIRTIO_PSTORE_CMD_NULL   0
> +#define VIRTIO_PSTORE_CMD_OPEN   1
> +#define VIRTIO_PSTORE_CMD_READ   2
> +#define VIRTIO_PSTORE_CMD_WRITE  3
> +#define VIRTIO_PSTORE_CMD_ERASE  4
> +#define VIRTIO_PSTORE_CMD_CLOSE  5
> +
> +#define VIRTIO_PSTORE_TYPE_UNKNOWN  0
> +#define VIRTIO_PSTORE_TYPE_DMESG    1
> +
> +#define VIRTIO_PSTORE_FL_COMPRESSED  1
> +
> +struct virtio_pstore_req {
> +	__virtio16		cmd;
> +	__virtio16		type;
> +	__virtio32		flags;
> +	__virtio64		id;
> +	__virtio32		count;
> +	__virtio32		reserved;
> +};
> +
> +struct virtio_pstore_res {
> +	__virtio16		cmd;
> +	__virtio16		type;
> +	__virtio32		ret;
> +};
> +
> +struct virtio_pstore_fileinfo {
> +	__virtio64		id;
> +	__virtio32		count;
> +	__virtio16		type;
> +	__virtio16		unused;
> +	__virtio32		flags;
> +	__virtio32		len;
> +	__virtio64		time_sec;
> +	__virtio32		time_nsec;
> +	__virtio32		reserved;
> +};
> +
> +struct virtio_pstore_config {
> +	__virtio32		bufsize;
> +};
> +

What exactly does each field mean? I'm especially
interested in time fields - maintaining a consistent
time between host and guest is not a simple problem.

> +#endif /* _LINUX_VIRTIO_PSTORE_H */
> -- 
> 2.9.3

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] qemu: Implement virtio-pstore device
  2016-09-16 10:05     ` Namhyung Kim
@ 2016-11-10 22:50       ` Michael S. Tsirkin
  2016-11-15  6:23         ` Namhyung Kim
  0 siblings, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2016-11-10 22:50 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Anthony Liguori, Anton Vorontsov, Colin Cross, Kees Cook,
	Tony Luck, Steven Rostedt, Ingo Molnar, Minchan Kim,
	Daniel P . Berrange

On Fri, Sep 16, 2016 at 07:05:47PM +0900, Namhyung Kim wrote:
> On Tue, Sep 13, 2016 at 06:57:10PM +0300, Michael S. Tsirkin wrote:
> > On Sat, Aug 20, 2016 at 05:07:43PM +0900, Namhyung Kim wrote:
> > > Add virtio pstore device to allow kernel log files saved on the host.
> > > It will save the log files on the directory given by pstore device
> > > option.
> > > 
> > >   $ qemu-system-x86_64 -device virtio-pstore,directory=dir-xx ...
> > > 
> > >   (guest) # echo c > /proc/sysrq-trigger
> > > 
> > >   $ ls dir-xx
> > >   dmesg-1.enc.z  dmesg-2.enc.z
> > > 
> > > The log files are usually compressed using zlib.  Users can see the log
> > > messages directly on the host or on the guest (using pstore filesystem).
> > > 
> > > The 'directory' property is required for virtio-pstore device to work.
> > > It also adds 'bufsize' property to set size of pstore bufer.
> > > 
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > > Cc: Anthony Liguori <aliguori@amazon.com>
> > > Cc: Anton Vorontsov <anton@enomsg.org>
> > > Cc: Colin Cross <ccross@android.com>
> > > Cc: Kees Cook <keescook@chromium.org>
> > > Cc: Tony Luck <tony.luck@intel.com>
> > > Cc: Steven Rostedt <rostedt@goodmis.org>
> > > Cc: Ingo Molnar <mingo@kernel.org>
> > > Cc: Minchan Kim <minchan@kernel.org>
> > > Cc: Daniel P. Berrange <berrange@redhat.com>
> > > Cc: kvm@vger.kernel.org
> > > Cc: qemu-devel@nongnu.org
> > > Cc: virtualization@lists.linux-foundation.org
> > > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > > ---
> > >  hw/virtio/Makefile.objs                        |   2 +-
> > >  hw/virtio/virtio-pci.c                         |  52 ++
> > >  hw/virtio/virtio-pci.h                         |  14 +
> > >  hw/virtio/virtio-pstore.c                      | 699 +++++++++++++++++++++++++
> > >  include/hw/pci/pci.h                           |   1 +
> > >  include/hw/virtio/virtio-pstore.h              |  36 ++
> > >  include/standard-headers/linux/virtio_ids.h    |   1 +
> > >  include/standard-headers/linux/virtio_pstore.h |  76 +++
> > >  qdev-monitor.c                                 |   1 +
> > >  9 files changed, 881 insertions(+), 1 deletion(-)
> > >  create mode 100644 hw/virtio/virtio-pstore.c
> > >  create mode 100644 include/hw/virtio/virtio-pstore.h
> > >  create mode 100644 include/standard-headers/linux/virtio_pstore.h
> > > 
> > > diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> > > index 3e2b175..aae7082 100644
> > > --- a/hw/virtio/Makefile.objs
> > > +++ b/hw/virtio/Makefile.objs
> > > @@ -4,4 +4,4 @@ common-obj-y += virtio-bus.o
> > >  common-obj-y += virtio-mmio.o
> > >  
> > >  obj-y += virtio.o virtio-balloon.o 
> > > -obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
> > > +obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o virtio-pstore.o
> > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > > index 755f921..c184823 100644
> > > --- a/hw/virtio/virtio-pci.c
> > > +++ b/hw/virtio/virtio-pci.c
> > > @@ -2416,6 +2416,57 @@ static const TypeInfo virtio_host_pci_info = {
> > >  };
> > >  #endif
> > >  
> > > +/* virtio-pstore-pci */
> > > +
> > > +static void virtio_pstore_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
> > > +{
> > > +    VirtIOPstorePCI *vps = VIRTIO_PSTORE_PCI(vpci_dev);
> > > +    DeviceState *vdev = DEVICE(&vps->vdev);
> > > +    Error *err = NULL;
> > > +
> > > +    qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
> > > +    object_property_set_bool(OBJECT(vdev), true, "realized", &err);
> > > +    if (err) {
> > > +        error_propagate(errp, err);
> > > +        return;
> > > +    }
> > > +}
> > > +
> > > +static void virtio_pstore_pci_class_init(ObjectClass *klass, void *data)
> > > +{
> > > +    DeviceClass *dc = DEVICE_CLASS(klass);
> > > +    VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
> > > +    PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
> > > +
> > > +    k->realize = virtio_pstore_pci_realize;
> > > +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> > > +
> > > +    pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
> > > +    pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_PSTORE;
> > > +    pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
> > > +    pcidev_k->class_id = PCI_CLASS_OTHERS;
> > > +}
> > > +
> > > +static void virtio_pstore_pci_instance_init(Object *obj)
> > > +{
> > > +    VirtIOPstorePCI *dev = VIRTIO_PSTORE_PCI(obj);
> > > +
> > > +    virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
> > > +                                TYPE_VIRTIO_PSTORE);
> > > +    object_property_add_alias(obj, "directory", OBJECT(&dev->vdev),
> > > +                              "directory", &error_abort);
> > > +    object_property_add_alias(obj, "bufsize", OBJECT(&dev->vdev),
> > > +                              "bufsize", &error_abort);
> > > +}
> > > +
> > > +static const TypeInfo virtio_pstore_pci_info = {
> > > +    .name          = TYPE_VIRTIO_PSTORE_PCI,
> > > +    .parent        = TYPE_VIRTIO_PCI,
> > > +    .instance_size = sizeof(VirtIOPstorePCI),
> > > +    .instance_init = virtio_pstore_pci_instance_init,
> > > +    .class_init    = virtio_pstore_pci_class_init,
> > > +};
> > > +
> > >  /* virtio-pci-bus */
> > >  
> > >  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
> > > @@ -2485,6 +2536,7 @@ static void virtio_pci_register_types(void)
> > >  #ifdef CONFIG_VHOST_SCSI
> > >      type_register_static(&vhost_scsi_pci_info);
> > >  #endif
> > > +    type_register_static(&virtio_pstore_pci_info);
> > >  }
> > >  
> > >  type_init(virtio_pci_register_types)
> > > diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> > > index 25fbf8a..354b2b7 100644
> > > --- a/hw/virtio/virtio-pci.h
> > > +++ b/hw/virtio/virtio-pci.h
> > > @@ -31,6 +31,7 @@
> > >  #ifdef CONFIG_VHOST_SCSI
> > >  #include "hw/virtio/vhost-scsi.h"
> > >  #endif
> > > +#include "hw/virtio/virtio-pstore.h"
> > >  
> > >  typedef struct VirtIOPCIProxy VirtIOPCIProxy;
> > >  typedef struct VirtIOBlkPCI VirtIOBlkPCI;
> > > @@ -44,6 +45,7 @@ typedef struct VirtIOInputPCI VirtIOInputPCI;
> > >  typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI;
> > >  typedef struct VirtIOInputHostPCI VirtIOInputHostPCI;
> > >  typedef struct VirtIOGPUPCI VirtIOGPUPCI;
> > > +typedef struct VirtIOPstorePCI VirtIOPstorePCI;
> > >  
> > >  /* virtio-pci-bus */
> > >  
> > > @@ -324,6 +326,18 @@ struct VirtIOGPUPCI {
> > >      VirtIOGPU vdev;
> > >  };
> > >  
> > > +/*
> > > + * virtio-pstore-pci: This extends VirtioPCIProxy.
> > > + */
> > > +#define TYPE_VIRTIO_PSTORE_PCI "virtio-pstore-pci"
> > > +#define VIRTIO_PSTORE_PCI(obj) \
> > > +        OBJECT_CHECK(VirtIOPstorePCI, (obj), TYPE_VIRTIO_PSTORE_PCI)
> > > +
> > > +struct VirtIOPstorePCI {
> > > +    VirtIOPCIProxy parent_obj;
> > > +    VirtIOPstore vdev;
> > > +};
> > > +
> > >  /* Virtio ABI version, if we increment this, we break the guest driver. */
> > >  #define VIRTIO_PCI_ABI_VERSION          0
> > >  
> > > diff --git a/hw/virtio/virtio-pstore.c b/hw/virtio/virtio-pstore.c
> > > new file mode 100644
> > > index 0000000..b8fb4be
> > > --- /dev/null
> > > +++ b/hw/virtio/virtio-pstore.c
> > > @@ -0,0 +1,699 @@
> > > +/*
> > > + * Virtio Pstore Device
> > > + *
> > > + * Copyright (C) 2016  LG Electronics
> > > + *
> > > + * Authors:
> > > + *  Namhyung Kim  <namhyung@gmail.com>
> > > + *
> > > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > > + * See the COPYING file in the top-level directory.
> > > + *
> > > + */
> > > +
> > > +#include <stdio.h>
> > > +
> > > +#include "qemu/osdep.h"
> > > +#include "qemu/iov.h"
> > > +#include "qemu-common.h"
> > > +#include "qemu/cutils.h"
> > > +#include "qemu/error-report.h"
> > > +#include "sysemu/kvm.h"
> > > +#include "qapi/visitor.h"
> > > +#include "qapi-event.h"
> > > +#include "io/channel-util.h"
> > > +#include "trace.h"
> > > +
> > > +#include "hw/virtio/virtio.h"
> > > +#include "hw/virtio/virtio-bus.h"
> > > +#include "hw/virtio/virtio-access.h"
> > > +#include "hw/virtio/virtio-pstore.h"
> > > +
> > > +#define PSTORE_DEFAULT_BUFSIZE   (16 * 1024)
> > > +#define PSTORE_DEFAULT_FILE_MAX  5
> > > +
> > > +/* the index should match to the type value */
> > > +static const char *virtio_pstore_file_prefix[] = {
> > > +    "unknown-",		/* VIRTIO_PSTORE_TYPE_UNKNOWN */
> > 
> > Is there value in treating everything unexpected as "unknown"
> > and rotating them as if they were logs?
> > It might be better to treat everything that's not known
> > as guest error.
> 
> I was thinking about the version mismatch between the kernel and qemu.
> I'd like to make the device can deal with a new kernel version which
> might implement a new pstore message type.  It will be saved as
> unknown but the kernel can read it properly later.

Well it'll have a different prefix. E.g. if kernel has
two different types they will end up in the same
file, hardly what was wanted.


> > 
> > 
> > > +    "dmesg-",		/* VIRTIO_PSTORE_TYPE_DMESG */
> > 
> > use named initializers for this instead of comments.
> 
> Ok.
> 
> > 
> > > +};
> > > +
> > > +static char *virtio_pstore_to_filename(VirtIOPstore *s,
> > > +                                       struct virtio_pstore_req *req)
> > > +{
> > > +    const char *basename;
> > > +    unsigned long long id;
> > > +    unsigned int type = le16_to_cpu(req->type);
> > > +    unsigned int flags = le32_to_cpu(req->flags);
> > > +
> > > +    if (type < ARRAY_SIZE(virtio_pstore_file_prefix)) {
> > > +        basename = virtio_pstore_file_prefix[type];
> > > +    } else {
> > > +        basename = "unknown-";
> > > +    }
> > > +
> > > +    id = s->id++;
> > > +    return g_strdup_printf("%s/%s%llu%s", s->directory, basename, id,
> > > +                            flags & VIRTIO_PSTORE_FL_COMPRESSED ? ".enc.z" : "");
> > > +}
> > > +
> > > +static char *virtio_pstore_from_filename(VirtIOPstore *s, char *name,
> > > +                                         struct virtio_pstore_fileinfo *info)
> > > +{
> > > +    char *filename;
> > > +    unsigned int idx;
> > > +
> > > +    filename = g_strdup_printf("%s/%s", s->directory, name);
> > > +    if (filename == NULL)
> > > +        return NULL;
> > > +
> > > +    for (idx = 0; idx < ARRAY_SIZE(virtio_pstore_file_prefix); idx++) {
> > > +        if (g_str_has_prefix(name, virtio_pstore_file_prefix[idx])) {
> > > +            info->type = idx;
> > > +            name += strlen(virtio_pstore_file_prefix[idx]);
> > > +            break;
> > > +        }
> > > +    }
> > > +
> > > +    if (idx == ARRAY_SIZE(virtio_pstore_file_prefix)) {
> > > +        g_free(filename);
> > > +        return NULL;
> > > +    }
> > > +
> > > +    qemu_strtoull(name, NULL, 0, &info->id);
> > 
> > What if this fails?
> 
> Hmm.. will add a check for return value then.
> 
> > 
> > > +
> > > +    info->flags = 0;
> > > +    if (g_str_has_suffix(name, ".enc.z")) {
> > > +        info->flags |= VIRTIO_PSTORE_FL_COMPRESSED;
> > > +    }
> > > +
> > > +    return filename;
> > > +}
> > > +
> > > +static int prefix_idx;
> > > +static int prefix_count;
> > > +static int prefix_len;
> > This does not work properly if there are multiple instances
> > of it. Pls move everything into device state.
> 
> Kernel (currently?) allows only a single pstore device active.  But I
> think it'd be better to move them into device state anyway.
> 
> > 
> > > +
> > > +static int filter_pstore(const struct dirent *de)
> > > +{
> > > +    int i;
> > > +
> > > +    for (i = 0; i < prefix_count; i++) {
> > > +        const char *prefix = virtio_pstore_file_prefix[prefix_idx + i];
> > > +
> > > +        if (g_str_has_prefix(de->d_name, prefix)) {
> > > +            return 1;
> > > +        }
> > > +    }
> > > +    return 0;
> > > +}
> > > +
> > > +static int sort_pstore(const struct dirent **a, const struct dirent **b)
> > > +{
> > > +    uint64_t id_a, id_b;
> > > +
> > > +    qemu_strtoull((*a)->d_name + prefix_len, NULL, 0, &id_a);
> > > +    qemu_strtoull((*b)->d_name + prefix_len, NULL, 0, &id_b);
> > > +
> > > +    return id_a - id_b;
> > > +}
> > > +
> > > +static int rotate_pstore_file(VirtIOPstore *s, unsigned short type)
> > > +{
> > > +    int ret = 0;
> > > +    int i, num;
> > > +    char *filename;
> > > +    struct dirent **files;
> > > +
> > > +    if (type >= ARRAY_SIZE(virtio_pstore_file_prefix)) {
> > > +        type = VIRTIO_PSTORE_TYPE_UNKNOWN;
> > > +    }
> > > +
> > > +    prefix_idx = type;
> > > +    prefix_len = strlen(virtio_pstore_file_prefix[type]);
> > > +    prefix_count = 1;  /* only scan current type */
> > > +
> > > +    /* delete the oldest file in the same type */
> > > +    num = scandir(s->directory, &files, filter_pstore, sort_pstore);
> > > +    if (num < 0)
> > > +        return num;
> > > +    if (num < (int)s->file_max)
> > > +        goto out;
> > > +
> > > +    filename = g_strdup_printf("%s/%s", s->directory, files[0]->d_name);
> > > +    if (filename == NULL) {
> > > +        ret = -1;
> > > +        goto out;
> > > +    }
> > > +
> > > +    ret = unlink(filename);
> > > +
> > > +out:
> > > +    for (i = 0; i < num; i++) {
> > > +        g_free(files[i]);
> > > +    }
> > > +    g_free(files);
> > > +
> > > +    return ret;
> > > +}
> > 
> > Pls prefix everything with virtio_pstore or another
> > unique prefix. also below.
> 
> Ok.
> 
> > 
> > > +
> > > +static ssize_t virtio_pstore_do_open(VirtIOPstore *s)
> > > +{
> > > +    /* scan all pstore files */
> > > +    prefix_idx = 0;
> > > +    prefix_count = ARRAY_SIZE(virtio_pstore_file_prefix);
> > > +
> > > +    s->file_idx = 0;
> > > +    s->num_file = scandir(s->directory, &s->files, filter_pstore, alphasort);
> > > +
> > > +    return s->num_file >= 0 ? 0 : -1;
> > > +}
> > > +
> > > +static ssize_t virtio_pstore_do_close(VirtIOPstore *s)
> > > +{
> > > +    int i;
> > > +
> > > +    for (i = 0; i < s->num_file; i++) {
> > > +        g_free(s->files[i]);
> > > +    }
> > > +    g_free(s->files);
> > > +    s->files = NULL;
> > > +
> > > +    s->num_file = 0;
> > > +    return 0;
> > > +}
> > > +
> > > +static ssize_t virtio_pstore_do_erase(VirtIOPstore *s,
> > > +                                      struct virtio_pstore_req *req)
> > > +{
> > > +    char *filename;
> > > +    int ret;
> > > +
> > > +    filename = virtio_pstore_to_filename(s, req);
> > > +    if (filename == NULL)
> > > +        return -1;
> > 
> > this can't happen.
> 
> Why?  The virtio_pstore_to_filename() calls g_strdup_printf().  That
> means I don't need to worry about the memory allocation failure?
> 
> > 
> > also this is a coding style violation.
> 
> Oh, I missed the add {}, will fix.
> 
> > 
> > > +
> > > +    ret = unlink(filename);
> > > +
> > > +    g_free(filename);
> > > +    return ret;
> > > +}
> > > +
> > > +struct pstore_read_arg {
> > > +    VirtIOPstore *vps;
> > > +    VirtQueueElement *elem;
> > > +    struct virtio_pstore_fileinfo info;
> > > +    QIOChannel *ioc;
> > > +};
> > > +
> > > +static gboolean pstore_async_read_fn(QIOChannel *ioc, GIOCondition condition,
> > > +                                     gpointer data)
> > > +{
> > > +    struct pstore_read_arg *rarg = data;
> > > +    struct virtio_pstore_fileinfo *info = &rarg->info;
> > > +    VirtIOPstore *vps = rarg->vps;
> > > +    VirtQueueElement *elem = rarg->elem;
> > > +    struct virtio_pstore_res res;
> > > +    size_t offset = sizeof(res) + sizeof(*info);
> > > +    struct iovec *sg = elem->in_sg;
> > > +    unsigned int sg_num = elem->in_num;
> > > +    Error *err = NULL;
> > > +    ssize_t len;
> > > +    int ret;
> > > +
> > > +    /* skip res and fileinfo */
> > > +    iov_discard_front(&sg, &sg_num, sizeof(res) + sizeof(*info));
> > > +
> > > +    len = qio_channel_readv(rarg->ioc, sg, sg_num, &err);
> > > +    if (len < 0) {
> > > +        if (errno == EAGAIN) {
> > > +            len = 0;
> > > +        }
> > > +        ret = -1;
> > > +    } else {
> > > +        info->len = cpu_to_le32(len);
> > > +        ret = 0;
> > > +    }
> > > +
> > > +    res.cmd  = cpu_to_le16(VIRTIO_PSTORE_CMD_READ);
> > > +    res.type = cpu_to_le16(VIRTIO_PSTORE_TYPE_UNKNOWN);
> > > +    res.ret  = cpu_to_le32(ret);
> > > +
> > > +    /* now copy res and fileinfo */
> > > +    iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> > > +    iov_from_buf(elem->in_sg, elem->in_num, sizeof(res), info, sizeof(*info));
> > > +
> > > +    len += offset;
> > > +    virtqueue_push(vps->rvq, elem, len);
> > > +    virtio_notify(VIRTIO_DEVICE(vps), vps->rvq);
> > > +
> > > +    return G_SOURCE_REMOVE;
> > > +}
> > > +
> > > +static void free_rarg_fn(gpointer data)
> > > +{
> > > +    struct pstore_read_arg *rarg = data;
> > > +
> > > +    qio_channel_close(rarg->ioc, NULL);
> > > +
> > > +    g_free(rarg->elem);
> > > +    g_free(rarg);
> > > +}
> > > +
> > > +static ssize_t virtio_pstore_do_read(VirtIOPstore *s, VirtQueueElement *elem)
> > > +{
> > > +    char *filename = NULL;
> > > +    int fd, idx;
> > > +    struct stat stbuf;
> > > +    struct pstore_read_arg *rarg = NULL;
> > > +    Error *err = NULL;
> > > +    int ret = -1;
> > > +
> > > +    if (s->file_idx >= s->num_file) {
> > > +        return 0;
> > > +    }
> > > +
> > > +    rarg = g_malloc(sizeof(*rarg));
> > > +    if (rarg == NULL) {
> > > +        return -1;
> > > +    }
> > > +
> > > +    idx = s->file_idx++;
> > > +    filename = virtio_pstore_from_filename(s, s->files[idx]->d_name,
> > > +                                           &rarg->info);
> > > +    if (filename == NULL) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    fd = open(filename, O_RDONLY);
> > > +    if (fd < 0) {
> > > +        error_report("cannot open %s", filename);
> > > +        goto out;
> > > +    }
> > 
> > I see open here but close nowhere. Does this leak fds?
> 
> I guess so.  But this is changed to use qio_channel_file API in v5 and
> I hope doing it right.
> 
> > 
> > > +
> > > +    if (fstat(fd, &stbuf) < 0) {
> > 
> > So we can stat, but can we e.g. read?
> 
> It's just being a paranoid, I think it should succeed, no?
> 
> > 
> > 
> > > +        goto out;
> > > +    }
> > > +
> > > +    rarg->vps            = s;
> > > +    rarg->elem           = elem;
> > > +    rarg->info.id        = cpu_to_le64(rarg->info.id);
> > > +    rarg->info.type      = cpu_to_le16(rarg->info.type);
> > > +    rarg->info.flags     = cpu_to_le32(rarg->info.flags);
> > > +    rarg->info.time_sec  = cpu_to_le64(stbuf.st_ctim.tv_sec);
> > 
> > Is this seconds since epoch?
> > Why ctim specifically?
> > Pls add comments.
> 
> I think it doesn't matter either ctim or mtim.
> 
> > 
> > > +    rarg->info.time_nsec = cpu_to_le32(stbuf.st_ctim.tv_nsec);
> > 
> > Not all hosts support nanosecond precision.
> > Do we need some way to tell guest what's reliable?
> 
> In fact I'm not sure how much it affects users.  The pstore messages
> are occasional and AFAIK pstore keeps it only for users' information.
> 
> > 
> > Unless you limit this to linux host, you should care about things
> > like this (in man fstat)
> > 
> >            Since  kernel 2.5.48, the stat structure supports nanosecond
> >     resolution for the three file timestamp fields.  The nanosecond compo‐
> >     nents of each timestamp are available via names of the form
> >     st_atim.tv_nsec if the _BSD_SOURCE or _SVID_SOURCE feature  test  macro
> >     is  defined.   Nanosecond  timestamps are nowadays standardized,
> >     starting with POSIX.1-2008, and, starting with version 2.12, glibc also
> >     exposes the nanosecond component names if _POSIX_C_SOURCE is defined
> >     with the value 200809L or greater,  or  _XOPEN_SOURCE  is defined  with
> >     the  value 700 or greater.  If none of the aforementioned macros are
> >     defined, then the nanosecond values are exposed with names of the form
> >     st_atimensec.
> 
> Thanks for the info.
> 
> > 
> > 
> > 
> > 
> > > +
> > > +    rarg->ioc = qio_channel_new_fd(fd, &err);
> > > +    if (err) {
> > > +        error_reportf_err(err, "cannot create io channel: ");
> > > +        goto out;
> > > +    }
> > > +
> > > +    qio_channel_set_blocking(rarg->ioc, false, &err);
> > > +    qio_channel_add_watch(rarg->ioc, G_IO_IN, pstore_async_read_fn, rarg,
> > > +                          free_rarg_fn);
> > > +    g_free(filename);
> > > +    return 1;
> > > +
> > > +out:
> > > +    g_free(filename);
> > > +    g_free(rarg);
> > > +
> > > +    return ret;
> > > +}
> > > +
> > > +struct pstore_write_arg {
> > > +    VirtIOPstore *vps;
> > > +    VirtQueueElement *elem;
> > > +    struct virtio_pstore_req *req;
> > > +    QIOChannel *ioc;
> > > +};
> > > +
> > > +static gboolean pstore_async_write_fn(QIOChannel *ioc, GIOCondition condition,
> > > +                                      gpointer data)
> > > +{
> > > +    struct pstore_write_arg *warg = data;
> > > +    VirtIOPstore *vps = warg->vps;
> > > +    VirtQueueElement *elem = warg->elem;
> > > +    struct iovec *sg = elem->out_sg;
> > > +    unsigned int sg_num = elem->out_num;
> > > +    struct virtio_pstore_res res;
> > > +    Error *err = NULL;
> > > +    ssize_t len;
> > > +    int ret;
> > > +
> > > +    /* we already consumed the req */
> > > +    iov_discard_front(&sg, &sg_num, sizeof(*warg->req));
> > > +
> > > +    len = qio_channel_writev(warg->ioc, sg, sg_num, &err);
> > > +    if (len < 0) {
> > > +        ret = -1;
> > > +    } else {
> > > +        ret = 0;
> > > +    }
> > 
> > This can discard part of the data written.
> > Don't we care?
> 
> Doing partial write is better than failing out.  But if it's
> meaningful to add a retry loop, I'd like to do so.
> 
> > 
> > > +
> > > +    res.cmd  = cpu_to_le16(VIRTIO_PSTORE_CMD_WRITE);
> > > +    res.type = warg->req->type;
> > > +    res.ret  = cpu_to_le32(ret);
> > > +
> > > +    /* tell the result to guest */
> > > +    iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> > > +
> > > +    virtqueue_push(vps->wvq, elem, sizeof(res));
> > > +    virtio_notify(VIRTIO_DEVICE(vps), vps->wvq);
> > > +
> > > +    return G_SOURCE_REMOVE;
> > > +}
> > > +
> > > +static void free_warg_fn(gpointer data)
> > > +{
> > > +    struct pstore_write_arg *warg = data;
> > > +
> > > +    qio_channel_close(warg->ioc, NULL);
> > > +
> > > +    g_free(warg->elem);
> > > +    g_free(warg);
> > > +}
> > > +
> > > +static ssize_t virtio_pstore_do_write(VirtIOPstore *s, VirtQueueElement *elem,
> > > +                                      struct virtio_pstore_req *req)
> > > +{
> > > +    unsigned short type = le16_to_cpu(req->type);
> > > +    char *filename = NULL;
> > > +    int fd;
> > > +    int flags = O_WRONLY | O_CREAT | O_TRUNC;
> > > +    struct pstore_write_arg *warg = NULL;
> > > +    Error *err = NULL;
> > > +    int ret = -1;
> > > +
> > > +    /* do not keep same type of files more than 'file-max' */
> > > +    rotate_pstore_file(s, type);
> > 
> > If you don't care about failures, should this function
> > return a value? How about reporting it to the user?
> 
> Did you mean when it failed to delete the oldest file (FYI it's not
> really 'rotate').  Hmm.. will add error check and report.
> 
> > 
> > 
> > > +
> > > +    filename = virtio_pstore_to_filename(s, req);
> > > +    if (filename == NULL) {
> > > +        return -1;
> > > +    }
> > 
> > this can't happen
> > 
> > > +
> > > +    warg = g_malloc(sizeof(*warg));
> > > +    if (warg == NULL) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    fd = open(filename, flags, 0644);
> > > +    if (fd < 0) {
> > > +        error_report("cannot open %s", filename);
> > > +        ret = fd;
> > > +        goto out;
> > > +    }
> > > +
> > > +    warg->vps            = s;
> > > +    warg->elem           = elem;
> > > +    warg->req            = req;
> > > +
> > > +    warg->ioc = qio_channel_new_fd(fd, &err);
> > > +    if (err) {
> > > +        error_reportf_err(err, "cannot create io channel: ");
> > > +        goto out;
> > > +    }
> > > +
> > > +    qio_channel_set_blocking(warg->ioc, false, &err);
> > > +    qio_channel_add_watch(warg->ioc, G_IO_OUT, pstore_async_write_fn, warg,
> > > +                          free_warg_fn);
> > > +    g_free(filename);
> > > +    return 1;
> > > +
> > > +out:
> > > +    g_free(filename);
> > > +    g_free(warg);
> > > +    return ret;
> > > +}
> > > +
> > > +static void virtio_pstore_handle_io(VirtIODevice *vdev, VirtQueue *vq)
> > > +{
> > > +    VirtIOPstore *s = VIRTIO_PSTORE(vdev);
> > > +    VirtQueueElement *elem;
> > > +    struct virtio_pstore_req req;
> > > +    struct virtio_pstore_res res;
> > > +    ssize_t len = 0;
> > > +    int ret;
> > > +
> > > +    for (;;) {
> > > +        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> > > +        if (!elem) {
> > > +            return;
> > > +        }
> > > +
> > > +        if (elem->out_num < 1 || elem->in_num < 1) {
> > > +            error_report("request or response buffer is missing");
> > > +            exit(1);
> > > +        }
> > > +
> > > +        if (elem->out_num > 2 || elem->in_num > 3) {
> > > +            error_report("invalid number of input/output buffer");
> > > +            exit(1);
> > > +        }
> > > +
> > > +        len = iov_to_buf(elem->out_sg, elem->out_num, 0, &req, sizeof(req));
> > > +        if (len != (ssize_t)sizeof(req)) {
> > > +            error_report("invalid request size: %ld", (long)len);
> > > +            exit(1);
> > > +        }
> > > +        res.cmd  = req.cmd;
> > > +        res.type = req.type;
> > > +
> > > +        switch (le16_to_cpu(req.cmd)) {
> > > +        case VIRTIO_PSTORE_CMD_OPEN:
> > > +            ret = virtio_pstore_do_open(s);
> > > +            break;
> > > +        case VIRTIO_PSTORE_CMD_CLOSE:
> > > +            ret = virtio_pstore_do_close(s);
> > > +            break;
> > > +        case VIRTIO_PSTORE_CMD_ERASE:
> > > +            ret = virtio_pstore_do_erase(s, &req);
> > > +            break;
> > > +        case VIRTIO_PSTORE_CMD_READ:
> > > +            ret = virtio_pstore_do_read(s, elem);
> > > +            if (ret == 1) {
> > > +                /* async channel io */
> > > +                continue;
> > > +            }
> > > +            break;
> > > +        case VIRTIO_PSTORE_CMD_WRITE:
> > > +            ret = virtio_pstore_do_write(s, elem, &req);
> > > +            if (ret == 1) {
> > > +                /* async channel io */
> > > +                continue;
> > > +            }
> > > +            break;
> > > +        default:
> > > +            ret = -1;
> > > +            break;
> > > +        }
> > > +
> > > +        res.ret = ret;
> > > +
> > > +        iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> > > +        virtqueue_push(vq, elem, sizeof(res) + len);
> > > +
> > > +        virtio_notify(vdev, vq);
> > > +        g_free(elem);
> > > +
> > > +        if (ret < 0) {
> > > +            return;
> > 
> > what does this do?
> 
> If it failed on any processing, reports it to the kernel and stop
> processing later commands.  The kernel won't send same kind of command
> later.
> 
> > 
> > > +        }
> > > +    }
> > > +}
> > > +
> > > +static void virtio_pstore_device_realize(DeviceState *dev, Error **errp)
> > > +{
> > > +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > > +    VirtIOPstore *s = VIRTIO_PSTORE(dev);
> > > +
> > > +    virtio_init(vdev, "virtio-pstore", VIRTIO_ID_PSTORE,
> > > +                sizeof(struct virtio_pstore_config));
> > > +
> > > +    s->id = 1;
> > > +
> > > +    if (!s->bufsize)
> > > +        s->bufsize = PSTORE_DEFAULT_BUFSIZE;
> > > +    if (!s->file_max)
> > > +        s->file_max = PSTORE_DEFAULT_FILE_MAX;
> > > +
> > > +    s->rvq = virtio_add_queue(vdev, 128, virtio_pstore_handle_io);
> > > +    s->wvq = virtio_add_queue(vdev, 128, virtio_pstore_handle_io);
> > > +}
> > > +
> > > +static void virtio_pstore_device_unrealize(DeviceState *dev, Error **errp)
> > > +{
> > > +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > > +
> > > +    virtio_cleanup(vdev);
> > > +}
> > > +
> > > +static void virtio_pstore_get_config(VirtIODevice *vdev, uint8_t *config_data)
> > > +{
> > > +    VirtIOPstore *dev = VIRTIO_PSTORE(vdev);
> > > +    struct virtio_pstore_config config;
> > 
> > Add {} here - you want all fields initialized
> > if you add them, to avoid leaking them to guest.
> 
> Ok.
> 
> > 
> > > +
> > > +    config.bufsize = cpu_to_le32(dev->bufsize);
> > > +
> > > +    memcpy(config_data, &config, sizeof(struct virtio_pstore_config));
> > > +}
> > > +
> > > +static void virtio_pstore_set_config(VirtIODevice *vdev,
> > > +                                     const uint8_t *config_data)
> > > +{
> > > +    VirtIOPstore *dev = VIRTIO_PSTORE(vdev);
> > > +    struct virtio_pstore_config config;
> > > +
> > > +    memcpy(&config, config_data, sizeof(struct virtio_pstore_config));
> > > +
> > > +    dev->bufsize = le32_to_cpu(config.bufsize);
> > > +}
> > > +
> > > +static uint64_t get_features(VirtIODevice *vdev, uint64_t f, Error **errp)
> > > +{
> > > +    return f;
> > > +}
> > > +
> > > +static void pstore_get_directory(Object *obj, Visitor *v,
> > > +                                 const char *name, void *opaque,
> > > +                                 Error **errp)
> > > +{
> > > +    VirtIOPstore *s = opaque;
> > > +
> > > +    visit_type_str(v, name, &s->directory, errp);
> > > +}
> > > +
> > > +static void pstore_set_directory(Object *obj, Visitor *v,
> > > +                                 const char *name, void *opaque,
> > > +                                 Error **errp)
> > > +{
> > > +    VirtIOPstore *s = opaque;
> > > +    Error *local_err = NULL;
> > > +    char *value;
> > > +
> > > +    visit_type_str(v, name, &value, &local_err);
> > > +    if (local_err) {
> > > +        error_propagate(errp, local_err);
> > > +        return;
> > > +    }
> > > +
> > > +    g_free(s->directory);
> > > +    s->directory = value;
> > > +}
> > > +
> > > +static void pstore_release_directory(Object *obj, const char *name,
> > > +                                     void *opaque)
> > > +{
> > > +    VirtIOPstore *s = opaque;
> > > +
> > > +    g_free(s->directory);
> > > +    s->directory = NULL;
> > > +}
> > > +
> > > +static void pstore_get_bufsize(Object *obj, Visitor *v,
> > > +                               const char *name, void *opaque,
> > > +                               Error **errp)
> > > +{
> > > +    VirtIOPstore *s = opaque;
> > > +    uint64_t value = s->bufsize;
> > > +
> > > +    visit_type_size(v, name, &value, errp);
> > > +}
> > > +
> > > +static void pstore_set_bufsize(Object *obj, Visitor *v,
> > > +                               const char *name, void *opaque,
> > > +                               Error **errp)
> > > +{
> > > +    VirtIOPstore *s = opaque;
> > > +    Error *error = NULL;
> > > +    uint64_t value;
> > > +
> > > +    visit_type_size(v, name, &value, &error);
> > > +    if (error) {
> > > +        error_propagate(errp, error);
> > > +        return;
> > > +    }
> > > +
> > > +    if (value < 4096) {
> > > +        error_setg(&error, "Warning: too small buffer size: %"PRIu64, value);
> > > +        error_propagate(errp, error);
> > > +        return;
> > > +    }
> > > +
> > > +    s->bufsize = value;
> > > +}
> > > +
> > > +static void pstore_get_file_max(Object *obj, Visitor *v,
> > > +                                const char *name, void *opaque,
> > > +                                Error **errp)
> > > +{
> > > +    VirtIOPstore *s = opaque;
> > > +    int64_t value = s->file_max;
> > > +
> > > +    visit_type_int(v, name, &value, errp);
> > > +}
> > > +
> > > +static void pstore_set_file_max(Object *obj, Visitor *v,
> > > +                                const char *name, void *opaque,
> > > +                                Error **errp)
> > > +{
> > > +    VirtIOPstore *s = opaque;
> > > +    Error *error = NULL;
> > > +    int64_t value;
> > > +
> > > +    visit_type_int(v, name, &value, &error);
> > > +    if (error) {
> > > +        error_propagate(errp, error);
> > > +        return;
> > > +    }
> > > +
> > > +    s->file_max = value;
> > > +}
> > 
> > Do you need dynamic properties? There are easier ways
> > to define an int property. Same for others.
> 
> It was due to my insufficient knowledge about the qemu code base.  I
> don't think it needs to be dynamic.
> 
> Thanks,
> Namhyung
> 
> > 
> > > +
> > > +static Property virtio_pstore_properties[] = {
> > > +    DEFINE_PROP_END_OF_LIST(),
> > > +};
> > > +
> > > +static void virtio_pstore_instance_init(Object *obj)
> > > +{
> > > +    VirtIOPstore *s = VIRTIO_PSTORE(obj);
> > > +
> > > +    object_property_add(obj, "directory", "str",
> > > +                        pstore_get_directory, pstore_set_directory,
> > > +                        pstore_release_directory, s, NULL);
> > > +    object_property_add(obj, "bufsize", "size",
> > > +                        pstore_get_bufsize, pstore_set_bufsize, NULL, s, NULL);
> > > +    object_property_add(obj, "file-max", "int",
> > > +                        pstore_get_file_max, pstore_set_file_max, NULL, s, NULL);
> > > +}
> > > +
> > > +static void virtio_pstore_class_init(ObjectClass *klass, void *data)
> > > +{
> > > +    DeviceClass *dc = DEVICE_CLASS(klass);
> > > +    VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
> > > +
> > > +    dc->props = virtio_pstore_properties;
> > > +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> > > +    vdc->realize = virtio_pstore_device_realize;
> > > +    vdc->unrealize = virtio_pstore_device_unrealize;
> > > +    vdc->get_config = virtio_pstore_get_config;
> > > +    vdc->set_config = virtio_pstore_set_config;
> > > +    vdc->get_features = get_features;
> > > +}
> > > +
> > > +static const TypeInfo virtio_pstore_info = {
> > > +    .name = TYPE_VIRTIO_PSTORE,
> > > +    .parent = TYPE_VIRTIO_DEVICE,
> > > +    .instance_size = sizeof(VirtIOPstore),
> > > +    .instance_init = virtio_pstore_instance_init,
> > > +    .class_init = virtio_pstore_class_init,
> > > +};
> > > +
> > > +static void virtio_register_types(void)
> > > +{
> > > +    type_register_static(&virtio_pstore_info);
> > > +}
> > > +
> > > +type_init(virtio_register_types)
> > > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > > index 929ec2f..b31774a 100644
> > > --- a/include/hw/pci/pci.h
> > > +++ b/include/hw/pci/pci.h
> > > @@ -79,6 +79,7 @@
> > >  #define PCI_DEVICE_ID_VIRTIO_SCSI        0x1004
> > >  #define PCI_DEVICE_ID_VIRTIO_RNG         0x1005
> > >  #define PCI_DEVICE_ID_VIRTIO_9P          0x1009
> > > +#define PCI_DEVICE_ID_VIRTIO_PSTORE      0x100a
> > >  
> > >  #define PCI_VENDOR_ID_REDHAT             0x1b36
> > >  #define PCI_DEVICE_ID_REDHAT_BRIDGE      0x0001
> > > diff --git a/include/hw/virtio/virtio-pstore.h b/include/hw/virtio/virtio-pstore.h
> > > new file mode 100644
> > > index 0000000..85b1828
> > > --- /dev/null
> > > +++ b/include/hw/virtio/virtio-pstore.h
> > > @@ -0,0 +1,36 @@
> > > +/*
> > > + * Virtio Pstore Support
> > > + *
> > > + * Authors:
> > > + *  Namhyung Kim      <namhyung@gmail.com>
> > > + *
> > > + * This work is licensed under the terms of the GNU GPL, version 2.  See
> > > + * the COPYING file in the top-level directory.
> > > + *
> > > + */
> > > +
> > > +#ifndef _QEMU_VIRTIO_PSTORE_H
> > > +#define _QEMU_VIRTIO_PSTORE_H
> > > +
> > > +#include "standard-headers/linux/virtio_pstore.h"
> > > +#include "hw/virtio/virtio.h"
> > > +#include "hw/pci/pci.h"
> > > +
> > > +#define TYPE_VIRTIO_PSTORE "virtio-pstore-device"
> > > +#define VIRTIO_PSTORE(obj) \
> > > +        OBJECT_CHECK(VirtIOPstore, (obj), TYPE_VIRTIO_PSTORE)
> > > +
> > > +typedef struct VirtIOPstore {
> > > +    VirtIODevice    parent_obj;
> > > +    VirtQueue      *rvq;
> > > +    VirtQueue      *wvq;
> > > +    char           *directory;
> > > +    int             file_idx;
> > > +    int             num_file;
> > > +    struct dirent **files;
> > > +    uint64_t        id;
> > > +    uint64_t        bufsize;
> > > +    uint64_t        file_max;
> > > +} VirtIOPstore;
> > > +
> > > +#endif
> > > diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard-headers/linux/virtio_ids.h
> > > index 77925f5..c72a9ab 100644
> > > --- a/include/standard-headers/linux/virtio_ids.h
> > > +++ b/include/standard-headers/linux/virtio_ids.h
> > > @@ -41,5 +41,6 @@
> > >  #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
> > >  #define VIRTIO_ID_GPU          16 /* virtio GPU */
> > >  #define VIRTIO_ID_INPUT        18 /* virtio input */
> > > +#define VIRTIO_ID_PSTORE       22 /* virtio pstore */
> > >  
> > >  #endif /* _LINUX_VIRTIO_IDS_H */
> > > diff --git a/include/standard-headers/linux/virtio_pstore.h b/include/standard-headers/linux/virtio_pstore.h
> > > new file mode 100644
> > > index 0000000..2f91839
> > > --- /dev/null
> > > +++ b/include/standard-headers/linux/virtio_pstore.h
> > > @@ -0,0 +1,76 @@
> > > +#ifndef _LINUX_VIRTIO_PSTORE_H
> > > +#define _LINUX_VIRTIO_PSTORE_H
> > > +/* This header is BSD licensed so anyone can use the definitions to implement
> > > + * compatible drivers/servers.
> > > + *
> > > + * Redistribution and use in source and binary forms, with or without
> > > + * modification, are permitted provided that the following conditions
> > > + * are met:
> > > + * 1. Redistributions of source code must retain the above copyright
> > > + *    notice, this list of conditions and the following disclaimer.
> > > + * 2. Redistributions in binary form must reproduce the above copyright
> > > + *    notice, this list of conditions and the following disclaimer in the
> > > + *    documentation and/or other materials provided with the distribution.
> > > + * 3. Neither the name of IBM nor the names of its contributors
> > > + *    may be used to endorse or promote products derived from this software
> > > + *    without specific prior written permission.
> > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
> > > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> > > + * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
> > > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> > > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> > > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> > > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > > + * SUCH DAMAGE. */
> > > +#include "standard-headers/linux/types.h"
> > > +#include "standard-headers/linux/virtio_types.h"
> > > +#include "standard-headers/linux/virtio_ids.h"
> > > +#include "standard-headers/linux/virtio_config.h"
> > > +
> > > +#define VIRTIO_PSTORE_CMD_NULL   0
> > > +#define VIRTIO_PSTORE_CMD_OPEN   1
> > > +#define VIRTIO_PSTORE_CMD_READ   2
> > > +#define VIRTIO_PSTORE_CMD_WRITE  3
> > > +#define VIRTIO_PSTORE_CMD_ERASE  4
> > > +#define VIRTIO_PSTORE_CMD_CLOSE  5
> > > +
> > > +#define VIRTIO_PSTORE_TYPE_UNKNOWN  0
> > > +#define VIRTIO_PSTORE_TYPE_DMESG    1
> > > +
> > > +#define VIRTIO_PSTORE_FL_COMPRESSED  1
> > > +
> > > +struct virtio_pstore_req {
> > > +    __virtio16 cmd;
> > > +    __virtio16 type;
> > > +    __virtio32 flags;
> > > +    __virtio64 id;
> > > +    __virtio32 count;
> > > +    __virtio32 reserved;
> > > +};
> > > +
> > > +struct virtio_pstore_res {
> > > +    __virtio16 cmd;
> > > +    __virtio16 type;
> > > +    __virtio32 ret;
> > > +};
> > > +
> > > +struct virtio_pstore_fileinfo {
> > > +    __virtio64 id;
> > > +    __virtio32 count;
> > > +    __virtio16 type;
> > > +    __virtio16 unused;
> > > +    __virtio32 flags;
> > > +    __virtio32 len;
> > > +    __virtio64 time_sec;
> > > +    __virtio32 time_nsec;
> > > +    __virtio32 reserved;
> > > +};
> > > +
> > > +struct virtio_pstore_config {
> > > +    __virtio32 bufsize;
> > > +};
> > > +
> > > +#endif /* _LINUX_VIRTIO_PSTORE_H */
> > > diff --git a/qdev-monitor.c b/qdev-monitor.c
> > > index e19617f..e1df5a9 100644
> > > --- a/qdev-monitor.c
> > > +++ b/qdev-monitor.c
> > > @@ -73,6 +73,7 @@ static const QDevAlias qdev_alias_table[] = {
> > >      { "virtio-serial-pci", "virtio-serial", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
> > >      { "virtio-tablet-ccw", "virtio-tablet", QEMU_ARCH_S390X },
> > >      { "virtio-tablet-pci", "virtio-tablet", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
> > > +    { "virtio-pstore-pci", "virtio-pstore" },
> > >      { }
> > >  };
> > >  
> > > -- 
> > > 2.9.3

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-10 16:39   ` Michael S. Tsirkin
@ 2016-11-15  4:50     ` Namhyung Kim
  2016-11-15  5:06       ` Michael S. Tsirkin
  0 siblings, 1 reply; 31+ messages in thread
From: Namhyung Kim @ 2016-11-15  4:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Anthony Liguori, Anton Vorontsov, Colin Cross, Kees Cook,
	Tony Luck, Steven Rostedt, Ingo Molnar, Minchan Kim

Hi Michael,

On Thu, Nov 10, 2016 at 06:39:55PM +0200, Michael S. Tsirkin wrote:
> On Sat, Aug 20, 2016 at 05:07:42PM +0900, Namhyung Kim wrote:
> > The virtio pstore driver provides interface to the pstore subsystem so
> > that the guest kernel's log/dump message can be saved on the host
> > machine.  Users can access the log file directly on the host, or on the
> > guest at the next boot using pstore filesystem.  It currently deals with
> > kernel log (printk) buffer only, but we can extend it to have other
> > information (like ftrace dump) later.
> > 
> > It supports legacy PCI device using single order-2 page buffer.
> 
> Do you mean a legacy virtio device? I don't see why
> you would want to support pre-1.0 mode.
> If you drop that, you can drop all cpu_to_virtio things
> and just use __le accessors.

I was thinking about the kvmtools which lacks 1.0 support AFAIK.  But
I think it'd be better to always use __le type anyway.  Will change.


> 
> > It uses
> > two virtqueues - one for (sync) read and another for (async) write.
> > Since it cannot wait for write finished, it supports up to 128
> > concurrent IO.  The buffer size is configurable now.
> > 
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Anthony Liguori <aliguori@amazon.com>
> > Cc: Anton Vorontsov <anton@enomsg.org>
> > Cc: Colin Cross <ccross@android.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Tony Luck <tony.luck@intel.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Minchan Kim <minchan@kernel.org>
> > Cc: kvm@vger.kernel.org
> > Cc: qemu-devel@nongnu.org
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  drivers/virtio/Kconfig             |  10 +
> >  drivers/virtio/Makefile            |   1 +
> >  drivers/virtio/virtio_pstore.c     | 417 +++++++++++++++++++++++++++++++++++++
> >  include/uapi/linux/Kbuild          |   1 +
> >  include/uapi/linux/virtio_ids.h    |   1 +
> >  include/uapi/linux/virtio_pstore.h |  74 +++++++
> >  6 files changed, 504 insertions(+)
> >  create mode 100644 drivers/virtio/virtio_pstore.c
> >  create mode 100644 include/uapi/linux/virtio_pstore.h
> > 
> > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> > index 77590320d44c..8f0e6c796c12 100644
> > --- a/drivers/virtio/Kconfig
> > +++ b/drivers/virtio/Kconfig
> > @@ -58,6 +58,16 @@ config VIRTIO_INPUT
> >  
> >  	 If unsure, say M.
> >  
> > +config VIRTIO_PSTORE
> > +	tristate "Virtio pstore driver"
> > +	depends on VIRTIO
> > +	depends on PSTORE
> > +	---help---
> > +	 This driver supports virtio pstore devices to save/restore
> > +	 panic and oops messages on the host.
> > +
> > +	 If unsure, say M.
> > +
> >   config VIRTIO_MMIO
> >  	tristate "Platform bus driver for memory mapped virtio devices"
> >  	depends on HAS_IOMEM && HAS_DMA
> > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> > index 41e30e3dc842..bee68cb26d48 100644
> > --- a/drivers/virtio/Makefile
> > +++ b/drivers/virtio/Makefile
> > @@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
> >  virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
> >  obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
> >  obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
> > +obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o
> > diff --git a/drivers/virtio/virtio_pstore.c b/drivers/virtio/virtio_pstore.c
> > new file mode 100644
> > index 000000000000..0a63c7db4278
> > --- /dev/null
> > +++ b/drivers/virtio/virtio_pstore.c
> > @@ -0,0 +1,417 @@
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include <linux/pstore.h>
> > +#include <linux/virtio.h>
> > +#include <linux/virtio_config.h>
> > +#include <uapi/linux/virtio_ids.h>
> > +#include <uapi/linux/virtio_pstore.h>
> > +
> > +#define VIRT_PSTORE_ORDER    2
> > +#define VIRT_PSTORE_BUFSIZE  (4096 << VIRT_PSTORE_ORDER)
> > +#define VIRT_PSTORE_NR_REQ   128
> > +
> > +struct virtio_pstore {
> > +	struct virtio_device	*vdev;
> > +	struct virtqueue	*vq[2];
> 
> I'd add named fields instead of an array here, vq[0]
> vq[1] all over the place is hard to read.

Will change.

> 
> > +	struct pstore_info	 pstore;
> > +	struct virtio_pstore_req req[VIRT_PSTORE_NR_REQ];
> > +	struct virtio_pstore_res res[VIRT_PSTORE_NR_REQ];
> > +	unsigned int		 req_id;
> > +
> > +	/* Waiting for host to ack */
> > +	wait_queue_head_t	acked;
> > +	int			failed;
> > +};
> > +
> > +#define TYPE_TABLE_ENTRY(_entry)				\
> > +	{ PSTORE_TYPE_##_entry, VIRTIO_PSTORE_TYPE_##_entry }
> > +
> > +struct type_table {
> > +	int pstore;
> > +	u16 virtio;
> > +} type_table[] = {
> > +	TYPE_TABLE_ENTRY(DMESG),
> > +};
> > +
> > +#undef TYPE_TABLE_ENTRY
> 
> let's avoid macros for now pls. In fact, I would just open-code this
> in to_virtio_type below. We can always change our minds later if
> lots of types are added.

Yep.

> 
> > +
> > +
> 
> single emoty line pls

Ok.

> 
> > +static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id type)
> > +{
> > +	unsigned int i;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
> > +		if (type == type_table[i].pstore)
> > +			return cpu_to_virtio16(vps->vdev, type_table[i].virtio);
> > +	}
> > +
> > +	return cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_UNKNOWN);
> 
> This assigns u16 to __virtio type, sparse will warn
> if you enable endian-ness checks.
> Pls fix that and generally, please make sure this is
> clean from sparse warnings.

I'll run sparse before sending patch next time.

> 
> > +}
> > +
> > +static enum pstore_type_id from_virtio_type(struct virtio_pstore *vps, u16 type)
> > +{
> > +	unsigned int i;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
> > +		if (virtio16_to_cpu(vps->vdev, type) == type_table[i].virtio)
> > +			return type_table[i].pstore;
> > +	}
> > +
> > +	return PSTORE_TYPE_UNKNOWN;
> > +}
> > +
> > +static void virtpstore_ack(struct virtqueue *vq)
> > +{
> > +	struct virtio_pstore *vps = vq->vdev->priv;
> > +
> > +	wake_up(&vps->acked);
> > +}
> > +
> > +static void virtpstore_check(struct virtqueue *vq)
> > +{
> > +	struct virtio_pstore *vps = vq->vdev->priv;
> > +	struct virtio_pstore_res *res;
> > +	unsigned int len;
> > +
> > +	res = virtqueue_get_buf(vq, &len);
> > +	if (res == NULL)
> > +		return;
> > +
> > +	if (virtio32_to_cpu(vq->vdev, res->ret) < 0)
> > +		vps->failed = 1;
> > +}
> > +
> > +static void virt_pstore_get_reqs(struct virtio_pstore *vps,
> > +				 struct virtio_pstore_req **preq,
> > +				 struct virtio_pstore_res **pres)
> > +{
> > +	unsigned int idx = vps->req_id++ % VIRT_PSTORE_NR_REQ;
> > +
> > +	*preq = &vps->req[idx];
> > +	*pres = &vps->res[idx];
> > +
> > +	memset(*preq, 0, sizeof(**preq));
> > +	memset(*pres, 0, sizeof(**pres));
> > +}
> > +
> > +static int virt_pstore_open(struct pstore_info *psi)
> > +{
> > +	struct virtio_pstore *vps = psi->data;
> > +	struct virtio_pstore_req *req;
> > +	struct virtio_pstore_res *res;
> > +	struct scatterlist sgo[1], sgi[1];
> > +	struct scatterlist *sgs[2] = { sgo, sgi };
> > +	unsigned int len;
> > +
> > +	virt_pstore_get_reqs(vps, &req, &res);
> > +
> > +	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_OPEN);
> > +
> > +	sg_init_one(sgo, req, sizeof(*req));
> > +	sg_init_one(sgi, res, sizeof(*res));
> > +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> > +	virtqueue_kick(vps->vq[0]);
> > +
> > +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> 
> Does this block userspace in an uninterruptible wait if
> hardware is slow? That's not nice.

Yes, but it's not a common operation and I just wanted to make it
simple.


> 
> > +	return virtio32_to_cpu(vps->vdev, res->ret);
> > +}
> > +

[SNIP]
> > +struct virtio_pstore_fileinfo {
> > +	__virtio64		id;
> > +	__virtio32		count;
> > +	__virtio16		type;
> > +	__virtio16		unused;
> > +	__virtio32		flags;
> > +	__virtio32		len;
> > +	__virtio64		time_sec;
> > +	__virtio32		time_nsec;
> > +	__virtio32		reserved;
> > +};
> > +
> > +struct virtio_pstore_config {
> > +	__virtio32		bufsize;
> > +};
> > +
> 
> What exactly does each field mean? I'm especially
> interested in time fields - maintaining a consistent
> time between host and guest is not a simple problem.

These are required by pstore and will be used to create corresponding
files in the pstore filesystem.  The time fields are for mtime and
ctime and, I think, it's just a hint for user and doesn't require
strict consistency.

Thanks for your review!
Namhyung

> 
> > +#endif /* _LINUX_VIRTIO_PSTORE_H */
> > -- 
> > 2.9.3

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-15  4:50     ` Namhyung Kim
@ 2016-11-15  5:06       ` Michael S. Tsirkin
  2016-11-15  5:50         ` Namhyung Kim
  2016-11-15  9:57         ` Paolo Bonzini
  0 siblings, 2 replies; 31+ messages in thread
From: Michael S. Tsirkin @ 2016-11-15  5:06 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Anthony Liguori, Anton Vorontsov, Colin Cross, Kees Cook,
	Tony Luck, Steven Rostedt, Ingo Molnar, Minchan Kim

On Tue, Nov 15, 2016 at 01:50:21PM +0900, Namhyung Kim wrote:
> Hi Michael,
> 
> On Thu, Nov 10, 2016 at 06:39:55PM +0200, Michael S. Tsirkin wrote:
> > On Sat, Aug 20, 2016 at 05:07:42PM +0900, Namhyung Kim wrote:
> > > The virtio pstore driver provides interface to the pstore subsystem so
> > > that the guest kernel's log/dump message can be saved on the host
> > > machine.  Users can access the log file directly on the host, or on the
> > > guest at the next boot using pstore filesystem.  It currently deals with
> > > kernel log (printk) buffer only, but we can extend it to have other
> > > information (like ftrace dump) later.
> > > 
> > > It supports legacy PCI device using single order-2 page buffer.
> > 
> > Do you mean a legacy virtio device? I don't see why
> > you would want to support pre-1.0 mode.
> > If you drop that, you can drop all cpu_to_virtio things
> > and just use __le accessors.
> 
> I was thinking about the kvmtools which lacks 1.0 support AFAIK.

Unless kvmtools wants to be left behind it has to go 1.0.

>  But
> I think it'd be better to always use __le type anyway.  Will change.
> 
> 
> > 
> > > It uses
> > > two virtqueues - one for (sync) read and another for (async) write.
> > > Since it cannot wait for write finished, it supports up to 128
> > > concurrent IO.  The buffer size is configurable now.
> > > 
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > > Cc: Anthony Liguori <aliguori@amazon.com>
> > > Cc: Anton Vorontsov <anton@enomsg.org>
> > > Cc: Colin Cross <ccross@android.com>
> > > Cc: Kees Cook <keescook@chromium.org>
> > > Cc: Tony Luck <tony.luck@intel.com>
> > > Cc: Steven Rostedt <rostedt@goodmis.org>
> > > Cc: Ingo Molnar <mingo@kernel.org>
> > > Cc: Minchan Kim <minchan@kernel.org>
> > > Cc: kvm@vger.kernel.org
> > > Cc: qemu-devel@nongnu.org
> > > Cc: virtualization@lists.linux-foundation.org
> > > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > > ---
> > >  drivers/virtio/Kconfig             |  10 +
> > >  drivers/virtio/Makefile            |   1 +
> > >  drivers/virtio/virtio_pstore.c     | 417 +++++++++++++++++++++++++++++++++++++
> > >  include/uapi/linux/Kbuild          |   1 +
> > >  include/uapi/linux/virtio_ids.h    |   1 +
> > >  include/uapi/linux/virtio_pstore.h |  74 +++++++
> > >  6 files changed, 504 insertions(+)
> > >  create mode 100644 drivers/virtio/virtio_pstore.c
> > >  create mode 100644 include/uapi/linux/virtio_pstore.h
> > > 
> > > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> > > index 77590320d44c..8f0e6c796c12 100644
> > > --- a/drivers/virtio/Kconfig
> > > +++ b/drivers/virtio/Kconfig
> > > @@ -58,6 +58,16 @@ config VIRTIO_INPUT
> > >  
> > >  	 If unsure, say M.
> > >  
> > > +config VIRTIO_PSTORE
> > > +	tristate "Virtio pstore driver"
> > > +	depends on VIRTIO
> > > +	depends on PSTORE
> > > +	---help---
> > > +	 This driver supports virtio pstore devices to save/restore
> > > +	 panic and oops messages on the host.
> > > +
> > > +	 If unsure, say M.
> > > +
> > >   config VIRTIO_MMIO
> > >  	tristate "Platform bus driver for memory mapped virtio devices"
> > >  	depends on HAS_IOMEM && HAS_DMA
> > > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> > > index 41e30e3dc842..bee68cb26d48 100644
> > > --- a/drivers/virtio/Makefile
> > > +++ b/drivers/virtio/Makefile
> > > @@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
> > >  virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
> > >  obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
> > >  obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
> > > +obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o
> > > diff --git a/drivers/virtio/virtio_pstore.c b/drivers/virtio/virtio_pstore.c
> > > new file mode 100644
> > > index 000000000000..0a63c7db4278
> > > --- /dev/null
> > > +++ b/drivers/virtio/virtio_pstore.c
> > > @@ -0,0 +1,417 @@
> > > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > > +
> > > +#include <linux/kernel.h>
> > > +#include <linux/module.h>
> > > +#include <linux/pstore.h>
> > > +#include <linux/virtio.h>
> > > +#include <linux/virtio_config.h>
> > > +#include <uapi/linux/virtio_ids.h>
> > > +#include <uapi/linux/virtio_pstore.h>
> > > +
> > > +#define VIRT_PSTORE_ORDER    2
> > > +#define VIRT_PSTORE_BUFSIZE  (4096 << VIRT_PSTORE_ORDER)
> > > +#define VIRT_PSTORE_NR_REQ   128
> > > +
> > > +struct virtio_pstore {
> > > +	struct virtio_device	*vdev;
> > > +	struct virtqueue	*vq[2];
> > 
> > I'd add named fields instead of an array here, vq[0]
> > vq[1] all over the place is hard to read.
> 
> Will change.
> 
> > 
> > > +	struct pstore_info	 pstore;
> > > +	struct virtio_pstore_req req[VIRT_PSTORE_NR_REQ];
> > > +	struct virtio_pstore_res res[VIRT_PSTORE_NR_REQ];
> > > +	unsigned int		 req_id;
> > > +
> > > +	/* Waiting for host to ack */
> > > +	wait_queue_head_t	acked;
> > > +	int			failed;
> > > +};
> > > +
> > > +#define TYPE_TABLE_ENTRY(_entry)				\
> > > +	{ PSTORE_TYPE_##_entry, VIRTIO_PSTORE_TYPE_##_entry }
> > > +
> > > +struct type_table {
> > > +	int pstore;
> > > +	u16 virtio;
> > > +} type_table[] = {
> > > +	TYPE_TABLE_ENTRY(DMESG),
> > > +};
> > > +
> > > +#undef TYPE_TABLE_ENTRY
> > 
> > let's avoid macros for now pls. In fact, I would just open-code this
> > in to_virtio_type below. We can always change our minds later if
> > lots of types are added.
> 
> Yep.
> 
> > 
> > > +
> > > +
> > 
> > single emoty line pls
> 
> Ok.
> 
> > 
> > > +static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id type)
> > > +{
> > > +	unsigned int i;
> > > +
> > > +	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
> > > +		if (type == type_table[i].pstore)
> > > +			return cpu_to_virtio16(vps->vdev, type_table[i].virtio);
> > > +	}
> > > +
> > > +	return cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_UNKNOWN);
> > 
> > This assigns u16 to __virtio type, sparse will warn
> > if you enable endian-ness checks.
> > Pls fix that and generally, please make sure this is
> > clean from sparse warnings.
> 
> I'll run sparse before sending patch next time.
> 
> > 
> > > +}
> > > +
> > > +static enum pstore_type_id from_virtio_type(struct virtio_pstore *vps, u16 type)
> > > +{
> > > +	unsigned int i;
> > > +
> > > +	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
> > > +		if (virtio16_to_cpu(vps->vdev, type) == type_table[i].virtio)
> > > +			return type_table[i].pstore;
> > > +	}
> > > +
> > > +	return PSTORE_TYPE_UNKNOWN;
> > > +}
> > > +
> > > +static void virtpstore_ack(struct virtqueue *vq)
> > > +{
> > > +	struct virtio_pstore *vps = vq->vdev->priv;
> > > +
> > > +	wake_up(&vps->acked);
> > > +}
> > > +
> > > +static void virtpstore_check(struct virtqueue *vq)
> > > +{
> > > +	struct virtio_pstore *vps = vq->vdev->priv;
> > > +	struct virtio_pstore_res *res;
> > > +	unsigned int len;
> > > +
> > > +	res = virtqueue_get_buf(vq, &len);
> > > +	if (res == NULL)
> > > +		return;
> > > +
> > > +	if (virtio32_to_cpu(vq->vdev, res->ret) < 0)
> > > +		vps->failed = 1;
> > > +}
> > > +
> > > +static void virt_pstore_get_reqs(struct virtio_pstore *vps,
> > > +				 struct virtio_pstore_req **preq,
> > > +				 struct virtio_pstore_res **pres)
> > > +{
> > > +	unsigned int idx = vps->req_id++ % VIRT_PSTORE_NR_REQ;
> > > +
> > > +	*preq = &vps->req[idx];
> > > +	*pres = &vps->res[idx];
> > > +
> > > +	memset(*preq, 0, sizeof(**preq));
> > > +	memset(*pres, 0, sizeof(**pres));
> > > +}
> > > +
> > > +static int virt_pstore_open(struct pstore_info *psi)
> > > +{
> > > +	struct virtio_pstore *vps = psi->data;
> > > +	struct virtio_pstore_req *req;
> > > +	struct virtio_pstore_res *res;
> > > +	struct scatterlist sgo[1], sgi[1];
> > > +	struct scatterlist *sgs[2] = { sgo, sgi };
> > > +	unsigned int len;
> > > +
> > > +	virt_pstore_get_reqs(vps, &req, &res);
> > > +
> > > +	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_OPEN);
> > > +
> > > +	sg_init_one(sgo, req, sizeof(*req));
> > > +	sg_init_one(sgi, res, sizeof(*res));
> > > +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
> > > +	virtqueue_kick(vps->vq[0]);
> > > +
> > > +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
> > 
> > Does this block userspace in an uninterruptible wait if
> > hardware is slow? That's not nice.
> 
> Yes, but it's not a common operation and I just wanted to make it
> simple.
> 
> 
> > 
> > > +	return virtio32_to_cpu(vps->vdev, res->ret);
> > > +}
> > > +
> 
> [SNIP]
> > > +struct virtio_pstore_fileinfo {
> > > +	__virtio64		id;
> > > +	__virtio32		count;
> > > +	__virtio16		type;
> > > +	__virtio16		unused;
> > > +	__virtio32		flags;
> > > +	__virtio32		len;
> > > +	__virtio64		time_sec;
> > > +	__virtio32		time_nsec;
> > > +	__virtio32		reserved;
> > > +};
> > > +
> > > +struct virtio_pstore_config {
> > > +	__virtio32		bufsize;
> > > +};
> > > +
> > 
> > What exactly does each field mean? I'm especially
> > interested in time fields - maintaining a consistent
> > time between host and guest is not a simple problem.
> 
> These are required by pstore and will be used to create corresponding
> files in the pstore filesystem.  The time fields are for mtime and
> ctime and, I think, it's just a hint for user and doesn't require
> strict consistency.

Pls add documentation. I would just drop hints for now.

> 
> Thanks for your review!
> Namhyung
> 
> > 
> > > +#endif /* _LINUX_VIRTIO_PSTORE_H */
> > > -- 
> > > 2.9.3

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-15  5:06       ` Michael S. Tsirkin
@ 2016-11-15  5:50         ` Namhyung Kim
  2016-11-15 14:35           ` Michael S. Tsirkin
  2016-11-15  9:57         ` Paolo Bonzini
  1 sibling, 1 reply; 31+ messages in thread
From: Namhyung Kim @ 2016-11-15  5:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Anthony Liguori, Anton Vorontsov, Colin Cross, Kees Cook,
	Tony Luck, Steven Rostedt, Ingo Molnar, Minchan Kim

On Tue, Nov 15, 2016 at 07:06:28AM +0200, Michael S. Tsirkin wrote:
> On Tue, Nov 15, 2016 at 01:50:21PM +0900, Namhyung Kim wrote:
> > On Thu, Nov 10, 2016 at 06:39:55PM +0200, Michael S. Tsirkin wrote:
> > [SNIP]
> > > > +struct virtio_pstore_fileinfo {
> > > > +	__virtio64		id;
> > > > +	__virtio32		count;
> > > > +	__virtio16		type;
> > > > +	__virtio16		unused;
> > > > +	__virtio32		flags;
> > > > +	__virtio32		len;
> > > > +	__virtio64		time_sec;
> > > > +	__virtio32		time_nsec;
> > > > +	__virtio32		reserved;
> > > > +};
> > > > +
> > > > +struct virtio_pstore_config {
> > > > +	__virtio32		bufsize;
> > > > +};
> > > > +
> > > 
> > > What exactly does each field mean? I'm especially
> > > interested in time fields - maintaining a consistent
> > > time between host and guest is not a simple problem.
> > 
> > These are required by pstore and will be used to create corresponding
> > files in the pstore filesystem.  The time fields are for mtime and
> > ctime and, I think, it's just a hint for user and doesn't require
> > strict consistency.
> 
> Pls add documentation. I would just drop hints for now.

Well, I'll add docmentation.  But I think just dropping might not good
since they all have host time and it's helpful to know their relative
difference in guest.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] qemu: Implement virtio-pstore device
  2016-11-10 22:50       ` Michael S. Tsirkin
@ 2016-11-15  6:23         ` Namhyung Kim
  2016-11-15 14:38           ` Michael S. Tsirkin
  0 siblings, 1 reply; 31+ messages in thread
From: Namhyung Kim @ 2016-11-15  6:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Anthony Liguori, Anton Vorontsov, Colin Cross, Kees Cook,
	Tony Luck, Steven Rostedt, Ingo Molnar, Minchan Kim,
	Daniel P . Berrange

On Fri, Nov 11, 2016 at 12:50:03AM +0200, Michael S. Tsirkin wrote:
> On Fri, Sep 16, 2016 at 07:05:47PM +0900, Namhyung Kim wrote:
> > On Tue, Sep 13, 2016 at 06:57:10PM +0300, Michael S. Tsirkin wrote:
> > > On Sat, Aug 20, 2016 at 05:07:43PM +0900, Namhyung Kim wrote:
> > > > +
> > > > +/* the index should match to the type value */
> > > > +static const char *virtio_pstore_file_prefix[] = {
> > > > +    "unknown-",		/* VIRTIO_PSTORE_TYPE_UNKNOWN */
> > > 
> > > Is there value in treating everything unexpected as "unknown"
> > > and rotating them as if they were logs?
> > > It might be better to treat everything that's not known
> > > as guest error.
> > 
> > I was thinking about the version mismatch between the kernel and qemu.
> > I'd like to make the device can deal with a new kernel version which
> > might implement a new pstore message type.  It will be saved as
> > unknown but the kernel can read it properly later.
> 
> Well it'll have a different prefix. E.g. if kernel has
> two different types they will end up in the same
> file, hardly what was wanted.

Right, I think it needs to add 'type' info to the filename for unknown
type.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-15  5:06       ` Michael S. Tsirkin
  2016-11-15  5:50         ` Namhyung Kim
@ 2016-11-15  9:57         ` Paolo Bonzini
  2016-11-15 14:36           ` Namhyung Kim
  1 sibling, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2016-11-15  9:57 UTC (permalink / raw)
  To: Michael S. Tsirkin, Namhyung Kim
  Cc: virtio-dev, Tony Luck, Kees Cook, kvm,
	Radim Krčmář,
	Anton Vorontsov, LKML, Steven Rostedt, qemu-devel, Minchan Kim,
	Anthony Liguori, Colin Cross, virtualization, Ingo Molnar



On 15/11/2016 06:06, Michael S. Tsirkin wrote:
> On Tue, Nov 15, 2016 at 01:50:21PM +0900, Namhyung Kim wrote:
>> Hi Michael,
>>
>> On Thu, Nov 10, 2016 at 06:39:55PM +0200, Michael S. Tsirkin wrote:
>>> On Sat, Aug 20, 2016 at 05:07:42PM +0900, Namhyung Kim wrote:
>>>> The virtio pstore driver provides interface to the pstore subsystem so
>>>> that the guest kernel's log/dump message can be saved on the host
>>>> machine.  Users can access the log file directly on the host, or on the
>>>> guest at the next boot using pstore filesystem.  It currently deals with
>>>> kernel log (printk) buffer only, but we can extend it to have other
>>>> information (like ftrace dump) later.
>>>>
>>>> It supports legacy PCI device using single order-2 page buffer.
>>>
>>> Do you mean a legacy virtio device? I don't see why
>>> you would want to support pre-1.0 mode.
>>> If you drop that, you can drop all cpu_to_virtio things
>>> and just use __le accessors.
>>
>> I was thinking about the kvmtools which lacks 1.0 support AFAIK.
> 
> Unless kvmtools wants to be left behind it has to go 1.0.

And it also has to go ACPI.  Is there any reason, apart from kvmtool, to
make a completely new virtio device, with no support in existing guests,
rather than implement ACPI ERST?

Paolo

>>  But
>> I think it'd be better to always use __le type anyway.  Will change.
>>
>>
>>>
>>>> It uses
>>>> two virtqueues - one for (sync) read and another for (async) write.
>>>> Since it cannot wait for write finished, it supports up to 128
>>>> concurrent IO.  The buffer size is configurable now.
>>>>
>>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>>>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>>>> Cc: Anthony Liguori <aliguori@amazon.com>
>>>> Cc: Anton Vorontsov <anton@enomsg.org>
>>>> Cc: Colin Cross <ccross@android.com>
>>>> Cc: Kees Cook <keescook@chromium.org>
>>>> Cc: Tony Luck <tony.luck@intel.com>
>>>> Cc: Steven Rostedt <rostedt@goodmis.org>
>>>> Cc: Ingo Molnar <mingo@kernel.org>
>>>> Cc: Minchan Kim <minchan@kernel.org>
>>>> Cc: kvm@vger.kernel.org
>>>> Cc: qemu-devel@nongnu.org
>>>> Cc: virtualization@lists.linux-foundation.org
>>>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>>>> ---
>>>>  drivers/virtio/Kconfig             |  10 +
>>>>  drivers/virtio/Makefile            |   1 +
>>>>  drivers/virtio/virtio_pstore.c     | 417 +++++++++++++++++++++++++++++++++++++
>>>>  include/uapi/linux/Kbuild          |   1 +
>>>>  include/uapi/linux/virtio_ids.h    |   1 +
>>>>  include/uapi/linux/virtio_pstore.h |  74 +++++++
>>>>  6 files changed, 504 insertions(+)
>>>>  create mode 100644 drivers/virtio/virtio_pstore.c
>>>>  create mode 100644 include/uapi/linux/virtio_pstore.h
>>>>
>>>> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
>>>> index 77590320d44c..8f0e6c796c12 100644
>>>> --- a/drivers/virtio/Kconfig
>>>> +++ b/drivers/virtio/Kconfig
>>>> @@ -58,6 +58,16 @@ config VIRTIO_INPUT
>>>>  
>>>>  	 If unsure, say M.
>>>>  
>>>> +config VIRTIO_PSTORE
>>>> +	tristate "Virtio pstore driver"
>>>> +	depends on VIRTIO
>>>> +	depends on PSTORE
>>>> +	---help---
>>>> +	 This driver supports virtio pstore devices to save/restore
>>>> +	 panic and oops messages on the host.
>>>> +
>>>> +	 If unsure, say M.
>>>> +
>>>>   config VIRTIO_MMIO
>>>>  	tristate "Platform bus driver for memory mapped virtio devices"
>>>>  	depends on HAS_IOMEM && HAS_DMA
>>>> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
>>>> index 41e30e3dc842..bee68cb26d48 100644
>>>> --- a/drivers/virtio/Makefile
>>>> +++ b/drivers/virtio/Makefile
>>>> @@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
>>>>  virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
>>>>  obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
>>>>  obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
>>>> +obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o
>>>> diff --git a/drivers/virtio/virtio_pstore.c b/drivers/virtio/virtio_pstore.c
>>>> new file mode 100644
>>>> index 000000000000..0a63c7db4278
>>>> --- /dev/null
>>>> +++ b/drivers/virtio/virtio_pstore.c
>>>> @@ -0,0 +1,417 @@
>>>> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>>> +
>>>> +#include <linux/kernel.h>
>>>> +#include <linux/module.h>
>>>> +#include <linux/pstore.h>
>>>> +#include <linux/virtio.h>
>>>> +#include <linux/virtio_config.h>
>>>> +#include <uapi/linux/virtio_ids.h>
>>>> +#include <uapi/linux/virtio_pstore.h>
>>>> +
>>>> +#define VIRT_PSTORE_ORDER    2
>>>> +#define VIRT_PSTORE_BUFSIZE  (4096 << VIRT_PSTORE_ORDER)
>>>> +#define VIRT_PSTORE_NR_REQ   128
>>>> +
>>>> +struct virtio_pstore {
>>>> +	struct virtio_device	*vdev;
>>>> +	struct virtqueue	*vq[2];
>>>
>>> I'd add named fields instead of an array here, vq[0]
>>> vq[1] all over the place is hard to read.
>>
>> Will change.
>>
>>>
>>>> +	struct pstore_info	 pstore;
>>>> +	struct virtio_pstore_req req[VIRT_PSTORE_NR_REQ];
>>>> +	struct virtio_pstore_res res[VIRT_PSTORE_NR_REQ];
>>>> +	unsigned int		 req_id;
>>>> +
>>>> +	/* Waiting for host to ack */
>>>> +	wait_queue_head_t	acked;
>>>> +	int			failed;
>>>> +};
>>>> +
>>>> +#define TYPE_TABLE_ENTRY(_entry)				\
>>>> +	{ PSTORE_TYPE_##_entry, VIRTIO_PSTORE_TYPE_##_entry }
>>>> +
>>>> +struct type_table {
>>>> +	int pstore;
>>>> +	u16 virtio;
>>>> +} type_table[] = {
>>>> +	TYPE_TABLE_ENTRY(DMESG),
>>>> +};
>>>> +
>>>> +#undef TYPE_TABLE_ENTRY
>>>
>>> let's avoid macros for now pls. In fact, I would just open-code this
>>> in to_virtio_type below. We can always change our minds later if
>>> lots of types are added.
>>
>> Yep.
>>
>>>
>>>> +
>>>> +
>>>
>>> single emoty line pls
>>
>> Ok.
>>
>>>
>>>> +static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id type)
>>>> +{
>>>> +	unsigned int i;
>>>> +
>>>> +	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
>>>> +		if (type == type_table[i].pstore)
>>>> +			return cpu_to_virtio16(vps->vdev, type_table[i].virtio);
>>>> +	}
>>>> +
>>>> +	return cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_UNKNOWN);
>>>
>>> This assigns u16 to __virtio type, sparse will warn
>>> if you enable endian-ness checks.
>>> Pls fix that and generally, please make sure this is
>>> clean from sparse warnings.
>>
>> I'll run sparse before sending patch next time.
>>
>>>
>>>> +}
>>>> +
>>>> +static enum pstore_type_id from_virtio_type(struct virtio_pstore *vps, u16 type)
>>>> +{
>>>> +	unsigned int i;
>>>> +
>>>> +	for (i = 0; i < ARRAY_SIZE(type_table); i++) {
>>>> +		if (virtio16_to_cpu(vps->vdev, type) == type_table[i].virtio)
>>>> +			return type_table[i].pstore;
>>>> +	}
>>>> +
>>>> +	return PSTORE_TYPE_UNKNOWN;
>>>> +}
>>>> +
>>>> +static void virtpstore_ack(struct virtqueue *vq)
>>>> +{
>>>> +	struct virtio_pstore *vps = vq->vdev->priv;
>>>> +
>>>> +	wake_up(&vps->acked);
>>>> +}
>>>> +
>>>> +static void virtpstore_check(struct virtqueue *vq)
>>>> +{
>>>> +	struct virtio_pstore *vps = vq->vdev->priv;
>>>> +	struct virtio_pstore_res *res;
>>>> +	unsigned int len;
>>>> +
>>>> +	res = virtqueue_get_buf(vq, &len);
>>>> +	if (res == NULL)
>>>> +		return;
>>>> +
>>>> +	if (virtio32_to_cpu(vq->vdev, res->ret) < 0)
>>>> +		vps->failed = 1;
>>>> +}
>>>> +
>>>> +static void virt_pstore_get_reqs(struct virtio_pstore *vps,
>>>> +				 struct virtio_pstore_req **preq,
>>>> +				 struct virtio_pstore_res **pres)
>>>> +{
>>>> +	unsigned int idx = vps->req_id++ % VIRT_PSTORE_NR_REQ;
>>>> +
>>>> +	*preq = &vps->req[idx];
>>>> +	*pres = &vps->res[idx];
>>>> +
>>>> +	memset(*preq, 0, sizeof(**preq));
>>>> +	memset(*pres, 0, sizeof(**pres));
>>>> +}
>>>> +
>>>> +static int virt_pstore_open(struct pstore_info *psi)
>>>> +{
>>>> +	struct virtio_pstore *vps = psi->data;
>>>> +	struct virtio_pstore_req *req;
>>>> +	struct virtio_pstore_res *res;
>>>> +	struct scatterlist sgo[1], sgi[1];
>>>> +	struct scatterlist *sgs[2] = { sgo, sgi };
>>>> +	unsigned int len;
>>>> +
>>>> +	virt_pstore_get_reqs(vps, &req, &res);
>>>> +
>>>> +	req->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_OPEN);
>>>> +
>>>> +	sg_init_one(sgo, req, sizeof(*req));
>>>> +	sg_init_one(sgi, res, sizeof(*res));
>>>> +	virtqueue_add_sgs(vps->vq[0], sgs, 1, 1, vps, GFP_KERNEL);
>>>> +	virtqueue_kick(vps->vq[0]);
>>>> +
>>>> +	wait_event(vps->acked, virtqueue_get_buf(vps->vq[0], &len));
>>>
>>> Does this block userspace in an uninterruptible wait if
>>> hardware is slow? That's not nice.
>>
>> Yes, but it's not a common operation and I just wanted to make it
>> simple.
>>
>>
>>>
>>>> +	return virtio32_to_cpu(vps->vdev, res->ret);
>>>> +}
>>>> +
>>
>> [SNIP]
>>>> +struct virtio_pstore_fileinfo {
>>>> +	__virtio64		id;
>>>> +	__virtio32		count;
>>>> +	__virtio16		type;
>>>> +	__virtio16		unused;
>>>> +	__virtio32		flags;
>>>> +	__virtio32		len;
>>>> +	__virtio64		time_sec;
>>>> +	__virtio32		time_nsec;
>>>> +	__virtio32		reserved;
>>>> +};
>>>> +
>>>> +struct virtio_pstore_config {
>>>> +	__virtio32		bufsize;
>>>> +};
>>>> +
>>>
>>> What exactly does each field mean? I'm especially
>>> interested in time fields - maintaining a consistent
>>> time between host and guest is not a simple problem.
>>
>> These are required by pstore and will be used to create corresponding
>> files in the pstore filesystem.  The time fields are for mtime and
>> ctime and, I think, it's just a hint for user and doesn't require
>> strict consistency.
> 
> Pls add documentation. I would just drop hints for now.
> 
>>
>> Thanks for your review!
>> Namhyung
>>
>>>
>>>> +#endif /* _LINUX_VIRTIO_PSTORE_H */
>>>> -- 
>>>> 2.9.3
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-15  5:50         ` Namhyung Kim
@ 2016-11-15 14:35           ` Michael S. Tsirkin
  0 siblings, 0 replies; 31+ messages in thread
From: Michael S. Tsirkin @ 2016-11-15 14:35 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Anthony Liguori, Anton Vorontsov, Colin Cross, Kees Cook,
	Tony Luck, Steven Rostedt, Ingo Molnar, Minchan Kim

On Tue, Nov 15, 2016 at 02:50:11PM +0900, Namhyung Kim wrote:
> On Tue, Nov 15, 2016 at 07:06:28AM +0200, Michael S. Tsirkin wrote:
> > On Tue, Nov 15, 2016 at 01:50:21PM +0900, Namhyung Kim wrote:
> > > On Thu, Nov 10, 2016 at 06:39:55PM +0200, Michael S. Tsirkin wrote:
> > > [SNIP]
> > > > > +struct virtio_pstore_fileinfo {
> > > > > +	__virtio64		id;
> > > > > +	__virtio32		count;
> > > > > +	__virtio16		type;
> > > > > +	__virtio16		unused;
> > > > > +	__virtio32		flags;
> > > > > +	__virtio32		len;
> > > > > +	__virtio64		time_sec;
> > > > > +	__virtio32		time_nsec;
> > > > > +	__virtio32		reserved;
> > > > > +};
> > > > > +
> > > > > +struct virtio_pstore_config {
> > > > > +	__virtio32		bufsize;
> > > > > +};
> > > > > +
> > > > 
> > > > What exactly does each field mean? I'm especially
> > > > interested in time fields - maintaining a consistent
> > > > time between host and guest is not a simple problem.
> > > 
> > > These are required by pstore and will be used to create corresponding
> > > files in the pstore filesystem.  The time fields are for mtime and
> > > ctime and, I think, it's just a hint for user and doesn't require
> > > strict consistency.
> > 
> > Pls add documentation. I would just drop hints for now.
> 
> Well, I'll add docmentation.  But I think just dropping might not good
> since they all have host time and it's helpful to know their relative
> difference in guest.
> 
> Thanks,
> Namhyung

If it's part of host/guest ABI it needs to be better defined.
"It's just a hint does not need to be exact" is too vague,
we need to specify what kind of change will or will not
break guests.

-- 
MST

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-15  9:57         ` Paolo Bonzini
@ 2016-11-15 14:36           ` Namhyung Kim
  2016-11-15 14:38             ` Paolo Bonzini
  0 siblings, 1 reply; 31+ messages in thread
From: Namhyung Kim @ 2016-11-15 14:36 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Michael S. Tsirkin, virtio-dev, Tony Luck, Kees Cook, kvm,
	Radim Krčmář,
	Anton Vorontsov, LKML, Steven Rostedt, qemu-devel, Minchan Kim,
	Anthony Liguori, Colin Cross, virtualization, Ingo Molnar

Hi,

On Tue, Nov 15, 2016 at 10:57:29AM +0100, Paolo Bonzini wrote:
> 
> 
> On 15/11/2016 06:06, Michael S. Tsirkin wrote:
> > On Tue, Nov 15, 2016 at 01:50:21PM +0900, Namhyung Kim wrote:
> >> Hi Michael,
> >>
> >> On Thu, Nov 10, 2016 at 06:39:55PM +0200, Michael S. Tsirkin wrote:
> >>> On Sat, Aug 20, 2016 at 05:07:42PM +0900, Namhyung Kim wrote:
> >>>> The virtio pstore driver provides interface to the pstore subsystem so
> >>>> that the guest kernel's log/dump message can be saved on the host
> >>>> machine.  Users can access the log file directly on the host, or on the
> >>>> guest at the next boot using pstore filesystem.  It currently deals with
> >>>> kernel log (printk) buffer only, but we can extend it to have other
> >>>> information (like ftrace dump) later.
> >>>>
> >>>> It supports legacy PCI device using single order-2 page buffer.
> >>>
> >>> Do you mean a legacy virtio device? I don't see why
> >>> you would want to support pre-1.0 mode.
> >>> If you drop that, you can drop all cpu_to_virtio things
> >>> and just use __le accessors.
> >>
> >> I was thinking about the kvmtools which lacks 1.0 support AFAIK.
> > 
> > Unless kvmtools wants to be left behind it has to go 1.0.
> 
> And it also has to go ACPI.  Is there any reason, apart from kvmtool, to
> make a completely new virtio device, with no support in existing guests,
> rather than implement ACPI ERST?

Well, I know nothing about ACPI.  It looks like a huge spec and I
don't want to dig into it just for this.

What I want is to speed up dumping guest kernel message (especially
for ftrace dump).

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-15 14:36           ` Namhyung Kim
@ 2016-11-15 14:38             ` Paolo Bonzini
  2016-11-16  7:04               ` Namhyung Kim
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2016-11-15 14:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Michael S. Tsirkin, virtio-dev, Tony Luck, Kees Cook, kvm,
	Radim Krčmář,
	Anton Vorontsov, LKML, Steven Rostedt, qemu-devel, Minchan Kim,
	Anthony Liguori, Colin Cross, virtualization, Ingo Molnar



On 15/11/2016 15:36, Namhyung Kim wrote:
> Hi,
> 
> On Tue, Nov 15, 2016 at 10:57:29AM +0100, Paolo Bonzini wrote:
>>
>>
>> On 15/11/2016 06:06, Michael S. Tsirkin wrote:
>>> On Tue, Nov 15, 2016 at 01:50:21PM +0900, Namhyung Kim wrote:
>>>> Hi Michael,
>>>>
>>>> On Thu, Nov 10, 2016 at 06:39:55PM +0200, Michael S. Tsirkin wrote:
>>>>> On Sat, Aug 20, 2016 at 05:07:42PM +0900, Namhyung Kim wrote:
>>>>>> The virtio pstore driver provides interface to the pstore subsystem so
>>>>>> that the guest kernel's log/dump message can be saved on the host
>>>>>> machine.  Users can access the log file directly on the host, or on the
>>>>>> guest at the next boot using pstore filesystem.  It currently deals with
>>>>>> kernel log (printk) buffer only, but we can extend it to have other
>>>>>> information (like ftrace dump) later.
>>>>>>
>>>>>> It supports legacy PCI device using single order-2 page buffer.
>>>>>
>>>>> Do you mean a legacy virtio device? I don't see why
>>>>> you would want to support pre-1.0 mode.
>>>>> If you drop that, you can drop all cpu_to_virtio things
>>>>> and just use __le accessors.
>>>>
>>>> I was thinking about the kvmtools which lacks 1.0 support AFAIK.
>>>
>>> Unless kvmtools wants to be left behind it has to go 1.0.
>>
>> And it also has to go ACPI.  Is there any reason, apart from kvmtool, to
>> make a completely new virtio device, with no support in existing guests,
>> rather than implement ACPI ERST?
> 
> Well, I know nothing about ACPI.  It looks like a huge spec and I
> don't want to dig into it just for this.

ERST (error record serialization table) is a small subset of the ACPI spec.

Paolo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] qemu: Implement virtio-pstore device
  2016-11-15  6:23         ` Namhyung Kim
@ 2016-11-15 14:38           ` Michael S. Tsirkin
  0 siblings, 0 replies; 31+ messages in thread
From: Michael S. Tsirkin @ 2016-11-15 14:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: virtio-dev, kvm, qemu-devel, virtualization, LKML, Paolo Bonzini,
	Radim Krčmář,
	Anthony Liguori, Anton Vorontsov, Colin Cross, Kees Cook,
	Tony Luck, Steven Rostedt, Ingo Molnar, Minchan Kim,
	Daniel P . Berrange

On Tue, Nov 15, 2016 at 03:23:36PM +0900, Namhyung Kim wrote:
> On Fri, Nov 11, 2016 at 12:50:03AM +0200, Michael S. Tsirkin wrote:
> > On Fri, Sep 16, 2016 at 07:05:47PM +0900, Namhyung Kim wrote:
> > > On Tue, Sep 13, 2016 at 06:57:10PM +0300, Michael S. Tsirkin wrote:
> > > > On Sat, Aug 20, 2016 at 05:07:43PM +0900, Namhyung Kim wrote:
> > > > > +
> > > > > +/* the index should match to the type value */
> > > > > +static const char *virtio_pstore_file_prefix[] = {
> > > > > +    "unknown-",		/* VIRTIO_PSTORE_TYPE_UNKNOWN */
> > > > 
> > > > Is there value in treating everything unexpected as "unknown"
> > > > and rotating them as if they were logs?
> > > > It might be better to treat everything that's not known
> > > > as guest error.
> > > 
> > > I was thinking about the version mismatch between the kernel and qemu.
> > > I'd like to make the device can deal with a new kernel version which
> > > might implement a new pstore message type.  It will be saved as
> > > unknown but the kernel can read it properly later.
> > 
> > Well it'll have a different prefix. E.g. if kernel has
> > two different types they will end up in the same
> > file, hardly what was wanted.
> 
> Right, I think it needs to add 'type' info to the filename for unknown
> type.
> 
> Thanks,
> Namhyung

And that opens all kind of resource management issues as guest
might be able to open a ton of these unexpected types.
So don't try to predict the future, if you add a new type
you add a feature flag. Ignore or error on things you can
not handle.

-- 
MST

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-15 14:38             ` Paolo Bonzini
@ 2016-11-16  7:04               ` Namhyung Kim
  2016-11-16 12:10                 ` Paolo Bonzini
  0 siblings, 1 reply; 31+ messages in thread
From: Namhyung Kim @ 2016-11-16  7:04 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Michael S. Tsirkin, virtio-dev, Tony Luck, Kees Cook, KVM,
	Radim Krčmář,
	Anton Vorontsov, LKML, Steven Rostedt, qemu-devel, Minchan Kim,
	Anthony Liguori, Colin Cross, virtualization, Ingo Molnar

Hi,

On Tue, Nov 15, 2016 at 11:38 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 15/11/2016 15:36, Namhyung Kim wrote:
>> Hi,
>>
>> On Tue, Nov 15, 2016 at 10:57:29AM +0100, Paolo Bonzini wrote:
>>>
>>>
>>> On 15/11/2016 06:06, Michael S. Tsirkin wrote:
>>>> On Tue, Nov 15, 2016 at 01:50:21PM +0900, Namhyung Kim wrote:
>>>>> Hi Michael,
>>>>>
>>>>> On Thu, Nov 10, 2016 at 06:39:55PM +0200, Michael S. Tsirkin wrote:
>>>>>> On Sat, Aug 20, 2016 at 05:07:42PM +0900, Namhyung Kim wrote:
>>>>>>> The virtio pstore driver provides interface to the pstore subsystem so
>>>>>>> that the guest kernel's log/dump message can be saved on the host
>>>>>>> machine.  Users can access the log file directly on the host, or on the
>>>>>>> guest at the next boot using pstore filesystem.  It currently deals with
>>>>>>> kernel log (printk) buffer only, but we can extend it to have other
>>>>>>> information (like ftrace dump) later.
>>>>>>>
>>>>>>> It supports legacy PCI device using single order-2 page buffer.
>>>>>>
>>>>>> Do you mean a legacy virtio device? I don't see why
>>>>>> you would want to support pre-1.0 mode.
>>>>>> If you drop that, you can drop all cpu_to_virtio things
>>>>>> and just use __le accessors.
>>>>>
>>>>> I was thinking about the kvmtools which lacks 1.0 support AFAIK.
>>>>
>>>> Unless kvmtools wants to be left behind it has to go 1.0.
>>>
>>> And it also has to go ACPI.  Is there any reason, apart from kvmtool, to
>>> make a completely new virtio device, with no support in existing guests,
>>> rather than implement ACPI ERST?
>>
>> Well, I know nothing about ACPI.  It looks like a huge spec and I
>> don't want to dig into it just for this.
>
> ERST (error record serialization table) is a small subset of the ACPI spec.

Not sure how independent ERST is from ACPI and other specs.  It looks
like referencing UEFI spec at least.  Btw, is the ERST used for pstore
only (in Linux)?

Also I need to control pstore driver like using bigger buffer,
enabling specific message types and so on if ERST supports.  Is it
possible for ERST to provide such information?

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-16  7:04               ` Namhyung Kim
@ 2016-11-16 12:10                 ` Paolo Bonzini
  2016-11-18  3:32                   ` Namhyung Kim
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2016-11-16 12:10 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Michael S. Tsirkin, virtio-dev, Tony Luck, Kees Cook, KVM,
	Radim Krčmář,
	Anton Vorontsov, LKML, Steven Rostedt, qemu-devel, Minchan Kim,
	Anthony Liguori, Colin Cross, virtualization, Ingo Molnar

> Not sure how independent ERST is from ACPI and other specs.  It looks
> like referencing UEFI spec at least.

It is just the format of error records that comes from the UEFI spec
(include/linux/cper.h) but you can ignore it, I think.  It should be
handled by tools on the host side.  For you, the error log address
range contains a CPER header followed by a binary blob.  In practice,
you only need the record length field (bytes 20-23 of the header),
though it may be a good idea to validate the signature at the beginning
of the header.

> Btw, is the ERST used for pstore only (in Linux)?

Yes.  It can store various records, including dmesg and MCE.

There are other examples in QEMU of interfaces with ACPI.  They all use the
DSDT, but the logic is similar.  For example, docs/specs/acpi_mem_hotplug.txt
documents the memory hotplug interface. In all cases, ACPI tables contain small
programs that talk to specialized hardware registers, typically allocated to
hard-coded I/O ports.

In your case, the registers could occupy 16 consecutive I/O ports, like the
following:

     0x00       read/write   operation type (0=write,1=read,2=clear,3=dummy write)

     0x01       read-only    bit 7: if set, operation in progress

                             bit 0-6: operation status, see "Command Status Definition" in
                             the ACPI spec

     0x02       read-only    when read:

                             - read a 64-bit record id from the store to memory,
                               from the address that was last written to 0x08.

                             - if the id is valid and is not the last id in the store,
                               write the next 64-bit record id to the same address

                             - otherwise, write the first record id to the same address,
                               or 0xffffffffffffffff if the store is empty

     0x03                    unused, read as zero

     0x04-0x07  read/write   offset of the error record into the error log address range

     0x08-0x0b  read/write   when read, return number of stored records

                             when written, the written value is a 32-bit memory address,
                             which points to a 64-bit location used to communicate record ids.

     0x0c-0x0f  read/write   when read, always return -1 (together with the "mask" field
                             and READ_REGISTER, this lets ERST instructions return any value!)

                             when written, trigger the pstore operation:

                             - if the current operation is a dummy write, do nothing

                             - if the current operation is a write, write a new record, using
                             the written value as the base of the error log address range.  The
                             length must be parsed from the CPER header.

                             - if the current operation is a clear, read the record id
                             from the memory location that was last written to 0x08 and do the
                             operation.  the value written is ignored.

                             - if the current operation is a read, read the record id from the
                             memory location that was last written to 0x08, using the written
                             value as the base of the error log address range.

In addition, the firmware will need to reserve a few KB of RAM for the error log
address range (I checked a real system and it reserves 8KB).  The first eight
bytes are needed for the record identifier interface, because there's no such
thing as 64-bit I/O ports, and the rest can be used for the actual buffer.

QEMU already has an interface to allocate RAM and patch the address into an
ACPI table (bios_linker_loader_alloc).  Because this interface is actually meant
to load data from QEMU into the firmware (using the "fw_cfg" interface), you
would have to add a dummy 8KB file to fw_cfg using fw_cfg_add_file (for
example "etc/erst-memory"), it can be just full of zeros.

QEMU supports two chipsets, PIIX and ICH9, and the free I/O port ranges are
different.  You could use 0xa20 for ICH9 and 0xae20 for PIIX.

All in all, the contents of the ERST table would not be very different from a
non-virtual system, except that on real hardware the firmware would use SMIs
as the trap mechanism.  You almost have a one-to-one mapping between ERST
actions and registers accesses:

   BEGIN_WRITE_OPERATION                  write value 0 to register at 0x00
   BEGIN_READ_OPERATION                   write value 1 to register at 0x00
   BEGIN_CLEAR_OPERATION                  write value 2 to register at 0x00
   BEGIN_DUMMY_WRITE_OPERATION            write value 3 to register at 0x00
   END_OPERATION                          no-op
   CHECK_BUSY_STATUS                      read register at 0x01 with mask 0x80
   GET_COMMAND_STATUS                     read register at 0x01 with mask 0x7f
   SET_RECORD_OFFSET                      write register at 0x04
   GET_RECORD_COUNT                       read register at 0x08
   EXECUTE_OPERATION                      write ERST memory base + 8 to 0x0c
   GET_ERROR_LOG_ADDRESS_RANGE            read register at 0x0c (with mask = ERST memory base + 8)
   GET_ERROR_LOG_ADDRESS_RANGE_LENGTH     read register at 0x0c (with mask = 8192 - 8 = 8184)
   GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES read register at 0x0c (with mask = 0)

Only the get/set record identifier instructions are a little harder:

   GET_RECORD_IDENTIFIER                  write ERST memory base to register at 0x08
                                          read register at 0x02
                                          read eight bytes at ERST memory base

   SET_RECORD_IDENTIFIER                  write ERST memory base to register at 0x08
                                          write eight bytes at ERST memory base

On top of this, you need to add the APEI UUID (see apei_osc_setup in Linux)
to build_q35_osc_method, and use "-M q35" when you start QEMU.  If you need
more help just ask.  I or others can help you with the ACPI glue, then you
can write the file backend yourself, based on your existing virtio-pstore code.

> Also I need to control pstore driver like using bigger buffer,
> enabling specific message types and so on if ERST supports.  Is it
> possible for ERST to provide such information?

It's the normal pstore driver, same as on a real server.  What exactly do you
need?

Paolo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-16 12:10                 ` Paolo Bonzini
@ 2016-11-18  3:32                   ` Namhyung Kim
  2016-11-18  4:07                     ` Michael S. Tsirkin
  2016-11-18  9:45                     ` Paolo Bonzini
  0 siblings, 2 replies; 31+ messages in thread
From: Namhyung Kim @ 2016-11-18  3:32 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Michael S. Tsirkin, virtio-dev, Tony Luck, Kees Cook, KVM,
	Radim Krčmář,
	Anton Vorontsov, LKML, Steven Rostedt, qemu-devel, Minchan Kim,
	Anthony Liguori, Colin Cross, virtualization, Ingo Molnar

Hi,

Thanks for your detailed information,

On Wed, Nov 16, 2016 at 07:10:36AM -0500, Paolo Bonzini wrote:
> > Not sure how independent ERST is from ACPI and other specs.  It looks
> > like referencing UEFI spec at least.
> 
> It is just the format of error records that comes from the UEFI spec
> (include/linux/cper.h) but you can ignore it, I think.  It should be
> handled by tools on the host side.  For you, the error log address
> range contains a CPER header followed by a binary blob.  In practice,
> you only need the record length field (bytes 20-23 of the header),
> though it may be a good idea to validate the signature at the beginning
> of the header.
> 
> > Btw, is the ERST used for pstore only (in Linux)?
> 
> Yes.  It can store various records, including dmesg and MCE.
> 
> There are other examples in QEMU of interfaces with ACPI.  They all use the
> DSDT, but the logic is similar.  For example, docs/specs/acpi_mem_hotplug.txt
> documents the memory hotplug interface. In all cases, ACPI tables contain small
> programs that talk to specialized hardware registers, typically allocated to
> hard-coded I/O ports.
> 
> In your case, the registers could occupy 16 consecutive I/O ports, like the
> following:
> 
>      0x00       read/write   operation type (0=write,1=read,2=clear,3=dummy write)
> 
>      0x01       read-only    bit 7: if set, operation in progress
> 
>                              bit 0-6: operation status, see "Command Status Definition" in
>                              the ACPI spec
> 
>      0x02       read-only    when read:
> 
>                              - read a 64-bit record id from the store to memory,
>                                from the address that was last written to 0x08.
> 
>                              - if the id is valid and is not the last id in the store,
>                                write the next 64-bit record id to the same address
> 
>                              - otherwise, write the first record id to the same address,
>                                or 0xffffffffffffffff if the store is empty
> 
>      0x03                    unused, read as zero
> 
>      0x04-0x07  read/write   offset of the error record into the error log address range
> 
>      0x08-0x0b  read/write   when read, return number of stored records
> 
>                              when written, the written value is a 32-bit memory address,
>                              which points to a 64-bit location used to communicate record ids.
> 
>      0x0c-0x0f  read/write   when read, always return -1 (together with the "mask" field
>                              and READ_REGISTER, this lets ERST instructions return any value!)
> 
>                              when written, trigger the pstore operation:
> 
>                              - if the current operation is a dummy write, do nothing
> 
>                              - if the current operation is a write, write a new record, using
>                              the written value as the base of the error log address range.  The
>                              length must be parsed from the CPER header.
> 
>                              - if the current operation is a clear, read the record id
>                              from the memory location that was last written to 0x08 and do the
>                              operation.  the value written is ignored.
> 
>                              - if the current operation is a read, read the record id from the
>                              memory location that was last written to 0x08, using the written
>                              value as the base of the error log address range.
> 
> In addition, the firmware will need to reserve a few KB of RAM for the error log
> address range (I checked a real system and it reserves 8KB).  The first eight
> bytes are needed for the record identifier interface, because there's no such
> thing as 64-bit I/O ports, and the rest can be used for the actual buffer.

Is there a limit on the size?  It'd be great if it can use a few MB..

> 
> QEMU already has an interface to allocate RAM and patch the address into an
> ACPI table (bios_linker_loader_alloc).  Because this interface is actually meant
> to load data from QEMU into the firmware (using the "fw_cfg" interface), you
> would have to add a dummy 8KB file to fw_cfg using fw_cfg_add_file (for
> example "etc/erst-memory"), it can be just full of zeros.
> 
> QEMU supports two chipsets, PIIX and ICH9, and the free I/O port ranges are
> different.  You could use 0xa20 for ICH9 and 0xae20 for PIIX.
> 
> All in all, the contents of the ERST table would not be very different from a
> non-virtual system, except that on real hardware the firmware would use SMIs
> as the trap mechanism.  You almost have a one-to-one mapping between ERST
> actions and registers accesses:
> 
>    BEGIN_WRITE_OPERATION                  write value 0 to register at 0x00
>    BEGIN_READ_OPERATION                   write value 1 to register at 0x00
>    BEGIN_CLEAR_OPERATION                  write value 2 to register at 0x00
>    BEGIN_DUMMY_WRITE_OPERATION            write value 3 to register at 0x00
>    END_OPERATION                          no-op
>    CHECK_BUSY_STATUS                      read register at 0x01 with mask 0x80
>    GET_COMMAND_STATUS                     read register at 0x01 with mask 0x7f
>    SET_RECORD_OFFSET                      write register at 0x04
>    GET_RECORD_COUNT                       read register at 0x08
>    EXECUTE_OPERATION                      write ERST memory base + 8 to 0x0c
>    GET_ERROR_LOG_ADDRESS_RANGE            read register at 0x0c (with mask = ERST memory base + 8)
>    GET_ERROR_LOG_ADDRESS_RANGE_LENGTH     read register at 0x0c (with mask = 8192 - 8 = 8184)
>    GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES read register at 0x0c (with mask = 0)
> 
> Only the get/set record identifier instructions are a little harder:
> 
>    GET_RECORD_IDENTIFIER                  write ERST memory base to register at 0x08
>                                           read register at 0x02
>                                           read eight bytes at ERST memory base
> 
>    SET_RECORD_IDENTIFIER                  write ERST memory base to register at 0x08
>                                           write eight bytes at ERST memory base
> 
> On top of this, you need to add the APEI UUID (see apei_osc_setup in Linux)
> to build_q35_osc_method, and use "-M q35" when you start QEMU.  If you need
> more help just ask.  I or others can help you with the ACPI glue, then you
> can write the file backend yourself, based on your existing virtio-pstore code.
> 
> > Also I need to control pstore driver like using bigger buffer,
> > enabling specific message types and so on if ERST supports.  Is it
> > possible for ERST to provide such information?
> 
> It's the normal pstore driver, same as on a real server.  What exactly do you
> need?

Well, I don't want to send additional pstore messages to the device if
it cannot handle them properly - for example, ftrace message should not
overwrite kmsg dump.  It'd be great if device somehow could expose
acceptable message types to the driver IMHO.

Btw I prefer using the kvmtool for my kernel work since it's much more
simpler..

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-18  3:32                   ` Namhyung Kim
@ 2016-11-18  4:07                     ` Michael S. Tsirkin
  2016-11-18  9:46                       ` [virtio-dev] " Paolo Bonzini
  2016-11-18  9:45                     ` Paolo Bonzini
  1 sibling, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2016-11-18  4:07 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Paolo Bonzini, virtio-dev, Tony Luck, Kees Cook, KVM,
	Radim Krčmář,
	Anton Vorontsov, LKML, Steven Rostedt, qemu-devel, Minchan Kim,
	Anthony Liguori, Colin Cross, virtualization, Ingo Molnar

On Fri, Nov 18, 2016 at 12:32:06PM +0900, Namhyung Kim wrote:
> Btw I prefer using the kvmtool for my kernel work since it's much more
> simpler..
> 
> Thanks,
> Namhyung

Up to you but then you should extend that to support 1.0 spec.
I strongly object to adding to the list of legacy interfaces
we need to maintain.

-- 
MST

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-18  3:32                   ` Namhyung Kim
  2016-11-18  4:07                     ` Michael S. Tsirkin
@ 2016-11-18  9:45                     ` Paolo Bonzini
  1 sibling, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2016-11-18  9:45 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Michael S. Tsirkin, virtio-dev, Tony Luck, Kees Cook, KVM,
	Radim Krčmář,
	Anton Vorontsov, LKML, Steven Rostedt, qemu-devel, Minchan Kim,
	Anthony Liguori, Colin Cross, virtualization, Ingo Molnar



On 18/11/2016 04:32, Namhyung Kim wrote:
>> In addition, the firmware will need to reserve a few KB of RAM for the error log
>> address range (I checked a real system and it reserves 8KB).  The first eight
>> bytes are needed for the record identifier interface, because there's no such
>> thing as 64-bit I/O ports, and the rest can be used for the actual buffer.
> 
> Is there a limit on the size?  It'd be great if it can use a few MB..

Yes, you can make it customizable.

>>> Also I need to control pstore driver like using bigger buffer,
>>> enabling specific message types and so on if ERST supports.  Is it
>>> possible for ERST to provide such information?
>>
>> It's the normal pstore driver, same as on a real server.  What exactly do you
>> need?
> 
> Well, I don't want to send additional pstore messages to the device if
> it cannot handle them properly - for example, ftrace message should not
> overwrite kmsg dump.  It'd be great if device somehow could expose
> acceptable message types to the driver IMHO.

This is something that you have to do in the usual kernel pstore
infrastructure.  It should not be specific to virtualization.

Paolo

> Btw I prefer using the kvmtool for my kernel work since it's much more
> simpler..

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [virtio-dev] Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
  2016-11-18  4:07                     ` Michael S. Tsirkin
@ 2016-11-18  9:46                       ` Paolo Bonzini
  0 siblings, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2016-11-18  9:46 UTC (permalink / raw)
  To: Michael S. Tsirkin, Namhyung Kim
  Cc: virtio-dev, Tony Luck, Kees Cook, KVM,
	Radim Krčmář,
	Anton Vorontsov, LKML, Steven Rostedt, qemu-devel, Minchan Kim,
	Anthony Liguori, Colin Cross, virtualization, Ingo Molnar



On 18/11/2016 05:07, Michael S. Tsirkin wrote:
> On Fri, Nov 18, 2016 at 12:32:06PM +0900, Namhyung Kim wrote:
>> Btw I prefer using the kvmtool for my kernel work since it's much more
>> simpler..
> 
> Up to you but then you should extend that to support 1.0 spec.
> I strongly object to adding to the list of legacy interfaces
> we need to maintain.

I object to adding paravirtualization unless there is a good reason why
the usual mechanisms for physical machines cannot be used.  The cost of
maintaining a spec, two device implementations (kvmtool+qemu) and a
driver is not small, plus it will not work on older kernels.

Paolo

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2016-11-18  9:46 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-20  8:07 [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Namhyung Kim
2016-08-20  8:07 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
2016-09-13 15:19   ` Michael S. Tsirkin
2016-09-16  9:05     ` Namhyung Kim
2016-11-10 16:39   ` Michael S. Tsirkin
2016-11-15  4:50     ` Namhyung Kim
2016-11-15  5:06       ` Michael S. Tsirkin
2016-11-15  5:50         ` Namhyung Kim
2016-11-15 14:35           ` Michael S. Tsirkin
2016-11-15  9:57         ` Paolo Bonzini
2016-11-15 14:36           ` Namhyung Kim
2016-11-15 14:38             ` Paolo Bonzini
2016-11-16  7:04               ` Namhyung Kim
2016-11-16 12:10                 ` Paolo Bonzini
2016-11-18  3:32                   ` Namhyung Kim
2016-11-18  4:07                     ` Michael S. Tsirkin
2016-11-18  9:46                       ` [virtio-dev] " Paolo Bonzini
2016-11-18  9:45                     ` Paolo Bonzini
2016-08-20  8:07 ` [PATCH 2/3] qemu: Implement virtio-pstore device Namhyung Kim
2016-08-24 22:00   ` Daniel P. Berrange
2016-08-26  4:48     ` Namhyung Kim
2016-08-26 12:27       ` Daniel P. Berrange
2016-09-13 15:57   ` Michael S. Tsirkin
2016-09-16 10:05     ` Namhyung Kim
2016-11-10 22:50       ` Michael S. Tsirkin
2016-11-15  6:23         ` Namhyung Kim
2016-11-15 14:38           ` Michael S. Tsirkin
2016-08-20  8:07 ` [PATCH 3/3] kvmtool: " Namhyung Kim
2016-08-23 10:25 ` [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Joel Fernandes
2016-08-23 15:20   ` Namhyung Kim
2016-08-24  7:10     ` Joel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).