linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org,
	virtualization@lists.linux-foundation.org,
	LKML <linux-kernel@vger.kernel.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"Anthony Liguori" <aliguori@amazon.com>,
	"Anton Vorontsov" <anton@enomsg.org>,
	"Colin Cross" <ccross@android.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Tony Luck" <tony.luck@intel.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Minchan Kim" <minchan@kernel.org>
Subject: Re: [PATCH 6/7] qemu: Implement virtio-pstore device
Date: Thu, 28 Jul 2016 14:39:53 +0900	[thread overview]
Message-ID: <20160728053953.GB4371@sejong> (raw)
In-Reply-To: <20160728023254-mutt-send-email-mst@kernel.org>

On Thu, Jul 28, 2016 at 03:02:54AM +0300, Michael S. Tsirkin wrote:
> On Thu, Jul 28, 2016 at 12:08:30AM +0900, Namhyung Kim wrote:
> > Add virtio pstore device to allow kernel log files saved on the host.
> > It will save the log files on the directory given by pstore device
> > option.
> > 
> >   $ qemu-system-x86_64 -device virtio-pstore,directory=dir-xx ...
> > 
> >   (guest) # echo c > /proc/sysrq-trigger
> 
> So if the point is handling system crashes, I suspect
> virtio is the wrong protocol to use. ATM it's rather
> elaborate for performance, it's likely not to work when
> linux is crashing. I think you want something very very
> simple that will still work when you crash. Like maybe
> a serial device?

Well, I dont' know.  As you know, the kernel oops dump is already sent
to serial device but it's rather slow.  As I wrote in the cover
letter, enabling ftrace_dump_on_oops makes it even worse..  Also
pstore saves the (compressed) binary data so I thought it'd be better
to have a dedicated IO channel.

I know that we can't rely on anything in kernel when the kernel is
crashing.  In the virtualization environment, I think it'd be great if
it can dump the crash data in the host directly so I chose the virtio.
I thought the virtio is a relatively thin layer and independent from
other kernel features.  Even if it doesn't guarantee to work 100%,
it's worth trying IMHO.


> 
> >   $ ls dir-xx
> >   dmesg-1.enc.z  dmesg-2.enc.z
> > 
> > The log files are usually compressed using zlib.  Users can see the log
> > messages directly on the host or on the guest (using pstore filesystem).
> 
> So this lacks all management tools that a regular storage device
> has, and these are necessary if people are to use this
> in production.
> 
> For example, some kind of provision for limiting the amount of host disk
> this can consume seems called for. Rate-limiting disk writes
> on host also seems necessary. Handling host disk errors always
> propagates error to guest but an option to e.g. stop vm might
> be useful to avoid corruption.

Yes, it needs that kind of management.  I'd like to look at the
existing implementation.  But basically I thought it as a debugging
feature and it won't produce much data for the default setting.  Maybe
I can limit the file size not to exceed the buffer size and keep up to
pre-configured number of files for each type.  Now host disk error
will be propagated to guest.


> 
> > 
> > The 'directory' property is required for virtio-pstore device to work.
> > It also adds 'bufsize' and 'console' (boolean) properties.
> 
> No idea what these do. Seem to be RW by guest but have no effect
> otherwise.

The 'directory' is a pathname which it saves the data files.  The
'bufsize' sets the size of buffer used by pstore.  The 'console'
enables saving pstore console type data.


> 
> 
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Anthony Liguori <aliguori@amazon.com>
> > Cc: Anton Vorontsov <anton@enomsg.org>
> > Cc: Colin Cross <ccross@android.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Tony Luck <tony.luck@intel.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Minchan Kim <minchan@kernel.org>
> > Cc: kvm@vger.kernel.org
> > Cc: qemu-devel@nongnu.org
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---

[SNIP]
> > diff --git a/hw/virtio/virtio-pstore.c b/hw/virtio/virtio-pstore.c
> > new file mode 100644
> > index 0000000..2ca7786
> > --- /dev/null
> > +++ b/hw/virtio/virtio-pstore.c
> > @@ -0,0 +1,477 @@
> > +/*
> > + * Virtio Pstore Device
> > + *
> > + * Copyright (C) 2016  LG Electronics
> > + *
> > + * Authors:
> > + *  Namhyung Kim  <namhyung@gmail.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2.
> 
> We generally ask new code to be v2 or later.

Ok, will update.


> 
> >  See
> > + * the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include <stdio.h>
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/iov.h"
> > +#include "qemu-common.h"
> > +#include "qemu/cutils.h"
> > +#include "qemu/error-report.h"
> > +#include "sysemu/kvm.h"
> > +#include "qapi/visitor.h"
> > +#include "qapi-event.h"
> > +#include "trace.h"
> > +
> > +#include "hw/virtio/virtio.h"
> > +#include "hw/virtio/virtio-bus.h"
> > +#include "hw/virtio/virtio-access.h"
> > +#include "hw/virtio/virtio-pstore.h"
> > +
> > +
> > +static void virtio_pstore_to_filename(VirtIOPstore *s, char *buf, size_t sz,
> > +                                      struct virtio_pstore_req *req)
> > +{
> > +    const char *basename;
> > +    unsigned long long id = 0;
> > +    unsigned int flags = le32_to_cpu(req->flags);
> > +
> > +    switch (le16_to_cpu(req->type)) {
> > +    case VIRTIO_PSTORE_TYPE_DMESG:
> > +        basename = "dmesg";
> > +        id = s->id++;
> > +        break;
> > +    case VIRTIO_PSTORE_TYPE_CONSOLE:
> > +        basename = "console";
> > +        if (s->console_id) {
> > +            id = s->console_id;
> > +        } else {
> > +            id = s->console_id = s->id++;
> > +        }
> > +        break;
> > +    default:
> > +        basename = "unknown";
> > +        break;
> > +    }
> > +
> > +    snprintf(buf, sz, "%s/%s-%llu%s", s->directory, basename, id,
> > +             flags & VIRTIO_PSTORE_FL_COMPRESSED ? ".enc.z" : "");
> > +}
> > +
> > +static void virtio_pstore_from_filename(VirtIOPstore *s, char *name,
> > +                                        char *buf, size_t sz,
> > +                                        struct virtio_pstore_fileinfo *info)
> > +{
> > +    snprintf(buf, sz, "%s/%s", s->directory, name);
> 
> if this does not fit, buf will not be 0 terminated and can
> generally be corrupted.

Will change it to use malloc instead.

> 
> 
> > +
> > +    if (g_str_has_prefix(name, "dmesg-")) {
> > +        info->type = VIRTIO_PSTORE_TYPE_DMESG;
> > +        name += strlen("dmesg-");
> > +    } else if (g_str_has_prefix(name, "console-")) {
> > +        info->type = VIRTIO_PSTORE_TYPE_CONSOLE;
> > +        name += strlen("console-");
> > +    } else if (g_str_has_prefix(name, "unknown-")) {
> > +        info->type = VIRTIO_PSTORE_TYPE_UNKNOWN;
> > +        name += strlen("unknown-");
> > +    }
> > +
> > +    qemu_strtoull(name, NULL, 0, &info->id);
> > +
> > +    info->flags = 0;
> > +    if (g_str_has_suffix(name, ".enc.z")) {
> > +        info->flags |= VIRTIO_PSTORE_FL_COMPRESSED;
> > +    }
> > +}

[SNIP]
> > +static ssize_t virtio_pstore_do_write(VirtIOPstore *s, struct iovec *out_sg,
> > +                                      unsigned int out_num,
> > +                                      struct virtio_pstore_req *req)
> > +{
> > +    char path[PATH_MAX];
> > +    int fd;
> > +    ssize_t len;
> > +    unsigned short type;
> > +    int flags = O_WRONLY | O_CREAT;
> > +
> > +    /* we already consume the req */
> > +    iov_discard_front(&out_sg, &out_num, sizeof(*req));
> > +
> > +    virtio_pstore_to_filename(s, path, sizeof(path), req);
> > +
> > +    type = le16_to_cpu(req->type);
> > +
> > +    if (type == VIRTIO_PSTORE_TYPE_DMESG) {
> > +        flags |= O_TRUNC;
> > +    } else if (type == VIRTIO_PSTORE_TYPE_CONSOLE) {
> > +        flags |= O_APPEND;
> > +    }
> > +
> > +    fd = open(path, flags, 0644);
> > +    if (fd < 0) {
> > +        error_report("cannot open %s", path);
> > +        return -1;
> > +    }
> > +    len = writev(fd, out_sg, out_num);
> > +    close(fd);
> > +
> > +    return len;
> 
> All this is blocking VM until host io completes.

Hmm.. I don't know about the internals of qemu.  So does it make guest
stop?  If so, that's what I want to do for _DMESG. :)  As it's called
only on kernel oops I think it's admittable.  But for _CONSOLE, it
needs to do asynchronously.  Maybe I can add a thread to do the work.


> 
> 
> > +}
> > +
> > +static ssize_t virtio_pstore_do_close(VirtIOPstore *s)
> > +{
> > +    if (s->dirp == NULL) {
> > +        return 0;
> > +    }
> > +
> > +    closedir(s->dirp);
> > +    s->dirp = NULL;
> > +
> > +    return 0;
> > +}
> > +
> > +static ssize_t virtio_pstore_do_erase(VirtIOPstore *s,
> > +                                      struct virtio_pstore_req *req)
> > +{
> > +    char path[PATH_MAX];
> > +
> > +    virtio_pstore_to_filename(s, path, sizeof(path), req);
> > +
> > +    return unlink(path);
> > +}
> > +
> > +static void virtio_pstore_handle_io(VirtIODevice *vdev, VirtQueue *vq)
> > +{
> > +    VirtIOPstore *s = VIRTIO_PSTORE(vdev);
> > +    VirtQueueElement *elem;
> > +    struct virtio_pstore_req req;
> > +    struct virtio_pstore_res res;
> > +    ssize_t len = 0;
> > +    int ret;
> > +
> > +    for (;;) {
> > +        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> > +        if (!elem) {
> > +            return;
> > +        }
> > +
> > +        if (elem->out_num < 1 || elem->in_num < 1) {
> > +            error_report("request or response buffer is missing");
> > +            exit(1);
> > +        }
> > +
> > +        len = iov_to_buf(elem->out_sg, elem->out_num, 0, &req, sizeof(req));
> > +        if (len != (ssize_t)sizeof(req)) {
> > +            error_report("invalid request size: %ld", (long)len);
> > +            exit(1);
> > +        }
> > +        res.cmd  = req.cmd;
> > +        res.type = req.type;
> > +
> > +        switch (le16_to_cpu(req.cmd)) {
> > +        case VIRTIO_PSTORE_CMD_OPEN:
> > +            ret = virtio_pstore_do_open(s);
> > +            break;
> > +        case VIRTIO_PSTORE_CMD_READ:
> > +            ret = virtio_pstore_do_read(s, elem->in_sg, elem->in_num, &res);
> > +            if (ret > 0) {
> > +                len = ret;
> > +                ret = 0;
> > +            }
> > +            break;
> > +        case VIRTIO_PSTORE_CMD_WRITE:
> > +            ret = virtio_pstore_do_write(s, elem->out_sg, elem->out_num, &req);
> > +            break;
> > +        case VIRTIO_PSTORE_CMD_CLOSE:
> > +            ret = virtio_pstore_do_close(s);
> > +            break;
> > +        case VIRTIO_PSTORE_CMD_ERASE:
> > +            ret = virtio_pstore_do_erase(s, &req);
> > +            break;
> > +        default:
> > +            ret = -1;
> > +            break;
> > +        }
> > +
> > +        res.ret  = ret;
> > +
> > +        iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> > +        virtqueue_push(vq, elem, sizeof(res) + len);
> 
> this is wrong - len should be # of bytes written into guest memory.

???

All command except READ only write the virtio_pstore_ret so len is 0.
For READ, virtio_pstore_do_read() returns the actual bytes it wrote
(minus sizeof(res) of course), so I think it's fine, no?

Anyway, thanks for your review!

Thanks,
Namhyung


> 
> > +
> > +        virtio_notify(vdev, vq);
> > +        g_free(elem);
> > +
> > +        if (ret < 0) {
> > +            return;
> > +        }
> > +    }
> > +}

  reply	other threads:[~2016-07-28  5:40 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-27 15:08 [RFC/PATCHSET 0/7] virtio: Implement virtio pstore device (v2) Namhyung Kim
2016-07-27 15:08 ` [PATCH 1/7] pstore: Split pstore fragile flags Namhyung Kim
2016-07-27 15:08 ` [PATCH 2/7] pstore/ram: Set pstore flags dynamically Namhyung Kim
2016-07-27 15:08 ` [PATCH 3/7] pstore: Manage buffer position for async write Namhyung Kim
2016-07-27 15:08 ` [PATCH 4/7] virtio: Basic implementation of virtio pstore driver Namhyung Kim
2016-07-27 15:08 ` [PATCH 5/7] virtio-pstore: Support PSTORE_TYPE_CONSOLE Namhyung Kim
2016-07-27 15:08 ` [PATCH 6/7] qemu: Implement virtio-pstore device Namhyung Kim
2016-07-28  0:02   ` Michael S. Tsirkin
2016-07-28  5:39     ` Namhyung Kim [this message]
2016-07-28 12:56       ` Stefan Hajnoczi
2016-07-28 13:08         ` [Qemu-devel] " Daniel P. Berrange
2016-07-30  8:38           ` Namhyung Kim
2016-08-01  9:21             ` Daniel P. Berrange
2016-08-01 15:34               ` Namhyung Kim
2016-07-28 14:38       ` Steven Rostedt
2016-07-28 13:22   ` [Qemu-devel] " Daniel P. Berrange
2016-07-30  8:57     ` Namhyung Kim
2016-08-01  9:24       ` Daniel P. Berrange
2016-08-01 15:52         ` Namhyung Kim
2016-07-27 15:08 ` [PATCH 7/7] kvmtool: " Namhyung Kim
2016-07-27 22:18 ` [RFC/PATCHSET 0/7] virtio: Implement virtio pstore device (v2) Michael S. Tsirkin
2016-07-28  2:46   ` Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160728053953.GB4371@sejong \
    --to=namhyung@kernel.org \
    --cc=aliguori@amazon.com \
    --cc=anton@enomsg.org \
    --cc=ccross@android.com \
    --cc=keescook@chromium.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=minchan@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rkrcmar@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=tony.luck@intel.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).