From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59583) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1csle3-0004BK-Pi for qemu-devel@nongnu.org; Tue, 28 Mar 2017 03:34:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1csle0-0007Q1-LM for qemu-devel@nongnu.org; Tue, 28 Mar 2017 03:34:23 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:49362) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1csle0-0007Px-Bv for qemu-devel@nongnu.org; Tue, 28 Mar 2017 03:34:20 -0400 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v2S7Vosf145785 for ; Tue, 28 Mar 2017 03:34:18 -0400 Received: from e06smtp14.uk.ibm.com (e06smtp14.uk.ibm.com [195.75.94.110]) by mx0a-001b2d01.pphosted.com with ESMTP id 29fkh1g5xe-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 28 Mar 2017 03:34:18 -0400 Received: from localhost by e06smtp14.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 28 Mar 2017 08:34:15 +0100 Date: Tue, 28 Mar 2017 09:34:11 +0200 From: Cornelia Huck In-Reply-To: <20170327211728-mutt-send-email-mst@kernel.org> References: <149063674781.4447.14258971700726134711.stgit@bahia.lan> <149063676337.4447.2095575576822297032.stgit@bahia.lan> <20170327211728-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: <20170328093411.7535b59f.cornelia.huck@de.ibm.com> Subject: Re: [Qemu-devel] [PATCH 1/5] virtio: Error object based virtio_error() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: Greg Kurz , Stefano Stabellini , qemu-devel@nongnu.org On Mon, 27 Mar 2017 21:20:56 +0300 "Michael S. Tsirkin" wrote: > On Mon, Mar 27, 2017 at 07:46:03PM +0200, Greg Kurz wrote: > > This introduces an Error object based implementation of virtio_error(). It > > allows to implement virtio_error() wrappers in device-specific code. > > > > Signed-off-by: Greg Kurz > > --- > > hw/virtio/virtio.c | 21 ++++++++++++++++----- > > include/hw/virtio/virtio.h | 1 + > > 2 files changed, 17 insertions(+), 5 deletions(-) > > > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > > index 03592c542a55..4036f4816038 100644 > > --- a/hw/virtio/virtio.c > > +++ b/hw/virtio/virtio.c > > @@ -2443,6 +2443,16 @@ void virtio_device_set_child_bus_name(VirtIODevice *vdev, char *bus_name) > > vdev->bus_name = g_strdup(bus_name); > > } > > > > +static void virtio_device_set_broken(VirtIODevice *vdev) > > +{ > > + vdev->broken = true; > > + > > + if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) { > > + virtio_set_status(vdev, vdev->status | VIRTIO_CONFIG_S_NEEDS_RESET); > > + virtio_notify_config(vdev); > > + } > > +} > > + > > void GCC_FMT_ATTR(2, 3) virtio_error(VirtIODevice *vdev, const char *fmt, ...) > > { > > va_list ap; > > It's worth pondering whether we can set this for versions < 1.0 too. I'm a bit torn there. In theory, setting an unknown status bit should not really do harm; but we can't be sure that there aren't legacy drivers out there that will crash when they notice an unknown status bit, and I'm not sure we want that. > > > > @@ -2451,12 +2461,13 @@ void GCC_FMT_ATTR(2, 3) virtio_error(VirtIODevice *vdev, const char *fmt, ...) > > error_vreport(fmt, ap); > > va_end(ap); > > > > - vdev->broken = true; > > + virtio_device_set_broken(vdev); > > +} > > > > - if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) { > > - virtio_set_status(vdev, vdev->status | VIRTIO_CONFIG_S_NEEDS_RESET); > > - virtio_notify_config(vdev); > > - } > > +void virtio_error_err(VirtIODevice *vdev, Error *err) > > +{ > > + error_report_err(err); > > + virtio_device_set_broken(vdev); > > } > > > > static void virtio_memory_listener_commit(MemoryListener *listener) > > Should this skip error report if device is already broken? > Otherwise we'll get a ton of errors in the log. One would hope that qemu stops processing broken devices, but a check might be better. > > Also, whether to stop the device, or the VM, or just warn, > seems like a policy decision. Why not set it on command line > like we do for other storage? I would trust the device implementation to make the decision: Can we recover, can we start using the device again after a reset, or are we so broken that we want to terminate the vm? Note that all of this already applies to the existing virtio_error(); I think we should discuss this independently of this patch.