From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44736) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YkR52-00066p-4S for qemu-devel@nongnu.org; Tue, 21 Apr 2015 01:50:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YkR4y-0002uo-Qr for qemu-devel@nongnu.org; Tue, 21 Apr 2015 01:50:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43414) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YkR4y-0002uY-JF for qemu-devel@nongnu.org; Tue, 21 Apr 2015 01:50:40 -0400 Date: Tue, 21 Apr 2015 13:50:33 +0800 From: Fam Zheng Message-ID: <20150421055033.GB21030@fam-t430.nay.redhat.com> References: <1429257573-7359-1-git-send-email-famz@redhat.com> <20150420175905-mutt-send-email-mst@redhat.com> <20150421023700.GC8048@fam-t430.nay.redhat.com> <20150421070941-mutt-send-email-mst@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20150421070941-mutt-send-email-mst@redhat.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: Kevin Wolf , Rusty Russell , qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org, "Aneesh Kumar K.V" , Stefan Hajnoczi , Amit Shah , Paolo Bonzini On Tue, 04/21 07:22, Michael S. Tsirkin wrote: > On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote: > > On Mon, 04/20 19:36, Michael S. Tsirkin wrote: > > > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote: > > > > Currently, virtio code chooses to kill QEMU if the guest passes a= ny invalid > > > > data with vring. > > > > That has drawbacks such as losing unsaved data (e.g. when > > > > guest user is writing a very long email), or possible denial of s= ervice in > > > > a nested vm use case where virtio device is passed through. > > > >=20 > > > > virtio-1 has introduced a new status bit "NEEDS RESET" which coul= d be used to > > > > improve this by communicating the error state between virtio devi= ces and > > > > drivers. The device notifies guest upon setting the bit, then the= guest driver > > > > should detect this bit and report to userspace, or recover the de= vice by > > > > resetting it. > > >=20 > > > Unfortunately, virtio 1 spec does not have a conformance statement > > > that requires driver to recover. We merely have a non-normative loo= king > > > text: > > > Note: For example, the driver can=E2=80=99t assume requests in fli= ght > > > will be completed if DEVICE_NEEDS_RESET is set, nor can it assume = that > > > they have not been completed. A good implementation will try to re= cover > > > by issuing a reset. > > >=20 > > > Implementing this reset for all devices in a race-free manner might= also > > > be far from trivial. I think we'd need a feature bit for this. > > > OTOH as long as we make this a new feature, would an ability to > > > reset a single VQ be a better match for what you are trying to > > > achieve? > >=20 > > I think that is too complicated as a recovery measure, a device level= resetting > > will be better to get to a deterministic state, at least. >=20 > Question would be, how hard is it to stop host from using all queues, > retrieve all host OS state and re-program it into the device. > If we need to shadow all OS state within the driver, then that's a lot > of not well tested code with a possibility of introducing more bugs. I don't understand the question. In this series the virtio-blk device wil= l not pop any more requests, and as long as the reset is properly handled, both= guest and host should go back to a good state. >=20 > > >=20 > > > > This series makes necessary changes in virtio core code, based on= which > > > > virtio-blk is converted. Other devices now keep the existing beha= vior by > > > > passing in "error_abort". They will be converted in following ser= ies. The Linux > > > > driver part will also be worked on. > > > >=20 > > > > One concern with this behavior change is that it's now harder to = notice the > > > > actual driver bug that caused the error, as the guest continues t= o run. To > > > > address that, we could probably add a new error action option to = virtio > > > > devices, similar to the "read/write werror" in block layer, so t= he vm could be > > > > paused and the management will get an event in QMP like pvpanic. = This work can > > > > be done on top. > > >=20 > > > At the architectural level, that's only one concern. Others would b= e > > > - workloads such as openstack handle guest crash better than > > > a guest that's e.g. slow because of a memory leak > >=20 > > What memory leak are you referring to? >=20 > That was just an example. If host detects a malformed ring, it will > crash. But often it doesn't, result is buffers not being used, so gues= t > can't free them up. >=20 > > > - it's easier for guests to probe host for security issues > > > if guest isn't killed > > > - guest can flood host log with guest-triggered errors > >=20 > > We can still abort() if guest is triggering error too quickly. >=20 >=20 > Absolutely, and if it looked like I'm against error detection and > recovery, this was not my intent. >=20 > I am merely saying we can't apply this patchset as is, deferring > addressing the issues to patches on top. >=20 > But I have an idea: refactor the code to use error_abort.=20 That is patch 1-9 of this series. Or do you mean also refactor and pass error_abort to the memory core? Fam >This way we > can apply the patchset without making functional changes, and you can > make progress to complete this, on top. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Fam Zheng Subject: Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Date: Tue, 21 Apr 2015 13:50:33 +0800 Message-ID: <20150421055033.GB21030@fam-t430.nay.redhat.com> References: <1429257573-7359-1-git-send-email-famz@redhat.com> <20150420175905-mutt-send-email-mst@redhat.com> <20150421023700.GC8048@fam-t430.nay.redhat.com> <20150421070941-mutt-send-email-mst@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Content-Disposition: inline In-Reply-To: <20150421070941-mutt-send-email-mst@redhat.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org To: "Michael S. Tsirkin" Cc: qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org, "Aneesh Kumar K.V" , Stefan Hajnoczi , Amit Shah , Paolo Bonzini List-Id: virtualization@lists.linuxfoundation.org T24gVHVlLCAwNC8yMSAwNzoyMiwgTWljaGFlbCBTLiBUc2lya2luIHdyb3RlOgo+IE9uIFR1ZSwg QXByIDIxLCAyMDE1IGF0IDEwOjM3OjAwQU0gKzA4MDAsIEZhbSBaaGVuZyB3cm90ZToKPiA+IE9u IE1vbiwgMDQvMjAgMTk6MzYsIE1pY2hhZWwgUy4gVHNpcmtpbiB3cm90ZToKPiA+ID4gT24gRnJp LCBBcHIgMTcsIDIwMTUgYXQgMDM6NTk6MTVQTSArMDgwMCwgRmFtIFpoZW5nIHdyb3RlOgo+ID4g PiA+IEN1cnJlbnRseSwgdmlydGlvIGNvZGUgY2hvb3NlcyB0byBraWxsIFFFTVUgaWYgdGhlIGd1 ZXN0IHBhc3NlcyBhbnkgaW52YWxpZAo+ID4gPiA+IGRhdGEgd2l0aCB2cmluZy4KPiA+ID4gPiBU aGF0IGhhcyBkcmF3YmFja3Mgc3VjaCBhcyBsb3NpbmcgdW5zYXZlZCBkYXRhIChlLmcuIHdoZW4K PiA+ID4gPiBndWVzdCB1c2VyIGlzIHdyaXRpbmcgYSB2ZXJ5IGxvbmcgZW1haWwpLCBvciBwb3Nz aWJsZSBkZW5pYWwgb2Ygc2VydmljZSBpbgo+ID4gPiA+IGEgbmVzdGVkIHZtIHVzZSBjYXNlIHdo ZXJlIHZpcnRpbyBkZXZpY2UgaXMgcGFzc2VkIHRocm91Z2guCj4gPiA+ID4gCj4gPiA+ID4gdmly dGlvLTEgaGFzIGludHJvZHVjZWQgYSBuZXcgc3RhdHVzIGJpdCAiTkVFRFMgUkVTRVQiIHdoaWNo IGNvdWxkIGJlIHVzZWQgdG8KPiA+ID4gPiBpbXByb3ZlIHRoaXMgYnkgY29tbXVuaWNhdGluZyB0 aGUgZXJyb3Igc3RhdGUgYmV0d2VlbiB2aXJ0aW8gZGV2aWNlcyBhbmQKPiA+ID4gPiBkcml2ZXJz LiBUaGUgZGV2aWNlIG5vdGlmaWVzIGd1ZXN0IHVwb24gc2V0dGluZyB0aGUgYml0LCB0aGVuIHRo ZSBndWVzdCBkcml2ZXIKPiA+ID4gPiBzaG91bGQgZGV0ZWN0IHRoaXMgYml0IGFuZCByZXBvcnQg dG8gdXNlcnNwYWNlLCBvciByZWNvdmVyIHRoZSBkZXZpY2UgYnkKPiA+ID4gPiByZXNldHRpbmcg aXQuCj4gPiA+IAo+ID4gPiBVbmZvcnR1bmF0ZWx5LCB2aXJ0aW8gMSBzcGVjIGRvZXMgbm90IGhh dmUgYSBjb25mb3JtYW5jZSBzdGF0ZW1lbnQKPiA+ID4gdGhhdCByZXF1aXJlcyBkcml2ZXIgdG8g cmVjb3Zlci4gV2UgbWVyZWx5IGhhdmUgYSBub24tbm9ybWF0aXZlIGxvb2tpbmcKPiA+ID4gdGV4 dDoKPiA+ID4gCU5vdGU6IEZvciBleGFtcGxlLCB0aGUgZHJpdmVyIGNhbuKAmXQgYXNzdW1lIHJl cXVlc3RzIGluIGZsaWdodAo+ID4gPiAJd2lsbCBiZSBjb21wbGV0ZWQgaWYgREVWSUNFX05FRURT X1JFU0VUIGlzIHNldCwgbm9yIGNhbiBpdCBhc3N1bWUgdGhhdAo+ID4gPiAJdGhleSBoYXZlIG5v dCBiZWVuIGNvbXBsZXRlZC4gQSBnb29kIGltcGxlbWVudGF0aW9uIHdpbGwgdHJ5IHRvIHJlY292 ZXIKPiA+ID4gCWJ5IGlzc3VpbmcgYSByZXNldC4KPiA+ID4gCj4gPiA+IEltcGxlbWVudGluZyB0 aGlzIHJlc2V0IGZvciBhbGwgZGV2aWNlcyBpbiBhIHJhY2UtZnJlZSBtYW5uZXIgbWlnaHQgYWxz bwo+ID4gPiBiZSBmYXIgZnJvbSB0cml2aWFsLiAgSSB0aGluayB3ZSdkIG5lZWQgYSBmZWF0dXJl IGJpdCBmb3IgdGhpcy4KPiA+ID4gT1RPSCBhcyBsb25nIGFzIHdlIG1ha2UgdGhpcyBhIG5ldyBm ZWF0dXJlLCB3b3VsZCBhbiBhYmlsaXR5IHRvCj4gPiA+IHJlc2V0IGEgc2luZ2xlIFZRIGJlIGEg YmV0dGVyIG1hdGNoIGZvciB3aGF0IHlvdSBhcmUgdHJ5aW5nIHRvCj4gPiA+IGFjaGlldmU/Cj4g PiAKPiA+IEkgdGhpbmsgdGhhdCBpcyB0b28gY29tcGxpY2F0ZWQgYXMgYSByZWNvdmVyeSBtZWFz dXJlLCBhIGRldmljZSBsZXZlbCByZXNldHRpbmcKPiA+IHdpbGwgYmUgYmV0dGVyIHRvIGdldCB0 byBhIGRldGVybWluaXN0aWMgc3RhdGUsIGF0IGxlYXN0Lgo+IAo+IFF1ZXN0aW9uIHdvdWxkIGJl LCBob3cgaGFyZCBpcyBpdCB0byBzdG9wIGhvc3QgZnJvbSB1c2luZyBhbGwgcXVldWVzLAo+IHJl dHJpZXZlIGFsbCBob3N0IE9TIHN0YXRlIGFuZCByZS1wcm9ncmFtIGl0IGludG8gdGhlIGRldmlj ZS4KPiBJZiB3ZSBuZWVkIHRvIHNoYWRvdyBhbGwgT1Mgc3RhdGUgd2l0aGluIHRoZSBkcml2ZXIs IHRoZW4gdGhhdCdzIGEgbG90Cj4gb2Ygbm90IHdlbGwgdGVzdGVkIGNvZGUgd2l0aCBhIHBvc3Np YmlsaXR5IG9mIGludHJvZHVjaW5nIG1vcmUgYnVncy4KCkkgZG9uJ3QgdW5kZXJzdGFuZCB0aGUg cXVlc3Rpb24uIEluIHRoaXMgc2VyaWVzIHRoZSB2aXJ0aW8tYmxrIGRldmljZSB3aWxsIG5vdApw b3AgYW55IG1vcmUgcmVxdWVzdHMsIGFuZCBhcyBsb25nIGFzIHRoZSByZXNldCBpcyBwcm9wZXJs eSBoYW5kbGVkLCBib3RoIGd1ZXN0CmFuZCBob3N0IHNob3VsZCBnbyBiYWNrIHRvIGEgZ29vZCBz dGF0ZS4KPiAKPiA+ID4gCj4gPiA+ID4gVGhpcyBzZXJpZXMgbWFrZXMgbmVjZXNzYXJ5IGNoYW5n ZXMgaW4gdmlydGlvIGNvcmUgY29kZSwgYmFzZWQgb24gd2hpY2gKPiA+ID4gPiB2aXJ0aW8tYmxr IGlzIGNvbnZlcnRlZC4gT3RoZXIgZGV2aWNlcyBub3cga2VlcCB0aGUgZXhpc3RpbmcgYmVoYXZp b3IgYnkKPiA+ID4gPiBwYXNzaW5nIGluICJlcnJvcl9hYm9ydCIuIFRoZXkgd2lsbCBiZSBjb252 ZXJ0ZWQgaW4gZm9sbG93aW5nIHNlcmllcy4gVGhlIExpbnV4Cj4gPiA+ID4gZHJpdmVyIHBhcnQg d2lsbCBhbHNvIGJlIHdvcmtlZCBvbi4KPiA+ID4gPiAKPiA+ID4gPiBPbmUgY29uY2VybiB3aXRo IHRoaXMgYmVoYXZpb3IgY2hhbmdlIGlzIHRoYXQgaXQncyBub3cgaGFyZGVyIHRvIG5vdGljZSB0 aGUKPiA+ID4gPiBhY3R1YWwgZHJpdmVyIGJ1ZyB0aGF0IGNhdXNlZCB0aGUgZXJyb3IsIGFzIHRo ZSBndWVzdCBjb250aW51ZXMgdG8gcnVuLiAgVG8KPiA+ID4gPiBhZGRyZXNzIHRoYXQsIHdlIGNv dWxkIHByb2JhYmx5IGFkZCBhIG5ldyBlcnJvciBhY3Rpb24gb3B0aW9uIHRvIHZpcnRpbwo+ID4g PiA+IGRldmljZXMsICBzaW1pbGFyIHRvIHRoZSAicmVhZC93cml0ZSB3ZXJyb3IiIGluIGJsb2Nr IGxheWVyLCBzbyB0aGUgdm0gY291bGQgYmUKPiA+ID4gPiBwYXVzZWQgYW5kIHRoZSBtYW5hZ2Vt ZW50IHdpbGwgZ2V0IGFuIGV2ZW50IGluIFFNUCBsaWtlIHB2cGFuaWMuICBUaGlzIHdvcmsgY2Fu Cj4gPiA+ID4gYmUgZG9uZSBvbiB0b3AuCj4gPiA+IAo+ID4gPiBBdCB0aGUgYXJjaGl0ZWN0dXJh bCBsZXZlbCwgdGhhdCdzIG9ubHkgb25lIGNvbmNlcm4uIE90aGVycyB3b3VsZCBiZQo+ID4gPiAt IHdvcmtsb2FkcyBzdWNoIGFzIG9wZW5zdGFjayBoYW5kbGUgZ3Vlc3QgY3Jhc2ggYmV0dGVyIHRo YW4KPiA+ID4gICBhIGd1ZXN0IHRoYXQncyBlLmcuIHNsb3cgYmVjYXVzZSBvZiBhIG1lbW9yeSBs ZWFrCj4gPiAKPiA+IFdoYXQgbWVtb3J5IGxlYWsgYXJlIHlvdSByZWZlcnJpbmcgdG8/Cj4gCj4g VGhhdCB3YXMganVzdCBhbiBleGFtcGxlLiAgSWYgaG9zdCBkZXRlY3RzIGEgbWFsZm9ybWVkIHJp bmcsIGl0IHdpbGwKPiBjcmFzaC4gIEJ1dCBvZnRlbiBpdCBkb2Vzbid0LCByZXN1bHQgaXMgYnVm ZmVycyBub3QgYmVpbmcgdXNlZCwgc28gZ3Vlc3QKPiBjYW4ndCBmcmVlIHRoZW0gdXAuCj4gCj4g PiA+IC0gaXQncyBlYXNpZXIgZm9yIGd1ZXN0cyB0byBwcm9iZSBob3N0IGZvciBzZWN1cml0eSBp c3N1ZXMKPiA+ID4gICBpZiBndWVzdCBpc24ndCBraWxsZWQKPiA+ID4gLSBndWVzdCBjYW4gZmxv b2QgaG9zdCBsb2cgd2l0aCBndWVzdC10cmlnZ2VyZWQgZXJyb3JzCj4gPiAKPiA+IFdlIGNhbiBz dGlsbCBhYm9ydCgpIGlmIGd1ZXN0IGlzIHRyaWdnZXJpbmcgZXJyb3IgdG9vIHF1aWNrbHkuCj4g Cj4gCj4gQWJzb2x1dGVseSwgYW5kIGlmIGl0IGxvb2tlZCBsaWtlIEknbSBhZ2FpbnN0IGVycm9y IGRldGVjdGlvbiBhbmQKPiByZWNvdmVyeSwgdGhpcyB3YXMgbm90IG15IGludGVudC4KPiAKPiBJ IGFtIG1lcmVseSBzYXlpbmcgd2UgY2FuJ3QgYXBwbHkgdGhpcyBwYXRjaHNldCBhcyBpcywgZGVm ZXJyaW5nCj4gYWRkcmVzc2luZyB0aGUgaXNzdWVzIHRvIHBhdGNoZXMgb24gdG9wLgo+IAo+IEJ1 dCBJIGhhdmUgYW4gaWRlYTogcmVmYWN0b3IgdGhlIGNvZGUgdG8gdXNlIGVycm9yX2Fib3J0LiAK ClRoYXQgaXMgcGF0Y2ggMS05IG9mIHRoaXMgc2VyaWVzLiBPciBkbyB5b3UgbWVhbiBhbHNvIHJl ZmFjdG9yIGFuZCBwYXNzCmVycm9yX2Fib3J0IHRvIHRoZSBtZW1vcnkgY29yZT8KCkZhbQoKPlRo aXMgd2F5IHdlCj4gY2FuIGFwcGx5IHRoZSBwYXRjaHNldCB3aXRob3V0IG1ha2luZyBmdW5jdGlv bmFsIGNoYW5nZXMsIGFuZCB5b3UgY2FuCj4gbWFrZSBwcm9ncmVzcyB0byBjb21wbGV0ZSB0aGlz LCBvbiB0b3AuCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f ClZpcnR1YWxpemF0aW9uIG1haWxpbmcgbGlzdApWaXJ0dWFsaXphdGlvbkBsaXN0cy5saW51eC1m b3VuZGF0aW9uLm9yZwpodHRwczovL2xpc3RzLmxpbnV4Zm91bmRhdGlvbi5vcmcvbWFpbG1hbi9s aXN0aW5mby92aXJ0dWFsaXphdGlvbg==