From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:32800) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zn8GO-0005wl-Cl for qemu-devel@nongnu.org; Fri, 16 Oct 2015 12:53:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zn8GI-0003ge-J8 for qemu-devel@nongnu.org; Fri, 16 Oct 2015 12:53:50 -0400 From: Paul Durrant Date: Fri, 16 Oct 2015 16:53:44 +0000 Message-ID: <9AAE0902D5BC7E449B7C8E4E778ABCD02F616F4B@AMSPEX01CL01.citrite.net> References: <20151014094727.GE4281@noname.str.redhat.com> <561E3865.1010500@m2r.biz> <9AAE0902D5BC7E449B7C8E4E778ABCD02F6114DD@AMSPEX01CL01.citrite.net> <20151016140416.GD4185@noname.redhat.com> <9AAE0902D5BC7E449B7C8E4E778ABCD02F616715@AMSPEX01CL01.citrite.net> <20151016150226.GF4185@noname.redhat.com> <9AAE0902D5BC7E449B7C8E4E778ABCD02F616927@AMSPEX01CL01.citrite.net> <20151016161152.GG4185@noname.redhat.com> <9AAE0902D5BC7E449B7C8E4E778ABCD02F616CC1@AMSPEX01CL01.citrite.net> <20151016164257.GH4185@noname.redhat.com> In-Reply-To: <20151016164257.GH4185@noname.redhat.com> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: "qemu-block@nongnu.org" , "qemu-devel@nongnu.org" , "xen-devel@lists.xen.org" , Fabio Fantoni , Stefano Stabellini , Anthony Perard , John Snow > -----Original Message----- > From: Kevin Wolf [mailto:kwolf@redhat.com] > Sent: 16 October 2015 17:43 > To: Paul Durrant > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu- > devel@nongnu.org; xen-devel@lists.xen.org; qemu-block@nongnu.org > Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci > missed in qemu >=20 > Am 16.10.2015 um 18:20 hat Paul Durrant geschrieben: > > > -----Original Message----- > > > From: Kevin Wolf [mailto:kwolf@redhat.com] > > > Sent: 16 October 2015 17:12 > > > To: Paul Durrant > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qem= u- > > > devel@nongnu.org; xen-devel@lists.xen.org; qemu-block@nongnu.org > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support for > ahci > > > missed in qemu > > > > > > Am 16.10.2015 um 17:10 hat Paul Durrant geschrieben: > > > > > -----Original Message----- > > > > > From: Kevin Wolf [mailto:kwolf@redhat.com] > > > > > Sent: 16 October 2015 16:02 > > > > > To: Paul Durrant > > > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; > qemu- > > > > > devel@nongnu.org; xen-devel@lists.xen.org; qemu- > block@nongnu.org > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support > for > > > ahci > > > > > missed in qemu > > > > > > > > > > Am 16.10.2015 um 16:24 hat Paul Durrant geschrieben: > > > > > > > -----Original Message----- > > > > > > > From: Kevin Wolf [mailto:kwolf@redhat.com] > > > > > > > Sent: 16 October 2015 15:04 > > > > > > > To: Paul Durrant > > > > > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Per= ard; > > > qemu- > > > > > > > devel@nongnu.org; xen-devel@lists.xen.org; qemu- > > > block@nongnu.org > > > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug > support > > > for > > > > > ahci > > > > > > > missed in qemu > > > > > > > > > > > > > > Am 14.10.2015 um 14:48 hat Paul Durrant geschrieben: > > > > > > > > > -----Original Message----- > > > > > > > > > From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz] > > > > > > > > > Sent: 14 October 2015 12:12 > > > > > > > > > To: Kevin Wolf; Stefano Stabellini > > > > > > > > > Cc: John Snow; Anthony Perard; qemu-devel@nongnu.org; > xen- > > > > > > > > > devel@lists.xen.org; qemu-block@nongnu.org; Paul Durrant > > > > > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug > > > support > > > > > for > > > > > > > ahci > > > > > > > > > missed in qemu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Il 14/10/2015 11:47, Kevin Wolf ha scritto: > > > > > > > > > > [ CC qemu-block ] > > > > > > > > > > > > > > > > > > > > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieb= en: > > > > > > > > > >> On Tue, 13 Oct 2015, John Snow wrote: > > > > > > > > > >>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote: > > > > > > > > > >>>> I added ahci disk support in libxl and using it for = week > seems > > > > > that > > > > > > > was > > > > > > > > > >>>> ok, after a reply of Stefano Stabellini seems that x= en disk > > > unplug > > > > > > > > > >>>> support only ide disks: > > > > > > > > > >>>> > > > > > > > > > > > > > > > > > > > > > > > > > http://git.qemu.org/?p=3Dqemu.git;a=3Dcommitdiff;h=3D679f4f8b178e7c66fbc2= f39 > > > > > > > > > c905374ee8663d5d8 > > > > > > > > > >>>> > > > > > > > > > >>>> Today Paul Durrant told me that even if pv disk is o= k also > > > with > > > > > ahci > > > > > > > and > > > > > > > > > >>>> the emulated one is offline can be a risk: > > > > > > > > > >>>> http://lists.xenproject.org/archives/html/win-pv- > > > devel/2015- > > > > > > > > > 10/msg00021.html > > > > > > > > > >>>> > > > > > > > > > >>>> > > > > > > > > > >>>> I tried to take a fast look in qemu code but I not > understand > > > the > > > > > > > > > needed > > > > > > > > > >>>> thing for add the xen disk unplug support also for a= hci, > can > > > > > > > someone do > > > > > > > > > >>>> it or tell me useful information for do it please? > > > > > > > > > >>>> > > > > > > > > > >>>> Thanks for any reply and sorry for my bad english. > > > > > > > > > >>>> > > > > > > > > > >>> I'm not entirely sure what features you need AHCI to > support > > > in > > > > > > > order > > > > > > > > > >>> for Xen to be happy. > > > > > > > > > >>> > > > > > > > > > >>> I'd guess hotplugging, but where I get confused is th= at IDE > > > disks > > > > > don't > > > > > > > > > >>> support hotplugging either, so I guess I'm not sure s= ure > what > > > you > > > > > > > need. > > > > > > > > > >>> > > > > > > > > > >>> Stefano, can you help bridge my Xen knowledge gap? > > > > > > > > > >> > > > > > > > > > >> Hi John, > > > > > > > > > >> > > > > > > > > > >> we need something like > > > > > hw/i386/xen/xen_platform.c:unplug_disks > > > > > > > but > > > > > > > > > that > > > > > > > > > >> can unplug AHCI disk. And by unplug, I mean "make > disappear" > > > like > > > > > > > > > >> pci_piix3_xen_ide_unplug does for ide. > > > > > > > > > > Maybe this would be the right time to stop the crazines= s > with > > > your > > > > > > > > > > hybrid IDE/xendisk setup. It's a horrible thing that wo= uld > never > > > > > happen > > > > > > > > > > on real hardware. > > > > > > > > > > > > > > > > Unfortunately, it's going to be difficult to remove such 'c= raziness' > > > when > > > > > you > > > > > > > don't know a priori whether the VM has PV drivers or not. > > > > > > > > > > > > > > Why wouldn't you know that beforehand? I mean, even on real > > > > > hardware > > > > > > > you > > > > > > > can have different disk interfaces (IDE, AHCI, SCSI) and you = install > > > > > > > the exact driver that your hardware needs. You just do the sa= me > > > thing on > > > > > > > VM: If your hardware is PV, you install a PV driver. If your > hardware is > > > > > > > IDE, you install an IDE driver. Whether it's PV or IDE is som= ething > that > > > > > > > you, the user, decided when configuring the VM, so you defini= tely > > > know. > > > > > > > > > > > > > > > > > > > That's not necessarily true. The host admin that provisions the= VM > does > > > not > > > > > necessarily know what OS the user of that VM will install. The ad= min > may > > > just > > > > > be providing a generic VM with an emulated CD drive that the user > can > > > point > > > > > at any ISO they want. > > > > > > > > > > > > So, as a host admin, if you provide a VM with only PV backends = and > > > your > > > > > user is trying to boot an OS with no PV drivers they are not goin= g to be > > > > > happy, so you provide emulated devices. Then, at some point later= , > when > > > > > the user installs PV drivers, there really should be some way for= those > > > drivers > > > > > to start up without any need to contact the host admin and have t= he > VM > > > > > reconfigured. > > > > > > > > > > Why only IDE and xendisk then? Maybe I have an OS that works grea= t > > > with > > > > > AHCI, or virtio-blk, or an LSI SCSI controller, or a Megasas SCSI > > > > > controller, or USB sticks, or... (and IDE will hardly ever be the > > > > > optimal one) > > > > > > > > > > What about network cards? My OS might support the Xen PV one, or > it > > > > > might support rtl8139, or e1000, or virtio-net, or pcnet, or... > > > > > > > > > > Should we always put all of the hardware that can possibly be > emulated > > > > > in a VM just so that the one right device is definitely included = even > > > > > though we don't know what OS will be running? > > > > > > > > > > This is ridiculous. > > > > > > > > It might be, but to some extent it's reality. The reason that the > > > > default emulated network device chosen by xl is rtl8193 is that it = has > > > > drivers in just about every OS. The same reason for IDE being the > > > > default choice for storage. > > > > > > So what does this mean for a justification for the AHCI + xendisk hyb= rid > > > proposal? > > > > > > > > Just tell your admin what virtual hardware you really need. (Or t= ell > > > > > them to give you a proper interface to configure your VMs yoursel= f.) > > > > > > > > > > > > > My point is that the virtual hardware that the OS user wants will > > > > change. Before they install PV drivers, they will need emulated > > > > device. After installing PV drivers they will want PV devices. Shou= ld > > > > they really have to contact their cloud provider to make the switch= , > > > > when at the moment it happens automatically and transparently (the > > > > AHCI problem aside)? > > > > > > My point is that such a magic change shouldn't happen. It doesn't hap= pen > > > on real hardware either and people still get things installed to non-= IDE > > > disks. > > > > > > There is no reason to install the OS onto a different device than wil= l > > > be used later. With Linux, it's no problem at all because the PV driv= ers > > > are already included on the installation media anyway, and on Windows > or > > > presumably any other OS you can load and install the drivers right fr= om > > > the beginning. > > > > > > In fact, I would be surprised if using xendisk instead of IDE for > > > installing Windows didn't result in a noticably faster installation. > > > > > > > It most certainly would, but requiring users do it this way is likely t= o meet > some resistance I suspect. >=20 > Why do you think so? Installing the PV drivers afterwards doesn't seem > easier than just providing them during the installation. >=20 My experience of XenServer customers tells me that any form of manual inter= vention during guest install is likely to meet with resistance, unfortunate= ly. > > > Now, if you really insist on providing a legacy interface even to gue= sts > > > that eventually use PV drivers, there actually are sane ways to > > > implement this. It will be tricky to make that transition now without > > > breaking compatibility, but it could have been done from the start. > > > > > > Sane means for example that you don't open the same image twice (and > > > even read-write!) at the same time. This is a recipe for disaster and > > > it's surprising that you don't see corrupted images more often. > > > > > > > We don't because unplug is supposed to ensure the emulated device is > > gone before the PV frontend is started >=20 > The important part is the backend, but it seems that you open the second > instance of the image only when starting the PV frontend? I believe this is the case, yes. >=20 > As long as you don't enable the user to use most of qemu's functionality > like starting block jobs (which would keep the IDE instance around even > after unplugging the disk), it might actually be safe assuming that the > guest cooperates. Not sure what a malicious guest could do, though, as > nobody seems to check whether IDE is really unplugged before the second > instance is opened. The Windows drivers do check. After the unplug Windows is asked to re-enume= rate the IDE buses and we make sure the disks we expect to be gone are real= ly gone. > raw and qcow2 should be safe these days, but in > earlier times it would probably have been possible for the guest to > overwrite the image header and access arbitrary files on the host as > backing file. It might still be true for other image formats. >=20 > > > So if you wanted to have a clean solution, try to think how real > > > hardware would solve the problem. If you want me to suggest something > > > off the top of my head, I would come up with an extended IDE device > (one > > > single device!) that provides the IDE I/O ports and additionally some > > > MMIO BAR that enables access to PV functionality. > > > > > > Once you enable PV functionality, the IDE ports stop working; device > > > reset disables the PV ring and goes back to IDE mode. No hard disk > > > suddenly disappearing from the machine, no image corruption if the ID= E > > > device is written to before enabling PV, etc. > > > > > > > That's not sufficient though. The IDE device must not be enumerated by > > the OS and, in Windows at least, that enumeration occurs before the PV > > frontend has started up. >=20 > The trick is that it's only a single device, so there is no second > device that must be prevented from being enumerated. You provide a > driver for this specific IDE controller, so Windows wouldn't even try > the generic IDE driver when your driver is available. >=20 But the whole point is that we want Windows to use the generic IDE driver. = If we had a driver in Windows from the outset then it would be pure PV and = there'd be no problem :-) Paul > It's kind of the same sort of IDE controller extension as Bus Master > DMA, which just added a new BAR. If you had an old driver, it would just > ignore the new registers. If you had a new one, it would use them. But > in no way would the old appearance of the device simply disappear, you > just use an extended register set on the same device. >=20 > > > But it's your choice. You can keep your broken hack in IDE. Just don'= t > > > expect anyone to support adding new broken hacks to other devices. > > > > > > > I'd prefer to have a cleaner solution and I believe can achieve that in > Windows by obscuring the emulated disks using filter drivers, so that's t= he > way I'll probably go. >=20 > I wouldn't consider anything that works with two distinct disk devices > and two separate BlockDriverStates for the same image file a clean > solution. >=20 > Kevin