From mboxrd@z Thu Jan 1 00:00:00 1970 From: Fabio Fantoni Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu Date: Mon, 19 Oct 2015 15:42:09 +0200 Message-ID: <5624F331.2020908__44422.5980005742$1445262236$gmane$org@m2r.biz> References: <20151014094727.GE4281@noname.str.redhat.com> <561E3865.1010500@m2r.biz> <9AAE0902D5BC7E449B7C8E4E778ABCD02F6114DD@AMSPEX01CL01.citrite.net> <20151016140416.GD4185@noname.redhat.com> <9AAE0902D5BC7E449B7C8E4E778ABCD02F616715@AMSPEX01CL01.citrite.net> <20151016150226.GF4185@noname.redhat.com> <9AAE0902D5BC7E449B7C8E4E778ABCD02F616927@AMSPEX01CL01.citrite.net> <20151016161152.GG4185@noname.redhat.com> <9AAE0902D5BC7E449B7C8E4E778ABCD02F616CC1@AMSPEX01CL01.citrite.net> <20151016164257.GH4185@noname.redhat.com> <9AAE0902D5BC7E449B7C8E4E778ABCD02F616F4B@AMSPEX01CL01.citrite.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD02F616F4B@AMSPEX01CL01.citrite.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Paul Durrant , Kevin Wolf Cc: "qemu-block@nongnu.org" , "qemu-devel@nongnu.org" , "xen-devel@lists.xen.org" , Stefano Stabellini , Anthony Perard , John Snow List-Id: xen-devel@lists.xenproject.org Il 16/10/2015 18:53, Paul Durrant ha scritto: >> -----Original Message----- >> From: Kevin Wolf [mailto:kwolf@redhat.com] >> Sent: 16 October 2015 17:43 >> To: Paul Durrant >> Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu- >> devel@nongnu.org; xen-devel@lists.xen.org; qemu-block@nongnu.org >> Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci >> missed in qemu >> >> Am 16.10.2015 um 18:20 hat Paul Durrant geschrieben: >>>> -----Original Message----- >>>> From: Kevin Wolf [mailto:kwolf@redhat.com] >>>> Sent: 16 October 2015 17:12 >>>> To: Paul Durrant >>>> Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu- >>>> devel@nongnu.org; xen-devel@lists.xen.org; qemu-block@nongnu.org >>>> Subject: Re: [Qemu-devel] Question about xen disk unplug support for >> ahci >>>> missed in qemu >>>> >>>> Am 16.10.2015 um 17:10 hat Paul Durrant geschrieben: >>>>>> -----Original Message----- >>>>>> From: Kevin Wolf [mailto:kwolf@redhat.com] >>>>>> Sent: 16 October 2015 16:02 >>>>>> To: Paul Durrant >>>>>> Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; >> qemu- >>>>>> devel@nongnu.org; xen-devel@lists.xen.org; qemu- >> block@nongnu.org >>>>>> Subject: Re: [Qemu-devel] Question about xen disk unplug support >> for >>>> ahci >>>>>> missed in qemu >>>>>> >>>>>> Am 16.10.2015 um 16:24 hat Paul Durrant geschrieben: >>>>>>>> -----Original Message----- >>>>>>>> From: Kevin Wolf [mailto:kwolf@redhat.com] >>>>>>>> Sent: 16 October 2015 15:04 >>>>>>>> To: Paul Durrant >>>>>>>> Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; >>>> qemu- >>>>>>>> devel@nongnu.org; xen-devel@lists.xen.org; qemu- >>>> block@nongnu.org >>>>>>>> Subject: Re: [Qemu-devel] Question about xen disk unplug >> support >>>> for >>>>>> ahci >>>>>>>> missed in qemu >>>>>>>> >>>>>>>> Am 14.10.2015 um 14:48 hat Paul Durrant geschrieben: >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz] >>>>>>>>>> Sent: 14 October 2015 12:12 >>>>>>>>>> To: Kevin Wolf; Stefano Stabellini >>>>>>>>>> Cc: John Snow; Anthony Perard; qemu-devel@nongnu.org; >> xen- >>>>>>>>>> devel@lists.xen.org; qemu-block@nongnu.org; Paul Durrant >>>>>>>>>> Subject: Re: [Qemu-devel] Question about xen disk unplug >>>> support >>>>>> for >>>>>>>> ahci >>>>>>>>>> missed in qemu >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Il 14/10/2015 11:47, Kevin Wolf ha scritto: >>>>>>>>>>> [ CC qemu-block ] >>>>>>>>>>> >>>>>>>>>>> Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben: >>>>>>>>>>>> On Tue, 13 Oct 2015, John Snow wrote: >>>>>>>>>>>>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote: >>>>>>>>>>>>>> I added ahci disk support in libxl and using it for week >> seems >>>>>> that >>>>>>>> was >>>>>>>>>>>>>> ok, after a reply of Stefano Stabellini seems that xen disk >>>> unplug >>>>>>>>>>>>>> support only ide disks: >>>>>>>>>>>>>> >> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39 >>>>>>>>>> c905374ee8663d5d8 >>>>>>>>>>>>>> Today Paul Durrant told me that even if pv disk is ok also >>>> with >>>>>> ahci >>>>>>>> and >>>>>>>>>>>>>> the emulated one is offline can be a risk: >>>>>>>>>>>>>> http://lists.xenproject.org/archives/html/win-pv- >>>> devel/2015- >>>>>>>>>> 10/msg00021.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> I tried to take a fast look in qemu code but I not >> understand >>>> the >>>>>>>>>> needed >>>>>>>>>>>>>> thing for add the xen disk unplug support also for ahci, >> can >>>>>>>> someone do >>>>>>>>>>>>>> it or tell me useful information for do it please? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for any reply and sorry for my bad english. >>>>>>>>>>>>>> >>>>>>>>>>>>> I'm not entirely sure what features you need AHCI to >> support >>>> in >>>>>>>> order >>>>>>>>>>>>> for Xen to be happy. >>>>>>>>>>>>> >>>>>>>>>>>>> I'd guess hotplugging, but where I get confused is that IDE >>>> disks >>>>>> don't >>>>>>>>>>>>> support hotplugging either, so I guess I'm not sure sure >> what >>>> you >>>>>>>> need. >>>>>>>>>>>>> Stefano, can you help bridge my Xen knowledge gap? >>>>>>>>>>>> Hi John, >>>>>>>>>>>> >>>>>>>>>>>> we need something like >>>>>> hw/i386/xen/xen_platform.c:unplug_disks >>>>>>>> but >>>>>>>>>> that >>>>>>>>>>>> can unplug AHCI disk. And by unplug, I mean "make >> disappear" >>>> like >>>>>>>>>>>> pci_piix3_xen_ide_unplug does for ide. >>>>>>>>>>> Maybe this would be the right time to stop the craziness >> with >>>> your >>>>>>>>>>> hybrid IDE/xendisk setup. It's a horrible thing that would >> never >>>>>> happen >>>>>>>>>>> on real hardware. >>>>>>>>> Unfortunately, it's going to be difficult to remove such 'craziness' >>>> when >>>>>> you >>>>>>>> don't know a priori whether the VM has PV drivers or not. >>>>>>>> >>>>>>>> Why wouldn't you know that beforehand? I mean, even on real >>>>>> hardware >>>>>>>> you >>>>>>>> can have different disk interfaces (IDE, AHCI, SCSI) and you install >>>>>>>> the exact driver that your hardware needs. You just do the same >>>> thing on >>>>>>>> VM: If your hardware is PV, you install a PV driver. If your >> hardware is >>>>>>>> IDE, you install an IDE driver. Whether it's PV or IDE is something >> that >>>>>>>> you, the user, decided when configuring the VM, so you definitely >>>> know. >>>>>>> That's not necessarily true. The host admin that provisions the VM >> does >>>> not >>>>>> necessarily know what OS the user of that VM will install. The admin >> may >>>> just >>>>>> be providing a generic VM with an emulated CD drive that the user >> can >>>> point >>>>>> at any ISO they want. >>>>>>> So, as a host admin, if you provide a VM with only PV backends and >>>> your >>>>>> user is trying to boot an OS with no PV drivers they are not going to be >>>>>> happy, so you provide emulated devices. Then, at some point later, >> when >>>>>> the user installs PV drivers, there really should be some way for those >>>> drivers >>>>>> to start up without any need to contact the host admin and have the >> VM >>>>>> reconfigured. >>>>>> >>>>>> Why only IDE and xendisk then? Maybe I have an OS that works great >>>> with >>>>>> AHCI, or virtio-blk, or an LSI SCSI controller, or a Megasas SCSI >>>>>> controller, or USB sticks, or... (and IDE will hardly ever be the >>>>>> optimal one) >>>>>> >>>>>> What about network cards? My OS might support the Xen PV one, or >> it >>>>>> might support rtl8139, or e1000, or virtio-net, or pcnet, or... >>>>>> >>>>>> Should we always put all of the hardware that can possibly be >> emulated >>>>>> in a VM just so that the one right device is definitely included even >>>>>> though we don't know what OS will be running? >>>>>> >>>>>> This is ridiculous. >>>>> It might be, but to some extent it's reality. The reason that the >>>>> default emulated network device chosen by xl is rtl8193 is that it has >>>>> drivers in just about every OS. The same reason for IDE being the >>>>> default choice for storage. >>>> So what does this mean for a justification for the AHCI + xendisk hybrid >>>> proposal? >>>> >>>>>> Just tell your admin what virtual hardware you really need. (Or tell >>>>>> them to give you a proper interface to configure your VMs yourself.) >>>>>> >>>>> My point is that the virtual hardware that the OS user wants will >>>>> change. Before they install PV drivers, they will need emulated >>>>> device. After installing PV drivers they will want PV devices. Should >>>>> they really have to contact their cloud provider to make the switch, >>>>> when at the moment it happens automatically and transparently (the >>>>> AHCI problem aside)? >>>> My point is that such a magic change shouldn't happen. It doesn't happen >>>> on real hardware either and people still get things installed to non-IDE >>>> disks. >>>> >>>> There is no reason to install the OS onto a different device than will >>>> be used later. With Linux, it's no problem at all because the PV drivers >>>> are already included on the installation media anyway, and on Windows >> or >>>> presumably any other OS you can load and install the drivers right from >>>> the beginning. >>>> >>>> In fact, I would be surprised if using xendisk instead of IDE for >>>> installing Windows didn't result in a noticably faster installation. >>>> >>> It most certainly would, but requiring users do it this way is likely to meet >> some resistance I suspect. >> >> Why do you think so? Installing the PV drivers afterwards doesn't seem >> easier than just providing them during the installation. >> > My experience of XenServer customers tells me that any form of manual intervention during guest install is likely to meet with resistance, unfortunately. > >>>> Now, if you really insist on providing a legacy interface even to guests >>>> that eventually use PV drivers, there actually are sane ways to >>>> implement this. It will be tricky to make that transition now without >>>> breaking compatibility, but it could have been done from the start. >>>> >>>> Sane means for example that you don't open the same image twice (and >>>> even read-write!) at the same time. This is a recipe for disaster and >>>> it's surprising that you don't see corrupted images more often. >>>> >>> We don't because unplug is supposed to ensure the emulated device is >>> gone before the PV frontend is started >> The important part is the backend, but it seems that you open the second >> instance of the image only when starting the PV frontend? > I believe this is the case, yes. > >> As long as you don't enable the user to use most of qemu's functionality >> like starting block jobs (which would keep the IDE instance around even >> after unplugging the disk), it might actually be safe assuming that the >> guest cooperates. Not sure what a malicious guest could do, though, as >> nobody seems to check whether IDE is really unplugged before the second >> instance is opened. > The Windows drivers do check. After the unplug Windows is asked to re-enumerate the IDE buses and we make sure the disks we expect to be gone are really gone. > >> raw and qcow2 should be safe these days, but in >> earlier times it would probably have been possible for the guest to >> overwrite the image header and access arbitrary files on the host as >> backing file. It might still be true for other image formats. >> >>>> So if you wanted to have a clean solution, try to think how real >>>> hardware would solve the problem. If you want me to suggest something >>>> off the top of my head, I would come up with an extended IDE device >> (one >>>> single device!) that provides the IDE I/O ports and additionally some >>>> MMIO BAR that enables access to PV functionality. >>>> >>>> Once you enable PV functionality, the IDE ports stop working; device >>>> reset disables the PV ring and goes back to IDE mode. No hard disk >>>> suddenly disappearing from the machine, no image corruption if the IDE >>>> device is written to before enabling PV, etc. >>>> >>> That's not sufficient though. The IDE device must not be enumerated by >>> the OS and, in Windows at least, that enumeration occurs before the PV >>> frontend has started up. >> The trick is that it's only a single device, so there is no second >> device that must be prevented from being enumerated. You provide a >> driver for this specific IDE controller, so Windows wouldn't even try >> the generic IDE driver when your driver is available. >> > But the whole point is that we want Windows to use the generic IDE driver. If we had a driver in Windows from the outset then it would be pure PV and there'd be no problem :-) > > Paul I understand the goals of the actual 'hybrid' xen pv for disks and net, made in order to have older emulated hardware (compatible with the most of the common outofthebox systems) and the advantage to switch to pv without change domU's settings. It would a great thing if it is working without problems in most of cases, but unfortunately it is not. For the linux hvm domUs i haven't found any problems with that, apart from the boot speed (which can be solved using ahci), while with windows domUs (which are the major of my guest domUs), the problems are always there... In specific the problems i have are almost always with installation/update/remove of the pv drivers. Unsing gplPVs, taking grate care in removing e install new versions, i've found an acceptable way to get the task done. With the new winPV drivers, the problems arises. Some of these problems were fixed by Durrant, while others seem to be persistent and it is very difficult if not impossible to gain logs to report back. Despite all of my reports, it seems there aren't any possible solutions to these kind of problems. The problems i've mentioned are almost about disks, where i'm unable to boot windows domUs, due to blue screens etc., but i've had also problems on the network side (ie. pv network still there but not functioning and emulated network functioning intead). These kind of problems are becoming very frequantly on windows 10 guests and even more with unsigned pv drivers. For these and other reasons, i think the actual 'hybrid' solution it is not a very good solution. I suppose that on xenserver, these kind of problems come fixed by specific operations made by the installer, isnt'it? I'm wondering if the problems i've described are only 'mine' or if there are methods or solutions out there that i've not be able to find form myself... For example as reported in a old win-pv-devel mail some months ago when I thinked to found a valid solution to install/upgrade problems what after I saw that wasn't: http://fantu.info/xen/Notes_new_xen_winpv_drivers.7z On the other hand i've gave a try to kvm with virtio and found that even if i must to install dedicated drivers for install the system or boot it, i've never found problems in installing/updating the drivers and/or booting the guests or problems on network adapters and so on... Thanks for any reply and sorry for my bad english. > >> It's kind of the same sort of IDE controller extension as Bus Master >> DMA, which just added a new BAR. If you had an old driver, it would just >> ignore the new registers. If you had a new one, it would use them. But >> in no way would the old appearance of the device simply disappear, you >> just use an extended register set on the same device. >> >>>> But it's your choice. You can keep your broken hack in IDE. Just don't >>>> expect anyone to support adding new broken hacks to other devices. >>>> >>> I'd prefer to have a cleaner solution and I believe can achieve that in >> Windows by obscuring the emulated disks using filter drivers, so that's the >> way I'll probably go. >> >> I wouldn't consider anything that works with two distinct disk devices >> and two separate BlockDriverStates for the same image file a clean >> solution. >> >> Kevin