From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: KVM usability Date: Mon, 01 Mar 2010 09:33:47 -0600 Message-ID: <4B8BDE5B.8090201@codemonkey.ws> References: <1267089644.12790.74.camel@laptop> <1267152599.1726.76.camel@localhost> <20100226090147.GH15885@elte.hu> <4B879A2F.50203@redhat.com> <20100226103545.GA7463@elte.hu> <4B87A6BF.3090301@redhat.com> <20100226111734.GE7463@elte.hu> <4B8813F2.8090208@redhat.com> <20100227105643.GA17425@elte.hu> <4B893B2B.40301@redhat.com> <20100227172546.GA31472@elte.hu> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Zachary Amsden , Avi Kivity , "Zhang, Yanmin" , Peter Zijlstra , ming.m.lin@intel.com, sheng.yang@intel.com, Jes Sorensen , KVM General , Gleb Natapov , Arnaldo Carvalho de Melo , Fr??d??ric Weisbecker , Thomas Gleixner , "H. Peter Anvin" , Peter Zijlstra , Arjan van de Ven To: Ingo Molnar Return-path: Received: from qw-out-2122.google.com ([74.125.92.25]:58222 "EHLO qw-out-2122.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751281Ab0CAPdz (ORCPT ); Mon, 1 Mar 2010 10:33:55 -0500 Received: by qw-out-2122.google.com with SMTP id 8so508384qwh.37 for ; Mon, 01 Mar 2010 07:33:54 -0800 (PST) In-Reply-To: <20100227172546.GA31472@elte.hu> Sender: kvm-owner@vger.kernel.org List-ID: On 02/27/2010 11:25 AM, Ingo Molnar wrote: > * Zachary Amsden wrote: > > [...] > >> Second, it's not over-modularized. The modules are the individual >> components of the architecture. How would you propose to put it >> differently. They really can't naturally combine. And with the >> code quality of qemu in general being problematic by Linux kernel >> standards, it's not natural to move the device emulation directly >> into the kernel module. So this is why we are where we are today. >> > I'm not talking about moving it into a kernel _module_ - albeit that > alone is a worthwile thing to do for any performance sensitive hw > component. > > I was talking about the option of a clean, stripped down Qemu base > hosted in the kernel proper, in linux/tools/kvm/ or so. If i were > running a virtualization effort it would be the first place i'd > consider to put my tooling into. > Let's ignore the suggestion of hosting it in the kernel. There's no reason it couldn't be as successful hosted as a separate project. Let's consider what you would strip of out qemu. You would obviously pull out TCG and the device emulation that isn't useful for KVM. You can't compile out TCG today but you actually can compile out most device emulation so this doesn't actually buy you much. It certainly doesn't fix any of the problems you outlined. The GUI wouldn't change at all. You still have the same fundamental problem that whatever this userspace executable is, is not the place where you need to implement a user friendly GUI. That has to be a separate process. Maybe you could integrate that separate process into the same repository as the core process but we can still do this with qemu. > It would be a no-brainer: most of the devs come from the KVM side, and > KVM itself makes little sense without Qemu, and Qemu makes little sense > without KVM these days. (and i know about the non-KVM and non-x86 > roots of Qemu - still, it's not a significant piece of usage today) > Do you have statistics to back this up? You would probably be surprised at how many people use TCG. To be honest, every KVM developer including myself has considered and even prototyped exactly what you described. We've all independently come to the same conclusion: it's easier to incrementally improve qemu than it is to split the code base and try to maintain the fork. And a lot of the other vendors who have decided to fork qemu in the past have learned the hard way that it's more difficult to maintain a fork and are now merging back to upstream qemu. We could certainly make the same argument about forking the kernel to make it optimized for virtualization. If we took Linux and added it to the qemu git tree, we would instantly have transparent large page support for users instead of having to wait years to get it properly integrated. We could also add gang scheduling and hard scheduler limits to the kernel. But we know better and even though the process is more painful and drawn out, we end up with a much better solution in the long run by including the input and feedback from people like you. Xen clearly made a different decision and is still suffering the consequences. They've done the same thing with qemu as you describe and have now realized it was a mistake and are working to merge their changes into upstream qemu. There are *plenty* of usability issues (like transparent large pages) that need to be addressed at the KVM/kernel level. Today, a user has to choose between a ~30% decrease in performance on Java workloads or the ability to overcommit memory. It's a pretty significant problem and there's been a lot of resistance within the kernel community to fix it. Likewise, I'm seeing a good number of people hit problems with lock holder pre-emption in the field. It's absolutely a usability problem when a user sees catastrophically bad performance running an 8-VCPU virtual machine on a 2 socket host. Whether it's gang scheduling or directed yields + pause loop detection, we definitely need some scheduler changes to fix this problem. Not having an option enabled by default is an annoyance that a user eventually overcomes with the help of documentation. Performance problems are deal breakers that lead users to switch to another virtualization technology. Just stripping down qemu and putting the result in the kernel source tree doesn't fix anything. We have plenty of hard problems to solve already. Regards, Anthony Liguori