On Fri, 2019-08-02 at 16:07 +0300, Andrii Anisov wrote: > On 02.08.19 12:15, Julien Grall wrote: > > From the list below it is not clear what is the split between > > hypervisor time and guest time. See some of the examples below. > > I guess your question is *why* do I split hyp/guest time in such a > way. > > So for the guest I count time spent in the guest mode. Plus time > spent in hypervisor mode to serve explicit requests by guest. > From an accuracy, but also from a fairness perspective: - what a guest does directly (in guest mode) - what the hypervisor does, on behalf of a guest, no matter whether requested explicitly or not should all be accounted to the guest. In the sense that the guest should be charged for it. Actually, the concepts of "guest time" and "hypervisor time" are actually orthogonal from the accounting, at least ideally. In fact, when a guest does an hypercall, the time that we spend inside Xen for performing the hypercal itself: * is hypervisor time * the guest that did the hypercall should be charged for it. If we don't charge the guest for these activity, in theory, a guest can start doing a lot of hypercalls and generating a lot of interrupts... since most of the time is spent in the hypervisor, it's runtime (from the scheduler point of view) increase only a little, and the scheduler will continue to run it, and it will continue to generate hypercalls and interrupts, until it starve/DoS the system! In fact, this right now can't happen because we always charge guests for the time spent doing these things. The problem is that we often charge _the_wrong_ guest. This somewhat manages to prevent (or make it very unlikely) a DoS situation, but is indeed unfair, and may cause problems (especially in RT scenarios). > That time may be quite deterministic from the guest's point of view. > > But the time spent by hypervisor to handle interrupts, update the > hardware state is not requested by the guest itself. It is a > virtualization overhead. > Yes, but still, when it is the guest that causes such overhead, it is important that the guest itself gets to pay for it. Just as an example (although you don't have this problem on ARM), if I have an HVM, ideally I would charge to the guest the time that QEMU executes in dom0! On the other hand, the time that we spend in the scheduler, for instance, doing load balancing among the various runqueues, or the time that we spend in Xen (on x86) for time synchronization rendezvouses, they should not be charged to any guest. > And the overhead heavily depends on the system configuration (e.g. > how many guests are running). > That overhead may be accounted for a guest or for hyp, depending on > the model agreed. > Load balancing within the scheduler, indeed depends on how busy the system is, and I agree that time should be accounted against any guest. Saving and restoring the register state of a guest, I don't think it depends on how many other guests there are around, and I think should be accounted against the guest itself. > My idea is as following: > Accounting that overhead for guests is quite OK for server > applications, you put server overhead time on guests and charge money > from their budget. > I disagree. The benefits of more accurate and correct time accounting and charging are not workload or use case dependent. If we decide to charge the guest for hypercalls it does and interrupts it receives, then we should do that, both for servers and for embedded RT systems. > Yet for RT applications you will have more accurate view on the guest > execution time if you drop that overhead. > > Our target is XEN in safety critical systems. So I chosen more > deterministic (from my point of view) approach. > As said, I believe this is one of those cases, where we want an unified approach. And not because it's easier, or because "Xen has to work both on servers and embedded" (which, BTW, is true). But because it is the right thing to do, IMO. > Well, I suppose we may add granularity to the time accounting, and > then decide at the scheduler level what we count for the guest > execution time. > > But it is so far from the end, and we are here to discuss and agree > the stuff. > Indeed. :-) Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <> (Raistlin Majere)