On Fri, 2019-08-02 at 16:07 +0300, Andrii Anisov wrote:
> On 02.08.19 12:15, Julien Grall wrote:
> >  From the list below it is not clear what is the split between
> > hypervisor time and guest time. See some of the examples below.
> 
> I guess your question is *why* do I split hyp/guest time in such a
> way.
> 
> So for the guest I count time spent in the guest mode. Plus time
> spent in hypervisor mode to serve explicit requests by guest.
> 
From an accuracy, but also from a fairness perspective:
- what a guest does directly (in guest mode)
- what the hypervisor does, on behalf of a guest, no matter whether
requested explicitly or not
should all be accounted to the guest. In the sense that the guest
should be charged for it.

Actually, the concepts of "guest time" and "hypervisor time" are
actually orthogonal from the accounting, at least ideally.

In fact, when a guest does an hypercall, the time that we spend inside
Xen for performing the hypercal itself:
* is hypervisor time
* the guest that did the hypercall should be charged for it.

If we don't charge the guest for these activity, in theory, a guest can
start doing a lot of hypercalls and generating a lot of interrupts...
since most of the time is spent in the hypervisor, it's runtime (from
the scheduler point of view) increase only a little, and the scheduler
will continue to run it, and it will continue to generate hypercalls
and interrupts, until it starve/DoS the system!

In fact, this right now can't happen because we always charge guests
for the time spent doing these things. The problem is that we often
charge _the_wrong_ guest. This somewhat manages to prevent (or make it
very unlikely) a DoS situation, but is indeed unfair, and may cause
problems (especially in RT scenarios).

> That time may be quite deterministic from the guest's point of view.
> 
> But the time spent by hypervisor to handle interrupts, update the
> hardware state is not requested by the guest itself. It is a
> virtualization overhead. 
>
Yes, but still, when it is the guest that causes such overhead, it is
important that the guest itself gets to pay for it.

Just as an example (although you don't have this problem on ARM), if I
have an HVM, ideally I would charge to the guest the time that QEMU
executes in dom0!

On the other hand, the time that we spend in the scheduler, for
instance, doing load balancing among the various runqueues, or the time
that we spend in Xen (on x86) for time synchronization rendezvouses,
they should not be charged to any guest.

> And the overhead heavily depends on the system configuration (e.g.
> how many guests are running).
> That overhead may be accounted for a guest or for hyp, depending on
> the model agreed.
> 
Load balancing within the scheduler, indeed depends on how busy the
system is, and I agree that time should be accounted against any guest.

Saving and restoring the register state of a guest, I don't think it
depends on how many other guests there are around, and I think should
be accounted against the guest itself.

> My idea is as following:
> Accounting that overhead for guests is quite OK for server
> applications, you put server overhead time on guests and charge money
> from their budget.
>
I disagree. The benefits of more accurate and correct time accounting
and charging are not workload or use case dependent. If we decide to
charge the guest for hypercalls it does and interrupts it receives,
then we should do that, both for servers and for embedded RT systems.

> Yet for RT applications you will have more accurate view on the guest
> execution time if you drop that overhead.
> 

> Our target is XEN in safety critical systems. So I chosen more
> deterministic (from my point of view) approach.
> 
As said, I believe this is one of those cases, where we want an unified
approach. And not because it's easier, or because "Xen has to work both
on servers and embedded" (which, BTW, is true). But because it is the
right thing to do, IMO.

> Well, I suppose we may add granularity to the time accounting, and
> then decide at the scheduler level what we count for the guest
> execution time.
> 
> But it is so far from the end, and we are here to discuss and agree
> the stuff.
> 
Indeed. :-)

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)