xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Juergen Gross <jgross@suse.com>
To: Dario Faggioli <dfaggioli@suse.com>,
	Andrii Anisov <andrii.anisov@gmail.com>,
	xen-devel@lists.xenproject.org
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	Andrii Anisov <andrii_anisov@epam.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Tim Deegan <tim@xen.org>, Julien Grall <julien.grall@arm.com>,
	Jan Beulich <jbeulich@suse.com>
Subject: Re: [Xen-devel] [RFC 0/6] XEN scheduling hardening
Date: Fri, 26 Jul 2019 14:14:41 +0200	[thread overview]
Message-ID: <25dfa166-c7a4-c2dd-0c1d-58faf5ffc296@suse.com> (raw)
In-Reply-To: <79b01757ee19b57ac43649a4d94e378891152887.camel@suse.com>

On 26.07.19 13:56, Dario Faggioli wrote:
> [Adding George plus others x86, ARM and core-Xen people]
> 
> Hi Andrii,
> 
> First of all, thanks a lot for this series!
> 
> The problem you mention is a long standing one, and I'm glad we're
> eventually starting to properly look into it.
> 
> I already have one comment: I think I can see from where this come
> from, but I don't think 'XEN scheduling hardening' is what we're doing
> in this series... I'd go for something like "xen: sched: improve idle
> and vcpu time accounting precision", or something like that.
> 
> On Fri, 2019-07-26 at 13:37 +0300, Andrii Anisov wrote:
>> One of the scheduling problems is a misleading CPU idle time concept.
>> Now
>> for the CPU idle time, it is taken an idle vcpu run time. But idle
>> vcpu run
>> time includes IRQ processing, softirqs processing, tasklets
>> processing, etc.
>> Those tasks are not actual idle and they accounting may mislead CPU
>> freq
>> governors who rely on the CPU idle time.
>>
> Indeed! And I agree this is quite bad.
> 
>> The other problem is that pure hypervisor tasks execution time is
>> charged from
>> the guest vcpu budget.
>>
> Yep, equally bad.
> 
>> For example, IRQ and softirq processing time are charged
>> from the current vcpu budget, which is likely the guest vcpu. This is
>> quite
>> unfair and may break scheduling reliability.
>> It is proposed to charge guest
>> vcpus for the guest actual run time and time to serve guest's
>> hypercalls and
>> access to emulated iomem. All the rest is calculated as the
>> hypervisor run time
>> (IRQ and softirq processing, branch prediction hardening, etc.)
>>
> Right.
> 
>> While the series is the early RFC, several points are still
>> untouched:
>>   - Now the time elapsed from the last rescheduling is not fully
>> charged from
>>     the current vcpu budget. Are there any changes needed in the
>> existing
>>     scheduling algorithms?
>>
> I'll think about it, but out of the top of my head, I don't see how
> this can be a problem. Scheduling algorithms (should!) base their logic
> and their calculations on actual vcpus' runtime, not much on idle
> vcpus' one.
> 
>>   - How to avoid the absolute top priority of tasklets (what is obeyed
>> by all
>>     schedulers so far). Should idle vcpu be scheduled as the normal
>> guest vcpus
>>     (through queues, priorities, etc)?
>>
> Now, this is something to think about, and try to understand if
> anything would break if we go for it. I mean, I see why you'd want to
> do that, but tasklets and softirqs works the way they do, in Xen, since
> when they were introduced, I believe.
> 
> Therefore, even if there wouldn't be any subsystem explicitly relying
> on the current behavior (which should be verified), I think we are at
> high risk of breaking things, if we change.

We'd break things IMO.

Tasklets are sometimes used to perform async actions which can't be done
in guest vcpu context. Like switching a domain to shadow mode for L1TF
mitigation, or marshalling all cpus for stop_machine(). You don't want
to be able to block tasklets, you want them to run as soon as possible.

> 
> That's not to mean it would not be a good change, or that it is
> impossible... It's, rather, just to raise some awareness. :-)
> 
>>   - Idle vcpu naming is quite misleading. It is a kind of system
>> (hypervisor)
>>     task which is responsible for some hypervisor work. Should it be
>>     renamed/reconsidered?
>>
> Well, that's a design question, even for this very series, isn't it? I
> mean, I see two ways of achieving proper idle time accounting:
> 1) you leave things as they are --i.e., idle does not only do idling,
>     it also does all these other things, but you make sure you don't
>     count the time they take as idle time;
> 2) you move all these activities out of idle, and in some other
>     context, and you let idle just do the idling. At that point, time
>     accounted to idle will be only actual idle time, as the time it
>     took to Xen to do all the other things is now accounted to the new
>     execution context which is running them.

And here we are coming back to the idea of a "hypervisor domain" I
suggested about 10 years ago and which was rejected at that time...


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2019-07-26 12:14 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-26 10:37 [Xen-devel] [RFC 0/6] XEN scheduling hardening Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path Andrii Anisov
2019-07-26 10:48   ` Julien Grall
2019-07-30 17:35     ` Andrii Anisov
2019-07-30 20:10       ` Julien Grall
2019-08-01  6:45         ` Andrii Anisov
2019-08-01  9:37           ` Julien Grall
2019-08-02  8:28             ` Andrii Anisov
2019-08-02  9:03               ` Julien Grall
2019-08-02 12:24                 ` Andrii Anisov
2019-08-02 13:22                   ` Julien Grall
2019-08-01 11:19           ` Dario Faggioli
2019-08-02  7:50             ` Andrii Anisov
2019-08-02  9:15               ` Julien Grall
2019-08-02 13:07                 ` Andrii Anisov
2019-08-02 13:49                   ` Julien Grall
2019-08-03  1:39                     ` Dario Faggioli
2019-08-03  0:55                   ` Dario Faggioli
2019-08-06 13:09                     ` Andrii Anisov
2019-08-08 14:07                       ` Andrii Anisov
2019-08-13 14:45                         ` Dario Faggioli
2019-08-15 18:25                           ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 2/6] schedule: account true system idle time Andrii Anisov
2019-07-26 12:00   ` Dario Faggioli
2019-07-26 12:42     ` Andrii Anisov
2019-07-29 11:40       ` Dario Faggioli
2019-08-01  8:23         ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 3/6] sysctl: extend XEN_SYSCTL_getcpuinfo interface Andrii Anisov
2019-07-26 12:15   ` Dario Faggioli
2019-07-26 13:06     ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 4/6] xentop: show CPU load information Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 5/6] arm64: сall enter_hypervisor_head only when it is needed Andrii Anisov
2019-07-26 10:44   ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 5/6] arm64: call " Andrii Anisov
2019-07-26 10:59   ` Julien Grall
2019-07-30 17:35     ` Andrii Anisov
2019-07-31 11:02       ` Julien Grall
2019-07-31 11:33         ` Andre Przywara
2019-08-01  7:33         ` Andrii Anisov
2019-08-01 10:17           ` Julien Grall
2019-08-02 13:50             ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 6/6] schedule: account all the hypervisor time to the idle vcpu Andrii Anisov
2019-07-26 11:56 ` [Xen-devel] [RFC 0/6] XEN scheduling hardening Dario Faggioli
2019-07-26 12:14   ` Juergen Gross [this message]
2019-07-29 11:53     ` Dario Faggioli
2019-07-29 12:13       ` Juergen Gross
2019-07-29 14:47     ` Andrii Anisov
2019-07-29 18:46       ` Dario Faggioli
2019-07-29 14:28   ` Andrii Anisov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25dfa166-c7a4-c2dd-0c1d-58faf5ffc296@suse.com \
    --to=jgross@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=andrii.anisov@gmail.com \
    --cc=andrii_anisov@epam.com \
    --cc=dfaggioli@suse.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien.grall@arm.com \
    --cc=sstabellini@kernel.org \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).