From: Juergen Gross <jgross@suse.com>
To: Dario Faggioli <dfaggioli@suse.com>,
Andrii Anisov <andrii.anisov@gmail.com>,
xen-devel@lists.xenproject.org
Cc: Stefano Stabellini <sstabellini@kernel.org>,
Andrii Anisov <andrii_anisov@epam.com>,
George Dunlap <george.dunlap@eu.citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Tim Deegan <tim@xen.org>, Julien Grall <julien.grall@arm.com>,
Jan Beulich <jbeulich@suse.com>
Subject: Re: [Xen-devel] [RFC 0/6] XEN scheduling hardening
Date: Fri, 26 Jul 2019 14:14:41 +0200 [thread overview]
Message-ID: <25dfa166-c7a4-c2dd-0c1d-58faf5ffc296@suse.com> (raw)
In-Reply-To: <79b01757ee19b57ac43649a4d94e378891152887.camel@suse.com>
On 26.07.19 13:56, Dario Faggioli wrote:
> [Adding George plus others x86, ARM and core-Xen people]
>
> Hi Andrii,
>
> First of all, thanks a lot for this series!
>
> The problem you mention is a long standing one, and I'm glad we're
> eventually starting to properly look into it.
>
> I already have one comment: I think I can see from where this come
> from, but I don't think 'XEN scheduling hardening' is what we're doing
> in this series... I'd go for something like "xen: sched: improve idle
> and vcpu time accounting precision", or something like that.
>
> On Fri, 2019-07-26 at 13:37 +0300, Andrii Anisov wrote:
>> One of the scheduling problems is a misleading CPU idle time concept.
>> Now
>> for the CPU idle time, it is taken an idle vcpu run time. But idle
>> vcpu run
>> time includes IRQ processing, softirqs processing, tasklets
>> processing, etc.
>> Those tasks are not actual idle and they accounting may mislead CPU
>> freq
>> governors who rely on the CPU idle time.
>>
> Indeed! And I agree this is quite bad.
>
>> The other problem is that pure hypervisor tasks execution time is
>> charged from
>> the guest vcpu budget.
>>
> Yep, equally bad.
>
>> For example, IRQ and softirq processing time are charged
>> from the current vcpu budget, which is likely the guest vcpu. This is
>> quite
>> unfair and may break scheduling reliability.
>> It is proposed to charge guest
>> vcpus for the guest actual run time and time to serve guest's
>> hypercalls and
>> access to emulated iomem. All the rest is calculated as the
>> hypervisor run time
>> (IRQ and softirq processing, branch prediction hardening, etc.)
>>
> Right.
>
>> While the series is the early RFC, several points are still
>> untouched:
>> - Now the time elapsed from the last rescheduling is not fully
>> charged from
>> the current vcpu budget. Are there any changes needed in the
>> existing
>> scheduling algorithms?
>>
> I'll think about it, but out of the top of my head, I don't see how
> this can be a problem. Scheduling algorithms (should!) base their logic
> and their calculations on actual vcpus' runtime, not much on idle
> vcpus' one.
>
>> - How to avoid the absolute top priority of tasklets (what is obeyed
>> by all
>> schedulers so far). Should idle vcpu be scheduled as the normal
>> guest vcpus
>> (through queues, priorities, etc)?
>>
> Now, this is something to think about, and try to understand if
> anything would break if we go for it. I mean, I see why you'd want to
> do that, but tasklets and softirqs works the way they do, in Xen, since
> when they were introduced, I believe.
>
> Therefore, even if there wouldn't be any subsystem explicitly relying
> on the current behavior (which should be verified), I think we are at
> high risk of breaking things, if we change.
We'd break things IMO.
Tasklets are sometimes used to perform async actions which can't be done
in guest vcpu context. Like switching a domain to shadow mode for L1TF
mitigation, or marshalling all cpus for stop_machine(). You don't want
to be able to block tasklets, you want them to run as soon as possible.
>
> That's not to mean it would not be a good change, or that it is
> impossible... It's, rather, just to raise some awareness. :-)
>
>> - Idle vcpu naming is quite misleading. It is a kind of system
>> (hypervisor)
>> task which is responsible for some hypervisor work. Should it be
>> renamed/reconsidered?
>>
> Well, that's a design question, even for this very series, isn't it? I
> mean, I see two ways of achieving proper idle time accounting:
> 1) you leave things as they are --i.e., idle does not only do idling,
> it also does all these other things, but you make sure you don't
> count the time they take as idle time;
> 2) you move all these activities out of idle, and in some other
> context, and you let idle just do the idling. At that point, time
> accounted to idle will be only actual idle time, as the time it
> took to Xen to do all the other things is now accounted to the new
> execution context which is running them.
And here we are coming back to the idea of a "hypervisor domain" I
suggested about 10 years ago and which was rejected at that time...
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
next prev parent reply other threads:[~2019-07-26 12:14 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-26 10:37 [Xen-devel] [RFC 0/6] XEN scheduling hardening Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path Andrii Anisov
2019-07-26 10:48 ` Julien Grall
2019-07-30 17:35 ` Andrii Anisov
2019-07-30 20:10 ` Julien Grall
2019-08-01 6:45 ` Andrii Anisov
2019-08-01 9:37 ` Julien Grall
2019-08-02 8:28 ` Andrii Anisov
2019-08-02 9:03 ` Julien Grall
2019-08-02 12:24 ` Andrii Anisov
2019-08-02 13:22 ` Julien Grall
2019-08-01 11:19 ` Dario Faggioli
2019-08-02 7:50 ` Andrii Anisov
2019-08-02 9:15 ` Julien Grall
2019-08-02 13:07 ` Andrii Anisov
2019-08-02 13:49 ` Julien Grall
2019-08-03 1:39 ` Dario Faggioli
2019-08-03 0:55 ` Dario Faggioli
2019-08-06 13:09 ` Andrii Anisov
2019-08-08 14:07 ` Andrii Anisov
2019-08-13 14:45 ` Dario Faggioli
2019-08-15 18:25 ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 2/6] schedule: account true system idle time Andrii Anisov
2019-07-26 12:00 ` Dario Faggioli
2019-07-26 12:42 ` Andrii Anisov
2019-07-29 11:40 ` Dario Faggioli
2019-08-01 8:23 ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 3/6] sysctl: extend XEN_SYSCTL_getcpuinfo interface Andrii Anisov
2019-07-26 12:15 ` Dario Faggioli
2019-07-26 13:06 ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 4/6] xentop: show CPU load information Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 5/6] arm64: сall enter_hypervisor_head only when it is needed Andrii Anisov
2019-07-26 10:44 ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 5/6] arm64: call " Andrii Anisov
2019-07-26 10:59 ` Julien Grall
2019-07-30 17:35 ` Andrii Anisov
2019-07-31 11:02 ` Julien Grall
2019-07-31 11:33 ` Andre Przywara
2019-08-01 7:33 ` Andrii Anisov
2019-08-01 10:17 ` Julien Grall
2019-08-02 13:50 ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 6/6] schedule: account all the hypervisor time to the idle vcpu Andrii Anisov
2019-07-26 11:56 ` [Xen-devel] [RFC 0/6] XEN scheduling hardening Dario Faggioli
2019-07-26 12:14 ` Juergen Gross [this message]
2019-07-29 11:53 ` Dario Faggioli
2019-07-29 12:13 ` Juergen Gross
2019-07-29 14:47 ` Andrii Anisov
2019-07-29 18:46 ` Dario Faggioli
2019-07-29 14:28 ` Andrii Anisov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=25dfa166-c7a4-c2dd-0c1d-58faf5ffc296@suse.com \
--to=jgross@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=andrii.anisov@gmail.com \
--cc=andrii_anisov@epam.com \
--cc=dfaggioli@suse.com \
--cc=george.dunlap@eu.citrix.com \
--cc=jbeulich@suse.com \
--cc=julien.grall@arm.com \
--cc=sstabellini@kernel.org \
--cc=tim@xen.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).