xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: George Dunlap <George.Dunlap@citrix.com>
To: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	"Dario Faggioli" <dfaggioli@suse.com>,
	Meng Xu <mengxu@cis.upenn.edu>, Ian Jackson <iwj@xenproject.org>,
	Jan Beulich <jbeulich@suse.com>, Julien Grall <julien@xen.org>,
	Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wl@xen.org>
Subject: Re: [RFC PATCH 00/10] Preemption in hypervisor (ARM only)
Date: Mon, 1 Mar 2021 14:39:26 +0000	[thread overview]
Message-ID: <F8E5E0F1-0E6C-4659-95E3-01CA52CFCC3F@citrix.com> (raw)
In-Reply-To: <87k0qx9gw0.fsf@epam.com>



> On Feb 24, 2021, at 11:37 PM, Volodymyr Babchuk <volodymyr_babchuk@epam.com> wrote:
> 
> 
>> Hypervisor/virt properties are different to both a kernel-only-RTOS, and
>> regular usespace.  This was why I gave you some specific extra scenarios
>> to do latency testing with, so you could make a fair comparison of
>> "extra overhead caused by Xen" separate from "overhead due to
>> fundamental design constraints of using virt".
> 
> I can't see any fundamental constraints there. I see how virtualization
> architecture can influence context switch time: how many actions you
> need to switch one vCPU to another. I have in mind low level things
> there: reprogram MMU to use another set of tables, reprogram your
> interrupt controller, timer, etc. Of course, you can't get latency lower
> that context switch time. This is the only fundamental constraint I can
> see.

Well suppose you have two domains, A and B, both of which control  hardware which have hard real-time requirements.

And suppose that A has just started handling handling a latency-sensitive interrupt, when a latency-sensitive interrupt comes in for B.  You might well preempt A and let B run for a full timeslice, causing A’s interrupt handler to be delayed by a significant amount.

Preventing that sort of thing would be a much more tricky issue to get right.

>> If you want timely interrupt handling, you either need to partition your
>> workloads by the long-running-ness of their hypercalls, or not have
>> long-running hypercalls.
> 
> ... or do long-running tasks asynchronously. I believe, for most
> domctls and sysctls there is no need to hold calling vCPU in hypervisor
> mode at all.
> 
>> I remain unconvinced that preemption is an sensible fix to the problem
>> you're trying to solve.
> 
> Well, this is the purpose of this little experiment. I want to discuss
> different approaches and to estimate amount of required efforts. By the
> way, from x86 point of view, how hard to switch vCPU context while it is
> running in hypervisor mode?

I’m not necessarily opposed to introducing preemption, but the more we ask about things, the more complex things begin to look.  The idea of introducing an async framework to deal with long-running hypercalls is a huge engineering and design effort, not just for Xen, but for all future callers of the interface.

The claim in the cover letter was that “[s]ome hypercalls can not be preempted at all”.  I looked at the reference, and it looks like you’re referring to this:

"I brooded over ways to make [alloc_domheap_pages()] preemptible. But it is a) located deep in call chain and b) used not only by hypercalls. So I can't see an easy way to make it preemptible."

Let’s assume for the sake of argument that preventing delays due to alloc_domheap_pages() would require significant rearchitecting of the code.  And let’s even assume that there are 2-3 other such knotty issues making for unacceptably long hypercalls.  Will identifying and tracking down those issues really be more effort than introducing preemption, introducing async operations, and all the other things we’ve been talking about?

One thing that might be interesting is to add some sort of metrics (disabled in Kconfig by default); e.g.:

1. On entry to a hypercall, take a timestamp

2. On every hypercall_preempt() call, take another timestamp and see how much time has passed without a preempt, and reset the timestamp count; also do a check on exit of the hypercall

We could start by trying to do stats and figuring out which hypercalls go the longest without preemption, as a way to guide the optimization efforts.  Then as we get that number down, we could add an ASSERT()s that the time is never longer than a certain amount, and add runs like that to osstest to make sure there are no regressions introduced.

I agree that hypercall continuations are complex; and you’re right that the fact that the hypercall continuation may never be called limits where preemption can happen.  But making the entire hypervisor preemption-friendly is also quite complex in its own way; it’s not immediately obvious to me from this thread that hypervisor preemption is less complex.

 -George

  reply	other threads:[~2021-03-01 14:39 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-23  2:34 [RFC PATCH 00/10] Preemption in hypervisor (ARM only) Volodymyr Babchuk
2021-02-23  2:34 ` [RFC PATCH 01/10] sched: core: save IRQ state during locking Volodymyr Babchuk
2021-02-23  8:52   ` Jürgen Groß
2021-02-23 11:15     ` Volodymyr Babchuk
2021-02-24 18:29   ` Andrew Cooper
2021-02-23  2:34 ` [RFC PATCH 03/10] sched: credit2: " Volodymyr Babchuk
2021-02-23  2:34 ` [RFC PATCH 02/10] sched: rt: " Volodymyr Babchuk
2021-02-23  2:34 ` [RFC PATCH 04/10] preempt: use atomic_t to for preempt_count Volodymyr Babchuk
2021-02-23  2:34 ` [RFC PATCH 05/10] preempt: add try_preempt() function Volodymyr Babchuk
2021-02-23  2:34 ` [RFC PATCH 07/10] sched: core: remove ASSERT_NOT_IN_ATOMIC and disable preemption[!] Volodymyr Babchuk
2021-02-23  2:34 ` [RFC PATCH 06/10] arm: setup: disable preemption during startup Volodymyr Babchuk
2021-02-23  2:34 ` [RFC PATCH 08/10] arm: context_switch: allow to run with IRQs already disabled Volodymyr Babchuk
2021-02-23  2:34 ` [RFC PATCH 10/10] [HACK] alloc pages: enable preemption early Volodymyr Babchuk
2021-02-23  2:34 ` [RFC PATCH 09/10] arm: traps: try to preempt before leaving IRQ handler Volodymyr Babchuk
2021-02-23  9:02 ` [RFC PATCH 00/10] Preemption in hypervisor (ARM only) Julien Grall
2021-02-23 12:06   ` Volodymyr Babchuk
2021-02-24 10:08     ` Julien Grall
2021-02-24 20:57       ` Volodymyr Babchuk
2021-02-24 22:31         ` Julien Grall
2021-02-24 23:58           ` Volodymyr Babchuk
2021-02-25  0:39             ` Andrew Cooper
2021-02-25 12:51               ` Volodymyr Babchuk
2021-03-05  9:31                 ` Volodymyr Babchuk
2021-02-24 18:07 ` Andrew Cooper
2021-02-24 23:37   ` Volodymyr Babchuk
2021-03-01 14:39     ` George Dunlap [this message]
     [not found] <161405394665.5977.17427402181939884734@c667a6b167f6>
2021-02-23 20:29 ` Stefano Stabellini
2021-02-24  0:19   ` Volodymyr Babchuk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F8E5E0F1-0E6C-4659-95E3-01CA52CFCC3F@citrix.com \
    --to=george.dunlap@citrix.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=dfaggioli@suse.com \
    --cc=iwj@xenproject.org \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=mengxu@cis.upenn.edu \
    --cc=sstabellini@kernel.org \
    --cc=volodymyr_babchuk@epam.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).