linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Woodhouse <dwmw2@infradead.org>
To: Joao Martins <joao.m.martins@oracle.com>
Cc: "bp@alien8.de" <bp@alien8.de>, "x86@kernel.org" <x86@kernel.org>,
	"boris.ostrovsky@oracle.com" <boris.ostrovsky@oracle.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"ankur.a.arora@oracle.com" <ankur.a.arora@oracle.com>,
	"rkrcmar@redhat.com" <rkrcmar@redhat.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [EXTERNAL] [PATCH RFC 12/39] KVM: x86/xen: store virq when assigning evtchn
Date: Thu, 10 Feb 2022 15:23:10 +0000	[thread overview]
Message-ID: <776740ea7c05a6c17fa05a809b4cbeb824b5afa2.camel@infradead.org> (raw)
In-Reply-To: <97bdf580-c1ff-0f2e-989c-da73a2115e7b@oracle.com>

[-- Attachment #1: Type: text/plain, Size: 5416 bytes --]

On Thu, 2022-02-10 at 12:17 +0000, Joao Martins wrote:
> On 2/8/22 16:17, Woodhouse, David wrote:
> > And then we have the *outbound* events, which the guest can invoke with
> > the EVTCHNOP_send hypercall. Those are either:
> >  • IPI, raising the same port# on the guest
> >  • Interdomain looped back to a different port# on the guest
> >  • Interdomain triggering an eventfd.
> > 
> 
> /me nods
> 
> I am forgetting why you one do this on Xen:
> 
> * Interdomain looped back to a different port# on the guest

It's one of the few things we had to fix up when we started running PV
guests in the 'shim' under KVM. I don't know that it actually sends
loopback events via the true Xen (or KVM) host, but it does at least
register them so that the port# is 'reserved' and the host won't
allocate that port for anything else. It does it at least for the
console port.

For the inbound vs. outbound thing.... I did ponder a really simple API
design in which outbound ports are *only* ever associated with an
eventfd, and for IPIs the VMM would be expected to bind those as IRQFD
to an inbound event on the same port#.

You pointed out that it was quite inefficient, but... we already have
complex hacks to bypass the eventfd for posted interrupts when the
source and destination "match", and perhaps we could do something
similar to allow EVTCHNOP_send to deliver directly to a local port#
without having to go through all the eventfd code?

But the implementation of that would end up being awful, *and* the
userspace API isn't even that nice despite being "simple", because it
would force userspace to allocate a whole bunch of eventfds and use
space in the IRQ routing table for them. So it didn't seem worth it.
Let's just let userspace tell us explicitly the vcpu/port/prio instead
of having to jump through hoops to magically work it out from matching
eventfds.

> > In the last case, that eventfd can be set up with IRQFD for direct
> > event channel delivery to a different KVM/Xen guest.
> > 
> > I've used your implemention, with an idr for the outbound port# space
> > intercepting EVTCHNOP_send for known ports and only letting userspace
> > see the hypercall if it's for a port# the kernel doesn't know. Looks a
> > bit like
> > https://git.infradead.org/users/dwmw2/linux.git/commitdiff/b4fbc49218a
> > 
> > 
> > 
> > But I *don't* want to do the VIRQ part shown above, "spotting" the VIRQ
> > in that outbound port# space and squirreling the information away into
> > the kvm_vcpu for when we need to deliver a timer event.
> > 
> > The VIRQ isn't part of the *outbound* port# space; it isn't a port to
> > which a Xen guest can use EVTCHNOP_send to send an event.
> 
> But it is still an event channel which port is unique regardless of port
> type/space hence (...)
> 
> > If anything,
> > it would be part of the *inbound* port# space, in the KVM IRQ routing
> > table. So perhaps we could have a similar snippet in
> > kvm_xen_setup_evtchn() which spots a VIRQ and says "aha, now I know
> > where to deliver timer events for this vCPU".
> > 
> 
> (...) The thinking at the time was mainly simplicity so our way of saying
> 'offload the evtchn to KVM' was through the machinery that offloads the outbound
> part (using your terminology). I don't think even using XEN_EVENTFD as proposed
> here that that one could send an VIRQ via EVTCHNOP_send (I could be wrong as
> it has been a long time).

I confess I didn't test it but it *looked* like you could, while true
Xen wouldn't permit that.

> Regardless, I think you have a good point to split the semantics and (...)

> > 
> > So I think I'm going to make the timer VIRQ (port#, priority) into an
> > explicit KVM_XEN_VCPU_ATTR_TYPE.
> 
> (...) thus this makes sense. Do you particularly care about
> VIRQ_DEBUG?


Not really. Especially not as something to accelerate in KVM.

Our environment doesn't have any way to deliver that to guests,
although we *do* have an API call to deliver "diagnostic interrupt"
which maps to an NMI, and we *have* occasionally hacked the VMM to
deliver VIRQ_DEBUG to Xen guests instead of that NMI. Mostly back when
I was busy being confused about ->vcpu_id vs. ->vcpu_idx vs. the
Xen/ACPI CPU# and where the hell my interrupts were going.

> > Along with the *actual* timer expiry,
> > which we need to extract/restore for LU/LM too, don't we?
> > /me nods
> 
> I haven't thought that one well for Live Update / Live Migration, but
> I wonder if wouldn't be better to be instead a general 'xen state'
> attr type should you need more than just pending timers expiry. Albeit
> considering that the VMM has everything it needs (?), perhaps for Xen PV
> timer look to be the oddball missing, and we donºt need to go that extent.

Yeah, the VMM knows most of this stuff already, as it *told* the kernel
in the first place. Userspace is still responsible for all the setup
and admin, and the kernel just handles the bare minimum of the fast
path.

So on live update/migrate we only really need to read out the runstate
data from the kernel... and now the current timer expiry. 

On the *resume* side it's still lots of syscalls, and perhaps in the
end we might decide we want to do a KVM_XEN_HVM_SET_ATTR_MULTI which
takes an array of them? But I think that's a premature optimisation for
now.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

  reply	other threads:[~2022-02-10 15:23 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-20 20:15 [PATCH RFC 00/39] x86/KVM: Xen HVM guest support Joao Martins
2019-02-20 20:15 ` [PATCH RFC 01/39] KVM: x86: fix Xen hypercall page msr handling Joao Martins
2019-02-22  1:30   ` Sean Christopherson
2019-02-22 11:47     ` Joao Martins
2019-02-22 12:51     ` Paolo Bonzini
2020-11-30 10:39       ` David Woodhouse
2020-11-30 11:03         ` Paolo Bonzini
2020-11-30 11:27           ` David Woodhouse
2019-02-20 20:15 ` [PATCH RFC 02/39] KVM: x86/xen: intercept xen hypercalls if enabled Joao Martins
2019-02-21 18:29   ` Sean Christopherson
2019-02-21 20:56     ` Joao Martins
2019-02-22  0:30       ` Sean Christopherson
2019-02-22 12:50         ` Paolo Bonzini
2020-12-01  9:48   ` David Woodhouse
2020-12-01 11:19     ` David Woodhouse
2020-12-02 11:17       ` Joao Martins
2020-12-02 12:12         ` David Woodhouse
2020-12-02  5:19     ` Ankur Arora
2020-12-02  8:03       ` David Woodhouse
2020-12-02 18:20         ` Ankur Arora
2019-02-20 20:15 ` [PATCH RFC 03/39] KVM: x86/xen: register shared_info page Joao Martins
2020-12-01 13:07   ` David Woodhouse
2020-12-02  0:40     ` Ankur Arora
2020-12-02  1:26       ` David Woodhouse
2020-12-02  5:17         ` Ankur Arora
2020-12-02 10:50           ` Joao Martins
2020-12-02 10:44       ` Joao Martins
2020-12-02 12:20         ` David Woodhouse
2020-12-02 20:32           ` Ankur Arora
2020-12-03 10:16             ` David Woodhouse
2020-12-04 17:30               ` Sean Christopherson
2020-12-02 20:33         ` Ankur Arora
2020-12-12 12:07       ` David Woodhouse
2019-02-20 20:15 ` [PATCH RFC 04/39] KVM: x86/xen: setup pvclock updates Joao Martins
2019-02-20 20:15 ` [PATCH RFC 05/39] KVM: x86/xen: update wallclock region Joao Martins
2019-02-20 20:15 ` [PATCH RFC 06/39] KVM: x86/xen: register vcpu info Joao Martins
2019-02-20 20:15 ` [PATCH RFC 07/39] KVM: x86/xen: register vcpu time info region Joao Martins
2019-02-20 20:15 ` [PATCH RFC 08/39] KVM: x86/xen: register steal clock Joao Martins
2019-02-20 20:15 ` [PATCH RFC 09/39] KVM: x86: declare Xen HVM guest capability Joao Martins
2019-02-20 20:15 ` [PATCH RFC 10/39] KVM: x86/xen: support upcall vector Joao Martins
2020-12-02 11:17   ` David Woodhouse
2020-12-02 13:12     ` Joao Martins
2020-12-02 16:47       ` David Woodhouse
2020-12-02 18:34         ` Joao Martins
2020-12-02 19:02           ` David Woodhouse
2020-12-02 20:12             ` Joao Martins
2020-12-02 20:37               ` David Woodhouse
2020-12-03  1:08             ` Ankur Arora
2020-12-08 16:08             ` David Woodhouse
2020-12-09  6:35               ` Ankur Arora
2020-12-09 10:27                 ` David Woodhouse
2020-12-09 10:51                   ` Joao Martins
2020-12-09 11:39                     ` David Woodhouse
2020-12-09 13:26                       ` Joao Martins
2020-12-09 15:41                         ` David Woodhouse
2020-12-09 16:12                           ` Joao Martins
2021-01-01 14:33           ` David Woodhouse
2021-01-05 12:11             ` Joao Martins
2021-01-05 13:23               ` David Woodhouse
2019-02-20 20:15 ` [PATCH RFC 11/39] KVM: x86/xen: evtchn signaling via eventfd Joao Martins
2020-11-30  9:41   ` David Woodhouse
2020-11-30 12:17     ` Joao Martins
2020-11-30 12:55       ` David Woodhouse
2020-11-30 15:08         ` Joao Martins
2020-11-30 16:48           ` David Woodhouse
2020-11-30 17:15             ` Joao Martins
2020-11-30 18:01               ` David Woodhouse
2020-11-30 18:41                 ` Joao Martins
2020-11-30 19:04                   ` David Woodhouse
2020-11-30 19:25                     ` Joao Martins
2021-11-23 13:15           ` David Woodhouse
2019-02-20 20:15 ` [PATCH RFC 12/39] KVM: x86/xen: store virq when assigning evtchn Joao Martins
     [not found]   ` <b750291466f3c89e0a393e48079c087704b217a5.camel@amazon.co.uk>
2022-02-10 12:17     ` Joao Martins
2022-02-10 15:23       ` David Woodhouse [this message]
2019-02-20 20:15 ` [PATCH RFC 13/39] KVM: x86/xen: handle PV timers oneshot mode Joao Martins
2019-02-20 20:15 ` [PATCH RFC 14/39] KVM: x86/xen: handle PV IPI vcpu yield Joao Martins
2019-02-20 20:15 ` [PATCH RFC 15/39] KVM: x86/xen: handle PV spinlocks slowpath Joao Martins
2022-02-08 12:36   ` David Woodhouse
2022-02-10 12:17     ` Joao Martins
2022-02-10 14:11       ` David Woodhouse
2019-02-20 20:15 ` [PATCH RFC 16/39] KVM: x86: declare Xen HVM evtchn offload capability Joao Martins
2019-02-20 20:15 ` [PATCH RFC 17/39] x86/xen: export vcpu_info and shared_info Joao Martins
2019-02-20 20:15 ` [PATCH RFC 18/39] x86/xen: make hypercall_page generic Joao Martins
2019-02-20 20:15 ` [PATCH RFC 19/39] xen/xenbus: xenbus uninit support Joao Martins
2019-02-20 20:15 ` [PATCH RFC 20/39] xen-blkback: module_exit support Joao Martins
2019-02-25 18:57   ` Konrad Rzeszutek Wilk
2019-02-26 11:20     ` Joao Martins
2019-02-20 20:15 ` [PATCH RFC 21/39] KVM: x86/xen: domid allocation Joao Martins
2019-02-20 20:15 ` [PATCH RFC 22/39] KVM: x86/xen: grant table init Joao Martins
2019-02-20 20:15 ` [PATCH RFC 23/39] KVM: x86/xen: grant table grow support Joao Martins
2019-02-20 20:15 ` [PATCH RFC 24/39] KVM: x86/xen: backend hypercall support Joao Martins
2019-02-20 20:15 ` [PATCH RFC 25/39] KVM: x86/xen: grant map support Joao Martins
2019-02-20 20:15 ` [PATCH RFC 26/39] KVM: x86/xen: grant unmap support Joao Martins
2019-02-20 20:15 ` [PATCH RFC 27/39] KVM: x86/xen: grant copy support Joao Martins
2019-02-20 20:15 ` [PATCH RFC 28/39] KVM: x86/xen: interdomain evtchn support Joao Martins
2019-02-20 20:15 ` [PATCH RFC 29/39] KVM: x86/xen: evtchn unmask support Joao Martins
2019-02-20 20:16 ` [PATCH RFC 30/39] KVM: x86/xen: add additional evtchn ops Joao Martins
2019-02-20 20:16 ` [PATCH RFC 31/39] xen-shim: introduce shim domain driver Joao Martins
2019-02-20 20:16 ` [PATCH RFC 32/39] xen/balloon: xen_shim_domain() support Joao Martins
2019-02-20 20:16 ` [PATCH RFC 33/39] xen/grant-table: " Joao Martins
2019-02-20 20:16 ` [PATCH RFC 34/39] xen/gntdev: " Joao Martins
2019-02-20 20:16 ` [PATCH RFC 35/39] xen/xenbus: " Joao Martins
2019-02-20 20:16 ` [PATCH RFC 36/39] drivers/xen: " Joao Martins
2019-02-20 20:16 ` [PATCH RFC 37/39] xen-netback: " Joao Martins
2019-02-20 20:16 ` [PATCH RFC 38/39] xen-blkback: " Joao Martins
2019-02-20 20:16 ` [PATCH RFC 39/39] KVM: x86: declare Xen HVM Dom0 capability Joao Martins
2019-02-20 21:09 ` [PATCH RFC 00/39] x86/KVM: Xen HVM guest support Paolo Bonzini
2019-02-21  0:29   ` Ankur Arora
2019-02-21 11:45   ` Joao Martins
2019-02-22 16:59     ` Paolo Bonzini
2019-03-12 17:14       ` Joao Martins
2019-04-08  6:44         ` Juergen Gross
2019-04-08 10:36           ` Joao Martins
2019-04-08 10:42             ` Juergen Gross
2019-04-08 17:31               ` Joao Martins
2019-04-09  0:35                 ` Stefano Stabellini
2019-04-10  5:50                   ` [Xen-devel] " Ankur Arora
2019-04-10 20:45                     ` Stefano Stabellini
2019-04-09  5:04                 ` Juergen Gross
2019-04-10  6:55                   ` Ankur Arora
2019-04-10  7:14                     ` Juergen Gross
2019-02-20 23:39 ` [Xen-devel] " Marek Marczykowski-Górecki
2019-02-21  0:31   ` Ankur Arora
2019-02-21  7:57   ` Juergen Gross
2019-02-21 12:00     ` Joao Martins
2019-02-21 11:55   ` Joao Martins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=776740ea7c05a6c17fa05a809b4cbeb824b5afa2.camel@infradead.org \
    --to=dwmw2@infradead.org \
    --cc=ankur.a.arora@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=joao.m.martins@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).