linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: kvm@vger.kernel.org, kvm-ppc@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 06/19] KVM: PPC: Book3S HV: add a GET_ESB_FD control to the XIVE native device
Date: Mon, 11 Feb 2019 13:38:42 +1100	[thread overview]
Message-ID: <20190211023842.GE7230@umbus.fritz.box> (raw)
In-Reply-To: <8c915ed7-5aa5-1276-6598-d5dcd115dd56@kaod.org>

[-- Attachment #1: Type: text/plain, Size: 4507 bytes --]

On Sat, Feb 09, 2019 at 10:41:38AM +0100, Cédric Le Goater wrote:
> On 2/8/19 10:53 PM, Paul Mackerras wrote:
> > On Fri, Feb 08, 2019 at 08:58:14AM +0100, Cédric Le Goater wrote:
> >> On 2/8/19 6:15 AM, David Gibson wrote:
> >>> On Thu, Feb 07, 2019 at 10:03:15AM +0100, Cédric Le Goater wrote:
> >>>> That's the plan I have in mind as suggested by Paul if I understood it well.
> >>>> The mechanics are more complex than the patch zapping the PTEs from the VMA
> >>>> but it's also safer.
> >>>
> >>> Well, yes, where "safer" means "has the possibility to be correct".
> >>
> >> Well, the only problem with the kernel approach is keeping a pointer on 
> >> the VMA. If we could call find_vma(), it would be perfectly safe and much 
> >> more simpler.
> > 
> > You seem to be assuming that the kernel can easily work out a single
> > virtual address which will be the only place where a given set of
> > interrupt pages are mapped.  But that is really not possible in the
> > general case, because userspace could have mapped the fd at many
> > different offsets in many different places.
> > 
> > QEMU doesn't do that; in QEMU, the mmaps are sufficiently limited that
> > it can work out a single virtual address that needs to be changed.
> > The way that QEMU should tell the kernel what that address is and what
> > the mapping should be changed to, is via the existing munmap()/mmap()
> > interface.
> 
> Yes. We agreed on that. QEMU should handle these mappings somewhere in 
> VFIO. It's me grumbling, that's all.
> 
> The discussion has moved to the mmap() interface of the KVM device. The 
> current proposal adds controls on the device creating fds to mmap() the 
> TIMA pages and the ESB pages. David is proposing to use directly the fd 
> of the KVM device to mmap() these pages with a different offset for each 
> set. 
> 
> I think that should work pretty well, for passthrough also. The fault 
> handler should take care of populating the VMA(s) with the appropriate 
> pages. 
> 
> We might support END notification one day, so we should have room for 
> these pages. And nested might require IRQ space extensions at L1. 
> something to keep in mind.

I had some more thoughts on this topic.  I think there's been some
confusion because there are more ways of tackling this than I
previously realized:

1) All in kernel

The offset always maps directly to guest irq number and the kernel
somehow binds it either to an IPI or a host irq as necessary.
Cédric's original code attempts this, but the mechanism of keeping a
pointer to the VMA can't work.

But.. remapping the irqs should be sufficiently infrequent that it
might be ok to consider simply stepping through all the hosting
process's VMAs to do this.

2) Remapped in qemu (using memory regions)

I _think_ (in hindsight) was Cédric's been discussing as the
alternative in more recent posts.

Qemu maps the IPI pages at one place and the passthrough IRQ pages
somewhere else.  The IPIs are mapped into the guest as one memory
region, then any passthrough IRQ pages are mapped over that using
overlapping memory regions.

I don't think this approach will work well, because it could require a
bunch of separate KVM memory slots, which are fairly scarce.

3) Remapped in qemu (using mmap())

This is the approach I (and I think Paul) have been suggested in
contrast to (1).

Qemu maps the IPI pages and maps those into the guest.  When we need
to set up a passthrough IRQ, qemu mmap()s its pages directly over the
IPI pages, and it remains mapped into the guest with the same memory
region / memslot as the IPIs are already using.  If the passthrough
device is removed we have to remap the IPI pages back into place.

4) Dedicated irq numbers

We never re-use regular guest irq numbers for passthrough irqs,
instead we put them somewhere else and keep those mapped to the
passthrough irq pages.

I was favouring this approach, but it does mean there will be a guest
visible difference between kernel_irqchip=on and off which isn't
great.


(1) is the most elegant _interface_, but as we've seen it's
problematic to implement.  Looking at the for_all_vmas() approach
could be interesting, but otherwise option (3) might be the most
practical.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2019-02-11  6:28 UTC|newest]

Thread overview: 135+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-07 18:43 [PATCH 00/19] KVM: PPC: Book3S HV: add XIVE native exploitation mode Cédric Le Goater
2019-01-07 18:43 ` [PATCH 01/19] powerpc/xive: export flags for the XIVE native exploitation mode hcalls Cédric Le Goater
2019-01-09  3:33   ` David Gibson
2019-01-09 13:08   ` Michael Ellerman
2019-01-09 13:38     ` Cédric Le Goater
2019-01-07 18:43 ` [PATCH 02/19] powerpc/xive: add OPAL extensions for the XIVE native exploitation support Cédric Le Goater
2019-01-09  4:26   ` David Gibson
2019-01-07 18:43 ` [PATCH 03/19] KVM: PPC: Book3S HV: check the IRQ controller type Cédric Le Goater
2019-01-09  4:27   ` David Gibson
2019-01-22  4:56   ` Paul Mackerras
2019-01-23 16:24     ` Cédric Le Goater
2019-02-04  0:50       ` David Gibson
2019-02-04 10:16         ` Cédric Le Goater
2019-01-07 18:43 ` [PATCH 04/19] KVM: PPC: Book3S HV: export services for the XIVE native exploitation device Cédric Le Goater
2019-01-11  4:09   ` David Gibson
2019-01-07 18:43 ` [PATCH 05/19] KVM: PPC: Book3S HV: add a new KVM device for the XIVE native exploitation mode Cédric Le Goater
2019-01-22  5:05   ` Paul Mackerras
2019-01-23 16:28     ` Cédric Le Goater
2019-01-28 17:35     ` Cédric Le Goater
2019-01-30  4:29       ` Paul Mackerras
2019-01-30  7:01         ` Cédric Le Goater
2019-01-31  3:01           ` Paul Mackerras
2019-02-01 17:03             ` Cédric Le Goater
2019-02-04  4:25   ` David Gibson
2019-02-04 11:19     ` Cédric Le Goater
2019-02-05  5:26       ` David Gibson
2019-01-07 18:43 ` [PATCH 06/19] KVM: PPC: Book3S HV: add a GET_ESB_FD control to the XIVE native device Cédric Le Goater
2019-01-22  5:09   ` Paul Mackerras
2019-01-23 16:48     ` Cédric Le Goater
2019-02-04  4:45   ` David Gibson
2019-02-04 11:30     ` Cédric Le Goater
2019-02-05  5:28       ` David Gibson
2019-02-05 12:55         ` Cédric Le Goater
2019-02-06  1:23           ` David Gibson
2019-02-06  7:21             ` Cédric Le Goater
2019-02-07  2:49               ` David Gibson
2019-02-07  9:03                 ` Cédric Le Goater
2019-02-08  5:15                   ` David Gibson
2019-02-08  7:58                     ` Cédric Le Goater
2019-02-08 21:53                       ` Paul Mackerras
2019-02-09  9:41                         ` Cédric Le Goater
2019-02-11  2:38                           ` David Gibson [this message]
2019-02-11  6:42                             ` Benjamin Herrenschmidt
2019-02-12 22:07                               ` Cédric Le Goater
2019-01-07 18:43 ` [PATCH 07/19] KVM: PPC: Book3S HV: add a GET_TIMA_FD control to " Cédric Le Goater
2019-01-07 18:43 ` [PATCH 08/19] KVM: PPC: Book3S HV: add a VC_BASE control to the " Cédric Le Goater
2019-01-22  5:14   ` Paul Mackerras
2019-01-23 16:56     ` Cédric Le Goater
2019-02-04  4:49       ` David Gibson
2019-02-04 15:36         ` Cédric Le Goater
2019-01-07 18:43 ` [PATCH 09/19] KVM: PPC: Book3S HV: add a SET_SOURCE " Cédric Le Goater
2019-02-04  4:57   ` David Gibson
2019-02-04 19:07     ` Cédric Le Goater
2019-02-05  5:35       ` David Gibson
2019-02-05 13:39         ` Cédric Le Goater
2019-01-07 18:43 ` [PATCH 10/19] KVM: PPC: Book3S HV: add a EISN attribute to kvmppc_xive_irq_state Cédric Le Goater
2019-01-07 18:43 ` [PATCH 11/19] KVM: PPC: Book3S HV: add support for the XIVE native exploitation mode hcalls Cédric Le Goater
2019-01-22  5:23   ` Paul Mackerras
2019-01-23  6:44     ` Benjamin Herrenschmidt
2019-01-23  8:48       ` Cédric Le Goater
2019-01-23 10:26         ` Paul Mackerras
2019-01-23 10:48           ` Cédric Le Goater
2019-01-23 21:23           ` Benjamin Herrenschmidt
2019-01-07 18:43 ` [PATCH 12/19] KVM: PPC: Book3S HV: record guest queue page address Cédric Le Goater
2019-02-04  5:15   ` David Gibson
2019-02-04 15:37     ` Cédric Le Goater
2019-01-07 18:43 ` [PATCH 13/19] KVM: PPC: Book3S HV: add a SYNC control for the XIVE native migration Cédric Le Goater
2019-02-04  5:17   ` David Gibson
2019-02-04 15:39     ` Cédric Le Goater
2019-01-07 18:43 ` [PATCH 14/19] KVM: PPC: Book3S HV: add a control to make the XIVE EQ pages dirty Cédric Le Goater
2019-02-04  5:18   ` David Gibson
2019-02-04 15:46     ` Cédric Le Goater
2019-02-05  5:30       ` David Gibson
2019-01-07 18:43 ` [PATCH 15/19] KVM: PPC: Book3S HV: add get/set accessors for the source configuration Cédric Le Goater
2019-02-04  5:21   ` David Gibson
2019-02-04 16:07     ` Cédric Le Goater
2019-02-05  5:32       ` David Gibson
2019-02-05 13:03         ` Cédric Le Goater
2019-02-06  1:23           ` David Gibson
2019-02-06  1:24             ` David Gibson
2019-02-06  7:07               ` Cédric Le Goater
2019-02-07  2:48                 ` David Gibson
2019-02-07  9:13                   ` Cédric Le Goater
2019-02-08  5:15                     ` David Gibson
2019-02-14 16:50                       ` Cédric Le Goater
2019-01-07 18:43 ` [PATCH 16/19] KVM: PPC: Book3S HV: add get/set accessors for the EQ configuration Cédric Le Goater
2019-02-04  5:24   ` David Gibson
2019-02-05 17:45     ` Cédric Le Goater
2019-01-07 19:10 ` [PATCH 17/19] KVM: PPC: Book3S HV: add get/set accessors for the VP XIVE state Cédric Le Goater
2019-01-07 19:10   ` [PATCH 18/19] KVM: PPC: Book3S HV: add passthrough support Cédric Le Goater
2019-01-22  5:26     ` Paul Mackerras
2019-01-23  6:45       ` Benjamin Herrenschmidt
2019-01-23 10:30         ` Paul Mackerras
2019-01-23 11:07           ` Cédric Le Goater
2019-01-28  6:13             ` Paul Mackerras
2019-01-28 18:26               ` Cédric Le Goater
2019-01-29  2:45                 ` Paul Mackerras
2019-01-29 13:47                   ` Cédric Le Goater
2019-01-30  6:20                     ` Paul Mackerras
2019-01-30 15:54                       ` Cédric Le Goater
2019-01-31  2:48                         ` Paul Mackerras
2019-01-29  4:12                 ` Paul Mackerras
2019-01-29 17:44                   ` Cédric Le Goater
2019-01-30  5:55                     ` Paul Mackerras
2019-01-30  7:06                       ` Cédric Le Goater
2019-01-23 21:25           ` Benjamin Herrenschmidt
2019-01-24  8:41             ` Cédric Le Goater
2019-01-28  4:43             ` Paul Mackerras
2019-01-29 13:46               ` Cédric Le Goater
2019-01-07 19:10   ` [PATCH 19/19] KVM: introduce a KVM_DELETE_DEVICE ioctl Cédric Le Goater
2019-01-22  5:42     ` Paul Mackerras
2019-01-23 18:39       ` Cédric Le Goater
2019-01-23 21:32         ` Benjamin Herrenschmidt
2019-02-04  5:26   ` [PATCH 17/19] KVM: PPC: Book3S HV: add get/set accessors for the VP XIVE state David Gibson
2019-02-04 18:57     ` Cédric Le Goater
2019-02-05  5:33       ` David Gibson
2019-02-05 11:58         ` Cédric Le Goater
2019-02-06  1:19           ` David Gibson
2019-01-22  4:46 ` [PATCH 00/19] KVM: PPC: Book3S HV: add XIVE native exploitation mode Paul Mackerras
2019-01-23 19:07   ` Cédric Le Goater
2019-01-23 21:35     ` Benjamin Herrenschmidt
2019-01-26  8:25       ` Cédric Le Goater
2019-02-04  5:36         ` David Gibson
2019-02-05 11:31           ` Cédric Le Goater
2019-02-05 22:13             ` Paul Mackerras
2019-02-06  1:18               ` David Gibson
2019-02-06  7:35                 ` Cédric Le Goater
2019-02-07  2:51                   ` David Gibson
2019-02-07  8:31                     ` Cédric Le Goater
2019-02-08  5:07                       ` David Gibson
2019-02-08  7:38                         ` Cédric Le Goater
2019-01-28  5:51     ` Paul Mackerras
2019-01-29 13:51       ` Cédric Le Goater
2019-01-30  5:40         ` Paul Mackerras
2019-01-30 15:36           ` Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190211023842.GE7230@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=clg@kaod.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).