Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Julia Suvorova <jusual@redhat.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	Bjorn Helgaas <bhelgaas@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Vitaly Kuznetsov <vkuznets@redhat.com>
Subject: Re: [PATCH] x86/PCI: Use MMCONFIG by default for KVM guests
Date: Wed, 29 Jul 2020 10:43:07 -0400
Message-ID: <20200729102929-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CAMDeoFUCanfvrxAmW4_QH=L9BExCAydCifE_tvRaW_XTd0OQXw@mail.gmail.com>

On Sun, Jul 26, 2020 at 08:58:45PM +0200, Julia Suvorova wrote:
> On Thu, Jul 23, 2020 at 1:19 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > On Wed, Jul 22, 2020 at 02:15:13AM +0200, Julia Suvorova wrote:
> > > Scanning for PCI devices at boot takes a long time for KVM guests. It
> > > can be reduced if KVM will handle all configuration space accesses for
> > > non-existent devices without going to userspace [1]. But for this to
> > > work, all accesses must go through MMCONFIG.
> > > This change allows to use pci_mmcfg as raw_pci_ops for 64-bit KVM
> > > guests making MMCONFIG the default access method.
> >
> > The above *looks* like it's intended to be two paragraphs, which would
> > be easier to read with a blank line between.
> >
> > The last sentence should say what the patch actually *does*, e.g.,
> > "Use pci_mmcfg as raw_pci_ops ..."
> >
> > > [1] https://lkml.org/lkml/2020/5/14/936
> >
> > Please use a lore.kernel.org URL instead because it's more usable and
> > I'd rather depend on kernel.org than lkml.org.
> >
> > > Signed-off-by: Julia Suvorova <jusual@redhat.com>
> > > ---
> > >  arch/x86/pci/direct.c      | 5 +++++
> > >  arch/x86/pci/mmconfig_64.c | 3 +++
> > >  2 files changed, 8 insertions(+)
> > >
> > > diff --git a/arch/x86/pci/direct.c b/arch/x86/pci/direct.c
> > > index a51074c55982..8ff6b65d8f48 100644
> > > --- a/arch/x86/pci/direct.c
> > > +++ b/arch/x86/pci/direct.c
> > > @@ -6,6 +6,7 @@
> > >  #include <linux/pci.h>
> > >  #include <linux/init.h>
> > >  #include <linux/dmi.h>
> > > +#include <linux/kvm_para.h>
> > >  #include <asm/pci_x86.h>
> > >
> > >  /*
> > > @@ -264,6 +265,10 @@ void __init pci_direct_init(int type)
> > >  {
> > >       if (type == 0)
> > >               return;
> > > +
> > > +     if (raw_pci_ext_ops && kvm_para_available())
> > > +             return;
> > >       printk(KERN_INFO "PCI: Using configuration type %d for base access\n",
> > >                type);
> > >       if (type == 1) {
> > > diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
> > > index 0c7b6e66c644..9eb772821766 100644
> > > --- a/arch/x86/pci/mmconfig_64.c
> > > +++ b/arch/x86/pci/mmconfig_64.c
> > > @@ -10,6 +10,7 @@
> > >  #include <linux/init.h>
> > >  #include <linux/acpi.h>
> > >  #include <linux/bitmap.h>
> > > +#include <linux/kvm_para.h>
> > >  #include <linux/rcupdate.h>
> > >  #include <asm/e820/api.h>
> > >  #include <asm/pci_x86.h>
> > > @@ -122,6 +123,8 @@ int __init pci_mmcfg_arch_init(void)
> > >               }
> > >
> > >       raw_pci_ext_ops = &pci_mmcfg;
> > > +     if (kvm_para_available())
> > > +             raw_pci_ops = &pci_mmcfg;
> >
> > The idea of using MMCONFIG for *all* config space, not just extended
> > config space, makes sense to me, although the very long discussion at
> > https://lore.kernel.org/lkml/20071225032605.29147200@laptopd505.fenrus.org/
> > makes me wary.  Of course I realize you're talking specifically about
> > KVM, not doing this in general.
> >
> > But it doesn't seem right to make this specific to KVM, since it's not
> > obvious to me that there's a basis in PCI for making this distinction.
> 
> Bugs that were fixed (or more accurately, avoided) by a0ca99096094
> ("PCI x86: always use conf1 to access config space below 256 bytes")
> are still present. And to enable MMCONFIG for the entire config space,
> we need to re-introduce all these fixes or at least identify affected
> devices, which may be impossible.

What *is* about KVM here is that there's no real benefit
to this change if not running on x86 within a hypervisor.
And this should be better documented in a code comment and
commit log.



What is *not* about KVM here is that it's known to be
safe when running on QEMU and on the specific implementation.
Other implementations - even if they are using kvm -
might freeze if you disable memory of the pci host device,
or try to size BARs so they overlap the MMCONFIG.

So to proceed with your approach, I would say either we limit this to
just a known good QEMU device, or disable this when poking at unsafe
registers.

But I have another idea: isn't it true that we can get a large part of
the benefit simply by reading the device/vendor ID through MMCONFIG?
That is almost sure to be safe anyway, even though I would limit it to
just KVM simply because other systems do not benefit.

> 
> We can avoid KVM-specific changes in the generic PCI code by
> implementing x86_init.pci.arch_init inside KVM code, as Vitaly
> suggested. What do you think?
> 
> Best regards, Julia Suvorova.

Makes sense.

-- 
MST


      reply index

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-22  0:15 Julia Suvorova
2020-07-22  2:59 ` Matthew Wilcox
2020-07-22  9:43 ` Vitaly Kuznetsov
2020-07-26 18:35   ` Julia Suvorova
2020-07-27 14:04   ` Andy Shevchenko
2020-07-27 15:55     ` Vitaly Kuznetsov
2020-07-22 10:59 ` Michael S. Tsirkin
2020-07-22 23:19 ` Bjorn Helgaas
2020-07-26 18:58   ` Julia Suvorova
2020-07-29 14:43     ` Michael S. Tsirkin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200729102929-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=jusual@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=vkuznets@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git