KVM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses
@ 2020-07-30 19:35 Julia Suvorova
  2020-07-30 19:50 ` Andy Shevchenko
  0 siblings, 1 reply; 5+ messages in thread
From: Julia Suvorova @ 2020-07-30 19:35 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Bjorn Helgaas, Michael S. Tsirkin, Matthew Wilcox,
	Vitaly Kuznetsov, Andy Shevchenko, Sean Christopherson,
	Paolo Bonzini, Thomas Gleixner, Julia Suvorova

Using MMCONFIG instead of I/O ports cuts the number of config space
accesses in half, which is faster on KVM and opens the door for
additional optimizations such as Vitaly's "[PATCH 0/3] KVM: x86: KVM
MEM_PCI_HOLE memory":
https://lore.kernel.org/kvm/20200728143741.2718593-1-vkuznets@redhat.com

However, this change will not bring significant performance improvement
unless it is running on x86 within a hypervisor. Moreover, allowing
MMCONFIG access for addresses < 256 can be dangerous for some devices:
see commit a0ca99096094 ("PCI x86: always use conf1 to access config
space below 256 bytes"). That is why a special feature flag is needed.

Introduce KVM_FEATURE_PCI_GO_MMCONFIG, which can be enabled when the
configuration is known to be safe (e.g. in QEMU).

Signed-off-by: Julia Suvorova <jusual@redhat.com>
---
 Documentation/virt/kvm/cpuid.rst     |  4 ++++
 arch/x86/include/uapi/asm/kvm_para.h |  1 +
 arch/x86/kernel/kvm.c                | 14 ++++++++++++++
 3 files changed, 19 insertions(+)

diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst
index a7dff9186bed..711f2074877b 100644
--- a/Documentation/virt/kvm/cpuid.rst
+++ b/Documentation/virt/kvm/cpuid.rst
@@ -92,6 +92,10 @@ KVM_FEATURE_ASYNC_PF_INT          14          guest checks this feature bit
                                               async pf acknowledgment msr
                                               0x4b564d07.
 
+KVM_FEATURE_PCI_GO_MMCONFIG       15          guest checks this feature bit
+                                              before using MMCONFIG for all
+                                              PCI config accesses
+
 KVM_FEATURE_CLOCSOURCE_STABLE_BIT 24          host will warn if no guest-side
                                               per-cpu warps are expeced in
                                               kvmclock
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 812e9b4c1114..5793f372cae0 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -32,6 +32,7 @@
 #define KVM_FEATURE_POLL_CONTROL	12
 #define KVM_FEATURE_PV_SCHED_YIELD	13
 #define KVM_FEATURE_ASYNC_PF_INT	14
+#define KVM_FEATURE_PCI_GO_MMCONFIG	15
 
 #define KVM_HINTS_REALTIME      0
 
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index df63786e7bfa..1ec73e6f25ce 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -33,6 +33,7 @@
 #include <asm/hypervisor.h>
 #include <asm/tlb.h>
 #include <asm/cpuidle_haltpoll.h>
+#include <asm/pci_x86.h>
 
 DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
 
@@ -715,6 +716,18 @@ static uint32_t __init kvm_detect(void)
 	return kvm_cpuid_base();
 }
 
+static int __init kvm_pci_arch_init(void)
+{
+	if (raw_pci_ext_ops &&
+	    kvm_para_has_feature(KVM_FEATURE_PCI_GO_MMCONFIG)) {
+		pr_info("PCI: Using MMCONFIG for base access\n");
+		raw_pci_ops = raw_pci_ext_ops;
+		return 0;
+	}
+
+	return 1;
+}
+
 static void __init kvm_apic_init(void)
 {
 #if defined(CONFIG_SMP)
@@ -726,6 +739,7 @@ static void __init kvm_apic_init(void)
 static void __init kvm_init_platform(void)
 {
 	kvmclock_init();
+	x86_init.pci.arch_init = kvm_pci_arch_init;
 	x86_platform.apic_post_init = kvm_apic_init;
 }
 
-- 
2.25.4


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses
  2020-07-30 19:35 [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses Julia Suvorova
@ 2020-07-30 19:50 ` Andy Shevchenko
  2020-07-31  9:22   ` Vitaly Kuznetsov
  0 siblings, 1 reply; 5+ messages in thread
From: Andy Shevchenko @ 2020-07-30 19:50 UTC (permalink / raw)
  To: Julia Suvorova
  Cc: open list:VFIO DRIVER, Linux Kernel Mailing List, Bjorn Helgaas,
	Michael S. Tsirkin, Matthew Wilcox, Vitaly Kuznetsov,
	Sean Christopherson, Paolo Bonzini, Thomas Gleixner

On Thu, Jul 30, 2020 at 10:37 PM Julia Suvorova <jusual@redhat.com> wrote:
>
> Using MMCONFIG instead of I/O ports cuts the number of config space
> accesses in half, which is faster on KVM and opens the door for
> additional optimizations such as Vitaly's "[PATCH 0/3] KVM: x86: KVM
> MEM_PCI_HOLE memory":

> https://lore.kernel.org/kvm/20200728143741.2718593-1-vkuznets@redhat.com

You may use Link: tag for this.

> However, this change will not bring significant performance improvement
> unless it is running on x86 within a hypervisor. Moreover, allowing
> MMCONFIG access for addresses < 256 can be dangerous for some devices:
> see commit a0ca99096094 ("PCI x86: always use conf1 to access config
> space below 256 bytes"). That is why a special feature flag is needed.
>
> Introduce KVM_FEATURE_PCI_GO_MMCONFIG, which can be enabled when the
> configuration is known to be safe (e.g. in QEMU).

...

> +static int __init kvm_pci_arch_init(void)
> +{
> +       if (raw_pci_ext_ops &&
> +           kvm_para_has_feature(KVM_FEATURE_PCI_GO_MMCONFIG)) {

Better to use traditional pattern, i.e.
  if (not_supported)
    return bail_out;

  ...do useful things...
  return 0;

> +               pr_info("PCI: Using MMCONFIG for base access\n");
> +               raw_pci_ops = raw_pci_ext_ops;
> +               return 0;
> +       }

> +       return 1;

Hmm... I don't remember what positive codes means there. Perhaps you
need to return a rather error code?

> +}

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses
  2020-07-30 19:50 ` Andy Shevchenko
@ 2020-07-31  9:22   ` Vitaly Kuznetsov
  2020-07-31  9:41     ` Andy Shevchenko
  2020-07-31 18:23     ` Julia Suvorova
  0 siblings, 2 replies; 5+ messages in thread
From: Vitaly Kuznetsov @ 2020-07-31  9:22 UTC (permalink / raw)
  To: Andy Shevchenko, Julia Suvorova
  Cc: open list:VFIO DRIVER, Linux Kernel Mailing List, Bjorn Helgaas,
	Michael S. Tsirkin, Matthew Wilcox, Sean Christopherson,
	Paolo Bonzini, Thomas Gleixner

Andy Shevchenko <andy.shevchenko@gmail.com> writes:

> On Thu, Jul 30, 2020 at 10:37 PM Julia Suvorova <jusual@redhat.com> wrote:
>>
>> Using MMCONFIG instead of I/O ports cuts the number of config space
>> accesses in half, which is faster on KVM and opens the door for
>> additional optimizations such as Vitaly's "[PATCH 0/3] KVM: x86: KVM
>> MEM_PCI_HOLE memory":
>
>> https://lore.kernel.org/kvm/20200728143741.2718593-1-vkuznets@redhat.com
>
> You may use Link: tag for this.
>
>> However, this change will not bring significant performance improvement
>> unless it is running on x86 within a hypervisor. Moreover, allowing
>> MMCONFIG access for addresses < 256 can be dangerous for some devices:
>> see commit a0ca99096094 ("PCI x86: always use conf1 to access config
>> space below 256 bytes"). That is why a special feature flag is needed.
>>
>> Introduce KVM_FEATURE_PCI_GO_MMCONFIG, which can be enabled when the
>> configuration is known to be safe (e.g. in QEMU).
>
> ...
>
>> +static int __init kvm_pci_arch_init(void)
>> +{
>> +       if (raw_pci_ext_ops &&
>> +           kvm_para_has_feature(KVM_FEATURE_PCI_GO_MMCONFIG)) {
>
> Better to use traditional pattern, i.e.
>   if (not_supported)
>     return bail_out;
>
>   ...do useful things...
>   return 0;
>
>> +               pr_info("PCI: Using MMCONFIG for base access\n");
>> +               raw_pci_ops = raw_pci_ext_ops;
>> +               return 0;
>> +       }
>
>> +       return 1;
>
> Hmm... I don't remember what positive codes means there. Perhaps you
> need to return a rather error code?

If I'm reading the code correctly,

pci_arch_init() has the following:

        if (x86_init.pci.arch_init && !x86_init.pci.arch_init())
                return 0;


so returning '1' here means 'continue' and this seems to be
correct. (E.g. Hyper-V's hv_pci_init() does the same). What I'm not sure
about is 'return 0' above as this will result in skipping the rest of
pci_arch_init(). Was this desired or should we return '1' in both cases?

-- 
Vitaly


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses
  2020-07-31  9:22   ` Vitaly Kuznetsov
@ 2020-07-31  9:41     ` Andy Shevchenko
  2020-07-31 18:23     ` Julia Suvorova
  1 sibling, 0 replies; 5+ messages in thread
From: Andy Shevchenko @ 2020-07-31  9:41 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Julia Suvorova, open list:VFIO DRIVER, Linux Kernel Mailing List,
	Bjorn Helgaas, Michael S. Tsirkin, Matthew Wilcox,
	Sean Christopherson, Paolo Bonzini, Thomas Gleixner

On Fri, Jul 31, 2020 at 12:22 PM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> Andy Shevchenko <andy.shevchenko@gmail.com> writes:
> > On Thu, Jul 30, 2020 at 10:37 PM Julia Suvorova <jusual@redhat.com> wrote:

...

> >> +static int __init kvm_pci_arch_init(void)
> >> +{
> >> +       if (raw_pci_ext_ops &&

> >> +               return 0;
> >> +       }
> >
> >> +       return 1;
> >
> > Hmm... I don't remember what positive codes means there. Perhaps you
> > need to return a rather error code?
>
> If I'm reading the code correctly,
>
> pci_arch_init() has the following:
>
>         if (x86_init.pci.arch_init && !x86_init.pci.arch_init())
>                 return 0;
>
>
> so returning '1' here means 'continue' and this seems to be
> correct. (E.g. Hyper-V's hv_pci_init() does the same). What I'm not sure
> about is 'return 0' above as this will result in skipping the rest of
> pci_arch_init(). Was this desired or should we return '1' in both cases?

I think it depends what you want. In complex cases we recognize three
possibilities

-ERRNO: function failed, we have to stop and bailout with error from callee
0: function OK, stop and return 0
1: function OK, continue the rest in callee

Do we have needs in this or is the current enough for all (exist) callees?

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses
  2020-07-31  9:22   ` Vitaly Kuznetsov
  2020-07-31  9:41     ` Andy Shevchenko
@ 2020-07-31 18:23     ` Julia Suvorova
  1 sibling, 0 replies; 5+ messages in thread
From: Julia Suvorova @ 2020-07-31 18:23 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Andy Shevchenko, open list:VFIO DRIVER,
	Linux Kernel Mailing List, Bjorn Helgaas, Michael S. Tsirkin,
	Matthew Wilcox, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner

On Fri, Jul 31, 2020 at 11:22 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Andy Shevchenko <andy.shevchenko@gmail.com> writes:
>
> > On Thu, Jul 30, 2020 at 10:37 PM Julia Suvorova <jusual@redhat.com> wrote:
> >>
> >> Using MMCONFIG instead of I/O ports cuts the number of config space
> >> accesses in half, which is faster on KVM and opens the door for
> >> additional optimizations such as Vitaly's "[PATCH 0/3] KVM: x86: KVM
> >> MEM_PCI_HOLE memory":
> >
> >> https://lore.kernel.org/kvm/20200728143741.2718593-1-vkuznets@redhat.com
> >
> > You may use Link: tag for this.
> >
> >> However, this change will not bring significant performance improvement
> >> unless it is running on x86 within a hypervisor. Moreover, allowing
> >> MMCONFIG access for addresses < 256 can be dangerous for some devices:
> >> see commit a0ca99096094 ("PCI x86: always use conf1 to access config
> >> space below 256 bytes"). That is why a special feature flag is needed.
> >>
> >> Introduce KVM_FEATURE_PCI_GO_MMCONFIG, which can be enabled when the
> >> configuration is known to be safe (e.g. in QEMU).
> >
> > ...
> >
> >> +static int __init kvm_pci_arch_init(void)
> >> +{
> >> +       if (raw_pci_ext_ops &&
> >> +           kvm_para_has_feature(KVM_FEATURE_PCI_GO_MMCONFIG)) {
> >
> > Better to use traditional pattern, i.e.
> >   if (not_supported)
> >     return bail_out;
> >
> >   ...do useful things...
> >   return 0;
> >
> >> +               pr_info("PCI: Using MMCONFIG for base access\n");
> >> +               raw_pci_ops = raw_pci_ext_ops;
> >> +               return 0;
> >> +       }
> >
> >> +       return 1;
> >
> > Hmm... I don't remember what positive codes means there. Perhaps you
> > need to return a rather error code?
>
> If I'm reading the code correctly,
>
> pci_arch_init() has the following:
>
>         if (x86_init.pci.arch_init && !x86_init.pci.arch_init())
>                 return 0;
>
>
> so returning '1' here means 'continue' and this seems to be
> correct. (E.g. Hyper-V's hv_pci_init() does the same). What I'm not sure
> about is 'return 0' above as this will result in skipping the rest of
> pci_arch_init(). Was this desired or should we return '1' in both cases?

This is intentional because pci_direct_init() is about to overwrite
raw_pci_ops. And since QEMU doesn't have anything in
pciprobe_dmi_table, it is safe to skip it.

Best regards, Julia Suvorova.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, back to index

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-30 19:35 [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses Julia Suvorova
2020-07-30 19:50 ` Andy Shevchenko
2020-07-31  9:22   ` Vitaly Kuznetsov
2020-07-31  9:41     ` Andy Shevchenko
2020-07-31 18:23     ` Julia Suvorova

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git