* [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses
@ 2020-07-30 19:35 Julia Suvorova
2020-07-30 19:50 ` Andy Shevchenko
0 siblings, 1 reply; 5+ messages in thread
From: Julia Suvorova @ 2020-07-30 19:35 UTC (permalink / raw)
To: kvm
Cc: linux-kernel, Bjorn Helgaas, Michael S. Tsirkin, Matthew Wilcox,
Vitaly Kuznetsov, Andy Shevchenko, Sean Christopherson,
Paolo Bonzini, Thomas Gleixner, Julia Suvorova
Using MMCONFIG instead of I/O ports cuts the number of config space
accesses in half, which is faster on KVM and opens the door for
additional optimizations such as Vitaly's "[PATCH 0/3] KVM: x86: KVM
MEM_PCI_HOLE memory":
https://lore.kernel.org/kvm/20200728143741.2718593-1-vkuznets@redhat.com
However, this change will not bring significant performance improvement
unless it is running on x86 within a hypervisor. Moreover, allowing
MMCONFIG access for addresses < 256 can be dangerous for some devices:
see commit a0ca99096094 ("PCI x86: always use conf1 to access config
space below 256 bytes"). That is why a special feature flag is needed.
Introduce KVM_FEATURE_PCI_GO_MMCONFIG, which can be enabled when the
configuration is known to be safe (e.g. in QEMU).
Signed-off-by: Julia Suvorova <jusual@redhat.com>
---
Documentation/virt/kvm/cpuid.rst | 4 ++++
arch/x86/include/uapi/asm/kvm_para.h | 1 +
arch/x86/kernel/kvm.c | 14 ++++++++++++++
3 files changed, 19 insertions(+)
diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst
index a7dff9186bed..711f2074877b 100644
--- a/Documentation/virt/kvm/cpuid.rst
+++ b/Documentation/virt/kvm/cpuid.rst
@@ -92,6 +92,10 @@ KVM_FEATURE_ASYNC_PF_INT 14 guest checks this feature bit
async pf acknowledgment msr
0x4b564d07.
+KVM_FEATURE_PCI_GO_MMCONFIG 15 guest checks this feature bit
+ before using MMCONFIG for all
+ PCI config accesses
+
KVM_FEATURE_CLOCSOURCE_STABLE_BIT 24 host will warn if no guest-side
per-cpu warps are expeced in
kvmclock
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 812e9b4c1114..5793f372cae0 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -32,6 +32,7 @@
#define KVM_FEATURE_POLL_CONTROL 12
#define KVM_FEATURE_PV_SCHED_YIELD 13
#define KVM_FEATURE_ASYNC_PF_INT 14
+#define KVM_FEATURE_PCI_GO_MMCONFIG 15
#define KVM_HINTS_REALTIME 0
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index df63786e7bfa..1ec73e6f25ce 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -33,6 +33,7 @@
#include <asm/hypervisor.h>
#include <asm/tlb.h>
#include <asm/cpuidle_haltpoll.h>
+#include <asm/pci_x86.h>
DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
@@ -715,6 +716,18 @@ static uint32_t __init kvm_detect(void)
return kvm_cpuid_base();
}
+static int __init kvm_pci_arch_init(void)
+{
+ if (raw_pci_ext_ops &&
+ kvm_para_has_feature(KVM_FEATURE_PCI_GO_MMCONFIG)) {
+ pr_info("PCI: Using MMCONFIG for base access\n");
+ raw_pci_ops = raw_pci_ext_ops;
+ return 0;
+ }
+
+ return 1;
+}
+
static void __init kvm_apic_init(void)
{
#if defined(CONFIG_SMP)
@@ -726,6 +739,7 @@ static void __init kvm_apic_init(void)
static void __init kvm_init_platform(void)
{
kvmclock_init();
+ x86_init.pci.arch_init = kvm_pci_arch_init;
x86_platform.apic_post_init = kvm_apic_init;
}
--
2.25.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses
2020-07-30 19:35 [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses Julia Suvorova
@ 2020-07-30 19:50 ` Andy Shevchenko
2020-07-31 9:22 ` Vitaly Kuznetsov
0 siblings, 1 reply; 5+ messages in thread
From: Andy Shevchenko @ 2020-07-30 19:50 UTC (permalink / raw)
To: Julia Suvorova
Cc: open list:VFIO DRIVER, Linux Kernel Mailing List, Bjorn Helgaas,
Michael S. Tsirkin, Matthew Wilcox, Vitaly Kuznetsov,
Sean Christopherson, Paolo Bonzini, Thomas Gleixner
On Thu, Jul 30, 2020 at 10:37 PM Julia Suvorova <jusual@redhat.com> wrote:
>
> Using MMCONFIG instead of I/O ports cuts the number of config space
> accesses in half, which is faster on KVM and opens the door for
> additional optimizations such as Vitaly's "[PATCH 0/3] KVM: x86: KVM
> MEM_PCI_HOLE memory":
> https://lore.kernel.org/kvm/20200728143741.2718593-1-vkuznets@redhat.com
You may use Link: tag for this.
> However, this change will not bring significant performance improvement
> unless it is running on x86 within a hypervisor. Moreover, allowing
> MMCONFIG access for addresses < 256 can be dangerous for some devices:
> see commit a0ca99096094 ("PCI x86: always use conf1 to access config
> space below 256 bytes"). That is why a special feature flag is needed.
>
> Introduce KVM_FEATURE_PCI_GO_MMCONFIG, which can be enabled when the
> configuration is known to be safe (e.g. in QEMU).
...
> +static int __init kvm_pci_arch_init(void)
> +{
> + if (raw_pci_ext_ops &&
> + kvm_para_has_feature(KVM_FEATURE_PCI_GO_MMCONFIG)) {
Better to use traditional pattern, i.e.
if (not_supported)
return bail_out;
...do useful things...
return 0;
> + pr_info("PCI: Using MMCONFIG for base access\n");
> + raw_pci_ops = raw_pci_ext_ops;
> + return 0;
> + }
> + return 1;
Hmm... I don't remember what positive codes means there. Perhaps you
need to return a rather error code?
> +}
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses
2020-07-30 19:50 ` Andy Shevchenko
@ 2020-07-31 9:22 ` Vitaly Kuznetsov
2020-07-31 9:41 ` Andy Shevchenko
2020-07-31 18:23 ` Julia Suvorova
0 siblings, 2 replies; 5+ messages in thread
From: Vitaly Kuznetsov @ 2020-07-31 9:22 UTC (permalink / raw)
To: Andy Shevchenko, Julia Suvorova
Cc: open list:VFIO DRIVER, Linux Kernel Mailing List, Bjorn Helgaas,
Michael S. Tsirkin, Matthew Wilcox, Sean Christopherson,
Paolo Bonzini, Thomas Gleixner
Andy Shevchenko <andy.shevchenko@gmail.com> writes:
> On Thu, Jul 30, 2020 at 10:37 PM Julia Suvorova <jusual@redhat.com> wrote:
>>
>> Using MMCONFIG instead of I/O ports cuts the number of config space
>> accesses in half, which is faster on KVM and opens the door for
>> additional optimizations such as Vitaly's "[PATCH 0/3] KVM: x86: KVM
>> MEM_PCI_HOLE memory":
>
>> https://lore.kernel.org/kvm/20200728143741.2718593-1-vkuznets@redhat.com
>
> You may use Link: tag for this.
>
>> However, this change will not bring significant performance improvement
>> unless it is running on x86 within a hypervisor. Moreover, allowing
>> MMCONFIG access for addresses < 256 can be dangerous for some devices:
>> see commit a0ca99096094 ("PCI x86: always use conf1 to access config
>> space below 256 bytes"). That is why a special feature flag is needed.
>>
>> Introduce KVM_FEATURE_PCI_GO_MMCONFIG, which can be enabled when the
>> configuration is known to be safe (e.g. in QEMU).
>
> ...
>
>> +static int __init kvm_pci_arch_init(void)
>> +{
>> + if (raw_pci_ext_ops &&
>> + kvm_para_has_feature(KVM_FEATURE_PCI_GO_MMCONFIG)) {
>
> Better to use traditional pattern, i.e.
> if (not_supported)
> return bail_out;
>
> ...do useful things...
> return 0;
>
>> + pr_info("PCI: Using MMCONFIG for base access\n");
>> + raw_pci_ops = raw_pci_ext_ops;
>> + return 0;
>> + }
>
>> + return 1;
>
> Hmm... I don't remember what positive codes means there. Perhaps you
> need to return a rather error code?
If I'm reading the code correctly,
pci_arch_init() has the following:
if (x86_init.pci.arch_init && !x86_init.pci.arch_init())
return 0;
so returning '1' here means 'continue' and this seems to be
correct. (E.g. Hyper-V's hv_pci_init() does the same). What I'm not sure
about is 'return 0' above as this will result in skipping the rest of
pci_arch_init(). Was this desired or should we return '1' in both cases?
--
Vitaly
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses
2020-07-31 9:22 ` Vitaly Kuznetsov
@ 2020-07-31 9:41 ` Andy Shevchenko
2020-07-31 18:23 ` Julia Suvorova
1 sibling, 0 replies; 5+ messages in thread
From: Andy Shevchenko @ 2020-07-31 9:41 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Julia Suvorova, open list:VFIO DRIVER, Linux Kernel Mailing List,
Bjorn Helgaas, Michael S. Tsirkin, Matthew Wilcox,
Sean Christopherson, Paolo Bonzini, Thomas Gleixner
On Fri, Jul 31, 2020 at 12:22 PM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> Andy Shevchenko <andy.shevchenko@gmail.com> writes:
> > On Thu, Jul 30, 2020 at 10:37 PM Julia Suvorova <jusual@redhat.com> wrote:
...
> >> +static int __init kvm_pci_arch_init(void)
> >> +{
> >> + if (raw_pci_ext_ops &&
> >> + return 0;
> >> + }
> >
> >> + return 1;
> >
> > Hmm... I don't remember what positive codes means there. Perhaps you
> > need to return a rather error code?
>
> If I'm reading the code correctly,
>
> pci_arch_init() has the following:
>
> if (x86_init.pci.arch_init && !x86_init.pci.arch_init())
> return 0;
>
>
> so returning '1' here means 'continue' and this seems to be
> correct. (E.g. Hyper-V's hv_pci_init() does the same). What I'm not sure
> about is 'return 0' above as this will result in skipping the rest of
> pci_arch_init(). Was this desired or should we return '1' in both cases?
I think it depends what you want. In complex cases we recognize three
possibilities
-ERRNO: function failed, we have to stop and bailout with error from callee
0: function OK, stop and return 0
1: function OK, continue the rest in callee
Do we have needs in this or is the current enough for all (exist) callees?
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses
2020-07-31 9:22 ` Vitaly Kuznetsov
2020-07-31 9:41 ` Andy Shevchenko
@ 2020-07-31 18:23 ` Julia Suvorova
1 sibling, 0 replies; 5+ messages in thread
From: Julia Suvorova @ 2020-07-31 18:23 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Andy Shevchenko, open list:VFIO DRIVER,
Linux Kernel Mailing List, Bjorn Helgaas, Michael S. Tsirkin,
Matthew Wilcox, Sean Christopherson, Paolo Bonzini,
Thomas Gleixner
On Fri, Jul 31, 2020 at 11:22 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Andy Shevchenko <andy.shevchenko@gmail.com> writes:
>
> > On Thu, Jul 30, 2020 at 10:37 PM Julia Suvorova <jusual@redhat.com> wrote:
> >>
> >> Using MMCONFIG instead of I/O ports cuts the number of config space
> >> accesses in half, which is faster on KVM and opens the door for
> >> additional optimizations such as Vitaly's "[PATCH 0/3] KVM: x86: KVM
> >> MEM_PCI_HOLE memory":
> >
> >> https://lore.kernel.org/kvm/20200728143741.2718593-1-vkuznets@redhat.com
> >
> > You may use Link: tag for this.
> >
> >> However, this change will not bring significant performance improvement
> >> unless it is running on x86 within a hypervisor. Moreover, allowing
> >> MMCONFIG access for addresses < 256 can be dangerous for some devices:
> >> see commit a0ca99096094 ("PCI x86: always use conf1 to access config
> >> space below 256 bytes"). That is why a special feature flag is needed.
> >>
> >> Introduce KVM_FEATURE_PCI_GO_MMCONFIG, which can be enabled when the
> >> configuration is known to be safe (e.g. in QEMU).
> >
> > ...
> >
> >> +static int __init kvm_pci_arch_init(void)
> >> +{
> >> + if (raw_pci_ext_ops &&
> >> + kvm_para_has_feature(KVM_FEATURE_PCI_GO_MMCONFIG)) {
> >
> > Better to use traditional pattern, i.e.
> > if (not_supported)
> > return bail_out;
> >
> > ...do useful things...
> > return 0;
> >
> >> + pr_info("PCI: Using MMCONFIG for base access\n");
> >> + raw_pci_ops = raw_pci_ext_ops;
> >> + return 0;
> >> + }
> >
> >> + return 1;
> >
> > Hmm... I don't remember what positive codes means there. Perhaps you
> > need to return a rather error code?
>
> If I'm reading the code correctly,
>
> pci_arch_init() has the following:
>
> if (x86_init.pci.arch_init && !x86_init.pci.arch_init())
> return 0;
>
>
> so returning '1' here means 'continue' and this seems to be
> correct. (E.g. Hyper-V's hv_pci_init() does the same). What I'm not sure
> about is 'return 0' above as this will result in skipping the rest of
> pci_arch_init(). Was this desired or should we return '1' in both cases?
This is intentional because pci_direct_init() is about to overwrite
raw_pci_ops. And since QEMU doesn't have anything in
pciprobe_dmi_table, it is safe to skip it.
Best regards, Julia Suvorova.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-07-31 18:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-30 19:35 [PATCH] KVM: x86: Use MMCONFIG for all PCI config space accesses Julia Suvorova
2020-07-30 19:50 ` Andy Shevchenko
2020-07-31 9:22 ` Vitaly Kuznetsov
2020-07-31 9:41 ` Andy Shevchenko
2020-07-31 18:23 ` Julia Suvorova
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).