linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ajay Kaher <akaher@vmware.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: "x86@kernel.org" <x86@kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	Srivatsa Bhat <srivatsab@vmware.com>,
	"srivatsa@csail.mit.edu" <srivatsa@csail.mit.edu>,
	Alexey Makhalov <amakhalov@vmware.com>,
	Vasavi Sirnapalli <vsirnapalli@vmware.com>,
	"er.ajay.kaher@gmail.com" <er.ajay.kaher@gmail.com>,
	"willy@infradead.org" <willy@infradead.org>,
	Nadav Amit <namit@vmware.com>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"jailhouse-dev@googlegroups.com" <jailhouse-dev@googlegroups.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	"acrn-dev@lists.projectacrn.org" <acrn-dev@lists.projectacrn.org>,
	"helgaas@kernel.org" <helgaas@kernel.org>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	Alexander Graf <graf@amazon.com>
Subject: Re: [PATCH v2] x86/PCI: Prefer MMIO over PIO on all hypervisor
Date: Thu, 29 Sep 2022 05:36:46 +0000	[thread overview]
Message-ID: <B64FD502-E794-4E94-A267-D690476C57EE@vmware.com> (raw)
In-Reply-To: <87zgf3pfd1.fsf@redhat.com>


> On 13/09/22, 7:05 PM, "Vitaly Kuznetsov" <vkuznets@redhat.com> wrote:
>>
>> Thanks Vitaly for your response.
>>
>> 1. we have multiple objects of struct pci_raw_ops, 2. adding 'priority' field to struct pci_raw_ops
>> doesn't seems to be appropriate as need to take decision which object of struct pci_raw_ops has
>> to be used, not something with-in struct pci_raw_ops.
>
> I'm not sure I follow, you have two instances of 'struct pci_raw_ops'
> which are called 'raw_pci_ops' and 'raw_pci_ext_ops'. What if you do
> something like (completely untested):
>
> diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h
> index 70533fdcbf02..fb8270fa6c78 100644
> --- a/arch/x86/include/asm/pci_x86.h
> +++ b/arch/x86/include/asm/pci_x86.h
> @@ -116,6 +116,7 @@ extern void (*pcibios_disable_irq)(struct pci_dev *dev);
> extern bool mp_should_keep_irq(struct device *dev);
>
> struct pci_raw_ops {
> +       int rating;
>          int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn,
>                                                int reg, int len, u32 *val);
>          int (*write)(unsigned int domain, unsigned int bus, unsigned int devfn,
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index ddb798603201..e9965fd11576 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -40,7 +40,8 @@ const struct pci_raw_ops *__read_mostly raw_pci_ext_ops;
>  int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
>                                                 int reg, int len, u32 *val)
> {
> -       if (domain == 0 && reg < 256 && raw_pci_ops)
> +       if (domain == 0 && reg < 256 && raw_pci_ops &&
> +           (!raw_pci_ext_ops || raw_pci_ext_ops->rating <= raw_pci_ops->rating))
>                 return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
>         if (raw_pci_ext_ops)
>                 return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
> @@ -50,7 +51,8 @@ int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
>  int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
>                                                 int reg, int len, u32 val)
> {
> -       if (domain == 0 && reg < 256 && raw_pci_ops)
> +       if (domain == 0 && reg < 256 && raw_pci_ops &&
> +           (!raw_pci_ext_ops || raw_pci_ext_ops->rating <= raw_pci_ops->rating))
>                 return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
>          if (raw_pci_ext_ops)
>                 return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
>
> and then somewhere in Vmware hypervisor initialization code
> (arch/x86/kernel/cpu/vmware.c) you do
>
>  raw_pci_ext_ops->rating = 100;

Thanks Vitaly, for your review and helping us to improve the code.

I was working to make changes as you suggested, but before sending v3 would like to
discuss on following:

If we add rating with-in struct pci_raw_ops then we can't have pci_mmcfg as const,
and following change is must in arch/x86/pci/mmconfig_64.c:

-const struct pci_raw_ops pci_mmcfg = {
+struct pci_raw_ops pci_mmcfg = {
 	.read =		pci_mmcfg_read,
 	.write =	pci_mmcfg_write,
};

So to avoid this change, is it fine to have global bool prefer_raw_pci_ext_ops?

And raw_pci_read() will have following change:

-	if (domain == 0 && reg < 256 && raw_pci_ops)
+	if (domain == 0 && reg < 256 && raw_pci_ops &&
+	     (!prefer_raw_pci_ext_ops ||  !raw_pci_ext_ops)

>
> why wouldn't it work?
>
> (diclaimer: completely untested, raw_pci_ops/raw_pci_ext_ops
> initialization has to be checked so 'rating' is not garbage).
>
>>
>> It's a generic solution for all hypervisor (sorry for earlier wrong
>> Subject), not specific to VMware. Further looking for feedback if it's
>> impacting to any hypervisor.
>
> That's the tricky part. We can check modern hypervisor versions, but
> what about all other versions in existence? How can we know that there's
> no QEMU/Hyper-V/... version out there where MMIO path is broken? I'd
> suggest we limit the change to Vmware hypervisor, other hypervisors may
> use the same mechanism (like the one above) later (but the person
> suggesting the patch is always responsible for the research why it is
> safe to do so).

Ok, as of now we will make this change specific to VMware hypervisor.

- Ajay




  reply	other threads:[~2022-09-29  5:36 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-13 12:47 [PATCH v2] x86/PCI: Prefer MMIO over PIO on all hypervisor Ajay Kaher
2022-09-13 13:34 ` Vitaly Kuznetsov
2022-09-29  5:36   ` Ajay Kaher [this message]
2022-09-29  9:12     ` Alexander Graf
2022-10-03 15:03     ` Vitaly Kuznetsov
2022-10-03 17:34       ` Nadav Amit
2022-10-03 21:06         ` H. Peter Anvin
2022-10-03 21:28           ` Nadav Amit
2022-10-03 23:51             ` H. Peter Anvin
2022-10-04  0:19               ` Nadav Amit
2022-10-04  8:22         ` Alexander Graf
2022-10-04 18:48           ` Nadav Amit
2022-10-10 14:58             ` Nadav Amit
2022-10-10 17:05             ` Michael S. Tsirkin
2022-10-04  8:30         ` Vitaly Kuznetsov
2022-10-03 21:01       ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B64FD502-E794-4E94-A267-D690476C57EE@vmware.com \
    --to=akaher@vmware.com \
    --cc=acrn-dev@lists.projectacrn.org \
    --cc=amakhalov@vmware.com \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=er.ajay.kaher@gmail.com \
    --cc=graf@amazon.com \
    --cc=helgaas@kernel.org \
    --cc=hpa@zytor.com \
    --cc=jailhouse-dev@googlegroups.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namit@vmware.com \
    --cc=rostedt@goodmis.org \
    --cc=srivatsa@csail.mit.edu \
    --cc=srivatsab@vmware.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=vsirnapalli@vmware.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).