From: Megha Dey <megha.dey@intel.com> To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, dave.jiang@intel.com, ashok.raj@intel.com, kevin.tian@intel.com, dwmw@amazon.co.uk, x86@kernel.org, tony.luck@intel.com, dan.j.williams@intel.com, megha.dey@intel.com, jgg@mellanox.com, kvm@vger.kernel.org, iommu@lists.linux-foundation.org, alex.williamson@redhat.com, bhelgaas@google.com, maz@kernel.org, linux-pci@vger.kernel.org, baolu.lu@linux.intel.com, ravi.v.shankar@intel.com, Leon Romanovsky <leon@kernel.org> Subject: [PATCH 11/12] platform-msi: Add platform check for subdevice irq domain Date: Wed, 3 Feb 2021 12:56:44 -0800 [thread overview] Message-ID: <1612385805-3412-12-git-send-email-megha.dey@intel.com> (raw) In-Reply-To: <1612385805-3412-1-git-send-email-megha.dey@intel.com> From: Lu Baolu <baolu.lu@linux.intel.com> The pci_subdevice_msi_create_irq_domain() should fail if the underlying platform is not able to support IMS (Interrupt Message Storage). Otherwise, the isolation of interrupt is not guaranteed. For x86, IMS is only supported on bare metal for now. We could enable it in the virtualization environments in the future if interrupt HYPERCALL domain is supported or the hardware has the capability of interrupt isolation for subdevices. Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Leon Romanovsky <leon@kernel.org> Cc: Kevin Tian <kevin.tian@intel.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/linux-pci/87pn4nk7nn.fsf@nanos.tec.linutronix.de/ Link: https://lore.kernel.org/linux-pci/877dqrnzr3.fsf@nanos.tec.linutronix.de/ Link: https://lore.kernel.org/linux-pci/877dqqmc2h.fsf@nanos.tec.linutronix.de/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Megha Dey <megha.dey@intel.com> --- arch/x86/pci/common.c | 74 +++++++++++++++++++++++++++++++++++++++++++++ drivers/base/platform-msi.c | 8 +++++ include/linux/msi.h | 1 + 3 files changed, 83 insertions(+) diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 3507f45..263ccf6 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -12,6 +12,8 @@ #include <linux/init.h> #include <linux/dmi.h> #include <linux/slab.h> +#include <linux/iommu.h> +#include <linux/msi.h> #include <asm/acpi.h> #include <asm/segment.h> @@ -724,3 +726,75 @@ struct pci_dev *pci_real_dma_dev(struct pci_dev *dev) return dev; } #endif + +#ifdef CONFIG_DEVICE_MSI +/* + * We want to figure out which context we are running in. But the hardware + * does not introduce a reliable way (instruction, CPUID leaf, MSR, whatever) + * which can be manipulated by the VMM to let the OS figure out where it runs. + * So we go with the below probably on_bare_metal() function as a replacement + * for definitely on_bare_metal() to go forward only for the very simple reason + * that this is the only option we have. + */ +static const char * const vmm_vendor_name[] = { + "QEMU", "Bochs", "KVM", "Xen", "VMware", "VMW", "VMware Inc.", + "innotek GmbH", "Oracle Corporation", "Parallels", "BHYVE" +}; + +static void read_type0_virtual_machine(const struct dmi_header *dm, void *p) +{ + u8 *data = (u8 *)dm + 0x13; + + /* BIOS Information (Type 0) */ + if (dm->type != 0 || dm->length < 0x14) + return; + + /* Bit 4 of BIOS Characteristics Extension Byte 2*/ + if (*data & BIT(4)) + *((bool *)p) = true; +} + +static bool smbios_virtual_machine(void) +{ + bool bit_present = false; + + dmi_walk(read_type0_virtual_machine, &bit_present); + + return bit_present; +} + +static bool on_bare_metal(struct device *dev) +{ + int i; + + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) + return false; + + if (smbios_virtual_machine()) + return false; + + if (iommu_capable(dev->bus, IOMMU_CAP_VIOMMU_HINT)) + return false; + + for (i = 0; i < ARRAY_SIZE(vmm_vendor_name); i++) + if (dmi_match(DMI_SYS_VENDOR, vmm_vendor_name[i])) + return false; + + pr_info("System running on bare metal, report to bugzilla.kernel.org if not the case."); + + return true; +} + +bool arch_support_pci_device_ims(struct pci_dev *pdev) +{ + /* + * When we are running in a VMM context, the device IMS could only be + * enabled when the underlying hardware supports interrupt isolation + * of the subdevice, or any mechanism (trap, hypercall) is added so + * that changes in the interrupt message store could be managed by the + * VMM. For now, we only support the device IMS when we are running on + * the bare metal. + */ + return on_bare_metal(&pdev->dev); +} +#endif diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c index 6127b3b..d5ae26f 100644 --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -519,6 +519,11 @@ struct irq_domain *device_msi_create_irq_domain(struct fwnode_handle *fn, #ifdef CONFIG_PCI #include <linux/pci.h> +bool __weak arch_support_pci_device_ims(struct pci_dev *pdev) +{ + return false; +} + /** * pci_subdevice_msi_create_irq_domain - Create an irq domain for subdevices * @pdev: Pointer to PCI device for which the subdevice domain is created @@ -530,6 +535,9 @@ struct irq_domain *pci_subdevice_msi_create_irq_domain(struct pci_dev *pdev, struct irq_domain *domain, *pdev_msi; struct fwnode_handle *fn; + if (!arch_support_pci_device_ims(pdev)) + return NULL; + /* * Retrieve the MSI domain of the underlying PCI device's MSI * domain. The PCI device domain's parent domain is also the parent diff --git a/include/linux/msi.h b/include/linux/msi.h index a6b419d..fa02542 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -478,6 +478,7 @@ struct irq_domain *device_msi_create_irq_domain(struct fwnode_handle *fn, struct irq_domain *parent); # ifdef CONFIG_PCI +bool arch_support_pci_device_ims(struct pci_dev *pdev); struct irq_domain *pci_subdevice_msi_create_irq_domain(struct pci_dev *pdev, struct msi_domain_info *info); # endif -- 2.7.4
WARNING: multiple messages have this Message-ID (diff)
From: Megha Dey <megha.dey@intel.com> To: tglx@linutronix.de Cc: alex.williamson@redhat.com, kevin.tian@intel.com, tony.luck@intel.com, dave.jiang@intel.com, ashok.raj@intel.com, kvm@vger.kernel.org, ravi.v.shankar@intel.com, maz@kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org, jgg@mellanox.com, megha.dey@intel.com, linux-pci@vger.kernel.org, bhelgaas@google.com, dan.j.williams@intel.com, Leon Romanovsky <leon@kernel.org>, dwmw@amazon.co.uk Subject: [PATCH 11/12] platform-msi: Add platform check for subdevice irq domain Date: Wed, 3 Feb 2021 12:56:44 -0800 [thread overview] Message-ID: <1612385805-3412-12-git-send-email-megha.dey@intel.com> (raw) In-Reply-To: <1612385805-3412-1-git-send-email-megha.dey@intel.com> From: Lu Baolu <baolu.lu@linux.intel.com> The pci_subdevice_msi_create_irq_domain() should fail if the underlying platform is not able to support IMS (Interrupt Message Storage). Otherwise, the isolation of interrupt is not guaranteed. For x86, IMS is only supported on bare metal for now. We could enable it in the virtualization environments in the future if interrupt HYPERCALL domain is supported or the hardware has the capability of interrupt isolation for subdevices. Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Leon Romanovsky <leon@kernel.org> Cc: Kevin Tian <kevin.tian@intel.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/linux-pci/87pn4nk7nn.fsf@nanos.tec.linutronix.de/ Link: https://lore.kernel.org/linux-pci/877dqrnzr3.fsf@nanos.tec.linutronix.de/ Link: https://lore.kernel.org/linux-pci/877dqqmc2h.fsf@nanos.tec.linutronix.de/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Megha Dey <megha.dey@intel.com> --- arch/x86/pci/common.c | 74 +++++++++++++++++++++++++++++++++++++++++++++ drivers/base/platform-msi.c | 8 +++++ include/linux/msi.h | 1 + 3 files changed, 83 insertions(+) diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 3507f45..263ccf6 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -12,6 +12,8 @@ #include <linux/init.h> #include <linux/dmi.h> #include <linux/slab.h> +#include <linux/iommu.h> +#include <linux/msi.h> #include <asm/acpi.h> #include <asm/segment.h> @@ -724,3 +726,75 @@ struct pci_dev *pci_real_dma_dev(struct pci_dev *dev) return dev; } #endif + +#ifdef CONFIG_DEVICE_MSI +/* + * We want to figure out which context we are running in. But the hardware + * does not introduce a reliable way (instruction, CPUID leaf, MSR, whatever) + * which can be manipulated by the VMM to let the OS figure out where it runs. + * So we go with the below probably on_bare_metal() function as a replacement + * for definitely on_bare_metal() to go forward only for the very simple reason + * that this is the only option we have. + */ +static const char * const vmm_vendor_name[] = { + "QEMU", "Bochs", "KVM", "Xen", "VMware", "VMW", "VMware Inc.", + "innotek GmbH", "Oracle Corporation", "Parallels", "BHYVE" +}; + +static void read_type0_virtual_machine(const struct dmi_header *dm, void *p) +{ + u8 *data = (u8 *)dm + 0x13; + + /* BIOS Information (Type 0) */ + if (dm->type != 0 || dm->length < 0x14) + return; + + /* Bit 4 of BIOS Characteristics Extension Byte 2*/ + if (*data & BIT(4)) + *((bool *)p) = true; +} + +static bool smbios_virtual_machine(void) +{ + bool bit_present = false; + + dmi_walk(read_type0_virtual_machine, &bit_present); + + return bit_present; +} + +static bool on_bare_metal(struct device *dev) +{ + int i; + + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) + return false; + + if (smbios_virtual_machine()) + return false; + + if (iommu_capable(dev->bus, IOMMU_CAP_VIOMMU_HINT)) + return false; + + for (i = 0; i < ARRAY_SIZE(vmm_vendor_name); i++) + if (dmi_match(DMI_SYS_VENDOR, vmm_vendor_name[i])) + return false; + + pr_info("System running on bare metal, report to bugzilla.kernel.org if not the case."); + + return true; +} + +bool arch_support_pci_device_ims(struct pci_dev *pdev) +{ + /* + * When we are running in a VMM context, the device IMS could only be + * enabled when the underlying hardware supports interrupt isolation + * of the subdevice, or any mechanism (trap, hypercall) is added so + * that changes in the interrupt message store could be managed by the + * VMM. For now, we only support the device IMS when we are running on + * the bare metal. + */ + return on_bare_metal(&pdev->dev); +} +#endif diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c index 6127b3b..d5ae26f 100644 --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -519,6 +519,11 @@ struct irq_domain *device_msi_create_irq_domain(struct fwnode_handle *fn, #ifdef CONFIG_PCI #include <linux/pci.h> +bool __weak arch_support_pci_device_ims(struct pci_dev *pdev) +{ + return false; +} + /** * pci_subdevice_msi_create_irq_domain - Create an irq domain for subdevices * @pdev: Pointer to PCI device for which the subdevice domain is created @@ -530,6 +535,9 @@ struct irq_domain *pci_subdevice_msi_create_irq_domain(struct pci_dev *pdev, struct irq_domain *domain, *pdev_msi; struct fwnode_handle *fn; + if (!arch_support_pci_device_ims(pdev)) + return NULL; + /* * Retrieve the MSI domain of the underlying PCI device's MSI * domain. The PCI device domain's parent domain is also the parent diff --git a/include/linux/msi.h b/include/linux/msi.h index a6b419d..fa02542 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -478,6 +478,7 @@ struct irq_domain *device_msi_create_irq_domain(struct fwnode_handle *fn, struct irq_domain *parent); # ifdef CONFIG_PCI +bool arch_support_pci_device_ims(struct pci_dev *pdev); struct irq_domain *pci_subdevice_msi_create_irq_domain(struct pci_dev *pdev, struct msi_domain_info *info); # endif -- 2.7.4 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
next prev parent reply other threads:[~2021-02-03 21:04 UTC|newest] Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-02-03 20:56 [PATCH 00/12] Introduce dev-msi and interrupt message store Megha Dey 2021-02-03 20:56 ` Megha Dey 2021-02-03 20:56 ` [PATCH 01/12] x86/irq: Add DEV_MSI allocation type Megha Dey 2021-02-03 20:56 ` Megha Dey 2021-02-03 20:56 ` [PATCH 02/12] x86/msi: Rename and rework pci_msi_prepare() to cover non-PCI MSI Megha Dey 2021-02-03 20:56 ` Megha Dey 2021-02-03 20:56 ` [PATCH 03/12] platform-msi: Provide default irq_chip:: Ack Megha Dey 2021-02-03 20:56 ` Megha Dey 2021-02-03 20:56 ` [PATCH 04/12] genirq/proc: Take buslock on affinity write Megha Dey 2021-02-03 20:56 ` Megha Dey 2021-02-03 20:56 ` [PATCH 05/12] genirq/msi: Provide and use msi_domain_set_default_info_flags() Megha Dey 2021-02-03 20:56 ` Megha Dey 2021-02-03 20:56 ` [PATCH 06/12] platform-msi: Add device MSI infrastructure Megha Dey 2021-02-03 20:56 ` Megha Dey 2021-02-03 20:56 ` [PATCH 07/12] irqdomain/msi: Provide msi_alloc/free_store() callbacks Megha Dey 2021-02-03 20:56 ` Megha Dey 2021-02-03 20:56 ` [PATCH 08/12] genirq: Set auxiliary data for an interrupt Megha Dey 2021-02-03 20:56 ` Megha Dey 2021-02-03 20:56 ` [PATCH 09/12] iommu/vt-d: Add DEV-MSI support Megha Dey 2021-02-03 20:56 ` Megha Dey 2021-02-03 20:56 ` [PATCH 10/12] iommu: Add capability IOMMU_CAP_VIOMMU_HINT Megha Dey 2021-02-03 20:56 ` Megha Dey 2021-02-03 20:56 ` Megha Dey [this message] 2021-02-03 20:56 ` [PATCH 11/12] platform-msi: Add platform check for subdevice irq domain Megha Dey 2021-02-04 12:14 ` Jason Gunthorpe 2021-02-04 12:14 ` Jason Gunthorpe 2021-02-04 13:22 ` Lu Baolu 2021-02-04 13:22 ` Lu Baolu 2021-02-08 8:21 ` Leon Romanovsky 2021-02-08 8:21 ` Leon Romanovsky 2021-02-09 0:36 ` Lu Baolu 2021-02-03 20:56 ` [PATCH 12/12] irqchip: Add IMS (Interrupt Message Store) driver Megha Dey 2021-02-03 20:56 ` Megha Dey
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1612385805-3412-12-git-send-email-megha.dey@intel.com \ --to=megha.dey@intel.com \ --cc=alex.williamson@redhat.com \ --cc=ashok.raj@intel.com \ --cc=baolu.lu@linux.intel.com \ --cc=bhelgaas@google.com \ --cc=dan.j.williams@intel.com \ --cc=dave.jiang@intel.com \ --cc=dwmw@amazon.co.uk \ --cc=iommu@lists.linux-foundation.org \ --cc=jgg@mellanox.com \ --cc=kevin.tian@intel.com \ --cc=kvm@vger.kernel.org \ --cc=leon@kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-pci@vger.kernel.org \ --cc=maz@kernel.org \ --cc=ravi.v.shankar@intel.com \ --cc=tglx@linutronix.de \ --cc=tony.luck@intel.com \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.