All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC KERNEL PATCH v6 0/3] Support device passthrough when dom0 is PVH on Xen
@ 2024-04-19  3:36 Jiqian Chen
  2024-04-19  3:36 ` [KERNEL PATCH v6 1/3] xen/pci: Add xen_reset_device_state function Jiqian Chen
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Jiqian Chen @ 2024-04-19  3:36 UTC (permalink / raw)
  To: Juergen Gross, Stefano Stabellini, Bjorn Helgaas,
	Rafael J . Wysocki, Roger Pau Monné
  Cc: xen-devel, linux-pci, linux-kernel, linux-acpi, Huang Rui, Jiqian Chen

Hi All,
This is v6 series to support passthrough on Xen when dom0 is PVH.
v5->v6 change:
* patch#3: change to add a new syscall to translate irq to gsi, instead adding a new gsi sysfs.


Best regards,
Jiqian Chen


v4->v5 changes:
* patch#1: Add Reviewed-by Stefano
* patch#2: Add Reviewed-by Stefano
* patch#3: No changes


v3->v4 changes:
* patch#1: change the comment of PHYSDEVOP_pci_device_state_reset; use a new function pcistub_reset_device_state to wrap __pci_reset_function_locked and xen_reset_device_state, and call pcistub_reset_device_state in pci_stub.c
* patch#2: remove map_pirq from xen_pvh_passthrough_gsi


v2->v3 changes:
* patch#1: add condition to limit do xen_reset_device_state for no-pv domain in pcistub_init_device.
* patch#2: Abandoning previous implementations that call unmask_irq. To setup gsi and map pirq for passthrough device in pcistub_init_device.
* patch#3: Abandoning previous implementations that adds new syscall to get gsi from irq. To add a new sysfs for gsi, then userspace can get gsi number from sysfs.


Below is the description of v2 cover letter:
This series of patches are the v2 of the implementation of passthrough when dom0 is PVH on Xen.
We sent the v1 to upstream before, but the v1 had so many problems and we got lots of suggestions.
I will introduce all issues that these patches try to fix and the differences between v1 and v2.

Issues we encountered:
1. pci_stub failed to write bar for a passthrough device.
Problem: when we run \u201csudo xl pci-assignable-add <sbdf>\u201d to assign a device, pci_stub will call \u201cpcistub_init_device() -> pci_restore_state() -> pci_restore_config_space() ->
pci_restore_config_space_range() -> pci_restore_config_dword() -> pci_write_config_dword()\u201d, the pci config write will trigger an io interrupt to bar_write() in the xen, but the
bar->enabled was set before, the write is not allowed now, and then when 
bar->Qemu config the
passthrough device in xen_pt_realize(), it gets invalid bar values.

Reason: the reason is that we don't tell vPCI that the device has been reset, so the current cached state in pdev->vpci is all out of date and is different from the real device state.

Solution: to solve this problem, the first patch of kernel(xen/pci: Add xen_reset_device_state
function) and the fist patch of xen(xen/vpci: Clear all vpci status of device) add a new hypercall to reset the state stored in vPCI when the state of real device has changed.
Thank Roger for the suggestion of this v2, and it is different from v1 (https://lore.kernel.org/xen-devel/20230312075455.450187-3-ray.huang@amd.com/), v1 simply allow domU to write pci bar, it does not comply with the design principles of vPCI.

2. failed to do PHYSDEVOP_map_pirq when dom0 is PVH
Problem: HVM domU will do PHYSDEVOP_map_pirq for a passthrough device by using gsi. See xen_pt_realize->xc_physdev_map_pirq and pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq will call into Xen, but in hvm_physdev_op(), PHYSDEVOP_map_pirq is not allowed.

Reason: In hvm_physdev_op(), the variable "currd" is PVH dom0 and PVH has no X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.

Solution: I think we may need to allow PHYSDEVOP_map_pirq when "currd" is dom0 (at present dom0 is PVH). The second patch of xen(x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0) allow PVH dom0 do PHYSDEVOP_map_pirq. This v2 patch is better than v1, v1 simply remove the has_pirq check(xen https://lore.kernel.org/xen-devel/20230312075455.450187-4-ray.huang@amd.com/).

3. the gsi of a passthrough device doesn't be unmasked
 3.1 failed to check the permission of pirq
 3.2 the gsi of passthrough device was not registered in PVH dom0

Problem:
3.1 callback function pci_add_dm_done() will be called when qemu config a passthrough device for domU.
This function will call xc_domain_irq_permission()-> pirq_access_permitted() to check if the gsi has corresponding mappings in dom0. But it didn\u2019t, so failed. See XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH dom0 and it return irq is 0.
3.2 it's possible for a gsi (iow: vIO-APIC pin) to never get registered on PVH dom0, because the devices of PVH are using MSI(-X) interrupts. However, the IO-APIC pin must be configured for it to be able to be mapped into a domU.

Reason: After searching codes, I find "map_pirq" and "register_gsi" will be done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when the gsi(aka ioapic's pin) is unmasked in PVH dom0.
So the two problems can be concluded to that the gsi of a passthrough device doesn't be unmasked.

Solution: to solve these problems, the second patch of kernel(xen/pvh: Unmask irq for passthrough device in PVH dom0) call the unmask_irq() when we assign a device to be passthrough. So that passthrough devices can have the mapping of gsi on PVH dom0 and gsi can be registered. This v2 patch is different from the v1( kernel https://lore.kernel.org/xen-devel/20230312120157.452859-5-ray.huang@amd.com/,
kernel https://lore.kernel.org/xen-devel/20230312120157.452859-5-ray.huang@amd.com/ and xen https://lore.kernel.org/xen-devel/20230312075455.450187-5-ray.huang@amd.com/),
v1 performed "map_pirq" and "register_gsi" on all pci devices on PVH dom0, which is unnecessary and may cause multiple registration.

4. failed to map pirq for gsi
Problem: qemu will call xc_physdev_map_pirq() to map a passthrough device\u2019s gsi to pirq in function xen_pt_realize(). But failed.

Reason: According to the implement of xc_physdev_map_pirq(), it needs gsi instead of irq, but qemu pass irq to it and treat irq as gsi, it is got from file /sys/bus/pci/devices/xxxx:xx:xx.x/irq in function xen_host_pci_device_get(). But actually the gsi number is not equal with irq. On PVH dom0, when it allocates irq for a gsi in function acpi_register_gsi_ioapic(), allocation is dynamic, and follow the principle of applying first, distributing first. And if you debug the kernel codes(see function __irq_alloc_descs), you will find the irq number is allocated from small to large by order, but the applying gsi number is not, gsi 38 may come before gsi 28, that causes gsi 38 get a smaller irq number than gsi 28, and then gsi != irq.

Solution: we can record the relation between gsi and irq, then when userspace(qemu) want to use gsi, we can do a translation. The third patch of kernel(xen/privcmd: Add new syscall to get gsi from irq) records all the relations in acpi_register_gsi_xen_pvh() when dom0 initialize pci devices, and provide a syscall for userspace to get the gsi from irq. The third patch of xen(tools: Add new function to get gsi from irq) add a new function xc_physdev_gsi_from_irq() to call the new syscall added on kernel side.
And then userspace can use that function to get gsi. Then xc_physdev_map_pirq() will success. This v2 patch is the same as v1( kernel https://lore.kernel.org/xen-devel/20230312120157.452859-6-ray.huang@amd.com/ and xen https://lore.kernel.org/xen-devel/20230312075455.450187-6-ray.huang@amd.com/)

About the v2 patch of qemu, just change an included head file, other are similar to the v1 ( qemu https://lore.kernel.org/xen-devel/20230312092244.451465-19-ray.huang@amd.com/), just call
xc_physdev_gsi_from_irq() to get gsi from irq.

Jiqian Chen (3):
  xen/pci: Add xen_reset_device_state function
  xen/pvh: Setup gsi for passthrough device
  xen/privcmd: Add new syscall to get gsi from irq

 arch/x86/include/asm/apic.h        |  8 +++
 arch/x86/include/asm/xen/pci.h     |  5 ++
 arch/x86/kernel/acpi/boot.c        |  2 +-
 arch/x86/pci/xen.c                 | 21 +++++++
 arch/x86/xen/enlighten_pvh.c       | 92 ++++++++++++++++++++++++++++++
 drivers/acpi/pci_irq.c             |  2 +-
 drivers/xen/events/events_base.c   | 39 +++++++++++++
 drivers/xen/pci.c                  | 12 ++++
 drivers/xen/privcmd.c              | 19 ++++++
 drivers/xen/xen-pciback/pci_stub.c | 26 ++++++++-
 include/linux/acpi.h               |  1 +
 include/uapi/xen/privcmd.h         |  7 +++
 include/xen/acpi.h                 |  6 ++
 include/xen/events.h               |  5 ++
 include/xen/interface/physdev.h    |  7 +++
 include/xen/pci.h                  |  6 ++
 16 files changed, 253 insertions(+), 5 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [KERNEL PATCH v6 1/3] xen/pci: Add xen_reset_device_state function
  2024-04-19  3:36 [RFC KERNEL PATCH v6 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
@ 2024-04-19  3:36 ` Jiqian Chen
  2024-04-19  3:36 ` [RFC KERNEL PATCH v6 2/3] xen/pvh: Setup gsi for passthrough device Jiqian Chen
  2024-04-19  3:36 ` [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq Jiqian Chen
  2 siblings, 0 replies; 16+ messages in thread
From: Jiqian Chen @ 2024-04-19  3:36 UTC (permalink / raw)
  To: Juergen Gross, Stefano Stabellini, Bjorn Helgaas,
	Rafael J . Wysocki, Roger Pau Monné
  Cc: xen-devel, linux-pci, linux-kernel, linux-acpi, Huang Rui,
	Jiqian Chen, Huang Rui

When device on dom0 side has been reset, the vpci on Xen side
won't get notification, so that the cached state in vpci is
all out of date with the real device state.
To solve that problem, add a new function to clear all vpci
device state when device is reset on dom0 side.

And call that function in pcistub_init_device. Because when
using "pci-assignable-add" to assign a passthrough device in
Xen, it will reset passthrough device and the vpci state will
out of date, and then device will fail to restore bar state.

Co-developed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
 drivers/xen/pci.c                  | 12 ++++++++++++
 drivers/xen/xen-pciback/pci_stub.c | 18 +++++++++++++++---
 include/xen/interface/physdev.h    |  7 +++++++
 include/xen/pci.h                  |  6 ++++++
 4 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
index 72d4e3f193af..e9b30bc09139 100644
--- a/drivers/xen/pci.c
+++ b/drivers/xen/pci.c
@@ -177,6 +177,18 @@ static int xen_remove_device(struct device *dev)
 	return r;
 }
 
+int xen_reset_device_state(const struct pci_dev *dev)
+{
+	struct physdev_pci_device device = {
+		.seg = pci_domain_nr(dev->bus),
+		.bus = dev->bus->number,
+		.devfn = dev->devfn
+	};
+
+	return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_state_reset, &device);
+}
+EXPORT_SYMBOL_GPL(xen_reset_device_state);
+
 static int xen_pci_notifier(struct notifier_block *nb,
 			    unsigned long action, void *data)
 {
diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index e34b623e4b41..46c40ec8a18e 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -89,6 +89,16 @@ static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
 	return psdev;
 }
 
+static int pcistub_reset_device_state(struct pci_dev *dev)
+{
+	__pci_reset_function_locked(dev);
+
+	if (!xen_pv_domain())
+		return xen_reset_device_state(dev);
+	else
+		return 0;
+}
+
 /* Don't call this directly as it's called by pcistub_device_put */
 static void pcistub_device_release(struct kref *kref)
 {
@@ -107,7 +117,7 @@ static void pcistub_device_release(struct kref *kref)
 	/* Call the reset function which does not take lock as this
 	 * is called from "unbind" which takes a device_lock mutex.
 	 */
-	__pci_reset_function_locked(dev);
+	pcistub_reset_device_state(dev);
 	if (dev_data &&
 	    pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
 		dev_info(&dev->dev, "Could not reload PCI state\n");
@@ -284,7 +294,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	 * (so it's ready for the next domain)
 	 */
 	device_lock_assert(&dev->dev);
-	__pci_reset_function_locked(dev);
+	pcistub_reset_device_state(dev);
 
 	dev_data = pci_get_drvdata(dev);
 	ret = pci_load_saved_state(dev, dev_data->pci_saved_state);
@@ -420,7 +430,9 @@ static int pcistub_init_device(struct pci_dev *dev)
 		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
 	else {
 		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
-		__pci_reset_function_locked(dev);
+		err = pcistub_reset_device_state(dev);
+		if (err)
+			goto config_release;
 		pci_restore_state(dev);
 	}
 	/* Now disable the device (this also ensures some private device
diff --git a/include/xen/interface/physdev.h b/include/xen/interface/physdev.h
index a237af867873..8609770e28f5 100644
--- a/include/xen/interface/physdev.h
+++ b/include/xen/interface/physdev.h
@@ -256,6 +256,13 @@ struct physdev_pci_device_add {
  */
 #define PHYSDEVOP_prepare_msix          30
 #define PHYSDEVOP_release_msix          31
+/*
+ * Notify the hypervisor that a PCI device has been reset, so that any
+ * internally cached state is regenerated.  Should be called after any
+ * device reset performed by the hardware domain.
+ */
+#define PHYSDEVOP_pci_device_state_reset     32
+
 struct physdev_pci_device {
     /* IN */
     uint16_t seg;
diff --git a/include/xen/pci.h b/include/xen/pci.h
index b8337cf85fd1..b2e2e856efd6 100644
--- a/include/xen/pci.h
+++ b/include/xen/pci.h
@@ -4,10 +4,16 @@
 #define __XEN_PCI_H__
 
 #if defined(CONFIG_XEN_DOM0)
+int xen_reset_device_state(const struct pci_dev *dev);
 int xen_find_device_domain_owner(struct pci_dev *dev);
 int xen_register_device_domain_owner(struct pci_dev *dev, uint16_t domain);
 int xen_unregister_device_domain_owner(struct pci_dev *dev);
 #else
+static inline int xen_reset_device_state(const struct pci_dev *dev)
+{
+	return -1;
+}
+
 static inline int xen_find_device_domain_owner(struct pci_dev *dev)
 {
 	return -1;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC KERNEL PATCH v6 2/3] xen/pvh: Setup gsi for passthrough device
  2024-04-19  3:36 [RFC KERNEL PATCH v6 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
  2024-04-19  3:36 ` [KERNEL PATCH v6 1/3] xen/pci: Add xen_reset_device_state function Jiqian Chen
@ 2024-04-19  3:36 ` Jiqian Chen
  2024-05-10  7:48   ` Juergen Gross
  2024-04-19  3:36 ` [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq Jiqian Chen
  2 siblings, 1 reply; 16+ messages in thread
From: Jiqian Chen @ 2024-04-19  3:36 UTC (permalink / raw)
  To: Juergen Gross, Stefano Stabellini, Bjorn Helgaas,
	Rafael J . Wysocki, Roger Pau Monné
  Cc: xen-devel, linux-pci, linux-kernel, linux-acpi, Huang Rui,
	Jiqian Chen, Huang Rui

In PVH dom0, the gsis don't get registered, but the gsi of
a passthrough device must be configured for it to be able to be
mapped into a domU.

When assign a device to passthrough, proactively setup the gsi
of the device during that process.

Co-developed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
 arch/x86/xen/enlighten_pvh.c       | 92 ++++++++++++++++++++++++++++++
 drivers/acpi/pci_irq.c             |  2 +-
 drivers/xen/xen-pciback/pci_stub.c |  8 +++
 include/linux/acpi.h               |  1 +
 include/xen/acpi.h                 |  6 ++
 5 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
index c28f073c1df5..12be665b27d8 100644
--- a/arch/x86/xen/enlighten_pvh.c
+++ b/arch/x86/xen/enlighten_pvh.c
@@ -2,6 +2,7 @@
 #include <linux/acpi.h>
 #include <linux/export.h>
 #include <linux/mm.h>
+#include <linux/pci.h>
 
 #include <xen/hvc-console.h>
 
@@ -26,6 +27,97 @@
 bool __ro_after_init xen_pvh;
 EXPORT_SYMBOL_GPL(xen_pvh);
 
+typedef struct gsi_info {
+	int gsi;
+	int trigger;
+	int polarity;
+} gsi_info_t;
+
+struct acpi_prt_entry {
+	struct acpi_pci_id	id;
+	u8			pin;
+	acpi_handle		link;
+	u32			index;		/* GSI, or link _CRS index */
+};
+
+static int xen_pvh_get_gsi_info(struct pci_dev *dev,
+								gsi_info_t *gsi_info)
+{
+	int gsi;
+	u8 pin;
+	struct acpi_prt_entry *entry;
+	int trigger = ACPI_LEVEL_SENSITIVE;
+	int polarity = acpi_irq_model == ACPI_IRQ_MODEL_GIC ?
+				      ACPI_ACTIVE_HIGH : ACPI_ACTIVE_LOW;
+
+	if (!dev || !gsi_info)
+		return -EINVAL;
+
+	pin = dev->pin;
+	if (!pin)
+		return -EINVAL;
+
+	entry = acpi_pci_irq_lookup(dev, pin);
+	if (entry) {
+		if (entry->link)
+			gsi = acpi_pci_link_allocate_irq(entry->link,
+							 entry->index,
+							 &trigger, &polarity,
+							 NULL);
+		else
+			gsi = entry->index;
+	} else
+		gsi = -1;
+
+	if (gsi < 0)
+		return -EINVAL;
+
+	gsi_info->gsi = gsi;
+	gsi_info->trigger = trigger;
+	gsi_info->polarity = polarity;
+
+	return 0;
+}
+
+static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
+{
+	struct physdev_setup_gsi setup_gsi;
+
+	if (!gsi_info)
+		return -EINVAL;
+
+	setup_gsi.gsi = gsi_info->gsi;
+	setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
+	setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
+
+	return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
+}
+
+int xen_pvh_passthrough_gsi(struct pci_dev *dev)
+{
+	int ret;
+	gsi_info_t gsi_info;
+
+	if (!dev)
+		return -EINVAL;
+
+	ret = xen_pvh_get_gsi_info(dev, &gsi_info);
+	if (ret) {
+		xen_raw_printk("Fail to get gsi info!\n");
+		return ret;
+	}
+
+	ret = xen_pvh_setup_gsi(&gsi_info);
+	if (ret == -EEXIST) {
+		xen_raw_printk("Already setup the GSI :%d\n", gsi_info.gsi);
+		ret = 0;
+	} else if (ret)
+		xen_raw_printk("Fail to setup GSI (%d)!\n", gsi_info.gsi);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(xen_pvh_passthrough_gsi);
+
 void __init xen_pvh_init(struct boot_params *boot_params)
 {
 	u32 msr;
diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index ff30ceca2203..630fe0a34bc6 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -288,7 +288,7 @@ static int acpi_reroute_boot_interrupt(struct pci_dev *dev,
 }
 #endif /* CONFIG_X86_IO_APIC */
 
-static struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
+struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
 {
 	struct acpi_prt_entry *entry = NULL;
 	struct pci_dev *bridge;
diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 46c40ec8a18e..22d4380d2b04 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -20,6 +20,7 @@
 #include <linux/atomic.h>
 #include <xen/events.h>
 #include <xen/pci.h>
+#include <xen/acpi.h>
 #include <xen/xen.h>
 #include <asm/xen/hypervisor.h>
 #include <xen/interface/physdev.h>
@@ -435,6 +436,13 @@ static int pcistub_init_device(struct pci_dev *dev)
 			goto config_release;
 		pci_restore_state(dev);
 	}
+
+	if (xen_initial_domain() && xen_pvh_domain()) {
+		err = xen_pvh_passthrough_gsi(dev);
+		if (err)
+			goto config_release;
+	}
+
 	/* Now disable the device (this also ensures some private device
 	 * data is setup before we export)
 	 */
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index b7165e52b3c6..08f1e316bf27 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -361,6 +361,7 @@ void acpi_unregister_gsi (u32 gsi);
 
 struct pci_dev;
 
+struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin);
 int acpi_pci_irq_enable (struct pci_dev *dev);
 void acpi_penalize_isa_irq(int irq, int active);
 bool acpi_isa_irq_available(int irq);
diff --git a/include/xen/acpi.h b/include/xen/acpi.h
index b1e11863144d..17c4d37f1e60 100644
--- a/include/xen/acpi.h
+++ b/include/xen/acpi.h
@@ -67,10 +67,16 @@ static inline void xen_acpi_sleep_register(void)
 		acpi_suspend_lowlevel = xen_acpi_suspend_lowlevel;
 	}
 }
+int xen_pvh_passthrough_gsi(struct pci_dev *dev);
 #else
 static inline void xen_acpi_sleep_register(void)
 {
 }
+
+static inline int xen_pvh_passthrough_gsi(struct pci_dev *dev)
+{
+	return -1;
+}
 #endif
 
 #endif	/* _XEN_ACPI_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2024-04-19  3:36 [RFC KERNEL PATCH v6 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
  2024-04-19  3:36 ` [KERNEL PATCH v6 1/3] xen/pci: Add xen_reset_device_state function Jiqian Chen
  2024-04-19  3:36 ` [RFC KERNEL PATCH v6 2/3] xen/pvh: Setup gsi for passthrough device Jiqian Chen
@ 2024-04-19  3:36 ` Jiqian Chen
  2024-05-10  6:46   ` Jürgen Groß
  2 siblings, 1 reply; 16+ messages in thread
From: Jiqian Chen @ 2024-04-19  3:36 UTC (permalink / raw)
  To: Juergen Gross, Stefano Stabellini, Bjorn Helgaas,
	Rafael J . Wysocki, Roger Pau Monné
  Cc: xen-devel, linux-pci, linux-kernel, linux-acpi, Huang Rui,
	Jiqian Chen, Huang Rui

In PVH dom0, it uses the linux local interrupt mechanism,
when it allocs irq for a gsi, it is dynamic, and follow
the principle of applying first, distributing first. And
the irq number is alloced from small to large, but the
applying gsi number is not, may gsi 38 comes before gsi 28,
it causes the irq number is not equal with the gsi number.
And when passthrough a device, QEMU will use device's gsi
number to do pirq mapping, but the gsi number is got from
file /sys/bus/pci/devices/<sbdf>/irq, irq!= gsi, so it will
fail when mapping.
And in current linux codes, there is no method to translate
irq to gsi for userspace.

For above purpose, record the relationship of gsi and irq
when PVH dom0 do acpi_register_gsi_ioapic for devices and
adds a new syscall into privcmd to let userspace can get
that translation when they have a need.

Co-developed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
 arch/x86/include/asm/apic.h      |  8 +++++++
 arch/x86/include/asm/xen/pci.h   |  5 ++++
 arch/x86/kernel/acpi/boot.c      |  2 +-
 arch/x86/pci/xen.c               | 21 +++++++++++++++++
 drivers/xen/events/events_base.c | 39 ++++++++++++++++++++++++++++++++
 drivers/xen/privcmd.c            | 19 ++++++++++++++++
 include/uapi/xen/privcmd.h       |  7 ++++++
 include/xen/events.h             |  5 ++++
 8 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 9d159b771dc8..dd4139250895 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -169,6 +169,9 @@ extern bool apic_needs_pit(void);
 
 extern void apic_send_IPI_allbutself(unsigned int vector);
 
+extern int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
+				    int trigger, int polarity);
+
 #else /* !CONFIG_X86_LOCAL_APIC */
 static inline void lapic_shutdown(void) { }
 #define local_apic_timer_c2_ok		1
@@ -183,6 +186,11 @@ static inline void apic_intr_mode_init(void) { }
 static inline void lapic_assign_system_vectors(void) { }
 static inline void lapic_assign_legacy_vector(unsigned int i, bool r) { }
 static inline bool apic_needs_pit(void) { return true; }
+static inline int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
+				    int trigger, int polarity)
+{
+	return (int)gsi;
+}
 #endif /* !CONFIG_X86_LOCAL_APIC */
 
 #ifdef CONFIG_X86_X2APIC
diff --git a/arch/x86/include/asm/xen/pci.h b/arch/x86/include/asm/xen/pci.h
index 9015b888edd6..aa8ded61fc2d 100644
--- a/arch/x86/include/asm/xen/pci.h
+++ b/arch/x86/include/asm/xen/pci.h
@@ -5,6 +5,7 @@
 #if defined(CONFIG_PCI_XEN)
 extern int __init pci_xen_init(void);
 extern int __init pci_xen_hvm_init(void);
+extern int __init pci_xen_pvh_init(void);
 #define pci_xen 1
 #else
 #define pci_xen 0
@@ -13,6 +14,10 @@ static inline int pci_xen_hvm_init(void)
 {
 	return -1;
 }
+static inline int pci_xen_pvh_init(void)
+{
+	return -1;
+}
 #endif
 #ifdef CONFIG_XEN_PV_DOM0
 int __init pci_xen_initial_domain(void);
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 85a3ce2a3666..72c73458c083 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -749,7 +749,7 @@ static int acpi_register_gsi_pic(struct device *dev, u32 gsi,
 }
 
 #ifdef CONFIG_X86_LOCAL_APIC
-static int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
+int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
 				    int trigger, int polarity)
 {
 	int irq = gsi;
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 652cd53e77f6..f056ab5c0a06 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -114,6 +114,21 @@ static int acpi_register_gsi_xen_hvm(struct device *dev, u32 gsi,
 				 false /* no mapping of GSI to PIRQ */);
 }
 
+static int acpi_register_gsi_xen_pvh(struct device *dev, u32 gsi,
+				    int trigger, int polarity)
+{
+	int irq;
+
+	irq = acpi_register_gsi_ioapic(dev, gsi, trigger, polarity);
+	if (irq < 0)
+		return irq;
+
+	if (xen_pvh_add_gsi_irq_map(gsi, irq) == -EEXIST)
+		printk(KERN_INFO "Already map the GSI :%u and IRQ: %d\n", gsi, irq);
+
+	return irq;
+}
+
 #ifdef CONFIG_XEN_PV_DOM0
 static int xen_register_gsi(u32 gsi, int triggering, int polarity)
 {
@@ -558,6 +573,12 @@ int __init pci_xen_hvm_init(void)
 	return 0;
 }
 
+int __init pci_xen_pvh_init(void)
+{
+	__acpi_register_gsi = acpi_register_gsi_xen_pvh;
+	return 0;
+}
+
 #ifdef CONFIG_XEN_PV_DOM0
 int __init pci_xen_initial_domain(void)
 {
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 27553673e46b..80d4f7faac64 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -953,6 +953,43 @@ int xen_irq_from_gsi(unsigned gsi)
 }
 EXPORT_SYMBOL_GPL(xen_irq_from_gsi);
 
+int xen_gsi_from_irq(unsigned irq)
+{
+	struct irq_info *info;
+
+	list_for_each_entry(info, &xen_irq_list_head, list) {
+		if (info->type != IRQT_PIRQ)
+			continue;
+
+		if (info->irq == irq)
+			return info->u.pirq.gsi;
+	}
+
+	return -1;
+}
+EXPORT_SYMBOL_GPL(xen_gsi_from_irq);
+
+int xen_pvh_add_gsi_irq_map(unsigned gsi, unsigned irq)
+{
+	int tmp_irq;
+	struct irq_info *info;
+
+	tmp_irq = xen_irq_from_gsi(gsi);
+	if (tmp_irq != -1)
+		return -EEXIST;
+
+	info = kzalloc(sizeof(*info), GFP_KERNEL);
+	if (info == NULL)
+		panic("Unable to allocate metadata for GSI%d\n", gsi);
+
+	info->type = IRQT_PIRQ;
+	info->irq = irq;
+	info->u.pirq.gsi = gsi;
+	list_add_tail(&info->list, &xen_irq_list_head);
+
+	return 0;
+}
+
 static void __unbind_from_irq(struct irq_info *info, unsigned int irq)
 {
 	evtchn_port_t evtchn;
@@ -2295,6 +2332,8 @@ void __init xen_init_IRQ(void)
 	xen_init_setup_upcall_vector();
 	xen_alloc_callback_vector();
 
+	if (xen_pvh_domain())
+		pci_xen_pvh_init();
 
 	if (xen_hvm_domain()) {
 		native_init_IRQ();
diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index 67dfa4778864..11feed529e1d 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -842,6 +842,21 @@ static long privcmd_ioctl_mmap_resource(struct file *file,
 	return rc;
 }
 
+static long privcmd_ioctl_gsi_from_irq(struct file *file, void __user *udata)
+{
+	struct privcmd_gsi_from_irq kdata;
+
+	if (copy_from_user(&kdata, udata, sizeof(kdata)))
+		return -EFAULT;
+
+	kdata.gsi = xen_gsi_from_irq(kdata.irq);
+
+	if (copy_to_user(udata, &kdata, sizeof(kdata)))
+		return -EFAULT;
+
+	return 0;
+}
+
 #ifdef CONFIG_XEN_PRIVCMD_EVENTFD
 /* Irqfd support */
 static struct workqueue_struct *irqfd_cleanup_wq;
@@ -1529,6 +1544,10 @@ static long privcmd_ioctl(struct file *file,
 		ret = privcmd_ioctl_ioeventfd(file, udata);
 		break;
 
+	case IOCTL_PRIVCMD_GSI_FROM_IRQ:
+		ret = privcmd_ioctl_gsi_from_irq(file, udata);
+		break;
+
 	default:
 		break;
 	}
diff --git a/include/uapi/xen/privcmd.h b/include/uapi/xen/privcmd.h
index 8b8c5d1420fe..61f0ffbec077 100644
--- a/include/uapi/xen/privcmd.h
+++ b/include/uapi/xen/privcmd.h
@@ -126,6 +126,11 @@ struct privcmd_ioeventfd {
 	__u8 pad[2];
 };
 
+struct privcmd_gsi_from_irq {
+	__u32 irq;
+	__u32 gsi;
+};
+
 /*
  * @cmd: IOCTL_PRIVCMD_HYPERCALL
  * @arg: &privcmd_hypercall_t
@@ -157,5 +162,7 @@ struct privcmd_ioeventfd {
 	_IOW('P', 8, struct privcmd_irqfd)
 #define IOCTL_PRIVCMD_IOEVENTFD					\
 	_IOW('P', 9, struct privcmd_ioeventfd)
+#define IOCTL_PRIVCMD_GSI_FROM_IRQ				\
+	_IOC(_IOC_NONE, 'P', 10, sizeof(struct privcmd_gsi_from_irq))
 
 #endif /* __LINUX_PUBLIC_PRIVCMD_H__ */
diff --git a/include/xen/events.h b/include/xen/events.h
index 3b07409f8032..411298ae7fb0 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -127,6 +127,11 @@ int xen_pirq_from_irq(unsigned irq);
 /* Return the irq allocated to the gsi */
 int xen_irq_from_gsi(unsigned gsi);
 
+/* Return the gsi from irq */
+int xen_gsi_from_irq(unsigned irq);
+
+int xen_pvh_add_gsi_irq_map(unsigned gsi, unsigned irq);
+
 /* Determine whether to ignore this IRQ if it is passed to a guest. */
 int xen_test_irq_shared(int irq);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2024-04-19  3:36 ` [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq Jiqian Chen
@ 2024-05-10  6:46   ` Jürgen Groß
  2024-05-10  9:06     ` Chen, Jiqian
  0 siblings, 1 reply; 16+ messages in thread
From: Jürgen Groß @ 2024-05-10  6:46 UTC (permalink / raw)
  To: Jiqian Chen, Stefano Stabellini, Bjorn Helgaas,
	Rafael J . Wysocki, Roger Pau Monné
  Cc: xen-devel, linux-pci, linux-kernel, linux-acpi, Huang Rui

On 19.04.24 05:36, Jiqian Chen wrote:
> In PVH dom0, it uses the linux local interrupt mechanism,
> when it allocs irq for a gsi, it is dynamic, and follow
> the principle of applying first, distributing first. And
> the irq number is alloced from small to large, but the
> applying gsi number is not, may gsi 38 comes before gsi 28,
> it causes the irq number is not equal with the gsi number.
> And when passthrough a device, QEMU will use device's gsi
> number to do pirq mapping, but the gsi number is got from
> file /sys/bus/pci/devices/<sbdf>/irq, irq!= gsi, so it will
> fail when mapping.
> And in current linux codes, there is no method to translate
> irq to gsi for userspace.
> 
> For above purpose, record the relationship of gsi and irq
> when PVH dom0 do acpi_register_gsi_ioapic for devices and
> adds a new syscall into privcmd to let userspace can get
> that translation when they have a need.
> 
> Co-developed-by: Huang Rui <ray.huang@amd.com>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
>   arch/x86/include/asm/apic.h      |  8 +++++++
>   arch/x86/include/asm/xen/pci.h   |  5 ++++
>   arch/x86/kernel/acpi/boot.c      |  2 +-
>   arch/x86/pci/xen.c               | 21 +++++++++++++++++
>   drivers/xen/events/events_base.c | 39 ++++++++++++++++++++++++++++++++
>   drivers/xen/privcmd.c            | 19 ++++++++++++++++
>   include/uapi/xen/privcmd.h       |  7 ++++++
>   include/xen/events.h             |  5 ++++
>   8 files changed, 105 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
> index 9d159b771dc8..dd4139250895 100644
> --- a/arch/x86/include/asm/apic.h
> +++ b/arch/x86/include/asm/apic.h
> @@ -169,6 +169,9 @@ extern bool apic_needs_pit(void);
>   
>   extern void apic_send_IPI_allbutself(unsigned int vector);
>   
> +extern int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
> +				    int trigger, int polarity);
> +
>   #else /* !CONFIG_X86_LOCAL_APIC */
>   static inline void lapic_shutdown(void) { }
>   #define local_apic_timer_c2_ok		1
> @@ -183,6 +186,11 @@ static inline void apic_intr_mode_init(void) { }
>   static inline void lapic_assign_system_vectors(void) { }
>   static inline void lapic_assign_legacy_vector(unsigned int i, bool r) { }
>   static inline bool apic_needs_pit(void) { return true; }
> +static inline int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
> +				    int trigger, int polarity)
> +{
> +	return (int)gsi;
> +}
>   #endif /* !CONFIG_X86_LOCAL_APIC */
>   
>   #ifdef CONFIG_X86_X2APIC
> diff --git a/arch/x86/include/asm/xen/pci.h b/arch/x86/include/asm/xen/pci.h
> index 9015b888edd6..aa8ded61fc2d 100644
> --- a/arch/x86/include/asm/xen/pci.h
> +++ b/arch/x86/include/asm/xen/pci.h
> @@ -5,6 +5,7 @@
>   #if defined(CONFIG_PCI_XEN)
>   extern int __init pci_xen_init(void);
>   extern int __init pci_xen_hvm_init(void);
> +extern int __init pci_xen_pvh_init(void);
>   #define pci_xen 1
>   #else
>   #define pci_xen 0
> @@ -13,6 +14,10 @@ static inline int pci_xen_hvm_init(void)
>   {
>   	return -1;
>   }
> +static inline int pci_xen_pvh_init(void)
> +{
> +	return -1;
> +}
>   #endif
>   #ifdef CONFIG_XEN_PV_DOM0
>   int __init pci_xen_initial_domain(void);
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index 85a3ce2a3666..72c73458c083 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -749,7 +749,7 @@ static int acpi_register_gsi_pic(struct device *dev, u32 gsi,
>   }
>   
>   #ifdef CONFIG_X86_LOCAL_APIC
> -static int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
> +int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>   				    int trigger, int polarity)
>   {
>   	int irq = gsi;
> diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
> index 652cd53e77f6..f056ab5c0a06 100644
> --- a/arch/x86/pci/xen.c
> +++ b/arch/x86/pci/xen.c
> @@ -114,6 +114,21 @@ static int acpi_register_gsi_xen_hvm(struct device *dev, u32 gsi,
>   				 false /* no mapping of GSI to PIRQ */);
>   }
>   
> +static int acpi_register_gsi_xen_pvh(struct device *dev, u32 gsi,
> +				    int trigger, int polarity)
> +{
> +	int irq;
> +
> +	irq = acpi_register_gsi_ioapic(dev, gsi, trigger, polarity);
> +	if (irq < 0)
> +		return irq;
> +
> +	if (xen_pvh_add_gsi_irq_map(gsi, irq) == -EEXIST)
> +		printk(KERN_INFO "Already map the GSI :%u and IRQ: %d\n", gsi, irq);
> +
> +	return irq;
> +}
> +
>   #ifdef CONFIG_XEN_PV_DOM0
>   static int xen_register_gsi(u32 gsi, int triggering, int polarity)
>   {
> @@ -558,6 +573,12 @@ int __init pci_xen_hvm_init(void)
>   	return 0;
>   }
>   
> +int __init pci_xen_pvh_init(void)
> +{
> +	__acpi_register_gsi = acpi_register_gsi_xen_pvh;

No support for unregistering the gsi again?

> +	return 0;
> +}
> +
>   #ifdef CONFIG_XEN_PV_DOM0
>   int __init pci_xen_initial_domain(void)
>   {
> diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
> index 27553673e46b..80d4f7faac64 100644
> --- a/drivers/xen/events/events_base.c
> +++ b/drivers/xen/events/events_base.c
> @@ -953,6 +953,43 @@ int xen_irq_from_gsi(unsigned gsi)
>   }
>   EXPORT_SYMBOL_GPL(xen_irq_from_gsi);
>   
> +int xen_gsi_from_irq(unsigned irq)
> +{
> +	struct irq_info *info;
> +
> +	list_for_each_entry(info, &xen_irq_list_head, list) {
> +		if (info->type != IRQT_PIRQ)
> +			continue;
> +
> +		if (info->irq == irq)
> +			return info->u.pirq.gsi;
> +	}
> +
> +	return -1;
> +}
> +EXPORT_SYMBOL_GPL(xen_gsi_from_irq);
> +
> +int xen_pvh_add_gsi_irq_map(unsigned gsi, unsigned irq)
> +{
> +	int tmp_irq;
> +	struct irq_info *info;
> +
> +	tmp_irq = xen_irq_from_gsi(gsi);
> +	if (tmp_irq != -1)
> +		return -EEXIST;
> +
> +	info = kzalloc(sizeof(*info), GFP_KERNEL);
> +	if (info == NULL)
> +		panic("Unable to allocate metadata for GSI%d\n", gsi);

Please don't kill the system here, just return -ENOMEM.

> +
> +	info->type = IRQT_PIRQ;
> +	info->irq = irq;
> +	info->u.pirq.gsi = gsi;
> +	list_add_tail(&info->list, &xen_irq_list_head);

I think you need some kind of locking to protect changing of the list against
concurrent accesses.

> +
> +	return 0;
> +}
> +
>   static void __unbind_from_irq(struct irq_info *info, unsigned int irq)
>   {
>   	evtchn_port_t evtchn;
> @@ -2295,6 +2332,8 @@ void __init xen_init_IRQ(void)
>   	xen_init_setup_upcall_vector();
>   	xen_alloc_callback_vector();
>   
> +	if (xen_pvh_domain())
> +		pci_xen_pvh_init();
>   
>   	if (xen_hvm_domain()) {
>   		native_init_IRQ();
> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
> index 67dfa4778864..11feed529e1d 100644
> --- a/drivers/xen/privcmd.c
> +++ b/drivers/xen/privcmd.c
> @@ -842,6 +842,21 @@ static long privcmd_ioctl_mmap_resource(struct file *file,
>   	return rc;
>   }
>   
> +static long privcmd_ioctl_gsi_from_irq(struct file *file, void __user *udata)
> +{
> +	struct privcmd_gsi_from_irq kdata;
> +
> +	if (copy_from_user(&kdata, udata, sizeof(kdata)))
> +		return -EFAULT;
> +
> +	kdata.gsi = xen_gsi_from_irq(kdata.irq);
> +
> +	if (copy_to_user(udata, &kdata, sizeof(kdata)))
> +		return -EFAULT;
> +
> +	return 0;

Shouldn't you return an error if xen_gsi_from_irq() returned -1?


Juergen

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC KERNEL PATCH v6 2/3] xen/pvh: Setup gsi for passthrough device
  2024-04-19  3:36 ` [RFC KERNEL PATCH v6 2/3] xen/pvh: Setup gsi for passthrough device Jiqian Chen
@ 2024-05-10  7:48   ` Juergen Gross
  2024-05-10  8:42     ` Chen, Jiqian
  0 siblings, 1 reply; 16+ messages in thread
From: Juergen Gross @ 2024-05-10  7:48 UTC (permalink / raw)
  To: Jiqian Chen, Stefano Stabellini, Bjorn Helgaas,
	Rafael J . Wysocki, Roger Pau Monné
  Cc: xen-devel, linux-pci, linux-kernel, linux-acpi, Huang Rui


[-- Attachment #1.1.1: Type: text/plain, Size: 1659 bytes --]

On 19.04.24 05:36, Jiqian Chen wrote:
> In PVH dom0, the gsis don't get registered, but the gsi of
> a passthrough device must be configured for it to be able to be
> mapped into a domU.
> 
> When assign a device to passthrough, proactively setup the gsi
> of the device during that process.
> 
> Co-developed-by: Huang Rui <ray.huang@amd.com>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

This patch is breaking the build.

On Arm I get:

In file included from /home/gross/korg/src/drivers/xen/xen-pciback/pci_stub.c:23:0:
/home/gross/korg/src/include/xen/acpi.h: In function 'xen_acpi_sleep_register':
/home/gross/korg/src/include/xen/acpi.h:67:3: error: 'acpi_suspend_lowlevel' 
undeclared (first use in this function); did you mean 'xen_acpi_suspend_lowlevel'?
    acpi_suspend_lowlevel = xen_acpi_suspend_lowlevel;
    ^~~~~~~~~~~~~~~~~~~~~
    xen_acpi_suspend_lowlevel
/home/gross/korg/src/include/xen/acpi.h:67:3: note: each undeclared identifier 
is reported only once for each function it appears in
make[6]: *** [/home/gross/korg/src/scripts/Makefile.build:244: 
drivers/xen/xen-pciback/pci_stub.o] Error 1
make[5]: *** [/home/gross/korg/src/scripts/Makefile.build:485: 
drivers/xen/xen-pciback] Error 2
make[4]: *** [/home/gross/korg/src/scripts/Makefile.build:485: drivers/xen] Error 2

Additionally I'm seeing this warning on x86_64:

/home/gross/korg/src/arch/x86/xen/enlighten_pvh.c:97:5: warning: no previous 
prototype for ‘xen_pvh_passthrough_gsi’ [-Wmissing-prototypes]
  int xen_pvh_passthrough_gsi(struct pci_dev *dev)


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC KERNEL PATCH v6 2/3] xen/pvh: Setup gsi for passthrough device
  2024-05-10  7:48   ` Juergen Gross
@ 2024-05-10  8:42     ` Chen, Jiqian
  0 siblings, 0 replies; 16+ messages in thread
From: Chen, Jiqian @ 2024-05-10  8:42 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Stefano Stabellini, Bjorn Helgaas, Rafael J . Wysocki,
	Roger Pau Monné,
	xen-devel, linux-pci, linux-kernel, linux-acpi, Huang, Ray, Chen,
	Jiqian

Hi,

On 2024/5/10 15:48, Juergen Gross wrote:
> On 19.04.24 05:36, Jiqian Chen wrote:
>> In PVH dom0, the gsis don't get registered, but the gsi of
>> a passthrough device must be configured for it to be able to be
>> mapped into a domU.
>>
>> When assign a device to passthrough, proactively setup the gsi
>> of the device during that process.
>>
>> Co-developed-by: Huang Rui <ray.huang@amd.com>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> 
> This patch is breaking the build.
> 
> On Arm I get:
> 
> In file included from /home/gross/korg/src/drivers/xen/xen-pciback/pci_stub.c:23:0:
> /home/gross/korg/src/include/xen/acpi.h: In function 'xen_acpi_sleep_register':
> /home/gross/korg/src/include/xen/acpi.h:67:3: error: 'acpi_suspend_lowlevel' undeclared (first use in this function); did you mean 'xen_acpi_suspend_lowlevel'?
>    acpi_suspend_lowlevel = xen_acpi_suspend_lowlevel;
>    ^~~~~~~~~~~~~~~~~~~~~
>    xen_acpi_suspend_lowlevel
> /home/gross/korg/src/include/xen/acpi.h:67:3: note: each undeclared identifier is reported only once for each function it appears in
> make[6]: *** [/home/gross/korg/src/scripts/Makefile.build:244: drivers/xen/xen-pciback/pci_stub.o] Error 1
> make[5]: *** [/home/gross/korg/src/scripts/Makefile.build:485: drivers/xen/xen-pciback] Error 2
> make[4]: *** [/home/gross/korg/src/scripts/Makefile.build:485: drivers/xen] Error 2
Thanks for testing on Arm, it seems I should use macro "CONFIG_X86" to isolate the modifications to this file.

> 
> Additionally I'm seeing this warning on x86_64:
> 
> /home/gross/korg/src/arch/x86/xen/enlighten_pvh.c:97:5: warning: no previous prototype for ‘xen_pvh_passthrough_gsi’ [-Wmissing-prototypes]
>  int xen_pvh_passthrough_gsi(struct pci_dev *dev)
I think I need to add " #include <xen/acpi.h> " in file enlighten_pvh.c.

> 
> 
> Juergen

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2024-05-10  6:46   ` Jürgen Groß
@ 2024-05-10  9:06     ` Chen, Jiqian
  2024-05-10  9:53       ` Jürgen Groß
  2024-05-13  7:47       ` Chen, Jiqian
  0 siblings, 2 replies; 16+ messages in thread
From: Chen, Jiqian @ 2024-05-10  9:06 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Stefano Stabellini, Bjorn Helgaas, Rafael J . Wysocki,
	Roger Pau Monné,
	xen-devel, linux-pci, linux-kernel, linux-acpi, Huang, Ray, Chen,
	Jiqian

Hi,

On 2024/5/10 14:46, Jürgen Groß wrote:
> On 19.04.24 05:36, Jiqian Chen wrote:
>> In PVH dom0, it uses the linux local interrupt mechanism,
>> when it allocs irq for a gsi, it is dynamic, and follow
>> the principle of applying first, distributing first. And
>> the irq number is alloced from small to large, but the
>> applying gsi number is not, may gsi 38 comes before gsi 28,
>> it causes the irq number is not equal with the gsi number.
>> And when passthrough a device, QEMU will use device's gsi
>> number to do pirq mapping, but the gsi number is got from
>> file /sys/bus/pci/devices/<sbdf>/irq, irq!= gsi, so it will
>> fail when mapping.
>> And in current linux codes, there is no method to translate
>> irq to gsi for userspace.
>>
>> For above purpose, record the relationship of gsi and irq
>> when PVH dom0 do acpi_register_gsi_ioapic for devices and
>> adds a new syscall into privcmd to let userspace can get
>> that translation when they have a need.
>>
>> Co-developed-by: Huang Rui <ray.huang@amd.com>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>>   arch/x86/include/asm/apic.h      |  8 +++++++
>>   arch/x86/include/asm/xen/pci.h   |  5 ++++
>>   arch/x86/kernel/acpi/boot.c      |  2 +-
>>   arch/x86/pci/xen.c               | 21 +++++++++++++++++
>>   drivers/xen/events/events_base.c | 39 ++++++++++++++++++++++++++++++++
>>   drivers/xen/privcmd.c            | 19 ++++++++++++++++
>>   include/uapi/xen/privcmd.h       |  7 ++++++
>>   include/xen/events.h             |  5 ++++
>>   8 files changed, 105 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
>> index 9d159b771dc8..dd4139250895 100644
>> --- a/arch/x86/include/asm/apic.h
>> +++ b/arch/x86/include/asm/apic.h
>> @@ -169,6 +169,9 @@ extern bool apic_needs_pit(void);
>>     extern void apic_send_IPI_allbutself(unsigned int vector);
>>   +extern int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>> +                    int trigger, int polarity);
>> +
>>   #else /* !CONFIG_X86_LOCAL_APIC */
>>   static inline void lapic_shutdown(void) { }
>>   #define local_apic_timer_c2_ok        1
>> @@ -183,6 +186,11 @@ static inline void apic_intr_mode_init(void) { }
>>   static inline void lapic_assign_system_vectors(void) { }
>>   static inline void lapic_assign_legacy_vector(unsigned int i, bool r) { }
>>   static inline bool apic_needs_pit(void) { return true; }
>> +static inline int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>> +                    int trigger, int polarity)
>> +{
>> +    return (int)gsi;
>> +}
>>   #endif /* !CONFIG_X86_LOCAL_APIC */
>>     #ifdef CONFIG_X86_X2APIC
>> diff --git a/arch/x86/include/asm/xen/pci.h b/arch/x86/include/asm/xen/pci.h
>> index 9015b888edd6..aa8ded61fc2d 100644
>> --- a/arch/x86/include/asm/xen/pci.h
>> +++ b/arch/x86/include/asm/xen/pci.h
>> @@ -5,6 +5,7 @@
>>   #if defined(CONFIG_PCI_XEN)
>>   extern int __init pci_xen_init(void);
>>   extern int __init pci_xen_hvm_init(void);
>> +extern int __init pci_xen_pvh_init(void);
>>   #define pci_xen 1
>>   #else
>>   #define pci_xen 0
>> @@ -13,6 +14,10 @@ static inline int pci_xen_hvm_init(void)
>>   {
>>       return -1;
>>   }
>> +static inline int pci_xen_pvh_init(void)
>> +{
>> +    return -1;
>> +}
>>   #endif
>>   #ifdef CONFIG_XEN_PV_DOM0
>>   int __init pci_xen_initial_domain(void);
>> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
>> index 85a3ce2a3666..72c73458c083 100644
>> --- a/arch/x86/kernel/acpi/boot.c
>> +++ b/arch/x86/kernel/acpi/boot.c
>> @@ -749,7 +749,7 @@ static int acpi_register_gsi_pic(struct device *dev, u32 gsi,
>>   }
>>     #ifdef CONFIG_X86_LOCAL_APIC
>> -static int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>> +int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>>                       int trigger, int polarity)
>>   {
>>       int irq = gsi;
>> diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
>> index 652cd53e77f6..f056ab5c0a06 100644
>> --- a/arch/x86/pci/xen.c
>> +++ b/arch/x86/pci/xen.c
>> @@ -114,6 +114,21 @@ static int acpi_register_gsi_xen_hvm(struct device *dev, u32 gsi,
>>                    false /* no mapping of GSI to PIRQ */);
>>   }
>>   +static int acpi_register_gsi_xen_pvh(struct device *dev, u32 gsi,
>> +                    int trigger, int polarity)
>> +{
>> +    int irq;
>> +
>> +    irq = acpi_register_gsi_ioapic(dev, gsi, trigger, polarity);
>> +    if (irq < 0)
>> +        return irq;
>> +
>> +    if (xen_pvh_add_gsi_irq_map(gsi, irq) == -EEXIST)
>> +        printk(KERN_INFO "Already map the GSI :%u and IRQ: %d\n", gsi, irq);
>> +
>> +    return irq;
>> +}
>> +
>>   #ifdef CONFIG_XEN_PV_DOM0
>>   static int xen_register_gsi(u32 gsi, int triggering, int polarity)
>>   {
>> @@ -558,6 +573,12 @@ int __init pci_xen_hvm_init(void)
>>       return 0;
>>   }
>>   +int __init pci_xen_pvh_init(void)
>> +{
>> +    __acpi_register_gsi = acpi_register_gsi_xen_pvh;
> 
> No support for unregistering the gsi again?
__acpi_unregister_gsi is set in function acpi_set_irq_model_ioapic.
Maybe I need to use a new function to call acpi_unregister_gsi_ioapic and remove the mapping of irq and gsi from xen_irq_list_head ?

> 
>> +    return 0;
>> +}
>> +
>>   #ifdef CONFIG_XEN_PV_DOM0
>>   int __init pci_xen_initial_domain(void)
>>   {
>> diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
>> index 27553673e46b..80d4f7faac64 100644
>> --- a/drivers/xen/events/events_base.c
>> +++ b/drivers/xen/events/events_base.c
>> @@ -953,6 +953,43 @@ int xen_irq_from_gsi(unsigned gsi)
>>   }
>>   EXPORT_SYMBOL_GPL(xen_irq_from_gsi);
>>   +int xen_gsi_from_irq(unsigned irq)
>> +{
>> +    struct irq_info *info;
>> +
>> +    list_for_each_entry(info, &xen_irq_list_head, list) {
>> +        if (info->type != IRQT_PIRQ)
>> +            continue;
>> +
>> +        if (info->irq == irq)
>> +            return info->u.pirq.gsi;
>> +    }
>> +
>> +    return -1;
>> +}
>> +EXPORT_SYMBOL_GPL(xen_gsi_from_irq);
>> +
>> +int xen_pvh_add_gsi_irq_map(unsigned gsi, unsigned irq)
>> +{
>> +    int tmp_irq;
>> +    struct irq_info *info;
>> +
>> +    tmp_irq = xen_irq_from_gsi(gsi);
>> +    if (tmp_irq != -1)
>> +        return -EEXIST;
>> +
>> +    info = kzalloc(sizeof(*info), GFP_KERNEL);
>> +    if (info == NULL)
>> +        panic("Unable to allocate metadata for GSI%d\n", gsi);
> 
> Please don't kill the system here, just return -ENOMEM.
Will change in next version.

> 
>> +
>> +    info->type = IRQT_PIRQ;
I am considering whether I need to use a new type(like IRQT_GSI) here to distinguish with IRQT_PIRQ, because function restore_pirqs will process all IRQT_PIRQ.

>> +    info->irq = irq;
>> +    info->u.pirq.gsi = gsi;
>> +    list_add_tail(&info->list, &xen_irq_list_head);
> 
> I think you need some kind of locking to protect changing of the list against
> concurrent accesses.
OK, will add a lock in next version.

> 
>> +
>> +    return 0;
>> +}
>> +
>>   static void __unbind_from_irq(struct irq_info *info, unsigned int irq)
>>   {
>>       evtchn_port_t evtchn;
>> @@ -2295,6 +2332,8 @@ void __init xen_init_IRQ(void)
>>       xen_init_setup_upcall_vector();
>>       xen_alloc_callback_vector();
>>   +    if (xen_pvh_domain())
>> +        pci_xen_pvh_init();
>>         if (xen_hvm_domain()) {
>>           native_init_IRQ();
>> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
>> index 67dfa4778864..11feed529e1d 100644
>> --- a/drivers/xen/privcmd.c
>> +++ b/drivers/xen/privcmd.c
>> @@ -842,6 +842,21 @@ static long privcmd_ioctl_mmap_resource(struct file *file,
>>       return rc;
>>   }
>>   +static long privcmd_ioctl_gsi_from_irq(struct file *file, void __user *udata)
>> +{
>> +    struct privcmd_gsi_from_irq kdata;
>> +
>> +    if (copy_from_user(&kdata, udata, sizeof(kdata)))
>> +        return -EFAULT;
>> +
>> +    kdata.gsi = xen_gsi_from_irq(kdata.irq);
>> +
>> +    if (copy_to_user(udata, &kdata, sizeof(kdata)))
>> +        return -EFAULT;
>> +
>> +    return 0;
> 
> Shouldn't you return an error if xen_gsi_from_irq() returned -1?
Oh, will change in next version.

> 
> 
> Juergen

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2024-05-10  9:06     ` Chen, Jiqian
@ 2024-05-10  9:53       ` Jürgen Groß
  2024-05-10 10:13         ` Chen, Jiqian
  2024-05-13  7:47       ` Chen, Jiqian
  1 sibling, 1 reply; 16+ messages in thread
From: Jürgen Groß @ 2024-05-10  9:53 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Stefano Stabellini, Bjorn Helgaas, Rafael J . Wysocki,
	Roger Pau Monné,
	xen-devel, linux-pci, linux-kernel, linux-acpi, Huang, Ray

On 10.05.24 11:06, Chen, Jiqian wrote:
> Hi,
> 
> On 2024/5/10 14:46, Jürgen Groß wrote:
>> On 19.04.24 05:36, Jiqian Chen wrote:
>>> +
>>> +    info->type = IRQT_PIRQ;
> I am considering whether I need to use a new type(like IRQT_GSI) here to distinguish with IRQT_PIRQ, because function restore_pirqs will process all IRQT_PIRQ.

restore_pirqs() already considers gsi == 0 to be not GSI related. Isn't this
enough?


Juergen

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2024-05-10  9:53       ` Jürgen Groß
@ 2024-05-10 10:13         ` Chen, Jiqian
  2024-05-10 10:21           ` Jürgen Groß
  0 siblings, 1 reply; 16+ messages in thread
From: Chen, Jiqian @ 2024-05-10 10:13 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Stefano Stabellini, Bjorn Helgaas, Rafael J . Wysocki,
	Roger Pau Monné,
	xen-devel, linux-pci, linux-kernel, linux-acpi, Huang, Ray, Chen,
	Jiqian

On 2024/5/10 17:53, Jürgen Groß wrote:
> On 10.05.24 11:06, Chen, Jiqian wrote:
>> Hi,
>>
>> On 2024/5/10 14:46, Jürgen Groß wrote:
>>> On 19.04.24 05:36, Jiqian Chen wrote:
>>>> +
>>>> +    info->type = IRQT_PIRQ;
>> I am considering whether I need to use a new type(like IRQT_GSI) here to distinguish with IRQT_PIRQ, because function restore_pirqs will process all IRQT_PIRQ.
> 
> restore_pirqs() already considers gsi == 0 to be not GSI related. Isn't this
> enough?
No, it is not enough.
xen_pvh_add_gsi_irq_map adds the mapping of gsi and irq, but the value of gsi is not 0,
once restore_pirqs is called, it will do PHYSDEVOP_map_pirq for that gsi, but in pvh dom0, we shouldn't do PHYSDEVOP_map_pirq.

> 
> 
> Juergen

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2024-05-10 10:13         ` Chen, Jiqian
@ 2024-05-10 10:21           ` Jürgen Groß
  2024-05-10 10:32             ` Chen, Jiqian
  0 siblings, 1 reply; 16+ messages in thread
From: Jürgen Groß @ 2024-05-10 10:21 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Stefano Stabellini, Bjorn Helgaas, Rafael J . Wysocki,
	Roger Pau Monné,
	xen-devel, linux-pci, linux-kernel, linux-acpi, Huang, Ray

On 10.05.24 12:13, Chen, Jiqian wrote:
> On 2024/5/10 17:53, Jürgen Groß wrote:
>> On 10.05.24 11:06, Chen, Jiqian wrote:
>>> Hi,
>>>
>>> On 2024/5/10 14:46, Jürgen Groß wrote:
>>>> On 19.04.24 05:36, Jiqian Chen wrote:
>>>>> +
>>>>> +    info->type = IRQT_PIRQ;
>>> I am considering whether I need to use a new type(like IRQT_GSI) here to distinguish with IRQT_PIRQ, because function restore_pirqs will process all IRQT_PIRQ.
>>
>> restore_pirqs() already considers gsi == 0 to be not GSI related. Isn't this
>> enough?
> No, it is not enough.
> xen_pvh_add_gsi_irq_map adds the mapping of gsi and irq, but the value of gsi is not 0,
> once restore_pirqs is called, it will do PHYSDEVOP_map_pirq for that gsi, but in pvh dom0, we shouldn't do PHYSDEVOP_map_pirq.

Okay, then add a new flag to info->u.pirq.flags for that purpose?


Juergen


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2024-05-10 10:21           ` Jürgen Groß
@ 2024-05-10 10:32             ` Chen, Jiqian
  2024-05-10 11:27               ` Jürgen Groß
  0 siblings, 1 reply; 16+ messages in thread
From: Chen, Jiqian @ 2024-05-10 10:32 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Stefano Stabellini, Bjorn Helgaas, Rafael J . Wysocki,
	Roger Pau Monné,
	xen-devel, linux-pci, linux-kernel, linux-acpi, Huang, Ray, Chen,
	Jiqian

On 2024/5/10 18:21, Jürgen Groß wrote:
> On 10.05.24 12:13, Chen, Jiqian wrote:
>> On 2024/5/10 17:53, Jürgen Groß wrote:
>>> On 10.05.24 11:06, Chen, Jiqian wrote:
>>>> Hi,
>>>>
>>>> On 2024/5/10 14:46, Jürgen Groß wrote:
>>>>> On 19.04.24 05:36, Jiqian Chen wrote:
>>>>>> +
>>>>>> +    info->type = IRQT_PIRQ;
>>>> I am considering whether I need to use a new type(like IRQT_GSI) here to distinguish with IRQT_PIRQ, because function restore_pirqs will process all IRQT_PIRQ.
>>>
>>> restore_pirqs() already considers gsi == 0 to be not GSI related. Isn't this
>>> enough?
>> No, it is not enough.
>> xen_pvh_add_gsi_irq_map adds the mapping of gsi and irq, but the value of gsi is not 0,
>> once restore_pirqs is called, it will do PHYSDEVOP_map_pirq for that gsi, but in pvh dom0, we shouldn't do PHYSDEVOP_map_pirq.
> 
> Okay, then add a new flag to info->u.pirq.flags for that purpose?
I feel like adding "new flag to info->u.pirq.flags" is not as good as adding " new type to info->type".
Because in restore_pirqs, it considers " info->type != IRQT_PIRQ", if adding " new flag to info->u.pirq.flags", we need to add a new condition in restore_pirqs.
And actually this mapping(gsi and irq of pvh) doesn't have pirq, so it is not suitable to add to u.pirq.flags.

> 
> 
> Juergen
> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2024-05-10 10:32             ` Chen, Jiqian
@ 2024-05-10 11:27               ` Jürgen Groß
  2024-05-11  2:16                 ` Chen, Jiqian
  0 siblings, 1 reply; 16+ messages in thread
From: Jürgen Groß @ 2024-05-10 11:27 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Stefano Stabellini, Bjorn Helgaas, Rafael J . Wysocki,
	Roger Pau Monné,
	xen-devel, linux-pci, linux-kernel, linux-acpi, Huang, Ray

On 10.05.24 12:32, Chen, Jiqian wrote:
> On 2024/5/10 18:21, Jürgen Groß wrote:
>> On 10.05.24 12:13, Chen, Jiqian wrote:
>>> On 2024/5/10 17:53, Jürgen Groß wrote:
>>>> On 10.05.24 11:06, Chen, Jiqian wrote:
>>>>> Hi,
>>>>>
>>>>> On 2024/5/10 14:46, Jürgen Groß wrote:
>>>>>> On 19.04.24 05:36, Jiqian Chen wrote:
>>>>>>> +
>>>>>>> +    info->type = IRQT_PIRQ;
>>>>> I am considering whether I need to use a new type(like IRQT_GSI) here to distinguish with IRQT_PIRQ, because function restore_pirqs will process all IRQT_PIRQ.
>>>>
>>>> restore_pirqs() already considers gsi == 0 to be not GSI related. Isn't this
>>>> enough?
>>> No, it is not enough.
>>> xen_pvh_add_gsi_irq_map adds the mapping of gsi and irq, but the value of gsi is not 0,
>>> once restore_pirqs is called, it will do PHYSDEVOP_map_pirq for that gsi, but in pvh dom0, we shouldn't do PHYSDEVOP_map_pirq.
>>
>> Okay, then add a new flag to info->u.pirq.flags for that purpose?
> I feel like adding "new flag to info->u.pirq.flags" is not as good as adding " new type to info->type".
> Because in restore_pirqs, it considers " info->type != IRQT_PIRQ", if adding " new flag to info->u.pirq.flags", we need to add a new condition in restore_pirqs.
> And actually this mapping(gsi and irq of pvh) doesn't have pirq, so it is not suitable to add to u.pirq.flags.

Does this mean there is no other IRQT_PIRQ related activity relevant for those
GSIs/IRQs? In that case I agree to add IRQT_GSI.


Juergen

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2024-05-10 11:27               ` Jürgen Groß
@ 2024-05-11  2:16                 ` Chen, Jiqian
  0 siblings, 0 replies; 16+ messages in thread
From: Chen, Jiqian @ 2024-05-11  2:16 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Stefano Stabellini, Bjorn Helgaas, Rafael J . Wysocki,
	Roger Pau Monné,
	xen-devel, linux-pci, linux-kernel, linux-acpi, Huang, Ray, Chen,
	Jiqian

On 2024/5/10 19:27, Jürgen Groß wrote:
> On 10.05.24 12:32, Chen, Jiqian wrote:
>> On 2024/5/10 18:21, Jürgen Groß wrote:
>>> On 10.05.24 12:13, Chen, Jiqian wrote:
>>>> On 2024/5/10 17:53, Jürgen Groß wrote:
>>>>> On 10.05.24 11:06, Chen, Jiqian wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 2024/5/10 14:46, Jürgen Groß wrote:
>>>>>>> On 19.04.24 05:36, Jiqian Chen wrote:
>>>>>>>> +
>>>>>>>> +    info->type = IRQT_PIRQ;
>>>>>> I am considering whether I need to use a new type(like IRQT_GSI) here to distinguish with IRQT_PIRQ, because function restore_pirqs will process all IRQT_PIRQ.
>>>>>
>>>>> restore_pirqs() already considers gsi == 0 to be not GSI related. Isn't this
>>>>> enough?
>>>> No, it is not enough.
>>>> xen_pvh_add_gsi_irq_map adds the mapping of gsi and irq, but the value of gsi is not 0,
>>>> once restore_pirqs is called, it will do PHYSDEVOP_map_pirq for that gsi, but in pvh dom0, we shouldn't do PHYSDEVOP_map_pirq.
>>>
>>> Okay, then add a new flag to info->u.pirq.flags for that purpose?
>> I feel like adding "new flag to info->u.pirq.flags" is not as good as adding " new type to info->type".
>> Because in restore_pirqs, it considers " info->type != IRQT_PIRQ", if adding " new flag to info->u.pirq.flags", we need to add a new condition in restore_pirqs.
>> And actually this mapping(gsi and irq of pvh) doesn't have pirq, so it is not suitable to add to u.pirq.flags.
> 
> Does this mean there is no other IRQT_PIRQ related activity relevant for those GSIs/IRQs?
Yes, I think so.
> In that case I agree to add IRQT_GSI.
Thank you!
> 
> 
> Juergen

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2024-05-10  9:06     ` Chen, Jiqian
  2024-05-10  9:53       ` Jürgen Groß
@ 2024-05-13  7:47       ` Chen, Jiqian
  2024-05-13  7:59         ` Jürgen Groß
  1 sibling, 1 reply; 16+ messages in thread
From: Chen, Jiqian @ 2024-05-13  7:47 UTC (permalink / raw)
  To: Jürgen Groß, Stefano Stabellini
  Cc: Bjorn Helgaas, Rafael J . Wysocki, Roger Pau Monné,
	xen-devel, linux-pci, linux-kernel, linux-acpi, Huang, Ray, Chen,
	Jiqian

Hi,
On 2024/5/10 17:06, Chen, Jiqian wrote:
> Hi,
> 
> On 2024/5/10 14:46, Jürgen Groß wrote:
>> On 19.04.24 05:36, Jiqian Chen wrote:
>>> In PVH dom0, it uses the linux local interrupt mechanism,
>>> when it allocs irq for a gsi, it is dynamic, and follow
>>> the principle of applying first, distributing first. And
>>> the irq number is alloced from small to large, but the
>>> applying gsi number is not, may gsi 38 comes before gsi 28,
>>> it causes the irq number is not equal with the gsi number.
>>> And when passthrough a device, QEMU will use device's gsi
>>> number to do pirq mapping, but the gsi number is got from
>>> file /sys/bus/pci/devices/<sbdf>/irq, irq!= gsi, so it will
>>> fail when mapping.
>>> And in current linux codes, there is no method to translate
>>> irq to gsi for userspace.
>>>
>>> For above purpose, record the relationship of gsi and irq
>>> when PVH dom0 do acpi_register_gsi_ioapic for devices and
>>> adds a new syscall into privcmd to let userspace can get
>>> that translation when they have a need.
>>>
>>> Co-developed-by: Huang Rui <ray.huang@amd.com>
>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>> ---
>>>   arch/x86/include/asm/apic.h      |  8 +++++++
>>>   arch/x86/include/asm/xen/pci.h   |  5 ++++
>>>   arch/x86/kernel/acpi/boot.c      |  2 +-
>>>   arch/x86/pci/xen.c               | 21 +++++++++++++++++
>>>   drivers/xen/events/events_base.c | 39 ++++++++++++++++++++++++++++++++
>>>   drivers/xen/privcmd.c            | 19 ++++++++++++++++
>>>   include/uapi/xen/privcmd.h       |  7 ++++++
>>>   include/xen/events.h             |  5 ++++
>>>   8 files changed, 105 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
>>> index 9d159b771dc8..dd4139250895 100644
>>> --- a/arch/x86/include/asm/apic.h
>>> +++ b/arch/x86/include/asm/apic.h
>>> @@ -169,6 +169,9 @@ extern bool apic_needs_pit(void);
>>>     extern void apic_send_IPI_allbutself(unsigned int vector);
>>>   +extern int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>>> +                    int trigger, int polarity);
>>> +
>>>   #else /* !CONFIG_X86_LOCAL_APIC */
>>>   static inline void lapic_shutdown(void) { }
>>>   #define local_apic_timer_c2_ok        1
>>> @@ -183,6 +186,11 @@ static inline void apic_intr_mode_init(void) { }
>>>   static inline void lapic_assign_system_vectors(void) { }
>>>   static inline void lapic_assign_legacy_vector(unsigned int i, bool r) { }
>>>   static inline bool apic_needs_pit(void) { return true; }
>>> +static inline int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>>> +                    int trigger, int polarity)
>>> +{
>>> +    return (int)gsi;
>>> +}
>>>   #endif /* !CONFIG_X86_LOCAL_APIC */
>>>     #ifdef CONFIG_X86_X2APIC
>>> diff --git a/arch/x86/include/asm/xen/pci.h b/arch/x86/include/asm/xen/pci.h
>>> index 9015b888edd6..aa8ded61fc2d 100644
>>> --- a/arch/x86/include/asm/xen/pci.h
>>> +++ b/arch/x86/include/asm/xen/pci.h
>>> @@ -5,6 +5,7 @@
>>>   #if defined(CONFIG_PCI_XEN)
>>>   extern int __init pci_xen_init(void);
>>>   extern int __init pci_xen_hvm_init(void);
>>> +extern int __init pci_xen_pvh_init(void);
>>>   #define pci_xen 1
>>>   #else
>>>   #define pci_xen 0
>>> @@ -13,6 +14,10 @@ static inline int pci_xen_hvm_init(void)
>>>   {
>>>       return -1;
>>>   }
>>> +static inline int pci_xen_pvh_init(void)
>>> +{
>>> +    return -1;
>>> +}
>>>   #endif
>>>   #ifdef CONFIG_XEN_PV_DOM0
>>>   int __init pci_xen_initial_domain(void);
>>> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
>>> index 85a3ce2a3666..72c73458c083 100644
>>> --- a/arch/x86/kernel/acpi/boot.c
>>> +++ b/arch/x86/kernel/acpi/boot.c
>>> @@ -749,7 +749,7 @@ static int acpi_register_gsi_pic(struct device *dev, u32 gsi,
>>>   }
>>>     #ifdef CONFIG_X86_LOCAL_APIC
>>> -static int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>>> +int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>>>                       int trigger, int polarity)
>>>   {
>>>       int irq = gsi;
>>> diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
>>> index 652cd53e77f6..f056ab5c0a06 100644
>>> --- a/arch/x86/pci/xen.c
>>> +++ b/arch/x86/pci/xen.c
>>> @@ -114,6 +114,21 @@ static int acpi_register_gsi_xen_hvm(struct device *dev, u32 gsi,
>>>                    false /* no mapping of GSI to PIRQ */);
>>>   }
>>>   +static int acpi_register_gsi_xen_pvh(struct device *dev, u32 gsi,
>>> +                    int trigger, int polarity)
>>> +{
>>> +    int irq;
>>> +
>>> +    irq = acpi_register_gsi_ioapic(dev, gsi, trigger, polarity);
>>> +    if (irq < 0)
>>> +        return irq;
>>> +
>>> +    if (xen_pvh_add_gsi_irq_map(gsi, irq) == -EEXIST)
>>> +        printk(KERN_INFO "Already map the GSI :%u and IRQ: %d\n", gsi, irq);
>>> +
>>> +    return irq;
>>> +}
>>> +
>>>   #ifdef CONFIG_XEN_PV_DOM0
>>>   static int xen_register_gsi(u32 gsi, int triggering, int polarity)
>>>   {
>>> @@ -558,6 +573,12 @@ int __init pci_xen_hvm_init(void)
>>>       return 0;
>>>   }
>>>   +int __init pci_xen_pvh_init(void)
>>> +{
>>> +    __acpi_register_gsi = acpi_register_gsi_xen_pvh;
>>
>> No support for unregistering the gsi again?
> __acpi_unregister_gsi is set in function acpi_set_irq_model_ioapic.
> Maybe I need to use a new function to call acpi_unregister_gsi_ioapic and remove the mapping of irq and gsi from xen_irq_list_head ?
When I tried to support unregistering the gsi and removing the mapping during disable device,
I encountered that after running "xl pci-assignable-add 03:00.0", callstack pcistub_init_device->xen_pcibk_reset_device->pci_disable_device->pcibios_disable_device->acpi_pci_irq_disable->__acpi_unregister_gsi
removed the mapping, after that when user space called xen_gsi_from_irq to get gsi, it failed.

To cover above case, I want to change the implementation of xen_gsi_from_irq to pass sbdf to get the gsi instead of passing irq,
Because the sbdf and gsi of a device is unique and wiil not be changed even device is disabled or re-enabled.

Do you think this kind of change is acceptable?

> 
>>
>>> +    return 0;
>>> +}
>>> +
>>
>> Juergen
> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2024-05-13  7:47       ` Chen, Jiqian
@ 2024-05-13  7:59         ` Jürgen Groß
  0 siblings, 0 replies; 16+ messages in thread
From: Jürgen Groß @ 2024-05-13  7:59 UTC (permalink / raw)
  To: Chen, Jiqian, Stefano Stabellini
  Cc: Bjorn Helgaas, Rafael J . Wysocki, Roger Pau Monné,
	xen-devel, linux-pci, linux-kernel, linux-acpi, Huang, Ray

On 13.05.24 09:47, Chen, Jiqian wrote:
> Hi,
> On 2024/5/10 17:06, Chen, Jiqian wrote:
>> Hi,
>>
>> On 2024/5/10 14:46, Jürgen Groß wrote:
>>> On 19.04.24 05:36, Jiqian Chen wrote:
>>>> In PVH dom0, it uses the linux local interrupt mechanism,
>>>> when it allocs irq for a gsi, it is dynamic, and follow
>>>> the principle of applying first, distributing first. And
>>>> the irq number is alloced from small to large, but the
>>>> applying gsi number is not, may gsi 38 comes before gsi 28,
>>>> it causes the irq number is not equal with the gsi number.
>>>> And when passthrough a device, QEMU will use device's gsi
>>>> number to do pirq mapping, but the gsi number is got from
>>>> file /sys/bus/pci/devices/<sbdf>/irq, irq!= gsi, so it will
>>>> fail when mapping.
>>>> And in current linux codes, there is no method to translate
>>>> irq to gsi for userspace.
>>>>
>>>> For above purpose, record the relationship of gsi and irq
>>>> when PVH dom0 do acpi_register_gsi_ioapic for devices and
>>>> adds a new syscall into privcmd to let userspace can get
>>>> that translation when they have a need.
>>>>
>>>> Co-developed-by: Huang Rui <ray.huang@amd.com>
>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>> ---
>>>>    arch/x86/include/asm/apic.h      |  8 +++++++
>>>>    arch/x86/include/asm/xen/pci.h   |  5 ++++
>>>>    arch/x86/kernel/acpi/boot.c      |  2 +-
>>>>    arch/x86/pci/xen.c               | 21 +++++++++++++++++
>>>>    drivers/xen/events/events_base.c | 39 ++++++++++++++++++++++++++++++++
>>>>    drivers/xen/privcmd.c            | 19 ++++++++++++++++
>>>>    include/uapi/xen/privcmd.h       |  7 ++++++
>>>>    include/xen/events.h             |  5 ++++
>>>>    8 files changed, 105 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
>>>> index 9d159b771dc8..dd4139250895 100644
>>>> --- a/arch/x86/include/asm/apic.h
>>>> +++ b/arch/x86/include/asm/apic.h
>>>> @@ -169,6 +169,9 @@ extern bool apic_needs_pit(void);
>>>>      extern void apic_send_IPI_allbutself(unsigned int vector);
>>>>    +extern int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>>>> +                    int trigger, int polarity);
>>>> +
>>>>    #else /* !CONFIG_X86_LOCAL_APIC */
>>>>    static inline void lapic_shutdown(void) { }
>>>>    #define local_apic_timer_c2_ok        1
>>>> @@ -183,6 +186,11 @@ static inline void apic_intr_mode_init(void) { }
>>>>    static inline void lapic_assign_system_vectors(void) { }
>>>>    static inline void lapic_assign_legacy_vector(unsigned int i, bool r) { }
>>>>    static inline bool apic_needs_pit(void) { return true; }
>>>> +static inline int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>>>> +                    int trigger, int polarity)
>>>> +{
>>>> +    return (int)gsi;
>>>> +}
>>>>    #endif /* !CONFIG_X86_LOCAL_APIC */
>>>>      #ifdef CONFIG_X86_X2APIC
>>>> diff --git a/arch/x86/include/asm/xen/pci.h b/arch/x86/include/asm/xen/pci.h
>>>> index 9015b888edd6..aa8ded61fc2d 100644
>>>> --- a/arch/x86/include/asm/xen/pci.h
>>>> +++ b/arch/x86/include/asm/xen/pci.h
>>>> @@ -5,6 +5,7 @@
>>>>    #if defined(CONFIG_PCI_XEN)
>>>>    extern int __init pci_xen_init(void);
>>>>    extern int __init pci_xen_hvm_init(void);
>>>> +extern int __init pci_xen_pvh_init(void);
>>>>    #define pci_xen 1
>>>>    #else
>>>>    #define pci_xen 0
>>>> @@ -13,6 +14,10 @@ static inline int pci_xen_hvm_init(void)
>>>>    {
>>>>        return -1;
>>>>    }
>>>> +static inline int pci_xen_pvh_init(void)
>>>> +{
>>>> +    return -1;
>>>> +}
>>>>    #endif
>>>>    #ifdef CONFIG_XEN_PV_DOM0
>>>>    int __init pci_xen_initial_domain(void);
>>>> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
>>>> index 85a3ce2a3666..72c73458c083 100644
>>>> --- a/arch/x86/kernel/acpi/boot.c
>>>> +++ b/arch/x86/kernel/acpi/boot.c
>>>> @@ -749,7 +749,7 @@ static int acpi_register_gsi_pic(struct device *dev, u32 gsi,
>>>>    }
>>>>      #ifdef CONFIG_X86_LOCAL_APIC
>>>> -static int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>>>> +int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
>>>>                        int trigger, int polarity)
>>>>    {
>>>>        int irq = gsi;
>>>> diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
>>>> index 652cd53e77f6..f056ab5c0a06 100644
>>>> --- a/arch/x86/pci/xen.c
>>>> +++ b/arch/x86/pci/xen.c
>>>> @@ -114,6 +114,21 @@ static int acpi_register_gsi_xen_hvm(struct device *dev, u32 gsi,
>>>>                     false /* no mapping of GSI to PIRQ */);
>>>>    }
>>>>    +static int acpi_register_gsi_xen_pvh(struct device *dev, u32 gsi,
>>>> +                    int trigger, int polarity)
>>>> +{
>>>> +    int irq;
>>>> +
>>>> +    irq = acpi_register_gsi_ioapic(dev, gsi, trigger, polarity);
>>>> +    if (irq < 0)
>>>> +        return irq;
>>>> +
>>>> +    if (xen_pvh_add_gsi_irq_map(gsi, irq) == -EEXIST)
>>>> +        printk(KERN_INFO "Already map the GSI :%u and IRQ: %d\n", gsi, irq);
>>>> +
>>>> +    return irq;
>>>> +}
>>>> +
>>>>    #ifdef CONFIG_XEN_PV_DOM0
>>>>    static int xen_register_gsi(u32 gsi, int triggering, int polarity)
>>>>    {
>>>> @@ -558,6 +573,12 @@ int __init pci_xen_hvm_init(void)
>>>>        return 0;
>>>>    }
>>>>    +int __init pci_xen_pvh_init(void)
>>>> +{
>>>> +    __acpi_register_gsi = acpi_register_gsi_xen_pvh;
>>>
>>> No support for unregistering the gsi again?
>> __acpi_unregister_gsi is set in function acpi_set_irq_model_ioapic.
>> Maybe I need to use a new function to call acpi_unregister_gsi_ioapic and remove the mapping of irq and gsi from xen_irq_list_head ?
> When I tried to support unregistering the gsi and removing the mapping during disable device,
> I encountered that after running "xl pci-assignable-add 03:00.0", callstack pcistub_init_device->xen_pcibk_reset_device->pci_disable_device->pcibios_disable_device->acpi_pci_irq_disable->__acpi_unregister_gsi
> removed the mapping, after that when user space called xen_gsi_from_irq to get gsi, it failed.
> 
> To cover above case, I want to change the implementation of xen_gsi_from_irq to pass sbdf to get the gsi instead of passing irq,
> Because the sbdf and gsi of a device is unique and wiil not be changed even device is disabled or re-enabled.
> 
> Do you think this kind of change is acceptable?

Yes, I think so.


Juergen

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-05-13  7:59 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-19  3:36 [RFC KERNEL PATCH v6 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
2024-04-19  3:36 ` [KERNEL PATCH v6 1/3] xen/pci: Add xen_reset_device_state function Jiqian Chen
2024-04-19  3:36 ` [RFC KERNEL PATCH v6 2/3] xen/pvh: Setup gsi for passthrough device Jiqian Chen
2024-05-10  7:48   ` Juergen Gross
2024-05-10  8:42     ` Chen, Jiqian
2024-04-19  3:36 ` [RFC KERNEL PATCH v6 3/3] xen/privcmd: Add new syscall to get gsi from irq Jiqian Chen
2024-05-10  6:46   ` Jürgen Groß
2024-05-10  9:06     ` Chen, Jiqian
2024-05-10  9:53       ` Jürgen Groß
2024-05-10 10:13         ` Chen, Jiqian
2024-05-10 10:21           ` Jürgen Groß
2024-05-10 10:32             ` Chen, Jiqian
2024-05-10 11:27               ` Jürgen Groß
2024-05-11  2:16                 ` Chen, Jiqian
2024-05-13  7:47       ` Chen, Jiqian
2024-05-13  7:59         ` Jürgen Groß

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.