All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC KERNEL PATCH v2 0/3] Support device passthrough when dom0 is PVH on Xen
@ 2023-11-24 10:31 Jiqian Chen
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function Jiqian Chen
                   ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Jiqian Chen @ 2023-11-24 10:31 UTC (permalink / raw)
  To: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi
  Cc: Stefano Stabellini, Alex Deucher, Christian Koenig,
	Stewart Hildebrand, Xenia Ragiadakou, Honglei Huang, Julia Zhang,
	Huang Rui, Jiqian Chen

Hi All,

This series of patches are the v2 of the implementation of passthrough when dom0 is PVH on Xen.
We sent the v1 to upstream before, but the v1 had so many problems and we got lots of suggestions.
I will introduce all issues that these patches try to fix and the differences between v1 and v2.

Issues we encountered:
1. pci_stub failed to write bar for a passthrough device.
Problem: when we run “sudo xl pci-assignable-add <sbdf>” to assign a device, pci_stub will
call “pcistub_init_device() -> pci_restore_state() -> pci_restore_config_space() ->
pci_restore_config_space_range() -> pci_restore_config_dword() -> pci_write_config_dword()”,
the pci config write will trigger an io interrupt to bar_write() in the xen, but the
bar->enabled was set before, the write is not allowed now, and then when Qemu config the
passthrough device in xen_pt_realize(), it gets invalid bar values.

Reason: the reason is that we don't tell vPCI that the device has been reset, so the current
cached state in pdev->vpci is all out of date and is different from the real device state.

Solution: to solve this problem, the first patch of kernel(xen/pci: Add xen_reset_device_state
function) and the fist patch of xen(xen/vpci: Clear all vpci status of device) add a new
hypercall to reset the state stored in vPCI when the state of real device has changed.
Thank Roger for the suggestion of this v2, and it is different from v1
(https://lore.kernel.org/xen-devel/20230312075455.450187-3-ray.huang@amd.com/), v1 simply allow
domU to write pci bar, it does not comply with the design principles of vPCI.

2. failed to do PHYSDEVOP_map_pirq when dom0 is PVH
Problem: HVM domU will do PHYSDEVOP_map_pirq for a passthrough device by using gsi. See
xen_pt_realize->xc_physdev_map_pirq and pci_add_dm_done->xc_physdev_map_pirq. Then
xc_physdev_map_pirq will call into Xen, but in hvm_physdev_op(), PHYSDEVOP_map_pirq is not allowed.

Reason: In hvm_physdev_op(), the variable "currd" is PVH dom0 and PVH has no X86_EMU_USE_PIRQ flag,
it will fail at has_pirq check.

Solution: I think we may need to allow PHYSDEVOP_map_pirq when "currd" is dom0 (at present dom0 is
PVH). The second patch of xen(x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0) allow PVH dom0 do
PHYSDEVOP_map_pirq. This v2 patch is better than v1, v1 simply remove the has_pirq check(xen
https://lore.kernel.org/xen-devel/20230312075455.450187-4-ray.huang@amd.com/).

3. the gsi of a passthrough device doesn't be unmasked
 3.1 failed to check the permission of pirq
 3.2 the gsi of passthrough device was not registered in PVH dom0

Problem:
3.1 callback function pci_add_dm_done() will be called when qemu config a passthrough device for domU.
This function will call xc_domain_irq_permission()-> pirq_access_permitted() to check if the gsi has
corresponding mappings in dom0. But it didn’t, so failed. See
XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH dom0 and it return irq is 0.
3.2 it's possible for a gsi (iow: vIO-APIC pin) to never get registered on PVH dom0, because the
devices of PVH are using MSI(-X) interrupts. However, the IO-APIC pin must be configured for it to be
able to be mapped into a domU.

Reason: After searching codes, I find "map_pirq" and "register_gsi" will be done in function
vioapic_write_redirent->vioapic_hwdom_map_gsi when the gsi(aka ioapic's pin) is unmasked in PVH dom0.
So the two problems can be concluded to that the gsi of a passthrough device doesn't be unmasked.

Solution: to solve these problems, the second patch of kernel(xen/pvh: Unmask irq for passthrough
device in PVH dom0) call the unmask_irq() when we assign a device to be passthrough. So that
passthrough devices can have the mapping of gsi on PVH dom0 and gsi can be registered. This v2 patch
is different from the v1(
kernel https://lore.kernel.org/xen-devel/20230312120157.452859-5-ray.huang@amd.com/,
kernel https://lore.kernel.org/xen-devel/20230312120157.452859-5-ray.huang@amd.com/ and
xen https://lore.kernel.org/xen-devel/20230312075455.450187-5-ray.huang@amd.com/),
v1 performed "map_pirq" and "register_gsi" on all pci devices on PVH dom0, which is unnecessary and
may cause multiple registration.

4. failed to map pirq for gsi
Problem: qemu will call xc_physdev_map_pirq() to map a passthrough device’s gsi to pirq in function
xen_pt_realize(). But failed.

Reason: According to the implement of xc_physdev_map_pirq(), it needs gsi instead of irq, but qemu
pass irq to it and treat irq as gsi, it is got from file /sys/bus/pci/devices/xxxx:xx:xx.x/irq in
function xen_host_pci_device_get(). But actually the gsi number is not equal with irq. On PVH dom0,
when it allocates irq for a gsi in function acpi_register_gsi_ioapic(), allocation is dynamic, and
follow the principle of applying first, distributing first. And if you debug the kernel codes(see
function __irq_alloc_descs), you will find the irq number is allocated from small to large by order,
but the applying gsi number is not, gsi 38 may come before gsi 28, that causes gsi 38 get a smaller
irq number than gsi 28, and then gsi != irq.

Solution: we can record the relation between gsi and irq, then when userspace(qemu) want to use gsi,
we can do a translation. The third patch of kernel(xen/privcmd: Add new syscall to get gsi from irq)
records all the relations in acpi_register_gsi_xen_pvh() when dom0 initialize pci devices, and provide
a syscall for userspace to get the gsi from irq. The third patch of xen(tools: Add new function to get
gsi from irq) add a new function xc_physdev_gsi_from_irq() to call the new syscall added on kernel side.
And then userspace can use that function to get gsi. Then xc_physdev_map_pirq() will success. This v2
patch is the same as v1(
kernel https://lore.kernel.org/xen-devel/20230312120157.452859-6-ray.huang@amd.com/ and
xen https://lore.kernel.org/xen-devel/20230312075455.450187-6-ray.huang@amd.com/)

About the v2 patch of qemu, just change an included head file, other are similar to the v1 (
qemu https://lore.kernel.org/xen-devel/20230312092244.451465-19-ray.huang@amd.com/), just call
xc_physdev_gsi_from_irq() to get gsi from irq.


Jiqian Chen (3):
  xen/pci: Add xen_reset_device_state function
  xen/pvh: Unmask irq for passthrough device in PVH dom0
  xen/privcmd: Add new syscall to get gsi from irq

 arch/x86/include/asm/apic.h        |  7 ++++++
 arch/x86/include/asm/xen/pci.h     |  5 ++++
 arch/x86/kernel/acpi/boot.c        |  2 +-
 arch/x86/pci/xen.c                 | 21 ++++++++++++++++
 drivers/xen/events/events_base.c   | 39 ++++++++++++++++++++++++++++++
 drivers/xen/pci.c                  | 12 +++++++++
 drivers/xen/privcmd.c              | 20 +++++++++++++++
 drivers/xen/xen-pciback/pci_stub.c | 10 ++++++++
 include/linux/irq.h                |  1 +
 include/uapi/xen/privcmd.h         |  7 ++++++
 include/xen/events.h               |  5 ++++
 include/xen/interface/physdev.h    |  2 ++
 include/xen/pci.h                  |  6 +++++
 kernel/irq/chip.c                  |  1 +
 kernel/irq/internals.h             |  1 -
 kernel/irq/irqdesc.c               |  2 +-
 16 files changed, 138 insertions(+), 3 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-11-24 10:31 [RFC KERNEL PATCH v2 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
@ 2023-11-24 10:31 ` Jiqian Chen
  2023-11-30  3:46   ` Stefano Stabellini
  2023-12-04  7:58   ` Thomas Gleixner
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0 Jiqian Chen
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq Jiqian Chen
  2 siblings, 2 replies; 49+ messages in thread
From: Jiqian Chen @ 2023-11-24 10:31 UTC (permalink / raw)
  To: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi
  Cc: Stefano Stabellini, Alex Deucher, Christian Koenig,
	Stewart Hildebrand, Xenia Ragiadakou, Honglei Huang, Julia Zhang,
	Huang Rui, Jiqian Chen, Huang Rui

When device on dom0 side has been reset, the vpci on Xen side
won't get notification, so that the cached state in vpci is
all out of date with the real device state.
To solve that problem, this patch add a function to clear all
vpci device state when device is reset on dom0 side.

And call that function in pcistub_init_device. Because when
we use "pci-assignable-add" to assign a passthrough device in
Xen, it will reset passthrough device and the vpci state will
out of date, and then device will fail to restore bar state.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
---
 drivers/xen/pci.c                  | 12 ++++++++++++
 drivers/xen/xen-pciback/pci_stub.c |  3 +++
 include/xen/interface/physdev.h    |  2 ++
 include/xen/pci.h                  |  6 ++++++
 4 files changed, 23 insertions(+)

diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
index 72d4e3f193af..e9b30bc09139 100644
--- a/drivers/xen/pci.c
+++ b/drivers/xen/pci.c
@@ -177,6 +177,18 @@ static int xen_remove_device(struct device *dev)
 	return r;
 }
 
+int xen_reset_device_state(const struct pci_dev *dev)
+{
+	struct physdev_pci_device device = {
+		.seg = pci_domain_nr(dev->bus),
+		.bus = dev->bus->number,
+		.devfn = dev->devfn
+	};
+
+	return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_state_reset, &device);
+}
+EXPORT_SYMBOL_GPL(xen_reset_device_state);
+
 static int xen_pci_notifier(struct notifier_block *nb,
 			    unsigned long action, void *data)
 {
diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index e34b623e4b41..5a96b6c66c07 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -421,6 +421,9 @@ static int pcistub_init_device(struct pci_dev *dev)
 	else {
 		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
 		__pci_reset_function_locked(dev);
+		err = xen_reset_device_state(dev);
+		if (err)
+			goto config_release;
 		pci_restore_state(dev);
 	}
 	/* Now disable the device (this also ensures some private device
diff --git a/include/xen/interface/physdev.h b/include/xen/interface/physdev.h
index a237af867873..231526f80f6c 100644
--- a/include/xen/interface/physdev.h
+++ b/include/xen/interface/physdev.h
@@ -263,6 +263,8 @@ struct physdev_pci_device {
     uint8_t devfn;
 };
 
+#define PHYSDEVOP_pci_device_state_reset     32
+
 #define PHYSDEVOP_DBGP_RESET_PREPARE    1
 #define PHYSDEVOP_DBGP_RESET_DONE       2
 
diff --git a/include/xen/pci.h b/include/xen/pci.h
index b8337cf85fd1..b2e2e856efd6 100644
--- a/include/xen/pci.h
+++ b/include/xen/pci.h
@@ -4,10 +4,16 @@
 #define __XEN_PCI_H__
 
 #if defined(CONFIG_XEN_DOM0)
+int xen_reset_device_state(const struct pci_dev *dev);
 int xen_find_device_domain_owner(struct pci_dev *dev);
 int xen_register_device_domain_owner(struct pci_dev *dev, uint16_t domain);
 int xen_unregister_device_domain_owner(struct pci_dev *dev);
 #else
+static inline int xen_reset_device_state(const struct pci_dev *dev)
+{
+	return -1;
+}
+
 static inline int xen_find_device_domain_owner(struct pci_dev *dev)
 {
 	return -1;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-11-24 10:31 [RFC KERNEL PATCH v2 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function Jiqian Chen
@ 2023-11-24 10:31 ` Jiqian Chen
  2023-11-25  7:24   ` kernel test robot
                     ` (3 more replies)
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq Jiqian Chen
  2 siblings, 4 replies; 49+ messages in thread
From: Jiqian Chen @ 2023-11-24 10:31 UTC (permalink / raw)
  To: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi
  Cc: Stefano Stabellini, Alex Deucher, Christian Koenig,
	Stewart Hildebrand, Xenia Ragiadakou, Honglei Huang, Julia Zhang,
	Huang Rui, Jiqian Chen, Huang Rui

This patch is to solve two problems we encountered when we try to
passthrough a device to hvm domU base on Xen PVH dom0.

First, hvm guest will alloc a pirq and irq for a passthrough device
by using gsi, before that, the gsi must first has a mapping in dom0,
see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
into Xen and check whether dom0 has the mapping. See
XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
dom0 and it return irq is 0, and then return -EPERM.
This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
when thay are enabled.

Second, in PVH dom0, the gsi of a passthrough device doesn't get
registered, but gsi must be configured for it to be able to be
mapped into a domU.

After searching codes, we can find map_pirq and register_gsi will be
done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
can be conclude to that the gsi of a passthrough device doesn't be
unmasked.

To solve the unmaske problem, this patch call the unmask_irq when we
assign a device to be passthrough. So that the gsi can get registered
and mapped in PVH dom0.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 7 +++++++
 include/linux/irq.h                | 1 +
 kernel/irq/chip.c                  | 1 +
 kernel/irq/internals.h             | 1 -
 kernel/irq/irqdesc.c               | 2 +-
 5 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 5a96b6c66c07..b83d02bcc76c 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -357,6 +357,7 @@ static int pcistub_match(struct pci_dev *dev)
 static int pcistub_init_device(struct pci_dev *dev)
 {
 	struct xen_pcibk_dev_data *dev_data;
+	struct irq_desc *desc = NULL;
 	int err = 0;
 
 	dev_dbg(&dev->dev, "initializing...\n");
@@ -399,6 +400,12 @@ static int pcistub_init_device(struct pci_dev *dev)
 	if (err)
 		goto config_release;
 
+	if (xen_initial_domain() && xen_pvh_domain()) {
+		if (dev->irq <= 0 || !(desc = irq_to_desc(dev->irq)))
+			goto config_release;
+		unmask_irq(desc);
+	}
+
 	if (dev->msix_cap) {
 		struct physdev_pci_device ppdev = {
 			.seg = pci_domain_nr(dev->bus),
diff --git a/include/linux/irq.h b/include/linux/irq.h
index 90081afa10ce..44650ca178d9 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -659,6 +659,7 @@ extern void handle_percpu_irq(struct irq_desc *desc);
 extern void handle_percpu_devid_irq(struct irq_desc *desc);
 extern void handle_bad_irq(struct irq_desc *desc);
 extern void handle_nested_irq(unsigned int irq);
+extern void unmask_irq(struct irq_desc *desc);
 
 extern void handle_fasteoi_nmi(struct irq_desc *desc);
 extern void handle_percpu_devid_fasteoi_nmi(struct irq_desc *desc);
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index dc94e0bf2c94..fd67b40b678d 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -439,6 +439,7 @@ void unmask_irq(struct irq_desc *desc)
 		irq_state_clr_masked(desc);
 	}
 }
+EXPORT_SYMBOL_GPL(unmask_irq);
 
 void unmask_threaded_irq(struct irq_desc *desc)
 {
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index bcc7f21db9ee..d08e3e7b2819 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -95,7 +95,6 @@ extern void irq_disable(struct irq_desc *desc);
 extern void irq_percpu_enable(struct irq_desc *desc, unsigned int cpu);
 extern void irq_percpu_disable(struct irq_desc *desc, unsigned int cpu);
 extern void mask_irq(struct irq_desc *desc);
-extern void unmask_irq(struct irq_desc *desc);
 extern void unmask_threaded_irq(struct irq_desc *desc);
 
 #ifdef CONFIG_SPARSE_IRQ
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 27ca1c866f29..5977efed31b5 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -380,7 +380,7 @@ struct irq_desc *irq_to_desc(unsigned int irq)
 {
 	return mtree_load(&sparse_irqs, irq);
 }
-#ifdef CONFIG_KVM_BOOK3S_64_HV_MODULE
+#if defined CONFIG_KVM_BOOK3S_64_HV_MODULE || defined CONFIG_XEN_PVH
 EXPORT_SYMBOL_GPL(irq_to_desc);
 #endif
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2023-11-24 10:31 [RFC KERNEL PATCH v2 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function Jiqian Chen
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0 Jiqian Chen
@ 2023-11-24 10:31 ` Jiqian Chen
  2 siblings, 0 replies; 49+ messages in thread
From: Jiqian Chen @ 2023-11-24 10:31 UTC (permalink / raw)
  To: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi
  Cc: Stefano Stabellini, Alex Deucher, Christian Koenig,
	Stewart Hildebrand, Xenia Ragiadakou, Honglei Huang, Julia Zhang,
	Huang Rui, Jiqian Chen, Huang Rui

In PVH dom0, it uses the linux local interrupt mechanism,
when it allocs irq for a gsi, it is dynamic, and follow
the principle of applying first, distributing first. And
if you debug the codes, you will find the irq number is
alloced from small to large, but the applying gsi number
is not, may gsi 38 comes before gsi 28, it causes the irq
number is not equal with the gsi number.
And when we passthrough a device, QEMU will use device's gsi
number to do mapping actions, see xen_pt_realize->
xc_physdev_map_pirq, but the gsi number is got from file
/sys/bus/pci/devices/xxxx:xx:xx.x/irq, irq!= gsi, so it will
fail when mapping.
And in current linux codes, there is no method to translate
irq to gsi for userspace.

For above purpose, this patch record the relationship of gsi
and irq when PVH dom0 do acpi_register_gsi_ioapic for devices
and adds a new syscall into privcmd to let userspace can get
that translation when they have a need.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
---
 arch/x86/include/asm/apic.h      |  7 ++++++
 arch/x86/include/asm/xen/pci.h   |  5 ++++
 arch/x86/kernel/acpi/boot.c      |  2 +-
 arch/x86/pci/xen.c               | 21 +++++++++++++++++
 drivers/xen/events/events_base.c | 39 ++++++++++++++++++++++++++++++++
 drivers/xen/privcmd.c            | 20 ++++++++++++++++
 include/uapi/xen/privcmd.h       |  7 ++++++
 include/xen/events.h             |  5 ++++
 8 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index d21f48f1c242..5646444285ac 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -169,6 +169,8 @@ extern bool apic_needs_pit(void);
 
 extern void apic_send_IPI_allbutself(unsigned int vector);
 
+extern int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
+				    int trigger, int polarity);
 #else /* !CONFIG_X86_LOCAL_APIC */
 static inline void lapic_shutdown(void) { }
 #define local_apic_timer_c2_ok		1
@@ -183,6 +185,11 @@ static inline void apic_intr_mode_init(void) { }
 static inline void lapic_assign_system_vectors(void) { }
 static inline void lapic_assign_legacy_vector(unsigned int i, bool r) { }
 static inline bool apic_needs_pit(void) { return true; }
+static inline int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
+				    int trigger, int polarity)
+{
+	return (int)gsi;
+}
 #endif /* !CONFIG_X86_LOCAL_APIC */
 
 #ifdef CONFIG_X86_X2APIC
diff --git a/arch/x86/include/asm/xen/pci.h b/arch/x86/include/asm/xen/pci.h
index 9015b888edd6..aa8ded61fc2d 100644
--- a/arch/x86/include/asm/xen/pci.h
+++ b/arch/x86/include/asm/xen/pci.h
@@ -5,6 +5,7 @@
 #if defined(CONFIG_PCI_XEN)
 extern int __init pci_xen_init(void);
 extern int __init pci_xen_hvm_init(void);
+extern int __init pci_xen_pvh_init(void);
 #define pci_xen 1
 #else
 #define pci_xen 0
@@ -13,6 +14,10 @@ static inline int pci_xen_hvm_init(void)
 {
 	return -1;
 }
+static inline int pci_xen_pvh_init(void)
+{
+	return -1;
+}
 #endif
 #ifdef CONFIG_XEN_PV_DOM0
 int __init pci_xen_initial_domain(void);
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index d0918a75cb00..45b157e18c0b 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -739,7 +739,7 @@ static int acpi_register_gsi_pic(struct device *dev, u32 gsi,
 }
 
 #ifdef CONFIG_X86_LOCAL_APIC
-static int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
+int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
 				    int trigger, int polarity)
 {
 	int irq = gsi;
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 652cd53e77f6..f056ab5c0a06 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -114,6 +114,21 @@ static int acpi_register_gsi_xen_hvm(struct device *dev, u32 gsi,
 				 false /* no mapping of GSI to PIRQ */);
 }
 
+static int acpi_register_gsi_xen_pvh(struct device *dev, u32 gsi,
+				    int trigger, int polarity)
+{
+	int irq;
+
+	irq = acpi_register_gsi_ioapic(dev, gsi, trigger, polarity);
+	if (irq < 0)
+		return irq;
+
+	if (xen_pvh_add_gsi_irq_map(gsi, irq) == -EEXIST)
+		printk(KERN_INFO "Already map the GSI :%u and IRQ: %d\n", gsi, irq);
+
+	return irq;
+}
+
 #ifdef CONFIG_XEN_PV_DOM0
 static int xen_register_gsi(u32 gsi, int triggering, int polarity)
 {
@@ -558,6 +573,12 @@ int __init pci_xen_hvm_init(void)
 	return 0;
 }
 
+int __init pci_xen_pvh_init(void)
+{
+	__acpi_register_gsi = acpi_register_gsi_xen_pvh;
+	return 0;
+}
+
 #ifdef CONFIG_XEN_PV_DOM0
 int __init pci_xen_initial_domain(void)
 {
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 6de6b084ea60..a02d62955509 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -957,6 +957,43 @@ int xen_irq_from_gsi(unsigned gsi)
 }
 EXPORT_SYMBOL_GPL(xen_irq_from_gsi);
 
+int xen_gsi_from_irq(unsigned irq)
+{
+	struct irq_info *info;
+
+	list_for_each_entry(info, &xen_irq_list_head, list) {
+		if (info->type != IRQT_PIRQ)
+			continue;
+
+		if (info->irq == irq)
+			return info->u.pirq.gsi;
+	}
+
+	return -1;
+}
+EXPORT_SYMBOL_GPL(xen_gsi_from_irq);
+
+int xen_pvh_add_gsi_irq_map(unsigned gsi, unsigned irq)
+{
+	int tmp_irq;
+	struct irq_info *info;
+
+	tmp_irq = xen_irq_from_gsi(gsi);
+	if (tmp_irq != -1)
+		return -EEXIST;
+
+	info = kzalloc(sizeof(*info), GFP_KERNEL);
+	if (info == NULL)
+		panic("Unable to allocate metadata for GSI%d\n", gsi);
+
+	info->type = IRQT_PIRQ;
+	info->irq = irq;
+	info->u.pirq.gsi = gsi;
+	list_add_tail(&info->list, &xen_irq_list_head);
+
+	return 0;
+}
+
 static void __unbind_from_irq(unsigned int irq)
 {
 	evtchn_port_t evtchn = evtchn_from_irq(irq);
@@ -2303,6 +2340,8 @@ void __init xen_init_IRQ(void)
 	xen_init_setup_upcall_vector();
 	xen_alloc_callback_vector();
 
+	if (xen_pvh_domain())
+		pci_xen_pvh_init();
 
 	if (xen_hvm_domain()) {
 		native_init_IRQ();
diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index 1ce7f3c7a950..6fa8a01d8ae6 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -45,6 +45,7 @@
 #include <xen/page.h>
 #include <xen/xen-ops.h>
 #include <xen/balloon.h>
+#include <xen/events.h>
 
 #include "privcmd.h"
 
@@ -842,6 +843,21 @@ static long privcmd_ioctl_mmap_resource(struct file *file,
 	return rc;
 }
 
+static long privcmd_ioctl_gsi_from_irq(struct file *file, void __user *udata)
+{
+	struct privcmd_gsi_from_irq kdata;
+
+	if (copy_from_user(&kdata, udata, sizeof(kdata)))
+		return -EFAULT;
+
+	kdata.gsi = xen_gsi_from_irq(kdata.irq);
+
+	if (copy_to_user(udata, &kdata, sizeof(kdata)))
+		return -EFAULT;
+
+	return 0;
+}
+
 #ifdef CONFIG_XEN_PRIVCMD_EVENTFD
 /* Irqfd support */
 static struct workqueue_struct *irqfd_cleanup_wq;
@@ -1534,6 +1550,10 @@ static long privcmd_ioctl(struct file *file,
 		ret = privcmd_ioctl_ioeventfd(file, udata);
 		break;
 
+	case IOCTL_PRIVCMD_GSI_FROM_IRQ:
+		ret = privcmd_ioctl_gsi_from_irq(file, udata);
+		break;
+
 	default:
 		break;
 	}
diff --git a/include/uapi/xen/privcmd.h b/include/uapi/xen/privcmd.h
index 8b8c5d1420fe..61f0ffbec077 100644
--- a/include/uapi/xen/privcmd.h
+++ b/include/uapi/xen/privcmd.h
@@ -126,6 +126,11 @@ struct privcmd_ioeventfd {
 	__u8 pad[2];
 };
 
+struct privcmd_gsi_from_irq {
+	__u32 irq;
+	__u32 gsi;
+};
+
 /*
  * @cmd: IOCTL_PRIVCMD_HYPERCALL
  * @arg: &privcmd_hypercall_t
@@ -157,5 +162,7 @@ struct privcmd_ioeventfd {
 	_IOW('P', 8, struct privcmd_irqfd)
 #define IOCTL_PRIVCMD_IOEVENTFD					\
 	_IOW('P', 9, struct privcmd_ioeventfd)
+#define IOCTL_PRIVCMD_GSI_FROM_IRQ				\
+	_IOC(_IOC_NONE, 'P', 10, sizeof(struct privcmd_gsi_from_irq))
 
 #endif /* __LINUX_PUBLIC_PRIVCMD_H__ */
diff --git a/include/xen/events.h b/include/xen/events.h
index 23932b0673dc..78377c936efe 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -131,6 +131,11 @@ int xen_pirq_from_irq(unsigned irq);
 /* Return the irq allocated to the gsi */
 int xen_irq_from_gsi(unsigned gsi);
 
+/* Return the gsi from irq */
+int xen_gsi_from_irq(unsigned irq);
+
+int xen_pvh_add_gsi_irq_map(unsigned gsi, unsigned irq);
+
 /* Determine whether to ignore this IRQ if it is passed to a guest. */
 int xen_test_irq_shared(int irq);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq Jiqian Chen
@ 2023-11-26 16:06 ` kernel test robot
  -1 siblings, 0 replies; 49+ messages in thread
From: kernel test robot @ 2023-11-24 23:38 UTC (permalink / raw)
  To: oe-kbuild; +Cc: lkp

:::::: 
:::::: Manual check reason: "low confidence bisect report"
:::::: 

BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <20231124103123.3263471-4-Jiqian.Chen@amd.com>
References: <20231124103123.3263471-4-Jiqian.Chen@amd.com>
TO: Jiqian Chen <Jiqian.Chen@amd.com>

Hi Jiqian,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on v6.7-rc1]
[cannot apply to xen-tip/linux-next tip/irq/core tip/x86/core linus/master v6.7-rc2 next-20231124]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Jiqian-Chen/xen-pci-Add-xen_reset_device_state-function/20231124-183607
base:   v6.7-rc1
patch link:    https://lore.kernel.org/r/20231124103123.3263471-4-Jiqian.Chen%40amd.com
patch subject: [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq
:::::: branch date: 10 hours ago
:::::: commit date: 10 hours ago
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce: (https://download.01.org/0day-ci/archive/20231125/202311250440.DbUQXNLK-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/r/202311250440.DbUQXNLK-lkp@intel.com/

includecheck warnings: (new ones prefixed by >>)
>> drivers/xen/privcmd.c: xen/events.h is included more than once.

vim +38 drivers/xen/privcmd.c

    36	
    37	#include <xen/xen.h>
  > 38	#include <xen/events.h>
    39	#include <xen/privcmd.h>
    40	#include <xen/interface/xen.h>
    41	#include <xen/interface/memory.h>
    42	#include <xen/interface/hvm/dm_op.h>
    43	#include <xen/interface/hvm/ioreq.h>
    44	#include <xen/features.h>
    45	#include <xen/page.h>
    46	#include <xen/xen-ops.h>
    47	#include <xen/balloon.h>
  > 48	#include <xen/events.h>
    49	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0 Jiqian Chen
@ 2023-11-25  7:24   ` kernel test robot
  2023-11-25  7:42   ` kernel test robot
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 49+ messages in thread
From: kernel test robot @ 2023-11-25  7:24 UTC (permalink / raw)
  To: Jiqian Chen; +Cc: llvm, oe-kbuild-all

Hi Jiqian,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on v6.7-rc1]
[also build test ERROR on next-20231124]
[cannot apply to xen-tip/linux-next tip/irq/core tip/x86/core linus/master v6.7-rc2]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Jiqian-Chen/xen-pci-Add-xen_reset_device_state-function/20231124-183607
base:   v6.7-rc1
patch link:    https://lore.kernel.org/r/20231124103123.3263471-3-Jiqian.Chen%40amd.com
patch subject: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
config: hexagon-randconfig-001-20231125 (https://download.01.org/0day-ci/archive/20231125/202311251303.UOiJlP4m-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project.git f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231125/202311251303.UOiJlP4m-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202311251303.UOiJlP4m-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from arch/hexagon/kernel/irq_cpu.c:8:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:337:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
                                                     ^
   In file included from arch/hexagon/kernel/irq_cpu.c:8:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:337:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
                                                     ^
   In file included from arch/hexagon/kernel/irq_cpu.c:8:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:337:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   arch/hexagon/kernel/irq_cpu.c:22:13: error: static declaration of 'unmask_irq' follows non-static declaration
   static void unmask_irq(struct irq_data *data)
               ^
   include/linux/irq.h:662:13: note: previous declaration is here
   extern void unmask_irq(struct irq_desc *desc);
               ^
>> arch/hexagon/kernel/irq_cpu.c:48:16: error: incompatible function pointer types initializing 'void (*)(struct irq_data *)' with an expression of type 'void (struct irq_desc *)' [-Werror,-Wincompatible-function-pointer-types]
           .irq_unmask     = unmask_irq,
                             ^~~~~~~~~~
   6 warnings and 2 errors generated.


vim +48 arch/hexagon/kernel/irq_cpu.c

2ac211bc423a0e Richard Kuo 2011-10-31  44  
2ac211bc423a0e Richard Kuo 2011-10-31  45  static struct irq_chip hexagon_irq_chip = {
2ac211bc423a0e Richard Kuo 2011-10-31  46  	.name		= "HEXAGON",
2ac211bc423a0e Richard Kuo 2011-10-31  47  	.irq_mask	= mask_irq,
2ac211bc423a0e Richard Kuo 2011-10-31 @48  	.irq_unmask	= unmask_irq,
2ac211bc423a0e Richard Kuo 2011-10-31  49  	.irq_set_wake	= set_wake,
2ac211bc423a0e Richard Kuo 2011-10-31  50  	.irq_eoi	= eoi_irq
2ac211bc423a0e Richard Kuo 2011-10-31  51  };
2ac211bc423a0e Richard Kuo 2011-10-31  52  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0 Jiqian Chen
  2023-11-25  7:24   ` kernel test robot
@ 2023-11-25  7:42   ` kernel test robot
  2023-11-30  3:53   ` Stefano Stabellini
  2023-12-04  8:13   ` Thomas Gleixner
  3 siblings, 0 replies; 49+ messages in thread
From: kernel test robot @ 2023-11-25  7:42 UTC (permalink / raw)
  To: Jiqian Chen; +Cc: llvm, oe-kbuild-all

Hi Jiqian,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on v6.7-rc1]
[also build test ERROR on next-20231124]
[cannot apply to xen-tip/linux-next tip/irq/core tip/x86/core linus/master v6.7-rc2]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Jiqian-Chen/xen-pci-Add-xen_reset_device_state-function/20231124-183607
base:   v6.7-rc1
patch link:    https://lore.kernel.org/r/20231124103123.3263471-3-Jiqian.Chen%40amd.com
patch subject: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
config: hexagon-randconfig-002-20231125 (https://download.01.org/0day-ci/archive/20231125/202311251435.j0xYajwA-lkp@intel.com/config)
compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231125/202311251435.j0xYajwA-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202311251435.j0xYajwA-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from arch/hexagon/kernel/irq_cpu.c:8:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:337:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
                                                     ^
   In file included from arch/hexagon/kernel/irq_cpu.c:8:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:337:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
                                                     ^
   In file included from arch/hexagon/kernel/irq_cpu.c:8:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:337:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
>> arch/hexagon/kernel/irq_cpu.c:22:13: error: static declaration of 'unmask_irq' follows non-static declaration
   static void unmask_irq(struct irq_data *data)
               ^
   include/linux/irq.h:662:13: note: previous declaration is here
   extern void unmask_irq(struct irq_desc *desc);
               ^
>> arch/hexagon/kernel/irq_cpu.c:48:16: error: incompatible function pointer types initializing 'void (*)(struct irq_data *)' with an expression of type 'void (struct irq_desc *)' [-Wincompatible-function-pointer-types]
           .irq_unmask     = unmask_irq,
                             ^~~~~~~~~~
   6 warnings and 2 errors generated.


vim +/unmask_irq +22 arch/hexagon/kernel/irq_cpu.c

2ac211bc423a0e Richard Kuo 2011-10-31  21  
2ac211bc423a0e Richard Kuo 2011-10-31 @22  static void unmask_irq(struct irq_data *data)
2ac211bc423a0e Richard Kuo 2011-10-31  23  {
2ac211bc423a0e Richard Kuo 2011-10-31  24  	__vmintop_locen((long) data->irq);
2ac211bc423a0e Richard Kuo 2011-10-31  25  }
2ac211bc423a0e Richard Kuo 2011-10-31  26  
2ac211bc423a0e Richard Kuo 2011-10-31  27  /*  This is actually all we need for handle_fasteoi_irq  */
2ac211bc423a0e Richard Kuo 2011-10-31  28  static void eoi_irq(struct irq_data *data)
2ac211bc423a0e Richard Kuo 2011-10-31  29  {
2ac211bc423a0e Richard Kuo 2011-10-31  30  	__vmintop_globen((long) data->irq);
2ac211bc423a0e Richard Kuo 2011-10-31  31  }
2ac211bc423a0e Richard Kuo 2011-10-31  32  
2ac211bc423a0e Richard Kuo 2011-10-31  33  /* Power mamangement wake call. We don't need this, however,
2ac211bc423a0e Richard Kuo 2011-10-31  34   * if this is absent, then an -ENXIO error is returned to the
2ac211bc423a0e Richard Kuo 2011-10-31  35   * msm_serial driver, and it fails to correctly initialize.
2ac211bc423a0e Richard Kuo 2011-10-31  36   * This is a bug in the msm_serial driver, but, for now, we
2ac211bc423a0e Richard Kuo 2011-10-31  37   * work around it here, by providing this bogus handler.
2ac211bc423a0e Richard Kuo 2011-10-31  38   * XXX FIXME!!! remove this when msm_serial is fixed.
2ac211bc423a0e Richard Kuo 2011-10-31  39   */
2ac211bc423a0e Richard Kuo 2011-10-31  40  static int set_wake(struct irq_data *data, unsigned int on)
2ac211bc423a0e Richard Kuo 2011-10-31  41  {
2ac211bc423a0e Richard Kuo 2011-10-31  42  	return 0;
2ac211bc423a0e Richard Kuo 2011-10-31  43  }
2ac211bc423a0e Richard Kuo 2011-10-31  44  
2ac211bc423a0e Richard Kuo 2011-10-31  45  static struct irq_chip hexagon_irq_chip = {
2ac211bc423a0e Richard Kuo 2011-10-31  46  	.name		= "HEXAGON",
2ac211bc423a0e Richard Kuo 2011-10-31  47  	.irq_mask	= mask_irq,
2ac211bc423a0e Richard Kuo 2011-10-31 @48  	.irq_unmask	= unmask_irq,
2ac211bc423a0e Richard Kuo 2011-10-31  49  	.irq_set_wake	= set_wake,
2ac211bc423a0e Richard Kuo 2011-10-31  50  	.irq_eoi	= eoi_irq
2ac211bc423a0e Richard Kuo 2011-10-31  51  };
2ac211bc423a0e Richard Kuo 2011-10-31  52  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq
@ 2023-11-26 16:06 ` kernel test robot
  0 siblings, 0 replies; 49+ messages in thread
From: kernel test robot @ 2023-11-26 16:06 UTC (permalink / raw)
  To: Jiqian Chen; +Cc: oe-kbuild-all

Hi Jiqian,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on v6.7-rc1]
[cannot apply to xen-tip/linux-next tip/irq/core tip/x86/core linus/master v6.7-rc2 next-20231124]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Jiqian-Chen/xen-pci-Add-xen_reset_device_state-function/20231124-183607
base:   v6.7-rc1
patch link:    https://lore.kernel.org/r/20231124103123.3263471-4-Jiqian.Chen%40amd.com
patch subject: [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce: (https://download.01.org/0day-ci/archive/20231125/202311250440.DbUQXNLK-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <yujie.liu@intel.com>
| Closes: https://lore.kernel.org/r/202311250440.DbUQXNLK-lkp@intel.com/

includecheck warnings: (new ones prefixed by >>)
>> drivers/xen/privcmd.c: xen/events.h is included more than once.

vim +38 drivers/xen/privcmd.c

    36	
    37	#include <xen/xen.h>
  > 38	#include <xen/events.h>
    39	#include <xen/privcmd.h>
    40	#include <xen/interface/xen.h>
    41	#include <xen/interface/memory.h>
    42	#include <xen/interface/hvm/dm_op.h>
    43	#include <xen/interface/hvm/ioreq.h>
    44	#include <xen/features.h>
    45	#include <xen/page.h>
    46	#include <xen/xen-ops.h>
    47	#include <xen/balloon.h>
  > 48	#include <xen/events.h>
    49	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function Jiqian Chen
@ 2023-11-30  3:46   ` Stefano Stabellini
  2023-11-30  7:03     ` Chen, Jiqian
  2023-12-04  7:58   ` Thomas Gleixner
  1 sibling, 1 reply; 49+ messages in thread
From: Stefano Stabellini @ 2023-11-30  3:46 UTC (permalink / raw)
  To: Jiqian Chen
  Cc: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi, Stefano Stabellini,
	Alex Deucher, Christian Koenig, Stewart Hildebrand,
	Xenia Ragiadakou, Honglei Huang, Julia Zhang, Huang Rui,
	Huang Rui

On Fri, 24 Nov 2023, Jiqian Chen wrote:
> When device on dom0 side has been reset, the vpci on Xen side
> won't get notification, so that the cached state in vpci is
> all out of date with the real device state.
> To solve that problem, this patch add a function to clear all
> vpci device state when device is reset on dom0 side.
> 
> And call that function in pcistub_init_device. Because when
> we use "pci-assignable-add" to assign a passthrough device in
> Xen, it will reset passthrough device and the vpci state will
> out of date, and then device will fail to restore bar state.
> 
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> Signed-off-by: Huang Rui <ray.huang@amd.com>
> ---
>  drivers/xen/pci.c                  | 12 ++++++++++++
>  drivers/xen/xen-pciback/pci_stub.c |  3 +++
>  include/xen/interface/physdev.h    |  2 ++
>  include/xen/pci.h                  |  6 ++++++
>  4 files changed, 23 insertions(+)
> 
> diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
> index 72d4e3f193af..e9b30bc09139 100644
> --- a/drivers/xen/pci.c
> +++ b/drivers/xen/pci.c
> @@ -177,6 +177,18 @@ static int xen_remove_device(struct device *dev)
>  	return r;
>  }
>  
> +int xen_reset_device_state(const struct pci_dev *dev)
> +{
> +	struct physdev_pci_device device = {
> +		.seg = pci_domain_nr(dev->bus),
> +		.bus = dev->bus->number,
> +		.devfn = dev->devfn
> +	};
> +
> +	return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_state_reset, &device);
> +}
> +EXPORT_SYMBOL_GPL(xen_reset_device_state);
> +
>  static int xen_pci_notifier(struct notifier_block *nb,
>  			    unsigned long action, void *data)
>  {
> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
> index e34b623e4b41..5a96b6c66c07 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -421,6 +421,9 @@ static int pcistub_init_device(struct pci_dev *dev)
>  	else {
>  		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
>  		__pci_reset_function_locked(dev);
> +		err = xen_reset_device_state(dev);
> +		if (err)
> +			goto config_release;

Older versions of Xen won't have the hypercall
PHYSDEVOP_pci_device_state_reset implemented. I think we should do
something like:

if (err && xen_pvh_domain())
    goto config_release;


Or even:

if (xen_pvh_domain()) {
    err = xen_reset_device_state(dev);
    if (err)
        goto config_release;
}

depending on whether we want to call xen_reset_device_state also for PV
guests or not. I am assuming we don't want to error out on failure such
as -ENOENT for PV guests.


>  		pci_restore_state(dev);
>  	}
>  	/* Now disable the device (this also ensures some private device
> diff --git a/include/xen/interface/physdev.h b/include/xen/interface/physdev.h
> index a237af867873..231526f80f6c 100644
> --- a/include/xen/interface/physdev.h
> +++ b/include/xen/interface/physdev.h
> @@ -263,6 +263,8 @@ struct physdev_pci_device {
>      uint8_t devfn;
>  };
>  
> +#define PHYSDEVOP_pci_device_state_reset     32
> +
>  #define PHYSDEVOP_DBGP_RESET_PREPARE    1
>  #define PHYSDEVOP_DBGP_RESET_DONE       2
>  
> diff --git a/include/xen/pci.h b/include/xen/pci.h
> index b8337cf85fd1..b2e2e856efd6 100644
> --- a/include/xen/pci.h
> +++ b/include/xen/pci.h
> @@ -4,10 +4,16 @@
>  #define __XEN_PCI_H__
>  
>  #if defined(CONFIG_XEN_DOM0)
> +int xen_reset_device_state(const struct pci_dev *dev);
>  int xen_find_device_domain_owner(struct pci_dev *dev);
>  int xen_register_device_domain_owner(struct pci_dev *dev, uint16_t domain);
>  int xen_unregister_device_domain_owner(struct pci_dev *dev);
>  #else
> +static inline int xen_reset_device_state(const struct pci_dev *dev)
> +{
> +	return -1;
> +}
> +
>  static inline int xen_find_device_domain_owner(struct pci_dev *dev)
>  {
>  	return -1;
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0 Jiqian Chen
  2023-11-25  7:24   ` kernel test robot
  2023-11-25  7:42   ` kernel test robot
@ 2023-11-30  3:53   ` Stefano Stabellini
  2023-11-30 16:02     ` Roger Pau Monné
  2023-12-04  8:13   ` Thomas Gleixner
  3 siblings, 1 reply; 49+ messages in thread
From: Stefano Stabellini @ 2023-11-30  3:53 UTC (permalink / raw)
  To: Jiqian Chen, roger.pau
  Cc: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi, Stefano Stabellini,
	Alex Deucher, Christian Koenig, Stewart Hildebrand,
	Xenia Ragiadakou, Honglei Huang, Julia Zhang, Huang Rui,
	Huang Rui

On Fri, 24 Nov 2023, Jiqian Chen wrote:
> This patch is to solve two problems we encountered when we try to
> passthrough a device to hvm domU base on Xen PVH dom0.
> 
> First, hvm guest will alloc a pirq and irq for a passthrough device
> by using gsi, before that, the gsi must first has a mapping in dom0,
> see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
> into Xen and check whether dom0 has the mapping. See
> XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
> dom0 and it return irq is 0, and then return -EPERM.
> This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
> when thay are enabled.
> 
> Second, in PVH dom0, the gsi of a passthrough device doesn't get
> registered, but gsi must be configured for it to be able to be
> mapped into a domU.
> 
> After searching codes, we can find map_pirq and register_gsi will be
> done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
> the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
> can be conclude to that the gsi of a passthrough device doesn't be
> unmasked.
> 
> To solve the unmaske problem, this patch call the unmask_irq when we
> assign a device to be passthrough. So that the gsi can get registered
> and mapped in PVH dom0.


Roger, this seems to be more of a Xen issue than a Linux issue. Why do
we need the unmask check in Xen? Couldn't we just do:


diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 4e40d3609a..df262a4a18 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -287,7 +287,7 @@ static void vioapic_write_redirent(
             hvm_dpci_eoi(d, gsi);
     }
 
-    if ( is_hardware_domain(d) && unmasked )
+    if ( is_hardware_domain(d) )
     {
         /*
          * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-11-30  3:46   ` Stefano Stabellini
@ 2023-11-30  7:03     ` Chen, Jiqian
  2023-11-30 15:03       ` Stewart Hildebrand
  0 siblings, 1 reply; 49+ messages in thread
From: Chen, Jiqian @ 2023-11-30  7:03 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Oleksandr Tyshchenko, Thomas Gleixner,
	Boris Ostrovsky, Rafael J . Wysocki, Len Brown, Bjorn Helgaas,
	Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi, Stabellini, Stefano,
	Deucher, Alexander, Koenig, Christian, Hildebrand, Stewart,
	Ragiadakou, Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray,
	Chen, Jiqian


On 2023/11/30 11:46, Stefano Stabellini wrote:
> On Fri, 24 Nov 2023, Jiqian Chen wrote:
>> When device on dom0 side has been reset, the vpci on Xen side
>> won't get notification, so that the cached state in vpci is
>> all out of date with the real device state.
>> To solve that problem, this patch add a function to clear all
>> vpci device state when device is reset on dom0 side.
>>
>> And call that function in pcistub_init_device. Because when
>> we use "pci-assignable-add" to assign a passthrough device in
>> Xen, it will reset passthrough device and the vpci state will
>> out of date, and then device will fail to restore bar state.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>> ---
>>  drivers/xen/pci.c                  | 12 ++++++++++++
>>  drivers/xen/xen-pciback/pci_stub.c |  3 +++
>>  include/xen/interface/physdev.h    |  2 ++
>>  include/xen/pci.h                  |  6 ++++++
>>  4 files changed, 23 insertions(+)
>>
>> diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
>> index 72d4e3f193af..e9b30bc09139 100644
>> --- a/drivers/xen/pci.c
>> +++ b/drivers/xen/pci.c
>> @@ -177,6 +177,18 @@ static int xen_remove_device(struct device *dev)
>>  	return r;
>>  }
>>  
>> +int xen_reset_device_state(const struct pci_dev *dev)
>> +{
>> +	struct physdev_pci_device device = {
>> +		.seg = pci_domain_nr(dev->bus),
>> +		.bus = dev->bus->number,
>> +		.devfn = dev->devfn
>> +	};
>> +
>> +	return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_state_reset, &device);
>> +}
>> +EXPORT_SYMBOL_GPL(xen_reset_device_state);
>> +
>>  static int xen_pci_notifier(struct notifier_block *nb,
>>  			    unsigned long action, void *data)
>>  {
>> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>> index e34b623e4b41..5a96b6c66c07 100644
>> --- a/drivers/xen/xen-pciback/pci_stub.c
>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> @@ -421,6 +421,9 @@ static int pcistub_init_device(struct pci_dev *dev)
>>  	else {
>>  		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
>>  		__pci_reset_function_locked(dev);
>> +		err = xen_reset_device_state(dev);
>> +		if (err)
>> +			goto config_release;
> 
> Older versions of Xen won't have the hypercall
> PHYSDEVOP_pci_device_state_reset implemented. I think we should do
> something like:
> 
> if (err && xen_pvh_domain())
>     goto config_release;
> 
> 
> Or even:
> 
> if (xen_pvh_domain()) {
>     err = xen_reset_device_state(dev);
>     if (err)
>         goto config_release;
> }
> 
> depending on whether we want to call xen_reset_device_state also for PV
> guests or not. I am assuming we don't want to error out on failure such
> as -ENOENT for PV guests.
Yes, only for PVH dom0, I will add the condition in next version. Thank you!

> 
> 
>>  		pci_restore_state(dev);
>>  	}
>>  	/* Now disable the device (this also ensures some private device
>> diff --git a/include/xen/interface/physdev.h b/include/xen/interface/physdev.h
>> index a237af867873..231526f80f6c 100644
>> --- a/include/xen/interface/physdev.h
>> +++ b/include/xen/interface/physdev.h
>> @@ -263,6 +263,8 @@ struct physdev_pci_device {
>>      uint8_t devfn;
>>  };
>>  
>> +#define PHYSDEVOP_pci_device_state_reset     32
>> +
>>  #define PHYSDEVOP_DBGP_RESET_PREPARE    1
>>  #define PHYSDEVOP_DBGP_RESET_DONE       2
>>  
>> diff --git a/include/xen/pci.h b/include/xen/pci.h
>> index b8337cf85fd1..b2e2e856efd6 100644
>> --- a/include/xen/pci.h
>> +++ b/include/xen/pci.h
>> @@ -4,10 +4,16 @@
>>  #define __XEN_PCI_H__
>>  
>>  #if defined(CONFIG_XEN_DOM0)
>> +int xen_reset_device_state(const struct pci_dev *dev);
>>  int xen_find_device_domain_owner(struct pci_dev *dev);
>>  int xen_register_device_domain_owner(struct pci_dev *dev, uint16_t domain);
>>  int xen_unregister_device_domain_owner(struct pci_dev *dev);
>>  #else
>> +static inline int xen_reset_device_state(const struct pci_dev *dev)
>> +{
>> +	return -1;
>> +}
>> +
>>  static inline int xen_find_device_domain_owner(struct pci_dev *dev)
>>  {
>>  	return -1;
>> -- 
>> 2.34.1
>>

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-11-30  7:03     ` Chen, Jiqian
@ 2023-11-30 15:03       ` Stewart Hildebrand
  2023-12-04  3:25         ` Chen, Jiqian
  0 siblings, 1 reply; 49+ messages in thread
From: Stewart Hildebrand @ 2023-11-30 15:03 UTC (permalink / raw)
  To: Chen, Jiqian, Stefano Stabellini
  Cc: Juergen Gross, Oleksandr Tyshchenko, Thomas Gleixner,
	Boris Ostrovsky, Rafael J . Wysocki, Len Brown, Bjorn Helgaas,
	Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi, Stabellini, Stefano,
	Deucher, Alexander, Koenig, Christian, Ragiadakou, Xenia, Huang,
	Honglei1, Zhang, Julia, Huang, Ray

On 11/30/23 02:03, Chen, Jiqian wrote:
> 
> On 2023/11/30 11:46, Stefano Stabellini wrote:
>> On Fri, 24 Nov 2023, Jiqian Chen wrote:
>>> When device on dom0 side has been reset, the vpci on Xen side
>>> won't get notification, so that the cached state in vpci is
>>> all out of date with the real device state.
>>> To solve that problem, this patch add a function to clear all
>>> vpci device state when device is reset on dom0 side.
>>>
>>> And call that function in pcistub_init_device. Because when
>>> we use "pci-assignable-add" to assign a passthrough device in
>>> Xen, it will reset passthrough device and the vpci state will
>>> out of date, and then device will fail to restore bar state.
>>>
>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>> ---
>>>  drivers/xen/pci.c                  | 12 ++++++++++++
>>>  drivers/xen/xen-pciback/pci_stub.c |  3 +++
>>>  include/xen/interface/physdev.h    |  2 ++
>>>  include/xen/pci.h                  |  6 ++++++
>>>  4 files changed, 23 insertions(+)
>>>
>>> diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
>>> index 72d4e3f193af..e9b30bc09139 100644
>>> --- a/drivers/xen/pci.c
>>> +++ b/drivers/xen/pci.c
>>> @@ -177,6 +177,18 @@ static int xen_remove_device(struct device *dev)
>>>  	return r;
>>>  }
>>>  
>>> +int xen_reset_device_state(const struct pci_dev *dev)
>>> +{
>>> +	struct physdev_pci_device device = {
>>> +		.seg = pci_domain_nr(dev->bus),
>>> +		.bus = dev->bus->number,
>>> +		.devfn = dev->devfn
>>> +	};
>>> +
>>> +	return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_state_reset, &device);
>>> +}
>>> +EXPORT_SYMBOL_GPL(xen_reset_device_state);
>>> +
>>>  static int xen_pci_notifier(struct notifier_block *nb,
>>>  			    unsigned long action, void *data)
>>>  {
>>> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>>> index e34b623e4b41..5a96b6c66c07 100644
>>> --- a/drivers/xen/xen-pciback/pci_stub.c
>>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>>> @@ -421,6 +421,9 @@ static int pcistub_init_device(struct pci_dev *dev)
>>>  	else {
>>>  		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
>>>  		__pci_reset_function_locked(dev);
>>> +		err = xen_reset_device_state(dev);
>>> +		if (err)
>>> +			goto config_release;
>>
>> Older versions of Xen won't have the hypercall
>> PHYSDEVOP_pci_device_state_reset implemented. I think we should do
>> something like:
>>
>> if (err && xen_pvh_domain())
>>     goto config_release;
>>
>>
>> Or even:
>>
>> if (xen_pvh_domain()) {
>>     err = xen_reset_device_state(dev);
>>     if (err)
>>         goto config_release;
>> }
>>
>> depending on whether we want to call xen_reset_device_state also for PV
>> guests or not. I am assuming we don't want to error out on failure such
>> as -ENOENT for PV guests.
> Yes, only for PVH dom0, I will add the condition in next version. Thank you!

We will want to call xen_reset_device_state() for Arm dom0, too, so checking xen_pvh_domain() alone is not sufficient. I suggest instead to check !xen_pv_domain().

> 
>>
>>
>>>  		pci_restore_state(dev);
>>>  	}
>>>  	/* Now disable the device (this also ensures some private device
>>> diff --git a/include/xen/interface/physdev.h b/include/xen/interface/physdev.h
>>> index a237af867873..231526f80f6c 100644
>>> --- a/include/xen/interface/physdev.h
>>> +++ b/include/xen/interface/physdev.h
>>> @@ -263,6 +263,8 @@ struct physdev_pci_device {
>>>      uint8_t devfn;
>>>  };
>>>  
>>> +#define PHYSDEVOP_pci_device_state_reset     32
>>> +
>>>  #define PHYSDEVOP_DBGP_RESET_PREPARE    1
>>>  #define PHYSDEVOP_DBGP_RESET_DONE       2
>>>  
>>> diff --git a/include/xen/pci.h b/include/xen/pci.h
>>> index b8337cf85fd1..b2e2e856efd6 100644
>>> --- a/include/xen/pci.h
>>> +++ b/include/xen/pci.h
>>> @@ -4,10 +4,16 @@
>>>  #define __XEN_PCI_H__
>>>  
>>>  #if defined(CONFIG_XEN_DOM0)
>>> +int xen_reset_device_state(const struct pci_dev *dev);
>>>  int xen_find_device_domain_owner(struct pci_dev *dev);
>>>  int xen_register_device_domain_owner(struct pci_dev *dev, uint16_t domain);
>>>  int xen_unregister_device_domain_owner(struct pci_dev *dev);
>>>  #else
>>> +static inline int xen_reset_device_state(const struct pci_dev *dev)
>>> +{
>>> +	return -1;
>>> +}
>>> +
>>>  static inline int xen_find_device_domain_owner(struct pci_dev *dev)
>>>  {
>>>  	return -1;
>>> -- 
>>> 2.34.1
>>>
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-11-30  3:53   ` Stefano Stabellini
@ 2023-11-30 16:02     ` Roger Pau Monné
  2023-12-01  3:15       ` Stefano Stabellini
  0 siblings, 1 reply; 49+ messages in thread
From: Roger Pau Monné @ 2023-11-30 16:02 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jiqian Chen, Juergen Gross, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi,
	Stefano Stabellini, Alex Deucher, Christian Koenig,
	Stewart Hildebrand, Xenia Ragiadakou, Honglei Huang, Julia Zhang,
	Huang Rui

On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
> On Fri, 24 Nov 2023, Jiqian Chen wrote:
> > This patch is to solve two problems we encountered when we try to
> > passthrough a device to hvm domU base on Xen PVH dom0.
> > 
> > First, hvm guest will alloc a pirq and irq for a passthrough device
> > by using gsi, before that, the gsi must first has a mapping in dom0,
> > see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
> > into Xen and check whether dom0 has the mapping. See
> > XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
> > dom0 and it return irq is 0, and then return -EPERM.
> > This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
> > when thay are enabled.
> > 
> > Second, in PVH dom0, the gsi of a passthrough device doesn't get
> > registered, but gsi must be configured for it to be able to be
> > mapped into a domU.
> > 
> > After searching codes, we can find map_pirq and register_gsi will be
> > done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
> > the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
> > can be conclude to that the gsi of a passthrough device doesn't be
> > unmasked.
> > 
> > To solve the unmaske problem, this patch call the unmask_irq when we
> > assign a device to be passthrough. So that the gsi can get registered
> > and mapped in PVH dom0.
> 
> 
> Roger, this seems to be more of a Xen issue than a Linux issue. Why do
> we need the unmask check in Xen? Couldn't we just do:
> 
> 
> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> index 4e40d3609a..df262a4a18 100644
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
>              hvm_dpci_eoi(d, gsi);
>      }
>  
> -    if ( is_hardware_domain(d) && unmasked )
> +    if ( is_hardware_domain(d) )
>      {
>          /*
>           * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock

There are some issues with this approach.

mp_register_gsi() will only setup the trigger and polarity of the
IO-APIC pin once, so we do so once the guest unmask the pin in order
to assert that the configuration is the intended one.  A guest is
allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
that doesn't take effect unless the pin is unmasked.

Overall the question would be whether we have any guarantees that
the hardware domain has properly configured the pin, even if it's not
using it itself (as it hasn't been unmasked).

IIRC PCI legacy interrupts are level triggered and low polarity, so we
could configure any pins that are not setup at bind time?

Thanks, Roger.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-11-30 16:02     ` Roger Pau Monné
@ 2023-12-01  3:15       ` Stefano Stabellini
  2023-12-01  8:58         ` Roger Pau Monné
  0 siblings, 1 reply; 49+ messages in thread
From: Stefano Stabellini @ 2023-12-01  3:15 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Jiqian Chen, Juergen Gross,
	Oleksandr Tyshchenko, Thomas Gleixner, Boris Ostrovsky,
	Rafael J . Wysocki, Len Brown, Bjorn Helgaas, xen-devel,
	linux-kernel, linux-acpi, Stefano Stabellini, Alex Deucher,
	Christian Koenig, Stewart Hildebrand, Xenia Ragiadakou,
	Honglei Huang, Julia Zhang, Huang Rui

[-- Attachment #1: Type: text/plain, Size: 3016 bytes --]

On Thu, 30 Nov 2023, Roger Pau Monné wrote:
> On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
> > On Fri, 24 Nov 2023, Jiqian Chen wrote:
> > > This patch is to solve two problems we encountered when we try to
> > > passthrough a device to hvm domU base on Xen PVH dom0.
> > > 
> > > First, hvm guest will alloc a pirq and irq for a passthrough device
> > > by using gsi, before that, the gsi must first has a mapping in dom0,
> > > see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
> > > into Xen and check whether dom0 has the mapping. See
> > > XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
> > > dom0 and it return irq is 0, and then return -EPERM.
> > > This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
> > > when thay are enabled.
> > > 
> > > Second, in PVH dom0, the gsi of a passthrough device doesn't get
> > > registered, but gsi must be configured for it to be able to be
> > > mapped into a domU.
> > > 
> > > After searching codes, we can find map_pirq and register_gsi will be
> > > done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
> > > the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
> > > can be conclude to that the gsi of a passthrough device doesn't be
> > > unmasked.
> > > 
> > > To solve the unmaske problem, this patch call the unmask_irq when we
> > > assign a device to be passthrough. So that the gsi can get registered
> > > and mapped in PVH dom0.
> > 
> > 
> > Roger, this seems to be more of a Xen issue than a Linux issue. Why do
> > we need the unmask check in Xen? Couldn't we just do:
> > 
> > 
> > diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> > index 4e40d3609a..df262a4a18 100644
> > --- a/xen/arch/x86/hvm/vioapic.c
> > +++ b/xen/arch/x86/hvm/vioapic.c
> > @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
> >              hvm_dpci_eoi(d, gsi);
> >      }
> >  
> > -    if ( is_hardware_domain(d) && unmasked )
> > +    if ( is_hardware_domain(d) )
> >      {
> >          /*
> >           * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock
> 
> There are some issues with this approach.
> 
> mp_register_gsi() will only setup the trigger and polarity of the
> IO-APIC pin once, so we do so once the guest unmask the pin in order
> to assert that the configuration is the intended one.  A guest is
> allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
> that doesn't take effect unless the pin is unmasked.
> 
> Overall the question would be whether we have any guarantees that
> the hardware domain has properly configured the pin, even if it's not
> using it itself (as it hasn't been unmasked).
> 
> IIRC PCI legacy interrupts are level triggered and low polarity, so we
> could configure any pins that are not setup at bind time?

That could work.

Another idea is to move only the call to allocate_and_map_gsi_pirq at
bind time? That might be enough to pass a pirq_access_permitted check.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-01  3:15       ` Stefano Stabellini
@ 2023-12-01  8:58         ` Roger Pau Monné
  2023-12-02  3:37           ` Stefano Stabellini
  0 siblings, 1 reply; 49+ messages in thread
From: Roger Pau Monné @ 2023-12-01  8:58 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jiqian Chen, Juergen Gross, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi,
	Stefano Stabellini, Alex Deucher, Christian Koenig,
	Stewart Hildebrand, Xenia Ragiadakou, Honglei Huang, Julia Zhang,
	Huang Rui

On Thu, Nov 30, 2023 at 07:15:17PM -0800, Stefano Stabellini wrote:
> On Thu, 30 Nov 2023, Roger Pau Monné wrote:
> > On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
> > > On Fri, 24 Nov 2023, Jiqian Chen wrote:
> > > > This patch is to solve two problems we encountered when we try to
> > > > passthrough a device to hvm domU base on Xen PVH dom0.
> > > > 
> > > > First, hvm guest will alloc a pirq and irq for a passthrough device
> > > > by using gsi, before that, the gsi must first has a mapping in dom0,
> > > > see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
> > > > into Xen and check whether dom0 has the mapping. See
> > > > XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
> > > > dom0 and it return irq is 0, and then return -EPERM.
> > > > This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
> > > > when thay are enabled.
> > > > 
> > > > Second, in PVH dom0, the gsi of a passthrough device doesn't get
> > > > registered, but gsi must be configured for it to be able to be
> > > > mapped into a domU.
> > > > 
> > > > After searching codes, we can find map_pirq and register_gsi will be
> > > > done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
> > > > the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
> > > > can be conclude to that the gsi of a passthrough device doesn't be
> > > > unmasked.
> > > > 
> > > > To solve the unmaske problem, this patch call the unmask_irq when we
> > > > assign a device to be passthrough. So that the gsi can get registered
> > > > and mapped in PVH dom0.
> > > 
> > > 
> > > Roger, this seems to be more of a Xen issue than a Linux issue. Why do
> > > we need the unmask check in Xen? Couldn't we just do:
> > > 
> > > 
> > > diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> > > index 4e40d3609a..df262a4a18 100644
> > > --- a/xen/arch/x86/hvm/vioapic.c
> > > +++ b/xen/arch/x86/hvm/vioapic.c
> > > @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
> > >              hvm_dpci_eoi(d, gsi);
> > >      }
> > >  
> > > -    if ( is_hardware_domain(d) && unmasked )
> > > +    if ( is_hardware_domain(d) )
> > >      {
> > >          /*
> > >           * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock
> > 
> > There are some issues with this approach.
> > 
> > mp_register_gsi() will only setup the trigger and polarity of the
> > IO-APIC pin once, so we do so once the guest unmask the pin in order
> > to assert that the configuration is the intended one.  A guest is
> > allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
> > that doesn't take effect unless the pin is unmasked.
> > 
> > Overall the question would be whether we have any guarantees that
> > the hardware domain has properly configured the pin, even if it's not
> > using it itself (as it hasn't been unmasked).
> > 
> > IIRC PCI legacy interrupts are level triggered and low polarity, so we
> > could configure any pins that are not setup at bind time?
> 
> That could work.
> 
> Another idea is to move only the call to allocate_and_map_gsi_pirq at
> bind time? That might be enough to pass a pirq_access_permitted check.

Maybe, albeit that would change the behavior of XEN_DOMCTL_bind_pt_irq
just for PT_IRQ_TYPE_PCI and only when called from a PVH dom0 (as the
parameter would be a GSI instead of a previously mapped IRQ).  Such
difference just for PT_IRQ_TYPE_PCI is slightly weird - if we go that
route I would recommend that we instead introduce a new dmop that has
this syntax regardless of the domain type it's called from.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-01  8:58         ` Roger Pau Monné
@ 2023-12-02  3:37           ` Stefano Stabellini
  2023-12-04 10:28             ` Roger Pau Monné
  0 siblings, 1 reply; 49+ messages in thread
From: Stefano Stabellini @ 2023-12-02  3:37 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Jiqian Chen, Juergen Gross,
	Oleksandr Tyshchenko, Thomas Gleixner, Boris Ostrovsky,
	Rafael J . Wysocki, Len Brown, Bjorn Helgaas, xen-devel,
	linux-kernel, linux-acpi, Stefano Stabellini, Alex Deucher,
	Christian Koenig, Stewart Hildebrand, Xenia Ragiadakou,
	Honglei Huang, Julia Zhang, Huang Rui

[-- Attachment #1: Type: text/plain, Size: 5940 bytes --]

On Fri, 1 Dec 2023, Roger Pau Monné wrote:
> On Thu, Nov 30, 2023 at 07:15:17PM -0800, Stefano Stabellini wrote:
> > On Thu, 30 Nov 2023, Roger Pau Monné wrote:
> > > On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
> > > > On Fri, 24 Nov 2023, Jiqian Chen wrote:
> > > > > This patch is to solve two problems we encountered when we try to
> > > > > passthrough a device to hvm domU base on Xen PVH dom0.
> > > > > 
> > > > > First, hvm guest will alloc a pirq and irq for a passthrough device
> > > > > by using gsi, before that, the gsi must first has a mapping in dom0,
> > > > > see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
> > > > > into Xen and check whether dom0 has the mapping. See
> > > > > XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
> > > > > dom0 and it return irq is 0, and then return -EPERM.
> > > > > This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
> > > > > when thay are enabled.
> > > > > 
> > > > > Second, in PVH dom0, the gsi of a passthrough device doesn't get
> > > > > registered, but gsi must be configured for it to be able to be
> > > > > mapped into a domU.
> > > > > 
> > > > > After searching codes, we can find map_pirq and register_gsi will be
> > > > > done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
> > > > > the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
> > > > > can be conclude to that the gsi of a passthrough device doesn't be
> > > > > unmasked.
> > > > > 
> > > > > To solve the unmaske problem, this patch call the unmask_irq when we
> > > > > assign a device to be passthrough. So that the gsi can get registered
> > > > > and mapped in PVH dom0.
> > > > 
> > > > 
> > > > Roger, this seems to be more of a Xen issue than a Linux issue. Why do
> > > > we need the unmask check in Xen? Couldn't we just do:
> > > > 
> > > > 
> > > > diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> > > > index 4e40d3609a..df262a4a18 100644
> > > > --- a/xen/arch/x86/hvm/vioapic.c
> > > > +++ b/xen/arch/x86/hvm/vioapic.c
> > > > @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
> > > >              hvm_dpci_eoi(d, gsi);
> > > >      }
> > > >  
> > > > -    if ( is_hardware_domain(d) && unmasked )
> > > > +    if ( is_hardware_domain(d) )
> > > >      {
> > > >          /*
> > > >           * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock
> > > 
> > > There are some issues with this approach.
> > > 
> > > mp_register_gsi() will only setup the trigger and polarity of the
> > > IO-APIC pin once, so we do so once the guest unmask the pin in order
> > > to assert that the configuration is the intended one.  A guest is
> > > allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
> > > that doesn't take effect unless the pin is unmasked.
> > > 
> > > Overall the question would be whether we have any guarantees that
> > > the hardware domain has properly configured the pin, even if it's not
> > > using it itself (as it hasn't been unmasked).
> > > 
> > > IIRC PCI legacy interrupts are level triggered and low polarity, so we
> > > could configure any pins that are not setup at bind time?
> > 
> > That could work.
> > 
> > Another idea is to move only the call to allocate_and_map_gsi_pirq at
> > bind time? That might be enough to pass a pirq_access_permitted check.
> 
> Maybe, albeit that would change the behavior of XEN_DOMCTL_bind_pt_irq
> just for PT_IRQ_TYPE_PCI and only when called from a PVH dom0 (as the
> parameter would be a GSI instead of a previously mapped IRQ).  Such
> difference just for PT_IRQ_TYPE_PCI is slightly weird - if we go that
> route I would recommend that we instead introduce a new dmop that has
> this syntax regardless of the domain type it's called from.

Looking at the code it is certainly a bit confusing. My point was that
we don't need to wait until polarity and trigger are set appropriately
to allow Dom0 to pass successfully a pirq_access_permitted() check. Xen
should be able to figure out that Dom0 is permitted pirq access.

So the idea was to move the call to allocate_and_map_gsi_pirq() earlier
somewhere because allocate_and_map_gsi_pirq doesn't require trigger or
polarity to be configured to work. But the suggestion of doing it a
"bind time" (meaning: XEN_DOMCTL_bind_pt_irq) was a bad idea.

But maybe we can find another location, maybe within
xen/arch/x86/hvm/vioapic.c, to call allocate_and_map_gsi_pirq() before
trigger and polarity are set and before the interrupt is unmasked.

Then we change the implementation of vioapic_hwdom_map_gsi to skip the
call to allocate_and_map_gsi_pirq, because by the time
vioapic_hwdom_map_gsi we assume that allocate_and_map_gsi_pirq had
already been done.

I am not familiar with vioapic.c but to give you an idea of what I was
thinking:


diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 4e40d3609a..16d56fe851 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -189,14 +189,6 @@ static int vioapic_hwdom_map_gsi(unsigned int gsi, unsigned int trig,
         return ret;
     }
 
-    ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
-    if ( ret )
-    {
-        gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
-                 gsi, ret);
-        return ret;
-    }
-
     pcidevs_lock();
     ret = pt_irq_create_bind(currd, &pt_irq_bind);
     if ( ret )
@@ -287,6 +279,17 @@ static void vioapic_write_redirent(
             hvm_dpci_eoi(d, gsi);
     }
 
+    if ( is_hardware_domain(d) ) 
+    {
+        int pirq = gsi, ret;
+        ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
+        if ( ret )
+        {
+            gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
+                    gsi, ret);
+            return ret;
+        }
+    }
     if ( is_hardware_domain(d) && unmasked )
     {
         /*

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-11-30 15:03       ` Stewart Hildebrand
@ 2023-12-04  3:25         ` Chen, Jiqian
  2023-12-04  3:45           ` Stewart Hildebrand
  0 siblings, 1 reply; 49+ messages in thread
From: Chen, Jiqian @ 2023-12-04  3:25 UTC (permalink / raw)
  To: Hildebrand, Stewart, Stefano Stabellini
  Cc: Juergen Gross, Oleksandr Tyshchenko, Thomas Gleixner,
	Boris Ostrovsky, Rafael J . Wysocki, Len Brown, Bjorn Helgaas,
	Roger Pau Monné,
	xen-devel, Chen, Jiqian, linux-kernel, linux-acpi, Stabellini,
	Stefano, Deucher, Alexander, Koenig, Christian, Ragiadakou,
	Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray

Hi Stewart,

On 2023/11/30 23:03, Stewart Hildebrand wrote:
> On 11/30/23 02:03, Chen, Jiqian wrote:
>>
>> On 2023/11/30 11:46, Stefano Stabellini wrote:
>>> On Fri, 24 Nov 2023, Jiqian Chen wrote:
>>>> When device on dom0 side has been reset, the vpci on Xen side
>>>> won't get notification, so that the cached state in vpci is
>>>> all out of date with the real device state.
>>>> To solve that problem, this patch add a function to clear all
>>>> vpci device state when device is reset on dom0 side.
>>>>
>>>> And call that function in pcistub_init_device. Because when
>>>> we use "pci-assignable-add" to assign a passthrough device in
>>>> Xen, it will reset passthrough device and the vpci state will
>>>> out of date, and then device will fail to restore bar state.
>>>>
>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>>> ---
>>>>  drivers/xen/pci.c                  | 12 ++++++++++++
>>>>  drivers/xen/xen-pciback/pci_stub.c |  3 +++
>>>>  include/xen/interface/physdev.h    |  2 ++
>>>>  include/xen/pci.h                  |  6 ++++++
>>>>  4 files changed, 23 insertions(+)
>>>>
>>>> diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
>>>> index 72d4e3f193af..e9b30bc09139 100644
>>>> --- a/drivers/xen/pci.c
>>>> +++ b/drivers/xen/pci.c
>>>> @@ -177,6 +177,18 @@ static int xen_remove_device(struct device *dev)
>>>>  	return r;
>>>>  }
>>>>  
>>>> +int xen_reset_device_state(const struct pci_dev *dev)
>>>> +{
>>>> +	struct physdev_pci_device device = {
>>>> +		.seg = pci_domain_nr(dev->bus),
>>>> +		.bus = dev->bus->number,
>>>> +		.devfn = dev->devfn
>>>> +	};
>>>> +
>>>> +	return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_state_reset, &device);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(xen_reset_device_state);
>>>> +
>>>>  static int xen_pci_notifier(struct notifier_block *nb,
>>>>  			    unsigned long action, void *data)
>>>>  {
>>>> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>>>> index e34b623e4b41..5a96b6c66c07 100644
>>>> --- a/drivers/xen/xen-pciback/pci_stub.c
>>>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>>>> @@ -421,6 +421,9 @@ static int pcistub_init_device(struct pci_dev *dev)
>>>>  	else {
>>>>  		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
>>>>  		__pci_reset_function_locked(dev);
>>>> +		err = xen_reset_device_state(dev);
>>>> +		if (err)
>>>> +			goto config_release;
>>>
>>> Older versions of Xen won't have the hypercall
>>> PHYSDEVOP_pci_device_state_reset implemented. I think we should do
>>> something like:
>>>
>>> if (err && xen_pvh_domain())
>>>     goto config_release;
>>>
>>>
>>> Or even:
>>>
>>> if (xen_pvh_domain()) {
>>>     err = xen_reset_device_state(dev);
>>>     if (err)
>>>         goto config_release;
>>> }
>>>
>>> depending on whether we want to call xen_reset_device_state also for PV
>>> guests or not. I am assuming we don't want to error out on failure such
>>> as -ENOENT for PV guests.
>> Yes, only for PVH dom0, I will add the condition in next version. Thank you!
> 
> We will want to call xen_reset_device_state() for Arm dom0, too, so checking xen_pvh_domain() alone is not sufficient. I suggest instead to check !xen_pv_domain().
I am not using Arm. But is Arm dom0 not a PVH type dom0?

> 
>>
>>>
>>>
>>>>  		pci_restore_state(dev);
>>>>  	}
>>>>  	/* Now disable the device (this also ensures some private device
>>>> diff --git a/include/xen/interface/physdev.h b/include/xen/interface/physdev.h
>>>> index a237af867873..231526f80f6c 100644
>>>> --- a/include/xen/interface/physdev.h
>>>> +++ b/include/xen/interface/physdev.h
>>>> @@ -263,6 +263,8 @@ struct physdev_pci_device {
>>>>      uint8_t devfn;
>>>>  };
>>>>  
>>>> +#define PHYSDEVOP_pci_device_state_reset     32
>>>> +
>>>>  #define PHYSDEVOP_DBGP_RESET_PREPARE    1
>>>>  #define PHYSDEVOP_DBGP_RESET_DONE       2
>>>>  
>>>> diff --git a/include/xen/pci.h b/include/xen/pci.h
>>>> index b8337cf85fd1..b2e2e856efd6 100644
>>>> --- a/include/xen/pci.h
>>>> +++ b/include/xen/pci.h
>>>> @@ -4,10 +4,16 @@
>>>>  #define __XEN_PCI_H__
>>>>  
>>>>  #if defined(CONFIG_XEN_DOM0)
>>>> +int xen_reset_device_state(const struct pci_dev *dev);
>>>>  int xen_find_device_domain_owner(struct pci_dev *dev);
>>>>  int xen_register_device_domain_owner(struct pci_dev *dev, uint16_t domain);
>>>>  int xen_unregister_device_domain_owner(struct pci_dev *dev);
>>>>  #else
>>>> +static inline int xen_reset_device_state(const struct pci_dev *dev)
>>>> +{
>>>> +	return -1;
>>>> +}
>>>> +
>>>>  static inline int xen_find_device_domain_owner(struct pci_dev *dev)
>>>>  {
>>>>  	return -1;
>>>> -- 
>>>> 2.34.1
>>>>
>>

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-12-04  3:25         ` Chen, Jiqian
@ 2023-12-04  3:45           ` Stewart Hildebrand
  0 siblings, 0 replies; 49+ messages in thread
From: Stewart Hildebrand @ 2023-12-04  3:45 UTC (permalink / raw)
  To: Chen, Jiqian, Stefano Stabellini
  Cc: Juergen Gross, Oleksandr Tyshchenko, Thomas Gleixner,
	Boris Ostrovsky, Rafael J . Wysocki, Len Brown, Bjorn Helgaas,
	Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi, Stabellini, Stefano,
	Deucher, Alexander, Koenig, Christian, Ragiadakou, Xenia, Huang,
	Honglei1, Zhang, Julia, Huang, Ray

On 12/3/23 22:25, Chen, Jiqian wrote:
> Hi Stewart,
> 
> On 2023/11/30 23:03, Stewart Hildebrand wrote:
>> On 11/30/23 02:03, Chen, Jiqian wrote:
>>>
>>> On 2023/11/30 11:46, Stefano Stabellini wrote:
>>>> On Fri, 24 Nov 2023, Jiqian Chen wrote:
>>>>> When device on dom0 side has been reset, the vpci on Xen side
>>>>> won't get notification, so that the cached state in vpci is
>>>>> all out of date with the real device state.
>>>>> To solve that problem, this patch add a function to clear all
>>>>> vpci device state when device is reset on dom0 side.
>>>>>
>>>>> And call that function in pcistub_init_device. Because when
>>>>> we use "pci-assignable-add" to assign a passthrough device in
>>>>> Xen, it will reset passthrough device and the vpci state will
>>>>> out of date, and then device will fail to restore bar state.
>>>>>
>>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>>>> ---
>>>>>  drivers/xen/pci.c                  | 12 ++++++++++++
>>>>>  drivers/xen/xen-pciback/pci_stub.c |  3 +++
>>>>>  include/xen/interface/physdev.h    |  2 ++
>>>>>  include/xen/pci.h                  |  6 ++++++
>>>>>  4 files changed, 23 insertions(+)
>>>>>
>>>>> diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
>>>>> index 72d4e3f193af..e9b30bc09139 100644
>>>>> --- a/drivers/xen/pci.c
>>>>> +++ b/drivers/xen/pci.c
>>>>> @@ -177,6 +177,18 @@ static int xen_remove_device(struct device *dev)
>>>>>  	return r;
>>>>>  }
>>>>>  
>>>>> +int xen_reset_device_state(const struct pci_dev *dev)
>>>>> +{
>>>>> +	struct physdev_pci_device device = {
>>>>> +		.seg = pci_domain_nr(dev->bus),
>>>>> +		.bus = dev->bus->number,
>>>>> +		.devfn = dev->devfn
>>>>> +	};
>>>>> +
>>>>> +	return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_state_reset, &device);
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(xen_reset_device_state);
>>>>> +
>>>>>  static int xen_pci_notifier(struct notifier_block *nb,
>>>>>  			    unsigned long action, void *data)
>>>>>  {
>>>>> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>>>>> index e34b623e4b41..5a96b6c66c07 100644
>>>>> --- a/drivers/xen/xen-pciback/pci_stub.c
>>>>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>>>>> @@ -421,6 +421,9 @@ static int pcistub_init_device(struct pci_dev *dev)
>>>>>  	else {
>>>>>  		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
>>>>>  		__pci_reset_function_locked(dev);
>>>>> +		err = xen_reset_device_state(dev);
>>>>> +		if (err)
>>>>> +			goto config_release;
>>>>
>>>> Older versions of Xen won't have the hypercall
>>>> PHYSDEVOP_pci_device_state_reset implemented. I think we should do
>>>> something like:
>>>>
>>>> if (err && xen_pvh_domain())
>>>>     goto config_release;
>>>>
>>>>
>>>> Or even:
>>>>
>>>> if (xen_pvh_domain()) {
>>>>     err = xen_reset_device_state(dev);
>>>>     if (err)
>>>>         goto config_release;
>>>> }
>>>>
>>>> depending on whether we want to call xen_reset_device_state also for PV
>>>> guests or not. I am assuming we don't want to error out on failure such
>>>> as -ENOENT for PV guests.
>>> Yes, only for PVH dom0, I will add the condition in next version. Thank you!
>>
>> We will want to call xen_reset_device_state() for Arm dom0, too, so checking xen_pvh_domain() alone is not sufficient. I suggest instead to check !xen_pv_domain().
> I am not using Arm. But is Arm dom0 not a PVH type dom0?

Apparently not, Arm dom0 appears to be a xen_hvm_domain() from the kernel's perspective.

> 
>>
>>>
>>>>
>>>>
>>>>>  		pci_restore_state(dev);
>>>>>  	}
>>>>>  	/* Now disable the device (this also ensures some private device
>>>>> diff --git a/include/xen/interface/physdev.h b/include/xen/interface/physdev.h
>>>>> index a237af867873..231526f80f6c 100644
>>>>> --- a/include/xen/interface/physdev.h
>>>>> +++ b/include/xen/interface/physdev.h
>>>>> @@ -263,6 +263,8 @@ struct physdev_pci_device {
>>>>>      uint8_t devfn;
>>>>>  };
>>>>>  
>>>>> +#define PHYSDEVOP_pci_device_state_reset     32
>>>>> +
>>>>>  #define PHYSDEVOP_DBGP_RESET_PREPARE    1
>>>>>  #define PHYSDEVOP_DBGP_RESET_DONE       2
>>>>>  
>>>>> diff --git a/include/xen/pci.h b/include/xen/pci.h
>>>>> index b8337cf85fd1..b2e2e856efd6 100644
>>>>> --- a/include/xen/pci.h
>>>>> +++ b/include/xen/pci.h
>>>>> @@ -4,10 +4,16 @@
>>>>>  #define __XEN_PCI_H__
>>>>>  
>>>>>  #if defined(CONFIG_XEN_DOM0)
>>>>> +int xen_reset_device_state(const struct pci_dev *dev);
>>>>>  int xen_find_device_domain_owner(struct pci_dev *dev);
>>>>>  int xen_register_device_domain_owner(struct pci_dev *dev, uint16_t domain);
>>>>>  int xen_unregister_device_domain_owner(struct pci_dev *dev);
>>>>>  #else
>>>>> +static inline int xen_reset_device_state(const struct pci_dev *dev)
>>>>> +{
>>>>> +	return -1;
>>>>> +}
>>>>> +
>>>>>  static inline int xen_find_device_domain_owner(struct pci_dev *dev)
>>>>>  {
>>>>>  	return -1;
>>>>> -- 
>>>>> 2.34.1
>>>>>
>>>
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function Jiqian Chen
  2023-11-30  3:46   ` Stefano Stabellini
@ 2023-12-04  7:58   ` Thomas Gleixner
  2023-12-04  8:49     ` Chen, Jiqian
  1 sibling, 1 reply; 49+ messages in thread
From: Thomas Gleixner @ 2023-12-04  7:58 UTC (permalink / raw)
  To: Jiqian Chen, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko, Boris Ostrovsky, Rafael J . Wysocki,
	Len Brown, Bjorn Helgaas, Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi
  Cc: Stefano Stabellini, Alex Deucher, Christian Koenig,
	Stewart Hildebrand, Xenia Ragiadakou, Honglei Huang, Julia Zhang,
	Huang Rui, Jiqian Chen, Huang Rui

On Fri, Nov 24 2023 at 18:31, Jiqian Chen wrote:
> When device on dom0 side has been reset, the vpci on Xen side
> won't get notification, so that the cached state in vpci is
> all out of date with the real device state.
> To solve that problem, this patch add a function to clear all

Please get rid of 'this patch' all over the series.

# grep -r 'This patch' Documentation/process/

> vpci device state when device is reset on dom0 side.
>
> And call that function in pcistub_init_device. Because when
> we use "pci-assignable-add" to assign a passthrough device in
> Xen, it will reset passthrough device and the vpci state will
> out of date, and then device will fail to restore bar state.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> Signed-off-by: Huang Rui <ray.huang@amd.com>

This Signed-off-by chain is incorrect.

Documentation/process/submitting-patches.rst has a full chapter about
S-O-B and the correct usage.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-11-24 10:31 ` [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0 Jiqian Chen
                     ` (2 preceding siblings ...)
  2023-11-30  3:53   ` Stefano Stabellini
@ 2023-12-04  8:13   ` Thomas Gleixner
  2023-12-05  7:03     ` Chen, Jiqian
  3 siblings, 1 reply; 49+ messages in thread
From: Thomas Gleixner @ 2023-12-04  8:13 UTC (permalink / raw)
  To: Jiqian Chen, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko, Boris Ostrovsky, Rafael J . Wysocki,
	Len Brown, Bjorn Helgaas, Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi
  Cc: Stefano Stabellini, Alex Deucher, Christian Koenig,
	Stewart Hildebrand, Xenia Ragiadakou, Honglei Huang, Julia Zhang,
	Huang Rui, Jiqian Chen, Huang Rui

On Fri, Nov 24 2023 at 18:31, Jiqian Chen wrote:
> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
> index 5a96b6c66c07..b83d02bcc76c 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -357,6 +357,7 @@ static int pcistub_match(struct pci_dev *dev)
>  static int pcistub_init_device(struct pci_dev *dev)
>  {
>  	struct xen_pcibk_dev_data *dev_data;
> +	struct irq_desc *desc = NULL;
>  	int err = 0;
>  
>  	dev_dbg(&dev->dev, "initializing...\n");
> @@ -399,6 +400,12 @@ static int pcistub_init_device(struct pci_dev *dev)
>  	if (err)
>  		goto config_release;
>  
> +	if (xen_initial_domain() && xen_pvh_domain()) {
> +		if (dev->irq <= 0 || !(desc = irq_to_desc(dev->irq)))

Driver code has absolutely no business to access irq_desc.

> +			goto config_release;
> +		unmask_irq(desc);

Or to invoke any internal function.

> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -439,6 +439,7 @@ void unmask_irq(struct irq_desc *desc)
>  		irq_state_clr_masked(desc);
>  	}
>  }
> +EXPORT_SYMBOL_GPL(unmask_irq);

Not going to happen.

> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -380,7 +380,7 @@ struct irq_desc *irq_to_desc(unsigned int irq)
>  {
>  	return mtree_load(&sparse_irqs, irq);
>  }
> -#ifdef CONFIG_KVM_BOOK3S_64_HV_MODULE
> +#if defined CONFIG_KVM_BOOK3S_64_HV_MODULE || defined CONFIG_XEN_PVH

Neither that.

This all smells badly like a XEN internal issue and we are not going to
hack around it by exposing interrupt internals.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-12-04  7:58   ` Thomas Gleixner
@ 2023-12-04  8:49     ` Chen, Jiqian
  2023-12-04 21:31       ` Stefano Stabellini
  0 siblings, 1 reply; 49+ messages in thread
From: Chen, Jiqian @ 2023-12-04  8:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Boris Ostrovsky, Rafael J . Wysocki, Len Brown, Bjorn Helgaas,
	Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi, Stabellini, Stefano,
	Deucher, Alexander, Koenig, Christian, Hildebrand, Stewart,
	Ragiadakou, Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray,
	Chen, Jiqian


On 2023/12/4 15:58, Thomas Gleixner wrote:
> On Fri, Nov 24 2023 at 18:31, Jiqian Chen wrote:
>> When device on dom0 side has been reset, the vpci on Xen side
>> won't get notification, so that the cached state in vpci is
>> all out of date with the real device state.
>> To solve that problem, this patch add a function to clear all
> 
> Please get rid of 'this patch' all over the series.
> 
> # grep -r 'This patch' Documentation/process/
Thanks. I will remove "this patch" or "I/we" in next version.

> 
>> vpci device state when device is reset on dom0 side.
>>
>> And call that function in pcistub_init_device. Because when
>> we use "pci-assignable-add" to assign a passthrough device in
>> Xen, it will reset passthrough device and the vpci state will
>> out of date, and then device will fail to restore bar state.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> Signed-off-by: Huang Rui <ray.huang@amd.com>
> 
> This Signed-off-by chain is incorrect.
> 
> Documentation/process/submitting-patches.rst has a full chapter about
> S-O-B and the correct usage.
I am the author of this series of patches, and Huang Rui transported the v1 to upstream. And now I transport v2. I am not aware that the SOB chain is incorrect.
Do you have any suggestions?

> 
> Thanks,
> 
>         tglx

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-02  3:37           ` Stefano Stabellini
@ 2023-12-04 10:28             ` Roger Pau Monné
  2023-12-04 22:19               ` Stefano Stabellini
  2023-12-05  6:46               ` Chen, Jiqian
  0 siblings, 2 replies; 49+ messages in thread
From: Roger Pau Monné @ 2023-12-04 10:28 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jiqian Chen, Juergen Gross, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi,
	Stefano Stabellini, Alex Deucher, Christian Koenig,
	Stewart Hildebrand, Xenia Ragiadakou, Honglei Huang, Julia Zhang,
	Huang Rui

On Fri, Dec 01, 2023 at 07:37:55PM -0800, Stefano Stabellini wrote:
> On Fri, 1 Dec 2023, Roger Pau Monné wrote:
> > On Thu, Nov 30, 2023 at 07:15:17PM -0800, Stefano Stabellini wrote:
> > > On Thu, 30 Nov 2023, Roger Pau Monné wrote:
> > > > On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
> > > > > On Fri, 24 Nov 2023, Jiqian Chen wrote:
> > > > > > This patch is to solve two problems we encountered when we try to
> > > > > > passthrough a device to hvm domU base on Xen PVH dom0.
> > > > > > 
> > > > > > First, hvm guest will alloc a pirq and irq for a passthrough device
> > > > > > by using gsi, before that, the gsi must first has a mapping in dom0,
> > > > > > see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
> > > > > > into Xen and check whether dom0 has the mapping. See
> > > > > > XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
> > > > > > dom0 and it return irq is 0, and then return -EPERM.
> > > > > > This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
> > > > > > when thay are enabled.
> > > > > > 
> > > > > > Second, in PVH dom0, the gsi of a passthrough device doesn't get
> > > > > > registered, but gsi must be configured for it to be able to be
> > > > > > mapped into a domU.
> > > > > > 
> > > > > > After searching codes, we can find map_pirq and register_gsi will be
> > > > > > done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
> > > > > > the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
> > > > > > can be conclude to that the gsi of a passthrough device doesn't be
> > > > > > unmasked.
> > > > > > 
> > > > > > To solve the unmaske problem, this patch call the unmask_irq when we
> > > > > > assign a device to be passthrough. So that the gsi can get registered
> > > > > > and mapped in PVH dom0.
> > > > > 
> > > > > 
> > > > > Roger, this seems to be more of a Xen issue than a Linux issue. Why do
> > > > > we need the unmask check in Xen? Couldn't we just do:
> > > > > 
> > > > > 
> > > > > diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> > > > > index 4e40d3609a..df262a4a18 100644
> > > > > --- a/xen/arch/x86/hvm/vioapic.c
> > > > > +++ b/xen/arch/x86/hvm/vioapic.c
> > > > > @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
> > > > >              hvm_dpci_eoi(d, gsi);
> > > > >      }
> > > > >  
> > > > > -    if ( is_hardware_domain(d) && unmasked )
> > > > > +    if ( is_hardware_domain(d) )
> > > > >      {
> > > > >          /*
> > > > >           * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock
> > > > 
> > > > There are some issues with this approach.
> > > > 
> > > > mp_register_gsi() will only setup the trigger and polarity of the
> > > > IO-APIC pin once, so we do so once the guest unmask the pin in order
> > > > to assert that the configuration is the intended one.  A guest is
> > > > allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
> > > > that doesn't take effect unless the pin is unmasked.
> > > > 
> > > > Overall the question would be whether we have any guarantees that
> > > > the hardware domain has properly configured the pin, even if it's not
> > > > using it itself (as it hasn't been unmasked).
> > > > 
> > > > IIRC PCI legacy interrupts are level triggered and low polarity, so we
> > > > could configure any pins that are not setup at bind time?
> > > 
> > > That could work.
> > > 
> > > Another idea is to move only the call to allocate_and_map_gsi_pirq at
> > > bind time? That might be enough to pass a pirq_access_permitted check.
> > 
> > Maybe, albeit that would change the behavior of XEN_DOMCTL_bind_pt_irq
> > just for PT_IRQ_TYPE_PCI and only when called from a PVH dom0 (as the
> > parameter would be a GSI instead of a previously mapped IRQ).  Such
> > difference just for PT_IRQ_TYPE_PCI is slightly weird - if we go that
> > route I would recommend that we instead introduce a new dmop that has
> > this syntax regardless of the domain type it's called from.
> 
> Looking at the code it is certainly a bit confusing. My point was that
> we don't need to wait until polarity and trigger are set appropriately
> to allow Dom0 to pass successfully a pirq_access_permitted() check. Xen
> should be able to figure out that Dom0 is permitted pirq access.

The logic is certainly not straightforward, and it could benefit from
some comments.

The irq permissions are a bit special, in that they get setup when the
IRQ is mapped.

The problem however is not so much with IRQ permissions, that we can
indeed sort out internally in Xen.  Such check in dom0 has the side
effect of preventing the IRQ from being assigned to a domU without the
hardware source being properly configured AFAICT.

> 
> So the idea was to move the call to allocate_and_map_gsi_pirq() earlier
> somewhere because allocate_and_map_gsi_pirq doesn't require trigger or
> polarity to be configured to work. But the suggestion of doing it a
> "bind time" (meaning: XEN_DOMCTL_bind_pt_irq) was a bad idea.
> 
> But maybe we can find another location, maybe within
> xen/arch/x86/hvm/vioapic.c, to call allocate_and_map_gsi_pirq() before
> trigger and polarity are set and before the interrupt is unmasked.
> 
> Then we change the implementation of vioapic_hwdom_map_gsi to skip the
> call to allocate_and_map_gsi_pirq, because by the time
> vioapic_hwdom_map_gsi we assume that allocate_and_map_gsi_pirq had
> already been done.

But then we would end up in a situation where the
pirq_access_permitted() check will pass, but the IO-APIC pin won't be
configured, which I think it's not what we want.

One option would be to allow mp_register_gsi() to be called multiple
times, and update the IO-APIC pin configuration as long as the pin is
not unmasked.  That would propagate each dom0 RTE update to the
underlying IO-APIC.  However such approach relies on dom0 configuring
all possible IO-APIC pins, even if no device on dom0 is using them, I
think it's not a very reliable option.

Another option would be to modify the toolstack to setup the GSI
itself using the PHYSDEVOP_setup_gsi hypercall.  As said in a previous
email, since we only care about PCI device passthrough the legacy INTx
should always be level triggered and low polarity.

> I am not familiar with vioapic.c but to give you an idea of what I was
> thinking:
> 
> 
> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> index 4e40d3609a..16d56fe851 100644
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -189,14 +189,6 @@ static int vioapic_hwdom_map_gsi(unsigned int gsi, unsigned int trig,
>          return ret;
>      }
>  
> -    ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
> -    if ( ret )
> -    {
> -        gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
> -                 gsi, ret);
> -        return ret;
> -    }
> -
>      pcidevs_lock();
>      ret = pt_irq_create_bind(currd, &pt_irq_bind);
>      if ( ret )
> @@ -287,6 +279,17 @@ static void vioapic_write_redirent(
>              hvm_dpci_eoi(d, gsi);
>      }
>  
> +    if ( is_hardware_domain(d) ) 
> +    {
> +        int pirq = gsi, ret;
> +        ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
> +        if ( ret )
> +        {
> +            gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
> +                    gsi, ret);
> +            return ret;
> +        }
> +    }
>      if ( is_hardware_domain(d) && unmasked )
>      {
>          /*

As said above, such approach relies on dom0 writing to the IO-APIC RTE
of likely each IO-APIC pin, which is IMO not quite reliable.  In there
are two different issues here that need to be fixed for PVH dom0:

 - Fix the XEN_DOMCTL_irq_permission pirq_access_permitted() call to
   succeed for a PVH dom0, even if dom0 is not using the GSI itself.

 - Configure IO-APIC pins for PCI interrupts even if dom0 is not using
   the IO-APIC pin itself.

First one needs to be fixed internally in Xen, second one will require
the toolstack to issue an extra hypercall in order to ensure the
IO-APIC pin is properly configured.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-12-04  8:49     ` Chen, Jiqian
@ 2023-12-04 21:31       ` Stefano Stabellini
  2023-12-05  6:50         ` Chen, Jiqian
                           ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Stefano Stabellini @ 2023-12-04 21:31 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Thomas Gleixner, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko, Boris Ostrovsky, Rafael J . Wysocki,
	Len Brown, Bjorn Helgaas, Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi, Stabellini, Stefano,
	Deucher, Alexander, Koenig, Christian, Hildebrand, Stewart,
	Ragiadakou, Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray

On Mon, 3 Dec 2023, Chen, Jiqian wrote:
> >> vpci device state when device is reset on dom0 side.
> >>
> >> And call that function in pcistub_init_device. Because when
> >> we use "pci-assignable-add" to assign a passthrough device in
> >> Xen, it will reset passthrough device and the vpci state will
> >> out of date, and then device will fail to restore bar state.
> >>
> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> >> Signed-off-by: Huang Rui <ray.huang@amd.com>
> > 
> > This Signed-off-by chain is incorrect.
> > 
> > Documentation/process/submitting-patches.rst has a full chapter about
> > S-O-B and the correct usage.
> I am the author of this series of patches, and Huang Rui transported the v1 to upstream. And now I transport v2. I am not aware that the SOB chain is incorrect.
> Do you have any suggestions?

I think he means that your Signed-off-by should be the second one of the
two as you are the one submitting the patch to the LKML

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-04 10:28             ` Roger Pau Monné
@ 2023-12-04 22:19               ` Stefano Stabellini
  2023-12-05  9:19                 ` Roger Pau Monné
  2023-12-05  6:46               ` Chen, Jiqian
  1 sibling, 1 reply; 49+ messages in thread
From: Stefano Stabellini @ 2023-12-04 22:19 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Jiqian Chen, Juergen Gross,
	Oleksandr Tyshchenko, Thomas Gleixner, Boris Ostrovsky,
	Rafael J . Wysocki, Len Brown, Bjorn Helgaas, xen-devel,
	linux-kernel, linux-acpi, Stefano Stabellini, Alex Deucher,
	Christian Koenig, Stewart Hildebrand, Xenia Ragiadakou,
	Honglei Huang, Julia Zhang, Huang Rui

[-- Attachment #1: Type: text/plain, Size: 9083 bytes --]

On Mon, 4 Dec 2023, Roger Pau Monné wrote:
> On Fri, Dec 01, 2023 at 07:37:55PM -0800, Stefano Stabellini wrote:
> > On Fri, 1 Dec 2023, Roger Pau Monné wrote:
> > > On Thu, Nov 30, 2023 at 07:15:17PM -0800, Stefano Stabellini wrote:
> > > > On Thu, 30 Nov 2023, Roger Pau Monné wrote:
> > > > > On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
> > > > > > On Fri, 24 Nov 2023, Jiqian Chen wrote:
> > > > > > > This patch is to solve two problems we encountered when we try to
> > > > > > > passthrough a device to hvm domU base on Xen PVH dom0.
> > > > > > > 
> > > > > > > First, hvm guest will alloc a pirq and irq for a passthrough device
> > > > > > > by using gsi, before that, the gsi must first has a mapping in dom0,
> > > > > > > see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
> > > > > > > into Xen and check whether dom0 has the mapping. See
> > > > > > > XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
> > > > > > > dom0 and it return irq is 0, and then return -EPERM.
> > > > > > > This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
> > > > > > > when thay are enabled.
> > > > > > > 
> > > > > > > Second, in PVH dom0, the gsi of a passthrough device doesn't get
> > > > > > > registered, but gsi must be configured for it to be able to be
> > > > > > > mapped into a domU.
> > > > > > > 
> > > > > > > After searching codes, we can find map_pirq and register_gsi will be
> > > > > > > done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
> > > > > > > the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
> > > > > > > can be conclude to that the gsi of a passthrough device doesn't be
> > > > > > > unmasked.
> > > > > > > 
> > > > > > > To solve the unmaske problem, this patch call the unmask_irq when we
> > > > > > > assign a device to be passthrough. So that the gsi can get registered
> > > > > > > and mapped in PVH dom0.
> > > > > > 
> > > > > > 
> > > > > > Roger, this seems to be more of a Xen issue than a Linux issue. Why do
> > > > > > we need the unmask check in Xen? Couldn't we just do:
> > > > > > 
> > > > > > 
> > > > > > diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> > > > > > index 4e40d3609a..df262a4a18 100644
> > > > > > --- a/xen/arch/x86/hvm/vioapic.c
> > > > > > +++ b/xen/arch/x86/hvm/vioapic.c
> > > > > > @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
> > > > > >              hvm_dpci_eoi(d, gsi);
> > > > > >      }
> > > > > >  
> > > > > > -    if ( is_hardware_domain(d) && unmasked )
> > > > > > +    if ( is_hardware_domain(d) )
> > > > > >      {
> > > > > >          /*
> > > > > >           * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock
> > > > > 
> > > > > There are some issues with this approach.
> > > > > 
> > > > > mp_register_gsi() will only setup the trigger and polarity of the
> > > > > IO-APIC pin once, so we do so once the guest unmask the pin in order
> > > > > to assert that the configuration is the intended one.  A guest is
> > > > > allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
> > > > > that doesn't take effect unless the pin is unmasked.
> > > > > 
> > > > > Overall the question would be whether we have any guarantees that
> > > > > the hardware domain has properly configured the pin, even if it's not
> > > > > using it itself (as it hasn't been unmasked).
> > > > > 
> > > > > IIRC PCI legacy interrupts are level triggered and low polarity, so we
> > > > > could configure any pins that are not setup at bind time?
> > > > 
> > > > That could work.
> > > > 
> > > > Another idea is to move only the call to allocate_and_map_gsi_pirq at
> > > > bind time? That might be enough to pass a pirq_access_permitted check.
> > > 
> > > Maybe, albeit that would change the behavior of XEN_DOMCTL_bind_pt_irq
> > > just for PT_IRQ_TYPE_PCI and only when called from a PVH dom0 (as the
> > > parameter would be a GSI instead of a previously mapped IRQ).  Such
> > > difference just for PT_IRQ_TYPE_PCI is slightly weird - if we go that
> > > route I would recommend that we instead introduce a new dmop that has
> > > this syntax regardless of the domain type it's called from.
> > 
> > Looking at the code it is certainly a bit confusing. My point was that
> > we don't need to wait until polarity and trigger are set appropriately
> > to allow Dom0 to pass successfully a pirq_access_permitted() check. Xen
> > should be able to figure out that Dom0 is permitted pirq access.
> 
> The logic is certainly not straightforward, and it could benefit from
> some comments.
> 
> The irq permissions are a bit special, in that they get setup when the
> IRQ is mapped.
> 
> The problem however is not so much with IRQ permissions, that we can
> indeed sort out internally in Xen.  Such check in dom0 has the side
> effect of preventing the IRQ from being assigned to a domU without the
> hardware source being properly configured AFAICT.

Now I understand why you made a comment previously about Xen having to
configure trigger and polarity for these interrupts on its own.


> > So the idea was to move the call to allocate_and_map_gsi_pirq() earlier
> > somewhere because allocate_and_map_gsi_pirq doesn't require trigger or
> > polarity to be configured to work. But the suggestion of doing it a
> > "bind time" (meaning: XEN_DOMCTL_bind_pt_irq) was a bad idea.
> > 
> > But maybe we can find another location, maybe within
> > xen/arch/x86/hvm/vioapic.c, to call allocate_and_map_gsi_pirq() before
> > trigger and polarity are set and before the interrupt is unmasked.
> > 
> > Then we change the implementation of vioapic_hwdom_map_gsi to skip the
> > call to allocate_and_map_gsi_pirq, because by the time
> > vioapic_hwdom_map_gsi we assume that allocate_and_map_gsi_pirq had
> > already been done.
> 
> But then we would end up in a situation where the
> pirq_access_permitted() check will pass, but the IO-APIC pin won't be
> configured, which I think it's not what we want.
> 
> One option would be to allow mp_register_gsi() to be called multiple
> times, and update the IO-APIC pin configuration as long as the pin is
> not unmasked.  That would propagate each dom0 RTE update to the
> underlying IO-APIC.  However such approach relies on dom0 configuring
> all possible IO-APIC pins, even if no device on dom0 is using them, I
> think it's not a very reliable option.
> 
> Another option would be to modify the toolstack to setup the GSI
> itself using the PHYSDEVOP_setup_gsi hypercall.  As said in a previous
> email, since we only care about PCI device passthrough the legacy INTx
> should always be level triggered and low polarity.
> 
> > I am not familiar with vioapic.c but to give you an idea of what I was
> > thinking:
> > 
> > 
> > diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> > index 4e40d3609a..16d56fe851 100644
> > --- a/xen/arch/x86/hvm/vioapic.c
> > +++ b/xen/arch/x86/hvm/vioapic.c
> > @@ -189,14 +189,6 @@ static int vioapic_hwdom_map_gsi(unsigned int gsi, unsigned int trig,
> >          return ret;
> >      }
> >  
> > -    ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
> > -    if ( ret )
> > -    {
> > -        gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
> > -                 gsi, ret);
> > -        return ret;
> > -    }
> > -
> >      pcidevs_lock();
> >      ret = pt_irq_create_bind(currd, &pt_irq_bind);
> >      if ( ret )
> > @@ -287,6 +279,17 @@ static void vioapic_write_redirent(
> >              hvm_dpci_eoi(d, gsi);
> >      }
> >  
> > +    if ( is_hardware_domain(d) ) 
> > +    {
> > +        int pirq = gsi, ret;
> > +        ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
> > +        if ( ret )
> > +        {
> > +            gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
> > +                    gsi, ret);
> > +            return ret;
> > +        }
> > +    }
> >      if ( is_hardware_domain(d) && unmasked )
> >      {
> >          /*
> 
> As said above, such approach relies on dom0 writing to the IO-APIC RTE
> of likely each IO-APIC pin, which is IMO not quite reliable.  In there
> are two different issues here that need to be fixed for PVH dom0:
> 
>  - Fix the XEN_DOMCTL_irq_permission pirq_access_permitted() call to
>    succeed for a PVH dom0, even if dom0 is not using the GSI itself.

Yes makes sense


>  - Configure IO-APIC pins for PCI interrupts even if dom0 is not using
>    the IO-APIC pin itself.
> 
> First one needs to be fixed internally in Xen, second one will require
> the toolstack to issue an extra hypercall in order to ensure the
> IO-APIC pin is properly configured.
 
On ARM, Xen doesn't need to wait for dom0 to configure interrupts
correctly. Xen configures them all on its own at boot based on Device
Tree information. I guess it is not possible to do the same on x86? If
not, then I can see why we would need 1 extra toolstack hypercall for
that (or to bundle the operation of configuring IO-APIC pins together
with an existing toolstack hypercall).

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-04 10:28             ` Roger Pau Monné
  2023-12-04 22:19               ` Stefano Stabellini
@ 2023-12-05  6:46               ` Chen, Jiqian
  1 sibling, 0 replies; 49+ messages in thread
From: Chen, Jiqian @ 2023-12-05  6:46 UTC (permalink / raw)
  To: Roger Pau Monné, Stefano Stabellini
  Cc: Juergen Gross, Oleksandr Tyshchenko, Thomas Gleixner,
	Boris Ostrovsky, Rafael J . Wysocki, Len Brown, Bjorn Helgaas,
	xen-devel, linux-kernel, linux-acpi, Stabellini, Stefano,
	Deucher, Alexander, Koenig, Christian, Hildebrand, Stewart,
	Ragiadakou, Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray,
	Chen, Jiqian

On 2023/12/4 18:28, Roger Pau Monné wrote:
> On Fri, Dec 01, 2023 at 07:37:55PM -0800, Stefano Stabellini wrote:
>> On Fri, 1 Dec 2023, Roger Pau Monné wrote:
>>> On Thu, Nov 30, 2023 at 07:15:17PM -0800, Stefano Stabellini wrote:
>>>> On Thu, 30 Nov 2023, Roger Pau Monné wrote:
>>>>> On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
>>>>>> On Fri, 24 Nov 2023, Jiqian Chen wrote:
>>>>>>> This patch is to solve two problems we encountered when we try to
>>>>>>> passthrough a device to hvm domU base on Xen PVH dom0.
>>>>>>>
>>>>>>> First, hvm guest will alloc a pirq and irq for a passthrough device
>>>>>>> by using gsi, before that, the gsi must first has a mapping in dom0,
>>>>>>> see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
>>>>>>> into Xen and check whether dom0 has the mapping. See
>>>>>>> XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
>>>>>>> dom0 and it return irq is 0, and then return -EPERM.
>>>>>>> This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
>>>>>>> when thay are enabled.
>>>>>>>
>>>>>>> Second, in PVH dom0, the gsi of a passthrough device doesn't get
>>>>>>> registered, but gsi must be configured for it to be able to be
>>>>>>> mapped into a domU.
>>>>>>>
>>>>>>> After searching codes, we can find map_pirq and register_gsi will be
>>>>>>> done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
>>>>>>> the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
>>>>>>> can be conclude to that the gsi of a passthrough device doesn't be
>>>>>>> unmasked.
>>>>>>>
>>>>>>> To solve the unmaske problem, this patch call the unmask_irq when we
>>>>>>> assign a device to be passthrough. So that the gsi can get registered
>>>>>>> and mapped in PVH dom0.
>>>>>>
>>>>>>
>>>>>> Roger, this seems to be more of a Xen issue than a Linux issue. Why do
>>>>>> we need the unmask check in Xen? Couldn't we just do:
>>>>>>
>>>>>>
>>>>>> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
>>>>>> index 4e40d3609a..df262a4a18 100644
>>>>>> --- a/xen/arch/x86/hvm/vioapic.c
>>>>>> +++ b/xen/arch/x86/hvm/vioapic.c
>>>>>> @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
>>>>>>              hvm_dpci_eoi(d, gsi);
>>>>>>      }
>>>>>>  
>>>>>> -    if ( is_hardware_domain(d) && unmasked )
>>>>>> +    if ( is_hardware_domain(d) )
>>>>>>      {
>>>>>>          /*
>>>>>>           * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock
>>>>>
>>>>> There are some issues with this approach.
>>>>>
>>>>> mp_register_gsi() will only setup the trigger and polarity of the
>>>>> IO-APIC pin once, so we do so once the guest unmask the pin in order
>>>>> to assert that the configuration is the intended one.  A guest is
>>>>> allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
>>>>> that doesn't take effect unless the pin is unmasked.
>>>>>
>>>>> Overall the question would be whether we have any guarantees that
>>>>> the hardware domain has properly configured the pin, even if it's not
>>>>> using it itself (as it hasn't been unmasked).
>>>>>
>>>>> IIRC PCI legacy interrupts are level triggered and low polarity, so we
>>>>> could configure any pins that are not setup at bind time?
>>>>
>>>> That could work.
>>>>
>>>> Another idea is to move only the call to allocate_and_map_gsi_pirq at
>>>> bind time? That might be enough to pass a pirq_access_permitted check.
>>>
>>> Maybe, albeit that would change the behavior of XEN_DOMCTL_bind_pt_irq
>>> just for PT_IRQ_TYPE_PCI and only when called from a PVH dom0 (as the
>>> parameter would be a GSI instead of a previously mapped IRQ).  Such
>>> difference just for PT_IRQ_TYPE_PCI is slightly weird - if we go that
>>> route I would recommend that we instead introduce a new dmop that has
>>> this syntax regardless of the domain type it's called from.
>>
>> Looking at the code it is certainly a bit confusing. My point was that
>> we don't need to wait until polarity and trigger are set appropriately
>> to allow Dom0 to pass successfully a pirq_access_permitted() check. Xen
>> should be able to figure out that Dom0 is permitted pirq access.
> 
> The logic is certainly not straightforward, and it could benefit from
> some comments.
> 
> The irq permissions are a bit special, in that they get setup when the
> IRQ is mapped.
> 
> The problem however is not so much with IRQ permissions, that we can
> indeed sort out internally in Xen.  Such check in dom0 has the side
> effect of preventing the IRQ from being assigned to a domU without the
> hardware source being properly configured AFAICT.
> 
>>
>> So the idea was to move the call to allocate_and_map_gsi_pirq() earlier
>> somewhere because allocate_and_map_gsi_pirq doesn't require trigger or
>> polarity to be configured to work. But the suggestion of doing it a
>> "bind time" (meaning: XEN_DOMCTL_bind_pt_irq) was a bad idea.
>>
>> But maybe we can find another location, maybe within
>> xen/arch/x86/hvm/vioapic.c, to call allocate_and_map_gsi_pirq() before
>> trigger and polarity are set and before the interrupt is unmasked.
>>
>> Then we change the implementation of vioapic_hwdom_map_gsi to skip the
>> call to allocate_and_map_gsi_pirq, because by the time
>> vioapic_hwdom_map_gsi we assume that allocate_and_map_gsi_pirq had
>> already been done.
> 
> But then we would end up in a situation where the
> pirq_access_permitted() check will pass, but the IO-APIC pin won't be
> configured, which I think it's not what we want.
> 
> One option would be to allow mp_register_gsi() to be called multiple
> times, and update the IO-APIC pin configuration as long as the pin is
> not unmasked.  That would propagate each dom0 RTE update to the
> underlying IO-APIC.  However such approach relies on dom0 configuring
> all possible IO-APIC pins, even if no device on dom0 is using them, I
> think it's not a very reliable option.
> 
> Another option would be to modify the toolstack to setup the GSI
> itself using the PHYSDEVOP_setup_gsi hypercall.  As said in a previous
> email, since we only care about PCI device passthrough the legacy INTx
> should always be level triggered and low polarity.

I am thinking if we can do PHYSDEVOP_map_pirq and PHYSDEVOP_setup_gsi only for passthrough devices(in function pcistub_init_device).
Then it can pass permission check and setup gsi without affecting other devices that do not use IO-APIC pins.
What do you think?

> 
>> I am not familiar with vioapic.c but to give you an idea of what I was
>> thinking:
>>
>>
>> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
>> index 4e40d3609a..16d56fe851 100644
>> --- a/xen/arch/x86/hvm/vioapic.c
>> +++ b/xen/arch/x86/hvm/vioapic.c
>> @@ -189,14 +189,6 @@ static int vioapic_hwdom_map_gsi(unsigned int gsi, unsigned int trig,
>>          return ret;
>>      }
>>  
>> -    ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
>> -    if ( ret )
>> -    {
>> -        gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
>> -                 gsi, ret);
>> -        return ret;
>> -    }
>> -
>>      pcidevs_lock();
>>      ret = pt_irq_create_bind(currd, &pt_irq_bind);
>>      if ( ret )
>> @@ -287,6 +279,17 @@ static void vioapic_write_redirent(
>>              hvm_dpci_eoi(d, gsi);
>>      }
>>  
>> +    if ( is_hardware_domain(d) ) 
>> +    {
>> +        int pirq = gsi, ret;
>> +        ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
>> +        if ( ret )
>> +        {
>> +            gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
>> +                    gsi, ret);
>> +            return ret;
>> +        }
>> +    }
>>      if ( is_hardware_domain(d) && unmasked )
>>      {
>>          /*
> 
> As said above, such approach relies on dom0 writing to the IO-APIC RTE
> of likely each IO-APIC pin, which is IMO not quite reliable.  In there
> are two different issues here that need to be fixed for PVH dom0:
> 
>  - Fix the XEN_DOMCTL_irq_permission pirq_access_permitted() call to
>    succeed for a PVH dom0, even if dom0 is not using the GSI itself.
> 
>  - Configure IO-APIC pins for PCI interrupts even if dom0 is not using
>    the IO-APIC pin itself.
> 
> First one needs to be fixed internally in Xen, second one will require
> the toolstack to issue an extra hypercall in order to ensure the
> IO-APIC pin is properly configured.
> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-12-04 21:31       ` Stefano Stabellini
@ 2023-12-05  6:50         ` Chen, Jiqian
  2023-12-05  8:10         ` Jan Beulich
  2023-12-05 17:02         ` Thomas Gleixner
  2 siblings, 0 replies; 49+ messages in thread
From: Chen, Jiqian @ 2023-12-05  6:50 UTC (permalink / raw)
  To: Stefano Stabellini, Thomas Gleixner
  Cc: Juergen Gross, Oleksandr Tyshchenko, Boris Ostrovsky,
	Rafael J . Wysocki, Len Brown, Bjorn Helgaas,
	Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi, Stabellini, Stefano,
	Deucher, Alexander, Koenig, Christian, Hildebrand, Stewart,
	Ragiadakou, Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray,
	Chen, Jiqian

On 2023/12/5 05:31, Stefano Stabellini wrote:
> On Mon, 3 Dec 2023, Chen, Jiqian wrote:
>>>> vpci device state when device is reset on dom0 side.
>>>>
>>>> And call that function in pcistub_init_device. Because when
>>>> we use "pci-assignable-add" to assign a passthrough device in
>>>> Xen, it will reset passthrough device and the vpci state will
>>>> out of date, and then device will fail to restore bar state.
>>>>
>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>>
>>> This Signed-off-by chain is incorrect.
>>>
>>> Documentation/process/submitting-patches.rst has a full chapter about
>>> S-O-B and the correct usage.
>> I am the author of this series of patches, and Huang Rui transported the v1 to upstream. And now I transport v2. I am not aware that the SOB chain is incorrect.
>> Do you have any suggestions?
> 
> I think he means that your Signed-off-by should be the second one of the
> two as you are the one submitting the patch to the LKML
Got it. Thanks Stefano! I will adjust the sequence in next version.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-04  8:13   ` Thomas Gleixner
@ 2023-12-05  7:03     ` Chen, Jiqian
  0 siblings, 0 replies; 49+ messages in thread
From: Chen, Jiqian @ 2023-12-05  7:03 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Stefano Stabellini, Juergen Gross, Oleksandr Tyshchenko,
	Boris Ostrovsky, Rafael J . Wysocki, Len Brown, Bjorn Helgaas,
	Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi, Stabellini, Stefano,
	Deucher, Alexander, Koenig, Christian, Hildebrand, Stewart,
	Ragiadakou, Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray,
	Chen, Jiqian

Hi Thomas Gleixner,

Thank you for review, and you are right, it seems more like a XEN internal issue. We are discussing it and maybe will fix it in Xen code next version.

On 2023/12/4 16:13, Thomas Gleixner wrote:
> On Fri, Nov 24 2023 at 18:31, Jiqian Chen wrote:
>> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>> index 5a96b6c66c07..b83d02bcc76c 100644
>> --- a/drivers/xen/xen-pciback/pci_stub.c
>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> @@ -357,6 +357,7 @@ static int pcistub_match(struct pci_dev *dev)
>>  static int pcistub_init_device(struct pci_dev *dev)
>>  {
>>  	struct xen_pcibk_dev_data *dev_data;
>> +	struct irq_desc *desc = NULL;
>>  	int err = 0;
>>  
>>  	dev_dbg(&dev->dev, "initializing...\n");
>> @@ -399,6 +400,12 @@ static int pcistub_init_device(struct pci_dev *dev)
>>  	if (err)
>>  		goto config_release;
>>  
>> +	if (xen_initial_domain() && xen_pvh_domain()) {
>> +		if (dev->irq <= 0 || !(desc = irq_to_desc(dev->irq)))
> 
> Driver code has absolutely no business to access irq_desc.
> 
>> +			goto config_release;
>> +		unmask_irq(desc);
> 
> Or to invoke any internal function.
> 
>> --- a/kernel/irq/chip.c
>> +++ b/kernel/irq/chip.c
>> @@ -439,6 +439,7 @@ void unmask_irq(struct irq_desc *desc)
>>  		irq_state_clr_masked(desc);
>>  	}
>>  }
>> +EXPORT_SYMBOL_GPL(unmask_irq);
> 
> Not going to happen.
> 
>> --- a/kernel/irq/irqdesc.c
>> +++ b/kernel/irq/irqdesc.c
>> @@ -380,7 +380,7 @@ struct irq_desc *irq_to_desc(unsigned int irq)
>>  {
>>  	return mtree_load(&sparse_irqs, irq);
>>  }
>> -#ifdef CONFIG_KVM_BOOK3S_64_HV_MODULE
>> +#if defined CONFIG_KVM_BOOK3S_64_HV_MODULE || defined CONFIG_XEN_PVH
> 
> Neither that.
> 
> This all smells badly like a XEN internal issue and we are not going to
> hack around it by exposing interrupt internals.
> 
> Thanks,
> 
>         tglx

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-12-04 21:31       ` Stefano Stabellini
  2023-12-05  6:50         ` Chen, Jiqian
@ 2023-12-05  8:10         ` Jan Beulich
  2023-12-05 17:02         ` Thomas Gleixner
  2 siblings, 0 replies; 49+ messages in thread
From: Jan Beulich @ 2023-12-05  8:10 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Roger Pau Monné,
	xen-devel, Hildebrand, Stewart, Huang, Ray, Chen, Jiqian

(reducing Cc list)

On 04.12.2023 22:31, Stefano Stabellini wrote:
> On Mon, 3 Dec 2023, Chen, Jiqian wrote:
>>>> vpci device state when device is reset on dom0 side.
>>>>
>>>> And call that function in pcistub_init_device. Because when
>>>> we use "pci-assignable-add" to assign a passthrough device in
>>>> Xen, it will reset passthrough device and the vpci state will
>>>> out of date, and then device will fail to restore bar state.
>>>>
>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>>
>>> This Signed-off-by chain is incorrect.
>>>
>>> Documentation/process/submitting-patches.rst has a full chapter about
>>> S-O-B and the correct usage.
>> I am the author of this series of patches, and Huang Rui transported the v1 to upstream. And now I transport v2. I am not aware that the SOB chain is incorrect.
>> Do you have any suggestions?
> 
> I think he means that your Signed-off-by should be the second one of the
> two as you are the one submitting the patch to the LKML

But that's not really correct either, as it then doesn't represent the
sequence of events. The first S-o-b normally is the (original) author's,
isn't it?

Jan


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-04 22:19               ` Stefano Stabellini
@ 2023-12-05  9:19                 ` Roger Pau Monné
  2023-12-05  9:39                   ` Chen, Jiqian
  2023-12-05 10:32                   ` Jan Beulich
  0 siblings, 2 replies; 49+ messages in thread
From: Roger Pau Monné @ 2023-12-05  9:19 UTC (permalink / raw)
  To: Stefano Stabellini, Jan Beulich
  Cc: Jiqian Chen, Juergen Gross, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi,
	Stefano Stabellini, Alex Deucher, Christian Koenig,
	Stewart Hildebrand, Xenia Ragiadakou, Honglei Huang, Julia Zhang,
	Huang Rui

On Mon, Dec 04, 2023 at 02:19:33PM -0800, Stefano Stabellini wrote:
> On Mon, 4 Dec 2023, Roger Pau Monné wrote:
> > On Fri, Dec 01, 2023 at 07:37:55PM -0800, Stefano Stabellini wrote:
> > > On Fri, 1 Dec 2023, Roger Pau Monné wrote:
> > > > On Thu, Nov 30, 2023 at 07:15:17PM -0800, Stefano Stabellini wrote:
> > > > > On Thu, 30 Nov 2023, Roger Pau Monné wrote:
> > > > > > On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
> > > > > > > On Fri, 24 Nov 2023, Jiqian Chen wrote:
> > > > > > > > This patch is to solve two problems we encountered when we try to
> > > > > > > > passthrough a device to hvm domU base on Xen PVH dom0.
> > > > > > > > 
> > > > > > > > First, hvm guest will alloc a pirq and irq for a passthrough device
> > > > > > > > by using gsi, before that, the gsi must first has a mapping in dom0,
> > > > > > > > see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
> > > > > > > > into Xen and check whether dom0 has the mapping. See
> > > > > > > > XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
> > > > > > > > dom0 and it return irq is 0, and then return -EPERM.
> > > > > > > > This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
> > > > > > > > when thay are enabled.
> > > > > > > > 
> > > > > > > > Second, in PVH dom0, the gsi of a passthrough device doesn't get
> > > > > > > > registered, but gsi must be configured for it to be able to be
> > > > > > > > mapped into a domU.
> > > > > > > > 
> > > > > > > > After searching codes, we can find map_pirq and register_gsi will be
> > > > > > > > done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
> > > > > > > > the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
> > > > > > > > can be conclude to that the gsi of a passthrough device doesn't be
> > > > > > > > unmasked.
> > > > > > > > 
> > > > > > > > To solve the unmaske problem, this patch call the unmask_irq when we
> > > > > > > > assign a device to be passthrough. So that the gsi can get registered
> > > > > > > > and mapped in PVH dom0.
> > > > > > > 
> > > > > > > 
> > > > > > > Roger, this seems to be more of a Xen issue than a Linux issue. Why do
> > > > > > > we need the unmask check in Xen? Couldn't we just do:
> > > > > > > 
> > > > > > > 
> > > > > > > diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> > > > > > > index 4e40d3609a..df262a4a18 100644
> > > > > > > --- a/xen/arch/x86/hvm/vioapic.c
> > > > > > > +++ b/xen/arch/x86/hvm/vioapic.c
> > > > > > > @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
> > > > > > >              hvm_dpci_eoi(d, gsi);
> > > > > > >      }
> > > > > > >  
> > > > > > > -    if ( is_hardware_domain(d) && unmasked )
> > > > > > > +    if ( is_hardware_domain(d) )
> > > > > > >      {
> > > > > > >          /*
> > > > > > >           * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock
> > > > > > 
> > > > > > There are some issues with this approach.
> > > > > > 
> > > > > > mp_register_gsi() will only setup the trigger and polarity of the
> > > > > > IO-APIC pin once, so we do so once the guest unmask the pin in order
> > > > > > to assert that the configuration is the intended one.  A guest is
> > > > > > allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
> > > > > > that doesn't take effect unless the pin is unmasked.
> > > > > > 
> > > > > > Overall the question would be whether we have any guarantees that
> > > > > > the hardware domain has properly configured the pin, even if it's not
> > > > > > using it itself (as it hasn't been unmasked).
> > > > > > 
> > > > > > IIRC PCI legacy interrupts are level triggered and low polarity, so we
> > > > > > could configure any pins that are not setup at bind time?
> > > > > 
> > > > > That could work.
> > > > > 
> > > > > Another idea is to move only the call to allocate_and_map_gsi_pirq at
> > > > > bind time? That might be enough to pass a pirq_access_permitted check.
> > > > 
> > > > Maybe, albeit that would change the behavior of XEN_DOMCTL_bind_pt_irq
> > > > just for PT_IRQ_TYPE_PCI and only when called from a PVH dom0 (as the
> > > > parameter would be a GSI instead of a previously mapped IRQ).  Such
> > > > difference just for PT_IRQ_TYPE_PCI is slightly weird - if we go that
> > > > route I would recommend that we instead introduce a new dmop that has
> > > > this syntax regardless of the domain type it's called from.
> > > 
> > > Looking at the code it is certainly a bit confusing. My point was that
> > > we don't need to wait until polarity and trigger are set appropriately
> > > to allow Dom0 to pass successfully a pirq_access_permitted() check. Xen
> > > should be able to figure out that Dom0 is permitted pirq access.
> > 
> > The logic is certainly not straightforward, and it could benefit from
> > some comments.
> > 
> > The irq permissions are a bit special, in that they get setup when the
> > IRQ is mapped.
> > 
> > The problem however is not so much with IRQ permissions, that we can
> > indeed sort out internally in Xen.  Such check in dom0 has the side
> > effect of preventing the IRQ from being assigned to a domU without the
> > hardware source being properly configured AFAICT.
> 
> Now I understand why you made a comment previously about Xen having to
> configure trigger and polarity for these interrupts on its own.
> 
> 
> > > So the idea was to move the call to allocate_and_map_gsi_pirq() earlier
> > > somewhere because allocate_and_map_gsi_pirq doesn't require trigger or
> > > polarity to be configured to work. But the suggestion of doing it a
> > > "bind time" (meaning: XEN_DOMCTL_bind_pt_irq) was a bad idea.
> > > 
> > > But maybe we can find another location, maybe within
> > > xen/arch/x86/hvm/vioapic.c, to call allocate_and_map_gsi_pirq() before
> > > trigger and polarity are set and before the interrupt is unmasked.
> > > 
> > > Then we change the implementation of vioapic_hwdom_map_gsi to skip the
> > > call to allocate_and_map_gsi_pirq, because by the time
> > > vioapic_hwdom_map_gsi we assume that allocate_and_map_gsi_pirq had
> > > already been done.
> > 
> > But then we would end up in a situation where the
> > pirq_access_permitted() check will pass, but the IO-APIC pin won't be
> > configured, which I think it's not what we want.
> > 
> > One option would be to allow mp_register_gsi() to be called multiple
> > times, and update the IO-APIC pin configuration as long as the pin is
> > not unmasked.  That would propagate each dom0 RTE update to the
> > underlying IO-APIC.  However such approach relies on dom0 configuring
> > all possible IO-APIC pins, even if no device on dom0 is using them, I
> > think it's not a very reliable option.
> > 
> > Another option would be to modify the toolstack to setup the GSI
> > itself using the PHYSDEVOP_setup_gsi hypercall.  As said in a previous
> > email, since we only care about PCI device passthrough the legacy INTx
> > should always be level triggered and low polarity.
> > 
> > > I am not familiar with vioapic.c but to give you an idea of what I was
> > > thinking:
> > > 
> > > 
> > > diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> > > index 4e40d3609a..16d56fe851 100644
> > > --- a/xen/arch/x86/hvm/vioapic.c
> > > +++ b/xen/arch/x86/hvm/vioapic.c
> > > @@ -189,14 +189,6 @@ static int vioapic_hwdom_map_gsi(unsigned int gsi, unsigned int trig,
> > >          return ret;
> > >      }
> > >  
> > > -    ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
> > > -    if ( ret )
> > > -    {
> > > -        gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
> > > -                 gsi, ret);
> > > -        return ret;
> > > -    }
> > > -
> > >      pcidevs_lock();
> > >      ret = pt_irq_create_bind(currd, &pt_irq_bind);
> > >      if ( ret )
> > > @@ -287,6 +279,17 @@ static void vioapic_write_redirent(
> > >              hvm_dpci_eoi(d, gsi);
> > >      }
> > >  
> > > +    if ( is_hardware_domain(d) ) 
> > > +    {
> > > +        int pirq = gsi, ret;
> > > +        ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
> > > +        if ( ret )
> > > +        {
> > > +            gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
> > > +                    gsi, ret);
> > > +            return ret;
> > > +        }
> > > +    }
> > >      if ( is_hardware_domain(d) && unmasked )
> > >      {
> > >          /*
> > 
> > As said above, such approach relies on dom0 writing to the IO-APIC RTE
> > of likely each IO-APIC pin, which is IMO not quite reliable.  In there
> > are two different issues here that need to be fixed for PVH dom0:
> > 
> >  - Fix the XEN_DOMCTL_irq_permission pirq_access_permitted() call to
> >    succeed for a PVH dom0, even if dom0 is not using the GSI itself.
> 
> Yes makes sense
> 
> 
> >  - Configure IO-APIC pins for PCI interrupts even if dom0 is not using
> >    the IO-APIC pin itself.
> > 
> > First one needs to be fixed internally in Xen, second one will require
> > the toolstack to issue an extra hypercall in order to ensure the
> > IO-APIC pin is properly configured.
>  
> On ARM, Xen doesn't need to wait for dom0 to configure interrupts
> correctly. Xen configures them all on its own at boot based on Device
> Tree information. I guess it is not possible to do the same on x86?

No, not exactly.  There's some interrupt information in the ACPI MADT,
but that's just for very specific sources (Interrupt Source Override
Structures)

Then on AML devices can have resource descriptors that contain
information about how interrupts are setup.  However Xen is not able
to read any of this information on AML.

Legacy PCI interrupts are (always?) level triggered and low polarity,
because it's assumed that an interrupt source can be shared between
multiple devices.

I'm however not able to find any reference to this in the PCI spec,
hence I'm reluctant to take this for granted in Xen, and default all
GSIs >= 16 to such mode.

OTOH legacy PCI interrupts are not that used anymore, as almost all
devices will support MSI(-X) (because PCIe mandates it) and OSes
should prefer the latter.  SR-IOV VF don't even support legacy PCI
interrupts anymore.

> If
> not, then I can see why we would need 1 extra toolstack hypercall for
> that (or to bundle the operation of configuring IO-APIC pins together
> with an existing toolstack hypercall).

One suitable compromise would be to default unconfigured GSIs >= 16 to
level-triggered and low-polarity, as I would expect that to work in
almost all cases.  We can always introduce the usage of
PHYSDEVOP_setup_gsi later if required.

Maybe Jan has more input here, would you agree to defaulting non-ISA
GSIs to level-triggered, low-polarity in the absence of a specific
setup provided by dom0?

Thanks, Roger.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-05  9:19                 ` Roger Pau Monné
@ 2023-12-05  9:39                   ` Chen, Jiqian
  2023-12-05 10:32                   ` Jan Beulich
  1 sibling, 0 replies; 49+ messages in thread
From: Chen, Jiqian @ 2023-12-05  9:39 UTC (permalink / raw)
  To: Roger Pau Monné, Stefano Stabellini, Jan Beulich
  Cc: Chen, Jiqian, Juergen Gross, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi, Stabellini,
	Stefano, Deucher, Alexander, Koenig, Christian, Hildebrand,
	Stewart, Ragiadakou, Xenia, Huang, Honglei1, Zhang, Julia, Huang,
	Ray


On 2023/12/5 17:19, Roger Pau Monné wrote:
> On Mon, Dec 04, 2023 at 02:19:33PM -0800, Stefano Stabellini wrote:
>> On Mon, 4 Dec 2023, Roger Pau Monné wrote:
>>> On Fri, Dec 01, 2023 at 07:37:55PM -0800, Stefano Stabellini wrote:
>>>> On Fri, 1 Dec 2023, Roger Pau Monné wrote:
>>>>> On Thu, Nov 30, 2023 at 07:15:17PM -0800, Stefano Stabellini wrote:
>>>>>> On Thu, 30 Nov 2023, Roger Pau Monné wrote:
>>>>>>> On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
>>>>>>>> On Fri, 24 Nov 2023, Jiqian Chen wrote:
>>>>>>>>> This patch is to solve two problems we encountered when we try to
>>>>>>>>> passthrough a device to hvm domU base on Xen PVH dom0.
>>>>>>>>>
>>>>>>>>> First, hvm guest will alloc a pirq and irq for a passthrough device
>>>>>>>>> by using gsi, before that, the gsi must first has a mapping in dom0,
>>>>>>>>> see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
>>>>>>>>> into Xen and check whether dom0 has the mapping. See
>>>>>>>>> XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
>>>>>>>>> dom0 and it return irq is 0, and then return -EPERM.
>>>>>>>>> This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
>>>>>>>>> when thay are enabled.
>>>>>>>>>
>>>>>>>>> Second, in PVH dom0, the gsi of a passthrough device doesn't get
>>>>>>>>> registered, but gsi must be configured for it to be able to be
>>>>>>>>> mapped into a domU.
>>>>>>>>>
>>>>>>>>> After searching codes, we can find map_pirq and register_gsi will be
>>>>>>>>> done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
>>>>>>>>> the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
>>>>>>>>> can be conclude to that the gsi of a passthrough device doesn't be
>>>>>>>>> unmasked.
>>>>>>>>>
>>>>>>>>> To solve the unmaske problem, this patch call the unmask_irq when we
>>>>>>>>> assign a device to be passthrough. So that the gsi can get registered
>>>>>>>>> and mapped in PVH dom0.
>>>>>>>>
>>>>>>>>
>>>>>>>> Roger, this seems to be more of a Xen issue than a Linux issue. Why do
>>>>>>>> we need the unmask check in Xen? Couldn't we just do:
>>>>>>>>
>>>>>>>>
>>>>>>>> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
>>>>>>>> index 4e40d3609a..df262a4a18 100644
>>>>>>>> --- a/xen/arch/x86/hvm/vioapic.c
>>>>>>>> +++ b/xen/arch/x86/hvm/vioapic.c
>>>>>>>> @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
>>>>>>>>              hvm_dpci_eoi(d, gsi);
>>>>>>>>      }
>>>>>>>>  
>>>>>>>> -    if ( is_hardware_domain(d) && unmasked )
>>>>>>>> +    if ( is_hardware_domain(d) )
>>>>>>>>      {
>>>>>>>>          /*
>>>>>>>>           * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock
>>>>>>>
>>>>>>> There are some issues with this approach.
>>>>>>>
>>>>>>> mp_register_gsi() will only setup the trigger and polarity of the
>>>>>>> IO-APIC pin once, so we do so once the guest unmask the pin in order
>>>>>>> to assert that the configuration is the intended one.  A guest is
>>>>>>> allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
>>>>>>> that doesn't take effect unless the pin is unmasked.
>>>>>>>
>>>>>>> Overall the question would be whether we have any guarantees that
>>>>>>> the hardware domain has properly configured the pin, even if it's not
>>>>>>> using it itself (as it hasn't been unmasked).
>>>>>>>
>>>>>>> IIRC PCI legacy interrupts are level triggered and low polarity, so we
>>>>>>> could configure any pins that are not setup at bind time?
>>>>>>
>>>>>> That could work.
>>>>>>
>>>>>> Another idea is to move only the call to allocate_and_map_gsi_pirq at
>>>>>> bind time? That might be enough to pass a pirq_access_permitted check.
>>>>>
>>>>> Maybe, albeit that would change the behavior of XEN_DOMCTL_bind_pt_irq
>>>>> just for PT_IRQ_TYPE_PCI and only when called from a PVH dom0 (as the
>>>>> parameter would be a GSI instead of a previously mapped IRQ).  Such
>>>>> difference just for PT_IRQ_TYPE_PCI is slightly weird - if we go that
>>>>> route I would recommend that we instead introduce a new dmop that has
>>>>> this syntax regardless of the domain type it's called from.
>>>>
>>>> Looking at the code it is certainly a bit confusing. My point was that
>>>> we don't need to wait until polarity and trigger are set appropriately
>>>> to allow Dom0 to pass successfully a pirq_access_permitted() check. Xen
>>>> should be able to figure out that Dom0 is permitted pirq access.
>>>
>>> The logic is certainly not straightforward, and it could benefit from
>>> some comments.
>>>
>>> The irq permissions are a bit special, in that they get setup when the
>>> IRQ is mapped.
>>>
>>> The problem however is not so much with IRQ permissions, that we can
>>> indeed sort out internally in Xen.  Such check in dom0 has the side
>>> effect of preventing the IRQ from being assigned to a domU without the
>>> hardware source being properly configured AFAICT.
>>
>> Now I understand why you made a comment previously about Xen having to
>> configure trigger and polarity for these interrupts on its own.
>>
>>
>>>> So the idea was to move the call to allocate_and_map_gsi_pirq() earlier
>>>> somewhere because allocate_and_map_gsi_pirq doesn't require trigger or
>>>> polarity to be configured to work. But the suggestion of doing it a
>>>> "bind time" (meaning: XEN_DOMCTL_bind_pt_irq) was a bad idea.
>>>>
>>>> But maybe we can find another location, maybe within
>>>> xen/arch/x86/hvm/vioapic.c, to call allocate_and_map_gsi_pirq() before
>>>> trigger and polarity are set and before the interrupt is unmasked.
>>>>
>>>> Then we change the implementation of vioapic_hwdom_map_gsi to skip the
>>>> call to allocate_and_map_gsi_pirq, because by the time
>>>> vioapic_hwdom_map_gsi we assume that allocate_and_map_gsi_pirq had
>>>> already been done.
>>>
>>> But then we would end up in a situation where the
>>> pirq_access_permitted() check will pass, but the IO-APIC pin won't be
>>> configured, which I think it's not what we want.
>>>
>>> One option would be to allow mp_register_gsi() to be called multiple
>>> times, and update the IO-APIC pin configuration as long as the pin is
>>> not unmasked.  That would propagate each dom0 RTE update to the
>>> underlying IO-APIC.  However such approach relies on dom0 configuring
>>> all possible IO-APIC pins, even if no device on dom0 is using them, I
>>> think it's not a very reliable option.
>>>
>>> Another option would be to modify the toolstack to setup the GSI
>>> itself using the PHYSDEVOP_setup_gsi hypercall.  As said in a previous
>>> email, since we only care about PCI device passthrough the legacy INTx
>>> should always be level triggered and low polarity.
>>>
>>>> I am not familiar with vioapic.c but to give you an idea of what I was
>>>> thinking:
>>>>
>>>>
>>>> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
>>>> index 4e40d3609a..16d56fe851 100644
>>>> --- a/xen/arch/x86/hvm/vioapic.c
>>>> +++ b/xen/arch/x86/hvm/vioapic.c
>>>> @@ -189,14 +189,6 @@ static int vioapic_hwdom_map_gsi(unsigned int gsi, unsigned int trig,
>>>>          return ret;
>>>>      }
>>>>  
>>>> -    ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
>>>> -    if ( ret )
>>>> -    {
>>>> -        gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
>>>> -                 gsi, ret);
>>>> -        return ret;
>>>> -    }
>>>> -
>>>>      pcidevs_lock();
>>>>      ret = pt_irq_create_bind(currd, &pt_irq_bind);
>>>>      if ( ret )
>>>> @@ -287,6 +279,17 @@ static void vioapic_write_redirent(
>>>>              hvm_dpci_eoi(d, gsi);
>>>>      }
>>>>  
>>>> +    if ( is_hardware_domain(d) ) 
>>>> +    {
>>>> +        int pirq = gsi, ret;
>>>> +        ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
>>>> +        if ( ret )
>>>> +        {
>>>> +            gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
>>>> +                    gsi, ret);
>>>> +            return ret;
>>>> +        }
>>>> +    }
>>>>      if ( is_hardware_domain(d) && unmasked )
>>>>      {
>>>>          /*
>>>
>>> As said above, such approach relies on dom0 writing to the IO-APIC RTE
>>> of likely each IO-APIC pin, which is IMO not quite reliable.  In there
>>> are two different issues here that need to be fixed for PVH dom0:
>>>
>>>  - Fix the XEN_DOMCTL_irq_permission pirq_access_permitted() call to
>>>    succeed for a PVH dom0, even if dom0 is not using the GSI itself.
>>
>> Yes makes sense
>>
>>
>>>  - Configure IO-APIC pins for PCI interrupts even if dom0 is not using
>>>    the IO-APIC pin itself.
>>>
>>> First one needs to be fixed internally in Xen, second one will require
>>> the toolstack to issue an extra hypercall in order to ensure the
>>> IO-APIC pin is properly configured.
>>  
>> On ARM, Xen doesn't need to wait for dom0 to configure interrupts
>> correctly. Xen configures them all on its own at boot based on Device
>> Tree information. I guess it is not possible to do the same on x86?
> 
> No, not exactly.  There's some interrupt information in the ACPI MADT,
> but that's just for very specific sources (Interrupt Source Override
> Structures)
> 
> Then on AML devices can have resource descriptors that contain
> information about how interrupts are setup.  However Xen is not able
> to read any of this information on AML.
> 
> Legacy PCI interrupts are (always?) level triggered and low polarity,
> because it's assumed that an interrupt source can be shared between
> multiple devices.
> 
> I'm however not able to find any reference to this in the PCI spec,
> hence I'm reluctant to take this for granted in Xen, and default all
> GSIs >= 16 to such mode.
> 
> OTOH legacy PCI interrupts are not that used anymore, as almost all
> devices will support MSI(-X) (because PCIe mandates it) and OSes
> should prefer the latter.  SR-IOV VF don't even support legacy PCI
> interrupts anymore.
> 
>> If
>> not, then I can see why we would need 1 extra toolstack hypercall for
>> that (or to bundle the operation of configuring IO-APIC pins together
>> with an existing toolstack hypercall).
> 
> One suitable compromise would be to default unconfigured GSIs >= 16 to
> level-triggered and low-polarity, as I would expect that to work in
> almost all cases.  We can always introduce the usage of
> PHYSDEVOP_setup_gsi later if required.
> 
> Maybe Jan has more input here, would you agree to defaulting non-ISA
> GSIs to level-triggered, low-polarity in the absence of a specific
> setup provided by dom0?
> 
> Thanks, Roger.

No intention to disturb if I am incorrect, just a little input. On dom0 PVH, when it enables devices, it will call acpi_pci_irq_enable, and in that function, its default trigger is level and polarity is low for pci interrupt.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-05  9:19                 ` Roger Pau Monné
  2023-12-05  9:39                   ` Chen, Jiqian
@ 2023-12-05 10:32                   ` Jan Beulich
  2023-12-06  6:07                     ` Chen, Jiqian
  1 sibling, 1 reply; 49+ messages in thread
From: Jan Beulich @ 2023-12-05 10:32 UTC (permalink / raw)
  To: Roger Pau Monné, Stefano Stabellini
  Cc: Jiqian Chen, Juergen Gross, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi,
	Stefano Stabellini, Alex Deucher, Christian Koenig,
	Stewart Hildebrand, Xenia Ragiadakou, Honglei Huang, Julia Zhang,
	Huang Rui

On 05.12.2023 10:19, Roger Pau Monné wrote:
> On Mon, Dec 04, 2023 at 02:19:33PM -0800, Stefano Stabellini wrote:
>> On Mon, 4 Dec 2023, Roger Pau Monné wrote:
>>> On Fri, Dec 01, 2023 at 07:37:55PM -0800, Stefano Stabellini wrote:
>>>> On Fri, 1 Dec 2023, Roger Pau Monné wrote:
>>>>> On Thu, Nov 30, 2023 at 07:15:17PM -0800, Stefano Stabellini wrote:
>>>>>> On Thu, 30 Nov 2023, Roger Pau Monné wrote:
>>>>>>> On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
>>>>>>>> On Fri, 24 Nov 2023, Jiqian Chen wrote:
>>>>>>>>> This patch is to solve two problems we encountered when we try to
>>>>>>>>> passthrough a device to hvm domU base on Xen PVH dom0.
>>>>>>>>>
>>>>>>>>> First, hvm guest will alloc a pirq and irq for a passthrough device
>>>>>>>>> by using gsi, before that, the gsi must first has a mapping in dom0,
>>>>>>>>> see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
>>>>>>>>> into Xen and check whether dom0 has the mapping. See
>>>>>>>>> XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
>>>>>>>>> dom0 and it return irq is 0, and then return -EPERM.
>>>>>>>>> This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
>>>>>>>>> when thay are enabled.
>>>>>>>>>
>>>>>>>>> Second, in PVH dom0, the gsi of a passthrough device doesn't get
>>>>>>>>> registered, but gsi must be configured for it to be able to be
>>>>>>>>> mapped into a domU.
>>>>>>>>>
>>>>>>>>> After searching codes, we can find map_pirq and register_gsi will be
>>>>>>>>> done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
>>>>>>>>> the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
>>>>>>>>> can be conclude to that the gsi of a passthrough device doesn't be
>>>>>>>>> unmasked.
>>>>>>>>>
>>>>>>>>> To solve the unmaske problem, this patch call the unmask_irq when we
>>>>>>>>> assign a device to be passthrough. So that the gsi can get registered
>>>>>>>>> and mapped in PVH dom0.
>>>>>>>>
>>>>>>>>
>>>>>>>> Roger, this seems to be more of a Xen issue than a Linux issue. Why do
>>>>>>>> we need the unmask check in Xen? Couldn't we just do:
>>>>>>>>
>>>>>>>>
>>>>>>>> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
>>>>>>>> index 4e40d3609a..df262a4a18 100644
>>>>>>>> --- a/xen/arch/x86/hvm/vioapic.c
>>>>>>>> +++ b/xen/arch/x86/hvm/vioapic.c
>>>>>>>> @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
>>>>>>>>              hvm_dpci_eoi(d, gsi);
>>>>>>>>      }
>>>>>>>>  
>>>>>>>> -    if ( is_hardware_domain(d) && unmasked )
>>>>>>>> +    if ( is_hardware_domain(d) )
>>>>>>>>      {
>>>>>>>>          /*
>>>>>>>>           * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock
>>>>>>>
>>>>>>> There are some issues with this approach.
>>>>>>>
>>>>>>> mp_register_gsi() will only setup the trigger and polarity of the
>>>>>>> IO-APIC pin once, so we do so once the guest unmask the pin in order
>>>>>>> to assert that the configuration is the intended one.  A guest is
>>>>>>> allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
>>>>>>> that doesn't take effect unless the pin is unmasked.
>>>>>>>
>>>>>>> Overall the question would be whether we have any guarantees that
>>>>>>> the hardware domain has properly configured the pin, even if it's not
>>>>>>> using it itself (as it hasn't been unmasked).
>>>>>>>
>>>>>>> IIRC PCI legacy interrupts are level triggered and low polarity, so we
>>>>>>> could configure any pins that are not setup at bind time?
>>>>>>
>>>>>> That could work.
>>>>>>
>>>>>> Another idea is to move only the call to allocate_and_map_gsi_pirq at
>>>>>> bind time? That might be enough to pass a pirq_access_permitted check.
>>>>>
>>>>> Maybe, albeit that would change the behavior of XEN_DOMCTL_bind_pt_irq
>>>>> just for PT_IRQ_TYPE_PCI and only when called from a PVH dom0 (as the
>>>>> parameter would be a GSI instead of a previously mapped IRQ).  Such
>>>>> difference just for PT_IRQ_TYPE_PCI is slightly weird - if we go that
>>>>> route I would recommend that we instead introduce a new dmop that has
>>>>> this syntax regardless of the domain type it's called from.
>>>>
>>>> Looking at the code it is certainly a bit confusing. My point was that
>>>> we don't need to wait until polarity and trigger are set appropriately
>>>> to allow Dom0 to pass successfully a pirq_access_permitted() check. Xen
>>>> should be able to figure out that Dom0 is permitted pirq access.
>>>
>>> The logic is certainly not straightforward, and it could benefit from
>>> some comments.
>>>
>>> The irq permissions are a bit special, in that they get setup when the
>>> IRQ is mapped.
>>>
>>> The problem however is not so much with IRQ permissions, that we can
>>> indeed sort out internally in Xen.  Such check in dom0 has the side
>>> effect of preventing the IRQ from being assigned to a domU without the
>>> hardware source being properly configured AFAICT.
>>
>> Now I understand why you made a comment previously about Xen having to
>> configure trigger and polarity for these interrupts on its own.
>>
>>
>>>> So the idea was to move the call to allocate_and_map_gsi_pirq() earlier
>>>> somewhere because allocate_and_map_gsi_pirq doesn't require trigger or
>>>> polarity to be configured to work. But the suggestion of doing it a
>>>> "bind time" (meaning: XEN_DOMCTL_bind_pt_irq) was a bad idea.
>>>>
>>>> But maybe we can find another location, maybe within
>>>> xen/arch/x86/hvm/vioapic.c, to call allocate_and_map_gsi_pirq() before
>>>> trigger and polarity are set and before the interrupt is unmasked.
>>>>
>>>> Then we change the implementation of vioapic_hwdom_map_gsi to skip the
>>>> call to allocate_and_map_gsi_pirq, because by the time
>>>> vioapic_hwdom_map_gsi we assume that allocate_and_map_gsi_pirq had
>>>> already been done.
>>>
>>> But then we would end up in a situation where the
>>> pirq_access_permitted() check will pass, but the IO-APIC pin won't be
>>> configured, which I think it's not what we want.
>>>
>>> One option would be to allow mp_register_gsi() to be called multiple
>>> times, and update the IO-APIC pin configuration as long as the pin is
>>> not unmasked.  That would propagate each dom0 RTE update to the
>>> underlying IO-APIC.  However such approach relies on dom0 configuring
>>> all possible IO-APIC pins, even if no device on dom0 is using them, I
>>> think it's not a very reliable option.
>>>
>>> Another option would be to modify the toolstack to setup the GSI
>>> itself using the PHYSDEVOP_setup_gsi hypercall.  As said in a previous
>>> email, since we only care about PCI device passthrough the legacy INTx
>>> should always be level triggered and low polarity.
>>>
>>>> I am not familiar with vioapic.c but to give you an idea of what I was
>>>> thinking:
>>>>
>>>>
>>>> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
>>>> index 4e40d3609a..16d56fe851 100644
>>>> --- a/xen/arch/x86/hvm/vioapic.c
>>>> +++ b/xen/arch/x86/hvm/vioapic.c
>>>> @@ -189,14 +189,6 @@ static int vioapic_hwdom_map_gsi(unsigned int gsi, unsigned int trig,
>>>>          return ret;
>>>>      }
>>>>  
>>>> -    ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
>>>> -    if ( ret )
>>>> -    {
>>>> -        gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
>>>> -                 gsi, ret);
>>>> -        return ret;
>>>> -    }
>>>> -
>>>>      pcidevs_lock();
>>>>      ret = pt_irq_create_bind(currd, &pt_irq_bind);
>>>>      if ( ret )
>>>> @@ -287,6 +279,17 @@ static void vioapic_write_redirent(
>>>>              hvm_dpci_eoi(d, gsi);
>>>>      }
>>>>  
>>>> +    if ( is_hardware_domain(d) ) 
>>>> +    {
>>>> +        int pirq = gsi, ret;
>>>> +        ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
>>>> +        if ( ret )
>>>> +        {
>>>> +            gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
>>>> +                    gsi, ret);
>>>> +            return ret;
>>>> +        }
>>>> +    }
>>>>      if ( is_hardware_domain(d) && unmasked )
>>>>      {
>>>>          /*
>>>
>>> As said above, such approach relies on dom0 writing to the IO-APIC RTE
>>> of likely each IO-APIC pin, which is IMO not quite reliable.  In there
>>> are two different issues here that need to be fixed for PVH dom0:
>>>
>>>  - Fix the XEN_DOMCTL_irq_permission pirq_access_permitted() call to
>>>    succeed for a PVH dom0, even if dom0 is not using the GSI itself.
>>
>> Yes makes sense
>>
>>
>>>  - Configure IO-APIC pins for PCI interrupts even if dom0 is not using
>>>    the IO-APIC pin itself.
>>>
>>> First one needs to be fixed internally in Xen, second one will require
>>> the toolstack to issue an extra hypercall in order to ensure the
>>> IO-APIC pin is properly configured.
>>  
>> On ARM, Xen doesn't need to wait for dom0 to configure interrupts
>> correctly. Xen configures them all on its own at boot based on Device
>> Tree information. I guess it is not possible to do the same on x86?
> 
> No, not exactly.  There's some interrupt information in the ACPI MADT,
> but that's just for very specific sources (Interrupt Source Override
> Structures)
> 
> Then on AML devices can have resource descriptors that contain
> information about how interrupts are setup.  However Xen is not able
> to read any of this information on AML.
> 
> Legacy PCI interrupts are (always?) level triggered and low polarity,
> because it's assumed that an interrupt source can be shared between
> multiple devices.

Except that as per what you said just in the earlier paragraph ACPI can
tell us otherwise.

> I'm however not able to find any reference to this in the PCI spec,
> hence I'm reluctant to take this for granted in Xen, and default all
> GSIs >= 16 to such mode.
> 
> OTOH legacy PCI interrupts are not that used anymore, as almost all
> devices will support MSI(-X) (because PCIe mandates it) and OSes
> should prefer the latter.  SR-IOV VF don't even support legacy PCI
> interrupts anymore.
> 
>> If
>> not, then I can see why we would need 1 extra toolstack hypercall for
>> that (or to bundle the operation of configuring IO-APIC pins together
>> with an existing toolstack hypercall).
> 
> One suitable compromise would be to default unconfigured GSIs >= 16 to
> level-triggered and low-polarity, as I would expect that to work in
> almost all cases.  We can always introduce the usage of
> PHYSDEVOP_setup_gsi later if required.
> 
> Maybe Jan has more input here, would you agree to defaulting non-ISA
> GSIs to level-triggered, low-polarity in the absence of a specific
> setup provided by dom0?

Well, such defaulting is an option, but in case it's wrong we might
end up with hard to diagnose issues. Personally I'd prefer if we
didn't take shortcuts here, i.e. if we followed what Dom0 is able
to read from ACPI.

Jan

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-12-04 21:31       ` Stefano Stabellini
  2023-12-05  6:50         ` Chen, Jiqian
  2023-12-05  8:10         ` Jan Beulich
@ 2023-12-05 17:02         ` Thomas Gleixner
  2023-12-06  6:37           ` Chen, Jiqian
  2 siblings, 1 reply; 49+ messages in thread
From: Thomas Gleixner @ 2023-12-05 17:02 UTC (permalink / raw)
  To: Stefano Stabellini, Chen, Jiqian
  Cc: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Boris Ostrovsky, Rafael J . Wysocki, Len Brown, Bjorn Helgaas,
	Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi, Stabellini, Stefano,
	Deucher, Alexander, Koenig, Christian, Hildebrand, Stewart,
	Ragiadakou, Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray

On Mon, Dec 04 2023 at 13:31, Stefano Stabellini wrote:
> On Mon, 3 Dec 2023, Chen, Jiqian wrote:
>> >> vpci device state when device is reset on dom0 side.
>> >>
>> >> And call that function in pcistub_init_device. Because when
>> >> we use "pci-assignable-add" to assign a passthrough device in
>> >> Xen, it will reset passthrough device and the vpci state will
>> >> out of date, and then device will fail to restore bar state.
>> >>
>> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> >> Signed-off-by: Huang Rui <ray.huang@amd.com>
>> > 
>> > This Signed-off-by chain is incorrect.
>> > 
>> > Documentation/process/submitting-patches.rst has a full chapter about
>> > S-O-B and the correct usage.
>> I am the author of this series of patches, and Huang Rui transported the v1 to upstream. And now I transport v2. I am not aware that the SOB chain is incorrect.
>> Do you have any suggestions?
>
> I think he means that your Signed-off-by should be the second one of the
> two as you are the one submitting the patch to the LKML

No.

   Mailfrom: Jiqian Chen <Jiqian.Chen@amd.com>
   <body>

   Changelog-text

   Signed-off-by: Huang Rui <ray.huang@amd.com>
   Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>

is equally wrong because that would end up with Chen as author and Huang
as first S-O-B which is required to be the author's S-O-B

To make the above correct this would require:

   Mailfrom: Jiqian Chen <Jiqian.Chen@amd.com>
   <body>

   From: Huang Rui <ray.huang@amd.com>

   Changelog-text

   Signed-off-by: Huang Rui <ray.huang@amd.com>
   Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>

   which tells that Huang is the author and Chen is the 'transporter',
   which unfortunately does not reflect reality.

Or:

   Mailfrom: Jiqian Chen <Jiqian.Chen@amd.com>
   <body>

   Changelog-text

   Co-developed-by: Huang Rui <ray.huang@amd.com>
   Signed-off-by: Huang Rui <ray.huang@amd.com>
   Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>

   which tells that Checn is the author and Huang co-developed the
   patch, which might be true or not.


V1 which was sent by Huang has the ordering is correct:

   Mailfrom: Huang Rui <ray.huang@amd.com>
   <body>

   From: Jiqian Chen <Jiqian.Chen@amd.com>

   Changelog-text

   Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
   Signed-off-by: Huang Rui <ray.huang@amd.com>

   i.e. Chen authored and Huang transported

Now this V2 has not really much to do with V1 and is a new
implementation to solve the problem, which was authored by Chen, so
Huang is not involved at all if I understand correctly.

So what does his S-O-B mean here? Nothing...

It's very well documented how the whole S-O-B business works and it's
not really rocket science to get it straight.

It has a meaning and is not just for decoration purposes.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-05 10:32                   ` Jan Beulich
@ 2023-12-06  6:07                     ` Chen, Jiqian
  2023-12-07  2:18                       ` Stefano Stabellini
  2023-12-11 15:45                       ` Roger Pau Monné
  0 siblings, 2 replies; 49+ messages in thread
From: Chen, Jiqian @ 2023-12-06  6:07 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné, Stefano Stabellini
  Cc: Chen, Jiqian, Juergen Gross, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi, Stabellini,
	Stefano, Deucher, Alexander, Koenig, Christian, Hildebrand,
	Stewart, Ragiadakou, Xenia, Huang, Honglei1, Zhang, Julia, Huang,
	Ray

On 2023/12/5 18:32, Jan Beulich wrote:
> On 05.12.2023 10:19, Roger Pau Monné wrote:
>> On Mon, Dec 04, 2023 at 02:19:33PM -0800, Stefano Stabellini wrote:
>>> On Mon, 4 Dec 2023, Roger Pau Monné wrote:
>>>> On Fri, Dec 01, 2023 at 07:37:55PM -0800, Stefano Stabellini wrote:
>>>>> On Fri, 1 Dec 2023, Roger Pau Monné wrote:
>>>>>> On Thu, Nov 30, 2023 at 07:15:17PM -0800, Stefano Stabellini wrote:
>>>>>>> On Thu, 30 Nov 2023, Roger Pau Monné wrote:
>>>>>>>> On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
>>>>>>>>> On Fri, 24 Nov 2023, Jiqian Chen wrote:
>>>>>>>>>> This patch is to solve two problems we encountered when we try to
>>>>>>>>>> passthrough a device to hvm domU base on Xen PVH dom0.
>>>>>>>>>>
>>>>>>>>>> First, hvm guest will alloc a pirq and irq for a passthrough device
>>>>>>>>>> by using gsi, before that, the gsi must first has a mapping in dom0,
>>>>>>>>>> see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
>>>>>>>>>> into Xen and check whether dom0 has the mapping. See
>>>>>>>>>> XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
>>>>>>>>>> dom0 and it return irq is 0, and then return -EPERM.
>>>>>>>>>> This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
>>>>>>>>>> when thay are enabled.
>>>>>>>>>>
>>>>>>>>>> Second, in PVH dom0, the gsi of a passthrough device doesn't get
>>>>>>>>>> registered, but gsi must be configured for it to be able to be
>>>>>>>>>> mapped into a domU.
>>>>>>>>>>
>>>>>>>>>> After searching codes, we can find map_pirq and register_gsi will be
>>>>>>>>>> done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
>>>>>>>>>> the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
>>>>>>>>>> can be conclude to that the gsi of a passthrough device doesn't be
>>>>>>>>>> unmasked.
>>>>>>>>>>
>>>>>>>>>> To solve the unmaske problem, this patch call the unmask_irq when we
>>>>>>>>>> assign a device to be passthrough. So that the gsi can get registered
>>>>>>>>>> and mapped in PVH dom0.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Roger, this seems to be more of a Xen issue than a Linux issue. Why do
>>>>>>>>> we need the unmask check in Xen? Couldn't we just do:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
>>>>>>>>> index 4e40d3609a..df262a4a18 100644
>>>>>>>>> --- a/xen/arch/x86/hvm/vioapic.c
>>>>>>>>> +++ b/xen/arch/x86/hvm/vioapic.c
>>>>>>>>> @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
>>>>>>>>>              hvm_dpci_eoi(d, gsi);
>>>>>>>>>      }
>>>>>>>>>  
>>>>>>>>> -    if ( is_hardware_domain(d) && unmasked )
>>>>>>>>> +    if ( is_hardware_domain(d) )
>>>>>>>>>      {
>>>>>>>>>          /*
>>>>>>>>>           * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock
>>>>>>>>
>>>>>>>> There are some issues with this approach.
>>>>>>>>
>>>>>>>> mp_register_gsi() will only setup the trigger and polarity of the
>>>>>>>> IO-APIC pin once, so we do so once the guest unmask the pin in order
>>>>>>>> to assert that the configuration is the intended one.  A guest is
>>>>>>>> allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
>>>>>>>> that doesn't take effect unless the pin is unmasked.
>>>>>>>>
>>>>>>>> Overall the question would be whether we have any guarantees that
>>>>>>>> the hardware domain has properly configured the pin, even if it's not
>>>>>>>> using it itself (as it hasn't been unmasked).
>>>>>>>>
>>>>>>>> IIRC PCI legacy interrupts are level triggered and low polarity, so we
>>>>>>>> could configure any pins that are not setup at bind time?
>>>>>>>
>>>>>>> That could work.
>>>>>>>
>>>>>>> Another idea is to move only the call to allocate_and_map_gsi_pirq at
>>>>>>> bind time? That might be enough to pass a pirq_access_permitted check.
>>>>>>
>>>>>> Maybe, albeit that would change the behavior of XEN_DOMCTL_bind_pt_irq
>>>>>> just for PT_IRQ_TYPE_PCI and only when called from a PVH dom0 (as the
>>>>>> parameter would be a GSI instead of a previously mapped IRQ).  Such
>>>>>> difference just for PT_IRQ_TYPE_PCI is slightly weird - if we go that
>>>>>> route I would recommend that we instead introduce a new dmop that has
>>>>>> this syntax regardless of the domain type it's called from.
>>>>>
>>>>> Looking at the code it is certainly a bit confusing. My point was that
>>>>> we don't need to wait until polarity and trigger are set appropriately
>>>>> to allow Dom0 to pass successfully a pirq_access_permitted() check. Xen
>>>>> should be able to figure out that Dom0 is permitted pirq access.
>>>>
>>>> The logic is certainly not straightforward, and it could benefit from
>>>> some comments.
>>>>
>>>> The irq permissions are a bit special, in that they get setup when the
>>>> IRQ is mapped.
>>>>
>>>> The problem however is not so much with IRQ permissions, that we can
>>>> indeed sort out internally in Xen.  Such check in dom0 has the side
>>>> effect of preventing the IRQ from being assigned to a domU without the
>>>> hardware source being properly configured AFAICT.
>>>
>>> Now I understand why you made a comment previously about Xen having to
>>> configure trigger and polarity for these interrupts on its own.
>>>
>>>
>>>>> So the idea was to move the call to allocate_and_map_gsi_pirq() earlier
>>>>> somewhere because allocate_and_map_gsi_pirq doesn't require trigger or
>>>>> polarity to be configured to work. But the suggestion of doing it a
>>>>> "bind time" (meaning: XEN_DOMCTL_bind_pt_irq) was a bad idea.
>>>>>
>>>>> But maybe we can find another location, maybe within
>>>>> xen/arch/x86/hvm/vioapic.c, to call allocate_and_map_gsi_pirq() before
>>>>> trigger and polarity are set and before the interrupt is unmasked.
>>>>>
>>>>> Then we change the implementation of vioapic_hwdom_map_gsi to skip the
>>>>> call to allocate_and_map_gsi_pirq, because by the time
>>>>> vioapic_hwdom_map_gsi we assume that allocate_and_map_gsi_pirq had
>>>>> already been done.
>>>>
>>>> But then we would end up in a situation where the
>>>> pirq_access_permitted() check will pass, but the IO-APIC pin won't be
>>>> configured, which I think it's not what we want.
>>>>
>>>> One option would be to allow mp_register_gsi() to be called multiple
>>>> times, and update the IO-APIC pin configuration as long as the pin is
>>>> not unmasked.  That would propagate each dom0 RTE update to the
>>>> underlying IO-APIC.  However such approach relies on dom0 configuring
>>>> all possible IO-APIC pins, even if no device on dom0 is using them, I
>>>> think it's not a very reliable option.
>>>>
>>>> Another option would be to modify the toolstack to setup the GSI
>>>> itself using the PHYSDEVOP_setup_gsi hypercall.  As said in a previous
>>>> email, since we only care about PCI device passthrough the legacy INTx
>>>> should always be level triggered and low polarity.
>>>>
>>>>> I am not familiar with vioapic.c but to give you an idea of what I was
>>>>> thinking:
>>>>>
>>>>>
>>>>> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
>>>>> index 4e40d3609a..16d56fe851 100644
>>>>> --- a/xen/arch/x86/hvm/vioapic.c
>>>>> +++ b/xen/arch/x86/hvm/vioapic.c
>>>>> @@ -189,14 +189,6 @@ static int vioapic_hwdom_map_gsi(unsigned int gsi, unsigned int trig,
>>>>>          return ret;
>>>>>      }
>>>>>  
>>>>> -    ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
>>>>> -    if ( ret )
>>>>> -    {
>>>>> -        gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
>>>>> -                 gsi, ret);
>>>>> -        return ret;
>>>>> -    }
>>>>> -
>>>>>      pcidevs_lock();
>>>>>      ret = pt_irq_create_bind(currd, &pt_irq_bind);
>>>>>      if ( ret )
>>>>> @@ -287,6 +279,17 @@ static void vioapic_write_redirent(
>>>>>              hvm_dpci_eoi(d, gsi);
>>>>>      }
>>>>>  
>>>>> +    if ( is_hardware_domain(d) ) 
>>>>> +    {
>>>>> +        int pirq = gsi, ret;
>>>>> +        ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
>>>>> +        if ( ret )
>>>>> +        {
>>>>> +            gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
>>>>> +                    gsi, ret);
>>>>> +            return ret;
>>>>> +        }
>>>>> +    }
>>>>>      if ( is_hardware_domain(d) && unmasked )
>>>>>      {
>>>>>          /*
>>>>
>>>> As said above, such approach relies on dom0 writing to the IO-APIC RTE
>>>> of likely each IO-APIC pin, which is IMO not quite reliable.  In there
>>>> are two different issues here that need to be fixed for PVH dom0:
>>>>
>>>>  - Fix the XEN_DOMCTL_irq_permission pirq_access_permitted() call to
>>>>    succeed for a PVH dom0, even if dom0 is not using the GSI itself.
>>>
>>> Yes makes sense
>>>
>>>
>>>>  - Configure IO-APIC pins for PCI interrupts even if dom0 is not using
>>>>    the IO-APIC pin itself.
>>>>
>>>> First one needs to be fixed internally in Xen, second one will require
>>>> the toolstack to issue an extra hypercall in order to ensure the
>>>> IO-APIC pin is properly configured.
>>>  
>>> On ARM, Xen doesn't need to wait for dom0 to configure interrupts
>>> correctly. Xen configures them all on its own at boot based on Device
>>> Tree information. I guess it is not possible to do the same on x86?
>>
>> No, not exactly.  There's some interrupt information in the ACPI MADT,
>> but that's just for very specific sources (Interrupt Source Override
>> Structures)
>>
>> Then on AML devices can have resource descriptors that contain
>> information about how interrupts are setup.  However Xen is not able
>> to read any of this information on AML.
>>
>> Legacy PCI interrupts are (always?) level triggered and low polarity,
>> because it's assumed that an interrupt source can be shared between
>> multiple devices.
> 
> Except that as per what you said just in the earlier paragraph ACPI can
> tell us otherwise.
> 
>> I'm however not able to find any reference to this in the PCI spec,
>> hence I'm reluctant to take this for granted in Xen, and default all
>> GSIs >= 16 to such mode.
>>
>> OTOH legacy PCI interrupts are not that used anymore, as almost all
>> devices will support MSI(-X) (because PCIe mandates it) and OSes
>> should prefer the latter.  SR-IOV VF don't even support legacy PCI
>> interrupts anymore.
>>
>>> If
>>> not, then I can see why we would need 1 extra toolstack hypercall for
>>> that (or to bundle the operation of configuring IO-APIC pins together
>>> with an existing toolstack hypercall).
>>
>> One suitable compromise would be to default unconfigured GSIs >= 16 to
>> level-triggered and low-polarity, as I would expect that to work in
>> almost all cases.  We can always introduce the usage of
>> PHYSDEVOP_setup_gsi later if required.
>>
>> Maybe Jan has more input here, would you agree to defaulting non-ISA
>> GSIs to level-triggered, low-polarity in the absence of a specific
>> setup provided by dom0?
> 
> Well, such defaulting is an option, but in case it's wrong we might
> end up with hard to diagnose issues. Personally I'd prefer if we
> didn't take shortcuts here, i.e. if we followed what Dom0 is able
> to read from ACPI.

When PVH dom0 enable a device, it will get trigger and polarity from ACPI (see acpi_pci_irq_enable)
I have a version of patch which tried that way, see below:

diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
index ada3868c02c2..43e1bda9f946 100644
--- a/arch/x86/xen/enlighten_pvh.c
+++ b/arch/x86/xen/enlighten_pvh.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/acpi.h>
 #include <linux/export.h>
+#include <linux/pci.h>

 #include <xen/hvc-console.h>

@@ -25,6 +26,127 @@
 bool __ro_after_init xen_pvh;
 EXPORT_SYMBOL_GPL(xen_pvh);

+typedef struct gsi_info {
+       int gsi;
+       int trigger;
+       int polarity;
+       int pirq;
+} gsi_info_t;
+
+struct acpi_prt_entry {
+       struct acpi_pci_id      id;
+       u8                      pin;
+       acpi_handle             link;
+       u32                     index;          /* GSI, or link _CRS index */
+};
+
+static int xen_pvh_get_gsi_info(struct pci_dev *dev,
+                                                               gsi_info_t *gsi_info)
+{
+       int gsi;
+       u8 pin = 0;
+       struct acpi_prt_entry *entry;
+       int trigger = ACPI_LEVEL_SENSITIVE;
+       int polarity = acpi_irq_model == ACPI_IRQ_MODEL_GIC ?
+                                     ACPI_ACTIVE_HIGH : ACPI_ACTIVE_LOW;
+
+       if (dev)
+               pin = dev->pin;
+       if (!pin) {
+               xen_raw_printk("No interrupt pin configured\n");
+               return -EINVAL;
+       }
+
+       entry = acpi_pci_irq_lookup(dev, pin);
+       if (entry) {
+               if (entry->link)
+                       gsi = acpi_pci_link_allocate_irq(entry->link,
+                                                        entry->index,
+                                                        &trigger, &polarity,
+                                                        NULL);
+               else
+                       gsi = entry->index;
+       } else
+               return -EINVAL;
+
+       gsi_info->gsi = gsi;
+       gsi_info->trigger = trigger;
+       gsi_info->polarity = polarity;
+
+       return 0;
+}
+
+static int xen_pvh_map_pirq(gsi_info_t *gsi_info)
+{
+       struct physdev_map_pirq map_irq;
+       int ret;
+
+       map_irq.domid = DOMID_SELF;
+       map_irq.type = MAP_PIRQ_TYPE_GSI;
+       map_irq.index = gsi_info->gsi;
+       map_irq.pirq = gsi_info->gsi;
+
+       ret = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq);
+       gsi_info->pirq = map_irq.pirq;
+
+       return ret;
+}
+
+static int xen_pvh_unmap_pirq(gsi_info_t *gsi_info)
+{
+       struct physdev_unmap_pirq unmap_irq;
+
+       unmap_irq.domid = DOMID_SELF;
+       unmap_irq.pirq = gsi_info->pirq;
+
+       return HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq);
+}
+
+static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
+{
+       struct physdev_setup_gsi setup_gsi;
+
+       setup_gsi.gsi = gsi_info->gsi;
+       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
+       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
+
+       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
+}
+
+int xen_pvh_passthrough_gsi(struct pci_dev *dev)
+{
+       int ret;
+       gsi_info_t gsi_info;
+
+       if (!dev) {
+               return -EINVAL;
+       }
+
+       ret = xen_pvh_get_gsi_info(dev, &gsi_info);
+       if (ret) {
+               xen_raw_printk("Fail to get gsi info!\n");
+               return ret;
+       }
+
+       ret = xen_pvh_map_pirq(&gsi_info);
+       if (ret) {
+               xen_raw_printk("Fail to map pirq for gsi (%d)!\n", gsi_info.gsi);
+               return ret;
+       }
+
+       ret = xen_pvh_setup_gsi(&gsi_info);
+       if (ret == -EEXIST) {
+               ret = 0;
+               xen_raw_printk("Already setup the GSI :%u\n", gsi_info.gsi);
+       } else if (ret) {
+               xen_raw_printk("Fail to setup gsi (%d)!\n", gsi_info.gsi);
+               xen_pvh_unmap_pirq(&gsi_info);
+       }
+
+       return ret;
+}
+EXPORT_SYMBOL_GPL(xen_pvh_passthrough_gsi);
+
 void __init xen_pvh_init(struct boot_params *boot_params)
 {
        u32 msr;
diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index ff30ceca2203..630fe0a34bc6 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -288,7 +288,7 @@ static int acpi_reroute_boot_interrupt(struct pci_dev *dev,
 }
 #endif /* CONFIG_X86_IO_APIC */

-static struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
+struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
 {
        struct acpi_prt_entry *entry = NULL;
        struct pci_dev *bridge;
diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index e34b623e4b41..1abd4dad6f40 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -20,6 +20,7 @@
 #include <linux/atomic.h>
 #include <xen/events.h>
 #include <xen/pci.h>
+#include <xen/acpi.h>
 #include <xen/xen.h>
 #include <asm/xen/hypervisor.h>
 #include <xen/interface/physdev.h>
@@ -399,6 +400,12 @@ static int pcistub_init_device(struct pci_dev *dev)
        if (err)
                goto config_release;

+       if (xen_initial_domain() && xen_pvh_domain()) {
+               err = xen_pvh_passthrough_gsi(dev);
+               if (err)
+                       goto config_release;
+       }
+
        if (dev->msix_cap) {
                struct physdev_pci_device ppdev = {
                        .seg = pci_domain_nr(dev->bus),
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 641dc4843987..368d56ba2c5e 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -375,6 +375,7 @@ void acpi_unregister_gsi (u32 gsi);

 struct pci_dev;

+struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin);
 int acpi_pci_irq_enable (struct pci_dev *dev);
 void acpi_penalize_isa_irq(int irq, int active);
 bool acpi_isa_irq_available(int irq);
diff --git a/include/xen/acpi.h b/include/xen/acpi.h
index b1e11863144d..ce7f5554f88e 100644
--- a/include/xen/acpi.h
+++ b/include/xen/acpi.h
@@ -67,6 +67,7 @@ static inline void xen_acpi_sleep_register(void)
                acpi_suspend_lowlevel = xen_acpi_suspend_lowlevel;
        }
 }
+int xen_pvh_passthrough_gsi(struct pci_dev *dev);
 #else
 static inline void xen_acpi_sleep_register(void)
 {

> 
> Jan

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
  2023-12-05 17:02         ` Thomas Gleixner
@ 2023-12-06  6:37           ` Chen, Jiqian
  0 siblings, 0 replies; 49+ messages in thread
From: Chen, Jiqian @ 2023-12-06  6:37 UTC (permalink / raw)
  To: Thomas Gleixner, Stefano Stabellini
  Cc: Juergen Gross, Oleksandr Tyshchenko, Boris Ostrovsky,
	Rafael J . Wysocki, Len Brown, Bjorn Helgaas,
	Roger Pau Monné,
	xen-devel, linux-kernel, linux-acpi, Stabellini, Stefano,
	Deucher, Alexander, Koenig, Christian, Hildebrand, Stewart,
	Ragiadakou, Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray,
	Chen, Jiqian

Hi Thomas Gleixner,

On 2023/12/6 01:02, Thomas Gleixner wrote:
> On Mon, Dec 04 2023 at 13:31, Stefano Stabellini wrote:
>> On Mon, 3 Dec 2023, Chen, Jiqian wrote:
>>>>> vpci device state when device is reset on dom0 side.
>>>>>
>>>>> And call that function in pcistub_init_device. Because when
>>>>> we use "pci-assignable-add" to assign a passthrough device in
>>>>> Xen, it will reset passthrough device and the vpci state will
>>>>> out of date, and then device will fail to restore bar state.
>>>>>
>>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>>>
>>>> This Signed-off-by chain is incorrect.
>>>>
>>>> Documentation/process/submitting-patches.rst has a full chapter about
>>>> S-O-B and the correct usage.
>>> I am the author of this series of patches, and Huang Rui transported the v1 to upstream. And now I transport v2. I am not aware that the SOB chain is incorrect.
>>> Do you have any suggestions?
>>
>> I think he means that your Signed-off-by should be the second one of the
>> two as you are the one submitting the patch to the LKML
> 
> No.
> 
>    Mailfrom: Jiqian Chen <Jiqian.Chen@amd.com>
>    <body>
> 
>    Changelog-text
> 
>    Signed-off-by: Huang Rui <ray.huang@amd.com>
>    Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> 
> is equally wrong because that would end up with Chen as author and Huang
> as first S-O-B which is required to be the author's S-O-B
> 
> To make the above correct this would require:
> 
>    Mailfrom: Jiqian Chen <Jiqian.Chen@amd.com>
>    <body>
> 
>    From: Huang Rui <ray.huang@amd.com>
> 
>    Changelog-text
> 
>    Signed-off-by: Huang Rui <ray.huang@amd.com>
>    Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> 
>    which tells that Huang is the author and Chen is the 'transporter',
>    which unfortunately does not reflect reality.
> 
> Or:
> 
>    Mailfrom: Jiqian Chen <Jiqian.Chen@amd.com>
>    <body>
> 
>    Changelog-text
> 
>    Co-developed-by: Huang Rui <ray.huang@amd.com>
>    Signed-off-by: Huang Rui <ray.huang@amd.com>
>    Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> 
>    which tells that Checn is the author and Huang co-developed the
>    patch, which might be true or not.
> 
> 
> V1 which was sent by Huang has the ordering is correct:
> 
>    Mailfrom: Huang Rui <ray.huang@amd.com>
>    <body>
> 
>    From: Jiqian Chen <Jiqian.Chen@amd.com>
> 
>    Changelog-text
> 
>    Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>    Signed-off-by: Huang Rui <ray.huang@amd.com>
> 
>    i.e. Chen authored and Huang transported
> 
> Now this V2 has not really much to do with V1 and is a new
> implementation to solve the problem, which was authored by Chen, so
> Huang is not involved at all if I understand correctly.
Not exactly, V2 is modified based on the V1. I am the author of this series. Huang Rui upstream the V1, and he also helped me to improve the first patch of this V2 series.
Maybe in next version, below is more suitable:

Co-developed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>

Which can reflect that Huang Rui is co-developer, and I am the author and the last people to send patches.

> 
> So what does his S-O-B mean here? Nothing...
> 
> It's very well documented how the whole S-O-B business works and it's
> not really rocket science to get it straight.
> 
> It has a meaning and is not just for decoration purposes.
> 
> Thanks,
> 
>         tglx

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-06  6:07                     ` Chen, Jiqian
@ 2023-12-07  2:18                       ` Stefano Stabellini
  2023-12-07  3:38                         ` Chen, Jiqian
  2023-12-07  6:43                         ` Juergen Gross
  2023-12-11 15:45                       ` Roger Pau Monné
  1 sibling, 2 replies; 49+ messages in thread
From: Stefano Stabellini @ 2023-12-07  2:18 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Jan Beulich, Roger Pau Monné,
	Stefano Stabellini, Juergen Gross, Oleksandr Tyshchenko,
	Thomas Gleixner, Boris Ostrovsky, Rafael J . Wysocki, Len Brown,
	Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi, Stabellini,
	Stefano, Deucher, Alexander, Koenig, Christian, Hildebrand,
	Stewart, Ragiadakou, Xenia, Huang, Honglei1, Zhang, Julia, Huang,
	Ray

On Tue, 5 Dec 2023, Chen, Jiqian wrote:
> When PVH dom0 enable a device, it will get trigger and polarity from ACPI (see acpi_pci_irq_enable)
> I have a version of patch which tried that way, see below:

This approach looks much better. I think this patch is OKish. Juergen,
what do you think?


> diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
> index ada3868c02c2..43e1bda9f946 100644
> --- a/arch/x86/xen/enlighten_pvh.c
> +++ b/arch/x86/xen/enlighten_pvh.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0
>  #include <linux/acpi.h>
>  #include <linux/export.h>
> +#include <linux/pci.h>
> 
>  #include <xen/hvc-console.h>
> 
> @@ -25,6 +26,127 @@
>  bool __ro_after_init xen_pvh;
>  EXPORT_SYMBOL_GPL(xen_pvh);
> 
> +typedef struct gsi_info {
> +       int gsi;
> +       int trigger;
> +       int polarity;
> +       int pirq;
> +} gsi_info_t;
> +
> +struct acpi_prt_entry {
> +       struct acpi_pci_id      id;
> +       u8                      pin;
> +       acpi_handle             link;
> +       u32                     index;          /* GSI, or link _CRS index */
> +};
> +
> +static int xen_pvh_get_gsi_info(struct pci_dev *dev,
> +                                                               gsi_info_t *gsi_info)
> +{
> +       int gsi;
> +       u8 pin = 0;
> +       struct acpi_prt_entry *entry;
> +       int trigger = ACPI_LEVEL_SENSITIVE;
> +       int polarity = acpi_irq_model == ACPI_IRQ_MODEL_GIC ?
> +                                     ACPI_ACTIVE_HIGH : ACPI_ACTIVE_LOW;
> +
> +       if (dev)
> +               pin = dev->pin;
> +       if (!pin) {
> +               xen_raw_printk("No interrupt pin configured\n");
> +               return -EINVAL;
> +       }
> +
> +       entry = acpi_pci_irq_lookup(dev, pin);
> +       if (entry) {
> +               if (entry->link)
> +                       gsi = acpi_pci_link_allocate_irq(entry->link,
> +                                                        entry->index,
> +                                                        &trigger, &polarity,
> +                                                        NULL);
> +               else
> +                       gsi = entry->index;
> +       } else
> +               return -EINVAL;
> +
> +       gsi_info->gsi = gsi;
> +       gsi_info->trigger = trigger;
> +       gsi_info->polarity = polarity;
> +
> +       return 0;
> +}
> +
> +static int xen_pvh_map_pirq(gsi_info_t *gsi_info)
> +{
> +       struct physdev_map_pirq map_irq;
> +       int ret;
> +
> +       map_irq.domid = DOMID_SELF;
> +       map_irq.type = MAP_PIRQ_TYPE_GSI;
> +       map_irq.index = gsi_info->gsi;
> +       map_irq.pirq = gsi_info->gsi;
> +
> +       ret = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq);
> +       gsi_info->pirq = map_irq.pirq;
> +
> +       return ret;
> +}
> +
> +static int xen_pvh_unmap_pirq(gsi_info_t *gsi_info)
> +{
> +       struct physdev_unmap_pirq unmap_irq;
> +
> +       unmap_irq.domid = DOMID_SELF;
> +       unmap_irq.pirq = gsi_info->pirq;
> +
> +       return HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq);
> +}
> +
> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
> +{
> +       struct physdev_setup_gsi setup_gsi;
> +
> +       setup_gsi.gsi = gsi_info->gsi;
> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
> +
> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
> +}
> +
> +int xen_pvh_passthrough_gsi(struct pci_dev *dev)
> +{
> +       int ret;
> +       gsi_info_t gsi_info;
> +
> +       if (!dev) {
> +               return -EINVAL;
> +       }
> +
> +       ret = xen_pvh_get_gsi_info(dev, &gsi_info);
> +       if (ret) {
> +               xen_raw_printk("Fail to get gsi info!\n");
> +               return ret;
> +       }
> +
> +       ret = xen_pvh_map_pirq(&gsi_info);
> +       if (ret) {
> +               xen_raw_printk("Fail to map pirq for gsi (%d)!\n", gsi_info.gsi);
> +               return ret;
> +       }
> +
> +       ret = xen_pvh_setup_gsi(&gsi_info);
> +       if (ret == -EEXIST) {
> +               ret = 0;
> +               xen_raw_printk("Already setup the GSI :%u\n", gsi_info.gsi);
> +       } else if (ret) {
> +               xen_raw_printk("Fail to setup gsi (%d)!\n", gsi_info.gsi);
> +               xen_pvh_unmap_pirq(&gsi_info);
> +       }
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL_GPL(xen_pvh_passthrough_gsi);
> +
>  void __init xen_pvh_init(struct boot_params *boot_params)
>  {
>         u32 msr;
> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
> index ff30ceca2203..630fe0a34bc6 100644
> --- a/drivers/acpi/pci_irq.c
> +++ b/drivers/acpi/pci_irq.c
> @@ -288,7 +288,7 @@ static int acpi_reroute_boot_interrupt(struct pci_dev *dev,
>  }
>  #endif /* CONFIG_X86_IO_APIC */
> 
> -static struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
> +struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
>  {
>         struct acpi_prt_entry *entry = NULL;
>         struct pci_dev *bridge;
> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
> index e34b623e4b41..1abd4dad6f40 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -20,6 +20,7 @@
>  #include <linux/atomic.h>
>  #include <xen/events.h>
>  #include <xen/pci.h>
> +#include <xen/acpi.h>
>  #include <xen/xen.h>
>  #include <asm/xen/hypervisor.h>
>  #include <xen/interface/physdev.h>
> @@ -399,6 +400,12 @@ static int pcistub_init_device(struct pci_dev *dev)
>         if (err)
>                 goto config_release;
> 
> +       if (xen_initial_domain() && xen_pvh_domain()) {
> +               err = xen_pvh_passthrough_gsi(dev);
> +               if (err)
> +                       goto config_release;
> +       }
> +
>         if (dev->msix_cap) {
>                 struct physdev_pci_device ppdev = {
>                         .seg = pci_domain_nr(dev->bus),
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 641dc4843987..368d56ba2c5e 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -375,6 +375,7 @@ void acpi_unregister_gsi (u32 gsi);
> 
>  struct pci_dev;
> 
> +struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin);
>  int acpi_pci_irq_enable (struct pci_dev *dev);
>  void acpi_penalize_isa_irq(int irq, int active);
>  bool acpi_isa_irq_available(int irq);
> diff --git a/include/xen/acpi.h b/include/xen/acpi.h
> index b1e11863144d..ce7f5554f88e 100644
> --- a/include/xen/acpi.h
> +++ b/include/xen/acpi.h
> @@ -67,6 +67,7 @@ static inline void xen_acpi_sleep_register(void)
>                 acpi_suspend_lowlevel = xen_acpi_suspend_lowlevel;
>         }
>  }
> +int xen_pvh_passthrough_gsi(struct pci_dev *dev);
>  #else
>  static inline void xen_acpi_sleep_register(void)
>  {
> 
> > 
> > Jan
> 
> -- 
> Best regards,
> Jiqian Chen.
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-07  2:18                       ` Stefano Stabellini
@ 2023-12-07  3:38                         ` Chen, Jiqian
  2023-12-07  6:43                         ` Juergen Gross
  1 sibling, 0 replies; 49+ messages in thread
From: Chen, Jiqian @ 2023-12-07  3:38 UTC (permalink / raw)
  To: Juergen Gross, Stefano Stabellini
  Cc: Jan Beulich, Roger Pau Monné,
	Oleksandr Tyshchenko, Thomas Gleixner, Boris Ostrovsky,
	Rafael J . Wysocki, Len Brown, Bjorn Helgaas, xen-devel,
	linux-kernel, linux-acpi, Stabellini, Stefano, Deucher,
	Alexander, Koenig, Christian, Hildebrand, Stewart, Ragiadakou,
	Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen, Jiqian

(Adding Juergen to the "To" list.)

Hi Juergen, 
Looking forward to your opinions.

On 2023/12/7 10:18, Stefano Stabellini wrote:
> On Tue, 5 Dec 2023, Chen, Jiqian wrote:
>> When PVH dom0 enable a device, it will get trigger and polarity from ACPI (see acpi_pci_irq_enable)
>> I have a version of patch which tried that way, see below:
> 
> This approach looks much better. I think this patch is OKish. Juergen,
> what do you think?
> 
> 
>> diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
>> index ada3868c02c2..43e1bda9f946 100644
>> --- a/arch/x86/xen/enlighten_pvh.c
>> +++ b/arch/x86/xen/enlighten_pvh.c
>> @@ -1,6 +1,7 @@
>>  // SPDX-License-Identifier: GPL-2.0
>>  #include <linux/acpi.h>
>>  #include <linux/export.h>
>> +#include <linux/pci.h>
>>
>>  #include <xen/hvc-console.h>
>>
>> @@ -25,6 +26,127 @@
>>  bool __ro_after_init xen_pvh;
>>  EXPORT_SYMBOL_GPL(xen_pvh);
>>
>> +typedef struct gsi_info {
>> +       int gsi;
>> +       int trigger;
>> +       int polarity;
>> +       int pirq;
>> +} gsi_info_t;
>> +
>> +struct acpi_prt_entry {
>> +       struct acpi_pci_id      id;
>> +       u8                      pin;
>> +       acpi_handle             link;
>> +       u32                     index;          /* GSI, or link _CRS index */
>> +};
>> +
>> +static int xen_pvh_get_gsi_info(struct pci_dev *dev,
>> +                                                               gsi_info_t *gsi_info)
>> +{
>> +       int gsi;
>> +       u8 pin = 0;
>> +       struct acpi_prt_entry *entry;
>> +       int trigger = ACPI_LEVEL_SENSITIVE;
>> +       int polarity = acpi_irq_model == ACPI_IRQ_MODEL_GIC ?
>> +                                     ACPI_ACTIVE_HIGH : ACPI_ACTIVE_LOW;
>> +
>> +       if (dev)
>> +               pin = dev->pin;
>> +       if (!pin) {
>> +               xen_raw_printk("No interrupt pin configured\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       entry = acpi_pci_irq_lookup(dev, pin);
>> +       if (entry) {
>> +               if (entry->link)
>> +                       gsi = acpi_pci_link_allocate_irq(entry->link,
>> +                                                        entry->index,
>> +                                                        &trigger, &polarity,
>> +                                                        NULL);
>> +               else
>> +                       gsi = entry->index;
>> +       } else
>> +               return -EINVAL;
>> +
>> +       gsi_info->gsi = gsi;
>> +       gsi_info->trigger = trigger;
>> +       gsi_info->polarity = polarity;
>> +
>> +       return 0;
>> +}
>> +
>> +static int xen_pvh_map_pirq(gsi_info_t *gsi_info)
>> +{
>> +       struct physdev_map_pirq map_irq;
>> +       int ret;
>> +
>> +       map_irq.domid = DOMID_SELF;
>> +       map_irq.type = MAP_PIRQ_TYPE_GSI;
>> +       map_irq.index = gsi_info->gsi;
>> +       map_irq.pirq = gsi_info->gsi;
>> +
>> +       ret = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq);
>> +       gsi_info->pirq = map_irq.pirq;
>> +
>> +       return ret;
>> +}
>> +
>> +static int xen_pvh_unmap_pirq(gsi_info_t *gsi_info)
>> +{
>> +       struct physdev_unmap_pirq unmap_irq;
>> +
>> +       unmap_irq.domid = DOMID_SELF;
>> +       unmap_irq.pirq = gsi_info->pirq;
>> +
>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq);
>> +}
>> +
>> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
>> +{
>> +       struct physdev_setup_gsi setup_gsi;
>> +
>> +       setup_gsi.gsi = gsi_info->gsi;
>> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
>> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
>> +
>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
>> +}
>> +
>> +int xen_pvh_passthrough_gsi(struct pci_dev *dev)
>> +{
>> +       int ret;
>> +       gsi_info_t gsi_info;
>> +
>> +       if (!dev) {
>> +               return -EINVAL;
>> +       }
>> +
>> +       ret = xen_pvh_get_gsi_info(dev, &gsi_info);
>> +       if (ret) {
>> +               xen_raw_printk("Fail to get gsi info!\n");
>> +               return ret;
>> +       }
>> +
>> +       ret = xen_pvh_map_pirq(&gsi_info);
>> +       if (ret) {
>> +               xen_raw_printk("Fail to map pirq for gsi (%d)!\n", gsi_info.gsi);
>> +               return ret;
>> +       }
>> +
>> +       ret = xen_pvh_setup_gsi(&gsi_info);
>> +       if (ret == -EEXIST) {
>> +               ret = 0;
>> +               xen_raw_printk("Already setup the GSI :%u\n", gsi_info.gsi);
>> +       } else if (ret) {
>> +               xen_raw_printk("Fail to setup gsi (%d)!\n", gsi_info.gsi);
>> +               xen_pvh_unmap_pirq(&gsi_info);
>> +       }
>> +
>> +       return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(xen_pvh_passthrough_gsi);
>> +
>>  void __init xen_pvh_init(struct boot_params *boot_params)
>>  {
>>         u32 msr;
>> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
>> index ff30ceca2203..630fe0a34bc6 100644
>> --- a/drivers/acpi/pci_irq.c
>> +++ b/drivers/acpi/pci_irq.c
>> @@ -288,7 +288,7 @@ static int acpi_reroute_boot_interrupt(struct pci_dev *dev,
>>  }
>>  #endif /* CONFIG_X86_IO_APIC */
>>
>> -static struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
>> +struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
>>  {
>>         struct acpi_prt_entry *entry = NULL;
>>         struct pci_dev *bridge;
>> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>> index e34b623e4b41..1abd4dad6f40 100644
>> --- a/drivers/xen/xen-pciback/pci_stub.c
>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> @@ -20,6 +20,7 @@
>>  #include <linux/atomic.h>
>>  #include <xen/events.h>
>>  #include <xen/pci.h>
>> +#include <xen/acpi.h>
>>  #include <xen/xen.h>
>>  #include <asm/xen/hypervisor.h>
>>  #include <xen/interface/physdev.h>
>> @@ -399,6 +400,12 @@ static int pcistub_init_device(struct pci_dev *dev)
>>         if (err)
>>                 goto config_release;
>>
>> +       if (xen_initial_domain() && xen_pvh_domain()) {
>> +               err = xen_pvh_passthrough_gsi(dev);
>> +               if (err)
>> +                       goto config_release;
>> +       }
>> +
>>         if (dev->msix_cap) {
>>                 struct physdev_pci_device ppdev = {
>>                         .seg = pci_domain_nr(dev->bus),
>> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
>> index 641dc4843987..368d56ba2c5e 100644
>> --- a/include/linux/acpi.h
>> +++ b/include/linux/acpi.h
>> @@ -375,6 +375,7 @@ void acpi_unregister_gsi (u32 gsi);
>>
>>  struct pci_dev;
>>
>> +struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin);
>>  int acpi_pci_irq_enable (struct pci_dev *dev);
>>  void acpi_penalize_isa_irq(int irq, int active);
>>  bool acpi_isa_irq_available(int irq);
>> diff --git a/include/xen/acpi.h b/include/xen/acpi.h
>> index b1e11863144d..ce7f5554f88e 100644
>> --- a/include/xen/acpi.h
>> +++ b/include/xen/acpi.h
>> @@ -67,6 +67,7 @@ static inline void xen_acpi_sleep_register(void)
>>                 acpi_suspend_lowlevel = xen_acpi_suspend_lowlevel;
>>         }
>>  }
>> +int xen_pvh_passthrough_gsi(struct pci_dev *dev);
>>  #else
>>  static inline void xen_acpi_sleep_register(void)
>>  {
>>
>>>
>>> Jan
>>
>> -- 
>> Best regards,
>> Jiqian Chen.
>>

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-07  2:18                       ` Stefano Stabellini
  2023-12-07  3:38                         ` Chen, Jiqian
@ 2023-12-07  6:43                         ` Juergen Gross
  2023-12-08  5:53                           ` Chen, Jiqian
  1 sibling, 1 reply; 49+ messages in thread
From: Juergen Gross @ 2023-12-07  6:43 UTC (permalink / raw)
  To: Stefano Stabellini, Chen, Jiqian
  Cc: Jan Beulich, Roger Pau Monné,
	Oleksandr Tyshchenko, Thomas Gleixner, Boris Ostrovsky,
	Rafael J . Wysocki, Len Brown, Bjorn Helgaas, xen-devel,
	linux-kernel, linux-acpi, Stabellini, Stefano, Deucher,
	Alexander, Koenig, Christian, Hildebrand, Stewart, Ragiadakou,
	Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray


[-- Attachment #1.1.1: Type: text/plain, Size: 7560 bytes --]

On 07.12.23 03:18, Stefano Stabellini wrote:
> On Tue, 5 Dec 2023, Chen, Jiqian wrote:
>> When PVH dom0 enable a device, it will get trigger and polarity from ACPI (see acpi_pci_irq_enable)
>> I have a version of patch which tried that way, see below:
> 
> This approach looks much better. I think this patch is OKish. Juergen,
> what do you think?

The approach seems to be fine.


Juergen

> 
> 
>> diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
>> index ada3868c02c2..43e1bda9f946 100644
>> --- a/arch/x86/xen/enlighten_pvh.c
>> +++ b/arch/x86/xen/enlighten_pvh.c
>> @@ -1,6 +1,7 @@
>>   // SPDX-License-Identifier: GPL-2.0
>>   #include <linux/acpi.h>
>>   #include <linux/export.h>
>> +#include <linux/pci.h>
>>
>>   #include <xen/hvc-console.h>
>>
>> @@ -25,6 +26,127 @@
>>   bool __ro_after_init xen_pvh;
>>   EXPORT_SYMBOL_GPL(xen_pvh);
>>
>> +typedef struct gsi_info {
>> +       int gsi;
>> +       int trigger;
>> +       int polarity;
>> +       int pirq;
>> +} gsi_info_t;
>> +
>> +struct acpi_prt_entry {
>> +       struct acpi_pci_id      id;
>> +       u8                      pin;
>> +       acpi_handle             link;
>> +       u32                     index;          /* GSI, or link _CRS index */
>> +};
>> +
>> +static int xen_pvh_get_gsi_info(struct pci_dev *dev,
>> +                                                               gsi_info_t *gsi_info)
>> +{
>> +       int gsi;
>> +       u8 pin = 0;
>> +       struct acpi_prt_entry *entry;
>> +       int trigger = ACPI_LEVEL_SENSITIVE;
>> +       int polarity = acpi_irq_model == ACPI_IRQ_MODEL_GIC ?
>> +                                     ACPI_ACTIVE_HIGH : ACPI_ACTIVE_LOW;
>> +
>> +       if (dev)
>> +               pin = dev->pin;
>> +       if (!pin) {
>> +               xen_raw_printk("No interrupt pin configured\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       entry = acpi_pci_irq_lookup(dev, pin);
>> +       if (entry) {
>> +               if (entry->link)
>> +                       gsi = acpi_pci_link_allocate_irq(entry->link,
>> +                                                        entry->index,
>> +                                                        &trigger, &polarity,
>> +                                                        NULL);
>> +               else
>> +                       gsi = entry->index;
>> +       } else
>> +               return -EINVAL;
>> +
>> +       gsi_info->gsi = gsi;
>> +       gsi_info->trigger = trigger;
>> +       gsi_info->polarity = polarity;
>> +
>> +       return 0;
>> +}
>> +
>> +static int xen_pvh_map_pirq(gsi_info_t *gsi_info)
>> +{
>> +       struct physdev_map_pirq map_irq;
>> +       int ret;
>> +
>> +       map_irq.domid = DOMID_SELF;
>> +       map_irq.type = MAP_PIRQ_TYPE_GSI;
>> +       map_irq.index = gsi_info->gsi;
>> +       map_irq.pirq = gsi_info->gsi;
>> +
>> +       ret = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq);
>> +       gsi_info->pirq = map_irq.pirq;
>> +
>> +       return ret;
>> +}
>> +
>> +static int xen_pvh_unmap_pirq(gsi_info_t *gsi_info)
>> +{
>> +       struct physdev_unmap_pirq unmap_irq;
>> +
>> +       unmap_irq.domid = DOMID_SELF;
>> +       unmap_irq.pirq = gsi_info->pirq;
>> +
>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq);
>> +}
>> +
>> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
>> +{
>> +       struct physdev_setup_gsi setup_gsi;
>> +
>> +       setup_gsi.gsi = gsi_info->gsi;
>> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
>> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
>> +
>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
>> +}
>> +
>> +int xen_pvh_passthrough_gsi(struct pci_dev *dev)
>> +{
>> +       int ret;
>> +       gsi_info_t gsi_info;
>> +
>> +       if (!dev) {
>> +               return -EINVAL;
>> +       }
>> +
>> +       ret = xen_pvh_get_gsi_info(dev, &gsi_info);
>> +       if (ret) {
>> +               xen_raw_printk("Fail to get gsi info!\n");
>> +               return ret;
>> +       }
>> +
>> +       ret = xen_pvh_map_pirq(&gsi_info);
>> +       if (ret) {
>> +               xen_raw_printk("Fail to map pirq for gsi (%d)!\n", gsi_info.gsi);
>> +               return ret;
>> +       }
>> +
>> +       ret = xen_pvh_setup_gsi(&gsi_info);
>> +       if (ret == -EEXIST) {
>> +               ret = 0;
>> +               xen_raw_printk("Already setup the GSI :%u\n", gsi_info.gsi);
>> +       } else if (ret) {
>> +               xen_raw_printk("Fail to setup gsi (%d)!\n", gsi_info.gsi);
>> +               xen_pvh_unmap_pirq(&gsi_info);
>> +       }
>> +
>> +       return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(xen_pvh_passthrough_gsi);
>> +
>>   void __init xen_pvh_init(struct boot_params *boot_params)
>>   {
>>          u32 msr;
>> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
>> index ff30ceca2203..630fe0a34bc6 100644
>> --- a/drivers/acpi/pci_irq.c
>> +++ b/drivers/acpi/pci_irq.c
>> @@ -288,7 +288,7 @@ static int acpi_reroute_boot_interrupt(struct pci_dev *dev,
>>   }
>>   #endif /* CONFIG_X86_IO_APIC */
>>
>> -static struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
>> +struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
>>   {
>>          struct acpi_prt_entry *entry = NULL;
>>          struct pci_dev *bridge;
>> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>> index e34b623e4b41..1abd4dad6f40 100644
>> --- a/drivers/xen/xen-pciback/pci_stub.c
>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> @@ -20,6 +20,7 @@
>>   #include <linux/atomic.h>
>>   #include <xen/events.h>
>>   #include <xen/pci.h>
>> +#include <xen/acpi.h>
>>   #include <xen/xen.h>
>>   #include <asm/xen/hypervisor.h>
>>   #include <xen/interface/physdev.h>
>> @@ -399,6 +400,12 @@ static int pcistub_init_device(struct pci_dev *dev)
>>          if (err)
>>                  goto config_release;
>>
>> +       if (xen_initial_domain() && xen_pvh_domain()) {
>> +               err = xen_pvh_passthrough_gsi(dev);
>> +               if (err)
>> +                       goto config_release;
>> +       }
>> +
>>          if (dev->msix_cap) {
>>                  struct physdev_pci_device ppdev = {
>>                          .seg = pci_domain_nr(dev->bus),
>> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
>> index 641dc4843987..368d56ba2c5e 100644
>> --- a/include/linux/acpi.h
>> +++ b/include/linux/acpi.h
>> @@ -375,6 +375,7 @@ void acpi_unregister_gsi (u32 gsi);
>>
>>   struct pci_dev;
>>
>> +struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin);
>>   int acpi_pci_irq_enable (struct pci_dev *dev);
>>   void acpi_penalize_isa_irq(int irq, int active);
>>   bool acpi_isa_irq_available(int irq);
>> diff --git a/include/xen/acpi.h b/include/xen/acpi.h
>> index b1e11863144d..ce7f5554f88e 100644
>> --- a/include/xen/acpi.h
>> +++ b/include/xen/acpi.h
>> @@ -67,6 +67,7 @@ static inline void xen_acpi_sleep_register(void)
>>                  acpi_suspend_lowlevel = xen_acpi_suspend_lowlevel;
>>          }
>>   }
>> +int xen_pvh_passthrough_gsi(struct pci_dev *dev);
>>   #else
>>   static inline void xen_acpi_sleep_register(void)
>>   {
>>
>>>
>>> Jan
>>
>> -- 
>> Best regards,
>> Jiqian Chen.
>>


[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-07  6:43                         ` Juergen Gross
@ 2023-12-08  5:53                           ` Chen, Jiqian
  0 siblings, 0 replies; 49+ messages in thread
From: Chen, Jiqian @ 2023-12-08  5:53 UTC (permalink / raw)
  To: Juergen Gross, Stefano Stabellini
  Cc: Jan Beulich, Roger Pau Monné,
	Oleksandr Tyshchenko, Thomas Gleixner, Boris Ostrovsky,
	Rafael J . Wysocki, Len Brown, Bjorn Helgaas, xen-devel,
	linux-kernel, linux-acpi, Stabellini, Stefano, Deucher,
	Alexander, Koenig, Christian, Hildebrand, Stewart, Ragiadakou,
	Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen, Jiqian

Thank Stefano and Juergen, I will use this approach in next version.

On 2023/12/7 14:43, Juergen Gross wrote:
> On 07.12.23 03:18, Stefano Stabellini wrote:
>> On Tue, 5 Dec 2023, Chen, Jiqian wrote:
>>> When PVH dom0 enable a device, it will get trigger and polarity from ACPI (see acpi_pci_irq_enable)
>>> I have a version of patch which tried that way, see below:
>>
>> This approach looks much better. I think this patch is OKish. Juergen,
>> what do you think?
> 
> The approach seems to be fine.
> 
> 
> Juergen
> 
>>
>>
>>> diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
>>> index ada3868c02c2..43e1bda9f946 100644
>>> --- a/arch/x86/xen/enlighten_pvh.c
>>> +++ b/arch/x86/xen/enlighten_pvh.c
>>> @@ -1,6 +1,7 @@
>>>   // SPDX-License-Identifier: GPL-2.0
>>>   #include <linux/acpi.h>
>>>   #include <linux/export.h>
>>> +#include <linux/pci.h>
>>>
>>>   #include <xen/hvc-console.h>
>>>
>>> @@ -25,6 +26,127 @@
>>>   bool __ro_after_init xen_pvh;
>>>   EXPORT_SYMBOL_GPL(xen_pvh);
>>>
>>> +typedef struct gsi_info {
>>> +       int gsi;
>>> +       int trigger;
>>> +       int polarity;
>>> +       int pirq;
>>> +} gsi_info_t;
>>> +
>>> +struct acpi_prt_entry {
>>> +       struct acpi_pci_id      id;
>>> +       u8                      pin;
>>> +       acpi_handle             link;
>>> +       u32                     index;          /* GSI, or link _CRS index */
>>> +};
>>> +
>>> +static int xen_pvh_get_gsi_info(struct pci_dev *dev,
>>> +                                                               gsi_info_t *gsi_info)
>>> +{
>>> +       int gsi;
>>> +       u8 pin = 0;
>>> +       struct acpi_prt_entry *entry;
>>> +       int trigger = ACPI_LEVEL_SENSITIVE;
>>> +       int polarity = acpi_irq_model == ACPI_IRQ_MODEL_GIC ?
>>> +                                     ACPI_ACTIVE_HIGH : ACPI_ACTIVE_LOW;
>>> +
>>> +       if (dev)
>>> +               pin = dev->pin;
>>> +       if (!pin) {
>>> +               xen_raw_printk("No interrupt pin configured\n");
>>> +               return -EINVAL;
>>> +       }
>>> +
>>> +       entry = acpi_pci_irq_lookup(dev, pin);
>>> +       if (entry) {
>>> +               if (entry->link)
>>> +                       gsi = acpi_pci_link_allocate_irq(entry->link,
>>> +                                                        entry->index,
>>> +                                                        &trigger, &polarity,
>>> +                                                        NULL);
>>> +               else
>>> +                       gsi = entry->index;
>>> +       } else
>>> +               return -EINVAL;
>>> +
>>> +       gsi_info->gsi = gsi;
>>> +       gsi_info->trigger = trigger;
>>> +       gsi_info->polarity = polarity;
>>> +
>>> +       return 0;
>>> +}
>>> +
>>> +static int xen_pvh_map_pirq(gsi_info_t *gsi_info)
>>> +{
>>> +       struct physdev_map_pirq map_irq;
>>> +       int ret;
>>> +
>>> +       map_irq.domid = DOMID_SELF;
>>> +       map_irq.type = MAP_PIRQ_TYPE_GSI;
>>> +       map_irq.index = gsi_info->gsi;
>>> +       map_irq.pirq = gsi_info->gsi;
>>> +
>>> +       ret = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq);
>>> +       gsi_info->pirq = map_irq.pirq;
>>> +
>>> +       return ret;
>>> +}
>>> +
>>> +static int xen_pvh_unmap_pirq(gsi_info_t *gsi_info)
>>> +{
>>> +       struct physdev_unmap_pirq unmap_irq;
>>> +
>>> +       unmap_irq.domid = DOMID_SELF;
>>> +       unmap_irq.pirq = gsi_info->pirq;
>>> +
>>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq);
>>> +}
>>> +
>>> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
>>> +{
>>> +       struct physdev_setup_gsi setup_gsi;
>>> +
>>> +       setup_gsi.gsi = gsi_info->gsi;
>>> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
>>> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
>>> +
>>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
>>> +}
>>> +
>>> +int xen_pvh_passthrough_gsi(struct pci_dev *dev)
>>> +{
>>> +       int ret;
>>> +       gsi_info_t gsi_info;
>>> +
>>> +       if (!dev) {
>>> +               return -EINVAL;
>>> +       }
>>> +
>>> +       ret = xen_pvh_get_gsi_info(dev, &gsi_info);
>>> +       if (ret) {
>>> +               xen_raw_printk("Fail to get gsi info!\n");
>>> +               return ret;
>>> +       }
>>> +
>>> +       ret = xen_pvh_map_pirq(&gsi_info);
>>> +       if (ret) {
>>> +               xen_raw_printk("Fail to map pirq for gsi (%d)!\n", gsi_info.gsi);
>>> +               return ret;
>>> +       }
>>> +
>>> +       ret = xen_pvh_setup_gsi(&gsi_info);
>>> +       if (ret == -EEXIST) {
>>> +               ret = 0;
>>> +               xen_raw_printk("Already setup the GSI :%u\n", gsi_info.gsi);
>>> +       } else if (ret) {
>>> +               xen_raw_printk("Fail to setup gsi (%d)!\n", gsi_info.gsi);
>>> +               xen_pvh_unmap_pirq(&gsi_info);
>>> +       }
>>> +
>>> +       return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(xen_pvh_passthrough_gsi);
>>> +
>>>   void __init xen_pvh_init(struct boot_params *boot_params)
>>>   {
>>>          u32 msr;
>>> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
>>> index ff30ceca2203..630fe0a34bc6 100644
>>> --- a/drivers/acpi/pci_irq.c
>>> +++ b/drivers/acpi/pci_irq.c
>>> @@ -288,7 +288,7 @@ static int acpi_reroute_boot_interrupt(struct pci_dev *dev,
>>>   }
>>>   #endif /* CONFIG_X86_IO_APIC */
>>>
>>> -static struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
>>> +struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
>>>   {
>>>          struct acpi_prt_entry *entry = NULL;
>>>          struct pci_dev *bridge;
>>> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>>> index e34b623e4b41..1abd4dad6f40 100644
>>> --- a/drivers/xen/xen-pciback/pci_stub.c
>>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>>> @@ -20,6 +20,7 @@
>>>   #include <linux/atomic.h>
>>>   #include <xen/events.h>
>>>   #include <xen/pci.h>
>>> +#include <xen/acpi.h>
>>>   #include <xen/xen.h>
>>>   #include <asm/xen/hypervisor.h>
>>>   #include <xen/interface/physdev.h>
>>> @@ -399,6 +400,12 @@ static int pcistub_init_device(struct pci_dev *dev)
>>>          if (err)
>>>                  goto config_release;
>>>
>>> +       if (xen_initial_domain() && xen_pvh_domain()) {
>>> +               err = xen_pvh_passthrough_gsi(dev);
>>> +               if (err)
>>> +                       goto config_release;
>>> +       }
>>> +
>>>          if (dev->msix_cap) {
>>>                  struct physdev_pci_device ppdev = {
>>>                          .seg = pci_domain_nr(dev->bus),
>>> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
>>> index 641dc4843987..368d56ba2c5e 100644
>>> --- a/include/linux/acpi.h
>>> +++ b/include/linux/acpi.h
>>> @@ -375,6 +375,7 @@ void acpi_unregister_gsi (u32 gsi);
>>>
>>>   struct pci_dev;
>>>
>>> +struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin);
>>>   int acpi_pci_irq_enable (struct pci_dev *dev);
>>>   void acpi_penalize_isa_irq(int irq, int active);
>>>   bool acpi_isa_irq_available(int irq);
>>> diff --git a/include/xen/acpi.h b/include/xen/acpi.h
>>> index b1e11863144d..ce7f5554f88e 100644
>>> --- a/include/xen/acpi.h
>>> +++ b/include/xen/acpi.h
>>> @@ -67,6 +67,7 @@ static inline void xen_acpi_sleep_register(void)
>>>                  acpi_suspend_lowlevel = xen_acpi_suspend_lowlevel;
>>>          }
>>>   }
>>> +int xen_pvh_passthrough_gsi(struct pci_dev *dev);
>>>   #else
>>>   static inline void xen_acpi_sleep_register(void)
>>>   {
>>>
>>>>
>>>> Jan
>>>
>>> -- 
>>> Best regards,
>>> Jiqian Chen.
>>>
> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-06  6:07                     ` Chen, Jiqian
  2023-12-07  2:18                       ` Stefano Stabellini
@ 2023-12-11 15:45                       ` Roger Pau Monné
  2023-12-12  6:16                         ` Chen, Jiqian
  1 sibling, 1 reply; 49+ messages in thread
From: Roger Pau Monné @ 2023-12-11 15:45 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Jan Beulich, Stefano Stabellini, Juergen Gross,
	Oleksandr Tyshchenko, Thomas Gleixner, Boris Ostrovsky,
	Rafael J . Wysocki, Len Brown, Bjorn Helgaas, xen-devel,
	linux-kernel, linux-acpi, Stabellini, Stefano, Deucher,
	Alexander, Koenig, Christian, Hildebrand, Stewart, Ragiadakou,
	Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray

On Wed, Dec 06, 2023 at 06:07:26AM +0000, Chen, Jiqian wrote:
> On 2023/12/5 18:32, Jan Beulich wrote:
> > On 05.12.2023 10:19, Roger Pau Monné wrote:
> >> On Mon, Dec 04, 2023 at 02:19:33PM -0800, Stefano Stabellini wrote:
> >>> On Mon, 4 Dec 2023, Roger Pau Monné wrote:
> >>>> On Fri, Dec 01, 2023 at 07:37:55PM -0800, Stefano Stabellini wrote:
> >>>>> On Fri, 1 Dec 2023, Roger Pau Monné wrote:
> >>>>>> On Thu, Nov 30, 2023 at 07:15:17PM -0800, Stefano Stabellini wrote:
> >>>>>>> On Thu, 30 Nov 2023, Roger Pau Monné wrote:
> >>>>>>>> On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
> >>>>>>>>> On Fri, 24 Nov 2023, Jiqian Chen wrote:
> >>>>>>>>>> This patch is to solve two problems we encountered when we try to
> >>>>>>>>>> passthrough a device to hvm domU base on Xen PVH dom0.
> >>>>>>>>>>
> >>>>>>>>>> First, hvm guest will alloc a pirq and irq for a passthrough device
> >>>>>>>>>> by using gsi, before that, the gsi must first has a mapping in dom0,
> >>>>>>>>>> see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
> >>>>>>>>>> into Xen and check whether dom0 has the mapping. See
> >>>>>>>>>> XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
> >>>>>>>>>> dom0 and it return irq is 0, and then return -EPERM.
> >>>>>>>>>> This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
> >>>>>>>>>> when thay are enabled.
> >>>>>>>>>>
> >>>>>>>>>> Second, in PVH dom0, the gsi of a passthrough device doesn't get
> >>>>>>>>>> registered, but gsi must be configured for it to be able to be
> >>>>>>>>>> mapped into a domU.
> >>>>>>>>>>
> >>>>>>>>>> After searching codes, we can find map_pirq and register_gsi will be
> >>>>>>>>>> done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
> >>>>>>>>>> the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
> >>>>>>>>>> can be conclude to that the gsi of a passthrough device doesn't be
> >>>>>>>>>> unmasked.
> >>>>>>>>>>
> >>>>>>>>>> To solve the unmaske problem, this patch call the unmask_irq when we
> >>>>>>>>>> assign a device to be passthrough. So that the gsi can get registered
> >>>>>>>>>> and mapped in PVH dom0.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Roger, this seems to be more of a Xen issue than a Linux issue. Why do
> >>>>>>>>> we need the unmask check in Xen? Couldn't we just do:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> >>>>>>>>> index 4e40d3609a..df262a4a18 100644
> >>>>>>>>> --- a/xen/arch/x86/hvm/vioapic.c
> >>>>>>>>> +++ b/xen/arch/x86/hvm/vioapic.c
> >>>>>>>>> @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
> >>>>>>>>>              hvm_dpci_eoi(d, gsi);
> >>>>>>>>>      }
> >>>>>>>>>  
> >>>>>>>>> -    if ( is_hardware_domain(d) && unmasked )
> >>>>>>>>> +    if ( is_hardware_domain(d) )
> >>>>>>>>>      {
> >>>>>>>>>          /*
> >>>>>>>>>           * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock
> >>>>>>>>
> >>>>>>>> There are some issues with this approach.
> >>>>>>>>
> >>>>>>>> mp_register_gsi() will only setup the trigger and polarity of the
> >>>>>>>> IO-APIC pin once, so we do so once the guest unmask the pin in order
> >>>>>>>> to assert that the configuration is the intended one.  A guest is
> >>>>>>>> allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
> >>>>>>>> that doesn't take effect unless the pin is unmasked.
> >>>>>>>>
> >>>>>>>> Overall the question would be whether we have any guarantees that
> >>>>>>>> the hardware domain has properly configured the pin, even if it's not
> >>>>>>>> using it itself (as it hasn't been unmasked).
> >>>>>>>>
> >>>>>>>> IIRC PCI legacy interrupts are level triggered and low polarity, so we
> >>>>>>>> could configure any pins that are not setup at bind time?
> >>>>>>>
> >>>>>>> That could work.
> >>>>>>>
> >>>>>>> Another idea is to move only the call to allocate_and_map_gsi_pirq at
> >>>>>>> bind time? That might be enough to pass a pirq_access_permitted check.
> >>>>>>
> >>>>>> Maybe, albeit that would change the behavior of XEN_DOMCTL_bind_pt_irq
> >>>>>> just for PT_IRQ_TYPE_PCI and only when called from a PVH dom0 (as the
> >>>>>> parameter would be a GSI instead of a previously mapped IRQ).  Such
> >>>>>> difference just for PT_IRQ_TYPE_PCI is slightly weird - if we go that
> >>>>>> route I would recommend that we instead introduce a new dmop that has
> >>>>>> this syntax regardless of the domain type it's called from.
> >>>>>
> >>>>> Looking at the code it is certainly a bit confusing. My point was that
> >>>>> we don't need to wait until polarity and trigger are set appropriately
> >>>>> to allow Dom0 to pass successfully a pirq_access_permitted() check. Xen
> >>>>> should be able to figure out that Dom0 is permitted pirq access.
> >>>>
> >>>> The logic is certainly not straightforward, and it could benefit from
> >>>> some comments.
> >>>>
> >>>> The irq permissions are a bit special, in that they get setup when the
> >>>> IRQ is mapped.
> >>>>
> >>>> The problem however is not so much with IRQ permissions, that we can
> >>>> indeed sort out internally in Xen.  Such check in dom0 has the side
> >>>> effect of preventing the IRQ from being assigned to a domU without the
> >>>> hardware source being properly configured AFAICT.
> >>>
> >>> Now I understand why you made a comment previously about Xen having to
> >>> configure trigger and polarity for these interrupts on its own.
> >>>
> >>>
> >>>>> So the idea was to move the call to allocate_and_map_gsi_pirq() earlier
> >>>>> somewhere because allocate_and_map_gsi_pirq doesn't require trigger or
> >>>>> polarity to be configured to work. But the suggestion of doing it a
> >>>>> "bind time" (meaning: XEN_DOMCTL_bind_pt_irq) was a bad idea.
> >>>>>
> >>>>> But maybe we can find another location, maybe within
> >>>>> xen/arch/x86/hvm/vioapic.c, to call allocate_and_map_gsi_pirq() before
> >>>>> trigger and polarity are set and before the interrupt is unmasked.
> >>>>>
> >>>>> Then we change the implementation of vioapic_hwdom_map_gsi to skip the
> >>>>> call to allocate_and_map_gsi_pirq, because by the time
> >>>>> vioapic_hwdom_map_gsi we assume that allocate_and_map_gsi_pirq had
> >>>>> already been done.
> >>>>
> >>>> But then we would end up in a situation where the
> >>>> pirq_access_permitted() check will pass, but the IO-APIC pin won't be
> >>>> configured, which I think it's not what we want.
> >>>>
> >>>> One option would be to allow mp_register_gsi() to be called multiple
> >>>> times, and update the IO-APIC pin configuration as long as the pin is
> >>>> not unmasked.  That would propagate each dom0 RTE update to the
> >>>> underlying IO-APIC.  However such approach relies on dom0 configuring
> >>>> all possible IO-APIC pins, even if no device on dom0 is using them, I
> >>>> think it's not a very reliable option.
> >>>>
> >>>> Another option would be to modify the toolstack to setup the GSI
> >>>> itself using the PHYSDEVOP_setup_gsi hypercall.  As said in a previous
> >>>> email, since we only care about PCI device passthrough the legacy INTx
> >>>> should always be level triggered and low polarity.
> >>>>
> >>>>> I am not familiar with vioapic.c but to give you an idea of what I was
> >>>>> thinking:
> >>>>>
> >>>>>
> >>>>> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> >>>>> index 4e40d3609a..16d56fe851 100644
> >>>>> --- a/xen/arch/x86/hvm/vioapic.c
> >>>>> +++ b/xen/arch/x86/hvm/vioapic.c
> >>>>> @@ -189,14 +189,6 @@ static int vioapic_hwdom_map_gsi(unsigned int gsi, unsigned int trig,
> >>>>>          return ret;
> >>>>>      }
> >>>>>  
> >>>>> -    ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
> >>>>> -    if ( ret )
> >>>>> -    {
> >>>>> -        gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
> >>>>> -                 gsi, ret);
> >>>>> -        return ret;
> >>>>> -    }
> >>>>> -
> >>>>>      pcidevs_lock();
> >>>>>      ret = pt_irq_create_bind(currd, &pt_irq_bind);
> >>>>>      if ( ret )
> >>>>> @@ -287,6 +279,17 @@ static void vioapic_write_redirent(
> >>>>>              hvm_dpci_eoi(d, gsi);
> >>>>>      }
> >>>>>  
> >>>>> +    if ( is_hardware_domain(d) ) 
> >>>>> +    {
> >>>>> +        int pirq = gsi, ret;
> >>>>> +        ret = allocate_and_map_gsi_pirq(currd, pirq, &pirq);
> >>>>> +        if ( ret )
> >>>>> +        {
> >>>>> +            gprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
> >>>>> +                    gsi, ret);
> >>>>> +            return ret;
> >>>>> +        }
> >>>>> +    }
> >>>>>      if ( is_hardware_domain(d) && unmasked )
> >>>>>      {
> >>>>>          /*
> >>>>
> >>>> As said above, such approach relies on dom0 writing to the IO-APIC RTE
> >>>> of likely each IO-APIC pin, which is IMO not quite reliable.  In there
> >>>> are two different issues here that need to be fixed for PVH dom0:
> >>>>
> >>>>  - Fix the XEN_DOMCTL_irq_permission pirq_access_permitted() call to
> >>>>    succeed for a PVH dom0, even if dom0 is not using the GSI itself.
> >>>
> >>> Yes makes sense
> >>>
> >>>
> >>>>  - Configure IO-APIC pins for PCI interrupts even if dom0 is not using
> >>>>    the IO-APIC pin itself.
> >>>>
> >>>> First one needs to be fixed internally in Xen, second one will require
> >>>> the toolstack to issue an extra hypercall in order to ensure the
> >>>> IO-APIC pin is properly configured.
> >>>  
> >>> On ARM, Xen doesn't need to wait for dom0 to configure interrupts
> >>> correctly. Xen configures them all on its own at boot based on Device
> >>> Tree information. I guess it is not possible to do the same on x86?
> >>
> >> No, not exactly.  There's some interrupt information in the ACPI MADT,
> >> but that's just for very specific sources (Interrupt Source Override
> >> Structures)
> >>
> >> Then on AML devices can have resource descriptors that contain
> >> information about how interrupts are setup.  However Xen is not able
> >> to read any of this information on AML.
> >>
> >> Legacy PCI interrupts are (always?) level triggered and low polarity,
> >> because it's assumed that an interrupt source can be shared between
> >> multiple devices.
> > 
> > Except that as per what you said just in the earlier paragraph ACPI can
> > tell us otherwise.
> > 
> >> I'm however not able to find any reference to this in the PCI spec,
> >> hence I'm reluctant to take this for granted in Xen, and default all
> >> GSIs >= 16 to such mode.
> >>
> >> OTOH legacy PCI interrupts are not that used anymore, as almost all
> >> devices will support MSI(-X) (because PCIe mandates it) and OSes
> >> should prefer the latter.  SR-IOV VF don't even support legacy PCI
> >> interrupts anymore.
> >>
> >>> If
> >>> not, then I can see why we would need 1 extra toolstack hypercall for
> >>> that (or to bundle the operation of configuring IO-APIC pins together
> >>> with an existing toolstack hypercall).
> >>
> >> One suitable compromise would be to default unconfigured GSIs >= 16 to
> >> level-triggered and low-polarity, as I would expect that to work in
> >> almost all cases.  We can always introduce the usage of
> >> PHYSDEVOP_setup_gsi later if required.
> >>
> >> Maybe Jan has more input here, would you agree to defaulting non-ISA
> >> GSIs to level-triggered, low-polarity in the absence of a specific
> >> setup provided by dom0?
> > 
> > Well, such defaulting is an option, but in case it's wrong we might
> > end up with hard to diagnose issues. Personally I'd prefer if we
> > didn't take shortcuts here, i.e. if we followed what Dom0 is able
> > to read from ACPI.
> 
> When PVH dom0 enable a device, it will get trigger and polarity from ACPI (see acpi_pci_irq_enable)
> I have a version of patch which tried that way, see below:
> 
> diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
> index ada3868c02c2..43e1bda9f946 100644
> --- a/arch/x86/xen/enlighten_pvh.c
> +++ b/arch/x86/xen/enlighten_pvh.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0
>  #include <linux/acpi.h>
>  #include <linux/export.h>
> +#include <linux/pci.h>
> 
>  #include <xen/hvc-console.h>
> 
> @@ -25,6 +26,127 @@
>  bool __ro_after_init xen_pvh;
>  EXPORT_SYMBOL_GPL(xen_pvh);
> 
> +typedef struct gsi_info {
> +       int gsi;
> +       int trigger;
> +       int polarity;
> +       int pirq;
> +} gsi_info_t;
> +
> +struct acpi_prt_entry {
> +       struct acpi_pci_id      id;
> +       u8                      pin;
> +       acpi_handle             link;
> +       u32                     index;          /* GSI, or link _CRS index */
> +};
> +
> +static int xen_pvh_get_gsi_info(struct pci_dev *dev,
> +                                                               gsi_info_t *gsi_info)
> +{
> +       int gsi;
> +       u8 pin = 0;
> +       struct acpi_prt_entry *entry;
> +       int trigger = ACPI_LEVEL_SENSITIVE;
> +       int polarity = acpi_irq_model == ACPI_IRQ_MODEL_GIC ?
> +                                     ACPI_ACTIVE_HIGH : ACPI_ACTIVE_LOW;
> +
> +       if (dev)
> +               pin = dev->pin;
> +       if (!pin) {
> +               xen_raw_printk("No interrupt pin configured\n");
> +               return -EINVAL;
> +       }
> +
> +       entry = acpi_pci_irq_lookup(dev, pin);
> +       if (entry) {
> +               if (entry->link)
> +                       gsi = acpi_pci_link_allocate_irq(entry->link,
> +                                                        entry->index,
> +                                                        &trigger, &polarity,
> +                                                        NULL);
> +               else
> +                       gsi = entry->index;
> +       } else
> +               return -EINVAL;
> +
> +       gsi_info->gsi = gsi;
> +       gsi_info->trigger = trigger;
> +       gsi_info->polarity = polarity;
> +
> +       return 0;
> +}
> +
> +static int xen_pvh_map_pirq(gsi_info_t *gsi_info)
> +{
> +       struct physdev_map_pirq map_irq;
> +       int ret;
> +
> +       map_irq.domid = DOMID_SELF;
> +       map_irq.type = MAP_PIRQ_TYPE_GSI;
> +       map_irq.index = gsi_info->gsi;
> +       map_irq.pirq = gsi_info->gsi;
> +
> +       ret = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq);
> +       gsi_info->pirq = map_irq.pirq;
> +
> +       return ret;
> +}
> +
> +static int xen_pvh_unmap_pirq(gsi_info_t *gsi_info)
> +{
> +       struct physdev_unmap_pirq unmap_irq;
> +
> +       unmap_irq.domid = DOMID_SELF;
> +       unmap_irq.pirq = gsi_info->pirq;
> +
> +       return HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq);
> +}
> +
> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
> +{
> +       struct physdev_setup_gsi setup_gsi;
> +
> +       setup_gsi.gsi = gsi_info->gsi;
> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
> +
> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
> +}

Hm, why not simply call pcibios_enable_device() from pciback?  What
you are doing here using the hypercalls is a backdoor into what's done
automatically by Xen on IO-APIC accesses by a PVH dom0.

It will be much more natural for the PVH dom0 model to simply use the
native way to configure and unmask the IO-APIC pin, and that would
correctly setup the triggering/polarity and bind it to dom0 without
requiring the usage of any hypercalls.

Is that an issue since in that case the gsi will get mapped and bound
to dom0?

Otherwise I would prefer if the gsi is just configured from pciback
(PHYSDEVOP_setup_gsi) but not mapped, as allowing a PVH dom0 to map
interrupts using PHYSDEVOP_{,un}map_pirq to itself introduces a new
interface to manage interrupts that clashes with the native way that a
PVH dom0 uses.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-11 15:45                       ` Roger Pau Monné
@ 2023-12-12  6:16                         ` Chen, Jiqian
  2023-12-12  8:49                           ` Roger Pau Monné
  0 siblings, 1 reply; 49+ messages in thread
From: Chen, Jiqian @ 2023-12-12  6:16 UTC (permalink / raw)
  To: Roger Pau Monné, Thomas Gleixner, Stefano Stabellini
  Cc: Jan Beulich, Juergen Gross, Oleksandr Tyshchenko,
	Boris Ostrovsky, Rafael J . Wysocki, Len Brown, Bjorn Helgaas,
	xen-devel, linux-kernel, linux-acpi, Stabellini, Stefano,
	Deucher, Alexander, Koenig, Christian, Hildebrand, Stewart,
	Ragiadakou, Xenia, Huang, Honglei1, Zhang, Julia, Huang, Ray,
	Chen, Jiqian

On 2023/12/11 23:45, Roger Pau Monné wrote:
> On Wed, Dec 06, 2023 at 06:07:26AM +0000, Chen, Jiqian wrote:
>>
>> When PVH dom0 enable a device, it will get trigger and polarity from ACPI (see acpi_pci_irq_enable)
>> I have a version of patch which tried that way, see below:
>>
>> diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
>> index ada3868c02c2..43e1bda9f946 100644
>> --- a/arch/x86/xen/enlighten_pvh.c
>> +++ b/arch/x86/xen/enlighten_pvh.c
>> @@ -1,6 +1,7 @@
>>  // SPDX-License-Identifier: GPL-2.0
>>  #include <linux/acpi.h>
>>  #include <linux/export.h>
>> +#include <linux/pci.h>
>>
>>  #include <xen/hvc-console.h>
>>
>> @@ -25,6 +26,127 @@
>>  bool __ro_after_init xen_pvh;
>>  EXPORT_SYMBOL_GPL(xen_pvh);
>>
>> +typedef struct gsi_info {
>> +       int gsi;
>> +       int trigger;
>> +       int polarity;
>> +       int pirq;
>> +} gsi_info_t;
>> +
>> +struct acpi_prt_entry {
>> +       struct acpi_pci_id      id;
>> +       u8                      pin;
>> +       acpi_handle             link;
>> +       u32                     index;          /* GSI, or link _CRS index */
>> +};
>> +
>> +static int xen_pvh_get_gsi_info(struct pci_dev *dev,
>> +                                                               gsi_info_t *gsi_info)
>> +{
>> +       int gsi;
>> +       u8 pin = 0;
>> +       struct acpi_prt_entry *entry;
>> +       int trigger = ACPI_LEVEL_SENSITIVE;
>> +       int polarity = acpi_irq_model == ACPI_IRQ_MODEL_GIC ?
>> +                                     ACPI_ACTIVE_HIGH : ACPI_ACTIVE_LOW;
>> +
>> +       if (dev)
>> +               pin = dev->pin;
>> +       if (!pin) {
>> +               xen_raw_printk("No interrupt pin configured\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       entry = acpi_pci_irq_lookup(dev, pin);
>> +       if (entry) {
>> +               if (entry->link)
>> +                       gsi = acpi_pci_link_allocate_irq(entry->link,
>> +                                                        entry->index,
>> +                                                        &trigger, &polarity,
>> +                                                        NULL);
>> +               else
>> +                       gsi = entry->index;
>> +       } else
>> +               return -EINVAL;
>> +
>> +       gsi_info->gsi = gsi;
>> +       gsi_info->trigger = trigger;
>> +       gsi_info->polarity = polarity;
>> +
>> +       return 0;
>> +}
>> +
>> +static int xen_pvh_map_pirq(gsi_info_t *gsi_info)
>> +{
>> +       struct physdev_map_pirq map_irq;
>> +       int ret;
>> +
>> +       map_irq.domid = DOMID_SELF;
>> +       map_irq.type = MAP_PIRQ_TYPE_GSI;
>> +       map_irq.index = gsi_info->gsi;
>> +       map_irq.pirq = gsi_info->gsi;
>> +
>> +       ret = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq);
>> +       gsi_info->pirq = map_irq.pirq;
>> +
>> +       return ret;
>> +}
>> +
>> +static int xen_pvh_unmap_pirq(gsi_info_t *gsi_info)
>> +{
>> +       struct physdev_unmap_pirq unmap_irq;
>> +
>> +       unmap_irq.domid = DOMID_SELF;
>> +       unmap_irq.pirq = gsi_info->pirq;
>> +
>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq);
>> +}
>> +
>> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
>> +{
>> +       struct physdev_setup_gsi setup_gsi;
>> +
>> +       setup_gsi.gsi = gsi_info->gsi;
>> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
>> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
>> +
>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
>> +}
> 
> Hm, why not simply call pcibios_enable_device() from pciback?  What
pcibios_enable_device had been called when using cmd "xl pci-assignable-add sbdf" from pciback. But it didn't do map_pirq and setup_gsi.
Because pcibios_enable_device-> pcibios_enable_irq-> __acpi_register_gsi(acpi_register_gsi_ioapic PVH specific)
> you are doing here using the hypercalls is a backdoor into what's done
> automatically by Xen on IO-APIC accesses by a PVH dom0.
But the gsi didn't be unmasked, and vioapic_hwdom_map_gsi is never called.
So, I think in pciback, if we can do what vioapic_hwdom_map_gsi does.
> 
> It will be much more natural for the PVH dom0 model to simply use the
> native way to configure and unmask the IO-APIC pin, and that would
> correctly setup the triggering/polarity and bind it to dom0 without
> requiring the usage of any hypercalls.
Do you still prefer that I called unmask_irq in pcistub_init_device, as this v2 patch do?
But Thomas Gleixner think it is not suitable to export unmask_irq.
> 
> Is that an issue since in that case the gsi will get mapped and bound
> to dom0?
Dom0 do map_pirq is to pass the check xc_domain_irq_permission()-> pirq_access_permitted(), 
> 
> Otherwise I would prefer if the gsi is just configured from pciback
> (PHYSDEVOP_setup_gsi) but not mapped, as allowing a PVH dom0 to map
> interrupts using PHYSDEVOP_{,un}map_pirq to itself introduces a new
> interface to manage interrupts that clashes with the native way that a
> PVH dom0 uses.
This method does map_pirq and setup_gsi only when a device is assigned to passthrough, it won't affect the other device using native way.

> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-12  6:16                         ` Chen, Jiqian
@ 2023-12-12  8:49                           ` Roger Pau Monné
  2023-12-12  9:38                             ` Jan Beulich
  0 siblings, 1 reply; 49+ messages in thread
From: Roger Pau Monné @ 2023-12-12  8:49 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Thomas Gleixner, Stefano Stabellini, Jan Beulich, Juergen Gross,
	Oleksandr Tyshchenko, Boris Ostrovsky, Rafael J . Wysocki,
	Len Brown, Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi,
	Stabellini, Stefano, Deucher, Alexander, Koenig, Christian,
	Hildebrand, Stewart, Ragiadakou, Xenia, Huang, Honglei1, Zhang,
	Julia, Huang, Ray

On Tue, Dec 12, 2023 at 06:16:43AM +0000, Chen, Jiqian wrote:
> On 2023/12/11 23:45, Roger Pau Monné wrote:
> > On Wed, Dec 06, 2023 at 06:07:26AM +0000, Chen, Jiqian wrote:
> >>
> >> When PVH dom0 enable a device, it will get trigger and polarity from ACPI (see acpi_pci_irq_enable)
> >> I have a version of patch which tried that way, see below:
> >>
> >> diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
> >> index ada3868c02c2..43e1bda9f946 100644
> >> --- a/arch/x86/xen/enlighten_pvh.c
> >> +++ b/arch/x86/xen/enlighten_pvh.c
> >> @@ -1,6 +1,7 @@
> >>  // SPDX-License-Identifier: GPL-2.0
> >>  #include <linux/acpi.h>
> >>  #include <linux/export.h>
> >> +#include <linux/pci.h>
> >>
> >>  #include <xen/hvc-console.h>
> >>
> >> @@ -25,6 +26,127 @@
> >>  bool __ro_after_init xen_pvh;
> >>  EXPORT_SYMBOL_GPL(xen_pvh);
> >>
> >> +typedef struct gsi_info {
> >> +       int gsi;
> >> +       int trigger;
> >> +       int polarity;
> >> +       int pirq;
> >> +} gsi_info_t;
> >> +
> >> +struct acpi_prt_entry {
> >> +       struct acpi_pci_id      id;
> >> +       u8                      pin;
> >> +       acpi_handle             link;
> >> +       u32                     index;          /* GSI, or link _CRS index */
> >> +};
> >> +
> >> +static int xen_pvh_get_gsi_info(struct pci_dev *dev,
> >> +                                                               gsi_info_t *gsi_info)
> >> +{
> >> +       int gsi;
> >> +       u8 pin = 0;
> >> +       struct acpi_prt_entry *entry;
> >> +       int trigger = ACPI_LEVEL_SENSITIVE;
> >> +       int polarity = acpi_irq_model == ACPI_IRQ_MODEL_GIC ?
> >> +                                     ACPI_ACTIVE_HIGH : ACPI_ACTIVE_LOW;
> >> +
> >> +       if (dev)
> >> +               pin = dev->pin;
> >> +       if (!pin) {
> >> +               xen_raw_printk("No interrupt pin configured\n");
> >> +               return -EINVAL;
> >> +       }
> >> +
> >> +       entry = acpi_pci_irq_lookup(dev, pin);
> >> +       if (entry) {
> >> +               if (entry->link)
> >> +                       gsi = acpi_pci_link_allocate_irq(entry->link,
> >> +                                                        entry->index,
> >> +                                                        &trigger, &polarity,
> >> +                                                        NULL);
> >> +               else
> >> +                       gsi = entry->index;
> >> +       } else
> >> +               return -EINVAL;
> >> +
> >> +       gsi_info->gsi = gsi;
> >> +       gsi_info->trigger = trigger;
> >> +       gsi_info->polarity = polarity;
> >> +
> >> +       return 0;
> >> +}
> >> +
> >> +static int xen_pvh_map_pirq(gsi_info_t *gsi_info)
> >> +{
> >> +       struct physdev_map_pirq map_irq;
> >> +       int ret;
> >> +
> >> +       map_irq.domid = DOMID_SELF;
> >> +       map_irq.type = MAP_PIRQ_TYPE_GSI;
> >> +       map_irq.index = gsi_info->gsi;
> >> +       map_irq.pirq = gsi_info->gsi;
> >> +
> >> +       ret = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq);
> >> +       gsi_info->pirq = map_irq.pirq;
> >> +
> >> +       return ret;
> >> +}
> >> +
> >> +static int xen_pvh_unmap_pirq(gsi_info_t *gsi_info)
> >> +{
> >> +       struct physdev_unmap_pirq unmap_irq;
> >> +
> >> +       unmap_irq.domid = DOMID_SELF;
> >> +       unmap_irq.pirq = gsi_info->pirq;
> >> +
> >> +       return HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq);
> >> +}
> >> +
> >> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
> >> +{
> >> +       struct physdev_setup_gsi setup_gsi;
> >> +
> >> +       setup_gsi.gsi = gsi_info->gsi;
> >> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
> >> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
> >> +
> >> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
> >> +}
> > 
> > Hm, why not simply call pcibios_enable_device() from pciback?  What
> pcibios_enable_device had been called when using cmd "xl pci-assignable-add sbdf" from pciback. But it didn't do map_pirq and setup_gsi.
> Because pcibios_enable_device-> pcibios_enable_irq-> __acpi_register_gsi(acpi_register_gsi_ioapic PVH specific)
> > you are doing here using the hypercalls is a backdoor into what's done
> > automatically by Xen on IO-APIC accesses by a PVH dom0.
> But the gsi didn't be unmasked, and vioapic_hwdom_map_gsi is never called.
> So, I think in pciback, if we can do what vioapic_hwdom_map_gsi does.
>

I see, it does setup the IO-APIC pin but doesn't unmask it, that's
what I feared.

> > It will be much more natural for the PVH dom0 model to simply use the
> > native way to configure and unmask the IO-APIC pin, and that would
> > correctly setup the triggering/polarity and bind it to dom0 without
> > requiring the usage of any hypercalls.
> Do you still prefer that I called unmask_irq in pcistub_init_device, as this v2 patch do?
> But Thomas Gleixner think it is not suitable to export unmask_irq.

Yeah, that wasn't good.

> > 
> > Is that an issue since in that case the gsi will get mapped and bound
> > to dom0?
> Dom0 do map_pirq is to pass the check xc_domain_irq_permission()-> pirq_access_permitted(), 

Can we see about finding another way to fix this check?

One option would be granting permissions over the IRQ in
PHYSDEVOP_setup_gsi?

Otherwise we could see about modifying the logic in PHYSDEVOP_map_pirq
so that the hardware domain can map IRQs to other domains without
having it mapped to itself first?

I think the call to PHYSDEVOP_setup_gsi in pciback is fine, but I
would really like to avoid the usage of PHYSDEVOP_{,un}map_pirq on a
PVH dom0 against itself.

> > 
> > Otherwise I would prefer if the gsi is just configured from pciback
> > (PHYSDEVOP_setup_gsi) but not mapped, as allowing a PVH dom0 to map
> > interrupts using PHYSDEVOP_{,un}map_pirq to itself introduces a new
> > interface to manage interrupts that clashes with the native way that a
> > PVH dom0 uses.
> This method does map_pirq and setup_gsi only when a device is assigned to passthrough, it won't affect the other device using native way.

It's not affected because of the specific usage in Linux, but allowing
the interface to be used against itself (so to manage interrupts
from assigned to dom0) is opening a whole new way to setup interrupts,
and it's unclear to me how that will affect the current way we use to
manage interrupts on a PVH dom0.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-12  8:49                           ` Roger Pau Monné
@ 2023-12-12  9:38                             ` Jan Beulich
  2023-12-12 11:18                               ` Roger Pau Monné
  0 siblings, 1 reply; 49+ messages in thread
From: Jan Beulich @ 2023-12-12  9:38 UTC (permalink / raw)
  To: Roger Pau Monné, Chen, Jiqian
  Cc: Thomas Gleixner, Stefano Stabellini, Juergen Gross,
	Oleksandr Tyshchenko, Boris Ostrovsky, Rafael J . Wysocki,
	Len Brown, Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi,
	Stabellini, Stefano, Deucher, Alexander, Koenig, Christian,
	Hildebrand, Stewart, Ragiadakou, Xenia, Huang, Honglei1, Zhang,
	Julia, Huang, Ray

(I think the Cc list is too long here, but then I don't know who to
keep and who to possibly drop.)

On 12.12.2023 09:49, Roger Pau Monné wrote:
> On Tue, Dec 12, 2023 at 06:16:43AM +0000, Chen, Jiqian wrote:
>> On 2023/12/11 23:45, Roger Pau Monné wrote:
>>> On Wed, Dec 06, 2023 at 06:07:26AM +0000, Chen, Jiqian wrote:
>>>> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
>>>> +{
>>>> +       struct physdev_setup_gsi setup_gsi;
>>>> +
>>>> +       setup_gsi.gsi = gsi_info->gsi;
>>>> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
>>>> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
>>>> +
>>>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
>>>> +}
>>>
>>> Hm, why not simply call pcibios_enable_device() from pciback?  What
>> pcibios_enable_device had been called when using cmd "xl pci-assignable-add sbdf" from pciback. But it didn't do map_pirq and setup_gsi.
>> Because pcibios_enable_device-> pcibios_enable_irq-> __acpi_register_gsi(acpi_register_gsi_ioapic PVH specific)
>>> you are doing here using the hypercalls is a backdoor into what's done
>>> automatically by Xen on IO-APIC accesses by a PVH dom0.
>> But the gsi didn't be unmasked, and vioapic_hwdom_map_gsi is never called.
>> So, I think in pciback, if we can do what vioapic_hwdom_map_gsi does.
>>
> 
> I see, it does setup the IO-APIC pin but doesn't unmask it, that's
> what I feared.
> 
>>> It will be much more natural for the PVH dom0 model to simply use the
>>> native way to configure and unmask the IO-APIC pin, and that would
>>> correctly setup the triggering/polarity and bind it to dom0 without
>>> requiring the usage of any hypercalls.
>> Do you still prefer that I called unmask_irq in pcistub_init_device, as this v2 patch do?
>> But Thomas Gleixner think it is not suitable to export unmask_irq.
> 
> Yeah, that wasn't good.
> 
>>>
>>> Is that an issue since in that case the gsi will get mapped and bound
>>> to dom0?
>> Dom0 do map_pirq is to pass the check xc_domain_irq_permission()-> pirq_access_permitted(), 
> 
> Can we see about finding another way to fix this check?
> 
> One option would be granting permissions over the IRQ in
> PHYSDEVOP_setup_gsi?

There's no domain available there, and imo it's also the wrong interface to
possibly grant any permissions.

> Otherwise we could see about modifying the logic in PHYSDEVOP_map_pirq
> so that the hardware domain can map IRQs to other domains without
> having it mapped to itself first?

While this may be possible to arrange for, it still would feel wrong. How
would you express the same in a disaggregated environment? I.e. how would
you make sure a domain trying to grant permission is actually permitted to
do so for the resource in question?

> I think the call to PHYSDEVOP_setup_gsi in pciback is fine, but I
> would really like to avoid the usage of PHYSDEVOP_{,un}map_pirq on a
> PVH dom0 against itself.

+1

Jan

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-12  9:38                             ` Jan Beulich
@ 2023-12-12 11:18                               ` Roger Pau Monné
  2023-12-12 11:19                                 ` Jan Beulich
  0 siblings, 1 reply; 49+ messages in thread
From: Roger Pau Monné @ 2023-12-12 11:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Chen, Jiqian, Thomas Gleixner, Stefano Stabellini, Juergen Gross,
	Oleksandr Tyshchenko, Boris Ostrovsky, Rafael J . Wysocki,
	Len Brown, Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi,
	Stabellini, Stefano, Deucher, Alexander, Koenig, Christian,
	Hildebrand, Stewart, Ragiadakou, Xenia, Huang, Honglei1, Zhang,
	Julia, Huang, Ray

On Tue, Dec 12, 2023 at 10:38:08AM +0100, Jan Beulich wrote:
> (I think the Cc list is too long here, but then I don't know who to
> keep and who to possibly drop.)
> 
> On 12.12.2023 09:49, Roger Pau Monné wrote:
> > On Tue, Dec 12, 2023 at 06:16:43AM +0000, Chen, Jiqian wrote:
> >> On 2023/12/11 23:45, Roger Pau Monné wrote:
> >>> On Wed, Dec 06, 2023 at 06:07:26AM +0000, Chen, Jiqian wrote:
> >>>> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
> >>>> +{
> >>>> +       struct physdev_setup_gsi setup_gsi;
> >>>> +
> >>>> +       setup_gsi.gsi = gsi_info->gsi;
> >>>> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
> >>>> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
> >>>> +
> >>>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
> >>>> +}
> >>>
> >>> Hm, why not simply call pcibios_enable_device() from pciback?  What
> >> pcibios_enable_device had been called when using cmd "xl pci-assignable-add sbdf" from pciback. But it didn't do map_pirq and setup_gsi.
> >> Because pcibios_enable_device-> pcibios_enable_irq-> __acpi_register_gsi(acpi_register_gsi_ioapic PVH specific)
> >>> you are doing here using the hypercalls is a backdoor into what's done
> >>> automatically by Xen on IO-APIC accesses by a PVH dom0.
> >> But the gsi didn't be unmasked, and vioapic_hwdom_map_gsi is never called.
> >> So, I think in pciback, if we can do what vioapic_hwdom_map_gsi does.
> >>
> > 
> > I see, it does setup the IO-APIC pin but doesn't unmask it, that's
> > what I feared.
> > 
> >>> It will be much more natural for the PVH dom0 model to simply use the
> >>> native way to configure and unmask the IO-APIC pin, and that would
> >>> correctly setup the triggering/polarity and bind it to dom0 without
> >>> requiring the usage of any hypercalls.
> >> Do you still prefer that I called unmask_irq in pcistub_init_device, as this v2 patch do?
> >> But Thomas Gleixner think it is not suitable to export unmask_irq.
> > 
> > Yeah, that wasn't good.
> > 
> >>>
> >>> Is that an issue since in that case the gsi will get mapped and bound
> >>> to dom0?
> >> Dom0 do map_pirq is to pass the check xc_domain_irq_permission()-> pirq_access_permitted(), 
> > 
> > Can we see about finding another way to fix this check?
> > 
> > One option would be granting permissions over the IRQ in
> > PHYSDEVOP_setup_gsi?
> 
> There's no domain available there, and imo it's also the wrong interface to
> possibly grant any permissions.

Well, the domain is the caller.

> > Otherwise we could see about modifying the logic in PHYSDEVOP_map_pirq
> > so that the hardware domain can map IRQs to other domains without
> > having it mapped to itself first?
> 
> While this may be possible to arrange for, it still would feel wrong. How
> would you express the same in a disaggregated environment? I.e. how would
> you make sure a domain trying to grant permission is actually permitted to
> do so for the resource in question?

I've been looking again at the original issue, and I think I was
confused.  The issue is not that dom0 doesn't have permissions over
the GSIs (as we do grant those in dom0_setup_permissions()), but
rather that XEN_DOMCTL_irq_permission attempts to use
domain_pirq_to_irq() in order to get the IRQ from the PIRQ parameter.

I've always been a bit confused with the PIRQ GSI relation, but IIRC
GSIs are always identity mapped to PIRQs, and hence you could possibly
adjust XEN_DOMCTL_irq_permission to use irq_permit_access() to check
if the caller domain has permissions over the target PIRQ.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-12 11:18                               ` Roger Pau Monné
@ 2023-12-12 11:19                                 ` Jan Beulich
  2023-12-12 11:39                                   ` Roger Pau Monné
  0 siblings, 1 reply; 49+ messages in thread
From: Jan Beulich @ 2023-12-12 11:19 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Chen, Jiqian, Thomas Gleixner, Stefano Stabellini, Juergen Gross,
	Oleksandr Tyshchenko, Boris Ostrovsky, Rafael J . Wysocki,
	Len Brown, Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi,
	Stabellini, Stefano, Deucher, Alexander, Koenig, Christian,
	Hildebrand, Stewart, Ragiadakou, Xenia, Huang, Honglei1, Zhang,
	Julia, Huang, Ray

On 12.12.2023 12:18, Roger Pau Monné wrote:
> On Tue, Dec 12, 2023 at 10:38:08AM +0100, Jan Beulich wrote:
>> (I think the Cc list is too long here, but then I don't know who to
>> keep and who to possibly drop.)
>>
>> On 12.12.2023 09:49, Roger Pau Monné wrote:
>>> On Tue, Dec 12, 2023 at 06:16:43AM +0000, Chen, Jiqian wrote:
>>>> On 2023/12/11 23:45, Roger Pau Monné wrote:
>>>>> On Wed, Dec 06, 2023 at 06:07:26AM +0000, Chen, Jiqian wrote:
>>>>>> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
>>>>>> +{
>>>>>> +       struct physdev_setup_gsi setup_gsi;
>>>>>> +
>>>>>> +       setup_gsi.gsi = gsi_info->gsi;
>>>>>> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
>>>>>> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
>>>>>> +
>>>>>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
>>>>>> +}
>>>>>
>>>>> Hm, why not simply call pcibios_enable_device() from pciback?  What
>>>> pcibios_enable_device had been called when using cmd "xl pci-assignable-add sbdf" from pciback. But it didn't do map_pirq and setup_gsi.
>>>> Because pcibios_enable_device-> pcibios_enable_irq-> __acpi_register_gsi(acpi_register_gsi_ioapic PVH specific)
>>>>> you are doing here using the hypercalls is a backdoor into what's done
>>>>> automatically by Xen on IO-APIC accesses by a PVH dom0.
>>>> But the gsi didn't be unmasked, and vioapic_hwdom_map_gsi is never called.
>>>> So, I think in pciback, if we can do what vioapic_hwdom_map_gsi does.
>>>>
>>>
>>> I see, it does setup the IO-APIC pin but doesn't unmask it, that's
>>> what I feared.
>>>
>>>>> It will be much more natural for the PVH dom0 model to simply use the
>>>>> native way to configure and unmask the IO-APIC pin, and that would
>>>>> correctly setup the triggering/polarity and bind it to dom0 without
>>>>> requiring the usage of any hypercalls.
>>>> Do you still prefer that I called unmask_irq in pcistub_init_device, as this v2 patch do?
>>>> But Thomas Gleixner think it is not suitable to export unmask_irq.
>>>
>>> Yeah, that wasn't good.
>>>
>>>>>
>>>>> Is that an issue since in that case the gsi will get mapped and bound
>>>>> to dom0?
>>>> Dom0 do map_pirq is to pass the check xc_domain_irq_permission()-> pirq_access_permitted(), 
>>>
>>> Can we see about finding another way to fix this check?
>>>
>>> One option would be granting permissions over the IRQ in
>>> PHYSDEVOP_setup_gsi?
>>
>> There's no domain available there, and imo it's also the wrong interface to
>> possibly grant any permissions.
> 
> Well, the domain is the caller.

Granting permission to itself?

Jan

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-12 11:19                                 ` Jan Beulich
@ 2023-12-12 11:39                                   ` Roger Pau Monné
  2023-12-13  7:14                                     ` Chen, Jiqian
  0 siblings, 1 reply; 49+ messages in thread
From: Roger Pau Monné @ 2023-12-12 11:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Chen, Jiqian, Thomas Gleixner, Stefano Stabellini, Juergen Gross,
	Oleksandr Tyshchenko, Boris Ostrovsky, Rafael J . Wysocki,
	Len Brown, Bjorn Helgaas, xen-devel, linux-kernel, linux-acpi,
	Stabellini, Stefano, Deucher, Alexander, Koenig, Christian,
	Hildebrand, Stewart, Ragiadakou, Xenia, Huang, Honglei1, Zhang,
	Julia, Huang, Ray

On Tue, Dec 12, 2023 at 12:19:49PM +0100, Jan Beulich wrote:
> On 12.12.2023 12:18, Roger Pau Monné wrote:
> > On Tue, Dec 12, 2023 at 10:38:08AM +0100, Jan Beulich wrote:
> >> (I think the Cc list is too long here, but then I don't know who to
> >> keep and who to possibly drop.)
> >>
> >> On 12.12.2023 09:49, Roger Pau Monné wrote:
> >>> On Tue, Dec 12, 2023 at 06:16:43AM +0000, Chen, Jiqian wrote:
> >>>> On 2023/12/11 23:45, Roger Pau Monné wrote:
> >>>>> On Wed, Dec 06, 2023 at 06:07:26AM +0000, Chen, Jiqian wrote:
> >>>>>> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
> >>>>>> +{
> >>>>>> +       struct physdev_setup_gsi setup_gsi;
> >>>>>> +
> >>>>>> +       setup_gsi.gsi = gsi_info->gsi;
> >>>>>> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
> >>>>>> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
> >>>>>> +
> >>>>>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
> >>>>>> +}
> >>>>>
> >>>>> Hm, why not simply call pcibios_enable_device() from pciback?  What
> >>>> pcibios_enable_device had been called when using cmd "xl pci-assignable-add sbdf" from pciback. But it didn't do map_pirq and setup_gsi.
> >>>> Because pcibios_enable_device-> pcibios_enable_irq-> __acpi_register_gsi(acpi_register_gsi_ioapic PVH specific)
> >>>>> you are doing here using the hypercalls is a backdoor into what's done
> >>>>> automatically by Xen on IO-APIC accesses by a PVH dom0.
> >>>> But the gsi didn't be unmasked, and vioapic_hwdom_map_gsi is never called.
> >>>> So, I think in pciback, if we can do what vioapic_hwdom_map_gsi does.
> >>>>
> >>>
> >>> I see, it does setup the IO-APIC pin but doesn't unmask it, that's
> >>> what I feared.
> >>>
> >>>>> It will be much more natural for the PVH dom0 model to simply use the
> >>>>> native way to configure and unmask the IO-APIC pin, and that would
> >>>>> correctly setup the triggering/polarity and bind it to dom0 without
> >>>>> requiring the usage of any hypercalls.
> >>>> Do you still prefer that I called unmask_irq in pcistub_init_device, as this v2 patch do?
> >>>> But Thomas Gleixner think it is not suitable to export unmask_irq.
> >>>
> >>> Yeah, that wasn't good.
> >>>
> >>>>>
> >>>>> Is that an issue since in that case the gsi will get mapped and bound
> >>>>> to dom0?
> >>>> Dom0 do map_pirq is to pass the check xc_domain_irq_permission()-> pirq_access_permitted(), 
> >>>
> >>> Can we see about finding another way to fix this check?
> >>>
> >>> One option would be granting permissions over the IRQ in
> >>> PHYSDEVOP_setup_gsi?
> >>
> >> There's no domain available there, and imo it's also the wrong interface to
> >> possibly grant any permissions.
> > 
> > Well, the domain is the caller.
> 
> Granting permission to itself?

See below in the previous email, the issue is not with the
permissions, which are correctly assigned from
dom0_setup_permissions(), but the usage of domain_pirq_to_irq() in
pirq_access_permitted() as called by XEN_DOMCTL_irq_permission.
There's no need to play with the permissions at all.

Regards, Roger.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-12 11:39                                   ` Roger Pau Monné
@ 2023-12-13  7:14                                     ` Chen, Jiqian
  2023-12-13  7:41                                       ` Jan Beulich
  0 siblings, 1 reply; 49+ messages in thread
From: Chen, Jiqian @ 2023-12-13  7:14 UTC (permalink / raw)
  To: Roger Pau Monné, Jan Beulich
  Cc: Stefano Stabellini, Juergen Gross, Oleksandr Tyshchenko,
	xen-devel, linux-kernel, Huang, Ray, Chen, Jiqian

On 2023/12/12 19:39, Roger Pau Monné wrote:
> On Tue, Dec 12, 2023 at 12:19:49PM +0100, Jan Beulich wrote:
>> On 12.12.2023 12:18, Roger Pau Monné wrote:
>>> On Tue, Dec 12, 2023 at 10:38:08AM +0100, Jan Beulich wrote:
>>>> (I think the Cc list is too long here, but then I don't know who to
>>>> keep and who to possibly drop.)
>>>>
>>>> On 12.12.2023 09:49, Roger Pau Monné wrote:
>>>>> On Tue, Dec 12, 2023 at 06:16:43AM +0000, Chen, Jiqian wrote:
>>>>>> On 2023/12/11 23:45, Roger Pau Monné wrote:
>>>>>>> On Wed, Dec 06, 2023 at 06:07:26AM +0000, Chen, Jiqian wrote:
>>>>>>>> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
>>>>>>>> +{
>>>>>>>> +       struct physdev_setup_gsi setup_gsi;
>>>>>>>> +
>>>>>>>> +       setup_gsi.gsi = gsi_info->gsi;
>>>>>>>> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
>>>>>>>> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
>>>>>>>> +
>>>>>>>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
>>>>>>>> +}
>>>>>>>
>>>>>>> Hm, why not simply call pcibios_enable_device() from pciback?  What
>>>>>> pcibios_enable_device had been called when using cmd "xl pci-assignable-add sbdf" from pciback. But it didn't do map_pirq and setup_gsi.
>>>>>> Because pcibios_enable_device-> pcibios_enable_irq-> __acpi_register_gsi(acpi_register_gsi_ioapic PVH specific)
>>>>>>> you are doing here using the hypercalls is a backdoor into what's done
>>>>>>> automatically by Xen on IO-APIC accesses by a PVH dom0.
>>>>>> But the gsi didn't be unmasked, and vioapic_hwdom_map_gsi is never called.
>>>>>> So, I think in pciback, if we can do what vioapic_hwdom_map_gsi does.
>>>>>>
>>>>>
>>>>> I see, it does setup the IO-APIC pin but doesn't unmask it, that's
>>>>> what I feared.
>>>>>
>>>>>>> It will be much more natural for the PVH dom0 model to simply use the
>>>>>>> native way to configure and unmask the IO-APIC pin, and that would
>>>>>>> correctly setup the triggering/polarity and bind it to dom0 without
>>>>>>> requiring the usage of any hypercalls.
>>>>>> Do you still prefer that I called unmask_irq in pcistub_init_device, as this v2 patch do?
>>>>>> But Thomas Gleixner think it is not suitable to export unmask_irq.
>>>>>
>>>>> Yeah, that wasn't good.
>>>>>
>>>>>>>
>>>>>>> Is that an issue since in that case the gsi will get mapped and bound
>>>>>>> to dom0?
>>>>>> Dom0 do map_pirq is to pass the check xc_domain_irq_permission()-> pirq_access_permitted(), 
>>>>>
>>>>> Can we see about finding another way to fix this check?
>>>>>
>>>>> One option would be granting permissions over the IRQ in
>>>>> PHYSDEVOP_setup_gsi?
>>>>
>>>> There's no domain available there, and imo it's also the wrong interface to
>>>> possibly grant any permissions.
>>>
>>> Well, the domain is the caller.
>>
>> Granting permission to itself?
> 
> See below in the previous email, the issue is not with the
> permissions, which are correctly assigned from
> dom0_setup_permissions(), but the usage of domain_pirq_to_irq() in
> pirq_access_permitted() as called by XEN_DOMCTL_irq_permission.
> There's no need to play with the permissions at all.
Yes, the problem is pci_add_dm_done-> xc_domain_irq_permission-> XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0, so it failed.
I am think that since the PVH doesn't use pirq, can we just skip this irq_permission check for PVH?

> 
> Regards, Roger.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
  2023-12-13  7:14                                     ` Chen, Jiqian
@ 2023-12-13  7:41                                       ` Jan Beulich
  0 siblings, 0 replies; 49+ messages in thread
From: Jan Beulich @ 2023-12-13  7:41 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Stefano Stabellini, Juergen Gross, Oleksandr Tyshchenko,
	xen-devel, linux-kernel, Huang, Ray, Roger Pau Monné

On 13.12.2023 08:14, Chen, Jiqian wrote:
> On 2023/12/12 19:39, Roger Pau Monné wrote:
>> On Tue, Dec 12, 2023 at 12:19:49PM +0100, Jan Beulich wrote:
>>> On 12.12.2023 12:18, Roger Pau Monné wrote:
>>>> On Tue, Dec 12, 2023 at 10:38:08AM +0100, Jan Beulich wrote:
>>>>> (I think the Cc list is too long here, but then I don't know who to
>>>>> keep and who to possibly drop.)
>>>>>
>>>>> On 12.12.2023 09:49, Roger Pau Monné wrote:
>>>>>> On Tue, Dec 12, 2023 at 06:16:43AM +0000, Chen, Jiqian wrote:
>>>>>>> On 2023/12/11 23:45, Roger Pau Monné wrote:
>>>>>>>> On Wed, Dec 06, 2023 at 06:07:26AM +0000, Chen, Jiqian wrote:
>>>>>>>>> +static int xen_pvh_setup_gsi(gsi_info_t *gsi_info)
>>>>>>>>> +{
>>>>>>>>> +       struct physdev_setup_gsi setup_gsi;
>>>>>>>>> +
>>>>>>>>> +       setup_gsi.gsi = gsi_info->gsi;
>>>>>>>>> +       setup_gsi.triggering = (gsi_info->trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
>>>>>>>>> +       setup_gsi.polarity = (gsi_info->polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
>>>>>>>>> +
>>>>>>>>> +       return HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
>>>>>>>>> +}
>>>>>>>>
>>>>>>>> Hm, why not simply call pcibios_enable_device() from pciback?  What
>>>>>>> pcibios_enable_device had been called when using cmd "xl pci-assignable-add sbdf" from pciback. But it didn't do map_pirq and setup_gsi.
>>>>>>> Because pcibios_enable_device-> pcibios_enable_irq-> __acpi_register_gsi(acpi_register_gsi_ioapic PVH specific)
>>>>>>>> you are doing here using the hypercalls is a backdoor into what's done
>>>>>>>> automatically by Xen on IO-APIC accesses by a PVH dom0.
>>>>>>> But the gsi didn't be unmasked, and vioapic_hwdom_map_gsi is never called.
>>>>>>> So, I think in pciback, if we can do what vioapic_hwdom_map_gsi does.
>>>>>>>
>>>>>>
>>>>>> I see, it does setup the IO-APIC pin but doesn't unmask it, that's
>>>>>> what I feared.
>>>>>>
>>>>>>>> It will be much more natural for the PVH dom0 model to simply use the
>>>>>>>> native way to configure and unmask the IO-APIC pin, and that would
>>>>>>>> correctly setup the triggering/polarity and bind it to dom0 without
>>>>>>>> requiring the usage of any hypercalls.
>>>>>>> Do you still prefer that I called unmask_irq in pcistub_init_device, as this v2 patch do?
>>>>>>> But Thomas Gleixner think it is not suitable to export unmask_irq.
>>>>>>
>>>>>> Yeah, that wasn't good.
>>>>>>
>>>>>>>>
>>>>>>>> Is that an issue since in that case the gsi will get mapped and bound
>>>>>>>> to dom0?
>>>>>>> Dom0 do map_pirq is to pass the check xc_domain_irq_permission()-> pirq_access_permitted(), 
>>>>>>
>>>>>> Can we see about finding another way to fix this check?
>>>>>>
>>>>>> One option would be granting permissions over the IRQ in
>>>>>> PHYSDEVOP_setup_gsi?
>>>>>
>>>>> There's no domain available there, and imo it's also the wrong interface to
>>>>> possibly grant any permissions.
>>>>
>>>> Well, the domain is the caller.
>>>
>>> Granting permission to itself?
>>
>> See below in the previous email, the issue is not with the
>> permissions, which are correctly assigned from
>> dom0_setup_permissions(), but the usage of domain_pirq_to_irq() in
>> pirq_access_permitted() as called by XEN_DOMCTL_irq_permission.
>> There's no need to play with the permissions at all.
> Yes, the problem is pci_add_dm_done-> xc_domain_irq_permission-> XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0, so it failed.
> I am think that since the PVH doesn't use pirq, can we just skip this irq_permission check for PVH?

If not the pIRQ, then the real IRQ would need checking for permissions.
You won't get away without any checking at all, I'm afraid.

Jan

^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq
  2023-12-01  8:14 kernel test robot
@ 2023-12-04  2:40 ` Liu, Yujie
  0 siblings, 0 replies; 49+ messages in thread
From: Liu, Yujie @ 2023-12-04  2:40 UTC (permalink / raw)
  To: lkp, oe-kbuild

not_report

duplicate to https://lore.kernel.org/all/202311250440.DbUQXNLK-lkp@intel.com/

-----Original Message-----
From: lkp <lkp@intel.com> 
Sent: Friday, December 1, 2023 16:14
To: oe-kbuild@lists.linux.dev
Cc: lkp <lkp@intel.com>
Subject: Re: [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq

:::::: 
:::::: Manual check reason: "low confidence bisect report"
:::::: 

BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <20231124103123.3263471-4-Jiqian.Chen@amd.com>
References: <20231124103123.3263471-4-Jiqian.Chen@amd.com>
TO: Jiqian Chen <Jiqian.Chen@amd.com>

Hi Jiqian,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on v6.7-rc1]
[cannot apply to xen-tip/linux-next tip/irq/core tip/x86/core linus/master v6.7-rc2 next-20231201]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Jiqian-Chen/xen-pci-Add-xen_reset_device_state-function/20231124-183607
base:   v6.7-rc1
patch link:    https://lore.kernel.org/r/20231124103123.3263471-4-Jiqian.Chen%40amd.com
patch subject: [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq
:::::: branch date: 7 days ago
:::::: commit date: 7 days ago
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce: (https://download.01.org/0day-ci/archive/20231201/202312011652.YwfS8NBH-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/r/202312011652.YwfS8NBH-lkp@intel.com/

includecheck warnings: (new ones prefixed by >>)
>> drivers/xen/privcmd.c: xen/events.h is included more than once.

vim +38 drivers/xen/privcmd.c

    36	
    37	#include <xen/xen.h>
  > 38	#include <xen/events.h>
    39	#include <xen/privcmd.h>
    40	#include <xen/interface/xen.h>
    41	#include <xen/interface/memory.h>
    42	#include <xen/interface/hvm/dm_op.h>
    43	#include <xen/interface/hvm/ioreq.h>
    44	#include <xen/features.h>
    45	#include <xen/page.h>
    46	#include <xen/xen-ops.h>
    47	#include <xen/balloon.h>
  > 48	#include <xen/events.h>
    49	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq
@ 2023-12-01  8:14 kernel test robot
  2023-12-04  2:40 ` Liu, Yujie
  0 siblings, 1 reply; 49+ messages in thread
From: kernel test robot @ 2023-12-01  8:14 UTC (permalink / raw)
  To: oe-kbuild; +Cc: lkp

:::::: 
:::::: Manual check reason: "low confidence bisect report"
:::::: 

BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <20231124103123.3263471-4-Jiqian.Chen@amd.com>
References: <20231124103123.3263471-4-Jiqian.Chen@amd.com>
TO: Jiqian Chen <Jiqian.Chen@amd.com>

Hi Jiqian,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on v6.7-rc1]
[cannot apply to xen-tip/linux-next tip/irq/core tip/x86/core linus/master v6.7-rc2 next-20231201]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Jiqian-Chen/xen-pci-Add-xen_reset_device_state-function/20231124-183607
base:   v6.7-rc1
patch link:    https://lore.kernel.org/r/20231124103123.3263471-4-Jiqian.Chen%40amd.com
patch subject: [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq
:::::: branch date: 7 days ago
:::::: commit date: 7 days ago
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce: (https://download.01.org/0day-ci/archive/20231201/202312011652.YwfS8NBH-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/r/202312011652.YwfS8NBH-lkp@intel.com/

includecheck warnings: (new ones prefixed by >>)
>> drivers/xen/privcmd.c: xen/events.h is included more than once.

vim +38 drivers/xen/privcmd.c

    36	
    37	#include <xen/xen.h>
  > 38	#include <xen/events.h>
    39	#include <xen/privcmd.h>
    40	#include <xen/interface/xen.h>
    41	#include <xen/interface/memory.h>
    42	#include <xen/interface/hvm/dm_op.h>
    43	#include <xen/interface/hvm/ioreq.h>
    44	#include <xen/features.h>
    45	#include <xen/page.h>
    46	#include <xen/xen-ops.h>
    47	#include <xen/balloon.h>
  > 48	#include <xen/events.h>
    49	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2023-12-13  7:41 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-24 23:38 [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq kernel test robot
2023-11-26 16:06 ` kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2023-12-01  8:14 kernel test robot
2023-12-04  2:40 ` Liu, Yujie
2023-11-24 10:31 [RFC KERNEL PATCH v2 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
2023-11-24 10:31 ` [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function Jiqian Chen
2023-11-30  3:46   ` Stefano Stabellini
2023-11-30  7:03     ` Chen, Jiqian
2023-11-30 15:03       ` Stewart Hildebrand
2023-12-04  3:25         ` Chen, Jiqian
2023-12-04  3:45           ` Stewart Hildebrand
2023-12-04  7:58   ` Thomas Gleixner
2023-12-04  8:49     ` Chen, Jiqian
2023-12-04 21:31       ` Stefano Stabellini
2023-12-05  6:50         ` Chen, Jiqian
2023-12-05  8:10         ` Jan Beulich
2023-12-05 17:02         ` Thomas Gleixner
2023-12-06  6:37           ` Chen, Jiqian
2023-11-24 10:31 ` [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0 Jiqian Chen
2023-11-25  7:24   ` kernel test robot
2023-11-25  7:42   ` kernel test robot
2023-11-30  3:53   ` Stefano Stabellini
2023-11-30 16:02     ` Roger Pau Monné
2023-12-01  3:15       ` Stefano Stabellini
2023-12-01  8:58         ` Roger Pau Monné
2023-12-02  3:37           ` Stefano Stabellini
2023-12-04 10:28             ` Roger Pau Monné
2023-12-04 22:19               ` Stefano Stabellini
2023-12-05  9:19                 ` Roger Pau Monné
2023-12-05  9:39                   ` Chen, Jiqian
2023-12-05 10:32                   ` Jan Beulich
2023-12-06  6:07                     ` Chen, Jiqian
2023-12-07  2:18                       ` Stefano Stabellini
2023-12-07  3:38                         ` Chen, Jiqian
2023-12-07  6:43                         ` Juergen Gross
2023-12-08  5:53                           ` Chen, Jiqian
2023-12-11 15:45                       ` Roger Pau Monné
2023-12-12  6:16                         ` Chen, Jiqian
2023-12-12  8:49                           ` Roger Pau Monné
2023-12-12  9:38                             ` Jan Beulich
2023-12-12 11:18                               ` Roger Pau Monné
2023-12-12 11:19                                 ` Jan Beulich
2023-12-12 11:39                                   ` Roger Pau Monné
2023-12-13  7:14                                     ` Chen, Jiqian
2023-12-13  7:41                                       ` Jan Beulich
2023-12-05  6:46               ` Chen, Jiqian
2023-12-04  8:13   ` Thomas Gleixner
2023-12-05  7:03     ` Chen, Jiqian
2023-11-24 10:31 ` [RFC KERNEL PATCH v2 3/3] xen/privcmd: Add new syscall to get gsi from irq Jiqian Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.