xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [RFC XEN PATCH v2 0/3] Support device passthrough when dom0 is PVH on Xen
@ 2023-11-24 10:41 Jiqian Chen
  2023-11-24 10:41 ` [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device Jiqian Chen
                   ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Jiqian Chen @ 2023-11-24 10:41 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné,
	Wei Liu, Anthony PERARD, Juergen Gross, Stefano Stabellini,
	xen-devel
  Cc: Stewart Hildebrand, Alex Deucher, Xenia Ragiadakou,
	Stefano Stabellini, Huang Rui, Honglei Huang, Julia Zhang,
	Jiqian Chen

Hi All,

This series of patches are the v2 of the implementation of passthrough when dom0 is PVH on Xen.
We sent the v1 to upstream before, but the v1 had so many problems and we got lots of suggestions.
I will introduce all issues that these patches try to fix and the differences between v1 and v2.

v2 on kernel side:
https://lore.kernel.org/lkml/20231124103123.3263471-1-Jiqian.Chen@amd.com/T/#t

Issues we encountered:
1. pci_stub failed to write bar for a passthrough device.
Problem: when we run “sudo xl pci-assignable-add <sbdf>” to assign a device, pci_stub will
call “pcistub_init_device() -> pci_restore_state() -> pci_restore_config_space() ->
pci_restore_config_space_range() -> pci_restore_config_dword() -> pci_write_config_dword()”,
the pci config write will trigger an io interrupt to bar_write() in the xen, but the
bar->enabled was set before, the write is not allowed now, and then when Qemu config the
passthrough device in xen_pt_realize(), it gets invalid bar values.

Reason: the reason is that we don't tell vPCI that the device has been reset, so the current
cached state in pdev->vpci is all out of date and is different from the real device state.

Solution: to solve this problem, the first patch of kernel(xen/pci: Add xen_reset_device_state
function) and the fist patch of xen(xen/vpci: Clear all vpci status of device) add a new
hypercall to reset the state stored in vPCI when the state of real device has changed.
Thank Roger for the suggestion of this v2, and it is different from v1
(https://lore.kernel.org/xen-devel/20230312075455.450187-3-ray.huang@amd.com/), v1 simply allow
domU to write pci bar, it does not comply with the design principles of vPCI.

2. failed to do PHYSDEVOP_map_pirq when dom0 is PVH
Problem: HVM domU will do PHYSDEVOP_map_pirq for a passthrough device by using gsi. See
xen_pt_realize->xc_physdev_map_pirq and pci_add_dm_done->xc_physdev_map_pirq. Then
xc_physdev_map_pirq will call into Xen, but in hvm_physdev_op(), PHYSDEVOP_map_pirq is not allowed.

Reason: In hvm_physdev_op(), the variable "currd" is PVH dom0 and PVH has no X86_EMU_USE_PIRQ flag,
it will fail at has_pirq check.

Solution: I think we may need to allow PHYSDEVOP_map_pirq when "currd" is dom0 (at present dom0 is
PVH). The second patch of xen(x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0) allow PVH dom0 do
PHYSDEVOP_map_pirq. This v2 patch is better than v1, v1 simply remove the has_pirq check(xen
https://lore.kernel.org/xen-devel/20230312075455.450187-4-ray.huang@amd.com/).

3. the gsi of a passthrough device doesn't be unmasked
 3.1 failed to check the permission of pirq
 3.2 the gsi of passthrough device was not registered in PVH dom0

Problem:
3.1 callback function pci_add_dm_done() will be called when qemu config a passthrough device for domU.
This function will call xc_domain_irq_permission()-> pirq_access_permitted() to check if the gsi has
corresponding mappings in dom0. But it didn’t, so failed. See
XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH dom0 and it return irq is 0.
3.2 it's possible for a gsi (iow: vIO-APIC pin) to never get registered on PVH dom0, because the
devices of PVH are using MSI(-X) interrupts. However, the IO-APIC pin must be configured for it to be
able to be mapped into a domU.

Reason: After searching codes, I find "map_pirq" and "register_gsi" will be done in function
vioapic_write_redirent->vioapic_hwdom_map_gsi when the gsi(aka ioapic's pin) is unmasked in PVH dom0.
So the two problems can be concluded to that the gsi of a passthrough device doesn't be unmasked.

Solution: to solve these problems, the second patch of kernel(xen/pvh: Unmask irq for passthrough
device in PVH dom0) call the unmask_irq() when we assign a device to be passthrough. So that
passthrough devices can have the mapping of gsi on PVH dom0 and gsi can be registered. This v2 patch
is different from the v1(
kernel https://lore.kernel.org/xen-devel/20230312120157.452859-5-ray.huang@amd.com/,
kernel https://lore.kernel.org/xen-devel/20230312120157.452859-5-ray.huang@amd.com/ and
xen https://lore.kernel.org/xen-devel/20230312075455.450187-5-ray.huang@amd.com/),
v1 performed "map_pirq" and "register_gsi" on all pci devices on PVH dom0, which is unnecessary and
may cause multiple registration.

4. failed to map pirq for gsi
Problem: qemu will call xc_physdev_map_pirq() to map a passthrough device’s gsi to pirq in function
xen_pt_realize(). But failed.

Reason: According to the implement of xc_physdev_map_pirq(), it needs gsi instead of irq, but qemu
pass irq to it and treat irq as gsi, it is got from file /sys/bus/pci/devices/xxxx:xx:xx.x/irq in
function xen_host_pci_device_get(). But actually the gsi number is not equal with irq. On PVH dom0,
when it allocates irq for a gsi in function acpi_register_gsi_ioapic(), allocation is dynamic, and
follow the principle of applying first, distributing first. And if you debug the kernel codes(see
function __irq_alloc_descs), you will find the irq number is allocated from small to large by order,
but the applying gsi number is not, gsi 38 may come before gsi 28, that causes gsi 38 get a smaller
irq number than gsi 28, and then gsi != irq.

Solution: we can record the relation between gsi and irq, then when userspace(qemu) want to use gsi,
we can do a translation. The third patch of kernel(xen/privcmd: Add new syscall to get gsi from irq)
records all the relations in acpi_register_gsi_xen_pvh() when dom0 initialize pci devices, and provide
a syscall for userspace to get the gsi from irq. The third patch of xen(tools: Add new function to get
gsi from irq) add a new function xc_physdev_gsi_from_irq() to call the new syscall added on kernel side.
And then userspace can use that function to get gsi. Then xc_physdev_map_pirq() will success. This v2
patch is the same as v1(
kernel https://lore.kernel.org/xen-devel/20230312120157.452859-6-ray.huang@amd.com/ and
xen https://lore.kernel.org/xen-devel/20230312075455.450187-6-ray.huang@amd.com/)

About the v2 patch of qemu, just change an included head file, other are similar to the v1 (
qemu https://lore.kernel.org/xen-devel/20230312092244.451465-19-ray.huang@amd.com/), just call
xc_physdev_gsi_from_irq() to get gsi from irq.


Jiqian Chen (3):
  xen/vpci: Clear all vpci status of device
  x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0
  tools: Add new function to get gsi from irq

 tools/include/xen-sys/Linux/privcmd.h |  7 +++++++
 tools/include/xencall.h               |  2 ++
 tools/include/xenctrl.h               |  2 ++
 tools/libs/call/core.c                |  5 +++++
 tools/libs/call/libxencall.map        |  2 ++
 tools/libs/call/linux.c               | 14 ++++++++++++++
 tools/libs/call/private.h             |  9 +++++++++
 tools/libs/ctrl/xc_physdev.c          |  4 ++++
 tools/libs/light/libxl_pci.c          |  1 +
 xen/arch/x86/hvm/hypercall.c          |  3 +++
 xen/drivers/passthrough/pci.c         | 21 +++++++++++++++++++++
 xen/drivers/pci/physdev.c             | 14 ++++++++++++++
 xen/drivers/vpci/vpci.c               |  9 +++++++++
 xen/include/public/physdev.h          |  2 ++
 xen/include/xen/pci.h                 |  1 +
 xen/include/xen/vpci.h                |  6 ++++++
 16 files changed, 102 insertions(+)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-11-24 10:41 [RFC XEN PATCH v2 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
@ 2023-11-24 10:41 ` Jiqian Chen
  2023-11-28 14:08   ` Roger Pau Monné
  2023-11-24 10:41 ` [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0 Jiqian Chen
  2023-11-24 10:41 ` [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq Jiqian Chen
  2 siblings, 1 reply; 37+ messages in thread
From: Jiqian Chen @ 2023-11-24 10:41 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné,
	Wei Liu, Anthony PERARD, Juergen Gross, Stefano Stabellini,
	xen-devel
  Cc: Stewart Hildebrand, Alex Deucher, Xenia Ragiadakou,
	Stefano Stabellini, Huang Rui, Honglei Huang, Julia Zhang,
	Jiqian Chen, Huang Rui

When a device has been reset on dom0 side, the vpci on Xen
side won't get notification, so the cached state in vpci is
all out of date compare with the real device state.
To solve that problem, this patch add new hypercall to clear
all vpci device state. And when reset device happens on dom0
side, dom0 can call this hypercall to notify vpci.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
---
 xen/arch/x86/hvm/hypercall.c  |  1 +
 xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
 xen/drivers/pci/physdev.c     | 14 ++++++++++++++
 xen/drivers/vpci/vpci.c       |  9 +++++++++
 xen/include/public/physdev.h  |  2 ++
 xen/include/xen/pci.h         |  1 +
 xen/include/xen/vpci.h        |  6 ++++++
 7 files changed, 54 insertions(+)

diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index eeb73e1aa5..6ad5b4d5f1 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
     case PHYSDEVOP_pci_mmcfg_reserved:
     case PHYSDEVOP_pci_device_add:
     case PHYSDEVOP_pci_device_remove:
+    case PHYSDEVOP_pci_device_state_reset:
     case PHYSDEVOP_dbgp_op:
         if ( !is_hardware_domain(currd) )
             return -ENOSYS;
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 04d00c7c37..f871715585 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
     return ret;
 }
 
+int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
+{
+    struct pci_dev *pdev;
+    int ret = -ENODEV;
+
+    pcidevs_lock();
+
+    pdev = pci_get_pdev(NULL, PCI_SBDF(seg, bus, devfn));
+    if ( !pdev )
+        goto error;
+
+    ret = vpci_reset_device_state(pdev);
+    if (ret)
+        printk(XENLOG_ERR "PCI reset device %pp state failed\n", &pdev->sbdf);
+
+error:
+    pcidevs_unlock();
+
+    return ret;
+}
+
 /* Caller should hold the pcidevs_lock */
 static int deassign_device(struct domain *d, uint16_t seg, uint8_t bus,
                            uint8_t devfn)
diff --git a/xen/drivers/pci/physdev.c b/xen/drivers/pci/physdev.c
index 42db3e6d13..cfdb545654 100644
--- a/xen/drivers/pci/physdev.c
+++ b/xen/drivers/pci/physdev.c
@@ -67,6 +67,20 @@ ret_t pci_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case PHYSDEVOP_pci_device_state_reset: {
+        struct physdev_pci_device dev;
+
+        if ( !is_pci_passthrough_enabled() )
+            return -EOPNOTSUPP;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&dev, arg, 1) != 0 )
+            break;
+
+        ret = pci_reset_device_state(dev.seg, dev.bus, dev.devfn);
+        break;
+    }
+
     default:
         ret = -ENOSYS;
         break;
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 3bec9a4153..e9c3eb1cd6 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -103,6 +103,15 @@ int vpci_add_handlers(struct pci_dev *pdev)
 
     return rc;
 }
+
+int vpci_reset_device_state(struct pci_dev *pdev)
+{
+    ASSERT(pcidevs_locked());
+
+    vpci_remove_device(pdev);
+    return vpci_add_handlers(pdev);
+}
+
 #endif /* __XEN__ */
 
 static int vpci_register_cmp(const struct vpci_register *r1,
diff --git a/xen/include/public/physdev.h b/xen/include/public/physdev.h
index f0c0d4727c..4156948903 100644
--- a/xen/include/public/physdev.h
+++ b/xen/include/public/physdev.h
@@ -305,6 +305,8 @@ struct physdev_pci_device {
 typedef struct physdev_pci_device physdev_pci_device_t;
 DEFINE_XEN_GUEST_HANDLE(physdev_pci_device_t);
 
+#define PHYSDEVOP_pci_device_state_reset      32
+
 #define PHYSDEVOP_DBGP_RESET_PREPARE    1
 #define PHYSDEVOP_DBGP_RESET_DONE       2
 
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index 50d7dfb2a2..a0c28c5e6c 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -186,6 +186,7 @@ const unsigned long *pci_get_ro_map(u16 seg);
 int pci_add_device(u16 seg, u8 bus, u8 devfn,
                    const struct pci_dev_info *info, nodeid_t node);
 int pci_remove_device(u16 seg, u8 bus, u8 devfn);
+int pci_reset_device_state(u16 seg, u8 bus, u8 devfn);
 int pci_ro_device(int seg, int bus, int devfn);
 int pci_hide_device(unsigned int seg, unsigned int bus, unsigned int devfn);
 struct pci_dev *pci_get_pdev(const struct domain *d, pci_sbdf_t sbdf);
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index d743d96a10..e2d6cd5fa3 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -30,6 +30,7 @@ int __must_check vpci_add_handlers(struct pci_dev *pdev);
 
 /* Remove all handlers and free vpci related structures. */
 void vpci_remove_device(struct pci_dev *pdev);
+int vpci_reset_device_state(struct pci_dev *pdev);
 
 /* Add/remove a register handler. */
 int __must_check vpci_add_register(struct vpci *vpci,
@@ -242,6 +243,11 @@ static inline int vpci_add_handlers(struct pci_dev *pdev)
 
 static inline void vpci_remove_device(struct pci_dev *pdev) { }
 
+static inline int vpci_reset_device_state(struct pci_dev *pdev)
+{
+    return 0;
+}
+
 static inline void vpci_dump_msi(void) { }
 
 static inline uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0
  2023-11-24 10:41 [RFC XEN PATCH v2 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
  2023-11-24 10:41 ` [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device Jiqian Chen
@ 2023-11-24 10:41 ` Jiqian Chen
  2023-11-28 14:17   ` Roger Pau Monné
  2023-11-28 15:14   ` Jan Beulich
  2023-11-24 10:41 ` [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq Jiqian Chen
  2 siblings, 2 replies; 37+ messages in thread
From: Jiqian Chen @ 2023-11-24 10:41 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné,
	Wei Liu, Anthony PERARD, Juergen Gross, Stefano Stabellini,
	xen-devel
  Cc: Stewart Hildebrand, Alex Deucher, Xenia Ragiadakou,
	Stefano Stabellini, Huang Rui, Honglei Huang, Julia Zhang,
	Jiqian Chen, Huang Rui

If we run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
a passthrough device by using gsi, see xen_pt_realize->xc_physdev_map_pirq
and pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq will
call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq is not allowed
because currd is PVH dom0 and PVH has no X86_EMU_USE_PIRQ flag, it will
fail at has_pirq check.

So, I think we may need to allow PHYSDEVOP_map_pirq when currd is dom0 (at
present dom0 is PVH).

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
---
 xen/arch/x86/hvm/hypercall.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 6ad5b4d5f1..f9c4a2243a 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -74,6 +74,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
     {
     case PHYSDEVOP_map_pirq:
     case PHYSDEVOP_unmap_pirq:
+        if (is_hardware_domain(currd))
+            break;
     case PHYSDEVOP_eoi:
     case PHYSDEVOP_irq_status_query:
     case PHYSDEVOP_get_free_pirq:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-11-24 10:41 [RFC XEN PATCH v2 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
  2023-11-24 10:41 ` [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device Jiqian Chen
  2023-11-24 10:41 ` [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0 Jiqian Chen
@ 2023-11-24 10:41 ` Jiqian Chen
  2023-11-28 14:25   ` Roger Pau Monné
  2 siblings, 1 reply; 37+ messages in thread
From: Jiqian Chen @ 2023-11-24 10:41 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné,
	Wei Liu, Anthony PERARD, Juergen Gross, Stefano Stabellini,
	xen-devel
  Cc: Stewart Hildebrand, Alex Deucher, Xenia Ragiadakou,
	Stefano Stabellini, Huang Rui, Honglei Huang, Julia Zhang,
	Jiqian Chen, Huang Rui

In PVH dom0, it uses the linux local interrupt mechanism,
when it allocs irq for a gsi, it is dynamic, and follow
the principle of applying first, distributing first. And
if you debug the kernel codes, you will find the irq
number is alloced from small to large, but the applying
gsi number is not, may gsi 38 comes before gsi 28, that
causes the irq number is not equal with the gsi number.
And when we passthrough a device, QEMU will use its gsi
number to do mapping actions, see xen_pt_realize->
xc_physdev_map_pirq, but the gsi number is got from file
/sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
so it will fail when mapping.
And in current codes, there is no method to translate
irq to gsi for userspace.

For above purpose, this patch adds new function to get
that translation. And then we can use the correct gsi
number to do map_pirq.

And call this function before xc_physdev_map_pirq

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
---
 tools/include/xen-sys/Linux/privcmd.h |  7 +++++++
 tools/include/xencall.h               |  2 ++
 tools/include/xenctrl.h               |  2 ++
 tools/libs/call/core.c                |  5 +++++
 tools/libs/call/libxencall.map        |  2 ++
 tools/libs/call/linux.c               | 14 ++++++++++++++
 tools/libs/call/private.h             |  9 +++++++++
 tools/libs/ctrl/xc_physdev.c          |  4 ++++
 tools/libs/light/libxl_pci.c          |  1 +
 9 files changed, 46 insertions(+)

diff --git a/tools/include/xen-sys/Linux/privcmd.h b/tools/include/xen-sys/Linux/privcmd.h
index bc60e8fd55..ba4b8c3054 100644
--- a/tools/include/xen-sys/Linux/privcmd.h
+++ b/tools/include/xen-sys/Linux/privcmd.h
@@ -95,6 +95,11 @@ typedef struct privcmd_mmap_resource {
 	__u64 addr;
 } privcmd_mmap_resource_t;
 
+typedef struct privcmd_gsi_from_irq {
+	__u32 irq;
+	__u32 gsi;
+} privcmd_gsi_from_irq_t;
+
 /*
  * @cmd: IOCTL_PRIVCMD_HYPERCALL
  * @arg: &privcmd_hypercall_t
@@ -114,6 +119,8 @@ typedef struct privcmd_mmap_resource {
 	_IOC(_IOC_NONE, 'P', 6, sizeof(domid_t))
 #define IOCTL_PRIVCMD_MMAP_RESOURCE				\
 	_IOC(_IOC_NONE, 'P', 7, sizeof(privcmd_mmap_resource_t))
+#define IOCTL_PRIVCMD_GSI_FROM_IRQ				\
+	_IOC(_IOC_NONE, 'P', 10, sizeof(privcmd_gsi_from_irq_t))
 #define IOCTL_PRIVCMD_UNIMPLEMENTED				\
 	_IOC(_IOC_NONE, 'P', 0xFF, 0)
 
diff --git a/tools/include/xencall.h b/tools/include/xencall.h
index fc95ed0fe5..962cb45e1f 100644
--- a/tools/include/xencall.h
+++ b/tools/include/xencall.h
@@ -113,6 +113,8 @@ int xencall5(xencall_handle *xcall, unsigned int op,
              uint64_t arg1, uint64_t arg2, uint64_t arg3,
              uint64_t arg4, uint64_t arg5);
 
+int xen_oscall_gsi_from_irq(xencall_handle *xcall, int irq);
+
 /* Variant(s) of the above, as needed, returning "long" instead of "int". */
 long xencall2L(xencall_handle *xcall, unsigned int op,
                uint64_t arg1, uint64_t arg2);
diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
index 2ef8b4e054..2b9d55d2c6 100644
--- a/tools/include/xenctrl.h
+++ b/tools/include/xenctrl.h
@@ -1641,6 +1641,8 @@ int xc_physdev_unmap_pirq(xc_interface *xch,
                           uint32_t domid,
                           int pirq);
 
+int xc_physdev_gsi_from_irq(xc_interface *xch, int irq);
+
 /*
  *  LOGGING AND ERROR REPORTING
  */
diff --git a/tools/libs/call/core.c b/tools/libs/call/core.c
index 02c4f8e1ae..6f79f3babd 100644
--- a/tools/libs/call/core.c
+++ b/tools/libs/call/core.c
@@ -173,6 +173,11 @@ int xencall5(xencall_handle *xcall, unsigned int op,
     return osdep_hypercall(xcall, &call);
 }
 
+int xen_oscall_gsi_from_irq(xencall_handle *xcall, int irq)
+{
+    return osdep_oscall(xcall, irq);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/call/libxencall.map b/tools/libs/call/libxencall.map
index d18a3174e9..6cde8eda05 100644
--- a/tools/libs/call/libxencall.map
+++ b/tools/libs/call/libxencall.map
@@ -10,6 +10,8 @@ VERS_1.0 {
 		xencall4;
 		xencall5;
 
+		xen_oscall_gsi_from_irq;
+
 		xencall_alloc_buffer;
 		xencall_free_buffer;
 		xencall_alloc_buffer_pages;
diff --git a/tools/libs/call/linux.c b/tools/libs/call/linux.c
index 6d588e6bea..5267bceabf 100644
--- a/tools/libs/call/linux.c
+++ b/tools/libs/call/linux.c
@@ -85,6 +85,20 @@ long osdep_hypercall(xencall_handle *xcall, privcmd_hypercall_t *hypercall)
     return ioctl(xcall->fd, IOCTL_PRIVCMD_HYPERCALL, hypercall);
 }
 
+long osdep_oscall(xencall_handle *xcall, int irq)
+{
+    privcmd_gsi_from_irq_t gsi_irq = {
+        .irq = irq,
+        .gsi = -1,
+    };
+
+    if (ioctl(xcall->fd, IOCTL_PRIVCMD_GSI_FROM_IRQ, &gsi_irq)) {
+        return gsi_irq.irq;
+    }
+
+    return gsi_irq.gsi;
+}
+
 static void *alloc_pages_bufdev(xencall_handle *xcall, size_t npages)
 {
     void *p;
diff --git a/tools/libs/call/private.h b/tools/libs/call/private.h
index 9c3aa432ef..01a1f5076a 100644
--- a/tools/libs/call/private.h
+++ b/tools/libs/call/private.h
@@ -57,6 +57,15 @@ int osdep_xencall_close(xencall_handle *xcall);
 
 long osdep_hypercall(xencall_handle *xcall, privcmd_hypercall_t *hypercall);
 
+#if defined(__linux__)
+long osdep_oscall(xencall_handle *xcall, int irq);
+#else
+static inline long osdep_oscall(xencall_handle *xcall, int irq)
+{
+    return irq;
+}
+#endif
+
 void *osdep_alloc_pages(xencall_handle *xcall, size_t nr_pages);
 void osdep_free_pages(xencall_handle *xcall, void *p, size_t nr_pages);
 
diff --git a/tools/libs/ctrl/xc_physdev.c b/tools/libs/ctrl/xc_physdev.c
index 460a8e779c..4d3b138ebd 100644
--- a/tools/libs/ctrl/xc_physdev.c
+++ b/tools/libs/ctrl/xc_physdev.c
@@ -111,3 +111,7 @@ int xc_physdev_unmap_pirq(xc_interface *xch,
     return rc;
 }
 
+int xc_physdev_gsi_from_irq(xc_interface *xch, int irq)
+{
+    return xen_oscall_gsi_from_irq(xch->xcall, irq);
+}
diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 96cb4da079..ba8803dab4 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1486,6 +1486,7 @@ static void pci_add_dm_done(libxl__egc *egc,
         goto out_no_irq;
     }
     if ((fscanf(f, "%u", &irq) == 1) && irq) {
+        irq = xc_physdev_gsi_from_irq(ctx->xch, irq);
         r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
         if (r < 0) {
             LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)",
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-11-24 10:41 ` [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device Jiqian Chen
@ 2023-11-28 14:08   ` Roger Pau Monné
  2023-11-30  6:22     ` Chen, Jiqian
  0 siblings, 1 reply; 37+ messages in thread
From: Roger Pau Monné @ 2023-11-28 14:08 UTC (permalink / raw)
  To: Jiqian Chen
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Stewart Hildebrand, Alex Deucher,
	Xenia Ragiadakou, Stefano Stabellini, Huang Rui, Honglei Huang,
	Julia Zhang, Daniel P. Smith

On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote:
> When a device has been reset on dom0 side, the vpci on Xen
> side won't get notification, so the cached state in vpci is
> all out of date compare with the real device state.
> To solve that problem, this patch add new hypercall to clear
> all vpci device state. And when reset device happens on dom0
> side, dom0 can call this hypercall to notify vpci.
> 
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> Signed-off-by: Huang Rui <ray.huang@amd.com>
> ---
>  xen/arch/x86/hvm/hypercall.c  |  1 +
>  xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
>  xen/drivers/pci/physdev.c     | 14 ++++++++++++++
>  xen/drivers/vpci/vpci.c       |  9 +++++++++
>  xen/include/public/physdev.h  |  2 ++
>  xen/include/xen/pci.h         |  1 +
>  xen/include/xen/vpci.h        |  6 ++++++
>  7 files changed, 54 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
> index eeb73e1aa5..6ad5b4d5f1 100644
> --- a/xen/arch/x86/hvm/hypercall.c
> +++ b/xen/arch/x86/hvm/hypercall.c
> @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>      case PHYSDEVOP_pci_mmcfg_reserved:
>      case PHYSDEVOP_pci_device_add:
>      case PHYSDEVOP_pci_device_remove:
> +    case PHYSDEVOP_pci_device_state_reset:
>      case PHYSDEVOP_dbgp_op:
>          if ( !is_hardware_domain(currd) )
>              return -ENOSYS;
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index 04d00c7c37..f871715585 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
>      return ret;
>  }
>  
> +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)

You could use pci_sbdf_t here instead of 3 parameters.

I'm however unsure whether we really need this helper just to fetch
the pdev and call vpci_reset_device_state(), I think you could place
this logic directly in pci_physdev_op().  Unless there are plans to
use such logic outside of pci_physdev_op().

> +{
> +    struct pci_dev *pdev;
> +    int ret = -ENODEV;

Some XSM check should be introduced fro this operation, as none of the
existing ones is suitable.  See xsm_resource_unplug_pci() for example.

xsm_resource_reset_state_pci() or some such I would assume.

(adding the XSM maintainer for feedback).

> +
> +    pcidevs_lock();
> +
> +    pdev = pci_get_pdev(NULL, PCI_SBDF(seg, bus, devfn));
> +    if ( !pdev )
> +        goto error;
> +
> +    ret = vpci_reset_device_state(pdev);
> +    if (ret)
> +        printk(XENLOG_ERR "PCI reset device %pp state failed\n", &pdev->sbdf);
> +
> +error:
> +    pcidevs_unlock();
> +
> +    return ret;
> +}
> +
>  /* Caller should hold the pcidevs_lock */
>  static int deassign_device(struct domain *d, uint16_t seg, uint8_t bus,
>                             uint8_t devfn)
> diff --git a/xen/drivers/pci/physdev.c b/xen/drivers/pci/physdev.c
> index 42db3e6d13..cfdb545654 100644
> --- a/xen/drivers/pci/physdev.c
> +++ b/xen/drivers/pci/physdev.c
> @@ -67,6 +67,20 @@ ret_t pci_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          break;
>      }
>  
> +    case PHYSDEVOP_pci_device_state_reset: {
> +        struct physdev_pci_device dev;
> +
> +        if ( !is_pci_passthrough_enabled() )
> +            return -EOPNOTSUPP;
> +
> +        ret = -EFAULT;
> +        if ( copy_from_guest(&dev, arg, 1) != 0 )
> +            break;
> +
> +        ret = pci_reset_device_state(dev.seg, dev.bus, dev.devfn);
> +        break;
> +    }
> +
>      default:
>          ret = -ENOSYS;
>          break;
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 3bec9a4153..e9c3eb1cd6 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -103,6 +103,15 @@ int vpci_add_handlers(struct pci_dev *pdev)
>  
>      return rc;
>  }
> +
> +int vpci_reset_device_state(struct pci_dev *pdev)
> +{
> +    ASSERT(pcidevs_locked());
> +
> +    vpci_remove_device(pdev);
> +    return vpci_add_handlers(pdev);
> +}
> +
>  #endif /* __XEN__ */
>  
>  static int vpci_register_cmp(const struct vpci_register *r1,
> diff --git a/xen/include/public/physdev.h b/xen/include/public/physdev.h
> index f0c0d4727c..4156948903 100644
> --- a/xen/include/public/physdev.h
> +++ b/xen/include/public/physdev.h
> @@ -305,6 +305,8 @@ struct physdev_pci_device {
>  typedef struct physdev_pci_device physdev_pci_device_t;
>  DEFINE_XEN_GUEST_HANDLE(physdev_pci_device_t);
>  
> +#define PHYSDEVOP_pci_device_state_reset      32

This needs some comment in order to explain the expected behaviour,
and might be better placed a bit up after PHYSDEVOP_release_msix.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0
  2023-11-24 10:41 ` [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0 Jiqian Chen
@ 2023-11-28 14:17   ` Roger Pau Monné
  2023-11-30  6:32     ` Chen, Jiqian
  2023-11-28 15:14   ` Jan Beulich
  1 sibling, 1 reply; 37+ messages in thread
From: Roger Pau Monné @ 2023-11-28 14:17 UTC (permalink / raw)
  To: Jiqian Chen
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Stewart Hildebrand, Alex Deucher,
	Xenia Ragiadakou, Stefano Stabellini, Huang Rui, Honglei Huang,
	Julia Zhang

On Fri, Nov 24, 2023 at 06:41:35PM +0800, Jiqian Chen wrote:
> If we run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
> a passthrough device by using gsi, see xen_pt_realize->xc_physdev_map_pirq
> and pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq will
> call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq is not allowed
> because currd is PVH dom0 and PVH has no X86_EMU_USE_PIRQ flag, it will
> fail at has_pirq check.
> 
> So, I think we may need to allow PHYSDEVOP_map_pirq when currd is dom0 (at

And PHYSDEVOP_unmap_pirq also?

> present dom0 is PVH).

IMO it would be better to implement a new set of DMOP hypercalls that
handle the setup of interrupts from physical devices that are assigned
to a guest.  That should allow us to get rid of the awkward PHYSDEVOP
+ XEN_DOMCTL_{,un}bind_pt_irq hypercall interface, which currently
prevents QEMU from being hypervisor version agnostic (since the domctl
interface is not stable).

I understand this might be too much to ask for, but something to take
into account.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-11-24 10:41 ` [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq Jiqian Chen
@ 2023-11-28 14:25   ` Roger Pau Monné
  2023-11-28 14:42     ` Juergen Gross
                       ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Roger Pau Monné @ 2023-11-28 14:25 UTC (permalink / raw)
  To: Jiqian Chen
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Stewart Hildebrand, Alex Deucher,
	Xenia Ragiadakou, Stefano Stabellini, Huang Rui, Honglei Huang,
	Julia Zhang

On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
> In PVH dom0, it uses the linux local interrupt mechanism,
> when it allocs irq for a gsi, it is dynamic, and follow
> the principle of applying first, distributing first. And
> if you debug the kernel codes, you will find the irq
> number is alloced from small to large, but the applying
> gsi number is not, may gsi 38 comes before gsi 28, that
> causes the irq number is not equal with the gsi number.
> And when we passthrough a device, QEMU will use its gsi
> number to do mapping actions, see xen_pt_realize->
> xc_physdev_map_pirq, but the gsi number is got from file
> /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
> so it will fail when mapping.
> And in current codes, there is no method to translate
> irq to gsi for userspace.

I think it would be cleaner to just introduce a new sysfs node that
contains the gsi if a device is using one (much like the irq sysfs
node)?

Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
placing it in privcmd does seem quite strange to me.  I understand
that for passthrough we need the GSI, but such translation layer from
IRQ to GSI is all Linux internal, and it would be much simpler to just
expose the GSI in sysfs IMO.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-11-28 14:25   ` Roger Pau Monné
@ 2023-11-28 14:42     ` Juergen Gross
  2023-11-28 16:11       ` Roger Pau Monné
  2023-11-30  3:38     ` Stefano Stabellini
  2023-11-30  6:53     ` Chen, Jiqian
  2 siblings, 1 reply; 37+ messages in thread
From: Juergen Gross @ 2023-11-28 14:42 UTC (permalink / raw)
  To: Roger Pau Monné, Jiqian Chen
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Stefano Stabellini,
	xen-devel, Stewart Hildebrand, Alex Deucher, Xenia Ragiadakou,
	Stefano Stabellini, Huang Rui, Honglei Huang, Julia Zhang


[-- Attachment #1.1.1: Type: text/plain, Size: 1543 bytes --]

On 28.11.23 15:25, Roger Pau Monné wrote:
> On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
>> In PVH dom0, it uses the linux local interrupt mechanism,
>> when it allocs irq for a gsi, it is dynamic, and follow
>> the principle of applying first, distributing first. And
>> if you debug the kernel codes, you will find the irq
>> number is alloced from small to large, but the applying
>> gsi number is not, may gsi 38 comes before gsi 28, that
>> causes the irq number is not equal with the gsi number.
>> And when we passthrough a device, QEMU will use its gsi
>> number to do mapping actions, see xen_pt_realize->
>> xc_physdev_map_pirq, but the gsi number is got from file
>> /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
>> so it will fail when mapping.
>> And in current codes, there is no method to translate
>> irq to gsi for userspace.
> 
> I think it would be cleaner to just introduce a new sysfs node that
> contains the gsi if a device is using one (much like the irq sysfs
> node)?
> 
> Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
> placing it in privcmd does seem quite strange to me.  I understand
> that for passthrough we need the GSI, but such translation layer from
> IRQ to GSI is all Linux internal, and it would be much simpler to just
> expose the GSI in sysfs IMO.

You are aware that we have a Xen specific variant of acpi_register_gsi()?

It is the Xen event channel driver being responsible for the GSI<->IRQ
mapping.


Juergen


[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0
  2023-11-24 10:41 ` [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0 Jiqian Chen
  2023-11-28 14:17   ` Roger Pau Monné
@ 2023-11-28 15:14   ` Jan Beulich
  2023-11-30  6:44     ` Chen, Jiqian
  1 sibling, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2023-11-28 15:14 UTC (permalink / raw)
  To: Jiqian Chen
  Cc: Stewart Hildebrand, Alex Deucher, Xenia Ragiadakou,
	Stefano Stabellini, Huang Rui, Honglei Huang, Julia Zhang,
	Roger Pau Monné,
	Wei Liu, Anthony PERARD, Juergen Gross, Stefano Stabellini,
	xen-devel

On 24.11.2023 11:41, Jiqian Chen wrote:
> --- a/xen/arch/x86/hvm/hypercall.c
> +++ b/xen/arch/x86/hvm/hypercall.c
> @@ -74,6 +74,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>      {
>      case PHYSDEVOP_map_pirq:
>      case PHYSDEVOP_unmap_pirq:
> +        if (is_hardware_domain(currd))
> +            break;
>      case PHYSDEVOP_eoi:
>      case PHYSDEVOP_irq_status_query:
>      case PHYSDEVOP_get_free_pirq:

If you wouldn't go the route suggested by Roger, I think you will need
to deny self-mapping requests here.

Also note that both here and in patch 1 you will want to adjust a number
of style violations.

Jan


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-11-28 14:42     ` Juergen Gross
@ 2023-11-28 16:11       ` Roger Pau Monné
  2023-11-28 16:22         ` Juergen Gross
  2023-11-30  6:57         ` Chen, Jiqian
  0 siblings, 2 replies; 37+ messages in thread
From: Roger Pau Monné @ 2023-11-28 16:11 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Jiqian Chen, Jan Beulich, Wei Liu, Anthony PERARD,
	Stefano Stabellini, xen-devel, Stewart Hildebrand, Alex Deucher,
	Xenia Ragiadakou, Stefano Stabellini, Huang Rui, Honglei Huang,
	Julia Zhang

On Tue, Nov 28, 2023 at 03:42:31PM +0100, Juergen Gross wrote:
> On 28.11.23 15:25, Roger Pau Monné wrote:
> > On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
> > > In PVH dom0, it uses the linux local interrupt mechanism,
> > > when it allocs irq for a gsi, it is dynamic, and follow
> > > the principle of applying first, distributing first. And
> > > if you debug the kernel codes, you will find the irq
> > > number is alloced from small to large, but the applying
> > > gsi number is not, may gsi 38 comes before gsi 28, that
> > > causes the irq number is not equal with the gsi number.
> > > And when we passthrough a device, QEMU will use its gsi
> > > number to do mapping actions, see xen_pt_realize->
> > > xc_physdev_map_pirq, but the gsi number is got from file
> > > /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
> > > so it will fail when mapping.
> > > And in current codes, there is no method to translate
> > > irq to gsi for userspace.
> > 
> > I think it would be cleaner to just introduce a new sysfs node that
> > contains the gsi if a device is using one (much like the irq sysfs
> > node)?
> > 
> > Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
> > placing it in privcmd does seem quite strange to me.  I understand
> > that for passthrough we need the GSI, but such translation layer from
> > IRQ to GSI is all Linux internal, and it would be much simpler to just
> > expose the GSI in sysfs IMO.
> 
> You are aware that we have a Xen specific variant of acpi_register_gsi()?
> 
> It is the Xen event channel driver being responsible for the GSI<->IRQ
> mapping.

I'm kind of lost, this translation function is specifically needed for
PVH which doesn't use the Xen specific variant of acpi_register_gsi(),
and hence the IRQ <-> GSI relation is whatever the Linux kernel does
on native.

I do understand that on a PV dom0 the proposed sysfs gsi node would
match the irq node, but that doesn't seem like an issue to me.

Note also that PVH doesn't use acpi_register_gsi_xen_hvm() because
XENFEAT_hvm_pirqs feature is not exposed to PVH, so I expect it uses
the x86 acpi_register_gsi_ioapic().

Thanks, Roger.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-11-28 16:11       ` Roger Pau Monné
@ 2023-11-28 16:22         ` Juergen Gross
  2023-11-30  6:57         ` Chen, Jiqian
  1 sibling, 0 replies; 37+ messages in thread
From: Juergen Gross @ 2023-11-28 16:22 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Jiqian Chen, Jan Beulich, Wei Liu, Anthony PERARD,
	Stefano Stabellini, xen-devel, Stewart Hildebrand, Alex Deucher,
	Xenia Ragiadakou, Stefano Stabellini, Huang Rui, Honglei Huang,
	Julia Zhang


[-- Attachment #1.1.1: Type: text/plain, Size: 2334 bytes --]

On 28.11.23 17:11, Roger Pau Monné wrote:
> On Tue, Nov 28, 2023 at 03:42:31PM +0100, Juergen Gross wrote:
>> On 28.11.23 15:25, Roger Pau Monné wrote:
>>> On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
>>>> In PVH dom0, it uses the linux local interrupt mechanism,
>>>> when it allocs irq for a gsi, it is dynamic, and follow
>>>> the principle of applying first, distributing first. And
>>>> if you debug the kernel codes, you will find the irq
>>>> number is alloced from small to large, but the applying
>>>> gsi number is not, may gsi 38 comes before gsi 28, that
>>>> causes the irq number is not equal with the gsi number.
>>>> And when we passthrough a device, QEMU will use its gsi
>>>> number to do mapping actions, see xen_pt_realize->
>>>> xc_physdev_map_pirq, but the gsi number is got from file
>>>> /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
>>>> so it will fail when mapping.
>>>> And in current codes, there is no method to translate
>>>> irq to gsi for userspace.
>>>
>>> I think it would be cleaner to just introduce a new sysfs node that
>>> contains the gsi if a device is using one (much like the irq sysfs
>>> node)?
>>>
>>> Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
>>> placing it in privcmd does seem quite strange to me.  I understand
>>> that for passthrough we need the GSI, but such translation layer from
>>> IRQ to GSI is all Linux internal, and it would be much simpler to just
>>> expose the GSI in sysfs IMO.
>>
>> You are aware that we have a Xen specific variant of acpi_register_gsi()?
>>
>> It is the Xen event channel driver being responsible for the GSI<->IRQ
>> mapping.
> 
> I'm kind of lost, this translation function is specifically needed for
> PVH which doesn't use the Xen specific variant of acpi_register_gsi(),
> and hence the IRQ <-> GSI relation is whatever the Linux kernel does
> on native.
> 
> I do understand that on a PV dom0 the proposed sysfs gsi node would
> match the irq node, but that doesn't seem like an issue to me.
> 
> Note also that PVH doesn't use acpi_register_gsi_xen_hvm() because
> XENFEAT_hvm_pirqs feature is not exposed to PVH, so I expect it uses
> the x86 acpi_register_gsi_ioapic().

Oh, I wasn't aware of this.

Sorry for the noise.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-11-28 14:25   ` Roger Pau Monné
  2023-11-28 14:42     ` Juergen Gross
@ 2023-11-30  3:38     ` Stefano Stabellini
  2023-11-30  4:02       ` Stefano Stabellini
  2023-11-30  6:53     ` Chen, Jiqian
  2 siblings, 1 reply; 37+ messages in thread
From: Stefano Stabellini @ 2023-11-30  3:38 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Jiqian Chen, Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Stewart Hildebrand, Alex Deucher,
	Xenia Ragiadakou, Stefano Stabellini, Huang Rui, Honglei Huang,
	Julia Zhang

[-- Attachment #1: Type: text/plain, Size: 1449 bytes --]

On Tue, 28 Nov 2023, Roger Pau Monné wrote:
> On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
> > In PVH dom0, it uses the linux local interrupt mechanism,
> > when it allocs irq for a gsi, it is dynamic, and follow
> > the principle of applying first, distributing first. And
> > if you debug the kernel codes, you will find the irq
> > number is alloced from small to large, but the applying
> > gsi number is not, may gsi 38 comes before gsi 28, that
> > causes the irq number is not equal with the gsi number.
> > And when we passthrough a device, QEMU will use its gsi
> > number to do mapping actions, see xen_pt_realize->
> > xc_physdev_map_pirq, but the gsi number is got from file
> > /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
> > so it will fail when mapping.
> > And in current codes, there is no method to translate
> > irq to gsi for userspace.
> 
> I think it would be cleaner to just introduce a new sysfs node that
> contains the gsi if a device is using one (much like the irq sysfs
> node)?
> 
> Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
> placing it in privcmd does seem quite strange to me.  I understand
> that for passthrough we need the GSI, but such translation layer from
> IRQ to GSI is all Linux internal, and it would be much simpler to just
> expose the GSI in sysfs IMO.

Maybe something to add to drivers/xen/sys-hypervisor.c in Linux.
Juergen what do you think?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-11-30  3:38     ` Stefano Stabellini
@ 2023-11-30  4:02       ` Stefano Stabellini
  2023-11-30 10:06         ` Roger Pau Monné
  0 siblings, 1 reply; 37+ messages in thread
From: Stefano Stabellini @ 2023-11-30  4:02 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Roger Pau Monné,
	Jiqian Chen, Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	xen-devel, Stewart Hildebrand, Alex Deucher, Xenia Ragiadakou,
	Stefano Stabellini, Huang Rui, Honglei Huang, Julia Zhang

[-- Attachment #1: Type: text/plain, Size: 1835 bytes --]

n Wed, 29 Nov 2023, Stefano Stabellini wrote:
> On Tue, 28 Nov 2023, Roger Pau Monné wrote:
> > On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
> > > In PVH dom0, it uses the linux local interrupt mechanism,
> > > when it allocs irq for a gsi, it is dynamic, and follow
> > > the principle of applying first, distributing first. And
> > > if you debug the kernel codes, you will find the irq
> > > number is alloced from small to large, but the applying
> > > gsi number is not, may gsi 38 comes before gsi 28, that
> > > causes the irq number is not equal with the gsi number.
> > > And when we passthrough a device, QEMU will use its gsi
> > > number to do mapping actions, see xen_pt_realize->
> > > xc_physdev_map_pirq, but the gsi number is got from file
> > > /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
> > > so it will fail when mapping.
> > > And in current codes, there is no method to translate
> > > irq to gsi for userspace.
> > 
> > I think it would be cleaner to just introduce a new sysfs node that
> > contains the gsi if a device is using one (much like the irq sysfs
> > node)?
> > 
> > Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
> > placing it in privcmd does seem quite strange to me.  I understand
> > that for passthrough we need the GSI, but such translation layer from
> > IRQ to GSI is all Linux internal, and it would be much simpler to just
> > expose the GSI in sysfs IMO.
> 
> Maybe something to add to drivers/xen/sys-hypervisor.c in Linux.
> Juergen what do you think?

Let me also add that privcmd.c is already a Linux specific interface.
Although it was born to be a Xen hypercall "proxy" in reality today we
have a few privcmd ioctls that don't translate into hypercalls. So I
don't think that adding IOCTL_PRIVCMD_GSI_FROM_IRQ would be a problem.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-11-28 14:08   ` Roger Pau Monné
@ 2023-11-30  6:22     ` Chen, Jiqian
  2023-11-30 11:52       ` Roger Pau Monné
  2023-11-30 12:25       ` Daniel P. Smith
  0 siblings, 2 replies; 37+ messages in thread
From: Chen, Jiqian @ 2023-11-30  6:22 UTC (permalink / raw)
  To: Roger Pau Monné, Daniel P. Smith
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Hildebrand, Stewart, Deucher,
	Alexander, Ragiadakou, Xenia, Stabellini, Stefano, Huang, Ray,
	Huang, Honglei1, Zhang, Julia, Chen, Jiqian

Hi Roger and Daniel P. Smith,

On 2023/11/28 22:08, Roger Pau Monné wrote:
> On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote:
>> When a device has been reset on dom0 side, the vpci on Xen
>> side won't get notification, so the cached state in vpci is
>> all out of date compare with the real device state.
>> To solve that problem, this patch add new hypercall to clear
>> all vpci device state. And when reset device happens on dom0
>> side, dom0 can call this hypercall to notify vpci.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>> ---
>>  xen/arch/x86/hvm/hypercall.c  |  1 +
>>  xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
>>  xen/drivers/pci/physdev.c     | 14 ++++++++++++++
>>  xen/drivers/vpci/vpci.c       |  9 +++++++++
>>  xen/include/public/physdev.h  |  2 ++
>>  xen/include/xen/pci.h         |  1 +
>>  xen/include/xen/vpci.h        |  6 ++++++
>>  7 files changed, 54 insertions(+)
>>
>> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
>> index eeb73e1aa5..6ad5b4d5f1 100644
>> --- a/xen/arch/x86/hvm/hypercall.c
>> +++ b/xen/arch/x86/hvm/hypercall.c
>> @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>      case PHYSDEVOP_pci_mmcfg_reserved:
>>      case PHYSDEVOP_pci_device_add:
>>      case PHYSDEVOP_pci_device_remove:
>> +    case PHYSDEVOP_pci_device_state_reset:
>>      case PHYSDEVOP_dbgp_op:
>>          if ( !is_hardware_domain(currd) )
>>              return -ENOSYS;
>> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
>> index 04d00c7c37..f871715585 100644
>> --- a/xen/drivers/passthrough/pci.c
>> +++ b/xen/drivers/passthrough/pci.c
>> @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
>>      return ret;
>>  }
>>  
>> +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
> 
> You could use pci_sbdf_t here instead of 3 parameters.
Will change in next version, thank you.

> 
> I'm however unsure whether we really need this helper just to fetch
> the pdev and call vpci_reset_device_state(), I think you could place
> this logic directly in pci_physdev_op().  Unless there are plans to
> use such logic outside of pci_physdev_op().
If I place the logic of pci_reset_device_state directly in pci_physdev_op. I think I need to declare vpci_reset_device_state in pci.h? If it is suitable, I will change in next version.

> 
>> +{
>> +    struct pci_dev *pdev;
>> +    int ret = -ENODEV;
> 
> Some XSM check should be introduced fro this operation, as none of the
> existing ones is suitable.  See xsm_resource_unplug_pci() for example.
> 
> xsm_resource_reset_state_pci() or some such I would assume.
I don't know what I should do in XSM function(assume it is xsm_resource_reset_state_pci).
Hi Daniel P. Smith, could you please give me some suggestions?
Just to check the XSM_PRIV action?

> 
> (adding the XSM maintainer for feedback).
> 
>> +
>> +    pcidevs_lock();
>> +
>> +    pdev = pci_get_pdev(NULL, PCI_SBDF(seg, bus, devfn));
>> +    if ( !pdev )
>> +        goto error;
>> +
>> +    ret = vpci_reset_device_state(pdev);
>> +    if (ret)
>> +        printk(XENLOG_ERR "PCI reset device %pp state failed\n", &pdev->sbdf);
>> +
>> +error:
>> +    pcidevs_unlock();
>> +
>> +    return ret;
>> +}
>> +
>>  /* Caller should hold the pcidevs_lock */
>>  static int deassign_device(struct domain *d, uint16_t seg, uint8_t bus,
>>                             uint8_t devfn)
>> diff --git a/xen/drivers/pci/physdev.c b/xen/drivers/pci/physdev.c
>> index 42db3e6d13..cfdb545654 100644
>> --- a/xen/drivers/pci/physdev.c
>> +++ b/xen/drivers/pci/physdev.c
>> @@ -67,6 +67,20 @@ ret_t pci_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>          break;
>>      }
>>  
>> +    case PHYSDEVOP_pci_device_state_reset: {
>> +        struct physdev_pci_device dev;
>> +
>> +        if ( !is_pci_passthrough_enabled() )
>> +            return -EOPNOTSUPP;
>> +
>> +        ret = -EFAULT;
>> +        if ( copy_from_guest(&dev, arg, 1) != 0 )
>> +            break;
>> +
>> +        ret = pci_reset_device_state(dev.seg, dev.bus, dev.devfn);
>> +        break;
>> +    }
>> +
>>      default:
>>          ret = -ENOSYS;
>>          break;
>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>> index 3bec9a4153..e9c3eb1cd6 100644
>> --- a/xen/drivers/vpci/vpci.c
>> +++ b/xen/drivers/vpci/vpci.c
>> @@ -103,6 +103,15 @@ int vpci_add_handlers(struct pci_dev *pdev)
>>  
>>      return rc;
>>  }
>> +
>> +int vpci_reset_device_state(struct pci_dev *pdev)
>> +{
>> +    ASSERT(pcidevs_locked());
>> +
>> +    vpci_remove_device(pdev);
>> +    return vpci_add_handlers(pdev);
>> +}
>> +
>>  #endif /* __XEN__ */
>>  
>>  static int vpci_register_cmp(const struct vpci_register *r1,
>> diff --git a/xen/include/public/physdev.h b/xen/include/public/physdev.h
>> index f0c0d4727c..4156948903 100644
>> --- a/xen/include/public/physdev.h
>> +++ b/xen/include/public/physdev.h
>> @@ -305,6 +305,8 @@ struct physdev_pci_device {
>>  typedef struct physdev_pci_device physdev_pci_device_t;
>>  DEFINE_XEN_GUEST_HANDLE(physdev_pci_device_t);
>>  
>> +#define PHYSDEVOP_pci_device_state_reset      32
> 
> This needs some comment in order to explain the expected behaviour,
> and might be better placed a bit up after PHYSDEVOP_release_msix.
I will add some comment and move it closer to PHYSDEVOP_release_msix in next version. Thank you.

> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0
  2023-11-28 14:17   ` Roger Pau Monné
@ 2023-11-30  6:32     ` Chen, Jiqian
  2023-11-30 15:13       ` Roger Pau Monné
  0 siblings, 1 reply; 37+ messages in thread
From: Chen, Jiqian @ 2023-11-30  6:32 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Hildebrand, Stewart, Deucher,
	Alexander, Ragiadakou, Xenia, Stabellini, Stefano, Huang, Ray,
	Huang, Honglei1, Zhang, Julia, Chen, Jiqian


On 2023/11/28 22:17, Roger Pau Monné wrote:
> On Fri, Nov 24, 2023 at 06:41:35PM +0800, Jiqian Chen wrote:
>> If we run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
>> a passthrough device by using gsi, see xen_pt_realize->xc_physdev_map_pirq
>> and pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq will
>> call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq is not allowed
>> because currd is PVH dom0 and PVH has no X86_EMU_USE_PIRQ flag, it will
>> fail at has_pirq check.
>>
>> So, I think we may need to allow PHYSDEVOP_map_pirq when currd is dom0 (at
> 
> And PHYSDEVOP_unmap_pirq also?
Yes, in the failed path, PHYSDEVOP_unmap_pirq will be called. I will add some descriptions about it into the commit message.

> 
>> present dom0 is PVH).
> 
> IMO it would be better to implement a new set of DMOP hypercalls that
> handle the setup of interrupts from physical devices that are assigned
> to a guest.  That should allow us to get rid of the awkward PHYSDEVOP
> + XEN_DOMCTL_{,un}bind_pt_irq hypercall interface, which currently
> prevents QEMU from being hypervisor version agnostic (since the domctl
> interface is not stable).
> 
> I understand this might be too much to ask for, but something to take
> into account.
Yes, that will be a complex project. I think current change can meet the needs. We can take DMOP into account in the future. Thank you.

> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0
  2023-11-28 15:14   ` Jan Beulich
@ 2023-11-30  6:44     ` Chen, Jiqian
  2023-11-30  8:38       ` Chen, Jiqian
  2023-11-30  9:00       ` Jan Beulich
  0 siblings, 2 replies; 37+ messages in thread
From: Chen, Jiqian @ 2023-11-30  6:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Hildebrand, Stewart, Deucher, Alexander, Ragiadakou, Xenia,
	Stabellini, Stefano, Huang, Ray, Huang, Honglei1, Zhang, Julia,
	Roger Pau Monné,
	Wei Liu, Anthony PERARD, Juergen Gross, Stefano Stabellini,
	xen-devel, Chen, Jiqian


On 2023/11/28 23:14, Jan Beulich wrote:
> On 24.11.2023 11:41, Jiqian Chen wrote:
>> --- a/xen/arch/x86/hvm/hypercall.c
>> +++ b/xen/arch/x86/hvm/hypercall.c
>> @@ -74,6 +74,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>      {
>>      case PHYSDEVOP_map_pirq:
>>      case PHYSDEVOP_unmap_pirq:
>> +        if (is_hardware_domain(currd))
>> +            break;
>>      case PHYSDEVOP_eoi:
>>      case PHYSDEVOP_irq_status_query:
>>      case PHYSDEVOP_get_free_pirq:
> 
> If you wouldn't go the route suggested by Roger, I think you will need
> to deny self-mapping requests here.
Do you mean below?
if (arg.domid == DOMID_SELF)
	return;

> 
> Also note that both here and in patch 1 you will want to adjust a number
> of style violations.
Could you please descript in detail? This will greatly assist me in making modifications in the next version. Thank you!

> 
> Jan

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-11-28 14:25   ` Roger Pau Monné
  2023-11-28 14:42     ` Juergen Gross
  2023-11-30  3:38     ` Stefano Stabellini
@ 2023-11-30  6:53     ` Chen, Jiqian
  2 siblings, 0 replies; 37+ messages in thread
From: Chen, Jiqian @ 2023-11-30  6:53 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Hildebrand, Stewart, Deucher,
	Alexander, Ragiadakou, Xenia, Stabellini, Stefano, Huang, Ray,
	Huang, Honglei1, Zhang, Julia, Chen, Jiqian


On 2023/11/28 22:25, Roger Pau Monné wrote:
> On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
>> In PVH dom0, it uses the linux local interrupt mechanism,
>> when it allocs irq for a gsi, it is dynamic, and follow
>> the principle of applying first, distributing first. And
>> if you debug the kernel codes, you will find the irq
>> number is alloced from small to large, but the applying
>> gsi number is not, may gsi 38 comes before gsi 28, that
>> causes the irq number is not equal with the gsi number.
>> And when we passthrough a device, QEMU will use its gsi
>> number to do mapping actions, see xen_pt_realize->
>> xc_physdev_map_pirq, but the gsi number is got from file
>> /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
>> so it will fail when mapping.
>> And in current codes, there is no method to translate
>> irq to gsi for userspace.
> 
> I think it would be cleaner to just introduce a new sysfs node that
> contains the gsi if a device is using one (much like the irq sysfs
> node)?
Yes, I also ever thought this way. Add a sysfs node in /sys/bus/pci/devices/xxxx:xx:xx.x/gsi.
But I am not sure sysfs or privcmd, which is better.
If use sysfs node, should I need to use the macro of Xen to wrap the codes? And is it suitable to create it in function acpi_register_gsi_ioapic?

> 
> Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
> placing it in privcmd does seem quite strange to me.  I understand
> that for passthrough we need the GSI, but such translation layer from
> IRQ to GSI is all Linux internal, and it would be much simpler to just
> expose the GSI in sysfs IMO.
> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-11-28 16:11       ` Roger Pau Monné
  2023-11-28 16:22         ` Juergen Gross
@ 2023-11-30  6:57         ` Chen, Jiqian
  1 sibling, 0 replies; 37+ messages in thread
From: Chen, Jiqian @ 2023-11-30  6:57 UTC (permalink / raw)
  To: Roger Pau Monné, Juergen Gross
  Cc: Chen, Jiqian, Jan Beulich, Wei Liu, Anthony PERARD,
	Stefano Stabellini, xen-devel, Hildebrand, Stewart, Deucher,
	Alexander, Ragiadakou, Xenia, Stabellini, Stefano, Huang, Ray,
	Huang, Honglei1, Zhang, Julia


On 2023/11/29 00:11, Roger Pau Monné wrote:
> On Tue, Nov 28, 2023 at 03:42:31PM +0100, Juergen Gross wrote:
>> On 28.11.23 15:25, Roger Pau Monné wrote:
>>> On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
>>>> In PVH dom0, it uses the linux local interrupt mechanism,
>>>> when it allocs irq for a gsi, it is dynamic, and follow
>>>> the principle of applying first, distributing first. And
>>>> if you debug the kernel codes, you will find the irq
>>>> number is alloced from small to large, but the applying
>>>> gsi number is not, may gsi 38 comes before gsi 28, that
>>>> causes the irq number is not equal with the gsi number.
>>>> And when we passthrough a device, QEMU will use its gsi
>>>> number to do mapping actions, see xen_pt_realize->
>>>> xc_physdev_map_pirq, but the gsi number is got from file
>>>> /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
>>>> so it will fail when mapping.
>>>> And in current codes, there is no method to translate
>>>> irq to gsi for userspace.
>>>
>>> I think it would be cleaner to just introduce a new sysfs node that
>>> contains the gsi if a device is using one (much like the irq sysfs
>>> node)?
>>>
>>> Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
>>> placing it in privcmd does seem quite strange to me.  I understand
>>> that for passthrough we need the GSI, but such translation layer from
>>> IRQ to GSI is all Linux internal, and it would be much simpler to just
>>> expose the GSI in sysfs IMO.
>>
>> You are aware that we have a Xen specific variant of acpi_register_gsi()?
>>
>> It is the Xen event channel driver being responsible for the GSI<->IRQ
>> mapping.
> 
> I'm kind of lost, this translation function is specifically needed for
> PVH which doesn't use the Xen specific variant of acpi_register_gsi(),
> and hence the IRQ <-> GSI relation is whatever the Linux kernel does
> on native.
> 
> I do understand that on a PV dom0 the proposed sysfs gsi node would
> match the irq node, but that doesn't seem like an issue to me.
> 
> Note also that PVH doesn't use acpi_register_gsi_xen_hvm() because
> XENFEAT_hvm_pirqs feature is not exposed to PVH, so I expect it uses
> the x86 acpi_register_gsi_ioapic().
Yes, PVH use acpi_register_gsi_ioapic, thank Roger for explanation.

> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0
  2023-11-30  6:44     ` Chen, Jiqian
@ 2023-11-30  8:38       ` Chen, Jiqian
  2023-11-30  9:00       ` Jan Beulich
  1 sibling, 0 replies; 37+ messages in thread
From: Chen, Jiqian @ 2023-11-30  8:38 UTC (permalink / raw)
  To: Jan Beulich, Ragiadakou, Xenia
  Cc: Hildebrand, Stewart, Deucher, Alexander, Stabellini, Stefano,
	Huang, Ray, Huang, Honglei1, Zhang, Julia, Roger Pau Monné,
	Wei Liu, Anthony PERARD, Juergen Gross, Stefano Stabellini,
	xen-devel, Chen, Jiqian

Hi Jan Beulich,

On 2023/11/30 14:44, Chen, Jiqian wrote:
> 
> On 2023/11/28 23:14, Jan Beulich wrote:
>> On 24.11.2023 11:41, Jiqian Chen wrote:
>>> --- a/xen/arch/x86/hvm/hypercall.c
>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>> @@ -74,6 +74,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>      {
>>>      case PHYSDEVOP_map_pirq:
>>>      case PHYSDEVOP_unmap_pirq:
>>> +        if (is_hardware_domain(currd))
>>> +            break;
>>>      case PHYSDEVOP_eoi:
>>>      case PHYSDEVOP_irq_status_query:
>>>      case PHYSDEVOP_get_free_pirq:
>>
>> If you wouldn't go the route suggested by Roger, I think you will need
>> to deny self-mapping requests here.
> Do you mean below?
> if (arg.domid == DOMID_SELF)
> 	return;
> 
>>
>> Also note that both here and in patch 1 you will want to adjust a number
>> of style violations.
> Could you please descript in detail? This will greatly assist me in making modifications in the next version. Thank you!
Oh! Do you mean there are many code style problems that not satisfiable for CODING_STYLE of Xen in my codes?
Thank Xenia for reminder.

> 
>>
>> Jan
> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0
  2023-11-30  6:44     ` Chen, Jiqian
  2023-11-30  8:38       ` Chen, Jiqian
@ 2023-11-30  9:00       ` Jan Beulich
  2023-11-30  9:43         ` Chen, Jiqian
  1 sibling, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2023-11-30  9:00 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Hildebrand, Stewart, Deucher, Alexander, Ragiadakou, Xenia,
	Stabellini, Stefano, Huang, Ray, Huang, Honglei1, Zhang, Julia,
	Roger Pau Monné,
	Wei Liu, Anthony PERARD, Juergen Gross, Stefano Stabellini,
	xen-devel

On 30.11.2023 07:44, Chen, Jiqian wrote:
> On 2023/11/28 23:14, Jan Beulich wrote:
>> On 24.11.2023 11:41, Jiqian Chen wrote:
>>> --- a/xen/arch/x86/hvm/hypercall.c
>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>> @@ -74,6 +74,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>      {
>>>      case PHYSDEVOP_map_pirq:
>>>      case PHYSDEVOP_unmap_pirq:
>>> +        if (is_hardware_domain(currd))
>>> +            break;
>>>      case PHYSDEVOP_eoi:
>>>      case PHYSDEVOP_irq_status_query:
>>>      case PHYSDEVOP_get_free_pirq:
>>
>> If you wouldn't go the route suggested by Roger, I think you will need
>> to deny self-mapping requests here.
> Do you mean below?
> if (arg.domid == DOMID_SELF)
> 	return;

That's part of it, yes. You'd also need to check for the actual domain ID of
the caller domain.

>> Also note that both here and in patch 1 you will want to adjust a number
>> of style violations.
> Could you please descript in detail? This will greatly assist me in making modifications in the next version. Thank you!

Well, in the code above you're missing blanks inside the if(). Please see
./CODING_STYLE.

Jan


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0
  2023-11-30  9:00       ` Jan Beulich
@ 2023-11-30  9:43         ` Chen, Jiqian
  0 siblings, 0 replies; 37+ messages in thread
From: Chen, Jiqian @ 2023-11-30  9:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Hildebrand, Stewart, Deucher, Alexander, Ragiadakou, Xenia,
	Stabellini, Stefano, Huang, Ray, Huang, Honglei1, Zhang, Julia,
	Roger Pau Monné,
	Wei Liu, Anthony PERARD, Juergen Gross, Stefano Stabellini,
	xen-devel, Chen, Jiqian

On 2023/11/30 17:00, Jan Beulich wrote:
> On 30.11.2023 07:44, Chen, Jiqian wrote:
>> On 2023/11/28 23:14, Jan Beulich wrote:
>>> On 24.11.2023 11:41, Jiqian Chen wrote:
>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>> @@ -74,6 +74,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>      {
>>>>      case PHYSDEVOP_map_pirq:
>>>>      case PHYSDEVOP_unmap_pirq:
>>>> +        if (is_hardware_domain(currd))
>>>> +            break;
>>>>      case PHYSDEVOP_eoi:
>>>>      case PHYSDEVOP_irq_status_query:
>>>>      case PHYSDEVOP_get_free_pirq:
>>>
>>> If you wouldn't go the route suggested by Roger, I think you will need
>>> to deny self-mapping requests here.
>> Do you mean below?
>> if (arg.domid == DOMID_SELF)
>> 	return;
> 
> That's part of it, yes. You'd also need to check for the actual domain ID of
> the caller domain.
I will add more check in next version.

> 
>>> Also note that both here and in patch 1 you will want to adjust a number
>>> of style violations.
>> Could you please descript in detail? This will greatly assist me in making modifications in the next version. Thank you!
> 
> Well, in the code above you're missing blanks inside the if(). Please see
> ./CODING_STYLE.
Thank you very much! I will check and modify all my patches to meet the Xen code style in next version.

> 
> Jan

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-11-30  4:02       ` Stefano Stabellini
@ 2023-11-30 10:06         ` Roger Pau Monné
  2023-12-01  3:09           ` Stefano Stabellini
  0 siblings, 1 reply; 37+ messages in thread
From: Roger Pau Monné @ 2023-11-30 10:06 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jiqian Chen, Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	xen-devel, Stewart Hildebrand, Alex Deucher, Xenia Ragiadakou,
	Stefano Stabellini, Huang Rui, Honglei Huang, Julia Zhang

On Wed, Nov 29, 2023 at 08:02:40PM -0800, Stefano Stabellini wrote:
> n Wed, 29 Nov 2023, Stefano Stabellini wrote:
> > On Tue, 28 Nov 2023, Roger Pau Monné wrote:
> > > On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
> > > > In PVH dom0, it uses the linux local interrupt mechanism,
> > > > when it allocs irq for a gsi, it is dynamic, and follow
> > > > the principle of applying first, distributing first. And
> > > > if you debug the kernel codes, you will find the irq
> > > > number is alloced from small to large, but the applying
> > > > gsi number is not, may gsi 38 comes before gsi 28, that
> > > > causes the irq number is not equal with the gsi number.
> > > > And when we passthrough a device, QEMU will use its gsi
> > > > number to do mapping actions, see xen_pt_realize->
> > > > xc_physdev_map_pirq, but the gsi number is got from file
> > > > /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
> > > > so it will fail when mapping.
> > > > And in current codes, there is no method to translate
> > > > irq to gsi for userspace.
> > > 
> > > I think it would be cleaner to just introduce a new sysfs node that
> > > contains the gsi if a device is using one (much like the irq sysfs
> > > node)?
> > > 
> > > Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
> > > placing it in privcmd does seem quite strange to me.  I understand
> > > that for passthrough we need the GSI, but such translation layer from
> > > IRQ to GSI is all Linux internal, and it would be much simpler to just
> > > expose the GSI in sysfs IMO.
> > 
> > Maybe something to add to drivers/xen/sys-hypervisor.c in Linux.
> > Juergen what do you think?
> 
> Let me also add that privcmd.c is already a Linux specific interface.
> Although it was born to be a Xen hypercall "proxy" in reality today we
> have a few privcmd ioctls that don't translate into hypercalls. So I
> don't think that adding IOCTL_PRIVCMD_GSI_FROM_IRQ would be a problem.

Maybe not all ioctls translate to hypercalls (I guess you are
referring to the IOCTL_PRIVCMD_RESTRICT ioctl), but they are specific
Xen actions.  Getting the GSI used by a device has nothing do to with
Xen.

IMO drivers/xen/sys-hypervisor.c is also not appropriate, but I'm not
the maintainer of any of those components.

There's nothing Xen specific about fetching the GSI associated with a
PCI device.  The fact that Xen needs it for passthrough is just a red
herring, further cases where the GSI is needed might arise outside of
Xen, and hence such node would better be placed in a generic
location.  The right location should be /sys/bus/pci/devices/<sbdf>/gsi.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-11-30  6:22     ` Chen, Jiqian
@ 2023-11-30 11:52       ` Roger Pau Monné
  2023-11-30 11:55         ` Chen, Jiqian
  2023-11-30 12:25       ` Daniel P. Smith
  1 sibling, 1 reply; 37+ messages in thread
From: Roger Pau Monné @ 2023-11-30 11:52 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Daniel P. Smith, Jan Beulich, Wei Liu, Anthony PERARD,
	Juergen Gross, Stefano Stabellini, xen-devel, Hildebrand,
	Stewart, Deucher, Alexander, Ragiadakou, Xenia, Stabellini,
	Stefano, Huang, Ray, Huang, Honglei1, Zhang, Julia

On Thu, Nov 30, 2023 at 06:22:28AM +0000, Chen, Jiqian wrote:
> Hi Roger and Daniel P. Smith,
> 
> On 2023/11/28 22:08, Roger Pau Monné wrote:
> > On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote:
> >> When a device has been reset on dom0 side, the vpci on Xen
> >> side won't get notification, so the cached state in vpci is
> >> all out of date compare with the real device state.
> >> To solve that problem, this patch add new hypercall to clear
> >> all vpci device state. And when reset device happens on dom0
> >> side, dom0 can call this hypercall to notify vpci.
> >>
> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> >> Signed-off-by: Huang Rui <ray.huang@amd.com>
> >> ---
> >>  xen/arch/x86/hvm/hypercall.c  |  1 +
> >>  xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
> >>  xen/drivers/pci/physdev.c     | 14 ++++++++++++++
> >>  xen/drivers/vpci/vpci.c       |  9 +++++++++
> >>  xen/include/public/physdev.h  |  2 ++
> >>  xen/include/xen/pci.h         |  1 +
> >>  xen/include/xen/vpci.h        |  6 ++++++
> >>  7 files changed, 54 insertions(+)
> >>
> >> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
> >> index eeb73e1aa5..6ad5b4d5f1 100644
> >> --- a/xen/arch/x86/hvm/hypercall.c
> >> +++ b/xen/arch/x86/hvm/hypercall.c
> >> @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> >>      case PHYSDEVOP_pci_mmcfg_reserved:
> >>      case PHYSDEVOP_pci_device_add:
> >>      case PHYSDEVOP_pci_device_remove:
> >> +    case PHYSDEVOP_pci_device_state_reset:
> >>      case PHYSDEVOP_dbgp_op:
> >>          if ( !is_hardware_domain(currd) )
> >>              return -ENOSYS;
> >> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> >> index 04d00c7c37..f871715585 100644
> >> --- a/xen/drivers/passthrough/pci.c
> >> +++ b/xen/drivers/passthrough/pci.c
> >> @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
> >>      return ret;
> >>  }
> >>  
> >> +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
> > 
> > You could use pci_sbdf_t here instead of 3 parameters.
> Will change in next version, thank you.
> 
> > 
> > I'm however unsure whether we really need this helper just to fetch
> > the pdev and call vpci_reset_device_state(), I think you could place
> > this logic directly in pci_physdev_op().  Unless there are plans to
> > use such logic outside of pci_physdev_op().
> If I place the logic of pci_reset_device_state directly in pci_physdev_op. I think I need to declare vpci_reset_device_state in pci.h? If it is suitable, I will change in next version.

Just include xen/vpci.h in drivers/pci/physdev.c AFAICT.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-11-30 11:52       ` Roger Pau Monné
@ 2023-11-30 11:55         ` Chen, Jiqian
  0 siblings, 0 replies; 37+ messages in thread
From: Chen, Jiqian @ 2023-11-30 11:55 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Daniel P. Smith, Jan Beulich, Wei Liu, Anthony PERARD,
	Juergen Gross, Stefano Stabellini, xen-devel, Hildebrand,
	Stewart, Deucher, Alexander, Ragiadakou, Xenia, Stabellini,
	Stefano, Huang, Ray, Huang, Honglei1, Zhang, Julia, Chen, Jiqian


On 2023/11/30 19:52, Roger Pau Monné wrote:
> On Thu, Nov 30, 2023 at 06:22:28AM +0000, Chen, Jiqian wrote:
>> Hi Roger and Daniel P. Smith,
>>
>> On 2023/11/28 22:08, Roger Pau Monné wrote:
>>> On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote:
>>>> When a device has been reset on dom0 side, the vpci on Xen
>>>> side won't get notification, so the cached state in vpci is
>>>> all out of date compare with the real device state.
>>>> To solve that problem, this patch add new hypercall to clear
>>>> all vpci device state. And when reset device happens on dom0
>>>> side, dom0 can call this hypercall to notify vpci.
>>>>
>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>>> ---
>>>>  xen/arch/x86/hvm/hypercall.c  |  1 +
>>>>  xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
>>>>  xen/drivers/pci/physdev.c     | 14 ++++++++++++++
>>>>  xen/drivers/vpci/vpci.c       |  9 +++++++++
>>>>  xen/include/public/physdev.h  |  2 ++
>>>>  xen/include/xen/pci.h         |  1 +
>>>>  xen/include/xen/vpci.h        |  6 ++++++
>>>>  7 files changed, 54 insertions(+)
>>>>
>>>> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
>>>> index eeb73e1aa5..6ad5b4d5f1 100644
>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>> @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>      case PHYSDEVOP_pci_mmcfg_reserved:
>>>>      case PHYSDEVOP_pci_device_add:
>>>>      case PHYSDEVOP_pci_device_remove:
>>>> +    case PHYSDEVOP_pci_device_state_reset:
>>>>      case PHYSDEVOP_dbgp_op:
>>>>          if ( !is_hardware_domain(currd) )
>>>>              return -ENOSYS;
>>>> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
>>>> index 04d00c7c37..f871715585 100644
>>>> --- a/xen/drivers/passthrough/pci.c
>>>> +++ b/xen/drivers/passthrough/pci.c
>>>> @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
>>>>      return ret;
>>>>  }
>>>>  
>>>> +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
>>>
>>> You could use pci_sbdf_t here instead of 3 parameters.
>> Will change in next version, thank you.
>>
>>>
>>> I'm however unsure whether we really need this helper just to fetch
>>> the pdev and call vpci_reset_device_state(), I think you could place
>>> this logic directly in pci_physdev_op().  Unless there are plans to
>>> use such logic outside of pci_physdev_op().
>> If I place the logic of pci_reset_device_state directly in pci_physdev_op. I think I need to declare vpci_reset_device_state in pci.h? If it is suitable, I will change in next version.
> 
> Just include xen/vpci.h in drivers/pci/physdev.c AFAICT.
Ok, I will change it in next version. Thanks.

> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-11-30  6:22     ` Chen, Jiqian
  2023-11-30 11:52       ` Roger Pau Monné
@ 2023-11-30 12:25       ` Daniel P. Smith
  2023-11-30 12:39         ` Daniel P. Smith
  2023-11-30 13:03         ` Chen, Jiqian
  1 sibling, 2 replies; 37+ messages in thread
From: Daniel P. Smith @ 2023-11-30 12:25 UTC (permalink / raw)
  To: Chen, Jiqian, Roger Pau Monné
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Hildebrand, Stewart, Deucher,
	Alexander, Ragiadakou, Xenia, Stabellini, Stefano, Huang, Ray,
	Huang, Honglei1, Zhang, Julia

On 11/30/23 01:22, Chen, Jiqian wrote:
> Hi Roger and Daniel P. Smith,
> 
> On 2023/11/28 22:08, Roger Pau Monné wrote:
>> On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote:
>>> When a device has been reset on dom0 side, the vpci on Xen
>>> side won't get notification, so the cached state in vpci is
>>> all out of date compare with the real device state.
>>> To solve that problem, this patch add new hypercall to clear
>>> all vpci device state. And when reset device happens on dom0
>>> side, dom0 can call this hypercall to notify vpci.
>>>
>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>> ---
>>>   xen/arch/x86/hvm/hypercall.c  |  1 +
>>>   xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
>>>   xen/drivers/pci/physdev.c     | 14 ++++++++++++++
>>>   xen/drivers/vpci/vpci.c       |  9 +++++++++
>>>   xen/include/public/physdev.h  |  2 ++
>>>   xen/include/xen/pci.h         |  1 +
>>>   xen/include/xen/vpci.h        |  6 ++++++
>>>   7 files changed, 54 insertions(+)
>>>
>>> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
>>> index eeb73e1aa5..6ad5b4d5f1 100644
>>> --- a/xen/arch/x86/hvm/hypercall.c
>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>> @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>       case PHYSDEVOP_pci_mmcfg_reserved:
>>>       case PHYSDEVOP_pci_device_add:
>>>       case PHYSDEVOP_pci_device_remove:
>>> +    case PHYSDEVOP_pci_device_state_reset:
>>>       case PHYSDEVOP_dbgp_op:
>>>           if ( !is_hardware_domain(currd) )
>>>               return -ENOSYS;
>>> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
>>> index 04d00c7c37..f871715585 100644
>>> --- a/xen/drivers/passthrough/pci.c
>>> +++ b/xen/drivers/passthrough/pci.c
>>> @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
>>>       return ret;
>>>   }
>>>   
>>> +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
>>
>> You could use pci_sbdf_t here instead of 3 parameters.
> Will change in next version, thank you.
> 
>>
>> I'm however unsure whether we really need this helper just to fetch
>> the pdev and call vpci_reset_device_state(), I think you could place
>> this logic directly in pci_physdev_op().  Unless there are plans to
>> use such logic outside of pci_physdev_op().
> If I place the logic of pci_reset_device_state directly in pci_physdev_op. I think I need to declare vpci_reset_device_state in pci.h? If it is suitable, I will change in next version.
> 
>>
>>> +{
>>> +    struct pci_dev *pdev;
>>> +    int ret = -ENODEV;
>>
>> Some XSM check should be introduced fro this operation, as none of the
>> existing ones is suitable.  See xsm_resource_unplug_pci() for example.
>>
>> xsm_resource_reset_state_pci() or some such I would assume.
> I don't know what I should do in XSM function(assume it is xsm_resource_reset_state_pci).
> Hi Daniel P. Smith, could you please give me some suggestions?
> Just to check the XSM_PRIV action?
> 

Roger, thank you for seeing this and adding me in!

Jiqian, I just wanted to let you know I have seen your post but I have 
been a little tied up this week. Just with the cursory look, I think 
Roger is suggesting a new XSM check/hook is warranted.

If you would like to attempt at writing the dummy policy side, mimic 
xsm_resource_plug_pci in xen/include/xsm/dummy.h and 
xen/include/xsm/xsm.h, then I can look at handling the FLASK portion 
next week and provide it to you for inclusion into the series. If you 
are not comfortable with it, I can look at the whole thing next week. 
Just let me know what you would be comfortable with.

v/r,
dps


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-11-30 12:25       ` Daniel P. Smith
@ 2023-11-30 12:39         ` Daniel P. Smith
  2023-11-30 14:52           ` Roger Pau Monné
  2023-11-30 13:03         ` Chen, Jiqian
  1 sibling, 1 reply; 37+ messages in thread
From: Daniel P. Smith @ 2023-11-30 12:39 UTC (permalink / raw)
  To: Chen, Jiqian, Roger Pau Monné
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Hildebrand, Stewart, Deucher,
	Alexander, Ragiadakou, Xenia, Stabellini, Stefano, Huang, Ray,
	Huang, Honglei1, Zhang, Julia

On 11/30/23 07:25, Daniel P. Smith wrote:
> On 11/30/23 01:22, Chen, Jiqian wrote:
>> Hi Roger and Daniel P. Smith,
>>
>> On 2023/11/28 22:08, Roger Pau Monné wrote:
>>> On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote:
>>>> When a device has been reset on dom0 side, the vpci on Xen
>>>> side won't get notification, so the cached state in vpci is
>>>> all out of date compare with the real device state.
>>>> To solve that problem, this patch add new hypercall to clear
>>>> all vpci device state. And when reset device happens on dom0
>>>> side, dom0 can call this hypercall to notify vpci.
>>>>
>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>>> ---
>>>>   xen/arch/x86/hvm/hypercall.c  |  1 +
>>>>   xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
>>>>   xen/drivers/pci/physdev.c     | 14 ++++++++++++++
>>>>   xen/drivers/vpci/vpci.c       |  9 +++++++++
>>>>   xen/include/public/physdev.h  |  2 ++
>>>>   xen/include/xen/pci.h         |  1 +
>>>>   xen/include/xen/vpci.h        |  6 ++++++
>>>>   7 files changed, 54 insertions(+)
>>>>
>>>> diff --git a/xen/arch/x86/hvm/hypercall.c 
>>>> b/xen/arch/x86/hvm/hypercall.c
>>>> index eeb73e1aa5..6ad5b4d5f1 100644
>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>> @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd, 
>>>> XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>       case PHYSDEVOP_pci_mmcfg_reserved:
>>>>       case PHYSDEVOP_pci_device_add:
>>>>       case PHYSDEVOP_pci_device_remove:
>>>> +    case PHYSDEVOP_pci_device_state_reset:
>>>>       case PHYSDEVOP_dbgp_op:
>>>>           if ( !is_hardware_domain(currd) )
>>>>               return -ENOSYS;
>>>> diff --git a/xen/drivers/passthrough/pci.c 
>>>> b/xen/drivers/passthrough/pci.c
>>>> index 04d00c7c37..f871715585 100644
>>>> --- a/xen/drivers/passthrough/pci.c
>>>> +++ b/xen/drivers/passthrough/pci.c
>>>> @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
>>>>       return ret;
>>>>   }
>>>> +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
>>>
>>> You could use pci_sbdf_t here instead of 3 parameters.
>> Will change in next version, thank you.
>>
>>>
>>> I'm however unsure whether we really need this helper just to fetch
>>> the pdev and call vpci_reset_device_state(), I think you could place
>>> this logic directly in pci_physdev_op().  Unless there are plans to
>>> use such logic outside of pci_physdev_op().
>> If I place the logic of pci_reset_device_state directly in 
>> pci_physdev_op. I think I need to declare vpci_reset_device_state in 
>> pci.h? If it is suitable, I will change in next version.
>>
>>>
>>>> +{
>>>> +    struct pci_dev *pdev;
>>>> +    int ret = -ENODEV;
>>>
>>> Some XSM check should be introduced fro this operation, as none of the
>>> existing ones is suitable.  See xsm_resource_unplug_pci() for example.
>>>
>>> xsm_resource_reset_state_pci() or some such I would assume.
>> I don't know what I should do in XSM function(assume it is 
>> xsm_resource_reset_state_pci).
>> Hi Daniel P. Smith, could you please give me some suggestions?
>> Just to check the XSM_PRIV action?
>>
> 
> Roger, thank you for seeing this and adding me in!
> 
> Jiqian, I just wanted to let you know I have seen your post but I have 
> been a little tied up this week. Just with the cursory look, I think 
> Roger is suggesting a new XSM check/hook is warranted.
> 
> If you would like to attempt at writing the dummy policy side, mimic 
> xsm_resource_plug_pci in xen/include/xsm/dummy.h and 
> xen/include/xsm/xsm.h, then I can look at handling the FLASK portion 
> next week and provide it to you for inclusion into the series. If you 
> are not comfortable with it, I can look at the whole thing next week. 
> Just let me know what you would be comfortable with.

Apologies, thinking about for a moment and was thinking the hook should 
be called xsm_resource_config_pci. I would reset as a config operation 
and there might be new ones in the future. I do not believe there is a 
need to have fine grain access control down to individual config 
operation, thus they could all be captured under this one hook. Roger, 
what are your thoughts about this, in particular how you see vpci evolving?

v/r,
dps


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-11-30 12:25       ` Daniel P. Smith
  2023-11-30 12:39         ` Daniel P. Smith
@ 2023-11-30 13:03         ` Chen, Jiqian
  1 sibling, 0 replies; 37+ messages in thread
From: Chen, Jiqian @ 2023-11-30 13:03 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Hildebrand, Stewart, Deucher,
	Alexander, Ragiadakou, Xenia, Stabellini, Stefano, Huang, Ray,
	Huang, Honglei1, Zhang, Julia, Chen, Jiqian, Roger Pau Monné

On 2023/11/30 20:25, Daniel P. Smith wrote:
> On 11/30/23 01:22, Chen, Jiqian wrote:
>> Hi Roger and Daniel P. Smith,
>>
>> On 2023/11/28 22:08, Roger Pau Monné wrote:
>>> On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote:
>>>> When a device has been reset on dom0 side, the vpci on Xen
>>>> side won't get notification, so the cached state in vpci is
>>>> all out of date compare with the real device state.
>>>> To solve that problem, this patch add new hypercall to clear
>>>> all vpci device state. And when reset device happens on dom0
>>>> side, dom0 can call this hypercall to notify vpci.
>>>>
>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>>> ---
>>>>   xen/arch/x86/hvm/hypercall.c  |  1 +
>>>>   xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
>>>>   xen/drivers/pci/physdev.c     | 14 ++++++++++++++
>>>>   xen/drivers/vpci/vpci.c       |  9 +++++++++
>>>>   xen/include/public/physdev.h  |  2 ++
>>>>   xen/include/xen/pci.h         |  1 +
>>>>   xen/include/xen/vpci.h        |  6 ++++++
>>>>   7 files changed, 54 insertions(+)
>>>>
>>>> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
>>>> index eeb73e1aa5..6ad5b4d5f1 100644
>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>> @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>       case PHYSDEVOP_pci_mmcfg_reserved:
>>>>       case PHYSDEVOP_pci_device_add:
>>>>       case PHYSDEVOP_pci_device_remove:
>>>> +    case PHYSDEVOP_pci_device_state_reset:
>>>>       case PHYSDEVOP_dbgp_op:
>>>>           if ( !is_hardware_domain(currd) )
>>>>               return -ENOSYS;
>>>> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
>>>> index 04d00c7c37..f871715585 100644
>>>> --- a/xen/drivers/passthrough/pci.c
>>>> +++ b/xen/drivers/passthrough/pci.c
>>>> @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
>>>>       return ret;
>>>>   }
>>>>   +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
>>>
>>> You could use pci_sbdf_t here instead of 3 parameters.
>> Will change in next version, thank you.
>>
>>>
>>> I'm however unsure whether we really need this helper just to fetch
>>> the pdev and call vpci_reset_device_state(), I think you could place
>>> this logic directly in pci_physdev_op().  Unless there are plans to
>>> use such logic outside of pci_physdev_op().
>> If I place the logic of pci_reset_device_state directly in pci_physdev_op. I think I need to declare vpci_reset_device_state in pci.h? If it is suitable, I will change in next version.
>>
>>>
>>>> +{
>>>> +    struct pci_dev *pdev;
>>>> +    int ret = -ENODEV;
>>>
>>> Some XSM check should be introduced fro this operation, as none of the
>>> existing ones is suitable.  See xsm_resource_unplug_pci() for example.
>>>
>>> xsm_resource_reset_state_pci() or some such I would assume.
>> I don't know what I should do in XSM function(assume it is xsm_resource_reset_state_pci).
>> Hi Daniel P. Smith, could you please give me some suggestions?
>> Just to check the XSM_PRIV action?
>>
> 
> Roger, thank you for seeing this and adding me in!
> 
> Jiqian, I just wanted to let you know I have seen your post but I have been a little tied up this week. Just with the cursory look, I think Roger is suggesting a new XSM check/hook is warranted.
> 
> If you would like to attempt at writing the dummy policy side, mimic xsm_resource_plug_pci in xen/include/xsm/dummy.h and xen/include/xsm/xsm.h, then I can look at handling the FLASK portion next week and provide it to you for inclusion into the series. If you are not comfortable with it, I can look at the whole thing next week. Just let me know what you would be comfortable with.
I will refer to xsm_resource_unplug_pci and try to add dummy policy in next version. I am looking forward your review and guide. Thank you very much!

> 
> v/r,
> dps

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-11-30 12:39         ` Daniel P. Smith
@ 2023-11-30 14:52           ` Roger Pau Monné
  2023-12-04  6:57             ` Chen, Jiqian
  0 siblings, 1 reply; 37+ messages in thread
From: Roger Pau Monné @ 2023-11-30 14:52 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: Chen, Jiqian, Jan Beulich, Wei Liu, Anthony PERARD,
	Juergen Gross, Stefano Stabellini, xen-devel, Hildebrand,
	Stewart, Deucher, Alexander, Ragiadakou, Xenia, Stabellini,
	Stefano, Huang, Ray, Huang, Honglei1, Zhang, Julia

On Thu, Nov 30, 2023 at 07:39:38AM -0500, Daniel P. Smith wrote:
> On 11/30/23 07:25, Daniel P. Smith wrote:
> > On 11/30/23 01:22, Chen, Jiqian wrote:
> > > Hi Roger and Daniel P. Smith,
> > > 
> > > On 2023/11/28 22:08, Roger Pau Monné wrote:
> > > > On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote:
> > > > > When a device has been reset on dom0 side, the vpci on Xen
> > > > > side won't get notification, so the cached state in vpci is
> > > > > all out of date compare with the real device state.
> > > > > To solve that problem, this patch add new hypercall to clear
> > > > > all vpci device state. And when reset device happens on dom0
> > > > > side, dom0 can call this hypercall to notify vpci.
> > > > > 
> > > > > Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> > > > > Signed-off-by: Huang Rui <ray.huang@amd.com>
> > > > > ---
> > > > >   xen/arch/x86/hvm/hypercall.c  |  1 +
> > > > >   xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
> > > > >   xen/drivers/pci/physdev.c     | 14 ++++++++++++++
> > > > >   xen/drivers/vpci/vpci.c       |  9 +++++++++
> > > > >   xen/include/public/physdev.h  |  2 ++
> > > > >   xen/include/xen/pci.h         |  1 +
> > > > >   xen/include/xen/vpci.h        |  6 ++++++
> > > > >   7 files changed, 54 insertions(+)
> > > > > 
> > > > > diff --git a/xen/arch/x86/hvm/hypercall.c
> > > > > b/xen/arch/x86/hvm/hypercall.c
> > > > > index eeb73e1aa5..6ad5b4d5f1 100644
> > > > > --- a/xen/arch/x86/hvm/hypercall.c
> > > > > +++ b/xen/arch/x86/hvm/hypercall.c
> > > > > @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd,
> > > > > XEN_GUEST_HANDLE_PARAM(void) arg)
> > > > >       case PHYSDEVOP_pci_mmcfg_reserved:
> > > > >       case PHYSDEVOP_pci_device_add:
> > > > >       case PHYSDEVOP_pci_device_remove:
> > > > > +    case PHYSDEVOP_pci_device_state_reset:
> > > > >       case PHYSDEVOP_dbgp_op:
> > > > >           if ( !is_hardware_domain(currd) )
> > > > >               return -ENOSYS;
> > > > > diff --git a/xen/drivers/passthrough/pci.c
> > > > > b/xen/drivers/passthrough/pci.c
> > > > > index 04d00c7c37..f871715585 100644
> > > > > --- a/xen/drivers/passthrough/pci.c
> > > > > +++ b/xen/drivers/passthrough/pci.c
> > > > > @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
> > > > >       return ret;
> > > > >   }
> > > > > +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
> > > > 
> > > > You could use pci_sbdf_t here instead of 3 parameters.
> > > Will change in next version, thank you.
> > > 
> > > > 
> > > > I'm however unsure whether we really need this helper just to fetch
> > > > the pdev and call vpci_reset_device_state(), I think you could place
> > > > this logic directly in pci_physdev_op().  Unless there are plans to
> > > > use such logic outside of pci_physdev_op().
> > > If I place the logic of pci_reset_device_state directly in
> > > pci_physdev_op. I think I need to declare vpci_reset_device_state in
> > > pci.h? If it is suitable, I will change in next version.
> > > 
> > > > 
> > > > > +{
> > > > > +    struct pci_dev *pdev;
> > > > > +    int ret = -ENODEV;
> > > > 
> > > > Some XSM check should be introduced fro this operation, as none of the
> > > > existing ones is suitable.  See xsm_resource_unplug_pci() for example.
> > > > 
> > > > xsm_resource_reset_state_pci() or some such I would assume.
> > > I don't know what I should do in XSM function(assume it is
> > > xsm_resource_reset_state_pci).
> > > Hi Daniel P. Smith, could you please give me some suggestions?
> > > Just to check the XSM_PRIV action?
> > > 
> > 
> > Roger, thank you for seeing this and adding me in!
> > 
> > Jiqian, I just wanted to let you know I have seen your post but I have
> > been a little tied up this week. Just with the cursory look, I think
> > Roger is suggesting a new XSM check/hook is warranted.
> > 
> > If you would like to attempt at writing the dummy policy side, mimic
> > xsm_resource_plug_pci in xen/include/xsm/dummy.h and
> > xen/include/xsm/xsm.h, then I can look at handling the FLASK portion
> > next week and provide it to you for inclusion into the series. If you
> > are not comfortable with it, I can look at the whole thing next week.
> > Just let me know what you would be comfortable with.
> 
> Apologies, thinking about for a moment and was thinking the hook should be
> called xsm_resource_config_pci. I would reset as a config operation and
> there might be new ones in the future. I do not believe there is a need to
> have fine grain access control down to individual config operation, thus
> they could all be captured under this one hook. Roger, what are your
> thoughts about this, in particular how you see vpci evolving?

So the configuration space reset should only be done by the domain
that's also capable of triggering the physical device reset, usually
the hardware domain.  I don't think it's possible ATM to allow a
domain different than the hardware domain to perform a PCI reset, as
doing it requires unmediated access to the PCI config space.

So resetting the vPCI state should either be limited to the hardware
domain, or to a pci reset capability explicitly
(xsm_resource_reset_pci or some such?).  Or maybe
xsm_resource_config_pci if that denotes unmediated access to the PCI
config space.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0
  2023-11-30  6:32     ` Chen, Jiqian
@ 2023-11-30 15:13       ` Roger Pau Monné
  0 siblings, 0 replies; 37+ messages in thread
From: Roger Pau Monné @ 2023-11-30 15:13 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Hildebrand, Stewart, Deucher,
	Alexander, Ragiadakou, Xenia, Stabellini, Stefano, Huang, Ray,
	Huang, Honglei1, Zhang, Julia

On Thu, Nov 30, 2023 at 06:32:00AM +0000, Chen, Jiqian wrote:
> 
> On 2023/11/28 22:17, Roger Pau Monné wrote:
> > On Fri, Nov 24, 2023 at 06:41:35PM +0800, Jiqian Chen wrote:
> >> If we run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
> >> a passthrough device by using gsi, see xen_pt_realize->xc_physdev_map_pirq
> >> and pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq will
> >> call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq is not allowed
> >> because currd is PVH dom0 and PVH has no X86_EMU_USE_PIRQ flag, it will
> >> fail at has_pirq check.
> >>
> >> So, I think we may need to allow PHYSDEVOP_map_pirq when currd is dom0 (at
> > 
> > And PHYSDEVOP_unmap_pirq also?
> Yes, in the failed path, PHYSDEVOP_unmap_pirq will be called. I will add some descriptions about it into the commit message.
> 
> > 
> >> present dom0 is PVH).
> > 
> > IMO it would be better to implement a new set of DMOP hypercalls that
> > handle the setup of interrupts from physical devices that are assigned
> > to a guest.  That should allow us to get rid of the awkward PHYSDEVOP
> > + XEN_DOMCTL_{,un}bind_pt_irq hypercall interface, which currently
> > prevents QEMU from being hypervisor version agnostic (since the domctl
> > interface is not stable).
> > 
> > I understand this might be too much to ask for, but something to take
> > into account.
> Yes, that will be a complex project. I think current change can meet the needs. We can take DMOP into account in the future. Thank you.

The issue with this approach is that we always do things in a rush and
cut corners, and then never pay back the debt.  Anyway, I'm not going
to block this, and I'm not blaming you.

Sadly this is just focused on getting something working in the short
term rather than thinking long term in a maintainable interface.

Regards, Roger.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-11-30 10:06         ` Roger Pau Monné
@ 2023-12-01  3:09           ` Stefano Stabellini
  2023-12-01  9:03             ` Roger Pau Monné
  0 siblings, 1 reply; 37+ messages in thread
From: Stefano Stabellini @ 2023-12-01  3:09 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Jiqian Chen, Jan Beulich, Wei Liu,
	Anthony PERARD, Juergen Gross, xen-devel, Stewart Hildebrand,
	Alex Deucher, Xenia Ragiadakou, Stefano Stabellini, Huang Rui,
	Honglei Huang, Julia Zhang

[-- Attachment #1: Type: text/plain, Size: 2973 bytes --]

On Thu, 30 Nov 2023, Roger Pau Monné wrote:
> On Wed, Nov 29, 2023 at 08:02:40PM -0800, Stefano Stabellini wrote:
> > n Wed, 29 Nov 2023, Stefano Stabellini wrote:
> > > On Tue, 28 Nov 2023, Roger Pau Monné wrote:
> > > > On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
> > > > > In PVH dom0, it uses the linux local interrupt mechanism,
> > > > > when it allocs irq for a gsi, it is dynamic, and follow
> > > > > the principle of applying first, distributing first. And
> > > > > if you debug the kernel codes, you will find the irq
> > > > > number is alloced from small to large, but the applying
> > > > > gsi number is not, may gsi 38 comes before gsi 28, that
> > > > > causes the irq number is not equal with the gsi number.
> > > > > And when we passthrough a device, QEMU will use its gsi
> > > > > number to do mapping actions, see xen_pt_realize->
> > > > > xc_physdev_map_pirq, but the gsi number is got from file
> > > > > /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
> > > > > so it will fail when mapping.
> > > > > And in current codes, there is no method to translate
> > > > > irq to gsi for userspace.
> > > > 
> > > > I think it would be cleaner to just introduce a new sysfs node that
> > > > contains the gsi if a device is using one (much like the irq sysfs
> > > > node)?
> > > > 
> > > > Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
> > > > placing it in privcmd does seem quite strange to me.  I understand
> > > > that for passthrough we need the GSI, but such translation layer from
> > > > IRQ to GSI is all Linux internal, and it would be much simpler to just
> > > > expose the GSI in sysfs IMO.
> > > 
> > > Maybe something to add to drivers/xen/sys-hypervisor.c in Linux.
> > > Juergen what do you think?
> > 
> > Let me also add that privcmd.c is already a Linux specific interface.
> > Although it was born to be a Xen hypercall "proxy" in reality today we
> > have a few privcmd ioctls that don't translate into hypercalls. So I
> > don't think that adding IOCTL_PRIVCMD_GSI_FROM_IRQ would be a problem.
> 
> Maybe not all ioctls translate to hypercalls (I guess you are
> referring to the IOCTL_PRIVCMD_RESTRICT ioctl), but they are specific
> Xen actions.  Getting the GSI used by a device has nothing do to with
> Xen.
> 
> IMO drivers/xen/sys-hypervisor.c is also not appropriate, but I'm not
> the maintainer of any of those components.
> 
> There's nothing Xen specific about fetching the GSI associated with a
> PCI device.  The fact that Xen needs it for passthrough is just a red
> herring, further cases where the GSI is needed might arise outside of
> Xen, and hence such node would better be placed in a generic
> location.  The right location should be /sys/bus/pci/devices/<sbdf>/gsi.

That might be true but /sys/bus/pci/devices/<sbdf>/gsi is a non-Xen
generic interface and the maintainers of that portion of Linux code
might have a different opinion. We'll have to see.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-12-01  3:09           ` Stefano Stabellini
@ 2023-12-01  9:03             ` Roger Pau Monné
  2023-12-04  5:38               ` Chen, Jiqian
  0 siblings, 1 reply; 37+ messages in thread
From: Roger Pau Monné @ 2023-12-01  9:03 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jiqian Chen, Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	xen-devel, Stewart Hildebrand, Alex Deucher, Xenia Ragiadakou,
	Stefano Stabellini, Huang Rui, Honglei Huang, Julia Zhang

On Thu, Nov 30, 2023 at 07:09:12PM -0800, Stefano Stabellini wrote:
> On Thu, 30 Nov 2023, Roger Pau Monné wrote:
> > On Wed, Nov 29, 2023 at 08:02:40PM -0800, Stefano Stabellini wrote:
> > > n Wed, 29 Nov 2023, Stefano Stabellini wrote:
> > > > On Tue, 28 Nov 2023, Roger Pau Monné wrote:
> > > > > On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
> > > > > > In PVH dom0, it uses the linux local interrupt mechanism,
> > > > > > when it allocs irq for a gsi, it is dynamic, and follow
> > > > > > the principle of applying first, distributing first. And
> > > > > > if you debug the kernel codes, you will find the irq
> > > > > > number is alloced from small to large, but the applying
> > > > > > gsi number is not, may gsi 38 comes before gsi 28, that
> > > > > > causes the irq number is not equal with the gsi number.
> > > > > > And when we passthrough a device, QEMU will use its gsi
> > > > > > number to do mapping actions, see xen_pt_realize->
> > > > > > xc_physdev_map_pirq, but the gsi number is got from file
> > > > > > /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
> > > > > > so it will fail when mapping.
> > > > > > And in current codes, there is no method to translate
> > > > > > irq to gsi for userspace.
> > > > > 
> > > > > I think it would be cleaner to just introduce a new sysfs node that
> > > > > contains the gsi if a device is using one (much like the irq sysfs
> > > > > node)?
> > > > > 
> > > > > Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
> > > > > placing it in privcmd does seem quite strange to me.  I understand
> > > > > that for passthrough we need the GSI, but such translation layer from
> > > > > IRQ to GSI is all Linux internal, and it would be much simpler to just
> > > > > expose the GSI in sysfs IMO.
> > > > 
> > > > Maybe something to add to drivers/xen/sys-hypervisor.c in Linux.
> > > > Juergen what do you think?
> > > 
> > > Let me also add that privcmd.c is already a Linux specific interface.
> > > Although it was born to be a Xen hypercall "proxy" in reality today we
> > > have a few privcmd ioctls that don't translate into hypercalls. So I
> > > don't think that adding IOCTL_PRIVCMD_GSI_FROM_IRQ would be a problem.
> > 
> > Maybe not all ioctls translate to hypercalls (I guess you are
> > referring to the IOCTL_PRIVCMD_RESTRICT ioctl), but they are specific
> > Xen actions.  Getting the GSI used by a device has nothing do to with
> > Xen.
> > 
> > IMO drivers/xen/sys-hypervisor.c is also not appropriate, but I'm not
> > the maintainer of any of those components.
> > 
> > There's nothing Xen specific about fetching the GSI associated with a
> > PCI device.  The fact that Xen needs it for passthrough is just a red
> > herring, further cases where the GSI is needed might arise outside of
> > Xen, and hence such node would better be placed in a generic
> > location.  The right location should be /sys/bus/pci/devices/<sbdf>/gsi.
> 
> That might be true but /sys/bus/pci/devices/<sbdf>/gsi is a non-Xen
> generic interface and the maintainers of that portion of Linux code
> might have a different opinion. We'll have to see.

Right, but before resorting to implement a Xen specific workaround
let's attempt to do it the proper way :).

I cannot see why exposing the gsi on sysfs like that would be an
issue.  There's a lot of resource information exposed on sysfs
already, and it's a trivial node to implement.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-12-01  9:03             ` Roger Pau Monné
@ 2023-12-04  5:38               ` Chen, Jiqian
  2023-12-04 11:12                 ` Roger Pau Monné
  0 siblings, 1 reply; 37+ messages in thread
From: Chen, Jiqian @ 2023-12-04  5:38 UTC (permalink / raw)
  To: Roger Pau Monné, Stefano Stabellini
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross, xen-devel,
	Hildebrand, Stewart, Deucher, Alexander, Ragiadakou, Xenia,
	Stabellini, Stefano, Huang, Ray, Huang, Honglei1, Zhang, Julia,
	Chen, Jiqian

On 2023/12/1 17:03, Roger Pau Monné wrote:
> On Thu, Nov 30, 2023 at 07:09:12PM -0800, Stefano Stabellini wrote:
>> On Thu, 30 Nov 2023, Roger Pau Monné wrote:
>>> On Wed, Nov 29, 2023 at 08:02:40PM -0800, Stefano Stabellini wrote:
>>>> n Wed, 29 Nov 2023, Stefano Stabellini wrote:
>>>>> On Tue, 28 Nov 2023, Roger Pau Monné wrote:
>>>>>> On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
>>>>>>> In PVH dom0, it uses the linux local interrupt mechanism,
>>>>>>> when it allocs irq for a gsi, it is dynamic, and follow
>>>>>>> the principle of applying first, distributing first. And
>>>>>>> if you debug the kernel codes, you will find the irq
>>>>>>> number is alloced from small to large, but the applying
>>>>>>> gsi number is not, may gsi 38 comes before gsi 28, that
>>>>>>> causes the irq number is not equal with the gsi number.
>>>>>>> And when we passthrough a device, QEMU will use its gsi
>>>>>>> number to do mapping actions, see xen_pt_realize->
>>>>>>> xc_physdev_map_pirq, but the gsi number is got from file
>>>>>>> /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
>>>>>>> so it will fail when mapping.
>>>>>>> And in current codes, there is no method to translate
>>>>>>> irq to gsi for userspace.
>>>>>>
>>>>>> I think it would be cleaner to just introduce a new sysfs node that
>>>>>> contains the gsi if a device is using one (much like the irq sysfs
>>>>>> node)?
>>>>>>
>>>>>> Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
>>>>>> placing it in privcmd does seem quite strange to me.  I understand
>>>>>> that for passthrough we need the GSI, but such translation layer from
>>>>>> IRQ to GSI is all Linux internal, and it would be much simpler to just
>>>>>> expose the GSI in sysfs IMO.
>>>>>
>>>>> Maybe something to add to drivers/xen/sys-hypervisor.c in Linux.
>>>>> Juergen what do you think?
>>>>
>>>> Let me also add that privcmd.c is already a Linux specific interface.
>>>> Although it was born to be a Xen hypercall "proxy" in reality today we
>>>> have a few privcmd ioctls that don't translate into hypercalls. So I
>>>> don't think that adding IOCTL_PRIVCMD_GSI_FROM_IRQ would be a problem.
>>>
>>> Maybe not all ioctls translate to hypercalls (I guess you are
>>> referring to the IOCTL_PRIVCMD_RESTRICT ioctl), but they are specific
>>> Xen actions.  Getting the GSI used by a device has nothing do to with
>>> Xen.
>>>
>>> IMO drivers/xen/sys-hypervisor.c is also not appropriate, but I'm not
>>> the maintainer of any of those components.
>>>
>>> There's nothing Xen specific about fetching the GSI associated with a
>>> PCI device.  The fact that Xen needs it for passthrough is just a red
>>> herring, further cases where the GSI is needed might arise outside of
>>> Xen, and hence such node would better be placed in a generic
>>> location.  The right location should be /sys/bus/pci/devices/<sbdf>/gsi.
>>
>> That might be true but /sys/bus/pci/devices/<sbdf>/gsi is a non-Xen
>> generic interface and the maintainers of that portion of Linux code
>> might have a different opinion. We'll have to see.
> 
> Right, but before resorting to implement a Xen specific workaround
> let's attempt to do it the proper way :).
> 
> I cannot see why exposing the gsi on sysfs like that would be an
> issue.  There's a lot of resource information exposed on sysfs
> already, and it's a trivial node to implement.
Thanks for both of you' s suggestions. At present, it seems the result of discussion is that it needs to add a gsi sysfs. I will modify it in the next version and then add the corresponding maintainer to the review list.

> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-11-30 14:52           ` Roger Pau Monné
@ 2023-12-04  6:57             ` Chen, Jiqian
  2023-12-04 11:10               ` Roger Pau Monné
  0 siblings, 1 reply; 37+ messages in thread
From: Chen, Jiqian @ 2023-12-04  6:57 UTC (permalink / raw)
  To: Daniel P. Smith, Roger Pau Monné
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Hildebrand, Stewart, Deucher,
	Alexander, Ragiadakou, Xenia, Stabellini, Stefano, Huang, Ray,
	Huang, Honglei1, Zhang, Julia, Chen, Jiqian

Hi Daniel P. Smith,

On 2023/11/30 22:52, Roger Pau Monné wrote:
> On Thu, Nov 30, 2023 at 07:39:38AM -0500, Daniel P. Smith wrote:
>> On 11/30/23 07:25, Daniel P. Smith wrote:
>>> On 11/30/23 01:22, Chen, Jiqian wrote:
>>>> Hi Roger and Daniel P. Smith,
>>>>
>>>> On 2023/11/28 22:08, Roger Pau Monné wrote:
>>>>> On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote:
>>>>>> When a device has been reset on dom0 side, the vpci on Xen
>>>>>> side won't get notification, so the cached state in vpci is
>>>>>> all out of date compare with the real device state.
>>>>>> To solve that problem, this patch add new hypercall to clear
>>>>>> all vpci device state. And when reset device happens on dom0
>>>>>> side, dom0 can call this hypercall to notify vpci.
>>>>>>
>>>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>>>>> ---
>>>>>>   xen/arch/x86/hvm/hypercall.c  |  1 +
>>>>>>   xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
>>>>>>   xen/drivers/pci/physdev.c     | 14 ++++++++++++++
>>>>>>   xen/drivers/vpci/vpci.c       |  9 +++++++++
>>>>>>   xen/include/public/physdev.h  |  2 ++
>>>>>>   xen/include/xen/pci.h         |  1 +
>>>>>>   xen/include/xen/vpci.h        |  6 ++++++
>>>>>>   7 files changed, 54 insertions(+)
>>>>>>
>>>>>> diff --git a/xen/arch/x86/hvm/hypercall.c
>>>>>> b/xen/arch/x86/hvm/hypercall.c
>>>>>> index eeb73e1aa5..6ad5b4d5f1 100644
>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>>>> @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd,
>>>>>> XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>       case PHYSDEVOP_pci_mmcfg_reserved:
>>>>>>       case PHYSDEVOP_pci_device_add:
>>>>>>       case PHYSDEVOP_pci_device_remove:
>>>>>> +    case PHYSDEVOP_pci_device_state_reset:
>>>>>>       case PHYSDEVOP_dbgp_op:
>>>>>>           if ( !is_hardware_domain(currd) )
>>>>>>               return -ENOSYS;
>>>>>> diff --git a/xen/drivers/passthrough/pci.c
>>>>>> b/xen/drivers/passthrough/pci.c
>>>>>> index 04d00c7c37..f871715585 100644
>>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>>> @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
>>>>>>       return ret;
>>>>>>   }
>>>>>> +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
>>>>>
>>>>> You could use pci_sbdf_t here instead of 3 parameters.
>>>> Will change in next version, thank you.
>>>>
>>>>>
>>>>> I'm however unsure whether we really need this helper just to fetch
>>>>> the pdev and call vpci_reset_device_state(), I think you could place
>>>>> this logic directly in pci_physdev_op().  Unless there are plans to
>>>>> use such logic outside of pci_physdev_op().
>>>> If I place the logic of pci_reset_device_state directly in
>>>> pci_physdev_op. I think I need to declare vpci_reset_device_state in
>>>> pci.h? If it is suitable, I will change in next version.
>>>>
>>>>>
>>>>>> +{
>>>>>> +    struct pci_dev *pdev;
>>>>>> +    int ret = -ENODEV;
>>>>>
>>>>> Some XSM check should be introduced fro this operation, as none of the
>>>>> existing ones is suitable.  See xsm_resource_unplug_pci() for example.
>>>>>
>>>>> xsm_resource_reset_state_pci() or some such I would assume.
>>>> I don't know what I should do in XSM function(assume it is
>>>> xsm_resource_reset_state_pci).
>>>> Hi Daniel P. Smith, could you please give me some suggestions?
>>>> Just to check the XSM_PRIV action?
>>>>
>>>
>>> Roger, thank you for seeing this and adding me in!
>>>
>>> Jiqian, I just wanted to let you know I have seen your post but I have
>>> been a little tied up this week. Just with the cursory look, I think
>>> Roger is suggesting a new XSM check/hook is warranted.
>>>
>>> If you would like to attempt at writing the dummy policy side, mimic
>>> xsm_resource_plug_pci in xen/include/xsm/dummy.h and
>>> xen/include/xsm/xsm.h, then I can look at handling the FLASK portion
>>> next week and provide it to you for inclusion into the series. If you
>>> are not comfortable with it, I can look at the whole thing next week.
>>> Just let me know what you would be comfortable with.
>>
>> Apologies, thinking about for a moment and was thinking the hook should be
>> called xsm_resource_config_pci. I would reset as a config operation and
>> there might be new ones in the future. I do not believe there is a need to
>> have fine grain access control down to individual config operation, thus
>> they could all be captured under this one hook. Roger, what are your
>> thoughts about this, in particular how you see vpci evolving?
> 
> So the configuration space reset should only be done by the domain
> that's also capable of triggering the physical device reset, usually
> the hardware domain.  I don't think it's possible ATM to allow a
> domain different than the hardware domain to perform a PCI reset, as
> doing it requires unmediated access to the PCI config space.
> 
> So resetting the vPCI state should either be limited to the hardware
> domain, or to a pci reset capability explicitly
> (xsm_resource_reset_pci or some such?).  Or maybe
> xsm_resource_config_pci if that denotes unmediated access to the PCI
> config space.
> 
> Thanks, Roger.

Is it like below that I need to add for XSM?
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 9d7519eb89..81ceaf145f 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -937,6 +937,10 @@ int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
     struct pci_dev *pdev;
     int ret = -ENODEV;

+    ret = xsm_resource_config_pci(XSM_PRIV, (seg << 16) | (bus << 8) | devfn);
+    if ( ret )
+        return ret;
+
     pcidevs_lock();

     pdev = pci_get_pdev(NULL, PCI_SBDF(seg, bus, devfn));
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 8671af1ba4..bcaff99b23 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -472,6 +472,13 @@ static XSM_INLINE int cf_check xsm_resource_setup_pci(
     return xsm_default_action(action, current->domain, NULL);
 }

+static XSM_INLINE int cf_check xsm_resource_config_pci(
+    XSM_DEFAULT_ARG uint32_t machine_bdf)
+{
+    XSM_ASSERT_ACTION(XSM_PRIV);
+    return xsm_default_action(action, current->domain, NULL);
+}
+
 static XSM_INLINE int cf_check xsm_resource_setup_gsi(XSM_DEFAULT_ARG int gsi)
 {
     XSM_ASSERT_ACTION(XSM_PRIV);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 8dad03fd3d..1cb16b00de 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -571,6 +571,12 @@ static inline int xsm_resource_setup_pci(
     return alternative_call(xsm_ops.resource_setup_pci, machine_bdf);
 }

+static inline int xsm_resource_config_pci(
+    xsm_default_t def, uint32_t machine_bdf)
+{
+    return alternative_call(xsm_ops.resource_config_pci, machine_bdf);
+}
+
 static inline int xsm_resource_setup_gsi(xsm_default_t def, int gsi)
 {
     return alternative_call(xsm_ops.resource_setup_gsi, gsi);
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index e6ffa948f7..7a289ba5d8 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -91,6 +91,7 @@ static const struct xsm_ops __initconst_cf_clobber dummy_ops = {
     .resource_plug_pci             = xsm_resource_plug_pci,
     .resource_unplug_pci           = xsm_resource_unplug_pci,
     .resource_setup_pci            = xsm_resource_setup_pci,
+    .resource_config_pci            = xsm_resource_config_pci,
     .resource_setup_gsi            = xsm_resource_setup_gsi,


-- 
Best regards,
Jiqian Chen.

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-12-04  6:57             ` Chen, Jiqian
@ 2023-12-04 11:10               ` Roger Pau Monné
  2023-12-05  5:49                 ` Chen, Jiqian
  2023-12-07  1:11                 ` Daniel P. Smith
  0 siblings, 2 replies; 37+ messages in thread
From: Roger Pau Monné @ 2023-12-04 11:10 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Daniel P. Smith, Jan Beulich, Wei Liu, Anthony PERARD,
	Juergen Gross, Stefano Stabellini, xen-devel, Hildebrand,
	Stewart, Deucher, Alexander, Ragiadakou, Xenia, Stabellini,
	Stefano, Huang, Ray, Huang, Honglei1, Zhang, Julia

On Mon, Dec 04, 2023 at 06:57:03AM +0000, Chen, Jiqian wrote:
> Hi Daniel P. Smith,
> 
> On 2023/11/30 22:52, Roger Pau Monné wrote:
> > On Thu, Nov 30, 2023 at 07:39:38AM -0500, Daniel P. Smith wrote:
> >> On 11/30/23 07:25, Daniel P. Smith wrote:
> >>> On 11/30/23 01:22, Chen, Jiqian wrote:
> >>>> Hi Roger and Daniel P. Smith,
> >>>>
> >>>> On 2023/11/28 22:08, Roger Pau Monné wrote:
> >>>>> On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote:
> >>>>>> When a device has been reset on dom0 side, the vpci on Xen
> >>>>>> side won't get notification, so the cached state in vpci is
> >>>>>> all out of date compare with the real device state.
> >>>>>> To solve that problem, this patch add new hypercall to clear
> >>>>>> all vpci device state. And when reset device happens on dom0
> >>>>>> side, dom0 can call this hypercall to notify vpci.
> >>>>>>
> >>>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> >>>>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
> >>>>>> ---
> >>>>>>   xen/arch/x86/hvm/hypercall.c  |  1 +
> >>>>>>   xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
> >>>>>>   xen/drivers/pci/physdev.c     | 14 ++++++++++++++
> >>>>>>   xen/drivers/vpci/vpci.c       |  9 +++++++++
> >>>>>>   xen/include/public/physdev.h  |  2 ++
> >>>>>>   xen/include/xen/pci.h         |  1 +
> >>>>>>   xen/include/xen/vpci.h        |  6 ++++++
> >>>>>>   7 files changed, 54 insertions(+)
> >>>>>>
> >>>>>> diff --git a/xen/arch/x86/hvm/hypercall.c
> >>>>>> b/xen/arch/x86/hvm/hypercall.c
> >>>>>> index eeb73e1aa5..6ad5b4d5f1 100644
> >>>>>> --- a/xen/arch/x86/hvm/hypercall.c
> >>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
> >>>>>> @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd,
> >>>>>> XEN_GUEST_HANDLE_PARAM(void) arg)
> >>>>>>       case PHYSDEVOP_pci_mmcfg_reserved:
> >>>>>>       case PHYSDEVOP_pci_device_add:
> >>>>>>       case PHYSDEVOP_pci_device_remove:
> >>>>>> +    case PHYSDEVOP_pci_device_state_reset:
> >>>>>>       case PHYSDEVOP_dbgp_op:
> >>>>>>           if ( !is_hardware_domain(currd) )
> >>>>>>               return -ENOSYS;
> >>>>>> diff --git a/xen/drivers/passthrough/pci.c
> >>>>>> b/xen/drivers/passthrough/pci.c
> >>>>>> index 04d00c7c37..f871715585 100644
> >>>>>> --- a/xen/drivers/passthrough/pci.c
> >>>>>> +++ b/xen/drivers/passthrough/pci.c
> >>>>>> @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
> >>>>>>       return ret;
> >>>>>>   }
> >>>>>> +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
> >>>>>
> >>>>> You could use pci_sbdf_t here instead of 3 parameters.
> >>>> Will change in next version, thank you.
> >>>>
> >>>>>
> >>>>> I'm however unsure whether we really need this helper just to fetch
> >>>>> the pdev and call vpci_reset_device_state(), I think you could place
> >>>>> this logic directly in pci_physdev_op().  Unless there are plans to
> >>>>> use such logic outside of pci_physdev_op().
> >>>> If I place the logic of pci_reset_device_state directly in
> >>>> pci_physdev_op. I think I need to declare vpci_reset_device_state in
> >>>> pci.h? If it is suitable, I will change in next version.
> >>>>
> >>>>>
> >>>>>> +{
> >>>>>> +    struct pci_dev *pdev;
> >>>>>> +    int ret = -ENODEV;
> >>>>>
> >>>>> Some XSM check should be introduced fro this operation, as none of the
> >>>>> existing ones is suitable.  See xsm_resource_unplug_pci() for example.
> >>>>>
> >>>>> xsm_resource_reset_state_pci() or some such I would assume.
> >>>> I don't know what I should do in XSM function(assume it is
> >>>> xsm_resource_reset_state_pci).
> >>>> Hi Daniel P. Smith, could you please give me some suggestions?
> >>>> Just to check the XSM_PRIV action?
> >>>>
> >>>
> >>> Roger, thank you for seeing this and adding me in!
> >>>
> >>> Jiqian, I just wanted to let you know I have seen your post but I have
> >>> been a little tied up this week. Just with the cursory look, I think
> >>> Roger is suggesting a new XSM check/hook is warranted.
> >>>
> >>> If you would like to attempt at writing the dummy policy side, mimic
> >>> xsm_resource_plug_pci in xen/include/xsm/dummy.h and
> >>> xen/include/xsm/xsm.h, then I can look at handling the FLASK portion
> >>> next week and provide it to you for inclusion into the series. If you
> >>> are not comfortable with it, I can look at the whole thing next week.
> >>> Just let me know what you would be comfortable with.
> >>
> >> Apologies, thinking about for a moment and was thinking the hook should be
> >> called xsm_resource_config_pci. I would reset as a config operation and
> >> there might be new ones in the future. I do not believe there is a need to
> >> have fine grain access control down to individual config operation, thus
> >> they could all be captured under this one hook. Roger, what are your
> >> thoughts about this, in particular how you see vpci evolving?
> > 
> > So the configuration space reset should only be done by the domain
> > that's also capable of triggering the physical device reset, usually
> > the hardware domain.  I don't think it's possible ATM to allow a
> > domain different than the hardware domain to perform a PCI reset, as
> > doing it requires unmediated access to the PCI config space.
> > 
> > So resetting the vPCI state should either be limited to the hardware
> > domain, or to a pci reset capability explicitly
> > (xsm_resource_reset_pci or some such?).  Or maybe
> > xsm_resource_config_pci if that denotes unmediated access to the PCI
> > config space.
> > 
> > Thanks, Roger.
> 
> Is it like below that I need to add for XSM?
> diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
> index e6ffa948f7..7a289ba5d8 100644
> --- a/xen/xsm/dummy.c
> +++ b/xen/xsm/dummy.c
> @@ -91,6 +91,7 @@ static const struct xsm_ops __initconst_cf_clobber dummy_ops = {
>      .resource_plug_pci             = xsm_resource_plug_pci,
>      .resource_unplug_pci           = xsm_resource_unplug_pci,
>      .resource_setup_pci            = xsm_resource_setup_pci,
> +    .resource_config_pci            = xsm_resource_config_pci,

Now that I look at it, using the existing xsm_resource_setup_pci might
be enough, no need to introduce a xsm_resource_config_pci.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
  2023-12-04  5:38               ` Chen, Jiqian
@ 2023-12-04 11:12                 ` Roger Pau Monné
  0 siblings, 0 replies; 37+ messages in thread
From: Roger Pau Monné @ 2023-12-04 11:12 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Stefano Stabellini, Jan Beulich, Wei Liu, Anthony PERARD,
	Juergen Gross, xen-devel, Hildebrand, Stewart, Deucher,
	Alexander, Ragiadakou, Xenia, Stabellini, Stefano, Huang, Ray,
	Huang, Honglei1, Zhang, Julia

On Mon, Dec 04, 2023 at 05:38:06AM +0000, Chen, Jiqian wrote:
> On 2023/12/1 17:03, Roger Pau Monné wrote:
> > On Thu, Nov 30, 2023 at 07:09:12PM -0800, Stefano Stabellini wrote:
> >> On Thu, 30 Nov 2023, Roger Pau Monné wrote:
> >>> On Wed, Nov 29, 2023 at 08:02:40PM -0800, Stefano Stabellini wrote:
> >>>> n Wed, 29 Nov 2023, Stefano Stabellini wrote:
> >>>>> On Tue, 28 Nov 2023, Roger Pau Monné wrote:
> >>>>>> On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote:
> >>>>>>> In PVH dom0, it uses the linux local interrupt mechanism,
> >>>>>>> when it allocs irq for a gsi, it is dynamic, and follow
> >>>>>>> the principle of applying first, distributing first. And
> >>>>>>> if you debug the kernel codes, you will find the irq
> >>>>>>> number is alloced from small to large, but the applying
> >>>>>>> gsi number is not, may gsi 38 comes before gsi 28, that
> >>>>>>> causes the irq number is not equal with the gsi number.
> >>>>>>> And when we passthrough a device, QEMU will use its gsi
> >>>>>>> number to do mapping actions, see xen_pt_realize->
> >>>>>>> xc_physdev_map_pirq, but the gsi number is got from file
> >>>>>>> /sys/bus/pci/devices/xxxx:xx:xx.x/irq in current code,
> >>>>>>> so it will fail when mapping.
> >>>>>>> And in current codes, there is no method to translate
> >>>>>>> irq to gsi for userspace.
> >>>>>>
> >>>>>> I think it would be cleaner to just introduce a new sysfs node that
> >>>>>> contains the gsi if a device is using one (much like the irq sysfs
> >>>>>> node)?
> >>>>>>
> >>>>>> Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so
> >>>>>> placing it in privcmd does seem quite strange to me.  I understand
> >>>>>> that for passthrough we need the GSI, but such translation layer from
> >>>>>> IRQ to GSI is all Linux internal, and it would be much simpler to just
> >>>>>> expose the GSI in sysfs IMO.
> >>>>>
> >>>>> Maybe something to add to drivers/xen/sys-hypervisor.c in Linux.
> >>>>> Juergen what do you think?
> >>>>
> >>>> Let me also add that privcmd.c is already a Linux specific interface.
> >>>> Although it was born to be a Xen hypercall "proxy" in reality today we
> >>>> have a few privcmd ioctls that don't translate into hypercalls. So I
> >>>> don't think that adding IOCTL_PRIVCMD_GSI_FROM_IRQ would be a problem.
> >>>
> >>> Maybe not all ioctls translate to hypercalls (I guess you are
> >>> referring to the IOCTL_PRIVCMD_RESTRICT ioctl), but they are specific
> >>> Xen actions.  Getting the GSI used by a device has nothing do to with
> >>> Xen.
> >>>
> >>> IMO drivers/xen/sys-hypervisor.c is also not appropriate, but I'm not
> >>> the maintainer of any of those components.
> >>>
> >>> There's nothing Xen specific about fetching the GSI associated with a
> >>> PCI device.  The fact that Xen needs it for passthrough is just a red
> >>> herring, further cases where the GSI is needed might arise outside of
> >>> Xen, and hence such node would better be placed in a generic
> >>> location.  The right location should be /sys/bus/pci/devices/<sbdf>/gsi.
> >>
> >> That might be true but /sys/bus/pci/devices/<sbdf>/gsi is a non-Xen
> >> generic interface and the maintainers of that portion of Linux code
> >> might have a different opinion. We'll have to see.
> > 
> > Right, but before resorting to implement a Xen specific workaround
> > let's attempt to do it the proper way :).
> > 
> > I cannot see why exposing the gsi on sysfs like that would be an
> > issue.  There's a lot of resource information exposed on sysfs
> > already, and it's a trivial node to implement.
> Thanks for both of you' s suggestions. At present, it seems the result of discussion is that it needs to add a gsi sysfs. I will modify it in the next version and then add the corresponding maintainer to the review list.

Thanks, please keep xen-devel on Cc if possible.  Maybe if the
suggested path is not suitable maintainers can recommend another path
where the gsi (or equivalent) node could live.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-12-04 11:10               ` Roger Pau Monné
@ 2023-12-05  5:49                 ` Chen, Jiqian
  2023-12-07  1:11                 ` Daniel P. Smith
  1 sibling, 0 replies; 37+ messages in thread
From: Chen, Jiqian @ 2023-12-05  5:49 UTC (permalink / raw)
  To: Roger Pau Monné, Daniel P. Smith
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Hildebrand, Stewart, Deucher,
	Alexander, Ragiadakou, Xenia, Stabellini, Stefano, Huang, Ray,
	Huang, Honglei1, Zhang, Julia, Chen, Jiqian

On 2023/12/4 19:10, Roger Pau Monné wrote:
> On Mon, Dec 04, 2023 at 06:57:03AM +0000, Chen, Jiqian wrote:
>> Hi Daniel P. Smith,
>>
>> On 2023/11/30 22:52, Roger Pau Monné wrote:
>>> On Thu, Nov 30, 2023 at 07:39:38AM -0500, Daniel P. Smith wrote:
>>>> On 11/30/23 07:25, Daniel P. Smith wrote:
>>>>> On 11/30/23 01:22, Chen, Jiqian wrote:
>>>>>> Hi Roger and Daniel P. Smith,
>>>>>>
>>>>>> On 2023/11/28 22:08, Roger Pau Monné wrote:
>>>>>>> On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote:
>>>>>>>> When a device has been reset on dom0 side, the vpci on Xen
>>>>>>>> side won't get notification, so the cached state in vpci is
>>>>>>>> all out of date compare with the real device state.
>>>>>>>> To solve that problem, this patch add new hypercall to clear
>>>>>>>> all vpci device state. And when reset device happens on dom0
>>>>>>>> side, dom0 can call this hypercall to notify vpci.
>>>>>>>>
>>>>>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>>>>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>>>>>>> ---
>>>>>>>>   xen/arch/x86/hvm/hypercall.c  |  1 +
>>>>>>>>   xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
>>>>>>>>   xen/drivers/pci/physdev.c     | 14 ++++++++++++++
>>>>>>>>   xen/drivers/vpci/vpci.c       |  9 +++++++++
>>>>>>>>   xen/include/public/physdev.h  |  2 ++
>>>>>>>>   xen/include/xen/pci.h         |  1 +
>>>>>>>>   xen/include/xen/vpci.h        |  6 ++++++
>>>>>>>>   7 files changed, 54 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/xen/arch/x86/hvm/hypercall.c
>>>>>>>> b/xen/arch/x86/hvm/hypercall.c
>>>>>>>> index eeb73e1aa5..6ad5b4d5f1 100644
>>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>>>>>> @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd,
>>>>>>>> XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>>>       case PHYSDEVOP_pci_mmcfg_reserved:
>>>>>>>>       case PHYSDEVOP_pci_device_add:
>>>>>>>>       case PHYSDEVOP_pci_device_remove:
>>>>>>>> +    case PHYSDEVOP_pci_device_state_reset:
>>>>>>>>       case PHYSDEVOP_dbgp_op:
>>>>>>>>           if ( !is_hardware_domain(currd) )
>>>>>>>>               return -ENOSYS;
>>>>>>>> diff --git a/xen/drivers/passthrough/pci.c
>>>>>>>> b/xen/drivers/passthrough/pci.c
>>>>>>>> index 04d00c7c37..f871715585 100644
>>>>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>>>>> @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
>>>>>>>>       return ret;
>>>>>>>>   }
>>>>>>>> +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
>>>>>>>
>>>>>>> You could use pci_sbdf_t here instead of 3 parameters.
>>>>>> Will change in next version, thank you.
>>>>>>
>>>>>>>
>>>>>>> I'm however unsure whether we really need this helper just to fetch
>>>>>>> the pdev and call vpci_reset_device_state(), I think you could place
>>>>>>> this logic directly in pci_physdev_op().  Unless there are plans to
>>>>>>> use such logic outside of pci_physdev_op().
>>>>>> If I place the logic of pci_reset_device_state directly in
>>>>>> pci_physdev_op. I think I need to declare vpci_reset_device_state in
>>>>>> pci.h? If it is suitable, I will change in next version.
>>>>>>
>>>>>>>
>>>>>>>> +{
>>>>>>>> +    struct pci_dev *pdev;
>>>>>>>> +    int ret = -ENODEV;
>>>>>>>
>>>>>>> Some XSM check should be introduced fro this operation, as none of the
>>>>>>> existing ones is suitable.  See xsm_resource_unplug_pci() for example.
>>>>>>>
>>>>>>> xsm_resource_reset_state_pci() or some such I would assume.
>>>>>> I don't know what I should do in XSM function(assume it is
>>>>>> xsm_resource_reset_state_pci).
>>>>>> Hi Daniel P. Smith, could you please give me some suggestions?
>>>>>> Just to check the XSM_PRIV action?
>>>>>>
>>>>>
>>>>> Roger, thank you for seeing this and adding me in!
>>>>>
>>>>> Jiqian, I just wanted to let you know I have seen your post but I have
>>>>> been a little tied up this week. Just with the cursory look, I think
>>>>> Roger is suggesting a new XSM check/hook is warranted.
>>>>>
>>>>> If you would like to attempt at writing the dummy policy side, mimic
>>>>> xsm_resource_plug_pci in xen/include/xsm/dummy.h and
>>>>> xen/include/xsm/xsm.h, then I can look at handling the FLASK portion
>>>>> next week and provide it to you for inclusion into the series. If you
>>>>> are not comfortable with it, I can look at the whole thing next week.
>>>>> Just let me know what you would be comfortable with.
>>>>
>>>> Apologies, thinking about for a moment and was thinking the hook should be
>>>> called xsm_resource_config_pci. I would reset as a config operation and
>>>> there might be new ones in the future. I do not believe there is a need to
>>>> have fine grain access control down to individual config operation, thus
>>>> they could all be captured under this one hook. Roger, what are your
>>>> thoughts about this, in particular how you see vpci evolving?
>>>
>>> So the configuration space reset should only be done by the domain
>>> that's also capable of triggering the physical device reset, usually
>>> the hardware domain.  I don't think it's possible ATM to allow a
>>> domain different than the hardware domain to perform a PCI reset, as
>>> doing it requires unmediated access to the PCI config space.
>>>
>>> So resetting the vPCI state should either be limited to the hardware
>>> domain, or to a pci reset capability explicitly
>>> (xsm_resource_reset_pci or some such?).  Or maybe
>>> xsm_resource_config_pci if that denotes unmediated access to the PCI
>>> config space.
>>>
>>> Thanks, Roger.
>>
>> Is it like below that I need to add for XSM?
>> diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
>> index e6ffa948f7..7a289ba5d8 100644
>> --- a/xen/xsm/dummy.c
>> +++ b/xen/xsm/dummy.c
>> @@ -91,6 +91,7 @@ static const struct xsm_ops __initconst_cf_clobber dummy_ops = {
>>      .resource_plug_pci             = xsm_resource_plug_pci,
>>      .resource_unplug_pci           = xsm_resource_unplug_pci,
>>      .resource_setup_pci            = xsm_resource_setup_pci,
>> +    .resource_config_pci            = xsm_resource_config_pci,
> 
> Now that I look at it, using the existing xsm_resource_setup_pci might
> be enough, no need to introduce a xsm_resource_config_pci.
Ok, thank you. I will add xsm_resource_setup_pci check into PHYSDEVOP_pci_device_state_reset next step.

> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
  2023-12-04 11:10               ` Roger Pau Monné
  2023-12-05  5:49                 ` Chen, Jiqian
@ 2023-12-07  1:11                 ` Daniel P. Smith
  1 sibling, 0 replies; 37+ messages in thread
From: Daniel P. Smith @ 2023-12-07  1:11 UTC (permalink / raw)
  To: Roger Pau Monné, Chen, Jiqian
  Cc: Jan Beulich, Wei Liu, Anthony PERARD, Juergen Gross,
	Stefano Stabellini, xen-devel, Hildebrand, Stewart, Deucher,
	Alexander, Ragiadakou, Xenia, Stabellini, Stefano, Huang, Ray,
	Huang, Honglei1, Zhang, Julia



V/r,
Daniel P. Smith
Apertus Solutions, LLC

On 12/4/23 06:10, Roger Pau Monné wrote:
> On Mon, Dec 04, 2023 at 06:57:03AM +0000, Chen, Jiqian wrote:
>> Hi Daniel P. Smith,
>>
>> On 2023/11/30 22:52, Roger Pau Monné wrote:
>>> On Thu, Nov 30, 2023 at 07:39:38AM -0500, Daniel P. Smith wrote:
>>>> On 11/30/23 07:25, Daniel P. Smith wrote:
>>>>> On 11/30/23 01:22, Chen, Jiqian wrote:
>>>>>> Hi Roger and Daniel P. Smith,
>>>>>>
>>>>>> On 2023/11/28 22:08, Roger Pau Monné wrote:
>>>>>>> On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote:
>>>>>>>> When a device has been reset on dom0 side, the vpci on Xen
>>>>>>>> side won't get notification, so the cached state in vpci is
>>>>>>>> all out of date compare with the real device state.
>>>>>>>> To solve that problem, this patch add new hypercall to clear
>>>>>>>> all vpci device state. And when reset device happens on dom0
>>>>>>>> side, dom0 can call this hypercall to notify vpci.
>>>>>>>>
>>>>>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>>>>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>>>>>>> ---
>>>>>>>>    xen/arch/x86/hvm/hypercall.c  |  1 +
>>>>>>>>    xen/drivers/passthrough/pci.c | 21 +++++++++++++++++++++
>>>>>>>>    xen/drivers/pci/physdev.c     | 14 ++++++++++++++
>>>>>>>>    xen/drivers/vpci/vpci.c       |  9 +++++++++
>>>>>>>>    xen/include/public/physdev.h  |  2 ++
>>>>>>>>    xen/include/xen/pci.h         |  1 +
>>>>>>>>    xen/include/xen/vpci.h        |  6 ++++++
>>>>>>>>    7 files changed, 54 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/xen/arch/x86/hvm/hypercall.c
>>>>>>>> b/xen/arch/x86/hvm/hypercall.c
>>>>>>>> index eeb73e1aa5..6ad5b4d5f1 100644
>>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>>>>>> @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd,
>>>>>>>> XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>>>        case PHYSDEVOP_pci_mmcfg_reserved:
>>>>>>>>        case PHYSDEVOP_pci_device_add:
>>>>>>>>        case PHYSDEVOP_pci_device_remove:
>>>>>>>> +    case PHYSDEVOP_pci_device_state_reset:
>>>>>>>>        case PHYSDEVOP_dbgp_op:
>>>>>>>>            if ( !is_hardware_domain(currd) )
>>>>>>>>                return -ENOSYS;
>>>>>>>> diff --git a/xen/drivers/passthrough/pci.c
>>>>>>>> b/xen/drivers/passthrough/pci.c
>>>>>>>> index 04d00c7c37..f871715585 100644
>>>>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>>>>> @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
>>>>>>>>        return ret;
>>>>>>>>    }
>>>>>>>> +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn)
>>>>>>>
>>>>>>> You could use pci_sbdf_t here instead of 3 parameters.
>>>>>> Will change in next version, thank you.
>>>>>>
>>>>>>>
>>>>>>> I'm however unsure whether we really need this helper just to fetch
>>>>>>> the pdev and call vpci_reset_device_state(), I think you could place
>>>>>>> this logic directly in pci_physdev_op().  Unless there are plans to
>>>>>>> use such logic outside of pci_physdev_op().
>>>>>> If I place the logic of pci_reset_device_state directly in
>>>>>> pci_physdev_op. I think I need to declare vpci_reset_device_state in
>>>>>> pci.h? If it is suitable, I will change in next version.
>>>>>>
>>>>>>>
>>>>>>>> +{
>>>>>>>> +    struct pci_dev *pdev;
>>>>>>>> +    int ret = -ENODEV;
>>>>>>>
>>>>>>> Some XSM check should be introduced fro this operation, as none of the
>>>>>>> existing ones is suitable.  See xsm_resource_unplug_pci() for example.
>>>>>>>
>>>>>>> xsm_resource_reset_state_pci() or some such I would assume.
>>>>>> I don't know what I should do in XSM function(assume it is
>>>>>> xsm_resource_reset_state_pci).
>>>>>> Hi Daniel P. Smith, could you please give me some suggestions?
>>>>>> Just to check the XSM_PRIV action?
>>>>>>
>>>>>
>>>>> Roger, thank you for seeing this and adding me in!
>>>>>
>>>>> Jiqian, I just wanted to let you know I have seen your post but I have
>>>>> been a little tied up this week. Just with the cursory look, I think
>>>>> Roger is suggesting a new XSM check/hook is warranted.
>>>>>
>>>>> If you would like to attempt at writing the dummy policy side, mimic
>>>>> xsm_resource_plug_pci in xen/include/xsm/dummy.h and
>>>>> xen/include/xsm/xsm.h, then I can look at handling the FLASK portion
>>>>> next week and provide it to you for inclusion into the series. If you
>>>>> are not comfortable with it, I can look at the whole thing next week.
>>>>> Just let me know what you would be comfortable with.
>>>>
>>>> Apologies, thinking about for a moment and was thinking the hook should be
>>>> called xsm_resource_config_pci. I would reset as a config operation and
>>>> there might be new ones in the future. I do not believe there is a need to
>>>> have fine grain access control down to individual config operation, thus
>>>> they could all be captured under this one hook. Roger, what are your
>>>> thoughts about this, in particular how you see vpci evolving?
>>>
>>> So the configuration space reset should only be done by the domain
>>> that's also capable of triggering the physical device reset, usually
>>> the hardware domain.  I don't think it's possible ATM to allow a
>>> domain different than the hardware domain to perform a PCI reset, as
>>> doing it requires unmediated access to the PCI config space.
>>>
>>> So resetting the vPCI state should either be limited to the hardware
>>> domain, or to a pci reset capability explicitly
>>> (xsm_resource_reset_pci or some such?).  Or maybe
>>> xsm_resource_config_pci if that denotes unmediated access to the PCI
>>> config space.
>>>
>>> Thanks, Roger.
>>
>> Is it like below that I need to add for XSM?
>> diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
>> index e6ffa948f7..7a289ba5d8 100644
>> --- a/xen/xsm/dummy.c
>> +++ b/xen/xsm/dummy.c
>> @@ -91,6 +91,7 @@ static const struct xsm_ops __initconst_cf_clobber dummy_ops = {
>>       .resource_plug_pci             = xsm_resource_plug_pci,
>>       .resource_unplug_pci           = xsm_resource_unplug_pci,
>>       .resource_setup_pci            = xsm_resource_setup_pci,
>> +    .resource_config_pci            = xsm_resource_config_pci,
> 
> Now that I look at it, using the existing xsm_resource_setup_pci might
> be enough, no need to introduce a xsm_resource_config_pci.

Ack.


^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2023-12-07  1:12 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-24 10:41 [RFC XEN PATCH v2 0/3] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
2023-11-24 10:41 ` [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device Jiqian Chen
2023-11-28 14:08   ` Roger Pau Monné
2023-11-30  6:22     ` Chen, Jiqian
2023-11-30 11:52       ` Roger Pau Monné
2023-11-30 11:55         ` Chen, Jiqian
2023-11-30 12:25       ` Daniel P. Smith
2023-11-30 12:39         ` Daniel P. Smith
2023-11-30 14:52           ` Roger Pau Monné
2023-12-04  6:57             ` Chen, Jiqian
2023-12-04 11:10               ` Roger Pau Monné
2023-12-05  5:49                 ` Chen, Jiqian
2023-12-07  1:11                 ` Daniel P. Smith
2023-11-30 13:03         ` Chen, Jiqian
2023-11-24 10:41 ` [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0 Jiqian Chen
2023-11-28 14:17   ` Roger Pau Monné
2023-11-30  6:32     ` Chen, Jiqian
2023-11-30 15:13       ` Roger Pau Monné
2023-11-28 15:14   ` Jan Beulich
2023-11-30  6:44     ` Chen, Jiqian
2023-11-30  8:38       ` Chen, Jiqian
2023-11-30  9:00       ` Jan Beulich
2023-11-30  9:43         ` Chen, Jiqian
2023-11-24 10:41 ` [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq Jiqian Chen
2023-11-28 14:25   ` Roger Pau Monné
2023-11-28 14:42     ` Juergen Gross
2023-11-28 16:11       ` Roger Pau Monné
2023-11-28 16:22         ` Juergen Gross
2023-11-30  6:57         ` Chen, Jiqian
2023-11-30  3:38     ` Stefano Stabellini
2023-11-30  4:02       ` Stefano Stabellini
2023-11-30 10:06         ` Roger Pau Monné
2023-12-01  3:09           ` Stefano Stabellini
2023-12-01  9:03             ` Roger Pau Monné
2023-12-04  5:38               ` Chen, Jiqian
2023-12-04 11:12                 ` Roger Pau Monné
2023-11-30  6:53     ` Chen, Jiqian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).