All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC XEN PATCH v4 0/5] Support device passthrough when dom0 is PVH on Xen
@ 2024-01-05  7:09 Jiqian Chen
  2024-01-05  7:09 ` [RFC XEN PATCH v4 1/5] xen/vpci: Clear all vpci status of device Jiqian Chen
                   ` (4 more replies)
  0 siblings, 5 replies; 35+ messages in thread
From: Jiqian Chen @ 2024-01-05  7:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Stewart Hildebrand, Huang Rui,
	Jiqian Chen

Hi All,
This is v4 series to support passthrough when dom0 is PVH
v3->v4 changes:
* patch#1: change the comment of PHYSDEVOP_pci_device_state_reset; move printings behind pcidevs_unlock
* patch#2: add check to prevent PVH self map
* patch#3: new patch, The implementation of adding PHYSDEVOP_setup_gsi for PVH is treated as a separate patch
* patch#4: new patch to solve the map_pirq problem of PVH dom0. use gsi to grant irq permission in XEN_DOMCTL_irq_permission.
* patch#5: to be compatible with previous kernel versions, when there is no gsi sysfs, still use irq

v2->v3 changes:
* patch#1: move the content out of pci_reset_device_state and delete pci_reset_device_state; add xsm_resource_setup_pci check for PHYSDEVOP_pci_device_state_reset; add description for PHYSDEVOP_pci_device_state_reset;
* patch#2: du to changes in the implementation of the second patch on kernel side(that it will do setup_gsi and map_pirq when assigning a device to passthrough), add PHYSDEVOP_setup_gsi for PVH dom0, and we need to support self mapping.
* patch#3: du to changes in the implementation of the second patch on kernel side(that adds a new sysfs for gsi instead of a new syscall), so read gsi number from the sysfs of gsi.
v3 link:
https://lore.kernel.org/xen-devel/20231210164009.1551147-1-Jiqian.Chen@amd.com/T/#t

v2 link:
https://lore.kernel.org/xen-devel/20231124104136.3263722-1-Jiqian.Chen@amd.com/T/#t
Below is the description of v2 cover letter:
This series of patches are the v2 of the implementation of passthrough when dom0 is PVH on Xen.
We sent the v1 to upstream before, but the v1 had so many problems and we got lots of suggestions.
I will introduce all issues that these patches try to fix and the differences between v1 and v2.

Issues we encountered:
1. pci_stub failed to write bar for a passthrough device.
Problem: when we run “sudo xl pci-assignable-add <sbdf>” to assign a device, pci_stub will call “pcistub_init_device() -> pci_restore_state() -> pci_restore_config_space() ->
pci_restore_config_space_range() -> pci_restore_config_dword() -> pci_write_config_dword()”, the pci config write will trigger an io interrupt to bar_write() in the xen, but the
bar->enabled was set before, the write is not allowed now, and then when 
bar->Qemu config the
passthrough device in xen_pt_realize(), it gets invalid bar values.

Reason: the reason is that we don't tell vPCI that the device has been reset, so the current cached state in pdev->vpci is all out of date and is different from the real device state.

Solution: to solve this problem, the first patch of kernel(xen/pci: Add xen_reset_device_state
function) and the fist patch of xen(xen/vpci: Clear all vpci status of device) add a new hypercall to reset the state stored in vPCI when the state of real device has changed.
Thank Roger for the suggestion of this v2, and it is different from v1 (https://lore.kernel.org/xen-devel/20230312075455.450187-3-ray.huang@amd.com/), v1 simply allow domU to write pci bar, it does not comply with the design principles of vPCI.

2. failed to do PHYSDEVOP_map_pirq when dom0 is PVH
Problem: HVM domU will do PHYSDEVOP_map_pirq for a passthrough device by using gsi. See xen_pt_realize->xc_physdev_map_pirq and pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq will call into Xen, but in hvm_physdev_op(), PHYSDEVOP_map_pirq is not allowed.

Reason: In hvm_physdev_op(), the variable "currd" is PVH dom0 and PVH has no X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.

Solution: I think we may need to allow PHYSDEVOP_map_pirq when "currd" is dom0 (at present dom0 is PVH). The second patch of xen(x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0) allow PVH dom0 do PHYSDEVOP_map_pirq. This v2 patch is better than v1, v1 simply remove the has_pirq check(xen https://lore.kernel.org/xen-devel/20230312075455.450187-4-ray.huang@amd.com/).

3. the gsi of a passthrough device doesn't be unmasked
 3.1 failed to check the permission of pirq
 3.2 the gsi of passthrough device was not registered in PVH dom0

Problem:
3.1 callback function pci_add_dm_done() will be called when qemu config a passthrough device for domU.
This function will call xc_domain_irq_permission()-> pirq_access_permitted() to check if the gsi has corresponding mappings in dom0. But it didn’t, so failed. See XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH dom0 and it return irq is 0.
3.2 it's possible for a gsi (iow: vIO-APIC pin) to never get registered on PVH dom0, because the devices of PVH are using MSI(-X) interrupts. However, the IO-APIC pin must be configured for it to be able to be mapped into a domU.

Reason: After searching codes, I find "map_pirq" and "register_gsi" will be done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when the gsi(aka ioapic's pin) is unmasked in PVH dom0.
So the two problems can be concluded to that the gsi of a passthrough device doesn't be unmasked.

Solution: to solve these problems, the second patch of kernel(xen/pvh: Unmask irq for passthrough device in PVH dom0) call the unmask_irq() when we assign a device to be passthrough. So that passthrough devices can have the mapping of gsi on PVH dom0 and gsi can be registered. This v2 patch is different from the v1( kernel https://lore.kernel.org/xen-devel/20230312120157.452859-5-ray.huang@amd.com/,
kernel https://lore.kernel.org/xen-devel/20230312120157.452859-5-ray.huang@amd.com/ and xen https://lore.kernel.org/xen-devel/20230312075455.450187-5-ray.huang@amd.com/),
v1 performed "map_pirq" and "register_gsi" on all pci devices on PVH dom0, which is unnecessary and may cause multiple registration.

4. failed to map pirq for gsi
Problem: qemu will call xc_physdev_map_pirq() to map a passthrough device’s gsi to pirq in function xen_pt_realize(). But failed.

Reason: According to the implement of xc_physdev_map_pirq(), it needs gsi instead of irq, but qemu pass irq to it and treat irq as gsi, it is got from file /sys/bus/pci/devices/xxxx:xx:xx.x/irq in function xen_host_pci_device_get(). But actually the gsi number is not equal with irq. On PVH dom0, when it allocates irq for a gsi in function acpi_register_gsi_ioapic(), allocation is dynamic, and follow the principle of applying first, distributing first. And if you debug the kernel codes(see function __irq_alloc_descs), you will find the irq number is allocated from small to large by order, but the applying gsi number is not, gsi 38 may come before gsi 28, that causes gsi 38 get a smaller irq number than gsi 28, and then gsi != irq.

Solution: we can record the relation between gsi and irq, then when userspace(qemu) want to use gsi, we can do a translation. The third patch of kernel(xen/privcmd: Add new syscall to get gsi from irq) records all the relations in acpi_register_gsi_xen_pvh() when dom0 initialize pci devices, and provide a syscall for userspace to get the gsi from irq. The third patch of xen(tools: Add new function to get gsi from irq) add a new function xc_physdev_gsi_from_irq() to call the new syscall added on kernel side.
And then userspace can use that function to get gsi. Then xc_physdev_map_pirq() will success. This v2 patch is the same as v1( kernel https://lore.kernel.org/xen-devel/20230312120157.452859-6-ray.huang@amd.com/ and xen https://lore.kernel.org/xen-devel/20230312075455.450187-6-ray.huang@amd.com/)

About the v2 patch of qemu, just change an included head file, other are similar to the v1 ( qemu https://lore.kernel.org/xen-devel/20230312092244.451465-19-ray.huang@amd.com/), just call
xc_physdev_gsi_from_irq() to get gsi from irq.

Jiqian Chen (5):
  xen/vpci: Clear all vpci status of device
  x86/pvh: Allow (un)map_pirq when caller isn't DOMID_SELF
  x86/pvh: Add PHYSDEVOP_setup_gsi for PVH dom0
  domctl: Use gsi to grant/revoke irq permission
  libxl: Use gsi instead of irq for mapping pirq

 tools/libs/light/libxl_pci.c | 21 +++++++++++++++++----
 tools/libs/light/libxl_x86.c |  5 ++++-
 xen/arch/x86/hvm/hypercall.c | 34 ++++++++++++++++++++++++++++++++--
 xen/common/domctl.c          | 12 ++++++++++--
 xen/drivers/pci/physdev.c    | 34 ++++++++++++++++++++++++++++++++++
 xen/drivers/vpci/vpci.c      |  9 +++++++++
 xen/include/public/physdev.h |  7 +++++++
 xen/include/xen/vpci.h       |  6 ++++++
 8 files changed, 119 insertions(+), 9 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFC XEN PATCH v4 1/5] xen/vpci: Clear all vpci status of device
  2024-01-05  7:09 [RFC XEN PATCH v4 0/5] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
@ 2024-01-05  7:09 ` Jiqian Chen
  2024-01-09 15:24   ` Stewart Hildebrand
  2024-01-05  7:09 ` [RFC XEN PATCH v4 2/5] x86/pvh: Allow (un)map_pirq when caller isn't DOMID_SELF Jiqian Chen
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 35+ messages in thread
From: Jiqian Chen @ 2024-01-05  7:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Stewart Hildebrand, Huang Rui,
	Jiqian Chen, Huang Rui

When a device has been reset on dom0 side, the vpci on Xen
side won't get notification, so the cached state in vpci is
all out of date compare with the real device state.
To solve that problem, add a new hypercall to clear all vpci
device state. When the state of device is reset on dom0 side,
dom0 can call this hypercall to notify vpci.

Co-developed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
 xen/arch/x86/hvm/hypercall.c |  1 +
 xen/drivers/pci/physdev.c    | 34 ++++++++++++++++++++++++++++++++++
 xen/drivers/vpci/vpci.c      |  9 +++++++++
 xen/include/public/physdev.h |  7 +++++++
 xen/include/xen/vpci.h       |  6 ++++++
 5 files changed, 57 insertions(+)

diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index eeb73e1aa5d0..6ad5b4d5f11f 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
     case PHYSDEVOP_pci_mmcfg_reserved:
     case PHYSDEVOP_pci_device_add:
     case PHYSDEVOP_pci_device_remove:
+    case PHYSDEVOP_pci_device_state_reset:
     case PHYSDEVOP_dbgp_op:
         if ( !is_hardware_domain(currd) )
             return -ENOSYS;
diff --git a/xen/drivers/pci/physdev.c b/xen/drivers/pci/physdev.c
index 42db3e6d133c..552ccbf747cb 100644
--- a/xen/drivers/pci/physdev.c
+++ b/xen/drivers/pci/physdev.c
@@ -2,6 +2,7 @@
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
 #include <xen/init.h>
+#include <xen/vpci.h>
 
 #ifndef COMPAT
 typedef long ret_t;
@@ -67,6 +68,39 @@ ret_t pci_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case PHYSDEVOP_pci_device_state_reset: {
+        struct physdev_pci_device dev;
+        struct pci_dev *pdev;
+        pci_sbdf_t sbdf;
+
+        if ( !is_pci_passthrough_enabled() )
+            return -EOPNOTSUPP;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&dev, arg, 1) != 0 )
+            break;
+        sbdf = PCI_SBDF(dev.seg, dev.bus, dev.devfn);
+
+        ret = xsm_resource_setup_pci(XSM_PRIV, sbdf.sbdf);
+        if ( ret )
+            break;
+
+        pcidevs_lock();
+        pdev = pci_get_pdev(NULL, sbdf);
+        if ( !pdev )
+        {
+            pcidevs_unlock();
+            ret = -ENODEV;
+            break;
+        }
+
+        ret = vpci_reset_device_state(pdev);
+        pcidevs_unlock();
+        if ( ret )
+            printk(XENLOG_ERR "%pp: failed to reset PCI device state\n", &sbdf);
+        break;
+    }
+
     default:
         ret = -ENOSYS;
         break;
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 72ef277c4f8e..3c64cb10ccbb 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -107,6 +107,15 @@ int vpci_add_handlers(struct pci_dev *pdev)
 
     return rc;
 }
+
+int vpci_reset_device_state(struct pci_dev *pdev)
+{
+    ASSERT(pcidevs_locked());
+
+    vpci_remove_device(pdev);
+    return vpci_add_handlers(pdev);
+}
+
 #endif /* __XEN__ */
 
 static int vpci_register_cmp(const struct vpci_register *r1,
diff --git a/xen/include/public/physdev.h b/xen/include/public/physdev.h
index f0c0d4727c0b..f5bab1f29779 100644
--- a/xen/include/public/physdev.h
+++ b/xen/include/public/physdev.h
@@ -296,6 +296,13 @@ DEFINE_XEN_GUEST_HANDLE(physdev_pci_device_add_t);
  */
 #define PHYSDEVOP_prepare_msix          30
 #define PHYSDEVOP_release_msix          31
+/*
+ * Notify the hypervisor that a PCI device has been reset, so that any
+ * internally cached state is regenerated.  Should be called after any
+ * device reset performed by the hardware domain.
+ */
+#define PHYSDEVOP_pci_device_state_reset     32
+
 struct physdev_pci_device {
     /* IN */
     uint16_t seg;
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index d20c301a3db3..6ec83ce9ae13 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -30,6 +30,7 @@ int __must_check vpci_add_handlers(struct pci_dev *pdev);
 
 /* Remove all handlers and free vpci related structures. */
 void vpci_remove_device(struct pci_dev *pdev);
+int __must_check vpci_reset_device_state(struct pci_dev *pdev);
 
 /* Add/remove a register handler. */
 int __must_check vpci_add_register_mask(struct vpci *vpci,
@@ -262,6 +263,11 @@ static inline int vpci_add_handlers(struct pci_dev *pdev)
 
 static inline void vpci_remove_device(struct pci_dev *pdev) { }
 
+static inline int __must_check vpci_reset_device_state(struct pci_dev *pdev)
+{
+    return 0;
+}
+
 static inline void vpci_dump_msi(void) { }
 
 static inline uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC XEN PATCH v4 2/5] x86/pvh: Allow (un)map_pirq when caller isn't DOMID_SELF
  2024-01-05  7:09 [RFC XEN PATCH v4 0/5] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
  2024-01-05  7:09 ` [RFC XEN PATCH v4 1/5] xen/vpci: Clear all vpci status of device Jiqian Chen
@ 2024-01-05  7:09 ` Jiqian Chen
  2024-01-06  0:46   ` Stefano Stabellini
  2024-01-05  7:09 ` [RFC XEN PATCH v4 3/5] x86/pvh: Add PHYSDEVOP_setup_gsi for PVH dom0 Jiqian Chen
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 35+ messages in thread
From: Jiqian Chen @ 2024-01-05  7:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Stewart Hildebrand, Huang Rui,
	Jiqian Chen, Huang Rui

If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
a passthrough device by using gsi, see
xen_pt_realize->xc_physdev_map_pirq and
pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq
is not allowed because currd is PVH dom0 and PVH has no
X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.

So, allow PHYSDEVOP_map_pirq when domid of the caller is not
DOMID_SELF no matter whether currd has X86_EMU_USE_PIRQ flag
and also allow PHYSDEVOP_unmap_pirq for the failed path to
unmap pirq.

Co-developed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
 xen/arch/x86/hvm/hypercall.c | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 6ad5b4d5f11f..632a68be3cc4 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -10,6 +10,7 @@
 #include <xen/hypercall.h>
 #include <xen/ioreq.h>
 #include <xen/nospec.h>
+#include <xen/guest_access.h>
 
 #include <asm/hvm/emulate.h>
 #include <asm/hvm/support.h>
@@ -72,8 +73,30 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     switch ( cmd )
     {
-    case PHYSDEVOP_map_pirq:
-    case PHYSDEVOP_unmap_pirq:
+    case PHYSDEVOP_map_pirq: {
+        physdev_map_pirq_t map;
+
+        if ( copy_from_guest(&map, arg, 1) != 0 )
+            return -EFAULT;
+
+        if ( !has_pirq(currd) && map.domid == DOMID_SELF )
+            return -ENOSYS;
+
+        break;
+    }
+
+    case PHYSDEVOP_unmap_pirq: {
+        physdev_unmap_pirq_t unmap;
+
+        if ( copy_from_guest(&unmap, arg, 1) != 0 )
+            return -EFAULT;
+
+        if ( !has_pirq(currd) && unmap.domid == DOMID_SELF )
+            return -ENOSYS;
+
+        break;
+    }
+
     case PHYSDEVOP_eoi:
     case PHYSDEVOP_irq_status_query:
     case PHYSDEVOP_get_free_pirq:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC XEN PATCH v4 3/5] x86/pvh: Add PHYSDEVOP_setup_gsi for PVH dom0
  2024-01-05  7:09 [RFC XEN PATCH v4 0/5] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
  2024-01-05  7:09 ` [RFC XEN PATCH v4 1/5] xen/vpci: Clear all vpci status of device Jiqian Chen
  2024-01-05  7:09 ` [RFC XEN PATCH v4 2/5] x86/pvh: Allow (un)map_pirq when caller isn't DOMID_SELF Jiqian Chen
@ 2024-01-05  7:09 ` Jiqian Chen
  2024-01-06  0:54   ` Stefano Stabellini
  2024-01-05  7:09 ` [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission Jiqian Chen
  2024-01-05  7:09 ` [RFC XEN PATCH v4 5/5] libxl: Use gsi instead of irq for mapping pirq Jiqian Chen
  4 siblings, 1 reply; 35+ messages in thread
From: Jiqian Chen @ 2024-01-05  7:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Stewart Hildebrand, Huang Rui,
	Jiqian Chen, Huang Rui

On PVH dom0, the gsis don't get registered, but
the gsi of a passthrough device must be configured for it to
be able to be mapped into a hvm domU.
On Linux kernel side, it calles PHYSDEVOP_setup_gsi for
passthrough devices to register gsi when dom0 is PVH.
So, add PHYSDEVOP_setup_gsi for above purpose.

Co-developed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
 xen/arch/x86/hvm/hypercall.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 632a68be3cc4..e27d3ca15185 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -97,6 +97,12 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case PHYSDEVOP_setup_gsi:
+        if ( is_hardware_domain(currd) && !has_pirq(currd) )
+            break;
+        else
+            return -ENOSYS;
+
     case PHYSDEVOP_eoi:
     case PHYSDEVOP_irq_status_query:
     case PHYSDEVOP_get_free_pirq:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-05  7:09 [RFC XEN PATCH v4 0/5] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
                   ` (2 preceding siblings ...)
  2024-01-05  7:09 ` [RFC XEN PATCH v4 3/5] x86/pvh: Add PHYSDEVOP_setup_gsi for PVH dom0 Jiqian Chen
@ 2024-01-05  7:09 ` Jiqian Chen
  2024-01-06  1:08   ` Stefano Stabellini
  2024-01-10 20:09   ` Stewart Hildebrand
  2024-01-05  7:09 ` [RFC XEN PATCH v4 5/5] libxl: Use gsi instead of irq for mapping pirq Jiqian Chen
  4 siblings, 2 replies; 35+ messages in thread
From: Jiqian Chen @ 2024-01-05  7:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Stewart Hildebrand, Huang Rui,
	Jiqian Chen, Huang Rui

Some type of domain don't have PIRQ, like PVH, current
implementation is not suitable for those domain.

When passthrough a device to guest on PVH dom0, this
pci_add_dm_done->XEN_DOMCTL_irq_permission will failed
at domain_pirq_to_irq.

So, change it to use gsi to grant/revoke irq permission.
And change the caller of XEN_DOMCTL_irq_permission to
pass gsi to it instead of pirq.

Co-developed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
 tools/libs/light/libxl_pci.c |  6 ++++--
 tools/libs/light/libxl_x86.c |  5 ++++-
 xen/common/domctl.c          | 12 ++++++++++--
 3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 96cb4da0794e..e1341d1e9850 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1418,6 +1418,7 @@ static void pci_add_dm_done(libxl__egc *egc,
     unsigned long long start, end, flags, size;
     int irq, i;
     int r;
+    int gsi;
     uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
     uint32_t domainid = domid;
     bool isstubdom = libxl_is_stubdom(ctx, domid, &domainid);
@@ -1486,6 +1487,7 @@ static void pci_add_dm_done(libxl__egc *egc,
         goto out_no_irq;
     }
     if ((fscanf(f, "%u", &irq) == 1) && irq) {
+        gsi = irq;
         r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
         if (r < 0) {
             LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)",
@@ -1494,10 +1496,10 @@ static void pci_add_dm_done(libxl__egc *egc,
             rc = ERROR_FAIL;
             goto out;
         }
-        r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);
+        r = xc_domain_irq_permission(ctx->xch, domid, gsi, 1);
         if (r < 0) {
             LOGED(ERROR, domainid,
-                  "xc_domain_irq_permission irq=%d (error=%d)", irq, r);
+                  "xc_domain_irq_permission irq=%d (error=%d)", gsi, r);
             fclose(f);
             rc = ERROR_FAIL;
             goto out;
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index d16573e72cd4..88ad50cf6360 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -654,12 +654,15 @@ out:
 int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t domid, int irq)
 {
     int ret;
+    int gsi;
+
+    gsi = irq;
 
     ret = xc_physdev_map_pirq(CTX->xch, domid, irq, &irq);
     if (ret)
         return ret;
 
-    ret = xc_domain_irq_permission(CTX->xch, domid, irq, 1);
+    ret = xc_domain_irq_permission(CTX->xch, domid, gsi, 1);
 
     return ret;
 }
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index f5a71ee5f78d..eeb975bd0194 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -653,12 +653,20 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
         unsigned int pirq = op->u.irq_permission.pirq, irq;
         int allow = op->u.irq_permission.allow_access;
 
-        if ( pirq >= current->domain->nr_pirqs )
+        if ( pirq >= nr_irqs_gsi )
         {
             ret = -EINVAL;
             break;
         }
-        irq = pirq_access_permitted(current->domain, pirq);
+
+        if ( irq_access_permitted(current->domain, pirq) )
+            irq = pirq;
+        else
+        {
+            ret = -EPERM;
+            break;
+        }
+
         if ( !irq || xsm_irq_permission(XSM_HOOK, d, irq, allow) )
             ret = -EPERM;
         else if ( allow )
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC XEN PATCH v4 5/5] libxl: Use gsi instead of irq for mapping pirq
  2024-01-05  7:09 [RFC XEN PATCH v4 0/5] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
                   ` (3 preceding siblings ...)
  2024-01-05  7:09 ` [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission Jiqian Chen
@ 2024-01-05  7:09 ` Jiqian Chen
  2024-01-05  7:36   ` Jan Beulich
  2024-01-06  1:11   ` Stefano Stabellini
  4 siblings, 2 replies; 35+ messages in thread
From: Jiqian Chen @ 2024-01-05  7:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Stewart Hildebrand, Huang Rui,
	Jiqian Chen, Huang Rui

In PVH dom0, it uses the linux local interrupt mechanism,
when it allocs irq for a gsi, it is dynamic, and follow
the principle of applying first, distributing first. And
the irq number is alloced from small to large, but the
applying gsi number is not, may gsi 38 comes before gsi
28, that causes the irq number is not equal with the gsi
number. And when passthrough a device, xl wants to use
gsi to map pirq, see pci_add_dm_done->xc_physdev_map_pirq,
but the gsi number is got from file
/sys/bus/pci/devices/<sbdf>/irq in current code, so it
will fail when mapping.

So, use real gsi number read from gsi sysfs.

Co-developed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
 tools/libs/light/libxl_pci.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index e1341d1e9850..ab51dc099725 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1479,8 +1479,14 @@ static void pci_add_dm_done(libxl__egc *egc,
     fclose(f);
     if (!pci_supp_legacy_irq())
         goto out_no_irq;
-    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
+    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/gsi", pci->domain,
                                 pci->bus, pci->dev, pci->func);
+
+    if (access(sysfs_path, F_OK) != 0) {
+        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
+                                pci->bus, pci->dev, pci->func);
+    }
+
     f = fopen(sysfs_path, "r");
     if (f == NULL) {
         LOGED(ERROR, domainid, "Couldn't open %s", sysfs_path);
@@ -2231,9 +2237,14 @@ skip_bar:
     if (!pci_supp_legacy_irq())
         goto skip_legacy_irq;
 
-    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
+    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/gsi", pci->domain,
                            pci->bus, pci->dev, pci->func);
 
+    if (access(sysfs_path, F_OK) != 0) {
+        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
+                           pci->bus, pci->dev, pci->func);
+    }
+
     f = fopen(sysfs_path, "r");
     if (f == NULL) {
         LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 5/5] libxl: Use gsi instead of irq for mapping pirq
  2024-01-05  7:09 ` [RFC XEN PATCH v4 5/5] libxl: Use gsi instead of irq for mapping pirq Jiqian Chen
@ 2024-01-05  7:36   ` Jan Beulich
  2024-01-05  7:59     ` Chen, Jiqian
  2024-01-06  1:11   ` Stefano Stabellini
  1 sibling, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2024-01-05  7:36 UTC (permalink / raw)
  To: Jiqian Chen
  Cc: Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Stewart Hildebrand, Huang Rui,
	xen-devel

On 05.01.2024 08:09, Jiqian Chen wrote:
> --- a/tools/libs/light/libxl_pci.c
> +++ b/tools/libs/light/libxl_pci.c
> @@ -1479,8 +1479,14 @@ static void pci_add_dm_done(libxl__egc *egc,
>      fclose(f);
>      if (!pci_supp_legacy_irq())
>          goto out_no_irq;
> -    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
> +    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/gsi", pci->domain,
>                                  pci->bus, pci->dev, pci->func);
> +
> +    if (access(sysfs_path, F_OK) != 0) {

I suppose you want to also check errno, and fall back only when you got
back (I guess) ENOENT.

Jan

> +        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
> +                                pci->bus, pci->dev, pci->func);
> +    }
> +
>      f = fopen(sysfs_path, "r");
>      if (f == NULL) {
>          LOGED(ERROR, domainid, "Couldn't open %s", sysfs_path);
> @@ -2231,9 +2237,14 @@ skip_bar:
>      if (!pci_supp_legacy_irq())
>          goto skip_legacy_irq;
>  
> -    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
> +    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/gsi", pci->domain,
>                             pci->bus, pci->dev, pci->func);
>  
> +    if (access(sysfs_path, F_OK) != 0) {
> +        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
> +                           pci->bus, pci->dev, pci->func);
> +    }
> +
>      f = fopen(sysfs_path, "r");
>      if (f == NULL) {
>          LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 5/5] libxl: Use gsi instead of irq for mapping pirq
  2024-01-05  7:36   ` Jan Beulich
@ 2024-01-05  7:59     ` Chen, Jiqian
  0 siblings, 0 replies; 35+ messages in thread
From: Chen, Jiqian @ 2024-01-05  7:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Hildebrand, Stewart, Huang, Ray,
	xen-devel, Chen, Jiqian

On 2024/1/5 15:36, Jan Beulich wrote:
> On 05.01.2024 08:09, Jiqian Chen wrote:
>> --- a/tools/libs/light/libxl_pci.c
>> +++ b/tools/libs/light/libxl_pci.c
>> @@ -1479,8 +1479,14 @@ static void pci_add_dm_done(libxl__egc *egc,
>>      fclose(f);
>>      if (!pci_supp_legacy_irq())
>>          goto out_no_irq;
>> -    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>> +    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/gsi", pci->domain,
>>                                  pci->bus, pci->dev, pci->func);
>> +
>> +    if (access(sysfs_path, F_OK) != 0) {
> 
> I suppose you want to also check errno, and fall back only when you got
> back (I guess) ENOENT.
Ok, will check errno in next version, if it is ENOENT, then use irq, if it is not ENOENT, then log error and go to out.

> 
> Jan
> 
>> +        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>> +                                pci->bus, pci->dev, pci->func);
>> +    }
>> +
>>      f = fopen(sysfs_path, "r");
>>      if (f == NULL) {
>>          LOGED(ERROR, domainid, "Couldn't open %s", sysfs_path);
>> @@ -2231,9 +2237,14 @@ skip_bar:
>>      if (!pci_supp_legacy_irq())
>>          goto skip_legacy_irq;
>>  
>> -    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>> +    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/gsi", pci->domain,
>>                             pci->bus, pci->dev, pci->func);
>>  
>> +    if (access(sysfs_path, F_OK) != 0) {
>> +        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>> +                           pci->bus, pci->dev, pci->func);
>> +    }
>> +
>>      f = fopen(sysfs_path, "r");
>>      if (f == NULL) {
>>          LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 2/5] x86/pvh: Allow (un)map_pirq when caller isn't DOMID_SELF
  2024-01-05  7:09 ` [RFC XEN PATCH v4 2/5] x86/pvh: Allow (un)map_pirq when caller isn't DOMID_SELF Jiqian Chen
@ 2024-01-06  0:46   ` Stefano Stabellini
  2024-01-08  8:47     ` Jan Beulich
  0 siblings, 1 reply; 35+ messages in thread
From: Stefano Stabellini @ 2024-01-06  0:46 UTC (permalink / raw)
  To: Jiqian Chen
  Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Stewart Hildebrand, Huang Rui,
	Huang Rui

On Fri, 5 Jan 2024, Jiqian Chen wrote:
> If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
> a passthrough device by using gsi, see
> xen_pt_realize->xc_physdev_map_pirq and
> pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
> will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq
> is not allowed because currd is PVH dom0 and PVH has no
> X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.
> 
> So, allow PHYSDEVOP_map_pirq when domid of the caller is not
> DOMID_SELF no matter whether currd has X86_EMU_USE_PIRQ flag
> and also allow PHYSDEVOP_unmap_pirq for the failed path to
> unmap pirq.
> 
> Co-developed-by: Huang Rui <ray.huang@amd.com>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
>  xen/arch/x86/hvm/hypercall.c | 27 +++++++++++++++++++++++++--
>  1 file changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
> index 6ad5b4d5f11f..632a68be3cc4 100644
> --- a/xen/arch/x86/hvm/hypercall.c
> +++ b/xen/arch/x86/hvm/hypercall.c
> @@ -10,6 +10,7 @@
>  #include <xen/hypercall.h>
>  #include <xen/ioreq.h>
>  #include <xen/nospec.h>
> +#include <xen/guest_access.h>
>  
>  #include <asm/hvm/emulate.h>
>  #include <asm/hvm/support.h>
> @@ -72,8 +73,30 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>      switch ( cmd )
>      {
> -    case PHYSDEVOP_map_pirq:
> -    case PHYSDEVOP_unmap_pirq:
> +    case PHYSDEVOP_map_pirq: {
> +        physdev_map_pirq_t map;
> +
> +        if ( copy_from_guest(&map, arg, 1) != 0 )
> +            return -EFAULT;
> +
> +        if ( !has_pirq(currd) && map.domid == DOMID_SELF )
> +            return -ENOSYS;

This looks OK to me although there is already another copy_from_guest in
do_physdev_op, but I don't see an easy way to make it better.

Also, you could write this check like this:

        d = rcu_lock_domain_by_any_id(map.domid);
        if ( d == NULL )
            return -ESRCH;
        if ( !has_pirq(d) )
            return -ENOSYS;
        rcu_unlock_domain(d);

This way it is a bit more generic and not special-cased to DOMID_SELF.

I'll let the x86 maintainers comment on the way the prefer it.


> +        break;
> +    }
> +
> +    case PHYSDEVOP_unmap_pirq: {
> +        physdev_unmap_pirq_t unmap;
> +
> +        if ( copy_from_guest(&unmap, arg, 1) != 0 )
> +            return -EFAULT;
> +
> +        if ( !has_pirq(currd) && unmap.domid == DOMID_SELF )
> +            return -ENOSYS;
> +
> +        break;
> +    }
> +
>      case PHYSDEVOP_eoi:
>      case PHYSDEVOP_irq_status_query:
>      case PHYSDEVOP_get_free_pirq:
> -- 
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 3/5] x86/pvh: Add PHYSDEVOP_setup_gsi for PVH dom0
  2024-01-05  7:09 ` [RFC XEN PATCH v4 3/5] x86/pvh: Add PHYSDEVOP_setup_gsi for PVH dom0 Jiqian Chen
@ 2024-01-06  0:54   ` Stefano Stabellini
  2024-01-08  8:50     ` Jan Beulich
  0 siblings, 1 reply; 35+ messages in thread
From: Stefano Stabellini @ 2024-01-06  0:54 UTC (permalink / raw)
  To: Jiqian Chen
  Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Stewart Hildebrand, Huang Rui,
	Huang Rui

On Fri, 5 Jan 2024, Jiqian Chen wrote:
> On PVH dom0, the gsis don't get registered, but
> the gsi of a passthrough device must be configured for it to
> be able to be mapped into a hvm domU.
> On Linux kernel side, it calles PHYSDEVOP_setup_gsi for
> passthrough devices to register gsi when dom0 is PVH.
> So, add PHYSDEVOP_setup_gsi for above purpose.
> 
> Co-developed-by: Huang Rui <ray.huang@amd.com>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
>  xen/arch/x86/hvm/hypercall.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
> index 632a68be3cc4..e27d3ca15185 100644
> --- a/xen/arch/x86/hvm/hypercall.c
> +++ b/xen/arch/x86/hvm/hypercall.c
> @@ -97,6 +97,12 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          break;
>      }
>  
> +    case PHYSDEVOP_setup_gsi:
> +        if ( is_hardware_domain(currd) && !has_pirq(currd) )
> +            break;
> +        else
> +            return -ENOSYS;

I am not sure what is the best "if" check for this situation but I am
guessing we don't need has_pirq(currd). Maybe this is sufficient:

if ( is_hardware_domain(currd) )
    break;
else
    return -ENOSYS;


This is another one for the x86 maintainers.


>      case PHYSDEVOP_eoi:
>      case PHYSDEVOP_irq_status_query:
>      case PHYSDEVOP_get_free_pirq:
> -- 
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-05  7:09 ` [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission Jiqian Chen
@ 2024-01-06  1:08   ` Stefano Stabellini
  2024-01-06  1:12     ` Stefano Stabellini
                       ` (2 more replies)
  2024-01-10 20:09   ` Stewart Hildebrand
  1 sibling, 3 replies; 35+ messages in thread
From: Stefano Stabellini @ 2024-01-06  1:08 UTC (permalink / raw)
  To: Jiqian Chen
  Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Stewart Hildebrand, Huang Rui,
	Huang Rui

On Fri, 5 Jan 2024, Jiqian Chen wrote:
> Some type of domain don't have PIRQ, like PVH, current
> implementation is not suitable for those domain.
> 
> When passthrough a device to guest on PVH dom0, this
> pci_add_dm_done->XEN_DOMCTL_irq_permission will failed
> at domain_pirq_to_irq.
> 
> So, change it to use gsi to grant/revoke irq permission.
> And change the caller of XEN_DOMCTL_irq_permission to
> pass gsi to it instead of pirq.
> 
> Co-developed-by: Huang Rui <ray.huang@amd.com>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>

I realize there is no explanation or comment right now, but I think this
change would benefit from a in-code comment in xen/include/public/domctl.h
E.g.:

/* XEN_DOMCTL_irq_permission */
struct xen_domctl_irq_permission {
    uint32_t pirq;           /* pirq is the GSI on x86 */
    uint8_t allow_access;    /* flag to specify enable/disable of IRQ access */
    uint8_t pad[3];
};


> ---
>  tools/libs/light/libxl_pci.c |  6 ++++--
>  tools/libs/light/libxl_x86.c |  5 ++++-
>  xen/common/domctl.c          | 12 ++++++++++--
>  3 files changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> index 96cb4da0794e..e1341d1e9850 100644
> --- a/tools/libs/light/libxl_pci.c
> +++ b/tools/libs/light/libxl_pci.c
> @@ -1418,6 +1418,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>      unsigned long long start, end, flags, size;
>      int irq, i;
>      int r;
> +    int gsi;
>      uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
>      uint32_t domainid = domid;
>      bool isstubdom = libxl_is_stubdom(ctx, domid, &domainid);
> @@ -1486,6 +1487,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>          goto out_no_irq;
>      }
>      if ((fscanf(f, "%u", &irq) == 1) && irq) {
> +        gsi = irq;

A question for Roger and Jan: are we always guaranteed that gsi == irq
(also in the PV case)?

Also I don't think we necessarily need the gsi local variable, I would
just add an in-code comment below...


>          r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
>          if (r < 0) {
>              LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)",
> @@ -1494,10 +1496,10 @@ static void pci_add_dm_done(libxl__egc *egc,
>              rc = ERROR_FAIL;
>              goto out;
>          }

... like this:

/* xc_domain_irq_permission takes a gsi, here we assume irq == gsi */
r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);


> -        r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);
> +        r = xc_domain_irq_permission(ctx->xch, domid, gsi, 1);
>          if (r < 0) {
>              LOGED(ERROR, domainid,
> -                  "xc_domain_irq_permission irq=%d (error=%d)", irq, r);
> +                  "xc_domain_irq_permission irq=%d (error=%d)", gsi, r);
>              fclose(f);
>              rc = ERROR_FAIL;
>              goto out;
> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> index d16573e72cd4..88ad50cf6360 100644
> --- a/tools/libs/light/libxl_x86.c
> +++ b/tools/libs/light/libxl_x86.c
> @@ -654,12 +654,15 @@ out:
>  int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t domid, int irq)

you could just rename the int irq parameter to int gsi ?


>  {
>      int ret;
> +    int gsi;
> +
> +    gsi = irq;
>  
>      ret = xc_physdev_map_pirq(CTX->xch, domid, irq, &irq);
>      if (ret)
>          return ret;
>  
> -    ret = xc_domain_irq_permission(CTX->xch, domid, irq, 1);
> +    ret = xc_domain_irq_permission(CTX->xch, domid, gsi, 1);
>      return ret;
>  }
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index f5a71ee5f78d..eeb975bd0194 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -653,12 +653,20 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>          unsigned int pirq = op->u.irq_permission.pirq, irq;
>          int allow = op->u.irq_permission.allow_access;

here we could benefit from renaming pirq to gsi, at least it becomes
clear:

unsigned int gsi = op->u.irq_permission.pirq, irq;


> -        if ( pirq >= current->domain->nr_pirqs )
> +        if ( pirq >= nr_irqs_gsi )
>          {
>              ret = -EINVAL;
>              break;
>          }
> -        irq = pirq_access_permitted(current->domain, pirq);
> +
> +        if ( irq_access_permitted(current->domain, pirq) )
> +            irq = pirq;
> +        else
> +        {
> +            ret = -EPERM;
> +            break;
> +        }
> +
>          if ( !irq || xsm_irq_permission(XSM_HOOK, d, irq, allow) )
>              ret = -EPERM;
>          else if ( allow )
> -- 
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 5/5] libxl: Use gsi instead of irq for mapping pirq
  2024-01-05  7:09 ` [RFC XEN PATCH v4 5/5] libxl: Use gsi instead of irq for mapping pirq Jiqian Chen
  2024-01-05  7:36   ` Jan Beulich
@ 2024-01-06  1:11   ` Stefano Stabellini
  2024-01-08  6:02     ` Chen, Jiqian
  1 sibling, 1 reply; 35+ messages in thread
From: Stefano Stabellini @ 2024-01-06  1:11 UTC (permalink / raw)
  To: Jiqian Chen
  Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Stewart Hildebrand, Huang Rui,
	Huang Rui

On Fri, 5 Jan 2024, Jiqian Chen wrote:
> In PVH dom0, it uses the linux local interrupt mechanism,
> when it allocs irq for a gsi, it is dynamic, and follow
> the principle of applying first, distributing first. And
> the irq number is alloced from small to large, but the
> applying gsi number is not, may gsi 38 comes before gsi
> 28, that causes the irq number is not equal with the gsi
> number. And when passthrough a device, xl wants to use
> gsi to map pirq, see pci_add_dm_done->xc_physdev_map_pirq,
> but the gsi number is got from file
> /sys/bus/pci/devices/<sbdf>/irq in current code, so it
> will fail when mapping.
> 
> So, use real gsi number read from gsi sysfs.
> 
> Co-developed-by: Huang Rui <ray.huang@amd.com>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-06  1:08   ` Stefano Stabellini
@ 2024-01-06  1:12     ` Stefano Stabellini
  2024-01-08  3:43     ` Chen, Jiqian
  2024-01-08  8:55     ` Jan Beulich
  2 siblings, 0 replies; 35+ messages in thread
From: Stefano Stabellini @ 2024-01-06  1:12 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jiqian Chen, xen-devel, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Anthony PERARD,
	Juergen Gross, Stewart Hildebrand, Huang Rui, Huang Rui

On Fri, 5 Jan 2024, Stefano Stabellini wrote:
> On Fri, 5 Jan 2024, Jiqian Chen wrote:
> > Some type of domain don't have PIRQ, like PVH, current
> > implementation is not suitable for those domain.
> > 
> > When passthrough a device to guest on PVH dom0, this
> > pci_add_dm_done->XEN_DOMCTL_irq_permission will failed
> > at domain_pirq_to_irq.
> > 
> > So, change it to use gsi to grant/revoke irq permission.
> > And change the caller of XEN_DOMCTL_irq_permission to
> > pass gsi to it instead of pirq.
> > 
> > Co-developed-by: Huang Rui <ray.huang@amd.com>
> > Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> 
> I realize there is no explanation or comment right now, but I think this
> change would benefit from a in-code comment in xen/include/public/domctl.h
> E.g.:
> 
> /* XEN_DOMCTL_irq_permission */
> struct xen_domctl_irq_permission {
>     uint32_t pirq;           /* pirq is the GSI on x86 */
>     uint8_t allow_access;    /* flag to specify enable/disable of IRQ access */
>     uint8_t pad[3];
> };
> 
> 
> > ---
> >  tools/libs/light/libxl_pci.c |  6 ++++--
> >  tools/libs/light/libxl_x86.c |  5 ++++-
> >  xen/common/domctl.c          | 12 ++++++++++--
> >  3 files changed, 18 insertions(+), 5 deletions(-)
> > 
> > diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> > index 96cb4da0794e..e1341d1e9850 100644
> > --- a/tools/libs/light/libxl_pci.c
> > +++ b/tools/libs/light/libxl_pci.c
> > @@ -1418,6 +1418,7 @@ static void pci_add_dm_done(libxl__egc *egc,
> >      unsigned long long start, end, flags, size;
> >      int irq, i;
> >      int r;
> > +    int gsi;

After reading patch #5 I think you could just rename the int irq local
variable to int gsi.


> >      uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
> >      uint32_t domainid = domid;
> >      bool isstubdom = libxl_is_stubdom(ctx, domid, &domainid);
> > @@ -1486,6 +1487,7 @@ static void pci_add_dm_done(libxl__egc *egc,
> >          goto out_no_irq;
> >      }
> >      if ((fscanf(f, "%u", &irq) == 1) && irq) {
> > +        gsi = irq;
> 
> A question for Roger and Jan: are we always guaranteed that gsi == irq
> (also in the PV case)?
> 
> Also I don't think we necessarily need the gsi local variable, I would
> just add an in-code comment below...
> 
> 
> >          r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
> >          if (r < 0) {
> >              LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)",
> > @@ -1494,10 +1496,10 @@ static void pci_add_dm_done(libxl__egc *egc,
> >              rc = ERROR_FAIL;
> >              goto out;
> >          }
> 
> ... like this:
> 
> /* xc_domain_irq_permission takes a gsi, here we assume irq == gsi */
> r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);
> 
> 
> > -        r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);
> > +        r = xc_domain_irq_permission(ctx->xch, domid, gsi, 1);
> >          if (r < 0) {
> >              LOGED(ERROR, domainid,
> > -                  "xc_domain_irq_permission irq=%d (error=%d)", irq, r);
> > +                  "xc_domain_irq_permission irq=%d (error=%d)", gsi, r);
> >              fclose(f);
> >              rc = ERROR_FAIL;
> >              goto out;
> > diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> > index d16573e72cd4..88ad50cf6360 100644
> > --- a/tools/libs/light/libxl_x86.c
> > +++ b/tools/libs/light/libxl_x86.c
> > @@ -654,12 +654,15 @@ out:
> >  int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t domid, int irq)
> 
> you could just rename the int irq parameter to int gsi ?
> 
> 
> >  {
> >      int ret;
> > +    int gsi;
> > +
> > +    gsi = irq;
> >  
> >      ret = xc_physdev_map_pirq(CTX->xch, domid, irq, &irq);
> >      if (ret)
> >          return ret;
> >  
> > -    ret = xc_domain_irq_permission(CTX->xch, domid, irq, 1);
> > +    ret = xc_domain_irq_permission(CTX->xch, domid, gsi, 1);
> >      return ret;
> >  }
> > diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> > index f5a71ee5f78d..eeb975bd0194 100644
> > --- a/xen/common/domctl.c
> > +++ b/xen/common/domctl.c
> > @@ -653,12 +653,20 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
> >          unsigned int pirq = op->u.irq_permission.pirq, irq;
> >          int allow = op->u.irq_permission.allow_access;
> 
> here we could benefit from renaming pirq to gsi, at least it becomes
> clear:
> 
> unsigned int gsi = op->u.irq_permission.pirq, irq;
> 
> 
> > -        if ( pirq >= current->domain->nr_pirqs )
> > +        if ( pirq >= nr_irqs_gsi )
> >          {
> >              ret = -EINVAL;
> >              break;
> >          }
> > -        irq = pirq_access_permitted(current->domain, pirq);
> > +
> > +        if ( irq_access_permitted(current->domain, pirq) )
> > +            irq = pirq;
> > +        else
> > +        {
> > +            ret = -EPERM;
> > +            break;
> > +        }
> > +
> >          if ( !irq || xsm_irq_permission(XSM_HOOK, d, irq, allow) )
> >              ret = -EPERM;
> >          else if ( allow )
> > -- 
> > 2.34.1
> > 
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-06  1:08   ` Stefano Stabellini
  2024-01-06  1:12     ` Stefano Stabellini
@ 2024-01-08  3:43     ` Chen, Jiqian
  2024-01-08  8:55     ` Jan Beulich
  2 siblings, 0 replies; 35+ messages in thread
From: Chen, Jiqian @ 2024-01-08  3:43 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Anthony PERARD,
	Juergen Gross, Hildebrand, Stewart, Huang, Ray, Chen, Jiqian

On 2024/1/6 09:08, Stefano Stabellini wrote:
> On Fri, 5 Jan 2024, Jiqian Chen wrote:
>> Some type of domain don't have PIRQ, like PVH, current
>> implementation is not suitable for those domain.
>>
>> When passthrough a device to guest on PVH dom0, this
>> pci_add_dm_done->XEN_DOMCTL_irq_permission will failed
>> at domain_pirq_to_irq.
>>
>> So, change it to use gsi to grant/revoke irq permission.
>> And change the caller of XEN_DOMCTL_irq_permission to
>> pass gsi to it instead of pirq.
>>
>> Co-developed-by: Huang Rui <ray.huang@amd.com>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> 
> I realize there is no explanation or comment right now, but I think this
> change would benefit from a in-code comment in xen/include/public/domctl.h
> E.g.:
> 
> /* XEN_DOMCTL_irq_permission */
> struct xen_domctl_irq_permission {
>     uint32_t pirq;           /* pirq is the GSI on x86 */
>     uint8_t allow_access;    /* flag to specify enable/disable of IRQ access */
>     uint8_t pad[3];
> };
Will add this comment in next version, thanks.

> 
> 
>> ---
>>  tools/libs/light/libxl_pci.c |  6 ++++--
>>  tools/libs/light/libxl_x86.c |  5 ++++-
>>  xen/common/domctl.c          | 12 ++++++++++--
>>  3 files changed, 18 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
>> index 96cb4da0794e..e1341d1e9850 100644
>> --- a/tools/libs/light/libxl_pci.c
>> +++ b/tools/libs/light/libxl_pci.c
>> @@ -1418,6 +1418,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>>      unsigned long long start, end, flags, size;
>>      int irq, i;
>>      int r;
>> +    int gsi;
>>      uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
>>      uint32_t domainid = domid;
>>      bool isstubdom = libxl_is_stubdom(ctx, domid, &domainid);
>> @@ -1486,6 +1487,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>>          goto out_no_irq;
>>      }
>>      if ((fscanf(f, "%u", &irq) == 1) && irq) {
>> +        gsi = irq;
> 
> A question for Roger and Jan: are we always guaranteed that gsi == irq
> (also in the PV case)?
> 
> Also I don't think we necessarily need the gsi local variable, I would
> just add an in-code comment below...
Here pass the pointer of irq to xc_physdev_map_pirq, and that function will assign the value of pirq to irq on Xen side, so after calling xc_physdev_map_pirq, the value of irq isn't the original value, we need a local variable to record the original value.

> 
> 
>>          r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
>>          if (r < 0) {
>>              LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)",
>> @@ -1494,10 +1496,10 @@ static void pci_add_dm_done(libxl__egc *egc,
>>              rc = ERROR_FAIL;
>>              goto out;
>>          }
> 
> ... like this:
> 
> /* xc_domain_irq_permission takes a gsi, here we assume irq == gsi */
> r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);
Will add this comment in next version, thanks.

> 
> 
>> -        r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);
>> +        r = xc_domain_irq_permission(ctx->xch, domid, gsi, 1);
>>          if (r < 0) {
>>              LOGED(ERROR, domainid,
>> -                  "xc_domain_irq_permission irq=%d (error=%d)", irq, r);
>> +                  "xc_domain_irq_permission irq=%d (error=%d)", gsi, r);
>>              fclose(f);
>>              rc = ERROR_FAIL;
>>              goto out;
>> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
>> index d16573e72cd4..88ad50cf6360 100644
>> --- a/tools/libs/light/libxl_x86.c
>> +++ b/tools/libs/light/libxl_x86.c
>> @@ -654,12 +654,15 @@ out:
>>  int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t domid, int irq)
> 
> you could just rename the int irq parameter to int gsi ?
No, the same reason as above.

> 
> 
>>  {
>>      int ret;
>> +    int gsi;
>> +
>> +    gsi = irq;
>>  
>>      ret = xc_physdev_map_pirq(CTX->xch, domid, irq, &irq);
>>      if (ret)
>>          return ret;
>>  
>> -    ret = xc_domain_irq_permission(CTX->xch, domid, irq, 1);
>> +    ret = xc_domain_irq_permission(CTX->xch, domid, gsi, 1);
>>      return ret;
>>  }
>> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
>> index f5a71ee5f78d..eeb975bd0194 100644
>> --- a/xen/common/domctl.c
>> +++ b/xen/common/domctl.c
>> @@ -653,12 +653,20 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>>          unsigned int pirq = op->u.irq_permission.pirq, irq;
>>          int allow = op->u.irq_permission.allow_access;
> 
> here we could benefit from renaming pirq to gsi, at least it becomes
> clear:
> 
> unsigned int gsi = op->u.irq_permission.pirq, irq;
Thank you, I will rename it in next version.

> 
> 
>> -        if ( pirq >= current->domain->nr_pirqs )
>> +        if ( pirq >= nr_irqs_gsi )
>>          {
>>              ret = -EINVAL;
>>              break;
>>          }
>> -        irq = pirq_access_permitted(current->domain, pirq);
>> +
>> +        if ( irq_access_permitted(current->domain, pirq) )
>> +            irq = pirq;
>> +        else
>> +        {
>> +            ret = -EPERM;
>> +            break;
>> +        }
>> +
>>          if ( !irq || xsm_irq_permission(XSM_HOOK, d, irq, allow) )
>>              ret = -EPERM;
>>          else if ( allow )
>> -- 
>> 2.34.1
>>

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 5/5] libxl: Use gsi instead of irq for mapping pirq
  2024-01-06  1:11   ` Stefano Stabellini
@ 2024-01-08  6:02     ` Chen, Jiqian
  0 siblings, 0 replies; 35+ messages in thread
From: Chen, Jiqian @ 2024-01-08  6:02 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Anthony PERARD,
	Juergen Gross, Hildebrand, Stewart, Huang, Ray, Chen, Jiqian

On 2024/1/6 09:11, Stefano Stabellini wrote:
> On Fri, 5 Jan 2024, Jiqian Chen wrote:
>> In PVH dom0, it uses the linux local interrupt mechanism,
>> when it allocs irq for a gsi, it is dynamic, and follow
>> the principle of applying first, distributing first. And
>> the irq number is alloced from small to large, but the
>> applying gsi number is not, may gsi 38 comes before gsi
>> 28, that causes the irq number is not equal with the gsi
>> number. And when passthrough a device, xl wants to use
>> gsi to map pirq, see pci_add_dm_done->xc_physdev_map_pirq,
>> but the gsi number is got from file
>> /sys/bus/pci/devices/<sbdf>/irq in current code, so it
>> will fail when mapping.
>>
>> So, use real gsi number read from gsi sysfs.
>>
>> Co-developed-by: Huang Rui <ray.huang@amd.com>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> 
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Thank you very much! I will add this in next version.
And this implementation depends on the third patch on kernel side, if the maintainer doesn't like to add a gsi sysfs, this implementation may be changed in the future, if so, I will let you know.

> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 2/5] x86/pvh: Allow (un)map_pirq when caller isn't DOMID_SELF
  2024-01-06  0:46   ` Stefano Stabellini
@ 2024-01-08  8:47     ` Jan Beulich
  2024-01-08  9:15       ` Chen, Jiqian
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2024-01-08  8:47 UTC (permalink / raw)
  To: Stefano Stabellini, Jiqian Chen
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Anthony PERARD,
	Juergen Gross, Stewart Hildebrand, Huang Rui

On 06.01.2024 01:46, Stefano Stabellini wrote:
> On Fri, 5 Jan 2024, Jiqian Chen wrote:
>> @@ -72,8 +73,30 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>  
>>      switch ( cmd )
>>      {
>> -    case PHYSDEVOP_map_pirq:
>> -    case PHYSDEVOP_unmap_pirq:
>> +    case PHYSDEVOP_map_pirq: {
>> +        physdev_map_pirq_t map;
>> +
>> +        if ( copy_from_guest(&map, arg, 1) != 0 )
>> +            return -EFAULT;
>> +
>> +        if ( !has_pirq(currd) && map.domid == DOMID_SELF )
>> +            return -ENOSYS;
> 
> This looks OK to me although there is already another copy_from_guest in
> do_physdev_op, but I don't see an easy way to make it better.

How can double reads of hypercall args ever be okay? The new check clearly
needs to be inserted in the code path where the structure is being read
already anyway.

Jan


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 3/5] x86/pvh: Add PHYSDEVOP_setup_gsi for PVH dom0
  2024-01-06  0:54   ` Stefano Stabellini
@ 2024-01-08  8:50     ` Jan Beulich
  2024-01-09  8:01       ` Chen, Jiqian
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2024-01-08  8:50 UTC (permalink / raw)
  To: Stefano Stabellini, Jiqian Chen
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Anthony PERARD,
	Juergen Gross, Stewart Hildebrand, Huang Rui

On 06.01.2024 01:54, Stefano Stabellini wrote:
> On Fri, 5 Jan 2024, Jiqian Chen wrote:
>> On PVH dom0, the gsis don't get registered, but
>> the gsi of a passthrough device must be configured for it to
>> be able to be mapped into a hvm domU.
>> On Linux kernel side, it calles PHYSDEVOP_setup_gsi for
>> passthrough devices to register gsi when dom0 is PVH.
>> So, add PHYSDEVOP_setup_gsi for above purpose.
>>
>> Co-developed-by: Huang Rui <ray.huang@amd.com>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>>  xen/arch/x86/hvm/hypercall.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
>> index 632a68be3cc4..e27d3ca15185 100644
>> --- a/xen/arch/x86/hvm/hypercall.c
>> +++ b/xen/arch/x86/hvm/hypercall.c
>> @@ -97,6 +97,12 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>          break;
>>      }
>>  
>> +    case PHYSDEVOP_setup_gsi:
>> +        if ( is_hardware_domain(currd) && !has_pirq(currd) )
>> +            break;
>> +        else
>> +            return -ENOSYS;
> 
> I am not sure what is the best "if" check for this situation but I am
> guessing we don't need has_pirq(currd). Maybe this is sufficient:
> 
> if ( is_hardware_domain(currd) )
>     break;
> else
>     return -ENOSYS;

Maybe

    if ( !is_hardware_domain(currd) )
        return -EOPNOTSUPP;
    ASSERT(!has_pirq(currd));
    break;

? What I primarily dislike in both earlier proposals is the (imo
confusing) use of "else".

Jan


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-06  1:08   ` Stefano Stabellini
  2024-01-06  1:12     ` Stefano Stabellini
  2024-01-08  3:43     ` Chen, Jiqian
@ 2024-01-08  8:55     ` Jan Beulich
  2024-01-08 15:05       ` Roger Pau Monné
  2 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2024-01-08  8:55 UTC (permalink / raw)
  To: Stefano Stabellini, Jiqian Chen
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Anthony PERARD,
	Juergen Gross, Stewart Hildebrand, Huang Rui

On 06.01.2024 02:08, Stefano Stabellini wrote:
> On Fri, 5 Jan 2024, Jiqian Chen wrote:
>> --- a/tools/libs/light/libxl_pci.c
>> +++ b/tools/libs/light/libxl_pci.c
>> @@ -1418,6 +1418,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>>      unsigned long long start, end, flags, size;
>>      int irq, i;
>>      int r;
>> +    int gsi;
>>      uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
>>      uint32_t domainid = domid;
>>      bool isstubdom = libxl_is_stubdom(ctx, domid, &domainid);
>> @@ -1486,6 +1487,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>>          goto out_no_irq;
>>      }
>>      if ((fscanf(f, "%u", &irq) == 1) && irq) {
>> +        gsi = irq;
> 
> A question for Roger and Jan: are we always guaranteed that gsi == irq
> (also in the PV case)?

Iirc for IO-APIC based IRQs that's always the case; for MSI ones there
simply is no GSI, and Xen's IRQ number then normally isn't the same as
the pIRQ a domain would see.

Jan


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 2/5] x86/pvh: Allow (un)map_pirq when caller isn't DOMID_SELF
  2024-01-08  8:47     ` Jan Beulich
@ 2024-01-08  9:15       ` Chen, Jiqian
  2024-01-08  9:25         ` Jan Beulich
  0 siblings, 1 reply; 35+ messages in thread
From: Chen, Jiqian @ 2024-01-08  9:15 UTC (permalink / raw)
  To: Jan Beulich, Stefano Stabellini
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Anthony PERARD,
	Juergen Gross, Hildebrand, Stewart, Huang, Ray, Chen, Jiqian

On 2024/1/8 16:47, Jan Beulich wrote:
> On 06.01.2024 01:46, Stefano Stabellini wrote:
>> On Fri, 5 Jan 2024, Jiqian Chen wrote:
>>> @@ -72,8 +73,30 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>  
>>>      switch ( cmd )
>>>      {
>>> -    case PHYSDEVOP_map_pirq:
>>> -    case PHYSDEVOP_unmap_pirq:
>>> +    case PHYSDEVOP_map_pirq: {
>>> +        physdev_map_pirq_t map;
>>> +
>>> +        if ( copy_from_guest(&map, arg, 1) != 0 )
>>> +            return -EFAULT;
>>> +
>>> +        if ( !has_pirq(currd) && map.domid == DOMID_SELF )
>>> +            return -ENOSYS;
>>
>> This looks OK to me although there is already another copy_from_guest in
>> do_physdev_op, but I don't see an easy way to make it better.
> 
> How can double reads of hypercall args ever be okay? The new check clearly
> needs to be inserted in the code path where the structure is being read
> already anyway.
I also tried to add this check in PHYSDEVOP_map_pirq in physdev.c, but pv has no flag X86_EMU_USE_PIRQ too.
If want to add it into physdev.c and combine Stefano's opinions, this check may be like:

diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 47c4da0af7e1..c38d4d405726 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -303,11 +303,19 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
     case PHYSDEVOP_map_pirq: {
         physdev_map_pirq_t map;
         struct msi_info msi;
+        struct domain *d;

         ret = -EFAULT;
         if ( copy_from_guest(&map, arg, 1) != 0 )
             break;

+        d = rcu_lock_domain_by_any_id(map.domid);
+        if ( d == NULL )
+            return -ESRCH;
+        if ( !is_pv_domain(d) && !has_pirq(d) )
+            return -ENOSYS;
+        rcu_unlock_domain(d);
+
         switch ( map.type )
         {
         case MAP_PIRQ_TYPE_MSI_SEG:

> 
> Jan

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 2/5] x86/pvh: Allow (un)map_pirq when caller isn't DOMID_SELF
  2024-01-08  9:15       ` Chen, Jiqian
@ 2024-01-08  9:25         ` Jan Beulich
  2024-01-09  7:58           ` Chen, Jiqian
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2024-01-08  9:25 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Anthony PERARD,
	Juergen Gross, Hildebrand, Stewart, Huang, Ray,
	Stefano Stabellini

On 08.01.2024 10:15, Chen, Jiqian wrote:
> On 2024/1/8 16:47, Jan Beulich wrote:
>> On 06.01.2024 01:46, Stefano Stabellini wrote:
>>> On Fri, 5 Jan 2024, Jiqian Chen wrote:
>>>> @@ -72,8 +73,30 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>  
>>>>      switch ( cmd )
>>>>      {
>>>> -    case PHYSDEVOP_map_pirq:
>>>> -    case PHYSDEVOP_unmap_pirq:
>>>> +    case PHYSDEVOP_map_pirq: {
>>>> +        physdev_map_pirq_t map;
>>>> +
>>>> +        if ( copy_from_guest(&map, arg, 1) != 0 )
>>>> +            return -EFAULT;
>>>> +
>>>> +        if ( !has_pirq(currd) && map.domid == DOMID_SELF )
>>>> +            return -ENOSYS;
>>>
>>> This looks OK to me although there is already another copy_from_guest in
>>> do_physdev_op, but I don't see an easy way to make it better.
>>
>> How can double reads of hypercall args ever be okay? The new check clearly
>> needs to be inserted in the code path where the structure is being read
>> already anyway.
> I also tried to add this check in PHYSDEVOP_map_pirq in physdev.c, but pv has no flag X86_EMU_USE_PIRQ too.
> If want to add it into physdev.c and combine Stefano's opinions, this check may be like:
> 
> diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
> index 47c4da0af7e1..c38d4d405726 100644
> --- a/xen/arch/x86/physdev.c
> +++ b/xen/arch/x86/physdev.c
> @@ -303,11 +303,19 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>      case PHYSDEVOP_map_pirq: {
>          physdev_map_pirq_t map;
>          struct msi_info msi;
> +        struct domain *d;
> 
>          ret = -EFAULT;
>          if ( copy_from_guest(&map, arg, 1) != 0 )
>              break;
> 
> +        d = rcu_lock_domain_by_any_id(map.domid);
> +        if ( d == NULL )
> +            return -ESRCH;
> +        if ( !is_pv_domain(d) && !has_pirq(d) )
> +            return -ENOSYS;
> +        rcu_unlock_domain(d);
> +
>          switch ( map.type )
>          {
>          case MAP_PIRQ_TYPE_MSI_SEG:

Well, yes, perhaps kind of like that, but with rcu_unlock_domain() called
on the error 2nd return path as well, and without abusing ENOSYS.

Jan


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-08  8:55     ` Jan Beulich
@ 2024-01-08 15:05       ` Roger Pau Monné
  2024-01-09  8:18         ` Chen, Jiqian
  0 siblings, 1 reply; 35+ messages in thread
From: Roger Pau Monné @ 2024-01-08 15:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Jiqian Chen, xen-devel, Andrew Cooper,
	Wei Liu, George Dunlap, Julien Grall, Anthony PERARD,
	Juergen Gross, Stewart Hildebrand, Huang Rui

On Mon, Jan 08, 2024 at 09:55:26AM +0100, Jan Beulich wrote:
> On 06.01.2024 02:08, Stefano Stabellini wrote:
> > On Fri, 5 Jan 2024, Jiqian Chen wrote:
> >> --- a/tools/libs/light/libxl_pci.c
> >> +++ b/tools/libs/light/libxl_pci.c
> >> @@ -1418,6 +1418,7 @@ static void pci_add_dm_done(libxl__egc *egc,
> >>      unsigned long long start, end, flags, size;
> >>      int irq, i;
> >>      int r;
> >> +    int gsi;
> >>      uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
> >>      uint32_t domainid = domid;
> >>      bool isstubdom = libxl_is_stubdom(ctx, domid, &domainid);
> >> @@ -1486,6 +1487,7 @@ static void pci_add_dm_done(libxl__egc *egc,
> >>          goto out_no_irq;
> >>      }
> >>      if ((fscanf(f, "%u", &irq) == 1) && irq) {
> >> +        gsi = irq;
> > 
> > A question for Roger and Jan: are we always guaranteed that gsi == irq
> > (also in the PV case)?
> 
> Iirc for IO-APIC based IRQs that's always the case;

I think that's always the case on Linux, because it calls
PHYSDEVOP_map_pirq with index == pirq (see Linux
pci_xen_initial_domain()).  But other OSes could possibly make the
call with pirq == -1 and get a randomly allocated pirq for GSIs.

IOW: I don't think the pirq field in xen_domctl_irq_permission can be
altered like proposed here to switch from passing a pirq to a GSI.  A
new hypercall should be introduced that either is GSI specific, or
contains a type field in order to specify the namespace the field
targets.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 2/5] x86/pvh: Allow (un)map_pirq when caller isn't DOMID_SELF
  2024-01-08  9:25         ` Jan Beulich
@ 2024-01-09  7:58           ` Chen, Jiqian
  0 siblings, 0 replies; 35+ messages in thread
From: Chen, Jiqian @ 2024-01-09  7:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Anthony PERARD,
	Juergen Gross, Hildebrand, Stewart, Huang, Ray,
	Stefano Stabellini, Chen, Jiqian

On 2024/1/8 17:25, Jan Beulich wrote:
> On 08.01.2024 10:15, Chen, Jiqian wrote:
>> On 2024/1/8 16:47, Jan Beulich wrote:
>>> On 06.01.2024 01:46, Stefano Stabellini wrote:
>>>> On Fri, 5 Jan 2024, Jiqian Chen wrote:
>>>>> @@ -72,8 +73,30 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>  
>>>>>      switch ( cmd )
>>>>>      {
>>>>> -    case PHYSDEVOP_map_pirq:
>>>>> -    case PHYSDEVOP_unmap_pirq:
>>>>> +    case PHYSDEVOP_map_pirq: {
>>>>> +        physdev_map_pirq_t map;
>>>>> +
>>>>> +        if ( copy_from_guest(&map, arg, 1) != 0 )
>>>>> +            return -EFAULT;
>>>>> +
>>>>> +        if ( !has_pirq(currd) && map.domid == DOMID_SELF )
>>>>> +            return -ENOSYS;
>>>>
>>>> This looks OK to me although there is already another copy_from_guest in
>>>> do_physdev_op, but I don't see an easy way to make it better.
>>>
>>> How can double reads of hypercall args ever be okay? The new check clearly
>>> needs to be inserted in the code path where the structure is being read
>>> already anyway.
>> I also tried to add this check in PHYSDEVOP_map_pirq in physdev.c, but pv has no flag X86_EMU_USE_PIRQ too.
>> If want to add it into physdev.c and combine Stefano's opinions, this check may be like:
>>
>> diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
>> index 47c4da0af7e1..c38d4d405726 100644
>> --- a/xen/arch/x86/physdev.c
>> +++ b/xen/arch/x86/physdev.c
>> @@ -303,11 +303,19 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>      case PHYSDEVOP_map_pirq: {
>>          physdev_map_pirq_t map;
>>          struct msi_info msi;
>> +        struct domain *d;
>>
>>          ret = -EFAULT;
>>          if ( copy_from_guest(&map, arg, 1) != 0 )
>>              break;
>>
>> +        d = rcu_lock_domain_by_any_id(map.domid);
>> +        if ( d == NULL )
>> +            return -ESRCH;
>> +        if ( !is_pv_domain(d) && !has_pirq(d) )
>> +            return -ENOSYS;
>> +        rcu_unlock_domain(d);
>> +
>>          switch ( map.type )
>>          {
>>          case MAP_PIRQ_TYPE_MSI_SEG:
> 
> Well, yes, perhaps kind of like that, but with rcu_unlock_domain() called
> on the error 2nd return path as well, and without abusing ENOSYS.
Ok, will call rcu_unlock_domain on 2nd error path, and change ENOSYS to EOPNOTSUPP.

> 
> Jan

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 3/5] x86/pvh: Add PHYSDEVOP_setup_gsi for PVH dom0
  2024-01-08  8:50     ` Jan Beulich
@ 2024-01-09  8:01       ` Chen, Jiqian
  0 siblings, 0 replies; 35+ messages in thread
From: Chen, Jiqian @ 2024-01-09  8:01 UTC (permalink / raw)
  To: Jan Beulich, Stefano Stabellini
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Anthony PERARD,
	Juergen Gross, Hildebrand, Stewart, Huang, Ray, Chen, Jiqian

On 2024/1/8 16:50, Jan Beulich wrote:
> On 06.01.2024 01:54, Stefano Stabellini wrote:
>> On Fri, 5 Jan 2024, Jiqian Chen wrote:
>>> On PVH dom0, the gsis don't get registered, but
>>> the gsi of a passthrough device must be configured for it to
>>> be able to be mapped into a hvm domU.
>>> On Linux kernel side, it calles PHYSDEVOP_setup_gsi for
>>> passthrough devices to register gsi when dom0 is PVH.
>>> So, add PHYSDEVOP_setup_gsi for above purpose.
>>>
>>> Co-developed-by: Huang Rui <ray.huang@amd.com>
>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>> ---
>>>  xen/arch/x86/hvm/hypercall.c | 6 ++++++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
>>> index 632a68be3cc4..e27d3ca15185 100644
>>> --- a/xen/arch/x86/hvm/hypercall.c
>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>> @@ -97,6 +97,12 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>          break;
>>>      }
>>>  
>>> +    case PHYSDEVOP_setup_gsi:
>>> +        if ( is_hardware_domain(currd) && !has_pirq(currd) )
>>> +            break;
>>> +        else
>>> +            return -ENOSYS;
>>
>> I am not sure what is the best "if" check for this situation but I am
>> guessing we don't need has_pirq(currd). Maybe this is sufficient:
>>
>> if ( is_hardware_domain(currd) )
>>     break;
>> else
>>     return -ENOSYS;
> 
> Maybe
> 
>     if ( !is_hardware_domain(currd) )
>         return -EOPNOTSUPP;
>     ASSERT(!has_pirq(currd));
>     break;
> 
> ? What I primarily dislike in both earlier proposals is the (imo
> confusing) use of "else".
It looks like more reasonable. I will change to as this in next version if there are no other guys have different opinions.

> 
> Jan

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-08 15:05       ` Roger Pau Monné
@ 2024-01-09  8:18         ` Chen, Jiqian
  2024-01-09  9:38           ` Jan Beulich
  0 siblings, 1 reply; 35+ messages in thread
From: Chen, Jiqian @ 2024-01-09  8:18 UTC (permalink / raw)
  To: Roger Pau Monné, Jan Beulich, Stefano Stabellini
  Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Julien Grall,
	Anthony PERARD, Juergen Gross, Hildebrand, Stewart, Huang, Ray,
	Chen, Jiqian

On 2024/1/8 23:05, Roger Pau Monné wrote:
> On Mon, Jan 08, 2024 at 09:55:26AM +0100, Jan Beulich wrote:
>> On 06.01.2024 02:08, Stefano Stabellini wrote:
>>> On Fri, 5 Jan 2024, Jiqian Chen wrote:
>>>> --- a/tools/libs/light/libxl_pci.c
>>>> +++ b/tools/libs/light/libxl_pci.c
>>>> @@ -1418,6 +1418,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>>>>      unsigned long long start, end, flags, size;
>>>>      int irq, i;
>>>>      int r;
>>>> +    int gsi;
>>>>      uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
>>>>      uint32_t domainid = domid;
>>>>      bool isstubdom = libxl_is_stubdom(ctx, domid, &domainid);
>>>> @@ -1486,6 +1487,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>>>>          goto out_no_irq;
>>>>      }
>>>>      if ((fscanf(f, "%u", &irq) == 1) && irq) {
>>>> +        gsi = irq;
>>>
>>> A question for Roger and Jan: are we always guaranteed that gsi == irq
>>> (also in the PV case)?
>>
>> Iirc for IO-APIC based IRQs that's always the case;
> 
> I think that's always the case on Linux, because it calls
> PHYSDEVOP_map_pirq with index == pirq (see Linux
> pci_xen_initial_domain()).  But other OSes could possibly make the
> call with pirq == -1 and get a randomly allocated pirq for GSIs.
I don't think it's important whether pirq is randomly generated. What's important is whether irq always equals gsi in xen.
If so, we can directly pass in and grant gsi. However, according to Jan's previous email reply, in the case of Msi, irq may not be equal to gsi, so my patch cannot meet this situation.
I am confusing if the current domain doesn't have PIRQ flag, then regarding to XEN_DOMCTL_irq_permission, which kind of irq we should grant to caller domain? The gsi or other irq?
Or can we add a check in XEN_DOMCTL_irq_permission, if the current domain has PRIQ, we can get irq from pirq(like the original implementation), if not we can assign gsi to irq, and then grant irq. Of course, that needs to require the caller to pass in both the pirq and gsi.

> 
> IOW: I don't think the pirq field in xen_domctl_irq_permission can be
> altered like proposed here to switch from passing a pirq to a GSI.  A
> new hypercall should be introduced that either is GSI specific, or
> contains a type field in order to specify the namespace the field
> targets.
A new hypercall using for granting gsi? If so, how does the caller know to call which hypercall to grant permission, XEN_DOMCTL_irq_permission or that new hypercall?
I mean how the caller know if the current domain has PIRQ or not and when to pass in pirq number, when to pass in gsi number.

> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-09  8:18         ` Chen, Jiqian
@ 2024-01-09  9:38           ` Jan Beulich
  2024-01-09 10:16             ` Chen, Jiqian
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2024-01-09  9:38 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Julien Grall,
	Anthony PERARD, Juergen Gross, Hildebrand, Stewart, Huang, Ray,
	Roger Pau Monné,
	Stefano Stabellini

On 09.01.2024 09:18, Chen, Jiqian wrote:
> On 2024/1/8 23:05, Roger Pau Monné wrote:
>> On Mon, Jan 08, 2024 at 09:55:26AM +0100, Jan Beulich wrote:
>>> On 06.01.2024 02:08, Stefano Stabellini wrote:
>>>> On Fri, 5 Jan 2024, Jiqian Chen wrote:
>>>>> --- a/tools/libs/light/libxl_pci.c
>>>>> +++ b/tools/libs/light/libxl_pci.c
>>>>> @@ -1418,6 +1418,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>>>>>      unsigned long long start, end, flags, size;
>>>>>      int irq, i;
>>>>>      int r;
>>>>> +    int gsi;
>>>>>      uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
>>>>>      uint32_t domainid = domid;
>>>>>      bool isstubdom = libxl_is_stubdom(ctx, domid, &domainid);
>>>>> @@ -1486,6 +1487,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>>>>>          goto out_no_irq;
>>>>>      }
>>>>>      if ((fscanf(f, "%u", &irq) == 1) && irq) {
>>>>> +        gsi = irq;
>>>>
>>>> A question for Roger and Jan: are we always guaranteed that gsi == irq
>>>> (also in the PV case)?
>>>
>>> Iirc for IO-APIC based IRQs that's always the case;
>>
>> I think that's always the case on Linux, because it calls
>> PHYSDEVOP_map_pirq with index == pirq (see Linux
>> pci_xen_initial_domain()).  But other OSes could possibly make the
>> call with pirq == -1 and get a randomly allocated pirq for GSIs.
> I don't think it's important whether pirq is randomly generated. What's important is whether irq always equals gsi in xen.
> If so, we can directly pass in and grant gsi. However, according to Jan's previous email reply, in the case of Msi, irq may not be equal to gsi, so my patch cannot meet this situation.
> I am confusing if the current domain doesn't have PIRQ flag, then regarding to XEN_DOMCTL_irq_permission, which kind of irq we should grant to caller domain? The gsi or other irq?
> Or can we add a check in XEN_DOMCTL_irq_permission, if the current domain has PRIQ, we can get irq from pirq(like the original implementation), if not we can assign gsi to irq, and then grant irq. Of course, that needs to require the caller to pass in both the pirq and gsi.

I expect MSI will need handling differently from GSIs. When MSI is
set up for a device (and hence for a domain in possession of that
device), access ought to be granted right away.

>> IOW: I don't think the pirq field in xen_domctl_irq_permission can be
>> altered like proposed here to switch from passing a pirq to a GSI.  A
>> new hypercall should be introduced that either is GSI specific, or
>> contains a type field in order to specify the namespace the field
>> targets.
> A new hypercall using for granting gsi? If so, how does the caller know to call which hypercall to grant permission, XEN_DOMCTL_irq_permission or that new hypercall?

Either we add a feature indicator, or the caller simply tries the
new GSI interface first.

> I mean how the caller know if the current domain has PIRQ or not and when to pass in pirq number, when to pass in gsi number.

An interface that's GSI-centric would only ever be passed a GSI
(of course, I'm inclined to add).

Jan


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-09  9:38           ` Jan Beulich
@ 2024-01-09 10:16             ` Chen, Jiqian
  2024-01-09 10:40               ` Roger Pau Monné
  2024-01-09 10:46               ` Jan Beulich
  0 siblings, 2 replies; 35+ messages in thread
From: Chen, Jiqian @ 2024-01-09 10:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Julien Grall,
	Anthony PERARD, Juergen Gross, Hildebrand, Stewart, Huang, Ray,
	Roger Pau Monné,
	Stefano Stabellini, Chen, Jiqian

On 2024/1/9 17:38, Jan Beulich wrote:
> On 09.01.2024 09:18, Chen, Jiqian wrote:
>> On 2024/1/8 23:05, Roger Pau Monné wrote:
>>> On Mon, Jan 08, 2024 at 09:55:26AM +0100, Jan Beulich wrote:
>>>> On 06.01.2024 02:08, Stefano Stabellini wrote:
>>>>> On Fri, 5 Jan 2024, Jiqian Chen wrote:
>>>>>> --- a/tools/libs/light/libxl_pci.c
>>>>>> +++ b/tools/libs/light/libxl_pci.c
>>>>>> @@ -1418,6 +1418,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>>>>>>      unsigned long long start, end, flags, size;
>>>>>>      int irq, i;
>>>>>>      int r;
>>>>>> +    int gsi;
>>>>>>      uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
>>>>>>      uint32_t domainid = domid;
>>>>>>      bool isstubdom = libxl_is_stubdom(ctx, domid, &domainid);
>>>>>> @@ -1486,6 +1487,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>>>>>>          goto out_no_irq;
>>>>>>      }
>>>>>>      if ((fscanf(f, "%u", &irq) == 1) && irq) {
>>>>>> +        gsi = irq;
>>>>>
>>>>> A question for Roger and Jan: are we always guaranteed that gsi == irq
>>>>> (also in the PV case)?
>>>>
>>>> Iirc for IO-APIC based IRQs that's always the case;
>>>
>>> I think that's always the case on Linux, because it calls
>>> PHYSDEVOP_map_pirq with index == pirq (see Linux
>>> pci_xen_initial_domain()).  But other OSes could possibly make the
>>> call with pirq == -1 and get a randomly allocated pirq for GSIs.
>> I don't think it's important whether pirq is randomly generated. What's important is whether irq always equals gsi in xen.
>> If so, we can directly pass in and grant gsi. However, according to Jan's previous email reply, in the case of Msi, irq may not be equal to gsi, so my patch cannot meet this situation.
>> I am confusing if the current domain doesn't have PIRQ flag, then regarding to XEN_DOMCTL_irq_permission, which kind of irq we should grant to caller domain? The gsi or other irq?
>> Or can we add a check in XEN_DOMCTL_irq_permission, if the current domain has PRIQ, we can get irq from pirq(like the original implementation), if not we can assign gsi to irq, and then grant irq. Of course, that needs to require the caller to pass in both the pirq and gsi.
> 
> I expect MSI will need handling differently from GSIs. When MSI is
> set up for a device (and hence for a domain in possession of that
> device), access ought to be granted right away.
> 
>>> IOW: I don't think the pirq field in xen_domctl_irq_permission can be
>>> altered like proposed here to switch from passing a pirq to a GSI.  A
>>> new hypercall should be introduced that either is GSI specific, or
>>> contains a type field in order to specify the namespace the field
>>> targets.
>> A new hypercall using for granting gsi? If so, how does the caller know to call which hypercall to grant permission, XEN_DOMCTL_irq_permission or that new hypercall?
> 
> Either we add a feature indicator, or the caller simply tries the
> new GSI interface first.
I am still not sure how to use and implement it.
Taking pci_add_dm_done as an example, for now its implementation is:
pci_add_dm_done
	xc_physdev_map_pirq
	xc_domain_irq_permission(,,pirq,)
		XEN_DOMCTL_irq_permission

And assume the new hypercall is XEN_DOMCTL_gsi_permission, do you mean:
pci_add_dm_done
	xc_physdev_map_pirq
	ret = xc_domain_gsi_permission(,,gsi,)
		XEN_DOMCTL_gsi_permission
	if ( ret != 0 )
		xc_domain_irq_permission(,,pirq,)
			XEN_DOMCTL_irq_permission
But if so, I have a question that in XEN_DOMCTL_gsi_permission, when to fail and when to success?

Or do you mean:
pci_add_dm_done
	xc_physdev_map_pirq
	ret = xc_domain_irq_permission(,,pirq,)
		XEN_DOMCTL_irq_permission
	if ( ret != 0 )
		xc_domain_gsi_permission(,,gsi,)
			XEN_DOMCTL_gsi_permission
And in XEN_DOMCTL_gsi_permission, as long as the current domain has the access of gsi, then granting gsi to caller should be successful. Right?

> 
>> I mean how the caller know if the current domain has PIRQ or not and when to pass in pirq number, when to pass in gsi number.
> 
> An interface that's GSI-centric would only ever be passed a GSI
> (of course, I'm inclined to add).
> 
> Jan

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-09 10:16             ` Chen, Jiqian
@ 2024-01-09 10:40               ` Roger Pau Monné
  2024-01-09 10:46               ` Jan Beulich
  1 sibling, 0 replies; 35+ messages in thread
From: Roger Pau Monné @ 2024-01-09 10:40 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Jan Beulich, xen-devel, Andrew Cooper, Wei Liu, George Dunlap,
	Julien Grall, Anthony PERARD, Juergen Gross, Hildebrand, Stewart,
	Huang, Ray, Stefano Stabellini

On Tue, Jan 09, 2024 at 10:16:26AM +0000, Chen, Jiqian wrote:
> On 2024/1/9 17:38, Jan Beulich wrote:
> > On 09.01.2024 09:18, Chen, Jiqian wrote:
> >> On 2024/1/8 23:05, Roger Pau Monné wrote:
> >>> On Mon, Jan 08, 2024 at 09:55:26AM +0100, Jan Beulich wrote:
> >>>> On 06.01.2024 02:08, Stefano Stabellini wrote:
> >>>>> On Fri, 5 Jan 2024, Jiqian Chen wrote:
> >>>>>> --- a/tools/libs/light/libxl_pci.c
> >>>>>> +++ b/tools/libs/light/libxl_pci.c
> >>>>>> @@ -1418,6 +1418,7 @@ static void pci_add_dm_done(libxl__egc *egc,
> >>>>>>      unsigned long long start, end, flags, size;
> >>>>>>      int irq, i;
> >>>>>>      int r;
> >>>>>> +    int gsi;
> >>>>>>      uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
> >>>>>>      uint32_t domainid = domid;
> >>>>>>      bool isstubdom = libxl_is_stubdom(ctx, domid, &domainid);
> >>>>>> @@ -1486,6 +1487,7 @@ static void pci_add_dm_done(libxl__egc *egc,
> >>>>>>          goto out_no_irq;
> >>>>>>      }
> >>>>>>      if ((fscanf(f, "%u", &irq) == 1) && irq) {
> >>>>>> +        gsi = irq;
> >>>>>
> >>>>> A question for Roger and Jan: are we always guaranteed that gsi == irq
> >>>>> (also in the PV case)?
> >>>>
> >>>> Iirc for IO-APIC based IRQs that's always the case;
> >>>
> >>> I think that's always the case on Linux, because it calls
> >>> PHYSDEVOP_map_pirq with index == pirq (see Linux
> >>> pci_xen_initial_domain()).  But other OSes could possibly make the
> >>> call with pirq == -1 and get a randomly allocated pirq for GSIs.
> >> I don't think it's important whether pirq is randomly generated. What's important is whether irq always equals gsi in xen.

My 'randomly generated' comment was in that direction, not so much as
the pirq being truly random, but no longer matching the gsi.

> >> If so, we can directly pass in and grant gsi. However, according to Jan's previous email reply, in the case of Msi, irq may not be equal to gsi, so my patch cannot meet this situation.
> >> I am confusing if the current domain doesn't have PIRQ flag, then regarding to XEN_DOMCTL_irq_permission, which kind of irq we should grant to caller domain? The gsi or other irq?
> >> Or can we add a check in XEN_DOMCTL_irq_permission, if the current domain has PRIQ, we can get irq from pirq(like the original implementation), if not we can assign gsi to irq, and then grant irq. Of course, that needs to require the caller to pass in both the pirq and gsi.
> > 
> > I expect MSI will need handling differently from GSIs. When MSI is
> > set up for a device (and hence for a domain in possession of that
> > device), access ought to be granted right away.
> > 
> >>> IOW: I don't think the pirq field in xen_domctl_irq_permission can be
> >>> altered like proposed here to switch from passing a pirq to a GSI.  A
> >>> new hypercall should be introduced that either is GSI specific, or
> >>> contains a type field in order to specify the namespace the field
> >>> targets.
> >> A new hypercall using for granting gsi? If so, how does the caller know to call which hypercall to grant permission, XEN_DOMCTL_irq_permission or that new hypercall?
> > 
> > Either we add a feature indicator, or the caller simply tries the
> > new GSI interface first.
> I am still not sure how to use and implement it.
> Taking pci_add_dm_done as an example, for now its implementation is:
> pci_add_dm_done
> 	xc_physdev_map_pirq
> 	xc_domain_irq_permission(,,pirq,)
> 		XEN_DOMCTL_irq_permission
> 
> And assume the new hypercall is XEN_DOMCTL_gsi_permission, do you mean:
> pci_add_dm_done
> 	xc_physdev_map_pirq
> 	ret = xc_domain_gsi_permission(,,gsi,)
> 		XEN_DOMCTL_gsi_permission

Making this first call would also depend on whether the 'gsi' can be
fetched from sysfs, otherwise the call should be avoided.

> 	if ( ret != 0 )
> 		xc_domain_irq_permission(,,pirq,)
> 			XEN_DOMCTL_irq_permission

You can only fallback when you have the pirq (ie: when in a PV dom0),
on a PVH dom0 there's no pirq, so no fallback.

The code in pci_add_dm_done() must be aware of this, and only fallback
when in PV dom0.

> But if so, I have a question that in XEN_DOMCTL_gsi_permission, when to fail and when to success?
> 
> Or do you mean:
> pci_add_dm_done
> 	xc_physdev_map_pirq
> 	ret = xc_domain_irq_permission(,,pirq,)
> 		XEN_DOMCTL_irq_permission
> 	if ( ret != 0 )
> 		xc_domain_gsi_permission(,,gsi,)
> 			XEN_DOMCTL_gsi_permission
> And in XEN_DOMCTL_gsi_permission, as long as the current domain has the access of gsi, then granting gsi to caller should be successful. Right?

No, I think that would be backwards, the logic above looks correct
regarding the ordering of the hypercalls.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-09 10:16             ` Chen, Jiqian
  2024-01-09 10:40               ` Roger Pau Monné
@ 2024-01-09 10:46               ` Jan Beulich
  2024-01-10  8:49                 ` Chen, Jiqian
  1 sibling, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2024-01-09 10:46 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Julien Grall,
	Anthony PERARD, Juergen Gross, Hildebrand, Stewart, Huang, Ray,
	Roger Pau Monné,
	Stefano Stabellini

On 09.01.2024 11:16, Chen, Jiqian wrote:
> On 2024/1/9 17:38, Jan Beulich wrote:
>> On 09.01.2024 09:18, Chen, Jiqian wrote:
>>> A new hypercall using for granting gsi? If so, how does the caller know to call which hypercall to grant permission, XEN_DOMCTL_irq_permission or that new hypercall?
>>
>> Either we add a feature indicator, or the caller simply tries the
>> new GSI interface first.
> I am still not sure how to use and implement it.
> Taking pci_add_dm_done as an example, for now its implementation is:
> pci_add_dm_done
> 	xc_physdev_map_pirq
> 	xc_domain_irq_permission(,,pirq,)
> 		XEN_DOMCTL_irq_permission
> 
> And assume the new hypercall is XEN_DOMCTL_gsi_permission, do you mean:
> pci_add_dm_done
> 	xc_physdev_map_pirq
> 	ret = xc_domain_gsi_permission(,,gsi,)
> 		XEN_DOMCTL_gsi_permission
> 	if ( ret != 0 )
> 		xc_domain_irq_permission(,,pirq,)
> 			XEN_DOMCTL_irq_permission

No, falling back shouldn't be "blind". Fallback should only happen
when the new sub-op isn't implemented (hence why a feature indicator
may be necessary), and only if calling the existing sub-op promises
to be useful (which iirc would limit that to the PV Dom0 case).

> But if so, I have a question that in XEN_DOMCTL_gsi_permission, when to fail and when to success?

I'm afraid I don't understand the question. Behavior there isn't to
be fundamentally different from that for XEN_DOMCTL_irq_permission.
It's just that the incoming value is in another value space.

> Or do you mean:
> pci_add_dm_done
> 	xc_physdev_map_pirq
> 	ret = xc_domain_irq_permission(,,pirq,)
> 		XEN_DOMCTL_irq_permission
> 	if ( ret != 0 )
> 		xc_domain_gsi_permission(,,gsi,)
> 			XEN_DOMCTL_gsi_permission

No, this looks the wrong way round.

> And in XEN_DOMCTL_gsi_permission, as long as the current domain has the access of gsi, then granting gsi to caller should be successful. Right?

I think so; see above.

Jan


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 1/5] xen/vpci: Clear all vpci status of device
  2024-01-05  7:09 ` [RFC XEN PATCH v4 1/5] xen/vpci: Clear all vpci status of device Jiqian Chen
@ 2024-01-09 15:24   ` Stewart Hildebrand
  2024-01-10  6:24     ` Chen, Jiqian
  0 siblings, 1 reply; 35+ messages in thread
From: Stewart Hildebrand @ 2024-01-09 15:24 UTC (permalink / raw)
  To: Jiqian Chen, xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Huang Rui

On 1/5/24 02:09, Jiqian Chen wrote:
> diff --git a/xen/drivers/pci/physdev.c b/xen/drivers/pci/physdev.c
> index 42db3e6d133c..552ccbf747cb 100644
> --- a/xen/drivers/pci/physdev.c
> +++ b/xen/drivers/pci/physdev.c
> @@ -67,6 +68,39 @@ ret_t pci_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          break;
>      }
>  
> +    case PHYSDEVOP_pci_device_state_reset: {
> +        struct physdev_pci_device dev;
> +        struct pci_dev *pdev;
> +        pci_sbdf_t sbdf;
> +
> +        if ( !is_pci_passthrough_enabled() )
> +            return -EOPNOTSUPP;
> +
> +        ret = -EFAULT;
> +        if ( copy_from_guest(&dev, arg, 1) != 0 )
> +            break;
> +        sbdf = PCI_SBDF(dev.seg, dev.bus, dev.devfn);
> +
> +        ret = xsm_resource_setup_pci(XSM_PRIV, sbdf.sbdf);
> +        if ( ret )
> +            break;
> +
> +        pcidevs_lock();
> +        pdev = pci_get_pdev(NULL, sbdf);
> +        if ( !pdev )
> +        {
> +            pcidevs_unlock();
> +            ret = -ENODEV;
> +            break;
> +        }
> +

        write_lock(&pdev->domain->pci_lock);

> +        ret = vpci_reset_device_state(pdev);

        write_unlock(&pdev->domain->pci_lock);

> +        pcidevs_unlock();
> +        if ( ret )
> +            printk(XENLOG_ERR "%pp: failed to reset PCI device state\n", &sbdf);
> +        break;
> +    }
> +
>      default:
>          ret = -ENOSYS;
>          break;
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 72ef277c4f8e..3c64cb10ccbb 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -107,6 +107,15 @@ int vpci_add_handlers(struct pci_dev *pdev)
>  
>      return rc;
>  }
> +
> +int vpci_reset_device_state(struct pci_dev *pdev)
> +{
> +    ASSERT(pcidevs_locked());

    ASSERT(rw_is_write_locked(&pdev->domain->pci_lock));

> +
> +    vpci_remove_device(pdev);
> +    return vpci_add_handlers(pdev);
> +}
> +
>  #endif /* __XEN__ */
>  
>  static int vpci_register_cmp(const struct vpci_register *r1,


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 1/5] xen/vpci: Clear all vpci status of device
  2024-01-09 15:24   ` Stewart Hildebrand
@ 2024-01-10  6:24     ` Chen, Jiqian
  2024-01-10 14:09       ` Stewart Hildebrand
  0 siblings, 1 reply; 35+ messages in thread
From: Chen, Jiqian @ 2024-01-10  6:24 UTC (permalink / raw)
  To: Hildebrand, Stewart
  Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Huang, Ray, Chen, Jiqian

On 2024/1/9 23:24, Stewart Hildebrand wrote:
> On 1/5/24 02:09, Jiqian Chen wrote:
>> diff --git a/xen/drivers/pci/physdev.c b/xen/drivers/pci/physdev.c
>> index 42db3e6d133c..552ccbf747cb 100644
>> --- a/xen/drivers/pci/physdev.c
>> +++ b/xen/drivers/pci/physdev.c
>> @@ -67,6 +68,39 @@ ret_t pci_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>          break;
>>      }
>>  
>> +    case PHYSDEVOP_pci_device_state_reset: {
>> +        struct physdev_pci_device dev;
>> +        struct pci_dev *pdev;
>> +        pci_sbdf_t sbdf;
>> +
>> +        if ( !is_pci_passthrough_enabled() )
>> +            return -EOPNOTSUPP;
>> +
>> +        ret = -EFAULT;
>> +        if ( copy_from_guest(&dev, arg, 1) != 0 )
>> +            break;
>> +        sbdf = PCI_SBDF(dev.seg, dev.bus, dev.devfn);
>> +
>> +        ret = xsm_resource_setup_pci(XSM_PRIV, sbdf.sbdf);
>> +        if ( ret )
>> +            break;
>> +
>> +        pcidevs_lock();
>> +        pdev = pci_get_pdev(NULL, sbdf);
>> +        if ( !pdev )
>> +        {
>> +            pcidevs_unlock();
>> +            ret = -ENODEV;
>> +            break;
>> +        }
>> +
> 
>         write_lock(&pdev->domain->pci_lock);
> 
>> +        ret = vpci_reset_device_state(pdev);
> 
>         write_unlock(&pdev->domain->pci_lock);
vpci_reset_device_state only reset the vpci state of pdev without deleting pdev from domain, and here has held pcidevs_lock, it has no need to lock pci_lock?

> 
>> +        pcidevs_unlock();
>> +        if ( ret )
>> +            printk(XENLOG_ERR "%pp: failed to reset PCI device state\n", &sbdf);
>> +        break;
>> +    }
>> +
>>      default:
>>          ret = -ENOSYS;
>>          break;
>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>> index 72ef277c4f8e..3c64cb10ccbb 100644
>> --- a/xen/drivers/vpci/vpci.c
>> +++ b/xen/drivers/vpci/vpci.c
>> @@ -107,6 +107,15 @@ int vpci_add_handlers(struct pci_dev *pdev)
>>  
>>      return rc;
>>  }
>> +
>> +int vpci_reset_device_state(struct pci_dev *pdev)
>> +{
>> +    ASSERT(pcidevs_locked());
> 
>     ASSERT(rw_is_write_locked(&pdev->domain->pci_lock));
> 
>> +
>> +    vpci_remove_device(pdev);
>> +    return vpci_add_handlers(pdev);
>> +}
>> +
>>  #endif /* __XEN__ */
>>  
>>  static int vpci_register_cmp(const struct vpci_register *r1,

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-09 10:46               ` Jan Beulich
@ 2024-01-10  8:49                 ` Chen, Jiqian
  0 siblings, 0 replies; 35+ messages in thread
From: Chen, Jiqian @ 2024-01-10  8:49 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné
  Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Julien Grall,
	Anthony PERARD, Juergen Gross, Hildebrand, Stewart, Huang, Ray,
	Stefano Stabellini, Chen, Jiqian

Thank Jan and Roger, I may know how to add a new hypercall XEN_DOMCTL_gsi_permission, I will implement it in next version.

On 2024/1/9 18:46, Jan Beulich wrote:
> On 09.01.2024 11:16, Chen, Jiqian wrote:
>> On 2024/1/9 17:38, Jan Beulich wrote:
>>> On 09.01.2024 09:18, Chen, Jiqian wrote:
>>>> A new hypercall using for granting gsi? If so, how does the caller know to call which hypercall to grant permission, XEN_DOMCTL_irq_permission or that new hypercall?
>>>
>>> Either we add a feature indicator, or the caller simply tries the
>>> new GSI interface first.
>> I am still not sure how to use and implement it.
>> Taking pci_add_dm_done as an example, for now its implementation is:
>> pci_add_dm_done
>> 	xc_physdev_map_pirq
>> 	xc_domain_irq_permission(,,pirq,)
>> 		XEN_DOMCTL_irq_permission
>>
>> And assume the new hypercall is XEN_DOMCTL_gsi_permission, do you mean:
>> pci_add_dm_done
>> 	xc_physdev_map_pirq
>> 	ret = xc_domain_gsi_permission(,,gsi,)
>> 		XEN_DOMCTL_gsi_permission
>> 	if ( ret != 0 )
>> 		xc_domain_irq_permission(,,pirq,)
>> 			XEN_DOMCTL_irq_permission
> 
> No, falling back shouldn't be "blind". Fallback should only happen
> when the new sub-op isn't implemented (hence why a feature indicator
> may be necessary), and only if calling the existing sub-op promises
> to be useful (which iirc would limit that to the PV Dom0 case).
> 
>> But if so, I have a question that in XEN_DOMCTL_gsi_permission, when to fail and when to success?
> 
> I'm afraid I don't understand the question. Behavior there isn't to
> be fundamentally different from that for XEN_DOMCTL_irq_permission.
> It's just that the incoming value is in another value space.
> 
>> Or do you mean:
>> pci_add_dm_done
>> 	xc_physdev_map_pirq
>> 	ret = xc_domain_irq_permission(,,pirq,)
>> 		XEN_DOMCTL_irq_permission
>> 	if ( ret != 0 )
>> 		xc_domain_gsi_permission(,,gsi,)
>> 			XEN_DOMCTL_gsi_permission
> 
> No, this looks the wrong way round.
> 
>> And in XEN_DOMCTL_gsi_permission, as long as the current domain has the access of gsi, then granting gsi to caller should be successful. Right?
> 
> I think so; see above.
> 
> Jan

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 1/5] xen/vpci: Clear all vpci status of device
  2024-01-10  6:24     ` Chen, Jiqian
@ 2024-01-10 14:09       ` Stewart Hildebrand
  2024-01-11  2:36         ` Chen, Jiqian
  0 siblings, 1 reply; 35+ messages in thread
From: Stewart Hildebrand @ 2024-01-10 14:09 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Huang, Ray

On 1/10/24 01:24, Chen, Jiqian wrote:
> On 2024/1/9 23:24, Stewart Hildebrand wrote:
>> On 1/5/24 02:09, Jiqian Chen wrote:
>>> diff --git a/xen/drivers/pci/physdev.c b/xen/drivers/pci/physdev.c
>>> index 42db3e6d133c..552ccbf747cb 100644
>>> --- a/xen/drivers/pci/physdev.c
>>> +++ b/xen/drivers/pci/physdev.c
>>> @@ -67,6 +68,39 @@ ret_t pci_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>          break;
>>>      }
>>>  
>>> +    case PHYSDEVOP_pci_device_state_reset: {
>>> +        struct physdev_pci_device dev;
>>> +        struct pci_dev *pdev;
>>> +        pci_sbdf_t sbdf;
>>> +
>>> +        if ( !is_pci_passthrough_enabled() )
>>> +            return -EOPNOTSUPP;
>>> +
>>> +        ret = -EFAULT;
>>> +        if ( copy_from_guest(&dev, arg, 1) != 0 )
>>> +            break;
>>> +        sbdf = PCI_SBDF(dev.seg, dev.bus, dev.devfn);
>>> +
>>> +        ret = xsm_resource_setup_pci(XSM_PRIV, sbdf.sbdf);
>>> +        if ( ret )
>>> +            break;
>>> +
>>> +        pcidevs_lock();
>>> +        pdev = pci_get_pdev(NULL, sbdf);
>>> +        if ( !pdev )
>>> +        {
>>> +            pcidevs_unlock();
>>> +            ret = -ENODEV;
>>> +            break;
>>> +        }
>>> +
>>
>>         write_lock(&pdev->domain->pci_lock);
>>
>>> +        ret = vpci_reset_device_state(pdev);
>>
>>         write_unlock(&pdev->domain->pci_lock);
> vpci_reset_device_state only reset the vpci state of pdev without deleting pdev from domain, and here has held pcidevs_lock, it has no need to lock pci_lock?

Strictly speaking, it is not enforced yet. However, an upcoming change [1] will expand the scope of d->pci_lock to include protecting the pdev->vpci structure to an extent, so it will be required once that change is committed. In my opinion there is no harm in adding the additional lock now. If you prefer to wait I would not object, but in this case I would at least ask for a TODO comment to help remind us to address it later.

[1] https://lists.xenproject.org/archives/html/xen-devel/2024-01/msg00446.html

> 
>>
>>> +        pcidevs_unlock();
>>> +        if ( ret )
>>> +            printk(XENLOG_ERR "%pp: failed to reset PCI device state\n", &sbdf);
>>> +        break;
>>> +    }
>>> +
>>>      default:
>>>          ret = -ENOSYS;
>>>          break;
>>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>>> index 72ef277c4f8e..3c64cb10ccbb 100644
>>> --- a/xen/drivers/vpci/vpci.c
>>> +++ b/xen/drivers/vpci/vpci.c
>>> @@ -107,6 +107,15 @@ int vpci_add_handlers(struct pci_dev *pdev)
>>>  
>>>      return rc;
>>>  }
>>> +
>>> +int vpci_reset_device_state(struct pci_dev *pdev)
>>> +{
>>> +    ASSERT(pcidevs_locked());
>>
>>     ASSERT(rw_is_write_locked(&pdev->domain->pci_lock));
>>
>>> +
>>> +    vpci_remove_device(pdev);
>>> +    return vpci_add_handlers(pdev);
>>> +}
>>> +
>>>  #endif /* __XEN__ */
>>>  
>>>  static int vpci_register_cmp(const struct vpci_register *r1,
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-05  7:09 ` [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission Jiqian Chen
  2024-01-06  1:08   ` Stefano Stabellini
@ 2024-01-10 20:09   ` Stewart Hildebrand
  2024-01-11  2:39     ` Chen, Jiqian
  1 sibling, 1 reply; 35+ messages in thread
From: Stewart Hildebrand @ 2024-01-10 20:09 UTC (permalink / raw)
  To: Jiqian Chen, xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Huang Rui

On 1/5/24 02:09, Jiqian Chen wrote:
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index f5a71ee5f78d..eeb975bd0194 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -653,12 +653,20 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>          unsigned int pirq = op->u.irq_permission.pirq, irq;
>          int allow = op->u.irq_permission.allow_access;
>  
> -        if ( pirq >= current->domain->nr_pirqs )
> +        if ( pirq >= nr_irqs_gsi )

This doesn't build on ARM, as nr_irqs_gsi is x86 only. This is a wild guess: we may want keep the existing current->domain->nr_pirqs check, then add the new nr_irqs_gsi check wrapped in #ifdef CONFIG_X86.

>          {
>              ret = -EINVAL;
>              break;
>          }
> -        irq = pirq_access_permitted(current->domain, pirq);
> +
> +        if ( irq_access_permitted(current->domain, pirq) )
> +            irq = pirq;
> +        else
> +        {
> +            ret = -EPERM;
> +            break;
> +        }
> +
>          if ( !irq || xsm_irq_permission(XSM_HOOK, d, irq, allow) )
>              ret = -EPERM;
>          else if ( allow )


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 1/5] xen/vpci: Clear all vpci status of device
  2024-01-10 14:09       ` Stewart Hildebrand
@ 2024-01-11  2:36         ` Chen, Jiqian
  0 siblings, 0 replies; 35+ messages in thread
From: Chen, Jiqian @ 2024-01-11  2:36 UTC (permalink / raw)
  To: Hildebrand, Stewart
  Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Huang, Ray, Chen, Jiqian

On 2024/1/10 22:09, Stewart Hildebrand wrote:
> On 1/10/24 01:24, Chen, Jiqian wrote:
>> On 2024/1/9 23:24, Stewart Hildebrand wrote:
>>> On 1/5/24 02:09, Jiqian Chen wrote:
>>>> diff --git a/xen/drivers/pci/physdev.c b/xen/drivers/pci/physdev.c
>>>> index 42db3e6d133c..552ccbf747cb 100644
>>>> --- a/xen/drivers/pci/physdev.c
>>>> +++ b/xen/drivers/pci/physdev.c
>>>> @@ -67,6 +68,39 @@ ret_t pci_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>          break;
>>>>      }
>>>>  
>>>> +    case PHYSDEVOP_pci_device_state_reset: {
>>>> +        struct physdev_pci_device dev;
>>>> +        struct pci_dev *pdev;
>>>> +        pci_sbdf_t sbdf;
>>>> +
>>>> +        if ( !is_pci_passthrough_enabled() )
>>>> +            return -EOPNOTSUPP;
>>>> +
>>>> +        ret = -EFAULT;
>>>> +        if ( copy_from_guest(&dev, arg, 1) != 0 )
>>>> +            break;
>>>> +        sbdf = PCI_SBDF(dev.seg, dev.bus, dev.devfn);
>>>> +
>>>> +        ret = xsm_resource_setup_pci(XSM_PRIV, sbdf.sbdf);
>>>> +        if ( ret )
>>>> +            break;
>>>> +
>>>> +        pcidevs_lock();
>>>> +        pdev = pci_get_pdev(NULL, sbdf);
>>>> +        if ( !pdev )
>>>> +        {
>>>> +            pcidevs_unlock();
>>>> +            ret = -ENODEV;
>>>> +            break;
>>>> +        }
>>>> +
>>>
>>>         write_lock(&pdev->domain->pci_lock);
>>>
>>>> +        ret = vpci_reset_device_state(pdev);
>>>
>>>         write_unlock(&pdev->domain->pci_lock);
>> vpci_reset_device_state only reset the vpci state of pdev without deleting pdev from domain, and here has held pcidevs_lock, it has no need to lock pci_lock?
> 
> Strictly speaking, it is not enforced yet. However, an upcoming change [1] will expand the scope of d->pci_lock to include protecting the pdev->vpci structure to an extent, so it will be required once that change is committed. In my opinion there is no harm in adding the additional lock now. If you prefer to wait I would not object, but in this case I would at least ask for a TODO comment to help remind us to address it later.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2024-01/msg00446.html
> 
Ok, I see. I will add pci_lock in next version, thank you for reminding me.

>>
>>>
>>>> +        pcidevs_unlock();
>>>> +        if ( ret )
>>>> +            printk(XENLOG_ERR "%pp: failed to reset PCI device state\n", &sbdf);
>>>> +        break;
>>>> +    }
>>>> +
>>>>      default:
>>>>          ret = -ENOSYS;
>>>>          break;
>>>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>>>> index 72ef277c4f8e..3c64cb10ccbb 100644
>>>> --- a/xen/drivers/vpci/vpci.c
>>>> +++ b/xen/drivers/vpci/vpci.c
>>>> @@ -107,6 +107,15 @@ int vpci_add_handlers(struct pci_dev *pdev)
>>>>  
>>>>      return rc;
>>>>  }
>>>> +
>>>> +int vpci_reset_device_state(struct pci_dev *pdev)
>>>> +{
>>>> +    ASSERT(pcidevs_locked());
>>>
>>>     ASSERT(rw_is_write_locked(&pdev->domain->pci_lock));
>>>
>>>> +
>>>> +    vpci_remove_device(pdev);
>>>> +    return vpci_add_handlers(pdev);
>>>> +}
>>>> +
>>>>  #endif /* __XEN__ */
>>>>  
>>>>  static int vpci_register_cmp(const struct vpci_register *r1,
>>

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission
  2024-01-10 20:09   ` Stewart Hildebrand
@ 2024-01-11  2:39     ` Chen, Jiqian
  0 siblings, 0 replies; 35+ messages in thread
From: Chen, Jiqian @ 2024-01-11  2:39 UTC (permalink / raw)
  To: Hildebrand, Stewart
  Cc: xen-devel, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Juergen Gross, Huang, Ray, Chen, Jiqian

On 2024/1/11 04:09, Stewart Hildebrand wrote:
> On 1/5/24 02:09, Jiqian Chen wrote:
>> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
>> index f5a71ee5f78d..eeb975bd0194 100644
>> --- a/xen/common/domctl.c
>> +++ b/xen/common/domctl.c
>> @@ -653,12 +653,20 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>>          unsigned int pirq = op->u.irq_permission.pirq, irq;
>>          int allow = op->u.irq_permission.allow_access;
>>  
>> -        if ( pirq >= current->domain->nr_pirqs )
>> +        if ( pirq >= nr_irqs_gsi )
> 
> This doesn't build on ARM, as nr_irqs_gsi is x86 only. This is a wild guess: we may want keep the existing current->domain->nr_pirqs check, then add the new nr_irqs_gsi check wrapped in #ifdef CONFIG_X86.
As I will add a new hypercall in next version, then this change will not exist, thank you.

> 
>>          {
>>              ret = -EINVAL;
>>              break;
>>          }
>> -        irq = pirq_access_permitted(current->domain, pirq);
>> +
>> +        if ( irq_access_permitted(current->domain, pirq) )
>> +            irq = pirq;
>> +        else
>> +        {
>> +            ret = -EPERM;
>> +            break;
>> +        }
>> +
>>          if ( !irq || xsm_irq_permission(XSM_HOOK, d, irq, allow) )
>>              ret = -EPERM;
>>          else if ( allow )

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2024-01-11  2:39 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-05  7:09 [RFC XEN PATCH v4 0/5] Support device passthrough when dom0 is PVH on Xen Jiqian Chen
2024-01-05  7:09 ` [RFC XEN PATCH v4 1/5] xen/vpci: Clear all vpci status of device Jiqian Chen
2024-01-09 15:24   ` Stewart Hildebrand
2024-01-10  6:24     ` Chen, Jiqian
2024-01-10 14:09       ` Stewart Hildebrand
2024-01-11  2:36         ` Chen, Jiqian
2024-01-05  7:09 ` [RFC XEN PATCH v4 2/5] x86/pvh: Allow (un)map_pirq when caller isn't DOMID_SELF Jiqian Chen
2024-01-06  0:46   ` Stefano Stabellini
2024-01-08  8:47     ` Jan Beulich
2024-01-08  9:15       ` Chen, Jiqian
2024-01-08  9:25         ` Jan Beulich
2024-01-09  7:58           ` Chen, Jiqian
2024-01-05  7:09 ` [RFC XEN PATCH v4 3/5] x86/pvh: Add PHYSDEVOP_setup_gsi for PVH dom0 Jiqian Chen
2024-01-06  0:54   ` Stefano Stabellini
2024-01-08  8:50     ` Jan Beulich
2024-01-09  8:01       ` Chen, Jiqian
2024-01-05  7:09 ` [RFC XEN PATCH v4 4/5] domctl: Use gsi to grant/revoke irq permission Jiqian Chen
2024-01-06  1:08   ` Stefano Stabellini
2024-01-06  1:12     ` Stefano Stabellini
2024-01-08  3:43     ` Chen, Jiqian
2024-01-08  8:55     ` Jan Beulich
2024-01-08 15:05       ` Roger Pau Monné
2024-01-09  8:18         ` Chen, Jiqian
2024-01-09  9:38           ` Jan Beulich
2024-01-09 10:16             ` Chen, Jiqian
2024-01-09 10:40               ` Roger Pau Monné
2024-01-09 10:46               ` Jan Beulich
2024-01-10  8:49                 ` Chen, Jiqian
2024-01-10 20:09   ` Stewart Hildebrand
2024-01-11  2:39     ` Chen, Jiqian
2024-01-05  7:09 ` [RFC XEN PATCH v4 5/5] libxl: Use gsi instead of irq for mapping pirq Jiqian Chen
2024-01-05  7:36   ` Jan Beulich
2024-01-05  7:59     ` Chen, Jiqian
2024-01-06  1:11   ` Stefano Stabellini
2024-01-08  6:02     ` Chen, Jiqian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.