All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] hw/xen/xen_pt: Call default handler only if no custom one is set
@ 2022-11-14 19:20 Marek Marczykowski-Górecki
  2022-11-14 19:20 ` [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen Marek Marczykowski-Górecki
  2022-11-22 17:12   ` Anthony PERARD
  0 siblings, 2 replies; 16+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-11-14 19:20 UTC (permalink / raw)
  To: qemu-devel
  Cc: Marek Marczykowski-Górecki, Stefano Stabellini,
	Anthony Perard, Paul Durrant, open list:X86 Xen CPUs

Call pci_default_write_config() in xen_pt_pci_write_config() only for
registers that do not have custom handler, and do that only after
resolving them. This is important for two reasons:
1. XenPTRegInfo has ro_mask which needs to be enforced - Xen-specific
   hooks do that on their own (especially xen_pt_*_reg_write()).
2. Not setting value early allows the hooks to see the old value too.

If it would be only about the first point, setting PCIDevice.wmask would
probably be sufficient, but given the second point, restructure those
writes.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
 hw/xen/xen_pt.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 0ec7e52183..269bd26109 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -255,6 +255,7 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
     uint32_t find_addr = addr;
     XenPTRegInfo *reg = NULL;
     bool wp_flag = false;
+    uint32_t emul_mask = 0, write_val;
 
     if (xen_pt_pci_config_access_check(d, addr, len)) {
         return;
@@ -310,7 +311,6 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
     }
 
     memory_region_transaction_begin();
-    pci_default_write_config(d, addr, val, len);
 
     /* adjust the read and write value to appropriate CFC-CFF window */
     read_val <<= (addr & 3) << 3;
@@ -370,6 +370,8 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
                 return;
             }
 
+            emul_mask |= ( (1 << (reg->size * 8) ) - 1 ) << ((find_addr & 3) * 8);
+
             /* calculate next address to find */
             emul_len -= reg->size;
             if (emul_len > 0) {
@@ -396,6 +398,24 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
     /* need to shift back before passing them to xen_host_pci_set_block. */
     val >>= (addr & 3) << 3;
 
+    /* store emulated registers that didn't have specific hooks */
+    write_val = val;
+    for (index = 0; emul_mask; index += emul_len) {
+        emul_len = 0;
+        while (emul_mask & 0xff) {
+            emul_len++;
+            emul_mask >>= 8;
+        }
+        if (emul_len) {
+            uint32_t mask = ((1 << (emul_len * 8)) - 1);
+            pci_default_write_config(d, addr, write_val & mask, emul_len);
+            write_val >>= emul_len * 8;
+        } else {
+            emul_mask >>= 8;
+            write_val >>= 8;
+        }
+    }
+
     memory_region_transaction_commit();
 
 out:
-- 
2.37.3



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen
  2022-11-14 19:20 [PATCH 1/2] hw/xen/xen_pt: Call default handler only if no custom one is set Marek Marczykowski-Górecki
@ 2022-11-14 19:20 ` Marek Marczykowski-Górecki
  2022-11-14 19:39   ` Andrew Cooper
                     ` (2 more replies)
  2022-11-22 17:12   ` Anthony PERARD
  1 sibling, 3 replies; 16+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-11-14 19:20 UTC (permalink / raw)
  To: qemu-devel
  Cc: Marek Marczykowski-Górecki, Stefano Stabellini,
	Anthony Perard, Paul Durrant, open list:X86 Xen CPUs

The /dev/mem is used for two purposes:
 - reading PCI_MSIX_ENTRY_CTRL_MASKBIT
 - reading Pending Bit Array (PBA)

The first one was originally done because when Xen did not send all
vector ctrl writes to the device model, so QEMU might have outdated old
register value. This has been changed in Xen, so QEMU can now use its
cached value of the register instead.

The Pending Bit Array (PBA) handling is for the case where it lives on
the same page as the MSI-X table itself. Xen has been extended to handle
this case too (as well as other registers that may live on those pages),
so QEMU handling is not necessary anymore.

Removing /dev/mem access is useful to work within stubdomain, and
necessary when dom0 kernel runs in lockdown mode.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
 hw/xen/xen_pt.h     |  1 -
 hw/xen/xen_pt_msi.c | 51 ++++-----------------------------------------
 2 files changed, 4 insertions(+), 48 deletions(-)

diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index e7c4316a7d..de4094e7ec 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -206,7 +206,6 @@ typedef struct XenPTMSIX {
     uint32_t table_offset_adjust; /* page align mmap */
     uint64_t mmio_base_addr;
     MemoryRegion mmio;
-    void *phys_iomem_base;
     XenPTMSIXEntry msix_entry[];
 } XenPTMSIX;
 
diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
index b71563f98a..a8a75dff66 100644
--- a/hw/xen/xen_pt_msi.c
+++ b/hw/xen/xen_pt_msi.c
@@ -460,15 +460,7 @@ static void pci_msix_write(void *opaque, hwaddr addr,
         entry->updated = true;
     } else if (msix->enabled && entry->updated &&
                !(val & PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
-        const volatile uint32_t *vec_ctrl;
-
-        /*
-         * If Xen intercepts the mask bit access, entry->vec_ctrl may not be
-         * up-to-date. Read from hardware directly.
-         */
-        vec_ctrl = s->msix->phys_iomem_base + entry_nr * PCI_MSIX_ENTRY_SIZE
-            + PCI_MSIX_ENTRY_VECTOR_CTRL;
-        xen_pt_msix_update_one(s, entry_nr, *vec_ctrl);
+        xen_pt_msix_update_one(s, entry_nr, entry->latch(VECTOR_CTRL));
     }
 
     set_entry_value(entry, offset, val);
@@ -493,7 +485,9 @@ static uint64_t pci_msix_read(void *opaque, hwaddr addr,
         return get_entry_value(&msix->msix_entry[entry_nr], offset);
     } else {
         /* Pending Bit Array (PBA) */
-        return *(uint32_t *)(msix->phys_iomem_base + addr);
+        XEN_PT_LOG(&s->dev, "reading PBA, addr %#lx, offset %#lx\n",
+                   addr, addr - msix->total_entries * PCI_MSIX_ENTRY_SIZE);
+        return 0xFFFFFFFF;
     }
 }
 
@@ -529,7 +523,6 @@ int xen_pt_msix_init(XenPCIPassthroughState *s, uint32_t base)
     int i, total_entries, bar_index;
     XenHostPCIDevice *hd = &s->real_device;
     PCIDevice *d = &s->dev;
-    int fd = -1;
     XenPTMSIX *msix = NULL;
     int rc = 0;
 
@@ -576,34 +569,6 @@ int xen_pt_msix_init(XenPCIPassthroughState *s, uint32_t base)
     msix->table_base = s->real_device.io_regions[bar_index].base_addr;
     XEN_PT_LOG(d, "get MSI-X table BAR base 0x%"PRIx64"\n", msix->table_base);
 
-    fd = open("/dev/mem", O_RDWR);
-    if (fd == -1) {
-        rc = -errno;
-        XEN_PT_ERR(d, "Can't open /dev/mem: %s\n", strerror(errno));
-        goto error_out;
-    }
-    XEN_PT_LOG(d, "table_off = 0x%x, total_entries = %d\n",
-               table_off, total_entries);
-    msix->table_offset_adjust = table_off & 0x0fff;
-    msix->phys_iomem_base =
-        mmap(NULL,
-             total_entries * PCI_MSIX_ENTRY_SIZE + msix->table_offset_adjust,
-             PROT_READ,
-             MAP_SHARED | MAP_LOCKED,
-             fd,
-             msix->table_base + table_off - msix->table_offset_adjust);
-    close(fd);
-    if (msix->phys_iomem_base == MAP_FAILED) {
-        rc = -errno;
-        XEN_PT_ERR(d, "Can't map physical MSI-X table: %s\n", strerror(errno));
-        goto error_out;
-    }
-    msix->phys_iomem_base = (char *)msix->phys_iomem_base
-        + msix->table_offset_adjust;
-
-    XEN_PT_LOG(d, "mapping physical MSI-X table to %p\n",
-               msix->phys_iomem_base);
-
     memory_region_add_subregion_overlap(&s->bar[bar_index], table_off,
                                         &msix->mmio,
                                         2); /* Priority: pci default + 1 */
@@ -624,14 +589,6 @@ void xen_pt_msix_unmap(XenPCIPassthroughState *s)
         return;
     }
 
-    /* unmap the MSI-X memory mapped register area */
-    if (msix->phys_iomem_base) {
-        XEN_PT_LOG(&s->dev, "unmapping physical MSI-X table from %p\n",
-                   msix->phys_iomem_base);
-        munmap(msix->phys_iomem_base, msix->total_entries * PCI_MSIX_ENTRY_SIZE
-               + msix->table_offset_adjust);
-    }
-
     memory_region_del_subregion(&s->bar[msix->bar_index], &msix->mmio);
 }
 
-- 
2.37.3



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen
  2022-11-14 19:20 ` [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen Marek Marczykowski-Górecki
@ 2022-11-14 19:39   ` Andrew Cooper
  2022-11-14 22:44     ` Marek Marczykowski-Górecki
  2022-11-15  8:14   ` Jan Beulich
  2022-11-16 19:15   ` Jason Andryuk
  2 siblings, 1 reply; 16+ messages in thread
From: Andrew Cooper @ 2022-11-14 19:39 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki, qemu-devel
  Cc: Stefano Stabellini, Anthony Perard, Paul Durrant, open list:X86 Xen CPUs

On 14/11/2022 19:20, Marek Marczykowski-Górecki wrote:
> The /dev/mem is used for two purposes:
>  - reading PCI_MSIX_ENTRY_CTRL_MASKBIT
>  - reading Pending Bit Array (PBA)
>
> The first one was originally done because when Xen did not send all
> vector ctrl writes to the device model, so QEMU might have outdated old
> register value. This has been changed in Xen, so QEMU can now use its
> cached value of the register instead.
>
> The Pending Bit Array (PBA) handling is for the case where it lives on
> the same page as the MSI-X table itself. Xen has been extended to handle
> this case too (as well as other registers that may live on those pages),
> so QEMU handling is not necessary anymore.
>
> Removing /dev/mem access is useful to work within stubdomain, and
> necessary when dom0 kernel runs in lockdown mode.
>
> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>

The commit message ought to go further.  Using /dev/mem like this is
buggy anyway, because it is trapped and emulated by Xen in whatever
context Qemu is running.  Qemu cannot get the actual hardware value, and
even if it could, it would be racy with transient operations needing to
mask the vector.

i.e. it's not just nice-to-remote - it's fixing real corner cases.

~Andrew

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen
  2022-11-14 19:39   ` Andrew Cooper
@ 2022-11-14 22:44     ` Marek Marczykowski-Górecki
  0 siblings, 0 replies; 16+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-11-14 22:44 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: qemu-devel, Stefano Stabellini, Anthony Perard, Paul Durrant,
	open list:X86 Xen CPUs

[-- Attachment #1: Type: text/plain, Size: 1946 bytes --]

On Mon, Nov 14, 2022 at 07:39:48PM +0000, Andrew Cooper wrote:
> On 14/11/2022 19:20, Marek Marczykowski-Górecki wrote:
> > The /dev/mem is used for two purposes:
> >  - reading PCI_MSIX_ENTRY_CTRL_MASKBIT
> >  - reading Pending Bit Array (PBA)
> >
> > The first one was originally done because when Xen did not send all
> > vector ctrl writes to the device model, so QEMU might have outdated old
> > register value. This has been changed in Xen, so QEMU can now use its
> > cached value of the register instead.

I should have noted that "has been changed in Xen" isn't fully accurate
(yet). It refers to this patch:
https://lore.kernel.org/xen-devel/20221114192100.1539267-2-marmarek@invisiblethingslab.com/T/#u

> > The Pending Bit Array (PBA) handling is for the case where it lives on
> > the same page as the MSI-X table itself. Xen has been extended to handle
> > this case too (as well as other registers that may live on those pages),
> > so QEMU handling is not necessary anymore.
> >
> > Removing /dev/mem access is useful to work within stubdomain, and
> > necessary when dom0 kernel runs in lockdown mode.
> >
> > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> 
> The commit message ought to go further.  Using /dev/mem like this is
> buggy anyway, because it is trapped and emulated by Xen in whatever
> context Qemu is running.  Qemu cannot get the actual hardware value, and
> even if it could, it would be racy with transient operations needing to
> mask the vector.
> 
> i.e. it's not just nice-to-remote - it's fixing real corner cases.

Good point, I'll extend it.
But for this to work, the Xen patch needs to go in first (which won't
happen for 4.17). And also, upstream QEMU probably needs some way to
detect whether Xen has the change or not - to work with older versions
too.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen
  2022-11-14 19:20 ` [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen Marek Marczykowski-Górecki
  2022-11-14 19:39   ` Andrew Cooper
@ 2022-11-15  8:14   ` Jan Beulich
  2022-11-15 11:38     ` Marek Marczykowski-Górecki
  2022-11-16 19:15   ` Jason Andryuk
  2 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2022-11-15  8:14 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Stefano Stabellini, Anthony Perard, Paul Durrant,
	open list:X86 Xen CPUs, qemu-devel

On 14.11.2022 20:20, Marek Marczykowski-Górecki wrote:
> The /dev/mem is used for two purposes:
>  - reading PCI_MSIX_ENTRY_CTRL_MASKBIT
>  - reading Pending Bit Array (PBA)
> 
> The first one was originally done because when Xen did not send all
> vector ctrl writes to the device model, so QEMU might have outdated old
> register value. This has been changed in Xen, so QEMU can now use its
> cached value of the register instead.
> 
> The Pending Bit Array (PBA) handling is for the case where it lives on
> the same page as the MSI-X table itself. Xen has been extended to handle
> this case too (as well as other registers that may live on those pages),
> so QEMU handling is not necessary anymore.

Don't you need to check for new enough Xen for both aspects?

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen
  2022-11-15  8:14   ` Jan Beulich
@ 2022-11-15 11:38     ` Marek Marczykowski-Górecki
  2022-11-15 14:05       ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-11-15 11:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Anthony Perard, Paul Durrant,
	open list:X86 Xen CPUs, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1078 bytes --]

On Tue, Nov 15, 2022 at 09:14:07AM +0100, Jan Beulich wrote:
> On 14.11.2022 20:20, Marek Marczykowski-Górecki wrote:
> > The /dev/mem is used for two purposes:
> >  - reading PCI_MSIX_ENTRY_CTRL_MASKBIT
> >  - reading Pending Bit Array (PBA)
> > 
> > The first one was originally done because when Xen did not send all
> > vector ctrl writes to the device model, so QEMU might have outdated old
> > register value. This has been changed in Xen, so QEMU can now use its
> > cached value of the register instead.
> > 
> > The Pending Bit Array (PBA) handling is for the case where it lives on
> > the same page as the MSI-X table itself. Xen has been extended to handle
> > this case too (as well as other registers that may live on those pages),
> > so QEMU handling is not necessary anymore.
> 
> Don't you need to check for new enough Xen for both aspects?

Yes, see my response to Andrew in the thread. I'm open for suggestions
what to check specifically (Xen version directly?). 

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen
  2022-11-15 11:38     ` Marek Marczykowski-Górecki
@ 2022-11-15 14:05       ` Jan Beulich
  0 siblings, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2022-11-15 14:05 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Stefano Stabellini, Anthony Perard, Paul Durrant,
	open list:X86 Xen CPUs, qemu-devel

On 15.11.2022 12:38, Marek Marczykowski-Górecki wrote:
> On Tue, Nov 15, 2022 at 09:14:07AM +0100, Jan Beulich wrote:
>> On 14.11.2022 20:20, Marek Marczykowski-Górecki wrote:
>>> The /dev/mem is used for two purposes:
>>>  - reading PCI_MSIX_ENTRY_CTRL_MASKBIT
>>>  - reading Pending Bit Array (PBA)
>>>
>>> The first one was originally done because when Xen did not send all
>>> vector ctrl writes to the device model, so QEMU might have outdated old
>>> register value. This has been changed in Xen, so QEMU can now use its
>>> cached value of the register instead.
>>>
>>> The Pending Bit Array (PBA) handling is for the case where it lives on
>>> the same page as the MSI-X table itself. Xen has been extended to handle
>>> this case too (as well as other registers that may live on those pages),
>>> so QEMU handling is not necessary anymore.
>>
>> Don't you need to check for new enough Xen for both aspects?
> 
> Yes, see my response to Andrew in the thread. I'm open for suggestions
> what to check specifically (Xen version directly?). 

I guess we should first see what changes we (you) end up doing in Xen
itself, and then decide whether to surface the new functionality in
some kind of feature indicator. Generally I'd prefer to avoid version
checks, because they don't fit very well with backports nor the (rare)
need to revert something. But sometimes a feature can be probed easily
without requiring an explicit feature indicator.

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen
  2022-11-14 19:20 ` [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen Marek Marczykowski-Górecki
  2022-11-14 19:39   ` Andrew Cooper
  2022-11-15  8:14   ` Jan Beulich
@ 2022-11-16 19:15   ` Jason Andryuk
  2022-11-16 21:40     ` Marek Marczykowski-Górecki
  2 siblings, 1 reply; 16+ messages in thread
From: Jason Andryuk @ 2022-11-16 19:15 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: qemu-devel, Stefano Stabellini, Anthony Perard, Paul Durrant,
	open list:X86 Xen CPUs

On Mon, Nov 14, 2022 at 2:21 PM Marek Marczykowski-Górecki
<marmarek@invisiblethingslab.com> wrote:
>
> The /dev/mem is used for two purposes:
>  - reading PCI_MSIX_ENTRY_CTRL_MASKBIT
>  - reading Pending Bit Array (PBA)
>
> The first one was originally done because when Xen did not send all
> vector ctrl writes to the device model, so QEMU might have outdated old
> register value. This has been changed in Xen, so QEMU can now use its
> cached value of the register instead.
>
> The Pending Bit Array (PBA) handling is for the case where it lives on
> the same page as the MSI-X table itself. Xen has been extended to handle
> this case too (as well as other registers that may live on those pages),
> so QEMU handling is not necessary anymore.
>
> Removing /dev/mem access is useful to work within stubdomain, and
> necessary when dom0 kernel runs in lockdown mode.
>
> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>

I put the Xen, QEMU, and xen-pciback patches into OpenXT and gave a
little test.  When pci_permissive=0, iwlwifi fails to load its
firmware.  With pci_permissive=1, it looks like MSI-X is enabled. (I
previously included your libxl allow_interrupt_control patch - that
seemed to get regular MSIs working prior to the MSI-X patches.)  I
also removed the OpenXT equivalent of 0005-Disable-MSI-X-caps.patch.
I am testing with Linux 5.4.y, so that could be another factor.

One strange thing is the lspci output.  Dom0 shows MSI-X enabled.
Meanwhile NDVM (sys-net) does not show the MSI-X capability.  If you
`hexdump -C /sys/bus/pci/devices/$dev/config` you can see MSI-X
enabled, but you also see that the MSI capability has 00 as the next
pointer, so lspci stops parsing.

MSI cap stubdom:
00000040  10 00 92 00 c0 0e 00 00  10 0c 10 00 00 00 00 00  |................|
0x41 -> next 0x00
MSI cap dom0:
00000040  10 80 92 00 c0 0e 00 10  10 0c 10 00 00 00 00 00  |................|
0x41 -> next 0x80

MSI-X:
00000080  11 00 0f 80 00 20 00 00  00 30 00 00 00 00 00 00

AFAIU, the value 0x80 at offset 0x83 is MSI-X Enabled.

I had a boot where assignment failed with the hypervisor printing:
d12: assign (0000:00:14.3) failed (-16)
Rebooting the laptop seemed to clear that.

Regards,
Jason


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen
  2022-11-16 19:15   ` Jason Andryuk
@ 2022-11-16 21:40     ` Marek Marczykowski-Górecki
  2022-11-17  3:34       ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 16+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-11-16 21:40 UTC (permalink / raw)
  To: Jason Andryuk
  Cc: qemu-devel, Stefano Stabellini, Anthony Perard, Paul Durrant,
	open list:X86 Xen CPUs

[-- Attachment #1: Type: text/plain, Size: 3147 bytes --]

On Wed, Nov 16, 2022 at 02:15:22PM -0500, Jason Andryuk wrote:
> On Mon, Nov 14, 2022 at 2:21 PM Marek Marczykowski-Górecki
> <marmarek@invisiblethingslab.com> wrote:
> >
> > The /dev/mem is used for two purposes:
> >  - reading PCI_MSIX_ENTRY_CTRL_MASKBIT
> >  - reading Pending Bit Array (PBA)
> >
> > The first one was originally done because when Xen did not send all
> > vector ctrl writes to the device model, so QEMU might have outdated old
> > register value. This has been changed in Xen, so QEMU can now use its
> > cached value of the register instead.
> >
> > The Pending Bit Array (PBA) handling is for the case where it lives on
> > the same page as the MSI-X table itself. Xen has been extended to handle
> > this case too (as well as other registers that may live on those pages),
> > so QEMU handling is not necessary anymore.
> >
> > Removing /dev/mem access is useful to work within stubdomain, and
> > necessary when dom0 kernel runs in lockdown mode.
> >
> > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> 
> I put the Xen, QEMU, and xen-pciback patches into OpenXT and gave a
> little test.  When pci_permissive=0, iwlwifi fails to load its
> firmware.  With pci_permissive=1, it looks like MSI-X is enabled. (I
> previously included your libxl allow_interrupt_control patch - that
> seemed to get regular MSIs working prior to the MSI-X patches.)  I
> also removed the OpenXT equivalent of 0005-Disable-MSI-X-caps.patch.
> I am testing with Linux 5.4.y, so that could be another factor.

Can you confirm the allow_interrupt_control is set by libxl? Also,
vanilla 5.4 doesn't have the allow_interrupt_control patch at all, and you
may have an earlier version that had "allow_msi_enable" as the sysfs
file name.

> One strange thing is the lspci output.  Dom0 shows MSI-X enabled.
> Meanwhile NDVM (sys-net) does not show the MSI-X capability.  If you
> `hexdump -C /sys/bus/pci/devices/$dev/config` you can see MSI-X
> enabled, but you also see that the MSI capability has 00 as the next
> pointer, so lspci stops parsing.

This 00 value is written by Linux[1] (sic!) and then qemu incorrectly
allowing the write and happily emulating that zero. The other qemu patch
in this series ought to fix that (as in: properly refuse the write), do
you have it included?

[1] https://github.com/torvalds/linux/blob/master/drivers/net/wireless/intel/iwlwifi/pcie/drv.c#L1721

> MSI cap stubdom:
> 00000040  10 00 92 00 c0 0e 00 00  10 0c 10 00 00 00 00 00  |................|
> 0x41 -> next 0x00
> MSI cap dom0:
> 00000040  10 80 92 00 c0 0e 00 10  10 0c 10 00 00 00 00 00  |................|
> 0x41 -> next 0x80
> 
> MSI-X:
> 00000080  11 00 0f 80 00 20 00 00  00 30 00 00 00 00 00 00
> 
> AFAIU, the value 0x80 at offset 0x83 is MSI-X Enabled.
> 
> I had a boot where assignment failed with the hypervisor printing:
> d12: assign (0000:00:14.3) failed (-16)
> Rebooting the laptop seemed to clear that.

Zombie of previous domain? Not set as "assignable" first?

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen
  2022-11-16 21:40     ` Marek Marczykowski-Górecki
@ 2022-11-17  3:34       ` Marek Marczykowski-Górecki
  2022-11-17  8:04         ` Jan Beulich
  2022-11-17 17:29         ` Jason Andryuk
  0 siblings, 2 replies; 16+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-11-17  3:34 UTC (permalink / raw)
  To: Jason Andryuk, Jan Beulich
  Cc: qemu-devel, Stefano Stabellini, Anthony Perard, Paul Durrant,
	open list:X86 Xen CPUs

[-- Attachment #1: Type: text/plain, Size: 3641 bytes --]

On Wed, Nov 16, 2022 at 10:40:02PM +0100, Marek Marczykowski-Górecki wrote:
> On Wed, Nov 16, 2022 at 02:15:22PM -0500, Jason Andryuk wrote:
> > On Mon, Nov 14, 2022 at 2:21 PM Marek Marczykowski-Górecki
> > <marmarek@invisiblethingslab.com> wrote:
> > >
> > > The /dev/mem is used for two purposes:
> > >  - reading PCI_MSIX_ENTRY_CTRL_MASKBIT
> > >  - reading Pending Bit Array (PBA)
> > >
> > > The first one was originally done because when Xen did not send all
> > > vector ctrl writes to the device model, so QEMU might have outdated old
> > > register value. This has been changed in Xen, so QEMU can now use its
> > > cached value of the register instead.
> > >
> > > The Pending Bit Array (PBA) handling is for the case where it lives on
> > > the same page as the MSI-X table itself. Xen has been extended to handle
> > > this case too (as well as other registers that may live on those pages),
> > > so QEMU handling is not necessary anymore.
> > >
> > > Removing /dev/mem access is useful to work within stubdomain, and
> > > necessary when dom0 kernel runs in lockdown mode.
> > >
> > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > 
> > I put the Xen, QEMU, and xen-pciback patches into OpenXT and gave a
> > little test.  When pci_permissive=0, iwlwifi fails to load its
> > firmware.  With pci_permissive=1, it looks like MSI-X is enabled. (I
> > previously included your libxl allow_interrupt_control patch - that
> > seemed to get regular MSIs working prior to the MSI-X patches.)  I
> > also removed the OpenXT equivalent of 0005-Disable-MSI-X-caps.patch.
> > I am testing with Linux 5.4.y, so that could be another factor.
> 
> Can you confirm the allow_interrupt_control is set by libxl? Also,
> vanilla 5.4 doesn't have the allow_interrupt_control patch at all, and you
> may have an earlier version that had "allow_msi_enable" as the sysfs
> file name.

Ok, I found what is wrong. Enabling MSI-X is refused, because INTx isn't
disabled at this point yet. And apparently I was testing this with
permissive=1...

Linux does this:
https://github.com/torvalds/linux/blob/master/drivers/pci/msi/msi.c#L611
In short:
1. Enable MSI-X with MASKALL=1
2. Setup MSI-X table
3. Disable INTx
4. Set MASKALL=0

This patch on top should fix this:
----8<----
diff --git a/drivers/xen/xen-pciback/conf_space_capability.c b/drivers/xen/xen-pciback/conf_space_capability.c
index 097316a74126..f4c4381de76e 100644
--- a/drivers/xen/xen-pciback/conf_space_capability.c
+++ b/drivers/xen/xen-pciback/conf_space_capability.c
@@ -235,7 +235,7 @@ static int msi_msix_flags_write(struct pci_dev *dev, int offset, u16 new_value,
 	    (new_value ^ old_value) & ~field_config->allowed_bits)
 		return PCIBIOS_SET_FAILED;
 
-	if (new_value & field_config->enable_bit) {
+	if ((new_value & field_config->allowed_bits) == field_config->enable_bit) {
 		/* don't allow enabling together with other interrupt types */
 		int int_type = xen_pcibk_get_interrupt_type(dev);
 
----8<----

Jan, is the above safe? It should prevent clearing MASKALL if INTx isn't
disabled, unless I missed something? But also, it will allow enabling
MSI-X with MASKALL=1 together with MSI, which I'm not sure if should be
allowed.
Alternatively to the above patch, I could allow specifically setting
MSIX_FLAGS_ENABLE + MSIX_FLAGS_MASKALL while INTx isn't disabled as a
single exception, but at this point I'm not sure if some other driver or
OS wouldn't approach this in yet another way.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen
  2022-11-17  3:34       ` Marek Marczykowski-Górecki
@ 2022-11-17  8:04         ` Jan Beulich
  2022-11-17 11:15           ` Marek Marczykowski-Górecki
  2022-11-17 17:29         ` Jason Andryuk
  1 sibling, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2022-11-17  8:04 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: qemu-devel, Stefano Stabellini, Anthony Perard, Paul Durrant,
	open list:X86 Xen CPUs, Jason Andryuk

On 17.11.2022 04:34, Marek Marczykowski-Górecki wrote:
> Ok, I found what is wrong. Enabling MSI-X is refused, because INTx isn't
> disabled at this point yet. And apparently I was testing this with
> permissive=1...
> 
> Linux does this:
> https://github.com/torvalds/linux/blob/master/drivers/pci/msi/msi.c#L611
> In short:
> 1. Enable MSI-X with MASKALL=1
> 2. Setup MSI-X table
> 3. Disable INTx
> 4. Set MASKALL=0
> 
> This patch on top should fix this:
> ----8<----
> diff --git a/drivers/xen/xen-pciback/conf_space_capability.c b/drivers/xen/xen-pciback/conf_space_capability.c
> index 097316a74126..f4c4381de76e 100644
> --- a/drivers/xen/xen-pciback/conf_space_capability.c
> +++ b/drivers/xen/xen-pciback/conf_space_capability.c
> @@ -235,7 +235,7 @@ static int msi_msix_flags_write(struct pci_dev *dev, int offset, u16 new_value,
>  	    (new_value ^ old_value) & ~field_config->allowed_bits)
>  		return PCIBIOS_SET_FAILED;
>  
> -	if (new_value & field_config->enable_bit) {
> +	if ((new_value & field_config->allowed_bits) == field_config->enable_bit) {
>  		/* don't allow enabling together with other interrupt types */
>  		int int_type = xen_pcibk_get_interrupt_type(dev);
>  
> ----8<----
> 
> Jan, is the above safe? It should prevent clearing MASKALL if INTx isn't
> disabled, unless I missed something? But also, it will allow enabling
> MSI-X with MASKALL=1 together with MSI, which I'm not sure if should be
> allowed.

While it would probably be okay with what we have now (after your earlier
patch introducing allowed_bits), it's likely not going to be correct once
further bits would be added to allowed_bits (which clearly is going to be
wanted sooner or later, e.g. for multi-vector MSI). Hence I think ...

> Alternatively to the above patch, I could allow specifically setting
> MSIX_FLAGS_ENABLE + MSIX_FLAGS_MASKALL while INTx isn't disabled as a
> single exception,

... this is the way to go, and ...

> but at this point I'm not sure if some other driver or
> OS wouldn't approach this in yet another way.

... I guess we need to further add exceptions as needed - the one further
approach I could see is to keep all entry's mask bits set until setting
"INTx disable", without using MASKALL.

I'd like to note though that the PCI spec has no such exception. It,
however, also doesn't mandate setting "INTx disable" - that's merely a
workaround for flawed hardware afaik. (In Xen we also only cross check
MSI and MSI-X. IOW we rely on Dom0 anyway when it comes to driving
"INTx disable".) Therefore in the end pciback looks to be going too far
in enforcing such dependencies.

Thinking of this - what about making the change in
xen_pcibk_get_interrupt_type() instead, setting INTERRUPT_TYPE_MSIX only
when MASKALL is clear (alongside ENABLE being set)? This would then also
cover command_write().

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen
  2022-11-17  8:04         ` Jan Beulich
@ 2022-11-17 11:15           ` Marek Marczykowski-Górecki
  0 siblings, 0 replies; 16+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-11-17 11:15 UTC (permalink / raw)
  To: Jan Beulich
  Cc: qemu-devel, Stefano Stabellini, Anthony Perard, Paul Durrant,
	open list:X86 Xen CPUs, Jason Andryuk

[-- Attachment #1: Type: text/plain, Size: 3374 bytes --]

On Thu, Nov 17, 2022 at 09:04:40AM +0100, Jan Beulich wrote:
> On 17.11.2022 04:34, Marek Marczykowski-Górecki wrote:
> > Ok, I found what is wrong. Enabling MSI-X is refused, because INTx isn't
> > disabled at this point yet. And apparently I was testing this with
> > permissive=1...
> > 
> > Linux does this:
> > https://github.com/torvalds/linux/blob/master/drivers/pci/msi/msi.c#L611
> > In short:
> > 1. Enable MSI-X with MASKALL=1
> > 2. Setup MSI-X table
> > 3. Disable INTx
> > 4. Set MASKALL=0
> > 
> > This patch on top should fix this:
> > ----8<----
> > diff --git a/drivers/xen/xen-pciback/conf_space_capability.c b/drivers/xen/xen-pciback/conf_space_capability.c
> > index 097316a74126..f4c4381de76e 100644
> > --- a/drivers/xen/xen-pciback/conf_space_capability.c
> > +++ b/drivers/xen/xen-pciback/conf_space_capability.c
> > @@ -235,7 +235,7 @@ static int msi_msix_flags_write(struct pci_dev *dev, int offset, u16 new_value,
> >  	    (new_value ^ old_value) & ~field_config->allowed_bits)
> >  		return PCIBIOS_SET_FAILED;
> >  
> > -	if (new_value & field_config->enable_bit) {
> > +	if ((new_value & field_config->allowed_bits) == field_config->enable_bit) {
> >  		/* don't allow enabling together with other interrupt types */
> >  		int int_type = xen_pcibk_get_interrupt_type(dev);
> >  
> > ----8<----
> > 
> > Jan, is the above safe? It should prevent clearing MASKALL if INTx isn't
> > disabled, unless I missed something? But also, it will allow enabling
> > MSI-X with MASKALL=1 together with MSI, which I'm not sure if should be
> > allowed.
> 
> While it would probably be okay with what we have now (after your earlier
> patch introducing allowed_bits), it's likely not going to be correct once
> further bits would be added to allowed_bits (which clearly is going to be
> wanted sooner or later, e.g. for multi-vector MSI). Hence I think ...
> 
> > Alternatively to the above patch, I could allow specifically setting
> > MSIX_FLAGS_ENABLE + MSIX_FLAGS_MASKALL while INTx isn't disabled as a
> > single exception,
> 
> ... this is the way to go, and ...

Ok, I'll go this way then.

> > but at this point I'm not sure if some other driver or
> > OS wouldn't approach this in yet another way.
> 
> ... I guess we need to further add exceptions as needed - the one further
> approach I could see is to keep all entry's mask bits set until setting
> "INTx disable", without using MASKALL.
> 
> I'd like to note though that the PCI spec has no such exception. It,
> however, also doesn't mandate setting "INTx disable" - that's merely a
> workaround for flawed hardware afaik. (In Xen we also only cross check
> MSI and MSI-X. IOW we rely on Dom0 anyway when it comes to driving
> "INTx disable".) Therefore in the end pciback looks to be going too far
> in enforcing such dependencies.
> 
> Thinking of this - what about making the change in
> xen_pcibk_get_interrupt_type() instead, setting INTERRUPT_TYPE_MSIX only
> when MASKALL is clear (alongside ENABLE being set)? This would then also
> cover command_write().

That may be a good idea, but wouldn't cover the case here:
xen_pcibk_get_interrupt_type() at this time returns INTERRUPT_TYPE_INTX
only, and setting MSIX_FLAGS_ENABLE is prevented.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen
  2022-11-17  3:34       ` Marek Marczykowski-Górecki
  2022-11-17  8:04         ` Jan Beulich
@ 2022-11-17 17:29         ` Jason Andryuk
  1 sibling, 0 replies; 16+ messages in thread
From: Jason Andryuk @ 2022-11-17 17:29 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Jan Beulich, qemu-devel, Stefano Stabellini, Anthony Perard,
	Paul Durrant, open list:X86 Xen CPUs

On Wed, Nov 16, 2022 at 10:34 PM Marek Marczykowski-Górecki
<marmarek@invisiblethingslab.com> wrote:
>
> On Wed, Nov 16, 2022 at 10:40:02PM +0100, Marek Marczykowski-Górecki wrote:
> > On Wed, Nov 16, 2022 at 02:15:22PM -0500, Jason Andryuk wrote:
> > > On Mon, Nov 14, 2022 at 2:21 PM Marek Marczykowski-Górecki
> > > <marmarek@invisiblethingslab.com> wrote:
> > > >
> > > > The /dev/mem is used for two purposes:
> > > >  - reading PCI_MSIX_ENTRY_CTRL_MASKBIT
> > > >  - reading Pending Bit Array (PBA)
> > > >
> > > > The first one was originally done because when Xen did not send all
> > > > vector ctrl writes to the device model, so QEMU might have outdated old
> > > > register value. This has been changed in Xen, so QEMU can now use its
> > > > cached value of the register instead.
> > > >
> > > > The Pending Bit Array (PBA) handling is for the case where it lives on
> > > > the same page as the MSI-X table itself. Xen has been extended to handle
> > > > this case too (as well as other registers that may live on those pages),
> > > > so QEMU handling is not necessary anymore.
> > > >
> > > > Removing /dev/mem access is useful to work within stubdomain, and
> > > > necessary when dom0 kernel runs in lockdown mode.
> > > >
> > > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > >
> > > I put the Xen, QEMU, and xen-pciback patches into OpenXT and gave a
> > > little test.  When pci_permissive=0, iwlwifi fails to load its
> > > firmware.  With pci_permissive=1, it looks like MSI-X is enabled. (I
> > > previously included your libxl allow_interrupt_control patch - that
> > > seemed to get regular MSIs working prior to the MSI-X patches.)  I
> > > also removed the OpenXT equivalent of 0005-Disable-MSI-X-caps.patch.
> > > I am testing with Linux 5.4.y, so that could be another factor.
> >
> > Can you confirm the allow_interrupt_control is set by libxl? Also,
> > vanilla 5.4 doesn't have the allow_interrupt_control patch at all, and you
> > may have an earlier version that had "allow_msi_enable" as the sysfs
> > file name.

I backported allow_interrupt_control to 5.4 and that is set properly.

> Ok, I found what is wrong. Enabling MSI-X is refused, because INTx isn't
> disabled at this point yet. And apparently I was testing this with
> permissive=1...
>
> Linux does this:
> https://github.com/torvalds/linux/blob/master/drivers/pci/msi/msi.c#L611
> In short:
> 1. Enable MSI-X with MASKALL=1
> 2. Setup MSI-X table
> 3. Disable INTx
> 4. Set MASKALL=0
>
> This patch on top should fix this:
> ----8<----
> diff --git a/drivers/xen/xen-pciback/conf_space_capability.c b/drivers/xen/xen-pciback/conf_space_capability.c
> index 097316a74126..f4c4381de76e 100644
> --- a/drivers/xen/xen-pciback/conf_space_capability.c
> +++ b/drivers/xen/xen-pciback/conf_space_capability.c
> @@ -235,7 +235,7 @@ static int msi_msix_flags_write(struct pci_dev *dev, int offset, u16 new_value,
>             (new_value ^ old_value) & ~field_config->allowed_bits)
>                 return PCIBIOS_SET_FAILED;
>
> -       if (new_value & field_config->enable_bit) {
> +       if ((new_value & field_config->allowed_bits) == field_config->enable_bit) {
>                 /* don't allow enabling together with other interrupt types */
>                 int int_type = xen_pcibk_get_interrupt_type(dev);
>
> ----8<----

FWIW, I can confirm this allows enabling MSI-X with permissive=0 for me.

Thanks,
Jason


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] hw/xen/xen_pt: Call default handler only if no custom one is set
  2022-11-14 19:20 [PATCH 1/2] hw/xen/xen_pt: Call default handler only if no custom one is set Marek Marczykowski-Górecki
@ 2022-11-22 17:12   ` Anthony PERARD
  2022-11-22 17:12   ` Anthony PERARD
  1 sibling, 0 replies; 16+ messages in thread
From: Anthony PERARD via @ 2022-11-22 17:12 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: qemu-devel, Stefano Stabellini, Paul Durrant, open list:X86 Xen CPUs

On Mon, Nov 14, 2022 at 08:20:10PM +0100, Marek Marczykowski-Górecki wrote:
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index 0ec7e52183..269bd26109 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -255,6 +255,7 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
>      uint32_t find_addr = addr;
>      XenPTRegInfo *reg = NULL;
>      bool wp_flag = false;
> +    uint32_t emul_mask = 0, write_val;
>  
>      if (xen_pt_pci_config_access_check(d, addr, len)) {
>          return;
> @@ -310,7 +311,6 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
>      }
>  
>      memory_region_transaction_begin();
> -    pci_default_write_config(d, addr, val, len);
>  
>      /* adjust the read and write value to appropriate CFC-CFF window */
>      read_val <<= (addr & 3) << 3;
> @@ -370,6 +370,8 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
>                  return;
>              }
>  
> +            emul_mask |= ( (1 << (reg->size * 8) ) - 1 ) << ((find_addr & 3) * 8);
> +
>              /* calculate next address to find */
>              emul_len -= reg->size;
>              if (emul_len > 0) {
> @@ -396,6 +398,24 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
>      /* need to shift back before passing them to xen_host_pci_set_block. */
>      val >>= (addr & 3) << 3;
>  
> +    /* store emulated registers that didn't have specific hooks */
> +    write_val = val;
> +    for (index = 0; emul_mask; index += emul_len) {

`index` isn't used, was it meant to be use for something?

> +        emul_len = 0;
> +        while (emul_mask & 0xff) {
> +            emul_len++;

This seems to count the number of byte that have a hook
(xen_pt_find_reg() found a `reg_entry`).
This loop should count instead the number of bytes for which no
`reg_entry` have been found, right? Shouldn't the loop count when a byte
in emul_mask is unset?

> +            emul_mask >>= 8;
> +        }
> +        if (emul_len) {
> +            uint32_t mask = ((1 << (emul_len * 8)) - 1);
> +            pci_default_write_config(d, addr, write_val & mask, emul_len);

`addr` isn't updated in the loop, aren't we going to write bytes to the
wrong place? If for example "emul_mask == 0x00ff00ff" ?

> +            write_val >>= emul_len * 8;
> +        } else {
> +            emul_mask >>= 8;
> +            write_val >>= 8;
> +        }
> +    }

Thanks,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] hw/xen/xen_pt: Call default handler only if no custom one is set
@ 2022-11-22 17:12   ` Anthony PERARD
  0 siblings, 0 replies; 16+ messages in thread
From: Anthony PERARD @ 2022-11-22 17:12 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: qemu-devel, Stefano Stabellini, Paul Durrant, open list:X86 Xen CPUs

On Mon, Nov 14, 2022 at 08:20:10PM +0100, Marek Marczykowski-Górecki wrote:
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index 0ec7e52183..269bd26109 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -255,6 +255,7 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
>      uint32_t find_addr = addr;
>      XenPTRegInfo *reg = NULL;
>      bool wp_flag = false;
> +    uint32_t emul_mask = 0, write_val;
>  
>      if (xen_pt_pci_config_access_check(d, addr, len)) {
>          return;
> @@ -310,7 +311,6 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
>      }
>  
>      memory_region_transaction_begin();
> -    pci_default_write_config(d, addr, val, len);
>  
>      /* adjust the read and write value to appropriate CFC-CFF window */
>      read_val <<= (addr & 3) << 3;
> @@ -370,6 +370,8 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
>                  return;
>              }
>  
> +            emul_mask |= ( (1 << (reg->size * 8) ) - 1 ) << ((find_addr & 3) * 8);
> +
>              /* calculate next address to find */
>              emul_len -= reg->size;
>              if (emul_len > 0) {
> @@ -396,6 +398,24 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
>      /* need to shift back before passing them to xen_host_pci_set_block. */
>      val >>= (addr & 3) << 3;
>  
> +    /* store emulated registers that didn't have specific hooks */
> +    write_val = val;
> +    for (index = 0; emul_mask; index += emul_len) {

`index` isn't used, was it meant to be use for something?

> +        emul_len = 0;
> +        while (emul_mask & 0xff) {
> +            emul_len++;

This seems to count the number of byte that have a hook
(xen_pt_find_reg() found a `reg_entry`).
This loop should count instead the number of bytes for which no
`reg_entry` have been found, right? Shouldn't the loop count when a byte
in emul_mask is unset?

> +            emul_mask >>= 8;
> +        }
> +        if (emul_len) {
> +            uint32_t mask = ((1 << (emul_len * 8)) - 1);
> +            pci_default_write_config(d, addr, write_val & mask, emul_len);

`addr` isn't updated in the loop, aren't we going to write bytes to the
wrong place? If for example "emul_mask == 0x00ff00ff" ?

> +            write_val >>= emul_len * 8;
> +        } else {
> +            emul_mask >>= 8;
> +            write_val >>= 8;
> +        }
> +    }

Thanks,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] hw/xen/xen_pt: Call default handler only if no custom one is set
  2022-11-22 17:12   ` Anthony PERARD
  (?)
@ 2023-09-26 23:25   ` Marek Marczykowski-Górecki
  -1 siblings, 0 replies; 16+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-09-26 23:25 UTC (permalink / raw)
  To: Anthony PERARD
  Cc: qemu-devel, Stefano Stabellini, Paul Durrant, open list:X86 Xen CPUs

[-- Attachment #1: Type: text/plain, Size: 3012 bytes --]

On Tue, Nov 22, 2022 at 05:12:59PM +0000, Anthony PERARD wrote:
> On Mon, Nov 14, 2022 at 08:20:10PM +0100, Marek Marczykowski-Górecki wrote:
> > diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> > index 0ec7e52183..269bd26109 100644
> > --- a/hw/xen/xen_pt.c
> > +++ b/hw/xen/xen_pt.c
> > @@ -255,6 +255,7 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
> >      uint32_t find_addr = addr;
> >      XenPTRegInfo *reg = NULL;
> >      bool wp_flag = false;
> > +    uint32_t emul_mask = 0, write_val;
> >  
> >      if (xen_pt_pci_config_access_check(d, addr, len)) {
> >          return;
> > @@ -310,7 +311,6 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
> >      }
> >  
> >      memory_region_transaction_begin();
> > -    pci_default_write_config(d, addr, val, len);
> >  
> >      /* adjust the read and write value to appropriate CFC-CFF window */
> >      read_val <<= (addr & 3) << 3;
> > @@ -370,6 +370,8 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
> >                  return;
> >              }
> >  
> > +            emul_mask |= ( (1 << (reg->size * 8) ) - 1 ) << ((find_addr & 3) * 8);
> > +
> >              /* calculate next address to find */
> >              emul_len -= reg->size;
> >              if (emul_len > 0) {
> > @@ -396,6 +398,24 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t addr,
> >      /* need to shift back before passing them to xen_host_pci_set_block. */
> >      val >>= (addr & 3) << 3;
> >  
> > +    /* store emulated registers that didn't have specific hooks */
> > +    write_val = val;
> > +    for (index = 0; emul_mask; index += emul_len) {
> 
> `index` isn't used, was it meant to be use for something?

Yes, it should be used as addr + index below.

> > +        emul_len = 0;
> > +        while (emul_mask & 0xff) {
> > +            emul_len++;
> 
> This seems to count the number of byte that have a hook
> (xen_pt_find_reg() found a `reg_entry`).
> This loop should count instead the number of bytes for which no
> `reg_entry` have been found, right? Shouldn't the loop count when a byte
> in emul_mask is unset?

No, see the patch description - only declared registers should be saved.
The patch title is misleading, I'll clarify it.

> > +            emul_mask >>= 8;
> > +        }
> > +        if (emul_len) {
> > +            uint32_t mask = ((1 << (emul_len * 8)) - 1);
> > +            pci_default_write_config(d, addr, write_val & mask, emul_len);
> 
> `addr` isn't updated in the loop, aren't we going to write bytes to the
> wrong place? If for example "emul_mask == 0x00ff00ff" ?

Indeed, it should be addr + index.

> > +            write_val >>= emul_len * 8;
> > +        } else {
> > +            emul_mask >>= 8;
> > +            write_val >>= 8;
> > +        }
> > +    }
> 
> Thanks,
> 
> -- 
> Anthony PERARD

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-09-26 23:26 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-14 19:20 [PATCH 1/2] hw/xen/xen_pt: Call default handler only if no custom one is set Marek Marczykowski-Górecki
2022-11-14 19:20 ` [PATCH 2/2] Do not access /dev/mem in MSI-X PCI passthrough on Xen Marek Marczykowski-Górecki
2022-11-14 19:39   ` Andrew Cooper
2022-11-14 22:44     ` Marek Marczykowski-Górecki
2022-11-15  8:14   ` Jan Beulich
2022-11-15 11:38     ` Marek Marczykowski-Górecki
2022-11-15 14:05       ` Jan Beulich
2022-11-16 19:15   ` Jason Andryuk
2022-11-16 21:40     ` Marek Marczykowski-Górecki
2022-11-17  3:34       ` Marek Marczykowski-Górecki
2022-11-17  8:04         ` Jan Beulich
2022-11-17 11:15           ` Marek Marczykowski-Górecki
2022-11-17 17:29         ` Jason Andryuk
2022-11-22 17:12 ` [PATCH 1/2] hw/xen/xen_pt: Call default handler only if no custom one is set Anthony PERARD via
2022-11-22 17:12   ` Anthony PERARD
2023-09-26 23:25   ` Marek Marczykowski-Górecki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.