All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation
@ 2021-09-13 15:44 Alexandru Elisei
  2021-09-13 15:44 ` [PATCH v1 kvmtool 1/7] arm/gicv2m: Set errno when gicv2_update_routing() fails Alexandru Elisei
                   ` (7 more replies)
  0 siblings, 8 replies; 20+ messages in thread
From: Alexandru Elisei @ 2021-09-13 15:44 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm; +Cc: andre.przywara, jean-philippe

This series is meant to rework the way the MSIX table and PBA are allocated
to prevent situations where the size allocated by kvmtool is larger than
the size of the BAR that holds them.

Patches 1-3 are fixes for stuff I found when I was investing a bug
triggered by the incorrect sizing of the table and PBA.

Patch 4 is a preparatory patch.

Patch 5 is the proper fix. More details in the commit message.

Patch 6 is there to make it easier to catch such errors if the code
regresses.

Patch 7 is an optimization for guests with larger page sizes than the host.


Testing
=======

On an AMD Seattle, host page size 4k and 64k, guest page size 4k and 64k
(so 4 tests in total for each device). Tried device passthrough with a
Realtek RTL8168 NIC and with a Intel 82574L NIC (not at the same time). No
issues encountered, MSIX table and PBA BAR has the same layout as the
physical device.

On an AMD Ryzen 3900x, host and guest 4k pages, I've assigned an Intel
82574L NIC to the guest. Realtek RTL8168 doesn't work without emulating a
PCI Express bus because the drivers falls back to CSI for device
configuration. Everything works in the guest, the shared BAR has the same
layout in the guest as the physical device.


Alexandru Elisei (7):
  arm/gicv2m: Set errno when gicv2_update_routing() fails
  vfio/pci.c: Remove double include for assert.h
  pci: Fix pci_dev_* print macros
  vfio/pci: Rename PBA offset in device descriptor to fd_offset
  vfio/pci: Rework MSIX table and PBA physical size allocation
  vfio/pci: Print an error when offset is outside of the MSIX table or
    PBA
  vfio/pci: Align MSIX Table and PBA size allocation to 64k

 arm/gicv2m.c       | 10 +++---
 include/kvm/pci.h  | 10 +++---
 include/kvm/vfio.h |  3 +-
 vfio/pci.c         | 82 ++++++++++++++++++++++++++++++----------------
 4 files changed, 66 insertions(+), 39 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 kvmtool 1/7] arm/gicv2m: Set errno when gicv2_update_routing() fails
  2021-09-13 15:44 [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation Alexandru Elisei
@ 2021-09-13 15:44 ` Alexandru Elisei
  2021-10-06 15:08   ` Andre Przywara
  2021-09-13 15:44 ` [PATCH v1 kvmtool 2/7] vfio/pci.c: Remove double include for assert.h Alexandru Elisei
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Alexandru Elisei @ 2021-09-13 15:44 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm; +Cc: andre.przywara, jean-philippe

In case of an error when updating the routing table entries,
irq__update_msix_route() uses perror to print an error message.
gicv2m_update_routing() doesn't set errno, and instead returns the value
that errno should have had, which can lead to failure messages like this:

KVM_SET_GSI_ROUTING: Success

Set errno in gicv2m_update_routing() to avoid such messages in the future.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gicv2m.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arm/gicv2m.c b/arm/gicv2m.c
index d7e6398..b47ada8 100644
--- a/arm/gicv2m.c
+++ b/arm/gicv2m.c
@@ -42,16 +42,18 @@ static int gicv2m_update_routing(struct kvm *kvm,
 {
 	int spi;
 
-	if (entry->type != KVM_IRQ_ROUTING_MSI)
-		return -EINVAL;
+	if (entry->type != KVM_IRQ_ROUTING_MSI) {
+		errno = EINVAL;
+		return -errno;
+	}
 
 	if (!entry->u.msi.address_hi && !entry->u.msi.address_lo)
 		return 0;
 
 	spi = entry->u.msi.data & GICV2M_SPI_MASK;
 	if (spi < v2m.first_spi || spi >= v2m.first_spi + v2m.num_spis) {
-		pr_err("invalid SPI number %d", spi);
-		return -EINVAL;
+		errno = EINVAL;
+		return -errno;
 	}
 
 	v2m.spis[spi - v2m.first_spi] = entry->gsi;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 kvmtool 2/7] vfio/pci.c: Remove double include for assert.h
  2021-09-13 15:44 [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation Alexandru Elisei
  2021-09-13 15:44 ` [PATCH v1 kvmtool 1/7] arm/gicv2m: Set errno when gicv2_update_routing() fails Alexandru Elisei
@ 2021-09-13 15:44 ` Alexandru Elisei
  2021-10-06 15:09   ` Andre Przywara
  2021-09-13 15:44 ` [PATCH v1 kvmtool 3/7] pci: Fix pci_dev_* print macros Alexandru Elisei
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Alexandru Elisei @ 2021-09-13 15:44 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm; +Cc: andre.przywara, jean-philippe

assert.h is included twice, keep only one instance.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 vfio/pci.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/vfio/pci.c b/vfio/pci.c
index ea33fd6..10ff99e 100644
--- a/vfio/pci.c
+++ b/vfio/pci.c
@@ -10,8 +10,6 @@
 #include <sys/resource.h>
 #include <sys/time.h>
 
-#include <assert.h>
-
 /* Some distros don't have the define. */
 #ifndef PCI_CAP_EXP_RC_ENDPOINT_SIZEOF_V1
 #define PCI_CAP_EXP_RC_ENDPOINT_SIZEOF_V1	12
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 kvmtool 3/7] pci: Fix pci_dev_* print macros
  2021-09-13 15:44 [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation Alexandru Elisei
  2021-09-13 15:44 ` [PATCH v1 kvmtool 1/7] arm/gicv2m: Set errno when gicv2_update_routing() fails Alexandru Elisei
  2021-09-13 15:44 ` [PATCH v1 kvmtool 2/7] vfio/pci.c: Remove double include for assert.h Alexandru Elisei
@ 2021-09-13 15:44 ` Alexandru Elisei
  2021-09-14  9:13   ` [RESEND PATCH v1 kvmtool 4/8] vfio/pci: Rename PBA offset in device descriptor to fd_offset Alexandru Elisei
  2021-10-06 15:10   ` [PATCH v1 kvmtool 3/7] pci: Fix pci_dev_* print macros Andre Przywara
  2021-09-13 15:44 ` [PATCH v1 kvmtool 5/7] vfio/pci: Rework MSIX table and PBA physical size allocation Alexandru Elisei
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 20+ messages in thread
From: Alexandru Elisei @ 2021-09-13 15:44 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm; +Cc: andre.przywara, jean-philippe

Evaluate the "pci_hdr" argument before attempting to deference a field.
This fixes cryptic errors like this one, which came about during a
debugging session:

vfio/pci.c: In function 'vfio_pci_bar_activate':
include/kvm/pci.h:18:40: error: invalid type argument of '->' (have 'struct pci_device_header')
  pr_warning("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__)
                                        ^~
vfio/pci.c:482:3: note: in expansion of macro 'pci_dev_warn'
   pci_dev_warn(&vdev->pci.hdr, "%s: BAR4\n", __func__);

This is caused by the operator precedence rules in C, where pointer
deference via "->" has a higher precedence than taking the address with the
ampersand symbol. When the macro is substituted, it becomes
&vdev->pci.hdr->vendor_id and it dereferences vdev->pci.hdr, which is not a
pointer, instead of dereferencing &vdev->pci.hdr, which is a pointer, and
quite likely what the author intended.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 include/kvm/pci.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/kvm/pci.h b/include/kvm/pci.h
index 0f2d5bb..d6eb398 100644
--- a/include/kvm/pci.h
+++ b/include/kvm/pci.h
@@ -13,15 +13,15 @@
 #include "kvm/kvm-arch.h"
 
 #define pci_dev_err(pci_hdr, fmt, ...) \
-	pr_err("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__)
+	pr_err("[%04x:%04x] " fmt, (pci_hdr)->vendor_id, (pci_hdr)->device_id, ##__VA_ARGS__)
 #define pci_dev_warn(pci_hdr, fmt, ...) \
-	pr_warning("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__)
+	pr_warning("[%04x:%04x] " fmt, (pci_hdr)->vendor_id, (pci_hdr)->device_id, ##__VA_ARGS__)
 #define pci_dev_info(pci_hdr, fmt, ...) \
-	pr_info("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__)
+	pr_info("[%04x:%04x] " fmt, (pci_hdr)->vendor_id, (pci_hdr)->device_id, ##__VA_ARGS__)
 #define pci_dev_dbg(pci_hdr, fmt, ...) \
-	pr_debug("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__)
+	pr_debug("[%04x:%04x] " fmt, (pci_hdr)->vendor_id, (pci_hdr)->device_id, ##__VA_ARGS__)
 #define pci_dev_die(pci_hdr, fmt, ...) \
-	die("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__)
+	die("[%04x:%04x] " fmt, (pci_hdr)->vendor_id, (pci_hdr)->device_id, ##__VA_ARGS__)
 
 /*
  * PCI Configuration Mechanism #1 I/O ports. See Section 3.7.4.1.
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 kvmtool 5/7] vfio/pci: Rework MSIX table and PBA physical size allocation
  2021-09-13 15:44 [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation Alexandru Elisei
                   ` (2 preceding siblings ...)
  2021-09-13 15:44 ` [PATCH v1 kvmtool 3/7] pci: Fix pci_dev_* print macros Alexandru Elisei
@ 2021-09-13 15:44 ` Alexandru Elisei
  2021-10-06 15:11   ` Andre Przywara
  2021-09-13 15:44 ` [PATCH v1 kvmtool 6/7] vfio/pci: Print an error when offset is outside of the MSIX table or PBA Alexandru Elisei
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Alexandru Elisei @ 2021-09-13 15:44 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm; +Cc: andre.przywara, jean-philippe

When creating the MSIX table and PBA, kvmtool rounds up the table and
pending bit array sizes to the host's page size. Unfortunately, when doing
that, it doesn't take into account that the new size can exceed the device
BAR size, leading to hard to diagnose errors for certain configurations.

One theoretical example: PBA and table in the same 4k BAR, host's page size
is 4k. In this case, table->size = 4k, pba->size = 4k, map_size = 4k, which
means that pba->guest_phys_addr = table->guest_phys_addr + 4k, which is
outside of the 4k MMIO range allocated for both structures.

Another example, this time a real-world error that I encountered: happens
with a 64k host booting a 4k guest, an RTL8168 PCIE NIC assigned to the
guest. In this case, kvmtool sets table->size = 64k (because it's rounded
to the host's page size) and pba->size = 64k.

Truncated output of lspci -vv on the host:

01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
	Subsystem: TP-LINK Technologies Co., Ltd. TG-3468 Gigabit PCI Express Network Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 255
	Region 0: I/O ports at 1000 [size=256]
	Region 2: Memory at 40000000 (64-bit, non-prefetchable) [size=4K]
	Region 4: Memory at 100000000 (64-bit, prefetchable) [size=16K]
	[..]
	Capabilities: [b0] MSI-X: Enable- Count=4 Masked-
		Vector table: BAR=4 offset=00000000
		PBA: BAR=4 offset=00000800
	[..]

When booting the guest:

[..]
[    0.207444] pci-host-generic 40000000.pci: host bridge /pci ranges:
[    0.208564] pci-host-generic 40000000.pci:       IO 0x0000000000..0x000000ffff -> 0x0000000000
[    0.209857] pci-host-generic 40000000.pci:      MEM 0x0050000000..0x007fffffff -> 0x0050000000
[    0.211184] pci-host-generic 40000000.pci: ECAM at [mem 0x40000000-0x4fffffff] for [bus 00]
[    0.212625] pci-host-generic 40000000.pci: PCI host bridge to bus 0000:00
[    0.213647] pci_bus 0000:00: root bus resource [bus 00]
[    0.214429] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
[    0.215355] pci_bus 0000:00: root bus resource [mem 0x50000000-0x7fffffff]
[    0.216676] pci 0000:00:00.0: [10ec:8168] type 00 class 0x020000
[    0.223771] pci 0000:00:00.0: reg 0x10: [io  0x6200-0x62ff]
[    0.239765] pci 0000:00:00.0: reg 0x18: [mem 0x50010000-0x50010fff]
[    0.244595] pci 0000:00:00.0: reg 0x20: [mem 0x50000000-0x50003fff]
[    0.246331] pci 0000:00:01.0: [1af4:1000] type 00 class 0x020000
[    0.247278] pci 0000:00:01.0: reg 0x10: [io  0x6300-0x63ff]
[    0.248212] pci 0000:00:01.0: reg 0x14: [mem 0x50020000-0x500200ff]
[    0.249172] pci 0000:00:01.0: reg 0x18: [mem 0x50020400-0x500207ff]
[    0.250450] pci 0000:00:02.0: [1af4:1001] type 00 class 0x018000
[    0.251392] pci 0000:00:02.0: reg 0x10: [io  0x6400-0x64ff]
[    0.252351] pci 0000:00:02.0: reg 0x14: [mem 0x50020800-0x500208ff]
[    0.253312] pci 0000:00:02.0: reg 0x18: [mem 0x50020c00-0x50020fff]
(1) [    0.254760] pci 0000:00:00.0: BAR 4: assigned [mem 0x50000000-0x50003fff]
(2) [    0.255805] pci 0000:00:00.0: BAR 2: assigned [mem 0x50004000-0x50004fff]
  Warning: [10ec:8168] Error activating emulation for BAR 2
  Warning: [10ec:8168] Error activating emulation for BAR 2
[    0.260432] pci 0000:00:01.0: BAR 2: assigned [mem 0x50005000-0x500053ff]
  Warning: [1af4:1000] Error activating emulation for BAR 2
  Warning: [1af4:1000] Error activating emulation for BAR 2
[    0.261469] pci 0000:00:02.0: BAR 2: assigned [mem 0x50005400-0x500057ff]
  Warning: [1af4:1001] Error activating emulation for BAR 2
  Warning: [1af4:1001] Error activating emulation for BAR 2
[    0.262499] pci 0000:00:00.0: BAR 0: assigned [io  0x1000-0x10ff]
[    0.263415] pci 0000:00:01.0: BAR 0: assigned [io  0x1100-0x11ff]
[    0.264462] pci 0000:00:01.0: BAR 1: assigned [mem 0x50005800-0x500058ff]
  Warning: [1af4:1000] Error activating emulation for BAR 1
  Warning: [1af4:1000] Error activating emulation for BAR 1
[    0.265481] pci 0000:00:02.0: BAR 0: assigned [io  0x1200-0x12ff]
[    0.266397] pci 0000:00:02.0: BAR 1: assigned [mem 0x50005900-0x500059ff]
  Warning: [1af4:1001] Error activating emulation for BAR 1
  Warning: [1af4:1001] Error activating emulation for BAR 1
[    0.267892] EINJ: ACPI disabled.
[    0.269922] virtio-pci 0000:00:01.0: virtio_pci: leaving for legacy driver
[    0.271118] virtio-pci 0000:00:02.0: virtio_pci: leaving for legacy driver
[    0.274122] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    0.275930] printk: console [ttyS0] disabled
[    0.276669] 1000000.U6_16550A: ttyS0 at MMIO 0x1000000 (irq = 13, base_baud = 115200) is a 16550A
[    0.278058] printk: console [ttyS0] enabled
[    0.278058] printk: console [ttyS0] enabled
[    0.279304] printk: bootconsole [ns16550a0] disabled
[    0.279304] printk: bootconsole [ns16550a0] disabled
[    0.281252] 1001000.U6_16550A: ttyS1 at MMIO 0x1001000 (irq = 14, base_baud = 115200) is a 16550A
[    0.282842] 1002000.U6_16550A: ttyS2 at MMIO 0x1002000 (irq = 15, base_baud = 115200) is a 16550A
[    0.284611] 1003000.U6_16550A: ttyS3 at MMIO 0x1003000 (irq = 16, base_baud = 115200) is a 16550A
[    0.286094] SuperH (H)SCI(F) driver initialized
[    0.286868] msm_serial: driver initialized
[    0.287890] [drm] radeon kernel modesetting enabled.
[    0.288826] cacheinfo: Unable to detect cache hierarchy for CPU 0
[    0.293321] loop: module loaded
KVM_SET_GSI_ROUTING: Invalid argument

At (1), the guest writes 0x50000000 into BAR 4 of the NIC (which holds
the MSIX table and PBA), expecting that will cover only 16k of address
space (the BAR size), up to 0x50003fff, inclusive. On the host side, in
vfio_pci_bar_activate(), kvmtool will actually register for MMIO
emulation the region 0x50000000-0x5000ffff (64k in total) for the MSIX
table and 0x50010000-0x5001ffff (another 64k) for the PBA (kvmtool set
table->size and pba->size to 64k when it aligned them to the host's page
size).

Then at step (2), the guest writes the next available address (from its
point of view) into BAR 2 of the NIC, which is 0x50004000. On the host
side, the PCI emulation layer will search all the regions that overlap with
the BAR address range (0x50004000-0x50004fff) and will find none because,
just like the guest, it uses the BAR size to check for overlaps. When
vfio_pci_bar_activate() is reached, kvmtool will try to register memory for
this region, but it is already registered for the MSIX table emulation and
fails.

The same scenario repeats for every following memory BAR, because the MSIX
table and PBA use memory from 0x50000000 to 0x5001ffff.

The error at the end, which finally terminates the VM, is caused by the
guest trying to write to a totally different BAR, which vfio-pci
interpretes as a write to MSI-X table because it falls in the 64k region
that was registered for emulation. The IRQ ID is not a valid SPI number and
gicv2m_update_routing() returns an error (and sets errno to EINVAL).

Fix this by aligning the table and PBA size to 8 bytes to allow for
qword accesses, like PCI 3.0 mandates.

For the sake of simplicity, the PBA offset in a BAR, in case of a shared
BAR, is kept the same as the offset of the physical device. One hopes that
the device respects the recommendations set forth in PCI LOCAL BUS
SPECIFICATION, REV. 3.0, section "MSI-X Capability and Table Structures"

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 include/kvm/vfio.h |  1 +
 vfio/pci.c         | 65 ++++++++++++++++++++++++++++------------------
 2 files changed, 41 insertions(+), 25 deletions(-)

diff --git a/include/kvm/vfio.h b/include/kvm/vfio.h
index 8cdf04f..764ab9b 100644
--- a/include/kvm/vfio.h
+++ b/include/kvm/vfio.h
@@ -50,6 +50,7 @@ struct vfio_pci_msix_pba {
 	size_t				size;
 	off_t				fd_offset; /* in VFIO device fd */
 	unsigned int			bar;
+	u32				bar_offset; /* in the shared BAR */
 	u32				guest_phys_addr;
 };
 
diff --git a/vfio/pci.c b/vfio/pci.c
index cc18311..7781868 100644
--- a/vfio/pci.c
+++ b/vfio/pci.c
@@ -502,7 +502,7 @@ static int vfio_pci_bar_activate(struct kvm *kvm,
 
 	if (has_msix && (u32)bar_num == pba->bar) {
 		if (pba->bar == table->bar)
-			pba->guest_phys_addr = table->guest_phys_addr + table->size;
+			pba->guest_phys_addr = table->guest_phys_addr + pba->bar_offset;
 		else
 			pba->guest_phys_addr = region->guest_phys_addr;
 		ret = kvm__register_mmio(kvm, pba->guest_phys_addr,
@@ -815,15 +815,21 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev)
 	if (msix) {
 		/* Add a shortcut to the PBA region for the MMIO handler */
 		int pba_index = VFIO_PCI_BAR0_REGION_INDEX + pdev->msix_pba.bar;
+		u32 pba_bar_offset = msix->pba_offset & PCI_MSIX_PBA_OFFSET;
+
 		pdev->msix_pba.fd_offset = vdev->regions[pba_index].info.offset +
-					   (msix->pba_offset & PCI_MSIX_PBA_OFFSET);
+					   pba_bar_offset;
 
 		/* Tidy up the capability */
 		msix->table_offset &= PCI_MSIX_TABLE_BIR;
-		msix->pba_offset &= PCI_MSIX_PBA_BIR;
-		if (pdev->msix_table.bar == pdev->msix_pba.bar)
-			msix->pba_offset |= pdev->msix_table.size &
-					    PCI_MSIX_PBA_OFFSET;
+		if (pdev->msix_table.bar == pdev->msix_pba.bar) {
+			/* Keep the same offset as the MSIX cap. */
+			pdev->msix_pba.bar_offset = pba_bar_offset;
+		} else {
+			/* PBA is at the start of the BAR. */
+			msix->pba_offset &= PCI_MSIX_PBA_BIR;
+			pdev->msix_pba.bar_offset = 0;
+		}
 	}
 
 	/* Install our fake Configuration Space */
@@ -896,8 +902,10 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
 	 * KVM needs memory regions to be multiple of and aligned on PAGE_SIZE.
 	 */
 	nr_entries = (msix->ctrl & PCI_MSIX_FLAGS_QSIZE) + 1;
-	table->size = ALIGN(nr_entries * PCI_MSIX_ENTRY_SIZE, PAGE_SIZE);
-	pba->size = ALIGN(DIV_ROUND_UP(nr_entries, 64), PAGE_SIZE);
+
+	/* MSIX table and PBA must support QWORD accesses. */
+	table->size = ALIGN(nr_entries * PCI_MSIX_ENTRY_SIZE, 8);
+	pba->size = ALIGN(DIV_ROUND_UP(nr_entries, 64), 8);
 
 	entries = calloc(nr_entries, sizeof(struct vfio_pci_msi_entry));
 	if (!entries)
@@ -911,23 +919,8 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
 		return ret;
 	if (!info.size)
 		return -EINVAL;
-	map_size = info.size;
-
-	if (table->bar != pba->bar) {
-		ret = vfio_pci_get_region_info(vdev, pba->bar, &info);
-		if (ret)
-			return ret;
-		if (!info.size)
-			return -EINVAL;
-		map_size += info.size;
-	}
 
-	/*
-	 * To ease MSI-X cap configuration in case they share the same BAR,
-	 * collapse table and pending array. The size of the BAR regions must be
-	 * powers of two.
-	 */
-	map_size = ALIGN(map_size, PAGE_SIZE);
+	map_size = ALIGN(info.size, PAGE_SIZE);
 	table->guest_phys_addr = pci_get_mmio_block(map_size);
 	if (!table->guest_phys_addr) {
 		pr_err("cannot allocate MMIO space");
@@ -943,7 +936,29 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
 	 * between MSI-X table and PBA. For the sake of isolation, create a
 	 * virtual PBA.
 	 */
-	pba->guest_phys_addr = table->guest_phys_addr + table->size;
+	if (table->bar == pba->bar) {
+		u32 pba_bar_offset = msix->pba_offset & PCI_MSIX_PBA_OFFSET;
+		/* Sanity checks. */
+		if (table->size > pba_bar_offset)
+			die("MSIX table overlaps with PBA");
+		if (pba_bar_offset + pba->size > info.size)
+			die("PBA exceeds the size of the region");
+		pba->guest_phys_addr = table->guest_phys_addr + pba_bar_offset;
+	} else {
+		ret = vfio_pci_get_region_info(vdev, pba->bar, &info);
+		if (ret)
+			return ret;
+		if (!info.size)
+			return -EINVAL;
+
+		map_size = ALIGN(info.size, PAGE_SIZE);
+		pba->guest_phys_addr = pci_get_mmio_block(map_size);
+		if (!pba->guest_phys_addr) {
+			pr_err("cannot allocate MMIO space");
+			ret = -ENOMEM;
+			goto out_free;
+		}
+	}
 
 	pdev->msix.entries = entries;
 	pdev->msix.nr_entries = nr_entries;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 kvmtool 6/7] vfio/pci: Print an error when offset is outside of the MSIX table or PBA
  2021-09-13 15:44 [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation Alexandru Elisei
                   ` (3 preceding siblings ...)
  2021-09-13 15:44 ` [PATCH v1 kvmtool 5/7] vfio/pci: Rework MSIX table and PBA physical size allocation Alexandru Elisei
@ 2021-09-13 15:44 ` Alexandru Elisei
  2021-10-06 15:11   ` Andre Przywara
  2021-09-13 15:44 ` [PATCH v1 kvmtool 7/7] vfio/pci: Align MSIX Table and PBA size allocation to 64k Alexandru Elisei
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Alexandru Elisei @ 2021-09-13 15:44 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm; +Cc: andre.przywara, jean-philippe

Now that we keep track of the real size of MSIX table and PBA, print an
error when the guest tries to write to an offset which is not inside the
correct regions.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 vfio/pci.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/vfio/pci.c b/vfio/pci.c
index 7781868..a6d0408 100644
--- a/vfio/pci.c
+++ b/vfio/pci.c
@@ -249,6 +249,11 @@ static void vfio_pci_msix_pba_access(struct kvm_cpu *vcpu, u64 addr, u8 *data,
 	u64 offset = addr - pba->guest_phys_addr;
 	struct vfio_device *vdev = container_of(pdev, struct vfio_device, pci);
 
+	if (offset >= pba->size) {
+		vfio_dev_err(vdev, "access outside of the MSIX PBA");
+		return;
+	}
+
 	if (is_write)
 		return;
 
@@ -269,6 +274,10 @@ static void vfio_pci_msix_table_access(struct kvm_cpu *vcpu, u64 addr, u8 *data,
 	struct vfio_device *vdev = container_of(pdev, struct vfio_device, pci);
 
 	u64 offset = addr - pdev->msix_table.guest_phys_addr;
+	if (offset >= pdev->msix_table.size) {
+		vfio_dev_err(vdev, "access outside of the MSI-X table");
+		return;
+	}
 
 	size_t vector = offset / PCI_MSIX_ENTRY_SIZE;
 	off_t field = offset % PCI_MSIX_ENTRY_SIZE;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v1 kvmtool 7/7] vfio/pci: Align MSIX Table and PBA size allocation to 64k
  2021-09-13 15:44 [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation Alexandru Elisei
                   ` (4 preceding siblings ...)
  2021-09-13 15:44 ` [PATCH v1 kvmtool 6/7] vfio/pci: Print an error when offset is outside of the MSIX table or PBA Alexandru Elisei
@ 2021-09-13 15:44 ` Alexandru Elisei
  2021-10-06 15:11   ` Andre Przywara
       [not found] ` <20210913154413.14322-5-alexandru.elisei@arm.com>
  2021-10-12  8:31 ` [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation Will Deacon
  7 siblings, 1 reply; 20+ messages in thread
From: Alexandru Elisei @ 2021-09-13 15:44 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm; +Cc: andre.przywara, jean-philippe

When allocating MMIO space for the MSI-X table, kvmtool rounds the
allocation to the host's page size to make it as easy as possible for the
guest to map the table to a page, if it wants to (and doesn't do BAR
reassignment, like the x86 architecture for example). However, the host's
page size can differ from the guest's, for example, if the host is compiled
with 4k pages and the guest is using 64k pages.

To make sure the allocation is always aligned to a guest's page size, round
it up to the maximum page size, which is 64k. Do the same for the pending
bit array if it lives in its own BAR.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 vfio/pci.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/vfio/pci.c b/vfio/pci.c
index a6d0408..7e258a4 100644
--- a/vfio/pci.c
+++ b/vfio/pci.c
@@ -1,3 +1,5 @@
+#include "linux/sizes.h"
+
 #include "kvm/irq.h"
 #include "kvm/kvm.h"
 #include "kvm/kvm-cpu.h"
@@ -929,7 +931,7 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
 	if (!info.size)
 		return -EINVAL;
 
-	map_size = ALIGN(info.size, PAGE_SIZE);
+	map_size = ALIGN(info.size, SZ_64K);
 	table->guest_phys_addr = pci_get_mmio_block(map_size);
 	if (!table->guest_phys_addr) {
 		pr_err("cannot allocate MMIO space");
@@ -960,7 +962,7 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
 		if (!info.size)
 			return -EINVAL;
 
-		map_size = ALIGN(info.size, PAGE_SIZE);
+		map_size = ALIGN(info.size, SZ_64K);
 		pba->guest_phys_addr = pci_get_mmio_block(map_size);
 		if (!pba->guest_phys_addr) {
 			pr_err("cannot allocate MMIO space");
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RESEND PATCH v1 kvmtool 4/8] vfio/pci: Rename PBA offset in device descriptor to fd_offset
  2021-09-13 15:44 ` [PATCH v1 kvmtool 3/7] pci: Fix pci_dev_* print macros Alexandru Elisei
@ 2021-09-14  9:13   ` Alexandru Elisei
  2021-10-06 15:10   ` [PATCH v1 kvmtool 3/7] pci: Fix pci_dev_* print macros Andre Przywara
  1 sibling, 0 replies; 20+ messages in thread
From: Alexandru Elisei @ 2021-09-14  9:13 UTC (permalink / raw)
  To: will, julien.thierry.kdev, kvm; +Cc: andre.przywara, jean-philippe

The MSI-X capability defines a PBA offset, which is the offset of the PBA
array in the BAR that holds the array.

kvmtool uses the field "pba_offset" in struct msix_cap (which represents
the MSIX capability) to refer to the [PBA offset:BAR] field of the
capability; and the field "offset" in the struct vfio_pci_msix_pba to refer
to offset of the PBA array in the device descriptor created by the VFIO
driver.

As we're getting ready to add yet another field that represents an offset
to struct vfio_pci_msix_pba, try to avoid ambiguities by renaming the
struct's "offset" field to "fd_offset".

No functional change intended.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
This is a resend of the original patch which kvm@vger.kernel.org rejected:

<kvm@vger.kernel.org>: host vger.kernel.org[23.128.96.18] said: 550 5.7.1
    Content-Policy reject msg: The message contains HTML, therefore we consider
    it SPAM.  Send pure TEXT/PLAIN if you are not a spammer. BF:<S 1>;
    S1343576AbhIMPoN (in reply to end of DATA command)

 include/kvm/vfio.h | 2 +-
 vfio/pci.c         | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/kvm/vfio.h b/include/kvm/vfio.h
index 28223cf2f036..8cdf04fcc265 100644
--- a/include/kvm/vfio.h
+++ b/include/kvm/vfio.h
@@ -48,7 +48,7 @@ struct vfio_pci_msix_table {
 
 struct vfio_pci_msix_pba {
 	size_t				size;
-	off_t				offset; /* in VFIO device fd */
+	off_t				fd_offset; /* in VFIO device fd */
 	unsigned int			bar;
 	u32				guest_phys_addr;
 };
diff --git a/vfio/pci.c b/vfio/pci.c
index 10ff99e70226..cc183118c7e3 100644
--- a/vfio/pci.c
+++ b/vfio/pci.c
@@ -256,7 +256,7 @@ static void vfio_pci_msix_pba_access(struct kvm_cpu *vcpu, u64 addr, u8 *data,
 	 * TODO: emulate PBA. Hardware MSI-X is never masked, so reading the PBA
 	 * is completely useless here. Note that Linux doesn't use PBA.
 	 */
-	if (pread(vdev->fd, data, len, pba->offset + offset) != (ssize_t)len)
+	if (pread(vdev->fd, data, len, pba->fd_offset + offset) != (ssize_t)len)
 		vfio_dev_err(vdev, "cannot access MSIX PBA\n");
 }
 
@@ -815,8 +815,8 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev)
 	if (msix) {
 		/* Add a shortcut to the PBA region for the MMIO handler */
 		int pba_index = VFIO_PCI_BAR0_REGION_INDEX + pdev->msix_pba.bar;
-		pdev->msix_pba.offset = vdev->regions[pba_index].info.offset +
-					(msix->pba_offset & PCI_MSIX_PBA_OFFSET);
+		pdev->msix_pba.fd_offset = vdev->regions[pba_index].info.offset +
+					   (msix->pba_offset & PCI_MSIX_PBA_OFFSET);
 
 		/* Tidy up the capability */
 		msix->table_offset &= PCI_MSIX_TABLE_BIR;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 kvmtool 1/7] arm/gicv2m: Set errno when gicv2_update_routing() fails
  2021-09-13 15:44 ` [PATCH v1 kvmtool 1/7] arm/gicv2m: Set errno when gicv2_update_routing() fails Alexandru Elisei
@ 2021-10-06 15:08   ` Andre Przywara
  0 siblings, 0 replies; 20+ messages in thread
From: Andre Przywara @ 2021-10-06 15:08 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: will, julien.thierry.kdev, kvm, jean-philippe

On Mon, 13 Sep 2021 16:44:07 +0100
Alexandru Elisei <alexandru.elisei@arm.com> wrote:

> In case of an error when updating the routing table entries,
> irq__update_msix_route() uses perror to print an error message.
> gicv2m_update_routing() doesn't set errno, and instead returns the value
> that errno should have had, which can lead to failure messages like this:
> 
> KVM_SET_GSI_ROUTING: Success
> 
> Set errno in gicv2m_update_routing() to avoid such messages in the future.

Fair enough, the usage of errno in the error reporting path is not really
consistent in kvmtool, but as we also keep the return value, that's
alright:

> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre

> ---
>  arm/gicv2m.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/arm/gicv2m.c b/arm/gicv2m.c
> index d7e6398..b47ada8 100644
> --- a/arm/gicv2m.c
> +++ b/arm/gicv2m.c
> @@ -42,16 +42,18 @@ static int gicv2m_update_routing(struct kvm *kvm,
>  {
>  	int spi;
>  
> -	if (entry->type != KVM_IRQ_ROUTING_MSI)
> -		return -EINVAL;
> +	if (entry->type != KVM_IRQ_ROUTING_MSI) {
> +		errno = EINVAL;
> +		return -errno;
> +	}
>  
>  	if (!entry->u.msi.address_hi && !entry->u.msi.address_lo)
>  		return 0;
>  
>  	spi = entry->u.msi.data & GICV2M_SPI_MASK;
>  	if (spi < v2m.first_spi || spi >= v2m.first_spi + v2m.num_spis) {
> -		pr_err("invalid SPI number %d", spi);
> -		return -EINVAL;
> +		errno = EINVAL;
> +		return -errno;
>  	}
>  
>  	v2m.spis[spi - v2m.first_spi] = entry->gsi;


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 kvmtool 2/7] vfio/pci.c: Remove double include for assert.h
  2021-09-13 15:44 ` [PATCH v1 kvmtool 2/7] vfio/pci.c: Remove double include for assert.h Alexandru Elisei
@ 2021-10-06 15:09   ` Andre Przywara
  0 siblings, 0 replies; 20+ messages in thread
From: Andre Przywara @ 2021-10-06 15:09 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: will, julien.thierry.kdev, kvm, jean-philippe

On Mon, 13 Sep 2021 16:44:08 +0100
Alexandru Elisei <alexandru.elisei@arm.com> wrote:

> assert.h is included twice, keep only one instance.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre

> ---
>  vfio/pci.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/vfio/pci.c b/vfio/pci.c
> index ea33fd6..10ff99e 100644
> --- a/vfio/pci.c
> +++ b/vfio/pci.c
> @@ -10,8 +10,6 @@
>  #include <sys/resource.h>
>  #include <sys/time.h>
>  
> -#include <assert.h>
> -
>  /* Some distros don't have the define. */
>  #ifndef PCI_CAP_EXP_RC_ENDPOINT_SIZEOF_V1
>  #define PCI_CAP_EXP_RC_ENDPOINT_SIZEOF_V1	12


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 kvmtool 3/7] pci: Fix pci_dev_* print macros
  2021-09-13 15:44 ` [PATCH v1 kvmtool 3/7] pci: Fix pci_dev_* print macros Alexandru Elisei
  2021-09-14  9:13   ` [RESEND PATCH v1 kvmtool 4/8] vfio/pci: Rename PBA offset in device descriptor to fd_offset Alexandru Elisei
@ 2021-10-06 15:10   ` Andre Przywara
  1 sibling, 0 replies; 20+ messages in thread
From: Andre Przywara @ 2021-10-06 15:10 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: will, julien.thierry.kdev, kvm, jean-philippe

On Mon, 13 Sep 2021 16:44:09 +0100
Alexandru Elisei <alexandru.elisei@arm.com> wrote:

> Evaluate the "pci_hdr" argument before attempting to deference a field.
> This fixes cryptic errors like this one, which came about during a
> debugging session:
> 
> vfio/pci.c: In function 'vfio_pci_bar_activate':
> include/kvm/pci.h:18:40: error: invalid type argument of '->' (have 'struct pci_device_header')
>   pr_warning("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__)
>                                         ^~
> vfio/pci.c:482:3: note: in expansion of macro 'pci_dev_warn'
>    pci_dev_warn(&vdev->pci.hdr, "%s: BAR4\n", __func__);
> 
> This is caused by the operator precedence rules in C, where pointer
> deference via "->" has a higher precedence than taking the address with the
> ampersand symbol. When the macro is substituted, it becomes
> &vdev->pci.hdr->vendor_id and it dereferences vdev->pci.hdr, which is not a
> pointer, instead of dereferencing &vdev->pci.hdr, which is a pointer, and
> quite likely what the author intended.

Indeed! Actually that should not need that many words, parameters in macros
should always be put in parentheses.

> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre

> ---
>  include/kvm/pci.h | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/include/kvm/pci.h b/include/kvm/pci.h
> index 0f2d5bb..d6eb398 100644
> --- a/include/kvm/pci.h
> +++ b/include/kvm/pci.h
> @@ -13,15 +13,15 @@
>  #include "kvm/kvm-arch.h"
>  
>  #define pci_dev_err(pci_hdr, fmt, ...) \
> -	pr_err("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__)
> +	pr_err("[%04x:%04x] " fmt, (pci_hdr)->vendor_id, (pci_hdr)->device_id, ##__VA_ARGS__)
>  #define pci_dev_warn(pci_hdr, fmt, ...) \
> -	pr_warning("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__)
> +	pr_warning("[%04x:%04x] " fmt, (pci_hdr)->vendor_id, (pci_hdr)->device_id, ##__VA_ARGS__)
>  #define pci_dev_info(pci_hdr, fmt, ...) \
> -	pr_info("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__)
> +	pr_info("[%04x:%04x] " fmt, (pci_hdr)->vendor_id, (pci_hdr)->device_id, ##__VA_ARGS__)
>  #define pci_dev_dbg(pci_hdr, fmt, ...) \
> -	pr_debug("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__)
> +	pr_debug("[%04x:%04x] " fmt, (pci_hdr)->vendor_id, (pci_hdr)->device_id, ##__VA_ARGS__)
>  #define pci_dev_die(pci_hdr, fmt, ...) \
> -	die("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__)
> +	die("[%04x:%04x] " fmt, (pci_hdr)->vendor_id, (pci_hdr)->device_id, ##__VA_ARGS__)
>  
>  /*
>   * PCI Configuration Mechanism #1 I/O ports. See Section 3.7.4.1.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 kvmtool 4/7] vfio/pci: Rename PBA offset in device descriptor to fd_offset
       [not found] ` <20210913154413.14322-5-alexandru.elisei@arm.com>
@ 2021-10-06 15:10   ` Andre Przywara
  0 siblings, 0 replies; 20+ messages in thread
From: Andre Przywara @ 2021-10-06 15:10 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: will, julien.thierry.kdev, kvm, jean-philippe

On Mon, 13 Sep 2021 16:44:10 +0100
Alexandru Elisei <alexandru.elisei@arm.com> wrote:

> The MSI-X capability defines a PBA offset, which is the offset of the PBA
> array in the BAR that holds the array.
> 
> kvmtool uses the field "pba_offset" in struct msix_cap (which represents
> the MSIX capability) to refer to the [PBA offset:BAR] field of the
> capability; and the field "offset" in the struct vfio_pci_msix_pba to refer
> to offset of the PBA array in the device descriptor created by the VFIO
> driver.
> 
> As we're getting ready to add yet another field that represents an offset
> to struct vfio_pci_msix_pba, try to avoid ambiguities by renaming the
> struct's "offset" field to "fd_offset".
> 
> No functional change intended.

Makes sense, too many offsets.

> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre

> ---
>  include/kvm/vfio.h | 2 +-
>  vfio/pci.c         | 6 +++---
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/include/kvm/vfio.h b/include/kvm/vfio.h
> index 28223cf..8cdf04f 100644
> --- a/include/kvm/vfio.h
> +++ b/include/kvm/vfio.h
> @@ -48,7 +48,7 @@ struct vfio_pci_msix_table {
>  
>  struct vfio_pci_msix_pba {
>  	size_t				size;
> -	off_t				offset; /* in VFIO device fd */
> +	off_t				fd_offset; /* in VFIO device fd */
>  	unsigned int			bar;
>  	u32				guest_phys_addr;
>  };
> diff --git a/vfio/pci.c b/vfio/pci.c
> index 10ff99e..cc18311 100644
> --- a/vfio/pci.c
> +++ b/vfio/pci.c
> @@ -256,7 +256,7 @@ static void vfio_pci_msix_pba_access(struct kvm_cpu *vcpu, u64 addr, u8 *data,
>  	 * TODO: emulate PBA. Hardware MSI-X is never masked, so reading the PBA
>  	 * is completely useless here. Note that Linux doesn't use PBA.
>  	 */
> -	if (pread(vdev->fd, data, len, pba->offset + offset) != (ssize_t)len)
> +	if (pread(vdev->fd, data, len, pba->fd_offset + offset) != (ssize_t)len)
>  		vfio_dev_err(vdev, "cannot access MSIX PBA\n");
>  }
>  
> @@ -815,8 +815,8 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev)
>  	if (msix) {
>  		/* Add a shortcut to the PBA region for the MMIO handler */
>  		int pba_index = VFIO_PCI_BAR0_REGION_INDEX + pdev->msix_pba.bar;
> -		pdev->msix_pba.offset = vdev->regions[pba_index].info.offset +
> -					(msix->pba_offset & PCI_MSIX_PBA_OFFSET);
> +		pdev->msix_pba.fd_offset = vdev->regions[pba_index].info.offset +
> +					   (msix->pba_offset & PCI_MSIX_PBA_OFFSET);
>  
>  		/* Tidy up the capability */
>  		msix->table_offset &= PCI_MSIX_TABLE_BIR;


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 kvmtool 5/7] vfio/pci: Rework MSIX table and PBA physical size allocation
  2021-09-13 15:44 ` [PATCH v1 kvmtool 5/7] vfio/pci: Rework MSIX table and PBA physical size allocation Alexandru Elisei
@ 2021-10-06 15:11   ` Andre Przywara
  2021-10-11 14:39     ` Alexandru Elisei
  0 siblings, 1 reply; 20+ messages in thread
From: Andre Przywara @ 2021-10-06 15:11 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: will, julien.thierry.kdev, kvm, jean-philippe

On Mon, 13 Sep 2021 16:44:11 +0100
Alexandru Elisei <alexandru.elisei@arm.com> wrote:

Hi,

> When creating the MSIX table and PBA, kvmtool rounds up the table and
> pending bit array sizes to the host's page size.

Do you have an idea why it was doing that? I mean the comment:
"KVM needs memory regions to be multiple of and aligned on PAGE_SIZE."
is still there. I guess the actual size alignment is done later, so 
table->size and pba->size don't need to be aligned already?
(In that case you should remove the now misleading comment)

Just one nit below, but other than that it looks indeed much better.

> Unfortunately, when doing
> that, it doesn't take into account that the new size can exceed the device
> BAR size, leading to hard to diagnose errors for certain configurations.
> 
> One theoretical example: PBA and table in the same 4k BAR, host's page size
> is 4k. In this case, table->size = 4k, pba->size = 4k, map_size = 4k, which
> means that pba->guest_phys_addr = table->guest_phys_addr + 4k, which is
> outside of the 4k MMIO range allocated for both structures.
> 
> Another example, this time a real-world error that I encountered: happens
> with a 64k host booting a 4k guest, an RTL8168 PCIE NIC assigned to the
> guest. In this case, kvmtool sets table->size = 64k (because it's rounded
> to the host's page size) and pba->size = 64k.
> 
> Truncated output of lspci -vv on the host:
> 
> 01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
> 	Subsystem: TP-LINK Technologies Co., Ltd. TG-3468 Gigabit PCI Express Network Adapter
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 0
> 	Interrupt: pin A routed to IRQ 255
> 	Region 0: I/O ports at 1000 [size=256]
> 	Region 2: Memory at 40000000 (64-bit, non-prefetchable) [size=4K]
> 	Region 4: Memory at 100000000 (64-bit, prefetchable) [size=16K]
> 	[..]
> 	Capabilities: [b0] MSI-X: Enable- Count=4 Masked-
> 		Vector table: BAR=4 offset=00000000
> 		PBA: BAR=4 offset=00000800
> 	[..]
> 
> When booting the guest:
> 
> [..]
> [    0.207444] pci-host-generic 40000000.pci: host bridge /pci ranges:
> [    0.208564] pci-host-generic 40000000.pci:       IO 0x0000000000..0x000000ffff -> 0x0000000000
> [    0.209857] pci-host-generic 40000000.pci:      MEM 0x0050000000..0x007fffffff -> 0x0050000000
> [    0.211184] pci-host-generic 40000000.pci: ECAM at [mem 0x40000000-0x4fffffff] for [bus 00]
> [    0.212625] pci-host-generic 40000000.pci: PCI host bridge to bus 0000:00
> [    0.213647] pci_bus 0000:00: root bus resource [bus 00]
> [    0.214429] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
> [    0.215355] pci_bus 0000:00: root bus resource [mem 0x50000000-0x7fffffff]
> [    0.216676] pci 0000:00:00.0: [10ec:8168] type 00 class 0x020000
> [    0.223771] pci 0000:00:00.0: reg 0x10: [io  0x6200-0x62ff]
> [    0.239765] pci 0000:00:00.0: reg 0x18: [mem 0x50010000-0x50010fff]
> [    0.244595] pci 0000:00:00.0: reg 0x20: [mem 0x50000000-0x50003fff]
> [    0.246331] pci 0000:00:01.0: [1af4:1000] type 00 class 0x020000
> [    0.247278] pci 0000:00:01.0: reg 0x10: [io  0x6300-0x63ff]
> [    0.248212] pci 0000:00:01.0: reg 0x14: [mem 0x50020000-0x500200ff]
> [    0.249172] pci 0000:00:01.0: reg 0x18: [mem 0x50020400-0x500207ff]
> [    0.250450] pci 0000:00:02.0: [1af4:1001] type 00 class 0x018000
> [    0.251392] pci 0000:00:02.0: reg 0x10: [io  0x6400-0x64ff]
> [    0.252351] pci 0000:00:02.0: reg 0x14: [mem 0x50020800-0x500208ff]
> [    0.253312] pci 0000:00:02.0: reg 0x18: [mem 0x50020c00-0x50020fff]
> (1) [    0.254760] pci 0000:00:00.0: BAR 4: assigned [mem 0x50000000-0x50003fff]
> (2) [    0.255805] pci 0000:00:00.0: BAR 2: assigned [mem 0x50004000-0x50004fff]
>   Warning: [10ec:8168] Error activating emulation for BAR 2
>   Warning: [10ec:8168] Error activating emulation for BAR 2
> [    0.260432] pci 0000:00:01.0: BAR 2: assigned [mem 0x50005000-0x500053ff]
>   Warning: [1af4:1000] Error activating emulation for BAR 2
>   Warning: [1af4:1000] Error activating emulation for BAR 2
> [    0.261469] pci 0000:00:02.0: BAR 2: assigned [mem 0x50005400-0x500057ff]
>   Warning: [1af4:1001] Error activating emulation for BAR 2
>   Warning: [1af4:1001] Error activating emulation for BAR 2
> [    0.262499] pci 0000:00:00.0: BAR 0: assigned [io  0x1000-0x10ff]
> [    0.263415] pci 0000:00:01.0: BAR 0: assigned [io  0x1100-0x11ff]
> [    0.264462] pci 0000:00:01.0: BAR 1: assigned [mem 0x50005800-0x500058ff]
>   Warning: [1af4:1000] Error activating emulation for BAR 1
>   Warning: [1af4:1000] Error activating emulation for BAR 1
> [    0.265481] pci 0000:00:02.0: BAR 0: assigned [io  0x1200-0x12ff]
> [    0.266397] pci 0000:00:02.0: BAR 1: assigned [mem 0x50005900-0x500059ff]
>   Warning: [1af4:1001] Error activating emulation for BAR 1
>   Warning: [1af4:1001] Error activating emulation for BAR 1
> [    0.267892] EINJ: ACPI disabled.
> [    0.269922] virtio-pci 0000:00:01.0: virtio_pci: leaving for legacy driver
> [    0.271118] virtio-pci 0000:00:02.0: virtio_pci: leaving for legacy driver
> [    0.274122] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> [    0.275930] printk: console [ttyS0] disabled
> [    0.276669] 1000000.U6_16550A: ttyS0 at MMIO 0x1000000 (irq = 13, base_baud = 115200) is a 16550A
> [    0.278058] printk: console [ttyS0] enabled
> [    0.278058] printk: console [ttyS0] enabled
> [    0.279304] printk: bootconsole [ns16550a0] disabled
> [    0.279304] printk: bootconsole [ns16550a0] disabled
> [    0.281252] 1001000.U6_16550A: ttyS1 at MMIO 0x1001000 (irq = 14, base_baud = 115200) is a 16550A
> [    0.282842] 1002000.U6_16550A: ttyS2 at MMIO 0x1002000 (irq = 15, base_baud = 115200) is a 16550A
> [    0.284611] 1003000.U6_16550A: ttyS3 at MMIO 0x1003000 (irq = 16, base_baud = 115200) is a 16550A
> [    0.286094] SuperH (H)SCI(F) driver initialized
> [    0.286868] msm_serial: driver initialized
> [    0.287890] [drm] radeon kernel modesetting enabled.
> [    0.288826] cacheinfo: Unable to detect cache hierarchy for CPU 0
> [    0.293321] loop: module loaded
> KVM_SET_GSI_ROUTING: Invalid argument
> 
> At (1), the guest writes 0x50000000 into BAR 4 of the NIC (which holds
> the MSIX table and PBA), expecting that will cover only 16k of address
> space (the BAR size), up to 0x50003fff, inclusive. On the host side, in
> vfio_pci_bar_activate(), kvmtool will actually register for MMIO
> emulation the region 0x50000000-0x5000ffff (64k in total) for the MSIX
> table and 0x50010000-0x5001ffff (another 64k) for the PBA (kvmtool set
> table->size and pba->size to 64k when it aligned them to the host's page
> size).
> 
> Then at step (2), the guest writes the next available address (from its
> point of view) into BAR 2 of the NIC, which is 0x50004000. On the host
> side, the PCI emulation layer will search all the regions that overlap with
> the BAR address range (0x50004000-0x50004fff) and will find none because,
> just like the guest, it uses the BAR size to check for overlaps. When
> vfio_pci_bar_activate() is reached, kvmtool will try to register memory for
> this region, but it is already registered for the MSIX table emulation and
> fails.
> 
> The same scenario repeats for every following memory BAR, because the MSIX
> table and PBA use memory from 0x50000000 to 0x5001ffff.
> 
> The error at the end, which finally terminates the VM, is caused by the
> guest trying to write to a totally different BAR, which vfio-pci
> interpretes as a write to MSI-X table because it falls in the 64k region
> that was registered for emulation. The IRQ ID is not a valid SPI number and
> gicv2m_update_routing() returns an error (and sets errno to EINVAL).
> 
> Fix this by aligning the table and PBA size to 8 bytes to allow for
> qword accesses, like PCI 3.0 mandates.
> 
> For the sake of simplicity, the PBA offset in a BAR, in case of a shared
> BAR, is kept the same as the offset of the physical device. One hopes that
> the device respects the recommendations set forth in PCI LOCAL BUS
> SPECIFICATION, REV. 3.0, section "MSI-X Capability and Table Structures"
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  include/kvm/vfio.h |  1 +
>  vfio/pci.c         | 65 ++++++++++++++++++++++++++++------------------
>  2 files changed, 41 insertions(+), 25 deletions(-)
> 
> diff --git a/include/kvm/vfio.h b/include/kvm/vfio.h
> index 8cdf04f..764ab9b 100644
> --- a/include/kvm/vfio.h
> +++ b/include/kvm/vfio.h
> @@ -50,6 +50,7 @@ struct vfio_pci_msix_pba {
>  	size_t				size;
>  	off_t				fd_offset; /* in VFIO device fd */
>  	unsigned int			bar;
> +	u32				bar_offset; /* in the shared BAR */
>  	u32				guest_phys_addr;
>  };
>  
> diff --git a/vfio/pci.c b/vfio/pci.c
> index cc18311..7781868 100644
> --- a/vfio/pci.c
> +++ b/vfio/pci.c
> @@ -502,7 +502,7 @@ static int vfio_pci_bar_activate(struct kvm *kvm,
>  
>  	if (has_msix && (u32)bar_num == pba->bar) {
>  		if (pba->bar == table->bar)
> -			pba->guest_phys_addr = table->guest_phys_addr + table->size;
> +			pba->guest_phys_addr = table->guest_phys_addr + pba->bar_offset;
>  		else
>  			pba->guest_phys_addr = region->guest_phys_addr;
>  		ret = kvm__register_mmio(kvm, pba->guest_phys_addr,
> @@ -815,15 +815,21 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev)
>  	if (msix) {
>  		/* Add a shortcut to the PBA region for the MMIO handler */
>  		int pba_index = VFIO_PCI_BAR0_REGION_INDEX + pdev->msix_pba.bar;
> +		u32 pba_bar_offset = msix->pba_offset & PCI_MSIX_PBA_OFFSET;
> +
>  		pdev->msix_pba.fd_offset = vdev->regions[pba_index].info.offset +
> -					   (msix->pba_offset & PCI_MSIX_PBA_OFFSET);
> +					   pba_bar_offset;
>  
>  		/* Tidy up the capability */
>  		msix->table_offset &= PCI_MSIX_TABLE_BIR;
> -		msix->pba_offset &= PCI_MSIX_PBA_BIR;
> -		if (pdev->msix_table.bar == pdev->msix_pba.bar)
> -			msix->pba_offset |= pdev->msix_table.size &
> -					    PCI_MSIX_PBA_OFFSET;
> +		if (pdev->msix_table.bar == pdev->msix_pba.bar) {
> +			/* Keep the same offset as the MSIX cap. */
> +			pdev->msix_pba.bar_offset = pba_bar_offset;
> +		} else {
> +			/* PBA is at the start of the BAR. */
> +			msix->pba_offset &= PCI_MSIX_PBA_BIR;
> +			pdev->msix_pba.bar_offset = 0;
> +		}
>  	}
>  
>  	/* Install our fake Configuration Space */
> @@ -896,8 +902,10 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
>  	 * KVM needs memory regions to be multiple of and aligned on PAGE_SIZE.
>  	 */
>  	nr_entries = (msix->ctrl & PCI_MSIX_FLAGS_QSIZE) + 1;
> -	table->size = ALIGN(nr_entries * PCI_MSIX_ENTRY_SIZE, PAGE_SIZE);
> -	pba->size = ALIGN(DIV_ROUND_UP(nr_entries, 64), PAGE_SIZE);
> +
> +	/* MSIX table and PBA must support QWORD accesses. */
> +	table->size = ALIGN(nr_entries * PCI_MSIX_ENTRY_SIZE, 8);
> +	pba->size = ALIGN(DIV_ROUND_UP(nr_entries, 64), 8);
>  
>  	entries = calloc(nr_entries, sizeof(struct vfio_pci_msi_entry));
>  	if (!entries)
> @@ -911,23 +919,8 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
>  		return ret;
>  	if (!info.size)
>  		return -EINVAL;
> -	map_size = info.size;
> -
> -	if (table->bar != pba->bar) {
> -		ret = vfio_pci_get_region_info(vdev, pba->bar, &info);
> -		if (ret)
> -			return ret;
> -		if (!info.size)
> -			return -EINVAL;
> -		map_size += info.size;
> -	}
>  
> -	/*
> -	 * To ease MSI-X cap configuration in case they share the same BAR,
> -	 * collapse table and pending array. The size of the BAR regions must be
> -	 * powers of two.
> -	 */
> -	map_size = ALIGN(map_size, PAGE_SIZE);
> +	map_size = ALIGN(info.size, PAGE_SIZE);
>  	table->guest_phys_addr = pci_get_mmio_block(map_size);
>  	if (!table->guest_phys_addr) {
>  		pr_err("cannot allocate MMIO space");
> @@ -943,7 +936,29 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
>  	 * between MSI-X table and PBA. For the sake of isolation, create a
>  	 * virtual PBA.
>  	 */
> -	pba->guest_phys_addr = table->guest_phys_addr + table->size;
> +	if (table->bar == pba->bar) {
> +		u32 pba_bar_offset = msix->pba_offset & PCI_MSIX_PBA_OFFSET;

I think it's customary to put an empty line after variable declarations.

Cheers,
Andre

> +		/* Sanity checks. */
> +		if (table->size > pba_bar_offset)
> +			die("MSIX table overlaps with PBA");
> +		if (pba_bar_offset + pba->size > info.size)
> +			die("PBA exceeds the size of the region");
> +		pba->guest_phys_addr = table->guest_phys_addr + pba_bar_offset;
> +	} else {
> +		ret = vfio_pci_get_region_info(vdev, pba->bar, &info);
> +		if (ret)
> +			return ret;
> +		if (!info.size)
> +			return -EINVAL;
> +
> +		map_size = ALIGN(info.size, PAGE_SIZE);
> +		pba->guest_phys_addr = pci_get_mmio_block(map_size);
> +		if (!pba->guest_phys_addr) {
> +			pr_err("cannot allocate MMIO space");
> +			ret = -ENOMEM;
> +			goto out_free;
> +		}
> +	}
>  
>  	pdev->msix.entries = entries;
>  	pdev->msix.nr_entries = nr_entries;


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 kvmtool 6/7] vfio/pci: Print an error when offset is outside of the MSIX table or PBA
  2021-09-13 15:44 ` [PATCH v1 kvmtool 6/7] vfio/pci: Print an error when offset is outside of the MSIX table or PBA Alexandru Elisei
@ 2021-10-06 15:11   ` Andre Przywara
  2021-10-11 14:46     ` Alexandru Elisei
  0 siblings, 1 reply; 20+ messages in thread
From: Andre Przywara @ 2021-10-06 15:11 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: will, julien.thierry.kdev, kvm, jean-philippe

On Mon, 13 Sep 2021 16:44:12 +0100
Alexandru Elisei <alexandru.elisei@arm.com> wrote:

> Now that we keep track of the real size of MSIX table and PBA, print an
> error when the guest tries to write to an offset which is not inside the
> correct regions.

What is the actual purpose of this message? Is there anything the user can
do about it? Shouldn't we either abort the guest or inject an error
condition back into KVM or the guest instead?

Cheers,
Andre

> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  vfio/pci.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/vfio/pci.c b/vfio/pci.c
> index 7781868..a6d0408 100644
> --- a/vfio/pci.c
> +++ b/vfio/pci.c
> @@ -249,6 +249,11 @@ static void vfio_pci_msix_pba_access(struct kvm_cpu
> *vcpu, u64 addr, u8 *data, u64 offset = addr - pba->guest_phys_addr;
>  	struct vfio_device *vdev = container_of(pdev, struct
> vfio_device, pci); 
> +	if (offset >= pba->size) {
> +		vfio_dev_err(vdev, "access outside of the MSIX PBA");
> +		return;
> +	}
> +
>  	if (is_write)
>  		return;
>  
> @@ -269,6 +274,10 @@ static void vfio_pci_msix_table_access(struct
> kvm_cpu *vcpu, u64 addr, u8 *data, struct vfio_device *vdev =
> container_of(pdev, struct vfio_device, pci); 
>  	u64 offset = addr - pdev->msix_table.guest_phys_addr;
> +	if (offset >= pdev->msix_table.size) {
> +		vfio_dev_err(vdev, "access outside of the MSI-X table");
> +		return;
> +	}
>  
>  	size_t vector = offset / PCI_MSIX_ENTRY_SIZE;
>  	off_t field = offset % PCI_MSIX_ENTRY_SIZE;


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 kvmtool 7/7] vfio/pci: Align MSIX Table and PBA size allocation to 64k
  2021-09-13 15:44 ` [PATCH v1 kvmtool 7/7] vfio/pci: Align MSIX Table and PBA size allocation to 64k Alexandru Elisei
@ 2021-10-06 15:11   ` Andre Przywara
  2021-10-11 14:57     ` Alexandru Elisei
  0 siblings, 1 reply; 20+ messages in thread
From: Andre Przywara @ 2021-10-06 15:11 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: will, julien.thierry.kdev, kvm, jean-philippe

On Mon, 13 Sep 2021 16:44:13 +0100
Alexandru Elisei <alexandru.elisei@arm.com> wrote:

Hi,

> When allocating MMIO space for the MSI-X table, kvmtool rounds the
> allocation to the host's page size to make it as easy as possible for the
> guest to map the table to a page, if it wants to (and doesn't do BAR
> reassignment, like the x86 architecture for example). However, the host's
> page size can differ from the guest's, for example, if the host is compiled
> with 4k pages and the guest is using 64k pages.
> 
> To make sure the allocation is always aligned to a guest's page size, round
> it up to the maximum page size, which is 64k. Do the same for the pending
> bit array if it lives in its own BAR.

The idea of that looks alright on the first glance, but isn't needed for
x86, right? So should this be using an arch-specific MAX_PAGE_SIZE instead?

Cheers,
Andre

> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  vfio/pci.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/vfio/pci.c b/vfio/pci.c
> index a6d0408..7e258a4 100644
> --- a/vfio/pci.c
> +++ b/vfio/pci.c
> @@ -1,3 +1,5 @@
> +#include "linux/sizes.h"
> +
>  #include "kvm/irq.h"
>  #include "kvm/kvm.h"
>  #include "kvm/kvm-cpu.h"
> @@ -929,7 +931,7 @@ static int vfio_pci_create_msix_table(struct kvm
> *kvm, struct vfio_device *vdev) if (!info.size)
>  		return -EINVAL;
>  
> -	map_size = ALIGN(info.size, PAGE_SIZE);
> +	map_size = ALIGN(info.size, SZ_64K);
>  	table->guest_phys_addr = pci_get_mmio_block(map_size);
>  	if (!table->guest_phys_addr) {
>  		pr_err("cannot allocate MMIO space");
> @@ -960,7 +962,7 @@ static int vfio_pci_create_msix_table(struct kvm
> *kvm, struct vfio_device *vdev) if (!info.size)
>  			return -EINVAL;
>  
> -		map_size = ALIGN(info.size, PAGE_SIZE);
> +		map_size = ALIGN(info.size, SZ_64K);
>  		pba->guest_phys_addr = pci_get_mmio_block(map_size);
>  		if (!pba->guest_phys_addr) {
>  			pr_err("cannot allocate MMIO space");


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 kvmtool 5/7] vfio/pci: Rework MSIX table and PBA physical size allocation
  2021-10-06 15:11   ` Andre Przywara
@ 2021-10-11 14:39     ` Alexandru Elisei
  0 siblings, 0 replies; 20+ messages in thread
From: Alexandru Elisei @ 2021-10-11 14:39 UTC (permalink / raw)
  To: Andre Przywara; +Cc: will, julien.thierry.kdev, kvm, jean-philippe

Hi Andre,

Many thanks for the review.

On Wed, Oct 06, 2021 at 04:11:13PM +0100, Andre Przywara wrote:
> On Mon, 13 Sep 2021 16:44:11 +0100
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi,
> 
> > When creating the MSIX table and PBA, kvmtool rounds up the table and
> > pending bit array sizes to the host's page size.
> 
> Do you have an idea why it was doing that? I mean the comment:
> "KVM needs memory regions to be multiple of and aligned on PAGE_SIZE."
> is still there. I guess the actual size alignment is done later, so 
> table->size and pba->size don't need to be aligned already?
> (In that case you should remove the now misleading comment)

I don't know why the comment is there either. The only thing that I can think of
that requires memory regions to be aligned to PAGE_SIZE is
KVM_SET_USER_MEMORY_REGION, but kvmtool doesn't create a memslot for the MSIX
table and PBA, and there's nothing stopping userspace from doing trap and
emulate with a granularity as small as 1 byte.

I'll remove the comment, I don't think it serves any purpose after this patch.

> 
> Just one nit below, but other than that it looks indeed much better.
> 
> > Unfortunately, when doing
> > that, it doesn't take into account that the new size can exceed the device
> > BAR size, leading to hard to diagnose errors for certain configurations.
> > 
> > One theoretical example: PBA and table in the same 4k BAR, host's page size
> > is 4k. In this case, table->size = 4k, pba->size = 4k, map_size = 4k, which
> > means that pba->guest_phys_addr = table->guest_phys_addr + 4k, which is
> > outside of the 4k MMIO range allocated for both structures.
> > 
> > Another example, this time a real-world error that I encountered: happens
> > with a 64k host booting a 4k guest, an RTL8168 PCIE NIC assigned to the
> > guest. In this case, kvmtool sets table->size = 64k (because it's rounded
> > to the host's page size) and pba->size = 64k.
> > 
> > Truncated output of lspci -vv on the host:
> > 
> > 01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
> > 	Subsystem: TP-LINK Technologies Co., Ltd. TG-3468 Gigabit PCI Express Network Adapter
> > 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> > 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> > 	Latency: 0
> > 	Interrupt: pin A routed to IRQ 255
> > 	Region 0: I/O ports at 1000 [size=256]
> > 	Region 2: Memory at 40000000 (64-bit, non-prefetchable) [size=4K]
> > 	Region 4: Memory at 100000000 (64-bit, prefetchable) [size=16K]
> > 	[..]
> > 	Capabilities: [b0] MSI-X: Enable- Count=4 Masked-
> > 		Vector table: BAR=4 offset=00000000
> > 		PBA: BAR=4 offset=00000800
> > 	[..]
> > 
> > When booting the guest:
> > 
> > [..]
> > [    0.207444] pci-host-generic 40000000.pci: host bridge /pci ranges:
> > [    0.208564] pci-host-generic 40000000.pci:       IO 0x0000000000..0x000000ffff -> 0x0000000000
> > [    0.209857] pci-host-generic 40000000.pci:      MEM 0x0050000000..0x007fffffff -> 0x0050000000
> > [    0.211184] pci-host-generic 40000000.pci: ECAM at [mem 0x40000000-0x4fffffff] for [bus 00]
> > [    0.212625] pci-host-generic 40000000.pci: PCI host bridge to bus 0000:00
> > [    0.213647] pci_bus 0000:00: root bus resource [bus 00]
> > [    0.214429] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
> > [    0.215355] pci_bus 0000:00: root bus resource [mem 0x50000000-0x7fffffff]
> > [    0.216676] pci 0000:00:00.0: [10ec:8168] type 00 class 0x020000
> > [    0.223771] pci 0000:00:00.0: reg 0x10: [io  0x6200-0x62ff]
> > [    0.239765] pci 0000:00:00.0: reg 0x18: [mem 0x50010000-0x50010fff]
> > [    0.244595] pci 0000:00:00.0: reg 0x20: [mem 0x50000000-0x50003fff]
> > [    0.246331] pci 0000:00:01.0: [1af4:1000] type 00 class 0x020000
> > [    0.247278] pci 0000:00:01.0: reg 0x10: [io  0x6300-0x63ff]
> > [    0.248212] pci 0000:00:01.0: reg 0x14: [mem 0x50020000-0x500200ff]
> > [    0.249172] pci 0000:00:01.0: reg 0x18: [mem 0x50020400-0x500207ff]
> > [    0.250450] pci 0000:00:02.0: [1af4:1001] type 00 class 0x018000
> > [    0.251392] pci 0000:00:02.0: reg 0x10: [io  0x6400-0x64ff]
> > [    0.252351] pci 0000:00:02.0: reg 0x14: [mem 0x50020800-0x500208ff]
> > [    0.253312] pci 0000:00:02.0: reg 0x18: [mem 0x50020c00-0x50020fff]
> > (1) [    0.254760] pci 0000:00:00.0: BAR 4: assigned [mem 0x50000000-0x50003fff]
> > (2) [    0.255805] pci 0000:00:00.0: BAR 2: assigned [mem 0x50004000-0x50004fff]
> >   Warning: [10ec:8168] Error activating emulation for BAR 2
> >   Warning: [10ec:8168] Error activating emulation for BAR 2
> > [    0.260432] pci 0000:00:01.0: BAR 2: assigned [mem 0x50005000-0x500053ff]
> >   Warning: [1af4:1000] Error activating emulation for BAR 2
> >   Warning: [1af4:1000] Error activating emulation for BAR 2
> > [    0.261469] pci 0000:00:02.0: BAR 2: assigned [mem 0x50005400-0x500057ff]
> >   Warning: [1af4:1001] Error activating emulation for BAR 2
> >   Warning: [1af4:1001] Error activating emulation for BAR 2
> > [    0.262499] pci 0000:00:00.0: BAR 0: assigned [io  0x1000-0x10ff]
> > [    0.263415] pci 0000:00:01.0: BAR 0: assigned [io  0x1100-0x11ff]
> > [    0.264462] pci 0000:00:01.0: BAR 1: assigned [mem 0x50005800-0x500058ff]
> >   Warning: [1af4:1000] Error activating emulation for BAR 1
> >   Warning: [1af4:1000] Error activating emulation for BAR 1
> > [    0.265481] pci 0000:00:02.0: BAR 0: assigned [io  0x1200-0x12ff]
> > [    0.266397] pci 0000:00:02.0: BAR 1: assigned [mem 0x50005900-0x500059ff]
> >   Warning: [1af4:1001] Error activating emulation for BAR 1
> >   Warning: [1af4:1001] Error activating emulation for BAR 1
> > [    0.267892] EINJ: ACPI disabled.
> > [    0.269922] virtio-pci 0000:00:01.0: virtio_pci: leaving for legacy driver
> > [    0.271118] virtio-pci 0000:00:02.0: virtio_pci: leaving for legacy driver
> > [    0.274122] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> > [    0.275930] printk: console [ttyS0] disabled
> > [    0.276669] 1000000.U6_16550A: ttyS0 at MMIO 0x1000000 (irq = 13, base_baud = 115200) is a 16550A
> > [    0.278058] printk: console [ttyS0] enabled
> > [    0.278058] printk: console [ttyS0] enabled
> > [    0.279304] printk: bootconsole [ns16550a0] disabled
> > [    0.279304] printk: bootconsole [ns16550a0] disabled
> > [    0.281252] 1001000.U6_16550A: ttyS1 at MMIO 0x1001000 (irq = 14, base_baud = 115200) is a 16550A
> > [    0.282842] 1002000.U6_16550A: ttyS2 at MMIO 0x1002000 (irq = 15, base_baud = 115200) is a 16550A
> > [    0.284611] 1003000.U6_16550A: ttyS3 at MMIO 0x1003000 (irq = 16, base_baud = 115200) is a 16550A
> > [    0.286094] SuperH (H)SCI(F) driver initialized
> > [    0.286868] msm_serial: driver initialized
> > [    0.287890] [drm] radeon kernel modesetting enabled.
> > [    0.288826] cacheinfo: Unable to detect cache hierarchy for CPU 0
> > [    0.293321] loop: module loaded
> > KVM_SET_GSI_ROUTING: Invalid argument
> > 
> > At (1), the guest writes 0x50000000 into BAR 4 of the NIC (which holds
> > the MSIX table and PBA), expecting that will cover only 16k of address
> > space (the BAR size), up to 0x50003fff, inclusive. On the host side, in
> > vfio_pci_bar_activate(), kvmtool will actually register for MMIO
> > emulation the region 0x50000000-0x5000ffff (64k in total) for the MSIX
> > table and 0x50010000-0x5001ffff (another 64k) for the PBA (kvmtool set
> > table->size and pba->size to 64k when it aligned them to the host's page
> > size).
> > 
> > Then at step (2), the guest writes the next available address (from its
> > point of view) into BAR 2 of the NIC, which is 0x50004000. On the host
> > side, the PCI emulation layer will search all the regions that overlap with
> > the BAR address range (0x50004000-0x50004fff) and will find none because,
> > just like the guest, it uses the BAR size to check for overlaps. When
> > vfio_pci_bar_activate() is reached, kvmtool will try to register memory for
> > this region, but it is already registered for the MSIX table emulation and
> > fails.
> > 
> > The same scenario repeats for every following memory BAR, because the MSIX
> > table and PBA use memory from 0x50000000 to 0x5001ffff.
> > 
> > The error at the end, which finally terminates the VM, is caused by the
> > guest trying to write to a totally different BAR, which vfio-pci
> > interpretes as a write to MSI-X table because it falls in the 64k region
> > that was registered for emulation. The IRQ ID is not a valid SPI number and
> > gicv2m_update_routing() returns an error (and sets errno to EINVAL).
> > 
> > Fix this by aligning the table and PBA size to 8 bytes to allow for
> > qword accesses, like PCI 3.0 mandates.
> > 
> > For the sake of simplicity, the PBA offset in a BAR, in case of a shared
> > BAR, is kept the same as the offset of the physical device. One hopes that
> > the device respects the recommendations set forth in PCI LOCAL BUS
> > SPECIFICATION, REV. 3.0, section "MSI-X Capability and Table Structures"
> > 
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> >  include/kvm/vfio.h |  1 +
> >  vfio/pci.c         | 65 ++++++++++++++++++++++++++++------------------
> >  2 files changed, 41 insertions(+), 25 deletions(-)
> > 
> > diff --git a/include/kvm/vfio.h b/include/kvm/vfio.h
> > index 8cdf04f..764ab9b 100644
> > --- a/include/kvm/vfio.h
> > +++ b/include/kvm/vfio.h
> > @@ -50,6 +50,7 @@ struct vfio_pci_msix_pba {
> >  	size_t				size;
> >  	off_t				fd_offset; /* in VFIO device fd */
> >  	unsigned int			bar;
> > +	u32				bar_offset; /* in the shared BAR */
> >  	u32				guest_phys_addr;
> >  };
> >  
> > diff --git a/vfio/pci.c b/vfio/pci.c
> > index cc18311..7781868 100644
> > --- a/vfio/pci.c
> > +++ b/vfio/pci.c
> > @@ -502,7 +502,7 @@ static int vfio_pci_bar_activate(struct kvm *kvm,
> >  
> >  	if (has_msix && (u32)bar_num == pba->bar) {
> >  		if (pba->bar == table->bar)
> > -			pba->guest_phys_addr = table->guest_phys_addr + table->size;
> > +			pba->guest_phys_addr = table->guest_phys_addr + pba->bar_offset;
> >  		else
> >  			pba->guest_phys_addr = region->guest_phys_addr;
> >  		ret = kvm__register_mmio(kvm, pba->guest_phys_addr,
> > @@ -815,15 +815,21 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev)
> >  	if (msix) {
> >  		/* Add a shortcut to the PBA region for the MMIO handler */
> >  		int pba_index = VFIO_PCI_BAR0_REGION_INDEX + pdev->msix_pba.bar;
> > +		u32 pba_bar_offset = msix->pba_offset & PCI_MSIX_PBA_OFFSET;
> > +
> >  		pdev->msix_pba.fd_offset = vdev->regions[pba_index].info.offset +
> > -					   (msix->pba_offset & PCI_MSIX_PBA_OFFSET);
> > +					   pba_bar_offset;
> >  
> >  		/* Tidy up the capability */
> >  		msix->table_offset &= PCI_MSIX_TABLE_BIR;
> > -		msix->pba_offset &= PCI_MSIX_PBA_BIR;
> > -		if (pdev->msix_table.bar == pdev->msix_pba.bar)
> > -			msix->pba_offset |= pdev->msix_table.size &
> > -					    PCI_MSIX_PBA_OFFSET;
> > +		if (pdev->msix_table.bar == pdev->msix_pba.bar) {
> > +			/* Keep the same offset as the MSIX cap. */
> > +			pdev->msix_pba.bar_offset = pba_bar_offset;
> > +		} else {
> > +			/* PBA is at the start of the BAR. */
> > +			msix->pba_offset &= PCI_MSIX_PBA_BIR;
> > +			pdev->msix_pba.bar_offset = 0;
> > +		}
> >  	}
> >  
> >  	/* Install our fake Configuration Space */
> > @@ -896,8 +902,10 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
> >  	 * KVM needs memory regions to be multiple of and aligned on PAGE_SIZE.
> >  	 */
> >  	nr_entries = (msix->ctrl & PCI_MSIX_FLAGS_QSIZE) + 1;
> > -	table->size = ALIGN(nr_entries * PCI_MSIX_ENTRY_SIZE, PAGE_SIZE);
> > -	pba->size = ALIGN(DIV_ROUND_UP(nr_entries, 64), PAGE_SIZE);
> > +
> > +	/* MSIX table and PBA must support QWORD accesses. */
> > +	table->size = ALIGN(nr_entries * PCI_MSIX_ENTRY_SIZE, 8);
> > +	pba->size = ALIGN(DIV_ROUND_UP(nr_entries, 64), 8);
> >  
> >  	entries = calloc(nr_entries, sizeof(struct vfio_pci_msi_entry));
> >  	if (!entries)
> > @@ -911,23 +919,8 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
> >  		return ret;
> >  	if (!info.size)
> >  		return -EINVAL;
> > -	map_size = info.size;
> > -
> > -	if (table->bar != pba->bar) {
> > -		ret = vfio_pci_get_region_info(vdev, pba->bar, &info);
> > -		if (ret)
> > -			return ret;
> > -		if (!info.size)
> > -			return -EINVAL;
> > -		map_size += info.size;
> > -	}
> >  
> > -	/*
> > -	 * To ease MSI-X cap configuration in case they share the same BAR,
> > -	 * collapse table and pending array. The size of the BAR regions must be
> > -	 * powers of two.
> > -	 */
> > -	map_size = ALIGN(map_size, PAGE_SIZE);
> > +	map_size = ALIGN(info.size, PAGE_SIZE);
> >  	table->guest_phys_addr = pci_get_mmio_block(map_size);
> >  	if (!table->guest_phys_addr) {
> >  		pr_err("cannot allocate MMIO space");
> > @@ -943,7 +936,29 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev)
> >  	 * between MSI-X table and PBA. For the sake of isolation, create a
> >  	 * virtual PBA.
> >  	 */
> > -	pba->guest_phys_addr = table->guest_phys_addr + table->size;
> > +	if (table->bar == pba->bar) {
> > +		u32 pba_bar_offset = msix->pba_offset & PCI_MSIX_PBA_OFFSET;
> 
> I think it's customary to put an empty line after variable declarations.

Will do.

Thanks,
Alex

> 
> Cheers,
> Andre
> 
> > +		/* Sanity checks. */
> > +		if (table->size > pba_bar_offset)
> > +			die("MSIX table overlaps with PBA");
> > +		if (pba_bar_offset + pba->size > info.size)
> > +			die("PBA exceeds the size of the region");
> > +		pba->guest_phys_addr = table->guest_phys_addr + pba_bar_offset;
> > +	} else {
> > +		ret = vfio_pci_get_region_info(vdev, pba->bar, &info);
> > +		if (ret)
> > +			return ret;
> > +		if (!info.size)
> > +			return -EINVAL;
> > +
> > +		map_size = ALIGN(info.size, PAGE_SIZE);
> > +		pba->guest_phys_addr = pci_get_mmio_block(map_size);
> > +		if (!pba->guest_phys_addr) {
> > +			pr_err("cannot allocate MMIO space");
> > +			ret = -ENOMEM;
> > +			goto out_free;
> > +		}
> > +	}
> >  
> >  	pdev->msix.entries = entries;
> >  	pdev->msix.nr_entries = nr_entries;
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 kvmtool 6/7] vfio/pci: Print an error when offset is outside of the MSIX table or PBA
  2021-10-06 15:11   ` Andre Przywara
@ 2021-10-11 14:46     ` Alexandru Elisei
  0 siblings, 0 replies; 20+ messages in thread
From: Alexandru Elisei @ 2021-10-11 14:46 UTC (permalink / raw)
  To: Andre Przywara; +Cc: will, julien.thierry.kdev, kvm, jean-philippe

Hi Andre,

On Wed, Oct 06, 2021 at 04:11:28PM +0100, Andre Przywara wrote:
> On Mon, 13 Sep 2021 16:44:12 +0100
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> > Now that we keep track of the real size of MSIX table and PBA, print an
> > error when the guest tries to write to an offset which is not inside the
> > correct regions.
> 
> What is the actual purpose of this message? Is there anything the user can
> do about it? Shouldn't we either abort the guest or inject an error
> condition back into KVM or the guest instead?

The only way for the error message to trigger is in case of a programming error
in kvmtool - when access to the PBA BAR is enabled by the guest, kvmtool will
register the vfio_pci_msix_pba_access callback for the memory region
[pba->guest_phys_addr, pba->guest_phys_addr + pba->size) in
vfio_pci_bar_activate().

I believe kvmtool should definitely print something when it detects a
programming error. The question now becomes if kvmtool should die or continue
running the VM, as the VM can run just fine even if emulation for assigned
devices is malfunctioning. I would rather give the user the chance to safely
shutdown the VM instead of killing it outright.

Thanks,
Alex

> 
> Cheers,
> Andre
> 
> > 
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> >  vfio/pci.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/vfio/pci.c b/vfio/pci.c
> > index 7781868..a6d0408 100644
> > --- a/vfio/pci.c
> > +++ b/vfio/pci.c
> > @@ -249,6 +249,11 @@ static void vfio_pci_msix_pba_access(struct kvm_cpu
> > *vcpu, u64 addr, u8 *data, u64 offset = addr - pba->guest_phys_addr;
> >  	struct vfio_device *vdev = container_of(pdev, struct
> > vfio_device, pci); 
> > +	if (offset >= pba->size) {
> > +		vfio_dev_err(vdev, "access outside of the MSIX PBA");
> > +		return;
> > +	}
> > +
> >  	if (is_write)
> >  		return;
> >  
> > @@ -269,6 +274,10 @@ static void vfio_pci_msix_table_access(struct
> > kvm_cpu *vcpu, u64 addr, u8 *data, struct vfio_device *vdev =
> > container_of(pdev, struct vfio_device, pci); 
> >  	u64 offset = addr - pdev->msix_table.guest_phys_addr;
> > +	if (offset >= pdev->msix_table.size) {
> > +		vfio_dev_err(vdev, "access outside of the MSI-X table");
> > +		return;
> > +	}
> >  
> >  	size_t vector = offset / PCI_MSIX_ENTRY_SIZE;
> >  	off_t field = offset % PCI_MSIX_ENTRY_SIZE;
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 kvmtool 7/7] vfio/pci: Align MSIX Table and PBA size allocation to 64k
  2021-10-06 15:11   ` Andre Przywara
@ 2021-10-11 14:57     ` Alexandru Elisei
  0 siblings, 0 replies; 20+ messages in thread
From: Alexandru Elisei @ 2021-10-11 14:57 UTC (permalink / raw)
  To: Andre Przywara; +Cc: will, julien.thierry.kdev, kvm, jean-philippe

Hi Andre,

On Wed, Oct 06, 2021 at 04:11:42PM +0100, Andre Przywara wrote:
> On Mon, 13 Sep 2021 16:44:13 +0100
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi,
> 
> > When allocating MMIO space for the MSI-X table, kvmtool rounds the
> > allocation to the host's page size to make it as easy as possible for the
> > guest to map the table to a page, if it wants to (and doesn't do BAR
> > reassignment, like the x86 architecture for example). However, the host's
> > page size can differ from the guest's, for example, if the host is compiled
> > with 4k pages and the guest is using 64k pages.
> > 
> > To make sure the allocation is always aligned to a guest's page size, round
> > it up to the maximum page size, which is 64k. Do the same for the pending
> > bit array if it lives in its own BAR.
> 
> The idea of that looks alright on the first glance, but isn't needed for
> x86, right? So should this be using an arch-specific MAX_PAGE_SIZE instead?

By "not needed for x86", do you mean that 4k is enough and 64k is not needed? Or
that doing any kind of alignment is unnecessary? If it's the latter, Linux on
x86 relies on the BARs being programmed by the BIOS with valid addresses, so I
would rather err on the side of caution and align the BARs to at least 4k.

If it's the former, doing the alignment is not costing kvmtool anything, as the
addreses are just numbers in the PCI memory region, which is not allocated as
userspace memory by kvmtool, and it's about 1GiB, so I don't think 64 - 4 = 60k
will make much of a difference. With 100 devices, each with the MSIX table and
PBA in different BARs, that amounts to 60k * 2 * 100 ~= 12MiB. Out of 1GiB.

But I do agree that using something like MAX_PAGE_SIZE would be the correct way
to do it. I'll have a look at the page sizes for mips and powerpc and create the
define.

Thanks,
Alex

> 
> Cheers,
> Andre
> 
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> >  vfio/pci.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/vfio/pci.c b/vfio/pci.c
> > index a6d0408..7e258a4 100644
> > --- a/vfio/pci.c
> > +++ b/vfio/pci.c
> > @@ -1,3 +1,5 @@
> > +#include "linux/sizes.h"
> > +
> >  #include "kvm/irq.h"
> >  #include "kvm/kvm.h"
> >  #include "kvm/kvm-cpu.h"
> > @@ -929,7 +931,7 @@ static int vfio_pci_create_msix_table(struct kvm
> > *kvm, struct vfio_device *vdev) if (!info.size)
> >  		return -EINVAL;
> >  
> > -	map_size = ALIGN(info.size, PAGE_SIZE);
> > +	map_size = ALIGN(info.size, SZ_64K);
> >  	table->guest_phys_addr = pci_get_mmio_block(map_size);
> >  	if (!table->guest_phys_addr) {
> >  		pr_err("cannot allocate MMIO space");
> > @@ -960,7 +962,7 @@ static int vfio_pci_create_msix_table(struct kvm
> > *kvm, struct vfio_device *vdev) if (!info.size)
> >  			return -EINVAL;
> >  
> > -		map_size = ALIGN(info.size, PAGE_SIZE);
> > +		map_size = ALIGN(info.size, SZ_64K);
> >  		pba->guest_phys_addr = pci_get_mmio_block(map_size);
> >  		if (!pba->guest_phys_addr) {
> >  			pr_err("cannot allocate MMIO space");
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation
  2021-09-13 15:44 [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation Alexandru Elisei
                   ` (6 preceding siblings ...)
       [not found] ` <20210913154413.14322-5-alexandru.elisei@arm.com>
@ 2021-10-12  8:31 ` Will Deacon
  2021-10-12 10:50   ` Alexandru Elisei
  7 siblings, 1 reply; 20+ messages in thread
From: Will Deacon @ 2021-10-12  8:31 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: julien.thierry.kdev, kvm, andre.przywara, jean-philippe

Hi Alex,

On Mon, Sep 13, 2021 at 04:44:06PM +0100, Alexandru Elisei wrote:
> This series is meant to rework the way the MSIX table and PBA are allocated
> to prevent situations where the size allocated by kvmtool is larger than
> the size of the BAR that holds them.
> 
> Patches 1-3 are fixes for stuff I found when I was investing a bug
> triggered by the incorrect sizing of the table and PBA.
> 
> Patch 4 is a preparatory patch.
> 
> Patch 5 is the proper fix. More details in the commit message.
> 
> Patch 6 is there to make it easier to catch such errors if the code
> regresses.
> 
> Patch 7 is an optimization for guests with larger page sizes than the host.

Please can you post a version of this with Andre's tags/comments addressed
so that I can pick it up?

Thanks!

Will

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation
  2021-10-12  8:31 ` [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation Will Deacon
@ 2021-10-12 10:50   ` Alexandru Elisei
  0 siblings, 0 replies; 20+ messages in thread
From: Alexandru Elisei @ 2021-10-12 10:50 UTC (permalink / raw)
  To: Will Deacon; +Cc: julien.thierry.kdev, kvm, andre.przywara, jean-philippe

Hi Will,

On Tue, Oct 12, 2021 at 09:31:33AM +0100, Will Deacon wrote:
> Hi Alex,
> 
> On Mon, Sep 13, 2021 at 04:44:06PM +0100, Alexandru Elisei wrote:
> > This series is meant to rework the way the MSIX table and PBA are allocated
> > to prevent situations where the size allocated by kvmtool is larger than
> > the size of the BAR that holds them.
> > 
> > Patches 1-3 are fixes for stuff I found when I was investing a bug
> > triggered by the incorrect sizing of the table and PBA.
> > 
> > Patch 4 is a preparatory patch.
> > 
> > Patch 5 is the proper fix. More details in the commit message.
> > 
> > Patch 6 is there to make it easier to catch such errors if the code
> > regresses.
> > 
> > Patch 7 is an optimization for guests with larger page sizes than the host.
> 
> Please can you post a version of this with Andre's tags/comments addressed
> so that I can pick it up?

Working on it, will post as soon as I can.

Thanks,
Alex

> 
> Thanks!
> 
> Will

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-10-12 10:49 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-13 15:44 [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation Alexandru Elisei
2021-09-13 15:44 ` [PATCH v1 kvmtool 1/7] arm/gicv2m: Set errno when gicv2_update_routing() fails Alexandru Elisei
2021-10-06 15:08   ` Andre Przywara
2021-09-13 15:44 ` [PATCH v1 kvmtool 2/7] vfio/pci.c: Remove double include for assert.h Alexandru Elisei
2021-10-06 15:09   ` Andre Przywara
2021-09-13 15:44 ` [PATCH v1 kvmtool 3/7] pci: Fix pci_dev_* print macros Alexandru Elisei
2021-09-14  9:13   ` [RESEND PATCH v1 kvmtool 4/8] vfio/pci: Rename PBA offset in device descriptor to fd_offset Alexandru Elisei
2021-10-06 15:10   ` [PATCH v1 kvmtool 3/7] pci: Fix pci_dev_* print macros Andre Przywara
2021-09-13 15:44 ` [PATCH v1 kvmtool 5/7] vfio/pci: Rework MSIX table and PBA physical size allocation Alexandru Elisei
2021-10-06 15:11   ` Andre Przywara
2021-10-11 14:39     ` Alexandru Elisei
2021-09-13 15:44 ` [PATCH v1 kvmtool 6/7] vfio/pci: Print an error when offset is outside of the MSIX table or PBA Alexandru Elisei
2021-10-06 15:11   ` Andre Przywara
2021-10-11 14:46     ` Alexandru Elisei
2021-09-13 15:44 ` [PATCH v1 kvmtool 7/7] vfio/pci: Align MSIX Table and PBA size allocation to 64k Alexandru Elisei
2021-10-06 15:11   ` Andre Przywara
2021-10-11 14:57     ` Alexandru Elisei
     [not found] ` <20210913154413.14322-5-alexandru.elisei@arm.com>
2021-10-06 15:10   ` [PATCH v1 kvmtool 4/7] vfio/pci: Rename PBA offset in device descriptor to fd_offset Andre Przywara
2021-10-12  8:31 ` [PATCH v1 kvmtool 0/7] vfio/pci: Fix MSIX table and PBA size allocation Will Deacon
2021-10-12 10:50   ` Alexandru Elisei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.