linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] UIO_MEM_DMA_COHERENT for cnic/bnx2/bnx2x
@ 2024-01-09 12:14 Nilesh Javali
  2024-01-09 12:14 ` [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT type Nilesh Javali
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Nilesh Javali @ 2024-01-09 12:14 UTC (permalink / raw)
  To: martin.petersen, lduncan, cleech
  Cc: linux-scsi, GR-QLogic-Storage-Upstream, jmeneghi

During bnx2i iSCSI testing we ran into page refcounting issues in the
uio mmaps exported from cnic to the iscsiuio process, and bisected back
to the removal of the __GFP_COMP flag from dma_alloc_coherent calls.

In order to fix these drivers to be able to mmap dma coherent memory via
a uio device, without resorting to hacks and working with an iommu
enabled, introduce a new uio mmap type backed by dma_mmap_coherent.

While converting the uio interface, I also noticed that not all of these
allocations were PAGE_SIZE aligned. Particularly the bnx2/bnx2x status
block mapping was much smaller than any architecture page size, and I
was concerned that it could be unintentionally exposing kernel memory.

v3:
- fix warnings reported by kernel test robot
  and added base commit
v2:
- expose only the dma_addr within uio and cnic.
- Cleanup newly added unions comprising virtual_addr
  and struct device

Chris Leech (3):
  uio: introduce UIO_MEM_DMA_COHERENT type
  cnic,bnx2,bnx2x: use UIO_MEM_DMA_COHERENT
  cnic,bnx2,bnx2x: page align uio mmap allocations

 drivers/net/ethernet/broadcom/bnx2.c          |  2 +
 .../net/ethernet/broadcom/bnx2x/bnx2x_main.c  | 10 +++--
 drivers/net/ethernet/broadcom/cnic.c          | 20 +++++++---
 drivers/net/ethernet/broadcom/cnic.h          |  1 +
 drivers/net/ethernet/broadcom/cnic_if.h       |  1 +
 drivers/uio/uio.c                             | 40 +++++++++++++++++++
 include/linux/uio_driver.h                    |  2 +
 7 files changed, 66 insertions(+), 10 deletions(-)


base-commit: ed340d13aa1db6773667ed4bf907738df203fbda
-- 
2.23.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT type
  2024-01-09 12:14 [PATCH v3 0/3] UIO_MEM_DMA_COHERENT for cnic/bnx2/bnx2x Nilesh Javali
@ 2024-01-09 12:14 ` Nilesh Javali
  2024-01-24 23:14   ` Chris Leech
  2024-01-24 23:36   ` Chris Leech
  2024-01-09 12:14 ` [PATCH v3 2/3] cnic,bnx2,bnx2x: use UIO_MEM_DMA_COHERENT Nilesh Javali
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 10+ messages in thread
From: Nilesh Javali @ 2024-01-09 12:14 UTC (permalink / raw)
  To: martin.petersen, lduncan, cleech
  Cc: linux-scsi, GR-QLogic-Storage-Upstream, jmeneghi

From: Chris Leech <cleech@redhat.com>

Add a UIO memtype specifically for sharing dma_alloc_coherent
memory with userspace, backed by dma_mmap_coherent.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401042222.J9GOUiYL-lkp@intel.com/
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Chris Leech <cleech@redhat.com>
---
v3:
- fix warnings reported by kernel test robot
v2:
- expose only the dma_addr within uio_mem
- Cleanup newly added unions comprising virtual_addr
  and struct device
 drivers/uio/uio.c          | 40 ++++++++++++++++++++++++++++++++++++++
 include/linux/uio_driver.h |  2 ++
 2 files changed, 42 insertions(+)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 62082d64ece0..01d83728b513 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -24,6 +24,7 @@
 #include <linux/kobject.h>
 #include <linux/cdev.h>
 #include <linux/uio_driver.h>
+#include <linux/dma-mapping.h>
 
 #define UIO_MAX_DEVICES		(1U << MINORBITS)
 
@@ -759,6 +760,42 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
 			       vma->vm_page_prot);
 }
 
+static int uio_mmap_dma_coherent(struct vm_area_struct *vma)
+{
+	struct uio_device *idev = vma->vm_private_data;
+	struct uio_mem *mem;
+	void *addr;
+	int ret = 0;
+	int mi;
+
+	mi = uio_find_mem_index(vma);
+	if (mi < 0)
+		return -EINVAL;
+
+	mem = idev->info->mem + mi;
+
+	if (mem->dma_addr & ~PAGE_MASK)
+		return -ENODEV;
+	if (vma->vm_end - vma->vm_start > mem->size)
+		return -EINVAL;
+
+	/*
+	 * UIO uses offset to index into the maps for a device.
+	 * We need to clear vm_pgoff for dma_mmap_coherent.
+	 */
+	vma->vm_pgoff = 0;
+
+	addr = (void *)mem->addr;
+	ret = dma_mmap_coherent(&idev->dev,
+				vma,
+				addr,
+				mem->dma_addr,
+				vma->vm_end - vma->vm_start);
+	vma->vm_pgoff = mi;
+
+	return ret;
+}
+
 static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
 {
 	struct uio_listener *listener = filep->private_data;
@@ -806,6 +843,9 @@ static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
 	case UIO_MEM_VIRTUAL:
 		ret = uio_mmap_logical(vma);
 		break;
+	case UIO_MEM_DMA_COHERENT:
+		ret = uio_mmap_dma_coherent(vma);
+		break;
 	default:
 		ret = -EINVAL;
 	}
diff --git a/include/linux/uio_driver.h b/include/linux/uio_driver.h
index 47c5962b876b..7efa81497183 100644
--- a/include/linux/uio_driver.h
+++ b/include/linux/uio_driver.h
@@ -37,6 +37,7 @@ struct uio_map;
 struct uio_mem {
 	const char		*name;
 	phys_addr_t		addr;
+	dma_addr_t		dma_addr;
 	unsigned long		offs;
 	resource_size_t		size;
 	int			memtype;
@@ -158,6 +159,7 @@ extern int __must_check
 #define UIO_MEM_LOGICAL	2
 #define UIO_MEM_VIRTUAL 3
 #define UIO_MEM_IOVA	4
+#define UIO_MEM_DMA_COHERENT	5
 
 /* defines for uio_port->porttype */
 #define UIO_PORT_NONE	0
-- 
2.23.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/3] cnic,bnx2,bnx2x: use UIO_MEM_DMA_COHERENT
  2024-01-09 12:14 [PATCH v3 0/3] UIO_MEM_DMA_COHERENT for cnic/bnx2/bnx2x Nilesh Javali
  2024-01-09 12:14 ` [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT type Nilesh Javali
@ 2024-01-09 12:14 ` Nilesh Javali
  2024-01-09 12:14 ` [PATCH v3 3/3] cnic,bnx2,bnx2x: page align uio mmap allocations Nilesh Javali
  2024-01-17  7:59 ` [PATCH v3 0/3] UIO_MEM_DMA_COHERENT for cnic/bnx2/bnx2x Nilesh Javali
  3 siblings, 0 replies; 10+ messages in thread
From: Nilesh Javali @ 2024-01-09 12:14 UTC (permalink / raw)
  To: martin.petersen, lduncan, cleech
  Cc: linux-scsi, GR-QLogic-Storage-Upstream, jmeneghi

From: Chris Leech <cleech@redhat.com>

Use new UIO_MEM_DMA_COHERENT type to properly handle
mmap for dma_alloc_coherent buffers.

The cnic l2_ring and l2_buf mmaps have caused page
refcount issues as the dma_alloc_coherent no more
provide __GFP_COMP allocation as per commit,
dma-mapping: reject __GFP_COMP in dma_alloc_attrs.

Fix this by having the uio device use dma_mmap_coherent.

The bnx2 and bnx2x status block allocations are also dma_alloc_coherent,
and should use dma_mmap_coherent. They didn't allocate multiple pages,
but also didn't seem to work correctly with an iommu enabled.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401051255.Qa1W34Jr-lkp@intel.com/
Fixes: bb73955c0b1d ("cnic: don't pass bogus GFP_ flags to dma_alloc_coherent")
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Chris Leech <cleech@redhat.com>
---
v3:
- fix warnings reported by kernel test robot
v2:
- expose only the dma_addr from cnic to uio
- Cleanup newly added unions comprising virtual_addr
  and struct device
 drivers/net/ethernet/broadcom/bnx2.c             |  1 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |  2 ++
 drivers/net/ethernet/broadcom/cnic.c             | 12 +++++++++---
 drivers/net/ethernet/broadcom/cnic.h             |  1 +
 drivers/net/ethernet/broadcom/cnic_if.h          |  1 +
 5 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index 0d917a9699c5..b65b8592ad75 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -367,6 +367,7 @@ static void bnx2_setup_cnic_irq_info(struct bnx2 *bp)
 	cp->irq_arr[0].status_blk = (void *)
 		((unsigned long) bnapi->status_blk.msi +
 		(BNX2_SBLK_MSIX_ALIGN_SIZE * sb_id));
+	cp->irq_arr[0].status_blk_map = bp->status_blk_mapping;
 	cp->irq_arr[0].status_blk_num = sb_id;
 	cp->num_irq = 1;
 }
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 0d8e61c63c7c..678829646cec 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -14912,9 +14912,11 @@ void bnx2x_setup_cnic_irq_info(struct bnx2x *bp)
 	else
 		cp->irq_arr[0].status_blk = (void *)bp->cnic_sb.e1x_sb;
 
+	cp->irq_arr[0].status_blk_map = bp->cnic_sb_mapping;
 	cp->irq_arr[0].status_blk_num =  bnx2x_cnic_fw_sb_id(bp);
 	cp->irq_arr[0].status_blk_num2 = bnx2x_cnic_igu_sb_id(bp);
 	cp->irq_arr[1].status_blk = bp->def_status_blk;
+	cp->irq_arr[1].status_blk_map = bp->def_status_blk_mapping;
 	cp->irq_arr[1].status_blk_num = DEF_SB_ID;
 	cp->irq_arr[1].status_blk_num2 = DEF_SB_IGU_ID;
 
diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c
index 7926aaef8f0c..e2069a984764 100644
--- a/drivers/net/ethernet/broadcom/cnic.c
+++ b/drivers/net/ethernet/broadcom/cnic.c
@@ -1107,6 +1107,7 @@ static int cnic_init_uio(struct cnic_dev *dev)
 						     TX_MAX_TSS_RINGS + 1);
 		uinfo->mem[1].addr = (unsigned long) cp->status_blk.gen &
 					CNIC_PAGE_MASK;
+		uinfo->mem[1].dma_addr = cp->status_blk_map;
 		if (cp->ethdev->drv_state & CNIC_DRV_STATE_USING_MSIX)
 			uinfo->mem[1].size = BNX2_SBLK_MSIX_ALIGN_SIZE * 9;
 		else
@@ -1118,20 +1119,23 @@ static int cnic_init_uio(struct cnic_dev *dev)
 
 		uinfo->mem[1].addr = (unsigned long) cp->bnx2x_def_status_blk &
 			CNIC_PAGE_MASK;
+		uinfo->mem[1].dma_addr = cp->status_blk_map;
 		uinfo->mem[1].size = sizeof(*cp->bnx2x_def_status_blk);
 
 		uinfo->name = "bnx2x_cnic";
 	}
 
-	uinfo->mem[1].memtype = UIO_MEM_LOGICAL;
+	uinfo->mem[1].memtype = UIO_MEM_DMA_COHERENT;
 
 	uinfo->mem[2].addr = (unsigned long) udev->l2_ring;
+	uinfo->mem[2].dma_addr = udev->l2_ring_map;
 	uinfo->mem[2].size = udev->l2_ring_size;
-	uinfo->mem[2].memtype = UIO_MEM_LOGICAL;
+	uinfo->mem[2].memtype = UIO_MEM_DMA_COHERENT;
 
 	uinfo->mem[3].addr = (unsigned long) udev->l2_buf;
+	uinfo->mem[3].dma_addr = udev->l2_buf_map;
 	uinfo->mem[3].size = udev->l2_buf_size;
-	uinfo->mem[3].memtype = UIO_MEM_LOGICAL;
+	uinfo->mem[3].memtype = UIO_MEM_DMA_COHERENT;
 
 	uinfo->version = CNIC_MODULE_VERSION;
 	uinfo->irq = UIO_IRQ_CUSTOM;
@@ -1313,6 +1317,7 @@ static int cnic_alloc_bnx2x_resc(struct cnic_dev *dev)
 		return 0;
 
 	cp->bnx2x_def_status_blk = cp->ethdev->irq_arr[1].status_blk;
+	cp->status_blk_map = cp->ethdev->irq_arr[1].status_blk_map;
 
 	cp->l2_rx_ring_size = 15;
 
@@ -5323,6 +5328,7 @@ static int cnic_start_hw(struct cnic_dev *dev)
 	pci_dev_get(dev->pcidev);
 	cp->func = PCI_FUNC(dev->pcidev->devfn);
 	cp->status_blk.gen = ethdev->irq_arr[0].status_blk;
+	cp->status_blk_map = ethdev->irq_arr[0].status_blk_map;
 	cp->status_blk_num = ethdev->irq_arr[0].status_blk_num;
 
 	err = cp->alloc_resc(dev);
diff --git a/drivers/net/ethernet/broadcom/cnic.h b/drivers/net/ethernet/broadcom/cnic.h
index 4baea81bae7a..fedc84ada937 100644
--- a/drivers/net/ethernet/broadcom/cnic.h
+++ b/drivers/net/ethernet/broadcom/cnic.h
@@ -260,6 +260,7 @@ struct cnic_local {
 		#define SM_RX_ID		0
 		#define SM_TX_ID		1
 	} status_blk;
+	dma_addr_t status_blk_map;
 
 	struct host_sp_status_block	*bnx2x_def_status_blk;
 
diff --git a/drivers/net/ethernet/broadcom/cnic_if.h b/drivers/net/ethernet/broadcom/cnic_if.h
index 789e5c7e9311..49a11ec80b36 100644
--- a/drivers/net/ethernet/broadcom/cnic_if.h
+++ b/drivers/net/ethernet/broadcom/cnic_if.h
@@ -190,6 +190,7 @@ struct cnic_ops {
 struct cnic_irq {
 	unsigned int	vector;
 	void		*status_blk;
+	dma_addr_t	status_blk_map;
 	u32		status_blk_num;
 	u32		status_blk_num2;
 	u32		irq_flags;
-- 
2.23.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/3] cnic,bnx2,bnx2x: page align uio mmap allocations
  2024-01-09 12:14 [PATCH v3 0/3] UIO_MEM_DMA_COHERENT for cnic/bnx2/bnx2x Nilesh Javali
  2024-01-09 12:14 ` [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT type Nilesh Javali
  2024-01-09 12:14 ` [PATCH v3 2/3] cnic,bnx2,bnx2x: use UIO_MEM_DMA_COHERENT Nilesh Javali
@ 2024-01-09 12:14 ` Nilesh Javali
  2024-01-17  7:59 ` [PATCH v3 0/3] UIO_MEM_DMA_COHERENT for cnic/bnx2/bnx2x Nilesh Javali
  3 siblings, 0 replies; 10+ messages in thread
From: Nilesh Javali @ 2024-01-09 12:14 UTC (permalink / raw)
  To: martin.petersen, lduncan, cleech
  Cc: linux-scsi, GR-QLogic-Storage-Upstream, jmeneghi

From: Chris Leech <cleech@redhat.com>

Allocations in these drivers that will be mmaped through a uio device
should be made in multiples of PAGE_SIZE to avoid exposing additional
kernel memory unintentionally.

Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Chris Leech <cleech@redhat.com>
---
 drivers/net/ethernet/broadcom/bnx2.c             | 1 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 8 ++++----
 drivers/net/ethernet/broadcom/cnic.c             | 8 +++++---
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index b65b8592ad75..490f88ad3bd2 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -838,6 +838,7 @@ bnx2_alloc_stats_blk(struct net_device *dev)
 						 BNX2_SBLK_MSIX_ALIGN_SIZE);
 	bp->status_stats_size = status_blk_size +
 				sizeof(struct statistics_block);
+	bp->status_stats_size = PAGE_ALIGN(bp->status_stats_size);
 	status_blk = dma_alloc_coherent(&bp->pdev->dev, bp->status_stats_size,
 					&bp->status_blk_mapping, GFP_KERNEL);
 	if (!status_blk)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 678829646cec..1d7be7b7c63f 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -8270,10 +8270,10 @@ void bnx2x_free_mem_cnic(struct bnx2x *bp)
 
 	if (!CHIP_IS_E1x(bp))
 		BNX2X_PCI_FREE(bp->cnic_sb.e2_sb, bp->cnic_sb_mapping,
-			       sizeof(struct host_hc_status_block_e2));
+			       PAGE_ALIGN(sizeof(struct host_hc_status_block_e2)));
 	else
 		BNX2X_PCI_FREE(bp->cnic_sb.e1x_sb, bp->cnic_sb_mapping,
-			       sizeof(struct host_hc_status_block_e1x));
+			       PAGE_ALIGN(sizeof(struct host_hc_status_block_e1x)));
 
 	BNX2X_PCI_FREE(bp->t2, bp->t2_mapping, SRC_T2_SZ);
 }
@@ -8316,12 +8316,12 @@ int bnx2x_alloc_mem_cnic(struct bnx2x *bp)
 	if (!CHIP_IS_E1x(bp)) {
 		/* size = the status block + ramrod buffers */
 		bp->cnic_sb.e2_sb = BNX2X_PCI_ALLOC(&bp->cnic_sb_mapping,
-						    sizeof(struct host_hc_status_block_e2));
+						    PAGE_ALIGN(sizeof(struct host_hc_status_block_e2)));
 		if (!bp->cnic_sb.e2_sb)
 			goto alloc_mem_err;
 	} else {
 		bp->cnic_sb.e1x_sb = BNX2X_PCI_ALLOC(&bp->cnic_sb_mapping,
-						     sizeof(struct host_hc_status_block_e1x));
+						     PAGE_ALIGN(sizeof(struct host_hc_status_block_e1x)));
 		if (!bp->cnic_sb.e1x_sb)
 			goto alloc_mem_err;
 	}
diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c
index e2069a984764..ec4333e1ac9b 100644
--- a/drivers/net/ethernet/broadcom/cnic.c
+++ b/drivers/net/ethernet/broadcom/cnic.c
@@ -1026,6 +1026,7 @@ static int __cnic_alloc_uio_rings(struct cnic_uio_dev *udev, int pages)
 		return 0;
 
 	udev->l2_ring_size = pages * CNIC_PAGE_SIZE;
+	udev->l2_ring_size = PAGE_ALIGN(udev->l2_ring_size);
 	udev->l2_ring = dma_alloc_coherent(&udev->pdev->dev, udev->l2_ring_size,
 					   &udev->l2_ring_map, GFP_KERNEL);
 	if (!udev->l2_ring)
@@ -1033,6 +1034,7 @@ static int __cnic_alloc_uio_rings(struct cnic_uio_dev *udev, int pages)
 
 	udev->l2_buf_size = (cp->l2_rx_ring_size + 1) * cp->l2_single_buf_size;
 	udev->l2_buf_size = CNIC_PAGE_ALIGN(udev->l2_buf_size);
+	udev->l2_buf_size = PAGE_ALIGN(udev->l2_buf_size);
 	udev->l2_buf = dma_alloc_coherent(&udev->pdev->dev, udev->l2_buf_size,
 					  &udev->l2_buf_map, GFP_KERNEL);
 	if (!udev->l2_buf) {
@@ -1109,9 +1111,9 @@ static int cnic_init_uio(struct cnic_dev *dev)
 					CNIC_PAGE_MASK;
 		uinfo->mem[1].dma_addr = cp->status_blk_map;
 		if (cp->ethdev->drv_state & CNIC_DRV_STATE_USING_MSIX)
-			uinfo->mem[1].size = BNX2_SBLK_MSIX_ALIGN_SIZE * 9;
+			uinfo->mem[1].size = PAGE_ALIGN(BNX2_SBLK_MSIX_ALIGN_SIZE * 9);
 		else
-			uinfo->mem[1].size = BNX2_SBLK_MSIX_ALIGN_SIZE;
+			uinfo->mem[1].size = PAGE_ALIGN(BNX2_SBLK_MSIX_ALIGN_SIZE);
 
 		uinfo->name = "bnx2_cnic";
 	} else if (test_bit(CNIC_F_BNX2X_CLASS, &dev->flags)) {
@@ -1120,7 +1122,7 @@ static int cnic_init_uio(struct cnic_dev *dev)
 		uinfo->mem[1].addr = (unsigned long) cp->bnx2x_def_status_blk &
 			CNIC_PAGE_MASK;
 		uinfo->mem[1].dma_addr = cp->status_blk_map;
-		uinfo->mem[1].size = sizeof(*cp->bnx2x_def_status_blk);
+		uinfo->mem[1].size = PAGE_ALIGN(sizeof(*cp->bnx2x_def_status_blk));
 
 		uinfo->name = "bnx2x_cnic";
 	}
-- 
2.23.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH v3 0/3] UIO_MEM_DMA_COHERENT for cnic/bnx2/bnx2x
  2024-01-09 12:14 [PATCH v3 0/3] UIO_MEM_DMA_COHERENT for cnic/bnx2/bnx2x Nilesh Javali
                   ` (2 preceding siblings ...)
  2024-01-09 12:14 ` [PATCH v3 3/3] cnic,bnx2,bnx2x: page align uio mmap allocations Nilesh Javali
@ 2024-01-17  7:59 ` Nilesh Javali
  3 siblings, 0 replies; 10+ messages in thread
From: Nilesh Javali @ 2024-01-17  7:59 UTC (permalink / raw)
  To: Nilesh Javali, martin.petersen, lduncan, cleech
  Cc: linux-scsi, GR-QLogic-Storage-Upstream, jmeneghi

A gentle reminder if this patch set can be reviewed.

Thanks,
Nilesh

> -----Original Message-----
> From: Nilesh Javali <njavali@marvell.com>
> Sent: Tuesday, January 9, 2024 5:45 PM
> To: martin.petersen@oracle.com; lduncan@suse.com; cleech@redhat.com
> Cc: linux-scsi@vger.kernel.org; GR-QLogic-Storage-Upstream <GR-QLogic-
> Storage-Upstream@marvell.com>; jmeneghi@redhat.com
> Subject: [PATCH v3 0/3] UIO_MEM_DMA_COHERENT for cnic/bnx2/bnx2x
> 
> During bnx2i iSCSI testing we ran into page refcounting issues in the
> uio mmaps exported from cnic to the iscsiuio process, and bisected back
> to the removal of the __GFP_COMP flag from dma_alloc_coherent calls.
> 
> In order to fix these drivers to be able to mmap dma coherent memory via
> a uio device, without resorting to hacks and working with an iommu
> enabled, introduce a new uio mmap type backed by dma_mmap_coherent.
> 
> While converting the uio interface, I also noticed that not all of these
> allocations were PAGE_SIZE aligned. Particularly the bnx2/bnx2x status
> block mapping was much smaller than any architecture page size, and I
> was concerned that it could be unintentionally exposing kernel memory.
> 
> v3:
> - fix warnings reported by kernel test robot
>   and added base commit
> v2:
> - expose only the dma_addr within uio and cnic.
> - Cleanup newly added unions comprising virtual_addr
>   and struct device
> 
> Chris Leech (3):
>   uio: introduce UIO_MEM_DMA_COHERENT type
>   cnic,bnx2,bnx2x: use UIO_MEM_DMA_COHERENT
>   cnic,bnx2,bnx2x: page align uio mmap allocations
> 
>  drivers/net/ethernet/broadcom/bnx2.c          |  2 +
>  .../net/ethernet/broadcom/bnx2x/bnx2x_main.c  | 10 +++--
>  drivers/net/ethernet/broadcom/cnic.c          | 20 +++++++---
>  drivers/net/ethernet/broadcom/cnic.h          |  1 +
>  drivers/net/ethernet/broadcom/cnic_if.h       |  1 +
>  drivers/uio/uio.c                             | 40 +++++++++++++++++++
>  include/linux/uio_driver.h                    |  2 +
>  7 files changed, 66 insertions(+), 10 deletions(-)
> 
> 
> base-commit: ed340d13aa1db6773667ed4bf907738df203fbda
> --
> 2.23.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT type
  2024-01-09 12:14 ` [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT type Nilesh Javali
@ 2024-01-24 23:14   ` Chris Leech
  2024-01-25 22:43     ` Greg Kroah-Hartman
  2024-01-24 23:36   ` Chris Leech
  1 sibling, 1 reply; 10+ messages in thread
From: Chris Leech @ 2024-01-24 23:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Nilesh Javali
  Cc: martin.petersen, lduncan, linux-scsi, GR-QLogic-Storage-Upstream,
	jmeneghi, Christoph Hellwig

Hi Greg, I'd like to ask you to reconsider accepting UIO API changes for
mapping DMA coherent allocations.

The bnx2i iSCSI driver is currently broken due to incorrect (but
previously functioning) uio use in the "cnic" module.  I believe that
the most direct way to fix it is to specifically handle DMA coherent
allocations in the UIO API backed by dma_mmap_coherent.

After my original posting of these patches[1], Nilesh Javali had made an
attempt to change the cnic allocations[2]. Those were rejected, and
Nilesh has returned to cleaning up the UIO_MEM_DMA_COHERENT patches.

[1] https://lore.kernel.org/all/20230929170023.1020032-1-cleech@redhat.com/
[2] https://lore.kernel.org/linux-scsi/20231219055514.12324-1-njavali@marvell.com/

While I understand the complaints about how these drivers have been
structured, I also don't like letting support bitrot when there's a
reasonable alternative to re-architecting an existing driver.

There are two other uio drivers which are mmaping dma_alloc_coherent
memory as UIO_MEM_PHYS, uio_dmem_genirq and uio_pruss. While a
conversion to use dma_mmap_coherent might be more correct for these as
well, I have no way of testing them and assume that this just hasn't
been an issue for the platforms in question.

Nilesh has kindly removed the unions I had in the original posting, and
done additional testing on the changes.

Although I do believe that we need to go back to somehow storing the
device struct associated with the PCI device, and not using idev->dev
here.

> +	ret = dma_mmap_coherent(&idev->dev,
> +				vma,
> +				addr,
> +				mem->dma_addr,
> +				vma->vm_end - vma->vm_start);

Thank you,
- Chris Leech

On Tue, Jan 09, 2024 at 05:44:56PM +0530, Nilesh Javali wrote:
> From: Chris Leech <cleech@redhat.com>
> 
> Add a UIO memtype specifically for sharing dma_alloc_coherent
> memory with userspace, backed by dma_mmap_coherent.
> 
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202401042222.J9GOUiYL-lkp@intel.com/
> Signed-off-by: Nilesh Javali <njavali@marvell.com>
> Signed-off-by: Chris Leech <cleech@redhat.com>
> ---
> v3:
> - fix warnings reported by kernel test robot
> v2:
> - expose only the dma_addr within uio_mem
> - Cleanup newly added unions comprising virtual_addr
>   and struct device
>  drivers/uio/uio.c          | 40 ++++++++++++++++++++++++++++++++++++++
>  include/linux/uio_driver.h |  2 ++
>  2 files changed, 42 insertions(+)
> 
> diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
> index 62082d64ece0..01d83728b513 100644
> --- a/drivers/uio/uio.c
> +++ b/drivers/uio/uio.c
> @@ -24,6 +24,7 @@
>  #include <linux/kobject.h>
>  #include <linux/cdev.h>
>  #include <linux/uio_driver.h>
> +#include <linux/dma-mapping.h>
>  
>  #define UIO_MAX_DEVICES		(1U << MINORBITS)
>  
> @@ -759,6 +760,42 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
>  			       vma->vm_page_prot);
>  }
>  
> +static int uio_mmap_dma_coherent(struct vm_area_struct *vma)
> +{
> +	struct uio_device *idev = vma->vm_private_data;
> +	struct uio_mem *mem;
> +	void *addr;
> +	int ret = 0;
> +	int mi;
> +
> +	mi = uio_find_mem_index(vma);
> +	if (mi < 0)
> +		return -EINVAL;
> +
> +	mem = idev->info->mem + mi;
> +
> +	if (mem->dma_addr & ~PAGE_MASK)
> +		return -ENODEV;
> +	if (vma->vm_end - vma->vm_start > mem->size)
> +		return -EINVAL;
> +
> +	/*
> +	 * UIO uses offset to index into the maps for a device.
> +	 * We need to clear vm_pgoff for dma_mmap_coherent.
> +	 */
> +	vma->vm_pgoff = 0;
> +
> +	addr = (void *)mem->addr;
> +	ret = dma_mmap_coherent(&idev->dev,
> +				vma,
> +				addr,
> +				mem->dma_addr,
> +				vma->vm_end - vma->vm_start);
> +	vma->vm_pgoff = mi;
> +
> +	return ret;
> +}
> +
>  static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
>  {
>  	struct uio_listener *listener = filep->private_data;
> @@ -806,6 +843,9 @@ static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
>  	case UIO_MEM_VIRTUAL:
>  		ret = uio_mmap_logical(vma);
>  		break;
> +	case UIO_MEM_DMA_COHERENT:
> +		ret = uio_mmap_dma_coherent(vma);
> +		break;
>  	default:
>  		ret = -EINVAL;
>  	}
> diff --git a/include/linux/uio_driver.h b/include/linux/uio_driver.h
> index 47c5962b876b..7efa81497183 100644
> --- a/include/linux/uio_driver.h
> +++ b/include/linux/uio_driver.h
> @@ -37,6 +37,7 @@ struct uio_map;
>  struct uio_mem {
>  	const char		*name;
>  	phys_addr_t		addr;
> +	dma_addr_t		dma_addr;
>  	unsigned long		offs;
>  	resource_size_t		size;
>  	int			memtype;
> @@ -158,6 +159,7 @@ extern int __must_check
>  #define UIO_MEM_LOGICAL	2
>  #define UIO_MEM_VIRTUAL 3
>  #define UIO_MEM_IOVA	4
> +#define UIO_MEM_DMA_COHERENT	5
>  
>  /* defines for uio_port->porttype */
>  #define UIO_PORT_NONE	0
> -- 
> 2.23.1
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT type
  2024-01-09 12:14 ` [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT type Nilesh Javali
  2024-01-24 23:14   ` Chris Leech
@ 2024-01-24 23:36   ` Chris Leech
  2024-01-25 10:49     ` [EXT] " Nilesh Javali
  1 sibling, 1 reply; 10+ messages in thread
From: Chris Leech @ 2024-01-24 23:36 UTC (permalink / raw)
  To: Nilesh Javali
  Cc: martin.petersen, lduncan, linux-scsi, GR-QLogic-Storage-Upstream,
	jmeneghi

Nilesh, 

On Tue, Jan 09, 2024 at 05:44:56PM +0530, Nilesh Javali wrote:
> +	ret = dma_mmap_coherent(&idev->dev,
> +				vma,
> +				addr,
> +				mem->dma_addr,
> +				vma->vm_end - vma->vm_start);

When I asked about the use of idev->dev here in the v2 posting, you
repled as follows.

> While the cnic loads, it registers the cnic_uio_dev->pdev->dev with
> uio and the uio attaches its device to cnic device as it's parent. So
> uio and cnic are attached to the same PCI device.

I still don't think the sysfs parent relationship is enough to get the
correct behavior out of the DMA APIs on all platforms, and
dma_mmap_coherent needs to be using the same device struct as
dma_alloc_coherent.

I had some testing done on an AMD system, where your v2 patch set was
failing with the iommu enabled, and my original changes were reported to
work.  And I believe these v3 patches are functionally the same.

- Chris


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [EXT] Re: [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT type
  2024-01-24 23:36   ` Chris Leech
@ 2024-01-25 10:49     ` Nilesh Javali
  2024-01-25 19:16       ` Chris Leech
  0 siblings, 1 reply; 10+ messages in thread
From: Nilesh Javali @ 2024-01-25 10:49 UTC (permalink / raw)
  To: Chris Leech
  Cc: martin.petersen, lduncan, linux-scsi, GR-QLogic-Storage-Upstream,
	jmeneghi, Kaushal Desai, Greg Kroah-Hartman

Chris,

> -----Original Message-----
> From: Chris Leech <cleech@redhat.com>
> Sent: Thursday, January 25, 2024 5:07 AM
> To: Nilesh Javali <njavali@marvell.com>
> Cc: martin.petersen@oracle.com; lduncan@suse.com; linux-
> scsi@vger.kernel.org; GR-QLogic-Storage-Upstream <GR-QLogic-Storage-
> Upstream@marvell.com>; jmeneghi@redhat.com
> Subject: [EXT] Re: [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT
> type
> 
> External Email
> 
> ----------------------------------------------------------------------
> Nilesh,
> 
> On Tue, Jan 09, 2024 at 05:44:56PM +0530, Nilesh Javali wrote:
> > +	ret = dma_mmap_coherent(&idev->dev,
> > +				vma,
> > +				addr,
> > +				mem->dma_addr,
> > +				vma->vm_end - vma->vm_start);
> 
> When I asked about the use of idev->dev here in the v2 posting, you
> repled as follows.
> 
> > While the cnic loads, it registers the cnic_uio_dev->pdev->dev with
> > uio and the uio attaches its device to cnic device as it's parent. So
> > uio and cnic are attached to the same PCI device.
> 
> I still don't think the sysfs parent relationship is enough to get the
> correct behavior out of the DMA APIs on all platforms, and
> dma_mmap_coherent needs to be using the same device struct as
> dma_alloc_coherent.
> 
> I had some testing done on an AMD system, where your v2 patch set was
> failing with the iommu enabled, and my original changes were reported to
> work.  And I believe these v3 patches are functionally the same.

We have extensively verified this patch set with IOMMU enabled and see
no regression when VT and SRIOV are enabled. However issues are observed
only when VT, VT-D and SRIOV are enabled in the HW BIOS.
In the failure case, with VT-D enabled, we observe the OS fails to boot with
DMAR timeout error.

"  **] A start job is running for Network Manager (2min 6s / no limit)
[ 147.069016] DMAR: VT-d detected Invalidation Time-out Error: SID 0
[ 147.069016] DMAR: DRHD: handling fault status reg 40
[ 147.080924] DMAR: QI HEAD: Device-TLB Invalidation qw0 = 0xaf0300100003, qw1 = 0x7ffffffffffff001
[ 147.090207] DMAR: QI PRIOR: Invalidation Wait qw0 = 0x200000025, qw1 = 0x10005f634".

With your proposed changes, please confirm if you see no issues with VT-D enabled on Intel/AMD platform.

Also based on our observation the issue with VT-D enabled is not related to the current patch set under test.

Thanks,
Nilesh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [EXT] Re: [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT type
  2024-01-25 10:49     ` [EXT] " Nilesh Javali
@ 2024-01-25 19:16       ` Chris Leech
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Leech @ 2024-01-25 19:16 UTC (permalink / raw)
  To: Nilesh Javali
  Cc: martin.petersen, lduncan, linux-scsi, GR-QLogic-Storage-Upstream,
	jmeneghi, Kaushal Desai, Greg Kroah-Hartman

On Thu, Jan 25, 2024 at 10:49:56AM +0000, Nilesh Javali wrote:
>
> We have extensively verified this patch set with IOMMU enabled and see
> no regression when VT and SRIOV are enabled. However issues are observed
> only when VT, VT-D and SRIOV are enabled in the HW BIOS.

I don't understand what having IOMMU enabled while VT-d is disabled in
the BIOS settings actually means. Isn't VT-d Intel's name for it's IOMMU
implemenation? Does disabling VT-d support but setting intel_iommu=on
actually do anything?

> In the failure case, with VT-D enabled, we observe the OS fails to boot with
> DMAR timeout error.
> 
> "  **] A start job is running for Network Manager (2min 6s / no limit)
> [ 147.069016] DMAR: VT-d detected Invalidation Time-out Error: SID 0
> [ 147.069016] DMAR: DRHD: handling fault status reg 40
> [ 147.080924] DMAR: QI HEAD: Device-TLB Invalidation qw0 = 0xaf0300100003, qw1 = 0x7ffffffffffff001
> [ 147.090207] DMAR: QI PRIOR: Invalidation Wait qw0 = 0x200000025, qw1 = 0x10005f634".
> 
> With your proposed changes, please confirm if you see no issues with
> VT-D enabled on Intel/AMD platform.

I've just gone back and tested on a R320 with a Xeon E5-2420, this
systems BIOS does not have seperate VT and VT-d setting, but VT-d does
appear to be on when the single virtualization features setting is
enabled.

- A kernel with my v1 patches logs into a target just fine.
- These v3 patches fail.  My configuration is probably different, I
  don't see a network manager online stall but I do see segfaults in
  iscsiuio.
- Adding back the dma_device lines to the v3 patches, keeping the union
  cleanups you did in place, and it goes back to working.

> Also based on our observation the issue with VT-D enabled is not
> related to the current patch set under test.

Yes, Jerry Snitsel noted that the IOMMU code had been clearing the
__GFP_COMP flag for longer than the DMA API has been rejecting it. 
So IOMMU support has had issues for longer, but I think we can fix both
by doing this correctly.

Thanks,
 Chris Leech


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT type
  2024-01-24 23:14   ` Chris Leech
@ 2024-01-25 22:43     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 10+ messages in thread
From: Greg Kroah-Hartman @ 2024-01-25 22:43 UTC (permalink / raw)
  To: Chris Leech
  Cc: Nilesh Javali, martin.petersen, lduncan, linux-scsi,
	GR-QLogic-Storage-Upstream, jmeneghi, Christoph Hellwig

On Wed, Jan 24, 2024 at 03:14:51PM -0800, Chris Leech wrote:
> Hi Greg, I'd like to ask you to reconsider accepting UIO API changes for
> mapping DMA coherent allocations.

Then why wasn't this patch series actually sent to me?  It's a bit hard
to apply something I am not even aware of :(

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-01-25 22:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-09 12:14 [PATCH v3 0/3] UIO_MEM_DMA_COHERENT for cnic/bnx2/bnx2x Nilesh Javali
2024-01-09 12:14 ` [PATCH v3 1/3] uio: introduce UIO_MEM_DMA_COHERENT type Nilesh Javali
2024-01-24 23:14   ` Chris Leech
2024-01-25 22:43     ` Greg Kroah-Hartman
2024-01-24 23:36   ` Chris Leech
2024-01-25 10:49     ` [EXT] " Nilesh Javali
2024-01-25 19:16       ` Chris Leech
2024-01-09 12:14 ` [PATCH v3 2/3] cnic,bnx2,bnx2x: use UIO_MEM_DMA_COHERENT Nilesh Javali
2024-01-09 12:14 ` [PATCH v3 3/3] cnic,bnx2,bnx2x: page align uio mmap allocations Nilesh Javali
2024-01-17  7:59 ` [PATCH v3 0/3] UIO_MEM_DMA_COHERENT for cnic/bnx2/bnx2x Nilesh Javali

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).