linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent
@ 2019-01-03 17:23 Stephen Warren
  2019-01-03 17:23 ` [PATCH v4 2/2] net/mlx4: replace pci_{,un}map_sg with dma_{,un}map_sg Stephen Warren
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Stephen Warren @ 2019-01-03 17:23 UTC (permalink / raw)
  To: Tariq Toukan, xavier.huwei
  Cc: netdev, linux-rdma, Doug Ledford, Jason Gunthorpe,
	Christoph Hellwig, Stephen Warren

From: Stephen Warren <swarren@nvidia.com>

This patch solves a crash at the time of mlx4 driver unload or system
shutdown. The crash occurs because dma_alloc_coherent() returns one
value in mlx4_alloc_icm_coherent(), but a different value is passed to
dma_free_coherent() in mlx4_free_icm_coherent(). In turn this is because
when allocated, that pointer is passed to sg_set_buf() to record it,
then when freed it is re-calculated by calling
lowmem_page_address(sg_page()) which returns a different value. Solve
this by recording the value that dma_alloc_coherent() returns, and
passing this to dma_free_coherent().

This patch is roughly equivalent to commit 378efe798ecf ("RDMA/hns: Get
rid of page operation after dma_alloc_coherent").

Based-on-code-from: Christoph Hellwig <hch@lst.de>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
---
v4 (Jan 3):
- Shortened commit description.
- Use bool not int in struct mlx4_icm_chunk.
- Tariq said "Thanks for your patch. It looks good to me." for v3.
v3 (Dec 19):
- Rework chunk data structure to store all data for coherent allocations
  separately from the sg list. Code from Christoph Hellwig with fixes by
  me. Notes:
  - chunk->coherent is an int not a bool since checkpatch complains about
    using bool in structs; see https://lkml.org/lkml/2017/11/21/384.
  - chunk->coherent is used rather than chunk->table->coherent since the
    table pointer isn't available when creating chunks. This duplicates
    data, but simplifies the patch.
v2:
- Rework mlx4_table_find() to explicitly calculate the returned address
  differently depending on wheter the table was allocated using
  dma_alloc_coherent() or alloc_pages(), which in turn allows the
  changes to mlx4_alloc_icm_pages() to be dropped.
- Drop changes to mlx4_alloc/free_icm_pages. This path uses
  pci_map_sg() which can re-write the sg list which in turn would cause
  chunk->mem[] (the sg list) and chunk->buf[] to become inconsistent.
- Enhance commit description.

Note: I've tested this patch in a downstream 4.14 based kernel (using
ibping, ib_read_bw, and ib_write_bw), but can't test it in mainline
since my system isn't supported there yet. I have compile-tested it in
mainline at least, for ARM64.
---
 drivers/net/ethernet/mellanox/mlx4/icm.c | 92 ++++++++++++++----------
 drivers/net/ethernet/mellanox/mlx4/icm.h | 22 +++++-
 2 files changed, 75 insertions(+), 39 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.c b/drivers/net/ethernet/mellanox/mlx4/icm.c
index 4b4351141b94..76b84d08a058 100644
--- a/drivers/net/ethernet/mellanox/mlx4/icm.c
+++ b/drivers/net/ethernet/mellanox/mlx4/icm.c
@@ -57,12 +57,12 @@ static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chu
 	int i;
 
 	if (chunk->nsg > 0)
-		pci_unmap_sg(dev->persist->pdev, chunk->mem, chunk->npages,
+		pci_unmap_sg(dev->persist->pdev, chunk->sg, chunk->npages,
 			     PCI_DMA_BIDIRECTIONAL);
 
 	for (i = 0; i < chunk->npages; ++i)
-		__free_pages(sg_page(&chunk->mem[i]),
-			     get_order(chunk->mem[i].length));
+		__free_pages(sg_page(&chunk->sg[i]),
+			     get_order(chunk->sg[i].length));
 }
 
 static void mlx4_free_icm_coherent(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk)
@@ -71,9 +71,9 @@ static void mlx4_free_icm_coherent(struct mlx4_dev *dev, struct mlx4_icm_chunk *
 
 	for (i = 0; i < chunk->npages; ++i)
 		dma_free_coherent(&dev->persist->pdev->dev,
-				  chunk->mem[i].length,
-				  lowmem_page_address(sg_page(&chunk->mem[i])),
-				  sg_dma_address(&chunk->mem[i]));
+				  chunk->buf[i].size,
+				  chunk->buf[i].addr,
+				  chunk->buf[i].dma_addr);
 }
 
 void mlx4_free_icm(struct mlx4_dev *dev, struct mlx4_icm *icm, int coherent)
@@ -111,22 +111,21 @@ static int mlx4_alloc_icm_pages(struct scatterlist *mem, int order,
 	return 0;
 }
 
-static int mlx4_alloc_icm_coherent(struct device *dev, struct scatterlist *mem,
-				    int order, gfp_t gfp_mask)
+static int mlx4_alloc_icm_coherent(struct device *dev, struct mlx4_icm_buf *buf,
+				   int order, gfp_t gfp_mask)
 {
-	void *buf = dma_alloc_coherent(dev, PAGE_SIZE << order,
-				       &sg_dma_address(mem), gfp_mask);
-	if (!buf)
+	buf->addr = dma_alloc_coherent(dev, PAGE_SIZE << order,
+				       &buf->dma_addr, gfp_mask);
+	if (!buf->addr)
 		return -ENOMEM;
 
-	if (offset_in_page(buf)) {
-		dma_free_coherent(dev, PAGE_SIZE << order,
-				  buf, sg_dma_address(mem));
+	if (offset_in_page(buf->addr)) {
+		dma_free_coherent(dev, PAGE_SIZE << order, buf->addr,
+				  buf->dma_addr);
 		return -ENOMEM;
 	}
 
-	sg_set_buf(mem, buf, PAGE_SIZE << order);
-	sg_dma_len(mem) = PAGE_SIZE << order;
+	buf->size = PAGE_SIZE << order;
 	return 0;
 }
 
@@ -159,21 +158,21 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
 
 	while (npages > 0) {
 		if (!chunk) {
-			chunk = kmalloc_node(sizeof(*chunk),
+			chunk = kzalloc_node(sizeof(*chunk),
 					     gfp_mask & ~(__GFP_HIGHMEM |
 							  __GFP_NOWARN),
 					     dev->numa_node);
 			if (!chunk) {
-				chunk = kmalloc(sizeof(*chunk),
+				chunk = kzalloc(sizeof(*chunk),
 						gfp_mask & ~(__GFP_HIGHMEM |
 							     __GFP_NOWARN));
 				if (!chunk)
 					goto fail;
 			}
+			chunk->coherent = coherent;
 
-			sg_init_table(chunk->mem, MLX4_ICM_CHUNK_LEN);
-			chunk->npages = 0;
-			chunk->nsg    = 0;
+			if (!coherent)
+				sg_init_table(chunk->sg, MLX4_ICM_CHUNK_LEN);
 			list_add_tail(&chunk->list, &icm->chunk_list);
 		}
 
@@ -186,10 +185,10 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
 
 		if (coherent)
 			ret = mlx4_alloc_icm_coherent(&dev->persist->pdev->dev,
-						      &chunk->mem[chunk->npages],
-						      cur_order, mask);
+						&chunk->buf[chunk->npages],
+						cur_order, mask);
 		else
-			ret = mlx4_alloc_icm_pages(&chunk->mem[chunk->npages],
+			ret = mlx4_alloc_icm_pages(&chunk->sg[chunk->npages],
 						   cur_order, mask,
 						   dev->numa_node);
 
@@ -205,7 +204,7 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
 		if (coherent)
 			++chunk->nsg;
 		else if (chunk->npages == MLX4_ICM_CHUNK_LEN) {
-			chunk->nsg = pci_map_sg(dev->persist->pdev, chunk->mem,
+			chunk->nsg = pci_map_sg(dev->persist->pdev, chunk->sg,
 						chunk->npages,
 						PCI_DMA_BIDIRECTIONAL);
 
@@ -220,7 +219,7 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
 	}
 
 	if (!coherent && chunk) {
-		chunk->nsg = pci_map_sg(dev->persist->pdev, chunk->mem,
+		chunk->nsg = pci_map_sg(dev->persist->pdev, chunk->sg,
 					chunk->npages,
 					PCI_DMA_BIDIRECTIONAL);
 
@@ -320,7 +319,7 @@ void *mlx4_table_find(struct mlx4_icm_table *table, u32 obj,
 	u64 idx;
 	struct mlx4_icm_chunk *chunk;
 	struct mlx4_icm *icm;
-	struct page *page = NULL;
+	void *addr = NULL;
 
 	if (!table->lowmem)
 		return NULL;
@@ -336,28 +335,49 @@ void *mlx4_table_find(struct mlx4_icm_table *table, u32 obj,
 
 	list_for_each_entry(chunk, &icm->chunk_list, list) {
 		for (i = 0; i < chunk->npages; ++i) {
+			dma_addr_t dma_addr;
+			size_t len;
+
+			if (table->coherent) {
+				len = chunk->buf[i].size;
+				dma_addr = chunk->buf[i].dma_addr;
+				addr = chunk->buf[i].addr;
+			} else {
+				struct page *page;
+
+				len = sg_dma_len(&chunk->sg[i]);
+				dma_addr = sg_dma_address(&chunk->sg[i]);
+
+				/* XXX: we should never do this for highmem
+				 * allocation.  This function either needs
+				 * to be split, or the kernel virtual address
+				 * return needs to be made optional.
+				 */
+				page = sg_page(&chunk->sg[i]);
+				addr = lowmem_page_address(page);
+			}
+
 			if (dma_handle && dma_offset >= 0) {
-				if (sg_dma_len(&chunk->mem[i]) > dma_offset)
-					*dma_handle = sg_dma_address(&chunk->mem[i]) +
-						dma_offset;
-				dma_offset -= sg_dma_len(&chunk->mem[i]);
+				if (len > dma_offset)
+					*dma_handle = dma_addr + dma_offset;
+				dma_offset -= len;
 			}
+
 			/*
 			 * DMA mapping can merge pages but not split them,
 			 * so if we found the page, dma_handle has already
 			 * been assigned to.
 			 */
-			if (chunk->mem[i].length > offset) {
-				page = sg_page(&chunk->mem[i]);
+			if (len > offset)
 				goto out;
-			}
-			offset -= chunk->mem[i].length;
+			offset -= len;
 		}
 	}
 
+	addr = NULL;
 out:
 	mutex_unlock(&table->mutex);
-	return page ? lowmem_page_address(page) + offset : NULL;
+	return addr ? addr + offset : NULL;
 }
 
 int mlx4_table_get_range(struct mlx4_dev *dev, struct mlx4_icm_table *table,
diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.h b/drivers/net/ethernet/mellanox/mlx4/icm.h
index c9169a490557..d199874b1c07 100644
--- a/drivers/net/ethernet/mellanox/mlx4/icm.h
+++ b/drivers/net/ethernet/mellanox/mlx4/icm.h
@@ -47,11 +47,21 @@ enum {
 	MLX4_ICM_PAGE_SIZE	= 1 << MLX4_ICM_PAGE_SHIFT,
 };
 
+struct mlx4_icm_buf {
+	void			*addr;
+	size_t			size;
+	dma_addr_t		dma_addr;
+};
+
 struct mlx4_icm_chunk {
 	struct list_head	list;
 	int			npages;
 	int			nsg;
-	struct scatterlist	mem[MLX4_ICM_CHUNK_LEN];
+	bool			coherent;
+	union {
+		struct scatterlist	sg[MLX4_ICM_CHUNK_LEN];
+		struct mlx4_icm_buf	buf[MLX4_ICM_CHUNK_LEN];
+	};
 };
 
 struct mlx4_icm {
@@ -114,12 +124,18 @@ static inline void mlx4_icm_next(struct mlx4_icm_iter *iter)
 
 static inline dma_addr_t mlx4_icm_addr(struct mlx4_icm_iter *iter)
 {
-	return sg_dma_address(&iter->chunk->mem[iter->page_idx]);
+	if (iter->chunk->coherent)
+		return iter->chunk->buf[iter->page_idx].dma_addr;
+	else
+		return sg_dma_address(&iter->chunk->sg[iter->page_idx]);
 }
 
 static inline unsigned long mlx4_icm_size(struct mlx4_icm_iter *iter)
 {
-	return sg_dma_len(&iter->chunk->mem[iter->page_idx]);
+	if (iter->chunk->coherent)
+		return iter->chunk->buf[iter->page_idx].size;
+	else
+		return sg_dma_len(&iter->chunk->sg[iter->page_idx]);
 }
 
 int mlx4_MAP_ICM_AUX(struct mlx4_dev *dev, struct mlx4_icm *icm);
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 2/2] net/mlx4: replace pci_{,un}map_sg with dma_{,un}map_sg
  2019-01-03 17:23 [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent Stephen Warren
@ 2019-01-03 17:23 ` Stephen Warren
  2019-01-06  8:29   ` Tariq Toukan
  2019-01-07 15:10   ` David Miller
  2019-01-04 21:42 ` [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent David Miller
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 8+ messages in thread
From: Stephen Warren @ 2019-01-03 17:23 UTC (permalink / raw)
  To: Tariq Toukan, xavier.huwei
  Cc: netdev, linux-rdma, Doug Ledford, Jason Gunthorpe,
	Christoph Hellwig, Stephen Warren

From: Stephen Warren <swarren@nvidia.com>

pci_{,un}map_sg are deprecated and replaced by dma_{,un}map_sg. This is
especially relevant since the rest of the driver uses the DMA API. Fix
the driver to use the replacement APIs.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
---
v4: New patch.
---
 drivers/net/ethernet/mellanox/mlx4/icm.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.c b/drivers/net/ethernet/mellanox/mlx4/icm.c
index 76b84d08a058..d89a3da89e5a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/icm.c
+++ b/drivers/net/ethernet/mellanox/mlx4/icm.c
@@ -57,8 +57,8 @@ static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chu
 	int i;
 
 	if (chunk->nsg > 0)
-		pci_unmap_sg(dev->persist->pdev, chunk->sg, chunk->npages,
-			     PCI_DMA_BIDIRECTIONAL);
+		dma_unmap_sg(&dev->persist->pdev->dev, chunk->sg, chunk->npages,
+			     DMA_BIDIRECTIONAL);
 
 	for (i = 0; i < chunk->npages; ++i)
 		__free_pages(sg_page(&chunk->sg[i]),
@@ -204,9 +204,9 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
 		if (coherent)
 			++chunk->nsg;
 		else if (chunk->npages == MLX4_ICM_CHUNK_LEN) {
-			chunk->nsg = pci_map_sg(dev->persist->pdev, chunk->sg,
-						chunk->npages,
-						PCI_DMA_BIDIRECTIONAL);
+			chunk->nsg = dma_map_sg(&dev->persist->pdev->dev,
+						chunk->sg, chunk->npages,
+						DMA_BIDIRECTIONAL);
 
 			if (chunk->nsg <= 0)
 				goto fail;
@@ -219,9 +219,8 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
 	}
 
 	if (!coherent && chunk) {
-		chunk->nsg = pci_map_sg(dev->persist->pdev, chunk->sg,
-					chunk->npages,
-					PCI_DMA_BIDIRECTIONAL);
+		chunk->nsg = dma_map_sg(&dev->persist->pdev->dev, chunk->sg,
+					chunk->npages, DMA_BIDIRECTIONAL);
 
 		if (chunk->nsg <= 0)
 			goto fail;
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent
  2019-01-03 17:23 [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent Stephen Warren
  2019-01-03 17:23 ` [PATCH v4 2/2] net/mlx4: replace pci_{,un}map_sg with dma_{,un}map_sg Stephen Warren
@ 2019-01-04 21:42 ` David Miller
  2019-01-06  8:26   ` Tariq Toukan
  2019-01-06  8:30 ` Tariq Toukan
  2019-01-07 15:10 ` David Miller
  3 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2019-01-04 21:42 UTC (permalink / raw)
  To: swarren
  Cc: tariqt, xavier.huwei, netdev, linux-rdma, dledford, jgg, hch, swarren


Mellanox folks, these two patches could use some review.

Thank you.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent
  2019-01-04 21:42 ` [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent David Miller
@ 2019-01-06  8:26   ` Tariq Toukan
  0 siblings, 0 replies; 8+ messages in thread
From: Tariq Toukan @ 2019-01-06  8:26 UTC (permalink / raw)
  To: David Miller, swarren
  Cc: Tariq Toukan, xavier.huwei, netdev, linux-rdma, dledford,
	Jason Gunthorpe, hch, swarren



On 1/4/2019 11:42 PM, David Miller wrote:
> 
> Mellanox folks, these two patches could use some review.
> 
> Thank you.
> 

Sure. They were sent on our weekend. Reviewing now.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 2/2] net/mlx4: replace pci_{,un}map_sg with dma_{,un}map_sg
  2019-01-03 17:23 ` [PATCH v4 2/2] net/mlx4: replace pci_{,un}map_sg with dma_{,un}map_sg Stephen Warren
@ 2019-01-06  8:29   ` Tariq Toukan
  2019-01-07 15:10   ` David Miller
  1 sibling, 0 replies; 8+ messages in thread
From: Tariq Toukan @ 2019-01-06  8:29 UTC (permalink / raw)
  To: Stephen Warren, Tariq Toukan, xavier.huwei
  Cc: netdev, linux-rdma, Doug Ledford, Jason Gunthorpe,
	Christoph Hellwig, Stephen Warren



On 1/3/2019 7:23 PM, Stephen Warren wrote:
> From: Stephen Warren <swarren@nvidia.com>
> 
> pci_{,un}map_sg are deprecated and replaced by dma_{,un}map_sg. This is
> especially relevant since the rest of the driver uses the DMA API. Fix
> the driver to use the replacement APIs.
> 
> Signed-off-by: Stephen Warren <swarren@nvidia.com>
> ---
> v4: New patch.
> ---
>   drivers/net/ethernet/mellanox/mlx4/icm.c | 15 +++++++--------
>   1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.c b/drivers/net/ethernet/mellanox/mlx4/icm.c
> index 76b84d08a058..d89a3da89e5a 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/icm.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/icm.c
> @@ -57,8 +57,8 @@ static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chu
>   	int i;
>   
>   	if (chunk->nsg > 0)
> -		pci_unmap_sg(dev->persist->pdev, chunk->sg, chunk->npages,
> -			     PCI_DMA_BIDIRECTIONAL);
> +		dma_unmap_sg(&dev->persist->pdev->dev, chunk->sg, chunk->npages,
> +			     DMA_BIDIRECTIONAL);
>   
>   	for (i = 0; i < chunk->npages; ++i)
>   		__free_pages(sg_page(&chunk->sg[i]),
> @@ -204,9 +204,9 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
>   		if (coherent)
>   			++chunk->nsg;
>   		else if (chunk->npages == MLX4_ICM_CHUNK_LEN) {
> -			chunk->nsg = pci_map_sg(dev->persist->pdev, chunk->sg,
> -						chunk->npages,
> -						PCI_DMA_BIDIRECTIONAL);
> +			chunk->nsg = dma_map_sg(&dev->persist->pdev->dev,
> +						chunk->sg, chunk->npages,
> +						DMA_BIDIRECTIONAL);
>   
>   			if (chunk->nsg <= 0)
>   				goto fail;
> @@ -219,9 +219,8 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
>   	}
>   
>   	if (!coherent && chunk) {
> -		chunk->nsg = pci_map_sg(dev->persist->pdev, chunk->sg,
> -					chunk->npages,
> -					PCI_DMA_BIDIRECTIONAL);
> +		chunk->nsg = dma_map_sg(&dev->persist->pdev->dev, chunk->sg,
> +					chunk->npages, DMA_BIDIRECTIONAL);
>   
>   		if (chunk->nsg <= 0)
>   			goto fail;
> 

Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent
  2019-01-03 17:23 [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent Stephen Warren
  2019-01-03 17:23 ` [PATCH v4 2/2] net/mlx4: replace pci_{,un}map_sg with dma_{,un}map_sg Stephen Warren
  2019-01-04 21:42 ` [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent David Miller
@ 2019-01-06  8:30 ` Tariq Toukan
  2019-01-07 15:10 ` David Miller
  3 siblings, 0 replies; 8+ messages in thread
From: Tariq Toukan @ 2019-01-06  8:30 UTC (permalink / raw)
  To: Stephen Warren, Tariq Toukan, xavier.huwei
  Cc: netdev, linux-rdma, Doug Ledford, Jason Gunthorpe,
	Christoph Hellwig, Stephen Warren



On 1/3/2019 7:23 PM, Stephen Warren wrote:
> From: Stephen Warren <swarren@nvidia.com>
> 
> This patch solves a crash at the time of mlx4 driver unload or system
> shutdown. The crash occurs because dma_alloc_coherent() returns one
> value in mlx4_alloc_icm_coherent(), but a different value is passed to
> dma_free_coherent() in mlx4_free_icm_coherent(). In turn this is because
> when allocated, that pointer is passed to sg_set_buf() to record it,
> then when freed it is re-calculated by calling
> lowmem_page_address(sg_page()) which returns a different value. Solve
> this by recording the value that dma_alloc_coherent() returns, and
> passing this to dma_free_coherent().
> 
> This patch is roughly equivalent to commit 378efe798ecf ("RDMA/hns: Get
> rid of page operation after dma_alloc_coherent").
> 
> Based-on-code-from: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Stephen Warren <swarren@nvidia.com>
> ---
> v4 (Jan 3):
> - Shortened commit description.
> - Use bool not int in struct mlx4_icm_chunk.
> - Tariq said "Thanks for your patch. It looks good to me." for v3.
> v3 (Dec 19):
> - Rework chunk data structure to store all data for coherent allocations
>    separately from the sg list. Code from Christoph Hellwig with fixes by
>    me. Notes:
>    - chunk->coherent is an int not a bool since checkpatch complains about
>      using bool in structs; see https://lkml.org/lkml/2017/11/21/384.
>    - chunk->coherent is used rather than chunk->table->coherent since the
>      table pointer isn't available when creating chunks. This duplicates
>      data, but simplifies the patch.
> v2:
> - Rework mlx4_table_find() to explicitly calculate the returned address
>    differently depending on wheter the table was allocated using
>    dma_alloc_coherent() or alloc_pages(), which in turn allows the
>    changes to mlx4_alloc_icm_pages() to be dropped.
> - Drop changes to mlx4_alloc/free_icm_pages. This path uses
>    pci_map_sg() which can re-write the sg list which in turn would cause
>    chunk->mem[] (the sg list) and chunk->buf[] to become inconsistent.
> - Enhance commit description.
> 
> Note: I've tested this patch in a downstream 4.14 based kernel (using
> ibping, ib_read_bw, and ib_write_bw), but can't test it in mainline
> since my system isn't supported there yet. I have compile-tested it in
> mainline at least, for ARM64.
> ---
>   drivers/net/ethernet/mellanox/mlx4/icm.c | 92 ++++++++++++++----------
>   drivers/net/ethernet/mellanox/mlx4/icm.h | 22 +++++-
>   2 files changed, 75 insertions(+), 39 deletions(-)
> 

Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent
  2019-01-03 17:23 [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent Stephen Warren
                   ` (2 preceding siblings ...)
  2019-01-06  8:30 ` Tariq Toukan
@ 2019-01-07 15:10 ` David Miller
  3 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2019-01-07 15:10 UTC (permalink / raw)
  To: swarren
  Cc: tariqt, xavier.huwei, netdev, linux-rdma, dledford, jgg, hch, swarren

From: Stephen Warren <swarren@wwwdotorg.org>
Date: Thu,  3 Jan 2019 10:23:23 -0700

> From: Stephen Warren <swarren@nvidia.com>
> 
> This patch solves a crash at the time of mlx4 driver unload or system
> shutdown. The crash occurs because dma_alloc_coherent() returns one
> value in mlx4_alloc_icm_coherent(), but a different value is passed to
> dma_free_coherent() in mlx4_free_icm_coherent(). In turn this is because
> when allocated, that pointer is passed to sg_set_buf() to record it,
> then when freed it is re-calculated by calling
> lowmem_page_address(sg_page()) which returns a different value. Solve
> this by recording the value that dma_alloc_coherent() returns, and
> passing this to dma_free_coherent().
> 
> This patch is roughly equivalent to commit 378efe798ecf ("RDMA/hns: Get
> rid of page operation after dma_alloc_coherent").
> 
> Based-on-code-from: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Stephen Warren <swarren@nvidia.com>

Applied.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 2/2] net/mlx4: replace pci_{,un}map_sg with dma_{,un}map_sg
  2019-01-03 17:23 ` [PATCH v4 2/2] net/mlx4: replace pci_{,un}map_sg with dma_{,un}map_sg Stephen Warren
  2019-01-06  8:29   ` Tariq Toukan
@ 2019-01-07 15:10   ` David Miller
  1 sibling, 0 replies; 8+ messages in thread
From: David Miller @ 2019-01-07 15:10 UTC (permalink / raw)
  To: swarren
  Cc: tariqt, xavier.huwei, netdev, linux-rdma, dledford, jgg, hch, swarren

From: Stephen Warren <swarren@wwwdotorg.org>
Date: Thu,  3 Jan 2019 10:23:24 -0700

> From: Stephen Warren <swarren@nvidia.com>
> 
> pci_{,un}map_sg are deprecated and replaced by dma_{,un}map_sg. This is
> especially relevant since the rest of the driver uses the DMA API. Fix
> the driver to use the replacement APIs.
> 
> Signed-off-by: Stephen Warren <swarren@nvidia.com>

Applied.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-01-07 15:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-03 17:23 [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent Stephen Warren
2019-01-03 17:23 ` [PATCH v4 2/2] net/mlx4: replace pci_{,un}map_sg with dma_{,un}map_sg Stephen Warren
2019-01-06  8:29   ` Tariq Toukan
2019-01-07 15:10   ` David Miller
2019-01-04 21:42 ` [PATCH v4 1/2] net/mlx4: Get rid of page operation after dma_alloc_coherent David Miller
2019-01-06  8:26   ` Tariq Toukan
2019-01-06  8:30 ` Tariq Toukan
2019-01-07 15:10 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).