linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 rdma-rc 0/3] RDMA/hns: Bug fixes in hns RoCE driver
@ 2017-11-28  7:10 Wei Hu (Xavier)
  2017-11-28  7:10 ` [PATCH V2 rdma-rc 1/3] RDMA/hns: Fix the issue of IOVA not page continuous in hip08 Wei Hu (Xavier)
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Wei Hu (Xavier) @ 2017-11-28  7:10 UTC (permalink / raw)
  To: dledford, jgg
  Cc: linux-rdma, linuxarm, linux-kernel, shaobo.xu, xavier.huwei,
	zhangxiping3

This patch-set introduces three patches to fix the memory related
issues. one fixes DMA operation failure when smmu is enabled.
the other two patches fixes incorrect usage of dma API that may
cause coherency problems.

Wei Hu (Xavier) (3):
  RDMA/hns: Fix the issue of IOVA not page continuous in hip08
  RDMA/hns: Get rid of virt_to_page and vmap calls after
    dma_alloc_coherent
  RDMA/hns: Get rid of page operation after dma_alloc_coherent

 drivers/infiniband/hw/hns/hns_roce_alloc.c  | 23 -----------------------
 drivers/infiniband/hw/hns/hns_roce_device.h |  4 +---
 drivers/infiniband/hw/hns/hns_roce_hem.c    | 25 +++++++++++++------------
 drivers/infiniband/hw/hns/hns_roce_hem.h    |  1 +
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 22 +++++++++++++++-------
 5 files changed, 30 insertions(+), 45 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH V2 rdma-rc 1/3] RDMA/hns: Fix the issue of IOVA not page continuous in hip08
  2017-11-28  7:10 [PATCH V2 rdma-rc 0/3] RDMA/hns: Bug fixes in hns RoCE driver Wei Hu (Xavier)
@ 2017-11-28  7:10 ` Wei Hu (Xavier)
  2017-11-28  7:10 ` [PATCH V2 rdma-rc 2/3] RDMA/hns: Get rid of virt_to_page and vmap calls after dma_alloc_coherent Wei Hu (Xavier)
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Wei Hu (Xavier) @ 2017-11-28  7:10 UTC (permalink / raw)
  To: dledford, jgg
  Cc: linux-rdma, linuxarm, linux-kernel, shaobo.xu, xavier.huwei,
	zhangxiping3

If the smmu is enabled, the length of sg obtained from
__iommu_map_sg_attrs is not 4kB. When the IOVA is set with the sg
dma address, the IOVA will not be page continuous. so, the current
code has MTPT configuration error that probably cause dma operation
failure. In order to fix this issue, the IOVA should be calculated
based on the sg length.

Fixes: 3958cc5("RDMA/hns: Configure the MTPT in hip08")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com>
---
changelog:
 v1 -> v2: Revise the commit message and add Fixes line at Jason's
    comment. The related link: https://lkml.org/lkml/2017/11/27/841
---
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 04281d0..4d3e976 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -1131,9 +1131,11 @@ static int hns_roce_v2_write_mtpt(void *mb_buf, struct hns_roce_mr *mr,
 {
 	struct hns_roce_v2_mpt_entry *mpt_entry;
 	struct scatterlist *sg;
+	u64 page_addr;
 	u64 *pages;
+	int i, j;
+	int len;
 	int entry;
-	int i;
 
 	mpt_entry = mb_buf;
 	memset(mpt_entry, 0, sizeof(*mpt_entry));
@@ -1191,14 +1193,20 @@ static int hns_roce_v2_write_mtpt(void *mb_buf, struct hns_roce_mr *mr,
 
 	i = 0;
 	for_each_sg(mr->umem->sg_head.sgl, sg, mr->umem->nmap, entry) {
-		pages[i] = ((u64)sg_dma_address(sg)) >> 6;
-
-		/* Record the first 2 entry directly to MTPT table */
-		if (i >= HNS_ROCE_V2_MAX_INNER_MTPT_NUM - 1)
-			break;
-		i++;
+		len = sg_dma_len(sg) >> PAGE_SHIFT;
+		for (j = 0; j < len; ++j) {
+			page_addr = sg_dma_address(sg) +
+				    (j << mr->umem->page_shift);
+			pages[i] = page_addr >> 6;
+
+			/* Record the first 2 entry directly to MTPT table */
+			if (i >= HNS_ROCE_V2_MAX_INNER_MTPT_NUM - 1)
+				goto found;
+			i++;
+		}
 	}
 
+found:
 	mpt_entry->pa0_l = cpu_to_le32(lower_32_bits(pages[0]));
 	roce_set_field(mpt_entry->byte_56_pa0_h, V2_MPT_BYTE_56_PA0_H_M,
 		       V2_MPT_BYTE_56_PA0_H_S,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH V2 rdma-rc 2/3] RDMA/hns: Get rid of virt_to_page and vmap calls after dma_alloc_coherent
  2017-11-28  7:10 [PATCH V2 rdma-rc 0/3] RDMA/hns: Bug fixes in hns RoCE driver Wei Hu (Xavier)
  2017-11-28  7:10 ` [PATCH V2 rdma-rc 1/3] RDMA/hns: Fix the issue of IOVA not page continuous in hip08 Wei Hu (Xavier)
@ 2017-11-28  7:10 ` Wei Hu (Xavier)
  2017-11-28  7:10 ` [PATCH V2 rdma-rc 3/3] RDMA/hns: Get rid of page operation " Wei Hu (Xavier)
  2017-12-02  0:04 ` [PATCH V2 rdma-rc 0/3] RDMA/hns: Bug fixes in hns RoCE driver Jason Gunthorpe
  3 siblings, 0 replies; 5+ messages in thread
From: Wei Hu (Xavier) @ 2017-11-28  7:10 UTC (permalink / raw)
  To: dledford, jgg
  Cc: linux-rdma, linuxarm, linux-kernel, shaobo.xu, xavier.huwei,
	zhangxiping3

In general dma_alloc_coherent() return a CPU virtual address and
a DMA address, and we have no guarantee that the virtual address
is either in the linear map or vmalloc, and not some other special
place. we have no guarantee that the underlying memory even has
an associated struct page at all.

In current code, there are incorrect usage as below:
dma_alloc_coherent + virt_to_page + vmap. There will probably
introduce coherency problem. This patch fixes it to get rid of
virt_to_page and vmap calls at Leon's suggestion. The related
link: https://lkml.org/lkml/2017/11/7/34

Fixes: 9a44353("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com>
---
changelog:
 v1 -> v2: Revise the commit message and add Fixes line at Jason's
    comment. The related link: https://lkml.org/lkml/2017/11/27/841
---
 drivers/infiniband/hw/hns/hns_roce_alloc.c  | 23 -----------------------
 drivers/infiniband/hw/hns/hns_roce_device.h |  4 +---
 2 files changed, 1 insertion(+), 26 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_alloc.c b/drivers/infiniband/hw/hns/hns_roce_alloc.c
index 3e4c525..a40ec93 100644
--- a/drivers/infiniband/hw/hns/hns_roce_alloc.c
+++ b/drivers/infiniband/hw/hns/hns_roce_alloc.c
@@ -162,14 +162,10 @@ void hns_roce_buf_free(struct hns_roce_dev *hr_dev, u32 size,
 {
 	int i;
 	struct device *dev = hr_dev->dev;
-	u32 bits_per_long = BITS_PER_LONG;
 
 	if (buf->nbufs == 1) {
 		dma_free_coherent(dev, size, buf->direct.buf, buf->direct.map);
 	} else {
-		if (bits_per_long == 64 && buf->page_shift == PAGE_SHIFT)
-			vunmap(buf->direct.buf);
-
 		for (i = 0; i < buf->nbufs; ++i)
 			if (buf->page_list[i].buf)
 				dma_free_coherent(dev, 1 << buf->page_shift,
@@ -185,9 +181,7 @@ int hns_roce_buf_alloc(struct hns_roce_dev *hr_dev, u32 size, u32 max_direct,
 {
 	int i = 0;
 	dma_addr_t t;
-	struct page **pages;
 	struct device *dev = hr_dev->dev;
-	u32 bits_per_long = BITS_PER_LONG;
 	u32 page_size = 1 << page_shift;
 	u32 order;
 
@@ -236,23 +230,6 @@ int hns_roce_buf_alloc(struct hns_roce_dev *hr_dev, u32 size, u32 max_direct,
 			buf->page_list[i].map = t;
 			memset(buf->page_list[i].buf, 0, page_size);
 		}
-		if (bits_per_long == 64 && page_shift == PAGE_SHIFT) {
-			pages = kmalloc_array(buf->nbufs, sizeof(*pages),
-					      GFP_KERNEL);
-			if (!pages)
-				goto err_free;
-
-			for (i = 0; i < buf->nbufs; ++i)
-				pages[i] = virt_to_page(buf->page_list[i].buf);
-
-			buf->direct.buf = vmap(pages, buf->nbufs, VM_MAP,
-					       PAGE_KERNEL);
-			kfree(pages);
-			if (!buf->direct.buf)
-				goto err_free;
-		} else {
-			buf->direct.buf = NULL;
-		}
 	}
 
 	return 0;
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index dde5178..dcfd209 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -795,11 +795,9 @@ static inline void hns_roce_write64_k(__be32 val[2], void __iomem *dest)
 
 static inline void *hns_roce_buf_offset(struct hns_roce_buf *buf, int offset)
 {
-	u32 bits_per_long_val = BITS_PER_LONG;
 	u32 page_size = 1 << buf->page_shift;
 
-	if ((bits_per_long_val == 64 && buf->page_shift == PAGE_SHIFT) ||
-	    buf->nbufs == 1)
+	if (buf->nbufs == 1)
 		return (char *)(buf->direct.buf) + offset;
 	else
 		return (char *)(buf->page_list[offset >> buf->page_shift].buf) +
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH V2 rdma-rc 3/3] RDMA/hns: Get rid of page operation after dma_alloc_coherent
  2017-11-28  7:10 [PATCH V2 rdma-rc 0/3] RDMA/hns: Bug fixes in hns RoCE driver Wei Hu (Xavier)
  2017-11-28  7:10 ` [PATCH V2 rdma-rc 1/3] RDMA/hns: Fix the issue of IOVA not page continuous in hip08 Wei Hu (Xavier)
  2017-11-28  7:10 ` [PATCH V2 rdma-rc 2/3] RDMA/hns: Get rid of virt_to_page and vmap calls after dma_alloc_coherent Wei Hu (Xavier)
@ 2017-11-28  7:10 ` Wei Hu (Xavier)
  2017-12-02  0:04 ` [PATCH V2 rdma-rc 0/3] RDMA/hns: Bug fixes in hns RoCE driver Jason Gunthorpe
  3 siblings, 0 replies; 5+ messages in thread
From: Wei Hu (Xavier) @ 2017-11-28  7:10 UTC (permalink / raw)
  To: dledford, jgg
  Cc: linux-rdma, linuxarm, linux-kernel, shaobo.xu, xavier.huwei,
	zhangxiping3

In general dma_alloc_coherent() return a CPU virtual address and
a DMA address, and we have no guarantee that the underlying memory
even has an associated struct page at all.

This patch gets rid of the page operation after dma_alloc_coherent,
and records the VA returned form dma_alloc_coherent in the struct
of hem in hns RoCE driver.

Fixes: 9a44353("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com>
---
changelog:
 v1 -> v2: Revise the commit message and add Fixes line at Jason's
    comment. The related link: https://lkml.org/lkml/2017/11/27/841
---
 drivers/infiniband/hw/hns/hns_roce_hem.c | 25 +++++++++++++------------
 drivers/infiniband/hw/hns/hns_roce_hem.h |  1 +
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_hem.c b/drivers/infiniband/hw/hns/hns_roce_hem.c
index 8b733a6..0eeabfb 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hem.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hem.c
@@ -224,6 +224,7 @@ static struct hns_roce_hem *hns_roce_alloc_hem(struct hns_roce_dev *hr_dev,
 			sg_init_table(chunk->mem, HNS_ROCE_HEM_CHUNK_LEN);
 			chunk->npages = 0;
 			chunk->nsg = 0;
+			memset(chunk->buf, 0, sizeof(chunk->buf));
 			list_add_tail(&chunk->list, &hem->chunk_list);
 		}
 
@@ -240,8 +241,7 @@ static struct hns_roce_hem *hns_roce_alloc_hem(struct hns_roce_dev *hr_dev,
 		if (!buf)
 			goto fail;
 
-		sg_set_buf(mem, buf, PAGE_SIZE << order);
-		WARN_ON(mem->offset);
+		chunk->buf[chunk->npages] = buf;
 		sg_dma_len(mem) = PAGE_SIZE << order;
 
 		++chunk->npages;
@@ -267,8 +267,8 @@ void hns_roce_free_hem(struct hns_roce_dev *hr_dev, struct hns_roce_hem *hem)
 	list_for_each_entry_safe(chunk, tmp, &hem->chunk_list, list) {
 		for (i = 0; i < chunk->npages; ++i)
 			dma_free_coherent(hr_dev->dev,
-				   chunk->mem[i].length,
-				   lowmem_page_address(sg_page(&chunk->mem[i])),
+				   sg_dma_len(&chunk->mem[i]),
+				   chunk->buf[i],
 				   sg_dma_address(&chunk->mem[i]));
 		kfree(chunk);
 	}
@@ -722,11 +722,12 @@ void *hns_roce_table_find(struct hns_roce_dev *hr_dev,
 	struct hns_roce_hem_chunk *chunk;
 	struct hns_roce_hem_mhop mhop;
 	struct hns_roce_hem *hem;
-	struct page *page = NULL;
+	void *addr = NULL;
 	unsigned long mhop_obj = obj;
 	unsigned long obj_per_chunk;
 	unsigned long idx_offset;
 	int offset, dma_offset;
+	int length;
 	int i, j;
 	u32 hem_idx = 0;
 
@@ -763,25 +764,25 @@ void *hns_roce_table_find(struct hns_roce_dev *hr_dev,
 
 	list_for_each_entry(chunk, &hem->chunk_list, list) {
 		for (i = 0; i < chunk->npages; ++i) {
+			length = sg_dma_len(&chunk->mem[i]);
 			if (dma_handle && dma_offset >= 0) {
-				if (sg_dma_len(&chunk->mem[i]) >
-				    (u32)dma_offset)
+				if (length > (u32)dma_offset)
 					*dma_handle = sg_dma_address(
 						&chunk->mem[i]) + dma_offset;
-				dma_offset -= sg_dma_len(&chunk->mem[i]);
+				dma_offset -= length;
 			}
 
-			if (chunk->mem[i].length > (u32)offset) {
-				page = sg_page(&chunk->mem[i]);
+			if (length > (u32)offset) {
+				addr = chunk->buf[i] + offset;
 				goto out;
 			}
-			offset -= chunk->mem[i].length;
+			offset -= length;
 		}
 	}
 
 out:
 	mutex_unlock(&table->mutex);
-	return page ? lowmem_page_address(page) + offset : NULL;
+	return addr;
 }
 EXPORT_SYMBOL_GPL(hns_roce_table_find);
 
diff --git a/drivers/infiniband/hw/hns/hns_roce_hem.h b/drivers/infiniband/hw/hns/hns_roce_hem.h
index db66db1..e8850d5 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hem.h
+++ b/drivers/infiniband/hw/hns/hns_roce_hem.h
@@ -78,6 +78,7 @@ struct hns_roce_hem_chunk {
 	int			 npages;
 	int			 nsg;
 	struct scatterlist	 mem[HNS_ROCE_HEM_CHUNK_LEN];
+	void			 *buf[HNS_ROCE_HEM_CHUNK_LEN];
 };
 
 struct hns_roce_hem {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH V2 rdma-rc 0/3] RDMA/hns: Bug fixes in hns RoCE driver
  2017-11-28  7:10 [PATCH V2 rdma-rc 0/3] RDMA/hns: Bug fixes in hns RoCE driver Wei Hu (Xavier)
                   ` (2 preceding siblings ...)
  2017-11-28  7:10 ` [PATCH V2 rdma-rc 3/3] RDMA/hns: Get rid of page operation " Wei Hu (Xavier)
@ 2017-12-02  0:04 ` Jason Gunthorpe
  3 siblings, 0 replies; 5+ messages in thread
From: Jason Gunthorpe @ 2017-12-02  0:04 UTC (permalink / raw)
  To: Wei Hu (Xavier)
  Cc: dledford, linux-rdma, linuxarm, linux-kernel, shaobo.xu,
	xavier.huwei, zhangxiping3

On Tue, Nov 28, 2017 at 03:10:25PM +0800, Wei Hu (Xavier) wrote:
> This patch-set introduces three patches to fix the memory related
> issues. one fixes DMA operation failure when smmu is enabled.
> the other two patches fixes incorrect usage of dma API that may
> cause coherency problems.
> 
> Wei Hu (Xavier) (3):
>   RDMA/hns: Fix the issue of IOVA not page continuous in hip08
>   RDMA/hns: Get rid of virt_to_page and vmap calls after
>     dma_alloc_coherent
>   RDMA/hns: Get rid of page operation after dma_alloc_coherent
> 
>  drivers/infiniband/hw/hns/hns_roce_alloc.c  | 23 -----------------------
>  drivers/infiniband/hw/hns/hns_roce_device.h |  4 +---
>  drivers/infiniband/hw/hns/hns_roce_hem.c    | 25 +++++++++++++------------
>  drivers/infiniband/hw/hns/hns_roce_hem.h    |  1 +
>  drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 22 +++++++++++++++-------
>  5 files changed, 30 insertions(+), 45 deletions(-)

Applied to for-rc, Thanks

Jason

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-12-02  0:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-28  7:10 [PATCH V2 rdma-rc 0/3] RDMA/hns: Bug fixes in hns RoCE driver Wei Hu (Xavier)
2017-11-28  7:10 ` [PATCH V2 rdma-rc 1/3] RDMA/hns: Fix the issue of IOVA not page continuous in hip08 Wei Hu (Xavier)
2017-11-28  7:10 ` [PATCH V2 rdma-rc 2/3] RDMA/hns: Get rid of virt_to_page and vmap calls after dma_alloc_coherent Wei Hu (Xavier)
2017-11-28  7:10 ` [PATCH V2 rdma-rc 3/3] RDMA/hns: Get rid of page operation " Wei Hu (Xavier)
2017-12-02  0:04 ` [PATCH V2 rdma-rc 0/3] RDMA/hns: Bug fixes in hns RoCE driver Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).