All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kashyap Desai <kashyap.desai@broadcom.com>
To: linux-rdma@vger.kernel.org
Cc: jgg@nvidia.com, leonro@nvidia.com, selvin.xavier@broadcom.com,
	andrew.gospodarek@broadcom.com,
	Kashyap Desai <kashyap.desai@broadcom.com>
Subject: [PATCH  rdma-rc v1] RDMA/core: fix sg_to_page mapping for boundary condition
Date: Tue, 16 Aug 2022 13:41:53 +0530	[thread overview]
Message-ID: <20220816081153.1580612-1-kashyap.desai@broadcom.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 3999 bytes --]

This issue frequently hits if AMD IOMMU is enabled.

In case of 1MB data transfer, ib core is supposed to set 256 entries of
4K page size in MR page table. Because of the defect in ib_sg_to_pages,
it breaks just after setting one entry.
Memory region page table entries may find stale entries (or NULL if
address is memset). Something like this -

crash> x/32a 0xffff9cd9f7f84000
<<< -This looks like stale entries. Only first entry is valid ->>>
0xffff9cd9f7f84000:     0xfffffffffff00000      0x68d31000
0xffff9cd9f7f84010:     0x68d32000      0x68d33000
0xffff9cd9f7f84020:     0x68d34000      0x975c5000
0xffff9cd9f7f84030:     0x975c6000      0x975c7000
0xffff9cd9f7f84040:     0x975c8000      0x975c9000
0xffff9cd9f7f84050:     0x975ca000      0x975cb000
0xffff9cd9f7f84060:     0x975cc000      0x975cd000
0xffff9cd9f7f84070:     0x975ce000      0x975cf000
0xffff9cd9f7f84080:     0x0     0x0
0xffff9cd9f7f84090:     0x0     0x0
0xffff9cd9f7f840a0:     0x0     0x0
0xffff9cd9f7f840b0:     0x0     0x0
0xffff9cd9f7f840c0:     0x0     0x0
0xffff9cd9f7f840d0:     0x0     0x0
0xffff9cd9f7f840e0:     0x0     0x0
0xffff9cd9f7f840f0:     0x0     0x0

All addresses other than 0xfffffffffff00000 are stale entries.
Once this kind of incorrect page entries are passed to the RDMA h/w,
AMD IOMMU module detects the page fault whenever h/w tries to access
addresses which are not actually populated by the ib stack correctly.
Below prints are logged whenever this issue hits.

bnxt_en 0000:21:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x001e address=0x68d31000 flags=0x0050]

ib_sg_to_pages function populates the correct page address in most of the cases,
but there is one boundary condition which is not handled correctly.

Boundary condition explained -
Page addresses are not populated correctly if the dma buffer is mapped to the
very last region of address space.

One of the example -
Whenever page_add is  0xfffffffffff00000  (Last 1MB section of the address space)
and dma length is 1MB, end of the dma address = 0
(Derived from 0xfffffffffff00000 + 0x100000).

use dma buffer length instead of end_dma_addr to fill page addresses.

v0->v1 : Use first_page_off instead of page_off for readability
	 Fix functional issue of not reseting first_page_off

Fixes: 4c67e2bfc8b7 ("IB/core: Introduce new fast registration API")
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
---
 drivers/infiniband/core/verbs.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index e54b3f1b730e..5e72c44bac3a 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -2676,15 +2676,19 @@ int ib_sg_to_pages(struct ib_mr *mr, struct scatterlist *sgl, int sg_nents,
 		u64 dma_addr = sg_dma_address(sg) + sg_offset;
 		u64 prev_addr = dma_addr;
 		unsigned int dma_len = sg_dma_len(sg) - sg_offset;
+		unsigned int curr_dma_len = 0;
+		unsigned int first_page_off = 0;
 		u64 end_dma_addr = dma_addr + dma_len;
 		u64 page_addr = dma_addr & page_mask;
 
+		if (i == 0)
+			first_page_off = dma_addr - page_addr;
 		/*
 		 * For the second and later elements, check whether either the
 		 * end of element i-1 or the start of element i is not aligned
 		 * on a page boundary.
 		 */
-		if (i && (last_page_off != 0 || page_addr != dma_addr)) {
+		else if (last_page_off != 0 || page_addr != dma_addr) {
 			/* Stop mapping if there is a gap. */
 			if (last_end_dma_addr != dma_addr)
 				break;
@@ -2708,8 +2712,10 @@ int ib_sg_to_pages(struct ib_mr *mr, struct scatterlist *sgl, int sg_nents,
 			}
 			prev_addr = page_addr;
 next_page:
+			curr_dma_len += mr->page_size - first_page_off;
 			page_addr += mr->page_size;
-		} while (page_addr < end_dma_addr);
+			first_page_off = 0;
+		} while (curr_dma_len < dma_len);
 
 		mr->length += dma_len;
 		last_end_dma_addr = end_dma_addr;
-- 
2.27.0


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4212 bytes --]

             reply	other threads:[~2022-08-16  9:39 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-16  8:11 Kashyap Desai [this message]
2022-08-18 23:52 ` [PATCH rdma-rc v1] RDMA/core: fix sg_to_page mapping for boundary condition Jason Gunthorpe
2022-08-19  9:43   ` Kashyap Desai
2022-08-19 11:48     ` Jason Gunthorpe
2022-08-22 14:21       ` Kashyap Desai
2022-08-26 13:14         ` Jason Gunthorpe
2022-09-01 12:06           ` Kashyap Desai
2022-09-06 17:33             ` Jason Gunthorpe
2022-09-12 11:02               ` Kashyap Desai
2022-09-20 19:14                 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220816081153.1580612-1-kashyap.desai@broadcom.com \
    --to=kashyap.desai@broadcom.com \
    --cc=andrew.gospodarek@broadcom.com \
    --cc=jgg@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=selvin.xavier@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.