Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Doug Ledford <dledford@redhat.com>,
	Jason Gunthorpe <jgg@mellanox.com>,
	"David S . Miller" <davem@davemloft.net>,
	Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: Leon Romanovsky <leonro@mellanox.com>,
	RDMA mailing list <linux-rdma@vger.kernel.org>,
	Hans Westgaard Ry <hans.westgaard.ry@oracle.com>,
	Moni Shoua <monis@mellanox.com>,
	linux-netdev <netdev@vger.kernel.org>
Subject: [PATCH mlx5-next 07/10] RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths
Date: Wed, 15 Jan 2020 14:43:37 +0200
Message-ID: <20200115124340.79108-8-leon@kernel.org> (raw)
In-Reply-To: <20200115124340.79108-1-leon@kernel.org>

From: Jason Gunthorpe <jgg@mellanox.com>

Till recently it was not possible for userspace to specify a different
IOVA, but with the new ibv_reg_mr_iova() library call this can be done.

To compute the user_va we must compute:
  user_va = (iova - iova_start) + user_va_start

while being cautious of overflow and other math problems.

The iova is not reliably stored in the mmkey when the MR is created. Only
the cached creation path (the common one) set it, so it must also be set
when creating uncached MRs.

Fix the weird use of iova when computing the starting page index in the
MR. In the normal case, when iova == umem.address:
  iova & (~(BIT(page_shift) - 1)) ==
  ALIGN_DOWN(umem.address, odp->page_size) ==
  ib_umem_start(odp)

And when iova is different using it in math with a user_va is wrong.

Finally, do not allow an implicit ODP to be created with a non-zero IOVA
as we have no support for that.

Fixes: 7bdf65d411c1 ("IB/mlx5: Handle page faults")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/mr.c  |  2 ++
 drivers/infiniband/hw/mlx5/odp.c | 19 +++++++++++++------
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 1913e88522ec..6fa0a83c19de 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1249,6 +1249,8 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,

 	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && !start &&
 	    length == U64_MAX) {
+		if (virt_addr != start)
+			return ERR_PTR(-EINVAL);
 		if (!(access_flags & IB_ACCESS_ON_DEMAND) ||
 		    !(dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
 			return ERR_PTR(-EINVAL);
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index 879ed9ac0af9..4216814ba871 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -662,11 +662,10 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp,
 	bool downgrade = flags & MLX5_PF_FLAGS_DOWNGRADE;
 	unsigned long current_seq;
 	u64 access_mask;
-	u64 start_idx, page_mask;
+	u64 start_idx;

 	page_shift = odp->page_shift;
-	page_mask = ~(BIT(page_shift) - 1);
-	start_idx = (user_va - (mr->mmkey.iova & page_mask)) >> page_shift;
+	start_idx = (user_va - ib_umem_start(odp)) >> page_shift;
 	access_mask = ODP_READ_ALLOWED_BIT;

 	if (odp->umem.writable && !downgrade)
@@ -805,11 +804,19 @@ static int pagefault_mr(struct mlx5_ib_mr *mr, u64 io_virt, size_t bcnt,
 {
 	struct ib_umem_odp *odp = to_ib_umem_odp(mr->umem);

+	if (unlikely(io_virt < mr->mmkey.iova))
+		return -EFAULT;
+
 	if (!odp->is_implicit_odp) {
-		if (unlikely(io_virt < ib_umem_start(odp) ||
-			     ib_umem_end(odp) - io_virt < bcnt))
+		u64 user_va;
+
+		if (check_add_overflow(io_virt - mr->mmkey.iova,
+				       (u64)odp->umem.address, &user_va))
+			return -EFAULT;
+		if (unlikely(user_va >= ib_umem_end(odp) ||
+			     ib_umem_end(odp) - user_va < bcnt))
 			return -EFAULT;
-		return pagefault_real_mr(mr, odp, io_virt, bcnt, bytes_mapped,
+		return pagefault_real_mr(mr, odp, user_va, bcnt, bytes_mapped,
 					 flags);
 	}
 	return pagefault_implicit_mr(mr, odp, io_virt, bcnt, bytes_mapped,
--
2.20.1


  parent reply index

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-15 12:43 [PATCH mlx5-next 00/10] Use ODP MRs for kernel ULPs Leon Romanovsky
2020-01-15 12:43 ` [PATCH mlx5-next 01/10] IB: Allow calls to ib_umem_get from " Leon Romanovsky
2020-01-15 12:43 ` [PATCH mlx5-next 02/10] IB/core: Introduce ib_reg_user_mr Leon Romanovsky
2020-01-15 12:43 ` [PATCH mlx5-next 03/10] IB/core: Add interface to advise_mr for kernel users Leon Romanovsky
2020-01-15 12:43 ` [PATCH mlx5-next 04/10] IB/mlx5: Add ODP WQE handlers for kernel QPs Leon Romanovsky
2020-01-15 12:43 ` [PATCH mlx5-next 05/10] RDMA/mlx5: Don't fake udata for kernel path Leon Romanovsky
2020-01-15 12:43 ` [PATCH mlx5-next 06/10] IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs Leon Romanovsky
2020-01-15 12:43 ` Leon Romanovsky [this message]
2020-01-15 12:43 ` [PATCH mlx5-next 08/10] net/rds: Detect need of On-Demand-Paging memory registration Leon Romanovsky
2020-01-15 21:42   ` santosh.shilimkar
2020-01-15 12:43 ` [PATCH mlx5-next 09/10] net/rds: Handle ODP mr registration/unregistration Leon Romanovsky
2020-01-15 21:51   ` santosh.shilimkar
2020-01-16  7:11     ` Leon Romanovsky
2020-01-16  7:22       ` santosh.shilimkar
2020-01-18 10:19   ` Leon Romanovsky
2020-01-15 12:43 ` [PATCH mlx5-next 10/10] net/rds: Use prefetch for On-Demand-Paging MR Leon Romanovsky
2020-01-15 21:43   ` santosh.shilimkar
2020-01-16  6:59 ` [PATCH mlx5-next 00/10] Use ODP MRs for kernel ULPs Leon Romanovsky
2020-01-16 13:57   ` Jason Gunthorpe
2020-01-16 14:04     ` Leon Romanovsky
2020-01-16 19:34     ` santosh.shilimkar
2020-01-17 14:12       ` Jason Gunthorpe

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200115124340.79108-8-leon@kernel.org \
    --to=leon@kernel.org \
    --cc=davem@davemloft.net \
    --cc=dledford@redhat.com \
    --cc=hans.westgaard.ry@oracle.com \
    --cc=jgg@mellanox.com \
    --cc=leonro@mellanox.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=monis@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=santosh.shilimkar@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org
	public-inbox-index linux-rdma

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git