All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: Doug Ledford <dledford@redhat.com>,
	Artemy Kovalyov <artemyko@mellanox.com>,
	<linux-rdma@vger.kernel.org>
Subject: Re: [PATCH rdma-rc] RDMA/mlx5: Prevent prefetch from racing with implicit destruction
Date: Tue, 21 Jul 2020 13:59:13 -0300	[thread overview]
Message-ID: <20200721165913.GA3171161@nvidia.com> (raw)
In-Reply-To: <20200719065435.130722-1-leon@kernel.org>

On Sun, Jul 19, 2020 at 09:54:35AM +0300, Leon Romanovsky wrote:
> From: Jason Gunthorpe <jgg@nvidia.com>
> 
> Prefetch work in mlx5_ib_prefetch_mr_work can be queued and able to run
> concurrently with destruction of the implicit MR. The num_deferred_work
> was intended to serialize this, but there is a race:
> 
>        CPU0                                          CPU1
> 
>     mlx5_ib_free_implicit_mr()
>       xa_erase(odp_mkeys)
>       synchronize_srcu()
>       __xa_erase(implicit_children)
>                                       mlx5_ib_prefetch_mr_work()
>                                         pagefault_mr()
>                                          pagefault_implicit_mr()
>                                           implicit_get_child_mr()
>                                            xa_cmpxchg()
>                                         atomic_dec_and_test(num_deferred_mr)
>       wait_event(imr->q_deferred_work)
>       ib_umem_odp_release(odp_imr)
>         kfree(odp_imr)
> 
> At this point in mlx5_ib_free_implicit_mr() the implicit_children list is
> supposed to be empty forever so that destroy_unused_implicit_child_mr()
> and related are not and will not be running.
> 
> Since it is not empty the destroy_unused_implicit_child_mr() flow ends up
> touching deallocated memory as mlx5_ib_free_implicit_mr() already tore down the
> imr parent.
> 
> The solution is to flush out the prefetch wq by driving num_deferred_work
> to zero after creation of new prefetch work is blocked.
> 
> Fixes: 5256edcb98a1 ("RDMA/mlx5: Rework implicit ODP destroy")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/hw/mlx5/odp.c | 22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)

Applied to for-rc, thanks

Jason

      reply	other threads:[~2020-07-21 17:03 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-19  6:54 [PATCH rdma-rc] RDMA/mlx5: Prevent prefetch from racing with implicit destruction Leon Romanovsky
2020-07-21 16:59 ` Jason Gunthorpe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200721165913.GA3171161@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=artemyko@mellanox.com \
    --cc=dledford@redhat.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.