All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Yishai Hadas <yishaih@nvidia.com>
Cc: Leon Romanovsky <leon@kernel.org>,
	Doug Ledford <dledford@redhat.com>, <linux-rdma@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH rdma-next v2 1/4] IB/core: Improve ODP to use hmm_range_fault()
Date: Tue, 29 Sep 2020 17:13:03 -0300	[thread overview]
Message-ID: <20200929201303.GG9475@nvidia.com> (raw)
In-Reply-To: <089ce58a-a439-79b5-72ac-128d56002878@nvidia.com>

On Tue, Sep 29, 2020 at 11:09:43PM +0300, Yishai Hadas wrote:
> On 9/29/2020 10:27 PM, Jason Gunthorpe wrote:
> > On Tue, Sep 22, 2020 at 11:21:01AM +0300, Leon Romanovsky wrote:
> > 
> > > +	if (!*dma_addr) {
> > > +		*dma_addr = ib_dma_map_page(dev, page, 0,
> > > +				1 << umem_odp->page_shift,
> > > +				DMA_BIDIRECTIONAL);
> > > +		if (ib_dma_mapping_error(dev, *dma_addr)) {
> > > +			*dma_addr = 0;
> > > +			return -EFAULT;
> > > +		}
> > > +		umem_odp->npages++;
> > > +	}
> > > +
> > > +	*dma_addr |= access_mask;
> > This does need some masking, the purpose of this is to update the
> > access flags in the case we hit a fault on a dma mapped thing. Looks
> > like this can happen on a read-only page becoming writable again
> > (wp_page_reuse() doesn't trigger notifiers)
> > 
> > It should also have a comment to that effect.
> > 
> > something like:
> > 
> > if (*dma_addr) {
> >      /*
> >       * If the page is already dma mapped it means it went through a
> >       * non-invalidating trasition, like read-only to writable. Resync the
> >       * flags.
> >       */
> >      *dma_addr = (*dma_addr & (~ODP_DMA_ADDR_MASK)) | access_mask;
> Did you mean
> 
> *dma_addr = (*dma_addr & (ODP_DMA_ADDR_MASK)) | access_mask;

Probably

> flags. (see ODP_DMA_ADDR_MASK).  Also, if we went through a
> read->write access without invalidation why do we need to mask at
> all ? the new access_mask should have the write access.

Feels like a good idea to be safe here
 
> > > +		WARN_ON(range.hmm_pfns[pfn_index] & HMM_PFN_ERROR);
> > > +		WARN_ON(!(range.hmm_pfns[pfn_index] & HMM_PFN_VALID));
> > > +		hmm_order = hmm_pfn_to_map_order(range.hmm_pfns[pfn_index]);
> > > +		/* If a hugepage was detected and ODP wasn't set for, the umem
> > > +		 * page_shift will be used, the opposite case is an error.
> > > +		 */
> > > +		if (hmm_order + PAGE_SHIFT < page_shift) {
> > > +			ret = -EINVAL;
> > > +			pr_debug("%s: un-expected hmm_order %d, page_shift %d\n",
> > > +				 __func__, hmm_order, page_shift);
> > >   			break;
> > >   		}
> > I think this break should be a continue here. There is no reason not
> > to go to the next aligned PFN and try to sync as much as possible.
> 
> This might happen if the application didn't honor the contract to use
> hugepages for the full range despite that it sets IB_ACCESS_HUGETLB, right ?

Yes

> Do we still need to sync as much as possible in that case ? I
> believe that we may consider return an error in this case to let
> application be aware of as was before this series.

We might be prefetching or something weird where it could make sense.

> > This should also
> > 
> >    WARN_ON(umem_odp->dma_list[dma_index]);
> > 
> > And all the pr_debugs around this code being touched should become
> > mlx5_ib_dbg
> We are in IB core, why mlx5_ib_debug ?

oops, dev_dbg

Jason

  reply	other threads:[~2020-09-29 20:13 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-22  8:21 [PATCH rdma-next v2 0/4] Improve ODP by using HMM API Leon Romanovsky
2020-09-22  8:21 ` [PATCH rdma-next v2 1/4] IB/core: Improve ODP to use hmm_range_fault() Leon Romanovsky
2020-09-29 17:59   ` Jason Gunthorpe
2020-09-29 18:02     ` Christoph Hellwig
2020-09-29 18:13       ` Jason Gunthorpe
2020-09-29 18:15         ` Christoph Hellwig
2020-09-29 18:27           ` Jason Gunthorpe
2020-09-29 20:20     ` Yishai Hadas
2020-09-29 19:27   ` Jason Gunthorpe
2020-09-29 20:09     ` Yishai Hadas
2020-09-29 20:13       ` Jason Gunthorpe [this message]
2020-09-29 20:30         ` Yishai Hadas
2020-09-30  0:37           ` Jason Gunthorpe
2020-09-29 21:34         ` Yishai Hadas
2020-09-30  0:35           ` Jason Gunthorpe
2020-09-30  7:32             ` Yishai Hadas
2020-09-22  8:21 ` [PATCH rdma-next v2 2/4] IB/core: Enable ODP sync without faulting Leon Romanovsky
2020-09-22  8:21 ` [PATCH rdma-next v2 3/4] RDMA/mlx5: Extend advice MR to support non faulting mode Leon Romanovsky
2020-09-22  8:21 ` [PATCH rdma-next v2 4/4] RDMA/mlx5: Sync device with CPU pages upon ODP MR registration Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200929201303.GG9475@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=dledford@redhat.com \
    --cc=hch@infradead.org \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.