All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alistair Popple <apopple@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: <linux-mm@kvack.org>, <nouveau@lists.freedesktop.org>,
	<bskeggs@redhat.com>, <akpm@linux-foundation.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<dri-devel@lists.freedesktop.org>, <jhubbard@nvidia.com>,
	<rcampbell@nvidia.com>, <jglisse@redhat.com>, <hch@infradead.org>,
	<daniel@ffwll.ch>
Subject: Re: [PATCH v3 5/8] mm: Device exclusive memory access
Date: Tue, 2 Mar 2021 19:57:58 +1100	[thread overview]
Message-ID: <2758096.Z30Q8iEM0t@nvdebian> (raw)
In-Reply-To: <20210302000559.GA763995@nvidia.com>

On Tuesday, 2 March 2021 11:05:59 AM AEDT Jason Gunthorpe wrote:
> On Fri, Feb 26, 2021 at 06:18:29PM +1100, Alistair Popple wrote:
> 
> > +/**
> > + * make_device_exclusive_range() - Mark a range for exclusive use by a 
device
> > + * @mm: mm_struct of assoicated target process
> > + * @start: start of the region to mark for exclusive device access
> > + * @end: end address of region
> > + * @pages: returns the pages which were successfully mark for exclusive 
acces
> > + *
> > + * Returns: number of pages successfully marked for exclusive access
> > + *
> > + * This function finds the ptes mapping page(s) to the given address 
range and
> > + * replaces them with special swap entries preventing userspace CPU 
access. On
> > + * fault these entries are replaced with the original mapping after 
calling MMU
> > + * notifiers.
> > + */
> > +int make_device_exclusive_range(struct mm_struct *mm, unsigned long 
start,
> > +				unsigned long end, struct page **pages)
> > +{
> > +	long npages = (end - start) >> PAGE_SHIFT;
> > +	long i;
> > +
> > +	npages = get_user_pages_remote(mm, start, npages,
> > +				       FOLL_GET | FOLL_WRITE | FOLL_SPLIT_PMD,
> > +				       pages, NULL, NULL);
> > +	for (i = 0; i < npages; i++) {
> > +		if (!trylock_page(pages[i])) {
> > +			put_page(pages[i]);
> > +			pages[i] = NULL;
> > +			continue;
> > +		}
> > +
> > +		if (!try_to_protect(pages[i])) {
> 
> Isn't this racy? get_user_pages returns the ptes at an instant in
> time, they could have already been changed to something else?

Right. On it's own this does not guarantee that the page is mapped at the 
given location, only that a mapping won't get established without an mmu 
notifier callback to clear the swap entry.

The intent was a driver could use HMM or some other mechanism to keep PTEs 
synchronised if required. However I just looked at patch 8 in the series again 
and it appears I got this wrong when converting from the old migration 
approach:

+               mutex_unlock(&svmm->mutex);
+               ret = nouveau_atomic_range_fault(svmm, drm, args,
+                                               size, hmm_flags, mm);

The mutex needs to be unlocked after the range fault to ensure the PTE hasn't 
changed. But this ends up being a problem because try_to_protect() calls 
notifiers which need to take that mutex and hence deadlocks.

> I would think you'd want to switch to the swap entry atomically under
> th PTLs?

That is one approach, but the reuse of get_user_pages() to walk the page 
tables and fault/gather the pages is a nice simplification and adding a new 
FOLL flag/mode to atomically swap entries doesn't seem right.

However try_to_protect() scans the PTEs again under the PTL so checking the 
mapping of interest actually gets replaced during the rmap walk seems like a 
reasonable solution. Thanks for the comments.

 - Alistair

> Jason
> 





WARNING: multiple messages have this Message-ID (diff)
From: Alistair Popple <apopple@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: rcampbell@nvidia.com, linux-doc@vger.kernel.org,
	nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, hch@infradead.org,
	linux-mm@kvack.org, bskeggs@redhat.com, daniel@ffwll.ch,
	akpm@linux-foundation.org
Subject: Re: [Nouveau] [PATCH v3 5/8] mm: Device exclusive memory access
Date: Tue, 2 Mar 2021 19:57:58 +1100	[thread overview]
Message-ID: <2758096.Z30Q8iEM0t@nvdebian> (raw)
In-Reply-To: <20210302000559.GA763995@nvidia.com>

On Tuesday, 2 March 2021 11:05:59 AM AEDT Jason Gunthorpe wrote:
> On Fri, Feb 26, 2021 at 06:18:29PM +1100, Alistair Popple wrote:
> 
> > +/**
> > + * make_device_exclusive_range() - Mark a range for exclusive use by a 
device
> > + * @mm: mm_struct of assoicated target process
> > + * @start: start of the region to mark for exclusive device access
> > + * @end: end address of region
> > + * @pages: returns the pages which were successfully mark for exclusive 
acces
> > + *
> > + * Returns: number of pages successfully marked for exclusive access
> > + *
> > + * This function finds the ptes mapping page(s) to the given address 
range and
> > + * replaces them with special swap entries preventing userspace CPU 
access. On
> > + * fault these entries are replaced with the original mapping after 
calling MMU
> > + * notifiers.
> > + */
> > +int make_device_exclusive_range(struct mm_struct *mm, unsigned long 
start,
> > +				unsigned long end, struct page **pages)
> > +{
> > +	long npages = (end - start) >> PAGE_SHIFT;
> > +	long i;
> > +
> > +	npages = get_user_pages_remote(mm, start, npages,
> > +				       FOLL_GET | FOLL_WRITE | FOLL_SPLIT_PMD,
> > +				       pages, NULL, NULL);
> > +	for (i = 0; i < npages; i++) {
> > +		if (!trylock_page(pages[i])) {
> > +			put_page(pages[i]);
> > +			pages[i] = NULL;
> > +			continue;
> > +		}
> > +
> > +		if (!try_to_protect(pages[i])) {
> 
> Isn't this racy? get_user_pages returns the ptes at an instant in
> time, they could have already been changed to something else?

Right. On it's own this does not guarantee that the page is mapped at the 
given location, only that a mapping won't get established without an mmu 
notifier callback to clear the swap entry.

The intent was a driver could use HMM or some other mechanism to keep PTEs 
synchronised if required. However I just looked at patch 8 in the series again 
and it appears I got this wrong when converting from the old migration 
approach:

+               mutex_unlock(&svmm->mutex);
+               ret = nouveau_atomic_range_fault(svmm, drm, args,
+                                               size, hmm_flags, mm);

The mutex needs to be unlocked after the range fault to ensure the PTE hasn't 
changed. But this ends up being a problem because try_to_protect() calls 
notifiers which need to take that mutex and hence deadlocks.

> I would think you'd want to switch to the swap entry atomically under
> th PTLs?

That is one approach, but the reuse of get_user_pages() to walk the page 
tables and fault/gather the pages is a nice simplification and adding a new 
FOLL flag/mode to atomically swap entries doesn't seem right.

However try_to_protect() scans the PTEs again under the PTL so checking the 
mapping of interest actually gets replaced during the rmap walk seems like a 
reasonable solution. Thanks for the comments.

 - Alistair

> Jason
> 




_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

WARNING: multiple messages have this Message-ID (diff)
From: Alistair Popple <apopple@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: rcampbell@nvidia.com, linux-doc@vger.kernel.org,
	nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, hch@infradead.org,
	linux-mm@kvack.org, jglisse@redhat.com, bskeggs@redhat.com,
	jhubbard@nvidia.com, akpm@linux-foundation.org
Subject: Re: [PATCH v3 5/8] mm: Device exclusive memory access
Date: Tue, 2 Mar 2021 19:57:58 +1100	[thread overview]
Message-ID: <2758096.Z30Q8iEM0t@nvdebian> (raw)
In-Reply-To: <20210302000559.GA763995@nvidia.com>

On Tuesday, 2 March 2021 11:05:59 AM AEDT Jason Gunthorpe wrote:
> On Fri, Feb 26, 2021 at 06:18:29PM +1100, Alistair Popple wrote:
> 
> > +/**
> > + * make_device_exclusive_range() - Mark a range for exclusive use by a 
device
> > + * @mm: mm_struct of assoicated target process
> > + * @start: start of the region to mark for exclusive device access
> > + * @end: end address of region
> > + * @pages: returns the pages which were successfully mark for exclusive 
acces
> > + *
> > + * Returns: number of pages successfully marked for exclusive access
> > + *
> > + * This function finds the ptes mapping page(s) to the given address 
range and
> > + * replaces them with special swap entries preventing userspace CPU 
access. On
> > + * fault these entries are replaced with the original mapping after 
calling MMU
> > + * notifiers.
> > + */
> > +int make_device_exclusive_range(struct mm_struct *mm, unsigned long 
start,
> > +				unsigned long end, struct page **pages)
> > +{
> > +	long npages = (end - start) >> PAGE_SHIFT;
> > +	long i;
> > +
> > +	npages = get_user_pages_remote(mm, start, npages,
> > +				       FOLL_GET | FOLL_WRITE | FOLL_SPLIT_PMD,
> > +				       pages, NULL, NULL);
> > +	for (i = 0; i < npages; i++) {
> > +		if (!trylock_page(pages[i])) {
> > +			put_page(pages[i]);
> > +			pages[i] = NULL;
> > +			continue;
> > +		}
> > +
> > +		if (!try_to_protect(pages[i])) {
> 
> Isn't this racy? get_user_pages returns the ptes at an instant in
> time, they could have already been changed to something else?

Right. On it's own this does not guarantee that the page is mapped at the 
given location, only that a mapping won't get established without an mmu 
notifier callback to clear the swap entry.

The intent was a driver could use HMM or some other mechanism to keep PTEs 
synchronised if required. However I just looked at patch 8 in the series again 
and it appears I got this wrong when converting from the old migration 
approach:

+               mutex_unlock(&svmm->mutex);
+               ret = nouveau_atomic_range_fault(svmm, drm, args,
+                                               size, hmm_flags, mm);

The mutex needs to be unlocked after the range fault to ensure the PTE hasn't 
changed. But this ends up being a problem because try_to_protect() calls 
notifiers which need to take that mutex and hence deadlocks.

> I would think you'd want to switch to the swap entry atomically under
> th PTLs?

That is one approach, but the reuse of get_user_pages() to walk the page 
tables and fault/gather the pages is a nice simplification and adding a new 
FOLL flag/mode to atomically swap entries doesn't seem right.

However try_to_protect() scans the PTEs again under the PTL so checking the 
mapping of interest actually gets replaced during the rmap walk seems like a 
reasonable solution. Thanks for the comments.

 - Alistair

> Jason
> 




_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2021-03-02  9:21 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-26  7:18 [PATCH v3 0/8] Add support for SVM atomics in Nouveau Alistair Popple
2021-02-26  7:18 ` Alistair Popple
2021-02-26  7:18 ` [Nouveau] " Alistair Popple
2021-02-26  7:18 ` [PATCH v3 1/8] mm: Remove special swap entry functions Alistair Popple
2021-02-26  7:18   ` Alistair Popple
2021-02-26  7:18   ` [Nouveau] " Alistair Popple
2021-02-26 15:59   ` Christoph Hellwig
2021-02-26 15:59     ` [Nouveau] " Christoph Hellwig
2021-03-02  8:52     ` Alistair Popple
2021-03-02  8:52       ` Alistair Popple
2021-03-02  8:52       ` [Nouveau] " Alistair Popple
2021-03-02 12:02       ` Alistair Popple
2021-03-02 12:02         ` Alistair Popple
2021-03-02 12:02         ` [Nouveau] " Alistair Popple
2021-03-01 17:46   ` Jason Gunthorpe
2021-03-01 17:46     ` Jason Gunthorpe
2021-03-01 17:46     ` [Nouveau] " Jason Gunthorpe
2021-03-02  0:21     ` Alistair Popple
2021-03-02  0:21       ` Alistair Popple
2021-03-02  0:21       ` [Nouveau] " Alistair Popple
2021-02-26  7:18 ` [PATCH v3 2/8] mm/swapops: Rework swap entry manipulation code Alistair Popple
2021-02-26  7:18   ` Alistair Popple
2021-02-26  7:18   ` [Nouveau] " Alistair Popple
2021-02-26 16:00   ` Christoph Hellwig
2021-02-26 16:00     ` [Nouveau] " Christoph Hellwig
2021-03-01 17:47   ` Jason Gunthorpe
2021-03-01 17:47     ` Jason Gunthorpe
2021-03-01 17:47     ` [Nouveau] " Jason Gunthorpe
2021-02-26  7:18 ` [PATCH v3 3/8] mm/rmap: Split try_to_munlock from try_to_unmap Alistair Popple
2021-02-26  7:18   ` Alistair Popple
2021-02-26  7:18   ` [Nouveau] " Alistair Popple
2021-02-26 16:01   ` Christoph Hellwig
2021-02-26 16:01     ` [Nouveau] " Christoph Hellwig
2021-03-01 16:10   ` Jason Gunthorpe
2021-03-01 16:10     ` Jason Gunthorpe
2021-03-01 16:10     ` [Nouveau] " Jason Gunthorpe
2021-03-04  4:27     ` Alistair Popple
2021-03-04  4:27       ` Alistair Popple
2021-03-04  4:27       ` [Nouveau] " Alistair Popple
2021-02-26  7:18 ` [PATCH v3 4/8] mm/rmap: Split migration into its own function Alistair Popple
2021-02-26  7:18   ` Alistair Popple
2021-02-26  7:18   ` [Nouveau] " Alistair Popple
2021-02-26 16:03   ` Christoph Hellwig
2021-02-26 16:03     ` [Nouveau] " Christoph Hellwig
2021-03-02 22:08   ` Zi Yan
2021-03-02 22:08     ` Zi Yan
2021-03-02 22:08     ` [Nouveau] " Zi Yan
2021-03-04 23:54     ` Alistair Popple
2021-03-04 23:54       ` Alistair Popple
2021-03-04 23:54       ` [Nouveau] " Alistair Popple
2021-02-26  7:18 ` [PATCH v3 5/8] mm: Device exclusive memory access Alistair Popple
2021-02-26  7:18   ` Alistair Popple
2021-02-26  7:18   ` [Nouveau] " Alistair Popple
2021-03-01 17:54   ` Jason Gunthorpe
2021-03-01 17:54     ` Jason Gunthorpe
2021-03-01 17:54     ` [Nouveau] " Jason Gunthorpe
2021-03-01 22:55   ` Ralph Campbell
2021-03-01 22:55     ` Ralph Campbell
2021-03-01 22:55     ` [Nouveau] " Ralph Campbell
2021-03-02  0:05   ` Jason Gunthorpe
2021-03-02  0:05     ` Jason Gunthorpe
2021-03-02  0:05     ` [Nouveau] " Jason Gunthorpe
2021-03-02  8:57     ` Alistair Popple [this message]
2021-03-02  8:57       ` Alistair Popple
2021-03-02  8:57       ` [Nouveau] " Alistair Popple
2021-03-02 12:41       ` Jason Gunthorpe
2021-03-02 12:41         ` Jason Gunthorpe
2021-03-02 12:41         ` [Nouveau] " Jason Gunthorpe
2021-03-04  5:20         ` Alistair Popple
2021-03-04  5:20           ` Alistair Popple
2021-03-04  5:20           ` [Nouveau] " Alistair Popple
2021-02-26  7:18 ` [PATCH v3 6/8] mm: Selftests for exclusive device memory Alistair Popple
2021-02-26  7:18   ` Alistair Popple
2021-02-26  7:18   ` [Nouveau] " Alistair Popple
2021-03-01 17:55   ` Jason Gunthorpe
2021-03-01 17:55     ` Jason Gunthorpe
2021-03-01 17:55     ` [Nouveau] " Jason Gunthorpe
2021-03-01 18:07     ` Ralph Campbell
2021-03-01 18:07       ` Ralph Campbell
2021-03-01 18:07       ` [Nouveau] " Ralph Campbell
2021-03-01 23:14   ` Ralph Campbell
2021-03-01 23:14     ` Ralph Campbell
2021-03-01 23:14     ` [Nouveau] " Ralph Campbell
2021-03-02  9:12     ` Alistair Popple
2021-03-02  9:12       ` Alistair Popple
2021-03-02  9:12       ` [Nouveau] " Alistair Popple
2021-02-26  7:18 ` [PATCH v3 7/8] nouveau/svm: Refactor nouveau_range_fault Alistair Popple
2021-02-26  7:18   ` Alistair Popple
2021-02-26  7:18   ` [Nouveau] " Alistair Popple
2021-02-26  7:18 ` [PATCH v3 8/8] nouveau/svm: Implement atomic SVM access Alistair Popple
2021-02-26  7:18   ` Alistair Popple
2021-02-26  7:18   ` [Nouveau] " Alistair Popple

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2758096.Z30Q8iEM0t@nvdebian \
    --to=apopple@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=bskeggs@redhat.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@infradead.org \
    --cc=jgg@nvidia.com \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nouveau@lists.freedesktop.org \
    --cc=rcampbell@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.