Linux-kselftest Archive on lore.kernel.org
 help / color / Atom feed
From: Ralph Campbell <rcampbell@nvidia.com>
To: Jason Gunthorpe <jgg@mellanox.com>
Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>,
	Jerome Glisse <jglisse@redhat.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	Christoph Hellwig <hch@lst.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Shuah Khan <shuah@kernel.org>
Subject: Re: [PATCH v5 1/2] mm/mmu_notifier: make interval notifier updates safe
Date: Tue, 17 Dec 2019 13:50:24 -0800
Message-ID: <59d4ea9e-3f6b-11c2-75d1-5baecd5b4ae2@nvidia.com> (raw)
In-Reply-To: <20191217205147.GI16762@mellanox.com>


On 12/17/19 12:51 PM, Jason Gunthorpe wrote:
> On Mon, Dec 16, 2019 at 11:57:32AM -0800, Ralph Campbell wrote:
>> mmu_interval_notifier_insert() and mmu_interval_notifier_remove() can't
>> be called safely from inside the invalidate() callback. This is fine for
>> devices with explicit memory region register and unregister calls but it
>> is desirable from a programming model standpoint to not require explicit
>> memory region registration. Regions can be registered based on device
>> address faults but without a mechanism for updating or removing the mmu
>> interval notifiers in response to munmap(), the invalidation callbacks
>> will be for regions that are stale or apply to different mmaped regions.
> 
> What we do in RDMA is drive the removal from a work queue, as we need
> a synchronize_srcu anyhow to serialize everything to do with
> destroying a part of the address space mirror.
> 
> Is it really necessary to have all this stuff just to save doing
> something like a work queue?

Well, the invalidates already have to use the driver lock to synchronize
so handling the range tracking updates semi-synchronously seems more
straightforward to me.

Do you feel strongly that adding a work queue is the right way to handle
this?

> Also, I think we are not taking core kernel APIs like this with out an
> in-kernel user??

Right. I was looking for feedback before updating nouveau to use it.

>> diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
>> index 9e6caa8ecd19..55fbefcdc564 100644
>> +++ b/include/linux/mmu_notifier.h
>> @@ -233,11 +233,18 @@ struct mmu_notifier {
>>    * @invalidate: Upon return the caller must stop using any SPTEs within this
>>    *              range. This function can sleep. Return false only if sleeping
>>    *              was required but mmu_notifier_range_blockable(range) is false.
>> + * @release:	This function will be called when the mmu_interval_notifier
>> + *		is removed from the interval tree. Defining this function also
>> + *		allows mmu_interval_notifier_remove() and
>> + *		mmu_interval_notifier_update() to be called from the
>> + *		invalidate() callback function (i.e., they won't block waiting
>> + *		for invalidations to finish.
> 
> Having a function called remove that doesn't block seems like very
> poor choice of language, we've tended to use put to describe that
> operation.
> 
> The difference is meaningful as people often create use after free
> bugs in drivers when presented with interfaces named 'remove' or
> 'destroy' that don't actually guarentee there is not going to be
> continued accesses to the memory.

OK. I can rename it put().

>>    */
>>   struct mmu_interval_notifier_ops {
>>   	bool (*invalidate)(struct mmu_interval_notifier *mni,
>>   			   const struct mmu_notifier_range *range,
>>   			   unsigned long cur_seq);
>> +	void (*release)(struct mmu_interval_notifier *mni);
>>   };
>>   
>>   struct mmu_interval_notifier {
>> @@ -246,6 +253,8 @@ struct mmu_interval_notifier {
>>   	struct mm_struct *mm;
>>   	struct hlist_node deferred_item;
>>   	unsigned long invalidate_seq;
>> +	unsigned long deferred_start;
>> +	unsigned long deferred_last;
> 
> I couldn't quite understand how something like this can work, what is
> preventing parallel updates?

It is serialized by the struct mmu_notifier_mm lock.
If there are no tasks walking the interval tree, the update
happens synchronously under the lock. If there are walkers,
the start/last values are stored under the lock and the last caller's
values are used to update the interval tree when the last walker
finishes (under the lock again).

>> +/**
>> + * mmu_interval_notifier_update - Update interval notifier end
>> + * @mni: Interval notifier to update
>> + * @start: New starting virtual address to monitor
>> + * @length: New length of the range to monitor
>> + *
>> + * This function updates the range being monitored.
>> + * If there is no release() function defined, the call will wait for the
>> + * update to finish before returning.
>> + */
>> +int mmu_interval_notifier_update(struct mmu_interval_notifier *mni,
>> +				 unsigned long start, unsigned long length)
>> +{
> 
> Update should probably be its own patch
> 
> Jason

OK.
Thanks for the review.

  reply index

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-16 19:57 [PATCH v5 0/2] mm/hmm/test: add self tests for HMM Ralph Campbell
2019-12-16 19:57 ` [PATCH v5 1/2] mm/mmu_notifier: make interval notifier updates safe Ralph Campbell
2019-12-17 20:51   ` Jason Gunthorpe
2019-12-17 21:50     ` Ralph Campbell [this message]
2020-01-09 19:48   ` Jason Gunthorpe
2020-01-09 22:01     ` Ralph Campbell
2020-01-09 23:25       ` Jason Gunthorpe
2020-01-13 22:44         ` Ralph Campbell
2020-01-14 12:45           ` Jason Gunthorpe
2020-01-15 22:04             ` Ralph Campbell
2020-01-16 14:13               ` Jason Gunthorpe
2019-12-16 19:57 ` [PATCH v5 2/2] mm/hmm/test: add self tests for HMM Ralph Campbell

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59d4ea9e-3f6b-11c2-75d1-5baecd5b4ae2@nvidia.com \
    --to=rcampbell@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=hch@lst.de \
    --cc=jgg@mellanox.com \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-kselftest Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-kselftest/0 linux-kselftest/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-kselftest linux-kselftest/ https://lore.kernel.org/linux-kselftest \
		linux-kselftest@vger.kernel.org
	public-inbox-index linux-kselftest

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kselftest


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git