From: Ralph Campbell <rcampbell@nvidia.com>
To: Jason Gunthorpe <jgg@mellanox.com>
Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-kselftest@vger.kernel.org"
<linux-kselftest@vger.kernel.org>,
Jerome Glisse <jglisse@redhat.com>,
"John Hubbard" <jhubbard@nvidia.com>,
Christoph Hellwig <hch@lst.de>,
Andrew Morton <akpm@linux-foundation.org>,
Shuah Khan <shuah@kernel.org>
Subject: Re: [PATCH v5 1/2] mm/mmu_notifier: make interval notifier updates safe
Date: Tue, 17 Dec 2019 13:50:24 -0800 [thread overview]
Message-ID: <59d4ea9e-3f6b-11c2-75d1-5baecd5b4ae2@nvidia.com> (raw)
In-Reply-To: <20191217205147.GI16762@mellanox.com>
On 12/17/19 12:51 PM, Jason Gunthorpe wrote:
> On Mon, Dec 16, 2019 at 11:57:32AM -0800, Ralph Campbell wrote:
>> mmu_interval_notifier_insert() and mmu_interval_notifier_remove() can't
>> be called safely from inside the invalidate() callback. This is fine for
>> devices with explicit memory region register and unregister calls but it
>> is desirable from a programming model standpoint to not require explicit
>> memory region registration. Regions can be registered based on device
>> address faults but without a mechanism for updating or removing the mmu
>> interval notifiers in response to munmap(), the invalidation callbacks
>> will be for regions that are stale or apply to different mmaped regions.
>
> What we do in RDMA is drive the removal from a work queue, as we need
> a synchronize_srcu anyhow to serialize everything to do with
> destroying a part of the address space mirror.
>
> Is it really necessary to have all this stuff just to save doing
> something like a work queue?
Well, the invalidates already have to use the driver lock to synchronize
so handling the range tracking updates semi-synchronously seems more
straightforward to me.
Do you feel strongly that adding a work queue is the right way to handle
this?
> Also, I think we are not taking core kernel APIs like this with out an
> in-kernel user??
Right. I was looking for feedback before updating nouveau to use it.
>> diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
>> index 9e6caa8ecd19..55fbefcdc564 100644
>> +++ b/include/linux/mmu_notifier.h
>> @@ -233,11 +233,18 @@ struct mmu_notifier {
>> * @invalidate: Upon return the caller must stop using any SPTEs within this
>> * range. This function can sleep. Return false only if sleeping
>> * was required but mmu_notifier_range_blockable(range) is false.
>> + * @release: This function will be called when the mmu_interval_notifier
>> + * is removed from the interval tree. Defining this function also
>> + * allows mmu_interval_notifier_remove() and
>> + * mmu_interval_notifier_update() to be called from the
>> + * invalidate() callback function (i.e., they won't block waiting
>> + * for invalidations to finish.
>
> Having a function called remove that doesn't block seems like very
> poor choice of language, we've tended to use put to describe that
> operation.
>
> The difference is meaningful as people often create use after free
> bugs in drivers when presented with interfaces named 'remove' or
> 'destroy' that don't actually guarentee there is not going to be
> continued accesses to the memory.
OK. I can rename it put().
>> */
>> struct mmu_interval_notifier_ops {
>> bool (*invalidate)(struct mmu_interval_notifier *mni,
>> const struct mmu_notifier_range *range,
>> unsigned long cur_seq);
>> + void (*release)(struct mmu_interval_notifier *mni);
>> };
>>
>> struct mmu_interval_notifier {
>> @@ -246,6 +253,8 @@ struct mmu_interval_notifier {
>> struct mm_struct *mm;
>> struct hlist_node deferred_item;
>> unsigned long invalidate_seq;
>> + unsigned long deferred_start;
>> + unsigned long deferred_last;
>
> I couldn't quite understand how something like this can work, what is
> preventing parallel updates?
It is serialized by the struct mmu_notifier_mm lock.
If there are no tasks walking the interval tree, the update
happens synchronously under the lock. If there are walkers,
the start/last values are stored under the lock and the last caller's
values are used to update the interval tree when the last walker
finishes (under the lock again).
>> +/**
>> + * mmu_interval_notifier_update - Update interval notifier end
>> + * @mni: Interval notifier to update
>> + * @start: New starting virtual address to monitor
>> + * @length: New length of the range to monitor
>> + *
>> + * This function updates the range being monitored.
>> + * If there is no release() function defined, the call will wait for the
>> + * update to finish before returning.
>> + */
>> +int mmu_interval_notifier_update(struct mmu_interval_notifier *mni,
>> + unsigned long start, unsigned long length)
>> +{
>
> Update should probably be its own patch
>
> Jason
OK.
Thanks for the review.
next prev parent reply other threads:[~2019-12-17 21:50 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-16 19:57 [PATCH v5 0/2] mm/hmm/test: add self tests for HMM Ralph Campbell
2019-12-16 19:57 ` [PATCH v5 1/2] mm/mmu_notifier: make interval notifier updates safe Ralph Campbell
2019-12-17 20:51 ` Jason Gunthorpe
2019-12-17 21:50 ` Ralph Campbell [this message]
2020-01-09 19:48 ` Jason Gunthorpe
2020-01-09 22:01 ` Ralph Campbell
2020-01-09 23:25 ` Jason Gunthorpe
2020-01-13 22:44 ` Ralph Campbell
2020-01-14 12:45 ` Jason Gunthorpe
2020-01-15 22:04 ` Ralph Campbell
2020-01-16 14:13 ` Jason Gunthorpe
2019-12-16 19:57 ` [PATCH v5 2/2] mm/hmm/test: add self tests for HMM Ralph Campbell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=59d4ea9e-3f6b-11c2-75d1-5baecd5b4ae2@nvidia.com \
--to=rcampbell@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=hch@lst.de \
--cc=jgg@mellanox.com \
--cc=jglisse@redhat.com \
--cc=jhubbard@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
--cc=shuah@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).