From: Davidlohr Bueso <dave@stgolabs.net>
To: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
Waiman Long <longman@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Will Deacon <will.deacon@arm.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH 5/5] hugetlbfs: Limit wait time when trying to share huge PMD
Date: Wed, 11 Sep 2019 21:40:02 -0700 [thread overview]
Message-ID: <20190912044002.xp3c7jbpbmq4dbz6@linux-p48b> (raw)
In-Reply-To: <20190912034143.GJ29434@bombadil.infradead.org>
On Wed, 11 Sep 2019, Matthew Wilcox wrote:
>On Wed, Sep 11, 2019 at 08:26:52PM -0700, Mike Kravetz wrote:
>> All this got me wondering if we really need to take i_mmap_rwsem in write
>> mode here. We are not changing the tree, only traversing it looking for
>> a suitable vma.
>>
>> Unless I am missing something, the hugetlb code only ever takes the semaphore
>> in write mode; never read. Could this have been the result of changing the
>> tree semaphore to read/write? Instead of analyzing all the code, the easiest
>> and safest thing would have been to take all accesses in write mode.
>
>I was wondering the same thing. It was changed here:
>
>commit 83cde9e8ba95d180eaefefe834958fbf7008cf39
>Author: Davidlohr Bueso <dave@stgolabs.net>
>Date: Fri Dec 12 16:54:21 2014 -0800
>
> mm: use new helper functions around the i_mmap_mutex
>
> Convert all open coded mutex_lock/unlock calls to the
> i_mmap_[lock/unlock]_write() helpers.
>
>and a subsequent patch said:
>
> This conversion is straightforward. For now, all users take the write
> lock.
>
>There were subsequent patches which changed a few places
>c8475d144abb1e62958cc5ec281d2a9e161c1946
>1acf2e040721564d579297646862b8ea3dd4511b
>d28eb9c861f41aa2af4cfcc5eeeddff42b13d31e
>874bfcaf79e39135cd31e1cfc9265cf5222d1ec3
>3dec0ba0be6a532cac949e02b853021bf6d57dad
>
>but I don't know why this one wasn't changed.
I cannot recall why huge_pmd_share() was not changed along with the other
callers that don't modify the interval tree. By looking at the function,
I agree that this could be shared, in fact this lock is much less involved
than it's anon_vma counterpart, last I checked (perhaps with the exception
of take_rmap_locks().
>
>(I was also wondering about caching a potentially sharable page table
>in the address_space to avoid having to walk the VMA tree at all if that
>one happened to be sharable).
I also think that the right solution is within the mm instead of adding
a new api to rwsem and the extra complexity/overhead to osq _just_ for this
case. We've managed to not need timeout extensions in our locking primitives
thus far, which is a good thing imo.
Thanks,
Davidlohr
next prev parent reply other threads:[~2019-09-12 4:40 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-11 15:05 [PATCH 0/5] hugetlbfs: Disable PMD sharing for large systems Waiman Long
2019-09-11 15:05 ` [PATCH 1/5] locking/rwsem: Add down_write_timedlock() Waiman Long
2019-09-11 15:05 ` [PATCH 2/5] locking/rwsem: Enable timeout check when spinning on owner Waiman Long
2019-09-11 15:05 ` [PATCH 3/5] locking/osq: Allow early break from OSQ Waiman Long
2019-09-11 15:05 ` [PATCH 4/5] locking/rwsem: Enable timeout check when staying in the OSQ Waiman Long
2019-09-11 15:05 ` [PATCH 5/5] hugetlbfs: Limit wait time when trying to share huge PMD Waiman Long
2019-09-11 15:14 ` Matthew Wilcox
2019-09-11 15:44 ` Waiman Long
2019-09-11 17:03 ` Mike Kravetz
2019-09-11 17:15 ` Waiman Long
2019-09-11 17:22 ` Qian Cai
2019-09-11 17:28 ` Waiman Long
2019-09-11 16:01 ` Qian Cai
2019-09-11 16:34 ` Waiman Long
2019-09-11 19:42 ` Qian Cai
2019-09-11 20:54 ` Waiman Long
2019-09-11 21:57 ` Qian Cai
2019-09-11 19:57 ` Matthew Wilcox
2019-09-11 20:51 ` Waiman Long
2019-09-12 3:26 ` Mike Kravetz
2019-09-12 3:41 ` Matthew Wilcox
2019-09-12 4:40 ` Davidlohr Bueso [this message]
2019-09-16 13:53 ` Waiman Long
2019-09-12 9:06 ` Waiman Long
2019-09-12 16:43 ` Mike Kravetz
2019-09-13 18:23 ` Waiman Long
2019-09-12 5:36 ` Hillf Danton
2019-09-13 1:50 ` [PATCH 0/5] hugetlbfs: Disable PMD sharing for large systems Dave Chinner
2019-09-25 8:35 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190912044002.xp3c7jbpbmq4dbz6@linux-p48b \
--to=dave@stgolabs.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=mike.kravetz@oracle.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=viro@zeniv.linux.org.uk \
--cc=will.deacon@arm.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).