From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Mike Kravetz <mike.kravetz@oracle.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Michal Hocko <mhocko@kernel.org>, Hugh Dickins <hughd@google.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
Andrea Arcangeli <aarcange@redhat.com>,
"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
Davidlohr Bueso <dave@stgolabs.net>,
Prakash Sangappa <prakash.sangappa@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>,
stable@vger.kernel.org
Subject: Re: [PATCH 2/3] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race
Date: Mon, 17 Dec 2018 15:55:28 +0530 [thread overview]
Message-ID: <27f8893b-57b3-088d-2d48-9e8acc5987bd@linux.ibm.com> (raw)
In-Reply-To: <20181203200850.6460-3-mike.kravetz@oracle.com>
On 12/4/18 1:38 AM, Mike Kravetz wrote:
> hugetlbfs page faults can race with truncate and hole punch operations.
> Current code in the page fault path attempts to handle this by 'backing
> out' operations if we encounter the race. One obvious omission in the
> current code is removing a page newly added to the page cache. This is
> pretty straight forward to address, but there is a more subtle and
> difficult issue of backing out hugetlb reservations. To handle this
> correctly, the 'reservation state' before page allocation needs to be
> noted so that it can be properly backed out. There are four distinct
> possibilities for reservation state: shared/reserved, shared/no-resv,
> private/reserved and private/no-resv. Backing out a reservation may
> require memory allocation which could fail so that needs to be taken
> into account as well.
>
> Instead of writing the required complicated code for this rare
> occurrence, just eliminate the race. i_mmap_rwsem is now held in read
> mode for the duration of page fault processing. Hold i_mmap_rwsem
> longer in truncation and hold punch code to cover the call to
> remove_inode_hugepages.
>
> Cc: <stable@vger.kernel.org>
> Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd")
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> ---
> fs/hugetlbfs/inode.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 32920a10100e..3244147fc42b 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -505,8 +505,8 @@ static int hugetlb_vmtruncate(struct inode *inode, loff_t offset)
> i_mmap_lock_write(mapping);
> if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root))
> hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0);
> - i_mmap_unlock_write(mapping);
> remove_inode_hugepages(inode, offset, LLONG_MAX);
> + i_mmap_unlock_write(mapping);
> return 0;
> }
We used to do remove_inode_hugepages()
mutex_lock(&hugetlb_fault_mutex_table[hash]);
i_mmap_lock_write(mapping);
hugetlb_vmdelete_list(&mapping->i_mmap,
i_mmap_unlock_write(mapping);
did we change the lock ordering with this patch?
>
> @@ -540,8 +540,8 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
> hugetlb_vmdelete_list(&mapping->i_mmap,
> hole_start >> PAGE_SHIFT,
> hole_end >> PAGE_SHIFT);
> - i_mmap_unlock_write(mapping);
> remove_inode_hugepages(inode, hole_start, hole_end);
> + i_mmap_unlock_write(mapping);
> inode_unlock(inode);
> }
>
-aneesh
next prev parent reply other threads:[~2018-12-17 10:26 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-03 20:08 [PATCH 0/3] hugetlbfs: use i_mmap_rwsem for better synchronization Mike Kravetz
2018-12-03 20:08 ` [PATCH 1/3] hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization Mike Kravetz
2018-12-04 13:54 ` Sasha Levin
2018-12-03 20:08 ` [PATCH 2/3] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race Mike Kravetz
2018-12-04 13:54 ` Sasha Levin
2018-12-17 10:25 ` Aneesh Kumar K.V [this message]
2018-12-17 18:42 ` Mike Kravetz
2018-12-18 0:17 ` Mike Kravetz
2018-12-18 22:10 ` Andrew Morton
2018-12-18 22:34 ` Mike Kravetz
2019-06-14 21:56 ` Sasha Levin
2019-06-14 23:33 ` Mike Kravetz
2019-06-15 22:38 ` Sasha Levin
2018-12-03 20:08 ` [PATCH 3/3] hugetlbfs: remove unnecessary code after i_mmap_rwsem synchronization Mike Kravetz
2018-12-04 13:54 ` Sasha Levin
2018-12-17 10:34 ` Aneesh Kumar K.V
2018-12-14 21:22 ` [PATCH 0/3] hugetlbfs: use i_mmap_rwsem for better synchronization Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27f8893b-57b3-088d-2d48-9e8acc5987bd@linux.ibm.com \
--to=aneesh.kumar@linux.ibm.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=dave@stgolabs.net \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=prakash.sangappa@oracle.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).