From: Hugh Dickins <hughd@google.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
Ning Qu <quning@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH 19/24] huge tmpfs: disband split huge pmds on race or memory failure
Date: Fri, 20 Feb 2015 20:22:21 -0800 (PST) [thread overview]
Message-ID: <alpine.LSU.2.11.1502202020580.14414@eggly.anvils> (raw)
In-Reply-To: <alpine.LSU.2.11.1502201941340.14414@eggly.anvils>
Andres L-C has pointed out that the single-page unmap_mapping_range()
fallback in truncate_inode_page() cannot protect against the case when
a huge page was faulted in after the full-range unmap_mapping_range():
because page_mapped(page) checks tail page's mapcount, not the head's.
So, there's a danger that hole-punching (and maybe even truncation)
can free pages while they are mapped into userspace with a huge pmd.
And I don't believe that the CVE-2014-4171 protection in shmem_fault()
can fully protect from this, although it does make it much harder.
Fix that by adding a duplicate single-page unmap_mapping_range()
into shmem_disband_hugeteam() (called when punching or truncating
a PageTeam), at the point when we also hold the head's page lock
(without which there would still be races): which will then split
all huge pmd mappings covering the page into team pte mappings.
This is also just what's needed to handle memory_failure() correctly:
provide custom shmem_error_remove_page(), call shmem_disband_hugeteam()
from that before proceeding to generic_error_remove_page(), then this
additional unmap_mapping_range() will remap team by ptes as needed.
(There is an unlikely case that we're racing with another disbander,
or disband didn't get trylock on head page at first: memory_failure()
has almost finished with the page, so it's safe to unlock and relock
before retrying.)
But there is one further change needed in hwpoison_user_mappings():
it must recognize a hugely mapped team before concluding that the
page is not mapped. (And still no support for soft_offline(),
which will have to wait for page migration of teams.)
Signed-off-by: Hugh Dickins <hughd@google.com>
---
mm/memory-failure.c | 8 +++++++-
mm/shmem.c | 27 ++++++++++++++++++++++++++-
2 files changed, 33 insertions(+), 2 deletions(-)
--- thpfs.orig/mm/memory-failure.c 2015-02-08 18:54:22.000000000 -0800
+++ thpfs/mm/memory-failure.c 2015-02-20 19:34:59.047883965 -0800
@@ -44,6 +44,7 @@
#include <linux/rmap.h>
#include <linux/export.h>
#include <linux/pagemap.h>
+#include <linux/pageteam.h>
#include <linux/swap.h>
#include <linux/backing-dev.h>
#include <linux/migrate.h>
@@ -889,6 +890,7 @@ static int hwpoison_user_mappings(struct
int kill = 1, forcekill;
struct page *hpage = *hpagep;
struct page *ppage;
+ bool mapped;
/*
* Here we are interested only in user-mapped pages, so skip any
@@ -903,7 +905,11 @@ static int hwpoison_user_mappings(struct
* This check implies we don't kill processes if their pages
* are in the swap cache early. Those are always late kills.
*/
- if (!page_mapped(hpage))
+ mapped = page_mapped(hpage);
+ if (PageTeam(p) && !PageAnon(p) &&
+ team_hugely_mapped(team_head(p)))
+ mapped = true;
+ if (!mapped)
return SWAP_SUCCESS;
if (PageKsm(p)) {
--- thpfs.orig/mm/shmem.c 2015-02-20 19:34:21.603969581 -0800
+++ thpfs/mm/shmem.c 2015-02-20 19:34:59.051883956 -0800
@@ -603,6 +603,17 @@ static void shmem_disband_hugeteam(struc
page_cache_release(head);
return;
}
+ /*
+ * truncate_inode_page() will unmap page if page_mapped(page),
+ * but there's a race by which the team could be hugely mapped,
+ * with page_mapped(page) saying false. So check here if the
+ * head is hugely mapped, and if so unmap page to remap team.
+ */
+ if (team_hugely_mapped(head)) {
+ unmap_mapping_range(page->mapping,
+ (loff_t)page->index << PAGE_CACHE_SHIFT,
+ PAGE_CACHE_SIZE, 0);
+ }
}
/*
@@ -1216,6 +1227,20 @@ void shmem_truncate_range(struct inode *
}
EXPORT_SYMBOL_GPL(shmem_truncate_range);
+int shmem_error_remove_page(struct address_space *mapping, struct page *page)
+{
+ if (PageTeam(page)) {
+ shmem_disband_hugeteam(page);
+ while (unlikely(PageTeam(page))) {
+ unlock_page(page);
+ cond_resched();
+ lock_page(page);
+ shmem_disband_hugeteam(page);
+ }
+ }
+ return generic_error_remove_page(mapping, page);
+}
+
static int shmem_setattr(struct dentry *dentry, struct iattr *attr)
{
struct inode *inode = dentry->d_inode;
@@ -4031,7 +4056,7 @@ static const struct address_space_operat
#ifdef CONFIG_MIGRATION
.migratepage = migrate_page,
#endif
- .error_remove_page = generic_error_remove_page,
+ .error_remove_page = shmem_error_remove_page,
};
static const struct file_operations shmem_file_operations = {
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-02-21 4:22 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-21 3:49 [PATCH 00/24] huge tmpfs: an alternative approach to THPageCache Hugh Dickins
2015-02-21 3:51 ` [PATCH 01/24] mm: update_lru_size warn and reset bad lru_size Hugh Dickins
2015-02-23 9:30 ` Kirill A. Shutemov
2015-03-23 2:44 ` Hugh Dickins
2015-02-21 3:54 ` [PATCH 02/24] mm: update_lru_size do the __mod_zone_page_state Hugh Dickins
2015-02-21 3:56 ` [PATCH 03/24] mm: use __SetPageSwapBacked and don't ClearPageSwapBacked Hugh Dickins
2015-02-25 10:53 ` Mel Gorman
2015-03-23 3:01 ` Hugh Dickins
2015-02-21 3:58 ` [PATCH 04/24] mm: make page migration's newpage handling more robust Hugh Dickins
2015-02-21 4:00 ` [PATCH 05/24] tmpfs: preliminary minor tidyups Hugh Dickins
2015-02-21 4:01 ` [PATCH 06/24] huge tmpfs: prepare counts in meminfo, vmstat and SysRq-m Hugh Dickins
2015-02-21 4:03 ` [PATCH 07/24] huge tmpfs: include shmem freeholes in available memory counts Hugh Dickins
2015-02-21 4:05 ` [PATCH 08/24] huge tmpfs: prepare huge=N mount option and /proc/sys/vm/shmem_huge Hugh Dickins
2015-02-21 4:06 ` [PATCH 09/24] huge tmpfs: try to allocate huge pages, split into a team Hugh Dickins
2015-02-21 4:07 ` [PATCH 10/24] huge tmpfs: avoid team pages in a few places Hugh Dickins
2015-02-21 4:09 ` [PATCH 11/24] huge tmpfs: shrinker to migrate and free underused holes Hugh Dickins
2015-03-19 16:56 ` Konstantin Khlebnikov
2015-03-23 4:40 ` Hugh Dickins
2015-03-23 12:50 ` Kirill A. Shutemov
2015-03-23 13:50 ` Kirill A. Shutemov
2015-03-24 12:57 ` Kirill A. Shutemov
2015-03-25 0:41 ` Hugh Dickins
2015-02-21 4:11 ` [PATCH 12/24] huge tmpfs: get_unmapped_area align and fault supply huge page Hugh Dickins
2015-02-21 4:12 ` [PATCH 13/24] huge tmpfs: extend get_user_pages_fast to shmem pmd Hugh Dickins
2015-02-21 4:13 ` [PATCH 14/24] huge tmpfs: extend vma_adjust_trans_huge " Hugh Dickins
2015-02-21 4:15 ` [PATCH 15/24] huge tmpfs: rework page_referenced_one and try_to_unmap_one Hugh Dickins
2015-02-21 4:16 ` [PATCH 16/24] huge tmpfs: fix problems from premature exposure of pagetable Hugh Dickins
2015-07-01 10:53 ` Kirill A. Shutemov
2015-02-21 4:18 ` [PATCH 17/24] huge tmpfs: map shmem by huge page pmd or by page team ptes Hugh Dickins
2015-02-21 4:20 ` [PATCH 18/24] huge tmpfs: mmap_sem is unlocked when truncation splits huge pmd Hugh Dickins
2015-02-21 4:22 ` Hugh Dickins [this message]
2015-02-21 4:23 ` [PATCH 20/24] huge tmpfs: use Unevictable lru with variable hpage_nr_pages() Hugh Dickins
2015-02-21 4:25 ` [PATCH 21/24] huge tmpfs: fix Mlocked meminfo, tracking huge and unhuge mlocks Hugh Dickins
2015-02-21 4:27 ` [PATCH 22/24] huge tmpfs: fix Mapped meminfo, tracking huge and unhuge mappings Hugh Dickins
2015-02-21 4:29 ` [PATCH 23/24] kvm: plumb return of hva when resolving page fault Hugh Dickins
2015-02-21 4:31 ` [PATCH 24/24] kvm: teach kvm to map page teams as huge pages Hugh Dickins
2015-02-23 13:48 ` [PATCH 00/24] huge tmpfs: an alternative approach to THPageCache Kirill A. Shutemov
2015-03-23 2:25 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.1502202020580.14414@eggly.anvils \
--to=hughd@google.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=quning@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).