From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f48.google.com (mail-pa0-f48.google.com [209.85.220.48]) by kanga.kvack.org (Postfix) with ESMTP id 604746B0032 for ; Fri, 20 Feb 2015 23:22:24 -0500 (EST) Received: by pabrd3 with SMTP id rd3so12931727pab.4 for ; Fri, 20 Feb 2015 20:22:24 -0800 (PST) Received: from mail-pa0-x22b.google.com (mail-pa0-x22b.google.com. [2607:f8b0:400e:c03::22b]) by mx.google.com with ESMTPS id a4si27342921pdn.18.2015.02.20.20.22.23 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 20 Feb 2015 20:22:23 -0800 (PST) Received: by padbj1 with SMTP id bj1so12934004pad.5 for ; Fri, 20 Feb 2015 20:22:23 -0800 (PST) Date: Fri, 20 Feb 2015 20:22:21 -0800 (PST) From: Hugh Dickins Subject: [PATCH 19/24] huge tmpfs: disband split huge pmds on race or memory failure In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: "Kirill A. Shutemov" Cc: Andrea Arcangeli , Ning Qu , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org Andres L-C has pointed out that the single-page unmap_mapping_range() fallback in truncate_inode_page() cannot protect against the case when a huge page was faulted in after the full-range unmap_mapping_range(): because page_mapped(page) checks tail page's mapcount, not the head's. So, there's a danger that hole-punching (and maybe even truncation) can free pages while they are mapped into userspace with a huge pmd. And I don't believe that the CVE-2014-4171 protection in shmem_fault() can fully protect from this, although it does make it much harder. Fix that by adding a duplicate single-page unmap_mapping_range() into shmem_disband_hugeteam() (called when punching or truncating a PageTeam), at the point when we also hold the head's page lock (without which there would still be races): which will then split all huge pmd mappings covering the page into team pte mappings. This is also just what's needed to handle memory_failure() correctly: provide custom shmem_error_remove_page(), call shmem_disband_hugeteam() from that before proceeding to generic_error_remove_page(), then this additional unmap_mapping_range() will remap team by ptes as needed. (There is an unlikely case that we're racing with another disbander, or disband didn't get trylock on head page at first: memory_failure() has almost finished with the page, so it's safe to unlock and relock before retrying.) But there is one further change needed in hwpoison_user_mappings(): it must recognize a hugely mapped team before concluding that the page is not mapped. (And still no support for soft_offline(), which will have to wait for page migration of teams.) Signed-off-by: Hugh Dickins --- mm/memory-failure.c | 8 +++++++- mm/shmem.c | 27 ++++++++++++++++++++++++++- 2 files changed, 33 insertions(+), 2 deletions(-) --- thpfs.orig/mm/memory-failure.c 2015-02-08 18:54:22.000000000 -0800 +++ thpfs/mm/memory-failure.c 2015-02-20 19:34:59.047883965 -0800 @@ -44,6 +44,7 @@ #include #include #include +#include #include #include #include @@ -889,6 +890,7 @@ static int hwpoison_user_mappings(struct int kill = 1, forcekill; struct page *hpage = *hpagep; struct page *ppage; + bool mapped; /* * Here we are interested only in user-mapped pages, so skip any @@ -903,7 +905,11 @@ static int hwpoison_user_mappings(struct * This check implies we don't kill processes if their pages * are in the swap cache early. Those are always late kills. */ - if (!page_mapped(hpage)) + mapped = page_mapped(hpage); + if (PageTeam(p) && !PageAnon(p) && + team_hugely_mapped(team_head(p))) + mapped = true; + if (!mapped) return SWAP_SUCCESS; if (PageKsm(p)) { --- thpfs.orig/mm/shmem.c 2015-02-20 19:34:21.603969581 -0800 +++ thpfs/mm/shmem.c 2015-02-20 19:34:59.051883956 -0800 @@ -603,6 +603,17 @@ static void shmem_disband_hugeteam(struc page_cache_release(head); return; } + /* + * truncate_inode_page() will unmap page if page_mapped(page), + * but there's a race by which the team could be hugely mapped, + * with page_mapped(page) saying false. So check here if the + * head is hugely mapped, and if so unmap page to remap team. + */ + if (team_hugely_mapped(head)) { + unmap_mapping_range(page->mapping, + (loff_t)page->index << PAGE_CACHE_SHIFT, + PAGE_CACHE_SIZE, 0); + } } /* @@ -1216,6 +1227,20 @@ void shmem_truncate_range(struct inode * } EXPORT_SYMBOL_GPL(shmem_truncate_range); +int shmem_error_remove_page(struct address_space *mapping, struct page *page) +{ + if (PageTeam(page)) { + shmem_disband_hugeteam(page); + while (unlikely(PageTeam(page))) { + unlock_page(page); + cond_resched(); + lock_page(page); + shmem_disband_hugeteam(page); + } + } + return generic_error_remove_page(mapping, page); +} + static int shmem_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = dentry->d_inode; @@ -4031,7 +4056,7 @@ static const struct address_space_operat #ifdef CONFIG_MIGRATION .migratepage = migrate_page, #endif - .error_remove_page = generic_error_remove_page, + .error_remove_page = shmem_error_remove_page, }; static const struct file_operations shmem_file_operations = { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org