linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Michal Hocko <mhocko@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Punit Agrawal <punit.agrawal@arm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Subject: [PATCH 4.14 35/53] mm: hwpoison: disable memory error handling on 1GB hugepage
Date: Tue, 10 Jul 2018 20:25:11 +0200	[thread overview]
Message-ID: <20180710182500.819612094@linuxfoundation.org> (raw)
In-Reply-To: <20180710182458.736721865@linuxfoundation.org>

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

commit 31286a8484a85e8b4e91ddb0f5415aee8a416827 upstream.

Recently the following BUG was reported:

    Injecting memory failure for pfn 0x3c0000 at process virtual address 0x7fe300000000
    Memory failure: 0x3c0000: recovery action for huge page: Recovered
    BUG: unable to handle kernel paging request at ffff8dfcc0003000
    IP: gup_pgd_range+0x1f0/0xc20
    PGD 17ae72067 P4D 17ae72067 PUD 0
    Oops: 0000 [#1] SMP PTI
    ...
    CPU: 3 PID: 5467 Comm: hugetlb_1gb Not tainted 4.15.0-rc8-mm1-abc+ #3
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014

You can easily reproduce this by calling madvise(MADV_HWPOISON) twice on
a 1GB hugepage.  This happens because get_user_pages_fast() is not aware
of a migration entry on pud that was created in the 1st madvise() event.

I think that conversion to pud-aligned migration entry is working, but
other MM code walking over page table isn't prepared for it.  We need
some time and effort to make all this work properly, so this patch
avoids the reported bug by just disabling error handling for 1GB
hugepage.

[n-horiguchi@ah.jp.nec.com: v2]
  Link: http://lkml.kernel.org/r/1517284444-18149-1-git-send-email-n-horiguchi@ah.jp.nec.com
Link: http://lkml.kernel.org/r/1517207283-15769-1-git-send-email-n-horiguchi@ah.jp.nec.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/mm.h  |    1 +
 mm/memory-failure.c |   16 ++++++++++++++++
 2 files changed, 17 insertions(+)

--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2549,6 +2549,7 @@ enum mf_action_page_type {
 	MF_MSG_POISONED_HUGE,
 	MF_MSG_HUGE,
 	MF_MSG_FREE_HUGE,
+	MF_MSG_NON_PMD_HUGE,
 	MF_MSG_UNMAP_FAILED,
 	MF_MSG_DIRTY_SWAPCACHE,
 	MF_MSG_CLEAN_SWAPCACHE,
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -508,6 +508,7 @@ static const char * const action_page_ty
 	[MF_MSG_POISONED_HUGE]		= "huge page already hardware poisoned",
 	[MF_MSG_HUGE]			= "huge page",
 	[MF_MSG_FREE_HUGE]		= "free huge page",
+	[MF_MSG_NON_PMD_HUGE]		= "non-pmd-sized huge page",
 	[MF_MSG_UNMAP_FAILED]		= "unmapping failed page",
 	[MF_MSG_DIRTY_SWAPCACHE]	= "dirty swapcache page",
 	[MF_MSG_CLEAN_SWAPCACHE]	= "clean swapcache page",
@@ -1090,6 +1091,21 @@ static int memory_failure_hugetlb(unsign
 		return 0;
 	}
 
+	/*
+	 * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so
+	 * simply disable it. In order to make it work properly, we need
+	 * make sure that:
+	 *  - conversion of a pud that maps an error hugetlb into hwpoison
+	 *    entry properly works, and
+	 *  - other mm code walking over page table is aware of pud-aligned
+	 *    hwpoison entries.
+	 */
+	if (huge_page_size(page_hstate(head)) > PMD_SIZE) {
+		action_result(pfn, MF_MSG_NON_PMD_HUGE, MF_IGNORED);
+		res = -EBUSY;
+		goto out;
+	}
+
 	if (!hwpoison_user_mappings(p, pfn, trapno, flags, &head)) {
 		action_result(pfn, MF_MSG_UNMAP_FAILED, MF_IGNORED);
 		res = -EBUSY;



  parent reply	other threads:[~2018-07-10 18:33 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-10 18:24 [PATCH 4.14 00/53] 4.14.55-stable review Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 01/53] userfaultfd: hugetlbfs: fix userfaultfd_huge_must_wait() pte access Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 02/53] mm: hugetlb: yield when prepping struct pages Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 03/53] tracing: Fix missing return symbol in function_graph output Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 04/53] scsi: sg: mitigate read/write abuse Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 05/53] scsi: target: Fix truncated PR-in ReadKeys response Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 06/53] s390: Correct register corruption in critical section cleanup Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 07/53] drbd: fix access after free Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 08/53] vfio: Use get_user_pages_longterm correctly Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 09/53] cifs: Fix use after free of a mid_q_entry Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 10/53] cifs: Fix memory leak in smb2_set_ea() Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 11/53] cifs: Fix infinite loop when using hard mount option Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 12/53] cifs: Fix slab-out-of-bounds in send_set_info() on SMB2 ACE setting Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 13/53] drm: Use kvzalloc for allocating blob property memory Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 14/53] drm/udl: fix display corruption of the last line Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 15/53] jbd2: dont mark block as modified if the handle is out of credits Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 16/53] ext4: add corruption check in ext4_xattr_set_entry() Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 17/53] ext4: always verify the magic number in xattr blocks Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 18/53] ext4: make sure bitmaps and the inode table dont overlap with bg descriptors Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 19/53] ext4: always check block group bounds in ext4_init_block_bitmap() Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 20/53] ext4: only look at the bg_flags field if it is valid Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 21/53] ext4: verify the depth of extent tree in ext4_find_extent() Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 22/53] ext4: include the illegal physical block in the bad map ext4_error msg Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.14 23/53] ext4: clear i_data in ext4_inode_info when removing inline data Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 24/53] ext4: never move the system.data xattr out of the inode body Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 25/53] ext4: avoid running out of journal credits when appending to an inline file Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 26/53] ext4: add more inode number paranoia checks Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 27/53] ext4: add more mount time checks of the superblock Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 28/53] ext4: check superblock mapped prior to committing Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 29/53] block: factor out __blkdev_issue_zero_pages() Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 30/53] block: cope with WRITE ZEROES failing in blkdev_issue_zeroout() Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 31/53] HID: i2c-hid: Fix "incomplete report" noise Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 32/53] HID: hiddev: fix potential Spectre v1 Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 33/53] HID: debug: check length before copy_to_user() Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 34/53] irq/core: Fix boot crash when the irqaffinity= boot parameter is passed on CPUMASK_OFFSTACK=y kernels(v1) Greg Kroah-Hartman
2018-07-10 18:25 ` Greg Kroah-Hartman [this message]
2018-07-10 18:25 ` [PATCH 4.14 36/53] media: vb2: core: Finish buffers at the end of the stream Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 37/53] f2fs: truncate preallocated blocks in error case Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 38/53] [PATCH] Revert "dpaa_eth: fix error in dpaa_remove()" Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 39/53] Kbuild: fix # escaping in .cmd files for future Make Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 40/53] media: cx25840: Use subdev host data for PLL override Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 41/53] mtd: rawnand: mxc: set spare area size register explicitly Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 42/53] fs: allow per-device dax status checking for filesystems Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 43/53] dax: change bdev_dax_supported() to support boolean returns Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 44/53] dax: check for QUEUE_FLAG_DAX in bdev_dax_supported() Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 45/53] dm: set QUEUE_FLAG_DAX accordingly in dm_table_set_restrictions() Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 46/53] dm: prevent DAX mounts if not supported Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 47/53] mtd: cfi_cmdset_0002: Change definition naming to retry write operation Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 48/53] mtd: cfi_cmdset_0002: Change erase functions to retry for error Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 49/53] mtd: cfi_cmdset_0002: Change erase functions to check chip good only Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 50/53] netfilter: nf_log: dont hold nf_log_mutex during user access Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 51/53] staging: comedi: quatech_daqp_cs: fix no-op loop daqp_ao_insn_write() Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 52/53] sched, tracing: Fix trace_sched_pi_setprio() for deboosting Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.14 53/53] Revert mm/vmstat.c: fix vmstat_update() preemption BUG Greg Kroah-Hartman
2018-07-11 13:05 ` [PATCH 4.14 00/53] 4.14.55-stable review Naresh Kamboju
2018-07-11 13:41 ` Guenter Roeck
2018-07-11 15:20 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180710182500.819612094@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mpe@ellerman.id.au \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=punit.agrawal@arm.com \
    --cc=stable@vger.kernel.org \
    --cc=sudipm.mukherjee@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).