All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/1] x86/mm: Forbid the zero page once it has uncorrectable errors
  2022-04-20 21:00 [PATCH 1/1] x86/mm: Forbid the zero page once it has uncorrectable errors Qiuxu Zhuo
@ 2022-04-20 13:39 ` Dave Hansen
  2022-04-21  7:53   ` Zhuo, Qiuxu
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Hansen @ 2022-04-20 13:39 UTC (permalink / raw)
  To: Qiuxu Zhuo, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Tony Luck, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Andrew Morton, Naoya Horiguchi
  Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:HWPOISON MEMORY FAILURE HANDLING,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)

On 4/20/22 14:00, Qiuxu Zhuo wrote:
> Accessing to the zero page with uncorrectable errors causes unexpected
> machine checks. So forbid the zero page from being used by user-space
> processes once it has uncorrectable errors. Processes that have already
> mapped the zero page with uncorrectable errors will get killed once they
> access to it. New processes will not use the zero page.

There are lots of pages which are entirely fatal if they have
uncorrectable errors.  On my laptop, if there were an error, there is a
0.00000596% chance it will be in the zero page.

Why is this worth special casing this one page?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/1] x86/mm: Forbid the zero page once it has uncorrectable errors
@ 2022-04-20 21:00 Qiuxu Zhuo
  2022-04-20 13:39 ` Dave Hansen
  0 siblings, 1 reply; 4+ messages in thread
From: Qiuxu Zhuo @ 2022-04-20 21:00 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Tony Luck,
	Dave Hansen, Andy Lutomirski, Peter Zijlstra, Andrew Morton,
	Naoya Horiguchi
  Cc: Qiuxu Zhuo, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:HWPOISON MEMORY FAILURE HANDLING,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)

Accessing to the zero page with uncorrectable errors causes unexpected
machine checks. So forbid the zero page from being used by user-space
processes once it has uncorrectable errors. Processes that have already
mapped the zero page with uncorrectable errors will get killed once they
access to it. New processes will not use the zero page.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
1) Processes that have already mapped the zero page with uncorrectable
   errors could be recovered by attaching a new zeroed anonymous page.
   But this may need to walk all page tables for all such processes to
   update the PTEs pointing to the zero page. Looks like a big modification
   for a rare problem?

2) Some validation tests that sometimes pick up the virtual address
   mapped to the zero page to inject errors get themself killed and can't
   run anymore until reboot the system. To avoid injecting errors to the
   zero page, please refer to the path:

   https://lore.kernel.org/all/20220419211921.2230752-1-tony.luck@intel.com/

 arch/x86/include/asm/pgtable.h | 3 +++
 arch/x86/kernel/cpu/mce/core.c | 6 ++++++
 arch/x86/mm/pgtable.c          | 2 ++
 mm/memory-failure.c            | 2 +-
 4 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 62ab07e24aef..d4b8693452e5 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -55,6 +55,9 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
 	__visible;
 #define ZERO_PAGE(vaddr) ((void)(vaddr),virt_to_page(empty_zero_page))
 
+extern bool __read_mostly forbids_zeropage;
+#define mm_forbids_zeropage(x)	forbids_zeropage
+
 extern spinlock_t pgd_lock;
 extern struct list_head pgd_list;
 
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 981496e6bc0e..5b3af27cc8fa 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -44,6 +44,7 @@
 #include <linux/sync_core.h>
 #include <linux/task_work.h>
 #include <linux/hardirq.h>
+#include <linux/pgtable.h>
 
 #include <asm/intel-family.h>
 #include <asm/processor.h>
@@ -1370,6 +1371,11 @@ static void queue_task_work(struct mce *m, char *msg, void (*func)(struct callba
 	if (count > 1)
 		return;
 
+	if (is_zero_pfn(current->mce_addr >> PAGE_SHIFT) && !forbids_zeropage) {
+		pr_err("Forbid user-space process from using zero page\n");
+		forbids_zeropage = true;
+	}
+
 	task_work_add(current, &current->mce_kill_me, TWA_RESUME);
 }
 
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 3481b35cb4ec..c0c56bce3acc 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -28,6 +28,8 @@ void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table)
 
 gfp_t __userpte_alloc_gfp = GFP_PGTABLE_USER | PGTABLE_HIGHMEM;
 
+bool __read_mostly forbids_zeropage;
+
 pgtable_t pte_alloc_one(struct mm_struct *mm)
 {
 	return __pte_alloc_one(mm, __userpte_alloc_gfp);
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index dcb6bb9cf731..30ad7bdeb89f 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1744,7 +1744,7 @@ int memory_failure(unsigned long pfn, int flags)
 		goto unlock_mutex;
 	}
 
-	if (TestSetPageHWPoison(p)) {
+	if (TestSetPageHWPoison(p) || is_zero_pfn(pfn)) {
 		pr_err("Memory failure: %#lx: already hardware poisoned\n",
 			pfn);
 		res = -EHWPOISON;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [PATCH 1/1] x86/mm: Forbid the zero page once it has uncorrectable errors
  2022-04-20 13:39 ` Dave Hansen
@ 2022-04-21  7:53   ` Zhuo, Qiuxu
  2022-04-21  8:50     ` David Hildenbrand
  0 siblings, 1 reply; 4+ messages in thread
From: Zhuo, Qiuxu @ 2022-04-21  7:53 UTC (permalink / raw)
  To: Hansen, Dave, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Luck, Tony, Dave Hansen, Lutomirski, Andy, Peter Zijlstra,
	Andrew Morton, Naoya Horiguchi
  Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:HWPOISON MEMORY FAILURE HANDLING,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)

> From: Hansen, Dave <dave.hansen@intel.com>
> ...
> Subject: Re: [PATCH 1/1] x86/mm: Forbid the zero page once it has
> uncorrectable errors
> ...
> There are lots of pages which are entirely fatal if they have uncorrectable errors.
> On my laptop, if there were an error, there is a 0.00000596% chance it will be in
> the zero page.
> 
> Why is this worth special casing this one page?

Hi Dave,

   Yes, this is a rare problem. Just feel that the fix is simple, so post it here to see whether you'll consider it 😊.

Thanks!
-Qiuxu


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] x86/mm: Forbid the zero page once it has uncorrectable errors
  2022-04-21  7:53   ` Zhuo, Qiuxu
@ 2022-04-21  8:50     ` David Hildenbrand
  0 siblings, 0 replies; 4+ messages in thread
From: David Hildenbrand @ 2022-04-21  8:50 UTC (permalink / raw)
  To: Zhuo, Qiuxu, Hansen, Dave, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Luck, Tony, Dave Hansen, Lutomirski, Andy,
	Peter Zijlstra, Andrew Morton, Naoya Horiguchi,
	Christian Borntraeger
  Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:HWPOISON MEMORY FAILURE HANDLING,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)

On 21.04.22 09:53, Zhuo, Qiuxu wrote:
>> From: Hansen, Dave <dave.hansen@intel.com>
>> ...
>> Subject: Re: [PATCH 1/1] x86/mm: Forbid the zero page once it has
>> uncorrectable errors
>> ...
>> There are lots of pages which are entirely fatal if they have uncorrectable errors.
>> On my laptop, if there were an error, there is a 0.00000596% chance it will be in
>> the zero page.
>>
>> Why is this worth special casing this one page?
> 
> Hi Dave,
> 
>    Yes, this is a rare problem. Just feel that the fix is simple, so post it here to see whether you'll consider it 😊.

Just some background information.

mm_forbids_zeropage() exists for the sole purpose of s390x/kvm not being
able to use the shared zeropage for a KVM guest because the storage keys
associated with the shared zeropage could result in trouble. So
s390x/kvm has to make sure that no shared zeropage
is mapped into the process.

See fa41ba0d08de ("s390/mm: avoid empty zero pages for KVM guests to
avoid postcopy hangs") for details.

@Christian

a) with keyless guests we could actually use the shared zeropage because
the guest cannot possibly enable storage keys, correct?

b) Why is there no mm_forbids_zeropage() check in mfill_zeropage_pte()?
Maybe I'm missing something, but looks like we can still place the
shared zeropage into a KVM guest via uffd.


In general, there are more place that will use the shared zeropage, most
notably, fs/dax.c  will place the shared zeropage for holes and would
still use it on x86-64. IIRC, s390x doesn't use it.

/proc/vmcore will map the zeropage to user space for areas that are not
RAM, so you could still stumble over it there and trigger a MCE.

Last but not least, the huge shared zeropage would suffer from similar
problems.


Also, I wonder if the generic code change in mm/memory-failure.c is
correct as it touches common code and you only mess with the x86
zeropage. But I did not look into the details.


So the code here at least isn't complete. So I'm not convinced this
change is worth it.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-04-21  8:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-20 21:00 [PATCH 1/1] x86/mm: Forbid the zero page once it has uncorrectable errors Qiuxu Zhuo
2022-04-20 13:39 ` Dave Hansen
2022-04-21  7:53   ` Zhuo, Qiuxu
2022-04-21  8:50     ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.