[PATCH 00/11] mm: Teach memory_failure() about ZONE_DEVICE pages

* [PATCH 00/11] mm: Teach memory_failure() about ZONE_DEVICE pages
@ 2018-05-22 14:39 Dan Williams
  2018-05-22 14:39 ` [PATCH 01/11] device-dax: convert to vmf_insert_mixed and vm_fault_t Dan Williams
                   ` (10 more replies)
  0 siblings, 11 replies; 26+ messages in thread
From: Dan Williams @ 2018-05-22 14:39 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: linux-edac, Tony Luck, Borislav Petkov, stable, Jan Kara,
	H. Peter Anvin, x86, Thomas Gleixner, Andi Kleen,
	Christoph Hellwig, Ross Zwisler, Matthew Wilcox, Ingo Molnar,
	Michal Hocko, Naoya Horiguchi, Jérôme Glisse,
	Wu Fengguang, Souptick Joarder, linux-mm, linux-fsdevel,
	tony.luck

As it stands, memory_failure() gets thoroughly confused by dev_pagemap
backed mappings. The recovery code has specific enabling for several
possible page states and needs new enabling to handle poison in dax
mappings.

In order to support reliable reverse mapping of user space addresses add
new locking in the fsdax implementation to prevent races between
page-address_space disassociation events and the rmap performed in the
memory_failure() path. Additionally, since dev_pagemap pages are hidden
from the page allocator, add a mechanism to determine the size of the
mapping that encompasses a given poisoned pfn. Lastly, since pmem errors
can be repaired, change the speculatively accessed poison protection,
mce_unmap_kpfn(), to be reversible and otherwise allow ongoing access
from the kernel.

---

Dan Williams (11):
      device-dax: convert to vmf_insert_mixed and vm_fault_t
      device-dax: cleanup vm_fault de-reference chains
      device-dax: enable page_mapping()
      device-dax: set page->index
      filesystem-dax: set page->index
      filesystem-dax: perform __dax_invalidate_mapping_entry() under the page lock
      mm, madvise_inject_error: fix page count leak
      x86, memory_failure: introduce {set,clear}_mce_nospec()
      mm, memory_failure: pass page size to kill_proc()
      mm, memory_failure: teach memory_failure() about dev_pagemap pages
      libnvdimm, pmem: restore page attributes when clearing errors

 arch/x86/include/asm/set_memory.h         |   29 ++++++
 arch/x86/kernel/cpu/mcheck/mce-internal.h |   15 ---
 arch/x86/kernel/cpu/mcheck/mce.c          |   38 +-------
 drivers/dax/device.c                      |   91 ++++++++++++--------
 drivers/nvdimm/pmem.c                     |   26 ++++++
 drivers/nvdimm/pmem.h                     |   13 +++
 fs/dax.c                                  |  102 ++++++++++++++++++++--
 include/linux/huge_mm.h                   |    5 +
 include/linux/set_memory.h                |   14 +++
 mm/huge_memory.c                          |    4 -
 mm/madvise.c                              |   11 ++
 mm/memory-failure.c                       |  133 +++++++++++++++++++++++++++--
 12 files changed, 370 insertions(+), 111 deletions(-)

^ permalink raw reply	[flat|nested] 26+ messages in thread