From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 12 Jun 2018 14:14:20 -0600 From: Ross Zwisler To: Dan Williams Cc: linux-nvdimm@lists.01.org, Jan Kara , Christoph Hellwig , =?iso-8859-1?B?Suly9G1l?= Glisse , Matthew Wilcox , Naoya Horiguchi , Ross Zwisler , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH v4 11/12] mm, memory_failure: Teach memory_failure() about dev_pagemap pages Message-ID: <20180612201420.GA12706@linux.intel.com> References: <152850182079.38390.8280340535691965744.stgit@dwillia2-desk3.amr.corp.intel.com> <152850187949.38390.1012249765651998342.stgit@dwillia2-desk3.amr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <152850187949.38390.1012249765651998342.stgit@dwillia2-desk3.amr.corp.intel.com> Sender: owner-linux-mm@kvack.org List-ID: On Fri, Jun 08, 2018 at 04:51:19PM -0700, Dan Williams wrote: > mce: Uncorrected hardware memory error in user-access at af34214200 > {1}[Hardware Error]: It has been corrected by h/w and requires no further action > mce: [Hardware Error]: Machine check events logged > {1}[Hardware Error]: event severity: corrected > Memory failure: 0xaf34214: reserved kernel page still referenced by 1 users > [..] > Memory failure: 0xaf34214: recovery action for reserved kernel page: Failed > mce: Memory error not recovered > > In contrast to typical memory, dev_pagemap pages may be dax mapped. With > dax there is no possibility to map in another page dynamically since dax > establishes 1:1 physical address to file offset associations. Also > dev_pagemap pages associated with NVDIMM / persistent memory devices can > internal remap/repair addresses with poison. While memory_failure() > assumes that it can discard typical poisoned pages and keep them > unmapped indefinitely, dev_pagemap pages may be returned to service > after the error is cleared. > > Teach memory_failure() to detect and handle MEMORY_DEVICE_HOST > dev_pagemap pages that have poison consumed by userspace. Mark the > memory as UC instead of unmapping it completely to allow ongoing access > via the device driver (nd_pmem). Later, nd_pmem will grow support for > marking the page back to WB when the error is cleared. > > Cc: Jan Kara > Cc: Christoph Hellwig > Cc: J�r�me Glisse > Cc: Matthew Wilcox > Cc: Naoya Horiguchi > Cc: Ross Zwisler > Signed-off-by: Dan Williams > --- <> > +static int memory_failure_dev_pagemap(unsigned long pfn, int flags, > + struct dev_pagemap *pgmap) > +{ > + const bool unmap_success = true; > + unsigned long size; > + struct page *page; > + LIST_HEAD(tokill); > + int rc = -EBUSY; > + loff_t start; > + > + /* > + * Prevent the inode from being freed while we are interrogating > + * the address_space, typically this would be handled by > + * lock_page(), but dax pages do not use the page lock. > + */ > + page = dax_lock_page(pfn); > + if (!page) > + goto out; > + > + if (hwpoison_filter(page)) { > + rc = 0; > + goto unlock; > + } > + > + switch (pgmap->type) { > + case MEMORY_DEVICE_PRIVATE: > + case MEMORY_DEVICE_PUBLIC: > + /* > + * TODO: Handle HMM pages which may need coordination > + * with device-side memory. > + */ > + goto unlock; > + default: > + break; > + } > + > + /* > + * If the page is not mapped in userspace then report it as > + * unhandled. > + */ > + size = dax_mapping_size(page); > + if (!size) { > + pr_err("Memory failure: %#lx: failed to unmap page\n", pfn); > + goto unlock; > + } > + > + SetPageHWPoison(page); > + > + /* > + * Unlike System-RAM there is no possibility to swap in a > + * different physical page at a given virtual address, so all > + * userspace consumption of ZONE_DEVICE memory necessitates > + * SIGBUS (i.e. MF_MUST_KILL) > + */ > + flags |= MF_ACTION_REQUIRED | MF_MUST_KILL; > + collect_procs(page, &tokill, flags & MF_ACTION_REQUIRED); You know "flags & MF_ACTION_REQUIRED" will always be true, so you can just pass in MF_ACTION_REQUIRED or even just "true". > + > + start = (page->index << PAGE_SHIFT) & ~(size - 1); > + unmap_mapping_range(page->mapping, start, start + size, 0); > + > + kill_procs(&tokill, flags & MF_MUST_KILL, !unmap_success, ilog2(size), You know "flags & MF_MUST_KILL" will always be true, so you can just pass in MF_MUST_KILL or even just "true". Also, you can get rid of the constant "unmap_success" if you want and just pass in false as the 3rd argument.