Re: [PATCH 0/6] use memcpy_mcsafe() for copy_to_iter()

From: Dan Williams <dan.j.williams@intel.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Peter Zijlstra <peterz@infradead.org>, X86 ML <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 0/6] use memcpy_mcsafe() for copy_to_iter()
Date: Tue, 1 May 2018 16:31:44 -0700	[thread overview]
Message-ID: <CAPcyv4hSvAx9kR7tb5-g4D-zKjZbL1WJxM5q6L5ejLFgESSQug@mail.gmail.com> (raw)
In-Reply-To: <CALCETrV=HtQDfrKmY0Td2GfxAxZbAoz71JbYSZcq0LxL2A-RqQ@mail.gmail.com>

On Tue, May 1, 2018 at 4:28 PM, Andy Lutomirski <luto@kernel.org> wrote:
> On Tue, May 1, 2018 at 4:02 PM Dan Williams <dan.j.williams@intel.com>
> wrote:
>
>> On Tue, May 1, 2018 at 2:05 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> > On Tue, May 1, 2018 at 1:55 PM Dan Williams <dan.j.williams@intel.com>
>> > wrote:
>> >
>> >> The result of the bypass is that the kernel treats machine checks
> during
>> >> read as system fatal (reboot) when they could simply be flagged as an
>> >> I/O error, similar to performing reads through the pmem driver. Prevent
>> >> this fatal condition by deploying memcpy_mcsafe() in the fsdax read
>> >> path.
>> >
>> > How about just changing the rules, and go the old "Don't do that then"
> way?
>> >
>> > IOW, get rid of the whole idea that MCS errors should be fatal. It's
> wrong
>> > and pointless anyway.
>> >
>> > The while approach seems fundamentally buggered, if you ever want to
> mmap
>> > one of these things. And don't you want that?
>> >
>> > So why continue down a fundamentally broken path?
>
>> I'm confused. Are you talking about getting rid of the block-layer
>> bypass or changing how MCS errors are handled? If it's the former I've
>> gotten push back in the past trying to remove the bypass, but I feel
>> better about my chances to slay that beast wielding the +5 Hammer of
>> Linus. If it's the latter, MCS error handling, I don't see how get
>> around something like copy_to_iter_mcsafe().
>
>> You mention mmap. Yes, we want the predominant access model to be
>> dax-mmap for Persistent Memory, but there's still the question about
>> what to do with media errors. To date we are trying to mirror the
>> error handling model for System Memory, i.e. SIGBUS to the process
>> that consumed the error. Is that error handling model also problematic
>> in your view?
>
> I'm not sure exactly what you mean here, but my understanding of the status
> quo is that memory errors in user code are non-fatal but that memory errors
> in kernel code are fatal unless there's an appropriate extable entry.  The
> old iov_iter code assumes that memcpy() on kernel addresses can't fail.
> I'm not sure how else it could work.

Right, I'm trying to clarify the "IOW, get rid of the whole idea that
MCS errors should be fatal" comment. Especially as I am about to go
fix memory_failure() to understand that ZONE_DEVICE pages != typical
"struct page", and do the right thing with respect to un-mapping
userspace dax mapped pages.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm