From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-x243.google.com (mail-oi0-x243.google.com [IPv6:2607:f8b0:4003:c06::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id BB8C3203B8C02 for ; Tue, 1 May 2018 16:02:59 -0700 (PDT) Received: by mail-oi0-x243.google.com with SMTP id c203-v6so11329775oib.7 for ; Tue, 01 May 2018 16:02:59 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <152520750404.36522.15462513519590065300.stgit@dwillia2-desk3.amr.corp.intel.com> From: Dan Williams Date: Tue, 1 May 2018 16:02:58 -0700 Message-ID: Subject: Re: [PATCH 0/6] use memcpy_mcsafe() for copy_to_iter() List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Linus Torvalds Cc: Tony Luck , "linux-nvdimm@lists.01.org" , Peter Zijlstra , the arch/x86 maintainers , Linux Kernel Mailing List , Andy Lutomirski , Ingo Molnar , Borislav Petkov , Al Viro , Thomas Gleixner , Andrew Morton List-ID: On Tue, May 1, 2018 at 2:05 PM, Linus Torvalds wrote: > On Tue, May 1, 2018 at 1:55 PM Dan Williams > wrote: > >> The result of the bypass is that the kernel treats machine checks during >> read as system fatal (reboot) when they could simply be flagged as an >> I/O error, similar to performing reads through the pmem driver. Prevent >> this fatal condition by deploying memcpy_mcsafe() in the fsdax read >> path. > > How about just changing the rules, and go the old "Don't do that then" way? > > IOW, get rid of the whole idea that MCS errors should be fatal. It's wrong > and pointless anyway. > > The while approach seems fundamentally buggered, if you ever want to mmap > one of these things. And don't you want that? > > So why continue down a fundamentally broken path? I'm confused. Are you talking about getting rid of the block-layer bypass or changing how MCS errors are handled? If it's the former I've gotten push back in the past trying to remove the bypass, but I feel better about my chances to slay that beast wielding the +5 Hammer of Linus. If it's the latter, MCS error handling, I don't see how get around something like copy_to_iter_mcsafe(). You mention mmap. Yes, we want the predominant access model to be dax-mmap for Persistent Memory, but there's still the question about what to do with media errors. To date we are trying to mirror the error handling model for System Memory, i.e. SIGBUS to the process that consumed the error. Is that error handling model also problematic in your view? _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751862AbeEAXDD (ORCPT ); Tue, 1 May 2018 19:03:03 -0400 Received: from mail-oi0-f65.google.com ([209.85.218.65]:39415 "EHLO mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751615AbeEAXC7 (ORCPT ); Tue, 1 May 2018 19:02:59 -0400 X-Google-Smtp-Source: AB8JxZqjQ36XNBA01JwBoyWD45GWpH/3+M2Zs8BXyKUclBrFB5gFVsTgAZ2LjshEsJza0t/1f2v1oqw34wW4yuP0lYk= MIME-Version: 1.0 In-Reply-To: References: <152520750404.36522.15462513519590065300.stgit@dwillia2-desk3.amr.corp.intel.com> From: Dan Williams Date: Tue, 1 May 2018 16:02:58 -0700 Message-ID: Subject: Re: [PATCH 0/6] use memcpy_mcsafe() for copy_to_iter() To: Linus Torvalds Cc: "linux-nvdimm@lists.01.org" , Tony Luck , Peter Zijlstra , Borislav Petkov , "the arch/x86 maintainers" , Thomas Gleixner , Andy Lutomirski , Ingo Molnar , Al Viro , Andrew Morton , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 1, 2018 at 2:05 PM, Linus Torvalds wrote: > On Tue, May 1, 2018 at 1:55 PM Dan Williams > wrote: > >> The result of the bypass is that the kernel treats machine checks during >> read as system fatal (reboot) when they could simply be flagged as an >> I/O error, similar to performing reads through the pmem driver. Prevent >> this fatal condition by deploying memcpy_mcsafe() in the fsdax read >> path. > > How about just changing the rules, and go the old "Don't do that then" way? > > IOW, get rid of the whole idea that MCS errors should be fatal. It's wrong > and pointless anyway. > > The while approach seems fundamentally buggered, if you ever want to mmap > one of these things. And don't you want that? > > So why continue down a fundamentally broken path? I'm confused. Are you talking about getting rid of the block-layer bypass or changing how MCS errors are handled? If it's the former I've gotten push back in the past trying to remove the bypass, but I feel better about my chances to slay that beast wielding the +5 Hammer of Linus. If it's the latter, MCS error handling, I don't see how get around something like copy_to_iter_mcsafe(). You mention mmap. Yes, we want the predominant access model to be dax-mmap for Persistent Memory, but there's still the question about what to do with media errors. To date we are trying to mirror the error handling model for System Memory, i.e. SIGBUS to the process that consumed the error. Is that error handling model also problematic in your view?