From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BCFFC4338F for ; Wed, 18 Aug 2021 17:11:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 61A9B60FE6 for ; Wed, 18 Aug 2021 17:11:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232535AbhHRRLq (ORCPT ); Wed, 18 Aug 2021 13:11:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230116AbhHRRLh (ORCPT ); Wed, 18 Aug 2021 13:11:37 -0400 Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA1D2C0613D9 for ; Wed, 18 Aug 2021 10:11:02 -0700 (PDT) Received: by mail-pf1-x434.google.com with SMTP id t42so227084pfg.12 for ; Wed, 18 Aug 2021 10:11:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=a0h+TEUN+dwMuxMWfVo3bvlWUMIfpZXCfROxPjF14kk=; b=jDsABRYgBIIzrUssvhLgBVDGZE0Cf22oHKeXq227eBIsN2vBP9YrH/iXeV5oktQ1wT MDOsGDHaFL+mNosGVecwy6VMOUWLYny6ZrO/00J9/0F759bmWDSQCihcQ1MjVA6f2xpW agQZlkMoQK959gPQ5CKqAi/+MkBvk1YivITLoknQj4KYHgw74XQDL3qjxl4tpMYtvejl tjRH2ZHWo1jP3P4njwM94m2JVxmh0FimSMyqzoyCdqqIzO7Fa/CCjv5bEHYm7vMOipfO eYkEZ7cUvaJCVuf2qCApHQrS9TcUMGKlSKOzom110bFoLwum/gYX12zdGlb04yShzMrm KAsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=a0h+TEUN+dwMuxMWfVo3bvlWUMIfpZXCfROxPjF14kk=; b=iROz2uOVo/ZgOVpHvpBobNa9Y0mmwGEcmxdBOdNzDybrPdI73Ga0qnxtwY3BNjWUq7 tJOGWyew91tV9yUxvlcZFBL9EMo7Zmx50nlttYqn5u48YQNwAXa2AS+EDFLoXoittiJP 3X4sxKQa2X2jnYlpIHRbItLoj7lvaf2Po2qzIFgcx8R5/xaEspxoeX0W4BgjMvIBaicq l2Y5JdXrM3Qn73kj6QjdbouXi+dXunIaEFWsW1DQ2r2WfbbHJpwn5qZdAE9h/c1Ex/wH ZsWCbCnJ/oh8sj2lbtuZi3qW41JzZ561q5ulOwsaiynWuo1bCynNP6YxtAwViIIjHAXx Nk8g== X-Gm-Message-State: AOAM5312cODjQj86yZfXBx/MI7zUeJdnqgb9Mk3OnkxGw247pknjpfNu npvTRlW/LeHhjz9tSzxqiZes7c5cjuo8+Q4QEv4t6g== X-Google-Smtp-Source: ABdhPJzdnb/dj0te4AmEpXNnOuagyYA3beOwcj2486U3TAeBRF2kxocNgKDFE/I3ReZg4NnJJpeR0A7dwtzSQubN+nk= X-Received: by 2002:a05:6a00:16c6:b029:32d:e190:9dd0 with SMTP id l6-20020a056a0016c6b029032de1909dd0mr10326420pfc.70.1629306662392; Wed, 18 Aug 2021 10:11:02 -0700 (PDT) MIME-Version: 1.0 References: <20210730100158.3117319-1-ruansy.fnst@fujitsu.com> <20210730100158.3117319-2-ruansy.fnst@fujitsu.com> <1d286104-28f4-d442-efed-4344eb8fa5a1@oracle.com> <78c22960-3f6d-8e5d-890a-72915236bedc@oracle.com> In-Reply-To: From: Dan Williams Date: Wed, 18 Aug 2021 10:10:51 -0700 Message-ID: Subject: Re: [PATCH RESEND v6 1/9] pagemap: Introduce ->memory_failure() To: "ruansy.fnst@fujitsu.com" Cc: Jane Chu , "linux-kernel@vger.kernel.org" , "linux-xfs@vger.kernel.org" , "nvdimm@lists.linux.dev" , "linux-mm@kvack.org" , "linux-fsdevel@vger.kernel.org" , "dm-devel@redhat.com" , "djwong@kernel.org" , "david@fromorbit.com" , "hch@lst.de" , "agk@redhat.com" , "snitzer@redhat.com" Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 18, 2021 at 12:52 AM ruansy.fnst@fujitsu.com wrote: > > > > > -----Original Message----- > > From: Jane Chu > > Subject: Re: [PATCH RESEND v6 1/9] pagemap: Introduce ->memory_failure() > > > > > > On 8/17/2021 10:43 PM, Jane Chu wrote: > > > More information - > > > > > > On 8/16/2021 10:20 AM, Jane Chu wrote: > > >> Hi, ShiYang, > > >> > > >> So I applied the v6 patch series to my 5.14-rc3 as it's what you > > >> indicated is what v6 was based at, and injected a hardware poison. > > >> > > >> I'm seeing the same problem that was reported a while ago after the > > >> poison was consumed - in the SIGBUS payload, the si_addr is missing: > > >> > > >> ** SIGBUS(7): canjmp=1, whichstep=0, ** > > >> ** si_addr(0x(nil)), si_lsb(0xC), si_code(0x4, BUS_MCEERR_AR) ** > > >> > > >> The si_addr ought to be 0x7f6568000000 - the vaddr of the first page > > >> in this case. > > > > > > The failure came from here : > > > > > > [PATCH RESEND v6 6/9] xfs: Implement ->notify_failure() for XFS > > > > > > +static int > > > +xfs_dax_notify_failure( > > > ... > > > + if (!xfs_sb_version_hasrmapbt(&mp->m_sb)) { > > > + xfs_warn(mp, "notify_failure() needs rmapbt enabled!"); > > > + return -EOPNOTSUPP; > > > + } > > > > > > I am not familiar with XFS, but I have a few questions I hope to get > > > answers - > > > > > > 1) What does it take and cost to make > > > xfs_sb_version_hasrmapbt(&mp->m_sb) to return true? > > Enable rmpabt feature when making xfs filesystem > `mkfs.xfs -m rmapbt=1 /path/to/device` > BTW, reflink is enabled by default. > > > > > > > 2) For a running environment that fails the above check, is it > > > okay to leave the poison handle in limbo and why? > It will fall back to the old handler. I think you have already known it. > > > > > > > 3) If the above regression is not acceptable, any potential remedy? > > > > How about moving the check to prior to the notifier registration? > > And register only if the check is passed? This seems better than an > > alternative which is to fall back to the legacy memory_failure handling in case > > the filesystem returns -EOPNOTSUPP. > > Sounds like a nice solution. I think I can add an is_notify_supported() interface in dax_holder_ops and check it when register dax_holder. Shouldn't the fs avoid registering a memory failure handler if it is not prepared to take over? For example, shouldn't this case behave identically to ext4 that will not even register a callback?