From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,UNPARSEABLE_RELAY,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD534C4361B for ; Wed, 16 Dec 2020 02:46:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4F9FE230FF for ; Wed, 16 Dec 2020 02:46:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4F9FE230FF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 754BA6B0036; Tue, 15 Dec 2020 21:46:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6DDFE6B005D; Tue, 15 Dec 2020 21:46:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5A6516B0068; Tue, 15 Dec 2020 21:46:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0160.hostedemail.com [216.40.44.160]) by kanga.kvack.org (Postfix) with ESMTP id 3BA2A6B0036 for ; Tue, 15 Dec 2020 21:46:43 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id EF40F180AD81D for ; Wed, 16 Dec 2020 02:46:42 +0000 (UTC) X-FDA: 77597607444.25.bag50_131808a27428 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id C152A1804E3A0 for ; Wed, 16 Dec 2020 02:46:42 +0000 (UTC) X-HE-Tag: bag50_131808a27428 X-Filterd-Recvd-Size: 7413 Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Wed, 16 Dec 2020 02:46:42 +0000 (UTC) Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0BG2iQFr060031; Wed, 16 Dec 2020 02:46:27 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : content-transfer-encoding : in-reply-to; s=corp-2020-01-29; bh=WkCjasEHHy7py54Y2YmM8Hp8LCLtjfYuzvd3sPD2fgI=; b=HRPBZ1n1r4YYFJMqvwxqxobnCC8pPSyTiY3O8BBNihRVerUQAh1Dwe+MuK9XM63QVbTh E5LAPvc0G3eYYWwd4sAFBxJ7gg6m8G/ccfxjKGbWAZNfL/jaUw0i6bNG4xlqZWRPW8+N 63XClUkN4jj4VGs6nZpKvvCpBTheMtzJvQHQzc5zmJ7Xtv5GtjM8L+A4q4P/NMmDimBM zKCXUSpwfATQfYo0KVaFdLu9kdqr7cqPGCL0wUpSlBHqbEhu8FlIFuJYVStg0nvPv5BZ E43zhimIo8Zvw1NkLsSaGch9+Q1wEabNEnmiKsHLvRciyi6W+R6bWWyIhHIEXZb0Zji7 fA== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2130.oracle.com with ESMTP id 35cn9rds72-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 16 Dec 2020 02:46:27 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0BG2jAV8015913; Wed, 16 Dec 2020 02:46:27 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3020.oracle.com with ESMTP id 35e6js2byy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 16 Dec 2020 02:46:27 +0000 Received: from abhmp0011.oracle.com (abhmp0011.oracle.com [141.146.116.17]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 0BG2kJYw003814; Wed, 16 Dec 2020 02:46:20 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 15 Dec 2020 18:46:19 -0800 Date: Tue, 15 Dec 2020 18:46:18 -0800 From: "Darrick J. Wong" To: Dave Chinner Cc: Jane Chu , Ruan Shiyang , linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-raid@vger.kernel.org, dan.j.williams@intel.com, hch@lst.de, song@kernel.org, rgoldwyn@suse.de, qi.fuli@fujitsu.com, y-goto@fujitsu.com Subject: Re: [RFC PATCH v2 0/6] fsdax: introduce fs query to support reflink Message-ID: <20201216024618.GC6918@magnolia> References: <20201123004116.2453-1-ruansy.fnst@cn.fujitsu.com> <89ab4ec4-e4f0-7c17-6982-4f55bb40f574@oracle.com> <3b35604c-57e2-8cb5-da69-53508c998540@oracle.com> <20201215231022.GL632069@dread.disaster.area> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20201215231022.GL632069@dread.disaster.area> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9836 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxscore=0 bulkscore=0 malwarescore=0 adultscore=0 mlxlogscore=999 phishscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012160014 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9836 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 mlxlogscore=999 impostorscore=0 lowpriorityscore=0 clxscore=1015 spamscore=0 malwarescore=0 priorityscore=1501 phishscore=0 mlxscore=0 bulkscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012160014 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Dec 16, 2020 at 10:10:22AM +1100, Dave Chinner wrote: > On Tue, Dec 15, 2020 at 11:05:07AM -0800, Jane Chu wrote: > > On 12/15/2020 3:58 AM, Ruan Shiyang wrote: > > > Hi Jane > > >=20 > > > On 2020/12/15 =E4=B8=8A=E5=8D=884:58, Jane Chu wrote: > > > > Hi, Shiyang, > > > >=20 > > > > On 11/22/2020 4:41 PM, Shiyang Ruan wrote: > > > > > This patchset is a try to resolve the problem of tracking share= d page > > > > > for fsdax. > > > > >=20 > > > > > Change from v1: > > > > > =C2=A0=C2=A0 - Intorduce ->block_lost() for block device > > > > > =C2=A0=C2=A0 - Support mapped device > > > > > =C2=A0=C2=A0 - Add 'not available' warning for realtime device = in XFS > > > > > =C2=A0=C2=A0 - Rebased to v5.10-rc1 > > > > >=20 > > > > > This patchset moves owner tracking from dax_assocaite_entry() t= o pmem > > > > > device, by introducing an interface ->memory_failure() of struc= t > > > > > pagemap.=C2=A0 The interface is called by memory_failure() in m= m, and > > > > > implemented by pmem device.=C2=A0 Then pmem device calls its ->= block_lost() > > > > > to find the filesystem which the damaged page located in, and c= all > > > > > ->storage_lost() to track files or metadata assocaited with thi= s page. > > > > > Finally we are able to try to fix the damaged data in filesyste= m and do > > > >=20 > > > > Does that mean clearing poison? if so, would you mind to elaborat= e > > > > specifically which change does that? > > >=20 > > > Recovering data for filesystem (or pmem device) has not been done i= n > > > this patchset...=C2=A0 I just triggered the handler for the files s= haring the > > > corrupted page here. > >=20 > > Thanks! That confirms my understanding. > >=20 > > With the framework provided by the patchset, how do you envision it t= o > > ease/simplify poison recovery from the user's perspective? >=20 > At the moment, I'd say no change what-so-ever. THe behaviour is > necessary so that we can kill whatever user application maps > multiply-shared physical blocks if there's a memory error. THe > recovery method from that is unchanged. The only advantage may be > that the filesystem (if rmap enabled) can tell you the exact file > and offset into the file where data was corrupted. >=20 > However, it can be worse, too: it may also now completely shut down > the filesystem if the filesystem discovers the error is in metadata > rather than user data. That's much more complex to recover from, and > right now will require downtime to take the filesystem offline and > run fsck to correct the error. That may trash whatever the metadata > that can't be recovered points to, so you still have a uesr data > recovery process to perform after this... ...though for the future future I'd like to bypass the default behaviors if there's somebody watching the sb notification that will also kick off the appropriate repair activities. The xfs auto-repair parts are coming along nicely. Dunno about userspace, though I figure if we can do userspace page faults then some people could probably do autorepair too. --D > > And how does it help in dealing with page faults upon poisoned > > dax page? >=20 > It doesn't. If the page is poisoned, the same behaviour will occur > as does now. This is simply error reporting infrastructure, not > error handling. >=20 > Future work might change how we correct the faults found in the > storage, but I think the user visible behaviour is going to be "kill > apps mapping corrupted data" for a long time yet.... >=20 > Cheers, >=20 > Dave. > --=20 > Dave Chinner > david@fromorbit.com