linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jane Chu <jane.chu@oracle.com>
To: "ruansy.fnst@fujitsu.com" <ruansy.fnst@fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>
Cc: "djwong@kernel.org" <djwong@kernel.org>,
	"dan.j.williams@intel.com" <dan.j.williams@intel.com>,
	"david@fromorbit.com" <david@fromorbit.com>,
	"hch@lst.de" <hch@lst.de>, "agk@redhat.com" <agk@redhat.com>,
	"snitzer@redhat.com" <snitzer@redhat.com>
Subject: Re: [PATCH RESEND v6 1/9] pagemap: Introduce ->memory_failure()
Date: Thu, 19 Aug 2021 13:50:49 -0700	[thread overview]
Message-ID: <0c11714b-06f8-8eba-e0b3-8bb1caa8ebf2@oracle.com> (raw)
In-Reply-To: <OSBPR01MB29203E90FCF9711D8736C8D4F4C09@OSBPR01MB2920.jpnprd01.prod.outlook.com>


On 8/19/2021 2:10 AM, ruansy.fnst@fujitsu.com wrote:
>> From: Jane Chu <jane.chu@oracle.com>
>> Subject: Re: [PATCH RESEND v6 1/9] pagemap: Introduce ->memory_failure()
>>
>> Sorry, correction in line.
>>
>> On 8/19/2021 12:18 AM, Jane Chu wrote:
>>> Hi, Shiyang,
>>>
>>>   >  > > 1) What does it take and cost to make  >  > >
>>> xfs_sb_version_hasrmapbt(&mp->m_sb) to return true?
>>>   >
>>>   > Enable rmpabt feature when making xfs filesystem  >     `mkfs.xfs
>>> -m rmapbt=1 /path/to/device`  > BTW, reflink is enabled by default.
>>>
>>> Thanks!  I tried
>>> mkfs.xfs -d agcount=2,extszinherit=512,su=2m,sw=1 -m reflink=0 -m
>>> rmapbt=1 -f /dev/pmem0
>>>
>>> Again, injected a HW poison to the first page in a dax-file, had the
>>> poison consumed and received a SIGBUS. The result is better -
>>>
>>> ** SIGBUS(7): canjmp=1, whichstep=0, **
>>> ** si_addr(0x0x7ff2d8800000), si_lsb(0x15), si_code(0x4,
>>> BUS_MCEERR_AR) **
>>>
>>> The SIGBUS payload looks correct.
>>>
>>> However, "dmesg" has 2048 lines on sending SIGBUS, one per 512bytes -
>>
>> Actually that's one per 2MB, even though the poison is located in pfn 0x1850600
>> only.
>>
>>>
>>> [ 7003.482326] Memory failure: 0x1850600: Sending SIGBUS to
>>> fsdax_poison_v1:4109 due to hardware memory corruption [ 7003.507956]
>>> Memory failure: 0x1850800: Sending SIGBUS to
>>> fsdax_poison_v1:4109 due to hardware memory corruption [ 7003.531681]
>>> Memory failure: 0x1850a00: Sending SIGBUS to
>>> fsdax_poison_v1:4109 due to hardware memory corruption [ 7003.554190]
>>> Memory failure: 0x1850c00: Sending SIGBUS to
>>> fsdax_poison_v1:4109 due to hardware memory corruption [ 7003.575831]
>>> Memory failure: 0x1850e00: Sending SIGBUS to
>>> fsdax_poison_v1:4109 due to hardware memory corruption [ 7003.596796]
>>> Memory failure: 0x1851000: Sending SIGBUS to
>>> fsdax_poison_v1:4109 due to hardware memory corruption ....
>>> [ 7045.738270] Memory failure: 0x194fe00: Sending SIGBUS to
>>> fsdax_poison_v1:4109 due to hardware memory corruption [ 7045.758885]
>>> Memory failure: 0x1950000: Sending SIGBUS to
>>> fsdax_poison_v1:4109 due to hardware memory corruption [ 7045.779495]
>>> Memory failure: 0x1950200: Sending SIGBUS to
>>> fsdax_poison_v1:4109 due to hardware memory corruption [ 7045.800106]
>>> Memory failure: 0x1950400: Sending SIGBUS to
>>> fsdax_poison_v1:4109 due to hardware memory corruption
>>>
>>> That's too much for a single process dealing with a single poison in a
>>> PMD page. If nothing else, given an .si_addr_lsb being 0x15, it
>>> doesn't make sense to send a SIGBUS per 512B block.
>>>
>>> Could you determine the user process' mapping size from the
>>> filesystem, and take that as a hint to determine how many iterations
>>> to call
>>> mf_dax_kill_procs() ?
>>
>> Sorry, scratch the 512byte stuff... the filesystem has been notified the length of
>> the poison blast radius, could it take clue from that?
> 
> I think this is caused by a mistake I made in the 6th patch: xfs handler iterates the file range in block size(4k here) even though it is a PMD page. That's why so many message shows when poison on a PMD page.  I'll fix it in next version.
> 

Sorry, just to clarify, it looks like XFS has iterated through out the
entire file in 2MiB stride.  The test file size is 4GiB, that explains
'dmesg' showing 2048 line about sending SIGBUS.

thanks,
-jane


> 
> --
> Thanks,
> Ruan.
> 
>>
>> thanks,
>> -jane
>>
>>>
>>> thanks!
>>> -jane
>>>
>>>
>>>


  reply	other threads:[~2021-08-19 20:51 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-30 10:01 [PATCH RESEND v6 0/9] fsdax: introduce fs query to support reflink Shiyang Ruan
2021-07-30 10:01 ` [PATCH RESEND v6 1/9] pagemap: Introduce ->memory_failure() Shiyang Ruan
2021-08-06  1:17   ` Jane Chu
2021-08-16 17:20     ` Jane Chu
2021-08-17  1:44       ` ruansy.fnst
2021-08-18  5:43       ` Jane Chu
2021-08-18  6:08         ` Jane Chu
2021-08-18  7:52           ` ruansy.fnst
2021-08-18 17:10             ` Dan Williams
2021-08-23 13:21               ` hch
2021-08-18 15:52           ` Darrick J. Wong
2021-08-19  7:18           ` Jane Chu
2021-08-19  8:11             ` Jane Chu
2021-08-19  9:10               ` ruansy.fnst
2021-08-19 20:50                 ` Jane Chu [this message]
2021-08-20 16:07   ` Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 2/9] dax: Introduce holder for dax_device Shiyang Ruan
2021-08-06  1:02   ` Jane Chu
2021-08-17  1:45     ` ruansy.fnst
2021-08-20 16:06   ` Dan Williams
2021-08-20 20:19   ` Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 3/9] mm: factor helpers for memory_failure_dev_pagemap Shiyang Ruan
2021-08-06  1:00   ` Jane Chu
2021-08-20 16:54     ` Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 4/9] pmem,mm: Implement ->memory_failure in pmem driver Shiyang Ruan
2021-08-20 20:51   ` Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 5/9] mm: Introduce mf_dax_kill_procs() for fsdax case Shiyang Ruan
2021-08-06  0:59   ` Jane Chu
2021-08-20 22:40   ` Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 6/9] xfs: Implement ->notify_failure() for XFS Shiyang Ruan
2021-08-06  0:50   ` Jane Chu
2021-08-20 22:56     ` Dan Williams
2021-08-20 22:59   ` Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 7/9] dm: Introduce ->rmap() to find bdev offset Shiyang Ruan
2021-08-20 23:46   ` Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 8/9] md: Implement dax_holder_operations Shiyang Ruan
2021-08-06  0:48   ` Jane Chu
2021-08-17  1:59     ` ruansy.fnst
2021-07-30 10:01 ` [PATCH RESEND v6 9/9] fsdax: add exception for reflinked files Shiyang Ruan
2021-08-06  0:46   ` Jane Chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0c11714b-06f8-8eba-e0b3-8bb1caa8ebf2@oracle.com \
    --to=jane.chu@oracle.com \
    --cc=agk@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=ruansy.fnst@fujitsu.com \
    --cc=snitzer@redhat.com \
    --subject='Re: [PATCH RESEND v6 1/9] pagemap: Introduce ->memory_failure()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).