All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jane Chu <jane.chu@oracle.com>
To: Shiyang Ruan <ruansy.fnst@fujitsu.com>,
	linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org,
	nvdimm@lists.linux.dev, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, dm-devel@redhat.com
Cc: djwong@kernel.org, dan.j.williams@intel.com, david@fromorbit.com,
	hch@lst.de, agk@redhat.com, snitzer@redhat.com,
	Jane Chu <jane.chu@oracle.com>
Subject: Re: [PATCH RESEND v6 1/9] pagemap: Introduce ->memory_failure()
Date: Thu, 19 Aug 2021 01:11:52 -0700	[thread overview]
Message-ID: <ab9b42d8-2b81-9977-c60a-3f419e53f7bc@oracle.com> (raw)
In-Reply-To: <d908b630-dbaf-fac5-527b-682ced045643@oracle.com>

Sorry, correction in line.

On 8/19/2021 12:18 AM, Jane Chu wrote:
> Hi, Shiyang,
> 
>  >  > > 1) What does it take and cost to make
>  >  > >     xfs_sb_version_hasrmapbt(&mp->m_sb) to return true?
>  >
>  > Enable rmpabt feature when making xfs filesystem
>  >     `mkfs.xfs -m rmapbt=1 /path/to/device`
>  > BTW, reflink is enabled by default.
> 
> Thanks!  I tried
> mkfs.xfs -d agcount=2,extszinherit=512,su=2m,sw=1 -m reflink=0 -m 
> rmapbt=1 -f /dev/pmem0
> 
> Again, injected a HW poison to the first page in a dax-file, had
> the poison consumed and received a SIGBUS. The result is better -
> 
> ** SIGBUS(7): canjmp=1, whichstep=0, **
> ** si_addr(0x0x7ff2d8800000), si_lsb(0x15), si_code(0x4, BUS_MCEERR_AR) **
> 
> The SIGBUS payload looks correct.
> 
> However, "dmesg" has 2048 lines on sending SIGBUS, one per 512bytes -

Actually that's one per 2MB, even though the poison is located
in pfn 0x1850600 only.

> 
> [ 7003.482326] Memory failure: 0x1850600: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7003.507956] Memory failure: 0x1850800: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7003.531681] Memory failure: 0x1850a00: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7003.554190] Memory failure: 0x1850c00: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7003.575831] Memory failure: 0x1850e00: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7003.596796] Memory failure: 0x1851000: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> ....
> [ 7045.738270] Memory failure: 0x194fe00: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7045.758885] Memory failure: 0x1950000: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7045.779495] Memory failure: 0x1950200: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7045.800106] Memory failure: 0x1950400: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> 
> That's too much for a single process dealing with a single
> poison in a PMD page. If nothing else, given an .si_addr_lsb being 0x15,
> it doesn't make sense to send a SIGBUS per 512B block.
> 
> Could you determine the user process' mapping size from the filesystem,
> and take that as a hint to determine how many iterations to call
> mf_dax_kill_procs() ?

Sorry, scratch the 512byte stuff... the filesystem has been
notified the length of the poison blast radius, could it take clue
from that?

thanks,
-jane

> 
> thanks!
> -jane
> 
> 
> 

WARNING: multiple messages have this Message-ID (diff)
From: Jane Chu <jane.chu@oracle.com>
To: Shiyang Ruan <ruansy.fnst@fujitsu.com>,
	linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org,
	nvdimm@lists.linux.dev, linux-mm@kvack.org,
	 linux-fsdevel@vger.kernel.org, dm-devel@redhat.com
Cc: Jane Chu <jane.chu@oracle.com>,
	snitzer@redhat.com, djwong@kernel.org, david@fromorbit.com,
	dan.j.williams@intel.com, hch@lst.de, agk@redhat.com
Subject: Re: [dm-devel] [PATCH RESEND v6 1/9] pagemap: Introduce ->memory_failure()
Date: Thu, 19 Aug 2021 01:11:52 -0700	[thread overview]
Message-ID: <ab9b42d8-2b81-9977-c60a-3f419e53f7bc@oracle.com> (raw)
In-Reply-To: <d908b630-dbaf-fac5-527b-682ced045643@oracle.com>

Sorry, correction in line.

On 8/19/2021 12:18 AM, Jane Chu wrote:
> Hi, Shiyang,
> 
>  >  > > 1) What does it take and cost to make
>  >  > >     xfs_sb_version_hasrmapbt(&mp->m_sb) to return true?
>  >
>  > Enable rmpabt feature when making xfs filesystem
>  >     `mkfs.xfs -m rmapbt=1 /path/to/device`
>  > BTW, reflink is enabled by default.
> 
> Thanks!  I tried
> mkfs.xfs -d agcount=2,extszinherit=512,su=2m,sw=1 -m reflink=0 -m 
> rmapbt=1 -f /dev/pmem0
> 
> Again, injected a HW poison to the first page in a dax-file, had
> the poison consumed and received a SIGBUS. The result is better -
> 
> ** SIGBUS(7): canjmp=1, whichstep=0, **
> ** si_addr(0x0x7ff2d8800000), si_lsb(0x15), si_code(0x4, BUS_MCEERR_AR) **
> 
> The SIGBUS payload looks correct.
> 
> However, "dmesg" has 2048 lines on sending SIGBUS, one per 512bytes -

Actually that's one per 2MB, even though the poison is located
in pfn 0x1850600 only.

> 
> [ 7003.482326] Memory failure: 0x1850600: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7003.507956] Memory failure: 0x1850800: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7003.531681] Memory failure: 0x1850a00: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7003.554190] Memory failure: 0x1850c00: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7003.575831] Memory failure: 0x1850e00: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7003.596796] Memory failure: 0x1851000: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> ....
> [ 7045.738270] Memory failure: 0x194fe00: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7045.758885] Memory failure: 0x1950000: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7045.779495] Memory failure: 0x1950200: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> [ 7045.800106] Memory failure: 0x1950400: Sending SIGBUS to 
> fsdax_poison_v1:4109 due to hardware memory corruption
> 
> That's too much for a single process dealing with a single
> poison in a PMD page. If nothing else, given an .si_addr_lsb being 0x15,
> it doesn't make sense to send a SIGBUS per 512B block.
> 
> Could you determine the user process' mapping size from the filesystem,
> and take that as a hint to determine how many iterations to call
> mf_dax_kill_procs() ?

Sorry, scratch the 512byte stuff... the filesystem has been
notified the length of the poison blast radius, could it take clue
from that?

thanks,
-jane

> 
> thanks!
> -jane
> 
> 
> 


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

  reply	other threads:[~2021-08-19  8:12 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-30 10:01 [PATCH RESEND v6 0/9] fsdax: introduce fs query to support reflink Shiyang Ruan
2021-07-30 10:01 ` [dm-devel] " Shiyang Ruan
2021-07-30 10:01 ` [PATCH RESEND v6 1/9] pagemap: Introduce ->memory_failure() Shiyang Ruan
2021-07-30 10:01   ` [dm-devel] " Shiyang Ruan
2021-08-06  1:17   ` Jane Chu
2021-08-06  1:17     ` [dm-devel] " Jane Chu
2021-08-16 17:20     ` Jane Chu
2021-08-16 17:20       ` [dm-devel] " Jane Chu
2021-08-17  1:44       ` ruansy.fnst
2021-08-17  1:44         ` [dm-devel] " ruansy.fnst
2021-08-18  5:43       ` Jane Chu
2021-08-18  5:43         ` [dm-devel] " Jane Chu
2021-08-18  6:08         ` Jane Chu
2021-08-18  6:08           ` [dm-devel] " Jane Chu
2021-08-18  7:52           ` ruansy.fnst
2021-08-18  7:52             ` [dm-devel] " ruansy.fnst
2021-08-18 17:10             ` Dan Williams
2021-08-18 17:10               ` [dm-devel] " Dan Williams
2021-08-18 17:10               ` Dan Williams
2021-08-23 13:21               ` hch
2021-08-23 13:21                 ` [dm-devel] " hch
2021-08-18 15:52           ` Darrick J. Wong
2021-08-18 15:52             ` [dm-devel] " Darrick J. Wong
2021-08-19  7:18           ` Jane Chu
2021-08-19  7:18             ` [dm-devel] " Jane Chu
2021-08-19  8:11             ` Jane Chu [this message]
2021-08-19  8:11               ` Jane Chu
2021-08-19  9:10               ` ruansy.fnst
2021-08-19  9:10                 ` [dm-devel] " ruansy.fnst
2021-08-19 20:50                 ` Jane Chu
2021-08-19 20:50                   ` [dm-devel] " Jane Chu
2021-08-20 16:07   ` Dan Williams
2021-08-20 16:07     ` Dan Williams
2021-08-20 16:07     ` [dm-devel] " Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 2/9] dax: Introduce holder for dax_device Shiyang Ruan
2021-07-30 10:01   ` [dm-devel] " Shiyang Ruan
2021-08-06  1:02   ` Jane Chu
2021-08-06  1:02     ` [dm-devel] " Jane Chu
2021-08-17  1:45     ` ruansy.fnst
2021-08-17  1:45       ` [dm-devel] " ruansy.fnst
2021-08-20 16:06   ` Dan Williams
2021-08-20 16:06     ` Dan Williams
2021-08-20 16:06     ` [dm-devel] " Dan Williams
2021-08-20 20:19   ` Dan Williams
2021-08-20 20:19     ` [dm-devel] " Dan Williams
2021-08-20 20:19     ` Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 3/9] mm: factor helpers for memory_failure_dev_pagemap Shiyang Ruan
2021-07-30 10:01   ` [dm-devel] " Shiyang Ruan
2021-08-06  1:00   ` Jane Chu
2021-08-06  1:00     ` [dm-devel] " Jane Chu
2021-08-20 16:54     ` Dan Williams
2021-08-20 16:54       ` [dm-devel] " Dan Williams
2021-08-20 16:54       ` Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 4/9] pmem,mm: Implement ->memory_failure in pmem driver Shiyang Ruan
2021-07-30 10:01   ` [dm-devel] [PATCH RESEND v6 4/9] pmem, mm: " Shiyang Ruan
2021-08-20 20:51   ` [PATCH RESEND v6 4/9] pmem,mm: " Dan Williams
2021-08-20 20:51     ` [dm-devel] [PATCH RESEND v6 4/9] pmem, mm: " Dan Williams
2021-08-20 20:51     ` [PATCH RESEND v6 4/9] pmem,mm: " Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 5/9] mm: Introduce mf_dax_kill_procs() for fsdax case Shiyang Ruan
2021-07-30 10:01   ` [dm-devel] " Shiyang Ruan
2021-08-06  0:59   ` Jane Chu
2021-08-06  0:59     ` [dm-devel] " Jane Chu
2021-08-20 22:40   ` Dan Williams
2021-08-20 22:40     ` [dm-devel] " Dan Williams
2021-08-20 22:40     ` Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 6/9] xfs: Implement ->notify_failure() for XFS Shiyang Ruan
2021-07-30 10:01   ` [dm-devel] " Shiyang Ruan
2021-08-06  0:50   ` Jane Chu
2021-08-06  0:50     ` [dm-devel] " Jane Chu
2021-08-20 22:56     ` Dan Williams
2021-08-20 22:56       ` [dm-devel] " Dan Williams
2021-08-20 22:56       ` Dan Williams
2021-08-20 22:59   ` Dan Williams
2021-08-20 22:59     ` [dm-devel] " Dan Williams
2021-08-20 22:59     ` Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 7/9] dm: Introduce ->rmap() to find bdev offset Shiyang Ruan
2021-07-30 10:01   ` [dm-devel] " Shiyang Ruan
2021-08-20 23:46   ` Dan Williams
2021-08-20 23:46     ` [dm-devel] " Dan Williams
2021-08-20 23:46     ` Dan Williams
2021-07-30 10:01 ` [PATCH RESEND v6 8/9] md: Implement dax_holder_operations Shiyang Ruan
2021-07-30 10:01   ` [dm-devel] " Shiyang Ruan
2021-08-06  0:48   ` Jane Chu
2021-08-06  0:48     ` [dm-devel] " Jane Chu
2021-08-17  1:59     ` ruansy.fnst
2021-08-17  1:59       ` [dm-devel] " ruansy.fnst
2021-07-30 10:01 ` [PATCH RESEND v6 9/9] fsdax: add exception for reflinked files Shiyang Ruan
2021-07-30 10:01   ` [dm-devel] " Shiyang Ruan
2021-08-06  0:46   ` Jane Chu
2021-08-06  0:46     ` [dm-devel] " Jane Chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab9b42d8-2b81-9977-c60a-3f419e53f7bc@oracle.com \
    --to=jane.chu@oracle.com \
    --cc=agk@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=ruansy.fnst@fujitsu.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.