All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: "Dan Williams" <dan.j.williams@intel.com>,
	"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
Cc: Shiyang Ruan <ruansy.fnst@fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"djwong@kernel.org" <djwong@kernel.org>,
	"david@fromorbit.com" <david@fromorbit.com>,
	"hch@infradead.org" <hch@infradead.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"jane.chu@oracle.com" <jane.chu@oracle.com>,
	"rgoldwyn@suse.de" <rgoldwyn@suse.de>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
	"willy@infradead.org" <willy@infradead.org>,
	"linmiaohe@huawei.com" <linmiaohe@huawei.com>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v2 05/14] mm: Introduce mf_dax_kill_procs() for fsdax case
Date: Wed, 24 Aug 2022 22:05:31 -0700	[thread overview]
Message-ID: <6307031b763be_18ed729455@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <6306fbabab4cd_18ed7294e2@dwillia2-xfh.jf.intel.com.notmuch>

Dan Williams wrote:
> HORIGUCHI NAOYA(堀口 直也) wrote:
> > On Wed, Aug 24, 2022 at 02:52:51PM -0700, Dan Williams wrote:
> > > Shiyang Ruan wrote:
> > > > This new function is a variant of mf_generic_kill_procs that accepts a
> > > > file, offset pair instead of a struct to support multiple files sharing
> > > > a DAX mapping.  It is intended to be called by the file systems as part
> > > > of the memory_failure handler after the file system performed a reverse
> > > > mapping from the storage address to the file and file offset.
> > > > 
> > > > Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> > > > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > > > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > > > Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> > > > Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
> > > > ---
> > > >  include/linux/mm.h  |  2 +
> > > >  mm/memory-failure.c | 96 ++++++++++++++++++++++++++++++++++++++++-----
> > > >  2 files changed, 88 insertions(+), 10 deletions(-)
> > > 
> > > Unfortunately my test suite was only running the "non-destructive" set
> > > of 'ndctl' tests which skipped some of the complex memory-failure cases.
> > > Upon fixing that, bisect flags this commit as the source of the following
> > > crash regression:
> > 
> > Thank you for testing/reporting.
> > 
> > > 
> > >  kernel BUG at mm/memory-failure.c:310!
> > >  invalid opcode: 0000 [#1] PREEMPT SMP PTI
> > >  CPU: 26 PID: 1252 Comm: dax-pmd Tainted: G           OE     5.19.0-rc4+ #58
> > >  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
> > >  RIP: 0010:add_to_kill+0x304/0x400
> > > [..]
> > >  Call Trace:
> > >   <TASK>
> > >   collect_procs.part.0+0x2c8/0x470
> > >   memory_failure+0x979/0xf30
> > >   do_madvise.part.0.cold+0x9c/0xd3
> > >   ? lock_is_held_type+0xe3/0x140
> > >   ? find_held_lock+0x2b/0x80
> > >   ? lock_release+0x145/0x2f0
> > >   ? lock_is_held_type+0xe3/0x140
> > >   ? syscall_enter_from_user_mode+0x20/0x70
> > >   __x64_sys_madvise+0x56/0x70
> > >   do_syscall_64+0x3a/0x80
> > >   entry_SYSCALL_64_after_hwframe+0x46/0xb0
> > 
> > This stacktrace shows that VM_BUG_ON_VMA() in dev_pagemap_mapping_shift()
> > was triggered.  I think that BUG_ON is too harsh here because address ==
> > -EFAULT means that there's no mapping for the address.  The subsequent
> > code considers "tk->size_shift == 0" as "no mapping" cases, so
> > dev_pagemap_mapping_shift() can return 0 in such a case?
> > 
> > Could the following diff work for the issue?
> 
> This passes the "dax-ext4.sh" and "dax-xfs.sh" tests from the ndctl
> suite.
> 
> It then fails on the "device-dax" test with this signature:
> 
>  BUG: kernel NULL pointer dereference, address: 0000000000000010
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
>  PGD 8000000205073067 P4D 8000000205073067 PUD 2062b3067 PMD 0 
>  Oops: 0000 [#1] PREEMPT SMP PTI
>  CPU: 22 PID: 4535 Comm: device-dax Tainted: G           OE    N 6.0.0-rc2+ #59
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>  RIP: 0010:memory_failure+0x667/0xba0
> [..]
>  Call Trace:
>   <TASK>
>   ? _printk+0x58/0x73
>   do_madvise.part.0.cold+0xaf/0xc5
> 
> Which is:
> 
> (gdb) li *(memory_failure+0x667)
> 0xffffffff813b7f17 is in memory_failure (mm/memory-failure.c:1933).
> 1928
> 1929            /*
> 1930             * Call driver's implementation to handle the memory failure, otherwise
> 1931             * fall back to generic handler.
> 1932             */
> 1933            if (pgmap->ops->memory_failure) {
> 1934                    rc = pgmap->ops->memory_failure(pgmap, pfn, 1, flags);
> 
> 
> ...I think this is just a simple matter of:
> 
> @@ -1928,7 +1930,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>          * Call driver's implementation to handle the memory failure, otherwise
>          * fall back to generic handler.
>          */
> -       if (pgmap->ops->memory_failure) {
> +       if (pgmap->ops && pgmap->ops->memory_failure) {
>                 rc = pgmap->ops->memory_failure(pgmap, pfn, 1, flags);
>                 /*
>                  * Fall back to generic handler too if operation is not
> 
> 
> ...since device-dax does not implement pagemap ops.
> 
> I will see what else pops up and make sure that this regression always
> runs going forward.

Ok, that was last of the regression fallout that I could find.

  reply	other threads:[~2022-08-25  5:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-03  5:37 [PATCHSETS v2] v14 fsdax-rmap + v11 fsdax-reflink Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 01/14] dax: Introduce holder for dax_device Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 02/14] mm: factor helpers for memory_failure_dev_pagemap Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 03/14] pagemap,pmem: Introduce ->memory_failure() Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 04/14] fsdax: Introduce dax_lock_mapping_entry() Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 05/14] mm: Introduce mf_dax_kill_procs() for fsdax case Shiyang Ruan
2022-08-24 21:52   ` Dan Williams
2022-08-24 23:42     ` HORIGUCHI NAOYA(堀口 直也)
2022-08-25  4:33       ` Dan Williams
2022-08-25  5:05         ` Dan Williams [this message]
2022-08-25 19:28           ` Dan Williams
2022-06-03  5:37 ` [PATCH v2 06/14] xfs: Implement ->notify_failure() for XFS Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 07/14] fsdax: set a CoW flag when associate reflink mappings Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 08/14] fsdax: Output address in dax_iomap_pfn() and rename it Shiyang Ruan
2022-06-07 14:38   ` [PATCH v2.1 " Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 09/14] fsdax: Introduce dax_iomap_cow_copy() Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 10/14] fsdax: Replace mmap entry in case of CoW Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 11/14] fsdax: Add dax_iomap_cow_copy() for dax zero Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 12/14] fsdax: Dedup file range to use a compare function Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 13/14] xfs: support CoW in fsdax mode Shiyang Ruan
2022-06-03  5:37 ` [PATCH v2 14/14] xfs: Add dax dedupe support Shiyang Ruan
2022-06-17  2:31 ` [PATCHSETS v2] v14 fsdax-rmap + v11 fsdax-reflink Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6307031b763be_18ed729455@dwillia2-xfh.jf.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=jane.chu@oracle.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=rgoldwyn@suse.de \
    --cc=ruansy.fnst@fujitsu.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.