From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Jan Stancek <jstancek@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>,
linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org,
Memory Management <mm-qe@redhat.com>,
LTP Mailing List <ltp@lists.linux.it>,
Linux Stable maillist <stable@vger.kernel.org>,
CKI Project <cki-project@redhat.com>,
Michael Ellerman <mpe@ellerman.id.au>
Subject: Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later
Date: Tue, 3 Dec 2019 08:08:39 -0800 [thread overview]
Message-ID: <20191203160839.GJ7335@magnolia> (raw)
In-Reply-To: <433638211.14837331.1575383728189.JavaMail.zimbra@redhat.com>
On Tue, Dec 03, 2019 at 09:35:28AM -0500, Jan Stancek wrote:
>
> ----- Original Message -----
> > On Tue, Dec 03, 2019 at 07:50:39AM -0500, Jan Stancek wrote:
> > > My theory is that there's a race in iomap. There appear to be
> > > interleaved calls to iomap_set_range_uptodate() for same page
> > > with varying offset and length. Each call sees bitmap as _not_
> > > entirely "uptodate" and hence doesn't call SetPageUptodate().
> > > Even though each bit in bitmap ends up uptodate by the time
> > > all calls finish.
> >
> > Weird. That should be prevented by the page lock that all callers
> > of iomap_set_range_uptodate. But in case I miss something, does
> > the patch below trigger? If not it is not jut a race, but might
> > be some weird ordering problem with the bitops, especially if it
> > only triggers on ppc, which is very weakly ordered.
> >
> > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> > index d33c7bc5ee92..25e942c71590 100644
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -148,6 +148,8 @@ iomap_set_range_uptodate(struct page *page, unsigned off,
> > unsigned len)
> > unsigned int i;
> > bool uptodate = true;
> >
> > + WARN_ON_ONCE(!PageLocked(page));
> > +
> > if (iop) {
> > for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
> > if (i >= first && i <= last)
> >
>
> Hit it pretty quick this time:
>
> # uptime
> 09:27:42 up 22 min, 2 users, load average: 0.09, 13.38, 26.18
>
> # /mnt/testarea/ltp/testcases/bin/genbessel
> Bus error (core dumped)
>
> # dmesg | grep -i -e warn -e call
> [ 0.000000] dt-cpu-ftrs: not enabling: system-call-vectored (disabled or unsupported by kernel)
> [ 0.000000] random: get_random_u64 called from cache_random_seq_create+0x98/0x1e0 with crng_init=0
> [ 0.000000] rcu: Offload RCU callbacks from CPUs: (none).
> [ 5.312075] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
> [ 5.357307] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
> [ 5.485126] megaraid_sas 0031:01:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000
>
> So, extra WARN_ON_ONCE applied on top of v5.4-8836-g81b6b96475ac
> did not trigger.
>
> Is it possible for iomap code to submit multiple bio-s for same
> locked page and then receive callbacks in parallel?
Yes, if (say) you have 64k pages on a 4k-block filesystem and the extent
mapping for all 16 blocks aren't contiguous, then iomap will issue
separate bios for each physical fragment it finds. iomap will call
submit_bio on those bios whenever it thinks it's done filling the bio,
so you can indeed get multiple callbacks in parallel.
--D
next prev parent reply other threads:[~2019-12-03 16:09 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-30 5:26 ❌ FAIL: Test report for kernel 5.3.13-3b5f971.cki (stable-queue) CKI Project
2019-11-30 21:56 ` Jan Stancek
2019-12-02 5:46 ` Michael Ellerman
2019-12-02 12:30 ` Jan Stancek
2019-12-03 12:50 ` [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later Jan Stancek
2019-12-03 13:07 ` Christoph Hellwig
2019-12-03 14:35 ` Jan Stancek
2019-12-03 16:08 ` Darrick J. Wong [this message]
2019-12-03 19:09 ` Christoph Hellwig
2019-12-04 14:43 ` Jan Stancek
2019-12-07 0:02 ` dftxbs3e
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191203160839.GJ7335@magnolia \
--to=darrick.wong@oracle.com \
--cc=cki-project@redhat.com \
--cc=hch@infradead.org \
--cc=jstancek@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=ltp@lists.linux.it \
--cc=mm-qe@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).