linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	syzbot <syzbot+2fc0712f8f8b8b8fa0ef@syzkaller.appspotmail.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>
Subject: Re: kernel BUG at mm/page-writeback.c:LINE!
Date: Tue, 12 Jan 2021 11:44:25 +0100	[thread overview]
Message-ID: <20210112104425.GA8760@quack2.suse.cz> (raw)
In-Reply-To: <CAHk-=wgD9GK5CeHopYmRHoYS9cNuCmDMsc=+MbM_KgJ0KB+=ng@mail.gmail.com>

On Fri 08-01-21 18:04:21, Linus Torvalds wrote:
> On Tue, Jan 5, 2021 at 11:53 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I took your "way to go" statement as an ack, and made it all be commit
> > c2407cf7d22d ("mm: make wait_on_page_writeback() wait for multiple
> > pending writebacks").
> 
> Oh, and Michael Larabel (of phoronix) reports that that one-liner does
> something bad to a few PostgreSQL tests, on the order of 5-10%
> regression on some machines (but apparently not others).

Do you have more details? From my experience (we do regular pgbench runs
for various kernels in various configs in SUSE) PostgreSQL numbers tend to
be somewhat noisy and more dependent on CPU scheduling and NUMA locality
than anything else. But it very much depends on the exact config passed to
pgbench so that's why I'm asking...

> I suspect that's a sign of instability in the benchmark numbers, but
> it probably also means that we have some silly condition where
> multiple threads want to clean the same page.
> 
> I sent him a patch to try if it ends up being better to just not wake
> things up early at all (instead of the "if" -> "while") conversion.
> That trivial patch appended here in case anybody has comments.
> 
> Just the fact that that one-liner made a performance impact makes me
> go "hmm", though. Michael didn't see the BUG_ON(), so it's presumably
> some _other_ user of wait_on_page_writeback() than the
> write_cache_pages() one that causes issues.
> 
> Anybody got any suspicions? Honestly, when working on the page wait
> queues, I was working under the assumption that it's really just the
> page lock that truly matters.
> 
> I'm thinking things like __filemap_fdatawait_range(), which doesn't
> hold the page lock at all, so it's all kinds of non-serialized, and
> could now be waiting for any number of IO's ro complete..
> 
> Oh well. This email doesn't really have a point, it's more of a
> heads-up that that "wait to see one or multiple writebacks" thing
> seems to matter more than I would have expected for some loads..

Honestly I'm surprised your patch made a difference as well. It is pretty
common a page gets redirtied while it's being written back but usually it
takes time before next writeback of the page is started. But I guess with
the DB load it is possible e.g. if we frequently flush out some page for
data consistency reasons (I know PostgreSQL is using sync_file_range(2)
interface to start flushing pages early and then uses fsync(2) when it
really needs the pages written which could create a situation with unfair
treatment of PageWriteback bit).

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2021-01-12 10:45 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-03 14:19 kernel BUG at mm/page-writeback.c:LINE! syzbot
2021-01-04 20:41 ` Andrew Morton
2021-01-04 21:52   ` Linus Torvalds
2021-01-05  3:28     ` Hugh Dickins
2021-01-05 19:31       ` Linus Torvalds
2021-01-05 19:53         ` Linus Torvalds
2021-01-05 21:13           ` Hugh Dickins
2021-01-05 21:22             ` Linus Torvalds
2021-01-05 21:34               ` Matthew Wilcox
2021-01-09  2:04           ` Linus Torvalds
2021-01-12 10:44             ` Jan Kara [this message]
2021-01-12 17:35               ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210112104425.GA8760@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=syzbot+2fc0712f8f8b8b8fa0ef@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).