All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>, Dave Jones <davej@redhat.com>,
	Oleg Nesterov <oleg@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Andrey Vagin <avagin@openvz.org>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: frequent softlockups with 3.10rc6.
Date: Tue, 2 Jul 2013 10:38:20 -0700	[thread overview]
Message-ID: <CA+55aFxSY76aWEC04yREjFObPF3ZJFMHdczoT=yGmSpsg5hhyg@mail.gmail.com> (raw)
In-Reply-To: <20130702165752.GA12179@quack.suse.cz>

On Tue, Jul 2, 2013 at 9:57 AM, Jan Kara <jack@suse.cz> wrote:
>
>   sync(2) was always slow in presence of heavy concurrent IO so I don't
> think this is a stable material.

It's not the "sync being slow" part I personally react to. I don't
care that much about that.

It's the "sync slows down other things" part that makes me go "Hmm,
this may be due to interactions with the flushing changes". A slow
sync is fine - a sync that causes the global disk throughput to go
down by also stopping *other* writeback is not.

So it's the effect on the normal background writeback that makes me go
"hmm - have we really always had that, or is this an effect of the old
sync logic _mixed_ with all the bdflush -> worker changes"

The thing is, it used to be that bdflush didn't much care what a sync
by another user was doing. But bdflush doesn't exist any more, it's
all worker threads..

>   The trouble is with callers like write_inode_now() from iput_final().
> For write_inode_now() to work correctly in that place, you must make sure
> page writeback is finished before calling ->write_inode() because
> filesystems may (and do) dirty the inode in their ->end_io callbacks. If
> you don't wait you risk calling ->evict_inode() on a dirty inode and thus
> loosing some updates.

My point was - why don't we move that sync thing into the caller (so
write_inode_now() in this case)?

IOW, I'm not disputing the need for filemap_fdatawait() in the data
paths. I'm just saying that maybe we could split things up - including
that whole "write_inode()" call. Some users clearly want to do this in
different orders.

That said, we might also just want to change the "sync_mode" thing.
The thing that I dislike about this patch (even though I applied it)
is that odd

        if (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync) {

test. It doesn't make sense to me. It's a hack saying "I know that
'sync' does something special and doesn't actually want this
particular WB_SYNC_ALL behavior at all". That's hacky. Moving that
kind of "I know what the caller *really* meant" logic into the callers
- by splitting up the logic - would get rid of the hacky part.

But another approach of getting rid of the hacky part might be to
simple split - and rename - that "WB_SYNC_ALL" thing, and simply say
"clearly 'sync()' and individual callers of 'write_inode_now()' have
totally different expectations of the semantics of WB_SYNC_ALL". Which
means that they really shouldn't share the same "sync_mode" at all.

So maybe we could just extend that "sync_mode", and have the ones that
want to do _one_ inode synchronously use "WB_SYNC_SINGLE" to make it
clear that they are syncing a single inode. Vs "WB_SYNC_ALL" that
would be used for "I'm syncing all inodes, and I'll do a separate
second pass for syncing".

Then that test would become

        if (wbc->sync_mode == WB_SYNC_SINGLE) {

instead, and now "sync_mode" would actually describe what mode of
syncing the caller wants, without that hacky special "we know what the
caller _really_ meant by looking at *which* caller it is".

See what my objection to the code is? And maybe there is yet another
solution to the oddity, I've just outlined two possible ones..

                   Linus

  reply	other threads:[~2013-07-02 17:38 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-19 16:45 frequent softlockups with 3.10rc6 Dave Jones
2013-06-19 17:53 ` Dave Jones
2013-06-19 18:13   ` Paul E. McKenney
2013-06-19 18:42     ` Dave Jones
2013-06-20  0:12     ` Dave Jones
2013-06-20 16:16       ` Paul E. McKenney
2013-06-20 16:27         ` Dave Jones
2013-06-21 15:11         ` Dave Jones
2013-06-21 19:59           ` Oleg Nesterov
2013-06-22  1:37             ` Dave Jones
2013-06-22 17:31               ` Oleg Nesterov
2013-06-22 21:59                 ` Dave Jones
2013-06-23  5:00                   ` Andrew Vagin
2013-06-23 14:36                   ` Oleg Nesterov
2013-06-23 15:06                     ` Dave Jones
2013-06-23 16:04                       ` Oleg Nesterov
2013-06-24  0:21                         ` Dave Jones
2013-06-24  2:00                         ` Dave Jones
2013-06-24 14:39                           ` Oleg Nesterov
2013-06-24 14:52                             ` Steven Rostedt
2013-06-24 16:00                               ` Dave Jones
2013-06-24 16:24                                 ` Steven Rostedt
2013-06-24 16:51                                   ` Dave Jones
2013-06-24 17:04                                     ` Steven Rostedt
2013-06-25 16:55                                       ` Dave Jones
2013-06-25 17:21                                         ` Steven Rostedt
2013-06-25 17:23                                           ` Steven Rostedt
2013-06-25 17:26                                           ` Dave Jones
2013-06-25 17:31                                             ` Steven Rostedt
2013-06-25 17:32                                             ` Steven Rostedt
2013-06-25 17:29                                           ` Steven Rostedt
2013-06-25 17:34                                             ` Dave Jones
2013-06-24 16:37                                 ` Oleg Nesterov
2013-06-24 16:49                                   ` Dave Jones
2013-06-24 15:57                         ` Dave Jones
2013-06-24 17:35                           ` Oleg Nesterov
2013-06-24 17:44                             ` Dave Jones
2013-06-24 17:53                             ` Steven Rostedt
2013-06-24 18:00                               ` Dave Jones
2013-06-25 15:35                             ` Dave Jones
2013-06-25 16:23                               ` Steven Rostedt
2013-06-26  5:23                                 ` Dave Jones
2013-06-26 19:52                                   ` Steven Rostedt
2013-06-26 20:00                                     ` Dave Jones
2013-06-27  3:01                                       ` Steven Rostedt
2013-06-26  5:48                                 ` Dave Jones
2013-06-26 19:18                               ` Oleg Nesterov
2013-06-26 19:40                                 ` Dave Jones
2013-06-27  0:22                                 ` Dave Jones
2013-06-27  1:06                                   ` Eric W. Biederman
2013-06-27  2:32                                     ` Tejun Heo
2013-06-27  7:55                                   ` Dave Chinner
2013-06-27 10:06                                     ` Dave Chinner
2013-06-27 12:52                                       ` Dave Chinner
2013-06-27 15:21                                         ` Dave Jones
2013-06-28  1:13                                           ` Dave Chinner
2013-06-28  3:58                                             ` Dave Chinner
2013-06-28 10:28                                               ` Jan Kara
2013-06-29  3:39                                                 ` Dave Chinner
2013-07-01 12:00                                                   ` Jan Kara
2013-07-02  6:29                                                     ` Dave Chinner
2013-07-02  8:19                                                       ` Jan Kara
2013-07-02 12:38                                                         ` Dave Chinner
2013-07-02 14:05                                                           ` Jan Kara
2013-07-02 16:13                                                             ` Linus Torvalds
2013-07-02 16:57                                                               ` Jan Kara
2013-07-02 17:38                                                                 ` Linus Torvalds [this message]
2013-07-03  3:07                                                                   ` Dave Chinner
2013-07-03  3:28                                                                     ` Linus Torvalds
2013-07-03  4:49                                                                       ` Dave Chinner
2013-07-04  7:19                                                                         ` Andrew Morton
2013-06-29 20:13                                               ` Dave Jones
2013-06-29 22:23                                                 ` Linus Torvalds
2013-06-29 23:44                                                   ` Dave Jones
2013-06-30  0:21                                                     ` Steven Rostedt
2013-07-01 12:49                                                     ` Pavel Machek
2013-06-30  0:17                                                   ` Steven Rostedt
2013-06-30  2:05                                                   ` Dave Chinner
2013-06-30  2:34                                                     ` Dave Chinner
2013-06-27 14:30                                     ` Dave Jones
2013-06-28  1:18                                       ` Dave Chinner
2013-06-28  2:54                                         ` Linus Torvalds
2013-06-28  3:54                                           ` Dave Chinner
2013-06-28  5:59                                             ` Linus Torvalds
2013-06-28  7:21                                               ` Dave Chinner
2013-06-28  8:22                                                 ` Linus Torvalds
2013-06-28  8:32                                                   ` Al Viro
2013-06-28  8:22                                               ` Al Viro
2013-06-28  9:49                                               ` Jan Kara
2013-07-01 17:57                                             ` block layer softlockup Dave Jones
2013-07-02  2:07                                               ` Dave Chinner
2013-07-02  6:01                                                 ` Dave Jones
2013-07-02  7:30                                                   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFxSY76aWEC04yREjFObPF3ZJFMHdczoT=yGmSpsg5hhyg@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=avagin@openvz.org \
    --cc=davej@redhat.com \
    --cc=david@fromorbit.com \
    --cc=ebiederm@xmission.com \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.