All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jens Axboe <axboe@fb.com>, Ingo Molnar <mingo@kernel.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Chris Mason <clm@fb.com>, Dave Chinner <david@fromorbit.com>,
	Jan Kara <jack@suse.cz>, Josef Bacik <jbacik@fb.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Neil Brown <neilb@suse.de>, Christoph Hellwig <hch@lst.de>,
	Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH] fs-writeback: drop wb->list_lock during blk_finish_plug()
Date: Fri, 18 Sep 2015 17:59:56 +0200	[thread overview]
Message-ID: <20150918155956.GZ3816@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <CA+55aFznZFMHzGZr1ybY9wu_FuRJ-ZiJipuHM3brvHO7GybUpQ@mail.gmail.com>

On Fri, Sep 18, 2015 at 08:32:15AM -0700, Linus Torvalds wrote:

> Actually, even a *full* schedule doesn't unplug, unless the process is
> going to sleep. See sched_submit_work(), which  will only call the
> unplugging if the process is actually going to sleep (ok, so it's a
> bit subtle if you don't know the state rules, but it's the
> "!tsk->state" check there)
> 
> So preemption and cond_resched() isn't _that_ odd. We've basically
> treated a non-sleeping schedule as a no-op for the task work.
> 
> The thinking was probably that it might be better to delay starting
> the IO in case we get scheduled back quickly, and we're obviously not
> actually _sleeping_, so it's likely not too bad.
> 
> Now, that's probably bogus, and I think that we should perhaps just
> make the rule be that "if we actually switch to another task, we run
> blk_schedule_flush_plug()".

That gets to be a tad tricky. We must either do it while holding
scheduler locks (and I would really want to avoid calling block layer
code with those held) or when the new task is already scheduled (and
that would suck too, because then we account stuff to the wrong task).

> But it should be noted that that really *does* introduce a lot of new
> potential races. Traditionally, our block layer plugging has been
> entirely thread-synchronous, and would never happen asynchronously.
> But with preemption, that "switch to another thread" really *does*
> happen asynchronously.
> 
> So making things always happen on task switch is actually fairly
> dangerous, and potentially adds the need for much more synchronization
> for the IO submission.

Can't say I'm a fan of it :/

> What we possibly *could* make the scheduler rule be:
> 
>  - if it's not an actual PREEMPT_ACTIVE (ie in a random place)

PREEMPT_ACTIVE is actually a recursion flag, 'random place' does not
factor into it. That is, most 'random places' will not have
PREEMPT_ACTIVE set.

>  - _and_ we actually switch to another thread

This is 'hard', as per the above.

>  - _then_ do the whole blk_schedule_flush_plug(tsk) thing.
> 
> adding some scheduler people to the explicit cc list.
> 
> That said, the "cond_resched[_lock]()" functions currently always set
> PREEMPT_ACTIVE (indirectly - they use preempt_schedule_common()), so
> even though those are synchronous, right now they *look* asynchronous
> to the scheduler, so we'd still have to sort that out.

See PREEMPT_ACTIVE being a recursion flag, we set it there so we won't
preempt while we're already scheduling.

  reply	other threads:[~2015-09-18 16:05 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-11 19:37 [PATCH] fs-writeback: drop wb->list_lock during blk_finish_plug() Chris Mason
2015-09-11 20:02 ` Linus Torvalds
2015-09-11 20:37   ` Linus Torvalds
2015-09-11 20:40     ` Josef Bacik
2015-09-11 21:04       ` Linus Torvalds
2015-09-11 22:06         ` Linus Torvalds
2015-09-11 23:16           ` Chris Mason
2015-09-11 23:36             ` Linus Torvalds
2015-09-12  0:52               ` Linus Torvalds
2015-09-12  2:15                 ` Chris Mason
2015-09-12  2:27                   ` Linus Torvalds
2015-09-12 23:00               ` Chris Mason
2015-09-12 23:29                 ` Linus Torvalds
2015-09-12 23:46                   ` Chris Mason
2015-09-13 13:12                     ` Chris Mason
2015-09-13 22:56                       ` Dave Chinner
2015-09-13 23:12                 ` Dave Chinner
2015-09-14 20:06                   ` Linus Torvalds
2015-09-16 15:16                     ` Chris Mason
2015-09-16 19:58                       ` Jan Kara
2015-09-16 20:00                         ` Chris Mason
2015-09-16 22:07                           ` Dave Chinner
2015-09-17  0:37                             ` Dave Chinner
2015-09-17  1:12                               ` Linus Torvalds
2015-09-17  2:14                                 ` Dave Chinner
2015-09-17 19:39                                   ` Linus Torvalds
2015-09-17 22:42                                     ` Chris Mason
2015-09-17 23:08                                       ` Linus Torvalds
2015-09-17 23:56                                         ` Chris Mason
2015-09-18  0:37                                           ` Dave Chinner
2015-09-18  1:50                                             ` Linus Torvalds
2015-09-18  5:40                                               ` Dave Chinner
2015-09-18  6:04                                                 ` Linus Torvalds
2015-09-18  6:06                                                   ` Linus Torvalds
2015-09-18 14:21                                                     ` Jens Axboe
2015-09-18 13:16                                                   ` Chris Mason
2015-09-18 14:23                                                     ` Jens Axboe
2015-09-18 15:32                                                       ` Linus Torvalds
2015-09-18 15:59                                                         ` Peter Zijlstra [this message]
2015-09-18 16:02                                                           ` Peter Zijlstra
2015-09-18 16:12                                                           ` Linus Torvalds
2015-09-28 14:47                                                             ` Peter Zijlstra
2015-09-28 16:08                                                               ` Linus Torvalds
2015-09-29  7:55                                                                 ` Ingo Molnar
2015-09-18 22:17                                                   ` Dave Chinner
2015-09-21  9:24                                                     ` Jan Kara
2015-09-21  9:24                                                       ` Jan Kara
2015-09-21 20:21                                                       ` Andrew Morton
2015-09-21 20:21                                                         ` Andrew Morton
2015-09-17 23:03                                   ` Dave Chinner
2015-09-17 23:13                                     ` Linus Torvalds
2015-09-17  3:48                               ` Chris Mason
2015-09-17  4:30                                 ` Dave Chinner
2015-09-17 12:13                                   ` Chris Mason
2015-09-11 23:06         ` Chris Mason
2015-09-11 23:13           ` Linus Torvalds
  -- strict thread matches above, loose matches on Subject: below --
2015-09-09 15:23 Chris Mason
2015-09-11 18:49 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150918155956.GZ3816@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=axboe@fb.com \
    --cc=clm@fb.com \
    --cc=david@fromorbit.com \
    --cc=fweisbec@gmail.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jbacik@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=neilb@suse.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.