All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chris Mason <clm@fb.com>, Jan Kara <jack@suse.cz>,
	Josef Bacik <jbacik@fb.com>, LKML <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Neil Brown <neilb@suse.de>, Christoph Hellwig <hch@lst.de>,
	Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH] fs-writeback: drop wb->list_lock during blk_finish_plug()
Date: Fri, 18 Sep 2015 09:03:23 +1000	[thread overview]
Message-ID: <20150917230323.GQ3902@dastard> (raw)
In-Reply-To: <20150917021453.GO3902@dastard>

On Thu, Sep 17, 2015 at 12:14:53PM +1000, Dave Chinner wrote:
> On Wed, Sep 16, 2015 at 06:12:29PM -0700, Linus Torvalds wrote:
> > On Wed, Sep 16, 2015 at 5:37 PM, Dave Chinner <david@fromorbit.com> wrote:
> > >
> > > TL;DR: Results look really bad - not only is the plugging
> > > problematic, baseline writeback performance has regressed
> > > significantly.
> > 
> > Dave, if you're testing my current -git, the other performance issue
> > might still be the spinlock thing.
> 
> I have the fix as the first commit in my local tree - it'll remain
> there until I get a conflict after an update. :)
> 
> > The plugging IO pauses are interesting, though. Plugging really
> > *shouldn't* cause that kind of pauses, _regardless_ of what level it
> > happens on, so I wonder if the patch ends up just exposing some really
> > basic problem that just normally goes hidden.
> 
> Right, that's what I suspect - it didn't happen on older kernels,
> but we've just completely reworked the writeback code for the
> control group awareness since I last looked really closely at
> this...
> 
> > Can you match up the IO wait times with just *where* it is
> > waiting? Is it waiting for that inode I_SYNC thing in
> > inode_sleep_on_writeback()?
> 
> I'll do some more investigation.

Ok, I'm happy to report there is actually nothing wrong with the
plugging code that is your tree.  I finally tracked the problem I
was seeing down to a misbehaving RAID controller.[*]

With that problem sorted:

kernel		files/s		wall time
3.17		32500		5m54s
4.3-noplug	34400		5m25s
3.17-plug	52900		3m19s
4.3-badplug	60540		3m24s
4.3-rc1		56600		3m23s

So the 3.17/4.3-noplug baselines so no regression - 4.3 is slightly
faster. All the plugging variants show roughly the same improvement
and IO behaviour. These numbers are reproducable and there are no
weird performance inconsistencies during any of the 4.3-rc1 kernel
runs.  Hence my numbers and observed behaviour now aligns with
Chris' results and so I think we can say the reworked high level
plugging is behaving as we expected it to.

Cheers,

Dave.

[*] It seems to have a dodgy battery connector, and so has been
"losing" battery backup and changing the cache mode of the HBA from
write back to write through. This results in changing from NVRAM
performance to SSD native performance and back again.  A small
vibration would cause the connection to the battery to reconnect and
the controller would switch back to writeback mode. The few log
entries in the bios showed changes in status between a few seconds
apart to minutes apart - enough for the cache status to change
several times a 5-10 minute benchmark run.

I didn't notice the hardware was playing up because it wasn't
triggering the machine alert indicator through the bios like it's
supposed to and so the visible and audible alarms were not being
triggered, nor was the BMC logging the raid controller cache status
changes.

In the end, I noticed it by chance - during a low level test the
behaviour changed very obviously as one of my dogs ran past the
rack.  I unplugged everything inside the server, plugged it all back
in, powered it back up and fiddled with cables until I found what
was causing the problem. And having done this, the BMC is now
sending warnings and the audible alarm is working when the battery
is disconnected... :/
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2015-09-17 23:04 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-11 19:37 [PATCH] fs-writeback: drop wb->list_lock during blk_finish_plug() Chris Mason
2015-09-11 20:02 ` Linus Torvalds
2015-09-11 20:37   ` Linus Torvalds
2015-09-11 20:40     ` Josef Bacik
2015-09-11 21:04       ` Linus Torvalds
2015-09-11 22:06         ` Linus Torvalds
2015-09-11 23:16           ` Chris Mason
2015-09-11 23:36             ` Linus Torvalds
2015-09-12  0:52               ` Linus Torvalds
2015-09-12  2:15                 ` Chris Mason
2015-09-12  2:27                   ` Linus Torvalds
2015-09-12 23:00               ` Chris Mason
2015-09-12 23:29                 ` Linus Torvalds
2015-09-12 23:46                   ` Chris Mason
2015-09-13 13:12                     ` Chris Mason
2015-09-13 22:56                       ` Dave Chinner
2015-09-13 23:12                 ` Dave Chinner
2015-09-14 20:06                   ` Linus Torvalds
2015-09-16 15:16                     ` Chris Mason
2015-09-16 19:58                       ` Jan Kara
2015-09-16 20:00                         ` Chris Mason
2015-09-16 22:07                           ` Dave Chinner
2015-09-17  0:37                             ` Dave Chinner
2015-09-17  1:12                               ` Linus Torvalds
2015-09-17  2:14                                 ` Dave Chinner
2015-09-17 19:39                                   ` Linus Torvalds
2015-09-17 22:42                                     ` Chris Mason
2015-09-17 23:08                                       ` Linus Torvalds
2015-09-17 23:56                                         ` Chris Mason
2015-09-18  0:37                                           ` Dave Chinner
2015-09-18  1:50                                             ` Linus Torvalds
2015-09-18  5:40                                               ` Dave Chinner
2015-09-18  6:04                                                 ` Linus Torvalds
2015-09-18  6:06                                                   ` Linus Torvalds
2015-09-18 14:21                                                     ` Jens Axboe
2015-09-18 13:16                                                   ` Chris Mason
2015-09-18 14:23                                                     ` Jens Axboe
2015-09-18 15:32                                                       ` Linus Torvalds
2015-09-18 15:59                                                         ` Peter Zijlstra
2015-09-18 16:02                                                           ` Peter Zijlstra
2015-09-18 16:12                                                           ` Linus Torvalds
2015-09-28 14:47                                                             ` Peter Zijlstra
2015-09-28 16:08                                                               ` Linus Torvalds
2015-09-29  7:55                                                                 ` Ingo Molnar
2015-09-18 22:17                                                   ` Dave Chinner
2015-09-21  9:24                                                     ` Jan Kara
2015-09-21  9:24                                                       ` Jan Kara
2015-09-21 20:21                                                       ` Andrew Morton
2015-09-21 20:21                                                         ` Andrew Morton
2015-09-17 23:03                                   ` Dave Chinner [this message]
2015-09-17 23:13                                     ` Linus Torvalds
2015-09-17  3:48                               ` Chris Mason
2015-09-17  4:30                                 ` Dave Chinner
2015-09-17 12:13                                   ` Chris Mason
2015-09-11 23:06         ` Chris Mason
2015-09-11 23:13           ` Linus Torvalds
  -- strict thread matches above, loose matches on Subject: below --
2015-09-09 15:23 Chris Mason
2015-09-11 18:49 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150917230323.GQ3902@dastard \
    --to=david@fromorbit.com \
    --cc=clm@fb.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jbacik@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.