All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc MERLIN <marc@merlins.org>
To: Christian Rohmann <crohmann@netcologne.de>
Cc: Chris Murphy <lists@colorremedies.com>,
	"Austin S. Hemmelgarn" <ahferroin7@gmail.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: btrfs-progs 4.4 re-balance of RAID6 is very slow / limited to one cpu core?
Date: Tue, 9 Feb 2016 08:46:37 -0800	[thread overview]
Message-ID: <20160209164637.GL13969@merlins.org> (raw)
In-Reply-To: <56B9EE1E.2040000@netcologne.de>

On Tue, Feb 09, 2016 at 02:48:14PM +0100, Christian Rohmann wrote:
> 
> 
> On 02/01/2016 09:52 PM, Chris Murphy wrote:
> >> Would some sort of stracing or profiling of the process help to narrow
> >> > down where the time is currently spent and why the balancing is only
> >> > running single-threaded?
> > This can't be straced. Someone a lot more knowledgeable than I am
> > might figure out where all the waits are with just a sysrq + t, if it
> > is a hold up in say parity computations. Otherwise perf which is a
> > rabbit hole but perf top is kinda cool to watch. That might give you
> > an idea where most of the cpu cycles are going if you can isolate the
> > workload to just the balance. Otherwise you may end up with noisy
> > data.
> 
> My balance run is now working away since 19th of January:
>  "885 out of about 3492 chunks balanced (996 considered),  75% left"
> 
> So this will take several more WEEKS to finish. Is there really nothing
> anyone here wants me to do or analyze to help finding the root cause of
> this? I mean with this kind of performance there is no way a RAID6 can
> be used in production. Not because the code is not stable or
> functioning, but because regular maintenance like replacing a drive or
> growing an array takes WEEKS in which another maintenance procedure
> could be necessary or, much worse, another drive might have failed.
> 
> What I'm saying is: Such a slow RAID6 balance renders the redundancy
> unusable because drives might fail quicker than the potential rebuild
> (read "balance").

I agree, this is bad.
For what it's worth, one of my own filesystems (target for backups, many
many files) has apparently become slow enough that it half hangs my
system when I'm using it.
I've just unmounted it to make sure my overall system performance comes
back, and I may have to delete and recreate it.

Sadly, this also means that btrfs still seems to get itself in corner
cases that are causing performance issues.
I'm not saying that you did hit this problem, but it is possible.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

  reply	other threads:[~2016-02-09 16:47 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-22 13:38 btrfs-progs 4.4 re-balance of RAID6 is very slow / limited to one cpu core? Christian Rohmann
2016-01-22 14:51 ` Duncan
2016-01-24  2:30 ` Henk Slager
2016-01-25 11:34   ` Christian Rohmann
2016-01-25 22:13     ` Chris Murphy
     [not found]       ` <CAKZK7uxdX9UBPOKButtPjqBOdVUfHdRTimP+W34fkz1h9P+wHg@mail.gmail.com>
2016-01-26  0:44         ` Fwd: " Justin Brown
2016-01-26  5:17           ` Chris Murphy
2016-01-26  6:14             ` Chris Murphy
2016-01-26  8:54               ` Christian Rohmann
2016-01-26 19:26                 ` Chris Murphy
2016-01-26 19:27                   ` Chris Murphy
2016-01-26 19:57                   ` Austin S. Hemmelgarn
2016-01-26 20:20                     ` Chris Murphy
2016-01-27  8:48                       ` Christian Rohmann
2016-01-27 16:34                         ` Austin S. Hemmelgarn
2016-01-27 20:58                           ` bbrendon
2016-01-27 21:53                           ` Chris Murphy
2016-01-28 12:27                             ` Austin S. Hemmelgarn
2016-02-01 14:10                             ` Christian Rohmann
2016-02-01 20:52                               ` Chris Murphy
2016-02-09 13:48                                 ` Christian Rohmann
2016-02-09 16:46                                   ` Marc MERLIN [this message]
2016-02-09 21:46                                   ` Chris Murphy
2016-02-10  2:23                                     ` Chris Murphy
2016-02-10  2:36                                       ` Chris Murphy
2016-02-10 13:19                                     ` Christian Rohmann
2016-02-10 19:16                                       ` Chris Murphy
2016-02-10 19:38                                         ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160209164637.GL13969@merlins.org \
    --to=marc@merlins.org \
    --cc=ahferroin7@gmail.com \
    --cc=crohmann@netcologne.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.