Re: BTRFS as a GlusterFS storage back-end, and what I've learned from using it as such.

From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: BTRFS as a GlusterFS storage back-end, and what I've learned from using it as such.
Date: Thu, 13 Apr 2017 07:33:06 -0400	[thread overview]
Message-ID: <43b388aa-197d-2700-14db-8e296c658000@gmail.com> (raw)
In-Reply-To: <pan$ea378$cb1a6397$cb89db25$92b1a470@cox.net>

On 2017-04-12 18:48, Duncan wrote:
> Austin S. Hemmelgarn posted on Wed, 12 Apr 2017 07:18:44 -0400 as
> excerpted:
>
>> On 2017-04-12 01:49, Qu Wenruo wrote:
>>>
>>> At 04/11/2017 11:40 PM, Austin S. Hemmelgarn wrote:
>>>>
>>>> 4. Depending on other factors, compression can actually slow you down
>>>> pretty significantly.  In the particular case I saw this happen (all
>>>> cores completely utilized by userspace software), LZO compression
>>>> actually caused around 5-10% performance degradation compared to no
>>>> compression.  This is somewhat obvious once it's explained, but it's
>>>> not exactly intuitive  and as such it's probably worth documenting in
>>>> the man pages that compression won't always make things better.  I may
>>>> send a patch to add this at some point in the near future.
>>>
>>> This seems interesting.
>>> Maybe it's CPU limiting the performance?
>
>> In this case, I'm pretty certain that that's the cause.  I've only ever
>> seen this happen though when the CPU was under either full or more than
>> full load (so pretty much full utilization of all the cores), and it
>> gets worse as the CPU load increases.
>
> This seems blatantly obvious to me, no explanation needed, at least
> assuming people understand what compression is and does.  It certainly
> doesn't seem btrfs specific to me.
>
> Which makes my wonder if I'm missing something that would seem to
> counteract the obvious, but doesn't in this case.
>
> Compression at its most basic can be described as a tradeoff of CPU
> cycles to decrease data size (by tracking and eliminating internal
> redundancy), and thus transfer time of the data.
>
> In conditions where the bottleneck is (seek and) transfer time, as on hdds
> with mostly idle CPUs, compression therefore tends to be a pretty big
> performance boost because the lower size of the compressed data means
> fewer seeks and lower transfer time, and because that's where the
> bottleneck is, making it more efficient increases the performance of the
> entire thing.
>
> But the context here is SSDs, with 0 seek time and fast transfer speeds,
> and already 100% utilized CPUs, so the bottleneck is the 100% utilized
> CPUs and the increased CPU cycles necessary for the compression/
> decompression simply increases the CPU bottleneck.
>
> So far from a mystery, this seems so basic to me that the simplest
> dunderhead should get it, at least as long as they aren't /so/ simple
> they can't understand the tradeoff inherent in the simplest compression
> basics.
>
> But that's not the implication of the discussion quoted above, and the
> participants are both what I'd consider far more qualified to understand
> and deal with this sort of thing than I, so I /gotta/ be missing
> something that despite my correct ultimate conclusion, means I haven't
> reached it using a correct logic train, and that there /must/ be some
> logic steps in there that I've left out that would intuitively switch the
> logic, making this a rather less intuitive conclusion than I'm thinking.
>
> So what am I missing?
>
> Or is it simply that the tradeoff between CPU usage and data size and
> minimum transit time isn't as simple and basic for most people as I'm
> assuming here, such that it isn't obviously giving more work to an
> already bottlenecked CPU, reducing the performance when it /is/ the CPU
> that's bottlenecked?
>
There's also CPU overhead in transferring the data.  Normally this isn't 
big, but when you start talking about stuff that manages to get full 
bandwidth utilization on the SSD's, it has some impact.  Especially in 
this case, since it's AHCI based SATA controllers (not quite as bad as 
IDE, but still far more overhead than SAS or even parallel SCSI).

The other thing though, is that I see this when dealing with traditional 
hard disks too, and the increasing impact as CPU load increases doesn't 
match up with what I would expect when factoring in both the increased 
scheduling overhead and the decreased runtime, both of which lead me to 
believe that BTRFS is doing something less efficiently than it could 
here, but I'm not sure what.