All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hans van Kranenburg <hans@knorrie.org>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>, linux-btrfs@vger.kernel.org
Subject: Re: Debugging abysmal write performance with 100% cpu kworker/u16:X+flush-btrfs-2
Date: Tue, 28 Jul 2020 16:52:08 +0200	[thread overview]
Message-ID: <bfba7719-d976-d7e3-2956-f0f200de623f@knorrie.org> (raw)
In-Reply-To: <5155a008-d534-18d3-5416-1c879031b45d@gmx.com>

On 7/28/20 3:52 AM, Qu Wenruo wrote:
> 
> 
> On 2020/7/28 上午8:51, Qu Wenruo wrote:
>>
>>
>> On 2020/7/28 上午1:17, Hans van Kranenburg wrote:
>>> Hi!
>>>
>>> [...]
>>>>
>>>> Since it's almost a dead CPU burner loop, regular sleep based lockup
>>>> detector won't help much.
>>>
>>> Here's a flame graph of 180 seconds, taken from the kernel thread pid:
>>>
>>> https://syrinx.knorrie.org/~knorrie/btrfs/keep/2020-07-27-perf-kworker-flush-btrfs.svg
>>
>> That's really awesome!
>>
>>>
>>>> You can try trace events first to see which trace event get executed the
>>>> most frequently, then try to add probe points to pin down the real cause.
>>>
>>> From the default collection, I already got the following, a few days
>>> ago, by enabling find_free_extent and btrfs_cow_block:
>>>
>>> https://syrinx.knorrie.org/~knorrie/btrfs/keep/2020-07-25-find_free_extent.txt
> 
> This output is in fact pretty confusing, and maybe give you a false view
> on the callers of find_free_extent().
> 
> It always shows "EXTENT_TREE" as the owner, but that's due to a bad
> decision on the trace event.
> 
> I have submitted a patch addressing it, and added you to the CC.

Oh right, thanks, that actually makes a lot of sense, lol.

I was misled because at first sight I was thinking, "yeah, obviously,
where else than in the extent tree are you going to do administration
about allocated blocks.", and didn't realize yet it did make no sense.

> Would you mind to re-capture the events with that patch?
> So that we could have a clearer idea on which tree is having the most
> amount of concurrency?

Yes. I will do that and try to reproduce the symptoms with as few
actions in parallel as possible.

I've been away most of the day today, I will see how far I get later,
otherwise continue tomorrow.

What you *can* see in the current output already, however, is that the
kworker/u16:3-13887 thing is doing all DATA work, while many different
processes (rsync, btrfs receive) all do the find_free_extent work for
METADATA. That's already an interesting difference.

So, one direction to look into is who is all trying to grab that spin
lock, since if it's per 'space' (which sounds logical, since the workers
will never clash because a whole block group belongs to only 1 'space'),
then I don't see why kworker/u16:18-11336 would spend 1/3 of it's time
in a busy locking situation waiting, while it's the only process working
on METADATA.

But, I'll gather some more logs and pictures.

Do you know (some RTFM pointer?) about a way to debug who's locking on
the same thing? I didn't research that yet.

Hans

> [...]

  reply	other threads:[~2020-07-28 14:52 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-25 14:24 Debugging abysmal write performance with 100% cpu kworker/u16:X+flush-btrfs-2 Hans van Kranenburg
2020-07-25 15:37 ` Holger Hoffstätte
2020-07-25 16:43   ` Hans van Kranenburg
2020-07-25 19:44     ` Holger Hoffstätte
2020-07-25 21:03       ` Hans van Kranenburg
2020-07-26  1:00         ` Chris Murphy
2020-07-25 21:27 ` Hans van Kranenburg
2020-07-26  8:10   ` A L
2020-07-26  0:50 ` Chris Murphy
2020-07-27 11:09 ` Qu Wenruo
2020-07-27 17:17   ` Hans van Kranenburg
2020-07-27 19:23     ` Chris Murphy
2020-07-27 23:16     ` Chris Murphy
2020-07-28  0:51     ` Qu Wenruo
2020-07-28  1:52       ` Qu Wenruo
2020-07-28 14:52         ` Hans van Kranenburg [this message]
2020-07-29  0:15           ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bfba7719-d976-d7e3-2956-f0f200de623f@knorrie.org \
    --to=hans@knorrie.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.