linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vojtech Myslivec <vojtech@xmyslivec.cz>
To: Chris Murphy <lists@colorremedies.com>
Cc: Michal Moravec <michal.moravec@logicworks.cz>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
	Linux-RAID <linux-raid@vger.kernel.org>
Subject: Re: Linux RAID with btrfs stuck and consume 100 % CPU
Date: Wed, 23 Sep 2020 20:14:19 +0200	[thread overview]
Message-ID: <0c792470-6ee9-8254-dd57-a7a90ac95bcd@xmyslivec.cz> (raw)
In-Reply-To: <CAJCQCtTR1JZTLr+xTEZxrwh8xSfb+zpKjc+_S2tJGFofVMUdSQ@mail.gmail.com>


On 17. 09. 20 19:08, Chris Murphy wrote:
>
> On Wed, Sep 16, 2020 at 3:42 AM Vojtech Myslivec wrote:
>>
>> Description of the devices in iostat, just for recap:
>> - sda-sdf: 6 HDD disks
>> - sdg, sdh: 2 SSD disks
>>
>> - md0: raid1 over sdg1 and sdh1 ("SSD RAID", Physical Volume for LVM)
>> - md1: our "problematic" raid6 over sda-sdf ("HDD RAID", btrfs
>>        formatted)
>>
>> - Logical volumes over md0 Physical Volume (on SSD RAID)
>>     - dm-0: 4G  LV for SWAP
>>     - dm-1: 16G LV for root file system (ext4 formatted)
>>     - dm-2: 1G  LV for md1 journal
>
> It's kindof a complicated setup. When this problem happens, can you
> check swap pressure?
>
> /sys/fs/cgroup/memory.stat
>
> pgfault and maybe also pgmajfault - see if they're going up; or also
> you can look at vmstat and see how heavy swap is being used at the
> time. The thing is.
>
> Because any heavy eviction means writes to dm-0->md0 raid1->sdg+sdh
> SSDs, which are the same SSDs that you have the md1 raid6 mdadm
> journal going to. So if you have any kind of swap pressure, it very
> likely will stop the journal or at least substantially slow it down,
> and now you get blocked tasks as the pressure builds more and more
> because now you have a ton of dirty writes in Btrfs that can't make it
> to disk.
>
> If there is minimal swap usage, then this hypothesis is false and
> something else is going on. I also don't have an explanation why your
> work around works.

On 17. 09. 20 19:20, Chris Murphy wrote:
> The iostat isn't particularly revealing, I don't see especially high
> %util for any device. SSD write MB/s gets up to 42 which is
> reasonable.

On 17. 09. 20 19:43, Chris Murphy wrote:
> [Mon Aug 31 15:31:55 2020] sysrq: Show Blocked State
> [Mon Aug 31 15:31:55 2020]   task                        PC stack   pid father
> 
> [Mon Aug 31 15:31:55 2020] md1_reclaim     D    0   806      2 0x80004000
> [Mon Aug 31 15:31:55 2020] Call Trace:
> ...
> 
> *shrug*
> 
> These SSDs should be able to handle > 500MB/s. And > 130K IOPS. Swap
> would have to be pretty heavy to slow down journal writes.
> 
> I'm not sure I have any good advise. My remaining ideas involve
> changing configuration just to see if the problem goes away, rather
> than actually understanding the cause of the problem.

OK, I see.

This is a physical server with 32 GB RAM and dedicated to backup tasks.
Our monitoring shows there is (almost) no swap usage all the time. So I
hope this should not be the problem. However, I would look for the stats
you mentioned and, for start, I would disable the swap for some several
days. It's there only as a "backup" for any case, and it is not used at
all most of the time.

Sadly, I am not able to _disable the journal_ if I do - just by removing
the device from the array - the MD device instantly fails and btrfs
volume remounts read-only. I can not find any other way how to disable
the journal, it seems it is not supported. I can see only
`--add-journal` option and no corresponding `--delete-journal` one.

I welcome any advice how to exchange write-journal with internal bitmap.

Any other possible changes that comes to my mind are:
- Enlarge write-journal
- Move write-journal to physical sdg/sdh SSDs (out from md0 raid1
  device).

I find the later a bit risky, as the write-journal is not redundant
then. That's the reason we choose write journal on RAID device.

Vojtech

  reply	other threads:[~2020-09-23 18:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-22 20:47 Linux RAID with btrfs stuck and consume 100 % CPU Vojtech Myslivec
2020-07-22 22:00 ` antlists
2020-07-23  2:08 ` Chris Murphy
     [not found]   ` <29509e08-e373-b352-d696-fcb9f507a545@xmyslivec.cz>
2020-07-28 20:23     ` Chris Murphy
     [not found]       ` <695936b4-67a2-c862-9cb6-5545b4ab3c42@xmyslivec.cz>
2020-08-14 20:04         ` Chris Murphy
     [not found]           ` <2f2f1c21-c81b-55aa-6f77-e2d3f32d32cb@xmyslivec.cz>
2020-08-19 22:58             ` Chris Murphy
2020-08-26 15:35               ` Vojtech Myslivec
2020-08-26 18:07                 ` Chris Murphy
2020-09-16  9:42                   ` Vojtech Myslivec
2020-09-17 17:08                     ` Chris Murphy
2020-09-17 17:20                       ` Chris Murphy
2020-09-17 17:43                     ` Chris Murphy
2020-09-23 18:14                       ` Vojtech Myslivec [this message]
2021-02-11  3:14                         ` Manuel Riel
2021-02-28  8:34                           ` Manuel Riel
     [not found]                             ` <56AD80D0-6853-4E3A-A94C-AD1477D3FDA4@snapdragon.cc>
2021-03-17 15:55                               ` Vojtech Myslivec
2020-07-29 21:06 ` Guoqing Jiang
2020-07-29 21:48   ` Chris Murphy
2020-08-12 14:19     ` Vojtech Myslivec
2020-08-12 14:19       ` Vojtech Myslivec
2020-07-30  6:45   ` Song Liu
2020-08-12 13:58   ` Vojtech Myslivec

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0c792470-6ee9-8254-dd57-a7a90ac95bcd@xmyslivec.cz \
    --to=vojtech@xmyslivec.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=michal.moravec@logicworks.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).