From: Vojtech Myslivec <vojtech@xmyslivec.cz>
To: Chris Murphy <lists@colorremedies.com>,
Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
Linux-RAID <linux-raid@vger.kernel.org>,
Michal Moravec <michal.moravec@logicworks.cz>,
Song Liu <songliubraving@fb.com>
Subject: Re: Linux RAID with btrfs stuck and consume 100 % CPU
Date: Wed, 12 Aug 2020 16:19:31 +0200 [thread overview]
Message-ID: <442d5127-11f0-80ca-5914-1a561bb2c292@xmyslivec.cz> (raw)
In-Reply-To: <CAJCQCtQAHr91wEwvFmh_-UB3Cd3UecSjjy6w7nOeqUktrn4UzQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2848 bytes --]
On 29. 07. 20 23:48, Chris Murphy wrote:
> On Wed, Jul 29, 2020 at 3:06 PM Guoqing Jiang
> <guoqing.jiang@cloud.ionos.com> wrote:
>> On 7/22/20 10:47 PM, Vojtech Myslivec wrote:
>>> 1. What should be the cause of this problem?
>>
>> Just a quick glance based on the stacks which you attached, I guess it
>> could be
>> a deadlock issue of raid5 cache super write.
>>
>> Maybe the commit 8e018c21da3f ("raid5-cache: fix a deadlock in superblock>> write") didn't fix the problem completely. Cc Song.
>
> That references discards, and it make me relook at mdadm -D which
> shows a journal device:
>
> 0 253 2 - journal /dev/dm-2
>
> Vojtech, can you confirm this device is an SSD? There are a couple
> SSDs that show up in the dmesg if I recall correctly.
I tried to explain this in my first post. It's logical volume in a
volume group over RAID 1 over 2 SSDs.
My colleague replied to with more details:
On 05. 08. 2020 Michal Moravec wrote:
>> On 29 Jul 2020, Chris Murphy wrote:
>> Vojtech, can you confirm this device is an SSD? There are a couple
>> SSDs that show up in the dmesg if I recall correctly.
>
> Yes. We have a pair (sdg, sdh) of INTEL D3-S4610 240 GB SSDs
> (SSDSC2KG240G8).
> We use them for OS and the raid6 journal.
> They are configured as raid md0 array with LVM on top of it.
> Logical volume vg0-journal_md1 (of 1G size) is used as journal device
> for md1 array (where are problem with proccess md1_raid6 consuming
> 100%
> CPU and blocking btrfs operation is happening)
>> What is the default discard hinting for this SSD when it's used as
>> a journal device for mdadm?
>
> What do you mean by discard hinting?
> We have a issue_discards = 1 configuration in /etc/lvm/lvm.conf
>> And what is the write behavior of the journal?
>
> That would be journal_mode set to write-through, right?
>> I'm not familiar with this feature at all, whether it's treated as a
>> raw block device for the journal or if the journal resides on a file
>> system.
>
> From lsblk output I see no filesystem on vg0-journal_md1. It looks
> like plain logical volume to me.
[my comment]: yes, it's LV block device, no filesystem here.
>> So I get kinda curious what might happen long term if this is a very
>> busy file system, very busy raid5/6 journal on this SSD, without any
>> discard hints?
>> Is it possible the SSD runs out of ready-to-write erase blocks, and
>> the firmware has become super slow doing erasure/garbage collection
>> on demand?
>> And the journal is now having a hard time flushing?
>
> What kind of information could we gather to verify/reject any of these
> ideas?
[my question]: Is LVM configuration (above) enough? Sadly, there are not
much information about RAID 6 journaling at kernel wiki. There are some
info in mdadm(8), but nothing about discards/trim operation.
[-- Attachment #2: lsblk-output.txt --]
[-- Type: text/plain, Size: 1076 bytes --]
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdg 8:96 1 223,6G 0 disk
├─sdg1 8:97 1 37,3G 0 part
│ └─md0 9:0 0 37,2G 0 raid1
│ ├─vg0-swap 253:0 0 3,7G 0 lvm [SWAP]
│ ├─vg0-root 253:1 0 14,9G 0 lvm /
│ └─vg0-journal_md1 253:2 0 1G 0 lvm
│ └─md1 9:1 0 29,1T 0 raid6 /mnt/data
├─sdg2 8:98 1 1K 0 part
└─sdg5 8:101 1 186,3G 0 part
sdh 8:112 1 223,6G 0 disk
├─sdh1 8:113 1 37,3G 0 part
│ └─md0 9:0 0 37,2G 0 raid1
│ ├─vg0-swap 253:0 0 3,7G 0 lvm [SWAP]
│ ├─vg0-root 253:1 0 14,9G 0 lvm /
│ └─vg0-journal_md1 253:2 0 1G 0 lvm
│ └─md1 9:1 0 29,1T 0 raid6 /mnt/data
├─sdh2 8:114 1 1K 0 part
└─sdh5 8:117 1 186,3G 0 part
next prev parent reply other threads:[~2020-08-12 14:19 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-22 20:47 Linux RAID with btrfs stuck and consume 100 % CPU Vojtech Myslivec
2020-07-22 22:00 ` antlists
2020-07-23 2:08 ` Chris Murphy
[not found] ` <29509e08-e373-b352-d696-fcb9f507a545@xmyslivec.cz>
2020-07-28 20:23 ` Chris Murphy
[not found] ` <695936b4-67a2-c862-9cb6-5545b4ab3c42@xmyslivec.cz>
2020-08-14 20:04 ` Chris Murphy
[not found] ` <2f2f1c21-c81b-55aa-6f77-e2d3f32d32cb@xmyslivec.cz>
2020-08-19 22:58 ` Chris Murphy
2020-08-26 15:35 ` Vojtech Myslivec
2020-08-26 18:07 ` Chris Murphy
2020-09-16 9:42 ` Vojtech Myslivec
2020-09-17 17:08 ` Chris Murphy
2020-09-17 17:20 ` Chris Murphy
2020-09-17 17:43 ` Chris Murphy
2020-09-23 18:14 ` Vojtech Myslivec
2021-02-11 3:14 ` Manuel Riel
2021-02-28 8:34 ` Manuel Riel
[not found] ` <56AD80D0-6853-4E3A-A94C-AD1477D3FDA4@snapdragon.cc>
2021-03-17 15:55 ` Vojtech Myslivec
2020-07-29 21:06 ` Guoqing Jiang
2020-07-29 21:48 ` Chris Murphy
2020-08-12 14:19 ` Vojtech Myslivec [this message]
2020-08-12 14:19 ` Vojtech Myslivec
2020-07-30 6:45 ` Song Liu
2020-08-12 13:58 ` Vojtech Myslivec
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=442d5127-11f0-80ca-5914-1a561bb2c292@xmyslivec.cz \
--to=vojtech@xmyslivec.cz \
--cc=guoqing.jiang@cloud.ionos.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=lists@colorremedies.com \
--cc=michal.moravec@logicworks.cz \
--cc=songliubraving@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).