From: Song Liu <song@kernel.org>
To: Dan Moulding <dan@danm.net>
Cc: junxiao.bi@oracle.com, gregkh@linuxfoundation.org,
linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
regressions@lists.linux.dev, stable@vger.kernel.org,
yukuai1@huaweicloud.com
Subject: Re: [REGRESSION] 6.7.1: md: raid5 hang and unresponsive system; successfully bisected
Date: Tue, 6 Feb 2024 00:07:36 -0800 [thread overview]
Message-ID: <CAPhsuW58VdmZwigxP6t_fstkSDb34GB9+gTM0Sziet=n17HzQg@mail.gmail.com> (raw)
In-Reply-To: <20240125203130.28187-1-dan@danm.net>
On Thu, Jan 25, 2024 at 12:31 PM Dan Moulding <dan@danm.net> wrote:
>
> Hi Junxiao,
>
> I first noticed this problem the next day after I had upgraded some
> machines to the 6.7.1 kernel. One of the machines is a backup server.
> Just a few hours after the upgrade to 6.7.1, it started running its
> overnight backup jobs. Those backup jobs hung part way through. When I
> tried to check on the backups in the morning, I found the server
> mostly unresponsive. I could SSH in but most shell commands would just
> hang. I was able to run top and see that the md0_raid5 kernel thread
> was using 100% CPU. I tried to reboot the server, but it wasn't able
> to successfully shutdown and eventually I had to hard reset it.
>
> The next day, the same sequence of events occurred on that server
> again when it tried to run its backup jobs. Then the following day, I
> experienced another hang on a different machine, with a similar RAID-5
> configuration. That time I was scp'ing a large file to a virtual
> machine whose image was stored on the RAID-5 array. Part way through
> the transfer scp reported that the transfer had stalled. I checked top
> on that machine and found once again that the md0_raid5 kernel thread
> was using 100% CPU.
>
> Yesterday I created a fresh Fedora 39 VM for the purposes of
> reproducing this problem in a different environment (the other two
> machines are both Gentoo servers running v6.7 kernels straight from
> the stable trees with a custom kernel configuration). I am able to
> reproduce the problem on Fedora 39 running both the v6.6.13 stable
> tree kernel code and the Fedora 39 6.6.13 distribution kernel.
>
> On this Fedora 39 VM, I created a 1GiB LVM volume to use as the RAID-5
> journal from space on the "boot" disk. Then I attached 3 additional
> 100 GiB virtual disks and created the RAID-5 from those 3 disks and
> the write-journal device. I then created a new LVM volume group from
> the md0 array and created one LVM logical volume named "data", using
> all but 64GiB of the available VG space. I then created an ext4 file
> system on the "data" volume, mounted it, and used "dd" to copy 1MiB
> blocks from /dev/urandom to a file on the "data" file system, and just
> let it run. Eventually "dd" hangs and top shows that md0_raid5 is
> using 100% CPU.
>
> Here is an example command I just ran, which has hung after writing
> 4.1 GiB of random data to the array:
>
> test@localhost:~$ dd if=/dev/urandom bs=1M of=/data/random.dat status=progress
> 4410310656 bytes (4.4 GB, 4.1 GiB) copied, 324 s, 13.6 MB/s
Update on this..
I haven't been testing the following config md-6.9 branch [1].
The array works fine afaict.
Dan, could you please run the test on this branch
(83cbdaf61b1ab9cdaa0321eeea734bc70ca069c8)?
Thanks,
Song
[1] https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/log/?h=md-6.9
[root@eth50-1 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sr0 11:0 1 1024M 0 rom
vda 253:0 0 32G 0 disk
├─vda1 253:1 0 2G 0 part /boot
└─vda2 253:2 0 30G 0 part /
nvme2n1 259:0 0 50G 0 disk
└─md0 9:0 0 100G 0 raid5
├─vg--md--data-md--data-real 250:2 0 50G 0 lvm
│ ├─vg--md--data-md--data 250:1 0 50G 0 lvm /mnt/2
│ └─vg--md--data-snap 250:4 0 50G 0 lvm
└─vg--md--data-snap-cow 250:3 0 49G 0 lvm
└─vg--md--data-snap 250:4 0 50G 0 lvm
nvme0n1 259:1 0 50G 0 disk
└─md0 9:0 0 100G 0 raid5
├─vg--md--data-md--data-real 250:2 0 50G 0 lvm
│ ├─vg--md--data-md--data 250:1 0 50G 0 lvm /mnt/2
│ └─vg--md--data-snap 250:4 0 50G 0 lvm
└─vg--md--data-snap-cow 250:3 0 49G 0 lvm
└─vg--md--data-snap 250:4 0 50G 0 lvm
nvme1n1 259:2 0 50G 0 disk
└─md0 9:0 0 100G 0 raid5
├─vg--md--data-md--data-real 250:2 0 50G 0 lvm
│ ├─vg--md--data-md--data 250:1 0 50G 0 lvm /mnt/2
│ └─vg--md--data-snap 250:4 0 50G 0 lvm
└─vg--md--data-snap-cow 250:3 0 49G 0 lvm
└─vg--md--data-snap 250:4 0 50G 0 lvm
nvme4n1 259:3 0 2G 0 disk
nvme3n1 259:4 0 50G 0 disk
└─vg--data-lv--journal 250:0 0 512M 0 lvm
└─md0 9:0 0 100G 0 raid5
├─vg--md--data-md--data-real 250:2 0 50G 0 lvm
│ ├─vg--md--data-md--data 250:1 0 50G 0 lvm /mnt/2
│ └─vg--md--data-snap 250:4 0 50G 0 lvm
└─vg--md--data-snap-cow 250:3 0 49G 0 lvm
└─vg--md--data-snap 250:4 0 50G 0 lvm
nvme5n1 259:5 0 2G 0 disk
nvme6n1 259:6 0 4G 0 disk
[root@eth50-1 ~]# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid5 nvme2n1[4] dm-0[3](J) nvme1n1[1] nvme0n1[0]
104790016 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
unused devices: <none>
[root@eth50-1 ~]# mount | grep /mnt/2
/dev/mapper/vg--md--data-md--data on /mnt/2 type ext4 (rw,relatime,stripe=256)
next prev parent reply other threads:[~2024-02-06 8:07 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-23 0:56 [REGRESSION] 6.7.1: md: raid5 hang and unresponsive system; successfully bisected Dan Moulding
2024-01-23 1:08 ` Song Liu
2024-01-23 1:35 ` Dan Moulding
2024-01-23 6:35 ` Song Liu
2024-01-23 21:53 ` Dan Moulding
2024-01-23 22:21 ` Song Liu
2024-01-23 23:58 ` Dan Moulding
2024-01-25 0:01 ` Song Liu
2024-01-25 16:44 ` junxiao.bi
2024-01-25 19:40 ` Song Liu
2024-01-25 20:31 ` Dan Moulding
2024-01-26 3:30 ` Carlos Carvalho
2024-01-26 15:46 ` Dan Moulding
2024-01-30 16:26 ` Blazej Kucman
2024-01-30 20:21 ` Song Liu
2024-01-31 1:26 ` Song Liu
2024-01-31 2:13 ` Yu Kuai
2024-01-31 2:41 ` Yu Kuai
2024-01-31 4:55 ` Song Liu
2024-01-31 13:36 ` Blazej Kucman
2024-02-01 1:39 ` Yu Kuai
2024-01-26 16:21 ` Roman Mamedov
2024-01-31 17:37 ` junxiao.bi
2024-02-06 8:07 ` Song Liu [this message]
2024-02-06 20:56 ` Dan Moulding
2024-02-06 21:34 ` Song Liu
2024-02-20 23:06 ` Dan Moulding
2024-02-20 23:15 ` junxiao.bi
2024-02-21 14:50 ` Mateusz Kusiak
2024-02-21 19:15 ` junxiao.bi
2024-02-23 17:44 ` Dan Moulding
2024-02-23 19:18 ` junxiao.bi
2024-02-23 20:22 ` Dan Moulding
2024-02-23 8:07 ` Linux regression tracking (Thorsten Leemhuis)
2024-02-24 2:13 ` Song Liu
2024-02-25 17:46 ` Thomas B. Clark
2024-02-26 1:17 ` Thomas B. Clark
2024-02-26 17:35 ` Song Liu
2024-03-01 20:26 ` junxiao.bi
2024-03-01 23:12 ` Dan Moulding
2024-03-02 0:05 ` Song Liu
2024-03-06 8:38 ` Linux regression tracking (Thorsten Leemhuis)
2024-03-06 17:13 ` Song Liu
2024-03-02 16:55 ` Dan Moulding
2024-03-07 3:34 ` Yu Kuai
2024-03-08 23:49 ` junxiao.bi
2024-03-10 5:13 ` Dan Moulding
2024-03-11 1:50 ` Yu Kuai
2024-03-12 22:56 ` junxiao.bi
2024-03-13 1:20 ` Yu Kuai
2024-03-14 18:20 ` junxiao.bi
2024-03-14 22:36 ` Song Liu
2024-03-15 1:30 ` Yu Kuai
2024-03-14 16:12 ` Dan Moulding
2024-03-15 1:17 ` Yu Kuai
2024-03-19 14:16 ` Dan Moulding
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAPhsuW58VdmZwigxP6t_fstkSDb34GB9+gTM0Sziet=n17HzQg@mail.gmail.com' \
--to=song@kernel.org \
--cc=dan@danm.net \
--cc=gregkh@linuxfoundation.org \
--cc=junxiao.bi@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=regressions@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=yukuai1@huaweicloud.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.