All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Moulding <dan@danm.net>
To: junxiao.bi@oracle.com
Cc: dan@danm.net, gregkh@linuxfoundation.org,
	linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	regressions@lists.linux.dev, song@kernel.org,
	stable@vger.kernel.org, logang@deltatee.com
Subject: Re: [REGRESSION] 6.7.1: md: raid5 hang and unresponsive system; successfully bisected
Date: Sat,  2 Mar 2024 09:55:38 -0700	[thread overview]
Message-ID: <20240302165538.30761-1-dan@danm.net> (raw)
In-Reply-To: <739634c3-3e21-44dd-abb1-356cf54e54fd@oracle.com>

> I have not root cause this yet, but would like share some findings from 
> the vmcore Dan shared. From what i can see, this doesn't look like a md 
> issue, but something wrong with block layer or below.

Below is one other thing I found that might be of interest. This is
from the original email thread [1] that was linked to in the original
issue from 2022, which the change in question reverts:

On 2022-09-02 17:46, Logan Gunthorpe wrote:
> I've made some progress on this nasty bug. I've got far enough to know it's not
> related to the blk-wbt or the block layer.
> 
> Turns out a bunch of bios are stuck queued in a blk_plug in the md_raid5 
> thread while that thread appears to be stuck in an infinite loop (so it never
> schedules or does anything to flush the plug). 
> 
> I'm still debugging to try and find out the root cause of that infinite loop, 
> but I just wanted to send an update that the previous place I was stuck at
> was not correct.
> 
> Logan

This certainly sounds like it has some similarities to what we are
seeing when that change is reverted. The md0_raid5 thread appears to be
in an infinite loop, consuming 100% CPU, but not actually doing any
work.

-- Dan

[1] https://lore.kernel.org/r/7f3b87b6-b52a-f737-51d7-a4eec5c44112@deltatee.com

  parent reply	other threads:[~2024-03-02 16:55 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-23  0:56 [REGRESSION] 6.7.1: md: raid5 hang and unresponsive system; successfully bisected Dan Moulding
2024-01-23  1:08 ` Song Liu
2024-01-23  1:35 ` Dan Moulding
2024-01-23  6:35   ` Song Liu
2024-01-23 21:53     ` Dan Moulding
2024-01-23 22:21       ` Song Liu
2024-01-23 23:58         ` Dan Moulding
2024-01-25  0:01           ` Song Liu
2024-01-25 16:44             ` junxiao.bi
2024-01-25 19:40               ` Song Liu
2024-01-25 20:31               ` Dan Moulding
2024-01-26  3:30                 ` Carlos Carvalho
2024-01-26 15:46                   ` Dan Moulding
2024-01-30 16:26                     ` Blazej Kucman
2024-01-30 20:21                       ` Song Liu
2024-01-31  1:26                       ` Song Liu
2024-01-31  2:13                         ` Yu Kuai
2024-01-31  2:41                       ` Yu Kuai
2024-01-31  4:55                         ` Song Liu
2024-01-31 13:36                           ` Blazej Kucman
2024-02-01  1:39                             ` Yu Kuai
2024-01-26 16:21                   ` Roman Mamedov
2024-01-31 17:37                 ` junxiao.bi
2024-02-06  8:07                 ` Song Liu
2024-02-06 20:56                   ` Dan Moulding
2024-02-06 21:34                     ` Song Liu
2024-02-20 23:06 ` Dan Moulding
2024-02-20 23:15   ` junxiao.bi
2024-02-21 14:50     ` Mateusz Kusiak
2024-02-21 19:15       ` junxiao.bi
2024-02-23 17:44     ` Dan Moulding
2024-02-23 19:18       ` junxiao.bi
2024-02-23 20:22         ` Dan Moulding
2024-02-23  8:07   ` Linux regression tracking (Thorsten Leemhuis)
2024-02-24  2:13     ` Song Liu
2024-02-25 17:46       ` Thomas B. Clark
2024-02-26  1:17         ` Thomas B. Clark
2024-02-26 17:35           ` Song Liu
2024-03-01 20:26       ` junxiao.bi
2024-03-01 23:12         ` Dan Moulding
2024-03-02  0:05           ` Song Liu
2024-03-06  8:38             ` Linux regression tracking (Thorsten Leemhuis)
2024-03-06 17:13               ` Song Liu
2024-03-02 16:55         ` Dan Moulding [this message]
2024-03-07  3:34         ` Yu Kuai
2024-03-08 23:49         ` junxiao.bi
2024-03-10  5:13           ` Dan Moulding
2024-03-11  1:50           ` Yu Kuai
2024-03-12 22:56             ` junxiao.bi
2024-03-13  1:20               ` Yu Kuai
2024-03-14 18:20                 ` junxiao.bi
2024-03-14 22:36                   ` Song Liu
2024-03-15  1:30                   ` Yu Kuai
2024-03-14 16:12             ` Dan Moulding
2024-03-15  1:17               ` Yu Kuai
2024-03-19 14:16                 ` Dan Moulding

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240302165538.30761-1-dan@danm.net \
    --to=dan@danm.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=junxiao.bi@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=regressions@lists.linux.dev \
    --cc=song@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.