linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Donald Buczek <buczek@molgen.mpg.de>
To: Dragan Stancevic <dragan@stancevic.com>,
	Yu Kuai <yukuai1@huaweicloud.com>,
	song@kernel.org
Cc: guoqing.jiang@linux.dev, it+raid@molgen.mpg.de,
	linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	msmith626@gmail.com,
	"yangerkun@huawei.com" <yangerkun@huawei.com>
Subject: Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition
Date: Sun, 24 Sep 2023 16:35:59 +0200	[thread overview]
Message-ID: <fb261b77-4859-07bb-e586-8589741e0c9e@molgen.mpg.de> (raw)
In-Reply-To: <f79867f5-befb-0d7d-0c01-a42caa5d1466@molgen.mpg.de>

On 9/17/23 10:55, Donald Buczek wrote:
> On 9/14/23 08:03, Donald Buczek wrote:
>> On 9/13/23 16:16, Dragan Stancevic wrote:
>>> Hi Donald-
>>> [...]
>>> Here is a list of changes for 6.1:
>>>
>>> e5e9b9cb71a0 md: factor out a helper to wake up md_thread directly
>>> f71209b1f21c md: enhance checking in md_check_recovery()
>>> 753260ed0b46 md: wake up 'resync_wait' at last in md_reap_sync_thread()
>>> 130443d60b1b md: refactor idle/frozen_sync_thread() to fix deadlock
>>> 6f56f0c4f124 md: add a mutex to synchronize idle and frozen in action_store()
>>> 64e5e09afc14 md: refactor action_store() for 'idle' and 'frozen'
>>> a865b96c513b Revert "md: unlock mddev before reap sync_thread in action_store"
>>
>> Thanks!
>>
>> I've put these patches on v6.1.52. I've started a script which transitions the three md-devices of a very active backup server through idle->check->idle every 6 minutes a few ours ago.  It went through ~400 iterations till now. No lock-ups so far.
> 
> Oh dear, looks like the deadlock problem is _not_fixed with these patches.

Some more info after another incident:

- We've hit the deadlock with 5.15.131 (so it is NOT introduced by any of the above patches)
- The symptoms are not exactly the same as with the original year-old problem. Differences:
- - mdX_raid6 is NOT busy looping
- - /sys/devices/virtual/block/mdX/md/array_state says "active" not "write pending"
- - `echo active > /sys/devices/virtual/block/mdX/md/array_state` does not resolve the deadlock
- - After hours in the deadlock state the system resumed operation when a script of mine read(!) lots of sysfs files.
- But in both cases, `echo idle > /sys/devices/virtual/block/mdX/md/sync_action` hangs as does all I/O operation on the raid.

The fact that we didn't hit the problem for many month on 5.15.94 might hint that it was introduced between 5.15.94 and 5.15.131

We'll try to reproduce the problem on a test machine for analysis, but this make take time (vacation imminent for one...).

But its not like these patches caused the problem. Any maybe they _did_ fix the original problem, as we didn't hit that one.

Best

   Donald

-- 
Donald Buczek
buczek@molgen.mpg.de
Tel: +49 30 8413 1433

  reply	other threads:[~2023-09-24 14:37 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-28 12:25 md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition Donald Buczek
2020-11-30  2:06 ` Guoqing Jiang
2020-12-01  9:29   ` Donald Buczek
2020-12-02 17:28     ` Donald Buczek
2020-12-03  1:55       ` Guoqing Jiang
2020-12-03 11:42         ` Donald Buczek
2020-12-21 12:33           ` Donald Buczek
2021-01-19 11:30             ` Donald Buczek
2021-01-20 16:33               ` Guoqing Jiang
2021-01-23 13:04                 ` Donald Buczek
2021-01-25  8:54                   ` Donald Buczek
2021-01-25 21:32                     ` Donald Buczek
2021-01-26  0:44                       ` Guoqing Jiang
2021-01-26  9:50                         ` Donald Buczek
2021-01-26 11:14                           ` Guoqing Jiang
2021-01-26 12:58                             ` Donald Buczek
2021-01-26 14:06                               ` Guoqing Jiang
2021-01-26 16:05                                 ` Donald Buczek
2021-02-02 15:42                                   ` Guoqing Jiang
2021-02-08 11:38                                     ` Donald Buczek
2021-02-08 14:53                                       ` Guoqing Jiang
2021-02-08 18:41                                         ` Donald Buczek
2021-02-09  0:46                                           ` Guoqing Jiang
2021-02-09  9:24                                             ` Donald Buczek
2023-03-14 13:25                                             ` Marc Smith
2023-03-14 13:55                                               ` Guoqing Jiang
2023-03-14 14:45                                                 ` Marc Smith
2023-03-16 15:25                                                   ` Marc Smith
2023-03-29  0:01                                                     ` Song Liu
2023-08-22 21:16                                                       ` Dragan Stancevic
2023-08-23  1:22                                                         ` Yu Kuai
2023-08-23 15:33                                                           ` Dragan Stancevic
2023-08-24  1:18                                                             ` Yu Kuai
2023-08-28 20:32                                                               ` Dragan Stancevic
2023-08-30  1:36                                                                 ` Yu Kuai
2023-09-05  3:50                                                                   ` Yu Kuai
2023-09-05 13:54                                                                     ` Dragan Stancevic
2023-09-13  9:08                                                                       ` Donald Buczek
2023-09-13 14:16                                                                         ` Dragan Stancevic
2023-09-14  6:03                                                                           ` Donald Buczek
2023-09-17  8:55                                                                             ` Donald Buczek
2023-09-24 14:35                                                                               ` Donald Buczek [this message]
2023-09-25  1:11                                                                                 ` Yu Kuai
2023-09-25  9:11                                                                                   ` Donald Buczek
2023-09-25  9:32                                                                                     ` Yu Kuai
2023-03-15  3:02                                                 ` Yu Kuai
2023-03-15  9:30                                                   ` Guoqing Jiang
2023-03-15  9:53                                                     ` Yu Kuai
2023-03-15  7:52                                               ` Donald Buczek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb261b77-4859-07bb-e586-8589741e0c9e@molgen.mpg.de \
    --to=buczek@molgen.mpg.de \
    --cc=dragan@stancevic.com \
    --cc=guoqing.jiang@linux.dev \
    --cc=it+raid@molgen.mpg.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=msmith626@gmail.com \
    --cc=song@kernel.org \
    --cc=yangerkun@huawei.com \
    --cc=yukuai1@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).