All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mikko Rantalainen <mikko.rantalainen@peda.net>
To: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>, song@kernel.org
Cc: agk@redhat.com, snitzer@redhat.com, linux-raid@vger.kernel.org,
	dm-devel@redhat.com
Subject: Re: [PATCH V2] md: don't unregister sync_thread with reconfig_mutex held
Date: Tue, 11 Jan 2022 14:25:04 +0200	[thread overview]
Message-ID: <a238949b-f8e2-fc6a-fecc-99df8ec6292a@peda.net> (raw)
In-Reply-To: <1613177399-22024-1-git-send-email-guoqing.jiang@cloud.ionos.com>

Guoqing Jiang (2021-02-13 02:49 Europe/Helsinki):
> Unregister sync_thread doesn't need to hold reconfig_mutex since it
> doesn't reconfigure array.
> 
> And it could cause deadlock problem for raid5 as follows:
> 
> 1. process A tried to reap sync thread with reconfig_mutex held after echo
>    idle to sync_action.
> 2. raid5 sync thread was blocked if there were too many active stripes.
> 3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer)
>    which causes the number of active stripes can't be decreased.
> 4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able
>    to hold reconfig_mutex.
> 
> More details in the link:
> https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t

I don't understand md well enough to suggest a patch but isn't this
logically a classic two lock deadlock problem where

thread 1:
- lock reconfig_mutex
- blocked for sync that requires SB_CHANGE_PENDING

thread 2:
- (logically) acquire lock SB_CHANGE_PENDING
- blocked for reconfig_mutex before SB_CHANGE_PENDING can be released

?

The classic fix for this kind of deadlock is to require these locks to
be always acquired in constant order and released in reverse order.

If you had a rule that SB_CHANGE_PENDING cannot be set or cleared
without already having reconfig_mutex, wouldn't that prevent this
deadlock? (If I understood the issue correctly, it's currently possible
to set but not clear the SB_CHANGE_PENDING without having reconfig_mutex.)

Another possibility is to expect SB_CHANGE_PENDING to be set as part of
sync process required change to "idle" (write to sync_action). In that
case the logic would be you cannot have reconfig_mutex already locked
before setting (logically acquiring lock) SB_CHANGE_PENDING. So the
transfer from active to idle would require first setting
SB_CHANGE_PENDING, doing the required processing (getting and freeing
reconfig_mutex in process) and then clearing SB_CHANGE_PENDING.
Basically the rule would be you must lock SB_CHANGE_PENDING before you
can lock reconfig_mutex.

If I've understood correctly SB_CHANGE_PENDING is not technically a lock
but it's logically used like it were a lock.

Would either of these make sense for the overall design?

Obviously, if it doesn't hurt the performance too much, using a single
lock for everything that needs to be serialized would be much easier.

-- 
Mikko

WARNING: multiple messages have this Message-ID (diff)
From: Mikko Rantalainen <mikko.rantalainen@peda.net>
To: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>, song@kernel.org
Cc: linux-raid@vger.kernel.org, dm-devel@redhat.com,
	snitzer@redhat.com, agk@redhat.com
Subject: Re: [dm-devel] [PATCH V2] md: don't unregister sync_thread with reconfig_mutex held
Date: Tue, 11 Jan 2022 14:25:04 +0200	[thread overview]
Message-ID: <a238949b-f8e2-fc6a-fecc-99df8ec6292a@peda.net> (raw)
In-Reply-To: <1613177399-22024-1-git-send-email-guoqing.jiang@cloud.ionos.com>

Guoqing Jiang (2021-02-13 02:49 Europe/Helsinki):
> Unregister sync_thread doesn't need to hold reconfig_mutex since it
> doesn't reconfigure array.
> 
> And it could cause deadlock problem for raid5 as follows:
> 
> 1. process A tried to reap sync thread with reconfig_mutex held after echo
>    idle to sync_action.
> 2. raid5 sync thread was blocked if there were too many active stripes.
> 3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer)
>    which causes the number of active stripes can't be decreased.
> 4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able
>    to hold reconfig_mutex.
> 
> More details in the link:
> https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t

I don't understand md well enough to suggest a patch but isn't this
logically a classic two lock deadlock problem where

thread 1:
- lock reconfig_mutex
- blocked for sync that requires SB_CHANGE_PENDING

thread 2:
- (logically) acquire lock SB_CHANGE_PENDING
- blocked for reconfig_mutex before SB_CHANGE_PENDING can be released

?

The classic fix for this kind of deadlock is to require these locks to
be always acquired in constant order and released in reverse order.

If you had a rule that SB_CHANGE_PENDING cannot be set or cleared
without already having reconfig_mutex, wouldn't that prevent this
deadlock? (If I understood the issue correctly, it's currently possible
to set but not clear the SB_CHANGE_PENDING without having reconfig_mutex.)

Another possibility is to expect SB_CHANGE_PENDING to be set as part of
sync process required change to "idle" (write to sync_action). In that
case the logic would be you cannot have reconfig_mutex already locked
before setting (logically acquiring lock) SB_CHANGE_PENDING. So the
transfer from active to idle would require first setting
SB_CHANGE_PENDING, doing the required processing (getting and freeing
reconfig_mutex in process) and then clearing SB_CHANGE_PENDING.
Basically the rule would be you must lock SB_CHANGE_PENDING before you
can lock reconfig_mutex.

If I've understood correctly SB_CHANGE_PENDING is not technically a lock
but it's logically used like it were a lock.

Would either of these make sense for the overall design?

Obviously, if it doesn't hurt the performance too much, using a single
lock for everything that needs to be serialized would be much easier.

-- 
Mikko

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


  parent reply	other threads:[~2022-01-11 12:34 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-13  0:49 [PATCH V2] md: don't unregister sync_thread with reconfig_mutex held Guoqing Jiang
2021-02-13  0:49 ` [dm-devel] " Guoqing Jiang
2021-02-15 11:07 ` Paul Menzel
2021-02-15 11:07   ` [dm-devel] " Paul Menzel
2021-02-24  9:09   ` Song Liu
2021-02-24  9:09     ` [dm-devel] " Song Liu
2021-02-24  9:25     ` Guoqing Jiang
2021-02-24  9:25       ` [dm-devel] " Guoqing Jiang
2021-03-19 23:00       ` Song Liu
2021-03-19 23:00         ` [dm-devel] " Song Liu
2021-11-30 17:25         ` Paul Menzel
2021-11-30 17:25           ` [dm-devel] " Paul Menzel
2021-11-30 17:27           ` Paul Menzel
2021-11-30 17:27             ` [dm-devel] " Paul Menzel
2021-12-08 14:16             ` Guoqing Jiang
2021-12-08 14:16               ` [dm-devel] " Guoqing Jiang
2021-12-08 16:35               ` Heinz Mauelshagen
2021-12-09  0:47                 ` Guoqing Jiang
2021-12-09  0:47                   ` [dm-devel] " Guoqing Jiang
2021-12-09 12:54   ` Donald Buczek
2021-12-09 12:54     ` [dm-devel] " Donald Buczek
2021-12-09 12:57   ` Donald Buczek
2021-12-09 12:57     ` [dm-devel] " Donald Buczek
2021-12-10  1:06     ` Guoqing Jiang
2021-12-10  1:06       ` [dm-devel] " Guoqing Jiang
2021-12-10 14:16 ` Donald Buczek
2021-12-10 14:16   ` [dm-devel] " Donald Buczek
2021-12-14  2:34   ` Guoqing Jiang
2021-12-14  2:34     ` [dm-devel] " Guoqing Jiang
2021-12-14  9:31     ` Donald Buczek
2021-12-14  9:31       ` [dm-devel] " Donald Buczek
2021-12-14 10:03       ` Guoqing Jiang
2021-12-14 10:03         ` [dm-devel] " Guoqing Jiang
2021-12-14 12:21         ` Donald Buczek
2021-12-14 12:21           ` [dm-devel] " Donald Buczek
2022-01-11 12:25 ` Mikko Rantalainen [this message]
2022-01-11 12:25   ` Mikko Rantalainen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a238949b-f8e2-fc6a-fecc-99df8ec6292a@peda.net \
    --to=mikko.rantalainen@peda.net \
    --cc=agk@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=guoqing.jiang@cloud.ionos.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=snitzer@redhat.com \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.