linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "heming.zhao@suse.com" <heming.zhao@suse.com>
To: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>,
	linux-raid@vger.kernel.org, song@kernel.org
Cc: lidong.zhong@suse.com, xni@redhat.com, neilb@suse.de, colyli@suse.de
Subject: Re: [PATCH 2/2] md/cluster: fix deadlock when doing reshape job
Date: Wed, 11 Nov 2020 20:13:12 +0800	[thread overview]
Message-ID: <e5ab9dfc-46bd-ac84-e79f-468f4ed0512f@suse.com> (raw)
In-Reply-To: <b6ef581d-2eaf-37d3-7a21-a91630b0836c@cloud.ionos.com>

On 11/10/20 2:36 PM, Guoqing Jiang wrote:
> 
> 
> On 11/8/20 15:53, Zhao Heming wrote:
>> There is a similar deadlock in commit 0ba959774e93
>> ("md-cluster: use sync way to handle METADATA_UPDATED msg")
>> ... ...
>>   static void unlock_comm(struct md_cluster_info *cinfo)
>> @@ -784,9 +794,11 @@ static int sendmsg(struct md_cluster_info *cinfo, struct cluster_msg *cmsg,
>>   {
>>       int ret;
>> -    lock_comm(cinfo, mddev_locked);
>> -    ret = __sendmsg(cinfo, cmsg);
>> -    unlock_comm(cinfo);
>> +    ret = lock_comm(cinfo, mddev_locked);
>> +    if (!ret) {
>> +        ret = __sendmsg(cinfo, cmsg);
>> +        unlock_comm(cinfo);
>> +    }
>>       return ret;
>>   }
> 
> What happens if the cluster has latency issue? Let's say, node normally can't get lock during the 5s window. IOW, is the timeout value justified well?

5 seconds is a random number first from my brain when I was coding. 

below I gave more info for my patch.

The key of patch 2 is to change from wait_event to wait_event_timeout.
After applying patch, code logic of some functions are also changed.
I will analyze the two sides: send & receive

*** send side***

Before applying patch, sendmsg only fail when __sendmsg return error. it means dlm layer fails to convert lock. But currently code should do access the return value of lock_comm.

After applying patch, there will have new fail cases: 5s timeout.

all related functions:
resync_bitmap
- void, only print error info when error. 

update_bitmap_size
- return err. caller already handles error case.

resync_info_update
- return err. caller ignore. because this is info msg, it can safely ignore.

remove_disk
- return err. part of caller handle error case. 
  I forgot to modify hot_remove_disk(), will add it in v2 patch.
  So in future v2 patch, all callers will handle error case.

gather_bitmaps - return err, caller already handles error case

We can see there is only one function which doesn't care return value: resync_bitmap
this function is used in leave path. if the msg doesn't send out (5s timeout), the result is other nodes won't know the failure event by BITMAP_NEEDS_SYNC. But if my understand is correct, even if missing BITMAP_NEEDS_SYNC, there is another api recover_slot(), which is triggered by dlm and do the same job.

So all the sending side related functions are safe.

*** receive side ***

process_metadata_update
the patch means it won't do metadata update job when 5s timeout. 
and I change my mind, 5s timeout is wrong. Receive side should do as more as possible to handle incomming msg. if there is a 5s timeout code branch, there will tigger inconsistent issue.
e.g. 
A does --faulty, send METADATA_UPDATE to B, 
B receives the msg, but meets 5s timeout. it won't to update kernel md info. and it will have a gap between kernel md info and disk metadata.


  reply	other threads:[~2020-11-11 12:15 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-08 14:52 [PATCH 0/2] md-cluster bugs fix Zhao Heming
2020-11-08 14:53 ` [PATCH 1/2] md/cluster: reshape should returns error when remote doing resyncing job Zhao Heming
2020-11-09 18:01   ` Song Liu
2020-11-10  6:38   ` Guoqing Jiang
2020-11-10  6:59     ` heming.zhao
2020-11-08 14:53 ` [PATCH 2/2] md/cluster: fix deadlock when doing reshape job Zhao Heming
2020-11-09  2:02   ` heming.zhao
2020-11-09 18:06     ` Song Liu
2020-11-10  2:24       ` heming.zhao
2020-11-10  6:36   ` Guoqing Jiang
2020-11-11 12:13     ` heming.zhao [this message]
2020-11-09 17:43 ` [PATCH 0/2] md-cluster bugs fix Song Liu
  -- strict thread matches above, loose matches on Subject: below --
2020-11-05 13:11 [PATCH 1/2] md/cluster: reshape should returns error when remote doing resyncing job Zhao Heming
2020-11-05 13:11 ` [PATCH 2/2] md/cluster: fix deadlock when doing reshape job Zhao Heming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e5ab9dfc-46bd-ac84-e79f-468f4ed0512f@suse.com \
    --to=heming.zhao@suse.com \
    --cc=colyli@suse.de \
    --cc=guoqing.jiang@cloud.ionos.com \
    --cc=lidong.zhong@suse.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=song@kernel.org \
    --cc=xni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).