All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Fasheh <mfasheh@versity.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability
Date: Tue, 22 Aug 2017 15:49:59 -0500	[thread overview]
Message-ID: <CAAXPY_Khwm1o4gnCjPAhfn576F9dOrCkEggEb4BcrexvxywLkw@mail.gmail.com> (raw)
In-Reply-To: <63ADC13FD55D6546B7DECE290D39E373AC2CB9E5@H3CMLB14-EX.srv.huawei-3com.com>

On Tue, Aug 8, 2017 at 5:56 AM, Changwei Ge <ge.changwei@h3c.com> wrote:
>>> It will improve the reliability a lot.
>> Can you detail your testing? Code-wise this looks fine to me but as
>> you note, this is a pretty hard to hit corner case so it'd be nice to
>> hear that you were able to exercise it.
>>
>> Thanks,
>>    --Mark
> Hi Mark,
>
> My test is quite simple to perform.
> Test environment includes 7 hosts. Ethernet devices in 6 of them are
> down and then up repetitively.
> After several rounds of up and down. Some file operation hangs.
>
> Through debugfs.ocfs2 tool involved in NODE 2 which was the owner of
> lock resource 'O000000000000000011150300000000',
> it told that:
>
> debugfs: dlm_locks O000000000000000011150300000000
> Lockres: O000000000000000011150300000000   Owner: 2    State: 0x0
> Last Used: 0      ASTs Reserved: 0    Inflight: 0    Migration Pending: No
> Refs: 4    Locks: 2    On Lists: None
> Reference Map: 3
>  Lock-Queue  Node  Level  Conv  Cookie           Refs  AST  BAST
> Pending-Action
>  Granted     2     PR     -1    2:53             2     No   No    None
>  Granted     3     PR     -1    3:48             2     No   No    None
>
> That meant NODE 2 had granted NODE 3 and the AST had been transited to
> NODE 3.
>
> Meanwhile, through debugfs.ocfs2 tool involved in NODE 3,
> it told that:
> debugfs: dlm_locks O000000000000000011150300000000
> Lockres: O000000000000000011150300000000   Owner: 2    State: 0x0
> Last Used: 0      ASTs Reserved: 0    Inflight: 0    Migration Pending: No
> Refs: 3    Locks: 1    On Lists: None
> Reference Map:
>  Lock-Queue  Node  Level  Conv  Cookie           Refs  AST  BAST
> Pending-Action
>  Blocked     3     PR     -1    3:48             2     No   No    None
>
> That meant NODE 3 didn't ever receive any AST to move local lock from
> blocked list to grant list.
>
> This consequence  makes sense, since AST sending is failed which can be
> seen in kernel log.
>
> As for BAST, it is more or less the same.
>
> Thanks
> Changwei


Thanks for the testing details. I think you got Andrew's e-mail wrong
so I'm CC'ing him now. It might be a good idea to re-send the patch
with the right CC's - add some of your testing details to the log.
You're free to use my

Reviewed-by: Mark Fasheh <mfasheh@versity.com>

as well.

Thanks again,
   --Mark

  reply	other threads:[~2017-08-22 20:49 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-07  7:13 [Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability Changwei Ge
2017-08-07  7:43 ` Gang He
2017-08-07  7:55   ` Changwei Ge
2017-08-07 20:19 ` Mark Fasheh
2017-08-08 10:56   ` Changwei Ge
2017-08-22 20:49     ` Mark Fasheh [this message]
2017-08-23  1:06       ` Joseph Qi
2017-08-09 11:32 ` Joseph Qi
2017-08-09 15:24   ` ge changwei
2017-08-10  9:34     ` Joseph Qi
2017-08-10 10:49       ` Changwei Ge
2017-08-23  2:23         ` Junxiao Bi
2017-08-23  3:34           ` Joseph Qi
2017-08-23  4:47             ` Gang He
2017-08-23  5:56               ` Changwei Ge
     [not found]                 ` <63ADC13FD55D6546B7DECE290D39E373CED4F4ED@H3CMLB14-EX.srv.huawei-3com.com>
2017-09-13  7:03                   ` Changwei Ge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAXPY_Khwm1o4gnCjPAhfn576F9dOrCkEggEb4BcrexvxywLkw@mail.gmail.com \
    --to=mfasheh@versity.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.