Potential race in dlm based messaging md-cluster.c

* Potential race in dlm based messaging md-cluster.c
       [not found]     ` <CAE3Hb8pJ=0MB6EX5jVch28gj-gnf0Mp1wyzxBfWjzLf=SuV4sQ@mail.gmail.com>
@ 2015-04-30 18:36       ` Abhijit Bhopatkar
  2015-04-30 18:47         ` Abhijit Bhopatkar
  2015-05-05  9:22         ` Lidong Zhong
  0 siblings, 2 replies; 9+ messages in thread
From: Abhijit Bhopatkar @ 2015-04-30 18:36 UTC (permalink / raw)
  To: Goldwyn Rodrigues; +Cc: linux-raid

There is a possibility of a receiver losing out on messages in certain 
corner conditions. One of the buggy case is if there is are two sender 
ready with messages to be sent. Sender 1 initially gets the TOKEN lock 
and proceeds.
After initial processing the sender of message 1 _will_ release TOKEN as 
soon as receiver releases ACK, it does not wait till ACK CR is 
re-acquired by receiver.

To illustrate the problem consider timeline for two senders and one 
receiver (we will ignore receive part for Sender2 node)

Sender1              Sender2                         Receiver
Get EX on TOKEN       Get EX on TOKEN
<Granted>                    <Wait till granted>

Get EX on MSG
write LVB
down MSG to CR
Get EX of ACK
<wait till granted>                                                     
      BAST for ACK
                                                             Get CR on MSG
                     read LVB
                     process
                     release ACK
AST for ACK
down ACK to CR
release MSG
release TOKEN
                    <granted>
                    Get EX on MSG
                    <... proceed ...>
                    release TOKEN
  <lost one message>
^^^^^^^^^^^^^^^^^
                                                              Get EX on MSG
                                                              Get CR on ACK
release MSG

Abhijit

^ permalink raw reply	[flat|nested] 9+ messages in thread