All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] DLM Shutdown
@ 2016-02-10  1:33 Andreas Gruenbacher
  2016-02-10 17:38 ` David Teigland
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Gruenbacher @ 2016-02-10  1:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi Dave and Chrissie,

I recently started looking into how DLM works, with the help of
Chrissie's "Programming Locking Applications" handbook
(http://people.redhat.com/ccaulfie/docs/rhdlmbook.pdf). I didn't find
a simple way for testing DLM in a minimal setup: DLM requires
dlm_controld which depends on corosync. dlm_controld needs some sort
of membership management service and so I understand that it uses
corosync, but from a testing perspective, having something simple
would still be nice. So I started writing FakeDLM, a toy dlm_controld
substitute (https://github.com/andreas-gruenbacher/fakedlm).

This turned up a problem when shutting down DLM: dlm_controld only
shuts itself down on SIGTERM when no lockspaces exist anymore, it
never actively releases existing lockspaces. This means that as soon
as any application creates the default lockspace (via libdlm), or if
an application doesn't release any lockspaces it creates, dlm_controld
will never shut down.

It would make more sense, at least for testing purposes, to try
removing existing lockspaces and to perform a proper cleanup, though.
The only way I could find to make that happen is to do what
dlm_release_lockslace() in libdlm does though: to use
DLM_USER_REMOVE_LOCKSPACE requests. An added difficulty is that
lockspaces can be "created" with DLM_USER_CREATE_LOCKSPACE multiple
times (they are only created the first time), and only an equivalent
number of DLM_USER_REMOVE_LOCKSPACE requests will eventually remove
the lockspace.

In addition, the DLM_USER_REMOVE_LOCKSPACE requests are blocking so
they cannot be written to /dev/misc/dlm-control synchronously from the
process that handles the offline@/kernel/dlm/<lockspace_name> uevents
which the removal of a lockspace triggers for cleaning up the
lockspace configuration. Using aio_write instead has lead to lockdep
warnings and a deadlock in the kernel; I haven't found out the reason
for this problem yet, though.

Any ideas would be welcome.

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Cluster-devel] DLM Shutdown
  2016-02-10  1:33 [Cluster-devel] DLM Shutdown Andreas Gruenbacher
@ 2016-02-10 17:38 ` David Teigland
  2016-02-10 19:48   ` Andreas Gruenbacher
  0 siblings, 1 reply; 7+ messages in thread
From: David Teigland @ 2016-02-10 17:38 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Feb 10, 2016 at 02:33:49AM +0100, Andreas Gruenbacher wrote:
> never actively releases existing lockspaces. This means that as soon
> as any application creates the default lockspace (via libdlm), or if
> an application doesn't release any lockspaces it creates, dlm_controld
> will never shut down.

You can simply run 'dlm_tool leave foo' (repeately if necessary) to remove
a lockspace.  You could add a dlm_tool option that would trivially clear
all existing lockspaces this way.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Cluster-devel] DLM Shutdown
  2016-02-10 17:38 ` David Teigland
@ 2016-02-10 19:48   ` Andreas Gruenbacher
  2016-02-10 20:18     ` David Teigland
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Gruenbacher @ 2016-02-10 19:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Feb 10, 2016 at 6:38 PM, David Teigland <teigland@redhat.com> wrote:
> On Wed, Feb 10, 2016 at 02:33:49AM +0100, Andreas Gruenbacher wrote:
>> never actively releases existing lockspaces. This means that as soon
>> as any application creates the default lockspace (via libdlm), or if
>> an application doesn't release any lockspaces it creates, dlm_controld
>> will never shut down.
>
> You can simply run 'dlm_tool leave foo' (repeatedly if necessary) to remove
> a lockspace.  You could add a dlm_tool option that would trivially clear
> all existing lockspaces this way.

Hmm yes, that can be used to release unused lockspaces. I see that it
even forces lockspaces to be released when there are still active
locks.

When a shutdown is requested, shouldn't dlm_controld really release
lockspaces in a similar way as well?

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Cluster-devel] DLM Shutdown
  2016-02-10 19:48   ` Andreas Gruenbacher
@ 2016-02-10 20:18     ` David Teigland
  2016-02-10 20:38       ` Andreas Gruenbacher
  0 siblings, 1 reply; 7+ messages in thread
From: David Teigland @ 2016-02-10 20:18 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Feb 10, 2016 at 08:48:12PM +0100, Andreas Gruenbacher wrote:
> When a shutdown is requested, shouldn't dlm_controld really release
> lockspaces in a similar way as well?

You could probably do that if you check that the lockspace is managing no
local locks (which would be a pain).  If locks exist you'd not want to do
that without a force option at least.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Cluster-devel] DLM Shutdown
  2016-02-10 20:18     ` David Teigland
@ 2016-02-10 20:38       ` Andreas Gruenbacher
  2016-02-10 21:16         ` David Teigland
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Gruenbacher @ 2016-02-10 20:38 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Feb 10, 2016 at 9:18 PM, David Teigland <teigland@redhat.com> wrote:
> On Wed, Feb 10, 2016 at 08:48:12PM +0100, Andreas Gruenbacher wrote:
>> When a shutdown is requested, shouldn't dlm_controld really release
>> lockspaces in a similar way as well?
>
> You could probably do that if you check that the lockspace is managing no
> local locks (which would be a pain).  If locks exist you'd not want to do
> that without a force option at least.

Is that not what dlm_release_lockspace() with force set to false is
supposed to do? It's weird that the operation may have to be repeated
several times before the lockspace finally goes away; that could be
improved with an additional flag to DLM_USER_REMOVE_LOCKSPACE.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Cluster-devel] DLM Shutdown
  2016-02-10 20:38       ` Andreas Gruenbacher
@ 2016-02-10 21:16         ` David Teigland
  2016-02-10 22:08           ` Andreas Gruenbacher
  0 siblings, 1 reply; 7+ messages in thread
From: David Teigland @ 2016-02-10 21:16 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Feb 10, 2016 at 09:38:58PM +0100, Andreas Gruenbacher wrote:
> On Wed, Feb 10, 2016 at 9:18 PM, David Teigland <teigland@redhat.com> wrote:
> > On Wed, Feb 10, 2016 at 08:48:12PM +0100, Andreas Gruenbacher wrote:
> >> When a shutdown is requested, shouldn't dlm_controld really release
> >> lockspaces in a similar way as well?
> >
> > You could probably do that if you check that the lockspace is managing no
> > local locks (which would be a pain).  If locks exist you'd not want to do
> > that without a force option at least.
> 
> Is that not what dlm_release_lockspace() with force set to false is
> supposed to do? It's weird that the operation may have to be repeated
> several times before the lockspace finally goes away; that could be
> improved with an additional flag to DLM_USER_REMOVE_LOCKSPACE.

OK, yes, but we've wandered into the weeds here.  dlm_controld isn't
involved in lockspace lifetimes, that's the application/libdlm side.
The question is what behavior the program creating/removing the lockspace
needs (and if the program is for more than just testing.)



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Cluster-devel] DLM Shutdown
  2016-02-10 21:16         ` David Teigland
@ 2016-02-10 22:08           ` Andreas Gruenbacher
  0 siblings, 0 replies; 7+ messages in thread
From: Andreas Gruenbacher @ 2016-02-10 22:08 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Feb 10, 2016 at 10:16 PM, David Teigland <teigland@redhat.com> wrote:
> OK, yes, but we've wandered into the weeds here.  dlm_controld isn't
> involved in lockspace lifetimes, that's the application/libdlm side.
> The question is what behavior the program creating/removing the lockspace
> needs (and if the program is for more than just testing.)

Well, I'm not sure if this is indeed the right question. Sending
dlm_controld a TERM signal is an indication that dlm_controld should
seriously try shutting itself down and should only ignore the request
if it really must. This surely includes removing the default
lockspace, but also lockspaces that don't have active locks; I'm not
so sure about lockspaces with active locks.

It's a design deficiency that dlm_controld doesn't know which
lockspaces are currently in use by any applications; with this
knowledge, it could recycle unused lockspaces after a while.



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-02-10 22:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-10  1:33 [Cluster-devel] DLM Shutdown Andreas Gruenbacher
2016-02-10 17:38 ` David Teigland
2016-02-10 19:48   ` Andreas Gruenbacher
2016-02-10 20:18     ` David Teigland
2016-02-10 20:38       ` Andreas Gruenbacher
2016-02-10 21:16         ` David Teigland
2016-02-10 22:08           ` Andreas Gruenbacher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.