All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] cman init rework
@ 2009-03-26 13:50 Fabio M. Di Nitto
  2009-03-30 21:42 ` David Teigland
  0 siblings, 1 reply; 5+ messages in thread
From: Fabio M. Di Nitto @ 2009-03-26 13:50 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi everybody,

I spent a bit of time cleaning cman init script to be a lot easier to
read (IMHO), more maintainable and ready to be expanded to fix some
outstanding issues.

major highlights:

- clean up all over
- grouping of functions
- made ready for extensions (specially at startup time)
- remove tons of duplicate stuff around
- try to standardize a bit the way of writing the shell script
- run in quiet/terse/full output mode

The final script is tons of times smaller than the whole patchset.

So I published a fabbione_cmaninit branch for people to look at history
of changes.

In attachment there is a copy of the new one.

Two notes (I know somebody is going to ask):

1) On startup (eg):

start_foo()
{
        start_daemon foo
}

all those wrappers look the same. It's preparation to extend them.

In our current startup sequence, we do start a daemon, we make sure it
starts, but we never check if it's actually working properly.

As time goes, and I also need feedback from different daemons
maintainer, those snippets will change to look like:

start_foo()
{
        start_daemon foo || return 1
	check_if_daemon_foo_is_working
}

similar to what we do with cman that we wait for quorum and we check
cman_is_running via cman_status.

Those checks will make our init script a lot more robust than it is now.

2) On shutdown (eg):

I did slow down the shutdown time by adding a sleep 1 after each kill
invokation. This will give a bit of time for each daemon to exit.
Note that the timer is configurable (see for example stop_qdiskd).

Please give it a shot and let me know.

I plan to land this one in master and stable3 sometime next week.

It's not a blocker for 3.0.0 final.

Fabio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cman.in
Type: application/x-shellscript
Size: 16199 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090326/6c0ac25d/attachment.bin>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] cman init rework
  2009-03-26 13:50 [Cluster-devel] cman init rework Fabio M. Di Nitto
@ 2009-03-30 21:42 ` David Teigland
  2009-03-31  5:23   ` Fabio M. Di Nitto
  2009-03-31  7:21   ` Fabio M. Di Nitto
  0 siblings, 2 replies; 5+ messages in thread
From: David Teigland @ 2009-03-30 21:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Thu, Mar 26, 2009 at 02:50:31PM +0100, Fabio M. Di Nitto wrote:
> In our current startup sequence, we do start a daemon, we make sure it
> starts, but we never check if it's actually working properly.

If there's no groupd_compat setting in cluster.conf, or if it's set to 2, then
groupd does compat "detection" when it starts up, looking for old cluster2
nodes that require compat mode.  This detection phase can sometimes take a
while.  Other daemons have to ask groupd about the mode it chose after the
detection phase, and retry for a while if it's still pending.  It might be
nice for the init script to wait for this detection phase to complete after
starting groupd.  To do this we can run 'group_tool compat' and loop until
"pending" doesn't show up in a grep.  We should probably loop for somewhere
around 10 seconds, there's no good predictable number.  If groupd is still
pending after that time, the init script should just continue since it's most
likely taking longer than expected.  Other daemons are already prepared to
wait for groupd to pick a mode during their startup.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] cman init rework
  2009-03-30 21:42 ` David Teigland
@ 2009-03-31  5:23   ` Fabio M. Di Nitto
  2009-03-31 17:55     ` David Teigland
  2009-03-31  7:21   ` Fabio M. Di Nitto
  1 sibling, 1 reply; 5+ messages in thread
From: Fabio M. Di Nitto @ 2009-03-31  5:23 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Mon, 2009-03-30 at 16:42 -0500, David Teigland wrote:
> On Thu, Mar 26, 2009 at 02:50:31PM +0100, Fabio M. Di Nitto wrote:
> > In our current startup sequence, we do start a daemon, we make sure it
> > starts, but we never check if it's actually working properly.
> 
> If there's no groupd_compat setting in cluster.conf, or if it's set to 2, then
> groupd does compat "detection" when it starts up, looking for old cluster2
> nodes that require compat mode.  This detection phase can sometimes take a
> while.  Other daemons have to ask groupd about the mode it chose after the
> detection phase, and retry for a while if it's still pending.  It might be
> nice for the init script to wait for this detection phase to complete after
> starting groupd.  To do this we can run 'group_tool compat' and loop until
> "pending" doesn't show up in a grep.  We should probably loop for somewhere
> around 10 seconds, there's no good predictable number.  If groupd is still
> pending after that time, the init script should just continue since it's most
> likely taking longer than expected.  Other daemons are already prepared to
> wait for groupd to pick a mode during their startup.

So far we specifically check for groupd_compat=0 to avoid starting
groupd at all.

Is this still correct?

For other values of groupd_compat or none specified in the config, we
start groupd.

Should we wait no matter what or only when none or 2 are specified?

Thanks
Fabio



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] cman init rework
  2009-03-30 21:42 ` David Teigland
  2009-03-31  5:23   ` Fabio M. Di Nitto
@ 2009-03-31  7:21   ` Fabio M. Di Nitto
  1 sibling, 0 replies; 5+ messages in thread
From: Fabio M. Di Nitto @ 2009-03-31  7:21 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Mon, 2009-03-30 at 16:42 -0500, David Teigland wrote:
> On Thu, Mar 26, 2009 at 02:50:31PM +0100, Fabio M. Di Nitto wrote:
> > In our current startup sequence, we do start a daemon, we make sure it
> > starts, but we never check if it's actually working properly.
> 
> If there's no groupd_compat setting in cluster.conf, or if it's set to 2, then
> groupd does compat "detection" when it starts up, looking for old cluster2
> nodes that require compat mode.  This detection phase can sometimes take a
> while.  Other daemons have to ask groupd about the mode it chose after the
> detection phase, and retry for a while if it's still pending.  It might be
> nice for the init script to wait for this detection phase to complete after
> starting groupd.  To do this we can run 'group_tool compat' and loop until
> "pending" doesn't show up in a grep.  We should probably loop for somewhere
> around 10 seconds, there's no good predictable number.  If groupd is still
> pending after that time, the init script should just continue since it's most
> likely taking longer than expected.  Other daemons are already prepared to
> wait for groupd to pick a mode during their startup.
> 

I committed: c685efb1ca60e907f1a1d36ba371ed8241c1586e

that should implement what you have asked for...

on a 2 nodes cluster I see:

node1:

   Starting groupd...                                      [  OK  ]
   Waiting groupd protocol negotiation:                    [  OK  ]

(there is no waiting time, first node, goes up quickly)

node2 (starts a bit after node1)

   Starting groupd...                                      [  OK  ]
   Waiting groupd protocol negotiation: 0 1                [  OK  ]

it takes approx 2/3 seconds to negotiate.

The output is shown only when running in full verbose mode.

Fabio



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] cman init rework
  2009-03-31  5:23   ` Fabio M. Di Nitto
@ 2009-03-31 17:55     ` David Teigland
  0 siblings, 0 replies; 5+ messages in thread
From: David Teigland @ 2009-03-31 17:55 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Mar 31, 2009 at 07:23:15AM +0200, Fabio M. Di Nitto wrote:
> On Mon, 2009-03-30 at 16:42 -0500, David Teigland wrote:
> > On Thu, Mar 26, 2009 at 02:50:31PM +0100, Fabio M. Di Nitto wrote:
> > > In our current startup sequence, we do start a daemon, we make sure it
> > > starts, but we never check if it's actually working properly.
> > 
> > If there's no groupd_compat setting in cluster.conf, or if it's set to 2, then
> > groupd does compat "detection" when it starts up, looking for old cluster2
> > nodes that require compat mode.  This detection phase can sometimes take a
> > while.  Other daemons have to ask groupd about the mode it chose after the
> > detection phase, and retry for a while if it's still pending.  It might be
> > nice for the init script to wait for this detection phase to complete after
> > starting groupd.  To do this we can run 'group_tool compat' and loop until
> > "pending" doesn't show up in a grep.  We should probably loop for somewhere
> > around 10 seconds, there's no good predictable number.  If groupd is still
> > pending after that time, the init script should just continue since it's most
> > likely taking longer than expected.  Other daemons are already prepared to
> > wait for groupd to pick a mode during their startup.
> 
> So far we specifically check for groupd_compat=0 to avoid starting
> groupd at all.
> 
> Is this still correct?
> 
> For other values of groupd_compat or none specified in the config, we
> start groupd.
> 
> Should we wait no matter what or only when none or 2 are specified?

Only when none or 2 are specified, there's no detection when set to 0 or 1.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-03-31 17:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-26 13:50 [Cluster-devel] cman init rework Fabio M. Di Nitto
2009-03-30 21:42 ` David Teigland
2009-03-31  5:23   ` Fabio M. Di Nitto
2009-03-31 17:55     ` David Teigland
2009-03-31  7:21   ` Fabio M. Di Nitto

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.