All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [PATCH 1/2] cman init: make sure we start after fence_sanlockd and warn users
@ 2012-10-09  9:36 Fabio M. Di Nitto
  2012-10-09  9:36 ` [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd Fabio M. Di Nitto
  0 siblings, 1 reply; 12+ messages in thread
From: Fabio M. Di Nitto @ 2012-10-09  9:36 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: "Fabio M. Di Nitto" <fdinitto@redhat.com>

Resolves: rhbz#509056

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
---
 cman/init.d/cman.in |   13 +++++++++++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/cman/init.d/cman.in b/cman/init.d/cman.in
index a88f52f..849739b 100644
--- a/cman/init.d/cman.in
+++ b/cman/init.d/cman.in
@@ -8,8 +8,8 @@
 #
 ### BEGIN INIT INFO
 # Provides:		cman
-# Required-Start:	$network $time
-# Required-Stop:	$network $time
+# Required-Start:	$network $time fence_sanlockd
+# Required-Stop:	$network $time fence_sanlockd
 # Default-Start:
 # Default-Stop:
 # Short-Description:	Starts and stops cman
@@ -740,6 +740,13 @@ stop_cmannotifyd()
 	stop_daemon cmannotifyd
 }
 
+fence_sanlock_check()
+{
+	service fence_sanlockd status > /dev/null 2>&1 &&
+		echo "   fence_sanlockd detected. Unfencing might take several minutes!"
+	return 0
+}
+
 unfence_self()
 {
 	# fence_node returns 0 on success, 1 on failure, 2 if unconfigured
@@ -881,6 +888,8 @@ start()
 
 	[ "$breakpoint" = "daemons" ] && exit 0
 
+	fence_sanlock_check
+
 	runwrap unfence_self \
 		none \
 		"Unfencing self"
-- 
1.7.7.6



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd
  2012-10-09  9:36 [Cluster-devel] [PATCH 1/2] cman init: make sure we start after fence_sanlockd and warn users Fabio M. Di Nitto
@ 2012-10-09  9:36 ` Fabio M. Di Nitto
  2012-10-10  4:26   ` Dietmar Maurer
  2012-10-10  4:33   ` Dietmar Maurer
  0 siblings, 2 replies; 12+ messages in thread
From: Fabio M. Di Nitto @ 2012-10-09  9:36 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: "Fabio M. Di Nitto" <fdinitto@redhat.com>

requires wdmd >= 2.6

Resolves: rhbz#509056

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
---
 cman/scripts/Makefile         |    2 +-
 cman/scripts/checkquorum.wdmd |  104 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 105 insertions(+), 1 deletions(-)
 create mode 100644 cman/scripts/checkquorum.wdmd

diff --git a/cman/scripts/Makefile b/cman/scripts/Makefile
index b4866c8..7950311 100644
--- a/cman/scripts/Makefile
+++ b/cman/scripts/Makefile
@@ -1,4 +1,4 @@
-SHAREDIRTEX=checkquorum
+SHAREDIRTEX=checkquorum checkquorum.wdmd
 
 include ../../make/defines.mk
 include $(OBJDIR)/make/clean.mk
diff --git a/cman/scripts/checkquorum.wdmd b/cman/scripts/checkquorum.wdmd
new file mode 100644
index 0000000..1d81ff6
--- /dev/null
+++ b/cman/scripts/checkquorum.wdmd
@@ -0,0 +1,104 @@
+#!/bin/bash
+# Quorum detection watchdog script
+#
+# This script will return -2 if the node had quorum at one point
+# and then subsequently lost it
+#
+# Copyright 2012 Red Hat, Inc.
+
+# defaults
+
+# Amount of time in seconds to wait after quorum is lost to fail script
+waittime=60
+
+# action to take if quorum is missing for over > waittime
+# autodetect|hardreboot|crashdump|watchdog
+action=autodetect
+
+# Location of temporary file to capture timeouts
+timerfile="/var/run/cluster/checkquorum-timer"
+
+# rpm based distros
+[ -d /etc/sysconfig ] && \
+	[ -f /etc/sysconfig/checkquorum ] && \
+	. /etc/sysconfig/checkquorum
+
+# deb based distros
+[ ! -d /etc/sysconfig ] && \
+	[ -f /etc/default/checkquorum ] && \
+	. /etc/default/checkquorum
+
+has_quorum() {
+	corosync-quorumtool -s 2>/dev/null | \
+		grep ^Quorate: | \
+		grep -q Yes$
+}
+
+had_quorum() {
+	output="$(corosync-objctl 2>/dev/null | \
+		grep runtime.totem.pg.mrp.srp.operational_entered | cut -d "=" -f 2)"
+	[ -n "$output" ] && {
+		[ "$output" -ge 1 ] && return 0
+		return 1
+	}
+}
+
+take_action() {
+	case "$action" in
+		watchdog)
+			[ -n "$wdmd_action" ] && return 1
+			;;
+		hardreboot)
+			echo 1 > /proc/sys/kernel/sysrq
+			echo b > /proc/sysrq-trigger
+			;;
+		crashdump)
+			echo 1 > /proc/sys/kernel/sysrq
+			echo c > /proc/sysrq-trigger
+			;;
+		autodetect)
+			service kdump status > /dev/null 2>&1
+			usekexec="$?"
+			[ -n "$wdmd_action" ] && [ "$usekexec" != "0" ] && return 1
+			echo 1 > /proc/sys/kernel/sysrq
+			[ "$usekexec" = "0" ] && echo c > /proc/sysrq-trigger
+			echo b > /proc/sysrq-trigger
+	esac
+}
+
+# watchdog uses $1 = test or = repair
+# with no arguments we are called by wdmd
+[ -z "$1" ] && wdmd_action=yes
+
+# we don't support watchdog repair action
+[ "$1" = "repair" ] && exit 1
+
+service corosync status > /dev/null 2>&1
+ret=$?
+
+case "$ret" in
+	3) # corosync is not running (clean)
+		rm -f "$timerfile"
+		exit 0
+		;;
+	1) # corosync crashed or did exit abonormally (dirty - take action)
+		logger -t checkquorum.wdmd "corosync crashed or exited abonarmally. Node will soon reboot"
+		take_action
+		;;
+	0) # corosync is running (clean)
+		# check quorum here
+		has_quorum && {
+			echo -e "oldtime=$(date +%s)" > "$timerfile"
+			exit 0
+		}
+		. "$timerfile"
+		newtime="$(date +%s)" 
+		delta=$((newtime - oldtime))
+		logger -t checkquorum.wdmd "Node has lost quorum. Node will soon reboot"
+		had_quorum && [ "$delta" -gt "$waittime" ] && {
+			take_action
+		}
+		;;
+esac
+
+exit $?
-- 
1.7.7.6



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd
  2012-10-09  9:36 ` [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd Fabio M. Di Nitto
@ 2012-10-10  4:26   ` Dietmar Maurer
  2012-10-10  6:59     ` Fabio M. Di Nitto
  2012-10-10  4:33   ` Dietmar Maurer
  1 sibling, 1 reply; 12+ messages in thread
From: Dietmar Maurer @ 2012-10-10  4:26 UTC (permalink / raw)
  To: cluster-devel.redhat.com

> +# rpm based distros
> +[ -d /etc/sysconfig ] && \
> +	[ -f /etc/sysconfig/checkquorum ] && \
> +	. /etc/sysconfig/checkquorum
> +
> +# deb based distros
> +[ ! -d /etc/sysconfig ] && \
> +	[ -f /etc/default/checkquorum ] && \
> +	. /etc/default/checkquorum
> +

FYI: Some RAID tool vendors delivers utilities for debian which creates directory '/etc/sysconfig'
on debian boxes, so that test is not reliable.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd
  2012-10-09  9:36 ` [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd Fabio M. Di Nitto
  2012-10-10  4:26   ` Dietmar Maurer
@ 2012-10-10  4:33   ` Dietmar Maurer
  2012-10-10  7:06     ` Fabio M. Di Nitto
  1 sibling, 1 reply; 12+ messages in thread
From: Dietmar Maurer @ 2012-10-10  4:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Will you add some documentaion how to use those scripts?

Seems those scripts does not check if the node is joined to the fence domain?

> -----Original Message-----
> From: cluster-devel-bounces at redhat.com [mailto:cluster-devel-
> bounces at redhat.com] On Behalf Of Fabio M. Di Nitto
> Sent: Dienstag, 09. Oktober 2012 11:36
> To: cluster-devel at redhat.com
> Cc: Fabio M. Di Nitto
> Subject: [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration
> script with wdmd
> 
> From: "Fabio M. Di Nitto" <fdinitto@redhat.com>
> 




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd
  2012-10-10  4:26   ` Dietmar Maurer
@ 2012-10-10  6:59     ` Fabio M. Di Nitto
  2012-10-10  8:06       ` Dietmar Maurer
  0 siblings, 1 reply; 12+ messages in thread
From: Fabio M. Di Nitto @ 2012-10-10  6:59 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On 10/10/2012 6:26 AM, Dietmar Maurer wrote:
>> +# rpm based distros
>> +[ -d /etc/sysconfig ] && \
>> +	[ -f /etc/sysconfig/checkquorum ] && \
>> +	. /etc/sysconfig/checkquorum
>> +
>> +# deb based distros
>> +[ ! -d /etc/sysconfig ] && \
>> +	[ -f /etc/default/checkquorum ] && \
>> +	. /etc/default/checkquorum
>> +
> 
> FYI: Some RAID tool vendors delivers utilities for debian which creates directory '/etc/sysconfig'
> on debian boxes, so that test is not reliable.
> 
> 

This might be a controversial argument.

Debian policy (1) define the use of /etc/default as "should" (2), for
conffile such as this one. On the other side it does not explicitly
forbid the use of sysconfig.

sysconfig is not found anywhere in Debian default archive because
packages to use the formal *should* policy.

If third-party applications don?t follow Debian packaging guidelines, it
is possible that they might break other components as well.

Of course we can argue on the definition of "should" forever and ever :)

As upstream we follow basic guidelines, distribution packagers and
porters should (pun intended ;)) make sure to provide us with porting
patches (that?s also part of the Debian Maintainer duty).

Fabio

1) http://www.debian.org/doc/debian-policy/ch-opersys.html
   Section 9.3.2

"To ease the burden on the system administrator, such configurable values
 should not be placed directly in the script.
 Instead, they should be placed in a file in /etc/default, ...."

2) http://www.thefreedictionary.com/should
 should  (shd)
  aux.v. Past tense of shall
  1. Used to express obligation or duty
....





^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd
  2012-10-10  4:33   ` Dietmar Maurer
@ 2012-10-10  7:06     ` Fabio M. Di Nitto
  2012-10-10  8:10       ` Dietmar Maurer
  0 siblings, 1 reply; 12+ messages in thread
From: Fabio M. Di Nitto @ 2012-10-10  7:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On 10/10/2012 6:33 AM, Dietmar Maurer wrote:
> Will you add some documentaion how to use those scripts?

Yes our documentation overlord is preparing an upstream wiki page for
it. It will be ready before a release.

> 
> Seems those scripts does not check if the node is joined to the fence domain?
> 

It doesn?t really need to.

I?ll put this in the easiest way as possible:

- real fencing == murder
  there can only be one killer in the cluster at a time
  fence domain coordinates who can/should be killed by who

- checkquorum.wdmd == suicide
  there are N nodes in the cluster that can decide to commit suicide
  without really caring about what others are doing.
  this can run without any fencing configuration at all.

Anyway examples and all, setups, limitations.. all in the doc as soon as
it?s ready. Be a bit patience :)

Fabio



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd
  2012-10-10  6:59     ` Fabio M. Di Nitto
@ 2012-10-10  8:06       ` Dietmar Maurer
  2012-10-10  8:11         ` Fabio M. Di Nitto
  0 siblings, 1 reply; 12+ messages in thread
From: Dietmar Maurer @ 2012-10-10  8:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

> On 10/10/2012 6:26 AM, Dietmar Maurer wrote:
> >> +# rpm based distros
> >> +[ -d /etc/sysconfig ] && \
> >> +	[ -f /etc/sysconfig/checkquorum ] && \
> >> +	. /etc/sysconfig/checkquorum
> >> +
> >> +# deb based distros
> >> +[ ! -d /etc/sysconfig ] && \
> >> +	[ -f /etc/default/checkquorum ] && \
> >> +	. /etc/default/checkquorum
> >> +
> >
> > FYI: Some RAID tool vendors delivers utilities for debian which creates
> directory '/etc/sysconfig'
> > on debian boxes, so that test is not reliable.
> >
> >
> 
> This might be a controversial argument.

I just though there are better tests to see if you run on debian, for example:

[ -f /etc/debian_version && -d /etc/default ]








^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd
  2012-10-10  7:06     ` Fabio M. Di Nitto
@ 2012-10-10  8:10       ` Dietmar Maurer
  0 siblings, 0 replies; 12+ messages in thread
From: Dietmar Maurer @ 2012-10-10  8:10 UTC (permalink / raw)
  To: cluster-devel.redhat.com

> Anyway examples and all, setups, limitations.. all in the doc as soon as it?s
> ready. Be a bit patience :)

Ok (I am just curios) - many thanks for you fast answers!

- Dietmar




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd
  2012-10-10  8:06       ` Dietmar Maurer
@ 2012-10-10  8:11         ` Fabio M. Di Nitto
  2012-10-10  8:15           ` Dietmar Maurer
  2012-10-10 11:04           ` Heiko Nardmann
  0 siblings, 2 replies; 12+ messages in thread
From: Fabio M. Di Nitto @ 2012-10-10  8:11 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On 10/10/2012 10:06 AM, Dietmar Maurer wrote:
>> On 10/10/2012 6:26 AM, Dietmar Maurer wrote:
>>>> +# rpm based distros
>>>> +[ -d /etc/sysconfig ] && \
>>>> +	[ -f /etc/sysconfig/checkquorum ] && \
>>>> +	. /etc/sysconfig/checkquorum
>>>> +
>>>> +# deb based distros
>>>> +[ ! -d /etc/sysconfig ] && \
>>>> +	[ -f /etc/default/checkquorum ] && \
>>>> +	. /etc/default/checkquorum
>>>> +
>>>
>>> FYI: Some RAID tool vendors delivers utilities for debian which creates
>> directory '/etc/sysconfig'
>>> on debian boxes, so that test is not reliable.
>>>
>>>
>>
>> This might be a controversial argument.
> 
> I just though there are better tests to see if you run on debian, for example:
> 
> [ -f /etc/debian_version && -d /etc/default ]
> 

that doesn?t scale well for debian derivates that don?t ship
debian_version :) (see ubuntu & co..)

You can?t even use something like "which dpkg" since the tool is
available on rpm based distributions... or viceversa.. there is rpm for
Debian & derivates.

hardcoding all distributions is not optimal either, as they might change
policy by version....

Fabio



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd
  2012-10-10  8:11         ` Fabio M. Di Nitto
@ 2012-10-10  8:15           ` Dietmar Maurer
  2012-10-10 11:04           ` Heiko Nardmann
  1 sibling, 0 replies; 12+ messages in thread
From: Dietmar Maurer @ 2012-10-10  8:15 UTC (permalink / raw)
  To: cluster-devel.redhat.com

> > [ -f /etc/debian_version && -d /etc/default ]
> >
> 
> that doesn?t scale well for debian derivates that don?t ship debian_version :)
> (see ubuntu & co..)
> 
> You can?t even use something like "which dpkg" since the tool is available on
> rpm based distributions... or viceversa.. there is rpm for Debian & derivates.
> 
> hardcoding all distributions is not optimal either, as they might change policy
> by version....

OK, I can see the problem now.

- Dietmar




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd
  2012-10-10  8:11         ` Fabio M. Di Nitto
  2012-10-10  8:15           ` Dietmar Maurer
@ 2012-10-10 11:04           ` Heiko Nardmann
  2012-10-10 11:14             ` Fabio M. Di Nitto
  1 sibling, 1 reply; 12+ messages in thread
From: Heiko Nardmann @ 2012-10-10 11:04 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Am 10.10.2012 10:11, schrieb Fabio M. Di Nitto:
> [snip]
> that doesn?t scale well for debian derivates that don?t ship
> debian_version :) (see ubuntu & co..)
>
> You can?t even use something like "which dpkg" since the tool is
> available on rpm based distributions... or viceversa.. there is rpm for
> Debian & derivates.
>
> hardcoding all distributions is not optimal either, as they might change
> policy by version....
>
> Fabio
>

What about 'lsb_release'? Is that executable available on all platforms?

Heiko



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd
  2012-10-10 11:04           ` Heiko Nardmann
@ 2012-10-10 11:14             ` Fabio M. Di Nitto
  0 siblings, 0 replies; 12+ messages in thread
From: Fabio M. Di Nitto @ 2012-10-10 11:14 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On 10/10/2012 1:04 PM, Heiko Nardmann wrote:
> Am 10.10.2012 10:11, schrieb Fabio M. Di Nitto:
>> [snip]
>> that doesn?t scale well for debian derivates that don?t ship
>> debian_version :) (see ubuntu & co..)
>>
>> You can?t even use something like "which dpkg" since the tool is
>> available on rpm based distributions... or viceversa.. there is rpm for
>> Debian & derivates.
>>
>> hardcoding all distributions is not optimal either, as they might change
>> policy by version....
>>
>> Fabio
>>
> 
> What about 'lsb_release'? Is that executable available on all platforms?


Not installed by default, it?s generally shipped with $distro-lsb
metapackage that pulls in half gazillions dependencies.

I doubt it would solve anything since you still need to parse the
output. It?s really no different than hardcoding /etc/$distro_release,
actually with a few GB of extra packages ;)

Fabio



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-10-10 11:14 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-09  9:36 [Cluster-devel] [PATCH 1/2] cman init: make sure we start after fence_sanlockd and warn users Fabio M. Di Nitto
2012-10-09  9:36 ` [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd Fabio M. Di Nitto
2012-10-10  4:26   ` Dietmar Maurer
2012-10-10  6:59     ` Fabio M. Di Nitto
2012-10-10  8:06       ` Dietmar Maurer
2012-10-10  8:11         ` Fabio M. Di Nitto
2012-10-10  8:15           ` Dietmar Maurer
2012-10-10 11:04           ` Heiko Nardmann
2012-10-10 11:14             ` Fabio M. Di Nitto
2012-10-10  4:33   ` Dietmar Maurer
2012-10-10  7:06     ` Fabio M. Di Nitto
2012-10-10  8:10       ` Dietmar Maurer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.