lttng-dev.lists.lttng.org archive mirror
 help / color / mirror / Atom feed
* [PATCH lttng-tools 2/6] Fix: tests: error handling in high throughput limits test (v2)
       [not found] <20190516190801.16878-1-mathieu.desnoyers@efficios.com>
@ 2019-05-16 19:07 ` Mathieu Desnoyers
  2019-05-16 19:07 ` [PATCH lttng-tools 3/6] Fix: utils.sh: handle SIGPIPE Mathieu Desnoyers
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: Mathieu Desnoyers @ 2019-05-16 19:07 UTC (permalink / raw)
  To: joraj, jgalar; +Cc: lttng-dev

Each individual call to "tc" should be checked for error, else we
may fail to catch specific tc errors caused, for instance, by a
kernel configuration that only contains some of the required
class modules.

Also, invoke the utils.sh full_cleanup function from the script-specific
interrupt_cleanup trap handler rather than try to perform stopping
of relayd and sessiond within the script.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
Changes since v1:
- Issue reset_bw_limit when set_bw_limit fails.
---
 .../streaming/test_high_throughput_limits     | 48 ++++++++++++++-----
 1 file changed, 37 insertions(+), 11 deletions(-)

diff --git a/tests/regression/tools/streaming/test_high_throughput_limits b/tests/regression/tools/streaming/test_high_throughput_limits
index 32c3f1f2..5945bde7 100755
--- a/tests/regression/tools/streaming/test_high_throughput_limits
+++ b/tests/regression/tools/streaming/test_high_throughput_limits
@@ -41,6 +41,12 @@ if [ ! -x "$TESTAPP_BIN" ]; then
 	BAIL_OUT "No UST events binary detected."
 fi
 
+function reset_bw_limit
+{
+	tc qdisc del dev $DEFAULT_IF root >/dev/null 2>&1
+	return $?
+}
+
 function set_bw_limit
 {
 	limit=$1
@@ -51,28 +57,47 @@ function set_bw_limit
 	# parent qdisc (1:) will always limit us to the right max value
 	dataportlimit=$((9*${ctrlportlimit}))
 
+	diag "Set bandwidth limits to ${limit}kbits, ${ctrlportlimit} for control and ${dataportlimit} for data"
 
 	tc qdisc add dev $DEFAULT_IF root handle 1: htb default 15 >/dev/null 2>&1
+	if [ $? -ne 0 ]; then
+		reset_bw_limit
+		return 1
+	fi
 
 	# the total bandwidth is the limit set by the user
 	tc class add dev $DEFAULT_IF parent 1: classid 1:1 htb rate ${limit}kbit ceil ${limit}kbit >/dev/null 2>&1
+	if [ $? -ne 0 ]; then
+		reset_bw_limit
+		return 1
+	fi
 	# 1/10 of the bandwidth guaranteed and traffic prioritized for the control port
 	tc class add dev $DEFAULT_IF parent 1:1 classid 1:10 htb rate ${ctrlportlimit}kbit ceil ${limit}kbit prio 1 >/dev/null 2>&1
+	if [ $? -ne 0 ]; then
+		reset_bw_limit
+		return 1
+	fi
 	# 9/10 of the bandwidth guaranteed and can borrow up to the total bandwidth (if unused)
 	tc class add dev $DEFAULT_IF parent 1:1 classid 1:11 htb rate ${dataportlimit}kbit ceil ${limit}kbit prio 2 >/dev/null 2>&1
+	if [ $? -ne 0 ]; then
+		reset_bw_limit
+		return 1
+	fi
 
 	# filter to assign control traffic to the 1:10 class
 	tc filter add dev $DEFAULT_IF parent 1: protocol ip u32 match ip dport $SESSIOND_CTRL_PORT 0xffff flowid 1:10 >/dev/null 2>&1
+	if [ $? -ne 0 ]; then
+		reset_bw_limit
+		return 1
+	fi
 	# filter to assign data traffic to the 1:11 class
 	tc filter add dev $DEFAULT_IF parent 1: protocol ip u32 match ip dport $SESSIOND_DATA_PORT 0xffff flowid 1:11 >/dev/null 2>&1
+	if [ $? -ne 0 ]; then
+		reset_bw_limit
+		return 1
+	fi
 
-	ok $? "Set bandwidth limits to ${limit}kbits, ${ctrlportlimit} for control and ${dataportlimit} for data"
-}
-
-function reset_bw_limit
-{
-	tc qdisc del dev $DEFAULT_IF root >/dev/null 2>&1
-	ok $? "Reset bandwith limits"
+	return 0
 }
 
 function create_lttng_session_with_uri
@@ -148,9 +173,9 @@ function validate_event_count
 function interrupt_cleanup()
 {
 	diag "*** Exiting ***"
-	stop_lttng_relayd
-	stop_lttng_sessiond
 	reset_bw_limit
+	# invoke utils cleanup
+	full_cleanup
 	exit 1
 }
 
@@ -168,8 +193,7 @@ skip $isroot "Root access is needed to set bandwith limits. Skipping all tests."
 {
 
 	# Catch sigint and try to cleanup limits
-	trap interrupt_cleanup SIGTERM
-	trap interrupt_cleanup SIGINT
+	trap interrupt_cleanup SIGTERM SIGINT
 
 	BW_LIMITS=(3200 1600 800 400 200 100 50 25)
 	for BW in ${BW_LIMITS[@]};
@@ -177,6 +201,7 @@ skip $isroot "Root access is needed to set bandwith limits. Skipping all tests."
 		diag "Test high-throughput with bandwidth limit set to ${BW}kbits"
 
 		set_bw_limit $BW
+		ok $? "Setting bandwidth limit"
 
 		start_lttng_sessiond
 		start_lttng_relayd "-o $TRACE_PATH"
@@ -185,5 +210,6 @@ skip $isroot "Root access is needed to set bandwith limits. Skipping all tests."
 		stop_lttng_relayd
 		stop_lttng_sessiond
 		reset_bw_limit
+		ok $? "Reset bandwith limits"
 	done
 }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH lttng-tools 3/6] Fix: utils.sh: handle SIGPIPE
       [not found] <20190516190801.16878-1-mathieu.desnoyers@efficios.com>
  2019-05-16 19:07 ` [PATCH lttng-tools 2/6] Fix: tests: error handling in high throughput limits test (v2) Mathieu Desnoyers
@ 2019-05-16 19:07 ` Mathieu Desnoyers
  2019-05-16 19:07 ` [PATCH lttng-tools 4/6] Fix: test: utils.sh: exit from process on full_cleanup Mathieu Desnoyers
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: Mathieu Desnoyers @ 2019-05-16 19:07 UTC (permalink / raw)
  To: joraj, jgalar; +Cc: lttng-dev

perl prove closes its child pipes before giving it a chance to execute
the signal trap handler. This means the child will not be able to
complete execution of the trap handler if that handler writes to stdout
or stderr.

Work-around this situation by redirecting stdin, stdout, and stderr
to /dev/null if a SIGPIPE is caught.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 tests/utils/utils.sh | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/tests/utils/utils.sh b/tests/utils/utils.sh
index 8f6381eb..53964027 100644
--- a/tests/utils/utils.sh
+++ b/tests/utils/utils.sh
@@ -68,9 +68,21 @@ function full_cleanup ()
 	trap - SIGTERM && kill -- -$$
 }
 
+function null_pipes ()
+{
+	exec 0>/dev/null
+	exec 1>/dev/null
+	exec 2>/dev/null
+}
 
 trap full_cleanup SIGINT SIGTERM
 
+# perl prove closes its child pipes before giving it a chance to run its
+# signal trap handlers. Redirect pipes to /dev/null if SIGPIPE is caught
+# to allow those trap handlers to proceed.
+
+trap null_pipes SIGPIPE
+
 function print_ok ()
 {
 	# Check if we are a terminal
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH lttng-tools 4/6] Fix: test: utils.sh: exit from process on full_cleanup
       [not found] <20190516190801.16878-1-mathieu.desnoyers@efficios.com>
  2019-05-16 19:07 ` [PATCH lttng-tools 2/6] Fix: tests: error handling in high throughput limits test (v2) Mathieu Desnoyers
  2019-05-16 19:07 ` [PATCH lttng-tools 3/6] Fix: utils.sh: handle SIGPIPE Mathieu Desnoyers
@ 2019-05-16 19:07 ` Mathieu Desnoyers
  2019-05-16 19:08 ` [PATCH lttng-tools 5/6] Cleanup: test: don't stop relayd twice Mathieu Desnoyers
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: Mathieu Desnoyers @ 2019-05-16 19:07 UTC (permalink / raw)
  To: joraj, jgalar; +Cc: lttng-dev

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 tests/utils/utils.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/utils/utils.sh b/tests/utils/utils.sh
index 53964027..ad3088d6 100644
--- a/tests/utils/utils.sh
+++ b/tests/utils/utils.sh
@@ -66,6 +66,7 @@ function full_cleanup ()
 	# The '-' before the pid number ($$) indicates 'kill' to signal the
 	# whole process group.
 	trap - SIGTERM && kill -- -$$
+	exit 1
 }
 
 function null_pipes ()
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH lttng-tools 5/6] Cleanup: test: don't stop relayd twice
       [not found] <20190516190801.16878-1-mathieu.desnoyers@efficios.com>
                   ` (2 preceding siblings ...)
  2019-05-16 19:07 ` [PATCH lttng-tools 4/6] Fix: test: utils.sh: exit from process on full_cleanup Mathieu Desnoyers
@ 2019-05-16 19:08 ` Mathieu Desnoyers
  2019-05-16 19:08 ` [PATCH lttng-tools 6/6] tests: invoke full_cleanup from script trap handlers, use modprobe -r Mathieu Desnoyers
  2019-09-05 20:54 ` [PATCH lttng-tools 1/6] Improve handling of test SIGTERM/SIGINT (v2) Jérémie Galarneau
  5 siblings, 0 replies; 6+ messages in thread
From: Mathieu Desnoyers @ 2019-05-16 19:08 UTC (permalink / raw)
  To: joraj, jgalar; +Cc: lttng-dev

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
---
 tests/regression/tools/live/test_lttng_ust | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/regression/tools/live/test_lttng_ust b/tests/regression/tools/live/test_lttng_ust
index 06017d01..830fc783 100755
--- a/tests/regression/tools/live/test_lttng_ust
+++ b/tests/regression/tools/live/test_lttng_ust
@@ -34,7 +34,7 @@ TRACE_PATH=$(mktemp -d)
 
 DIR=$(readlink -f $TESTDIR)
 
-NUM_TESTS=12
+NUM_TESTS=11
 
 source $TESTDIR/utils/utils.sh
 
@@ -84,5 +84,4 @@ stop_lttng_relayd
 
 test_custom_url
 
-stop_lttng_relayd
 stop_lttng_sessiond
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH lttng-tools 6/6] tests: invoke full_cleanup from script trap handlers, use modprobe -r
       [not found] <20190516190801.16878-1-mathieu.desnoyers@efficios.com>
                   ` (3 preceding siblings ...)
  2019-05-16 19:08 ` [PATCH lttng-tools 5/6] Cleanup: test: don't stop relayd twice Mathieu Desnoyers
@ 2019-05-16 19:08 ` Mathieu Desnoyers
  2019-09-05 20:54 ` [PATCH lttng-tools 1/6] Improve handling of test SIGTERM/SIGINT (v2) Jérémie Galarneau
  5 siblings, 0 replies; 6+ messages in thread
From: Mathieu Desnoyers @ 2019-05-16 19:08 UTC (permalink / raw)
  To: joraj, jgalar; +Cc: lttng-dev

Scripts implementing their own trap handlers override the generic
one provided by utils.sh (full_cleanup). Invoke it at the end of
the handlers to provide the utils cleanup as well.

Moreover, change use of "rmmod" to "modprobe -r", which is better
in trap handlers because it does not print errors if the module
was not loaded yet when the signal occurs.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
---
 tests/regression/kernel/test_clock_override            | 10 +++-------
 tests/regression/kernel/test_rotation_destroy_flush    |  7 +++----
 tests/regression/tools/crash/test_crash                |  3 +--
 .../tools/notification/test_notification_kernel        |  2 +-
 .../tools/notification/test_notification_multi_app     |  2 +-
 .../tools/notification/test_notification_ust           |  2 +-
 .../tools/streaming/test_high_throughput_limits        |  1 -
 .../rotation-destroy-flush/test_rotation_destroy_flush |  3 +--
 tests/stress/test_multi_sessions_per_uid_10app         |  5 ++---
 .../stress/test_multi_sessions_per_uid_5app_streaming  |  5 ++---
 ...t_multi_sessions_per_uid_5app_streaming_kill_relayd |  5 ++---
 11 files changed, 17 insertions(+), 28 deletions(-)

diff --git a/tests/regression/kernel/test_clock_override b/tests/regression/kernel/test_clock_override
index e19b77e1..1fbba771 100755
--- a/tests/regression/kernel/test_clock_override
+++ b/tests/regression/kernel/test_clock_override
@@ -49,11 +49,9 @@ source $TESTDIR/utils/utils.sh
 function signal_cleanup()
 {
 	diag "*** Exiting ***"
-	rmmod lttng-test
 	stop_lttng_sessiond
-	rmmod lttng-clock-plugin-test
-	rmmod lttng-clock
-	exit 1
+	modprobe -r lttng-test lttng-clock-plugin-test lttng-clock
+	full_cleanup
 }
 
 function extract_clock_metadata()
@@ -93,10 +91,8 @@ function test_clock_override_metadata()
 	stop_lttng_tracing_ok $SESSION_NAME
 	destroy_lttng_session_ok $SESSION_NAME
 
-	rmmod lttng-test
 	stop_lttng_sessiond
-	rmmod lttng-clock-plugin-test
-	rmmod lttng-clock
+	modprobe -r lttng-test lttng-clock-plugin-test lttng-clock
 
 	local TRACE_METADATA_FILE_PATH="$(find "$TRACE_PATH" -name metadata -type f)"
 	local TRACE_METADATA_DIR="$(dirname "$TRACE_METADATA_FILE_PATH")"
diff --git a/tests/regression/kernel/test_rotation_destroy_flush b/tests/regression/kernel/test_rotation_destroy_flush
index 0b0b0ca7..03933a3a 100755
--- a/tests/regression/kernel/test_rotation_destroy_flush
+++ b/tests/regression/kernel/test_rotation_destroy_flush
@@ -39,9 +39,8 @@ source $TESTDIR/utils/utils.sh
 function signal_cleanup()
 {
 	diag "*** Exiting ***"
-	rmmod lttng-test
-	stop_lttng_sessiond
-	exit 1
+	modprobe -r lttng-test
+	full_cleanup
 }
 
 function enable_kernel_lttng_channel_size_limit ()
@@ -107,7 +106,7 @@ function test_rotation_destroy_flush_single()
 
 	rm -rf $TRACE_PATH
 
-	rmmod lttng-test
+	modprobe -r lttng-test
 	stop_lttng_sessiond
 }
 
diff --git a/tests/regression/tools/crash/test_crash b/tests/regression/tools/crash/test_crash
index 13909c1b..5bad16e5 100755
--- a/tests/regression/tools/crash/test_crash
+++ b/tests/regression/tools/crash/test_crash
@@ -392,8 +392,7 @@ function interrupt_cleanup()
 {
     diag "*** Cleaning-up test ***"
     stop_test_apps
-    stop_lttng_sessiond
-    exit 1
+    full_cleanup
 }
 
 TESTS=(
diff --git a/tests/regression/tools/notification/test_notification_kernel b/tests/regression/tools/notification/test_notification_kernel
index e7368df2..cc6fc581 100755
--- a/tests/regression/tools/notification/test_notification_kernel
+++ b/tests/regression/tools/notification/test_notification_kernel
@@ -56,7 +56,7 @@ function kernel_event_generator
 	state_file=$1
 	kernel_event_generator_suspended=0
 	trap kernel_event_generator_toogle_state SIGUSR1
-	trap "exit" SIGTERM SIGINT EXIT
+
 	while (true); do
 		if [[ $kernel_event_generator_suspended -eq "1" ]]; then
 			touch $state_file
diff --git a/tests/regression/tools/notification/test_notification_multi_app b/tests/regression/tools/notification/test_notification_multi_app
index 7465a83f..51d94e4f 100755
--- a/tests/regression/tools/notification/test_notification_multi_app
+++ b/tests/regression/tools/notification/test_notification_multi_app
@@ -64,7 +64,7 @@ function kernel_event_generator
 	state_file=$1
 	kernel_event_generator_suspended=0
 	trap kernel_event_generator_toogle_state SIGUSR1
-	trap "exit" SIGTERM SIGINT
+
 	while (true); do
 		if [[ $kernel_event_generator_suspended -eq "1" ]]; then
 			touch $state_file
diff --git a/tests/regression/tools/notification/test_notification_ust b/tests/regression/tools/notification/test_notification_ust
index 8941e476..82f79a8e 100755
--- a/tests/regression/tools/notification/test_notification_ust
+++ b/tests/regression/tools/notification/test_notification_ust
@@ -56,7 +56,7 @@ function ust_event_generator
 	state_file=$1
 	ust_event_generator_suspended=0
 	trap ust_event_generator_toogle_state SIGUSR1
-	trap "exit" SIGTERM SIGINT
+
 	while (true); do
 		if [[ $ust_event_generator_suspended -eq "1" ]]; then
 			touch $state_file
diff --git a/tests/regression/tools/streaming/test_high_throughput_limits b/tests/regression/tools/streaming/test_high_throughput_limits
index 5945bde7..1bcf3532 100755
--- a/tests/regression/tools/streaming/test_high_throughput_limits
+++ b/tests/regression/tools/streaming/test_high_throughput_limits
@@ -176,7 +176,6 @@ function interrupt_cleanup()
 	reset_bw_limit
 	# invoke utils cleanup
 	full_cleanup
-	exit 1
 }
 
 plan_tests $NUM_TESTS
diff --git a/tests/regression/ust/rotation-destroy-flush/test_rotation_destroy_flush b/tests/regression/ust/rotation-destroy-flush/test_rotation_destroy_flush
index a7a93771..e404564e 100755
--- a/tests/regression/ust/rotation-destroy-flush/test_rotation_destroy_flush
+++ b/tests/regression/ust/rotation-destroy-flush/test_rotation_destroy_flush
@@ -48,8 +48,7 @@ function run_app()
 function signal_cleanup()
 {
 	diag "*** Exiting ***"
-	stop_lttng_sessiond
-	exit 1
+	full_cleanup
 }
 
 function enable_ust_lttng_channel_size_limit ()
diff --git a/tests/stress/test_multi_sessions_per_uid_10app b/tests/stress/test_multi_sessions_per_uid_10app
index 82e8ad50..c9f8403e 100755
--- a/tests/stress/test_multi_sessions_per_uid_10app
+++ b/tests/stress/test_multi_sessions_per_uid_10app
@@ -112,11 +112,10 @@ function sighandler()
 {
 	cleanup
 	rm $LOG_FILE
-	exit 1
+	full_cleanup
 }
 
-trap sighandler SIGINT
-trap sighandler SIGTERM
+trap sighandler SIGINT SIGTERM
 
 # Make sure we collect a coredump if possible.
 ulimit -c unlimited
diff --git a/tests/stress/test_multi_sessions_per_uid_5app_streaming b/tests/stress/test_multi_sessions_per_uid_5app_streaming
index ed989498..4203ac30 100755
--- a/tests/stress/test_multi_sessions_per_uid_5app_streaming
+++ b/tests/stress/test_multi_sessions_per_uid_5app_streaming
@@ -142,11 +142,10 @@ function sighandler()
 {
 	cleanup
 	rm $LOG_FILE_SESSIOND $LOG_FILE_RELAYD
-	exit 1
+	full_cleanup
 }
 
-trap sighandler SIGINT
-trap sighandler SIGTERM
+trap sighandler SIGINT SIGTERM
 
 # Make sure we collect a coredump if possible.
 ulimit -c unlimited
diff --git a/tests/stress/test_multi_sessions_per_uid_5app_streaming_kill_relayd b/tests/stress/test_multi_sessions_per_uid_5app_streaming_kill_relayd
index c699ac22..d0121e32 100755
--- a/tests/stress/test_multi_sessions_per_uid_5app_streaming_kill_relayd
+++ b/tests/stress/test_multi_sessions_per_uid_5app_streaming_kill_relayd
@@ -144,11 +144,10 @@ function sighandler()
 {
 	cleanup
 	#rm $LOG_FILE_SESSIOND $LOG_FILE_RELAYD
-	exit 1
+	full_cleanup
 }
 
-trap sighandler SIGINT
-trap sighandler SIGTERM
+trap sighandler SIGINT SIGTERM
 
 # Make sure we collect a coredump if possible.
 ulimit -c unlimited
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH lttng-tools 1/6] Improve handling of test SIGTERM/SIGINT (v2)
       [not found] <20190516190801.16878-1-mathieu.desnoyers@efficios.com>
                   ` (4 preceding siblings ...)
  2019-05-16 19:08 ` [PATCH lttng-tools 6/6] tests: invoke full_cleanup from script trap handlers, use modprobe -r Mathieu Desnoyers
@ 2019-09-05 20:54 ` Jérémie Galarneau
  5 siblings, 0 replies; 6+ messages in thread
From: Jérémie Galarneau @ 2019-09-05 20:54 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: lttng-dev, jgalar, joraj

Series merged in master, stable-2.11, and stable-2.10.

Thanks!
Jérémie

On Thu, May 16, 2019 at 03:07:56PM -0400, Mathieu Desnoyers wrote:
> The current state of signal handling for test scripts is: on
> SIGTERM/SIGINT of the tests (e.g. a CTRL-C on the console), session
> daemon and relay daemon are killed with SIGKILL, thus leaking all their
> resources, and leaving lttng kernel modules loaded.
> 
> Revamp the "stop" functions to take a signal number and a timeout
> as optional parameters. The default signal number is SIGTERM.
> 
> The full_cleanup trap handler now tries to nicely kill relayd and
> sessiond (if they are present) with SIGTERM, and wait up to the
> user-configurable LTTNG_TEST_TEARDOWN_TIMEOUT environment variable
> (which has a default of 60s). Then, if there are still either relayd,
> sessiond, or consumerd present, it will SIGKILL them and wait for
> them to vanish. If it had to kill sessiond with SIGKILL, it will
> also explicitly try to unload the lttng modules with modprobe.
> 
> This approach is inspired from sysv init script shutdown behavior.
> 
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> ---
> Changes since v1:
> - Take care of feedback from Jonathan Rajotte,
> - Run through shellcheck.
> ---
>  tests/utils/utils.sh | 296 +++++++++++++++++++++++++++++--------------
>  1 file changed, 204 insertions(+), 92 deletions(-)
> 
> diff --git a/tests/utils/utils.sh b/tests/utils/utils.sh
> index 94b3a3c4..8f6381eb 100644
> --- a/tests/utils/utils.sh
> +++ b/tests/utils/utils.sh
> @@ -15,14 +15,12 @@
>  
>  SESSIOND_BIN="lttng-sessiond"
>  SESSIOND_MATCH=".*lttng-sess.*"
> -SESSIOND_PIDS=""
>  RUNAS_BIN="lttng-runas"
>  RUNAS_MATCH=".*lttng-runas.*"
>  CONSUMERD_BIN="lttng-consumerd"
>  CONSUMERD_MATCH=".*lttng-consumerd.*"
>  RELAYD_BIN="lttng-relayd"
>  RELAYD_MATCH=".*lttng-relayd.*"
> -RELAYD_PIDS=""
>  LTTNG_BIN="lttng"
>  BABELTRACE_BIN="babeltrace"
>  OUTPUT_DEST=/dev/null
> @@ -48,11 +46,20 @@ export LTTNG_SESSIOND_PATH="/bin/true"
>  
>  source $TESTDIR/utils/tap/tap.sh
>  
> +if [ -z $LTTNG_TEST_TEARDOWN_TIMEOUT ]; then
> +	LTTNG_TEST_TEARDOWN_TIMEOUT=60
> +fi
> +
>  function full_cleanup ()
>  {
> -	if [ -n "${SESSIOND_PIDS}" ] || [ -n "${RELAYD_PIDS}" ]; then
> -		kill -9 ${SESSIOND_PIDS} ${RELAYD_PIDS} > /dev/null 2>&1
> -	fi
> +	# Try to kill daemons gracefully
> +	stop_lttng_relayd_notap SIGTERM $LTTNG_TEST_TEARDOWN_TIMEOUT
> +	stop_lttng_sessiond_notap SIGTERM $LTTNG_TEST_TEARDOWN_TIMEOUT
> +
> +	# If daemons are still present, forcibly kill them
> +	stop_lttng_relayd_notap SIGKILL $LTTNG_TEST_TEARDOWN_TIMEOUT
> +	stop_lttng_sessiond_notap SIGKILL $LTTNG_TEST_TEARDOWN_TIMEOUT
> +	stop_lttng_consumerd_notap SIGKILL $LTTNG_TEST_TEARDOWN_TIMEOUT
>  
>  	# Disable trap for SIGTERM since the following kill to the
>  	# pidgroup will be SIGTERM. Otherwise it loops.
> @@ -379,26 +386,25 @@ function start_lttng_relayd_opt()
>  	local withtap=$1
>  	local opt=$2
>  
> -	DIR=$(readlink -f $TESTDIR)
> +	DIR=$(readlink -f "$TESTDIR")
>  
> -	if [ -z $(pgrep $RELAYD_MATCH) ]; then
> -		$DIR/../src/bin/lttng-relayd/$RELAYD_BIN -b $opt 1> $OUTPUT_DEST 2> $ERROR_OUTPUT_DEST
> -		#$DIR/../src/bin/lttng-relayd/$RELAYD_BIN $opt -vvv >>/tmp/relayd.log 2>&1 &
> +	if [ -z "$(pgrep "$RELAYD_MATCH")" ]; then
> +		# shellcheck disable=SC2086
> +		"$DIR/../src/bin/lttng-relayd/$RELAYD_BIN" -b $opt 1> $OUTPUT_DEST 2> $ERROR_OUTPUT_DEST
> +		#"$DIR/../src/bin/lttng-relayd/$RELAYD_BIN" $opt -vvv >>/tmp/relayd.log 2>&1 &
>  		if [ $? -eq 1 ]; then
> -			if [ $withtap -eq "1" ]; then
> +			if [ "$withtap" -eq "1" ]; then
>  				fail "Start lttng-relayd (opt: $opt)"
>  			fi
>  			return 1
>  		else
> -			if [ $withtap -eq "1" ]; then
> +			if [ "$withtap" -eq "1" ]; then
>  				pass "Start lttng-relayd (opt: $opt)"
>  			fi
>  		fi
>  	else
>  		pass "Start lttng-relayd (opt: $opt)"
>  	fi
> -
> -	RELAYD_PIDS=$(pgrep $RELAYD_MATCH)
>  }
>  
>  function start_lttng_relayd()
> @@ -414,29 +420,60 @@ function start_lttng_relayd_notap()
>  function stop_lttng_relayd_opt()
>  {
>  	local withtap=$1
> +	local signal=$2
>  
> -	if [ $withtap -eq "1" ]; then
> -		diag "Killing lttng-relayd (pid: $RELAYD_PIDS)"
> +	if [ -z "$signal" ]; then
> +		signal="SIGTERM"
>  	fi
> -	kill $RELAYD_PIDS 1> $OUTPUT_DEST 2> $ERROR_OUTPUT_DEST
> -	retval=$?
>  
> -	if [ $? -eq 1 ]; then
> -		if [ $withtap -eq "1" ]; then
> +	local timeout_s=$3
> +	local dtimeleft_s=
> +
> +	# Multiply time by 2 to simplify integer arithmetic
> +	if [ -n "$timeout_s" ]; then
> +		dtimeleft_s=$((timeout_s * 2))
> +	fi
> +
> +	local retval=0
> +	local pids=
> +
> +	pids=$(pgrep "$RELAYD_MATCH")
> +	if [ -z "$pids" ]; then
> +		if [ "$withtap" -eq "1" ]; then
> +			pass "No relay daemon to kill"
> +		fi
> +		return 0
> +	fi
> +
> +	diag "Killing (signal $signal) lttng-relayd (pid: $pids)"
> +
> +	# shellcheck disable=SC2086
> +	if ! kill -s $signal $pids 1> $OUTPUT_DEST 2> $ERROR_OUTPUT_DEST; then
> +		retval=1
> +		if [ "$withtap" -eq "1" ]; then
>  			fail "Kill relay daemon"
>  		fi
> -		return 1
>  	else
>  		out=1
>  		while [ -n "$out" ]; do
> -			out=$(pgrep $RELAYD_MATCH)
> +			out=$(pgrep "$RELAYD_MATCH")
> +			if [ -n "$dtimeleft_s" ]; then
> +				if [ $dtimeleft_s -lt 0 ]; then
> +					out=
> +					retval=1
> +				fi
> +				dtimeleft_s=$((dtimeleft_s - 1))
> +			fi
>  			sleep 0.5
>  		done
> -		if [ $withtap -eq "1" ]; then
> -			pass "Kill relay daemon"
> +		if [ "$withtap" -eq "1" ]; then
> +			if [ "$retval" -eq "0" ]; then
> +				pass "Wait after kill relay daemon"
> +			else
> +				fail "Wait after kill relay daemon"
> +			fi
>  		fi
>  	fi
> -	RELAYD_PIDS=""
>  	return $retval
>  }
>  
> @@ -459,14 +496,16 @@ function start_lttng_sessiond_opt()
>  
>  	local env_vars=""
>  	local consumerd=""
> -	local long_bit_value=$(getconf LONG_BIT)
>  
> -	if [ -n $TEST_NO_SESSIOND ] && [ "$TEST_NO_SESSIOND" == "1" ]; then
> +	local long_bit_value=
> +	long_bit_value=$(getconf LONG_BIT)
> +
> +	if [ -n "$TEST_NO_SESSIOND" ] && [ "$TEST_NO_SESSIOND" == "1" ]; then
>  		# Env variable requested no session daemon
>  		return
>  	fi
>  
> -	DIR=$(readlink -f $TESTDIR)
> +	DIR=$(readlink -f "$TESTDIR")
>  
>  	# Get long_bit value for 32/64 consumerd
>  	case "$long_bit_value" in
> @@ -483,32 +522,33 @@ function start_lttng_sessiond_opt()
>  
>  	# Check for env. variable. Allow the use of LD_PRELOAD etc.
>  	if [[ "x${LTTNG_SESSIOND_ENV_VARS}" != "x" ]]; then
> -		env_vars=${LTTNG_SESSIOND_ENV_VARS}
> +		env_vars="${LTTNG_SESSIOND_ENV_VARS} "
>  	fi
> +	env_vars="${env_vars}$DIR/../src/bin/lttng-sessiond/$SESSIOND_BIN"
>  
> -	validate_kernel_version
> -	if [ $? -ne 0 ]; then
> +	if ! validate_kernel_version; then
>  	    fail "Start session daemon"
>  	    BAIL_OUT "*** Kernel too old for session daemon tests ***"
>  	fi
>  
> -	: ${LTTNG_SESSION_CONFIG_XSD_PATH=${DIR}/../src/common/config/}
> +	: "${LTTNG_SESSION_CONFIG_XSD_PATH="${DIR}/../src/common/config/"}"
>  	export LTTNG_SESSION_CONFIG_XSD_PATH
>  
> -	if [ -z $(pgrep ${SESSIOND_MATCH}) ]; then
> +	if [ -z "$(pgrep "${SESSIOND_MATCH}")" ]; then
>  		# Have a load path ?
>  		if [ -n "$load_path" ]; then
> -			env $env_vars $DIR/../src/bin/lttng-sessiond/$SESSIOND_BIN --load "$load_path" --background $consumerd
> +			# shellcheck disable=SC2086
> +			env $env_vars --load "$load_path" --background "$consumerd"
>  		else
> -			env $env_vars $DIR/../src/bin/lttng-sessiond/$SESSIOND_BIN --background $consumerd
> +			# shellcheck disable=SC2086
> +			env $env_vars --background "$consumerd"
>  		fi
>  		#$DIR/../src/bin/lttng-sessiond/$SESSIOND_BIN --background --consumerd32-path="$DIR/../src/bin/lttng-consumerd/lttng-consumerd" --consumerd64-path="$DIR/../src/bin/lttng-consumerd/lttng-consumerd" --verbose-consumer >>/tmp/sessiond.log 2>&1
>  		status=$?
> -		if [ $withtap -eq "1" ]; then
> +		if [ "$withtap" -eq "1" ]; then
>  			ok $status "Start session daemon"
>  		fi
>  	fi
> -	SESSIOND_PIDS=$(pgrep $SESSIOND_MATCH)
>  }
>  
>  function start_lttng_sessiond()
> @@ -525,44 +565,98 @@ function stop_lttng_sessiond_opt()
>  {
>  	local withtap=$1
>  	local signal=$2
> -	local kill_opt=""
>  
> -	if [ -n $TEST_NO_SESSIOND ] && [ "$TEST_NO_SESSIOND" == "1" ]; then
> +	if [ -z "$signal" ]; then
> +		signal=SIGTERM
> +	fi
> +
> +	local timeout_s=$3
> +	local dtimeleft_s=
> +
> +	# Multiply time by 2 to simplify integer arithmetic
> +	if [ -n "$timeout_s" ]; then
> +		dtimeleft_s=$((timeout_s * 2))
> +	fi
> +
> +	if [ -n "$TEST_NO_SESSIOND" ] && [ "$TEST_NO_SESSIOND" == "1" ]; then
>  		# Env variable requested no session daemon
> -		return
> +		return 0
>  	fi
>  
> -	local pids="${SESSIOND_PIDS} $(pgrep $RUNAS_MATCH)"
> +	local retval=0
>  
> -	if [ -n "$2" ]; then
> -		kill_opt="$kill_opt -s $signal"
> +	local runas_pids=
> +	runas_pids=$(pgrep "$RUNAS_MATCH")
> +
> +	local pids=
> +	pids=$(pgrep "$SESSIOND_MATCH")
> +
> +	if [ -n "$runas_pids" ]; then
> +		pids="$pids $runas_pids"
>  	fi
> -	if [ $withtap -eq "1" ]; then
> -		diag "Killing $SESSIOND_BIN and lt-$SESSIOND_BIN pids: $(echo $pids | tr '\n' ' ')"
> +
> +	if [ -z "$pids" ]; then
> +		if [ "$withtap" -eq "1" ]; then
> +			pass "No session daemon to kill"
> +		fi
> +		return 0
>  	fi
> -	kill $kill_opt $pids 1> $OUTPUT_DEST 2> $ERROR_OUTPUT_DEST
>  
> -	if [ $? -eq 1 ]; then
> -		if [ $withtap -eq "1" ]; then
> +	diag "Killing (signal $signal) $SESSIOND_BIN and lt-$SESSIOND_BIN pids: $(echo "$pids" | tr '\n' ' ')"
> +
> +	# shellcheck disable=SC2086
> +	if ! kill -s $signal $pids 1> $OUTPUT_DEST 2> $ERROR_OUTPUT_DEST; then
> +		retval=1
> +		if [ "$withtap" -eq "1" ]; then
>  			fail "Kill sessions daemon"
>  		fi
>  	else
>  		out=1
>  		while [ -n "$out" ]; do
> -			out=$(pgrep ${SESSIOND_MATCH})
> +			out=$(pgrep "${SESSIOND_MATCH}")
> +			if [ -n "$dtimeleft_s" ]; then
> +				if [ $dtimeleft_s -lt 0 ]; then
> +					out=
> +					retval=1
> +				fi
> +				dtimeleft_s=$((dtimeleft_s - 1))
> +			fi
>  			sleep 0.5
>  		done
>  		out=1
>  		while [ -n "$out" ]; do
> -			out=$(pgrep $CONSUMERD_MATCH)
> +			out=$(pgrep "$CONSUMERD_MATCH")
> +			if [ -n "$dtimeleft_s" ]; then
> +				if [ $dtimeleft_s -lt 0 ]; then
> +					out=
> +					retval=1
> +				fi
> +				dtimeleft_s=$((dtimeleft_s - 1))
> +			fi
>  			sleep 0.5
>  		done
>  
> -		SESSIOND_PIDS=""
> -		if [ $withtap -eq "1" ]; then
> -			pass "Kill session daemon"
> +		if [ "$withtap" -eq "1" ]; then
> +			if [ "$retval" -eq "0" ]; then
> +				pass "Wait after kill session daemon"
> +			else
> +				fail "Wait after kill session daemon"
> +			fi
>  		fi
>  	fi
> +	if [ "$signal" = "SIGKILL" ]; then
> +		if [ "$(id -u)" -eq "0" ]; then
> +			local modules=
> +			modules="$(lsmod | grep ^lttng | awk '{print $1}')"
> +
> +			if [ -n "$modules" ]; then
> +				diag "Unloading all LTTng modules"
> +				modprobe -r "$modules"
> +			fi
> +		fi
> +	fi
> +
> +	return $retval
>  }
>  
>  function stop_lttng_sessiond()
> @@ -579,43 +673,40 @@ function sigstop_lttng_sessiond_opt()
>  {
>  	local withtap=$1
>  	local signal=SIGSTOP
> -	local kill_opt=""
>  
> -	if [ -n $TEST_NO_SESSIOND ] && [ "$TEST_NO_SESSIOND" == "1" ]; then
> +	if [ -n "$TEST_NO_SESSIOND" ] && [ "$TEST_NO_SESSIOND" == "1" ]; then
>  		# Env variable requested no session daemon
>  		return
>  	fi
>  
> -	PID_SESSIOND="$(pgrep ${SESSIOND_MATCH}) $(pgrep $RUNAS_MATCH)"
> -
> -	kill_opt="$kill_opt -s $signal"
> +	PID_SESSIOND="$(pgrep "${SESSIOND_MATCH}") $(pgrep "$RUNAS_MATCH")"
>  
> -	if [ $withtap -eq "1" ]; then
> -		diag "Sending SIGSTOP to lt-$SESSIOND_BIN and $SESSIOND_BIN pids: $(echo $PID_SESSIOND | tr '\n' ' ')"
> +	if [ "$withtap" -eq "1" ]; then
> +		diag "Sending SIGSTOP to lt-$SESSIOND_BIN and $SESSIOND_BIN pids: $(echo "$PID_SESSIOND" | tr '\n' ' ')"
>  	fi
> -	kill $kill_opt $PID_SESSIOND 1> $OUTPUT_DEST 2> $ERROR_OUTPUT_DEST
>  
> -	if [ $? -eq 1 ]; then
> -		if [ $withtap -eq "1" ]; then
> +	# shellcheck disable=SC2086
> +	if ! kill -s $signal $PID_SESSIOND 1> $OUTPUT_DEST 2> $ERROR_OUTPUT_DEST; then
> +		if [ "$withtap" -eq "1" ]; then
>  			fail "Sending SIGSTOP to session daemon"
>  		fi
>  	else
>  		out=1
>  		while [ $out -ne 0 ]; do
> -			pid=$(pgrep $SESSIOND_MATCH)
> +			pid="$(pgrep "$SESSIOND_MATCH")"
>  
>  			# Wait until state becomes stopped for session
>  			# daemon(s).
>  			out=0
>  			for sessiond_pid in $pid; do
> -				state=$(ps -p $sessiond_pid -o state= )
> +				state="$(ps -p "$sessiond_pid" -o state= )"
>  				if [[ -n "$state" && "$state" != "T" ]]; then
>  					out=1
>  				fi
>  			done
>  			sleep 0.5
>  		done
> -		if [ $withtap -eq "1" ]; then
> +		if [ "$withtap" -eq "1" ]; then
>  			pass "Sending SIGSTOP to session daemon"
>  		fi
>  	fi
> @@ -635,46 +726,70 @@ function stop_lttng_consumerd_opt()
>  {
>  	local withtap=$1
>  	local signal=$2
> -	local kill_opt=""
>  
> -	PID_CONSUMERD=$(pgrep $CONSUMERD_MATCH)
> +	if [ -z "$signal" ]; then
> +		signal=SIGTERM
> +	fi
> +
> +	local timeout_s=$3
> +	local dtimeleft_s=
>  
> -	if [ -n "$2" ]; then
> -		kill_opt="$kill_opt -s $signal"
> +	# Multiply time by 2 to simplify integer arithmetic
> +	if [ -n "$timeout_s" ]; then
> +		dtimeleft_s=$((timeout_s * 2))
>  	fi
>  
> -	if [ $withtap -eq "1" ]; then
> -		diag "Killing $CONSUMERD_BIN pids: $(echo $PID_CONSUMERD | tr '\n' ' ')"
> +	local retval=0
> +
> +	PID_CONSUMERD="$(pgrep "$CONSUMERD_MATCH")"
> +
> +	if [ -z "$PID_CONSUMERD" ]; then
> +		if [ "$withtap" -eq "1" ]; then
> +			pass "No consumer daemon to kill"
> +		fi
> +		return 0
>  	fi
>  
> -	kill $kill_opt $PID_CONSUMERD 1> $OUTPUT_DEST 2> $ERROR_OUTPUT_DEST
> -	retval=$?
> +	diag "Killing (signal $signal) $CONSUMERD_BIN pids: $(echo "$PID_CONSUMERD" | tr '\n' ' ')"
>  
> -	if [ $? -eq 1 ]; then
> -		if [ $withtap -eq "1" ]; then
> +	# shellcheck disable=SC2086
> +	if ! kill -s $signal $PID_CONSUMERD 1> $OUTPUT_DEST 2> $ERROR_OUTPUT_DEST; then
> +		retval=1
> +		if [ "$withtap" -eq "1" ]; then
>  			fail "Kill consumer daemon"
>  		fi
> -		return 1
>  	else
>  		out=1
>  		while [ $out -ne 0 ]; do
> -			pid=$(pgrep $CONSUMERD_MATCH)
> +			pid="$(pgrep "$CONSUMERD_MATCH")"
>  
>  			# If consumerds are still present check their status.
>  			# A zombie status qualifies the consumerd as *killed*
>  			out=0
>  			for consumer_pid in $pid; do
> -				state=$(ps -p $consumer_pid -o state= )
> +				state="$(ps -p "$consumer_pid" -o state= )"
>  				if [[ -n "$state" && "$state" != "Z" ]]; then
>  					out=1
>  				fi
>  			done
> +			if [ -n "$dtimeleft_s" ]; then
> +				if [ $dtimeleft_s -lt 0 ]; then
> +					out=0
> +					retval=1
> +				fi
> +				dtimeleft_s=$((dtimeleft_s - 1))
> +			fi
>  			sleep 0.5
>  		done
> -		if [ $withtap -eq "1" ]; then
> -			pass "Kill consumer daemon"
> +		if [ "$withtap" -eq "1" ]; then
> +			if [ "$retval" -eq "0" ]; then
> +				pass "Wait after kill consumer daemon"
> +			else
> +				fail "Wait after kill consumer daemon"
> +			fi
>  		fi
>  	fi
> +
>  	return $retval
>  }
>  
> @@ -692,40 +807,37 @@ function sigstop_lttng_consumerd_opt()
>  {
>  	local withtap=$1
>  	local signal=SIGSTOP
> -	local kill_opt=""
>  
> -	PID_CONSUMERD=$(pgrep $CONSUMERD_MATCH)
> +	PID_CONSUMERD="$(pgrep "$CONSUMERD_MATCH")"
>  
> -	kill_opt="$kill_opt -s $signal"
> +	diag "Sending SIGSTOP to $CONSUMERD_BIN pids: $(echo "$PID_CONSUMERD" | tr '\n' ' ')"
>  
> -	if [ $withtap -eq "1" ]; then
> -		diag "Sending SIGSTOP to $CONSUMERD_BIN pids: $(echo $PID_CONSUMERD | tr '\n' ' ')"
> -	fi
> -	kill $kill_opt $PID_CONSUMERD 1> $OUTPUT_DEST 2> $ERROR_OUTPUT_DEST
> +	# shellcheck disable=SC2086
> +	kill -s $signal $PID_CONSUMERD 1> $OUTPUT_DEST 2> $ERROR_OUTPUT_DEST
>  	retval=$?
>  
> -	if [ $? -eq 1 ]; then
> -		if [ $withtap -eq "1" ]; then
> +	if [ $retval -eq 1 ]; then
> +		if [ "$withtap" -eq "1" ]; then
>  			fail "Sending SIGSTOP to consumer daemon"
>  		fi
>  		return 1
>  	else
>  		out=1
>  		while [ $out -ne 0 ]; do
> -			pid=$(pgrep $CONSUMERD_MATCH)
> +			pid="$(pgrep "$CONSUMERD_MATCH")"
>  
>  			# Wait until state becomes stopped for all
>  			# consumers.
>  			out=0
>  			for consumer_pid in $pid; do
> -				state=$(ps -p $consumer_pid -o state= )
> +				state="$(ps -p "$consumer_pid" -o state= )"
>  				if [[ -n "$state" && "$state" != "T" ]]; then
>  					out=1
>  				fi
>  			done
>  			sleep 0.5
>  		done
> -		if [ $withtap -eq "1" ]; then
> +		if [ "$withtap" -eq "1" ]; then
>  			pass "Sending SIGSTOP to consumer daemon"
>  		fi
>  	fi
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-09-05 20:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190516190801.16878-1-mathieu.desnoyers@efficios.com>
2019-05-16 19:07 ` [PATCH lttng-tools 2/6] Fix: tests: error handling in high throughput limits test (v2) Mathieu Desnoyers
2019-05-16 19:07 ` [PATCH lttng-tools 3/6] Fix: utils.sh: handle SIGPIPE Mathieu Desnoyers
2019-05-16 19:07 ` [PATCH lttng-tools 4/6] Fix: test: utils.sh: exit from process on full_cleanup Mathieu Desnoyers
2019-05-16 19:08 ` [PATCH lttng-tools 5/6] Cleanup: test: don't stop relayd twice Mathieu Desnoyers
2019-05-16 19:08 ` [PATCH lttng-tools 6/6] tests: invoke full_cleanup from script trap handlers, use modprobe -r Mathieu Desnoyers
2019-09-05 20:54 ` [PATCH lttng-tools 1/6] Improve handling of test SIGTERM/SIGINT (v2) Jérémie Galarneau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).