All of lore.kernel.org
 help / color / mirror / Atom feed
* improving logging of hanging tests in CI
@ 2021-08-04 14:50 Peter Maydell
  0 siblings, 0 replies; only message in thread
From: Peter Maydell @ 2021-08-04 14:50 UTC (permalink / raw)
  To: QEMU Developers

Some of our tests in 'make check' hang intermittently. At the moment
we have a path to diagnosing these when they happen via my ad-hoc CI
scripts: I can log into the relevant machine and manually look at
what has hung, attach gdb, etc. However, for the gitlab CI this
doesn't work.

Is it possible to stick something into the 'make check' framework
that does something like this:
 * for each test run, if it hasn't exited after 5 minutes,
   assume it has hung
 * in that case, print "ERROR: test $WHATEVER hung" to stdout
 * run something like the below script to capture backtraces
   (which is just something I threw together this afternoon and
   could probably be improved)
 * kill the offending subtree of processes
 * make sure 'make' exits with an error

We'd need to make sure that the CI stuff had 'gdb' installed
(and that the CI machine config lets gdb attach to processes
by PID, which we can for our own runners even if the gitlab
stock setup forbids it.)

The idea is to at least get a backtrace of a hung test into the
logs, so we have some idea of what happened.

===backtrace-process-tree===
#!/bin/bash -e
# backtrace-process-tree: print a thread backtrace of specified
# process and all its descendants.
# Copyright 2021 Linaro
# License GPL-v2-or-later

if [ $# != 1 ]; then
    echo "Usage: backtrace-process-id PID"
    exit 1
fi

TOPPID="$1"

if [ ! -e "/proc/$TOPPID" ]; then
    echo "$TOPPID not a PID of a running process?"
    exit 1
fi

bt_me_and_children() {
  ME="$1"
  echo "==========================================================="
  echo "PROCESS: $ME"
  ps -ww -f -p "$ME" | tail -1
  gdb --nx --batch -ex 'thread apply all bt' /proc/"$ME"/exe "$ME"
  echo

  for child in $(pgrep -P "$ME"); do
      bt_me_and_children $child
  done
}

echo "Process tree:"
pstree -pT "$TOPPID"

bt_me_and_children "$TOPPID"
===endit===

-- PMM


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-08-04 17:06 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-04 14:50 improving logging of hanging tests in CI Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.