All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [OE-core] [PATCH 00/14] oeqa runtime tests when qemu hangs
       [not found] <173C0B9603DBAB46.24231@lists.openembedded.org>
@ 2023-01-25  7:54 ` Mikko Rapeli
  2023-01-25  8:03   ` Alexander Kanavin
  0 siblings, 1 reply; 4+ messages in thread
From: Mikko Rapeli @ 2023-01-25  7:54 UTC (permalink / raw)
  To: openembedded-core

Hi,

On Fri, Jan 20, 2023 at 04:44:36PM +0200, Mikko Rapeli via lists.openembedded.org wrote:
> I get a qemu hang on kirkstone, swtpm and optee. One of the
> optee-test/xtest hangs the qemu machine in some kind of deadlock.
> While this needs to be debugged and tested, the oeqa runtime tests
> also hanged and never returned. Thus this patch set. With these changes
> qemu deadlock is detected and with do_testimage() task eventually exits
> with all correct tests failing and the hangin qemu system killed.
> There are a lot of debug prints added by this patch set but I don't of
> any other way to debug complex python code. strace output from the hang
> doesn't tell where the deadlock happened.

On #yocto Richard said he doesn't like the large amount of debug prints
here. If there are some specific ones I should drop, then please let me
know. I think the logs in do_testimage() are quite readable with
these enabled. I can follow the logs and see target debug output in
larger, multi line chunks. I can see if an ssh command on target is
waiting for output for a long time, and output of the commands comes in
larger clear chunks.

I have a complex boot sequence which includes firmware, kernel,
initramfs, rootfs encryption etc before entering login prompt so
collecting all logs from the boot is criticial, and the boot takes a
long time too so seeing frequent output in do_testimage() logs is also
important. The chunk reading of output data really helps.

Cheers,

-Mikko


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [OE-core] [PATCH 00/14] oeqa runtime tests when qemu hangs
  2023-01-25  7:54 ` [OE-core] [PATCH 00/14] oeqa runtime tests when qemu hangs Mikko Rapeli
@ 2023-01-25  8:03   ` Alexander Kanavin
  2023-01-25 12:32     ` Mikko Rapeli
  0 siblings, 1 reply; 4+ messages in thread
From: Alexander Kanavin @ 2023-01-25  8:03 UTC (permalink / raw)
  To: Mikko Rapeli; +Cc: openembedded-core

[-- Attachment #1: Type: text/plain, Size: 2140 bytes --]

Perhaps you could show a sample output before and after all the changes?

Alex

On Wed 25. Jan 2023 at 8.54, Mikko Rapeli <mikko.rapeli@linaro.org> wrote:

> Hi,
>
> On Fri, Jan 20, 2023 at 04:44:36PM +0200, Mikko Rapeli via
> lists.openembedded.org wrote:
> > I get a qemu hang on kirkstone, swtpm and optee. One of the
> > optee-test/xtest hangs the qemu machine in some kind of deadlock.
> > While this needs to be debugged and tested, the oeqa runtime tests
> > also hanged and never returned. Thus this patch set. With these changes
> > qemu deadlock is detected and with do_testimage() task eventually exits
> > with all correct tests failing and the hangin qemu system killed.
> > There are a lot of debug prints added by this patch set but I don't of
> > any other way to debug complex python code. strace output from the hang
> > doesn't tell where the deadlock happened.
>
> On #yocto Richard said he doesn't like the large amount of debug prints
> here. If there are some specific ones I should drop, then please let me
> know. I think the logs in do_testimage() are quite readable with
> these enabled. I can follow the logs and see target debug output in
> larger, multi line chunks. I can see if an ssh command on target is
> waiting for output for a long time, and output of the commands comes in
> larger clear chunks.
>
> I have a complex boot sequence which includes firmware, kernel,
> initramfs, rootfs encryption etc before entering login prompt so
> collecting all logs from the boot is criticial, and the boot takes a
> long time too so seeing frequent output in do_testimage() logs is also
> important. The chunk reading of output data really helps.
>
> Cheers,
>
> -Mikko
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#176341):
> https://lists.openembedded.org/g/openembedded-core/message/176341
> Mute This Topic: https://lists.openembedded.org/mt/96401356/1686489
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [
> alex.kanavin@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>

[-- Attachment #2: Type: text/html, Size: 3215 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [OE-core] [PATCH 00/14] oeqa runtime tests when qemu hangs
  2023-01-25  8:03   ` Alexander Kanavin
@ 2023-01-25 12:32     ` Mikko Rapeli
  2023-01-25 21:10       ` Alexander Kanavin
  0 siblings, 1 reply; 4+ messages in thread
From: Mikko Rapeli @ 2023-01-25 12:32 UTC (permalink / raw)
  To: Alexander Kanavin; +Cc: openembedded-core

On Wed, Jan 25, 2023 at 09:03:03AM +0100, Alexander Kanavin wrote:
> Perhaps you could show a sample output before and after all the changes?

With kirkstone and build config:

MACHINE ??= "qemuarm64"
PACKAGE_CLASSES = "package_ipk"
INHERIT += "rm_work"
INHERIT += "buildhistory"
BUILDHISTORY_COMMIT = "1"
DISTRO_FEATURES:append = " systemd"
VIRTUAL-RUNTIME_init_manager = "systemd"
DISTRO_FEATURES_BACKFILL_CONSIDERED = "sysvinit"
VIRTUAL-RUNTIME_initscripts = ""
IMAGE_CLASSES += "testimage"
IMAGE_FEATURES += "ssh-server-openssh package-management"
TEST_RUNQEMUPARAMS = "slirp nographic novga"
TEST_SUITES = "\
    ping \
    ssh \
    date \
    df \
    parselogs \
    ptest \
"
QEMU_USE_SLIRP = "1"
# only localhost to access via ssh
QB_SLIRP_OPT = "-netdev user,id=net0,hostfwd=tcp:127.0.0.1:2222-:22"
TEST_SERVER_IP = "127.0.0.1"

With this I compiled core-image-minimal and ran do_testimage:

 * before this patch series: https://pastebin.com/raw/rzhgRGix

 * with this patch series: https://pastebin.com/raw/3R5mUutS

Major difference is that with the series full boot log is do_testimage
task debug output. It's not in bitbake output though, that remains
unchanged. Now if there are any hangs, timeouts or other problems, the
do_testimage logs with the patch series will contain information where
exactly the failure happened. And without this patch series, a hang in
qemu goes undetected and whole test execution hangs, and in multiple
locations: first in test ssh command output select() read() loop for
every command executed on target and then in the QMP debug output
for every ssh command which fails with return value 255.

Cheers,

-Mikko


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [OE-core] [PATCH 00/14] oeqa runtime tests when qemu hangs
  2023-01-25 12:32     ` Mikko Rapeli
@ 2023-01-25 21:10       ` Alexander Kanavin
  0 siblings, 0 replies; 4+ messages in thread
From: Alexander Kanavin @ 2023-01-25 21:10 UTC (permalink / raw)
  To: Mikko Rapeli; +Cc: openembedded-core

On Wed, 25 Jan 2023 at 13:32, Mikko Rapeli <mikko.rapeli@linaro.org> wrote:

> Major difference is that with the series full boot log is do_testimage
> task debug output. It's not in bitbake output though, that remains
> unchanged. Now if there are any hangs, timeouts or other problems, the
> do_testimage logs with the patch series will contain information where
> exactly the failure happened. And without this patch series, a hang in
> qemu goes undetected and whole test execution hangs, and in multiple
> locations: first in test ssh command output select() read() loop for
> every command executed on target and then in the QMP debug output
> for every ssh command which fails with return value 255.

Unfortunately I tend to agree with RP: if everything works normally,
and even if the tests themselves fail, this amount of output is
excessive, and is just noise. If you are experiencing hangs, then the
verbosity needs to be conditionally enabled, and off by default.

Alex


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-01-25 21:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <173C0B9603DBAB46.24231@lists.openembedded.org>
2023-01-25  7:54 ` [OE-core] [PATCH 00/14] oeqa runtime tests when qemu hangs Mikko Rapeli
2023-01-25  8:03   ` Alexander Kanavin
2023-01-25 12:32     ` Mikko Rapeli
2023-01-25 21:10       ` Alexander Kanavin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.