All of lore.kernel.org
 help / color / mirror / Atom feed
* failing smokey tests
@ 2020-03-20 12:23 Bradley Valdenebro Peter (DC-AE/ESW52)
  2020-03-20 12:35 ` Jan Kiszka
  0 siblings, 1 reply; 3+ messages in thread
From: Bradley Valdenebro Peter (DC-AE/ESW52) @ 2020-03-20 12:23 UTC (permalink / raw)
  To: xenomai

Hello Xenomai team,

We are currently running a Xenomai/Linux setup on a Zynq Z-7020 SoC (We run Linux on CPU0 and Xenomai on CPU1):
	$ uname -a
		Linux wom03 4.14.110-mainline #1 SMP PREEMPT Tue Mar 17 17:25:22 CET 2020 armv7l GNU/Linux
	$ cat /proc/xenomai/version
		3.1
	$ cat /proc/ipipe/version
		7

We are experiencing two problems with the smokey tests:

1. posix_cond (check POSIX condition variable services). This tests fails 8 out of 10 executions with the following message:
	
	$ sudo /usr/bin/smokey --run=14 --verbose=2
		autoinit_simple_conddestroy
		autoinit_simple_condwait
		simple_condwait
		relative_condwait
		autoinit_absolute_condwait
		cond_wait waited 9998.799 us

By looking at the test autoinit_absolute_condwait it looks like a timed conditional wait should timeout after 10000us. This is measured in the test and checked.
If it's lower than 10000us it will fail. Should it always be equal or higher than 10000us? Are 9998.799us not acceptable?

2. memory_coreheap and memory_heapmem
First execution of any of these tests goes fine but later executions of any of them always result in a system freeze and a power cycle is necessary.
E.g.: If I execute memory_coreheap after a power cycle it will succeed. If I later execute memory_heapmem it will freeze.
        Or if I execute memory_heapmem after a power cycle it will succeed. If I later execute it again it will freeze.

It freezes just after printing "(running the pattern check test for 'heapmem' -- this may take some time)"

By freeze I mean the system is unresponsive. I cannot even ping the system.
If I had to guess it looks like something isn't freed or cleaned-up properly after the test.

Thanks for your help.
	
Best regards,
Peter Bradley



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: failing smokey tests
  2020-03-20 12:23 failing smokey tests Bradley Valdenebro Peter (DC-AE/ESW52)
@ 2020-03-20 12:35 ` Jan Kiszka
  2020-03-23 19:44   ` Bradley Valdenebro Peter (DC-AE/ESW52)
  0 siblings, 1 reply; 3+ messages in thread
From: Jan Kiszka @ 2020-03-20 12:35 UTC (permalink / raw)
  To: Bradley Valdenebro Peter (DC-AE/ESW52), xenomai, Quirin Gylstorff

On 20.03.20 13:23, Bradley Valdenebro Peter (DC-AE/ESW52) via Xenomai wrote:
> Hello Xenomai team,
> 
> We are currently running a Xenomai/Linux setup on a Zynq Z-7020 SoC (We run Linux on CPU0 and Xenomai on CPU1):
> 	$ uname -a
> 		Linux wom03 4.14.110-mainline #1 SMP PREEMPT Tue Mar 17 17:25:22 CET 2020 armv7l GNU/Linux
> 	$ cat /proc/xenomai/version
> 		3.1
> 	$ cat /proc/ipipe/version
> 		7
> 
> We are experiencing two problems with the smokey tests:
> 
> 1. posix_cond (check POSIX condition variable services). This tests fails 8 out of 10 executions with the following message:
> 	
> 	$ sudo /usr/bin/smokey --run=14 --verbose=2
> 		autoinit_simple_conddestroy
> 		autoinit_simple_condwait
> 		simple_condwait
> 		relative_condwait
> 		autoinit_absolute_condwait
> 		cond_wait waited 9998.799 us
> 
> By looking at the test autoinit_absolute_condwait it looks like a timed conditional wait should timeout after 10000us. This is measured in the test and checked.
> If it's lower than 10000us it will fail. Should it always be equal or higher than 10000us? Are 9998.799us not acceptable?
> 

Did you run autotune prior to smokey? Maybe latencies are not properly 
tuned so that we overshoot.

That said, some tolerance for such scenarios might be needed here.

> 2. memory_coreheap and memory_heapmem
> First execution of any of these tests goes fine but later executions of any of them always result in a system freeze and a power cycle is necessary.
> E.g.: If I execute memory_coreheap after a power cycle it will succeed. If I later execute memory_heapmem it will freeze.
>          Or if I execute memory_heapmem after a power cycle it will succeed. If I later execute it again it will freeze.

Sounds like as if uninitialized memory is biting here.

> 
> It freezes just after printing "(running the pattern check test for 'heapmem' -- this may take some time)"
> 
> By freeze I mean the system is unresponsive. I cannot even ping the system.
> If I had to guess it looks like something isn't freed or cleaned-up properly after the test.

Does your system have the Xenomai watchdog enabled? If we only lock up 
in an endless loop, that should kick in and make the case analyzable 
(e.g. with gdb).

We run those tests on ARMv7 as well (BeagleBone Black and QEMU), but - 
Quiring, correct me if I'm wrong - not multiple times in a row.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: failing smokey tests
  2020-03-20 12:35 ` Jan Kiszka
@ 2020-03-23 19:44   ` Bradley Valdenebro Peter (DC-AE/ESW52)
  0 siblings, 0 replies; 3+ messages in thread
From: Bradley Valdenebro Peter (DC-AE/ESW52) @ 2020-03-23 19:44 UTC (permalink / raw)
  To: Jan Kiszka, xenomai, Quirin Gylstorff

Hello,

Thanks for your reply.
Concerning posix_cond issue no, we didn't run autotune. We were under the impression the system autotuned itself during startup. We will run autotune and try again. If we still have problems we will let you know.
About memory_coreheap and memory_heapmem we haven't got the Xenomai watchdog enabled. We will enable it and try to spot the point where it hangs using gdb but this might take us some time.


Best regards,

Peter Bradley
​

-----Original Message-----
From: Jan Kiszka <jan.kiszka@siemens.com> 
Sent: 20 March 2020 13:35
To: Bradley Valdenebro Peter (DC-AE/ESW52) <Peter.BradleyValdenebro@boschrexroth.nl>; xenomai@xenomai.org; Quirin Gylstorff <quirin.gylstorff@siemens.com>
Subject: Re: failing smokey tests

On 20.03.20 13:23, Bradley Valdenebro Peter (DC-AE/ESW52) via Xenomai wrote:
> Hello Xenomai team,
> 
> We are currently running a Xenomai/Linux setup on a Zynq Z-7020 SoC (We run Linux on CPU0 and Xenomai on CPU1):
> 	$ uname -a
> 		Linux wom03 4.14.110-mainline #1 SMP PREEMPT Tue Mar 17 17:25:22 CET 2020 armv7l GNU/Linux
> 	$ cat /proc/xenomai/version
> 		3.1
> 	$ cat /proc/ipipe/version
> 		7
> 
> We are experiencing two problems with the smokey tests:
> 
> 1. posix_cond (check POSIX condition variable services). This tests fails 8 out of 10 executions with the following message:
> 	
> 	$ sudo /usr/bin/smokey --run=14 --verbose=2
> 		autoinit_simple_conddestroy
> 		autoinit_simple_condwait
> 		simple_condwait
> 		relative_condwait
> 		autoinit_absolute_condwait
> 		cond_wait waited 9998.799 us
> 
> By looking at the test autoinit_absolute_condwait it looks like a timed conditional wait should timeout after 10000us. This is measured in the test and checked.
> If it's lower than 10000us it will fail. Should it always be equal or higher than 10000us? Are 9998.799us not acceptable?
> 

Did you run autotune prior to smokey? Maybe latencies are not properly tuned so that we overshoot.

That said, some tolerance for such scenarios might be needed here.

> 2. memory_coreheap and memory_heapmem
> First execution of any of these tests goes fine but later executions of any of them always result in a system freeze and a power cycle is necessary.
> E.g.: If I execute memory_coreheap after a power cycle it will succeed. If I later execute memory_heapmem it will freeze.
>          Or if I execute memory_heapmem after a power cycle it will succeed. If I later execute it again it will freeze.

Sounds like as if uninitialized memory is biting here.

> 
> It freezes just after printing "(running the pattern check test for 'heapmem' -- this may take some time)"
> 
> By freeze I mean the system is unresponsive. I cannot even ping the system.
> If I had to guess it looks like something isn't freed or cleaned-up properly after the test.

Does your system have the Xenomai watchdog enabled? If we only lock up in an endless loop, that should kick in and make the case analyzable (e.g. with gdb).

We run those tests on ARMv7 as well (BeagleBone Black and QEMU), but - Quiring, correct me if I'm wrong - not multiple times in a row.

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-03-23 19:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-20 12:23 failing smokey tests Bradley Valdenebro Peter (DC-AE/ESW52)
2020-03-20 12:35 ` Jan Kiszka
2020-03-23 19:44   ` Bradley Valdenebro Peter (DC-AE/ESW52)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.