[LTP] [PATCH 1/2] ltp: Add the ability to specify the latency constraint

All of lore.kernel.org
 help / color / mirror / Atom feed

* [LTP] [PATCH 1/2] ltp: Add the ability to specify the latency constraint
@ 2017-08-10  8:01 Daniel Lezcano
  2017-08-10  8:01 ` [LTP] [PATCH 2/2] syscalls/pselect: Add a zero " Daniel Lezcano
  2017-08-11 11:25 ` [LTP] [PATCH 1/2] ltp: Add the ability to specify the " Jan Stancek
  0 siblings, 2 replies; 29+ messages in thread
From: Daniel Lezcano @ 2017-08-10  8:01 UTC (permalink / raw)
  To: ltp

The ltp test suites provides a set of tests. Some of them are checking the test
happens in a specified amount of time.

Unfortunately, some platforms have slow power management routines adding more
than 1.5ms to wakeup from a deep idle state. This duration is far too long to
be acceptable when we are trying the measure a speficied routine with a timeout
reasonably delayed. For example, the testcases/kernel/syscalls/pselect_01 is
failing for this reason.

This patch gives the opportunity to the testcase to specify the latency
constraint when running. This option must be used with the needs_root in order
to have the right privileges.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 include/tst_test.h |  4 ++++
 lib/tst_test.c     | 18 ++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/include/tst_test.h b/include/tst_test.h
index e90312a..519fd4c 100644
--- a/include/tst_test.h
+++ b/include/tst_test.h
@@ -124,6 +124,7 @@ struct tst_test {
 	int needs_checkpoints:1;
 	int format_device:1;
 	int mount_device:1;
+	int needs_latency:1;
 
 	/* Minimal device size in megabytes */
 	unsigned int dev_min_size;
@@ -154,6 +155,9 @@ struct tst_test {
 
 	/* NULL terminated array of resource file names */
 	const char *const *resource_files;
+
+	/* Latency constraint to be set for the test */
+	int latency;
 };
 
 /*
diff --git a/lib/tst_test.c b/lib/tst_test.c
index 4c30eda..485515e 100644
--- a/lib/tst_test.c
+++ b/lib/tst_test.c
@@ -619,6 +619,21 @@ static void copy_resources(void)
 		TST_RESOURCE_COPY(NULL, tst_test->resource_files[i], NULL);
 }
 
+static int set_latency(void)
+{
+	int fd, ret;
+
+	fd = open("/dev/cpu_dma_latency", O_WRONLY);
+	if (fd < 0)
+		return fd;
+
+	ret = write(fd, &tst_test->latency, sizeof(tst_test->latency));
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
 static const char *get_tid(char *argv[])
 {
 	char *p;
@@ -736,6 +751,9 @@ static void do_setup(int argc, char *argv[])
 
 	if (tst_test->resource_files)
 		copy_resources();
+
+	if (tst_test->needs_latency && set_latency())
+		tst_brk(TCONF, "Failed to set cpu latency");
 }
 
 static void do_test_setup(void)
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [LTP] [PATCH 2/2] syscalls/pselect: Add a zero latency constraint
  2017-08-10  8:01 [LTP] [PATCH 1/2] ltp: Add the ability to specify the latency constraint Daniel Lezcano
@ 2017-08-10  8:01 ` Daniel Lezcano
  2017-08-10 11:50   ` Jiri Jaburek
  2017-08-11 11:25 ` [LTP] [PATCH 1/2] ltp: Add the ability to specify the " Jan Stancek
  1 sibling, 1 reply; 29+ messages in thread
From: Daniel Lezcano @ 2017-08-10  8:01 UTC (permalink / raw)
  To: ltp

The pselect_01 testcase works well on an x86 as it is a fast platform, with
fast exit latency idle routine.

However on ARM[64] the idle routine can take much more time, for example
1500us.

The pselect fails on the ARM[64] platforms because of these slow exit latencies,
the delay between the expected expiration and the observed one is not
acceptable.

The fix could be to increase the deviation on ARM64 but that wouldn't make
sense as some platforms, for the same architecture, can have faster or
different delays, hence we can potentially miss a bug.

The simplest solution is to set the cpu_dma latency constraint to zero, so the
idle driver will always choose the fastest idle state, thus fixing the issue
above. The latency constraint will apply only for this test.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 testcases/kernel/syscalls/pselect/pselect01.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/testcases/kernel/syscalls/pselect/pselect01.c b/testcases/kernel/syscalls/pselect/pselect01.c
index a2b5339..42027b6 100644
--- a/testcases/kernel/syscalls/pselect/pselect01.c
+++ b/testcases/kernel/syscalls/pselect/pselect01.c
@@ -44,6 +44,9 @@ int sample_fn(int clk_id, long long usec)
 }

 static struct tst_test test = {
+	.needs_root = 1,
+	.needs_latency = 1,
+	.latency = 0,
 	.tid = "pselect()",
 	.sample = sample_fn,
 };
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [LTP] [PATCH 2/2] syscalls/pselect: Add a zero latency constraint
  2017-08-10  8:01 ` [LTP] [PATCH 2/2] syscalls/pselect: Add a zero " Daniel Lezcano
@ 2017-08-10 11:50   ` Jiri Jaburek
  2017-08-10 12:00     ` Daniel Lezcano
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Jaburek @ 2017-08-10 11:50 UTC (permalink / raw)
  To: ltp

On 08/10/17 10:01, Daniel Lezcano wrote:
> The pselect_01 testcase works well on an x86 as it is a fast platform, with
> fast exit latency idle routine.
> 
> However on ARM[64] the idle routine can take much more time, for example
> 1500us.
> 
> The pselect fails on the ARM[64] platforms because of these slow exit latencies,
> the delay between the expected expiration and the observed one is not
> acceptable.
> 
> The fix could be to increase the deviation on ARM64 but that wouldn't make
> sense as some platforms, for the same architecture, can have faster or
> different delays, hence we can potentially miss a bug.
> 
> The simplest solution is to set the cpu_dma latency constraint to zero, so the
> idle driver will always choose the fastest idle state, thus fixing the issue
> above. The latency constraint will apply only for this test.

I think a more generic LTP-wide solution could be made; there are more
tests that fail ie. on virtualized environments because of CPU over-
provisioning on the host (x86, s390, ppc, etc.) and I can imagine ie.
networking tests having similar needs as well (probably currently worked
around using big timeouts).

Basically some system that would allow a user to set some "latency
multiplier" based on the system under test, so that ie. dedicated x86
machines with fairly deterministic CPU/net/etc. scheduling could have it
set low, and so that other systems could still run tests that are now
deemed unsuitable for such hardware.

> 
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
>  testcases/kernel/syscalls/pselect/pselect01.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/testcases/kernel/syscalls/pselect/pselect01.c b/testcases/kernel/syscalls/pselect/pselect01.c
> index a2b5339..42027b6 100644
> --- a/testcases/kernel/syscalls/pselect/pselect01.c
> +++ b/testcases/kernel/syscalls/pselect/pselect01.c
> @@ -44,6 +44,9 @@ int sample_fn(int clk_id, long long usec)
>  }
>  
>  static struct tst_test test = {
> +	.needs_root = 1,
> +	.needs_latency = 1,
> +	.latency = 0,
>  	.tid = "pselect()",
>  	.sample = sample_fn,
>  };
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH 2/2] syscalls/pselect: Add a zero latency constraint
  2017-08-10 11:50   ` Jiri Jaburek
@ 2017-08-10 12:00     ` Daniel Lezcano
  2017-08-11 11:26       ` Jan Stancek
  0 siblings, 1 reply; 29+ messages in thread
From: Daniel Lezcano @ 2017-08-10 12:00 UTC (permalink / raw)
  To: ltp

On 10/08/2017 13:50, Jiri Jaburek wrote:
> On 08/10/17 10:01, Daniel Lezcano wrote:
>> The pselect_01 testcase works well on an x86 as it is a fast platform, with
>> fast exit latency idle routine.
>>
>> However on ARM[64] the idle routine can take much more time, for example
>> 1500us.
>>
>> The pselect fails on the ARM[64] platforms because of these slow exit latencies,
>> the delay between the expected expiration and the observed one is not
>> acceptable.
>>
>> The fix could be to increase the deviation on ARM64 but that wouldn't make
>> sense as some platforms, for the same architecture, can have faster or
>> different delays, hence we can potentially miss a bug.
>>
>> The simplest solution is to set the cpu_dma latency constraint to zero, so the
>> idle driver will always choose the fastest idle state, thus fixing the issue
>> above. The latency constraint will apply only for this test.
> 
> I think a more generic LTP-wide solution could be made; there are more
> tests that fail ie. on virtualized environments because of CPU over-
> provisioning on the host (x86, s390, ppc, etc.)

Not sure a latency constraint will fix the above.

> and I can imagine ie.
> networking tests having similar needs as well (probably currently worked
> around using big timeouts).

Here you can set the network latency.

> Basically some system that would allow a user to set some "latency
> multiplier" based on the system under test, so that ie. dedicated x86
> machines with fairly deterministic CPU/net/etc. scheduling could have it
> set low, and so that other systems could still run tests that are now
> deemed unsuitable for such hardware.

Here the solution is to provide the cpu_dma latency, it will prevent the
cpuidle driver to choose a deeper idle state introducing an excessive
wakeup latency.

The excessive idle exit latency is the root cause of the pselect to
fail, setting the cpu_dma latency is the correct fix to let this test to
do the right thing. It is the purpose of the cpu_dma latency constraint.



>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>> ---
>>  testcases/kernel/syscalls/pselect/pselect01.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/testcases/kernel/syscalls/pselect/pselect01.c b/testcases/kernel/syscalls/pselect/pselect01.c
>> index a2b5339..42027b6 100644
>> --- a/testcases/kernel/syscalls/pselect/pselect01.c
>> +++ b/testcases/kernel/syscalls/pselect/pselect01.c
>> @@ -44,6 +44,9 @@ int sample_fn(int clk_id, long long usec)
>>  }
>>  
>>  static struct tst_test test = {
>> +	.needs_root = 1,
>> +	.needs_latency = 1,
>> +	.latency = 0,
>>  	.tid = "pselect()",
>>  	.sample = sample_fn,
>>  };
>>
> 
> 


-- 
 <http://www.linaro.org/> Linaro.org â”‚ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-10  8:01 [LTP] [PATCH 1/2] ltp: Add the ability to specify the latency constraint Daniel Lezcano
  2017-08-10  8:01 ` [LTP] [PATCH 2/2] syscalls/pselect: Add a zero " Daniel Lezcano
@ 2017-08-11 11:25 ` Jan Stancek
  2017-08-11 12:54   ` [LTP] [PATCH V2 " Daniel Lezcano
  1 sibling, 1 reply; 29+ messages in thread
From: Jan Stancek @ 2017-08-11 11:25 UTC (permalink / raw)
  To: ltp



----- Original Message -----
> The ltp test suites provides a set of tests. Some of them are checking the
> test
> happens in a specified amount of time.
> 
> Unfortunately, some platforms have slow power management routines adding more
> than 1.5ms to wakeup from a deep idle state. This duration is far too long to
> be acceptable when we are trying the measure a speficied routine with a
> timeout
> reasonably delayed. For example, the testcases/kernel/syscalls/pselect_01 is
> failing for this reason.
> 
> This patch gives the opportunity to the testcase to specify the latency
> constraint when running. This option must be used with the needs_root in
> order
> to have the right privileges.
> 
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
>  include/tst_test.h |  4 ++++
>  lib/tst_test.c     | 18 ++++++++++++++++++
>  2 files changed, 22 insertions(+)
> 
> diff --git a/include/tst_test.h b/include/tst_test.h
> index e90312a..519fd4c 100644
> --- a/include/tst_test.h
> +++ b/include/tst_test.h
> @@ -124,6 +124,7 @@ struct tst_test {
>  	int needs_checkpoints:1;
>  	int format_device:1;
>  	int mount_device:1;
> +	int needs_latency:1;
>  
>  	/* Minimal device size in megabytes */
>  	unsigned int dev_min_size;
> @@ -154,6 +155,9 @@ struct tst_test {
>  
>  	/* NULL terminated array of resource file names */
>  	const char *const *resource_files;
> +
> +	/* Latency constraint to be set for the test */
> +	int latency;
>  };
>  
>  /*
> diff --git a/lib/tst_test.c b/lib/tst_test.c
> index 4c30eda..485515e 100644
> --- a/lib/tst_test.c
> +++ b/lib/tst_test.c
> @@ -619,6 +619,21 @@ static void copy_resources(void)
>  		TST_RESOURCE_COPY(NULL, tst_test->resource_files[i], NULL);
>  }
>  
> +static int set_latency(void)
> +{
> +	int fd, ret;
> +
> +	fd = open("/dev/cpu_dma_latency", O_WRONLY);
> +	if (fd < 0)
> +		return fd;

Hi,

so any kind of failure to open/write to this node will end test
with TCONF. I'd rather not hide problems with open/write and
instead report any trouble via TBROK:

- if node doesn't exist (for example because kernels is too old),
  we'll run the test anyway
- if node exists but can't be opened -> TBROK
- if node can be opened, but write fails -> TBROK

Regards,
Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH 2/2] syscalls/pselect: Add a zero latency constraint
  2017-08-10 12:00     ` Daniel Lezcano
@ 2017-08-11 11:26       ` Jan Stancek
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Stancek @ 2017-08-11 11:26 UTC (permalink / raw)
  To: ltp



----- Original Message -----
> On 10/08/2017 13:50, Jiri Jaburek wrote:
> > On 08/10/17 10:01, Daniel Lezcano wrote:
> >> The pselect_01 testcase works well on an x86 as it is a fast platform,
> >> with
> >> fast exit latency idle routine.
> >>
> >> However on ARM[64] the idle routine can take much more time, for example
> >> 1500us.
> >>
> >> The pselect fails on the ARM[64] platforms because of these slow exit
> >> latencies,
> >> the delay between the expected expiration and the observed one is not
> >> acceptable.
> >>
> >> The fix could be to increase the deviation on ARM64 but that wouldn't make
> >> sense as some platforms, for the same architecture, can have faster or
> >> different delays, hence we can potentially miss a bug.
> >>
> >> The simplest solution is to set the cpu_dma latency constraint to zero, so
> >> the
> >> idle driver will always choose the fastest idle state, thus fixing the
> >> issue
> >> above. The latency constraint will apply only for this test.
> > 
> > I think a more generic LTP-wide solution could be made; there are more
> > tests that fail ie. on virtualized environments because of CPU over-
> > provisioning on the host (x86, s390, ppc, etc.)
> 
> Not sure a latency constraint will fix the above.

We could detect bare-metal / (some [1]) virt and adjust threshold
for timer tests, but I'm assuming Daniel sees this on bare metal,
so this patch would be useful anyway.

[1] kvm/xen is easy to detect, lpars might be more tricky

Regards,
Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-11 11:25 ` [LTP] [PATCH 1/2] ltp: Add the ability to specify the " Jan Stancek
@ 2017-08-11 12:54   ` Daniel Lezcano
  2017-08-11 12:54     ` [LTP] [PATCH V2 2/2] syscalls/pselect: Add a zero " Daniel Lezcano
  2017-08-11 14:09     ` [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the " Cyril Hrubis
  0 siblings, 2 replies; 29+ messages in thread
From: Daniel Lezcano @ 2017-08-11 12:54 UTC (permalink / raw)
  To: ltp

The ltp test suites provides a set of tests. Some of them are checking the test
happens in a specified amount of time.

Unfortunately, some platforms have slow power management routines adding more
than 1.5ms to wakeup from a deep idle state. This duration is far too long to
be acceptable when we are trying the measure a speficied routine with a timeout
reasonably delayed. For example, the testcases/kernel/syscalls/pselect_01 is
failing for this reason.

This patch gives the opportunity to the testcase to specify the latency
constraint when running. This option must be used with the needs_root in order
to have the right privileges.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 include/tst_test.h |  4 ++++
 lib/tst_test.c     | 28 ++++++++++++++++++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/include/tst_test.h b/include/tst_test.h
index e90312a..519fd4c 100644
--- a/include/tst_test.h
+++ b/include/tst_test.h
@@ -124,6 +124,7 @@ struct tst_test {
 	int needs_checkpoints:1;
 	int format_device:1;
 	int mount_device:1;
+	int needs_latency:1;
 
 	/* Minimal device size in megabytes */
 	unsigned int dev_min_size;
@@ -154,6 +155,9 @@ struct tst_test {
 
 	/* NULL terminated array of resource file names */
 	const char *const *resource_files;
+
+	/* Latency constraint to be set for the test */
+	int latency;
 };
 
 /*
diff --git a/lib/tst_test.c b/lib/tst_test.c
index 4c30eda..717a782 100644
--- a/lib/tst_test.c
+++ b/lib/tst_test.c
@@ -619,6 +619,31 @@ static void copy_resources(void)
 		TST_RESOURCE_COPY(NULL, tst_test->resource_files[i], NULL);
 }
 
+static int set_latency(void)
+{
+	int fd, ret;
+
+	fd = open("/dev/cpu_dma_latency", O_WRONLY);
+	if (fd < 0) {
+		/*
+		 * In case we are running on old kernel where the cpu_dma latency does
+		 * not exist, do not fail, just inform and bail out.
+		 */
+		if (errno == ENOENT) {
+			tst_res(TINFO, "/dev/cpu_dma_latency does not exists\n");
+			return 0;
+		}
+
+		return fd;
+	}
+
+	ret = write(fd, &tst_test->latency, sizeof(tst_test->latency));
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
 static const char *get_tid(char *argv[])
 {
 	char *p;
@@ -736,6 +761,9 @@ static void do_setup(int argc, char *argv[])
 
 	if (tst_test->resource_files)
 		copy_resources();
+
+	if (tst_test->needs_latency && set_latency())
+		tst_brk(TBROK, "Failed to set cpu latency");
 }
 
 static void do_test_setup(void)
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 2/2] syscalls/pselect: Add a zero latency constraint
  2017-08-11 12:54   ` [LTP] [PATCH V2 " Daniel Lezcano
@ 2017-08-11 12:54     ` Daniel Lezcano
  2017-08-11 14:09     ` [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the " Cyril Hrubis
  1 sibling, 0 replies; 29+ messages in thread
From: Daniel Lezcano @ 2017-08-11 12:54 UTC (permalink / raw)
  To: ltp

The pselect_01 testcase works well on an x86 as it is a fast platform, with
fast exit latency idle routine.

However on ARM[64] the idle routine can take much more time, for example
1500us.

The pselect fails on the ARM[64] platforms because of these slow exit latencies,
the delay between the expected expiration and the observed one is not
acceptable.

The fix could be to increase the deviation on ARM64 but that wouldn't make
sense as some platforms, for the same architecture, can have faster or
different delays, hence we can potentially miss a bug.

The simplest solution is to set the cpu_dma latency constraint to zero, so the
idle driver will always choose the fastest idle state, thus fixing the issue
above. The latency constraint will apply only for this test.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 testcases/kernel/syscalls/pselect/pselect01.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/testcases/kernel/syscalls/pselect/pselect01.c b/testcases/kernel/syscalls/pselect/pselect01.c
index a2b5339..42027b6 100644
--- a/testcases/kernel/syscalls/pselect/pselect01.c
+++ b/testcases/kernel/syscalls/pselect/pselect01.c
@@ -44,6 +44,9 @@ int sample_fn(int clk_id, long long usec)
 }

 static struct tst_test test = {
+	.needs_root = 1,
+	.needs_latency = 1,
+	.latency = 0,
 	.tid = "pselect()",
 	.sample = sample_fn,
 };
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-11 12:54   ` [LTP] [PATCH V2 " Daniel Lezcano
  2017-08-11 12:54     ` [LTP] [PATCH V2 2/2] syscalls/pselect: Add a zero " Daniel Lezcano
@ 2017-08-11 14:09     ` Cyril Hrubis
  2017-08-11 14:52       ` Daniel Lezcano
  1 sibling, 1 reply; 29+ messages in thread
From: Cyril Hrubis @ 2017-08-11 14:09 UTC (permalink / raw)
  To: ltp

Hi!
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
>  include/tst_test.h |  4 ++++
>  lib/tst_test.c     | 28 ++++++++++++++++++++++++++++
>  2 files changed, 32 insertions(+)
> 
> diff --git a/include/tst_test.h b/include/tst_test.h
> index e90312a..519fd4c 100644
> --- a/include/tst_test.h
> +++ b/include/tst_test.h
> @@ -124,6 +124,7 @@ struct tst_test {
>  	int needs_checkpoints:1;
>  	int format_device:1;
>  	int mount_device:1;
> +	int needs_latency:1;
>  
>  	/* Minimal device size in megabytes */
>  	unsigned int dev_min_size;
> @@ -154,6 +155,9 @@ struct tst_test {
>  
>  	/* NULL terminated array of resource file names */
>  	const char *const *resource_files;
> +
> +	/* Latency constraint to be set for the test */
> +	int latency;
>  };

This seems to be a bit too specific to me, I would be more happy if we
named it "measures_sleep_time" or something that describes what the
testcase does rather than calling it after specific workaround.

Also shouldn't this be set automatically for all timer related testcases
(these that make use of the sampling function) rather than just for the
pselect one? If so we should set the flag in the tst_timer_test.c
library instead. Or we can also stick the latency setup/cleanup
into the tst_timer_test.c as well and don't bother with adding more
flags into the tst_test structure (see timer_setup() and timer_cleanup()
in the tst_timer_test.c).

>  /*
> diff --git a/lib/tst_test.c b/lib/tst_test.c
> index 4c30eda..717a782 100644
> --- a/lib/tst_test.c
> +++ b/lib/tst_test.c
> @@ -619,6 +619,31 @@ static void copy_resources(void)
>  		TST_RESOURCE_COPY(NULL, tst_test->resource_files[i], NULL);
>  }
>  
> +static int set_latency(void)
> +{
> +	int fd, ret;
> +
> +	fd = open("/dev/cpu_dma_latency", O_WRONLY);
> +	if (fd < 0) {
> +		/*
> +		 * In case we are running on old kernel where the cpu_dma latency does
> +		 * not exist, do not fail, just inform and bail out.
> +		 */
> +		if (errno == ENOENT) {
> +			tst_res(TINFO, "/dev/cpu_dma_latency does not exists\n");
> +			return 0;
> +		}
> +
> +		return fd;
> +	}
> +
> +	ret = write(fd, &tst_test->latency, sizeof(tst_test->latency));
> +	if (ret < 0)
> +		return ret;
> +
> +	return 0;
> +}
> +
>  static const char *get_tid(char *argv[])
>  {
>  	char *p;
> @@ -736,6 +761,9 @@ static void do_setup(int argc, char *argv[])
>  
>  	if (tst_test->resource_files)
>  		copy_resources();
> +
> +	if (tst_test->needs_latency && set_latency())
> +		tst_brk(TBROK, "Failed to set cpu latency");
>  }
>  
>  static void do_test_setup(void)
> -- 
> 2.1.4
> 
> 
> -- 
> Mailing list info: https://lists.linux.it/listinfo/ltp

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-11 14:09     ` [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the " Cyril Hrubis
@ 2017-08-11 14:52       ` Daniel Lezcano
  2017-08-11 15:28         ` Cyril Hrubis
  0 siblings, 1 reply; 29+ messages in thread
From: Daniel Lezcano @ 2017-08-11 14:52 UTC (permalink / raw)
  To: ltp

On 11/08/2017 16:09, Cyril Hrubis wrote:
> Hi!
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>> ---
>>  include/tst_test.h |  4 ++++
>>  lib/tst_test.c     | 28 ++++++++++++++++++++++++++++
>>  2 files changed, 32 insertions(+)
>>
>> diff --git a/include/tst_test.h b/include/tst_test.h
>> index e90312a..519fd4c 100644
>> --- a/include/tst_test.h
>> +++ b/include/tst_test.h
>> @@ -124,6 +124,7 @@ struct tst_test {
>>  	int needs_checkpoints:1;
>>  	int format_device:1;
>>  	int mount_device:1;
>> +	int needs_latency:1;
>>  
>>  	/* Minimal device size in megabytes */
>>  	unsigned int dev_min_size;
>> @@ -154,6 +155,9 @@ struct tst_test {
>>  
>>  	/* NULL terminated array of resource file names */
>>  	const char *const *resource_files;
>> +
>> +	/* Latency constraint to be set for the test */
>> +	int latency;
>>  };
> 
> This seems to be a bit too specific to me, I would be more happy if we
> named it "measures_sleep_time" or something that describes what the
> testcase does rather than calling it after specific workaround.

I'm sorry, I don't get the point. The latency is a system parameter for
any kind of run. Here it is in the common tst lib code as the constraint
may be needed for any other new test. I don't understand the
'measures_sleep_time', there is no connection with the latency word.

Perhaps the cpu_dma_latency needs a better explanation:

When the CPU has no task in its runqueue, the kernel will run a special
task (the idle task). When entering in the idle routine, it will use an
idle state process selection based on how long we predict the CPU will
be sleeping (based on a mix of wake up statistics and timer events) as
well as an exit latency constraint. The deeper the idle state is, the
longer the latency to wakeup is and the more energy we save.

The number of idle states is hardware specific, as well as the idle
state characteristics. For example, for an x86,i7 the deepest idle
state's exit latency is 86us, on other platforms (eg. mobile), the
deepest idle state's exit latency could more than 1500us (or less).

Under certain circumstances, you don't want your CPU to go to a too deep
idle state because you need fast response: eg. playing a game, reading a
mp3 or a HD video (otherwise you miss frames). The application will then
set a cpu_dma_latency constraint and as the kernel checks this latency
when selecting an idle state, deep idle state will be skipped.

For the specific pselect test, we measure the drift between the expected
wakeup and the observed one. With the energy framework, we added an
unacceptable (for the test) delay. This test does not check the latency,
it checks if the wakeup period from the *pselect* syscall is acceptable,
in absolute, so we need to put apart the idle routine during this
specific test.

Technically speaking, the cpu_dma_latency is set by opening the file
/dev/cpu_dma_latency and writing the value but as soon as you close the
file descriptor, the constraint is removed from the kernel. So in this
case, any test setting the cpu_dma_latency will have the file descriptor
close at exit time, the constraint is bound to the test itself.

> Also shouldn't this be set automatically for all timer related testcases
> (these that make use of the sampling function) rather than just for the
> pselect one? If so we should set the flag in the tst_timer_test.c
> library instead. Or we can also stick the latency setup/cleanup
> into the tst_timer_test.c as well and don't bother with adding more
> flags into the tst_test structure (see timer_setup() and timer_cleanup()
> in the tst_timer_test.c).

I'm not sure. We may want to measure everything together (timer + drift
+ energy framework latencies) in other testcases. I think it is fine if
we let the test program to decide what latency he wants to set.

-- 
 <http://www.linaro.org/> Linaro.org â”‚ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-11 14:52       ` Daniel Lezcano
@ 2017-08-11 15:28         ` Cyril Hrubis
  2017-08-14 12:56           ` Daniel Lezcano
  0 siblings, 1 reply; 29+ messages in thread
From: Cyril Hrubis @ 2017-08-11 15:28 UTC (permalink / raw)
  To: ltp

Hi!
> > This seems to be a bit too specific to me, I would be more happy if we
> > named it "measures_sleep_time" or something that describes what the
> > testcase does rather than calling it after specific workaround.
> 
> I'm sorry, I don't get the point. The latency is a system parameter for
> any kind of run. Here it is in the common tst lib code as the constraint
> may be needed for any other new test. I don't understand the
> 'measures_sleep_time', there is no connection with the latency word.

My point of view is that this is very specific tuning parameter and in
ideal world the testcases would not need to touch kernel tuning knobs
unless the knob is what is being tested, but we do not live in an ideal
world either so some kind of solution is desirable. On the other hand
there is plenty of different tuning knobs there and I would like to
avoid putting each of them into the test library as another flag to be
set in the test structure.

> Perhaps the cpu_dma_latency needs a better explanation:
> 
> When the CPU has no task in its runqueue, the kernel will run a special
> task (the idle task). When entering in the idle routine, it will use an
> idle state process selection based on how long we predict the CPU will
> be sleeping (based on a mix of wake up statistics and timer events) as
> well as an exit latency constraint. The deeper the idle state is, the
> longer the latency to wakeup is and the more energy we save.
> 
> The number of idle states is hardware specific, as well as the idle
> state characteristics. For example, for an x86,i7 the deepest idle
> state's exit latency is 86us, on other platforms (eg. mobile), the
> deepest idle state's exit latency could more than 1500us (or less).
> 
> Under certain circumstances, you don't want your CPU to go to a too deep
> idle state because you need fast response: eg. playing a game, reading a
> mp3 or a HD video (otherwise you miss frames). The application will then
> set a cpu_dma_latency constraint and as the kernel checks this latency
> when selecting an idle state, deep idle state will be skipped.
> 
> For the specific pselect test, we measure the drift between the expected
> wakeup and the observed one. With the energy framework, we added an
> unacceptable (for the test) delay. This test does not check the latency,
> it checks if the wakeup period from the *pselect* syscall is acceptable,
> in absolute, so we need to put apart the idle routine during this
> specific test.
> 
> Technically speaking, the cpu_dma_latency is set by opening the file
> /dev/cpu_dma_latency and writing the value but as soon as you close the
> file descriptor, the constraint is removed from the kernel. So in this
> case, any test setting the cpu_dma_latency will have the file descriptor
> close at exit time, the constraint is bound to the test itself.

So if I'm getting this correctly the kernel tries to save power
agressively in your case since it knows when next timer expires and that
waking up from deep idle state would make the timer slack, right? From
that point of view this more or less looks like an workaround :-).

But what I do not understand here is why pselect requires special
handling while rest of the timer testcases that do exactly the same and
end up using the very same hrtimer framework in kernel work fine.

We do have at least select() and poll() tests that should end up using
the same timer with the same slack formula in kernel and hence should
fail in the very same way. And possibly rest of the timer testcases
should likely fail as well since these are not that different either.

Are you, by any chance, using latest stable release? Since we had
rewritten all the timer precision tests recently in:

https://github.com/linux-test-project/ltp/commit/c459654db64cd29177a382ab178fdd5ad59664e4

> > Also shouldn't this be set automatically for all timer related testcases
> > (these that make use of the sampling function) rather than just for the
> > pselect one? If so we should set the flag in the tst_timer_test.c
> > library instead. Or we can also stick the latency setup/cleanup
> > into the tst_timer_test.c as well and don't bother with adding more
> > flags into the tst_test structure (see timer_setup() and timer_cleanup()
> > in the tst_timer_test.c).
> 
> I'm not sure. We may want to measure everything together (timer + drift
> + energy framework latencies) in other testcases. I think it is fine if
> we let the test program to decide what latency he wants to set.

Are you using latest git with the timer library? I would suspect that
all of the test would need the very same treatement  since the test
paramters and maximal error margins are all defined in a single place
now.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-11 15:28         ` Cyril Hrubis
@ 2017-08-14 12:56           ` Daniel Lezcano
  2017-08-14 13:33             ` Cyril Hrubis
  0 siblings, 1 reply; 29+ messages in thread
From: Daniel Lezcano @ 2017-08-14 12:56 UTC (permalink / raw)
  To: ltp

On 11/08/2017 17:28, Cyril Hrubis wrote:
> Hi!
>>> This seems to be a bit too specific to me, I would be more happy 
>>> if we named it "measures_sleep_time" or something that describes 
>>> what the testcase does rather than calling it after specific 
>>> workaround.
>> 
>> I'm sorry, I don't get the point. The latency is a system
>> parameter for any kind of run. Here it is in the common tst lib
>> code as the constraint may be needed for any other new test. I
>> don't understand the 'measures_sleep_time', there is no connection
>> with the latency word.
> 
> My point of view is that this is very specific tuning parameter and 
> in ideal world the testcases would not need to touch kernel tuning 
> knobs unless the knob is what is being tested, but we do not live in 
> an ideal world either so some kind of solution is desirable. On the 
> other hand there is plenty of different tuning knobs there and I 
> would like to avoid putting each of them into the test library as 
> another flag to be set in the test structure.
> 
>> Perhaps the cpu_dma_latency needs a better explanation:
>> 
>> When the CPU has no task in its runqueue, the kernel will run a 
>> special task (the idle task). When entering in the idle routine,
>> it will use an idle state process selection based on how long we 
>> predict the CPU will be sleeping (based on a mix of wake up 
>> statistics and timer events) as well as an exit latency
>> constraint. The deeper the idle state is, the longer the latency to
>> wakeup is and the more energy we save.
>> 
>> The number of idle states is hardware specific, as well as the idle
>> state characteristics. For example, for an x86,i7 the deepest idle
>> state's exit latency is 86us, on other platforms (eg. mobile), the
>> deepest idle state's exit latency could more than 1500us (or 
>> less).
>> 
>> Under certain circumstances, you don't want your CPU to go to a
>> too deep idle state because you need fast response: eg. playing a
>> game, reading a mp3 or a HD video (otherwise you miss frames). The 
>> application will then set a cpu_dma_latency constraint and as the 
>> kernel checks this latency when selecting an idle state, deep idle 
>> state will be skipped.
>> 
>> For the specific pselect test, we measure the drift between the 
>> expected wakeup and the observed one. With the energy framework,
>> we added an unacceptable (for the test) delay. This test does not 
>> check the latency, it checks if the wakeup period from the 
>> *pselect* syscall is acceptable, in absolute, so we need to put 
>> apart the idle routine during this specific test.
>> 
>> Technically speaking, the cpu_dma_latency is set by opening the 
>> file /dev/cpu_dma_latency and writing the value but as soon as you 
>> close the file descriptor, the constraint is removed from the 
>> kernel. So in this case, any test setting the cpu_dma_latency will 
>> have the file descriptor close at exit time, the constraint is 
>> bound to the test itself.
> 
> So if I'm getting this correctly the kernel tries to save power 
> agressively in your case since it knows when next timer expires and 
> that waking up from deep idle state would make the timer slack, 
> right? From that point of view this more or less looks like an 
> workaround :-).

No, it is not trying to save power aggressively. It has exactly the same
behavior than other platforms/arch. The kernel does not have a
constraint in terms of latency, so it does not discriminate any idle
state. Luckily the test works on different architecture because:

 1. the architecture / platform does not have deep idle state
    or
 2. the deep idle state have fast exit latency

However, there are some discussions around setting in the kernel
a timer to force a wakeup a bit sooner (basically minus the exit
latency) in order to reduce the drift but that won't be for tomorrow
because other parameters are entering in the equation (eg. latency can
change depending on the CPU clock).

> But what I do not understand here is why pselect requires special 
> handling while rest of the timer testcases that do exactly the same 
> and end up using the very same hrtimer framework in kernel work 
> fine.
> We do have at least select() and poll() 

I tested them individually and they fail with the userspace governor+min
freq.

If I change the cpufreq governor to ondemand and run select0[0-4] one
after other, they don't fail. The reason is because the previous tests
have a CPU duty cycle long enough and the governor decides to stay at a
high frequency, thus reducing the exit latency.

> tests that should end up 
> using the same timer with the same slack formula in kernel and hence 
> should fail in the very same way. And possibly rest of the timer 
> testcases should likely fail as well since these are not that 
> different either.

The latencies introduced by the kernel energy saving framework can make
these tests to fail and it is impossible to get rid of that, except by
setting the latency to zero when running specific test cases.


> Are you, by any chance, using latest stable release? Since we had 
> rewritten all the timer precision tests recently in:
> 
> https://github.com/linux-test-project/ltp/commit/c459654db64cd29177a382ab178fdd5ad59664e4

Yes, I am.


>>> Also shouldn't this be set automatically for all timer related 
>>> testcases (these that make use of the sampling function) rather 
>>> than just for the pselect one? If so we should set the flag in 
>>> the tst_timer_test.c library instead. Or we can also stick the 
>>> latency setup/cleanup into the tst_timer_test.c as well and
>>> don't bother with adding more flags into the tst_test structure
>>> (see timer_setup() and timer_cleanup() in the tst_timer_test.c).
>> 
>> I'm not sure. We may want to measure everything together (timer + 
>> drift + energy framework latencies) in other testcases. I think it 
>> is fine if we let the test program to decide what latency he wants 
>> to set.
> 
> Are you using latest git with the timer library? I would suspect that
> all of the test would need the very same treatement  since the test
> paramters and maximal error margins are all defined in a single place
> now.

Do you mean setting the zero latency in the tst_timer_test ?


-- 
 <http://www.linaro.org/> Linaro.org â”‚ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-14 12:56           ` Daniel Lezcano
@ 2017-08-14 13:33             ` Cyril Hrubis
  2017-08-14 14:19               ` Daniel Lezcano
  0 siblings, 1 reply; 29+ messages in thread
From: Cyril Hrubis @ 2017-08-14 13:33 UTC (permalink / raw)
  To: ltp

Hi!
> > tests that should end up 
> > using the same timer with the same slack formula in kernel and hence 
> > should fail in the very same way. And possibly rest of the timer 
> > testcases should likely fail as well since these are not that 
> > different either.
> 
> The latencies introduced by the kernel energy saving framework can make
> these tests to fail and it is impossible to get rid of that, except by
> setting the latency to zero when running specific test cases.
> 
> 
> > Are you, by any chance, using latest stable release? Since we had 
> > rewritten all the timer precision tests recently in:
> > 
> > https://github.com/linux-test-project/ltp/commit/c459654db64cd29177a382ab178fdd5ad59664e4
> 
> Yes, I am.

That explains it. Previously each of the timer testcases had it's own
PASS/FAIL criteria and each of them was slightly different. We got rid
of that mess recetly and so the latest git has a timer measurement
library and the test only defines a sampling function now. We also did
quite a lot of testing to make sure that the test are stable now.  And
because of that we take more samples and apply discarded mean to get rid
of random outliners. But we did most of the testing on x86 hardware so
it's possible that it still needs some adjustements.

Can you, please, try with the latest git to see if these tests works for
you now? And then, in a case that they stil fail, we will figure out how
to fix them. Most likely we will patch the timer test library, either
to loosen the crieria or to keep the cpu_dma_latecy open while we sample
the timers.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-14 13:33             ` Cyril Hrubis
@ 2017-08-14 14:19               ` Daniel Lezcano
  2017-08-14 14:36                 ` Cyril Hrubis
  0 siblings, 1 reply; 29+ messages in thread
From: Daniel Lezcano @ 2017-08-14 14:19 UTC (permalink / raw)
  To: ltp

On 14/08/2017 15:33, Cyril Hrubis wrote:

[ ... ]

>>> Are you, by any chance, using latest stable release? Since we had 
>>> rewritten all the timer precision tests recently in:
>>>
>>> https://github.com/linux-test-project/ltp/commit/c459654db64cd29177a382ab178fdd5ad59664e4
>>
>> Yes, I am.
> 
> That explains it. Previously each of the timer testcases had it's own
> PASS/FAIL criteria and each of them was slightly different. We got rid
> of that mess recetly and so the latest git has a timer measurement
> library and the test only defines a sampling function now. We also did
> quite a lot of testing to make sure that the test are stable now.  And
> because of that we take more samples and apply discarded mean to get rid
> of random outliners. But we did most of the testing on x86 hardware so
> it's possible that it still needs some adjustements.

IMO, you should not try to adjust this because there can be a so big gap
between some arch/platforms in term of exit_latency that can make the
test to miss a bug. I mean being more tolerant for one arch can make the
test miss a bug on another arch.

eg.

exynos4 : 5000us
at91: 10us
ux500: 70us
mediatek: 600us
ppc: 10us
x86: 86us
sh mobile: 2300us

etc...


The simplest and cleanest way is to reduce the latency to its minimum in
order to reduce the energy framework impact on the tests.

It is recent the mobile runs ltp.

> Can you, please, try with the latest git to see if these tests works for
> you now? And then, in a case that they stil fail, we will figure out how
> to fix them. Most likely we will patch the timer test library, either
> to loosen the crieria or to keep the cpu_dma_latecy open while we sample
> the timers.

There is a misunderstanding. I ran the tests (and they fail) on the
latest one 4a707d417e3f95025fe6c707e2763e84b2bed29a.





-- 
 <http://www.linaro.org/> Linaro.org â”‚ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-14 14:19               ` Daniel Lezcano
@ 2017-08-14 14:36                 ` Cyril Hrubis
  2017-08-14 15:43                   ` Daniel Lezcano
  0 siblings, 1 reply; 29+ messages in thread
From: Cyril Hrubis @ 2017-08-14 14:36 UTC (permalink / raw)
  To: ltp

Hi!
> > That explains it. Previously each of the timer testcases had it's own
> > PASS/FAIL criteria and each of them was slightly different. We got rid
> > of that mess recetly and so the latest git has a timer measurement
> > library and the test only defines a sampling function now. We also did
> > quite a lot of testing to make sure that the test are stable now.  And
> > because of that we take more samples and apply discarded mean to get rid
> > of random outliners. But we did most of the testing on x86 hardware so
> > it's possible that it still needs some adjustements.
> 
> IMO, you should not try to adjust this because there can be a so big gap
> between some arch/platforms in term of exit_latency that can make the
> test to miss a bug. I mean being more tolerant for one arch can make the
> test miss a bug on another arch.
> 
> eg.
> 
> exynos4 : 5000us
> at91: 10us
> ux500: 70us
> mediatek: 600us
> ppc: 10us
> x86: 86us
> sh mobile: 2300us
> 
> etc...

Ok.

> The simplest and cleanest way is to reduce the latency to its minimum in
> order to reduce the energy framework impact on the tests.
> 
> It is recent the mobile runs ltp.

Sounds reasonably then.

> > Can you, please, try with the latest git to see if these tests works for
> > you now? And then, in a case that they stil fail, we will figure out how
> > to fix them. Most likely we will patch the timer test library, either
> > to loosen the crieria or to keep the cpu_dma_latecy open while we sample
> > the timers.
> 
> There is a misunderstanding. I ran the tests (and they fail) on the
> latest one 4a707d417e3f95025fe6c707e2763e84b2bed29a.

Okay, and do all of the timer tests fail or just some subset?

And even if only subset of them fails I would still consider changing
the timer library rather than individual testcases.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-14 14:36                 ` Cyril Hrubis
@ 2017-08-14 15:43                   ` Daniel Lezcano
  2017-08-15 11:06                     ` Cyril Hrubis
  0 siblings, 1 reply; 29+ messages in thread
From: Daniel Lezcano @ 2017-08-14 15:43 UTC (permalink / raw)
  To: ltp

On 14/08/2017 16:36, Cyril Hrubis wrote:
> Hi!
>>> That explains it. Previously each of the timer testcases had it's own
>>> PASS/FAIL criteria and each of them was slightly different. We got rid
>>> of that mess recetly and so the latest git has a timer measurement
>>> library and the test only defines a sampling function now. We also did
>>> quite a lot of testing to make sure that the test are stable now.  And
>>> because of that we take more samples and apply discarded mean to get rid
>>> of random outliners. But we did most of the testing on x86 hardware so
>>> it's possible that it still needs some adjustements.
>>
>> IMO, you should not try to adjust this because there can be a so big gap
>> between some arch/platforms in term of exit_latency that can make the
>> test to miss a bug. I mean being more tolerant for one arch can make the
>> test miss a bug on another arch.
>>
>> eg.
>>
>> exynos4 : 5000us
>> at91: 10us
>> ux500: 70us
>> mediatek: 600us
>> ppc: 10us
>> x86: 86us
>> sh mobile: 2300us
>>
>> etc...
> 
> Ok.
> 
>> The simplest and cleanest way is to reduce the latency to its minimum in
>> order to reduce the energy framework impact on the tests.
>>
>> It is recent the mobile runs ltp.
> 
> Sounds reasonably then.
> 
>>> Can you, please, try with the latest git to see if these tests works for
>>> you now? And then, in a case that they stil fail, we will figure out how
>>> to fix them. Most likely we will patch the timer test library, either
>>> to loosen the crieria or to keep the cpu_dma_latecy open while we sample
>>> the timers.
>>
>> There is a misunderstanding. I ran the tests (and they fail) on the
>> latest one 4a707d417e3f95025fe6c707e2763e84b2bed29a.
> 
> Okay, and do all of the timer tests fail or just some subset?

Actually I did not run the entire test suite, I ran the tests using the
tst_timer_start() function:

  ---------------------------------------------
 | latency constraint	infinite	 0     |
  ---------------------------------------------
 | nanosleep01		failed		pass   |
  ---------------------------------------------
 | nanosleep02		pass		pass   |
  ---------------------------------------------
 | fcntl33		pass		pass   |
  ---------------------------------------------	
 | clock_nanosleep02	failed		pass   |
  ---------------------------------------------
 | epoll_wait02		failed		pass   |
  ---------------------------------------------
 | futex_wait05		failed		pass   |
  ---------------------------------------------

> And even if only subset of them fails I would still consider changing
> the timer library rather than individual testcases.

Yes, that make sense.

Do you want keep the latency option for future use?


-- 
 <http://www.linaro.org/> Linaro.org â”‚ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-14 15:43                   ` Daniel Lezcano
@ 2017-08-15 11:06                     ` Cyril Hrubis
  2017-08-15 20:15                       ` Daniel Lezcano
  0 siblings, 1 reply; 29+ messages in thread
From: Cyril Hrubis @ 2017-08-15 11:06 UTC (permalink / raw)
  To: ltp

Hi!
> > And even if only subset of them fails I would still consider changing
> > the timer library rather than individual testcases.
> 
> Yes, that make sense.
> 
> Do you want keep the latency option for future use?

I tend not to add anything just in case that we may need it later in
order to keep the test library as small as possible, it's complex enough
even as it is. Moreover we can always add it easily when we find a test
that requires it.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-15 11:06                     ` Cyril Hrubis
@ 2017-08-15 20:15                       ` Daniel Lezcano
  2017-08-17 13:50                         ` Cyril Hrubis
  0 siblings, 1 reply; 29+ messages in thread
From: Daniel Lezcano @ 2017-08-15 20:15 UTC (permalink / raw)
  To: ltp

On 15/08/2017 13:06, Cyril Hrubis wrote:
> Hi!
>>> And even if only subset of them fails I would still consider changing
>>> the timer library rather than individual testcases.
>>
>> Yes, that make sense.
>>
>> Do you want keep the latency option for future use?
> 
> I tend not to add anything just in case that we may need it later in
> order to keep the test library as small as possible, it's complex enough
> even as it is. Moreover we can always add it easily when we find a test
> that requires it.

Setting the latency to zero in tst_timer_start(), if the opening of the
/dev/cpu_dma_latency file fails, we continue but issue a warning with 	

	tst_resm(TWARN,
		"Failed to open '/dev/cpu_dma_latency': %s',
		strerror(errno));

is ok ?

-- 
 <http://www.linaro.org/> Linaro.org â”‚ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-15 20:15                       ` Daniel Lezcano
@ 2017-08-17 13:50                         ` Cyril Hrubis
  2017-08-17 14:02                           ` Daniel Lezcano
  2017-08-17 15:00                           ` [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library Daniel Lezcano
  0 siblings, 2 replies; 29+ messages in thread
From: Cyril Hrubis @ 2017-08-17 13:50 UTC (permalink / raw)
  To: ltp

Hi!
> >> Yes, that make sense.
> >>
> >> Do you want keep the latency option for future use?
> > 
> > I tend not to add anything just in case that we may need it later in
> > order to keep the test library as small as possible, it's complex enough
> > even as it is. Moreover we can always add it easily when we find a test
> > that requires it.
> 
> Setting the latency to zero in tst_timer_start(), if the opening of the
> /dev/cpu_dma_latency file fails, we continue but issue a warning with 	
> 
> 	tst_resm(TWARN,
> 		"Failed to open '/dev/cpu_dma_latency': %s',
> 		strerror(errno));
> 
> is ok ?

Issuing TWARN marks the test as a failure, you should go for TINFO if
you want just inform the user about non-fatal problem.

Also are you sure that tst_timer_start() is the right place to open the
file? That function is called ~1000 times in each timer test hence this
would add quite a bit of overhead. Why don't we just put it into the
timer_setup() in the lib/tst_timer_test.c that is called once at the
start of the test?

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the latency constraint
  2017-08-17 13:50                         ` Cyril Hrubis
@ 2017-08-17 14:02                           ` Daniel Lezcano
  2017-08-17 15:00                           ` [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library Daniel Lezcano
  1 sibling, 0 replies; 29+ messages in thread
From: Daniel Lezcano @ 2017-08-17 14:02 UTC (permalink / raw)
  To: ltp

On 17/08/2017 15:50, Cyril Hrubis wrote:
> Hi!
>>>> Yes, that make sense.
>>>>
>>>> Do you want keep the latency option for future use?
>>>
>>> I tend not to add anything just in case that we may need it later in
>>> order to keep the test library as small as possible, it's complex enough
>>> even as it is. Moreover we can always add it easily when we find a test
>>> that requires it.
>>
>> Setting the latency to zero in tst_timer_start(), if the opening of the
>> /dev/cpu_dma_latency file fails, we continue but issue a warning with 	
>>
>> 	tst_resm(TWARN,
>> 		"Failed to open '/dev/cpu_dma_latency': %s',
>> 		strerror(errno));
>>
>> is ok ?
> 
> Issuing TWARN marks the test as a failure, you should go for TINFO if
> you want just inform the user about non-fatal problem.

Ok.


> Also are you sure that tst_timer_start() is the right place to open the
> file? That function is called ~1000 times in each timer test hence this
> would add quite a bit of overhead. Why don't we just put it into the
> timer_setup() in the lib/tst_timer_test.c that is called once at the
> start of the test?

Yes, absolutely.

Thanks.

  -- Daniel




-- 
 <http://www.linaro.org/> Linaro.org â”‚ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library
  2017-08-17 13:50                         ` Cyril Hrubis
  2017-08-17 14:02                           ` Daniel Lezcano
@ 2017-08-17 15:00                           ` Daniel Lezcano
  2017-08-18 12:25                             ` Cyril Hrubis
  1 sibling, 1 reply; 29+ messages in thread
From: Daniel Lezcano @ 2017-08-17 15:00 UTC (permalink / raw)
  To: ltp

The ltp test suites provides a set of tests. Some of them are checking the test
happens in a specified amount of time.

Unfortunately, some platforms have slow power management routines adding more
than 1.5ms to wakeup from a deep idle state. This duration is far too long to
be acceptable when we are trying the measure a speficied routine with a timeout
reasonably delayed.

All the timers test measure the deviation between the measured and the expected
timeouts and check the gap is reasonable. A slow platform with the slow idle
states will introduce a latency breaking the tests randomly (eg. when the cpu
freq is low).

More precisely, the following tests fail randomly on a hikey 96board:

-------------------------------------------------
| latency constraint    infinite         0      |
-------------------------------------------------
| nanosleep01           failed          pass    |
-------------------------------------------------
| nanosleep02           pass            pass    |
-------------------------------------------------
| fcntl33               pass            pass    |
-------------------------------------------------
| clock_nanosleep02     failed          pass    |
-------------------------------------------------
| epoll_wait02          failed          pass    |
-------------------------------------------------
| futex_wait0           failed          pass    |
-------------------------------------------------

In order to reduce the impact of the energy framework on the tests, let's
specify a temporary latency constraint by setting the cpu_dma_latency to zero.

The constraint ends when the file descriptor to /dev/cpu_dma_latency is closed,
we let that happen when the process exits.

Note the access to /dev/cpu_dma_latency requires the root privileges. Without
them, the test will run without the latency constraint which is fine for most
of the platforms ltp runs on.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 lib/tst_timer_test.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/lib/tst_timer_test.c b/lib/tst_timer_test.c
index 7539c62..cd4ebca 100644
--- a/lib/tst_timer_test.c
+++ b/lib/tst_timer_test.c
@@ -335,6 +335,17 @@ void do_timer_test(long long usec, unsigned int nsamples)

 static void parse_timer_opts(void);

+static int set_latency(void)
+{
+        int fd, latency = 0;
+
+        fd = open("/dev/cpu_dma_latency", O_WRONLY);
+        if (fd < 0)
+                return fd;
+
+        return write(fd, &latency, sizeof(latency));
+}
+
 static void timer_setup(void)
 {
 	struct timespec t;
@@ -365,6 +376,9 @@ static void timer_setup(void)

 	samples = SAFE_MALLOC(sizeof(long long) * MAX(MAX_SAMPLES, sample_cnt));

+	if (set_latency() < 0)
+		tst_res(TINFO, "Failed to set zero latency constraint: %m");
+
 	if (setup)
 		setup();
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library
  2017-08-17 15:00                           ` [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library Daniel Lezcano
@ 2017-08-18 12:25                             ` Cyril Hrubis
  2017-12-12 14:48                               ` Jan Stancek
  0 siblings, 1 reply; 29+ messages in thread
From: Cyril Hrubis @ 2017-08-18 12:25 UTC (permalink / raw)
  To: ltp

Hi!
Pushed, thanks.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library
  2017-08-18 12:25                             ` Cyril Hrubis
@ 2017-12-12 14:48                               ` Jan Stancek
  2017-12-12 14:56                                 ` Daniel Lezcano
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Stancek @ 2017-12-12 14:48 UTC (permalink / raw)
  To: ltp

Hi,

I'm running into similar problem on "Dell Precision 5820 tower", with
Intel(R) Xeon(R) W-2133 CPU @ 3.60GHz, but I can't find any /proc /sys
knob that would help.

# uname -r
4.14.5

# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance

Any timer related tests are reliably failing on longer timeouts:
---
tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations, threshold 450.01us
tst_timer_test.c:318: INFO: min 1095us, max 1098us, median 1096us, trunc mean 1095.99us (discarded 25)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations, threshold 450.01us
tst_timer_test.c:318: INFO: min 2062us, max 2138us, median 2137us, trunc mean 2135.98us (discarded 25)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations, threshold 450.04us
tst_timer_test.c:318: INFO: min 5262us, max 5263us, median 5262us, trunc mean 5262.20us (discarded 15)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations, threshold 450.33us
tst_timer_test.c:318: INFO: min 10318us, max 10471us, median 10471us, trunc mean 10469.07us (discarded 5)
tst_timer_test.c:321: FAIL: poll() slept for too long

 Time: us | Frequency
--------------------------------------------------------------------------------
    10318 | +
    10327 | 
    10336 | 
    10345 | 
    10354 | 
    10363 | 
    10372 | 
    10381 | 
    10390 | 
    10399 | 
    10408 | 
    10417 | 
    10426 | 
    10435 | +
    10444 | 
    10453 | 
    10462 | 
    10471 | ********************************************************************
--------------------------------------------------------------------------------
      9us | 1 sample = 0.69388 '*', 1.38776 '+', 2.77551 '-', non-zero '.'

tst_timer_test.c:275: INFO: poll() sleeping for 25000us 50 iterations, threshold 451.29us
tst_timer_test.c:318: INFO: min 26096us, max 26097us, median 26096us, trunc mean 26096.00us (discarded 2)
tst_timer_test.c:321: FAIL: poll() slept for too long

 Time: us | Frequency
--------------------------------------------------------------------------------
    26096 | ********************************************************************
    26097 | *-
--------------------------------------------------------------------------------
      1us | 1 sample = 1.38776 '*', 2.77551 '+', 5.55102 '-', non-zero '.'

tst_timer_test.c:275: INFO: poll() sleeping for 100000us 10 iterations, threshold 537.00us
tst_timer_test.c:318: INFO: min 104273us, max 104274us, median 104274us, trunc mean 104273.67us (discarded 1)
tst_timer_test.c:321: FAIL: poll() slept for too long

 Time: us | Frequency
--------------------------------------------------------------------------------
   104273 | *****************************
   104274 | ********************************************************************
--------------------------------------------------------------------------------
      1us | 1 sample = 9.71429 '*', 19.42857 '+', 38.85714 '-', non-zero '.'

tst_timer_test.c:275: INFO: poll() sleeping for 1000000us 2 iterations, threshold 4400.00us
tst_timer_test.c:318: INFO: min 1007717us, max 1011050us, median 1007717us, trunc mean 1007717.00us (discarded 1)
tst_timer_test.c:321: FAIL: poll() slept for too long

 Time: us | Frequency
--------------------------------------------------------------------------------
  1007717 | ********************************************************************
  1007893 | 
  1008069 | 
  1008245 | 
  1008421 | 
  1008597 | 
  1008773 | 
  1008949 | 
  1009125 | 
  1009301 | 
  1009477 | 
  1009653 | 
  1009829 | 
  1010005 | 
  1010181 | 
  1010357 | 
  1010533 | 
  1010709 | 
  1010885 | ********************************************************************
--------------------------------------------------------------------------------
    176us | 1 sample = 68.00000 '*', 136.00000 '+', 272.00000 '-', non-zero '.'
---

SCHED_OTHER or SCHED_FIFO -> FAIL
intel_idle.max_cstate=0 processor.max_cstate=1 -> FAIL
echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo -> FAIL
idle=halt -> FAIL
idle=poll -> FAIL

Only thing I found to help is to keep CPU slightly busy with
  taskset -c 1 sh -c "while [ True ]; do usleep 100; done"

After that it started to PASS:

# taskset -c 1 ./poll02
tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations, threshold 450.01us
tst_timer_test.c:318: INFO: min 1004us, max 1325us, median 1072us, trunc mean 1149.29us (discarded 25)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations, threshold 450.01us
tst_timer_test.c:318: INFO: min 2007us, max 2326us, median 2075us, trunc mean 2158.64us (discarded 25)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations, threshold 450.04us
tst_timer_test.c:318: INFO: min 5006us, max 5345us, median 5074us, trunc mean 5146.84us (discarded 15)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations, threshold 450.33us
tst_timer_test.c:318: INFO: min 10004us, max 10364us, median 10075us, trunc mean 10128.61us (discarded 5)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 25000us 50 iterations, threshold 451.29us
tst_timer_test.c:318: INFO: min 25006us, max 25359us, median 25072us, trunc mean 25137.48us (discarded 2)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 100000us 10 iterations, threshold 537.00us
tst_timer_test.c:318: INFO: min 100010us, max 100372us, median 100125us, trunc mean 100167.78us (discarded 1)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 1000000us 2 iterations, threshold 4400.00us
tst_timer_test.c:318: INFO: min 1000843us, max 1000920us, median 1000843us, trunc mean 1000843.00us (discarded 1)
tst_timer_test.c:333: PASS: Measured times are within thresholds

Summary:
passed   7
failed   0
skipped  0
warnings 0

Any ideas?

Regards,
Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library
  2017-12-12 14:48                               ` Jan Stancek
@ 2017-12-12 14:56                                 ` Daniel Lezcano
  2017-12-12 15:04                                   ` Jan Stancek
  0 siblings, 1 reply; 29+ messages in thread
From: Daniel Lezcano @ 2017-12-12 14:56 UTC (permalink / raw)
  To: ltp

On 12/12/2017 15:48, Jan Stancek wrote:
> Hi,
> 
> I'm running into similar problem on "Dell Precision 5820 tower", with
> Intel(R) Xeon(R) W-2133 CPU @ 3.60GHz, but I can't find any /proc /sys
> knob that would help.
> 
> # uname -r
> 4.14.5
> 
> # cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
> performance
> performance
> performance
> performance
> performance
> performance
> performance
> performance
> performance
> performance
> performance
> performance
> 
> Any timer related tests are reliably failing on longer timeouts:
> ---
> tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
> tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
> tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
> tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations, threshold 450.01us
> tst_timer_test.c:318: INFO: min 1095us, max 1098us, median 1096us, trunc mean 1095.99us (discarded 25)
> tst_timer_test.c:333: PASS: Measured times are within thresholds
> tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations, threshold 450.01us
> tst_timer_test.c:318: INFO: min 2062us, max 2138us, median 2137us, trunc mean 2135.98us (discarded 25)
> tst_timer_test.c:333: PASS: Measured times are within thresholds
> tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations, threshold 450.04us
> tst_timer_test.c:318: INFO: min 5262us, max 5263us, median 5262us, trunc mean 5262.20us (discarded 15)
> tst_timer_test.c:333: PASS: Measured times are within thresholds
> tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations, threshold 450.33us
> tst_timer_test.c:318: INFO: min 10318us, max 10471us, median 10471us, trunc mean 10469.07us (discarded 5)
> tst_timer_test.c:321: FAIL: poll() slept for too long

Are you running the tests with this zero latency patch included ?


> SCHED_OTHER or SCHED_FIFO -> FAIL
> intel_idle.max_cstate=0 processor.max_cstate=1 -> FAIL
> echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo -> FAIL
> idle=halt -> FAIL
> idle=poll -> FAIL
> 
> Only thing I found to help is to keep CPU slightly busy with
>   taskset -c 1 sh -c "while [ True ]; do usleep 100; done"
> 
> After that it started to PASS:
> 
> # taskset -c 1 ./poll02
> tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
> tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
> tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
> tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations, threshold 450.01us
> tst_timer_test.c:318: INFO: min 1004us, max 1325us, median 1072us, trunc mean 1149.29us (discarded 25)
> tst_timer_test.c:333: PASS: Measured times are within thresholds
> tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations, threshold 450.01us
> tst_timer_test.c:318: INFO: min 2007us, max 2326us, median 2075us, trunc mean 2158.64us (discarded 25)
> tst_timer_test.c:333: PASS: Measured times are within thresholds
> tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations, threshold 450.04us
> tst_timer_test.c:318: INFO: min 5006us, max 5345us, median 5074us, trunc mean 5146.84us (discarded 15)
> tst_timer_test.c:333: PASS: Measured times are within thresholds
> tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations, threshold 450.33us
> tst_timer_test.c:318: INFO: min 10004us, max 10364us, median 10075us, trunc mean 10128.61us (discarded 5)
> tst_timer_test.c:333: PASS: Measured times are within thresholds
> tst_timer_test.c:275: INFO: poll() sleeping for 25000us 50 iterations, threshold 451.29us
> tst_timer_test.c:318: INFO: min 25006us, max 25359us, median 25072us, trunc mean 25137.48us (discarded 2)
> tst_timer_test.c:333: PASS: Measured times are within thresholds
> tst_timer_test.c:275: INFO: poll() sleeping for 100000us 10 iterations, threshold 537.00us
> tst_timer_test.c:318: INFO: min 100010us, max 100372us, median 100125us, trunc mean 100167.78us (discarded 1)
> tst_timer_test.c:333: PASS: Measured times are within thresholds
> tst_timer_test.c:275: INFO: poll() sleeping for 1000000us 2 iterations, threshold 4400.00us
> tst_timer_test.c:318: INFO: min 1000843us, max 1000920us, median 1000843us, trunc mean 1000843.00us (discarded 1)
> tst_timer_test.c:333: PASS: Measured times are within thresholds
> 
> Summary:
> passed   7
> failed   0
> skipped  0
> warnings 0
> 
> Any ideas?
> 
> Regards,
> Jan
> 


-- 
 <http://www.linaro.org/> Linaro.org â”‚ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library
  2017-12-12 14:56                                 ` Daniel Lezcano
@ 2017-12-12 15:04                                   ` Jan Stancek
  2017-12-12 15:21                                     ` Daniel Lezcano
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Stancek @ 2017-12-12 15:04 UTC (permalink / raw)
  To: ltp



----- Original Message -----
> On 12/12/2017 15:48, Jan Stancek wrote:
> > Hi,
> > 
> > I'm running into similar problem on "Dell Precision 5820 tower", with
> > Intel(R) Xeon(R) W-2133 CPU @ 3.60GHz, but I can't find any /proc /sys
> > knob that would help.
> > 
> > # uname -r
> > 4.14.5
> > 
> > # cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
> > performance
> > performance
> > performance
> > performance
> > performance
> > performance
> > performance
> > performance
> > performance
> > performance
> > performance
> > performance
> > 
> > Any timer related tests are reliably failing on longer timeouts:
> > ---
> > tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
> > tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
> > tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
> > tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations,
> > threshold 450.01us
> > tst_timer_test.c:318: INFO: min 1095us, max 1098us, median 1096us, trunc
> > mean 1095.99us (discarded 25)
> > tst_timer_test.c:333: PASS: Measured times are within thresholds
> > tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations,
> > threshold 450.01us
> > tst_timer_test.c:318: INFO: min 2062us, max 2138us, median 2137us, trunc
> > mean 2135.98us (discarded 25)
> > tst_timer_test.c:333: PASS: Measured times are within thresholds
> > tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations,
> > threshold 450.04us
> > tst_timer_test.c:318: INFO: min 5262us, max 5263us, median 5262us, trunc
> > mean 5262.20us (discarded 15)
> > tst_timer_test.c:333: PASS: Measured times are within thresholds
> > tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations,
> > threshold 450.33us
> > tst_timer_test.c:318: INFO: min 10318us, max 10471us, median 10471us, trunc
> > mean 10469.07us (discarded 5)
> > tst_timer_test.c:321: FAIL: poll() slept for too long
> 
> Are you running the tests with this zero latency patch included ?

Yes, I'm running ltp release 20170929, which has your patch.

...
[pid 18276] 09:59:21.379883 open("/dev/cpu_dma_latency", O_WRONLY) = 3
[pid 18276] 09:59:21.379906 write(3, "\0\0\0\0", 4) = 4
...

I tried it without that patch, and it started failing more with smaller
timeouts.

Regards,
Jan

> 
> 
> > SCHED_OTHER or SCHED_FIFO -> FAIL
> > intel_idle.max_cstate=0 processor.max_cstate=1 -> FAIL
> > echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo -> FAIL
> > idle=halt -> FAIL
> > idle=poll -> FAIL
> > 
> > Only thing I found to help is to keep CPU slightly busy with
> >   taskset -c 1 sh -c "while [ True ]; do usleep 100; done"
> > 
> > After that it started to PASS:
> > 
> > # taskset -c 1 ./poll02
> > tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
> > tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
> > tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
> > tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations,
> > threshold 450.01us
> > tst_timer_test.c:318: INFO: min 1004us, max 1325us, median 1072us, trunc
> > mean 1149.29us (discarded 25)
> > tst_timer_test.c:333: PASS: Measured times are within thresholds
> > tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations,
> > threshold 450.01us
> > tst_timer_test.c:318: INFO: min 2007us, max 2326us, median 2075us, trunc
> > mean 2158.64us (discarded 25)
> > tst_timer_test.c:333: PASS: Measured times are within thresholds
> > tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations,
> > threshold 450.04us
> > tst_timer_test.c:318: INFO: min 5006us, max 5345us, median 5074us, trunc
> > mean 5146.84us (discarded 15)
> > tst_timer_test.c:333: PASS: Measured times are within thresholds
> > tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations,
> > threshold 450.33us
> > tst_timer_test.c:318: INFO: min 10004us, max 10364us, median 10075us, trunc
> > mean 10128.61us (discarded 5)
> > tst_timer_test.c:333: PASS: Measured times are within thresholds
> > tst_timer_test.c:275: INFO: poll() sleeping for 25000us 50 iterations,
> > threshold 451.29us
> > tst_timer_test.c:318: INFO: min 25006us, max 25359us, median 25072us, trunc
> > mean 25137.48us (discarded 2)
> > tst_timer_test.c:333: PASS: Measured times are within thresholds
> > tst_timer_test.c:275: INFO: poll() sleeping for 100000us 10 iterations,
> > threshold 537.00us
> > tst_timer_test.c:318: INFO: min 100010us, max 100372us, median 100125us,
> > trunc mean 100167.78us (discarded 1)
> > tst_timer_test.c:333: PASS: Measured times are within thresholds
> > tst_timer_test.c:275: INFO: poll() sleeping for 1000000us 2 iterations,
> > threshold 4400.00us
> > tst_timer_test.c:318: INFO: min 1000843us, max 1000920us, median 1000843us,
> > trunc mean 1000843.00us (discarded 1)
> > tst_timer_test.c:333: PASS: Measured times are within thresholds
> > 
> > Summary:
> > passed   7
> > failed   0
> > skipped  0
> > warnings 0
> > 
> > Any ideas?
> > 
> > Regards,
> > Jan
> > 
> 
> 
> --
>  <http://www.linaro.org/> Linaro.org â”‚ Open source software for ARM SoCs
> 
> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
> <http://twitter.com/#!/linaroorg> Twitter |
> <http://www.linaro.org/linaro-blog/> Blog
> 
> 
> --
> Mailing list info: https://lists.linux.it/listinfo/ltp
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library
  2017-12-12 15:04                                   ` Jan Stancek
@ 2017-12-12 15:21                                     ` Daniel Lezcano
  2017-12-13 17:00                                       ` Daniel Lezcano
  2018-02-01 22:52                                       ` Jan Stancek
  0 siblings, 2 replies; 29+ messages in thread
From: Daniel Lezcano @ 2017-12-12 15:21 UTC (permalink / raw)
  To: ltp


On Intel, the firmware usually overrides the kernel power management
decision by auto-promoting the idle states.

So it is possible it is the case for you. With a process keeping the CPU
"busy" that prevents the firmware to go to a deepest idle state.

I see it is a Xeon platform. You can try by checking the
performance/power balance option in the BIOS. AFAIR, there is
performance aggressive, power aggressive, balanced-performance+ and
balanced-power+ and balanced.

Can you check this option ?

On 12/12/2017 16:04, Jan Stancek wrote:
> 
> 
> ----- Original Message -----
>> On 12/12/2017 15:48, Jan Stancek wrote:
>>> Hi,
>>>
>>> I'm running into similar problem on "Dell Precision 5820 tower", with
>>> Intel(R) Xeon(R) W-2133 CPU @ 3.60GHz, but I can't find any /proc /sys
>>> knob that would help.
>>>
>>> # uname -r
>>> 4.14.5
>>>
>>> # cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
>>> performance
>>> performance
>>> performance
>>> performance
>>> performance
>>> performance
>>> performance
>>> performance
>>> performance
>>> performance
>>> performance
>>> performance
>>>
>>> Any timer related tests are reliably failing on longer timeouts:
>>> ---
>>> tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
>>> tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
>>> tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
>>> tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations,
>>> threshold 450.01us
>>> tst_timer_test.c:318: INFO: min 1095us, max 1098us, median 1096us, trunc
>>> mean 1095.99us (discarded 25)
>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>> tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations,
>>> threshold 450.01us
>>> tst_timer_test.c:318: INFO: min 2062us, max 2138us, median 2137us, trunc
>>> mean 2135.98us (discarded 25)
>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>> tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations,
>>> threshold 450.04us
>>> tst_timer_test.c:318: INFO: min 5262us, max 5263us, median 5262us, trunc
>>> mean 5262.20us (discarded 15)
>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>> tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations,
>>> threshold 450.33us
>>> tst_timer_test.c:318: INFO: min 10318us, max 10471us, median 10471us, trunc
>>> mean 10469.07us (discarded 5)
>>> tst_timer_test.c:321: FAIL: poll() slept for too long
>>
>> Are you running the tests with this zero latency patch included ?
> 
> Yes, I'm running ltp release 20170929, which has your patch.
> 
> ...
> [pid 18276] 09:59:21.379883 open("/dev/cpu_dma_latency", O_WRONLY) = 3
> [pid 18276] 09:59:21.379906 write(3, "\0\0\0\0", 4) = 4
> ...
> 
> I tried it without that patch, and it started failing more with smaller
> timeouts.
> 
> Regards,
> Jan
> 
>>
>>
>>> SCHED_OTHER or SCHED_FIFO -> FAIL
>>> intel_idle.max_cstate=0 processor.max_cstate=1 -> FAIL
>>> echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo -> FAIL
>>> idle=halt -> FAIL
>>> idle=poll -> FAIL
>>>
>>> Only thing I found to help is to keep CPU slightly busy with
>>>   taskset -c 1 sh -c "while [ True ]; do usleep 100; done"
>>>
>>> After that it started to PASS:
>>>
>>> # taskset -c 1 ./poll02
>>> tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
>>> tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
>>> tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
>>> tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations,
>>> threshold 450.01us
>>> tst_timer_test.c:318: INFO: min 1004us, max 1325us, median 1072us, trunc
>>> mean 1149.29us (discarded 25)
>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>> tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations,
>>> threshold 450.01us
>>> tst_timer_test.c:318: INFO: min 2007us, max 2326us, median 2075us, trunc
>>> mean 2158.64us (discarded 25)
>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>> tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations,
>>> threshold 450.04us
>>> tst_timer_test.c:318: INFO: min 5006us, max 5345us, median 5074us, trunc
>>> mean 5146.84us (discarded 15)
>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>> tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations,
>>> threshold 450.33us
>>> tst_timer_test.c:318: INFO: min 10004us, max 10364us, median 10075us, trunc
>>> mean 10128.61us (discarded 5)
>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>> tst_timer_test.c:275: INFO: poll() sleeping for 25000us 50 iterations,
>>> threshold 451.29us
>>> tst_timer_test.c:318: INFO: min 25006us, max 25359us, median 25072us, trunc
>>> mean 25137.48us (discarded 2)
>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>> tst_timer_test.c:275: INFO: poll() sleeping for 100000us 10 iterations,
>>> threshold 537.00us
>>> tst_timer_test.c:318: INFO: min 100010us, max 100372us, median 100125us,
>>> trunc mean 100167.78us (discarded 1)
>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>> tst_timer_test.c:275: INFO: poll() sleeping for 1000000us 2 iterations,
>>> threshold 4400.00us
>>> tst_timer_test.c:318: INFO: min 1000843us, max 1000920us, median 1000843us,
>>> trunc mean 1000843.00us (discarded 1)
>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>>
>>> Summary:
>>> passed   7
>>> failed   0
>>> skipped  0
>>> warnings 0
>>>
>>> Any ideas?
>>>
>>> Regards,
>>> Jan
>>>
>>
>>
>> --
>>  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
>>
>> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
>> <http://twitter.com/#!/linaroorg> Twitter |
>> <http://www.linaro.org/linaro-blog/> Blog
>>
>>
>> --
>> Mailing list info: https://lists.linux.it/listinfo/ltp
>>


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library
  2017-12-12 15:21                                     ` Daniel Lezcano
@ 2017-12-13 17:00                                       ` Daniel Lezcano
  2017-12-13 20:42                                         ` Jan Stancek
  2018-02-01 22:52                                       ` Jan Stancek
  1 sibling, 1 reply; 29+ messages in thread
From: Daniel Lezcano @ 2017-12-13 17:00 UTC (permalink / raw)
  To: ltp

On 12/12/2017 16:21, Daniel Lezcano wrote:
> 
> On Intel, the firmware usually overrides the kernel power management
> decision by auto-promoting the idle states.
> 
> So it is possible it is the case for you. With a process keeping the CPU
> "busy" that prevents the firmware to go to a deepest idle state.
> 
> I see it is a Xeon platform. You can try by checking the
> performance/power balance option in the BIOS. AFAIR, there is
> performance aggressive, power aggressive, balanced-performance+ and
> balanced-power+ and balanced.
> 
> Can you check this option ?

So ? any news from the front ?

> On 12/12/2017 16:04, Jan Stancek wrote:
>>
>>
>> ----- Original Message -----
>>> On 12/12/2017 15:48, Jan Stancek wrote:
>>>> Hi,
>>>>
>>>> I'm running into similar problem on "Dell Precision 5820 tower", with
>>>> Intel(R) Xeon(R) W-2133 CPU @ 3.60GHz, but I can't find any /proc /sys
>>>> knob that would help.
>>>>
>>>> # uname -r
>>>> 4.14.5
>>>>
>>>> # cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
>>>> performance
>>>> performance
>>>> performance
>>>> performance
>>>> performance
>>>> performance
>>>> performance
>>>> performance
>>>> performance
>>>> performance
>>>> performance
>>>> performance
>>>>
>>>> Any timer related tests are reliably failing on longer timeouts:
>>>> ---
>>>> tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
>>>> tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
>>>> tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
>>>> tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations,
>>>> threshold 450.01us
>>>> tst_timer_test.c:318: INFO: min 1095us, max 1098us, median 1096us, trunc
>>>> mean 1095.99us (discarded 25)
>>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>>> tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations,
>>>> threshold 450.01us
>>>> tst_timer_test.c:318: INFO: min 2062us, max 2138us, median 2137us, trunc
>>>> mean 2135.98us (discarded 25)
>>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>>> tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations,
>>>> threshold 450.04us
>>>> tst_timer_test.c:318: INFO: min 5262us, max 5263us, median 5262us, trunc
>>>> mean 5262.20us (discarded 15)
>>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>>> tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations,
>>>> threshold 450.33us
>>>> tst_timer_test.c:318: INFO: min 10318us, max 10471us, median 10471us, trunc
>>>> mean 10469.07us (discarded 5)
>>>> tst_timer_test.c:321: FAIL: poll() slept for too long
>>>
>>> Are you running the tests with this zero latency patch included ?
>>
>> Yes, I'm running ltp release 20170929, which has your patch.
>>
>> ...
>> [pid 18276] 09:59:21.379883 open("/dev/cpu_dma_latency", O_WRONLY) = 3
>> [pid 18276] 09:59:21.379906 write(3, "\0\0\0\0", 4) = 4
>> ...
>>
>> I tried it without that patch, and it started failing more with smaller
>> timeouts.
>>
>> Regards,
>> Jan
>>
>>>
>>>
>>>> SCHED_OTHER or SCHED_FIFO -> FAIL
>>>> intel_idle.max_cstate=0 processor.max_cstate=1 -> FAIL
>>>> echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo -> FAIL
>>>> idle=halt -> FAIL
>>>> idle=poll -> FAIL
>>>>
>>>> Only thing I found to help is to keep CPU slightly busy with
>>>>   taskset -c 1 sh -c "while [ True ]; do usleep 100; done"
>>>>
>>>> After that it started to PASS:
>>>>
>>>> # taskset -c 1 ./poll02
>>>> tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
>>>> tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
>>>> tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
>>>> tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations,
>>>> threshold 450.01us
>>>> tst_timer_test.c:318: INFO: min 1004us, max 1325us, median 1072us, trunc
>>>> mean 1149.29us (discarded 25)
>>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>>> tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations,
>>>> threshold 450.01us
>>>> tst_timer_test.c:318: INFO: min 2007us, max 2326us, median 2075us, trunc
>>>> mean 2158.64us (discarded 25)
>>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>>> tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations,
>>>> threshold 450.04us
>>>> tst_timer_test.c:318: INFO: min 5006us, max 5345us, median 5074us, trunc
>>>> mean 5146.84us (discarded 15)
>>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>>> tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations,
>>>> threshold 450.33us
>>>> tst_timer_test.c:318: INFO: min 10004us, max 10364us, median 10075us, trunc
>>>> mean 10128.61us (discarded 5)
>>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>>> tst_timer_test.c:275: INFO: poll() sleeping for 25000us 50 iterations,
>>>> threshold 451.29us
>>>> tst_timer_test.c:318: INFO: min 25006us, max 25359us, median 25072us, trunc
>>>> mean 25137.48us (discarded 2)
>>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>>> tst_timer_test.c:275: INFO: poll() sleeping for 100000us 10 iterations,
>>>> threshold 537.00us
>>>> tst_timer_test.c:318: INFO: min 100010us, max 100372us, median 100125us,
>>>> trunc mean 100167.78us (discarded 1)
>>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>>> tst_timer_test.c:275: INFO: poll() sleeping for 1000000us 2 iterations,
>>>> threshold 4400.00us
>>>> tst_timer_test.c:318: INFO: min 1000843us, max 1000920us, median 1000843us,
>>>> trunc mean 1000843.00us (discarded 1)
>>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
>>>>
>>>> Summary:
>>>> passed   7
>>>> failed   0
>>>> skipped  0
>>>> warnings 0
>>>>
>>>> Any ideas?
>>>>
>>>> Regards,
>>>> Jan
>>>>
>>>
>>>
>>> --
>>>  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
>>>
>>> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
>>> <http://twitter.com/#!/linaroorg> Twitter |
>>> <http://www.linaro.org/linaro-blog/> Blog
>>>
>>>
>>> --
>>> Mailing list info: https://lists.linux.it/listinfo/ltp
>>>
> 
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library
  2017-12-13 17:00                                       ` Daniel Lezcano
@ 2017-12-13 20:42                                         ` Jan Stancek
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Stancek @ 2017-12-13 20:42 UTC (permalink / raw)
  To: ltp



----- Original Message -----
> On 12/12/2017 16:21, Daniel Lezcano wrote:
> > 
> > On Intel, the firmware usually overrides the kernel power management
> > decision by auto-promoting the idle states.

Yeah, I was saving BIOS as last option.

> > 
> > So it is possible it is the case for you. With a process keeping the CPU
> > "busy" that prevents the firmware to go to a deepest idle state.
> > 
> > I see it is a Xeon platform. You can try by checking the
> > performance/power balance option in the BIOS. AFAIR, there is
> > performance aggressive, power aggressive, balanced-performance+ and
> > balanced-power+ and balanced.
> > 
> > Can you check this option ?
> 
> So ? any news from the front ?

Not yet, it's a remote system and I can't get into BIOS atm.

> 
> > On 12/12/2017 16:04, Jan Stancek wrote:
> >>
> >>
> >> ----- Original Message -----
> >>> On 12/12/2017 15:48, Jan Stancek wrote:
> >>>> Hi,
> >>>>
> >>>> I'm running into similar problem on "Dell Precision 5820 tower", with
> >>>> Intel(R) Xeon(R) W-2133 CPU @ 3.60GHz, but I can't find any /proc /sys
> >>>> knob that would help.
> >>>>
> >>>> # uname -r
> >>>> 4.14.5
> >>>>
> >>>> # cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
> >>>> performance
> >>>> performance
> >>>> performance
> >>>> performance
> >>>> performance
> >>>> performance
> >>>> performance
> >>>> performance
> >>>> performance
> >>>> performance
> >>>> performance
> >>>> performance
> >>>>
> >>>> Any timer related tests are reliably failing on longer timeouts:
> >>>> ---
> >>>> tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
> >>>> tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
> >>>> tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
> >>>> tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations,
> >>>> threshold 450.01us
> >>>> tst_timer_test.c:318: INFO: min 1095us, max 1098us, median 1096us, trunc
> >>>> mean 1095.99us (discarded 25)
> >>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>>> tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations,
> >>>> threshold 450.01us
> >>>> tst_timer_test.c:318: INFO: min 2062us, max 2138us, median 2137us, trunc
> >>>> mean 2135.98us (discarded 25)
> >>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>>> tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations,
> >>>> threshold 450.04us
> >>>> tst_timer_test.c:318: INFO: min 5262us, max 5263us, median 5262us, trunc
> >>>> mean 5262.20us (discarded 15)
> >>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>>> tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations,
> >>>> threshold 450.33us
> >>>> tst_timer_test.c:318: INFO: min 10318us, max 10471us, median 10471us,
> >>>> trunc
> >>>> mean 10469.07us (discarded 5)
> >>>> tst_timer_test.c:321: FAIL: poll() slept for too long
> >>>
> >>> Are you running the tests with this zero latency patch included ?
> >>
> >> Yes, I'm running ltp release 20170929, which has your patch.
> >>
> >> ...
> >> [pid 18276] 09:59:21.379883 open("/dev/cpu_dma_latency", O_WRONLY) = 3
> >> [pid 18276] 09:59:21.379906 write(3, "\0\0\0\0", 4) = 4
> >> ...
> >>
> >> I tried it without that patch, and it started failing more with smaller
> >> timeouts.
> >>
> >> Regards,
> >> Jan
> >>
> >>>
> >>>
> >>>> SCHED_OTHER or SCHED_FIFO -> FAIL
> >>>> intel_idle.max_cstate=0 processor.max_cstate=1 -> FAIL
> >>>> echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo -> FAIL
> >>>> idle=halt -> FAIL
> >>>> idle=poll -> FAIL
> >>>>
> >>>> Only thing I found to help is to keep CPU slightly busy with
> >>>>   taskset -c 1 sh -c "while [ True ]; do usleep 100; done"
> >>>>
> >>>> After that it started to PASS:
> >>>>
> >>>> # taskset -c 1 ./poll02
> >>>> tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
> >>>> tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
> >>>> tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
> >>>> tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations,
> >>>> threshold 450.01us
> >>>> tst_timer_test.c:318: INFO: min 1004us, max 1325us, median 1072us, trunc
> >>>> mean 1149.29us (discarded 25)
> >>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>>> tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations,
> >>>> threshold 450.01us
> >>>> tst_timer_test.c:318: INFO: min 2007us, max 2326us, median 2075us, trunc
> >>>> mean 2158.64us (discarded 25)
> >>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>>> tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations,
> >>>> threshold 450.04us
> >>>> tst_timer_test.c:318: INFO: min 5006us, max 5345us, median 5074us, trunc
> >>>> mean 5146.84us (discarded 15)
> >>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>>> tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations,
> >>>> threshold 450.33us
> >>>> tst_timer_test.c:318: INFO: min 10004us, max 10364us, median 10075us,
> >>>> trunc
> >>>> mean 10128.61us (discarded 5)
> >>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>>> tst_timer_test.c:275: INFO: poll() sleeping for 25000us 50 iterations,
> >>>> threshold 451.29us
> >>>> tst_timer_test.c:318: INFO: min 25006us, max 25359us, median 25072us,
> >>>> trunc
> >>>> mean 25137.48us (discarded 2)
> >>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>>> tst_timer_test.c:275: INFO: poll() sleeping for 100000us 10 iterations,
> >>>> threshold 537.00us
> >>>> tst_timer_test.c:318: INFO: min 100010us, max 100372us, median 100125us,
> >>>> trunc mean 100167.78us (discarded 1)
> >>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>>> tst_timer_test.c:275: INFO: poll() sleeping for 1000000us 2 iterations,
> >>>> threshold 4400.00us
> >>>> tst_timer_test.c:318: INFO: min 1000843us, max 1000920us, median
> >>>> 1000843us,
> >>>> trunc mean 1000843.00us (discarded 1)
> >>>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>>>
> >>>> Summary:
> >>>> passed   7
> >>>> failed   0
> >>>> skipped  0
> >>>> warnings 0
> >>>>
> >>>> Any ideas?
> >>>>
> >>>> Regards,
> >>>> Jan
> >>>>
> >>>
> >>>
> >>> --
> >>>  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
> >>>
> >>> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
> >>> <http://twitter.com/#!/linaroorg> Twitter |
> >>> <http://www.linaro.org/linaro-blog/> Blog
> >>>
> >>>
> >>> --
> >>> Mailing list info: https://lists.linux.it/listinfo/ltp
> >>>
> > 
> > 
> 
> 
> --
>  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
> 
> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
> <http://twitter.com/#!/linaroorg> Twitter |
> <http://www.linaro.org/linaro-blog/> Blog
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library
  2017-12-12 15:21                                     ` Daniel Lezcano
  2017-12-13 17:00                                       ` Daniel Lezcano
@ 2018-02-01 22:52                                       ` Jan Stancek
  1 sibling, 0 replies; 29+ messages in thread
From: Jan Stancek @ 2018-02-01 22:52 UTC (permalink / raw)
  To: ltp



----- Original Message -----
> 
> On Intel, the firmware usually overrides the kernel power management
> decision by auto-promoting the idle states.
> 
> So it is possible it is the case for you. With a process keeping the CPU
> "busy" that prevents the firmware to go to a deepest idle state.
> 
> I see it is a Xeon platform. You can try by checking the
> performance/power balance option in the BIOS. AFAIR, there is
> performance aggressive, power aggressive, balanced-performance+ and
> balanced-power+ and balanced.
> 
> Can you check this option ?

I played a lot of with BIOS settigns, but none of:
 speedstep, c-states, turboboost, S3 disable
made any difference.

As it turned out, this is a kernel bug which was fixed in 4.15 by:
  commit b511203093489eb1829cb4de86e8214752205ac6
  Author: Len Brown <len.brown@intel.com>
  Date:   Fri Dec 22 00:27:55 2017 -0500
    x86/tsc: Fix erroneous TSC rate on Skylake Xeon

Numbers look great with this patch:

tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations, threshold 450.01us
tst_timer_test.c:318: INFO: min 1052us, max 1056us, median 1052us, trunc mean 1052.00us (discarded 25)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations, threshold 450.01us
tst_timer_test.c:318: INFO: min 2019us, max 2055us, median 2052us, trunc mean 2051.95us (discarded 25)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations, threshold 450.04us
tst_timer_test.c:318: INFO: min 5052us, max 5053us, median 5052us, trunc mean 5052.15us (discarded 15)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations, threshold 450.33us
tst_timer_test.c:318: INFO: min 10052us, max 10053us, median 10052us, trunc mean 10052.26us (discarded 5)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 25000us 50 iterations, threshold 451.29us
tst_timer_test.c:318: INFO: min 25052us, max 25053us, median 25052us, trunc mean 25052.46us (discarded 2)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 100000us 10 iterations, threshold 537.00us
tst_timer_test.c:318: INFO: min 100103us, max 100103us, median 100103us, trunc mean 100103.00us (discarded 1)
tst_timer_test.c:333: PASS: Measured times are within thresholds
tst_timer_test.c:275: INFO: poll() sleeping for 1000000us 2 iterations, threshold 4400.00us
tst_timer_test.c:318: INFO: min 1001003us, max 1001004us, median 1001003us, trunc mean 1001003.00us (discarded 1)
tst_timer_test.c:333: PASS: Measured times are within thresholds

Regards,
Jan

> 
> On 12/12/2017 16:04, Jan Stancek wrote:
> > 
> > 
> > ----- Original Message -----
> >> On 12/12/2017 15:48, Jan Stancek wrote:
> >>> Hi,
> >>>
> >>> I'm running into similar problem on "Dell Precision 5820 tower", with
> >>> Intel(R) Xeon(R) W-2133 CPU @ 3.60GHz, but I can't find any /proc /sys
> >>> knob that would help.
> >>>
> >>> # uname -r
> >>> 4.14.5
> >>>
> >>> # cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
> >>> performance
> >>> performance
> >>> performance
> >>> performance
> >>> performance
> >>> performance
> >>> performance
> >>> performance
> >>> performance
> >>> performance
> >>> performance
> >>> performance
> >>>
> >>> Any timer related tests are reliably failing on longer timeouts:
> >>> ---
> >>> tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
> >>> tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
> >>> tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
> >>> tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations,
> >>> threshold 450.01us
> >>> tst_timer_test.c:318: INFO: min 1095us, max 1098us, median 1096us, trunc
> >>> mean 1095.99us (discarded 25)
> >>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>> tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations,
> >>> threshold 450.01us
> >>> tst_timer_test.c:318: INFO: min 2062us, max 2138us, median 2137us, trunc
> >>> mean 2135.98us (discarded 25)
> >>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>> tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations,
> >>> threshold 450.04us
> >>> tst_timer_test.c:318: INFO: min 5262us, max 5263us, median 5262us, trunc
> >>> mean 5262.20us (discarded 15)
> >>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>> tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations,
> >>> threshold 450.33us
> >>> tst_timer_test.c:318: INFO: min 10318us, max 10471us, median 10471us,
> >>> trunc
> >>> mean 10469.07us (discarded 5)
> >>> tst_timer_test.c:321: FAIL: poll() slept for too long
> >>
> >> Are you running the tests with this zero latency patch included ?
> > 
> > Yes, I'm running ltp release 20170929, which has your patch.
> > 
> > ...
> > [pid 18276] 09:59:21.379883 open("/dev/cpu_dma_latency", O_WRONLY) = 3
> > [pid 18276] 09:59:21.379906 write(3, "\0\0\0\0", 4) = 4
> > ...
> > 
> > I tried it without that patch, and it started failing more with smaller
> > timeouts.
> > 
> > Regards,
> > Jan
> > 
> >>
> >>
> >>> SCHED_OTHER or SCHED_FIFO -> FAIL
> >>> intel_idle.max_cstate=0 processor.max_cstate=1 -> FAIL
> >>> echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo -> FAIL
> >>> idle=halt -> FAIL
> >>> idle=poll -> FAIL
> >>>
> >>> Only thing I found to help is to keep CPU slightly busy with
> >>>   taskset -c 1 sh -c "while [ True ]; do usleep 100; done"
> >>>
> >>> After that it started to PASS:
> >>>
> >>> # taskset -c 1 ./poll02
> >>> tst_test.c:934: INFO: Timeout per run is 0h 05m 00s
> >>> tst_timer_test.c:356: INFO: CLOCK_MONOTONIC resolution 1ns
> >>> tst_timer_test.c:368: INFO: prctl(PR_GET_TIMERSLACK) = 50us
> >>> tst_timer_test.c:275: INFO: poll() sleeping for 1000us 500 iterations,
> >>> threshold 450.01us
> >>> tst_timer_test.c:318: INFO: min 1004us, max 1325us, median 1072us, trunc
> >>> mean 1149.29us (discarded 25)
> >>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>> tst_timer_test.c:275: INFO: poll() sleeping for 2000us 500 iterations,
> >>> threshold 450.01us
> >>> tst_timer_test.c:318: INFO: min 2007us, max 2326us, median 2075us, trunc
> >>> mean 2158.64us (discarded 25)
> >>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>> tst_timer_test.c:275: INFO: poll() sleeping for 5000us 300 iterations,
> >>> threshold 450.04us
> >>> tst_timer_test.c:318: INFO: min 5006us, max 5345us, median 5074us, trunc
> >>> mean 5146.84us (discarded 15)
> >>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>> tst_timer_test.c:275: INFO: poll() sleeping for 10000us 100 iterations,
> >>> threshold 450.33us
> >>> tst_timer_test.c:318: INFO: min 10004us, max 10364us, median 10075us,
> >>> trunc
> >>> mean 10128.61us (discarded 5)
> >>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>> tst_timer_test.c:275: INFO: poll() sleeping for 25000us 50 iterations,
> >>> threshold 451.29us
> >>> tst_timer_test.c:318: INFO: min 25006us, max 25359us, median 25072us,
> >>> trunc
> >>> mean 25137.48us (discarded 2)
> >>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>> tst_timer_test.c:275: INFO: poll() sleeping for 100000us 10 iterations,
> >>> threshold 537.00us
> >>> tst_timer_test.c:318: INFO: min 100010us, max 100372us, median 100125us,
> >>> trunc mean 100167.78us (discarded 1)
> >>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>> tst_timer_test.c:275: INFO: poll() sleeping for 1000000us 2 iterations,
> >>> threshold 4400.00us
> >>> tst_timer_test.c:318: INFO: min 1000843us, max 1000920us, median
> >>> 1000843us,
> >>> trunc mean 1000843.00us (discarded 1)
> >>> tst_timer_test.c:333: PASS: Measured times are within thresholds
> >>>
> >>> Summary:
> >>> passed   7
> >>> failed   0
> >>> skipped  0
> >>> warnings 0
> >>>
> >>> Any ideas?
> >>>
> >>> Regards,
> >>> Jan
> >>>


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2018-02-01 22:52 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-10  8:01 [LTP] [PATCH 1/2] ltp: Add the ability to specify the latency constraint Daniel Lezcano
2017-08-10  8:01 ` [LTP] [PATCH 2/2] syscalls/pselect: Add a zero " Daniel Lezcano
2017-08-10 11:50   ` Jiri Jaburek
2017-08-10 12:00     ` Daniel Lezcano
2017-08-11 11:26       ` Jan Stancek
2017-08-11 11:25 ` [LTP] [PATCH 1/2] ltp: Add the ability to specify the " Jan Stancek
2017-08-11 12:54   ` [LTP] [PATCH V2 " Daniel Lezcano
2017-08-11 12:54     ` [LTP] [PATCH V2 2/2] syscalls/pselect: Add a zero " Daniel Lezcano
2017-08-11 14:09     ` [LTP] [PATCH V2 1/2] ltp: Add the ability to specify the " Cyril Hrubis
2017-08-11 14:52       ` Daniel Lezcano
2017-08-11 15:28         ` Cyril Hrubis
2017-08-14 12:56           ` Daniel Lezcano
2017-08-14 13:33             ` Cyril Hrubis
2017-08-14 14:19               ` Daniel Lezcano
2017-08-14 14:36                 ` Cyril Hrubis
2017-08-14 15:43                   ` Daniel Lezcano
2017-08-15 11:06                     ` Cyril Hrubis
2017-08-15 20:15                       ` Daniel Lezcano
2017-08-17 13:50                         ` Cyril Hrubis
2017-08-17 14:02                           ` Daniel Lezcano
2017-08-17 15:00                           ` [LTP] [PATCH V3] ltp: Add a zero latency constraint for the timer tests library Daniel Lezcano
2017-08-18 12:25                             ` Cyril Hrubis
2017-12-12 14:48                               ` Jan Stancek
2017-12-12 14:56                                 ` Daniel Lezcano
2017-12-12 15:04                                   ` Jan Stancek
2017-12-12 15:21                                     ` Daniel Lezcano
2017-12-13 17:00                                       ` Daniel Lezcano
2017-12-13 20:42                                         ` Jan Stancek
2018-02-01 22:52                                       ` Jan Stancek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.