All of lore.kernel.org
 help / color / mirror / Atom feed
* Strange problem with PREEMPT_RT
@ 2021-09-29 16:40 Pierre FICHEUX
  2021-09-30  0:13 ` Punit Agrawal
  0 siblings, 1 reply; 21+ messages in thread
From: Pierre FICHEUX @ 2021-09-29 16:40 UTC (permalink / raw)
  To: linux-rt-users

Hi,

I have a strange problem on a PREEMPT_RT system.

I have a process with 2 threads,

- 1 TR thread (10 ms period) which places 350 KB blocks in a fifo (1
block every 10 ms).
- 1 non-TR thread (SCHED_OTHER) which reads the block in the fifo and
writes it to the disk

If I run this on a powerful machine (HP Z4-i9, 14 cores, NVME disk,
CentOS 7 with 3.10 PREEMPT_RT kernel, yes that's ooold), the max
jitter WITH hackbench remains around 20 to 30 µs while the max jitter
WITHOUT hackbench goes up to 350 µs!

-> hackbench -p -g 20 -l 10000000

Running the program with  taskset 01 doesn't change anything
If I don't write the data to disk it doesn't change anything either.

The important jitters appear rather at the beginning (but sometimes also later).

Any ideas ?


thanks by advance

--
Pierre

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-29 16:40 Strange problem with PREEMPT_RT Pierre FICHEUX
@ 2021-09-30  0:13 ` Punit Agrawal
  2021-09-30  8:21   ` Pierre FICHEUX
  2021-09-30 13:23   ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 21+ messages in thread
From: Punit Agrawal @ 2021-09-30  0:13 UTC (permalink / raw)
  To: Pierre FICHEUX; +Cc: linux-rt-users

Pierre FICHEUX <pierre.ficheux@smile.fr> writes:

> Hi,
>
> I have a strange problem on a PREEMPT_RT system.
>
> I have a process with 2 threads,
>
> - 1 TR thread (10 ms period) which places 350 KB blocks in a fifo (1
> block every 10 ms).
> - 1 non-TR thread (SCHED_OTHER) which reads the block in the fifo and
> writes it to the disk
>
> If I run this on a powerful machine (HP Z4-i9, 14 cores, NVME disk,
> CentOS 7 with 3.10 PREEMPT_RT kernel, yes that's ooold), the max
> jitter WITH hackbench remains around 20 to 30 µs while the max jitter
> WITHOUT hackbench goes up to 350 µs!
>
> -> hackbench -p -g 20 -l 10000000
>
> Running the program with  taskset 01 doesn't change anything
> If I don't write the data to disk it doesn't change anything either.
>
> The important jitters appear rather at the beginning (but sometimes also later).
>
> Any ideas ?

Is power management (cpuidle, cpufreq) enabled on the system?

One possible explanation -

The load from the 10ms task, isn't high enough to keep the system at
high-frequencies or prevent it from going into deeper sleep states. Both
of these can impact latencies.

With hackbench, the system is sufficiently busy to avoid the going into
idle.

>
>
> thanks by advance
>
> --
> Pierre

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30  0:13 ` Punit Agrawal
@ 2021-09-30  8:21   ` Pierre FICHEUX
  2021-09-30 13:17     ` Luis Goncalves
  2021-09-30 13:18     ` Sebastian Andrzej Siewior
  2021-09-30 13:23   ` Sebastian Andrzej Siewior
  1 sibling, 2 replies; 21+ messages in thread
From: Pierre FICHEUX @ 2021-09-30  8:21 UTC (permalink / raw)
  To: Punit Agrawal; +Cc: linux-rt-users

Hi,

Thx a lot for your help (I suspected such a thing).

I can see CONFIG_CPU_IDLE=y in the config but as it's an old kernel
(3.10) there is no entry such as
/sys/devices/system/cpu/cpu?/cpuidle/state?/disable

Will take a look further.

regards

Le jeu. 30 sept. 2021 à 02:13, Punit Agrawal <punitagrawal@gmail.com> a écrit :
>
> Pierre FICHEUX <pierre.ficheux@smile.fr> writes:
>
> > Hi,
> >
> > I have a strange problem on a PREEMPT_RT system.
> >
> > I have a process with 2 threads,
> >
> > - 1 TR thread (10 ms period) which places 350 KB blocks in a fifo (1
> > block every 10 ms).
> > - 1 non-TR thread (SCHED_OTHER) which reads the block in the fifo and
> > writes it to the disk
> >
> > If I run this on a powerful machine (HP Z4-i9, 14 cores, NVME disk,
> > CentOS 7 with 3.10 PREEMPT_RT kernel, yes that's ooold), the max
> > jitter WITH hackbench remains around 20 to 30 µs while the max jitter
> > WITHOUT hackbench goes up to 350 µs!
> >
> > -> hackbench -p -g 20 -l 10000000
> >
> > Running the program with  taskset 01 doesn't change anything
> > If I don't write the data to disk it doesn't change anything either.
> >
> > The important jitters appear rather at the beginning (but sometimes also later).
> >
> > Any ideas ?
>
> Is power management (cpuidle, cpufreq) enabled on the system?
>
> One possible explanation -
>
> The load from the 10ms task, isn't high enough to keep the system at
> high-frequencies or prevent it from going into deeper sleep states. Both
> of these can impact latencies.
>
> With hackbench, the system is sufficiently busy to avoid the going into
> idle.
>
> >
> >
> > thanks by advance
> >
> > --
> > Pierre



-- 

Pierre FICHEUX -/- CTO Smile ECS, France -\- pierre.ficheux@smile.fr
                             http://www.smile.fr
                             https://smile.eu/fr/offres/embarque-iot
I would love to change the world, but they won't give me the source code

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30  8:21   ` Pierre FICHEUX
@ 2021-09-30 13:17     ` Luis Goncalves
  2021-09-30 13:18     ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 21+ messages in thread
From: Luis Goncalves @ 2021-09-30 13:17 UTC (permalink / raw)
  To: Pierre FICHEUX; +Cc: Punit Agrawal, Linux RT Users

On Thu, Sep 30, 2021 at 5:22 AM Pierre FICHEUX <pierre.ficheux@smile.fr> wrote:
>
> Hi,
>
> Thx a lot for your help (I suspected such a thing).
>
> I can see CONFIG_CPU_IDLE=y in the config but as it's an old kernel
> (3.10) there is no entry such as
> /sys/devices/system/cpu/cpu?/cpuidle/state?/disable
>
> Will take a look further.

You probably have tuned installed in your system. If so, you could run:

    tuned-adm profile realtime

That should set most of the required and suggested configurations, including
the ones Punit wisely mentioned.

Best regards,
Luis

>
> regards
>
> Le jeu. 30 sept. 2021 à 02:13, Punit Agrawal <punitagrawal@gmail.com> a écrit :
> >
> > Pierre FICHEUX <pierre.ficheux@smile.fr> writes:
> >
> > > Hi,
> > >
> > > I have a strange problem on a PREEMPT_RT system.
> > >
> > > I have a process with 2 threads,
> > >
> > > - 1 TR thread (10 ms period) which places 350 KB blocks in a fifo (1
> > > block every 10 ms).
> > > - 1 non-TR thread (SCHED_OTHER) which reads the block in the fifo and
> > > writes it to the disk
> > >
> > > If I run this on a powerful machine (HP Z4-i9, 14 cores, NVME disk,
> > > CentOS 7 with 3.10 PREEMPT_RT kernel, yes that's ooold), the max
> > > jitter WITH hackbench remains around 20 to 30 µs while the max jitter
> > > WITHOUT hackbench goes up to 350 µs!
> > >
> > > -> hackbench -p -g 20 -l 10000000
> > >
> > > Running the program with  taskset 01 doesn't change anything
> > > If I don't write the data to disk it doesn't change anything either.
> > >
> > > The important jitters appear rather at the beginning (but sometimes also later).
> > >
> > > Any ideas ?
> >
> > Is power management (cpuidle, cpufreq) enabled on the system?
> >
> > One possible explanation -
> >
> > The load from the 10ms task, isn't high enough to keep the system at
> > high-frequencies or prevent it from going into deeper sleep states. Both
> > of these can impact latencies.
> >
> > With hackbench, the system is sufficiently busy to avoid the going into
> > idle.
> >
> > >
> > >
> > > thanks by advance
> > >
> > > --
> > > Pierre
>
>
>
> --
>
> Pierre FICHEUX -/- CTO Smile ECS, France -\- pierre.ficheux@smile.fr
>                              http://www.smile.fr
>                              https://smile.eu/fr/offres/embarque-iot
> I would love to change the world, but they won't give me the source code
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30  8:21   ` Pierre FICHEUX
  2021-09-30 13:17     ` Luis Goncalves
@ 2021-09-30 13:18     ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 21+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-30 13:18 UTC (permalink / raw)
  To: Pierre FICHEUX; +Cc: Punit Agrawal, linux-rt-users

On 2021-09-30 10:21:44 [+0200], Pierre FICHEUX wrote:
> Hi,
> 
> Thx a lot for your help (I suspected such a thing).
> 
> I can see CONFIG_CPU_IDLE=y in the config but as it's an old kernel
> (3.10) there is no entry such as
> /sys/devices/system/cpu/cpu?/cpuidle/state?/disable
> 
> Will take a look further.

  cpupower idle-info

Some of the higher C states have _long_ transition time. Very often
anything > C1 is evil.

> regards

Sebastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30  0:13 ` Punit Agrawal
  2021-09-30  8:21   ` Pierre FICHEUX
@ 2021-09-30 13:23   ` Sebastian Andrzej Siewior
  2021-09-30 13:26     ` Pierre FICHEUX
                       ` (2 more replies)
  1 sibling, 3 replies; 21+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-30 13:23 UTC (permalink / raw)
  To: Punit Agrawal, Clark Williams, John Kacur
  Cc: Pierre FICHEUX, linux-rt-users, Thomas Gleixner

On 2021-09-30 09:13:25 [+0900], Punit Agrawal wrote:
> With hackbench, the system is sufficiently busy to avoid the going into
> idle.

Not just that. cyclictest's usage of /dev/cpu_dma_latency has the side
effect that it may disable some of the PM stuff in the system.
So your system appears good but then when cyclictest is gone, the
numbers go up.

Maybe we should drop that so we observe a system without altering its
behaviour?

Sebastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 13:23   ` Sebastian Andrzej Siewior
@ 2021-09-30 13:26     ` Pierre FICHEUX
  2021-09-30 13:51       ` Sebastian Andrzej Siewior
  2021-09-30 13:41     ` John Ogness
  2021-09-30 23:01     ` Punit Agrawal
  2 siblings, 1 reply; 21+ messages in thread
From: Pierre FICHEUX @ 2021-09-30 13:26 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Punit Agrawal, Clark Williams, John Kacur, linux-rt-users,
	Thomas Gleixner

I've just copied the set_latency_target() of cyclcitest.c to my code
but I got the same behaviour.

Le jeu. 30 sept. 2021 à 15:23, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> a écrit :
>
> On 2021-09-30 09:13:25 [+0900], Punit Agrawal wrote:
> > With hackbench, the system is sufficiently busy to avoid the going into
> > idle.
>
> Not just that. cyclictest's usage of /dev/cpu_dma_latency has the side
> effect that it may disable some of the PM stuff in the system.
> So your system appears good but then when cyclictest is gone, the
> numbers go up.
>
> Maybe we should drop that so we observe a system without altering its
> behaviour?
>
> Sebastian



-- 

Pierre FICHEUX -/- CTO Smile ECS, France -\- pierre.ficheux@smile.fr
                             http://www.smile.fr
                             https://smile.eu/fr/offres/embarque-iot
I would love to change the world, but they won't give me the source code

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 13:23   ` Sebastian Andrzej Siewior
  2021-09-30 13:26     ` Pierre FICHEUX
@ 2021-09-30 13:41     ` John Ogness
  2021-09-30 14:25       ` John Kacur
  2021-09-30 14:59       ` John Kacur
  2021-09-30 23:01     ` Punit Agrawal
  2 siblings, 2 replies; 21+ messages in thread
From: John Ogness @ 2021-09-30 13:41 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Punit Agrawal, Clark Williams, John Kacur
  Cc: Pierre FICHEUX, linux-rt-users, Thomas Gleixner

On 2021-09-30, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
>> With hackbench, the system is sufficiently busy to avoid the going
>> into idle.
>
> Not just that. cyclictest's usage of /dev/cpu_dma_latency has the side
> effect that it may disable some of the PM stuff in the system.
> So your system appears good but then when cyclictest is gone, the
> numbers go up.
>
> Maybe we should drop that so we observe a system without altering its
> behaviour?

+1

Developers wanting to explicitly cause this behavior can use --latency=
to enable it. Having it on as a default is misleading.

John Ogness

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 13:26     ` Pierre FICHEUX
@ 2021-09-30 13:51       ` Sebastian Andrzej Siewior
  2021-09-30 14:31         ` Pierre FICHEUX
  0 siblings, 1 reply; 21+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-30 13:51 UTC (permalink / raw)
  To: Pierre FICHEUX
  Cc: Punit Agrawal, Clark Williams, John Kacur, linux-rt-users,
	Thomas Gleixner

On 2021-09-30 15:26:32 [+0200], Pierre FICHEUX wrote:
> I've just copied the set_latency_target() of cyclcitest.c to my code
> but I got the same behaviour.

What happens if you disable idle states > C1 with cpupower?

Sebastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 13:41     ` John Ogness
@ 2021-09-30 14:25       ` John Kacur
  2021-09-30 15:02         ` John Ogness
  2021-09-30 14:59       ` John Kacur
  1 sibling, 1 reply; 21+ messages in thread
From: John Kacur @ 2021-09-30 14:25 UTC (permalink / raw)
  To: John Ogness
  Cc: Sebastian Andrzej Siewior, Punit Agrawal, Clark Williams,
	Pierre FICHEUX, linux-rt-users, Thomas Gleixner



On Thu, 30 Sep 2021, John Ogness wrote:

> On 2021-09-30, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> >> With hackbench, the system is sufficiently busy to avoid the going
> >> into idle.
> >
> > Not just that. cyclictest's usage of /dev/cpu_dma_latency has the side
> > effect that it may disable some of the PM stuff in the system.
> > So your system appears good but then when cyclictest is gone, the
> > numbers go up.
> >
> > Maybe we should drop that so we observe a system without altering its
> > behaviour?
> 
> +1
> 
> Developers wanting to explicitly cause this behavior can use --latency=
> to enable it. Having it on as a default is misleading.

Where does this "--latency=" option apply to?

John


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 13:51       ` Sebastian Andrzej Siewior
@ 2021-09-30 14:31         ` Pierre FICHEUX
  2021-09-30 23:40           ` Punit Agrawal
  2021-10-18  9:12           ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 21+ messages in thread
From: Pierre FICHEUX @ 2021-09-30 14:31 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Punit Agrawal, Clark Williams, John Kacur, linux-rt-users,
	Thomas Gleixner

# cpupower -c all idle-info
CPUidle driver: none
CPUidle governor: menu
analyzing CPU 0:

CPU 0: No idle states
...
same for all cpus.

Le jeu. 30 sept. 2021 à 15:51, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> a écrit :
>
> On 2021-09-30 15:26:32 [+0200], Pierre FICHEUX wrote:
> > I've just copied the set_latency_target() of cyclcitest.c to my code
> > but I got the same behaviour.
>
> What happens if you disable idle states > C1 with cpupower?
>
> Sebastian



-- 

Pierre FICHEUX -/- CTO Smile ECS, France -\- pierre.ficheux@smile.fr
                             http://www.smile.fr
                             https://smile.eu/fr/offres/embarque-iot
I would love to change the world, but they won't give me the source code

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 13:41     ` John Ogness
  2021-09-30 14:25       ` John Kacur
@ 2021-09-30 14:59       ` John Kacur
  1 sibling, 0 replies; 21+ messages in thread
From: John Kacur @ 2021-09-30 14:59 UTC (permalink / raw)
  To: John Ogness
  Cc: Sebastian Andrzej Siewior, Punit Agrawal, Clark Williams,
	Pierre FICHEUX, linux-rt-users, Thomas Gleixner



On Thu, 30 Sep 2021, John Ogness wrote:

> On 2021-09-30, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> >> With hackbench, the system is sufficiently busy to avoid the going
> >> into idle.
> >
> > Not just that. cyclictest's usage of /dev/cpu_dma_latency has the side
> > effect that it may disable some of the PM stuff in the system.
> > So your system appears good but then when cyclictest is gone, the
> > numbers go up.
> >
> > Maybe we should drop that so we observe a system without altering its
> > behaviour?
> 
> +1
> 
> Developers wanting to explicitly cause this behavior can use --latency=
> to enable it. Having it on as a default is misleading.
> 
> John Ogness
> 

If you pass "--laptop" to cyclictest then it won't use the trick of 
writing to /dev/cpu_dma_latency in order to save battery power.
It doesn't matter whether you are really running on a laptop or not.

John Kacur


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 14:25       ` John Kacur
@ 2021-09-30 15:02         ` John Ogness
  2021-09-30 15:49           ` Sebastian Andrzej Siewior
  2021-09-30 16:16           ` John Kacur
  0 siblings, 2 replies; 21+ messages in thread
From: John Ogness @ 2021-09-30 15:02 UTC (permalink / raw)
  To: John Kacur
  Cc: Sebastian Andrzej Siewior, Punit Agrawal, Clark Williams,
	Pierre FICHEUX, linux-rt-users, Thomas Gleixner

On 2021-09-30, John Kacur <jkacur@redhat.com> wrote:
>>>> With hackbench, the system is sufficiently busy to avoid the going
>>>> into idle.
>>>
>>> Not just that. cyclictest's usage of /dev/cpu_dma_latency has the side
>>> effect that it may disable some of the PM stuff in the system.
>>> So your system appears good but then when cyclictest is gone, the
>>> numbers go up.
>>>
>>> Maybe we should drop that so we observe a system without altering its
>>> behaviour?
>> 
>> +1
>> 
>> Developers wanting to explicitly cause this behavior can use --latency=
>> to enable it. Having it on as a default is misleading.
>
> Where does this "--latency=" option apply to?

It is the value written to /dev/cpu_dma_latency, which AFAIK writes the
maximum acceptable latency (in microseconds). This translates to the
allowed C states. cyclictest currently writes 0, which should keep the
processor in C0. For example, setting it to 1-5, should allow C0 and C1.

Using --laptop will cause cyclictest to avoid touching
/dev/cpu_dma_latency. But nobody would know that unless they looked at
the code.

IMHO, systems should be configured for production use and cyclictest
should just _measure_ latencies at a specified priority level. But by
default cyclictest is adjusting global system behavior during
measurements, thus providing results that the system (as it is actually
configured) would not be able to provide.

I realize that nobody wants to touch defaults. But I'm not sure users
are aware how important the --laptop option is for realistic
measurements. In fact, the description of --laptop even encourages users
_not_ to use it. :-/

John Ogness

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 15:02         ` John Ogness
@ 2021-09-30 15:49           ` Sebastian Andrzej Siewior
  2021-09-30 16:16           ` John Kacur
  1 sibling, 0 replies; 21+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-30 15:49 UTC (permalink / raw)
  To: John Ogness
  Cc: John Kacur, Punit Agrawal, Clark Williams, Pierre FICHEUX,
	linux-rt-users, Thomas Gleixner

On 2021-09-30 17:08:54 [+0206], John Ogness wrote:
> > Where does this "--latency=" option apply to?
> 
> It is the value written to /dev/cpu_dma_latency, which AFAIK writes the
> maximum acceptable latency (in microseconds). This translates to the
> allowed C states. cyclictest currently writes 0, which should keep the
> processor in C0. For example, setting it to 1-5, should allow C0 and C1.
> 
> Using --laptop will cause cyclictest to avoid touching
> /dev/cpu_dma_latency. But nobody would know that unless they looked at
> the code.

Exactly. We want to measure not to change the defaults before we start
doing so.

> IMHO, systems should be configured for production use and cyclictest
> should just _measure_ latencies at a specified priority level. But by
> default cyclictest is adjusting global system behavior during
> measurements, thus providing results that the system (as it is actually
> configured) would not be able to provide.
> 
> I realize that nobody wants to touch defaults. But I'm not sure users
> are aware how important the --laptop option is for realistic
> measurements. In fact, the description of --laptop even encourages users
> _not_ to use it. :-/

We don't want to touch the defaults? Not so long ago I was curious why
the -b argument didn't stop the trace. Apparently I didn't use the
--tracemark option which is required nowadays.

> John Ogness

Sebastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 15:02         ` John Ogness
  2021-09-30 15:49           ` Sebastian Andrzej Siewior
@ 2021-09-30 16:16           ` John Kacur
  2021-09-30 23:22             ` Punit Agrawal
  1 sibling, 1 reply; 21+ messages in thread
From: John Kacur @ 2021-09-30 16:16 UTC (permalink / raw)
  To: John Ogness
  Cc: Sebastian Andrzej Siewior, Punit Agrawal, Clark Williams,
	Pierre FICHEUX, linux-rt-users, Thomas Gleixner



On Thu, 30 Sep 2021, John Ogness wrote:

> On 2021-09-30, John Kacur <jkacur@redhat.com> wrote:
> >>>> With hackbench, the system is sufficiently busy to avoid the going
> >>>> into idle.
> >>>
> >>> Not just that. cyclictest's usage of /dev/cpu_dma_latency has the side
> >>> effect that it may disable some of the PM stuff in the system.
> >>> So your system appears good but then when cyclictest is gone, the
> >>> numbers go up.
> >>>
> >>> Maybe we should drop that so we observe a system without altering its
> >>> behaviour?
> >> 
> >> +1
> >> 
> >> Developers wanting to explicitly cause this behavior can use --latency=
> >> to enable it. Having it on as a default is misleading.
> >
> > Where does this "--latency=" option apply to?

I see this option was dropped from the help, will add it back.

> 
> It is the value written to /dev/cpu_dma_latency, which AFAIK writes the
> maximum acceptable latency (in microseconds). This translates to the
> allowed C states. cyclictest currently writes 0, which should keep the
> processor in C0. For example, setting it to 1-5, should allow C0 and C1.
> 
> Using --laptop will cause cyclictest to avoid touching
> /dev/cpu_dma_latency. But nobody would know that unless they looked at
> the code.
> 
> IMHO, systems should be configured for production use and cyclictest
> should just _measure_ latencies at a specified priority level. But by
> default cyclictest is adjusting global system behavior during
> measurements, thus providing results that the system (as it is actually
> configured) would not be able to provide.

The thing is, we are not just trying to measure an environment, we are 
also simulating a realtime application. If your realtime environment 
doesn't disable c-states, then your realtime application probably should.

It's always a question of are we trying to measure a worse case scenario 
or a best case scenario.

If I remove this default we're going to get a slew of reports from people 
using cyclictest wondering why they aren't getting good realtime 
performance.

> 
> I realize that nobody wants to touch defaults. But I'm not sure users
> are aware how important the --laptop option is for realistic
> measurements. In fact, the description of --laptop even encourages users
> _not_ to use it. :-/

I agree, but I just wanted to quickly inform people on this email thread, 
how to disable the default of writing to cpu_dma_latency, before we create 
any patches to improve this.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 13:23   ` Sebastian Andrzej Siewior
  2021-09-30 13:26     ` Pierre FICHEUX
  2021-09-30 13:41     ` John Ogness
@ 2021-09-30 23:01     ` Punit Agrawal
  2 siblings, 0 replies; 21+ messages in thread
From: Punit Agrawal @ 2021-09-30 23:01 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Clark Williams, John Kacur, Pierre FICHEUX, linux-rt-users,
	Thomas Gleixner

Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:

> On 2021-09-30 09:13:25 [+0900], Punit Agrawal wrote:
>> With hackbench, the system is sufficiently busy to avoid the going into
>> idle.
>
> Not just that. cyclictest's usage of /dev/cpu_dma_latency has the side
> effect that it may disable some of the PM stuff in the system.
> So your system appears good but then when cyclictest is gone, the
> numbers go up.
>
> Maybe we should drop that so we observe a system without altering its
> behaviour?

+1

For users tuning their system it doesn't help if the reported numbers
are not reproducible when running their application.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 16:16           ` John Kacur
@ 2021-09-30 23:22             ` Punit Agrawal
  0 siblings, 0 replies; 21+ messages in thread
From: Punit Agrawal @ 2021-09-30 23:22 UTC (permalink / raw)
  To: John Kacur
  Cc: John Ogness, Sebastian Andrzej Siewior, Clark Williams,
	Pierre FICHEUX, linux-rt-users, Thomas Gleixner

John Kacur <jkacur@redhat.com> writes:

> On Thu, 30 Sep 2021, John Ogness wrote:
>
>> On 2021-09-30, John Kacur <jkacur@redhat.com> wrote:
>> >>>> With hackbench, the system is sufficiently busy to avoid the going
>> >>>> into idle.
>> >>>
>> >>> Not just that. cyclictest's usage of /dev/cpu_dma_latency has the side
>> >>> effect that it may disable some of the PM stuff in the system.
>> >>> So your system appears good but then when cyclictest is gone, the
>> >>> numbers go up.
>> >>>
>> >>> Maybe we should drop that so we observe a system without altering its
>> >>> behaviour?
>> >> 
>> >> +1
>> >> 
>> >> Developers wanting to explicitly cause this behavior can use --latency=
>> >> to enable it. Having it on as a default is misleading.
>> >
>> > Where does this "--latency=" option apply to?
>
> I see this option was dropped from the help, will add it back.
>
>> 
>> It is the value written to /dev/cpu_dma_latency, which AFAIK writes the
>> maximum acceptable latency (in microseconds). This translates to the
>> allowed C states. cyclictest currently writes 0, which should keep the
>> processor in C0. For example, setting it to 1-5, should allow C0 and C1.
>> 
>> Using --laptop will cause cyclictest to avoid touching
>> /dev/cpu_dma_latency. But nobody would know that unless they looked at
>> the code.
>> 
>> IMHO, systems should be configured for production use and cyclictest
>> should just _measure_ latencies at a specified priority level. But by
>> default cyclictest is adjusting global system behavior during
>> measurements, thus providing results that the system (as it is actually
>> configured) would not be able to provide.
>
> The thing is, we are not just trying to measure an environment, we are 
> also simulating a realtime application. If your realtime environment 
> doesn't disable c-states, then your realtime application probably
> should.

I agree something (firmware, OS, application) should disable c-states -
cyclictest doing it during measurements hides the fact that the system
isn't configured correctly for applications requiring low latencies.

I do too lean towards "altering system behaviour is better left outside
the application" as different system have different knobs and need
different tweaking to achieve low latency.

> It's always a question of are we trying to measure a worse case scenario 
> or a best case scenario.
>
> If I remove this default we're going to get a slew of reports from people 
> using cyclictest wondering why they aren't getting good realtime 
> performance.

Perhaps, this is a change best done with a major version bump. Not that
it'll stop users queries completely but hopefully more people will pay
attention to the changelog.

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 14:31         ` Pierre FICHEUX
@ 2021-09-30 23:40           ` Punit Agrawal
  2021-10-03 11:11             ` Jack Winch
  2021-10-18  9:12           ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 21+ messages in thread
From: Punit Agrawal @ 2021-09-30 23:40 UTC (permalink / raw)
  To: Pierre FICHEUX
  Cc: Sebastian Andrzej Siewior, Clark Williams, John Kacur,
	linux-rt-users, Thomas Gleixner

Please don't top-post[0] - it makes it harder to reply with context from
previous emails. Kernel related mailing lists tend to prefer
bottom-posting[1].

Pierre FICHEUX <pierre.ficheux@smile.fr> writes:

> # cpupower -c all idle-info
> CPUidle driver: none
> CPUidle governor: menu
> analyzing CPU 0:
>
> CPU 0: No idle states
> ...
> same for all cpus.

It looks like cpuidle may not be to blame for your woes.

One last thing to check regarding idle (call me paranoid) would be to
see if there's an option to disable C-states in bios. Not being familiar
with your system, it would be useful to rule out any firmware driven
idle management.

Barring that, maybe look into upgrading to a newer kernel if possible.

[0] https://en.wikipedia.org/wiki/Posting_style#Top-posting
[1] https://en.wikipedia.org/wiki/Posting_style#Bottom-posting

[...]


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 23:40           ` Punit Agrawal
@ 2021-10-03 11:11             ` Jack Winch
  2021-10-04 12:54               ` John Kacur
  0 siblings, 1 reply; 21+ messages in thread
From: Jack Winch @ 2021-10-03 11:11 UTC (permalink / raw)
  To: Punit Agrawal
  Cc: Pierre FICHEUX, Sebastian Andrzej Siewior, Clark Williams,
	John Kacur, linux-rt-users, Thomas Gleixner

Punit Agrawal <punitagrawal@gmail.com> writes:

> > If I remove this default we're going to get a slew of reports from people
> > using cyclictest wondering why they aren't getting good realtime
> > performance.
>
> Perhaps, this is a change best done with a major version bump. Not that
> it'll stop users queries completely but hopefully more people will pay
> attention to the changelog.

Regrettably, I know quite a few RT Linux users who are neither party
to this mailing list (due to company IT policy), nor do they pay
attention to the changelog.  Even less read the source code of the
rt-tests suite of tools.

Regardless of these sorrowful facts, as this is such a fundamental
matter, perhaps it would be useful to highlight this behaviour of
cyclictest in the tool's documentation - specifically the program's
help output, manual page, and related content on the RT Linux wiki.  I
see additional information going into the first two forms of
documentation as a required step.  However, I believe that adding
information on this behaviour to the relevant pages of the RT Linux
wiki would also be greatly beneficial.  This is because many
developers who are new to RT Linux development, in my experience, tend
to make use of the wiki to 'orientate' themselves when first starting
out, etc, and the omission of information regarding this behaviour
could lead to wrong assessments and conclusions being made when using
this tool.

Jack

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-10-03 11:11             ` Jack Winch
@ 2021-10-04 12:54               ` John Kacur
  0 siblings, 0 replies; 21+ messages in thread
From: John Kacur @ 2021-10-04 12:54 UTC (permalink / raw)
  To: Jack Winch
  Cc: Punit Agrawal, Pierre FICHEUX, Sebastian Andrzej Siewior,
	Clark Williams, linux-rt-users, Thomas Gleixner



On Sun, 3 Oct 2021, Jack Winch wrote:

> Punit Agrawal <punitagrawal@gmail.com> writes:
> 
> > > If I remove this default we're going to get a slew of reports from people
> > > using cyclictest wondering why they aren't getting good realtime
> > > performance.
> >
> > Perhaps, this is a change best done with a major version bump. Not that
> > it'll stop users queries completely but hopefully more people will pay
> > attention to the changelog.
> 
> Regrettably, I know quite a few RT Linux users who are neither party
> to this mailing list (due to company IT policy), nor do they pay
> attention to the changelog.  Even less read the source code of the
> rt-tests suite of tools.
> 
> Regardless of these sorrowful facts, as this is such a fundamental
> matter, perhaps it would be useful to highlight this behaviour of
> cyclictest in the tool's documentation - specifically the program's
> help output, manual page, and related content on the RT Linux wiki.  I
> see additional information going into the first two forms of
> documentation as a required step.  However, I believe that adding
> information on this behaviour to the relevant pages of the RT Linux
> wiki would also be greatly beneficial.  This is because many
> developers who are new to RT Linux development, in my experience, tend
> to make use of the wiki to 'orientate' themselves when first starting
> out, etc, and the omission of information regarding this behaviour
> could lead to wrong assessments and conclusions being made when using
> this tool.
> 
> Jack
> 

Don't worry, I try to keep the help and man pages up-to-date so you don't 
need to read change logs. If you see any problems with help / man pages
either report that to the list and cc me, or send me a patch.

Thanks

John Kacur


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Strange problem with PREEMPT_RT
  2021-09-30 14:31         ` Pierre FICHEUX
  2021-09-30 23:40           ` Punit Agrawal
@ 2021-10-18  9:12           ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 21+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-10-18  9:12 UTC (permalink / raw)
  To: Pierre FICHEUX
  Cc: Punit Agrawal, Clark Williams, John Kacur, linux-rt-users,
	Thomas Gleixner

On 2021-09-30 16:31:25 [+0200], Pierre FICHEUX wrote:
> # cpupower -c all idle-info
> CPUidle driver: none
> CPUidle governor: menu
> analyzing CPU 0:
> 
> CPU 0: No idle states
> ...
> same for all cpus.

Are you still stuck with this or did go somewhere in the meantime?

Sebastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-10-18  9:12 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-29 16:40 Strange problem with PREEMPT_RT Pierre FICHEUX
2021-09-30  0:13 ` Punit Agrawal
2021-09-30  8:21   ` Pierre FICHEUX
2021-09-30 13:17     ` Luis Goncalves
2021-09-30 13:18     ` Sebastian Andrzej Siewior
2021-09-30 13:23   ` Sebastian Andrzej Siewior
2021-09-30 13:26     ` Pierre FICHEUX
2021-09-30 13:51       ` Sebastian Andrzej Siewior
2021-09-30 14:31         ` Pierre FICHEUX
2021-09-30 23:40           ` Punit Agrawal
2021-10-03 11:11             ` Jack Winch
2021-10-04 12:54               ` John Kacur
2021-10-18  9:12           ` Sebastian Andrzej Siewior
2021-09-30 13:41     ` John Ogness
2021-09-30 14:25       ` John Kacur
2021-09-30 15:02         ` John Ogness
2021-09-30 15:49           ` Sebastian Andrzej Siewior
2021-09-30 16:16           ` John Kacur
2021-09-30 23:22             ` Punit Agrawal
2021-09-30 14:59       ` John Kacur
2021-09-30 23:01     ` Punit Agrawal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.