linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 4/4] sched/core: split iowait state into two states
       [not found] ` <20240416121526.67022-5-axboe@kernel.dk>
@ 2024-04-24 10:01   ` Peter Zijlstra
  2024-04-24 10:08     ` Christian Loehle
  2024-04-25 14:20     ` Rafael J. Wysocki
  0 siblings, 2 replies; 5+ messages in thread
From: Peter Zijlstra @ 2024-04-24 10:01 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-kernel, tglx, Rafael J. Wysocki, linux-pm, daniel.lezcano

On Tue, Apr 16, 2024 at 06:11:21AM -0600, Jens Axboe wrote:
> iowait is a bogus metric, but it's helpful in the sense that it allows
> short waits to not enter sleep states that have a higher exit latency
> than would've otherwise have been picked for iowait'ing tasks. However,
> it's harmless in that lots of applications and monitoring assumes that
> iowait is busy time, or otherwise use it as a health metric.
> Particularly for async IO it's entirely nonsensical.

Let me get this straight, all of this is about working around
cpuidle menu governor insaity?

Rafael, how far along are we with fully deprecating that thing? Yes it
still exists, but should people really be using it still?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 4/4] sched/core: split iowait state into two states
  2024-04-24 10:01   ` [PATCH 4/4] sched/core: split iowait state into two states Peter Zijlstra
@ 2024-04-24 10:08     ` Christian Loehle
  2024-04-25 10:16       ` Peter Zijlstra
  2024-04-25 14:20     ` Rafael J. Wysocki
  1 sibling, 1 reply; 5+ messages in thread
From: Christian Loehle @ 2024-04-24 10:08 UTC (permalink / raw)
  To: Peter Zijlstra, Jens Axboe
  Cc: linux-kernel, tglx, Rafael J. Wysocki, linux-pm, daniel.lezcano

On 24/04/2024 11:01, Peter Zijlstra wrote:
> On Tue, Apr 16, 2024 at 06:11:21AM -0600, Jens Axboe wrote:
>> iowait is a bogus metric, but it's helpful in the sense that it allows
>> short waits to not enter sleep states that have a higher exit latency
>> than would've otherwise have been picked for iowait'ing tasks. However,
>> it's harmless in that lots of applications and monitoring assumes that
>> iowait is busy time, or otherwise use it as a health metric.
>> Particularly for async IO it's entirely nonsensical.
> 
> Let me get this straight, all of this is about working around
> cpuidle menu governor insaity?
> 
> Rafael, how far along are we with fully deprecating that thing? Yes it
> still exists, but should people really be using it still?
> 

Well there is also the iowait boost handling in schedutil and intel_pstate
which, at least in synthetic benchmarks, does have an effect [1].
io_uring (the only user of iowait but not iowait_acct) works around both.

See commit ("8a796565cec3 io_uring: Use io_schedule* in cqring wait")

[1]
https://lore.kernel.org/lkml/20240304201625.100619-1-christian.loehle@arm.com/#t

Kind Regards,
Christian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 4/4] sched/core: split iowait state into two states
  2024-04-24 10:08     ` Christian Loehle
@ 2024-04-25 10:16       ` Peter Zijlstra
  2024-04-25 10:39         ` Christian Loehle
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2024-04-25 10:16 UTC (permalink / raw)
  To: Christian Loehle
  Cc: Jens Axboe, linux-kernel, tglx, Rafael J. Wysocki, linux-pm,
	daniel.lezcano

On Wed, Apr 24, 2024 at 11:08:42AM +0100, Christian Loehle wrote:
> On 24/04/2024 11:01, Peter Zijlstra wrote:
> > On Tue, Apr 16, 2024 at 06:11:21AM -0600, Jens Axboe wrote:
> >> iowait is a bogus metric, but it's helpful in the sense that it allows
> >> short waits to not enter sleep states that have a higher exit latency
> >> than would've otherwise have been picked for iowait'ing tasks. However,
> >> it's harmless in that lots of applications and monitoring assumes that
> >> iowait is busy time, or otherwise use it as a health metric.
> >> Particularly for async IO it's entirely nonsensical.
> > 
> > Let me get this straight, all of this is about working around
> > cpuidle menu governor insaity?
> > 
> > Rafael, how far along are we with fully deprecating that thing? Yes it
> > still exists, but should people really be using it still?
> > 
> 
> Well there is also the iowait boost handling in schedutil and intel_pstate
> which, at least in synthetic benchmarks, does have an effect [1].

Those are cpufreq not cpuidle and at least they don't use nr_iowait. The
original Changelog mentioned idle states, and I hate on menu for using
nr_iowait.

> io_uring (the only user of iowait but not iowait_acct) works around both.
> 
> See commit ("8a796565cec3 io_uring: Use io_schedule* in cqring wait")
> 
> [1]
> https://lore.kernel.org/lkml/20240304201625.100619-1-christian.loehle@arm.com/#t

So while I agree with most of the short-commings listed in that set,
however that patch is quite terrifying.

I would prefer to start with something a *lot* simpler. How about a tick
driven decay of iops count per task. And that whole step array
*shudder*.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 4/4] sched/core: split iowait state into two states
  2024-04-25 10:16       ` Peter Zijlstra
@ 2024-04-25 10:39         ` Christian Loehle
  0 siblings, 0 replies; 5+ messages in thread
From: Christian Loehle @ 2024-04-25 10:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jens Axboe, linux-kernel, tglx, Rafael J. Wysocki, linux-pm,
	daniel.lezcano

On 25/04/2024 11:16, Peter Zijlstra wrote:
> On Wed, Apr 24, 2024 at 11:08:42AM +0100, Christian Loehle wrote:
>> On 24/04/2024 11:01, Peter Zijlstra wrote:
>>> On Tue, Apr 16, 2024 at 06:11:21AM -0600, Jens Axboe wrote:
>>>> iowait is a bogus metric, but it's helpful in the sense that it allows
>>>> short waits to not enter sleep states that have a higher exit latency
>>>> than would've otherwise have been picked for iowait'ing tasks. However,
>>>> it's harmless in that lots of applications and monitoring assumes that
>>>> iowait is busy time, or otherwise use it as a health metric.
>>>> Particularly for async IO it's entirely nonsensical.
>>>
>>> Let me get this straight, all of this is about working around
>>> cpuidle menu governor insaity?
>>>
>>> Rafael, how far along are we with fully deprecating that thing? Yes it
>>> still exists, but should people really be using it still?
>>>
>>
>> Well there is also the iowait boost handling in schedutil and intel_pstate
>> which, at least in synthetic benchmarks, does have an effect [1].
> 
> Those are cpufreq not cpuidle and at least they don't use nr_iowait. The
> original Changelog mentioned idle states, and I hate on menu for using
> nr_iowait.

I'd say they care about any regression, but I'll let Jens answer that.
The original change also mentions cpufreq and Jens did mention in an
earlier version that he doesn't care, for them it's all just increased
latency ;) 
https://lore.kernel.org/lkml/00d36e83-c9a5-412d-bf49-2e109308d6cd@arm.com/T/#m216536520bc31846aff5875993d22f446a37b297

> 
>> io_uring (the only user of iowait but not iowait_acct) works around both.
>>
>> See commit ("8a796565cec3 io_uring: Use io_schedule* in cqring wait")
>>
>> [1]
>> https://lore.kernel.org/lkml/20240304201625.100619-1-christian.loehle@arm.com/#t
> 
> So while I agree with most of the short-commings listed in that set,
> however that patch is quite terrifying.

Not disagreeing with you on that.

> 
> I would prefer to start with something a *lot* simpler. How about a tick
> driven decay of iops count per task. And that whole step array
> *shudder*.

It's an attempt of solving unnecessary boosting based upon what is there for
us to work with now: iowait wakeups.
There are many workloads with e.g. > 5000 iowait wakeups per second that don't
benefit from boosting at all (and therefore it's a complete energy waste).
I don't see anything obvious how we would attempt to detect non-boost-worthy
scenarios with a tick driven decay count, but please do elaborate.

(If you *really* care about IO throughput, the task wakeup path is hopefully
not critical anyway (i.e. you do everything in your power to have IO pending
during that time) and then we don't need boosting, but just looking
at a tick-length period doesn't let us distinguish those scenarios AFAICS.)

Regards,
Christian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 4/4] sched/core: split iowait state into two states
  2024-04-24 10:01   ` [PATCH 4/4] sched/core: split iowait state into two states Peter Zijlstra
  2024-04-24 10:08     ` Christian Loehle
@ 2024-04-25 14:20     ` Rafael J. Wysocki
  1 sibling, 0 replies; 5+ messages in thread
From: Rafael J. Wysocki @ 2024-04-25 14:20 UTC (permalink / raw)
  To: Jens Axboe, Peter Zijlstra
  Cc: linux-kernel, tglx, linux-pm, daniel.lezcano, Rafael J. Wysocki

On Wednesday, April 24, 2024 12:01:27 PM CEST Peter Zijlstra wrote:
> On Tue, Apr 16, 2024 at 06:11:21AM -0600, Jens Axboe wrote:
> > iowait is a bogus metric, but it's helpful in the sense that it allows
> > short waits to not enter sleep states that have a higher exit latency
> > than would've otherwise have been picked for iowait'ing tasks. However,
> > it's harmless in that lots of applications and monitoring assumes that
> > iowait is busy time, or otherwise use it as a health metric.
> > Particularly for async IO it's entirely nonsensical.
> 
> Let me get this straight, all of this is about working around
> cpuidle menu governor insaity?
> 
> Rafael, how far along are we with fully deprecating that thing? Yes it
> still exists, but should people really be using it still?

Well, they appear to be used to it ...




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-04-25 14:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20240416121526.67022-1-axboe@kernel.dk>
     [not found] ` <20240416121526.67022-5-axboe@kernel.dk>
2024-04-24 10:01   ` [PATCH 4/4] sched/core: split iowait state into two states Peter Zijlstra
2024-04-24 10:08     ` Christian Loehle
2024-04-25 10:16       ` Peter Zijlstra
2024-04-25 10:39         ` Christian Loehle
2024-04-25 14:20     ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).