* Re: [PATCH 4/4] sched/core: split iowait state into two states [not found] ` <20240416121526.67022-5-axboe@kernel.dk> @ 2024-04-24 10:01 ` Peter Zijlstra 2024-04-24 10:08 ` Christian Loehle 2024-04-25 14:20 ` Rafael J. Wysocki 0 siblings, 2 replies; 5+ messages in thread From: Peter Zijlstra @ 2024-04-24 10:01 UTC (permalink / raw) To: Jens Axboe Cc: linux-kernel, tglx, Rafael J. Wysocki, linux-pm, daniel.lezcano On Tue, Apr 16, 2024 at 06:11:21AM -0600, Jens Axboe wrote: > iowait is a bogus metric, but it's helpful in the sense that it allows > short waits to not enter sleep states that have a higher exit latency > than would've otherwise have been picked for iowait'ing tasks. However, > it's harmless in that lots of applications and monitoring assumes that > iowait is busy time, or otherwise use it as a health metric. > Particularly for async IO it's entirely nonsensical. Let me get this straight, all of this is about working around cpuidle menu governor insaity? Rafael, how far along are we with fully deprecating that thing? Yes it still exists, but should people really be using it still? ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 4/4] sched/core: split iowait state into two states 2024-04-24 10:01 ` [PATCH 4/4] sched/core: split iowait state into two states Peter Zijlstra @ 2024-04-24 10:08 ` Christian Loehle 2024-04-25 10:16 ` Peter Zijlstra 2024-04-25 14:20 ` Rafael J. Wysocki 1 sibling, 1 reply; 5+ messages in thread From: Christian Loehle @ 2024-04-24 10:08 UTC (permalink / raw) To: Peter Zijlstra, Jens Axboe Cc: linux-kernel, tglx, Rafael J. Wysocki, linux-pm, daniel.lezcano On 24/04/2024 11:01, Peter Zijlstra wrote: > On Tue, Apr 16, 2024 at 06:11:21AM -0600, Jens Axboe wrote: >> iowait is a bogus metric, but it's helpful in the sense that it allows >> short waits to not enter sleep states that have a higher exit latency >> than would've otherwise have been picked for iowait'ing tasks. However, >> it's harmless in that lots of applications and monitoring assumes that >> iowait is busy time, or otherwise use it as a health metric. >> Particularly for async IO it's entirely nonsensical. > > Let me get this straight, all of this is about working around > cpuidle menu governor insaity? > > Rafael, how far along are we with fully deprecating that thing? Yes it > still exists, but should people really be using it still? > Well there is also the iowait boost handling in schedutil and intel_pstate which, at least in synthetic benchmarks, does have an effect [1]. io_uring (the only user of iowait but not iowait_acct) works around both. See commit ("8a796565cec3 io_uring: Use io_schedule* in cqring wait") [1] https://lore.kernel.org/lkml/20240304201625.100619-1-christian.loehle@arm.com/#t Kind Regards, Christian ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 4/4] sched/core: split iowait state into two states 2024-04-24 10:08 ` Christian Loehle @ 2024-04-25 10:16 ` Peter Zijlstra 2024-04-25 10:39 ` Christian Loehle 0 siblings, 1 reply; 5+ messages in thread From: Peter Zijlstra @ 2024-04-25 10:16 UTC (permalink / raw) To: Christian Loehle Cc: Jens Axboe, linux-kernel, tglx, Rafael J. Wysocki, linux-pm, daniel.lezcano On Wed, Apr 24, 2024 at 11:08:42AM +0100, Christian Loehle wrote: > On 24/04/2024 11:01, Peter Zijlstra wrote: > > On Tue, Apr 16, 2024 at 06:11:21AM -0600, Jens Axboe wrote: > >> iowait is a bogus metric, but it's helpful in the sense that it allows > >> short waits to not enter sleep states that have a higher exit latency > >> than would've otherwise have been picked for iowait'ing tasks. However, > >> it's harmless in that lots of applications and monitoring assumes that > >> iowait is busy time, or otherwise use it as a health metric. > >> Particularly for async IO it's entirely nonsensical. > > > > Let me get this straight, all of this is about working around > > cpuidle menu governor insaity? > > > > Rafael, how far along are we with fully deprecating that thing? Yes it > > still exists, but should people really be using it still? > > > > Well there is also the iowait boost handling in schedutil and intel_pstate > which, at least in synthetic benchmarks, does have an effect [1]. Those are cpufreq not cpuidle and at least they don't use nr_iowait. The original Changelog mentioned idle states, and I hate on menu for using nr_iowait. > io_uring (the only user of iowait but not iowait_acct) works around both. > > See commit ("8a796565cec3 io_uring: Use io_schedule* in cqring wait") > > [1] > https://lore.kernel.org/lkml/20240304201625.100619-1-christian.loehle@arm.com/#t So while I agree with most of the short-commings listed in that set, however that patch is quite terrifying. I would prefer to start with something a *lot* simpler. How about a tick driven decay of iops count per task. And that whole step array *shudder*. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 4/4] sched/core: split iowait state into two states 2024-04-25 10:16 ` Peter Zijlstra @ 2024-04-25 10:39 ` Christian Loehle 0 siblings, 0 replies; 5+ messages in thread From: Christian Loehle @ 2024-04-25 10:39 UTC (permalink / raw) To: Peter Zijlstra Cc: Jens Axboe, linux-kernel, tglx, Rafael J. Wysocki, linux-pm, daniel.lezcano On 25/04/2024 11:16, Peter Zijlstra wrote: > On Wed, Apr 24, 2024 at 11:08:42AM +0100, Christian Loehle wrote: >> On 24/04/2024 11:01, Peter Zijlstra wrote: >>> On Tue, Apr 16, 2024 at 06:11:21AM -0600, Jens Axboe wrote: >>>> iowait is a bogus metric, but it's helpful in the sense that it allows >>>> short waits to not enter sleep states that have a higher exit latency >>>> than would've otherwise have been picked for iowait'ing tasks. However, >>>> it's harmless in that lots of applications and monitoring assumes that >>>> iowait is busy time, or otherwise use it as a health metric. >>>> Particularly for async IO it's entirely nonsensical. >>> >>> Let me get this straight, all of this is about working around >>> cpuidle menu governor insaity? >>> >>> Rafael, how far along are we with fully deprecating that thing? Yes it >>> still exists, but should people really be using it still? >>> >> >> Well there is also the iowait boost handling in schedutil and intel_pstate >> which, at least in synthetic benchmarks, does have an effect [1]. > > Those are cpufreq not cpuidle and at least they don't use nr_iowait. The > original Changelog mentioned idle states, and I hate on menu for using > nr_iowait. I'd say they care about any regression, but I'll let Jens answer that. The original change also mentions cpufreq and Jens did mention in an earlier version that he doesn't care, for them it's all just increased latency ;) https://lore.kernel.org/lkml/00d36e83-c9a5-412d-bf49-2e109308d6cd@arm.com/T/#m216536520bc31846aff5875993d22f446a37b297 > >> io_uring (the only user of iowait but not iowait_acct) works around both. >> >> See commit ("8a796565cec3 io_uring: Use io_schedule* in cqring wait") >> >> [1] >> https://lore.kernel.org/lkml/20240304201625.100619-1-christian.loehle@arm.com/#t > > So while I agree with most of the short-commings listed in that set, > however that patch is quite terrifying. Not disagreeing with you on that. > > I would prefer to start with something a *lot* simpler. How about a tick > driven decay of iops count per task. And that whole step array > *shudder*. It's an attempt of solving unnecessary boosting based upon what is there for us to work with now: iowait wakeups. There are many workloads with e.g. > 5000 iowait wakeups per second that don't benefit from boosting at all (and therefore it's a complete energy waste). I don't see anything obvious how we would attempt to detect non-boost-worthy scenarios with a tick driven decay count, but please do elaborate. (If you *really* care about IO throughput, the task wakeup path is hopefully not critical anyway (i.e. you do everything in your power to have IO pending during that time) and then we don't need boosting, but just looking at a tick-length period doesn't let us distinguish those scenarios AFAICS.) Regards, Christian ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 4/4] sched/core: split iowait state into two states 2024-04-24 10:01 ` [PATCH 4/4] sched/core: split iowait state into two states Peter Zijlstra 2024-04-24 10:08 ` Christian Loehle @ 2024-04-25 14:20 ` Rafael J. Wysocki 1 sibling, 0 replies; 5+ messages in thread From: Rafael J. Wysocki @ 2024-04-25 14:20 UTC (permalink / raw) To: Jens Axboe, Peter Zijlstra Cc: linux-kernel, tglx, linux-pm, daniel.lezcano, Rafael J. Wysocki On Wednesday, April 24, 2024 12:01:27 PM CEST Peter Zijlstra wrote: > On Tue, Apr 16, 2024 at 06:11:21AM -0600, Jens Axboe wrote: > > iowait is a bogus metric, but it's helpful in the sense that it allows > > short waits to not enter sleep states that have a higher exit latency > > than would've otherwise have been picked for iowait'ing tasks. However, > > it's harmless in that lots of applications and monitoring assumes that > > iowait is busy time, or otherwise use it as a health metric. > > Particularly for async IO it's entirely nonsensical. > > Let me get this straight, all of this is about working around > cpuidle menu governor insaity? > > Rafael, how far along are we with fully deprecating that thing? Yes it > still exists, but should people really be using it still? Well, they appear to be used to it ... ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-04-25 14:20 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20240416121526.67022-1-axboe@kernel.dk> [not found] ` <20240416121526.67022-5-axboe@kernel.dk> 2024-04-24 10:01 ` [PATCH 4/4] sched/core: split iowait state into two states Peter Zijlstra 2024-04-24 10:08 ` Christian Loehle 2024-04-25 10:16 ` Peter Zijlstra 2024-04-25 10:39 ` Christian Loehle 2024-04-25 14:20 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).