* [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq @ 2020-06-10 22:18 Douglas Anderson 2020-06-11 11:03 ` Quentin Perret 2020-06-12 12:52 ` Qais Yousef 0 siblings, 2 replies; 26+ messages in thread From: Douglas Anderson @ 2020-06-10 22:18 UTC (permalink / raw) To: Benson Leung, Enric Balletbo i Serra Cc: hsinyi, joelaf, peterz, drinkcat, gwendal, qperret, ctheegal, Douglas Anderson, Guenter Roeck, linux-kernel The cros_ec_spi driver is realtime priority so that it doesn't get preempted by other taks while it's talking to the EC but overall it really doesn't need lots of compute power. Unfortunately, by default, the kernel assumes that all realtime tasks should cause the cpufreq to jump to max and burn through power to get things done as quickly as possible. That's just not the correct behavior for cros_ec_spi. Switch to manually overriding the default. This won't help us if our work moves over to the SPI pump thread but that's not the common code path. Signed-off-by: Douglas Anderson <dianders@chromium.org> --- NOTE: This would cause a conflict if the patch https://lore.kernel.org/r/20200422112831.870192415@infradead.org lands first drivers/platform/chrome/cros_ec_spi.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/platform/chrome/cros_ec_spi.c b/drivers/platform/chrome/cros_ec_spi.c index debea5c4c829..76d59d5e7efd 100644 --- a/drivers/platform/chrome/cros_ec_spi.c +++ b/drivers/platform/chrome/cros_ec_spi.c @@ -709,8 +709,11 @@ static void cros_ec_spi_high_pri_release(void *worker) static int cros_ec_spi_devm_high_pri_alloc(struct device *dev, struct cros_ec_spi *ec_spi) { - struct sched_param sched_priority = { - .sched_priority = MAX_RT_PRIO / 2, + struct sched_attr sched_attr = { + .sched_policy = SCHED_FIFO, + .sched_priority = MAX_RT_PRIO / 2, + .sched_flags = SCHED_FLAG_UTIL_CLAMP_MIN, + .sched_util_min = 0, }; int err; @@ -728,8 +731,7 @@ static int cros_ec_spi_devm_high_pri_alloc(struct device *dev, if (err) return err; - err = sched_setscheduler_nocheck(ec_spi->high_pri_worker->task, - SCHED_FIFO, &sched_priority); + err = sched_setattr_nocheck(ec_spi->high_pri_worker->task, &sched_attr); if (err) dev_err(dev, "Can't set cros_ec high pri priority: %d\n", err); return err; -- 2.27.0.290.gba653c62da-goog ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-10 22:18 [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq Douglas Anderson @ 2020-06-11 11:03 ` Quentin Perret 2020-06-11 17:48 ` Doug Anderson 2020-06-12 12:52 ` Qais Yousef 1 sibling, 1 reply; 26+ messages in thread From: Quentin Perret @ 2020-06-11 11:03 UTC (permalink / raw) To: Douglas Anderson Cc: Benson Leung, Enric Balletbo i Serra, hsinyi, joelaf, peterz, drinkcat, gwendal, ctheegal, Guenter Roeck, linux-kernel Hi Doug, On Wednesday 10 Jun 2020 at 15:18:43 (-0700), Douglas Anderson wrote: > The cros_ec_spi driver is realtime priority so that it doesn't get > preempted by other taks while it's talking to the EC but overall it > really doesn't need lots of compute power. Unfortunately, by default, > the kernel assumes that all realtime tasks should cause the cpufreq to > jump to max and burn through power to get things done as quickly as > possible. That's just not the correct behavior for cros_ec_spi. Is this specific to this driver, or something you would want applied more globally to all RT tasks in ChromeOS (which is what we'd like to have in Android for instance)? IOW, how do you feel about 20200511154053.7822-1-qais.yousef@arm.com ? Otherwise, the patch looks good to me. Thanks, Quentin ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-11 11:03 ` Quentin Perret @ 2020-06-11 17:48 ` Doug Anderson 2020-06-12 9:24 ` Quentin Perret 0 siblings, 1 reply; 26+ messages in thread From: Doug Anderson @ 2020-06-11 17:48 UTC (permalink / raw) To: Quentin Perret Cc: Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Peter Zijlstra, Nicolas Boichat, Gwendal Grignou, ctheegal, Guenter Roeck, LKML Hi, On Thu, Jun 11, 2020 at 4:03 AM Quentin Perret <qperret@google.com> wrote: > > Hi Doug, > > On Wednesday 10 Jun 2020 at 15:18:43 (-0700), Douglas Anderson wrote: > > The cros_ec_spi driver is realtime priority so that it doesn't get > > preempted by other taks while it's talking to the EC but overall it > > really doesn't need lots of compute power. Unfortunately, by default, > > the kernel assumes that all realtime tasks should cause the cpufreq to > > jump to max and burn through power to get things done as quickly as > > possible. That's just not the correct behavior for cros_ec_spi. > > Is this specific to this driver, or something you would want applied > more globally to all RT tasks in ChromeOS (which is what we'd like to > have in Android for instance)? Hrm. I guess my first instinct is to say that we still want this patch even if we have something that is applied more globally. Specifically it sounds as if the patch you point at is suggesting that we'd tweak the boost value to something other than max but we'd still have a boost value. In the case of cros_ec_spi I don't believe we need any boost value at all, so my patch would still be useful. The computational needs of cros_ec_spi are very modest and it can do its work at lower CPU frequencies just fine. It just can't be interrupted for large swaths of time. > IOW, how do you feel about 20200511154053.7822-1-qais.yousef@arm.com ? I'm not totally a fan, but I'm definitely not an expert in this area (I've also only read the patch description and not the patch or the whole thread). I really don't want yet another value that I need to tune from board to board. Even worse, this tuning value isn't board-specific but a combination of board and software specific. By this, I'd imagine a scenario where you're using a real-time task to get audio decoding done within a certain latency. I guess you'd tune this value to make sure that you can get all your audio decoding done in time but also not burn extra power. Now, imagine that the OS upgrades and the audio task suddenly has to decode more complex streams. You've got to go across all of your boards and re-tune every one? ...or, nobody thinks about it and older boards start getting stuttery audio? Perhaps the opposite happens and someone comes up with a newer lower-cpu-intensive codec and you could save power. Sounds like a bit of a nightmare. I'd rather have a boolean value: boost all RT threads to max vs. don't boost all RT threads to max. Someone that just wanted RT stuff to run as fast as possible without any hassle on their system and didn't care about power efficiency could turn this on. Anyone who really cared about power could turn this off and then could find a more targeted way to boost things, hopefully in a way that doesn't require tuning. One option would be to still boost the CPU to max but only for certain tasks known to be really latency sensitive. Another might be to somehow measure whether or not the task is making its deadlines and boost the CPU frequency up if deadlines are not being met. I'm sure there are fancier ways. ...of course, I believe your patch allows me to do what I want: I can just set the default boost to 0. It just leaves in the temptation for others to require a default boost of something else and then I'm stuck in my tuning nightmare. -Doug -Doug ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-11 17:48 ` Doug Anderson @ 2020-06-12 9:24 ` Quentin Perret 2020-06-12 12:34 ` Qais Yousef 0 siblings, 1 reply; 26+ messages in thread From: Quentin Perret @ 2020-06-12 9:24 UTC (permalink / raw) To: Doug Anderson Cc: Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Peter Zijlstra, Nicolas Boichat, Gwendal Grignou, ctheegal, Guenter Roeck, LKML, qais.yousef +CC Qais [FYI] On Thursday 11 Jun 2020 at 10:48:40 (-0700), Doug Anderson wrote: > Hrm. I guess my first instinct is to say that we still want this > patch even if we have something that is applied more globally. Fair enough. > Specifically it sounds as if the patch you point at is suggesting that > we'd tweak the boost value to something other than max but we'd still > have a boost value. In the case of cros_ec_spi I don't believe we > need any boost value at all, so my patch would still be useful. The > computational needs of cros_ec_spi are very modest and it can do its > work at lower CPU frequencies just fine. It just can't be interrupted > for large swaths of time. > > > > IOW, how do you feel about 20200511154053.7822-1-qais.yousef@arm.com ? > > I'm not totally a fan, but I'm definitely not an expert in this area > (I've also only read the patch description and not the patch or the > whole thread). I really don't want yet another value that I need to > tune from board to board. Even worse, this tuning value isn't > board-specific but a combination of board and software specific. By > this, I'd imagine a scenario where you're using a real-time task to > get audio decoding done within a certain latency. I guess you'd tune > this value to make sure that you can get all your audio decoding done > in time but also not burn extra power. Now, imagine that the OS > upgrades and the audio task suddenly has to decode more complex > streams. You've got to go across all of your boards and re-tune every > one? ...or, nobody thinks about it and older boards start getting > stuttery audio? Perhaps the opposite happens and someone comes up > with a newer lower-cpu-intensive codec and you could save power. > Sounds like a bit of a nightmare. > > I'd rather have a boolean value: boost all RT threads to max vs. don't > boost all RT threads to max. Someone that just wanted RT stuff to run > as fast as possible without any hassle on their system and didn't care > about power efficiency could turn this on. Anyone who really cared > about power could turn this off and then could find a more targeted > way to boost things, hopefully in a way that doesn't require tuning. > One option would be to still boost the CPU to max but only for certain > tasks known to be really latency sensitive. Another might be to > somehow measure whether or not the task is making its deadlines and > boost the CPU frequency up if deadlines are not being met. I'm sure > there are fancier ways. > > ...of course, I believe your patch allows me to do what I want: I can > just set the default boost to 0. It just leaves in the temptation for > others to require a default boost of something else and then I'm stuck > in my tuning nightmare. Right, so I am not disagreeing at all with the above. FWIW, I expect Android to set this default value to 0 as you mentioned, so that uclamp basically becomes 'opt-in' for _all_ tasks, including RT. Now, given that Qais' patch is commiting something to userspace, I think it makes sense to expose the full range rather than a boolean value, as it's probably more future-proof. That is, if we expose a boolean knob now, and somebody wants to be able to set a default value in the middle in a few years, we'll have to add another knob, and maintain both (which sucks). But it's just my personal opinion. Feel free to jump in the other thread if you feel differently :) Cheers, Quentin ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-12 9:24 ` Quentin Perret @ 2020-06-12 12:34 ` Qais Yousef 2020-06-12 13:47 ` Quentin Perret 2020-06-18 21:31 ` Doug Anderson 0 siblings, 2 replies; 26+ messages in thread From: Qais Yousef @ 2020-06-12 12:34 UTC (permalink / raw) To: Quentin Perret Cc: Doug Anderson, Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Peter Zijlstra, Nicolas Boichat, Gwendal Grignou, ctheegal, Guenter Roeck, LKML On 06/12/20 10:24, Quentin Perret wrote: > +CC Qais [FYI] Thanks for the CC. > > On Thursday 11 Jun 2020 at 10:48:40 (-0700), Doug Anderson wrote: > > Hrm. I guess my first instinct is to say that we still want this > > patch even if we have something that is applied more globally. > > Fair enough. > > > Specifically it sounds as if the patch you point at is suggesting that > > we'd tweak the boost value to something other than max but we'd still > > have a boost value. In the case of cros_ec_spi I don't believe we > > need any boost value at all, so my patch would still be useful. The > > computational needs of cros_ec_spi are very modest and it can do its > > work at lower CPU frequencies just fine. It just can't be interrupted > > for large swaths of time. > > > > > > > IOW, how do you feel about 20200511154053.7822-1-qais.yousef@arm.com ? > > > > I'm not totally a fan, but I'm definitely not an expert in this area > > (I've also only read the patch description and not the patch or the > > whole thread). I really don't want yet another value that I need to > > tune from board to board. Even worse, this tuning value isn't > > board-specific but a combination of board and software specific. By > > this, I'd imagine a scenario where you're using a real-time task to > > get audio decoding done within a certain latency. I guess you'd tune > > this value to make sure that you can get all your audio decoding done > > in time but also not burn extra power. Now, imagine that the OS > > upgrades and the audio task suddenly has to decode more complex > > streams. You've got to go across all of your boards and re-tune every > > one? ...or, nobody thinks about it and older boards start getting > > stuttery audio? Perhaps the opposite happens and someone comes up > > with a newer lower-cpu-intensive codec and you could save power. > > Sounds like a bit of a nightmare. Generally I would expect this global tunable to be part of a vendor's SoC BSP. People tend to think of the flagship SoCs which are powerful, but if you consider the low and medium end devices there's a massive spectrum over there that this range is trying to cover. I would expect older boards init script to be separate for newer boards init script. The OS by default boosts all RT tasks unless a platform specific script overrides that. So I can't see how an OS upgrade would affect older boards. This knob still allows you to disable the max boosting and use the per-task uclamp interface to boost only those tasks you care about. AFAIK this is already done in a hacky way in android devices via special vendors provisions. > > > > I'd rather have a boolean value: boost all RT threads to max vs. don't > > boost all RT threads to max. Someone that just wanted RT stuff to run If that's what your use case requires, you can certainly treat it like a boolean if you want. > > as fast as possible without any hassle on their system and didn't care > > about power efficiency could turn this on. Anyone who really cared > > about power could turn this off and then could find a more targeted > > way to boost things, hopefully in a way that doesn't require tuning. > > One option would be to still boost the CPU to max but only for certain > > tasks known to be really latency sensitive. Another might be to per-task uclamp interface allows you to do that. But SoC vendors/system integrators need to decide that. I'm saying this with Android in mind specifically. Linux based laptops that are tuned in similar way are rare. But hopefully this will change at some point :) > > somehow measure whether or not the task is making its deadlines and > > boost the CPU frequency up if deadlines are not being met. I'm sure > > there are fancier ways. You need to use SCHED_DEADLINE then :) Making sure that RT tasks meet their deadlines is a userspace tuning problem. There's no way the kernel can know if an RT task is meeting its deadline. > > > > ...of course, I believe your patch allows me to do what I want: I can > > just set the default boost to 0. It just leaves in the temptation for > > others to require a default boost of something else and then I'm stuck > > in my tuning nightmare. To cover all possible systems Linux can run on, this variability is required. On general purpose systems with decent SoCs, I'd expect them to just want to set this to 0 and selectively boost specific RT tasks. > > Right, so I am not disagreeing at all with the above. FWIW, I expect > Android to set this default value to 0 as you mentioned, so that uclamp > basically becomes 'opt-in' for _all_ tasks, including RT. > > Now, given that Qais' patch is commiting something to userspace, I think > it makes sense to expose the full range rather than a boolean value, as > it's probably more future-proof. That is, if we expose a boolean knob > now, and somebody wants to be able to set a default value in the middle > in a few years, we'll have to add another knob, and maintain both (which > sucks). But it's just my personal opinion. Feel free to jump in the > other thread if you feel differently :) I guess you're talking about Android/Chromium specific knobs on top of the sysctl one here. Not sure why you need that since init scripts should be able to modify this directly since they run as root. Cheers -- Qais Yousef ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-12 12:34 ` Qais Yousef @ 2020-06-12 13:47 ` Quentin Perret 2020-06-12 14:00 ` Qais Yousef 2020-06-18 21:31 ` Doug Anderson 1 sibling, 1 reply; 26+ messages in thread From: Quentin Perret @ 2020-06-12 13:47 UTC (permalink / raw) To: Qais Yousef Cc: Doug Anderson, Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Peter Zijlstra, Nicolas Boichat, Gwendal Grignou, ctheegal, Guenter Roeck, LKML On Friday 12 Jun 2020 at 13:34:48 (+0100), Qais Yousef wrote: > > On Thursday 11 Jun 2020 at 10:48:40 (-0700), Doug Anderson wrote: > > > I'm not totally a fan, but I'm definitely not an expert in this area > > > (I've also only read the patch description and not the patch or the > > > whole thread). I really don't want yet another value that I need to > > > tune from board to board. Even worse, this tuning value isn't > > > board-specific but a combination of board and software specific. By > > > this, I'd imagine a scenario where you're using a real-time task to > > > get audio decoding done within a certain latency. I guess you'd tune > > > this value to make sure that you can get all your audio decoding done > > > in time but also not burn extra power. Now, imagine that the OS > > > upgrades and the audio task suddenly has to decode more complex > > > streams. You've got to go across all of your boards and re-tune every > > > one? ...or, nobody thinks about it and older boards start getting > > > stuttery audio? Perhaps the opposite happens and someone comes up > > > with a newer lower-cpu-intensive codec and you could save power. > > > Sounds like a bit of a nightmare. > > Generally I would expect this global tunable to be part of a vendor's SoC BSP. > > People tend to think of the flagship SoCs which are powerful, but if you > consider the low and medium end devices there's a massive spectrum over there > that this range is trying to cover. > > I would expect older boards init script to be separate for newer boards init > script. The OS by default boosts all RT tasks unless a platform specific script > overrides that. So I can't see how an OS upgrade would affect older boards. I think Doug meant that the device-specific values need re-tuning in case of major OS updates, which is indeed a pain. But yeah, I'm not sure if we have a better solution than that, though. > This knob still allows you to disable the max boosting and use the per-task > uclamp interface to boost only those tasks you care about. AFAIK this is > already done in a hacky way in android devices via special vendors provisions. > > > > > > > I'd rather have a boolean value: boost all RT threads to max vs. don't > > > boost all RT threads to max. Someone that just wanted RT stuff to run > > If that's what your use case requires, you can certainly treat it like > a boolean if you want. +1 > > > as fast as possible without any hassle on their system and didn't care > > > about power efficiency could turn this on. Anyone who really cared > > > about power could turn this off and then could find a more targeted > > > way to boost things, hopefully in a way that doesn't require tuning. > > > One option would be to still boost the CPU to max but only for certain > > > tasks known to be really latency sensitive. Another might be to > > per-task uclamp interface allows you to do that. But SoC vendors/system > integrators need to decide that. I'm saying this with Android in mind > specifically. Linux based laptops that are tuned in similar way are rare. But > hopefully this will change at some point :) > > > > somehow measure whether or not the task is making its deadlines and > > > boost the CPU frequency up if deadlines are not being met. I'm sure > > > there are fancier ways. > > You need to use SCHED_DEADLINE then :) Well, not quite :-) The frequency selection for DL is purely based on the userspace-provided parameters, from which we derive the bandwidth request. But we don't do things like 'raise the frequency if the actual runtime gets close to the WCET', or anything of the sort. All of that would have to be implemented in userspace ATM. Cheers, Quentin ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-12 13:47 ` Quentin Perret @ 2020-06-12 14:00 ` Qais Yousef 0 siblings, 0 replies; 26+ messages in thread From: Qais Yousef @ 2020-06-12 14:00 UTC (permalink / raw) To: Quentin Perret Cc: Doug Anderson, Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Peter Zijlstra, Nicolas Boichat, Gwendal Grignou, ctheegal, Guenter Roeck, LKML On 06/12/20 14:47, Quentin Perret wrote: [...] > > > > somehow measure whether or not the task is making its deadlines and > > > > boost the CPU frequency up if deadlines are not being met. I'm sure > > > > there are fancier ways. > > > > You need to use SCHED_DEADLINE then :) > > Well, not quite :-) > > The frequency selection for DL is purely based on the userspace-provided > parameters, from which we derive the bandwidth request. But we don't do > things like 'raise the frequency if the actual runtime gets close to the > WCET', or anything of the sort. All of that would have to be implemented > in userspace ATM. Hmm okay. I am not a deadline expert so sorry if I was misleading. But it's the only scheduling class that has the concept of deadline. Cheers -- Qais Yousef ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-12 12:34 ` Qais Yousef 2020-06-12 13:47 ` Quentin Perret @ 2020-06-18 21:31 ` Doug Anderson 2020-06-19 15:31 ` Qais Yousef 1 sibling, 1 reply; 26+ messages in thread From: Doug Anderson @ 2020-06-18 21:31 UTC (permalink / raw) To: Qais Yousef Cc: Quentin Perret, Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Peter Zijlstra, Nicolas Boichat, Gwendal Grignou, ctheegal, Guenter Roeck, LKML Hi, On Fri, Jun 12, 2020 at 5:34 AM Qais Yousef <qais.yousef@arm.com> wrote: > > On 06/12/20 10:24, Quentin Perret wrote: > > +CC Qais [FYI] > > Thanks for the CC. > > > > > On Thursday 11 Jun 2020 at 10:48:40 (-0700), Doug Anderson wrote: > > > Hrm. I guess my first instinct is to say that we still want this > > > patch even if we have something that is applied more globally. > > > > Fair enough. > > > > > Specifically it sounds as if the patch you point at is suggesting that > > > we'd tweak the boost value to something other than max but we'd still > > > have a boost value. In the case of cros_ec_spi I don't believe we > > > need any boost value at all, so my patch would still be useful. The > > > computational needs of cros_ec_spi are very modest and it can do its > > > work at lower CPU frequencies just fine. It just can't be interrupted > > > for large swaths of time. > > > > > > > > > > IOW, how do you feel about 20200511154053.7822-1-qais.yousef@arm.com ? > > > > > > I'm not totally a fan, but I'm definitely not an expert in this area > > > (I've also only read the patch description and not the patch or the > > > whole thread). I really don't want yet another value that I need to > > > tune from board to board. Even worse, this tuning value isn't > > > board-specific but a combination of board and software specific. By > > > this, I'd imagine a scenario where you're using a real-time task to > > > get audio decoding done within a certain latency. I guess you'd tune > > > this value to make sure that you can get all your audio decoding done > > > in time but also not burn extra power. Now, imagine that the OS > > > upgrades and the audio task suddenly has to decode more complex > > > streams. You've got to go across all of your boards and re-tune every > > > one? ...or, nobody thinks about it and older boards start getting > > > stuttery audio? Perhaps the opposite happens and someone comes up > > > with a newer lower-cpu-intensive codec and you could save power. > > > Sounds like a bit of a nightmare. > > Generally I would expect this global tunable to be part of a vendor's SoC BSP. I think I'm coming at this from a very different world than the one you're thinking of. The concept of a BSP makes me think of a SoC vendor providing a drop of Linux with all their own weird hacks in it that only ever works on that one SoC and will never, ever, ever be updated. 5 years down the road if a software update is needed (security fix?) some poor soul will be in charge of tracking down the exact ancient code base that was used to build the original kernel, making the tweak, and building a new kernel. ;-) In the world of Chrome OS we try very very hard to build everything from a clean code base and are trying hard to stay even closer to upstream and away from per-device weirdness... > People tend to think of the flagship SoCs which are powerful, but if you > consider the low and medium end devices there's a massive spectrum over there > that this range is trying to cover. Yeah, perhaps you're right. When thinking about a laptop it's almost always a fairly powerful device and things could certainly be different for very low end chips trying to run Linux. > I would expect older boards init script to be separate for newer boards init > script. The OS by default boosts all RT tasks unless a platform specific script > overrides that. So I can't see how an OS upgrade would affect older boards. In the Chrome OS world I'm part of, the people supporting the boards are not always the people adding new whizbang features. We get new versions of the OS every 6 weeks and those new versions may change the way things work pretty significantly. We ship those new versions off to all boards. Certainly they undergo testing, but there are a lot of boards and if something is not tuned as well as it was when the device first shipped it's likely nobody will really notice. This is my bias against needing per-board tuning. > This knob still allows you to disable the max boosting and use the per-task > uclamp interface to boost only those tasks you care about. AFAIK this is > already done in a hacky way in android devices via special vendors provisions. Yeah. I don't think I have enough skin in the game to really say one way or the other what the API should be so I'll probably stay off the other, bigger thread and let others decide what the best API should be. For Chrome OS I'd probably advocate just disabling the default boost but even there I'd be willing to defer to the folks doing the actual work. -Doug ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-18 21:31 ` Doug Anderson @ 2020-06-19 15:31 ` Qais Yousef 2020-06-22 18:19 ` Doug Anderson 0 siblings, 1 reply; 26+ messages in thread From: Qais Yousef @ 2020-06-19 15:31 UTC (permalink / raw) To: Doug Anderson Cc: Quentin Perret, Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Peter Zijlstra, Nicolas Boichat, Gwendal Grignou, ctheegal, Guenter Roeck, LKML Hi Doug, On 06/18/20 14:31, Doug Anderson wrote: > Hi, > > On Fri, Jun 12, 2020 at 5:34 AM Qais Yousef <qais.yousef@arm.com> wrote: > > > > On 06/12/20 10:24, Quentin Perret wrote: > > > +CC Qais [FYI] > > > > Thanks for the CC. > > > > > > > > On Thursday 11 Jun 2020 at 10:48:40 (-0700), Doug Anderson wrote: > > > > Hrm. I guess my first instinct is to say that we still want this > > > > patch even if we have something that is applied more globally. > > > > > > Fair enough. > > > > > > > Specifically it sounds as if the patch you point at is suggesting that > > > > we'd tweak the boost value to something other than max but we'd still > > > > have a boost value. In the case of cros_ec_spi I don't believe we > > > > need any boost value at all, so my patch would still be useful. The > > > > computational needs of cros_ec_spi are very modest and it can do its > > > > work at lower CPU frequencies just fine. It just can't be interrupted > > > > for large swaths of time. > > > > > > > > > > > > > IOW, how do you feel about 20200511154053.7822-1-qais.yousef@arm.com ? > > > > > > > > I'm not totally a fan, but I'm definitely not an expert in this area > > > > (I've also only read the patch description and not the patch or the > > > > whole thread). I really don't want yet another value that I need to > > > > tune from board to board. Even worse, this tuning value isn't > > > > board-specific but a combination of board and software specific. By > > > > this, I'd imagine a scenario where you're using a real-time task to > > > > get audio decoding done within a certain latency. I guess you'd tune > > > > this value to make sure that you can get all your audio decoding done > > > > in time but also not burn extra power. Now, imagine that the OS > > > > upgrades and the audio task suddenly has to decode more complex > > > > streams. You've got to go across all of your boards and re-tune every > > > > one? ...or, nobody thinks about it and older boards start getting > > > > stuttery audio? Perhaps the opposite happens and someone comes up > > > > with a newer lower-cpu-intensive codec and you could save power. > > > > Sounds like a bit of a nightmare. > > > > Generally I would expect this global tunable to be part of a vendor's SoC BSP. > > I think I'm coming at this from a very different world than the one > you're thinking of. The concept of a BSP makes me think of a SoC > vendor providing a drop of Linux with all their own weird hacks in it > that only ever works on that one SoC and will never, ever, ever be > updated. 5 years down the road if a software update is needed > (security fix?) some poor soul will be in charge of tracking down the > exact ancient code base that was used to build the original kernel, > making the tweak, and building a new kernel. ;-) > > In the world of Chrome OS we try very very hard to build everything > from a clean code base and are trying hard to stay even closer to > upstream and away from per-device weirdness... Hmm I thought OEMs who ship stuff that are based on Chrome OS would have to do the final tuning here, which would be based on the recommendation of the SoC vendor. And I'm being overly generic here to think not only SoC from Intel which traditionally they have been treated in different ways. I think you provide a generic stack for OEMs, no? Generally for RT tasks, Linux always required an admin to do the tuning. And to provide an automatic solution that fits all is not easy to happen, because what ultimately is important for userspace, is only known by the userspace. This is an interesting problem for me personally that I have been trying to look at in my free time. On general purpose systems, it's hard to reason about what's important because, as you say, you distribute something that should target a wide range of platforms. And sometimes the end user (like a person installing Ubuntu Desktop on their laptop), have no clue what a driver is even to pick the right tuning for all RT tasks in their system. But this problem already exists and catching up with this will need more work from both distros and maybe kernel. I can't see a possible situation where the kernel can do all the tuning without userspace providing more info anyway. The per-device weirdness you're talking about is just how the way goes in a world where there are many SoCs that are created to target different budgets and use cases. So I don't see the 2 worlds (laptops/mobile) really different, they just traditionally started differently where such variations in SoC didn't exist that much. But there are more Arm based SoCs that are targetting laptop markets now.. So you never know when they will be dealing with the same variations the mobile world sees today. > > > > People tend to think of the flagship SoCs which are powerful, but if you > > consider the low and medium end devices there's a massive spectrum over there > > that this range is trying to cover. > > Yeah, perhaps you're right. When thinking about a laptop it's almost > always a fairly powerful device and things could certainly be > different for very low end chips trying to run Linux. I think there are low end laptops in the market that barely have enough grunt to do work. So not all laptops are fairly powerful. > > > > I would expect older boards init script to be separate for newer boards init > > script. The OS by default boosts all RT tasks unless a platform specific script > > overrides that. So I can't see how an OS upgrade would affect older boards. > > In the Chrome OS world I'm part of, the people supporting the boards > are not always the people adding new whizbang features. We get new > versions of the OS every 6 weeks and those new versions may change the > way things work pretty significantly. We ship those new versions off > to all boards. Certainly they undergo testing, but there are a lot of > boards and if something is not tuned as well as it was when the device > first shipped it's likely nobody will really notice. This is my bias > against needing per-board tuning. Apologies for my lack knowledge about the exact process in Chrome OS. In my head, I think what you do is like what Android and other distros do. Provide a software stack that can be used on any platform. On Android OEMs do have to do the tuning (with a help from SoC vendor usually). Not many laptops ship with Linux as their default OS, but this is an increasing trend and LWN did write something recently about the demand is increasing for Linux based laptops. I thought Chrome OS is the same, where an OEM will take your deliverables and do the final integration. So I agree at your level it might be hard to reason about the full spectrum. But I thought the OEMs that wants to ship Chrome OS will need to look at init scripts, drivers etc to ensure their laptop is competitive. > > > > This knob still allows you to disable the max boosting and use the per-task > > uclamp interface to boost only those tasks you care about. AFAIK this is > > already done in a hacky way in android devices via special vendors provisions. > > Yeah. I don't think I have enough skin in the game to really say one > way or the other what the API should be so I'll probably stay off the > other, bigger thread and let others decide what the best API should > be. For Chrome OS I'd probably advocate just disabling the default > boost but even there I'd be willing to defer to the folks doing the > actual work. Yes I appreciate the challenges you have. I think for most users disabling is good enough for them as RT tasks could end up causing a lot of power to be wasted unnecessarily. But if someone wants to take the extra mile and squeeze the best compromise, they can :) Thanks -- Qais Yousef ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-19 15:31 ` Qais Yousef @ 2020-06-22 18:19 ` Doug Anderson 2020-06-23 15:56 ` Qais Yousef 0 siblings, 1 reply; 26+ messages in thread From: Doug Anderson @ 2020-06-22 18:19 UTC (permalink / raw) To: Qais Yousef Cc: Quentin Perret, Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Peter Zijlstra, Nicolas Boichat, Gwendal Grignou, ctheegal, Guenter Roeck, LKML Hi, On Fri, Jun 19, 2020 at 8:31 AM Qais Yousef <qais.yousef@arm.com> wrote: > > Hi Doug, > > On 06/18/20 14:31, Doug Anderson wrote: > > Hi, > > > > On Fri, Jun 12, 2020 at 5:34 AM Qais Yousef <qais.yousef@arm.com> wrote: > > > > > > On 06/12/20 10:24, Quentin Perret wrote: > > > > +CC Qais [FYI] > > > > > > Thanks for the CC. > > > > > > > > > > > On Thursday 11 Jun 2020 at 10:48:40 (-0700), Doug Anderson wrote: > > > > > Hrm. I guess my first instinct is to say that we still want this > > > > > patch even if we have something that is applied more globally. > > > > > > > > Fair enough. > > > > > > > > > Specifically it sounds as if the patch you point at is suggesting that > > > > > we'd tweak the boost value to something other than max but we'd still > > > > > have a boost value. In the case of cros_ec_spi I don't believe we > > > > > need any boost value at all, so my patch would still be useful. The > > > > > computational needs of cros_ec_spi are very modest and it can do its > > > > > work at lower CPU frequencies just fine. It just can't be interrupted > > > > > for large swaths of time. > > > > > > > > > > > > > > > > IOW, how do you feel about 20200511154053.7822-1-qais.yousef@arm.com ? > > > > > > > > > > I'm not totally a fan, but I'm definitely not an expert in this area > > > > > (I've also only read the patch description and not the patch or the > > > > > whole thread). I really don't want yet another value that I need to > > > > > tune from board to board. Even worse, this tuning value isn't > > > > > board-specific but a combination of board and software specific. By > > > > > this, I'd imagine a scenario where you're using a real-time task to > > > > > get audio decoding done within a certain latency. I guess you'd tune > > > > > this value to make sure that you can get all your audio decoding done > > > > > in time but also not burn extra power. Now, imagine that the OS > > > > > upgrades and the audio task suddenly has to decode more complex > > > > > streams. You've got to go across all of your boards and re-tune every > > > > > one? ...or, nobody thinks about it and older boards start getting > > > > > stuttery audio? Perhaps the opposite happens and someone comes up > > > > > with a newer lower-cpu-intensive codec and you could save power. > > > > > Sounds like a bit of a nightmare. > > > > > > Generally I would expect this global tunable to be part of a vendor's SoC BSP. > > > > I think I'm coming at this from a very different world than the one > > you're thinking of. The concept of a BSP makes me think of a SoC > > vendor providing a drop of Linux with all their own weird hacks in it > > that only ever works on that one SoC and will never, ever, ever be > > updated. 5 years down the road if a software update is needed > > (security fix?) some poor soul will be in charge of tracking down the > > exact ancient code base that was used to build the original kernel, > > making the tweak, and building a new kernel. ;-) > > > > In the world of Chrome OS we try very very hard to build everything > > from a clean code base and are trying hard to stay even closer to > > upstream and away from per-device weirdness... > > Hmm I thought OEMs who ship stuff that are based on Chrome OS would have to do > the final tuning here, which would be based on the recommendation of the SoC > vendor. And I'm being overly generic here to think not only SoC from Intel > which traditionally they have been treated in different ways. > > I think you provide a generic stack for OEMs, no? No, that's the Android way. The only way Chrome OS can ship updates for the whole fleet every ~6 weeks and support it for so many years is that all the builds and updates happen on Google servers. The only way Google can maintain this is to not have a separate kernel code base for every single variant of every single board. > Generally for RT tasks, Linux always required an admin to do the tuning. And to > provide an automatic solution that fits all is not easy to happen, because what > ultimately is important for userspace, is only known by the userspace. > > This is an interesting problem for me personally that I have been trying to > look at in my free time. On general purpose systems, it's hard to reason about > what's important because, as you say, you distribute something that should > target a wide range of platforms. And sometimes the end user (like a person > installing Ubuntu Desktop on their laptop), have no clue what a driver is even > to pick the right tuning for all RT tasks in their system. > > But this problem already exists and catching up with this will need more work > from both distros and maybe kernel. I can't see a possible situation where the > kernel can do all the tuning without userspace providing more info anyway. > > The per-device weirdness you're talking about is just how the way goes in > a world where there are many SoCs that are created to target different budgets > and use cases. I think it's sane for the OS to do very high level tuning, like "this platform has an underpowered CPU and being fast is more important than battery life, so error on the side of running fast" or "this platform has a fast SSD so tune disk algorithms appropriately". ...but picking specific values gets tricky. > So I don't see the 2 worlds (laptops/mobile) really different, they just > traditionally started differently where such variations in SoC didn't exist > that much. But there are more Arm based SoCs that are targetting laptop markets > now.. So you never know when they will be dealing with the same variations the > mobile world sees today. The slowest laptop we currently support is a Quad-core A17 running at ~1.8 GHz. On that CPU the cros_ec thread needs to be realtime (so it doesn't get interrupted) but doesn't need any particular CPU speed. That being said, that does remind me that on that laptop we did give up and end up with a hack to set a minimum CPU speed when USB audio was happening. That CPU has dwc2 for its USB controller and it's utterly important to service interrupts in < 125 us when USB audio is going on. So in that case, yes, I was forced to do tuning and pick a specific value. > > > People tend to think of the flagship SoCs which are powerful, but if you > > > consider the low and medium end devices there's a massive spectrum over there > > > that this range is trying to cover. > > > > Yeah, perhaps you're right. When thinking about a laptop it's almost > > always a fairly powerful device and things could certainly be > > different for very low end chips trying to run Linux. > > I think there are low end laptops in the market that barely have enough grunt > to do work. So not all laptops are fairly powerful. > > > > > > > > I would expect older boards init script to be separate for newer boards init > > > script. The OS by default boosts all RT tasks unless a platform specific script > > > overrides that. So I can't see how an OS upgrade would affect older boards. > > > > In the Chrome OS world I'm part of, the people supporting the boards > > are not always the people adding new whizbang features. We get new > > versions of the OS every 6 weeks and those new versions may change the > > way things work pretty significantly. We ship those new versions off > > to all boards. Certainly they undergo testing, but there are a lot of > > boards and if something is not tuned as well as it was when the device > > first shipped it's likely nobody will really notice. This is my bias > > against needing per-board tuning. > > Apologies for my lack knowledge about the exact process in Chrome OS. In my > head, I think what you do is like what Android and other distros do. Provide > a software stack that can be used on any platform. On Android OEMs do have to > do the tuning (with a help from SoC vendor usually). Not many laptops ship with > Linux as their default OS, but this is an increasing trend and LWN did write > something recently about the demand is increasing for Linux based laptops. > > > I thought Chrome OS is the same, where an OEM will take your deliverables and > do the final integration. > > So I agree at your level it might be hard to reason about the full spectrum. > But I thought the OEMs that wants to ship Chrome OS will need to look at init > scripts, drivers etc to ensure their laptop is competitive. > > > > > > > > This knob still allows you to disable the max boosting and use the per-task > > > uclamp interface to boost only those tasks you care about. AFAIK this is > > > already done in a hacky way in android devices via special vendors provisions. > > > > Yeah. I don't think I have enough skin in the game to really say one > > way or the other what the API should be so I'll probably stay off the > > other, bigger thread and let others decide what the best API should > > be. For Chrome OS I'd probably advocate just disabling the default > > boost but even there I'd be willing to defer to the folks doing the > > actual work. > > Yes I appreciate the challenges you have. I think for most users disabling is > good enough for them as RT tasks could end up causing a lot of power to be > wasted unnecessarily. But if someone wants to take the extra mile and squeeze > the best compromise, they can :) Yeah, thinking about the USB audio hack we ended up with makes me understand that sometimes on lower-end systems you are forced to do tuning even if it's annoying. -Doug ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-22 18:19 ` Doug Anderson @ 2020-06-23 15:56 ` Qais Yousef 0 siblings, 0 replies; 26+ messages in thread From: Qais Yousef @ 2020-06-23 15:56 UTC (permalink / raw) To: Doug Anderson Cc: Quentin Perret, Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Peter Zijlstra, Nicolas Boichat, Gwendal Grignou, ctheegal, Guenter Roeck, LKML On 06/22/20 11:19, Doug Anderson wrote: [...] > > Hmm I thought OEMs who ship stuff that are based on Chrome OS would have to do > > the final tuning here, which would be based on the recommendation of the SoC > > vendor. And I'm being overly generic here to think not only SoC from Intel > > which traditionally they have been treated in different ways. > > > > I think you provide a generic stack for OEMs, no? > > No, that's the Android way. The only way Chrome OS can ship updates > for the whole fleet every ~6 weeks and support it for so many years is > that all the builds and updates happen on Google servers. The only > way Google can maintain this is to not have a separate kernel code > base for every single variant of every single board. I was referring to userspace tuning, which I expected to be platform specific. Assuming the devices share the same kernel. At least in the context of uclamp and RT tuning that should be possible. I appreciate that you want to ship a simple generic setup on as many devices. But there will be a trade-off to make between optimal tuning and keeping things simple and generic. The latter is perfectly fine goal to have of course. Others who want to tune more they're free to do so too as well. > > > > Generally for RT tasks, Linux always required an admin to do the tuning. And to > > provide an automatic solution that fits all is not easy to happen, because what > > ultimately is important for userspace, is only known by the userspace. > > > > This is an interesting problem for me personally that I have been trying to > > look at in my free time. On general purpose systems, it's hard to reason about > > what's important because, as you say, you distribute something that should > > target a wide range of platforms. And sometimes the end user (like a person > > installing Ubuntu Desktop on their laptop), have no clue what a driver is even > > to pick the right tuning for all RT tasks in their system. > > > > But this problem already exists and catching up with this will need more work > > from both distros and maybe kernel. I can't see a possible situation where the > > kernel can do all the tuning without userspace providing more info anyway. > > > > The per-device weirdness you're talking about is just how the way goes in > > a world where there are many SoCs that are created to target different budgets > > and use cases. > > I think it's sane for the OS to do very high level tuning, like "this > platform has an underpowered CPU and being fast is more important than > battery life, so error on the side of running fast" or "this platform > has a fast SSD so tune disk algorithms appropriately". ...but picking > specific values gets tricky. You can always use the per-task API to boost that task to maximum, if power is not your concern. From user-space ;-) If you're power conscious too, yeah there's no way but to test for the minimum value that gives you what you want. The task can try to regulate itself too if it can observe when it's not running fast enough (notices a glitch?). You can get fancy if you want, depending on your goal. It's hard to get the best though with one size fits all. HTH -- Qais Yousef ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-10 22:18 [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq Douglas Anderson 2020-06-11 11:03 ` Quentin Perret @ 2020-06-12 12:52 ` Qais Yousef 2020-06-18 21:18 ` Doug Anderson 1 sibling, 1 reply; 26+ messages in thread From: Qais Yousef @ 2020-06-12 12:52 UTC (permalink / raw) To: Douglas Anderson Cc: Benson Leung, Enric Balletbo i Serra, hsinyi, joelaf, peterz, drinkcat, gwendal, qperret, ctheegal, Guenter Roeck, linux-kernel On 06/10/20 15:18, Douglas Anderson wrote: > The cros_ec_spi driver is realtime priority so that it doesn't get > preempted by other taks while it's talking to the EC but overall it > really doesn't need lots of compute power. Unfortunately, by default, > the kernel assumes that all realtime tasks should cause the cpufreq to > jump to max and burn through power to get things done as quickly as > possible. That's just not the correct behavior for cros_ec_spi. > > Switch to manually overriding the default. > > This won't help us if our work moves over to the SPI pump thread but > that's not the common code path. > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > --- > NOTE: This would cause a conflict if the patch > https://lore.kernel.org/r/20200422112831.870192415@infradead.org lands > first > > drivers/platform/chrome/cros_ec_spi.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/drivers/platform/chrome/cros_ec_spi.c b/drivers/platform/chrome/cros_ec_spi.c > index debea5c4c829..76d59d5e7efd 100644 > --- a/drivers/platform/chrome/cros_ec_spi.c > +++ b/drivers/platform/chrome/cros_ec_spi.c > @@ -709,8 +709,11 @@ static void cros_ec_spi_high_pri_release(void *worker) > static int cros_ec_spi_devm_high_pri_alloc(struct device *dev, > struct cros_ec_spi *ec_spi) > { > - struct sched_param sched_priority = { > - .sched_priority = MAX_RT_PRIO / 2, > + struct sched_attr sched_attr = { > + .sched_policy = SCHED_FIFO, > + .sched_priority = MAX_RT_PRIO / 2, > + .sched_flags = SCHED_FLAG_UTIL_CLAMP_MIN, > + .sched_util_min = 0, Hmm I thought Peter already removed all users that change RT priority directly. https://lore.kernel.org/lkml/20200422112719.826676174@infradead.org/ > }; > int err; > > @@ -728,8 +731,7 @@ static int cros_ec_spi_devm_high_pri_alloc(struct device *dev, > if (err) > return err; > > - err = sched_setscheduler_nocheck(ec_spi->high_pri_worker->task, > - SCHED_FIFO, &sched_priority); > + err = sched_setattr_nocheck(ec_spi->high_pri_worker->task, &sched_attr); > if (err) > dev_err(dev, "Can't set cros_ec high pri priority: %d\n", err); > return err; If I read the code correctly, if you fail here cros_ec_spi_probe() will fail too and the whole thing will not be loaded. If you compile without uclamp then sched_setattr_nocheck() will always fail with -EOPNOTSUPP. Why can't you manage the priority and boost value of such tasks via your init scripts instead? I have to admit I need to think about whether it makes sense to have a generic API that allows drivers to opt-out of the default boosting behavior for their RT tasks. Thanks -- Qais Yousef > -- > 2.27.0.290.gba653c62da-goog > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-12 12:52 ` Qais Yousef @ 2020-06-18 21:18 ` Doug Anderson 2020-06-19 15:38 ` Qais Yousef 2020-06-24 15:56 ` Peter Zijlstra 0 siblings, 2 replies; 26+ messages in thread From: Doug Anderson @ 2020-06-18 21:18 UTC (permalink / raw) To: Qais Yousef Cc: Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Peter Zijlstra, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML Hi, On Fri, Jun 12, 2020 at 5:52 AM Qais Yousef <qais.yousef@arm.com> wrote: > > On 06/10/20 15:18, Douglas Anderson wrote: > > The cros_ec_spi driver is realtime priority so that it doesn't get > > preempted by other taks while it's talking to the EC but overall it > > really doesn't need lots of compute power. Unfortunately, by default, > > the kernel assumes that all realtime tasks should cause the cpufreq to > > jump to max and burn through power to get things done as quickly as > > possible. That's just not the correct behavior for cros_ec_spi. > > > > Switch to manually overriding the default. > > > > This won't help us if our work moves over to the SPI pump thread but > > that's not the common code path. > > > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > --- > > NOTE: This would cause a conflict if the patch > > https://lore.kernel.org/r/20200422112831.870192415@infradead.org lands > > first > > > > drivers/platform/chrome/cros_ec_spi.c | 10 ++++++---- > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/platform/chrome/cros_ec_spi.c b/drivers/platform/chrome/cros_ec_spi.c > > index debea5c4c829..76d59d5e7efd 100644 > > --- a/drivers/platform/chrome/cros_ec_spi.c > > +++ b/drivers/platform/chrome/cros_ec_spi.c > > @@ -709,8 +709,11 @@ static void cros_ec_spi_high_pri_release(void *worker) > > static int cros_ec_spi_devm_high_pri_alloc(struct device *dev, > > struct cros_ec_spi *ec_spi) > > { > > - struct sched_param sched_priority = { > > - .sched_priority = MAX_RT_PRIO / 2, > > + struct sched_attr sched_attr = { > > + .sched_policy = SCHED_FIFO, > > + .sched_priority = MAX_RT_PRIO / 2, > > + .sched_flags = SCHED_FLAG_UTIL_CLAMP_MIN, > > + .sched_util_min = 0, > > Hmm I thought Peter already removed all users that change RT priority directly. > > https://lore.kernel.org/lkml/20200422112719.826676174@infradead.org/ Yeah, I mentioned that patch series "after the cut" in my patch and also made sure to CC Peter on my patch. I'm not sure what happened to that series since I don't see it in linuxnext... > > }; > > int err; > > > > @@ -728,8 +731,7 @@ static int cros_ec_spi_devm_high_pri_alloc(struct device *dev, > > if (err) > > return err; > > > > - err = sched_setscheduler_nocheck(ec_spi->high_pri_worker->task, > > - SCHED_FIFO, &sched_priority); > > + err = sched_setattr_nocheck(ec_spi->high_pri_worker->task, &sched_attr); > > if (err) > > dev_err(dev, "Can't set cros_ec high pri priority: %d\n", err); > > return err; > > If I read the code correctly, if you fail here cros_ec_spi_probe() will fail > too and the whole thing will not be loaded. If you compile without uclamp then > sched_setattr_nocheck() will always fail with -EOPNOTSUPP. Oops, definitely need to send out a v2 for that if nothing else. Is there any good way for me to code this up or do I need a big #ifdef somewhere in my code? > Why can't you manage the priority and boost value of such tasks via your init > scripts instead? I guess I feel like it's weird in this case. Userspace isn't creating this task and isn't the one marking it as realtime. ...and, if we ever manage to upgrade the protocol which we use to talk to the EC we might fully get rid of this task the need to have something boosted up to realtime. Said another way: the fact that we even have this task at all and the fact that it's realtime is an implementation detail of the kernel. It seems really weird to add initscripts for it. > I have to admit I need to think about whether it makes sense to have a generic > API that allows drivers to opt-out of the default boosting behavior for their > RT tasks. Seems like it would be useful. -Doug ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-18 21:18 ` Doug Anderson @ 2020-06-19 15:38 ` Qais Yousef 2020-06-22 18:21 ` Doug Anderson 2020-06-24 15:56 ` Peter Zijlstra 1 sibling, 1 reply; 26+ messages in thread From: Qais Yousef @ 2020-06-19 15:38 UTC (permalink / raw) To: Doug Anderson Cc: Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Peter Zijlstra, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML On 06/18/20 14:18, Doug Anderson wrote: > Hi, > > On Fri, Jun 12, 2020 at 5:52 AM Qais Yousef <qais.yousef@arm.com> wrote: > > > > On 06/10/20 15:18, Douglas Anderson wrote: > > > The cros_ec_spi driver is realtime priority so that it doesn't get > > > preempted by other taks while it's talking to the EC but overall it > > > really doesn't need lots of compute power. Unfortunately, by default, > > > the kernel assumes that all realtime tasks should cause the cpufreq to > > > jump to max and burn through power to get things done as quickly as > > > possible. That's just not the correct behavior for cros_ec_spi. > > > > > > Switch to manually overriding the default. > > > > > > This won't help us if our work moves over to the SPI pump thread but > > > that's not the common code path. > > > > > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > > --- > > > NOTE: This would cause a conflict if the patch > > > https://lore.kernel.org/r/20200422112831.870192415@infradead.org lands > > > first > > > > > > drivers/platform/chrome/cros_ec_spi.c | 10 ++++++---- > > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > > > diff --git a/drivers/platform/chrome/cros_ec_spi.c b/drivers/platform/chrome/cros_ec_spi.c > > > index debea5c4c829..76d59d5e7efd 100644 > > > --- a/drivers/platform/chrome/cros_ec_spi.c > > > +++ b/drivers/platform/chrome/cros_ec_spi.c > > > @@ -709,8 +709,11 @@ static void cros_ec_spi_high_pri_release(void *worker) > > > static int cros_ec_spi_devm_high_pri_alloc(struct device *dev, > > > struct cros_ec_spi *ec_spi) > > > { > > > - struct sched_param sched_priority = { > > > - .sched_priority = MAX_RT_PRIO / 2, > > > + struct sched_attr sched_attr = { > > > + .sched_policy = SCHED_FIFO, > > > + .sched_priority = MAX_RT_PRIO / 2, > > > + .sched_flags = SCHED_FLAG_UTIL_CLAMP_MIN, > > > + .sched_util_min = 0, > > > > Hmm I thought Peter already removed all users that change RT priority directly. > > > > https://lore.kernel.org/lkml/20200422112719.826676174@infradead.org/ > > Yeah, I mentioned that patch series "after the cut" in my patch and > also made sure to CC Peter on my patch. I'm not sure what happened to > that series since I don't see it in linuxnext... Apologies I missed that. > > > > > }; > > > int err; > > > > > > @@ -728,8 +731,7 @@ static int cros_ec_spi_devm_high_pri_alloc(struct device *dev, > > > if (err) > > > return err; > > > > > > - err = sched_setscheduler_nocheck(ec_spi->high_pri_worker->task, > > > - SCHED_FIFO, &sched_priority); > > > + err = sched_setattr_nocheck(ec_spi->high_pri_worker->task, &sched_attr); > > > if (err) > > > dev_err(dev, "Can't set cros_ec high pri priority: %d\n", err); > > > return err; > > > > If I read the code correctly, if you fail here cros_ec_spi_probe() will fail > > too and the whole thing will not be loaded. If you compile without uclamp then > > sched_setattr_nocheck() will always fail with -EOPNOTSUPP. > > Oops, definitely need to send out a v2 for that if nothing else. Is > there any good way for me to code this up or do I need a big #ifdef > somewhere in my code? A big #ifdef. But this kind of use I don't think was anticipated. And generally if we want to allow that, it has to be done via a proper API. Drivers picking random uclamp values is as bad as them picking random RT priority. > > > > Why can't you manage the priority and boost value of such tasks via your init > > scripts instead? > > I guess I feel like it's weird in this case. Userspace isn't creating > this task and isn't the one marking it as realtime. ...and, if we > ever manage to upgrade the protocol which we use to talk to the EC we > might fully get rid of this task the need to have something boosted up > to realtime. > > Said another way: the fact that we even have this task at all and the > fact that it's realtime is an implementation detail of the kernel. It > seems really weird to add initscripts for it. Yes this is the problem of RT for a general purpose systems. It's hard to reason about their priorities/importance since it's not a special purpose system with well defined spec of what hardware/software will be running on it and their precise requirements is not known before hand. > > > > I have to admit I need to think about whether it makes sense to have a generic > > API that allows drivers to opt-out of the default boosting behavior for their > > RT tasks. > > Seems like it would be useful. If you propose something that will help the discussion. I think based on the same approach Peter has taken to prevent random RT priorities. In uclamp case I think we just want to allow driver to opt RT tasks out of the default boosting behavior. I'm a bit wary that this extra layer of tuning might create a confusion, but I can't reason about why is it bad for a driver to say I don't want my RT task to be boosted too. Thanks -- Qais Yousef ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-19 15:38 ` Qais Yousef @ 2020-06-22 18:21 ` Doug Anderson 2020-06-23 16:40 ` Qais Yousef 0 siblings, 1 reply; 26+ messages in thread From: Doug Anderson @ 2020-06-22 18:21 UTC (permalink / raw) To: Qais Yousef Cc: Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Peter Zijlstra, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML Hi, On Fri, Jun 19, 2020 at 8:38 AM Qais Yousef <qais.yousef@arm.com> wrote: > > On 06/18/20 14:18, Doug Anderson wrote: > > Hi, > > > > On Fri, Jun 12, 2020 at 5:52 AM Qais Yousef <qais.yousef@arm.com> wrote: > > > > > > On 06/10/20 15:18, Douglas Anderson wrote: > > > > The cros_ec_spi driver is realtime priority so that it doesn't get > > > > preempted by other taks while it's talking to the EC but overall it > > > > really doesn't need lots of compute power. Unfortunately, by default, > > > > the kernel assumes that all realtime tasks should cause the cpufreq to > > > > jump to max and burn through power to get things done as quickly as > > > > possible. That's just not the correct behavior for cros_ec_spi. > > > > > > > > Switch to manually overriding the default. > > > > > > > > This won't help us if our work moves over to the SPI pump thread but > > > > that's not the common code path. > > > > > > > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > > > --- > > > > NOTE: This would cause a conflict if the patch > > > > https://lore.kernel.org/r/20200422112831.870192415@infradead.org lands > > > > first > > > > > > > > drivers/platform/chrome/cros_ec_spi.c | 10 ++++++---- > > > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/drivers/platform/chrome/cros_ec_spi.c b/drivers/platform/chrome/cros_ec_spi.c > > > > index debea5c4c829..76d59d5e7efd 100644 > > > > --- a/drivers/platform/chrome/cros_ec_spi.c > > > > +++ b/drivers/platform/chrome/cros_ec_spi.c > > > > @@ -709,8 +709,11 @@ static void cros_ec_spi_high_pri_release(void *worker) > > > > static int cros_ec_spi_devm_high_pri_alloc(struct device *dev, > > > > struct cros_ec_spi *ec_spi) > > > > { > > > > - struct sched_param sched_priority = { > > > > - .sched_priority = MAX_RT_PRIO / 2, > > > > + struct sched_attr sched_attr = { > > > > + .sched_policy = SCHED_FIFO, > > > > + .sched_priority = MAX_RT_PRIO / 2, > > > > + .sched_flags = SCHED_FLAG_UTIL_CLAMP_MIN, > > > > + .sched_util_min = 0, > > > > > > Hmm I thought Peter already removed all users that change RT priority directly. > > > > > > https://lore.kernel.org/lkml/20200422112719.826676174@infradead.org/ > > > > Yeah, I mentioned that patch series "after the cut" in my patch and > > also made sure to CC Peter on my patch. I'm not sure what happened to > > that series since I don't see it in linuxnext... > > Apologies I missed that. > > > > > > > > > }; > > > > int err; > > > > > > > > @@ -728,8 +731,7 @@ static int cros_ec_spi_devm_high_pri_alloc(struct device *dev, > > > > if (err) > > > > return err; > > > > > > > > - err = sched_setscheduler_nocheck(ec_spi->high_pri_worker->task, > > > > - SCHED_FIFO, &sched_priority); > > > > + err = sched_setattr_nocheck(ec_spi->high_pri_worker->task, &sched_attr); > > > > if (err) > > > > dev_err(dev, "Can't set cros_ec high pri priority: %d\n", err); > > > > return err; > > > > > > If I read the code correctly, if you fail here cros_ec_spi_probe() will fail > > > too and the whole thing will not be loaded. If you compile without uclamp then > > > sched_setattr_nocheck() will always fail with -EOPNOTSUPP. > > > > Oops, definitely need to send out a v2 for that if nothing else. Is > > there any good way for me to code this up or do I need a big #ifdef > > somewhere in my code? > > A big #ifdef. But this kind of use I don't think was anticipated. And generally > if we want to allow that, it has to be done via a proper API. Drivers picking > random uclamp values is as bad as them picking random RT priority. > > > > > > > > Why can't you manage the priority and boost value of such tasks via your init > > > scripts instead? > > > > I guess I feel like it's weird in this case. Userspace isn't creating > > this task and isn't the one marking it as realtime. ...and, if we > > ever manage to upgrade the protocol which we use to talk to the EC we > > might fully get rid of this task the need to have something boosted up > > to realtime. > > > > Said another way: the fact that we even have this task at all and the > > fact that it's realtime is an implementation detail of the kernel. It > > seems really weird to add initscripts for it. > > Yes this is the problem of RT for a general purpose systems. It's hard to > reason about their priorities/importance since it's not a special purpose > system with well defined spec of what hardware/software will be running on it > and their precise requirements is not known before hand. > > > > > > > > I have to admit I need to think about whether it makes sense to have a generic > > > API that allows drivers to opt-out of the default boosting behavior for their > > > RT tasks. > > > > Seems like it would be useful. > > If you propose something that will help the discussion. I think based on the > same approach Peter has taken to prevent random RT priorities. In uclamp case > I think we just want to allow driver to opt RT tasks out of the default > boosting behavior. > > I'm a bit wary that this extra layer of tuning might create a confusion, but > I can't reason about why is it bad for a driver to say I don't want my RT task > to be boosted too. Right. I was basically just trying to say "turn my boosting off". ...so I guess you're saying that doing a v2 of my patch with the proper #ifdef protection wouldn't be a good way to go and I'd need to propose some sort of API for this? -Doug ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-22 18:21 ` Doug Anderson @ 2020-06-23 16:40 ` Qais Yousef 2020-06-24 15:49 ` Joel Fernandes 2020-06-24 16:01 ` Peter Zijlstra 0 siblings, 2 replies; 26+ messages in thread From: Qais Yousef @ 2020-06-23 16:40 UTC (permalink / raw) To: Doug Anderson, Peter Zijlstra Cc: Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML On 06/22/20 11:21, Doug Anderson wrote: [...] > > If you propose something that will help the discussion. I think based on the > > same approach Peter has taken to prevent random RT priorities. In uclamp case > > I think we just want to allow driver to opt RT tasks out of the default > > boosting behavior. > > > > I'm a bit wary that this extra layer of tuning might create a confusion, but > > I can't reason about why is it bad for a driver to say I don't want my RT task > > to be boosted too. > > Right. I was basically just trying to say "turn my boosting off". > > ...so I guess you're saying that doing a v2 of my patch with the > proper #ifdef protection wouldn't be a good way to go and I'd need to > propose some sort of API for this? It's up to Peter really. It concerns me in general to start having in-kernel users of uclamp that might end up setting random values (like we ended having random RT priorities), that really don't mean a lot outside the context of the specific system it was tested on. Given the kernel could run anywhere, it's hard to rationalize what's okay or not. Opting out of default RT boost for a specific task in the kernel, could make sense though it still concerns me for the same reasons. Is this okay for all possible systems this can run on? It feels better for userspace to turn RT boosting off for all tasks if you know your system is powerful, or use the per task API to switch off boosting for the tasks you know they don't need it. But if we want to allow in-kernel users, IMO it needs to be done in a controlled way, in a similar manner Peter changed how RT priority can be set in the kernel. It would be good hear what Peter thinks. Thanks -- Qais Yousef ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-23 16:40 ` Qais Yousef @ 2020-06-24 15:49 ` Joel Fernandes 2020-06-24 16:55 ` Qais Yousef 2020-06-24 16:01 ` Peter Zijlstra 1 sibling, 1 reply; 26+ messages in thread From: Joel Fernandes @ 2020-06-24 15:49 UTC (permalink / raw) To: Qais Yousef Cc: Doug Anderson, Peter Zijlstra, Benson Leung, Enric Balletbo i Serra, hsinyi, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML On Tue, Jun 23, 2020 at 12:40 PM Qais Yousef <qais.yousef@arm.com> wrote: > > On 06/22/20 11:21, Doug Anderson wrote: > > [...] > > > > If you propose something that will help the discussion. I think based on the > > > same approach Peter has taken to prevent random RT priorities. In uclamp case > > > I think we just want to allow driver to opt RT tasks out of the default > > > boosting behavior. > > > > > > I'm a bit wary that this extra layer of tuning might create a confusion, but > > > I can't reason about why is it bad for a driver to say I don't want my RT task > > > to be boosted too. > > > > Right. I was basically just trying to say "turn my boosting off". > > > > ...so I guess you're saying that doing a v2 of my patch with the > > proper #ifdef protection wouldn't be a good way to go and I'd need to > > propose some sort of API for this? > > It's up to Peter really. > > It concerns me in general to start having in-kernel users of uclamp that might > end up setting random values (like we ended having random RT priorities), that > really don't mean a lot outside the context of the specific system it was > tested on. Given the kernel could run anywhere, it's hard to rationalize what's > okay or not. > > Opting out of default RT boost for a specific task in the kernel, could make > sense though it still concerns me for the same reasons. Is this okay for all > possible systems this can run on? > > It feels better for userspace to turn RT boosting off for all tasks if you know > your system is powerful, or use the per task API to switch off boosting for the > tasks you know they don't need it. > > But if we want to allow in-kernel users, IMO it needs to be done in > a controlled way, in a similar manner Peter changed how RT priority can be set > in the kernel. > > It would be good hear what Peter thinks. It seems a bit of a hack, but really the commit message says the driver is not expected to take a lot of CPU capacity so it should be expected to work across platforms. It is likely to behave better than how it behaves now. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-24 15:49 ` Joel Fernandes @ 2020-06-24 16:55 ` Qais Yousef 2020-06-24 17:35 ` Joel Fernandes 0 siblings, 1 reply; 26+ messages in thread From: Qais Yousef @ 2020-06-24 16:55 UTC (permalink / raw) To: Joel Fernandes Cc: Doug Anderson, Peter Zijlstra, Benson Leung, Enric Balletbo i Serra, hsinyi, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML On 06/24/20 11:49, Joel Fernandes wrote: > On Tue, Jun 23, 2020 at 12:40 PM Qais Yousef <qais.yousef@arm.com> wrote: > > > > On 06/22/20 11:21, Doug Anderson wrote: > > > > [...] > > > > > > If you propose something that will help the discussion. I think based on the > > > > same approach Peter has taken to prevent random RT priorities. In uclamp case > > > > I think we just want to allow driver to opt RT tasks out of the default > > > > boosting behavior. > > > > > > > > I'm a bit wary that this extra layer of tuning might create a confusion, but > > > > I can't reason about why is it bad for a driver to say I don't want my RT task > > > > to be boosted too. > > > > > > Right. I was basically just trying to say "turn my boosting off". > > > > > > ...so I guess you're saying that doing a v2 of my patch with the > > > proper #ifdef protection wouldn't be a good way to go and I'd need to > > > propose some sort of API for this? > > > > It's up to Peter really. > > > > It concerns me in general to start having in-kernel users of uclamp that might > > end up setting random values (like we ended having random RT priorities), that > > really don't mean a lot outside the context of the specific system it was > > tested on. Given the kernel could run anywhere, it's hard to rationalize what's > > okay or not. > > > > Opting out of default RT boost for a specific task in the kernel, could make > > sense though it still concerns me for the same reasons. Is this okay for all > > possible systems this can run on? > > > > It feels better for userspace to turn RT boosting off for all tasks if you know > > your system is powerful, or use the per task API to switch off boosting for the > > tasks you know they don't need it. > > > > But if we want to allow in-kernel users, IMO it needs to be done in > > a controlled way, in a similar manner Peter changed how RT priority can be set > > in the kernel. > > > > It would be good hear what Peter thinks. > > It seems a bit of a hack, but really the commit message says the Which part is the hack, the userspace control? It is how Linux expects things to work AFAIU. But I do agree there's a hole for general purpose userspace that wants to run and manage a diverse range of hardware. I still think it's the job of device manufacturers/system integrator. But not many ship Linux by default. Though I thought ChromeOS is the exception here. > driver is not expected to take a lot of CPU capacity so it should be > expected to work across platforms. It is likely to behave better than > how it behaves now. Doing the in-kernel opt-out via API should be fine, I think. But this will need to be discussed in the wider circle. It will already clash with this for example https://lore.kernel.org/lkml/20200619172011.5810-1-qais.yousef@arm.com/ Thanks -- Qais Yousef ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-24 16:55 ` Qais Yousef @ 2020-06-24 17:35 ` Joel Fernandes 2020-06-24 17:52 ` Qais Yousef 0 siblings, 1 reply; 26+ messages in thread From: Joel Fernandes @ 2020-06-24 17:35 UTC (permalink / raw) To: Qais Yousef Cc: Doug Anderson, Peter Zijlstra, Benson Leung, Enric Balletbo i Serra, hsinyi, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML On Wed, Jun 24, 2020 at 12:55 PM Qais Yousef <qais.yousef@arm.com> wrote: > > On 06/24/20 11:49, Joel Fernandes wrote: > > On Tue, Jun 23, 2020 at 12:40 PM Qais Yousef <qais.yousef@arm.com> wrote: > > > > > > On 06/22/20 11:21, Doug Anderson wrote: > > > > > > [...] > > > > > > > > If you propose something that will help the discussion. I think based on the > > > > > same approach Peter has taken to prevent random RT priorities. In uclamp case > > > > > I think we just want to allow driver to opt RT tasks out of the default > > > > > boosting behavior. > > > > > > > > > > I'm a bit wary that this extra layer of tuning might create a confusion, but > > > > > I can't reason about why is it bad for a driver to say I don't want my RT task > > > > > to be boosted too. > > > > > > > > Right. I was basically just trying to say "turn my boosting off". > > > > > > > > ...so I guess you're saying that doing a v2 of my patch with the > > > > proper #ifdef protection wouldn't be a good way to go and I'd need to > > > > propose some sort of API for this? > > > > > > It's up to Peter really. > > > > > > It concerns me in general to start having in-kernel users of uclamp that might > > > end up setting random values (like we ended having random RT priorities), that > > > really don't mean a lot outside the context of the specific system it was > > > tested on. Given the kernel could run anywhere, it's hard to rationalize what's > > > okay or not. > > > > > > Opting out of default RT boost for a specific task in the kernel, could make > > > sense though it still concerns me for the same reasons. Is this okay for all > > > possible systems this can run on? > > > > > > It feels better for userspace to turn RT boosting off for all tasks if you know > > > your system is powerful, or use the per task API to switch off boosting for the > > > tasks you know they don't need it. > > > > > > But if we want to allow in-kernel users, IMO it needs to be done in > > > a controlled way, in a similar manner Peter changed how RT priority can be set > > > in the kernel. > > > > > > It would be good hear what Peter thinks. > > > > It seems a bit of a hack, but really the commit message says the > > Which part is the hack, the userspace control? It is how Linux expects things > to work AFAIU. But I do agree there's a hole for general purpose userspace that > wants to run and manage a diverse range of hardware. I meant to say, this patch is a necessary hack of sorts. > > driver is not expected to take a lot of CPU capacity so it should be > > expected to work across platforms. It is likely to behave better than > > how it behaves now. > > Doing the in-kernel opt-out via API should be fine, I think. But this will > need to be discussed in the wider circle. It will already clash with this for > example > > https://lore.kernel.org/lkml/20200619172011.5810-1-qais.yousef@arm.com/ Have not yet looked closer at that patch, but are you saying this patch clashes with that work? Sorry I am operating on 2 hours of sleep here. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-24 17:35 ` Joel Fernandes @ 2020-06-24 17:52 ` Qais Yousef 2020-06-24 17:55 ` Joel Fernandes 0 siblings, 1 reply; 26+ messages in thread From: Qais Yousef @ 2020-06-24 17:52 UTC (permalink / raw) To: Joel Fernandes Cc: Doug Anderson, Peter Zijlstra, Benson Leung, Enric Balletbo i Serra, hsinyi, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML On 06/24/20 13:35, Joel Fernandes wrote: [...] > > Doing the in-kernel opt-out via API should be fine, I think. But this will > > need to be discussed in the wider circle. It will already clash with this for > > example > > > > https://lore.kernel.org/lkml/20200619172011.5810-1-qais.yousef@arm.com/ > > Have not yet looked closer at that patch, but are you saying this > patch clashes with that work? Sorry I am operating on 2 hours of sleep > here. The series is an optimization to remove the uclamp overhead from the scheduler fastpath until the userspace uses it. It introduces a static key that is disabled by default and will cause uclamp logic not to execute in the fast path. Once the userspace starts using util clamp, which we detect by either 1. Changing uclamp value of a task with sched_setattr() 2. Modifying the default sysctl_sched_util_clamp_{min, max} 3. Modifying the default cpu.uclamp.{min, max} value in cgroup If we start having in-kernel users changing uclamp value this means drivers will cause the system to opt-in into uclamp automatically even if the userspace doesn't actually use it. I think we can solve this by providing a special API to opt-out safely. Which is the right thing to do anyway even if we didn't have this clash. Hope this makes sense. Cheers -- Qais Yousef ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-24 17:52 ` Qais Yousef @ 2020-06-24 17:55 ` Joel Fernandes 2020-06-24 18:29 ` Doug Anderson 0 siblings, 1 reply; 26+ messages in thread From: Joel Fernandes @ 2020-06-24 17:55 UTC (permalink / raw) To: Qais Yousef Cc: Doug Anderson, Peter Zijlstra, Benson Leung, Enric Balletbo i Serra, hsinyi, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML On Wed, Jun 24, 2020 at 1:52 PM Qais Yousef <qais.yousef@arm.com> wrote: > > On 06/24/20 13:35, Joel Fernandes wrote: > > [...] > > > > Doing the in-kernel opt-out via API should be fine, I think. But this will > > > need to be discussed in the wider circle. It will already clash with this for > > > example > > > > > > https://lore.kernel.org/lkml/20200619172011.5810-1-qais.yousef@arm.com/ > > > > Have not yet looked closer at that patch, but are you saying this > > patch clashes with that work? Sorry I am operating on 2 hours of sleep > > here. > > The series is an optimization to remove the uclamp overhead from the scheduler > fastpath until the userspace uses it. It introduces a static key that is > disabled by default and will cause uclamp logic not to execute in the fast > path. Once the userspace starts using util clamp, which we detect by either > > 1. Changing uclamp value of a task with sched_setattr() > 2. Modifying the default sysctl_sched_util_clamp_{min, max} > 3. Modifying the default cpu.uclamp.{min, max} value in cgroup > > If we start having in-kernel users changing uclamp value this means drivers > will cause the system to opt-in into uclamp automatically even if the > userspace doesn't actually use it. > > I think we can solve this by providing a special API to opt-out safely. Which > is the right thing to do anyway even if we didn't have this clash. Makes sense, thanks. - Joel ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-24 17:55 ` Joel Fernandes @ 2020-06-24 18:29 ` Doug Anderson 2020-06-25 14:20 ` Qais Yousef 0 siblings, 1 reply; 26+ messages in thread From: Doug Anderson @ 2020-06-24 18:29 UTC (permalink / raw) To: Joel Fernandes Cc: Qais Yousef, Peter Zijlstra, Benson Leung, Enric Balletbo i Serra, hsinyi, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML Hi, On Wed, Jun 24, 2020 at 10:55 AM Joel Fernandes <joelaf@google.com> wrote: > > On Wed, Jun 24, 2020 at 1:52 PM Qais Yousef <qais.yousef@arm.com> wrote: > > > > On 06/24/20 13:35, Joel Fernandes wrote: > > > > [...] > > > > > > Doing the in-kernel opt-out via API should be fine, I think. But this will > > > > need to be discussed in the wider circle. It will already clash with this for > > > > example > > > > > > > > https://lore.kernel.org/lkml/20200619172011.5810-1-qais.yousef@arm.com/ > > > > > > Have not yet looked closer at that patch, but are you saying this > > > patch clashes with that work? Sorry I am operating on 2 hours of sleep > > > here. > > > > The series is an optimization to remove the uclamp overhead from the scheduler > > fastpath until the userspace uses it. It introduces a static key that is > > disabled by default and will cause uclamp logic not to execute in the fast > > path. Once the userspace starts using util clamp, which we detect by either > > > > 1. Changing uclamp value of a task with sched_setattr() > > 2. Modifying the default sysctl_sched_util_clamp_{min, max} > > 3. Modifying the default cpu.uclamp.{min, max} value in cgroup > > > > If we start having in-kernel users changing uclamp value this means drivers > > will cause the system to opt-in into uclamp automatically even if the > > userspace doesn't actually use it. > > > > I think we can solve this by providing a special API to opt-out safely. Which > > is the right thing to do anyway even if we didn't have this clash. > > Makes sense, thanks. OK, so I think the summary is: 1. There are enough external dependencies that are currently in the works that it makes sense for those to land first without trying to cram my patch to cros_ec in. 2. Maybe, as part of the work that's already going on, someone will add an API that I can use. If so then I can write my patch once that lands. 3. If nobody adds an API then I could look at adding the API myself once everything else is settled. 4. It looks as if the patch you mentioned originally [1] that allows userspace to just fully disable uclamp for RT tasks will land eventually (if we're stuck for a short term solution we can pick the existing patch). I believe Chrome OS will use that to just fully disable the default boosting for RT tasks across our system and (if needed) add boosts on a case-by-case basis. Once we do that, the incentive to land a patch for cros_ec will be mostly gone and probably we could consider my patch abandoned. While it would technically still be sane to land it won't have any real-world benefit. [1] https://lore.kernel.org/r/20200511154053.7822-1-qais.yousef@arm.com ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-24 18:29 ` Doug Anderson @ 2020-06-25 14:20 ` Qais Yousef 0 siblings, 0 replies; 26+ messages in thread From: Qais Yousef @ 2020-06-25 14:20 UTC (permalink / raw) To: Doug Anderson Cc: Joel Fernandes, Peter Zijlstra, Benson Leung, Enric Balletbo i Serra, hsinyi, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML On 06/24/20 11:29, Doug Anderson wrote: > Hi, > > On Wed, Jun 24, 2020 at 10:55 AM Joel Fernandes <joelaf@google.com> wrote: > > > > On Wed, Jun 24, 2020 at 1:52 PM Qais Yousef <qais.yousef@arm.com> wrote: > > > > > > On 06/24/20 13:35, Joel Fernandes wrote: > > > > > > [...] > > > > > > > > Doing the in-kernel opt-out via API should be fine, I think. But this will > > > > > need to be discussed in the wider circle. It will already clash with this for > > > > > example > > > > > > > > > > https://lore.kernel.org/lkml/20200619172011.5810-1-qais.yousef@arm.com/ > > > > > > > > Have not yet looked closer at that patch, but are you saying this > > > > patch clashes with that work? Sorry I am operating on 2 hours of sleep > > > > here. > > > > > > The series is an optimization to remove the uclamp overhead from the scheduler > > > fastpath until the userspace uses it. It introduces a static key that is > > > disabled by default and will cause uclamp logic not to execute in the fast > > > path. Once the userspace starts using util clamp, which we detect by either > > > > > > 1. Changing uclamp value of a task with sched_setattr() > > > 2. Modifying the default sysctl_sched_util_clamp_{min, max} > > > 3. Modifying the default cpu.uclamp.{min, max} value in cgroup > > > > > > If we start having in-kernel users changing uclamp value this means drivers > > > will cause the system to opt-in into uclamp automatically even if the > > > userspace doesn't actually use it. > > > > > > I think we can solve this by providing a special API to opt-out safely. Which > > > is the right thing to do anyway even if we didn't have this clash. > > > > Makes sense, thanks. > > OK, so I think the summary is: > > 1. There are enough external dependencies that are currently in the > works that it makes sense for those to land first without trying to > cram my patch to cros_ec in. +1 > > 2. Maybe, as part of the work that's already going on, someone will > add an API that I can use. If so then I can write my patch once that > lands. I won't be adding this API. Mainly because I can't argue for it personally as I'm still not convinced it's a valid way of handling RT default boosting behavior from within the kernel. But it's a valid discussion to have if you want to drive it. > > 3. If nobody adds an API then I could look at adding the API myself > once everything else is settled. > > 4. It looks as if the patch you mentioned originally [1] that allows > userspace to just fully disable uclamp for RT tasks will land > eventually (if we're stuck for a short term solution we can pick the > existing patch). I believe Chrome OS will use that to just fully > disable the default boosting for RT tasks across our system and (if > needed) add boosts on a case-by-case basis. Once we do that, the > incentive to land a patch for cros_ec will be mostly gone and probably > we could consider my patch abandoned. While it would technically > still be sane to land it won't have any real-world benefit. I think this is the best way forward. I'm interesting in hearing what difficulties you encounter while doing this work. Regarding the patch [1], I need to tweak the way it is implemented and send v6, but there are no objection to the idea and interface, so hopefully once I send v6 it'd be accepted. Thanks -- Qais Yousef > [1] https://lore.kernel.org/r/20200511154053.7822-1-qais.yousef@arm.com ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-23 16:40 ` Qais Yousef 2020-06-24 15:49 ` Joel Fernandes @ 2020-06-24 16:01 ` Peter Zijlstra 2020-06-24 17:14 ` Qais Yousef 1 sibling, 1 reply; 26+ messages in thread From: Peter Zijlstra @ 2020-06-24 16:01 UTC (permalink / raw) To: Qais Yousef Cc: Doug Anderson, Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML On Tue, Jun 23, 2020 at 05:40:21PM +0100, Qais Yousef wrote: > On 06/22/20 11:21, Doug Anderson wrote: > > [...] > > > > If you propose something that will help the discussion. I think based on the > > > same approach Peter has taken to prevent random RT priorities. In uclamp case > > > I think we just want to allow driver to opt RT tasks out of the default > > > boosting behavior. > > > > > > I'm a bit wary that this extra layer of tuning might create a confusion, but > > > I can't reason about why is it bad for a driver to say I don't want my RT task > > > to be boosted too. > > > > Right. I was basically just trying to say "turn my boosting off". > > > > ...so I guess you're saying that doing a v2 of my patch with the > > proper #ifdef protection wouldn't be a good way to go and I'd need to > > propose some sort of API for this? > > It's up to Peter really. > > It concerns me in general to start having in-kernel users of uclamp that might > end up setting random values (like we ended having random RT priorities), that > really don't mean a lot outside the context of the specific system it was > tested on. Given the kernel could run anywhere, it's hard to rationalize what's > okay or not. > > Opting out of default RT boost for a specific task in the kernel, could make > sense though it still concerns me for the same reasons. Is this okay for all > possible systems this can run on? > > It feels better for userspace to turn RT boosting off for all tasks if you know > your system is powerful, or use the per task API to switch off boosting for the > tasks you know they don't need it. > > But if we want to allow in-kernel users, IMO it needs to be done in > a controlled way, in a similar manner Peter changed how RT priority can be set > in the kernel. > > It would be good hear what Peter thinks. Hurmph.. I think I understand the problem, but I'm not sure what to do about it :-( Esp. given the uclamp optimization patches now under consideration. That is, if random drivers are going to install uclamps, then that will automagically enable the static_key and make the scheduler slower, even if that driver isn't particularly interesting to the user. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-24 16:01 ` Peter Zijlstra @ 2020-06-24 17:14 ` Qais Yousef 0 siblings, 0 replies; 26+ messages in thread From: Qais Yousef @ 2020-06-24 17:14 UTC (permalink / raw) To: Peter Zijlstra Cc: Doug Anderson, Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML On 06/24/20 18:01, Peter Zijlstra wrote: > On Tue, Jun 23, 2020 at 05:40:21PM +0100, Qais Yousef wrote: > > On 06/22/20 11:21, Doug Anderson wrote: > > > > [...] > > > > > > If you propose something that will help the discussion. I think based on the > > > > same approach Peter has taken to prevent random RT priorities. In uclamp case > > > > I think we just want to allow driver to opt RT tasks out of the default > > > > boosting behavior. > > > > > > > > I'm a bit wary that this extra layer of tuning might create a confusion, but > > > > I can't reason about why is it bad for a driver to say I don't want my RT task > > > > to be boosted too. > > > > > > Right. I was basically just trying to say "turn my boosting off". > > > > > > ...so I guess you're saying that doing a v2 of my patch with the > > > proper #ifdef protection wouldn't be a good way to go and I'd need to > > > propose some sort of API for this? > > > > It's up to Peter really. > > > > It concerns me in general to start having in-kernel users of uclamp that might > > end up setting random values (like we ended having random RT priorities), that > > really don't mean a lot outside the context of the specific system it was > > tested on. Given the kernel could run anywhere, it's hard to rationalize what's > > okay or not. > > > > Opting out of default RT boost for a specific task in the kernel, could make > > sense though it still concerns me for the same reasons. Is this okay for all > > possible systems this can run on? > > > > It feels better for userspace to turn RT boosting off for all tasks if you know > > your system is powerful, or use the per task API to switch off boosting for the > > tasks you know they don't need it. > > > > But if we want to allow in-kernel users, IMO it needs to be done in > > a controlled way, in a similar manner Peter changed how RT priority can be set > > in the kernel. > > > > It would be good hear what Peter thinks. > > Hurmph.. I think I understand the problem, but I'm not sure what to do > about it :-( > > Esp. given the uclamp optimization patches now under consideration. That > is, if random drivers are going to install uclamps, then that will > automagically enable the static_key and make the scheduler slower, even > if that driver isn't particularly interesting to the user. For some reason this didn't land in my inbox. Yeah I was considering how we can handle this potential new user without a clash. But I thought we first need to agree this is indeed the right thing to do. The easiest way is to make the new API act on the task struct directly rather than go through sched_setscheduler(). I.e: a wrapper around uclamp_se_set(). But first, I think we need to consider how accessible we want to keep sched_setscheduler(). IMO it should be hidden because of its potential misuse/abuse. Happy to help with that if you agree. Thanks -- Qais Yousef ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq 2020-06-18 21:18 ` Doug Anderson 2020-06-19 15:38 ` Qais Yousef @ 2020-06-24 15:56 ` Peter Zijlstra 1 sibling, 0 replies; 26+ messages in thread From: Peter Zijlstra @ 2020-06-24 15:56 UTC (permalink / raw) To: Doug Anderson Cc: Qais Yousef, Benson Leung, Enric Balletbo i Serra, hsinyi, Joel Fernandes, Nicolas Boichat, Gwendal Grignou, Quentin Perret, ctheegal, Guenter Roeck, LKML On Thu, Jun 18, 2020 at 02:18:23PM -0700, Doug Anderson wrote: > Hi, > > On Fri, Jun 12, 2020 at 5:52 AM Qais Yousef <qais.yousef@arm.com> wrote: > > > > On 06/10/20 15:18, Douglas Anderson wrote: > > > + struct sched_attr sched_attr = { > > > + .sched_policy = SCHED_FIFO, > > > + .sched_priority = MAX_RT_PRIO / 2, > > > + .sched_flags = SCHED_FLAG_UTIL_CLAMP_MIN, > > > + .sched_util_min = 0, > > > > Hmm I thought Peter already removed all users that change RT priority directly. > > > > https://lore.kernel.org/lkml/20200422112719.826676174@infradead.org/ > > Yeah, I mentioned that patch series "after the cut" in my patch and > also made sure to CC Peter on my patch. I'm not sure what happened to > that series since I don't see it in linuxnext... Right, I got busy with other stuff and it languished a bit. I tried to merge it recently (it's actually in tip/sched/fifo atm), but then 0day robot found a problem with it and I've not yet had the time to deal with that. Specifically these patches unEXPORT the schet_setscheduler() family, but kernel/trace/ring_buffer_benchmark.c can still be a module :-( ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2020-06-25 14:20 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-06-10 22:18 [PATCH] cros_ec_spi: Even though we're RT priority, don't bump cpu freq Douglas Anderson 2020-06-11 11:03 ` Quentin Perret 2020-06-11 17:48 ` Doug Anderson 2020-06-12 9:24 ` Quentin Perret 2020-06-12 12:34 ` Qais Yousef 2020-06-12 13:47 ` Quentin Perret 2020-06-12 14:00 ` Qais Yousef 2020-06-18 21:31 ` Doug Anderson 2020-06-19 15:31 ` Qais Yousef 2020-06-22 18:19 ` Doug Anderson 2020-06-23 15:56 ` Qais Yousef 2020-06-12 12:52 ` Qais Yousef 2020-06-18 21:18 ` Doug Anderson 2020-06-19 15:38 ` Qais Yousef 2020-06-22 18:21 ` Doug Anderson 2020-06-23 16:40 ` Qais Yousef 2020-06-24 15:49 ` Joel Fernandes 2020-06-24 16:55 ` Qais Yousef 2020-06-24 17:35 ` Joel Fernandes 2020-06-24 17:52 ` Qais Yousef 2020-06-24 17:55 ` Joel Fernandes 2020-06-24 18:29 ` Doug Anderson 2020-06-25 14:20 ` Qais Yousef 2020-06-24 16:01 ` Peter Zijlstra 2020-06-24 17:14 ` Qais Yousef 2020-06-24 15:56 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).