> On Dec 19, 2022, at 7:50 AM, Ola x Nilsson wrote: > > > On Mon, Dec 12 2022, Randy MacLeod wrote: > >> CCing Richard >> >> On 2022-12-12 05:07, Ola x Nilsson via lists.openembedded.org wrote: >>> Hi, >>> >>> I've been looking into using the pressure stall information awareness of >>> bitbake >> That's good to hear Ola. >>> but I have some problems getting it to work. Actually I think >>> it just doesn't work at all. >> >> Doesn't work at all? >> >> Well that would be surprising. See below. > > OK, it will occasionally block a task. But since the next attempt will > always be a very short time interval it will almost always start a new > task even if the pressure is high. > At least this is what I observe on my system. > > > >> 1. Rather than just keep track of the previous pressure values >> seen more than 1 second ago as done currently: >> >> if now - self.prev_pressure_time > 1.0: >> >> and always using that as a reference, we can >> store say 10 values per second and use that as a reference. >> >> There are some challenges in that approach in that we don't control >> how often the function is called. Averaging over the last 10 calls >> is tempting but likely has some edge cases such as when there are >> lots of tasks starting/ending. >> >> >> 2. If there has been a long delay since the function was last called, >> we could check the pressure, sleep for a short period of time and check it >> again. Some people would not like this since it will needlessly delay >> the build >> so we'd have to keep the delay to < 1 second. Too short a delay will reduce >> the accuracy of the result but I suspect that 0.1 seconds is sufficient >> for most >> users. We could also look at the avg10 value in this case or even some >> combination of >> both the current contention and avg10. >> >> >> 3. Just calculate the pressure per second by: >> >> ( current pressure - last pressure ) / (now - last_time) >> >> This could handle short time differences such os milliseconds >> as would be a 'cheap' way to deal with long delays. In your case, >> the pressure would be: >> >> 978077.0 io_pressure 1353882.0 mem_pressure 20922.0 >> >> divided by ~19 since the initial values were close to zero. >> >> Then for the next time, just 0.1 seconds later: >> >> 1670840042.384582 cpu_pressure 8978077.0 io_pressure 1353882.0 mem_pressure 20922.0 >> 1670840042.384582 cpu io pressure exceeded over 18.677629 seconds >> 1670840042.486946 cpu_pressure 466.0 io_pressure 30792.0 mem_pressure 0.0 >> >> Multiplying by 10 or easy calculation, the would be a pressure: >> >> cpu: 4660, io: 307920, mem: 0. >> >> >> Do you have another idea or a preference as to which approach we take? > > I think 3 is a good first step. Using multiple samples could improve > our calculated "avg1", but lets do that later if needed. I agree; Randy and I have been working on patching make and have taken a similar approach: https://github.com/ZhengQ2/make/tree/cpu-pressure ZhengQ2/make at cpu-pressure github.com Additionally, we found that when the pressure read is too frequent, we may get the same cpu pressure as an result, even if the pressure have actually changed. This is likely due to the per cpu variables used in the kernel. So, in addition to the algorithm Randy talked above, we also compares if the cpu pressure has been changed, if not, we will return the last result that has been produced. I will CC you when I have a patch, and you can try it out before the commit gets merged if you like. ZQ > > /Ola > >> >> ../Randy >> >> >>> >>> /Ola Nilsson >>> >>> >>> > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#14199): https://lists.openembedded.org/g/bitbake-devel/message/14199 > Mute This Topic: https://lists.openembedded.org/mt/95618299/7355053 > Group Owner: bitbake-devel+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [contrib@zhengqiu.net ] > -=-=-=-=-=-=-=-=-=-=-=-