bitbake-devel.lists.openembedded.org archive mirror
 help / color / mirror / Atom feed
From: Randy MacLeod <randy.macleod@windriver.com>
To: Ola x Nilsson <ola.x.nilsson@axis.com>, ChenQi <Qi.Chen@windriver.com>
Cc: contrib@zhengqiu.net,
	Richard Purdie <richard.purdie@linuxfoundation.org>,
	bitbake-devel@lists.openembedded.org
Subject: Re: [bitbake-devel] Bitbake PSI checker
Date: Mon, 22 May 2023 10:41:48 -0400	[thread overview]
Message-ID: <3dd30f41-688d-5691-f26e-66fc73bb49d0@windriver.com> (raw)
In-Reply-To: <jwqzg5w4s0y.fsf@axis.com>

[-- Attachment #1: Type: text/plain, Size: 13064 bytes --]

On 2023-05-22 05:36, Ola x Nilsson wrote:
> Hi Qi and Randy,
>
> I did some testing this morning, and I think this works fine for the <1s
> intervals.
>
> I added log prints whenever the exceeds_max_pressure function was called
> and was a bit suprised at some of my observations.


Yes, the kernel uses per-cpu variables to track pressure
efficiently and only updates what you see in /proc/pressure
periodically. Fun, eh!

I don't have a graph at hand to show that but here's a
CPU pressure typical pattern:

https://photos.app.goo.gl/XCMVAjywmBgoqj4E6

for those who haven't looked at the data.

This graph doesn't show that if you over-sample you'll get the same
value from pressure repeatedly until the per-cpu data is updated.
I might have that data on hand somewhere else but officially today is
a holiday so I'm not going to go look for it even if graphs are more
of a hobby than work!

>
> It seems setscene tasks are started without checking the PSI.  Is this
> by design?
Well, more like by lack of design!

I'll take a look, hopefully this week.


>   With the antivirus program forced on me by IT I easily reach
> CPU PSI on above 600000 (my current limit) while only running setscene
> tasks.
Ugh!
>
> If the PSI threshold has been reached, no new tasks will be started for
> a while.  But once the PSI check passes, it seems as many tasks as are
> allowed are started at once.  Considering the time interval between
> checks for each started task would be very small, this would probably
> happen even if the PSI was checked for each task start.  But won't this
> cause 'waves' of tasks that compete and cause high PSI instead of
> allowing just a few (one?) tasks to start and then wait a second?
Yes, I've considered that but hadn't gather data when
on it when Zheng was still working with me. I also was
concerned that we didn't want to slow the builds down
too much. I'm not sure how to make that trade-off in a
generic manner given that we don't know if a new build

will generate little, some or tremendous pressure.


The problem is even harder if you have 2 or 3 builds on the
same machine. The related but not exactly appropriate term
for this phenomena is, 'The thundering herd problem",
https://en.wikipedia.org/wiki/Thundering_herd_problem

I expect that there are good or even optimal solutions but
I haven't had/taken time to read the literature.


>
> These two things are obviously not connected to this patch.  I think
> this is fine except for the commit message which refers to runqemu.py
> instead of runqueue.py.


Oops.... I don't actually see that error but if it's done, c'est la vie.

>
> Thank you for this improvment.

+1 Qi !

Ola,
Thanks for checking and reporting and helping push us to do better!

../Randy



> /Ola
>
> On Mon, May 22 2023, ChenQi wrote:
>
>> Hi Ola & Randy,
>>
>> I just checked the codes and I think Ola is right. The current PSI check cannot block spawning of new tasks if the time interval
>> is small between current check and last check. I'll send out a patch to fix this issue.
>>
>> Also, I don't think calculating the value too often is a good idea, so I'll change the check to be >1s.
>>
>> Please help review the patch.
>>
>> Regards,
>> Qi
>>
>> On 5/21/23 03:58, Randy MacLeod wrote:
>>
>>   On 2022-12-19 14:49, Zheng Qiu via lists.openembedded.org wrote:
>>
>>   On Dec 19, 2022, at 7:50 AM, Ola x Nilsson<ola.x.nilsson@axis.com>  wrote:
>>
>>   On Mon, Dec 12 2022, Randy MacLeod wrote:
>>
>>   CCing Richard
>>
>>   On 2022-12-12 05:07, Ola x Nilsson via lists.openembedded.org wrote:
>>
>>   Hi,
>>
>>   I've been looking into using the pressure stall information awareness of
>>   bitbake
>>
>>   That's good to hear Ola.
>>
>>    but I have some problems getting it to work.  Actually I think
>>   it just doesn't work at all.
>>
>>   Doesn't work at all?
>>
>>   Well that would be surprising. See below.
>>
>>   OK, it will occasionally block a task. But since the next attempt will
>>   always be a very short time interval it will almost always start a new
>>   task even if the pressure is high.
>>   At least this is what I observe on my system.
>>
>>   <snip>
>>
>>   1. Rather than just keep track of the previous pressure values
>>   seen more than 1 second ago as done currently:
>>
>>         if now - self.prev_pressure_time > 1.0:
>>
>>   and always using that as a reference, we can
>>   store say 10 values per second and use that as a reference.
>>
>>   There are some challenges in that approach in that we don't control
>>   how often the function is called. Averaging over the last 10 calls
>>   is tempting but likely has some edge cases such as when there are
>>   lots of tasks starting/ending.
>>
>>   2. If there has been a long delay since the function was last called,
>>   we could check the pressure, sleep for a short period of time and check it
>>   again. Some people would not like this since it will needlessly delay
>>   the build
>>   so we'd have to keep the delay to < 1 second. Too short a delay will reduce
>>   the accuracy of the result but I suspect that 0.1 seconds is sufficient
>>   for most
>>   users. We could also look at the avg10 value in this case or even some
>>   combination of
>>   both the current contention and avg10.
>>
>>   3. Just calculate the pressure per second by:
>>
>>      ( current pressure - last pressure ) / (now - last_time)
>>
>>   This could handle  short time differences such os milliseconds
>>   as would be a 'cheap' way to deal with long delays. In your case,
>>   the pressure would be:
>>
>>     978077.0 io_pressure 1353882.0 mem_pressure 20922.0
>>
>>   divided by ~19 since the initial values were close to zero.
>>
>>   Then for the next time, just 0.1 seconds later:
>>
>>   1670840042.384582 cpu_pressure 8978077.0 io_pressure 1353882.0 mem_pressure 20922.0
>>   1670840042.384582 cpu io  pressure exceeded over 18.677629 seconds
>>   1670840042.486946 cpu_pressure 466.0 io_pressure 30792.0 mem_pressure 0.0
>>
>>   Multiplying by 10 or easy calculation, the would be a pressure:
>>
>>   cpu: 4660, io: 307920, mem: 0.
>>
>>   Do you have another idea or a preference as to which approach we take?
>>
>>   I think 3 is a good first step.  Using multiple samples could improve
>>   our calculated "avg1", but lets do that later if needed.
>>
>>   I agree; Randy and I have been working on patching make and have taken a similar approach:
>>
>>   make.png
>>   ZhengQ2/make at cpu-pressure github.com
>> make.png
>>   Additionally, we found that when the pressure read is too frequent, we may get the same cpu pressure as an result,
>>   even if the pressure have actually changed. This is likely due to the per cpu variables used in the kernel.
>>   So, in addition to the algorithm Randy talked above, we also compares if the cpu pressure has been changed, if not,
>>   we will return the last result that has been produced.
>>
>>   I will CC you when I have a patch, and you can try it out before the commit gets merged if you like.
>>
>>   Ola,
>>
>>   Does Qi's patch below help in your situation?
>>
>>   I still want/intent to add a bitbake PSI test case that uses stress-ng to induce load
>>   and a lightweight sleep task but there are never enough hours in the day/week/...
>>
>>   The basic idea is to:
>>
>>   1. Run a task that just sleeps for say 10 seconds and confirm that the actual
>>   execution time is < 11 seconds or so.
>>
>>   2. use stress to get the system into a CPU pressure environment above
>>   the current threshold for say 30 seconds and simultaneously / shortly there after,
>>   launch the same sleep task and confirm that this time, the actual exectuion time of
>>   the launch to completion time is 40+ seconds.
>>
>>   ../Randy 'getting caught up on email on the weekend' MacLeod
>>
>>   ❯ git show ba94f9a3b1960cc0fdc831c20a9d2f8ad289f307
>>   commit ba94f9a3b1960cc0fdc831c20a9d2f8ad289f307
>>   Author: Chen Qi<Qi.Chen@windriver.com>
>>   Date:   Thu Apr 6 23:07:14 2023
>>
>>       bitbake: runqueue: fix PSI check calculation
>>       
>>       The current PSI check calculation does not take into consideration
>>       the possibility of the time interval between last check and current
>>       check being much larger than 1s. In fact, the current behavior does
>>       not match what the manual says about BB_PRESSURE_MAX_XXX, even if
>>       the value is set to upper limit, 1000000, we still get many blocks
>>       on new task launch. The difference between 'total' should be divided
>>       by the time interval if it's larger than 1s.
>>       
>>       (Bitbake rev: b4763c2c93e7494e0a27f5970c19c1aac66c228b)
>>       
>>       Signed-off-by: Chen Qi<Qi.Chen@windriver.com>
>>       Signed-off-by: Richard Purdie<richard.purdie@linuxfoundation.org>
>>
>>   Δ bitbake/lib/bb/runqueue.py
>>   ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>>   
>>
>>   ────────────────────────────────────────┐
>>   • 198: class RunQueueScheduler(object): │
>>   ────────────────────────────────────────┘
>>                   curr_cpu_pressure = cpu_pressure_fds.readline().split()[4].split("=")[1]
>>                   curr_io_pressure = io_pressure_fds.readline().split()[4].split("=")[1]
>>                   curr_memory_pressure = memory_pressure_fds.readline().split()[4].split("=")[1]
>>                   exceeds_cpu_pressure =  self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float(self.prev_cpu_pressure))
>>   > self.rq.max_cpu_pressure
>>                   exceeds_io_pressure =  self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) >
>>   self.rq.max_io_pressure
>>                   exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float
>>   (self.prev_memory_pressure)) > self.rq.max_memory_pressure
>>                   now = time.time()
>>                   if now - self.prev_pressure_time > 1.0:
>>                   tdiff = now - self.prev_pressure_time
>>                   if tdiff > 1.0:
>>                       exceeds_cpu_pressure =  self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float
>>   (self.prev_cpu_pressure)) / tdiff > self.rq.max_cpu_pressure
>>                       exceeds_io_pressure =  self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) /
>>   tdiff > self.rq.max_io_pressure
>>                       exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float
>>   (self.prev_memory_pressure)) / tdiff > self.rq.max_memory_pressure
>>                       self.prev_cpu_pressure = curr_cpu_pressure
>>                       self.prev_io_pressure = curr_io_pressure
>>                       self.prev_memory_pressure = curr_memory_pressure
>>                       self.prev_pressure_time = now
>>                   else:
>>                       exceeds_cpu_pressure =  self.rq.max_cpu_pressure and (float(curr_cpu_pressure) - float
>>   (self.prev_cpu_pressure)) > self.rq.max_cpu_pressure
>>                       exceeds_io_pressure =  self.rq.max_io_pressure and (float(curr_io_pressure) - float(self.prev_io_pressure)) >
>>   self.rq.max_io_pressure
>>                       exceeds_memory_pressure = self.rq.max_memory_pressure and (float(curr_memory_pressure) - float
>>   (self.prev_memory_pressure)) > self.rq.max_memory_pressure
>>               return (exceeds_cpu_pressure or exceeds_io_pressure or exceeds_memory_pressure)
>>           return False
>>
>>   ZQ
>>
>>   /Ola
>>
>>   ../Randy
>>
>>   /Ola Nilsson
>>
>> -=-=-=-=-=-=-=-=-=-=-=-
>> Links: You receive all messages sent to this group.
>> View/Reply Online (#14206):https://lists.openembedded.org/g/bitbake-devel/message/14206
>> Mute This Topic:https://lists.openembedded.org/mt/95618299/3616765
>> Group Owner:bitbake-devel+owner@lists.openembedded.org
>> Unsubscribe:https://lists.openembedded.org/g/bitbake-devel/unsub  [randy.macleod@windriver.com]
>> -=-=-=-=-=-=-=-=-=-=-=-


-- 
# Randy MacLeod
# Wind River Linux

[-- Attachment #2: Type: text/html, Size: 15361 bytes --]

  reply	other threads:[~2023-05-22 14:42 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-12 10:07 Bitbake PSI checker Ola x Nilsson
2022-12-12 20:48 ` [bitbake-devel] " Randy MacLeod
2022-12-19 12:50   ` Ola x Nilsson
2022-12-19 19:49     ` contrib
2023-05-20 19:58       ` Randy MacLeod
2023-05-22  2:17         ` ChenQi
2023-05-22  9:36           ` Ola x Nilsson
2023-05-22 14:41             ` Randy MacLeod [this message]
2023-05-23  2:08               ` Chen, Qi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3dd30f41-688d-5691-f26e-66fc73bb49d0@windriver.com \
    --to=randy.macleod@windriver.com \
    --cc=Qi.Chen@windriver.com \
    --cc=bitbake-devel@lists.openembedded.org \
    --cc=contrib@zhengqiu.net \
    --cc=ola.x.nilsson@axis.com \
    --cc=richard.purdie@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).