RE: [OE-core] [PATCH] runqueue: add cpu/io pressure regulation

From: "Gupta, Aryaman" <Aryaman.Gupta@windriver.com>
To: Richard Purdie <richard.purdie@linuxfoundation.org>,
	"openembedded-core@lists.openembedded.org"
	<openembedded-core@lists.openembedded.org>
Subject: RE: [OE-core] [PATCH] runqueue: add cpu/io pressure regulation
Date: Thu, 30 Jun 2022 20:50:28 +0000	[thread overview]
Message-ID: <PH7PR11MB60302DE19EF77B6C29DAAB8BEBBA9@PH7PR11MB6030.namprd11.prod.outlook.com> (raw)
In-Reply-To: <566a77ac43e2ee7d4bb3204a0290fd6dedd3713c.camel@linuxfoundation.org>

>-----Original Message-----
>From: Richard Purdie <richard.purdie@linuxfoundation.org>
>Sent: Thursday, June 30, 2022 5:53 AM
>To: Gupta, Aryaman <Aryaman.Gupta@windriver.com>; openembedded-
>core@lists.openembedded.org
>Subject: Re: [OE-core] [PATCH] runqueue: add cpu/io pressure regulation
>
>[Please note: This e-mail is from an EXTERNAL e-mail address]
>
>On Wed, 2022-06-29 at 16:10 -0400, Aryaman Gupta wrote:
>> Stop the scheduler from starting new tasks if the current cpu or io
>> pressure is above a certain threshold, specified through the
>> "BB_MAX_{CPU|IO}_SOME_PRESSURE" variables in conf/local.conf.
>>
>> If the thresholds aren't specified, the default values are 100 for
>> both CPU and IO, which will have no impact on build times.
>> Arbitary lower limit of 1.0 results in a fatal error to avoid
>> extremely long builds. If the percentage limits are higher than 100,
>> then the default values are used and warnings are issued to inform
>> users that the specified limit is out of bounds.
>>
>> Signed-off-by: Aryaman Gupta <aryaman.gupta@windriver.com>
>> Signed-off-by: Randy Macleod <randy.macleod@windriver.com>
>> ---
>>  bitbake/lib/bb/runqueue.py | 39
>> ++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 39 insertions(+)
>
>This looks like a good start, thanks. There are a few things which will need
>cleaning up in here as this is pretty performance sensitive code (try a "time
>bitbake world -n" to see what I mean).
>
>Firstly, it is a bitbake patch so should really go to the bitbake mailing list, not OE-
>Core.
>

Right, will do!

>> diff --git a/bitbake/lib/bb/runqueue.py b/bitbake/lib/bb/runqueue.py
>> index 1e47fe70ef..9667acc11c 100644
>> --- a/bitbake/lib/bb/runqueue.py
>> +++ b/bitbake/lib/bb/runqueue.py
>> @@ -159,6 +159,27 @@ class RunQueueScheduler(object):
>>                  self.buildable.append(tid)
>>
>>          self.rev_prio_map = None
>> +        # Some hosts like openSUSE have readable /proc/pressure files
>> +        # but throw errors when these files are opened.
>> +        try:
>> +            subprocess.check_output(["cat", "/proc/pressure/cpu",
>"/proc/pressure/io"], \
>> +                                    universal_newlines=True, stderr=subprocess.DEVNULL)
>> +            self.readable_pressure_files = True
>>
>>         except:
>> +            if self.rq.max_cpu_pressure!=100 or self.rq.max_io_pressure!=100:
>> +                bb.warn("The /proc/pressure files can't be read. Continuing build
>without monitoring pressure")
>> +            self.readable_pressure_files = False
>> +
>> +    def exceeds_max_pressure(self):
>> +        if self.readable_pressure_files:
>> +            # extract avg10 from /proc/pressure/{cpu|io}
>> +            curr_pressure_sample = subprocess.check_output(["cat",
>"/proc/pressure/cpu", "/proc/pressure/io"], \
>> +                                   universal_newlines=True, stderr=subprocess.DEVNULL)
>> +            curr_cpu_pressure =
>curr_pressure_sample.split('\n')[0].split()[1].split("=")[1]
>> +            curr_io_pressure =
>> + curr_pressure_sample.split('\n')[2].split()[1].split("=")[1]
>
>This is horrible, you're adding in a fork() call for every pass through the
>scheduler code. You can just open() and read the file instead which will have
>*much* lower overhead.

Oh you're right. Randy and I tested this and turns out subprocess is significantly slower.
We also looked at another level of optimization where you open() once and re-use the file descriptor, 
similar to what's done in buildstats. While this is faster, the absolute time to run the code is around 1 or 3
microseconds. Given the added complexity to the code, the optimization does not seem worthwhile.
I'll remove the subprocess() call in v2.

>
>Even then, I'm not particularly thrilled to have this level of overhead in this
>codepath, in some ways I'd prefer to rate limit how often we're looking up this
>value rather than once per pass through the scheduler path. I'm curious what
>the timings say.
>
We wrote this simple test code:
import timeit, subprocess
file = "/proc/pressure/cpu"

def test_subprocess():
    subprocess.check_output(["cat", file])

test1 = open(file)
def test_seek():
    test1.seek(0)
    test1.read()

def test_openread():
    with open(file) as test:
        test.read()
        check_var = 1+1

repeat = 1000
#print("SUBPROCESS: %s " % timeit.Timer(test_subprocess).timeit(number=repeat))
print("SEEK: %s " % timeit.Timer(test_seek).timeit(number=repeat))
print("OPEN AND READ: %s " % timeit.Timer(test_openread).timeit(number=repeat))

Running this with python test.py gave:
SEEK: 0.012695984973106533 
OPEN AND READ: 0.0369647319894284

You can see that this is about 3x faster but only a couple of microseconds in absolute terms.
Do you think that's a reasonable overhead?

>> +
>> +            return float(curr_cpu_pressure) > self.rq.max_cpu_pressure or
>float(curr_io_pressure) > self.rq.max_io_pressure
>> +        return False
>>
>>      def next_buildable_task(self):
>>          """
>> @@ -171,6 +192,8 @@ class RunQueueScheduler(object):
>>          buildable.intersection_update(self.rq.tasks_covered |
>self.rq.tasks_notcovered)
>>          if not buildable:
>>              return None
>> +        if self.exceeds_max_pressure():
>> +            return None
>>
>>          # Filter out tasks that have a max number of threads that have been
>exceeded
>>          skip_buildable = {}
>> @@ -1699,6 +1722,8 @@ class RunQueueExecute:
>>
>>          self.number_tasks = int(self.cfgData.getVar("BB_NUMBER_THREADS") or
>1)
>>          self.scheduler = self.cfgData.getVar("BB_SCHEDULER") or "speed"
>> +        self.max_cpu_pressure =
>float(self.cfgData.getVar("BB_MAX_CPU_SOME_PRESSURE") or 100.0)
>> +        self.max_io_pressure =
>> + float(self.cfgData.getVar("BB_MAX_IO_SOME_PRESSURE") or 100.0)
>
>I did wonder if this should be BB_PRESSURE_MAX_SOME_IO as the order of the
>information kinds of seems backwards to me. That could just be me though! :)

Sure, that colour of bikeshed is perfect ;^)

Regards,
Aryaman
>
>Cheers,
>
>Richard