bitbake-devel.lists.openembedded.org archive mirror
 help / color / mirror / Atom feed
From: "Chen, Qi" <Qi.Chen@windriver.com>
To: "Chen, Qi" <Qi.Chen@windriver.com>,
	Richard Purdie <richard.purdie@linuxfoundation.org>,
	"bitbake-devel@lists.openembedded.org"
	<bitbake-devel@lists.openembedded.org>
Subject: RE: [bitbake-devel][PATCH] bitbake: add --noreply-timeout option
Date: Wed, 10 May 2023 15:05:59 +0000	[thread overview]
Message-ID: <CO6PR11MB56027195BD2DA96E48864117ED779@CO6PR11MB5602.namprd11.prod.outlook.com> (raw)
In-Reply-To: <175DC558A674BAA1.32438@lists.openembedded.org>

Hi Richard,

Thanks for your info. After checking the codes of updateConfig and looking at my project's special configuration, I finally found why my updateConfig takes so long.
We have an event triggered by ConfigParsed, and in that event, we do file search, mtime checking, etc, and then set BB_INVALIDCONF to True, which triggers a new parse.

Regards,
Qi

-----Original Message-----
From: bitbake-devel@lists.openembedded.org <bitbake-devel@lists.openembedded.org> On Behalf Of Chen Qi via lists.openembedded.org
Sent: Wednesday, May 10, 2023 7:40 PM
To: Richard Purdie <richard.purdie@linuxfoundation.org>; bitbake-devel@lists.openembedded.org
Subject: Re: [bitbake-devel][PATCH] bitbake: add --noreply-timeout option

Hi Richard,

Thanks for your reply. I totally agree that it should be the server configurations (e.g., BB_NUMBER_THREADS for each build, the number of builds to start in parallel, and maybe PSI related configs, etc.) that should be adjusted to avoid such timeout issue.

To add more information: the server I used where I saw this start-up timeout failure is 128 cores + 512G + 4 RADI0 disks. The uptime value when the timeout happened was around 500. The BB_NUMBER_THREADS remains its default value.

I also found such startup timeout is more likely to happen on machines with more cores. Because on another server, which has 40 cores, I have to start 3~5 builds to trigger this timeout error.

Regards,
Qi


-----Original Message-----
From: Richard Purdie <richard.purdie@linuxfoundation.org>
Sent: Wednesday, May 10, 2023 6:36 PM
To: Chen, Qi <Qi.Chen@windriver.com>; bitbake-devel@lists.openembedded.org
Subject: Re: [bitbake-devel][PATCH] bitbake: add --noreply-timeout option

On Wed, 2023-05-10 at 09:43 +0000, Chen, Qi wrote:
> Hi Richard,
> 
> It's not the bitbake server that takes more than 1 minute to start, 
> it's that the 'updateConfig' command that takes more than 1 minute.
> From what I see, other commands finish quite fast except the actual 
> building one.

The updateConfig command doesn't actually build anything. All it does is triggers a parse of the base configuration, it isn't even parsing recipes, just the bitbake.conf and layer.conf files.

Think of updateConfig as just bringing the two ends of the connection, client and server into sync.

I'm a bit worried that if the system is so overloaded it can't do that in 60s, we have bigger problems.

> The server was still usable, even after the second world build managed 
> to start.
> 
> I'm not using any special configuration. All default values. I do have 
> extra layers added, such as meta-openembedded/*, meta- virtualization, 
> meta-browser, meta-security, meta-selinux, etc.
> Anyway, the task number of a world build is about 40000+.
> 
> In fact, what I really want to do is to set the timeout to be some 
> large value on our autobuilders so that we can avoid this start-up 
> timeout failure. If such patch is not suitable for the official 
> bitbake, I'll make it a local patch until our autobuilders are 
> configured more properly. Do you have any suggestion?
> 
> That's all the information and the background.

What is BB_NUMBER_THREADS set to and how many CPU cores? Spinning media or SSDs? What are the system load numbers when this happens?

I think that if this startup timeout is happening, there will be other issues and you need to resolve the overloaded system problem rather than just move the timeout problem to somewhere else.

Cheers,

Richard


      parent reply	other threads:[~2023-05-10 15:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-10  3:16 [bitbake-devel][PATCH] bitbake: add --noreply-timeout option Qi.Chen
2023-05-10  7:52 ` Richard Purdie
2023-05-10  9:43   ` Chen, Qi
2023-05-10 10:35     ` Richard Purdie
2023-05-10 11:40       ` Chen, Qi
     [not found]       ` <175DC558A674BAA1.32438@lists.openembedded.org>
2023-05-10 15:05         ` Chen, Qi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CO6PR11MB56027195BD2DA96E48864117ED779@CO6PR11MB5602.namprd11.prod.outlook.com \
    --to=qi.chen@windriver.com \
    --cc=bitbake-devel@lists.openembedded.org \
    --cc=richard.purdie@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).