All of lore.kernel.org
 help / color / mirror / Atom feed
* an idea about parallel doing do_patch
@ 2020-09-07  8:27 zhangyifan46
  2020-09-07  8:37 ` [bitbake-devel] " Mikko Rapeli
  0 siblings, 1 reply; 5+ messages in thread
From: zhangyifan46 @ 2020-09-07  8:27 UTC (permalink / raw)
  To: bitbake-devel

[-- Attachment #1: Type: text/plain, Size: 677 bytes --]

In our peoject,we have suffered a lot from long duration of do_patch of linux kernel(3000+ patches). I found that one do_patch task uses only one process. So I modify a llittle do do_patch task. Here is my idea:
1.analyse the patches,only getting the modified files of each patch.
2.cluster all patches according to the files modified( patch no.1 modifies file A,B ,patch no.2 modifies file A, patch no.3 modifies file C,then we cluster patch no.1 and no.2 as  a group, patch no.3 as another group) , here I use union-find to do the cluster
3.assign one group on one process
But I met the problem of probabilistic missing patches.
Anyone has any comments about my idea?

[-- Attachment #2: Type: text/html, Size: 721 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [bitbake-devel] an idea about parallel doing do_patch
  2020-09-07  8:27 an idea about parallel doing do_patch zhangyifan46
@ 2020-09-07  8:37 ` Mikko Rapeli
       [not found]   ` <20610.1599468217725892636@lists.openembedded.org>
  2020-09-09 23:25   ` Khem Raj
  0 siblings, 2 replies; 5+ messages in thread
From: Mikko Rapeli @ 2020-09-07  8:37 UTC (permalink / raw)
  To: zhangyifan46; +Cc: bitbake-devel

Hi,

On Mon, Sep 07, 2020 at 01:27:29AM -0700, zhangyifan46 via lists.openembedded.org wrote:
> In our peoject,we have suffered a lot from long duration of do_patch of linux kernel(3000+ patches). I found that one do_patch task uses only one process. So I modify a llittle do do_patch task. Here is my idea:
> 1.analyse the patches,only getting the modified files of each patch.
> 2.cluster all patches according to the files modified( patch no.1 modifies file A,B ,patch no.2 modifies file A, patch no.3 modifies file C,then we cluster patch no.1 and no.2 as  a group, patch no.3 as another group) , here I use union-find to do the cluster
> 3.assign one group on one process
> But I met the problem of probabilistic missing patches.
> Anyone has any comments about my idea?

I think having large patch sets in meta layers and applied with bitbake are not the
right solution. I would create a custom git repo and branch for large forks/branches
like this.

Possibly it could also be possible to apply patches with git through mbox files, but
I don't think this will much faster than bitbake.

Hope this helps,

-Mikko

> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [bitbake-devel] an idea about parallel doing do_patch
       [not found]   ` <20610.1599468217725892636@lists.openembedded.org>
@ 2020-09-07  8:58     ` Mikko Rapeli
  2020-09-07 10:30       ` Richard Purdie
  0 siblings, 1 reply; 5+ messages in thread
From: Mikko Rapeli @ 2020-09-07  8:58 UTC (permalink / raw)
  To: zhangyifan46; +Cc: bitbake-devel

Hi, please reply to mailing list too.

On Mon, Sep 07, 2020 at 01:43:37AM -0700, zhangyifan46@huawei.com wrote:
> Sure using git repo is best solution.
> But I am working in a company  and we use local codes(tar + patches).
> right or wrong, I think if we can improve the efficiency of patching, it is better than nothing.

Well, here I would really move away from tar + patches approach and move things to git.
I've been there and forced BSP vendors to provide full git trees with proper commit messages
and history linked to upstream kernel.org stable point releases. We have these
in our project and product requirements. I suggest talking to the vendors and moving
to git.

Tar and patches was ok in the 90's but not anymore large change sets.

Parallelizing do_patch() could be a nice addition but needs serious testing
and validation because bugs could have really serious effects.

Cheers,

-Mikko

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [bitbake-devel] an idea about parallel doing do_patch
  2020-09-07  8:58     ` Mikko Rapeli
@ 2020-09-07 10:30       ` Richard Purdie
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Purdie @ 2020-09-07 10:30 UTC (permalink / raw)
  To: Mikko Rapeli, zhangyifan46; +Cc: bitbake-devel

On Mon, 2020-09-07 at 08:58 +0000, Mikko Rapeli wrote:
> Hi, please reply to mailing list too.
> 
> On Mon, Sep 07, 2020 at 01:43:37AM -0700, zhangyifan46@huawei.com
> wrote:
> > Sure using git repo is best solution.
> > But I am working in a company  and we use local codes(tar +
> > patches).
> > right or wrong, I think if we can improve the efficiency of
> > patching, it is better than nothing.
> 
> Well, here I would really move away from tar + patches approach and
> move things to git.
> I've been there and forced BSP vendors to provide full git trees with
> proper commit messages
> and history linked to upstream kernel.org stable point releases. We
> have these
> in our project and product requirements. I suggest talking to the
> vendors and moving
> to git.
> 
> Tar and patches was ok in the 90's but not anymore large change sets.
> 
> Parallelizing do_patch() could be a nice addition but needs serious
> testing and validation because bugs could have really serious
> effects.

I agree with Mikko, that do_patch isn't really designed to handle 3000
item queues of patches.

If you really want to do what you're doing, I'd write it as a
replacement for quilt which has parallelism. You'd then perhaps be able
to swap that in with fewer changes to OE-Core. I'd also want to very
carefully analyse where the time bottlebeck is in do_patch as I suspect
it may be on process execution overhead rather than in the patch tool
itself.

I'm not really very interested in changing do_patch in the core to do
parallelism since this is a very specific corner case where you really
should be doing something else entirely (use git).

Cheers,

Richard




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [bitbake-devel] an idea about parallel doing do_patch
  2020-09-07  8:37 ` [bitbake-devel] " Mikko Rapeli
       [not found]   ` <20610.1599468217725892636@lists.openembedded.org>
@ 2020-09-09 23:25   ` Khem Raj
  1 sibling, 0 replies; 5+ messages in thread
From: Khem Raj @ 2020-09-09 23:25 UTC (permalink / raw)
  To: Mikko Rapeli, zhangyifan46; +Cc: bitbake-devel


[-- Attachment #1.1: Type: text/plain, Size: 1635 bytes --]



On 9/7/20 1:37 AM, Mikko Rapeli wrote:
> Hi,
> 
> On Mon, Sep 07, 2020 at 01:27:29AM -0700, zhangyifan46 via lists.openembedded.org wrote:
>> In our peoject,we have suffered a lot from long duration of do_patch of linux kernel(3000+ patches). I found that one do_patch task uses only one process. So I modify a llittle do do_patch task. Here is my idea:
>> 1.analyse the patches,only getting the modified files of each patch.
>> 2.cluster all patches according to the files modified( patch no.1 modifies file A,B ,patch no.2 modifies file A, patch no.3 modifies file C,then we cluster patch no.1 and no.2 as  a group, patch no.3 as another group) , here I use union-find to do the cluster
>> 3.assign one group on one process
>> But I met the problem of probabilistic missing patches.
>> Anyone has any comments about my idea?
> 
> I think having large patch sets in meta layers and applied with bitbake are not the
> right solution. I would create a custom git repo and branch for large forks/branches
> like this.
> 

right, I think thousands of patches via patch management tools like
quilt is going to be cumbersome, I think best approach here would be to
fork kernel tree and maintain your patches on a branch. It will ease out
maintenance as well as build times. Ofcourse it means you have to adopt
a good process to manage your kernel branch so it does not spin into a
maintenance problems either.

> Possibly it could also be possible to apply patches with git through mbox files, but
> I don't think this will much faster than bitbake.


> 
> Hope this helps,
> 
> -Mikko
> 
>>
>>
>> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 201 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-09-09 23:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-07  8:27 an idea about parallel doing do_patch zhangyifan46
2020-09-07  8:37 ` [bitbake-devel] " Mikko Rapeli
     [not found]   ` <20610.1599468217725892636@lists.openembedded.org>
2020-09-07  8:58     ` Mikko Rapeli
2020-09-07 10:30       ` Richard Purdie
2020-09-09 23:25   ` Khem Raj

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.