All of lore.kernel.org
 help / color / mirror / Atom feed
* Improving npm(sw) fetcher & integration within Bitbake
@ 2021-11-04 12:29 Jasper Orschulko
  2021-11-04 13:09 ` [bitbake-devel] " Alexander Kanavin
  2021-11-04 23:15 ` Richard Purdie
  0 siblings, 2 replies; 16+ messages in thread
From: Jasper Orschulko @ 2021-11-04 12:29 UTC (permalink / raw)
  To: bitbake-devel; +Cc: martin, Daniel Baumgart

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Dear Bitbake developers,

recently we have been looking at the npmsw fetcher and discovered some
challenges regarding the integration into the developer workflow as
well as the build times within Bitbake. We believe that we found a
mechanism which would integrate well into Bitbake's existing project
structure and drastically improve the situation.


But first, what are the issues with the current npmsw fetcher?

1. Let's have a look at a typical npm-based project. You'd typically
have your package-lock.json (aka shrinkwrap file) stored within the git
repository containing your source code. Developers will rely on this
package-lock file on a daily basis during the development cycle.
Unfortunately, the current npmsw fetcher only supports shrinkwrap files
stored within the meta layer or within an npm registry. This is not
ideal, as changes to the file might be made within the project repo,
which then need to be manually applied to the lock file within the meta
repo. An ideal npmsw fetcher therefore would support using the lock
file directly from the source code repo. 

2. The current implementation of the npm class uses multiple shellouts
per npm module in order to add these to the npm cache. This is done, as
the `npm install` command is not called within the do_fetch, but at the
end of the do_configure step. This drastically increases the time
Bitbake spends in the do_configure step for a npm based recipe. In our
case (we have a relatively small project with approx. 600 npm packages
in total, including recursive packages) this takes ~100 minutes to
complete. What makes things worse, every change to the recipe and/or
lock file will cause a complete rerun of the do_configure job.

As a result, the npm fetcher currently is not really usable for
production workloads.


So how can we address these issues?

We plan to implement a "sub-fetcher" for npmsw (a concept which might
also be recyclable for similar use-cases). This would take the
form of e.g.:

SRC_URI = "npmsw+git://git-uri.git;npm-topdir=path_to_npm_project;..."

The idea is, that the npsw fetcher would then call an arbitrary sub-
fetcher (in this case git, however any fetcher will be supported) and
after the sub-fetcher has extracted the source code into the DL_DIR,
the npm fetcher will create a secondary download folder as a copy of
the sub-fetchers download folder. Within this copy, the npm fetcher
will call `npm ci`, effectively downloading the npm packages by doing a
clean-install on the basis of the package.json and the package-
lock.json files within the npmsw download dir. This results in a much
faster build, as it removes the need for seperate handling of the
individual node packages, as well as streamlining the developers
workflow with the build process within Bitbake.

As this fetcher would be implemented separately from the current npmsw
fetcher, this will not cause any breaking changes for existing setups.

Additionally, we plan on writing a separate npmsw.bbclass, which will
parse the package.json for each node module for an automated Bitbake
license manifest generation, which will resolve the current challenge
of having to maintain these manually, as currently described at
https://www.yoctoproject.org/docs/latest/mega-manual/mega-manual.html#npm-using-the-registry-modules-method
.

If this is something you see as a worthwhile goal, we will provide a
set of patch files within the coming weeks.

- -- 
With best regards

Jasper Orschulko
DevOps Engineer

Tel. +49 30 58 58 14 265
Fax +49 30 58 58 14 999
Jasper.Orschulko@iris-sensing.com

• • • • • • • • • • • • • • • • • • • • • • • • • •

iris-GmbH
infrared & intelligent sensors
Schnellerstraße 1-5 | 12439 Berlin

https://iris-sensing.com/




-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEE4WyPMIC5Ap4+Ooo1Ygqew07VMNUFAmGD0jsACgkQYgqew07V
MNVa1Qf+MwrlwXeS+FI8JrtHdTJ5CNJ64DkiTe0Tgqb7SQVkawXlm6KPezBYFIZb
HmAruV8vpQpUHkyKXpwuH4X0A2CO3jJ9v20H3sdRfL+33gSCOY9+LDOADlRf1MhT
fkV+OqwYufwG02ZnOpF3YRcTiDfG9UEzM+lzArOzhY6GjMp4FxvQZ/xLjEvdGVJ1
l3h3NPmsedqsJnal022wkPi2gN2ZCQPyIw11EJL929wqwPHudvqj8OX2q1JhDUn5
vjRpN2l4yg6g8bpF1If+5YkT/ZVjrZqcleL9pYPpVOuqk+j/iWbPSTmXjvzsrKDL
1UWqTBtRmWMYH+xkjKxD7Spjz1scRA==
=qjQA
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
  2021-11-04 12:29 Improving npm(sw) fetcher & integration within Bitbake Jasper Orschulko
@ 2021-11-04 13:09 ` Alexander Kanavin
  2021-11-06 16:58   ` Mike Crowe
  2021-11-04 23:15 ` Richard Purdie
  1 sibling, 1 reply; 16+ messages in thread
From: Alexander Kanavin @ 2021-11-04 13:09 UTC (permalink / raw)
  To: Jasper Orschulko, Stefan Herbrechtsmeier
  Cc: bitbake-devel, martin, Daniel Baumgart

[-- Attachment #1: Type: text/plain, Size: 5605 bytes --]

Hello Jasper,

I want to invite Stefan to this discussion because he's been facing similar
issues in integrating npm code with Yocto. He's been working on some
improvements to npm fetcher and class, and is most qualified to comment, so
maybe you guys can work out a plan to make it shine (or at least not be too
horrible).

To the best of my knowledge, no other companies at the moment are using
Yocto to integrate npm-based items into a product.

Alex

On Thu, 4 Nov 2021 at 13:29, Jasper Orschulko <
Jasper.Orschulko@iris-sensing.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Dear Bitbake developers,
>
> recently we have been looking at the npmsw fetcher and discovered some
> challenges regarding the integration into the developer workflow as
> well as the build times within Bitbake. We believe that we found a
> mechanism which would integrate well into Bitbake's existing project
> structure and drastically improve the situation.
>
>
> But first, what are the issues with the current npmsw fetcher?
>
> 1. Let's have a look at a typical npm-based project. You'd typically
> have your package-lock.json (aka shrinkwrap file) stored within the git
> repository containing your source code. Developers will rely on this
> package-lock file on a daily basis during the development cycle.
> Unfortunately, the current npmsw fetcher only supports shrinkwrap files
> stored within the meta layer or within an npm registry. This is not
> ideal, as changes to the file might be made within the project repo,
> which then need to be manually applied to the lock file within the meta
> repo. An ideal npmsw fetcher therefore would support using the lock
> file directly from the source code repo.
>
> 2. The current implementation of the npm class uses multiple shellouts
> per npm module in order to add these to the npm cache. This is done, as
> the `npm install` command is not called within the do_fetch, but at the
> end of the do_configure step. This drastically increases the time
> Bitbake spends in the do_configure step for a npm based recipe. In our
> case (we have a relatively small project with approx. 600 npm packages
> in total, including recursive packages) this takes ~100 minutes to
> complete. What makes things worse, every change to the recipe and/or
> lock file will cause a complete rerun of the do_configure job.
>
> As a result, the npm fetcher currently is not really usable for
> production workloads.
>
>
> So how can we address these issues?
>
> We plan to implement a "sub-fetcher" for npmsw (a concept which might
> also be recyclable for similar use-cases). This would take the
> form of e.g.:
>
> SRC_URI = "npmsw+git://git-uri.git;npm-topdir=path_to_npm_project;..."
>
> The idea is, that the npsw fetcher would then call an arbitrary sub-
> fetcher (in this case git, however any fetcher will be supported) and
> after the sub-fetcher has extracted the source code into the DL_DIR,
> the npm fetcher will create a secondary download folder as a copy of
> the sub-fetchers download folder. Within this copy, the npm fetcher
> will call `npm ci`, effectively downloading the npm packages by doing a
> clean-install on the basis of the package.json and the package-
> lock.json files within the npmsw download dir. This results in a much
> faster build, as it removes the need for seperate handling of the
> individual node packages, as well as streamlining the developers
> workflow with the build process within Bitbake.
>
> As this fetcher would be implemented separately from the current npmsw
> fetcher, this will not cause any breaking changes for existing setups.
>
> Additionally, we plan on writing a separate npmsw.bbclass, which will
> parse the package.json for each node module for an automated Bitbake
> license manifest generation, which will resolve the current challenge
> of having to maintain these manually, as currently described at
>
> https://www.yoctoproject.org/docs/latest/mega-manual/mega-manual.html#npm-using-the-registry-modules-method
> .
>
> If this is something you see as a worthwhile goal, we will provide a
> set of patch files within the coming weeks.
>
> - --
> With best regards
>
> Jasper Orschulko
> DevOps Engineer
>
> Tel. +49 30 58 58 14 265
> Fax +49 30 58 58 14 999
> Jasper.Orschulko@iris-sensing.com
>
> • • • • • • • • • • • • • • • • • • • • • • • • • •
>
> iris-GmbH
> infrared & intelligent sensors
> Schnellerstraße 1-5 | 12439 Berlin
>
> https://iris-sensing.com/
>
>
>
>
> -----BEGIN PGP SIGNATURE-----
>
> iQEzBAEBCAAdFiEE4WyPMIC5Ap4+Ooo1Ygqew07VMNUFAmGD0jsACgkQYgqew07V
> MNVa1Qf+MwrlwXeS+FI8JrtHdTJ5CNJ64DkiTe0Tgqb7SQVkawXlm6KPezBYFIZb
> HmAruV8vpQpUHkyKXpwuH4X0A2CO3jJ9v20H3sdRfL+33gSCOY9+LDOADlRf1MhT
> fkV+OqwYufwG02ZnOpF3YRcTiDfG9UEzM+lzArOzhY6GjMp4FxvQZ/xLjEvdGVJ1
> l3h3NPmsedqsJnal022wkPi2gN2ZCQPyIw11EJL929wqwPHudvqj8OX2q1JhDUn5
> vjRpN2l4yg6g8bpF1If+5YkT/ZVjrZqcleL9pYPpVOuqk+j/iWbPSTmXjvzsrKDL
> 1UWqTBtRmWMYH+xkjKxD7Spjz1scRA==
> =qjQA
> -----END PGP SIGNATURE-----
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#12875):
> https://lists.openembedded.org/g/bitbake-devel/message/12875
> Mute This Topic: https://lists.openembedded.org/mt/86814331/1686489
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [
> alex.kanavin@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>

[-- Attachment #2: Type: text/html, Size: 7006 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
  2021-11-04 12:29 Improving npm(sw) fetcher & integration within Bitbake Jasper Orschulko
  2021-11-04 13:09 ` [bitbake-devel] " Alexander Kanavin
@ 2021-11-04 23:15 ` Richard Purdie
  2021-11-05  9:07   ` Stefan Herbrechtsmeier
  1 sibling, 1 reply; 16+ messages in thread
From: Richard Purdie @ 2021-11-04 23:15 UTC (permalink / raw)
  To: Jasper Orschulko, bitbake-devel; +Cc: martin, Daniel Baumgart

On Thu, 2021-11-04 at 12:29 +0000, Jasper Orschulko wrote:
> Dear Bitbake developers,
> 
> recently we have been looking at the npmsw fetcher and discovered some
> challenges regarding the integration into the developer workflow as
> well as the build times within Bitbake. We believe that we found a
> mechanism which would integrate well into Bitbake's existing project
> structure and drastically improve the situation.
> 
> 
> But first, what are the issues with the current npmsw fetcher?
> 
> 1. Let's have a look at a typical npm-based project. You'd typically
> have your package-lock.json (aka shrinkwrap file) stored within the git
> repository containing your source code. Developers will rely on this
> package-lock file on a daily basis during the development cycle.
> Unfortunately, the current npmsw fetcher only supports shrinkwrap files
> stored within the meta layer or within an npm registry. This is not
> ideal, as changes to the file might be made within the project repo,
> which then need to be manually applied to the lock file within the meta
> repo. An ideal npmsw fetcher therefore would support using the lock
> file directly from the source code repo. 
> 
> 2. The current implementation of the npm class uses multiple shellouts
> per npm module in order to add these to the npm cache. This is done, as
> the `npm install` command is not called within the do_fetch, but at the
> end of the do_configure step. This drastically increases the time
> Bitbake spends in the do_configure step for a npm based recipe. In our
> case (we have a relatively small project with approx. 600 npm packages
> in total, including recursive packages) this takes ~100 minutes to
> complete. What makes things worse, every change to the recipe and/or
> lock file will cause a complete rerun of the do_configure job.
> 
> As a result, the npm fetcher currently is not really usable for
> production workloads.
> 
> 
> So how can we address these issues?
> 
> We plan to implement a "sub-fetcher" for npmsw (a concept which might
> also be recyclable for similar use-cases). This would take the
> form of e.g.:
> 
> SRC_URI = "npmsw+git://git-uri.git;npm-topdir=path_to_npm_project;..."
> 
> The idea is, that the npsw fetcher would then call an arbitrary sub-
> fetcher (in this case git, however any fetcher will be supported) and
> after the sub-fetcher has extracted the source code into the DL_DIR,
> the npm fetcher will create a secondary download folder as a copy of
> the sub-fetchers download folder. Within this copy, the npm fetcher
> will call `npm ci`, effectively downloading the npm packages by doing a
> clean-install on the basis of the package.json and the package-
> lock.json files within the npmsw download dir. This results in a much
> faster build, as it removes the need for seperate handling of the
> individual node packages, as well as streamlining the developers
> workflow with the build process within Bitbake.
> 
> As this fetcher would be implemented separately from the current npmsw
> fetcher, this will not cause any breaking changes for existing setups.
> 
> Additionally, we plan on writing a separate npmsw.bbclass, which will
> parse the package.json for each node module for an automated Bitbake
> license manifest generation, which will resolve the current challenge
> of having to maintain these manually, as currently described at
> https://www.yoctoproject.org/docs/latest/mega-manual/mega-manual.html#npm-using-the-registry-modules-method
> .
> 
> If this is something you see as a worthwhile goal, we will provide a
> set of patch files within the coming weeks.

At a first read it sounds reasonable but I don't know the answers to a few
questions which make or break things from an OE/bitbake perspective. Those
questions are:

a) Once DL_DIR has been populated by this fetch mechanism, can a subsequent
build run with just the data from there without accessing the network?

b) Is the information encoded into SRC_URI enough to give a deterministic build
result, i.e. if we run this build at some later date, will we get the same
result?

c) Is fetching only happening during the do_fetch task and not in any subsequent
step?


I'd love for some of the other people who're worked on this code to jump in as I
don't use it or understand it in detail. I am worried about how we maintain this
longer term as different people seem to have different use cases which sees the
code changing in different directions and we're starting to look like we may end
up with multiple ways of doing things which I really dislike.

If we do go this way, does this mean we can simplify other pieces and stop
supporting other codepaths?

Cheers

Richard








^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
  2021-11-04 23:15 ` Richard Purdie
@ 2021-11-05  9:07   ` Stefan Herbrechtsmeier
  2021-11-05 11:24     ` Jean-Marie Lemetayer
       [not found]     ` <4106f9ef-5b2e-5276-f1bb-c80a989d7fdf@mko.dev>
  0 siblings, 2 replies; 16+ messages in thread
From: Stefan Herbrechtsmeier @ 2021-11-05  9:07 UTC (permalink / raw)
  To: richard.purdie, Jasper Orschulko, bitbake-devel; +Cc: martin, Daniel Baumgart

Hi Jasper and Richard,

Am 05.11.2021 um 00:15 schrieb Richard Purdie via lists.openembedded.org:
> On Thu, 2021-11-04 at 12:29 +0000, Jasper Orschulko wrote:
>> Dear Bitbake developers,
>>
>> recently we have been looking at the npmsw fetcher and discovered some
>> challenges regarding the integration into the developer workflow as
>> well as the build times within Bitbake. We believe that we found a
>> mechanism which would integrate well into Bitbake's existing project
>> structure and drastically improve the situation.
>>
>>
>> But first, what are the issues with the current npmsw fetcher?
>>
>> 1. Let's have a look at a typical npm-based project. You'd typically
>> have your package-lock.json (aka shrinkwrap file) stored within the git
>> repository containing your source code. Developers will rely on this
>> package-lock file on a daily basis during the development cycle.
>> Unfortunately, the current npmsw fetcher only supports shrinkwrap files
>> stored within the meta layer or within an npm registry. This is not
>> ideal, as changes to the file might be made within the project repo,
>> which then need to be manually applied to the lock file within the meta
>> repo. An ideal npmsw fetcher therefore would support using the lock
>> file directly from the source code repo.

The package-lock.json and npm-shrinkwrap.json are identical. The only 
difference is that the npm-shrinkwrap.json could be published with the 
package.

>> 2. The current implementation of the npm class uses multiple shellouts
>> per npm module in order to add these to the npm cache. This is done, as
>> the `npm install` command is not called within the do_fetch, but at the
>> end of the do_configure step. This drastically increases the time
>> Bitbake spends in the do_configure step for a npm based recipe. In our
>> case (we have a relatively small project with approx. 600 npm packages
>> in total, including recursive packages) this takes ~100 minutes to
>> complete. What makes things worse, every change to the recipe and/or
>> lock file will cause a complete rerun of the do_configure job.

This is a problem of the sequential setup of the cache. I have a 
prototype to do this in a special bb task and use multiple parallel 
process task inside the bb task. But I have also a prototype which 
remove the complete cache and speed up the build significant.

>> As a result, the npm fetcher currently is not really usable for
>> production workloads.

Ack.

>> So how can we address these issues?
>>
>> We plan to implement a "sub-fetcher" for npmsw (a concept which might
>> also be recyclable for similar use-cases). This would take the
>> form of e.g.:
>>
>> SRC_URI = "npmsw+git://git-uri.git;npm-topdir=path_to_npm_project;..."
>>
>> The idea is, that the npsw fetcher would then call an arbitrary sub-
>> fetcher (in this case git, however any fetcher will be supported) and
>> after the sub-fetcher has extracted the source code into the DL_DIR,
>> the npm fetcher will create a secondary download folder as a copy of
>> the sub-fetchers download folder. Within this copy, the npm fetcher
>> will call `npm ci`, effectively downloading the npm packages by doing a
>> clean-install on the basis of the package.json and the package-
>> lock.json files within the npmsw download dir. This results in a much
>> faster build, as it removes the need for seperate handling of the
>> individual node packages, as well as streamlining the developers
>> workflow with the build process within Bitbake.

How should this support the download proxy? The npm ci command need a 
repository or a cache to work.

Furthermore you need a patch step in between the fetch steps to support 
tuning / fixing of the configuration before the second fetch step.

>> As this fetcher would be implemented separately from the current npmsw
>> fetcher, this will not cause any breaking changes for existing setups.
>>
>> Additionally, we plan on writing a separate npmsw.bbclass, which will
>> parse the package.json for each node module for an automated Bitbake
>> license manifest generation, which will resolve the current challenge
>> of having to maintain these manually, as currently described at
>> https://www.yoctoproject.org/docs/latest/mega-manual/mega-manual.html#npm-using-the-registry-modules-method
>> .

This licenses will be generated by the recipetool and you could provide 
checksums to detect the correct licenses.

The license inside the package.json is only a hint and you need a 
license file to fulfill the license compliance. Because of this I remove 
the package.json from LIC_FILES_CHKSUM because it is useless for the 
license compliance.


>> If this is something you see as a worthwhile goal, we will provide a
>> set of patch files within the coming weeks.

I think you mixed the unusable npm implementation with your special use 
case.

The problem is that the current npm implementation isn't really usable. 
I'm working on this and have already a prototype that could install, 
build and *test* a proprietary angular project and node-red as well as 
koa/examples from github.

If I understand you correct you like to build a npm recipe that could 
change it dependencies without update the recipe except the SRCREV of 
the repositories.

> At a first read it sounds reasonable but I don't know the answers to a few
> questions which make or break things from an OE/bitbake perspective. Those
> questions are:
> 
> a) Once DL_DIR has been populated by this fetch mechanism, can a subsequent
> build run with just the data from there without accessing the network?
> 
> b) Is the information encoded into SRC_URI enough to give a deterministic build
> result, i.e. if we run this build at some later date, will we get the same
> result?
> 
> c) Is fetching only happening during the do_fetch task and not in any subsequent
> step?
> 
> 
> I'd love for some of the other people who're worked on this code to jump in as I
> don't use it or understand it in detail. I am worried about how we maintain this
> longer term as different people seem to have different use cases which sees the
> code changing in different directions and we're starting to look like we may end
> up with multiple ways of doing things which I really dislike.

This leads to the questions what is the desired way to integrate a 
package / dependency manager. Nowadays any language (even C/C++) has a 
package manager available and more and more build systems (ex. Meson, 
CMake) support automatic download of dependencies. The common 
integration into OE is a script (recipetool) that generate a recipe 
from the foreign configuration. The current npm implementation is 
special because it reuse a foreign configuration and translate it into 
fetch commands on-the-fly. This leads to the problem that common tweaks 
like override a dependency or share configuration between recipes via 
include file isn't possible. We could fix it by removing the foreign 
configuration and do the translation during recipe creation. But this 
means you have to recreate the recipe after every dependency change.

Is it a valid use case for OE to support foreign dependency 
configurations like npm-shrinkwrap.json, go.sum or conan.lock?

Regards
   Stefan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
       [not found]     ` <4106f9ef-5b2e-5276-f1bb-c80a989d7fdf@mko.dev>
@ 2021-11-05 11:12       ` Martin Koppehel
  2021-11-05 13:16       ` Stefan Herbrechtsmeier
  1 sibling, 0 replies; 16+ messages in thread
From: Martin Koppehel @ 2021-11-05 11:12 UTC (permalink / raw)
  To: bitbake-devel

Hi Stefan and Richard,

first of all, thanks for sharing your thoughts on this. I'm working with 
Jasper to improve this and want to share my view on the topic.

On 11/5/21 10:07, Stefan Herbrechtsmeier wrote:
> Hi Jasper and Richard,
>
> Am 05.11.2021 um 00:15 schrieb Richard Purdie via lists.openembedded.org:
>> On Thu, 2021-11-04 at 12:29 +0000, Jasper Orschulko wrote:
>>> Dear Bitbake developers,
>>>
>>> recently we have been looking at the npmsw fetcher and discovered some
>>> challenges regarding the integration into the developer workflow as
>>> well as the build times within Bitbake. We believe that we found a
>>> mechanism which would integrate well into Bitbake's existing project
>>> structure and drastically improve the situation.
>>>
>>>
>>> But first, what are the issues with the current npmsw fetcher?
>>>
>>> 1. Let's have a look at a typical npm-based project. You'd typically
>>> have your package-lock.json (aka shrinkwrap file) stored within the git
>>> repository containing your source code. Developers will rely on this
>>> package-lock file on a daily basis during the development cycle.
>>> Unfortunately, the current npmsw fetcher only supports shrinkwrap files
>>> stored within the meta layer or within an npm registry. This is not
>>> ideal, as changes to the file might be made within the project repo,
>>> which then need to be manually applied to the lock file within the meta
>>> repo. An ideal npmsw fetcher therefore would support using the lock
>>> file directly from the source code repo.
>
> The package-lock.json and npm-shrinkwrap.json are identical. The only 
> difference is that the npm-shrinkwrap.json could be published with the 
> package.
>
>>> 2. The current implementation of the npm class uses multiple shellouts
>>> per npm module in order to add these to the npm cache. This is done, as
>>> the `npm install` command is not called within the do_fetch, but at the
>>> end of the do_configure step. This drastically increases the time
>>> Bitbake spends in the do_configure step for a npm based recipe. In our
>>> case (we have a relatively small project with approx. 600 npm packages
>>> in total, including recursive packages) this takes ~100 minutes to
>>> complete. What makes things worse, every change to the recipe and/or
>>> lock file will cause a complete rerun of the do_configure job.
>
> This is a problem of the sequential setup of the cache. I have a 
> prototype to do this in a special bb task and use multiple parallel 
> process task inside the bb task. But I have also a prototype which 
> remove the complete cache and speed up the build significant.
I do agree with the things Stefan wrote here, especially that most of 
the build-duration issues come from the fact that the npmsw fetcher and 
the npm bbclass work sequentially on all packages, which takes up 
~100minutes in the do_configure step for our case.
Our general idea too was to remove the complete cache population step 
which should drop the build time significantly.
>
>>> As a result, the npm fetcher currently is not really usable for
>>> production workloads.
>
> Ack.
>
>>> So how can we address these issues?
>>>
>>> We plan to implement a "sub-fetcher" for npmsw (a concept which might
>>> also be recyclable for similar use-cases). This would take the
>>> form of e.g.:
>>>
>>> SRC_URI = "npmsw+git://git-uri.git;npm-topdir=path_to_npm_project;..."
>>>
>>> The idea is, that the npsw fetcher would then call an arbitrary sub-
>>> fetcher (in this case git, however any fetcher will be supported) and
>>> after the sub-fetcher has extracted the source code into the DL_DIR,
>>> the npm fetcher will create a secondary download folder as a copy of
>>> the sub-fetchers download folder. Within this copy, the npm fetcher
>>> will call `npm ci`, effectively downloading the npm packages by doing a
>>> clean-install on the basis of the package.json and the package-
>>> lock.json files within the npmsw download dir. This results in a much
>>> faster build, as it removes the need for seperate handling of the
>>> individual node packages, as well as streamlining the developers
>>> workflow with the build process within Bitbake.
>
> How should this support the download proxy? The npm ci command need a 
> repository or a cache to work.
The npm ci command can utilize a private registry and/or http proxy if 
that's required. We didn't consider that case yet, but I think we could 
add a call to npm to configure a proxy according to e.g. a set of 
environment variables.
>
> Furthermore you need a patch step in between the fetch steps to 
> support tuning / fixing of the configuration before the second fetch step.
Our idea was to build a completely checked out and installed repository 
and archive this in the DL_DIR, which then can be used in the do_patch 
phase.
Are we missing some important use-case here? Whenever it is necessary to 
patch the package.json/package-lock.json this should ideally be done in 
your upstream repository.

Our primary motivation behind leaving the package-lock within the source 
repository was to have a single source of truth for the dependency versions.

>
>>> As this fetcher would be implemented separately from the current npmsw
>>> fetcher, this will not cause any breaking changes for existing setups.
>>>
>>> Additionally, we plan on writing a separate npmsw.bbclass, which will
>>> parse the package.json for each node module for an automated Bitbake
>>> license manifest generation, which will resolve the current challenge
>>> of having to maintain these manually, as currently described at
>>> https://www.yoctoproject.org/docs/latest/mega-manual/mega-manual.html#npm-using-the-registry-modules-method 
>>>
>>> .
>
> This licenses will be generated by the recipetool and you could 
> provide checksums to detect the correct licenses.
>
> The license inside the package.json is only a hint and you need a 
> license file to fulfill the license compliance. Because of this I 
> remove the package.json from LIC_FILES_CHKSUM because it is useless 
> for the license compliance.
You're right here that there's a need to have the full license file. In 
this case, a license crawler would need to traverse node_modules and 
scan for LICENSE[.md,.txt,] files and then generate the checksums.
>
>>> If this is something you see as a worthwhile goal, we will provide a
>>> set of patch files within the coming weeks.
>
> I think you mixed the unusable npm implementation with your special 
> use case.
>
> The problem is that the current npm implementation isn't really 
> usable. I'm working on this and have already a prototype that could 
> install, build and *test* a proprietary angular project and node-red 
> as well as koa/examples from github.
You make a very interesting point here, primarily because you cover two 
very different use-cases. I think we have to distinguish between 
something like a webinterface that only uses nodejs and npm at 
compile-time for dependency management and bundling, where NodeJS itself 
is not even required on the target (this is our use case). The second 
class of use cases is running software like node-red directly on the 
target, where the current approach of the npm fetcher works quite well. 
Our thoughts primarily focused on the webinterface use case, but I agree 
with you that we should keep an eye on supporting all use cases.
>
> If I understand you correct you like to build a npm recipe that could 
> change it dependencies without update the recipe except the SRCREV of 
> the repositories.

That is true, and I believe keeping the package-lock file directly in 
the source repository is something worth pursuing not only for us.
Do you have a strong preference for keeping the dependencies outside of 
the source repository?

>
>> At a first read it sounds reasonable but I don't know the answers to 
>> a few
>> questions which make or break things from an OE/bitbake perspective. 
>> Those
>> questions are:
>>
>> a) Once DL_DIR has been populated by this fetch mechanism, can a 
>> subsequent
>> build run with just the data from there without accessing the network?
This does hold for well-built packages that only use code out of 
node_modules.
We can not guarantee this because the package could execute arbitrary JS 
code during its build time, including fetching content from the internet.
>>
>> b) Is the information encoded into SRC_URI enough to give a 
>> deterministic build
>> result, i.e. if we run this build at some later date, will we get the 
>> same
>> result?
The package-lock.json should be checked into the source repository, so 
pinning down SRCREV guarantees a 100% reproducible dependency installation.
>>
>> c) Is fetching only happening during the do_fetch task and not in any 
>> subsequent
>> step?
Yes, we want to perform a full fetch directly in do_fetch and then 
archive the result of this operation within the DL_DIR, so subsequent 
builds can be done directly from the DL_DIR.
>>
>> I'd love for some of the other people who're worked on this code to 
>> jump in as I
>> don't use it or understand it in detail. I am worried about how we 
>> maintain this
>> longer term as different people seem to have different use cases 
>> which sees the
>> code changing in different directions and we're starting to look like 
>> we may end
>> up with multiple ways of doing things which I really dislike.
>
> This leads to the questions what is the desired way to integrate a 
> package / dependency manager. Nowadays any language (even C/C++) has a 
> package manager available and more and more build systems (ex. Meson, 
> CMake) support automatic download of dependencies. The common 
> integration into OE is a script (recipetool) that generate a recipe 
> from the foreign configuration. The current npm implementation is 
> special because it reuse a foreign configuration and translate it into 
> fetch commands on-the-fly. This leads to the problem that common 
> tweaks like override a dependency or share configuration between 
> recipes via include file isn't possible. We could fix it by removing 
> the foreign configuration and do the translation during recipe 
> creation. But this means you have to recreate the recipe after every 
> dependency change.
>
> Is it a valid use case for OE to support foreign dependency 
> configurations like npm-shrinkwrap.json, go.sum or conan.lock?
Agreed. Especially for cases like Javascript/Go/Rust where the 
dependency management is a core part of the language and ecosystem, we 
should support these.

>
> Regards
>   Stefan

Regards,
Martin


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
  2021-11-05  9:07   ` Stefan Herbrechtsmeier
@ 2021-11-05 11:24     ` Jean-Marie Lemetayer
  2021-11-05 16:02       ` Jasper Orschulko
       [not found]     ` <4106f9ef-5b2e-5276-f1bb-c80a989d7fdf@mko.dev>
  1 sibling, 1 reply; 16+ messages in thread
From: Jean-Marie Lemetayer @ 2021-11-05 11:24 UTC (permalink / raw)
  To: Stefan Herbrechtsmeier
  Cc: Richard Purdie, Jasper Orschulko, bitbake-devel, martin, Daniel Baumgart

[-- Attachment #1: Type: text/plain, Size: 8839 bytes --]

Hi Jasper and folks,

I am the guy who updated the NPM support in bitbake last time. All your
remarks are correct, shrinkwarp file placement can be updated and the
do_configure step is a real pain. But this is the only way I found which
follows the bitbake requirements.

For the 'access to network' thing, be careful with NPM because there are
some corner cases when NPM tries to access it without permissions (node-gyp
if I remember well). I was doing my test builds with my RJ45 unplugged to
be sure.

The fact that the NPM workflow is totally not the same as the Bitbake one
is the real issue. I think the best solution could be to make some
improvements to NPM directly so it can fit better in an environment like
Bitbake (maybe a new fetch command).

Best regards and good luck,
Jean-Marie


On Fri, Nov 5, 2021 at 2:07 AM Stefan Herbrechtsmeier <
stefan.herbrechtsmeier-oss@weidmueller.com> wrote:

> Hi Jasper and Richard,
>
> Am 05.11.2021 um 00:15 schrieb Richard Purdie via lists.openembedded.org:
> > On Thu, 2021-11-04 at 12:29 +0000, Jasper Orschulko wrote:
> >> Dear Bitbake developers,
> >>
> >> recently we have been looking at the npmsw fetcher and discovered some
> >> challenges regarding the integration into the developer workflow as
> >> well as the build times within Bitbake. We believe that we found a
> >> mechanism which would integrate well into Bitbake's existing project
> >> structure and drastically improve the situation.
> >>
> >>
> >> But first, what are the issues with the current npmsw fetcher?
> >>
> >> 1. Let's have a look at a typical npm-based project. You'd typically
> >> have your package-lock.json (aka shrinkwrap file) stored within the git
> >> repository containing your source code. Developers will rely on this
> >> package-lock file on a daily basis during the development cycle.
> >> Unfortunately, the current npmsw fetcher only supports shrinkwrap files
> >> stored within the meta layer or within an npm registry. This is not
> >> ideal, as changes to the file might be made within the project repo,
> >> which then need to be manually applied to the lock file within the meta
> >> repo. An ideal npmsw fetcher therefore would support using the lock
> >> file directly from the source code repo.
>
> The package-lock.json and npm-shrinkwrap.json are identical. The only
> difference is that the npm-shrinkwrap.json could be published with the
> package.
>
> >> 2. The current implementation of the npm class uses multiple shellouts
> >> per npm module in order to add these to the npm cache. This is done, as
> >> the `npm install` command is not called within the do_fetch, but at the
> >> end of the do_configure step. This drastically increases the time
> >> Bitbake spends in the do_configure step for a npm based recipe. In our
> >> case (we have a relatively small project with approx. 600 npm packages
> >> in total, including recursive packages) this takes ~100 minutes to
> >> complete. What makes things worse, every change to the recipe and/or
> >> lock file will cause a complete rerun of the do_configure job.
>
> This is a problem of the sequential setup of the cache. I have a
> prototype to do this in a special bb task and use multiple parallel
> process task inside the bb task. But I have also a prototype which
> remove the complete cache and speed up the build significant.
>
> >> As a result, the npm fetcher currently is not really usable for
> >> production workloads.
>
> Ack.
>
> >> So how can we address these issues?
> >>
> >> We plan to implement a "sub-fetcher" for npmsw (a concept which might
> >> also be recyclable for similar use-cases). This would take the
> >> form of e.g.:
> >>
> >> SRC_URI = "npmsw+git://git-uri.git;npm-topdir=path_to_npm_project;..."
> >>
> >> The idea is, that the npsw fetcher would then call an arbitrary sub-
> >> fetcher (in this case git, however any fetcher will be supported) and
> >> after the sub-fetcher has extracted the source code into the DL_DIR,
> >> the npm fetcher will create a secondary download folder as a copy of
> >> the sub-fetchers download folder. Within this copy, the npm fetcher
> >> will call `npm ci`, effectively downloading the npm packages by doing a
> >> clean-install on the basis of the package.json and the package-
> >> lock.json files within the npmsw download dir. This results in a much
> >> faster build, as it removes the need for seperate handling of the
> >> individual node packages, as well as streamlining the developers
> >> workflow with the build process within Bitbake.
>
> How should this support the download proxy? The npm ci command need a
> repository or a cache to work.
>
> Furthermore you need a patch step in between the fetch steps to support
> tuning / fixing of the configuration before the second fetch step.
>
> >> As this fetcher would be implemented separately from the current npmsw
> >> fetcher, this will not cause any breaking changes for existing setups.
> >>
> >> Additionally, we plan on writing a separate npmsw.bbclass, which will
> >> parse the package.json for each node module for an automated Bitbake
> >> license manifest generation, which will resolve the current challenge
> >> of having to maintain these manually, as currently described at
> >>
> https://www.yoctoproject.org/docs/latest/mega-manual/mega-manual.html#npm-using-the-registry-modules-method
> >> .
>
> This licenses will be generated by the recipetool and you could provide
> checksums to detect the correct licenses.
>
> The license inside the package.json is only a hint and you need a
> license file to fulfill the license compliance. Because of this I remove
> the package.json from LIC_FILES_CHKSUM because it is useless for the
> license compliance.
>
>
> >> If this is something you see as a worthwhile goal, we will provide a
> >> set of patch files within the coming weeks.
>
> I think you mixed the unusable npm implementation with your special use
> case.
>
> The problem is that the current npm implementation isn't really usable.
> I'm working on this and have already a prototype that could install,
> build and *test* a proprietary angular project and node-red as well as
> koa/examples from github.
>
> If I understand you correct you like to build a npm recipe that could
> change it dependencies without update the recipe except the SRCREV of
> the repositories.
>
> > At a first read it sounds reasonable but I don't know the answers to a
> few
> > questions which make or break things from an OE/bitbake perspective.
> Those
> > questions are:
> >
> > a) Once DL_DIR has been populated by this fetch mechanism, can a
> subsequent
> > build run with just the data from there without accessing the network?
> >
> > b) Is the information encoded into SRC_URI enough to give a
> deterministic build
> > result, i.e. if we run this build at some later date, will we get the
> same
> > result?
> >
> > c) Is fetching only happening during the do_fetch task and not in any
> subsequent
> > step?
> >
> >
> > I'd love for some of the other people who're worked on this code to jump
> in as I
> > don't use it or understand it in detail. I am worried about how we
> maintain this
> > longer term as different people seem to have different use cases which
> sees the
> > code changing in different directions and we're starting to look like we
> may end
> > up with multiple ways of doing things which I really dislike.
>
> This leads to the questions what is the desired way to integrate a
> package / dependency manager. Nowadays any language (even C/C++) has a
> package manager available and more and more build systems (ex. Meson,
> CMake) support automatic download of dependencies. The common
> integration into OE is a script (recipetool) that generate a recipe
> from the foreign configuration. The current npm implementation is
> special because it reuse a foreign configuration and translate it into
> fetch commands on-the-fly. This leads to the problem that common tweaks
> like override a dependency or share configuration between recipes via
> include file isn't possible. We could fix it by removing the foreign
> configuration and do the translation during recipe creation. But this
> means you have to recreate the recipe after every dependency change.
>
> Is it a valid use case for OE to support foreign dependency
> configurations like npm-shrinkwrap.json, go.sum or conan.lock?
>
> Regards
>    Stefan
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#12885):
> https://lists.openembedded.org/g/bitbake-devel/message/12885
> Mute This Topic: https://lists.openembedded.org/mt/86814331/3618298
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [
> jeanmarie.lemetayer@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>

[-- Attachment #2: Type: text/html, Size: 10995 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
       [not found]     ` <4106f9ef-5b2e-5276-f1bb-c80a989d7fdf@mko.dev>
  2021-11-05 11:12       ` Martin Koppehel
@ 2021-11-05 13:16       ` Stefan Herbrechtsmeier
  1 sibling, 0 replies; 16+ messages in thread
From: Stefan Herbrechtsmeier @ 2021-11-05 13:16 UTC (permalink / raw)
  To: Martin Koppehel, richard.purdie, Jasper Orschulko, bitbake-devel
  Cc: Daniel Baumgart

Hi,

Am 05.11.2021 um 12:10 schrieb Martin Koppehel:
> On 11/5/21 10:07, Stefan Herbrechtsmeier wrote:
>> Am 05.11.2021 um 00:15 schrieb Richard Purdie via lists.openembedded.org:
>>> On Thu, 2021-11-04 at 12:29 +0000, Jasper Orschulko wrote:
>>>> Dear Bitbake developers,

[snip]

>>>> So how can we address these issues?
>>>>
>>>> We plan to implement a "sub-fetcher" for npmsw (a concept which might
>>>> also be recyclable for similar use-cases). This would take the
>>>> form of e.g.:
>>>>
>>>> SRC_URI = "npmsw+git://git-uri.git;npm-topdir=path_to_npm_project;..."
>>>>
>>>> The idea is, that the npsw fetcher would then call an arbitrary sub-
>>>> fetcher (in this case git, however any fetcher will be supported) and
>>>> after the sub-fetcher has extracted the source code into the DL_DIR,
>>>> the npm fetcher will create a secondary download folder as a copy of
>>>> the sub-fetchers download folder. Within this copy, the npm fetcher
>>>> will call `npm ci`, effectively downloading the npm packages by doing a
>>>> clean-install on the basis of the package.json and the package-
>>>> lock.json files within the npmsw download dir. This results in a much
>>>> faster build, as it removes the need for seperate handling of the
>>>> individual node packages, as well as streamlining the developers
>>>> workflow with the build process within Bitbake.
>>
>> How should this support the download proxy? The npm ci command need a 
>> repository or a cache to work.
> The npm ci command can utilize a private registry and/or http proxy if 
> that's required. We didn't consider that case yet, but I think we could 
> add a call to npm to configure a proxy according to e.g. a set of 
> environment variables.

With proxy I mean the yocto http download proxy not a private npm registry:
https://downloads.yoctoproject.org/mirror/sources/

>> Furthermore you need a patch step in between the fetch steps to 
>> support tuning / fixing of the configuration before the second fetch 
>> step.
> Our idea was to build a completely checked out and installed repository 
> and archive this in the DL_DIR, which then can be used in the do_patch 
> phase.

This makes the download recipe specific and you can't share npm packages 
between recipes.

> Are we missing some important use-case here? Whenever it is necessary to 
> patch the package.json/package-lock.json this should ideally be done in 
> your upstream repository.

Yes but what if your upstream repository doesn't exist anymore or the 
upstream repo doesn't accept your change.


> Our primary motivation behind leaving the package-lock within the source 
> repository was to have a single source of truth for the dependency 
> versions.

What happens if you have a CVE in a common dependency. You have to wait 
for every project to integrate the update and have to check the external 
sources to know if the package was updated.

The problems are the different requirements between a developer and 
distribution point of view.

>>>> As this fetcher would be implemented separately from the current npmsw
>>>> fetcher, this will not cause any breaking changes for existing setups.
>>>>
>>>> Additionally, we plan on writing a separate npmsw.bbclass, which will
>>>> parse the package.json for each node module for an automated Bitbake
>>>> license manifest generation, which will resolve the current challenge
>>>> of having to maintain these manually, as currently described at
>>>> https://www.yoctoproject.org/docs/latest/mega-manual/mega-manual.html#npm-using-the-registry-modules-method 
>>>>
>>>> .
>>
>> This licenses will be generated by the recipetool and you could 
>> provide checksums to detect the correct licenses.
>>
>> The license inside the package.json is only a hint and you need a 
>> license file to fulfill the license compliance. Because of this I 
>> remove the package.json from LIC_FILES_CHKSUM because it is useless 
>> for the license compliance.
> You're right here that there's a need to have the full license file. In 
> this case, a license crawler would need to traverse node_modules and 
> scan for LICENSE[.md,.txt,] files and then generate the checksums.
>>
>>>> If this is something you see as a worthwhile goal, we will provide a
>>>> set of patch files within the coming weeks.
>>
>> I think you mixed the unusable npm implementation with your special 
>> use case.
>>
>> The problem is that the current npm implementation isn't really 
>> usable. I'm working on this and have already a prototype that could 
>> install, build and *test* a proprietary angular project and node-red 
>> as well as koa/examples from github.
> You make a very interesting point here, primarily because you cover two 
> very different use-cases. I think we have to distinguish between 
> something like a webinterface that only uses nodejs and npm at 
> compile-time for dependency management and bundling, where NodeJS itself 
> is not even required on the target (this is our use case). The second 
> class of use cases is running software like node-red directly on the 
> target, where the current approach of the npm fetcher works quite well. 
> Our thoughts primarily focused on the webinterface use case, but I agree 
> with you that we should keep an eye on supporting all use cases.
>>
>> If I understand you correct you like to build a npm recipe that could 
>> change it dependencies without update the recipe except the SRCREV of 
>> the repositories.
> 
> That is true, and I believe keeping the package-lock file directly in 
> the source repository is something worth pursuing not only for us.
> Do you have a strong preference for keeping the dependencies outside of 
> the source repository?

The problem is the different focus between a project and a distribution. 
If you use the dependencies direct you relay on the policy of the 
project and its dependencies. It must be possible to override the 
decision of an individual project or dependency if it doesn't match your 
requirements.

My question is if we really need a fetcher for the content of a 
package-lock or if we should create a recipe from a package-lock.

>>> At a first read it sounds reasonable but I don't know the answers to 
>>> a few
>>> questions which make or break things from an OE/bitbake perspective. 
>>> Those
>>> questions are:
>>>
>>> a) Once DL_DIR has been populated by this fetch mechanism, can a 
>>> subsequent
>>> build run with just the data from there without accessing the network?
> This does hold for well-built packages that only use code out of 
> node_modules.
> We can not guarantee this because the package could execute arbitrary JS 
> code during its build time, including fetching content from the internet.
>>>
>>> b) Is the information encoded into SRC_URI enough to give a 
>>> deterministic build
>>> result, i.e. if we run this build at some later date, will we get the 
>>> same
>>> result?
> The package-lock.json should be checked into the source repository, so 
> pinning down SRCREV guarantees a 100% reproducible dependency installation.
>>>
>>> c) Is fetching only happening during the do_fetch task and not in any 
>>> subsequent
>>> step?
> Yes, we want to perform a full fetch directly in do_fetch and then 
> archive the result of this operation within the DL_DIR, so subsequent 
> builds can be done directly from the DL_DIR.
>>>
>>> I'd love for some of the other people who're worked on this code to 
>>> jump in as I
>>> don't use it or understand it in detail. I am worried about how we 
>>> maintain this
>>> longer term as different people seem to have different use cases 
>>> which sees the
>>> code changing in different directions and we're starting to look like 
>>> we may end
>>> up with multiple ways of doing things which I really dislike.
>>
>> This leads to the questions what is the desired way to integrate a 
>> package / dependency manager. Nowadays any language (even C/C++) has a 
>> package manager available and more and more build systems (ex. Meson, 
>> CMake) support automatic download of dependencies. The common 
>> integration into OE is a script (recipetool) that generate a recipe 
>> from the foreign configuration. The current npm implementation is 
>> special because it reuse a foreign configuration and translate it into 
>> fetch commands on-the-fly. This leads to the problem that common 
>> tweaks like override a dependency or share configuration between 
>> recipes via include file isn't possible. We could fix it by removing 
>> the foreign configuration and do the translation during recipe 
>> creation. But this means you have to recreate the recipe after every 
>> dependency change.
>>
>> Is it a valid use case for OE to support foreign dependency 
>> configurations like npm-shrinkwrap.json, go.sum or conan.lock?
> Agreed. Especially for cases like Javascript/Go/Rust where the 
> dependency management is a core part of the language and ecosystem, we 
> should support these.

What is the advantage of a package manager specific fetcher instead of a 
package manager specific recipe generator? Does this advantages overcome 
the loss of common OE features?

Regards
   Stefan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
  2021-11-05 11:24     ` Jean-Marie Lemetayer
@ 2021-11-05 16:02       ` Jasper Orschulko
  2021-11-05 17:42         ` Alexander Kanavin
       [not found]         ` <5fb67154d576b74629e4836a86dcb5e479b73e67.camel@linuxfoundation.org>
  0 siblings, 2 replies; 16+ messages in thread
From: Jasper Orschulko @ 2021-11-05 16:02 UTC (permalink / raw)
  To: stefan.herbrechtsmeier-oss, jeanmarie.lemetayer
  Cc: richard.purdie, bitbake-devel, martin, Daniel Baumgart

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Thanks @all for your input.

It seems like there is a lot to be considered. When it comes to under-
the-hood development, the yocto project is fairly new to Martin and
myself. The term "bitbake requirements" has been used more than once in
this conversation, yet I am currently not aware of any formal
requirements. You wouldn't happen to have some documentation on this by
any chance? A common understanding of this and the actual usecases for
npm would be great, before we go into further details on how a possible
solution might look like.

- -- 
With best regards

Jasper Orschulko
DevOps Engineer

Tel. +49 30 58 58 14 265
Fax +49 30 58 58 14 999
Jasper.Orschulko@iris-sensing.com

• • • • • • • • • • • • • • • • • • • • • • • • • •

iris-GmbH
infrared & intelligent sensors
Schnellerstraße 1-5 | 12439 Berlin

https://iris-sensing.com/





On Fri, 2021-11-05 at 04:24 -0700, Jean-Marie Lemetayer wrote:
> Hi Jasper and folks,
> 
> I am the guy who updated the NPM support in bitbake last time. All
> your remarks are correct, shrinkwarp file placement can be updated
> and the do_configure step is a real pain. But this is the only way I
> found which follows the bitbake requirements.
> 
> For the 'access to network' thing, be careful with NPM because there
> are some corner cases when NPM tries to access it without permissions
> (node-gyp if I remember well). I was doing my test builds with my
> RJ45 unplugged to be sure.
> 
> The fact that the NPM workflow is totally not the same as the Bitbake
> one is the real issue. I think the best solution could be to make
> some improvements to NPM directly so it can fit better in an
> environment like Bitbake (maybe a new fetch command).
> 
> Best regards and good luck,
> Jean-Marie
> 
> 
> On Fri, Nov 5, 2021 at 2:07 AM Stefan Herbrechtsmeier
> <stefan.herbrechtsmeier-oss@weidmueller.com> wrote:
> > Hi Jasper and Richard,
> > 
> > Am 05.11.2021 um 00:15 schrieb Richard Purdie via
> > lists.openembedded.org:
> > > On Thu, 2021-11-04 at 12:29 +0000, Jasper Orschulko wrote:
> > > > Dear Bitbake developers,
> > > > 
> > > > recently we have been looking at the npmsw fetcher and
> > > > discovered
> > some
> > > > challenges regarding the integration into the developer
> > > > workflow
> > as
> > > > well as the build times within Bitbake. We believe that we
> > > > found a
> > > > mechanism which would integrate well into Bitbake's existing
> > project
> > > > structure and drastically improve the situation.
> > > > 
> > > > 
> > > > But first, what are the issues with the current npmsw fetcher?
> > > > 
> > > > 1. Let's have a look at a typical npm-based project. You'd
> > typically
> > > > have your package-lock.json (aka shrinkwrap file) stored within
> > the git
> > > > repository containing your source code. Developers will rely on
> > this
> > > > package-lock file on a daily basis during the development
> > > > cycle.
> > > > Unfortunately, the current npmsw fetcher only supports
> > > > shrinkwrap
> > files
> > > > stored within the meta layer or within an npm registry. This is
> > not
> > > > ideal, as changes to the file might be made within the project
> > repo,
> > > > which then need to be manually applied to the lock file within
> > > > the
> > meta
> > > > repo. An ideal npmsw fetcher therefore would support using the
> > lock
> > > > file directly from the source code repo.
> > 
> > The package-lock.json and npm-shrinkwrap.json are identical. The
> > only
> > difference is that the npm-shrinkwrap.json could be published with
> > the 
> > package.
> > 
> > > > 2. The current implementation of the npm class uses multiple
> > shellouts
> > > > per npm module in order to add these to the npm cache. This is
> > done, as
> > > > the `npm install` command is not called within the do_fetch,
> > > > but
> > at the
> > > > end of the do_configure step. This drastically increases the
> > > > time
> > > > Bitbake spends in the do_configure step for a npm based recipe.
> > > > In
> > our
> > > > case (we have a relatively small project with approx. 600 npm
> > packages
> > > > in total, including recursive packages) this takes ~100 minutes
> > > > to
> > > > complete. What makes things worse, every change to the recipe
> > and/or
> > > > lock file will cause a complete rerun of the do_configure job.
> > 
> > This is a problem of the sequential setup of the cache. I have a 
> > process task inside the bb task. But I have also a prototype which 
> > remove the complete cache and speed up the build significant.
> > 
> > > > As a result, the npm fetcher currently is not really usable for
> > > > production workloads.
> > 
> > Ack.
> > 
> > > > So how can we address these issues?
> > > > 
> > > > We plan to implement a "sub-fetcher" for npmsw (a concept which
> > might
> > > > also be recyclable for similar use-cases). This would take the
> > > > form of e.g.:
> > > > 
> > > > SRC_URI = "npmsw+git://git-uri.git;npm-
> > topdir=path_to_npm_project;..."
> > > > 
> > > > The idea is, that the npsw fetcher would then call an arbitrary
> > sub-
> > > > fetcher (in this case git, however any fetcher will be
> > > > supported)
> > and
> > > > after the sub-fetcher has extracted the source code into the
> > DL_DIR,
> > > > the npm fetcher will create a secondary download folder as a
> > > > copy
> > of
> > > > the sub-fetchers download folder. Within this copy, the npm
> > fetcher
> > > > will call `npm ci`, effectively downloading the npm packages by
> > doing a
> > > > clean-install on the basis of the package.json and the package-
> > > > lock.json files within the npmsw download dir. This results in
> > > > a
> > much
> > > > faster build, as it removes the need for seperate handling of
> > > > the
> > > > individual node packages, as well as streamlining the
> > > > developers
> > > > workflow with the build process within Bitbake.
> > 
> > How should this support the download proxy? The npm ci command need
> > a
> > repository or a cache to work.
> > 
> > Furthermore you need a patch step in between the fetch steps to
> > support 
> > tuning / fixing of the configuration before the second fetch step.
> > 
> > > > As this fetcher would be implemented separately from the
> > > > current
> > npmsw
> > > > fetcher, this will not cause any breaking changes for existing
> > setups.
> > > > 
> > > > Additionally, we plan on writing a separate npmsw.bbclass,
> > > > which
> > will
> > > > parse the package.json for each node module for an automated
> > Bitbake
> > > > license manifest generation, which will resolve the current
> > challenge
> > > > of having to maintain these manually, as currently described at
> > > > 
> > https://www.yoctoproject.org/docs/latest/mega-manual/mega-manual.html#npm-using-the-registry-modules-method
> > > > .
> > 
> > This licenses will be generated by the recipetool and you could
> > provide 
> > checksums to detect the correct licenses.
> > 
> > The license inside the package.json is only a hint and you need a 
> > license file to fulfill the license compliance. Because of this I
> > remove 
> > the package.json from LIC_FILES_CHKSUM because it is useless for
> > the 
> > license compliance.
> > 
> > 
> > > > If this is something you see as a worthwhile goal, we will
> > > > provide
> > a
> > > > set of patch files within the coming weeks.
> > 
> > I think you mixed the unusable npm implementation with your special
> > use 
> > case.
> > 
> > The problem is that the current npm implementation isn't really
> > usable. 
> > I'm working on this and have already a prototype that could
> > install, 
> > build and *test* a proprietary angular project and node-red as well
> > as 
> > koa/examples from github.
> > 
> > If I understand you correct you like to build a npm recipe that
> > could
> > change it dependencies without update the recipe except the SRCREV
> > of
> > the repositories.
> > 
> > > At a first read it sounds reasonable but I don't know the answers
> > to a few
> > > questions which make or break things from an OE/bitbake
> > perspective. Those
> > > questions are:
> > > 
> > > a) Once DL_DIR has been populated by this fetch mechanism, can a
> > subsequent
> > > build run with just the data from there without accessing the
> > network?
> > > 
> > > b) Is the information encoded into SRC_URI enough to give a
> > deterministic build
> > > result, i.e. if we run this build at some later date, will we get
> > the same
> > > result?
> > > 
> > > c) Is fetching only happening during the do_fetch task and not in
> > any subsequent
> > > step?
> > > 
> > > 
> > > I'd love for some of the other people who're worked on this code
> > > to
> > jump in as I
> > > don't use it or understand it in detail. I am worried about how
> > > we
> > maintain this
> > > longer term as different people seem to have different use cases
> > which sees the
> > > code changing in different directions and we're starting to look
> > like we may end
> > > up with multiple ways of doing things which I really dislike.
> > 
> > This leads to the questions what is the desired way to integrate a 
> > package / dependency manager. Nowadays any language (even C/C++)
> > has
> > a 
> > package manager available and more and more build systems (ex.
> > Meson,
> > CMake) support automatic download of dependencies. The common 
> > from the foreign configuration. The current npm implementation is 
> > special because it reuse a foreign configuration and translate it
> > into 
> > fetch commands on-the-fly. This leads to the problem that common
> > tweaks 
> > like override a dependency or share configuration between recipes
> > via
> > include file isn't possible. We could fix it by removing the
> > foreign 
> > configuration and do the translation during recipe creation. But
> > this
> > means you have to recreate the recipe after every dependency
> > change.
> > 
> > Is it a valid use case for OE to support foreign dependency 
> > configurations like npm-shrinkwrap.json, go.sum or conan.lock?
> > 
> > Regards
> >    Stefan
> > 
> > -=-=-=-=-=-=-=-=-=-=-=-
> > Links: You receive all messages sent to this group.
> > View/Reply Online (#12885):
> > https://lists.openembedded.org/g/bitbake-devel/message/12885
> > Mute This Topic: https://lists.openembedded.org/mt/86814331/3618298
> > Group Owner: bitbake-devel+owner@lists.openembedded.org
> > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub
> > [jeanmarie.lemetayer@gmail.com]
> > -=-=-=-=-=-=-=-=-=-=-=-
> > 
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEE4WyPMIC5Ap4+Ooo1Ygqew07VMNUFAmGFVaUACgkQYgqew07V
MNX4NQf+PwyOaBwFeWY/IKWhUfTswHKOkMDubyJnvTM0eRuNiJ+a6GNZo0ZXe/8v
7ozIwTIJShu/roimQff9K44HNjyvZJVC3hMeBUY9VS75RUliAJa48Vo71P57CtXD
ysuTg9hDaPicDA228pljFTHGk+SgLVUkAtgAw0J4URwJyfqoquzRXckt3s7A5/Nm
K3/SZg7FOS4D8bjWLwZvlSX4NVWswb478ct00JL0hdJv6I/Hnr+KBpKXG6OE0+Dj
x2lple1BxDmLcOoUsbvXroojGQNqIi4iYcxbkkvQqFmL/zb9r3MKv5h2VipbNLIC
E3Hdk2iFsXSOnf2Eo1RB3Hy4EwYoSQ==
=rOen
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
  2021-11-05 16:02       ` Jasper Orschulko
@ 2021-11-05 17:42         ` Alexander Kanavin
       [not found]         ` <5fb67154d576b74629e4836a86dcb5e479b73e67.camel@linuxfoundation.org>
  1 sibling, 0 replies; 16+ messages in thread
From: Alexander Kanavin @ 2021-11-05 17:42 UTC (permalink / raw)
  To: Jasper Orschulko
  Cc: stefan.herbrechtsmeier-oss, jeanmarie.lemetayer, richard.purdie,
	bitbake-devel, martin, Daniel Baumgart

[-- Attachment #1: Type: text/plain, Size: 12266 bytes --]

For what it's worth (food for thought wise) I've written an essay a while
ago on the subject:

https://www.yoctoproject.org/pipermail/yocto/2017-March/035028.html
https://www.yoctoproject.org/pipermail/yocto/2017-March/035029.html

Alex

On Fri, 5 Nov 2021 at 17:02, Jasper Orschulko <
Jasper.Orschulko@iris-sensing.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Thanks @all for your input.
>
> It seems like there is a lot to be considered. When it comes to under-
> the-hood development, the yocto project is fairly new to Martin and
> myself. The term "bitbake requirements" has been used more than once in
> this conversation, yet I am currently not aware of any formal
> requirements. You wouldn't happen to have some documentation on this by
> any chance? A common understanding of this and the actual usecases for
> npm would be great, before we go into further details on how a possible
> solution might look like.
>
> - --
> With best regards
>
> Jasper Orschulko
> DevOps Engineer
>
> Tel. +49 30 58 58 14 265
> Fax +49 30 58 58 14 999
> Jasper.Orschulko@iris-sensing.com
>
> • • • • • • • • • • • • • • • • • • • • • • • • • •
>
> iris-GmbH
> infrared & intelligent sensors
> Schnellerstraße 1-5 | 12439 Berlin
>
> https://iris-sensing.com/
>
>
>
>
>
> On Fri, 2021-11-05 at 04:24 -0700, Jean-Marie Lemetayer wrote:
> > Hi Jasper and folks,
> >
> > I am the guy who updated the NPM support in bitbake last time. All
> > your remarks are correct, shrinkwarp file placement can be updated
> > and the do_configure step is a real pain. But this is the only way I
> > found which follows the bitbake requirements.
> >
> > For the 'access to network' thing, be careful with NPM because there
> > are some corner cases when NPM tries to access it without permissions
> > (node-gyp if I remember well). I was doing my test builds with my
> > RJ45 unplugged to be sure.
> >
> > The fact that the NPM workflow is totally not the same as the Bitbake
> > one is the real issue. I think the best solution could be to make
> > some improvements to NPM directly so it can fit better in an
> > environment like Bitbake (maybe a new fetch command).
> >
> > Best regards and good luck,
> > Jean-Marie
> >
> >
> > On Fri, Nov 5, 2021 at 2:07 AM Stefan Herbrechtsmeier
> > <stefan.herbrechtsmeier-oss@weidmueller.com> wrote:
> > > Hi Jasper and Richard,
> > >
> > > Am 05.11.2021 um 00:15 schrieb Richard Purdie via
> > > lists.openembedded.org:
> > > > On Thu, 2021-11-04 at 12:29 +0000, Jasper Orschulko wrote:
> > > > > Dear Bitbake developers,
> > > > >
> > > > > recently we have been looking at the npmsw fetcher and
> > > > > discovered
> > > some
> > > > > challenges regarding the integration into the developer
> > > > > workflow
> > > as
> > > > > well as the build times within Bitbake. We believe that we
> > > > > found a
> > > > > mechanism which would integrate well into Bitbake's existing
> > > project
> > > > > structure and drastically improve the situation.
> > > > >
> > > > >
> > > > > But first, what are the issues with the current npmsw fetcher?
> > > > >
> > > > > 1. Let's have a look at a typical npm-based project. You'd
> > > typically
> > > > > have your package-lock.json (aka shrinkwrap file) stored within
> > > the git
> > > > > repository containing your source code. Developers will rely on
> > > this
> > > > > package-lock file on a daily basis during the development
> > > > > cycle.
> > > > > Unfortunately, the current npmsw fetcher only supports
> > > > > shrinkwrap
> > > files
> > > > > stored within the meta layer or within an npm registry. This is
> > > not
> > > > > ideal, as changes to the file might be made within the project
> > > repo,
> > > > > which then need to be manually applied to the lock file within
> > > > > the
> > > meta
> > > > > repo. An ideal npmsw fetcher therefore would support using the
> > > lock
> > > > > file directly from the source code repo.
> > >
> > > The package-lock.json and npm-shrinkwrap.json are identical. The
> > > only
> > > difference is that the npm-shrinkwrap.json could be published with
> > > the
> > > package.
> > >
> > > > > 2. The current implementation of the npm class uses multiple
> > > shellouts
> > > > > per npm module in order to add these to the npm cache. This is
> > > done, as
> > > > > the `npm install` command is not called within the do_fetch,
> > > > > but
> > > at the
> > > > > end of the do_configure step. This drastically increases the
> > > > > time
> > > > > Bitbake spends in the do_configure step for a npm based recipe.
> > > > > In
> > > our
> > > > > case (we have a relatively small project with approx. 600 npm
> > > packages
> > > > > in total, including recursive packages) this takes ~100 minutes
> > > > > to
> > > > > complete. What makes things worse, every change to the recipe
> > > and/or
> > > > > lock file will cause a complete rerun of the do_configure job.
> > >
> > > This is a problem of the sequential setup of the cache. I have a
> > > process task inside the bb task. But I have also a prototype which
> > > remove the complete cache and speed up the build significant.
> > >
> > > > > As a result, the npm fetcher currently is not really usable for
> > > > > production workloads.
> > >
> > > Ack.
> > >
> > > > > So how can we address these issues?
> > > > >
> > > > > We plan to implement a "sub-fetcher" for npmsw (a concept which
> > > might
> > > > > also be recyclable for similar use-cases). This would take the
> > > > > form of e.g.:
> > > > >
> > > > > SRC_URI = "npmsw+git://git-uri.git;npm-
> > > topdir=path_to_npm_project;..."
> > > > >
> > > > > The idea is, that the npsw fetcher would then call an arbitrary
> > > sub-
> > > > > fetcher (in this case git, however any fetcher will be
> > > > > supported)
> > > and
> > > > > after the sub-fetcher has extracted the source code into the
> > > DL_DIR,
> > > > > the npm fetcher will create a secondary download folder as a
> > > > > copy
> > > of
> > > > > the sub-fetchers download folder. Within this copy, the npm
> > > fetcher
> > > > > will call `npm ci`, effectively downloading the npm packages by
> > > doing a
> > > > > clean-install on the basis of the package.json and the package-
> > > > > lock.json files within the npmsw download dir. This results in
> > > > > a
> > > much
> > > > > faster build, as it removes the need for seperate handling of
> > > > > the
> > > > > individual node packages, as well as streamlining the
> > > > > developers
> > > > > workflow with the build process within Bitbake.
> > >
> > > How should this support the download proxy? The npm ci command need
> > > a
> > > repository or a cache to work.
> > >
> > > Furthermore you need a patch step in between the fetch steps to
> > > support
> > > tuning / fixing of the configuration before the second fetch step.
> > >
> > > > > As this fetcher would be implemented separately from the
> > > > > current
> > > npmsw
> > > > > fetcher, this will not cause any breaking changes for existing
> > > setups.
> > > > >
> > > > > Additionally, we plan on writing a separate npmsw.bbclass,
> > > > > which
> > > will
> > > > > parse the package.json for each node module for an automated
> > > Bitbake
> > > > > license manifest generation, which will resolve the current
> > > challenge
> > > > > of having to maintain these manually, as currently described at
> > > > >
> > >
> https://www.yoctoproject.org/docs/latest/mega-manual/mega-manual.html#npm-using-the-registry-modules-method
> > > > > .
> > >
> > > This licenses will be generated by the recipetool and you could
> > > provide
> > > checksums to detect the correct licenses.
> > >
> > > The license inside the package.json is only a hint and you need a
> > > license file to fulfill the license compliance. Because of this I
> > > remove
> > > the package.json from LIC_FILES_CHKSUM because it is useless for
> > > the
> > > license compliance.
> > >
> > >
> > > > > If this is something you see as a worthwhile goal, we will
> > > > > provide
> > > a
> > > > > set of patch files within the coming weeks.
> > >
> > > I think you mixed the unusable npm implementation with your special
> > > use
> > > case.
> > >
> > > The problem is that the current npm implementation isn't really
> > > usable.
> > > I'm working on this and have already a prototype that could
> > > install,
> > > build and *test* a proprietary angular project and node-red as well
> > > as
> > > koa/examples from github.
> > >
> > > If I understand you correct you like to build a npm recipe that
> > > could
> > > change it dependencies without update the recipe except the SRCREV
> > > of
> > > the repositories.
> > >
> > > > At a first read it sounds reasonable but I don't know the answers
> > > to a few
> > > > questions which make or break things from an OE/bitbake
> > > perspective. Those
> > > > questions are:
> > > >
> > > > a) Once DL_DIR has been populated by this fetch mechanism, can a
> > > subsequent
> > > > build run with just the data from there without accessing the
> > > network?
> > > >
> > > > b) Is the information encoded into SRC_URI enough to give a
> > > deterministic build
> > > > result, i.e. if we run this build at some later date, will we get
> > > the same
> > > > result?
> > > >
> > > > c) Is fetching only happening during the do_fetch task and not in
> > > any subsequent
> > > > step?
> > > >
> > > >
> > > > I'd love for some of the other people who're worked on this code
> > > > to
> > > jump in as I
> > > > don't use it or understand it in detail. I am worried about how
> > > > we
> > > maintain this
> > > > longer term as different people seem to have different use cases
> > > which sees the
> > > > code changing in different directions and we're starting to look
> > > like we may end
> > > > up with multiple ways of doing things which I really dislike.
> > >
> > > This leads to the questions what is the desired way to integrate a
> > > package / dependency manager. Nowadays any language (even C/C++)
> > > has
> > > a
> > > package manager available and more and more build systems (ex.
> > > Meson,
> > > CMake) support automatic download of dependencies. The common
> > > from the foreign configuration. The current npm implementation is
> > > special because it reuse a foreign configuration and translate it
> > > into
> > > fetch commands on-the-fly. This leads to the problem that common
> > > tweaks
> > > like override a dependency or share configuration between recipes
> > > via
> > > include file isn't possible. We could fix it by removing the
> > > foreign
> > > configuration and do the translation during recipe creation. But
> > > this
> > > means you have to recreate the recipe after every dependency
> > > change.
> > >
> > > Is it a valid use case for OE to support foreign dependency
> > > configurations like npm-shrinkwrap.json, go.sum or conan.lock?
> > >
> > > Regards
> > >    Stefan
> > >
> > >
> > >
> -----BEGIN PGP SIGNATURE-----
>
> iQEzBAEBCAAdFiEE4WyPMIC5Ap4+Ooo1Ygqew07VMNUFAmGFVaUACgkQYgqew07V
> MNX4NQf+PwyOaBwFeWY/IKWhUfTswHKOkMDubyJnvTM0eRuNiJ+a6GNZo0ZXe/8v
> 7ozIwTIJShu/roimQff9K44HNjyvZJVC3hMeBUY9VS75RUliAJa48Vo71P57CtXD
> ysuTg9hDaPicDA228pljFTHGk+SgLVUkAtgAw0J4URwJyfqoquzRXckt3s7A5/Nm
> K3/SZg7FOS4D8bjWLwZvlSX4NVWswb478ct00JL0hdJv6I/Hnr+KBpKXG6OE0+Dj
> x2lple1BxDmLcOoUsbvXroojGQNqIi4iYcxbkkvQqFmL/zb9r3MKv5h2VipbNLIC
> E3Hdk2iFsXSOnf2Eo1RB3Hy4EwYoSQ==
> =rOen
> -----END PGP SIGNATURE-----
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#12911):
> https://lists.openembedded.org/g/bitbake-devel/message/12911
> Mute This Topic: https://lists.openembedded.org/mt/86814331/1686489
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [
> alex.kanavin@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>

[-- Attachment #2: Type: text/html, Size: 16541 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
       [not found]         ` <5fb67154d576b74629e4836a86dcb5e479b73e67.camel@linuxfoundation.org>
@ 2021-11-06 10:30           ` Konrad Weihmann
  2021-11-08  7:41           ` Stefan Herbrechtsmeier
  1 sibling, 0 replies; 16+ messages in thread
From: Konrad Weihmann @ 2021-11-06 10:30 UTC (permalink / raw)
  To: Richard Purdie, Jasper Orschulko, stefan.herbrechtsmeier-oss,
	jeanmarie.lemetayer
  Cc: bitbake-devel, martin, Daniel Baumgart



On 06.11.21 11:06, Richard Purdie wrote:
> On Fri, 2021-11-05 at 16:02 +0000, Jasper Orschulko wrote:
>> Thanks @all for your input.
>>
>> It seems like there is a lot to be considered. When it comes to under-
>> the-hood development, the yocto project is fairly new to Martin and
>> myself. The term "bitbake requirements" has been used more than once in
>> this conversation, yet I am currently not aware of any formal
>> requirements. You wouldn't happen to have some documentation on this by
>> any chance? A common understanding of this and the actual usecases for
>> npm would be great, before we go into further details on how a possible
>> solution might look like.
> 
> We've used that term as there is a set of things which the bitbake fetcher is
> expected to provide in terms of workflows and user experiences, effectively what
> is it's API. Sadly there isn't a formal document, just a long understanding of
> what this looks like. It isn't a piece of the project that people have wanted to
> change too often so writing such documentation hasn't been a priority.
> 
> I can try and briefly summarise the expectations I can think of in the list
> below and we should perhaps just put this as a text file in the fetcher
> directory of the codebase. If others see anything I'm missing please add to the
> list.
> 
> Just to be clear, we're not trying to be awkward about making changes, there is
> just a long history with this codebase and end users tend to get upset if we
> break the API. Different users do have very different use cases from the code.
> 
> 
> a) network access for sources is only expected to happen in the do_fetch step.
> This is not enforced or tested but is required so that we can:
> 
>    i) audit the sources used (i.e. for license/manifest reasons)
>    ii) support offline builds with a suitable cache
>    iii) allow work to continue even with downtime upstream
>    iv) allow for changes upstream in incompatible ways
>    v) allow rebuilding of the software in X years time
> 
> b) network access is not expected in do_unpack
> 
> c) you can take DL_DIR and use it as a mirror for offline builds
> 
> d) access to the network is only made when explicitly configured in recipes
>     (e.g. use of AUTOREV, or use of git tags which change revision)
> 
> e) fetcher output is deterministic
>     (i.e. if you fetch configuration XXX now it will match in future exactly in
>      a clean build with a new DL_DIR)
> 
> f) network access is expected to work with the standard linux proxy variables
>     so that access behind firewalls works (the fetcher sets these in the
>     environment but only in the do_fetch tasks)
> 
> g) access during parsing has to be minimal, a "git ls-remote" for an AUTOREV
>     git recipe might be ok but you can't expect to checkout a git tree
> 
> h) we need to provide revision information during parsing such that a version
>     for the recipe can be constructed.
> 
> i) versions are expected to be able to increase in a way which sorts allowing
>     package feeds to operate (see PR server required for git revisions to sort)
> 
> j) API to query for possible version upgrades of a url is highly desireable to
>     allow out automated upgrage code to function (it is implied this does always
>     have network access)
> 
> k) Where fixes or changes to behaviour in the fetcher are made, we ask that
>     test cases are added (run with "bitbake-selftest bb.tests.fetch"). We do
>     have fairly extensive test coverage of the fetcher as it is the only way
>     to track all of it's corner cases, it still doesn't give entire coverage
>     though sadly.

That list looks very good and I'd really like to see that as a small 
README as part of the bitbake source tree.
One (maybe optional) addition from my side:
Each fetcher has to mind the resource control flags offered by bitbake 
(like BB_NUMBER_THREADS) to allow the user to control that behavior 
globally and not having the fetcher code allocate resources in a 
uncontrolled manner, if that is not desired by the user's configuration.

> 
> Not all fetchers support all features, autorev is optional and doesn't make
> sense for some. Upgrade detection means different things in different contexts
> too.
> 
> Also, I did realise the npm fetcher tests simply don't work anymore. They're not
> run as standard as our infrastructure didn't have npm on it to run the tests so
> they have bitrotted :(.
> 
> Cheers,
> 
> Richard
> 
> 
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#12923): https://lists.openembedded.org/g/bitbake-devel/message/12923
> Mute This Topic: https://lists.openembedded.org/mt/86814331/3647476
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [kweihmann@outlook.com]
> -=-=-=-=-=-=-=-=-=-=-=-
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
  2021-11-04 13:09 ` [bitbake-devel] " Alexander Kanavin
@ 2021-11-06 16:58   ` Mike Crowe
  2021-11-08  8:01     ` Stefan Herbrechtsmeier
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Crowe @ 2021-11-06 16:58 UTC (permalink / raw)
  To: Alexander Kanavin, Jasper Orschulko, Stefan Herbrechtsmeier,
	bitbake-devel, martin, Caner Altinbasak, Daniel Baumgart

On Thursday 04 November 2021 at 14:09:42 +0100, Alexander Kanavin wrote:
> To the best of my knowledge, no other companies at the moment are using
> Yocto to integrate npm-based items into a product.

We make light use of npm in the production of the rootfs for our products.
The upgrade to Dunfell was a bit painful, and the slowness is annoying too,
but I think that we've ended up with things being better in the end because
we can now be sure that all the sources are captured correctly.

We did run into a few bugs and some of our fixes for those have landed. The
npm fetcher seems to be fighting with the usual way that Bitbake expects
fetchers to work that I don't really understand enough (from either side)
to know how to fix. (e.g.
https://bugzilla.yoctoproject.org/show_bug.cgi?id=14383 which doesn't
directly affect us, but the underlying cause meant that our usual method
for capture sources needed some extra workarounds.)

Mike.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
       [not found]         ` <5fb67154d576b74629e4836a86dcb5e479b73e67.camel@linuxfoundation.org>
  2021-11-06 10:30           ` Konrad Weihmann
@ 2021-11-08  7:41           ` Stefan Herbrechtsmeier
  2021-11-08  7:59             ` Alexander Kanavin
  1 sibling, 1 reply; 16+ messages in thread
From: Stefan Herbrechtsmeier @ 2021-11-08  7:41 UTC (permalink / raw)
  To: Richard Purdie, Jasper Orschulko, jeanmarie.lemetayer
  Cc: bitbake-devel, martin, Daniel Baumgart

Am 06.11.2021 um 11:06 schrieb Richard Purdie:
> On Fri, 2021-11-05 at 16:02 +0000, Jasper Orschulko wrote:
>> Thanks @all for your input.
>>
>> It seems like there is a lot to be considered. When it comes to under-
>> the-hood development, the yocto project is fairly new to Martin and
>> myself. The term "bitbake requirements" has been used more than once in
>> this conversation, yet I am currently not aware of any formal
>> requirements. You wouldn't happen to have some documentation on this by
>> any chance? A common understanding of this and the actual usecases for
>> npm would be great, before we go into further details on how a possible
>> solution might look like.
> 
> We've used that term as there is a set of things which the bitbake fetcher is
> expected to provide in terms of workflows and user experiences, effectively what
> is it's API. Sadly there isn't a formal document, just a long understanding of
> what this looks like. It isn't a piece of the project that people have wanted to
> change too often so writing such documentation hasn't been a priority.
> 
> I can try and briefly summarise the expectations I can think of in the list
> below and we should perhaps just put this as a text file in the fetcher
> directory of the codebase. If others see anything I'm missing please add to the
> list.
> 
> Just to be clear, we're not trying to be awkward about making changes, there is
> just a long history with this codebase and end users tend to get upset if we
> break the API. Different users do have very different use cases from the code.
> 
> 
> a) network access for sources is only expected to happen in the do_fetch step.
> This is not enforced or tested but is required so that we can:
> 
>    i) audit the sources used (i.e. for license/manifest reasons)
>    ii) support offline builds with a suitable cache
>    iii) allow work to continue even with downtime upstream
>    iv) allow for changes upstream in incompatible ways
>    v) allow rebuilding of the software in X years time
> 
> b) network access is not expected in do_unpack
> 
> c) you can take DL_DIR and use it as a mirror for offline builds
> 
> d) access to the network is only made when explicitly configured in recipes
>     (e.g. use of AUTOREV, or use of git tags which change revision)
> 
> e) fetcher output is deterministic
>     (i.e. if you fetch configuration XXX now it will match in future exactly in
>      a clean build with a new DL_DIR)
> 
> f) network access is expected to work with the standard linux proxy variables
>     so that access behind firewalls works (the fetcher sets these in the
>     environment but only in the do_fetch tasks)
> 
> g) access during parsing has to be minimal, a "git ls-remote" for an AUTOREV
>     git recipe might be ok but you can't expect to checkout a git tree
> 
> h) we need to provide revision information during parsing such that a version
>     for the recipe can be constructed.
> 
> i) versions are expected to be able to increase in a way which sorts allowing
>     package feeds to operate (see PR server required for git revisions to sort)
> 
> j) API to query for possible version upgrades of a url is highly desireable to
>     allow out automated upgrage code to function (it is implied this does always
>     have network access)
> 
> k) Where fixes or changes to behaviour in the fetcher are made, we ask that
>     test cases are added (run with "bitbake-selftest bb.tests.fetch"). We do
>     have fairly extensive test coverage of the fetcher as it is the only way
>     to track all of it's corner cases, it still doesn't give entire coverage
>     though sadly.
> 
> Not all fetchers support all features, autorev is optional and doesn't make
> sense for some. Upgrade detection means different things in different contexts
> too.

Thanks for the list and I would like to see that as part of the bitbake 
source tree.

> Also, I did realise the npm fetcher tests simply don't work anymore. They're not
> run as standard as our infrastructure didn't have npm on it to run the tests so
> they have bitrotted :(.

Is it possible to add the nodejs recipe to the core?



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
  2021-11-08  7:41           ` Stefan Herbrechtsmeier
@ 2021-11-08  7:59             ` Alexander Kanavin
  0 siblings, 0 replies; 16+ messages in thread
From: Alexander Kanavin @ 2021-11-08  7:59 UTC (permalink / raw)
  To: Stefan Herbrechtsmeier
  Cc: Richard Purdie, Jasper Orschulko, jeanmarie.lemetayer,
	bitbake-devel, martin, Daniel Baumgart

[-- Attachment #1: Type: text/plain, Size: 649 bytes --]

On Mon, 8 Nov 2021 at 08:41, Stefan Herbrechtsmeier <
stefan.herbrechtsmeier-oss@weidmueller.com> wrote:

> > Also, I did realise the npm fetcher tests simply don't work anymore.
> They're not
> > run as standard as our infrastructure didn't have npm on it to run the
> tests so
> > they have bitrotted :(.
>
> Is it possible to add the nodejs recipe to the core?
>

I don't think it's eligible (the criteria for core is 'broadly useful for
embedded use cases', which nodejs is clearly not ;). I think the question
should be 'is it possible to run nodejs tests on the autobuilder so it
doesn't regress', and to that the answer is 'yes'.

Alex

Alex

[-- Attachment #2: Type: text/html, Size: 1213 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
  2021-11-06 16:58   ` Mike Crowe
@ 2021-11-08  8:01     ` Stefan Herbrechtsmeier
  2021-11-08 12:44       ` Jasper Orschulko
  0 siblings, 1 reply; 16+ messages in thread
From: Stefan Herbrechtsmeier @ 2021-11-08  8:01 UTC (permalink / raw)
  To: Mike Crowe, Alexander Kanavin, Jasper Orschulko, bitbake-devel,
	martin, Caner Altinbasak, Daniel Baumgart, Richard Purdie

Am 06.11.2021 um 17:58 schrieb Mike Crowe:
> On Thursday 04 November 2021 at 14:09:42 +0100, Alexander Kanavin wrote:
>> To the best of my knowledge, no other companies at the moment are using
>> Yocto to integrate npm-based items into a product.
> 
> We make light use of npm in the production of the rootfs for our products.
> The upgrade to Dunfell was a bit painful, and the slowness is annoying too,
> but I think that we've ended up with things being better in the end because
> we can now be sure that all the sources are captured correctly.

We have a similar problem.

> We did run into a few bugs and some of our fixes for those have landed. The
> npm fetcher seems to be fighting with the usual way that Bitbake expects
> fetchers to work that I don't really understand enough (from either side)
> to know how to fix. (e.g.
> https://bugzilla.yoctoproject.org/show_bug.cgi?id=14383 which doesn't
> directly affect us, but the underlying cause meant that our usual method
> for capture sources needed some extra workarounds.)

The npmsw fetcher is the only one that fetch multiple sources from one 
foreign configuration file. Therefore it parse the configuration file 
and translate it into multiple fetch commands.

I think this is a wrong approach and makes more problems as it solve. I 
suggest to move the logic to the recipetool and direct use the npm 
fetcher per npm package (dependency). Even the npm fetcher could be 
avoid if we doesn't require an autorev feature for an npm package.

This solution has the advantage that you could manipulate the 
dependencies inside a recipe and you could replace a npm package with an 
other oe package (DEPENDS / RDEPENDS).

The disadvantage is a big recipe instead of a big foreign configuration 
file and it is impossible to use a foreign configuration file direct 
from a git repository. But I think this use case doesn't match with the 
oe requirements and we should improve the recipe generation so that this 
could be done automatically.

Regards
   Stefan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
  2021-11-08  8:01     ` Stefan Herbrechtsmeier
@ 2021-11-08 12:44       ` Jasper Orschulko
  2021-11-11  7:51         ` Stefan Herbrechtsmeier
  0 siblings, 1 reply; 16+ messages in thread
From: Jasper Orschulko @ 2021-11-08 12:44 UTC (permalink / raw)
  To: alex.kanavin, mac, stefan.herbrechtsmeier-oss, richard.purdie,
	bitbake-devel, martin, Daniel Baumgart, cal

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi Stefan,

would you (and any other interested party) be interested in a
conference call with Martin and I? We are greatly interested in your
current prototypes, how they perform for the various usecases and
if/how we can help you with the implementation.

- -- 
With best regards

Jasper Orschulko
DevOps Engineer

Tel. +49 30 58 58 14 265
Fax +49 30 58 58 14 999
Jasper.Orschulko@iris-sensing.com

• • • • • • • • • • • • • • • • • • • • • • • • • •

iris-GmbH
infrared & intelligent sensors
Schnellerstraße 1-5 | 12439 Berlin

https://iris-sensing.com/





On Mon, 2021-11-08 at 09:01 +0100, Stefan Herbrechtsmeier wrote:
> Am 06.11.2021 um 17:58 schrieb Mike Crowe:
> > On Thursday 04 November 2021 at 14:09:42 +0100, Alexander Kanavin
> > wrote:
> > > To the best of my knowledge, no other companies at the moment are
> > > using
> > > Yocto to integrate npm-based items into a product.
> > 
> > We make light use of npm in the production of the rootfs for our
> > products.
> > The upgrade to Dunfell was a bit painful, and the slowness is
> > annoying too,
> > but I think that we've ended up with things being better in the end
> > because
> > we can now be sure that all the sources are captured correctly.
> 
> We have a similar problem.
> 
> > We did run into a few bugs and some of our fixes for those have
> > landed. The
> > npm fetcher seems to be fighting with the usual way that Bitbake
> > expects
> > fetchers to work that I don't really understand enough (from either
> > side)
> > to know how to fix. (e.g.
> > https://bugzilla.yoctoproject.org/show_bug.cgi?id=14383 which
> > doesn't
> > directly affect us, but the underlying cause meant that our usual
> > method
> > for capture sources needed some extra workarounds.)
> 
> The npmsw fetcher is the only one that fetch multiple sources from
> one 
> and translate it into multiple fetch commands.
> 
> I think this is a wrong approach and makes more problems as it solve.
> I 
> suggest to move the logic to the recipetool and direct use the npm 
> fetcher per npm package (dependency). Even the npm fetcher could be 
> avoid if we doesn't require an autorev feature for an npm package.
> 
> This solution has the advantage that you could manipulate the 
> dependencies inside a recipe and you could replace a npm package with
> an 
> other oe package (DEPENDS / RDEPENDS).
> 
> The disadvantage is a big recipe instead of a big foreign
> configuration 
> file and it is impossible to use a foreign configuration file direct 
> from a git repository. But I think this use case doesn't match with
> the 
> oe requirements and we should improve the recipe generation so that
> this 
> could be done automatically.
> 
> Regards
>    Stefan
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEE4WyPMIC5Ap4+Ooo1Ygqew07VMNUFAmGJG5oACgkQYgqew07V
MNVTDgf/d6eY/AlnnHg3OkHP1GfKoRDmMKNuS1sjbx7xMyGsZBpWRrPGPwpLWa6T
rdS3qcGyVnMBbHeTRDp2yuYxxFCrBqyiSGpHfQwmUL9u94Si3ujTdKZKpQSYquT3
FDZW4WDILP6LCyfVFnLdLyLyVJd62X2tHd8jfG8c+70xQ5Ibh0CDpEslHvqWb/nS
30pKJ3FX/VnMit6BIY4ndiidW+UX/rnu5+w2AeqEk9H9t1ZcjCJyEP3TgF+f3icN
6S6um5EG+A9YJ7qeuWVDwxGyDkWod7TkXMTxYAdhud8Sfo0gkx3cJ1zLr10szGIj
bSpEn6im5cccoeaFtaWpiLUmuwpEzg==
=uNna
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitbake-devel] Improving npm(sw) fetcher & integration within Bitbake
  2021-11-08 12:44       ` Jasper Orschulko
@ 2021-11-11  7:51         ` Stefan Herbrechtsmeier
  0 siblings, 0 replies; 16+ messages in thread
From: Stefan Herbrechtsmeier @ 2021-11-11  7:51 UTC (permalink / raw)
  To: Jasper Orschulko, alex.kanavin, mac, richard.purdie,
	bitbake-devel, martin, Daniel Baumgart, cal

Hi Jasper,

Am 08.11.2021 um 13:44 schrieb Jasper Orschulko:
> would you (and any other interested party) be interested in a
> conference call with Martin and I? We are greatly interested in your
> current prototypes, how they perform for the various usecases and
> if/how we can help you with the implementation.

I will contact you both direct to schedule a call.

Regards
   Stefan


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-11-11  7:51 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-04 12:29 Improving npm(sw) fetcher & integration within Bitbake Jasper Orschulko
2021-11-04 13:09 ` [bitbake-devel] " Alexander Kanavin
2021-11-06 16:58   ` Mike Crowe
2021-11-08  8:01     ` Stefan Herbrechtsmeier
2021-11-08 12:44       ` Jasper Orschulko
2021-11-11  7:51         ` Stefan Herbrechtsmeier
2021-11-04 23:15 ` Richard Purdie
2021-11-05  9:07   ` Stefan Herbrechtsmeier
2021-11-05 11:24     ` Jean-Marie Lemetayer
2021-11-05 16:02       ` Jasper Orschulko
2021-11-05 17:42         ` Alexander Kanavin
     [not found]         ` <5fb67154d576b74629e4836a86dcb5e479b73e67.camel@linuxfoundation.org>
2021-11-06 10:30           ` Konrad Weihmann
2021-11-08  7:41           ` Stefan Herbrechtsmeier
2021-11-08  7:59             ` Alexander Kanavin
     [not found]     ` <4106f9ef-5b2e-5276-f1bb-c80a989d7fdf@mko.dev>
2021-11-05 11:12       ` Martin Koppehel
2021-11-05 13:16       ` Stefan Herbrechtsmeier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.