Hi Jasper and folks,

I am the guy who updated the NPM support in bitbake last time. All your remarks are correct, shrinkwarp file placement can be updated and the do_configure step is a real pain. But this is the only way I found which follows the bitbake requirements.

For the 'access to network' thing, be careful with NPM because there are some corner cases when NPM tries to access it without permissions (node-gyp if I remember well). I was doing my test builds with my RJ45 unplugged to be sure.

The fact that the NPM workflow is totally not the same as the Bitbake one is the real issue. I think the best solution could be to make some improvements to NPM directly so it can fit better in an environment like Bitbake (maybe a new fetch command).

Best regards and good luck,
Jean-Marie


On Fri, Nov 5, 2021 at 2:07 AM Stefan Herbrechtsmeier <stefan.herbrechtsmeier-oss@weidmueller.com> wrote:
Hi Jasper and Richard,

Am 05.11.2021 um 00:15 schrieb Richard Purdie via lists.openembedded.org:
> On Thu, 2021-11-04 at 12:29 +0000, Jasper Orschulko wrote:
>> Dear Bitbake developers,
>>
>> recently we have been looking at the npmsw fetcher and discovered some
>> challenges regarding the integration into the developer workflow as
>> well as the build times within Bitbake. We believe that we found a
>> mechanism which would integrate well into Bitbake's existing project
>> structure and drastically improve the situation.
>>
>>
>> But first, what are the issues with the current npmsw fetcher?
>>
>> 1. Let's have a look at a typical npm-based project. You'd typically
>> have your package-lock.json (aka shrinkwrap file) stored within the git
>> repository containing your source code. Developers will rely on this
>> package-lock file on a daily basis during the development cycle.
>> Unfortunately, the current npmsw fetcher only supports shrinkwrap files
>> stored within the meta layer or within an npm registry. This is not
>> ideal, as changes to the file might be made within the project repo,
>> which then need to be manually applied to the lock file within the meta
>> repo. An ideal npmsw fetcher therefore would support using the lock
>> file directly from the source code repo.

The package-lock.json and npm-shrinkwrap.json are identical. The only
difference is that the npm-shrinkwrap.json could be published with the
package.

>> 2. The current implementation of the npm class uses multiple shellouts
>> per npm module in order to add these to the npm cache. This is done, as
>> the `npm install` command is not called within the do_fetch, but at the
>> end of the do_configure step. This drastically increases the time
>> Bitbake spends in the do_configure step for a npm based recipe. In our
>> case (we have a relatively small project with approx. 600 npm packages
>> in total, including recursive packages) this takes ~100 minutes to
>> complete. What makes things worse, every change to the recipe and/or
>> lock file will cause a complete rerun of the do_configure job.

This is a problem of the sequential setup of the cache. I have a
prototype to do this in a special bb task and use multiple parallel
process task inside the bb task. But I have also a prototype which
remove the complete cache and speed up the build significant.

>> As a result, the npm fetcher currently is not really usable for
>> production workloads.

Ack.

>> So how can we address these issues?
>>
>> We plan to implement a "sub-fetcher" for npmsw (a concept which might
>> also be recyclable for similar use-cases). This would take the
>> form of e.g.:
>>
>> SRC_URI = "npmsw+git://git-uri.git;npm-topdir=path_to_npm_project;..."
>>
>> The idea is, that the npsw fetcher would then call an arbitrary sub-
>> fetcher (in this case git, however any fetcher will be supported) and
>> after the sub-fetcher has extracted the source code into the DL_DIR,
>> the npm fetcher will create a secondary download folder as a copy of
>> the sub-fetchers download folder. Within this copy, the npm fetcher
>> will call `npm ci`, effectively downloading the npm packages by doing a
>> clean-install on the basis of the package.json and the package-
>> lock.json files within the npmsw download dir. This results in a much
>> faster build, as it removes the need for seperate handling of the
>> individual node packages, as well as streamlining the developers
>> workflow with the build process within Bitbake.

How should this support the download proxy? The npm ci command need a
repository or a cache to work.

Furthermore you need a patch step in between the fetch steps to support
tuning / fixing of the configuration before the second fetch step.

>> As this fetcher would be implemented separately from the current npmsw
>> fetcher, this will not cause any breaking changes for existing setups.
>>
>> Additionally, we plan on writing a separate npmsw.bbclass, which will
>> parse the package.json for each node module for an automated Bitbake
>> license manifest generation, which will resolve the current challenge
>> of having to maintain these manually, as currently described at
>> https://www.yoctoproject.org/docs/latest/mega-manual/mega-manual.html#npm-using-the-registry-modules-method
>> .

This licenses will be generated by the recipetool and you could provide
checksums to detect the correct licenses.

The license inside the package.json is only a hint and you need a
license file to fulfill the license compliance. Because of this I remove
the package.json from LIC_FILES_CHKSUM because it is useless for the
license compliance.


>> If this is something you see as a worthwhile goal, we will provide a
>> set of patch files within the coming weeks.

I think you mixed the unusable npm implementation with your special use
case.

The problem is that the current npm implementation isn't really usable.
I'm working on this and have already a prototype that could install,
build and *test* a proprietary angular project and node-red as well as
koa/examples from github.

If I understand you correct you like to build a npm recipe that could
change it dependencies without update the recipe except the SRCREV of
the repositories.

> At a first read it sounds reasonable but I don't know the answers to a few
> questions which make or break things from an OE/bitbake perspective. Those
> questions are:
>
> a) Once DL_DIR has been populated by this fetch mechanism, can a subsequent
> build run with just the data from there without accessing the network?
>
> b) Is the information encoded into SRC_URI enough to give a deterministic build
> result, i.e. if we run this build at some later date, will we get the same
> result?
>
> c) Is fetching only happening during the do_fetch task and not in any subsequent
> step?
>
>
> I'd love for some of the other people who're worked on this code to jump in as I
> don't use it or understand it in detail. I am worried about how we maintain this
> longer term as different people seem to have different use cases which sees the
> code changing in different directions and we're starting to look like we may end
> up with multiple ways of doing things which I really dislike.

This leads to the questions what is the desired way to integrate a
package / dependency manager. Nowadays any language (even C/C++) has a
package manager available and more and more build systems (ex. Meson,
CMake) support automatic download of dependencies. The common
integration into OE is a script (recipetool) that generate a recipe
from the foreign configuration. The current npm implementation is
special because it reuse a foreign configuration and translate it into
fetch commands on-the-fly. This leads to the problem that common tweaks
like override a dependency or share configuration between recipes via
include file isn't possible. We could fix it by removing the foreign
configuration and do the translation during recipe creation. But this
means you have to recreate the recipe after every dependency change.

Is it a valid use case for OE to support foreign dependency
configurations like npm-shrinkwrap.json, go.sum or conan.lock?

Regards
   Stefan

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#12885): https://lists.openembedded.org/g/bitbake-devel/message/12885
Mute This Topic: https://lists.openembedded.org/mt/86814331/3618298
Group Owner: bitbake-devel+owner@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [jeanmarie.lemetayer@gmail.com]
-=-=-=-=-=-=-=-=-=-=-=-