All of lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] Untractably slow CVE checks in support/scripts/pkg-stat
@ 2020-02-27 15:05 Titouan Christophe
  2020-02-27 16:54 ` Thomas Petazzoni
  0 siblings, 1 reply; 3+ messages in thread
From: Titouan Christophe @ 2020-02-27 15:05 UTC (permalink / raw)
  To: buildroot

Hello Thomas^2, Yann, and all Buildrooters,

During the FOSDEM2020 developer meeting, we started to work on matching 
Buildroot packages against the NIST Vulnerability Database (NVD) files, 
as to obtain a list of known CVEs affecting our packages.

The first implementation, merged in 
4a157be9efac8ba8888e4972f42eda213077152c, was loading entire nvd files 
one by one (Python's json.load()). While this is the most 
straightforward approach, this was not practical because when loaded 
into their Python representation, these files take up to a few gigabytes 
of memory, and hosts with a modest amount of RAM (4GB or less) were 
OOMing while processing the CVEs.

I therefore introduced the usage of the Python 3rd party module ijson, 
which allows to iterate over a json file in streaming, ie only loading 
one CVE at a time from its json representation. Thomas D.S. confirmed 
that this drastically reduced memory consumption. This modification was 
subsequently merged in 712f81c41cde9d58c750ae2b1617831c0b07ccbd . In the 
commit message, I wrote:

"""
To run the script with these modifications, one should install the ijson 
python package. This can be done with pip: `pip install ijson`. On 
Debian based distributions, this can be done with the apt package 
manager: `apt install python-ijson`.
"""

However, Thomas P. reported that the pkg-stat script now takes very much 
longer to terminate (from a few minutes before the change, to 2h30 (!) 
now). At first, I was puzzled because the same script completes in less 
than 5 minutes on my laptop. Thanks to the help of Yann, I managed to 
isolate the issue into a small Python script, which can be found, along 
with accompanying data files over there: 
https://mypi.cz/0c3af4651d1aefe7335b6f137131424e.tar.gz . On my laptop, 
the 8 different steps in this script run all in less than 1 minute, 
while on Yann's machine, the first one did not even complete in a few 
minutes.

I therefore went to the release notes of ijson 
(https://github.com/ICRAR/ijson/blob/master/CHANGELOG.md#24), and found 
that the version 2.4 introduced huge performance improvements. On my 
laptop, I have ijson-2.6.1, because I installed it via pip, and 
therefore obtained the last version. On the other hand, Yann installed 
it via the apt package manager, which only provides ijson-2.3 (before 
the perf improvement).

=> Thus, I think that the huge slowness we see currently in pkg-stat is 
due to an old ijson version. This can be easily verified with the 
following procedure:

1. Uninstall the version distributed by apt:
    `apt remove python-ijson`
2. Install the latest version with pip:
    `pip install [--user] ijson`
3. Start the pkg-stat script, it should complete in less than 5 minutes.


Kind regards,

Titouan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Buildroot] Untractably slow CVE checks in support/scripts/pkg-stat
  2020-02-27 15:05 [Buildroot] Untractably slow CVE checks in support/scripts/pkg-stat Titouan Christophe
@ 2020-02-27 16:54 ` Thomas Petazzoni
  2020-02-27 21:32   ` Peter Korsgaard
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Petazzoni @ 2020-02-27 16:54 UTC (permalink / raw)
  To: buildroot

Hello Titouan,

On Thu, 27 Feb 2020 16:05:41 +0100
Titouan Christophe <titouan.christophe@railnova.eu> wrote:

> 1. Uninstall the version distributed by apt:
>     `apt remove python-ijson`
> 2. Install the latest version with pip:
>     `pip install [--user] ijson`
> 3. Start the pkg-stat script, it should complete in less than 5 minutes.

Thanks a lot for the investigation and research!

I tested with ijson 2.6 installed from pip, and things are indeed a lot
better. It's not 5 minutes though:

real	19m55.557s
user	18m7.632s
sys	1m53.140s

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Buildroot] Untractably slow CVE checks in support/scripts/pkg-stat
  2020-02-27 16:54 ` Thomas Petazzoni
@ 2020-02-27 21:32   ` Peter Korsgaard
  0 siblings, 0 replies; 3+ messages in thread
From: Peter Korsgaard @ 2020-02-27 21:32 UTC (permalink / raw)
  To: buildroot

>>>>> "Thomas" == Thomas Petazzoni <thomas.petazzoni@bootlin.com> writes:

 > Hello Titouan,
 > On Thu, 27 Feb 2020 16:05:41 +0100
 > Titouan Christophe <titouan.christophe@railnova.eu> wrote:

 >> 1. Uninstall the version distributed by apt:
 >> `apt remove python-ijson`
 >> 2. Install the latest version with pip:
 >> `pip install [--user] ijson`
 >> 3. Start the pkg-stat script, it should complete in less than 5 minutes.

 > Thanks a lot for the investigation and research!

 > I tested with ijson 2.6 installed from pip, and things are indeed a lot
 > better. It's not 5 minutes though:

 > real	19m55.557s
 > user	18m7.632s
 > sys	1m53.140s

Ok, but still almost an order of magnitude faster - Thanks!

If needed, I think we can get most of the speedup even with the older
2.3 version packaged in Debian/Ubuntu by selecting a better backend than
the fallback python version that used to be default - E.G. something
like what was done automatically from version 2.5 onwards:

https://github.com/ICRAR/ijson/commit/4c6d4144ea56e883634bd8c3eddca02c6f54a8f7

But I'm not sure if it is worth the trouble.

-- 
Bye, Peter Korsgaard

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-02-27 21:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-27 15:05 [Buildroot] Untractably slow CVE checks in support/scripts/pkg-stat Titouan Christophe
2020-02-27 16:54 ` Thomas Petazzoni
2020-02-27 21:32   ` Peter Korsgaard

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.