All of lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements
@ 2020-02-22  8:57 Heiko Thiery
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 01/12] support/scripts/pkg-stats: store latest version in a dict Heiko Thiery
                   ` (12 more replies)
  0 siblings, 13 replies; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

This patchset adds several improvements for the json output of the
pkg-stats script.

- add developers information to the packages
- add supported defconfigs to json
- add license information to json
- add more generic status field to the packages for easier post
  processing

---
v2 -> v3:
- keep variable latest_release but change format
- add check for license file hashes
- introduce a na status for the checks
- add a list of all possible checks to the json output
- add the cve check status

v1 -> v2:
- cleanup and recreation of patches
- remove pkg name from dumping to json
- use patch_files instead of combine count and files in dict
- include getdevelopers.py to reuse Developers class

---
Heiko Thiery (12):
  support/scripts/pkg-stats: store latest version in a dict
  support/scripts/pkg-stats: store patch files for the package
  support/scripts/pkg-stats: set developers info
  support/scripts/pkg-stats: store licences of package
  support/scripts/pkg-stats: add package status
  support/scripts/pkg-stats: add package count to stats
  support/scripts/pkg-stats: store pkg dir path
  support/scripts/pkg-stats: add defconfig support
  support/scripts/pkg-stats: add support for license hash check
  support/scripts/pkg-stats: set status to 'na' for virtual packages
  support/scripts/pkg-stats: initialize all package status checks
  support/scripts/pkg-stats: add status for cve check

 support/scripts/pkg-stats | 272 +++++++++++++++++++++++++++++---------
 1 file changed, 212 insertions(+), 60 deletions(-)

-- 
2.20.1

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 01/12] support/scripts/pkg-stats: store latest version in a dict
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
@ 2020-02-22  8:57 ` Heiko Thiery
  2020-02-23 13:26   ` Titouan Christophe
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 02/12] support/scripts/pkg-stats: store patch files for the package Heiko Thiery
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

From: Heiko Thiery <heiko.thiery@kontron.com>

This patch changes the type of the latest_version variable to a dict.
This is for better readability/usability of the data. With this the json
output is more descriptive in later processing of the json output.

Signed-off-by: Heiko Thiery <heiko.thiery@kontron.com>
Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/scripts/pkg-stats | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index 7721d98459..8cc78f2f66 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -71,7 +71,7 @@ class Package:
         self.url_status = None
         self.url_worker = None
         self.cves = list()
-        self.latest_version = (RM_API_STATUS_ERROR, None, None)
+        self.latest_version = {'status': RM_API_STATUS_ERROR, 'version': None, 'id': None}
 
     def pkgvar(self):
         return self.name.upper().replace("-", "_")
@@ -460,9 +460,8 @@ def check_package_latest_version(packages):
     """
     Fills in the .latest_version field of all Package objects
 
-    This field has a special format:
-      (status, version, id)
-    with:
+    This field is a dict and has the following keys:
+
     - status: one of RM_API_STATUS_ERROR,
       RM_API_STATUS_FOUND_BY_DISTRO, RM_API_STATUS_FOUND_BY_PATTERN,
       RM_API_STATUS_NOT_FOUND
@@ -478,7 +477,9 @@ def check_package_latest_version(packages):
     worker_pool = Pool(processes=64)
     results = worker_pool.map(check_package_latest_version_worker, (pkg.name for pkg in packages))
     for pkg, r in zip(packages, results):
-        pkg.latest_version = r
+        pkg.latest_version['status'] = r[0]
+        pkg.latest_version['version'] = r[1]
+        pkg.latest_version['id'] = r[2]
     del http_pool
 
 
@@ -516,13 +517,13 @@ def calculate_stats(packages):
             stats["hash"] += 1
         else:
             stats["no-hash"] += 1
-        if pkg.latest_version[0] == RM_API_STATUS_FOUND_BY_DISTRO:
+        if pkg.latest_version['status'] == RM_API_STATUS_FOUND_BY_DISTRO:
             stats["rmo-mapping"] += 1
         else:
             stats["rmo-no-mapping"] += 1
-        if not pkg.latest_version[1]:
+        if not pkg.latest_version['version']:
             stats["version-unknown"] += 1
-        elif pkg.latest_version[1] == pkg.current_version:
+        elif pkg.latest_version['version'] == pkg.current_version:
             stats["version-uptodate"] += 1
         else:
             stats["version-not-uptodate"] += 1
@@ -688,29 +689,29 @@ def dump_html_pkg(f, pkg):
     f.write("  <td class=\"centered\">%s</td>\n" % current_version)
 
     # Latest version
-    if pkg.latest_version[0] == RM_API_STATUS_ERROR:
+    if pkg.latest_version['status'] == RM_API_STATUS_ERROR:
         td_class.append("version-error")
-    if pkg.latest_version[1] is None:
+    if pkg.latest_version['version'] is None:
         td_class.append("version-unknown")
-    elif pkg.latest_version[1] != pkg.current_version:
+    elif pkg.latest_version['version'] != pkg.current_version:
         td_class.append("version-needs-update")
     else:
         td_class.append("version-good")
 
-    if pkg.latest_version[0] == RM_API_STATUS_ERROR:
+    if pkg.latest_version['status'] == RM_API_STATUS_ERROR:
         latest_version_text = "<b>Error</b>"
-    elif pkg.latest_version[0] == RM_API_STATUS_NOT_FOUND:
+    elif pkg.latest_version['status'] == RM_API_STATUS_NOT_FOUND:
         latest_version_text = "<b>Not found</b>"
     else:
-        if pkg.latest_version[1] is None:
+        if pkg.latest_version['version'] is None:
             latest_version_text = "<b>Found, but no version</b>"
         else:
             latest_version_text = "<a href=\"https://release-monitoring.org/project/%s\"><b>%s</b></a>" % \
-                (pkg.latest_version[2], str(pkg.latest_version[1]))
+                (pkg.latest_version['id'], str(pkg.latest_version['version']))
 
         latest_version_text += "<br/>"
 
-        if pkg.latest_version[0] == RM_API_STATUS_FOUND_BY_DISTRO:
+        if pkg.latest_version['status'] == RM_API_STATUS_FOUND_BY_DISTRO:
             latest_version_text += "found by <a href=\"https://release-monitoring.org/distro/Buildroot/\">distro</a>"
         else:
             latest_version_text += "found by guess"
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 02/12] support/scripts/pkg-stats: store patch files for the package
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 01/12] support/scripts/pkg-stats: store latest version in a dict Heiko Thiery
@ 2020-02-22  8:57 ` Heiko Thiery
  2020-02-23 13:35   ` Titouan Christophe
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 03/12] support/scripts/pkg-stats: set developers info Heiko Thiery
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

From: Heiko Thiery <heiko.thiery@kontron.com>

Signed-off-by: Heiko Thiery <heiko.thiery@kontron.com>
Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/scripts/pkg-stats | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index 8cc78f2f66..4c963cef0f 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -65,6 +65,7 @@ class Package:
         self.has_license_files = False
         self.has_hash = False
         self.patch_count = 0
+        self.patch_files = []
         self.warnings = 0
         self.current_version = None
         self.url = None
@@ -131,10 +132,10 @@ class Package:
         """
         Fills in the .patch_count field
         """
-        self.patch_count = 0
         pkgdir = os.path.dirname(self.path)
         for subdir, _, _ in os.walk(pkgdir):
-            self.patch_count += len(fnmatch.filter(os.listdir(subdir), '*.patch'))
+            self.patch_files = fnmatch.filter(os.listdir(subdir), '*.patch')
+            self.patch_count = len(self.patch_files)
 
     def set_current_version(self):
         """
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 03/12] support/scripts/pkg-stats: set developers info
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 01/12] support/scripts/pkg-stats: store latest version in a dict Heiko Thiery
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 02/12] support/scripts/pkg-stats: store patch files for the package Heiko Thiery
@ 2020-02-22  8:57 ` Heiko Thiery
  2020-02-23 13:45   ` Titouan Christophe
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 04/12] support/scripts/pkg-stats: store licences of package Heiko Thiery
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

Use the function 'parse_developers' function from getdeveloperlib that
collect the information about the developers and the files they
maintain. Then set the maintainer(s) to each package.

Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/scripts/pkg-stats | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index 4c963cef0f..643272e9d2 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -30,10 +30,14 @@ import certifi
 import distutils.version
 import time
 import gzip
+import sys
 from urllib3 import HTTPSConnectionPool
 from urllib3.exceptions import HTTPError
 from multiprocessing import Pool
 
+sys.path.append('utils/')
+from getdeveloperlib import parse_developers
+
 NVD_START_YEAR = 2002
 NVD_JSON_VERSION = "1.0"
 NVD_BASE_URL = "https://nvd.nist.gov/feeds/json/cve/" + NVD_JSON_VERSION
@@ -169,6 +173,15 @@ class Package:
         """
         return cve in self.all_ignored_cves.get(self.pkgvar(), [])
 
+    def set_developers(self, developers):
+        """
+        Fills in the .developers field
+        """
+        self.developers = list()
+        for dev in developers:
+            if dev.hasfile(self.path):
+                self.developers.append((dev.name))
+
     def __eq__(self, other):
         return self.path == other.path
 
@@ -891,6 +904,8 @@ def __main__():
                                       'HEAD']).splitlines()[0]
     print("Build package list ...")
     packages = get_pkglist(args.npackages, package_list)
+    print("Getting developers ...")
+    developers = parse_developers()
     print("Getting package make info ...")
     package_init_make_info()
     print("Getting package details ...")
@@ -902,6 +917,7 @@ def __main__():
         pkg.set_check_package_warnings()
         pkg.set_current_version()
         pkg.set_url()
+        pkg.set_developers(developers)
     print("Checking URL status")
     check_package_urls(packages)
     print("Getting latest versions ...")
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 04/12] support/scripts/pkg-stats: store licences of package
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
                   ` (2 preceding siblings ...)
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 03/12] support/scripts/pkg-stats: set developers info Heiko Thiery
@ 2020-02-22  8:57 ` Heiko Thiery
  2020-02-23 15:27   ` Titouan Christophe
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 05/12] support/scripts/pkg-stats: add package status Heiko Thiery
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/scripts/pkg-stats | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index 643272e9d2..ebaf04465e 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -56,7 +56,7 @@ http_pool = None
 
 
 class Package:
-    all_licenses = list()
+    all_licenses = dict()
     all_license_files = list()
     all_versions = dict()
     all_ignored_cves = dict()
@@ -65,6 +65,7 @@ class Package:
         self.name = name
         self.path = path
         self.infras = None
+        self.license = None
         self.has_license = False
         self.has_license_files = False
         self.has_hash = False
@@ -122,6 +123,7 @@ class Package:
         var = self.pkgvar()
         if var in self.all_licenses:
             self.has_license = True
+            self.license = self.all_licenses[var]
         if var in self.all_license_files:
             self.has_license_files = True
 
@@ -384,7 +386,7 @@ def package_init_make_info():
             if value == "unknown":
                 continue
             pkgvar = pkgvar[:-8]
-            Package.all_licenses.append(pkgvar)
+            Package.all_licenses[pkgvar] = value
 
         elif pkgvar.endswith("_LICENSE_FILES"):
             if pkgvar.endswith("_MANIFEST_LICENSE_FILES"):
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 05/12] support/scripts/pkg-stats: add package status
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
                   ` (3 preceding siblings ...)
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 04/12] support/scripts/pkg-stats: store licences of package Heiko Thiery
@ 2020-02-22  8:57 ` Heiko Thiery
  2020-02-23 15:19   ` Titouan Christophe
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 06/12] support/scripts/pkg-stats: add package count to stats Heiko Thiery
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

Unify the status check information. The status is stored in a tuple. The
first entry is the status that can be 'ok', 'warning' or 'error'. The
second entry is a verbose message.

The following checks are performed:
- url: status of the URL check
- license: status of the license presence check
- license-files: status of the license file check
- hash: status of the hash file presence check
- patches: status of the patches count check
- pkg-check: status of the check-package script result
- developers: status if a package has developers in the DEVELOPERS file
- version: status of the version check

With that status information the following variables are replaced:
has_license, has_license_files, has_hash, url_status

Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/scripts/pkg-stats | 104 +++++++++++++++++++++++++-------------
 1 file changed, 69 insertions(+), 35 deletions(-)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index ebaf04465e..7c8cc81cf2 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -66,18 +66,15 @@ class Package:
         self.path = path
         self.infras = None
         self.license = None
-        self.has_license = False
-        self.has_license_files = False
-        self.has_hash = False
         self.patch_count = 0
         self.patch_files = []
         self.warnings = 0
         self.current_version = None
         self.url = None
-        self.url_status = None
         self.url_worker = None
         self.cves = list()
         self.latest_version = {'status': RM_API_STATUS_ERROR, 'version': None, 'id': None}
+        self.status = {}
 
     def pkgvar(self):
         return self.name.upper().replace("-", "_")
@@ -86,17 +83,17 @@ class Package:
         """
         Fills in the .url field
         """
-        self.url_status = "No Config.in"
+        self.status['url'] = ("warning", "no Config.in")
         for filename in os.listdir(os.path.dirname(self.path)):
             if fnmatch.fnmatch(filename, 'Config.*'):
                 fp = open(os.path.join(os.path.dirname(self.path), filename), "r")
                 for config_line in fp:
                     if URL_RE.match(config_line):
                         self.url = config_line.strip()
-                        self.url_status = "Found"
+                        self.status['url'] = ("ok", "found")
                         fp.close()
                         return
-                self.url_status = "Missing"
+                self.status['url'] = ("warning", "missing")
                 fp.close()
 
     def set_infra(self):
@@ -118,31 +115,43 @@ class Package:
 
     def set_license(self):
         """
-        Fills in the .has_license and .has_license_files fields
+        Fills in the .status['license'] and .status['license-files'] fields
         """
         var = self.pkgvar()
+        self.status['license'] = ("error", "missing")
+        self.status['license-files'] = ("error", "missing")
         if var in self.all_licenses:
-            self.has_license = True
             self.license = self.all_licenses[var]
+            self.status['license'] = ("ok", "found")
         if var in self.all_license_files:
-            self.has_license_files = True
+            self.status['license-files'] = ("ok", "found")
 
     def set_hash_info(self):
         """
-        Fills in the .has_hash field
+        Fills in the .status['hash'] field
         """
         hashpath = self.path.replace(".mk", ".hash")
-        self.has_hash = os.path.exists(hashpath)
+        if os.path.exists(hashpath):
+            self.status['hash'] = ("ok", "found")
+        else:
+            self.status['hash'] = ("error", "missing")
 
     def set_patch_count(self):
         """
-        Fills in the .patch_count field
+        Fills in the .patch_count, .patch_files and .status['patches'] fields
         """
         pkgdir = os.path.dirname(self.path)
         for subdir, _, _ in os.walk(pkgdir):
             self.patch_files = fnmatch.filter(os.listdir(subdir), '*.patch')
             self.patch_count = len(self.patch_files)
 
+        if self.patch_count == 0:
+            self.status['patches'] = ("ok", "no patches")
+        elif self.patch_count < 5:
+            self.status['patches'] = ("warning", "some patches")
+        else:
+            self.status['patches'] = ("error", "lots of patches")
+
     def set_current_version(self):
         """
         Fills in the .current_version field
@@ -153,10 +162,11 @@ class Package:
 
     def set_check_package_warnings(self):
         """
-        Fills in the .warnings field
+        Fills in the .warnings and .status['pkg-check'] fields
         """
         cmd = ["./utils/check-package"]
         pkgdir = os.path.dirname(self.path)
+        self.status['pkg-check'] = ("error", "Missing")
         for root, dirs, files in os.walk(pkgdir):
             for f in files:
                 if f.endswith(".mk") or f.endswith(".hash") or f == "Config.in" or f == "Config.in.host":
@@ -167,6 +177,10 @@ class Package:
             m = re.match("^([0-9]*) warnings generated", line)
             if m:
                 self.warnings = int(m.group(1))
+                if self.warnings == 0:
+                    self.status['pkg-check'] = ("ok", "no warnings")
+                else:
+                    self.status['pkg-check'] = ("error", "{} warnings".format(self.warnings))
                 return
 
     def is_cve_ignored(self, cve):
@@ -177,13 +191,20 @@ class Package:
 
     def set_developers(self, developers):
         """
-        Fills in the .developers field
+        Fills in the .developers and .status['developers'] field
         """
         self.developers = list()
+        self.status['developers'] = ("warning", "no developers")
         for dev in developers:
             if dev.hasfile(self.path):
                 self.developers.append((dev.name))
 
+        if self.developers:
+            self.status['developers'] = ("ok", "{} developers".format(len(self.developers)))
+
+    def is_status_ok(self, name):
+        return self.status[name][0] == 'ok'
+
     def __eq__(self, other):
         return self.path == other.path
 
@@ -192,7 +213,7 @@ class Package:
 
     def __str__(self):
         return "%s (path='%s', license='%s', license_files='%s', hash='%s', patches=%d)" % \
-            (self.name, self.path, self.has_license, self.has_license_files, self.has_hash, self.patch_count)
+            (self.name, self.path, self.is_status_ok('license'), self.is_status_ok('license-files'), self.status['hash'], self.patch_count)
 
 
 class CVE:
@@ -406,23 +427,23 @@ def package_init_make_info():
 
 
 def check_url_status_worker(url, url_status):
-    if url_status != "Missing" and url_status != "No Config.in":
+    if url_status[0] == 'ok':
         try:
             url_status_code = requests.head(url, timeout=30).status_code
             if url_status_code >= 400:
-                return "Invalid(%s)" % str(url_status_code)
+                return ("error", "invalid {}".format(url_status_code))
         except requests.exceptions.RequestException:
-            return "Invalid(Err)"
-        return "Ok"
+            return ("error", "invalid (err)")
+        return ("ok", "valid")
     return url_status
 
 
 def check_package_urls(packages):
     Package.pool = Pool(processes=64)
     for pkg in packages:
-        pkg.url_worker = pkg.pool.apply_async(check_url_status_worker, (pkg.url, pkg.url_status))
+        pkg.url_worker = pkg.pool.apply_async(check_url_status_worker, (pkg.url, pkg.status['url']))
     for pkg in packages:
-        pkg.url_status = pkg.url_worker.get(timeout=3600)
+        pkg.status['url'] = pkg.url_worker.get(timeout=3600)
 
 
 def release_monitoring_get_latest_version_by_distro(pool, name):
@@ -496,6 +517,18 @@ def check_package_latest_version(packages):
         pkg.latest_version['status'] = r[0]
         pkg.latest_version['version'] = r[1]
         pkg.latest_version['id'] = r[2]
+
+        if pkg.latest_version['status'] == RM_API_STATUS_ERROR:
+            pkg.status['version'] = ('warning', 'RM API error')
+        elif pkg.latest_version['status'] == RM_API_STATUS_NOT_FOUND:
+            pkg.status['version'] = ('warning', 'RM package not found')
+
+        if pkg.latest_version['version'] is None:
+            pkg.status['version'] = ('warning', 'no upstream version available')
+        elif pkg.latest_version['version'] != pkg.current_version:
+            pkg.status['version'] = ('error', 'package needs update')
+        else:
+            pkg.status['version'] = ('ok', 'up-to-date')
     del http_pool
 
 
@@ -521,15 +554,15 @@ def calculate_stats(packages):
             stats["infra-%s" % infra] += 1
         else:
             stats["infra-unknown"] += 1
-        if pkg.has_license:
+        if pkg.is_status_ok('license'):
             stats["license"] += 1
         else:
             stats["no-license"] += 1
-        if pkg.has_license_files:
+        if pkg.is_status_ok('license-files'):
             stats["license-files"] += 1
         else:
             stats["no-license-files"] += 1
-        if pkg.has_hash:
+        if pkg.is_status_ok('hash'):
             stats["hash"] += 1
         else:
             stats["no-hash"] += 1
@@ -672,30 +705,30 @@ def dump_html_pkg(f, pkg):
 
     # License
     td_class = ["centered"]
-    if pkg.has_license:
+    if pkg.is_status_ok('license'):
         td_class.append("correct")
     else:
         td_class.append("wrong")
     f.write("  <td class=\"%s\">%s</td>\n" %
-            (" ".join(td_class), boolean_str(pkg.has_license)))
+            (" ".join(td_class), boolean_str(pkg.is_status_ok('license'))))
 
     # License files
     td_class = ["centered"]
-    if pkg.has_license_files:
+    if pkg.is_status_ok('license-files'):
         td_class.append("correct")
     else:
         td_class.append("wrong")
     f.write("  <td class=\"%s\">%s</td>\n" %
-            (" ".join(td_class), boolean_str(pkg.has_license_files)))
+            (" ".join(td_class), boolean_str(pkg.is_status_ok('license-files'))))
 
     # Hash
     td_class = ["centered"]
-    if pkg.has_hash:
+    if pkg.is_status_ok('hash'):
         td_class.append("correct")
     else:
         td_class.append("wrong")
     f.write("  <td class=\"%s\">%s</td>\n" %
-            (" ".join(td_class), boolean_str(pkg.has_hash)))
+            (" ".join(td_class), boolean_str(pkg.is_status_ok('hash'))))
 
     # Current version
     if len(pkg.current_version) > 20:
@@ -746,12 +779,13 @@ def dump_html_pkg(f, pkg):
 
     # URL status
     td_class = ["centered"]
-    url_str = pkg.url_status
-    if pkg.url_status == "Missing" or pkg.url_status == "No Config.in":
+    url_str = pkg.status['url'][1]
+    if pkg.status['url'][0] == "warning":
+        td_class.append("missing_url")
+    elif pkg.status['url'][0] == "error":
         td_class.append("missing_url")
-    elif pkg.url_status.startswith("Invalid"):
         td_class.append("invalid_url")
-        url_str = "<a href=%s>%s</a>" % (pkg.url, pkg.url_status)
+        url_str = "<a href=%s>%s</a>" % (pkg.url, pkg.status['url'][1])
     else:
         td_class.append("good_url")
         url_str = "<a href=%s>Link</a>" % pkg.url
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 06/12] support/scripts/pkg-stats: add package count to stats
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
                   ` (4 preceding siblings ...)
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 05/12] support/scripts/pkg-stats: add package status Heiko Thiery
@ 2020-02-22  8:57 ` Heiko Thiery
  2020-02-23 14:40   ` Titouan Christophe
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 07/12] support/scripts/pkg-stats: store pkg dir path Heiko Thiery
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/scripts/pkg-stats | 1 +
 1 file changed, 1 insertion(+)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index 7c8cc81cf2..90afcc2332 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -544,6 +544,7 @@ def check_package_cves(nvd_path, packages):
 
 def calculate_stats(packages):
     stats = defaultdict(int)
+    stats['packages'] = len(packages)
     for pkg in packages:
         # If packages have multiple infra, take the first one. For the
         # vast majority of packages, the target and host infra are the
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 07/12] support/scripts/pkg-stats: store pkg dir path
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
                   ` (5 preceding siblings ...)
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 06/12] support/scripts/pkg-stats: add package count to stats Heiko Thiery
@ 2020-02-22  8:57 ` Heiko Thiery
  2020-02-23 15:20   ` Titouan Christophe
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 08/12] support/scripts/pkg-stats: add defconfig support Heiko Thiery
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

This value can be used for later processing.

In the buildroot-stats application this is used to create links pointing
to the git repo of buildroot.

Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/scripts/pkg-stats | 1 +
 1 file changed, 1 insertion(+)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index 90afcc2332..36b33586ef 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -64,6 +64,7 @@ class Package:
     def __init__(self, name, path):
         self.name = name
         self.path = path
+        self.pkg_path = os.path.dirname(path)
         self.infras = None
         self.license = None
         self.patch_count = 0
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 08/12] support/scripts/pkg-stats: add defconfig support
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
                   ` (6 preceding siblings ...)
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 07/12] support/scripts/pkg-stats: store pkg dir path Heiko Thiery
@ 2020-02-22  8:57 ` Heiko Thiery
  2020-02-23 14:37   ` Titouan Christophe
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 09/12] support/scripts/pkg-stats: add support for license hash check Heiko Thiery
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

Scan configs directory and create Defconfig objects.

Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/scripts/pkg-stats | 42 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index 36b33586ef..ae70f90485 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -54,6 +54,33 @@ RM_API_STATUS_NOT_FOUND = 4
 # because it's used by sub-processes.
 http_pool = None
 
+class Defconfig:
+    def __init__(self, name, path):
+        self.name = name
+        self.path = path
+        self.developers = None
+
+    def set_developers(self, developers):
+        """
+        Fills in the .developers field
+        """
+        self.developers = list()
+        for dev in developers:
+            if dev.hasfile(self.path):
+                self.developers.append(dev.name)
+
+def get_defconfig_list():
+     """
+     Builds the list of Buildroot defconfigs, returning a list of Defconfig
+     objects.
+     """
+     defconfigs = list()
+     files = [f for f in os.listdir('configs')]
+     for name in files:
+         d = Defconfig(name[:-10], os.path.join('configs', name))
+         defconfigs.append(d)
+     return defconfigs
+
 
 class Package:
     all_licenses = dict()
@@ -882,7 +909,7 @@ def dump_html(packages, stats, date, commit, output):
         f.write(html_footer)
 
 
-def dump_json(packages, stats, date, commit, output):
+def dump_json(packages, defconfigs, stats, date, commit, output):
     # Format packages as a dictionnary instead of a list
     # Exclude local field that does not contains real date
     excluded_fields = ['url_worker', 'name']
@@ -893,6 +920,12 @@ def dump_json(packages, stats, date, commit, output):
             if k not in excluded_fields
         } for pkg in packages
     }
+    defconfigs = {
+        d.name: {
+            k: v
+            for k, v in d.__dict__.items()
+        } for d in defconfigs
+    }
     # Aggregate infrastructures into a single dict entry
     statistics = {
         k: v
@@ -903,6 +936,7 @@ def dump_json(packages, stats, date, commit, output):
     # The actual structure to dump, add commit and date to it
     final = {'packages': pkgs,
              'stats': statistics,
+             'defconfigs': defconfigs,
              'commit': commit,
              'date': str(date)}
 
@@ -944,6 +978,10 @@ def __main__():
     packages = get_pkglist(args.npackages, package_list)
     print("Getting developers ...")
     developers = parse_developers()
+    print("Build defconfig list ...")
+    defconfigs = get_defconfig_list()
+    for d in defconfigs:
+        d.set_developers(developers)
     print("Getting package make info ...")
     package_init_make_info()
     print("Getting package details ...")
@@ -970,7 +1008,7 @@ def __main__():
         dump_html(packages, stats, date, commit, args.html)
     if args.json:
         print("Write JSON")
-        dump_json(packages, stats, date, commit, args.json)
+        dump_json(packages, defconfigs, stats, date, commit, args.json)
 
 
 __main__()
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 09/12] support/scripts/pkg-stats: add support for license hash check
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
                   ` (7 preceding siblings ...)
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 08/12] support/scripts/pkg-stats: add defconfig support Heiko Thiery
@ 2020-02-22  8:57 ` Heiko Thiery
  2020-02-23 16:02   ` Titouan Christophe
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 10/12] support/scripts/pkg-stats: set status to 'na' for virtual packages Heiko Thiery
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

Store the names of license files and check if they are in the hash file.

Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/scripts/pkg-stats | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index ae70f90485..e954dd125e 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -84,7 +84,7 @@ def get_defconfig_list():
 
 class Package:
     all_licenses = dict()
-    all_license_files = list()
+    all_license_files = dict()
     all_versions = dict()
     all_ignored_cves = dict()
 
@@ -94,6 +94,7 @@ class Package:
         self.pkg_path = os.path.dirname(path)
         self.infras = None
         self.license = None
+        self.license_files = None
         self.patch_count = 0
         self.patch_files = []
         self.warnings = 0
@@ -152,6 +153,7 @@ class Package:
             self.license = self.all_licenses[var]
             self.status['license'] = ("ok", "found")
         if var in self.all_license_files:
+            self.license_files = self.all_license_files[var].split(' ')
             self.status['license-files'] = ("ok", "found")
 
     def set_hash_info(self):
@@ -159,8 +161,18 @@ class Package:
         Fills in the .status['hash'] field
         """
         hashpath = self.path.replace(".mk", ".hash")
+        self.status['hash-license'] = ("na", "no hash file")
         if os.path.exists(hashpath):
             self.status['hash'] = ("ok", "found")
+            self.status['hash-license'] = ("error", "no license in hash file")
+            # check if license files are in hash file
+            if self.license_files is not None:
+                self.status['hash-license'] = ("ok", "found")
+                with open(hashpath) as f:
+                    content = f.read()
+                    for license in self.license_files:
+                        if content.find(license) == -1:
+                            self.status['hash-license'] = ("error", "license missing in hash file")
         else:
             self.status['hash'] = ("error", "missing")
 
@@ -441,7 +453,7 @@ def package_init_make_info():
             if pkgvar.endswith("_MANIFEST_LICENSE_FILES"):
                 continue
             pkgvar = pkgvar[:-14]
-            Package.all_license_files.append(pkgvar)
+            Package.all_license_files[pkgvar] = value
 
         elif pkgvar.endswith("_VERSION"):
             if pkgvar.endswith("_DL_VERSION"):
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 10/12] support/scripts/pkg-stats: set status to 'na' for virtual packages
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
                   ` (8 preceding siblings ...)
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 09/12] support/scripts/pkg-stats: add support for license hash check Heiko Thiery
@ 2020-02-22  8:57 ` Heiko Thiery
  2020-02-23 16:11   ` Titouan Christophe
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 11/12] support/scripts/pkg-stats: initialize all package status checks Heiko Thiery
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

If there is no infra set or infra is virtual the status is set to 'na'.

This is done for the follwing checks:
 - license
 - license-files
 - hash
 - hash-license
 - patches
 - version

Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/scripts/pkg-stats | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index e954dd125e..be3b6d7e71 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -125,6 +125,14 @@ class Package:
                 self.status['url'] = ("warning", "missing")
                 fp.close()
 
+    def is_valid_infra(self):
+        try:
+            if self.infras[0][1] == 'virtual':
+                return False
+        except IndexError:
+                return False
+        return True
+
     def set_infra(self):
         """
         Fills in the .infras field
@@ -146,6 +154,11 @@ class Package:
         """
         Fills in the .status['license'] and .status['license-files'] fields
         """
+        if self.is_valid_infra() == False:
+            self.status['license'] = ("na", "no valid package infra")
+            self.status['license-files'] = ("na", "no valid package infra")
+            return
+
         var = self.pkgvar()
         self.status['license'] = ("error", "missing")
         self.status['license-files'] = ("error", "missing")
@@ -160,6 +173,11 @@ class Package:
         """
         Fills in the .status['hash'] field
         """
+        if self.is_valid_infra() == False:
+            self.status['hash'] = ("na", "no valid package infra")
+            self.status['hash-license'] = ("na", "no valid package infra")
+            return
+
         hashpath = self.path.replace(".mk", ".hash")
         self.status['hash-license'] = ("na", "no hash file")
         if os.path.exists(hashpath):
@@ -180,6 +198,10 @@ class Package:
         """
         Fills in the .patch_count, .patch_files and .status['patches'] fields
         """
+        if self.is_valid_infra() == False:
+            self.status['patches'] = ("na", "no valid package infra")
+            return
+
         pkgdir = os.path.dirname(self.path)
         for subdir, _, _ in os.walk(pkgdir):
             self.patch_files = fnmatch.filter(os.listdir(subdir), '*.patch')
@@ -558,6 +580,10 @@ def check_package_latest_version(packages):
         pkg.latest_version['version'] = r[1]
         pkg.latest_version['id'] = r[2]
 
+        if pkg.is_valid_infra() == False:
+            pkg.status['version'] = ("na", "no valid package infra")
+            continue
+
         if pkg.latest_version['status'] == RM_API_STATUS_ERROR:
             pkg.status['version'] = ('warning', 'RM API error')
         elif pkg.latest_version['status'] == RM_API_STATUS_NOT_FOUND:
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 11/12] support/scripts/pkg-stats: initialize all package status checks
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
                   ` (9 preceding siblings ...)
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 10/12] support/scripts/pkg-stats: set status to 'na' for virtual packages Heiko Thiery
@ 2020-02-22  8:57 ` Heiko Thiery
  2020-02-23 14:09   ` Titouan Christophe
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 12/12] support/scripts/pkg-stats: add status for cve check Heiko Thiery
  2020-02-23 16:26 ` [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Titouan Christophe
  12 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

Also the list of status checks is added to the json output.

Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/scripts/pkg-stats | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index be3b6d7e71..ed22f6b650 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -87,6 +87,11 @@ class Package:
     all_license_files = dict()
     all_versions = dict()
     all_ignored_cves = dict()
+    # This is the list of all possible checks. Add new checks to this list so
+    # a tool that post-processeds the json output knows the checks before
+    # iterating over the packages.
+    status_checks = ['cve', 'developers', 'hash', 'hash-license', 'license',
+                     'license-files', 'patches', 'pkg-check', 'url', 'version']
 
     def __init__(self, name, path):
         self.name = name
@@ -103,7 +108,15 @@ class Package:
         self.url_worker = None
         self.cves = list()
         self.latest_version = {'status': RM_API_STATUS_ERROR, 'version': None, 'id': None}
+        self.init_check_status()
+
+    def init_check_status(self):
+        """
+        Set the status of all check to 'na'.
+        """
         self.status = {}
+        for check in self.status_checks:
+            self.status[check] = ("na", "no status check done")
 
     def pkgvar(self):
         return self.name.upper().replace("-", "_")
@@ -975,6 +988,7 @@ def dump_json(packages, defconfigs, stats, date, commit, output):
     final = {'packages': pkgs,
              'stats': statistics,
              'defconfigs': defconfigs,
+             'package_status_checks': Package.status_checks,
              'commit': commit,
              'date': str(date)}
 
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 12/12] support/scripts/pkg-stats: add status for cve check
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
                   ` (10 preceding siblings ...)
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 11/12] support/scripts/pkg-stats: initialize all package status checks Heiko Thiery
@ 2020-02-22  8:57 ` Heiko Thiery
  2020-02-23 14:24   ` Titouan Christophe
  2020-02-23 16:26 ` [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Titouan Christophe
  12 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-22  8:57 UTC (permalink / raw)
  To: buildroot

Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/scripts/pkg-stats | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index ed22f6b650..4efff8624f 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -617,8 +617,14 @@ def check_package_cves(nvd_path, packages):
 
     for cve in CVE.read_nvd_dir(nvd_path):
         for pkg_name in cve.pkg_names:
-            if pkg_name in packages and cve.affects(packages[pkg_name]):
-                packages[pkg_name].cves.append(cve.identifier)
+            if pkg_name in packages:
+                if cve.affects(packages[pkg_name]):
+                    packages[pkg_name].cves.append(cve.identifier)
+                if len(packages[pkg_name].cves):
+                    packages[pkg_name].status['cve'] = ('error', 'affected by cve')
+                else:
+                    packages[pkg_name].status['cve'] = ('ok', 'no cve found')
+
 
 
 def calculate_stats(packages):
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 01/12] support/scripts/pkg-stats: store latest version in a dict
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 01/12] support/scripts/pkg-stats: store latest version in a dict Heiko Thiery
@ 2020-02-23 13:26   ` Titouan Christophe
  2020-02-23 21:41     ` Heiko Thiery
  0 siblings, 1 reply; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 13:26 UTC (permalink / raw)
  To: buildroot

Hello Heiko and all,

On 2/22/20 9:57 AM, Heiko Thiery wrote:
> From: Heiko Thiery <heiko.thiery@kontron.com>
> 
> This patch changes the type of the latest_version variable to a dict.
> This is for better readability/usability of the data. With this the json
> output is more descriptive in later processing of the json output.
> 
> Signed-off-by: Heiko Thiery <heiko.thiery@kontron.com>
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>

You signed off with both your personal and work email addresses, was 
this intended ?

> ---
>   support/scripts/pkg-stats | 33 +++++++++++++++++----------------
>   1 file changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index 7721d98459..8cc78f2f66 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -71,7 +71,7 @@ class Package:
>           self.url_status = None
>           self.url_worker = None
>           self.cves = list()
> -        self.latest_version = (RM_API_STATUS_ERROR, None, None)
> +        self.latest_version = {'status': RM_API_STATUS_ERROR, 'version': None, 'id': None}
>   
>       def pkgvar(self):
>           return self.name.upper().replace("-", "_")
> @@ -460,9 +460,8 @@ def check_package_latest_version(packages):
>       """
>       Fills in the .latest_version field of all Package objects
>   
> -    This field has a special format:
> -      (status, version, id)
> -    with:
> +    This field is a dict and has the following keys:
> +
>       - status: one of RM_API_STATUS_ERROR,
>         RM_API_STATUS_FOUND_BY_DISTRO, RM_API_STATUS_FOUND_BY_PATTERN,
>         RM_API_STATUS_NOT_FOUND
> @@ -478,7 +477,9 @@ def check_package_latest_version(packages):
>       worker_pool = Pool(processes=64)
>       results = worker_pool.map(check_package_latest_version_worker, (pkg.name for pkg in packages))
>       for pkg, r in zip(packages, results):
> -        pkg.latest_version = r
> +        pkg.latest_version['status'] = r[0]
> +        pkg.latest_version['version'] = r[1]
> +        pkg.latest_version['id'] = r[2]

Bikeshedding, but maybe more elegant:

pkg.latest_version = dict(zip(['status', 'version', 'id'], r))

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 02/12] support/scripts/pkg-stats: store patch files for the package
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 02/12] support/scripts/pkg-stats: store patch files for the package Heiko Thiery
@ 2020-02-23 13:35   ` Titouan Christophe
  2020-02-23 21:23     ` Heiko Thiery
  0 siblings, 1 reply; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 13:35 UTC (permalink / raw)
  To: buildroot

Heiko,


On 2/22/20 9:57 AM, Heiko Thiery wrote:
> From: Heiko Thiery <heiko.thiery@kontron.com>
> 
> Signed-off-by: Heiko Thiery <heiko.thiery@kontron.com>
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>

Same as the previous patch, signed off with 2 different email addresses.

> ---
>   support/scripts/pkg-stats | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index 8cc78f2f66..4c963cef0f 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -65,6 +65,7 @@ class Package:
>           self.has_license_files = False
>           self.has_hash = False
>           self.patch_count = 0
> +        self.patch_files = []
>           self.warnings = 0
>           self.current_version = None
>           self.url = None
> @@ -131,10 +132,10 @@ class Package:
>           """
>           Fills in the .patch_count field
>           """
> -        self.patch_count = 0
>           pkgdir = os.path.dirname(self.path)
>           for subdir, _, _ in os.walk(pkgdir):
> -            self.patch_count += len(fnmatch.filter(os.listdir(subdir), '*.patch'))
> +            self.patch_files = fnmatch.filter(os.listdir(subdir), '*.patch')
> +            self.patch_count = len(self.patch_files)

We can compute the patch_count only once after looping through all the 
files, no need to update it at each step.

Maybe we can also entirely remove this patch_count attribute from the 
Package class, and define it as a property:

class Package:
     # ...
     @property
     def patch_count(self):
         return len(self.patch_files)

>   
>       def set_current_version(self):
>           """
> 


Best regards,

Titouan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 03/12] support/scripts/pkg-stats: set developers info
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 03/12] support/scripts/pkg-stats: set developers info Heiko Thiery
@ 2020-02-23 13:45   ` Titouan Christophe
  2020-02-23 21:37     ` Heiko Thiery
  0 siblings, 1 reply; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 13:45 UTC (permalink / raw)
  To: buildroot

Heiko, all,
On 2/22/20 9:57 AM, Heiko Thiery wrote:
> Use the function 'parse_developers' function from getdeveloperlib that
> collect the information about the developers and the files they
> maintain. Then set the maintainer(s) to each package.
> 
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
> ---
>   support/scripts/pkg-stats | 16 ++++++++++++++++
>   1 file changed, 16 insertions(+)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index 4c963cef0f..643272e9d2 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -30,10 +30,14 @@ import certifi
>   import distutils.version
>   import time
>   import gzip
> +import sys
>   from urllib3 import HTTPSConnectionPool
>   from urllib3.exceptions import HTTPError
>   from multiprocessing import Pool
>   
> +sys.path.append('utils/')
> +from getdeveloperlib import parse_developers
> + >   NVD_START_YEAR = 2002
>   NVD_JSON_VERSION = "1.0"
>   NVD_BASE_URL = "https://nvd.nist.gov/feeds/json/cve/" + NVD_JSON_VERSION
> @@ -169,6 +173,15 @@ class Package:
>           """
>           return cve in self.all_ignored_cves.get(self.pkgvar(), [])
>   
> +    def set_developers(self, developers):
> +        """
> +        Fills in the .developers field
> +        """
> +        self.developers = list()
> +        for dev in developers:
> +            if dev.hasfile(self.path):
> +                self.developers.append((dev.name))

Bikeshedding again, but maybe more elegant like this:

self.developers = [
     dev.name
     for dev in developers
     if dev.hasfile(self.path)
]

> +
>       def __eq__(self, other):
>           return self.path == other.path
>   
> @@ -891,6 +904,8 @@ def __main__():
>                                         'HEAD']).splitlines()[0]
>       print("Build package list ...")
>       packages = get_pkglist(args.npackages, package_list)
> +    print("Getting developers ...")
> +    developers = parse_developers()
>       print("Getting package make info ...")
>       package_init_make_info()
>       print("Getting package details ...")
> @@ -902,6 +917,7 @@ def __main__():
>           pkg.set_check_package_warnings()
>           pkg.set_current_version()
>           pkg.set_url()
> +        pkg.set_developers(developers)
>       print("Checking URL status")
>       check_package_urls(packages)
>       print("Getting latest versions ...")
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 11/12] support/scripts/pkg-stats: initialize all package status checks
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 11/12] support/scripts/pkg-stats: initialize all package status checks Heiko Thiery
@ 2020-02-23 14:09   ` Titouan Christophe
  2020-02-24  7:28     ` Heiko Thiery
  0 siblings, 1 reply; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 14:09 UTC (permalink / raw)
  To: buildroot

Heiko, all,

On 2/22/20 9:57 AM, Heiko Thiery wrote:
> Also the list of status checks is added to the json output.
> 
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com> > ---
>   support/scripts/pkg-stats | 14 ++++++++++++++
>   1 file changed, 14 insertions(+)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index be3b6d7e71..ed22f6b650 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -87,6 +87,11 @@ class Package:
>       all_license_files = dict()
>       all_versions = dict()
>       all_ignored_cves = dict()
> +    # This is the list of all possible checks. Add new checks to this list so
> +    # a tool that post-processeds the json output knows the checks before
> +    # iterating over the packages.
> +    status_checks = ['cve', 'developers', 'hash', 'hash-license', 'license',
> +                     'license-files', 'patches', 'pkg-check', 'url', 'version']
>   
>       def __init__(self, name, path):
>           self.name = name
> @@ -103,7 +108,15 @@ class Package:
>           self.url_worker = None
>           self.cves = list()
>           self.latest_version = {'status': RM_API_STATUS_ERROR, 'version': None, 'id': None}
> +        self.init_check_status()
> +
> +    def init_check_status(self):
> +        """
> +        Set the status of all check to 'na'.
> +        """
>           self.status = {}
> +        for check in self.status_checks:
> +            self.status[check] = ("na", "no status check done")

In my opinion, it would be better to leave this dictionary empty at 
initialization.

Because we know the list of status checks that have been run (from the 
'package_status_checks' list you introduce below), we can tell that a 
particular status check has not been done if its name is missing from 
the status dictionary. This would make the outputted json a tad smaller.

>   
>       def pkgvar(self):
>           return self.name.upper().replace("-", "_")
> @@ -975,6 +988,7 @@ def dump_json(packages, defconfigs, stats, date, commit, output):
>       final = {'packages': pkgs,
>                'stats': statistics,
>                'defconfigs': defconfigs,
> +             'package_status_checks': Package.status_checks,
>                'commit': commit,
>                'date': str(date)}
>   
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 12/12] support/scripts/pkg-stats: add status for cve check
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 12/12] support/scripts/pkg-stats: add status for cve check Heiko Thiery
@ 2020-02-23 14:24   ` Titouan Christophe
  2020-02-24  7:06     ` Heiko Thiery
  0 siblings, 1 reply; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 14:24 UTC (permalink / raw)
  To: buildroot

Heiko, all,

On 2/22/20 9:57 AM, Heiko Thiery wrote:
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
> ---
>   support/scripts/pkg-stats | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index ed22f6b650..4efff8624f 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -617,8 +617,14 @@ def check_package_cves(nvd_path, packages):
>   
>       for cve in CVE.read_nvd_dir(nvd_path):
>           for pkg_name in cve.pkg_names:
> -            if pkg_name in packages and cve.affects(packages[pkg_name]):
> -                packages[pkg_name].cves.append(cve.identifier)
> +            if pkg_name in packages:
> +                if cve.affects(packages[pkg_name]):
> +                    packages[pkg_name].cves.append(cve.identifier)
> +                if len(packages[pkg_name].cves):
> +                    packages[pkg_name].status['cve'] = ('error', 'affected by cve')
> +                else:
> +                    packages[pkg_name].status['cve'] = ('ok', 'no cve found')
> +
>   
>   
>   def calculate_stats(packages):
> 

In the current state, that would give:

* If a CVE applies to a br package -> error
* If a CVE does not applies to a br package -> ok
* If no CVE points to a br package -> na (no status check done)

I would argue that the latest case is not correct. The status should be 
ok, because we ran through the whole list of CVEs from the NVD feed, and 
we did not find any of them that applies to this package.

I would rather write it like this:

########################
diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index ed22f6b650..91477d583e 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -620,6 +620,12 @@ def check_package_cves(nvd_path, packages):
              if pkg_name in packages and cve.affects(packages[pkg_name]):
                  packages[pkg_name].cves.append(cve.identifier)

+    for pkg_name, pkg in packages.items():
+        if len(pkg.cves) > 0:
+            pkg.status['cve'] = ('error', 'affected by CVE(s)')
+        else:
+            pkg.status['cve'] = ('ok', 'no CVE found')
+

  def calculate_stats(packages):
      stats = defaultdict(int)
########################


Best regards,

Titouan

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 08/12] support/scripts/pkg-stats: add defconfig support
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 08/12] support/scripts/pkg-stats: add defconfig support Heiko Thiery
@ 2020-02-23 14:37   ` Titouan Christophe
  0 siblings, 0 replies; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 14:37 UTC (permalink / raw)
  To: buildroot

Heiko, all,

Nice to add the defconfigs to the stats !

On 2/22/20 9:57 AM, Heiko Thiery wrote:
> Scan configs directory and create Defconfig objects.
> 
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
> ---
>   support/scripts/pkg-stats | 42 +++++++++++++++++++++++++++++++++++++--
>   1 file changed, 40 insertions(+), 2 deletions(-)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index 36b33586ef..ae70f90485 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -54,6 +54,33 @@ RM_API_STATUS_NOT_FOUND = 4
>   # because it's used by sub-processes.
>   http_pool = None
>   
> +class Defconfig:
> +    def __init__(self, name, path):
> +        self.name = name
> +        self.path = path
> +        self.developers = None
> +
> +    def set_developers(self, developers):
> +        """
> +        Fills in the .developers field
> +        """
> +        self.developers = list()
> +        for dev in developers:
> +            if dev.hasfile(self.path):
> +                self.developers.append(dev.name)

self.developers = [
     dev.name
     for dev in developers
     if dev.hasfile(self.path)
]

> +
> +def get_defconfig_list():
> +     """
> +     Builds the list of Buildroot defconfigs, returning a list of Defconfig
> +     objects.
> +     """
> +     defconfigs = list()
> +     files = [f for f in os.listdir('configs')]
> +     for name in files:
> +         d = Defconfig(name[:-10], os.path.join('configs', name))
> +         defconfigs.append(d)
> +     return defconfigs

return [
     Defconfig(name[:-10], os.path.join('configs', name))
     for name in os.listdir('configs')
]

> +
>   
>   class Package:
>       all_licenses = dict()
> @@ -882,7 +909,7 @@ def dump_html(packages, stats, date, commit, output):
>           f.write(html_footer)
>   
>   
> -def dump_json(packages, stats, date, commit, output):
> +def dump_json(packages, defconfigs, stats, date, commit, output):
>       # Format packages as a dictionnary instead of a list
>       # Exclude local field that does not contains real date
>       excluded_fields = ['url_worker', 'name']
> @@ -893,6 +920,12 @@ def dump_json(packages, stats, date, commit, output):
>               if k not in excluded_fields
>           } for pkg in packages
>       }
> +    defconfigs = {
> +        d.name: {
> +            k: v
> +            for k, v in d.__dict__.items()
> +        } for d in defconfigs
> +    }
>       # Aggregate infrastructures into a single dict entry
>       statistics = {
>           k: v
> @@ -903,6 +936,7 @@ def dump_json(packages, stats, date, commit, output):
>       # The actual structure to dump, add commit and date to it
>       final = {'packages': pkgs,
>                'stats': statistics,
> +             'defconfigs': defconfigs,
>                'commit': commit,
>                'date': str(date)}
>   
> @@ -944,6 +978,10 @@ def __main__():
>       packages = get_pkglist(args.npackages, package_list)
>       print("Getting developers ...")
>       developers = parse_developers()
> +    print("Build defconfig list ...")
> +    defconfigs = get_defconfig_list()
> +    for d in defconfigs:
> +        d.set_developers(developers)
>       print("Getting package make info ...")
>       package_init_make_info()
>       print("Getting package details ...")
> @@ -970,7 +1008,7 @@ def __main__():
>           dump_html(packages, stats, date, commit, args.html)
>       if args.json:
>           print("Write JSON")
> -        dump_json(packages, stats, date, commit, args.json)
> +        dump_json(packages, defconfigs, stats, date, commit, args.json)
>   
>   
>   __main__()
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 06/12] support/scripts/pkg-stats: add package count to stats
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 06/12] support/scripts/pkg-stats: add package count to stats Heiko Thiery
@ 2020-02-23 14:40   ` Titouan Christophe
  0 siblings, 0 replies; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 14:40 UTC (permalink / raw)
  To: buildroot

Heiko, all,

On 2/22/20 9:57 AM, Heiko Thiery wrote:
> Signed-off-by: Heiko Thiery <heiko.thiery at gmail.

Reviewed-by: Titouan Christophe <titouan.christophe@railnova.eu>
Tested-by: Titouan Christophe <titouan.christophe@railnova.eu>

> ---
>   support/scripts/pkg-stats | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index 7c8cc81cf2..90afcc2332 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -544,6 +544,7 @@ def check_package_cves(nvd_path, packages):
>   
>   def calculate_stats(packages):
>       stats = defaultdict(int)
> +    stats['packages'] = len(packages)
>       for pkg in packages:
>           # If packages have multiple infra, take the first one. For the
>           # vast majority of packages, the target and host infra are the
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 05/12] support/scripts/pkg-stats: add package status
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 05/12] support/scripts/pkg-stats: add package status Heiko Thiery
@ 2020-02-23 15:19   ` Titouan Christophe
  2020-02-24  8:03     ` Heiko Thiery
  0 siblings, 1 reply; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 15:19 UTC (permalink / raw)
  To: buildroot

Heiko, all,

First of all, I really like this idea, as it makes the status checks 
quite easier to extend !

On 2/22/20 9:57 AM, Heiko Thiery wrote:
> Unify the status check information. The status is stored in a tuple. The
> first entry is the status that can be 'ok', 'warning' or 'error'. The
> second entry is a verbose message.
> 
> The following checks are performed:
> - url: status of the URL check
> - license: status of the license presence check
> - license-files: status of the license file check
> - hash: status of the hash file presence check
> - patches: status of the patches count check
> - pkg-check: status of the check-package script result
> - developers: status if a package has developers in the DEVELOPERS file
> - version: status of the version check
> 
> With that status information the following variables are replaced:
> has_license, has_license_files, has_hash, url_status
> 
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
> ---
>   support/scripts/pkg-stats | 104 +++++++++++++++++++++++++-------------
>   1 file changed, 69 insertions(+), 35 deletions(-)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index ebaf04465e..7c8cc81cf2 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -66,18 +66,15 @@ class Package:
>           self.path = path
>           self.infras = None
>           self.license = None
> -        self.has_license = False
> -        self.has_license_files = False
> -        self.has_hash = False
>           self.patch_count = 0
>           self.patch_files = []
>           self.warnings = 0
>           self.current_version = None
>           self.url = None
> -        self.url_status = None
>           self.url_worker = None
>           self.cves = list()
>           self.latest_version = {'status': RM_API_STATUS_ERROR, 'version': None, 'id': None}
> +        self.status = {}
>   
>       def pkgvar(self):
>           return self.name.upper().replace("-", "_")
> @@ -86,17 +83,17 @@ class Package:
>           """
>           Fills in the .url field
>           """
> -        self.url_status = "No Config.in"
> +        self.status['url'] = ("warning", "no Config.in")
>           for filename in os.listdir(os.path.dirname(self.path)):
>               if fnmatch.fnmatch(filename, 'Config.*'):
>                   fp = open(os.path.join(os.path.dirname(self.path), filename), "r")
>                   for config_line in fp:
>                       if URL_RE.match(config_line):
>                           self.url = config_line.strip()
> -                        self.url_status = "Found"
> +                        self.status['url'] = ("ok", "found")
>                           fp.close()
>                           return
> -                self.url_status = "Missing"
> +                self.status['url'] = ("warning", "missing")

I would set ("error", "missing") for consistency with what is done elsewhere

>                   fp.close()
>   
>       def set_infra(self):

[-- SNIP --]

> @@ -496,6 +517,18 @@ def check_package_latest_version(packages):
>           pkg.latest_version['status'] = r[0]
>           pkg.latest_version['version'] = r[1]
>           pkg.latest_version['id'] = r[2]
> +
> +        if pkg.latest_version['status'] == RM_API_STATUS_ERROR:
> +            pkg.status['version'] = ('warning', 'RM API error')
> +        elif pkg.latest_version['status'] == RM_API_STATUS_NOT_FOUND:
> +            pkg.status['version'] = ('warning', 'RM package not found')
> +
> +        if pkg.latest_version['version'] is None:
> +            pkg.status['version'] = ('warning', 'no upstream version available')
> +        elif pkg.latest_version['version'] != pkg.current_version:
> +            pkg.status['version'] = ('error', 'package needs update')

We can make the messages a bit more explicit:
* "Release Monitoring API error"
* "Package not found on Release Monitoring"
* "The newer version {} is available 
upstream".format(pkg.latest_version['version'])

> +        else:
> +            pkg.status['version'] = ('ok', 'up-to-date')
>       del http_pool
>   
>   

[-- SNIP --]

> @@ -746,12 +779,13 @@ def dump_html_pkg(f, pkg):
>   
>       # URL status
>       td_class = ["centered"]
> -    url_str = pkg.url_status
> -    if pkg.url_status == "Missing" or pkg.url_status == "No Config.in":
> +    url_str = pkg.status['url'][1]
> +    if pkg.status['url'][0] == "warning":
> +        td_class.append("missing_url")
> +    elif pkg.status['url'][0] == "error":
>           td_class.append("missing_url")

The above 2 conditions can be simplified like:

+    if pkg.status['url'][0] in ("error", "warning"):
+        td_class.append("missing_url")

> -    elif pkg.url_status.startswith("Invalid"):
>           td_class.append("invalid_url")
> -        url_str = "<a href=%s>%s</a>" % (pkg.url, pkg.url_status)
> +        url_str = "<a href=%s>%s</a>" % (pkg.url, pkg.status['url'][1])
>       else:
>           td_class.append("good_url")
>           url_str = "<a href=%s>Link</a>" % pkg.url
> 


Best regards,

Titouan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 07/12] support/scripts/pkg-stats: store pkg dir path
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 07/12] support/scripts/pkg-stats: store pkg dir path Heiko Thiery
@ 2020-02-23 15:20   ` Titouan Christophe
  2020-02-24  8:04     ` Heiko Thiery
  0 siblings, 1 reply; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 15:20 UTC (permalink / raw)
  To: buildroot

Hello Heiko,
On 2/22/20 9:57 AM, Heiko Thiery wrote:
> This value can be used for later processing.
> 
> In the buildroot-stats application this is used to create links pointing
> to the git repo of buildroot.
> 
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>

Reviewed-by: Titouan Christophe <titouan.christophe@railnova.eu>
Tested-by: Titouan Christophe <titouan.christophe@railnova.eu>

> ---
>   support/scripts/pkg-stats | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index 90afcc2332..36b33586ef 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -64,6 +64,7 @@ class Package:
>       def __init__(self, name, path):
>           self.name = name
>           self.path = path
> +        self.pkg_path = os.path.dirname(path)
>           self.infras = None
>           self.license = None
>           self.patch_count = 0
> 

Regards,

Titouan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 04/12] support/scripts/pkg-stats: store licences of package
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 04/12] support/scripts/pkg-stats: store licences of package Heiko Thiery
@ 2020-02-23 15:27   ` Titouan Christophe
  0 siblings, 0 replies; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 15:27 UTC (permalink / raw)
  To: buildroot

Heiko and all,

On 2/22/20 9:57 AM, Heiko Thiery wrote:
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>

Reviewed-by: Titouan Christophe <titouan.christophe@railnova.eu>
Tested-by: Titouan Christophe <titouan.christophe@railnova.eu>

> ---
>   support/scripts/pkg-stats | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index 643272e9d2..ebaf04465e 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -56,7 +56,7 @@ http_pool = None
>   
>   
>   class Package:
> -    all_licenses = list()
> +    all_licenses = dict()
>       all_license_files = list()
>       all_versions = dict()
>       all_ignored_cves = dict()
> @@ -65,6 +65,7 @@ class Package:
>           self.name = name
>           self.path = path
>           self.infras = None
> +        self.license = None
>           self.has_license = False
>           self.has_license_files = False
>           self.has_hash = False
> @@ -122,6 +123,7 @@ class Package >           var = self.pkgvar()
>           if var in self.all_licenses:
>               self.has_license = True
> +            self.license = self.all_licenses[var]
>           if var in self.all_license_files:
>               self.has_license_files = True
>   
> @@ -384,7 +386,7 @@ def package_init_make_info():
>               if value == "unknown":
>                   continue
>               pkgvar = pkgvar[:-8]
> -            Package.all_licenses.append(pkgvar)
> +            Package.all_licenses[pkgvar] = value
>   
>           elif pkgvar.endswith("_LICENSE_FILES"):
>               if pkgvar.endswith("_MANIFEST_LICENSE_FILES"):
> 

Best regards,

Titouan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 09/12] support/scripts/pkg-stats: add support for license hash check
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 09/12] support/scripts/pkg-stats: add support for license hash check Heiko Thiery
@ 2020-02-23 16:02   ` Titouan Christophe
  0 siblings, 0 replies; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 16:02 UTC (permalink / raw)
  To: buildroot

Heiko, all,


On 2/22/20 9:57 AM, Heiko Thiery wrote:
> Store the names of license files and check if they are in the hash file.
> 
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>

Reviewed-by: Titouan Christophe <titouan.christophe@railnova.eu>
Tested-by: Titouan Christophe <titouan.christophe@railnova.eu>

> ---
>   support/scripts/pkg-stats | 16 ++++++++++++++--
>   1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index ae70f90485..e954dd125e 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -84,7 +84,7 @@ def get_defconfig_list():
>   
>   class Package:
>       all_licenses = dict()
> -    all_license_files = list()
> +    all_license_files = dict()
>       all_versions = dict()
>       all_ignored_cves = dict()
>   
> @@ -94,6 +94,7 @@ class Package:
>           self.pkg_path = os.path.dirname(path)
>           self.infras = None
>           self.license = None
> +        self.license_files = None
>           self.patch_count = 0
>           self.patch_files = []
>           self.warnings = 0
> @@ -152,6 +153,7 @@ class Package:
>               self.license = self.all_licenses[var]
>               self.status['license'] = ("ok", "found")
>           if var in self.all_license_files:
> +            self.license_files = self.all_license_files[var].split(' ')
>               self.status['license-files'] = ("ok", "found")
>   
>       def set_hash_info(self):
> @@ -159,8 +161,18 @@ class Package:
>           Fills in the .status['hash'] field
>           """
>           hashpath = self.path.replace(".mk", ".hash")
> +        self.status['hash-license'] = ("na", "no hash file")
>           if os.path.exists(hashpath):
>               self.status['hash'] = ("ok", "found")
> +            self.status['hash-license'] = ("error", "no license in hash file")
> +            # check if license files are in hash file
> +            if self.license_files is not None:
> +                self.status['hash-license'] = ("ok", "found")
> +                with open(hashpath) as f:
> +                    content = f.read()
> +                    for license in self.license_files:
> +                        if content.find(license) == -1:
> +                            self.status['hash-license'] = ("error", "license missing in hash file")
>           else:
>               self.status['hash'] = ("error", "missing")
>   
> @@ -441,7 +453,7 @@ def package_init_make_info():
>               if pkgvar.endswith("_MANIFEST_LICENSE_FILES"):
>                   continue
>               pkgvar = pkgvar[:-14]
> -            Package.all_license_files.append(pkgvar)
> +            Package.all_license_files[pkgvar] = value
>   
>           elif pkgvar.endswith("_VERSION"):
>               if pkgvar.endswith("_DL_VERSION"):
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 10/12] support/scripts/pkg-stats: set status to 'na' for virtual packages
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 10/12] support/scripts/pkg-stats: set status to 'na' for virtual packages Heiko Thiery
@ 2020-02-23 16:11   ` Titouan Christophe
  2020-02-24  8:22     ` Heiko Thiery
  0 siblings, 1 reply; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 16:11 UTC (permalink / raw)
  To: buildroot

Hello Heiko and all,

On 2/22/20 9:57 AM, Heiko Thiery wrote:
> If there is no infra set or infra is virtual the status is set to 'na'.
> 
> This is done for the follwing checks:
>   - license
>   - license-files
>   - hash
>   - hash-license
>   - patches
>   - version
> 
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
> ---
>   support/scripts/pkg-stats | 26 ++++++++++++++++++++++++++
>   1 file changed, 26 insertions(+)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index e954dd125e..be3b6d7e71 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -125,6 +125,14 @@ class Package:
>                   self.status['url'] = ("warning", "missing")
>                   fp.close()
>   
> +    def is_valid_infra(self):
> +        try:
> +            if self.infras[0][1] == 'virtual':
> +                return False
> +        except IndexError:
> +                return False
> +        return True

You might want to make a property of this (like CVE.identifier).
I would also suggest has_valid_infra, because a package is not an 
infrastructure :-)

> +
>       def set_infra(self):
>           """
>           Fills in the .infras field
> @@ -146,6 +154,11 @@ class Package:
>           """
>           Fills in the .status['license'] and .status['license-files'] fields
>           """
> +        if self.is_valid_infra() == False:

if not self.is_valid_infra():

(or using a property, if not self.has_valid_infra:)

> +            self.status['license'] = ("na", "no valid package infra")
> +            self.status['license-files'] = ("na", "no valid package infra")
> +            return
> +
>           var = self.pkgvar()
>           self.status['license'] = ("error", "missing")
>           self.status['license-files'] = ("error", "missing")
> @@ -160,6 +173,11 @@ class Package:
>           """
>           Fills in the .status['hash'] field
>           """
> +        if self.is_valid_infra() == False:

same

> +            self.status['hash'] = ("na", "no valid package infra")
> +            self.status['hash-license'] = ("na", "no valid package infra")
> +            return
> +
>           hashpath = self.path.replace(".mk", ".hash")
>           self.status['hash-license'] = ("na", "no hash file")
>           if os.path.exists(hashpath):
> @@ -180,6 +198,10 @@ class Package:
>           """
>           Fills in the .patch_count, .patch_files and .status['patches'] fields
>           """
> +        if self.is_valid_infra() == False:

same

> +            self.status['patches'] = ("na", "no valid package infra")
> +            return
> +
>           pkgdir = os.path.dirname(self.path)
>           for subdir, _, _ in os.walk(pkgdir):
>               self.patch_files = fnmatch.filter(os.listdir(subdir), '*.patch')
> @@ -558,6 +580,10 @@ def check_package_latest_version(packages):
>           pkg.latest_version['version'] = r[1]
>           pkg.latest_version['id'] = r[2]
>   
> +        if pkg.is_valid_infra() == False:

same

> +            pkg.status['version'] = ("na", "no valid package infra")
> +            continue
> +
>           if pkg.latest_version['status'] == RM_API_STATUS_ERROR:
>               pkg.status['version'] = ('warning', 'RM API error')
>           elif pkg.latest_version['status'] == RM_API_STATUS_NOT_FOUND:
> 


Regards,

Titouan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements
  2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
                   ` (11 preceding siblings ...)
  2020-02-22  8:57 ` [Buildroot] [PATCH v3 12/12] support/scripts/pkg-stats: add status for cve check Heiko Thiery
@ 2020-02-23 16:26 ` Titouan Christophe
  12 siblings, 0 replies; 36+ messages in thread
From: Titouan Christophe @ 2020-02-23 16:26 UTC (permalink / raw)
  To: buildroot

Hello Heiko,

I tested your patchset, along with your buildroot-stat application, and 
overall this is great ! The generated HTML is nice and very comfortable 
to navigate.


I replied to individual patches, most of the time for cosmetic reasons:

* Using list/dict comprehensions in place of for-append loops
* Make use of Python properties
* if xxx == False >>> if not xxx

As this is mostly a matter of taste, if you (or others) don't agree with 
me, I will add a Reviewed-by anyway.


There are also a few required changes:

* You signed off with 2 distinct email addresses for 2 patches in the series
* Prefer to update the package status after entirely processing a status 
check (instead of updating at each change during a loop)
* Consistency in the generated status (warning vs. error), and improve 
the status messages

Best regards,

Titouan
On 2/22/20 9:57 AM, Heiko Thiery wrote:
> This patchset adds several improvements for the json output of the
> pkg-stats script.
> 
> - add developers information to the packages
> - add supported defconfigs to json
> - add license information to json
> - add more generic status field to the packages for easier post
>    processing
> 
> ---
> v2 -> v3:
> - keep variable latest_release but change format
> - add check for license file hashes
> - introduce a na status for the checks
> - add a list of all possible checks to the json output
> - add the cve check status
> 
> v1 -> v2:
> - cleanup and recreation of patches
> - remove pkg name from dumping to json
> - use patch_files instead of combine count and files in dict
> - include getdevelopers.py to reuse Developers class
> 
> ---
> Heiko Thiery (12):
>    support/scripts/pkg-stats: store latest version in a dict
>    support/scripts/pkg-stats: store patch files for the package
>    support/scripts/pkg-stats: set developers info
>    support/scripts/pkg-stats: store licences of package
>    support/scripts/pkg-stats: add package status
>    support/scripts/pkg-stats: add package count to stats
>    support/scripts/pkg-stats: store pkg dir path
>    support/scripts/pkg-stats: add defconfig support
>    support/scripts/pkg-stats: add support for license hash check
>    support/scripts/pkg-stats: set status to 'na' for virtual packages
>    support/scripts/pkg-stats: initialize all package status checks
>    support/scripts/pkg-stats: add status for cve check
> 
>   support/scripts/pkg-stats | 272 +++++++++++++++++++++++++++++---------
>   1 file changed, 212 insertions(+), 60 deletions(-)
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 02/12] support/scripts/pkg-stats: store patch files for the package
  2020-02-23 13:35   ` Titouan Christophe
@ 2020-02-23 21:23     ` Heiko Thiery
  0 siblings, 0 replies; 36+ messages in thread
From: Heiko Thiery @ 2020-02-23 21:23 UTC (permalink / raw)
  To: buildroot

Hi Titouan and all,

Am So., 23. Feb. 2020 um 14:35 Uhr schrieb Titouan Christophe
<titouan.christophe@railnova.eu>:
>
> Heiko,
>
>
> On 2/22/20 9:57 AM, Heiko Thiery wrote:
> > From: Heiko Thiery <heiko.thiery@kontron.com>
> >
> > Signed-off-by: Heiko Thiery <heiko.thiery@kontron.com>
> > Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
>
> Same as the previous patch, signed off with 2 different email addresses.
>
> > ---
> >   support/scripts/pkg-stats | 5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> > index 8cc78f2f66..4c963cef0f 100755
> > --- a/support/scripts/pkg-stats
> > +++ b/support/scripts/pkg-stats
> > @@ -65,6 +65,7 @@ class Package:
> >           self.has_license_files = False
> >           self.has_hash = False
> >           self.patch_count = 0
> > +        self.patch_files = []
> >           self.warnings = 0
> >           self.current_version = None
> >           self.url = None
> > @@ -131,10 +132,10 @@ class Package:
> >           """
> >           Fills in the .patch_count field
> >           """
> > -        self.patch_count = 0
> >           pkgdir = os.path.dirname(self.path)
> >           for subdir, _, _ in os.walk(pkgdir):
> > -            self.patch_count += len(fnmatch.filter(os.listdir(subdir), '*.patch'))
> > +            self.patch_files = fnmatch.filter(os.listdir(subdir), '*.patch')
> > +            self.patch_count = len(self.patch_files)
>
> We can compute the patch_count only once after looping through all the
> files, no need to update it at each step.
>
> Maybe we can also entirely remove this patch_count attribute from the
> Package class, and define it as a property:

This is a good idea. Just to mention then we lose this field in the
json output. I don't know if there is another user for that. I can
change the fields in the buildroot-stats app to use the length of the
patches list. Has anyone concerns about remove the attribute?

> class Package:
>      # ...
>      @property
>      def patch_count(self):
>          return len(self.patch_files)
>
> >
> >       def set_current_version(self):
> >           """
> >
>
>
> Best regards,
>
> Titouan

Thank you,
Heiko

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 03/12] support/scripts/pkg-stats: set developers info
  2020-02-23 13:45   ` Titouan Christophe
@ 2020-02-23 21:37     ` Heiko Thiery
  0 siblings, 0 replies; 36+ messages in thread
From: Heiko Thiery @ 2020-02-23 21:37 UTC (permalink / raw)
  To: buildroot

Hi Titouan and all,

Am So., 23. Feb. 2020 um 14:45 Uhr schrieb Titouan Christophe
<titouan.christophe@railnova.eu>:
>
> Heiko, all,
> On 2/22/20 9:57 AM, Heiko Thiery wrote:
> > Use the function 'parse_developers' function from getdeveloperlib that
> > collect the information about the developers and the files they
> > maintain. Then set the maintainer(s) to each package.
> >
> > Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
> > ---
> >   support/scripts/pkg-stats | 16 ++++++++++++++++
> >   1 file changed, 16 insertions(+)
> >
> > diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> > index 4c963cef0f..643272e9d2 100755
> > --- a/support/scripts/pkg-stats
> > +++ b/support/scripts/pkg-stats
> > @@ -30,10 +30,14 @@ import certifi
> >   import distutils.version
> >   import time
> >   import gzip
> > +import sys
> >   from urllib3 import HTTPSConnectionPool
> >   from urllib3.exceptions import HTTPError
> >   from multiprocessing import Pool
> >
> > +sys.path.append('utils/')
> > +from getdeveloperlib import parse_developers
> > + >   NVD_START_YEAR = 2002
> >   NVD_JSON_VERSION = "1.0"
> >   NVD_BASE_URL = "https://nvd.nist.gov/feeds/json/cve/" + NVD_JSON_VERSION
> > @@ -169,6 +173,15 @@ class Package:
> >           """
> >           return cve in self.all_ignored_cves.get(self.pkgvar(), [])
> >
> > +    def set_developers(self, developers):
> > +        """
> > +        Fills in the .developers field
> > +        """
> > +        self.developers = list()
> > +        for dev in developers:
> > +            if dev.hasfile(self.path):
> > +                self.developers.append((dev.name))
>
> Bikeshedding again, but maybe more elegant like this:

Indeed this is more elegant. I will change that.

> self.developers = [
>      dev.name
>      for dev in developers
>      if dev.hasfile(self.path)
> ]
>
> > +
> >       def __eq__(self, other):
> >           return self.path == other.path
> >
> > @@ -891,6 +904,8 @@ def __main__():
> >                                         'HEAD']).splitlines()[0]
> >       print("Build package list ...")
> >       packages = get_pkglist(args.npackages, package_list)
> > +    print("Getting developers ...")
> > +    developers = parse_developers()
> >       print("Getting package make info ...")
> >       package_init_make_info()
> >       print("Getting package details ...")
> > @@ -902,6 +917,7 @@ def __main__():
> >           pkg.set_check_package_warnings()
> >           pkg.set_current_version()
> >           pkg.set_url()
> > +        pkg.set_developers(developers)
> >       print("Checking URL status")
> >       check_package_urls(packages)
> >       print("Getting latest versions ...")
> >

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 01/12] support/scripts/pkg-stats: store latest version in a dict
  2020-02-23 13:26   ` Titouan Christophe
@ 2020-02-23 21:41     ` Heiko Thiery
  0 siblings, 0 replies; 36+ messages in thread
From: Heiko Thiery @ 2020-02-23 21:41 UTC (permalink / raw)
  To: buildroot

Hi Titouan and all,

Am So., 23. Feb. 2020 um 14:26 Uhr schrieb Titouan Christophe
<titouan.christophe@railnova.eu>:
>
> Hello Heiko and all,
>
> On 2/22/20 9:57 AM, Heiko Thiery wrote:
> > From: Heiko Thiery <heiko.thiery@kontron.com>
> >
> > This patch changes the type of the latest_version variable to a dict.
> > This is for better readability/usability of the data. With this the json
> > output is more descriptive in later processing of the json output.
> >
> > Signed-off-by: Heiko Thiery <heiko.thiery@kontron.com>
> > Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
>
> You signed off with both your personal and work email addresses, was
> this intended ?

No that was not intended. Thank you.

> > ---
> >   support/scripts/pkg-stats | 33 +++++++++++++++++----------------
> >   1 file changed, 17 insertions(+), 16 deletions(-)
> >
> > diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> > index 7721d98459..8cc78f2f66 100755
> > --- a/support/scripts/pkg-stats
> > +++ b/support/scripts/pkg-stats
> > @@ -71,7 +71,7 @@ class Package:
> >           self.url_status = None
> >           self.url_worker = None
> >           self.cves = list()
> > -        self.latest_version = (RM_API_STATUS_ERROR, None, None)
> > +        self.latest_version = {'status': RM_API_STATUS_ERROR, 'version': None, 'id': None}
> >
> >       def pkgvar(self):
> >           return self.name.upper().replace("-", "_")
> > @@ -460,9 +460,8 @@ def check_package_latest_version(packages):
> >       """
> >       Fills in the .latest_version field of all Package objects
> >
> > -    This field has a special format:
> > -      (status, version, id)
> > -    with:
> > +    This field is a dict and has the following keys:
> > +
> >       - status: one of RM_API_STATUS_ERROR,
> >         RM_API_STATUS_FOUND_BY_DISTRO, RM_API_STATUS_FOUND_BY_PATTERN,
> >         RM_API_STATUS_NOT_FOUND
> > @@ -478,7 +477,9 @@ def check_package_latest_version(packages):
> >       worker_pool = Pool(processes=64)
> >       results = worker_pool.map(check_package_latest_version_worker, (pkg.name for pkg in packages))
> >       for pkg, r in zip(packages, results):
> > -        pkg.latest_version = r
> > +        pkg.latest_version['status'] = r[0]
> > +        pkg.latest_version['version'] = r[1]
> > +        pkg.latest_version['id'] = r[2]
>
> Bikeshedding, but maybe more elegant:
>
> pkg.latest_version = dict(zip(['status', 'version', 'id'], r))

Yes, this is bikeshedding ;-) but a good one! I will change that.

Thank you
--
Heiko

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 12/12] support/scripts/pkg-stats: add status for cve check
  2020-02-23 14:24   ` Titouan Christophe
@ 2020-02-24  7:06     ` Heiko Thiery
  2020-02-24  9:35       ` Titouan Christophe
  0 siblings, 1 reply; 36+ messages in thread
From: Heiko Thiery @ 2020-02-24  7:06 UTC (permalink / raw)
  To: buildroot

Hi Titouan and all,

Am So., 23. Feb. 2020 um 15:24 Uhr schrieb Titouan Christophe
<titouan.christophe@railnova.eu>:
>
> Heiko, all,
>
> On 2/22/20 9:57 AM, Heiko Thiery wrote:
> > Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
> > ---
> >   support/scripts/pkg-stats | 10 ++++++++--
> >   1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> > index ed22f6b650..4efff8624f 100755
> > --- a/support/scripts/pkg-stats
> > +++ b/support/scripts/pkg-stats
> > @@ -617,8 +617,14 @@ def check_package_cves(nvd_path, packages):
> >
> >       for cve in CVE.read_nvd_dir(nvd_path):
> >           for pkg_name in cve.pkg_names:
> > -            if pkg_name in packages and cve.affects(packages[pkg_name]):
> > -                packages[pkg_name].cves.append(cve.identifier)
> > +            if pkg_name in packages:
> > +                if cve.affects(packages[pkg_name]):
> > +                    packages[pkg_name].cves.append(cve.identifier)
> > +                if len(packages[pkg_name].cves):
> > +                    packages[pkg_name].status['cve'] = ('error', 'affected by cve')
> > +                else:
> > +                    packages[pkg_name].status['cve'] = ('ok', 'no cve found')
> > +
> >
> >
> >   def calculate_stats(packages):
> >
>
> In the current state, that would give:
>
> * If a CVE applies to a br package -> error
> * If a CVE does not applies to a br package -> ok
> * If no CVE points to a br package -> na (no status check done)
>
> I would argue that the latest case is not correct. The status should be
> ok, because we ran through the whole list of CVEs from the NVD feed, and
> we did not find any of them that applies to this package.

I see .. I think you are right. This will lead to a wrong status. I
will remove the initializer for the status.

>
> I would rather write it like this:
>
> ########################
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index ed22f6b650..91477d583e 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -620,6 +620,12 @@ def check_package_cves(nvd_path, packages):
>               if pkg_name in packages and cve.affects(packages[pkg_name]):
>                   packages[pkg_name].cves.append(cve.identifier)
>
> +    for pkg_name, pkg in packages.items():
> +        if len(pkg.cves) > 0:
> +            pkg.status['cve'] = ('error', 'affected by CVE(s)')
> +        else:
> +            pkg.status['cve'] = ('ok', 'no CVE found')
> +

Isn't it right that we loop then (depending on the amount of nvd
pathes) several thousend times?

e.g. packages ~2600, nvds ~ 20 => 20*2600=52000

On the other hand we loop over the list of packages all over the place ;-/

>
>   def calculate_stats(packages):
>       stats = defaultdict(int)
> ########################
>
>
> Best regards,
>
> Titouan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 11/12] support/scripts/pkg-stats: initialize all package status checks
  2020-02-23 14:09   ` Titouan Christophe
@ 2020-02-24  7:28     ` Heiko Thiery
  0 siblings, 0 replies; 36+ messages in thread
From: Heiko Thiery @ 2020-02-24  7:28 UTC (permalink / raw)
  To: buildroot

Hi Titouan,

Am So., 23. Feb. 2020 um 15:09 Uhr schrieb Titouan Christophe
<titouan.christophe@railnova.eu>:
>
> Heiko, all,
>
> On 2/22/20 9:57 AM, Heiko Thiery wrote:
> > Also the list of status checks is added to the json output.
> >
> > Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com> > ---
> >   support/scripts/pkg-stats | 14 ++++++++++++++
> >   1 file changed, 14 insertions(+)
> >
> > diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> > index be3b6d7e71..ed22f6b650 100755
> > --- a/support/scripts/pkg-stats
> > +++ b/support/scripts/pkg-stats
> > @@ -87,6 +87,11 @@ class Package:
> >       all_license_files = dict()
> >       all_versions = dict()
> >       all_ignored_cves = dict()
> > +    # This is the list of all possible checks. Add new checks to this list so
> > +    # a tool that post-processeds the json output knows the checks before
> > +    # iterating over the packages.
> > +    status_checks = ['cve', 'developers', 'hash', 'hash-license', 'license',
> > +                     'license-files', 'patches', 'pkg-check', 'url', 'version']
> >
> >       def __init__(self, name, path):
> >           self.name = name
> > @@ -103,7 +108,15 @@ class Package:
> >           self.url_worker = None
> >           self.cves = list()
> >           self.latest_version = {'status': RM_API_STATUS_ERROR, 'version': None, 'id': None}
> > +        self.init_check_status()
> > +
> > +    def init_check_status(self):
> > +        """
> > +        Set the status of all check to 'na'.
> > +        """
> >           self.status = {}
> > +        for check in self.status_checks:
> > +            self.status[check] = ("na", "no status check done")
>
> In my opinion, it would be better to leave this dictionary empty at
> initialization.
>
> Because we know the list of status checks that have been run (from the
> 'package_status_checks' list you introduce below), we can tell that a
> particular status check has not been done if its name is missing from
> the status dictionary. This would make the outputted json a tad smaller.
>

No problem. As for the cve check this can lead to wrong status.

> >
> >       def pkgvar(self):
> >           return self.name.upper().replace("-", "_")
> > @@ -975,6 +988,7 @@ def dump_json(packages, defconfigs, stats, date, commit, output):
> >       final = {'packages': pkgs,
> >                'stats': statistics,
> >                'defconfigs': defconfigs,
> > +             'package_status_checks': Package.status_checks,
> >                'commit': commit,
> >                'date': str(date)}
> >
> >

Thank you,
Heiko

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 05/12] support/scripts/pkg-stats: add package status
  2020-02-23 15:19   ` Titouan Christophe
@ 2020-02-24  8:03     ` Heiko Thiery
  0 siblings, 0 replies; 36+ messages in thread
From: Heiko Thiery @ 2020-02-24  8:03 UTC (permalink / raw)
  To: buildroot

Am So., 23. Feb. 2020 um 16:19 Uhr schrieb Titouan Christophe
<titouan.christophe@railnova.eu>:
>
> Heiko, all,
>
> First of all, I really like this idea, as it makes the status checks
> quite easier to extend !
>
> On 2/22/20 9:57 AM, Heiko Thiery wrote:
> > Unify the status check information. The status is stored in a tuple. The
> > first entry is the status that can be 'ok', 'warning' or 'error'. The
> > second entry is a verbose message.
> >
> > The following checks are performed:
> > - url: status of the URL check
> > - license: status of the license presence check
> > - license-files: status of the license file check
> > - hash: status of the hash file presence check
> > - patches: status of the patches count check
> > - pkg-check: status of the check-package script result
> > - developers: status if a package has developers in the DEVELOPERS file
> > - version: status of the version check
> >
> > With that status information the following variables are replaced:
> > has_license, has_license_files, has_hash, url_status
> >
> > Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
> > ---
> >   support/scripts/pkg-stats | 104 +++++++++++++++++++++++++-------------
> >   1 file changed, 69 insertions(+), 35 deletions(-)
> >
> > diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> > index ebaf04465e..7c8cc81cf2 100755
> > --- a/support/scripts/pkg-stats
> > +++ b/support/scripts/pkg-stats
> > @@ -66,18 +66,15 @@ class Package:
> >           self.path = path
> >           self.infras = None
> >           self.license = None
> > -        self.has_license = False
> > -        self.has_license_files = False
> > -        self.has_hash = False
> >           self.patch_count = 0
> >           self.patch_files = []
> >           self.warnings = 0
> >           self.current_version = None
> >           self.url = None
> > -        self.url_status = None
> >           self.url_worker = None
> >           self.cves = list()
> >           self.latest_version = {'status': RM_API_STATUS_ERROR, 'version': None, 'id': None}
> > +        self.status = {}
> >
> >       def pkgvar(self):
> >           return self.name.upper().replace("-", "_")
> > @@ -86,17 +83,17 @@ class Package:
> >           """
> >           Fills in the .url field
> >           """
> > -        self.url_status = "No Config.in"
> > +        self.status['url'] = ("warning", "no Config.in")
> >           for filename in os.listdir(os.path.dirname(self.path)):
> >               if fnmatch.fnmatch(filename, 'Config.*'):
> >                   fp = open(os.path.join(os.path.dirname(self.path), filename), "r")
> >                   for config_line in fp:
> >                       if URL_RE.match(config_line):
> >                           self.url = config_line.strip()
> > -                        self.url_status = "Found"
> > +                        self.status['url'] = ("ok", "found")
> >                           fp.close()
> >                           return
> > -                self.url_status = "Missing"
> > +                self.status['url'] = ("warning", "missing")
>
> I would set ("error", "missing") for consistency with what is done elsewhere

Yes I will change that.

By the way I have no glue what is correct. What is an error and what
is only a warning? I think this has to be a separate threat.

e.g. for the version check. Is it an error when a newer package is
available and "only" a warning if the version cannot be checked
against release monitoring version?

>
> >                   fp.close()
> >
> >       def set_infra(self):
>
> [-- SNIP --]
>
> > @@ -496,6 +517,18 @@ def check_package_latest_version(packages):
> >           pkg.latest_version['status'] = r[0]
> >           pkg.latest_version['version'] = r[1]
> >           pkg.latest_version['id'] = r[2]
> > +
> > +        if pkg.latest_version['status'] == RM_API_STATUS_ERROR:
> > +            pkg.status['version'] = ('warning', 'RM API error')
> > +        elif pkg.latest_version['status'] == RM_API_STATUS_NOT_FOUND:
> > +            pkg.status['version'] = ('warning', 'RM package not found')
> > +
> > +        if pkg.latest_version['version'] is None:
> > +            pkg.status['version'] = ('warning', 'no upstream version available')
> > +        elif pkg.latest_version['version'] != pkg.current_version:
> > +            pkg.status['version'] = ('error', 'package needs update')
>
> We can make the messages a bit more explicit:
> * "Release Monitoring API error"
> * "Package not found on Release Monitoring"
> * "The newer version {} is available
> upstream".format(pkg.latest_version['version'])
>

ok

> > +        else:
> > +            pkg.status['version'] = ('ok', 'up-to-date')
> >       del http_pool
> >
> >
>
> [-- SNIP --]
>
> > @@ -746,12 +779,13 @@ def dump_html_pkg(f, pkg):
> >
> >       # URL status
> >       td_class = ["centered"]
> > -    url_str = pkg.url_status
> > -    if pkg.url_status == "Missing" or pkg.url_status == "No Config.in":
> > +    url_str = pkg.status['url'][1]
> > +    if pkg.status['url'][0] == "warning":
> > +        td_class.append("missing_url")
> > +    elif pkg.status['url'][0] == "error":
> >           td_class.append("missing_url")
>
> The above 2 conditions can be simplified like:
>
> +    if pkg.status['url'][0] in ("error", "warning"):
> +        td_class.append("missing_url")

But then I have to change this to:

@@ -778,10 +778,9 @@ def dump_html_pkg(f, pkg):
     # URL status
     td_class = ["centered"]
     url_str = pkg.status['url'][1]
-    if pkg.status['url'][0] == "warning":
-        td_class.append("missing_url")
-    elif pkg.status['url'][0] == "error":
+    if pkg.status['url'][0] in ("error", "warning"):
         td_class.append("missing_url")
+    if pkg.status['url'][0] == "error":
         td_class.append("invalid_url")
         url_str = "<a href=%s>%s</a>" % (pkg.url, pkg.status['url'][1])
     else:

So the invalid_url class is set only in case of status 'error'. Right?

> > -    elif pkg.url_status.startswith("Invalid"):
> >           td_class.append("invalid_url")
> > -        url_str = "<a href=%s>%s</a>" % (pkg.url, pkg.url_status)
> > +        url_str = "<a href=%s>%s</a>" % (pkg.url, pkg.status['url'][1])
> >       else:
> >           td_class.append("good_url")
> >           url_str = "<a href=%s>Link</a>" % pkg.url
> >
>
>
> Best regards,
>
> Titouan

Thank you,
Heiko

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 07/12] support/scripts/pkg-stats: store pkg dir path
  2020-02-23 15:20   ` Titouan Christophe
@ 2020-02-24  8:04     ` Heiko Thiery
  0 siblings, 0 replies; 36+ messages in thread
From: Heiko Thiery @ 2020-02-24  8:04 UTC (permalink / raw)
  To: buildroot

HI Titouan and all,

Am So., 23. Feb. 2020 um 16:20 Uhr schrieb Titouan Christophe
<titouan.christophe@railnova.eu>:
>
> Hello Heiko,
> On 2/22/20 9:57 AM, Heiko Thiery wrote:
> > This value can be used for later processing.
> >
> > In the buildroot-stats application this is used to create links pointing
> > to the git repo of buildroot.
> >
> > Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
>
> Reviewed-by: Titouan Christophe <titouan.christophe@railnova.eu>
> Tested-by: Titouan Christophe <titouan.christophe@railnova.eu>
>
> > ---
> >   support/scripts/pkg-stats | 1 +
> >   1 file changed, 1 insertion(+)
> >
> > diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> > index 90afcc2332..36b33586ef 100755
> > --- a/support/scripts/pkg-stats
> > +++ b/support/scripts/pkg-stats
> > @@ -64,6 +64,7 @@ class Package:
> >       def __init__(self, name, path):
> >           self.name = name
> >           self.path = path
> > +        self.pkg_path = os.path.dirname(path)
> >           self.infras = None
> >           self.license = None
> >           self.patch_count = 0
> >
>
> Regards,
>
> Titouan

Thank you,
Heiko

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 10/12] support/scripts/pkg-stats: set status to 'na' for virtual packages
  2020-02-23 16:11   ` Titouan Christophe
@ 2020-02-24  8:22     ` Heiko Thiery
  0 siblings, 0 replies; 36+ messages in thread
From: Heiko Thiery @ 2020-02-24  8:22 UTC (permalink / raw)
  To: buildroot

Hi Titouan and all,

Am So., 23. Feb. 2020 um 17:11 Uhr schrieb Titouan Christophe
<titouan.christophe@railnova.eu>:
>
> Hello Heiko and all,
>
> On 2/22/20 9:57 AM, Heiko Thiery wrote:
> > If there is no infra set or infra is virtual the status is set to 'na'.
> >
> > This is done for the follwing checks:
> >   - license
> >   - license-files
> >   - hash
> >   - hash-license
> >   - patches
> >   - version
> >
> > Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
> > ---
> >   support/scripts/pkg-stats | 26 ++++++++++++++++++++++++++
> >   1 file changed, 26 insertions(+)
> >
> > diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> > index e954dd125e..be3b6d7e71 100755
> > --- a/support/scripts/pkg-stats
> > +++ b/support/scripts/pkg-stats
> > @@ -125,6 +125,14 @@ class Package:
> >                   self.status['url'] = ("warning", "missing")
> >                   fp.close()
> >
> > +    def is_valid_infra(self):
> > +        try:
> > +            if self.infras[0][1] == 'virtual':
> > +                return False
> > +        except IndexError:
> > +                return False
> > +        return True
>
> You might want to make a property of this (like CVE.identifier).
> I would also suggest has_valid_infra, because a package is not an
> infrastructure :-)
>
> > +
> >       def set_infra(self):
> >           """
> >           Fills in the .infras field
> > @@ -146,6 +154,11 @@ class Package:
> >           """
> >           Fills in the .status['license'] and .status['license-files'] fields
> >           """
> > +        if self.is_valid_infra() == False:
>
> if not self.is_valid_infra():
>
> (or using a property, if not self.has_valid_infra:)
>
> > +            self.status['license'] = ("na", "no valid package infra")
> > +            self.status['license-files'] = ("na", "no valid package infra")
> > +            return
> > +
> >           var = self.pkgvar()
> >           self.status['license'] = ("error", "missing")
> >           self.status['license-files'] = ("error", "missing")
> > @@ -160,6 +173,11 @@ class Package:
> >           """
> >           Fills in the .status['hash'] field
> >           """
> > +        if self.is_valid_infra() == False:
>
> same
>
> > +            self.status['hash'] = ("na", "no valid package infra")
> > +            self.status['hash-license'] = ("na", "no valid package infra")
> > +            return
> > +
> >           hashpath = self.path.replace(".mk", ".hash")
> >           self.status['hash-license'] = ("na", "no hash file")
> >           if os.path.exists(hashpath):
> > @@ -180,6 +198,10 @@ class Package:
> >           """
> >           Fills in the .patch_count, .patch_files and .status['patches'] fields
> >           """
> > +        if self.is_valid_infra() == False:
>
> same
>
> > +            self.status['patches'] = ("na", "no valid package infra")
> > +            return
> > +
> >           pkgdir = os.path.dirname(self.path)
> >           for subdir, _, _ in os.walk(pkgdir):
> >               self.patch_files = fnmatch.filter(os.listdir(subdir), '*.patch')
> > @@ -558,6 +580,10 @@ def check_package_latest_version(packages):
> >           pkg.latest_version['version'] = r[1]
> >           pkg.latest_version['id'] = r[2]
> >
> > +        if pkg.is_valid_infra() == False:
>
> same
>
> > +            pkg.status['version'] = ("na", "no valid package infra")
> > +            continue
> > +
> >           if pkg.latest_version['status'] == RM_API_STATUS_ERROR:
> >               pkg.status['version'] = ('warning', 'RM API error')
> >           elif pkg.latest_version['status'] == RM_API_STATUS_NOT_FOUND:
> >
>

You're right with all of them .. I will change that for v4.

>
> Regards,
>
> Titouan

Thank you,
Heiko

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 12/12] support/scripts/pkg-stats: add status for cve check
  2020-02-24  7:06     ` Heiko Thiery
@ 2020-02-24  9:35       ` Titouan Christophe
  2020-02-24 12:21         ` Heiko Thiery
  0 siblings, 1 reply; 36+ messages in thread
From: Titouan Christophe @ 2020-02-24  9:35 UTC (permalink / raw)
  To: buildroot


On 2/24/20 8:06 AM, Heiko Thiery wrote:
> Hi Titouan and all,
> 
> Am So., 23. Feb. 2020 um 15:24 Uhr schrieb Titouan Christophe
> <titouan.christophe@railnova.eu>:
>>
>> Heiko, all,
>>
>> On 2/22/20 9:57 AM, Heiko Thiery wrote:
>>> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>

[--SNIP--]

> 
>>
>> I would rather write it like this:
>>
>> ########################
>> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
>> index ed22f6b650..91477d583e 100755
>> --- a/support/scripts/pkg-stats
>> +++ b/support/scripts/pkg-stats
>> @@ -620,6 +620,12 @@ def check_package_cves(nvd_path, packages):
>>                if pkg_name in packages and cve.affects(packages[pkg_name]):
>>                    packages[pkg_name].cves.append(cve.identifier)
>>
>> +    for pkg_name, pkg in packages.items():
>> +        if len(pkg.cves) > 0:
>> +            pkg.status['cve'] = ('error', 'affected by CVE(s)')
>> +        else:
>> +            pkg.status['cve'] = ('ok', 'no CVE found')
>> +
> 
> Isn't it right that we loop then (depending on the amount of nvd
> pathes) several thousend times?
> 
> e.g. packages ~2600, nvds ~ 20 => 20*2600=52000

Except that each NVD file contains a few thousands CVEs :).

> 
> On the other hand we loop over the list of packages all over the place ;-/

Looping over all CVEs in a single NVD file yields 5 to 10 more 
iterations than looping over all packages (for instance year 2018 alone 
has 16039 CVE items)

> 
>>
>>    def calculate_stats(packages):
>>        stats = defaultdict(int)
>> ########################
>>
>>
>> Best regards,
>>
>> Titouan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Buildroot] [PATCH v3 12/12] support/scripts/pkg-stats: add status for cve check
  2020-02-24  9:35       ` Titouan Christophe
@ 2020-02-24 12:21         ` Heiko Thiery
  0 siblings, 0 replies; 36+ messages in thread
From: Heiko Thiery @ 2020-02-24 12:21 UTC (permalink / raw)
  To: buildroot

Hi Titouan,


Am Mo., 24. Feb. 2020 um 10:35 Uhr schrieb Titouan Christophe
<titouan.christophe@railnova.eu>:
>
>
> On 2/24/20 8:06 AM, Heiko Thiery wrote:
> > Hi Titouan and all,
> >
> > Am So., 23. Feb. 2020 um 15:24 Uhr schrieb Titouan Christophe
> > <titouan.christophe@railnova.eu>:
> >>
> >> Heiko, all,
> >>
> >> On 2/22/20 9:57 AM, Heiko Thiery wrote:
> >>> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
>

[--SNIP--]

> > Isn't it right that we loop then (depending on the amount of nvd
> > pathes) several thousend times?
> >
> > e.g. packages ~2600, nvds ~ 20 => 20*2600=52000
>
> Except that each NVD file contains a few thousands CVEs :).
>
> >
> > On the other hand we loop over the list of packages all over the place ;-/
>
> Looping over all CVEs in a single NVD file yields 5 to 10 more
> iterations than looping over all packages (for instance year 2018 alone
> has 16039 CVE items)
>

you're right ... compared to this it doesn't matter.

> >
> >>
> >>    def calculate_stats(packages):
> >>        stats = defaultdict(int)
> >> ########################
> >>
> >>
> >> Best regards,
> >>
> >> Titouan

Thank you,
Heiko

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2020-02-24 12:21 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-22  8:57 [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Heiko Thiery
2020-02-22  8:57 ` [Buildroot] [PATCH v3 01/12] support/scripts/pkg-stats: store latest version in a dict Heiko Thiery
2020-02-23 13:26   ` Titouan Christophe
2020-02-23 21:41     ` Heiko Thiery
2020-02-22  8:57 ` [Buildroot] [PATCH v3 02/12] support/scripts/pkg-stats: store patch files for the package Heiko Thiery
2020-02-23 13:35   ` Titouan Christophe
2020-02-23 21:23     ` Heiko Thiery
2020-02-22  8:57 ` [Buildroot] [PATCH v3 03/12] support/scripts/pkg-stats: set developers info Heiko Thiery
2020-02-23 13:45   ` Titouan Christophe
2020-02-23 21:37     ` Heiko Thiery
2020-02-22  8:57 ` [Buildroot] [PATCH v3 04/12] support/scripts/pkg-stats: store licences of package Heiko Thiery
2020-02-23 15:27   ` Titouan Christophe
2020-02-22  8:57 ` [Buildroot] [PATCH v3 05/12] support/scripts/pkg-stats: add package status Heiko Thiery
2020-02-23 15:19   ` Titouan Christophe
2020-02-24  8:03     ` Heiko Thiery
2020-02-22  8:57 ` [Buildroot] [PATCH v3 06/12] support/scripts/pkg-stats: add package count to stats Heiko Thiery
2020-02-23 14:40   ` Titouan Christophe
2020-02-22  8:57 ` [Buildroot] [PATCH v3 07/12] support/scripts/pkg-stats: store pkg dir path Heiko Thiery
2020-02-23 15:20   ` Titouan Christophe
2020-02-24  8:04     ` Heiko Thiery
2020-02-22  8:57 ` [Buildroot] [PATCH v3 08/12] support/scripts/pkg-stats: add defconfig support Heiko Thiery
2020-02-23 14:37   ` Titouan Christophe
2020-02-22  8:57 ` [Buildroot] [PATCH v3 09/12] support/scripts/pkg-stats: add support for license hash check Heiko Thiery
2020-02-23 16:02   ` Titouan Christophe
2020-02-22  8:57 ` [Buildroot] [PATCH v3 10/12] support/scripts/pkg-stats: set status to 'na' for virtual packages Heiko Thiery
2020-02-23 16:11   ` Titouan Christophe
2020-02-24  8:22     ` Heiko Thiery
2020-02-22  8:57 ` [Buildroot] [PATCH v3 11/12] support/scripts/pkg-stats: initialize all package status checks Heiko Thiery
2020-02-23 14:09   ` Titouan Christophe
2020-02-24  7:28     ` Heiko Thiery
2020-02-22  8:57 ` [Buildroot] [PATCH v3 12/12] support/scripts/pkg-stats: add status for cve check Heiko Thiery
2020-02-23 14:24   ` Titouan Christophe
2020-02-24  7:06     ` Heiko Thiery
2020-02-24  9:35       ` Titouan Christophe
2020-02-24 12:21         ` Heiko Thiery
2020-02-23 16:26 ` [Buildroot] [PATCH v3 00/12] pkg-stats json output improvements Titouan Christophe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.