All of lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] [PATCH v3 0/5] New pkg-stats script, with version information
@ 2018-03-23 20:54 Thomas Petazzoni
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 1/5] support/scripts/pkg-stats-new: rewrite in Python Thomas Petazzoni
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Thomas Petazzoni @ 2018-03-23 20:54 UTC (permalink / raw)
  To: buildroot

Hello,

This series rewrites the pkg-stats script in Python, and adds two new
very useful information to it:

 - The current version of each package in Buildroot

 - The latest upstream version of each package, as provided by the
   release-monitoring.org web site.

The script then compares the current version in Buildroot with the
latest upstream version, and tells whether they are different.

The code is available at:

 https://git.free-electrons.com/users/thomas-petazzoni/buildroot/log/?h=pkg-stats-v3

Changes since v2
================

 - Take into account patches in sub-directories in
   add_patch_count(). Comment from Ricardo.
 - Fix get_check_package_warnings() which was not appending the full
   file path to be checked. Comment from Ricardo.
 - Use subprocess.Popen() instead of subprocess.check_output() in
   get_check_package_warnings(). Comment from Ricardo.
 - Move a lot of the logic as methods of the Package() class.
 - Use the "timeout" argument of urllib2.urlopen() in order to make
   sure that the requests terminate at some point, even if
   release-monitoring.org is stuck.
 - Address Ricardo's concern about printvars variable sorting affecting
   the version reported: we now parse all HOST_*_VERSION variables
   first, and then all *_VERSION variables, ensuring that the target
   version of a package wins.
 - Limit length of version string to 20 characters, as suggested by
   Ricardo.
 - Indicate in help text that the list of packages is
   comma-separated. Comment from Ricardo.
 - Fix typo in commit log. Comment from Ricardo.
 - Added Reviewed-by from Ricardo.

Changes since v1
================

This version mainly focuses on fixing the numerous comments and issues
pointed out by Ricardo Martincoski (many thanks for his very detailed
and useful review!), and also fixes a few other things:

 - Added copyright notice in the new script

 - Fixed all flake8 warnings (I still personally find the "let's force
   to have two empty lines between every function" a very silly rule,
   but well)

 - Pass the package path to the Package() constructor directly

 - Only introduce CSS properties when they are really needed (some
   were introduced in PATCH 1, but they were only really used by later
   patches)

 - Fix the broken #results internal link, which now properly points to
   the statistics table.

 - Remove double </table> tag.

 - Fix the sorting of the table by defining __eq__ and __lt__ in
   Package(), and make 'packages' a list rather than a dict.

 - Add the build date and git commit in the HTML page footer

 - Pass BR2_HAVE_DOT_CONFIG=y when calling make, in order to fake
   having a .config. This allows "printvars" to dump all variables
   even without a .config.

 - Add newlines in the HTML to make it readable.

More details
============

release-monitoring.org is a very useful web site, monitoring more than
16000 projects. It is also very easy to add new projects to be
monitored. It supports monitoring projects on popular hosting
platforms such as Github, but can also monitor plain HTTP folders, or
even web pages using a regexp to identify what is a version number
within the HTML blurb.

Projects can be found by regular search, but it is also possible to
add a mapping between the name of a package in a given distribution,
and the name of the package as known by release-monitoring.org. For
example in Buildroot "samba" is not named "samba" but "samba4", and
this mapping mechanism allows release-monitoring.org to reply to
requests for samba4 within the Buildroot distribution.

I had very good interactions with the release-monitoring.org
maintainers:

 - They are easily available on IRC

 - They created the "Buildroot" distribution within minutes,
   https://release-monitoring.org/distro/Buildroot/.

 - They have been very responsive to fix issues in existing packages.

It doesn't provide CVE related information for security, so it would
still be useful to extend this mechanism with another CVE related
database. But as we discussed during the last Buildroot meeting in
Brussels, the NIST database is not very up to date, while
release-monitoring.org, thanks to the process being fully automated.

Before people start sending gazillions of patches to update packages,
I would like us to focus on:

 - Adding missing projects on release-monitoring.org

 - Adding missing mappings for the Buildroot distribution on
   release-monitoring.org

 - Deciding how to handle packages such as all python-* packages or
   all x11r7 packages, for which the name never matches with the
   release-monitoring.org package name.

   Do we create a mapping for each of them on release-monitoring.org
   (which we would have to do for every new python package) or do we
   make the script smarter, and attempt to search the package without
   its python- prefix (which won't always work either) ?

Basically, I would like to focus on making the output of the script
more useful/relevant, and then only start getting gazillions of
patches updating packages.

Thanks for your review, and contributions!

Thomas

Thomas Petazzoni (5):
  support/scripts/pkg-stats-new: rewrite in Python
  support/scripts/pkg-stats-new: add -n and -p options
  support/scripts/pkg-stats-new: add current version information
  support/scripts/pkg-stats-new: add latest upstream version information
  support/scripts/pkg-stats: replace with new Python version

 support/scripts/pkg-stats | 997 ++++++++++++++++++++++++++++------------------
 1 file changed, 603 insertions(+), 394 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Buildroot] [PATCH v3 1/5] support/scripts/pkg-stats-new: rewrite in Python
  2018-03-23 20:54 [Buildroot] [PATCH v3 0/5] New pkg-stats script, with version information Thomas Petazzoni
@ 2018-03-23 20:54 ` Thomas Petazzoni
  2018-03-30  3:23   ` Ricardo Martincoski
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 2/5] support/scripts/pkg-stats-new: add -n and -p options Thomas Petazzoni
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Thomas Petazzoni @ 2018-03-23 20:54 UTC (permalink / raw)
  To: buildroot

This commit adds a new version of the pkg-stats script, rewritten in
Python. It is for now implemented in a separate file called,
pkg-stats-new, in order to make the diff easily readable. A future
commit will rename it to pkg-stats.

Compared to the existing shell-based pkg-stats script, the
functionality and output is basically the same. The main difference is
that the output no longer goes to stdout, but to the file passed as
argument using the -o option. This allows stdout to be used for more
debugging related information.

The way the script works is that a first function get_pkglist()
creates a dict associating package names with an instance of a
Package() object, containing basic information about the package. Then
a number of other functions (add_infra_info, add_pkg_make_info,
add_hash_info, add_patch_count, add_check_package_warnings) will
calculate additional information about packages, and fill in fields in
the Package objects.

calculate_stats() then calculates global statistics (how packages have
license information, how packages have a hash file, etc.). Finally,
dump_html() produces the HTML output, using a number of sub-functions.

One improvement over the shell-based version is that we can use
regexps to exclude some .mk files. Thanks to this, we can exclude all
linux-ext-*.mk files, avoiding incorrect matches.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
---
Changes since v2:

- Take into account patches in sub-directories in
  add_patch_count(). Comment from Ricardo.
- Fix get_check_package_warnings() which was not appending the full
  file path to be checked. Comment from Ricardo.
- Use subprocess.Popen() instead of subprocess.check_output() in
  get_check_package_warnings(). Comment from Ricardo.
- Move a lot of the logic as methods of the Package() class.

Changes since v1: take into account Ricardo's comments:

- Added copyright notice
- Fix flake8 warnings
- Pass package path to Package() constructor
- Remove CSS properties not needed in this commit
- Fix broken #results link
- Remove double </table>
- Add newlines in the generated HTML
- Fix the sorting of the table by defining __eq__ and __lt__ in
  Package(), and making 'packages' a list rather than a dict.
- Add the build date and git commit in the HTML page footer
- Pass BR2_HAVE_DOT_CONFIG=y when calling make, in order to fake
  having a .config. This allows "printvars" to dump all variables even
  without a .config.
---
 support/scripts/pkg-stats-new | 459 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 459 insertions(+)
 create mode 100755 support/scripts/pkg-stats-new

diff --git a/support/scripts/pkg-stats-new b/support/scripts/pkg-stats-new
new file mode 100755
index 0000000000..955d3ce990
--- /dev/null
+++ b/support/scripts/pkg-stats-new
@@ -0,0 +1,459 @@
+#!/usr/bin/env python
+
+# Copyright (C) 2009 by Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+import argparse
+import datetime
+import fnmatch
+import os
+from collections import defaultdict
+import re
+import subprocess
+
+INFRA_RE = re.compile("\$\(eval \$\(([a-z-]*)-package\)\)")
+
+
+class Package:
+    all_licenses = list()
+    all_license_files = list()
+
+    def __init__(self, name, path):
+        self.name = name
+        self.path = path
+        self.infras = None
+        self.has_license = False
+        self.has_license_files = False
+        self.has_hash = False
+        self.patch_count = 0
+        self.warnings = 0
+
+    def pkgvar(self):
+        return self.name.upper().replace("-", "_")
+
+    def set_infra(self):
+        """
+        Fills in the .infras field
+        """
+        self.infras = list()
+        with open(self.path, 'r') as f:
+            lines = f.readlines()
+            for l in lines:
+                match = INFRA_RE.match(l)
+                if not match:
+                    continue
+                infra = match.group(1)
+                if infra.startswith("host-"):
+                    self.infras.append(("host", infra[5:]))
+                else:
+                    self.infras.append(("target", infra))
+
+    def set_license(self):
+        """
+        Fills in the .has_license and .has_license_files fields
+        """
+        var = self.pkgvar()
+        if var in self.all_licenses:
+            self.has_license = True
+        if var in self.all_license_files:
+            self.has_license_files = True
+
+    def set_hash_info(self):
+        """
+        Fills in the .has_hash field
+        """
+        hashpath = self.path.replace(".mk", ".hash")
+        self.has_hash = os.path.exists(hashpath)
+
+    def set_patch_count(self):
+        """
+        Fills in the .patch_count field
+        """
+        self.patch_count = 0
+        pkgdir = os.path.dirname(self.path)
+        for subdir, _, _ in os.walk(pkgdir):
+            self.patch_count += len(fnmatch.filter(os.listdir(subdir), '*.patch'))
+
+    def set_check_package_warnings(self):
+        """
+        Fills in the .warnings field
+        """
+        cmd = ["./utils/check-package"]
+        pkgdir = os.path.dirname(self.path)
+        for root, dirs, files in os.walk(pkgdir):
+            for f in files:
+                if f.endswith(".mk") or f.endswith(".hash") or f == "Config.in" or f == "Config.in.host":
+                    cmd.append(os.path.join(root, f))
+        o = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()[1]
+        lines = o.splitlines()
+        for line in lines:
+            m = re.match("^([0-9]*) warnings generated", line)
+            if m:
+                self.warnings = int(m.group(1))
+                return
+
+    def __eq__(self, other):
+        return self.path == other.path
+
+    def __lt__(self, other):
+        return self.path < other.path
+
+    def __str__(self):
+        return "%s (path='%s', license='%s', license_files='%s', hash='%s', patches=%d)" % \
+            (self.name, self.path, self.has_license, self.has_license_files, self.has_hash, self.patch_count)
+
+
+def get_pkglist():
+    """
+    Builds the list of Buildroot packages, returning a list of Package
+    objects. Only the .name and .path fields of the Package object are
+    initialized.
+    """
+    WALK_USEFUL_SUBDIRS = ["boot", "linux", "package", "toolchain"]
+    WALK_EXCLUDES = ["boot/common.mk",
+                     "linux/linux-ext-.*.mk",
+                     "package/freescale-imx/freescale-imx.mk",
+                     "package/gcc/gcc.mk",
+                     "package/gstreamer/gstreamer.mk",
+                     "package/gstreamer1/gstreamer1.mk",
+                     "package/gtk2-themes/gtk2-themes.mk",
+                     "package/matchbox/matchbox.mk",
+                     "package/opengl/opengl.mk",
+                     "package/qt5/qt5.mk",
+                     "package/x11r7/x11r7.mk",
+                     "package/doc-asciidoc.mk",
+                     "package/pkg-.*.mk",
+                     "package/nvidia-tegra23/nvidia-tegra23.mk",
+                     "toolchain/toolchain-external/pkg-toolchain-external.mk",
+                     "toolchain/toolchain-external/toolchain-external.mk",
+                     "toolchain/toolchain.mk",
+                     "toolchain/helpers.mk",
+                     "toolchain/toolchain-wrapper.mk"]
+    packages = list()
+    for root, dirs, files in os.walk("."):
+        rootdir = root.split("/")
+        if len(rootdir) < 2:
+            continue
+        if rootdir[1] not in WALK_USEFUL_SUBDIRS:
+            continue
+        for f in files:
+            if not f.endswith(".mk"):
+                continue
+            # Strip ending ".mk"
+            pkgname = f[:-3]
+            pkgpath = os.path.join(root, f)
+            skip = False
+            for exclude in WALK_EXCLUDES:
+                # pkgpath[2:] strips the initial './'
+                if re.match(exclude, pkgpath[2:]):
+                    skip = True
+                    continue
+            if skip:
+                continue
+            p = Package(pkgname, pkgpath)
+            packages.append(p)
+    return packages
+
+
+def package_init_make_info():
+    # Licenses
+    o = subprocess.check_output(["make", "BR2_HAVE_DOT_CONFIG=y",
+                                 "-s", "printvars", "VARS=%_LICENSE"])
+    for l in o.splitlines():
+        # Get variable name and value
+        pkgvar, value = l.split("=")
+
+        # If present, strip HOST_ from variable name
+        if pkgvar.startswith("HOST_"):
+            pkgvar = pkgvar[5:]
+
+        # Strip _LICENSE
+        pkgvar = pkgvar[:-8]
+
+        # If value is "unknown", no license details available
+        if value == "unknown":
+            continue
+        Package.all_licenses.append(pkgvar)
+
+    # License files
+    o = subprocess.check_output(["make", "BR2_HAVE_DOT_CONFIG=y",
+                                 "-s", "printvars", "VARS=%_LICENSE_FILES"])
+    for l in o.splitlines():
+        # Get variable name and value
+        pkgvar, value = l.split("=")
+
+        # If present, strip HOST_ from variable name
+        if pkgvar.startswith("HOST_"):
+            pkgvar = pkgvar[5:]
+
+        if pkgvar.endswith("_MANIFEST_LICENSE_FILES"):
+            continue
+
+        # Strip _LICENSE_FILES
+        pkgvar = pkgvar[:-14]
+
+        Package.all_license_files.append(pkgvar)
+
+
+def calculate_stats(packages):
+    stats = defaultdict(int)
+    for pkg in packages:
+        # If packages have multiple infra, take the first one. For the
+        # vast majority of packages, the target and host infra are the
+        # same. There are very few packages that use a different infra
+        # for the host and target variants.
+        if len(pkg.infras) > 0:
+            infra = pkg.infras[0][1]
+            stats["infra-%s" % infra] += 1
+        else:
+            stats["infra-unknown"] += 1
+        if pkg.has_license:
+            stats["license"] += 1
+        else:
+            stats["no-license"] += 1
+        if pkg.has_license_files:
+            stats["license-files"] += 1
+        else:
+            stats["no-license-files"] += 1
+        if pkg.has_hash:
+            stats["hash"] += 1
+        else:
+            stats["no-hash"] += 1
+        stats["patches"] += pkg.patch_count
+    return stats
+
+
+html_header = """
+<head>
+<script src=\"https://www.kryogenix.org/code/browser/sorttable/sorttable.js\"></script>
+<style type=\"text/css\">
+table {
+  width: 100%;
+}
+td {
+  border: 1px solid black;
+}
+td.centered {
+  text-align: center;
+}
+td.wrong {
+  background: #ff9a69;
+}
+td.correct {
+  background: #d2ffc4;
+}
+td.nopatches {
+  background: #d2ffc4;
+}
+td.somepatches {
+  background: #ffd870;
+}
+td.lotsofpatches {
+  background: #ff9a69;
+}
+</style>
+<title>Statistics of Buildroot packages</title>
+</head>
+
+<a href=\"#results\">Results</a><br/>
+
+<p id=\"sortable_hint\"></p>
+"""
+
+
+html_footer = """
+</body>
+<script>
+if (typeof sorttable === \"object\") {
+  document.getElementById(\"sortable_hint\").innerHTML =
+  \"hint: the table can be sorted by clicking the column headers\"
+}
+</script>
+</html>
+"""
+
+
+def infra_str(infra_list):
+    if not infra_list:
+        return "Unknown"
+    elif len(infra_list) == 1:
+        return "<b>%s</b><br/>%s" % (infra_list[0][1], infra_list[0][0])
+    elif infra_list[0][1] == infra_list[1][1]:
+        return "<b>%s</b><br/>%s + %s" % \
+            (infra_list[0][1], infra_list[0][0], infra_list[1][0])
+    else:
+        return "<b>%s</b> (%s)<br/><b>%s</b> (%s)" % \
+            (infra_list[0][1], infra_list[0][0],
+             infra_list[1][1], infra_list[1][0])
+
+
+def boolean_str(b):
+    if b:
+        return "Yes"
+    else:
+        return "No"
+
+
+def dump_html_pkg(f, pkg):
+    f.write(" <tr>\n")
+    f.write("  <td>%s</td>\n" % pkg.path[2:])
+
+    # Patch count
+    td_class = ["centered"]
+    if pkg.patch_count == 0:
+        td_class.append("nopatches")
+    elif pkg.patch_count < 5:
+        td_class.append("somepatches")
+    else:
+        td_class.append("lotsofpatches")
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), str(pkg.patch_count)))
+
+    # Infrastructure
+    infra = infra_str(pkg.infras)
+    td_class = ["centered"]
+    if infra == "Unknown":
+        td_class.append("wrong")
+    else:
+        td_class.append("correct")
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), infra_str(pkg.infras)))
+
+    # License
+    td_class = ["centered"]
+    if pkg.has_license:
+        td_class.append("correct")
+    else:
+        td_class.append("wrong")
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), boolean_str(pkg.has_license)))
+
+    # License files
+    td_class = ["centered"]
+    if pkg.has_license_files:
+        td_class.append("correct")
+    else:
+        td_class.append("wrong")
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), boolean_str(pkg.has_license_files)))
+
+    # Hash
+    td_class = ["centered"]
+    if pkg.has_hash:
+        td_class.append("correct")
+    else:
+        td_class.append("wrong")
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), boolean_str(pkg.has_hash)))
+
+    # Warnings
+    td_class = ["centered"]
+    if pkg.warnings == 0:
+        td_class.append("correct")
+    else:
+        td_class.append("wrong")
+    f.write("  <td class=\"%s\">%d</td>\n" %
+            (" ".join(td_class), pkg.warnings))
+
+    f.write(" </tr>\n")
+
+
+def dump_html_all_pkgs(f, packages):
+    f.write("""
+<table class=\"sortable\">
+<tr>
+<td>Package</td>
+<td class=\"centered\">Patch count</td>
+<td class=\"centered\">Infrastructure</td>
+<td class=\"centered\">License</td>
+<td class=\"centered\">License files</td>
+<td class=\"centered\">Hash file</td>
+<td class=\"centered\">Warnings</td>
+</tr>
+""")
+    for pkg in sorted(packages):
+        dump_html_pkg(f, pkg)
+    f.write("</table>")
+
+
+def dump_html_stats(f, stats):
+    f.write("<a id=\"results\"></a>\n")
+    f.write("<table>\n")
+    infras = [infra[6:] for infra in stats.keys() if infra.startswith("infra-")]
+    for infra in infras:
+        f.write(" <tr><td>Packages using the <i>%s</i> infrastructure</td><td>%s</td></tr>\n" %
+                (infra, stats["infra-%s" % infra]))
+    f.write(" <tr><td>Packages having license information</td><td>%s</td></tr>\n" %
+            stats["license"])
+    f.write(" <tr><td>Packages not having license information</td><td>%s</td></tr>\n" %
+            stats["no-license"])
+    f.write(" <tr><td>Packages having license files information</td><td>%s</td></tr>\n" %
+            stats["license-files"])
+    f.write(" <tr><td>Packages not having license files information</td><td>%s</td></tr>\n" %
+            stats["no-license-files"])
+    f.write(" <tr><td>Packages having a hash file</td><td>%s</td></tr>\n" %
+            stats["hash"])
+    f.write(" <tr><td>Packages not having a hash file</td><td>%s</td></tr>\n" %
+            stats["no-hash"])
+    f.write(" <tr><td>Total number of patches</td><td>%s</td></tr>\n" %
+            stats["patches"])
+    f.write("</table>\n")
+
+
+def dump_gen_info(f):
+    # Updated on Mon Feb 19 08:12:08 CET 2018, Git commit aa77030b8f5e41f1c53eb1c1ad664b8c814ba032
+    o = subprocess.check_output(["git", "log", "master", "-n", "1", "--pretty=format:%H"])
+    git_commit = o.splitlines()[0]
+    f.write("<p><i>Updated on %s, git commit %s</i></p>\n" %
+            (str(datetime.datetime.utcnow()), git_commit))
+
+
+def dump_html(packages, stats, output):
+    with open(output, 'w') as f:
+        f.write(html_header)
+        dump_html_all_pkgs(f, packages)
+        dump_html_stats(f, stats)
+        dump_gen_info(f)
+        f.write(html_footer)
+
+
+def parse_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('-o', dest='output', action='store', required=True,
+                        help='HTML output file')
+    return parser.parse_args()
+
+
+def __main__():
+    args = parse_args()
+    print "Build package list ..."
+    packages = get_pkglist()
+    print "Getting package make info ..."
+    package_init_make_info()
+    print "Getting package details ..."
+    for pkg in packages:
+        pkg.set_infra()
+        pkg.set_license()
+        pkg.set_hash_info()
+        pkg.set_patch_count()
+        pkg.set_check_package_warnings()
+    print "Calculate stats"
+    stats = calculate_stats(packages)
+    print "Write HTML"
+    dump_html(packages, stats, args.output)
+
+
+__main__()
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Buildroot] [PATCH v3 2/5] support/scripts/pkg-stats-new: add -n and -p options
  2018-03-23 20:54 [Buildroot] [PATCH v3 0/5] New pkg-stats script, with version information Thomas Petazzoni
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 1/5] support/scripts/pkg-stats-new: rewrite in Python Thomas Petazzoni
@ 2018-03-23 20:54 ` Thomas Petazzoni
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 3/5] support/scripts/pkg-stats-new: add current version information Thomas Petazzoni
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Thomas Petazzoni @ 2018-03-23 20:54 UTC (permalink / raw)
  To: buildroot

This commit adds the following options to the pkg-stats-new script:

 -n, to specify a number of packages to parse instead of all packages

 -p, to specify a list of packages (comma-separated) to parse instead
     of all packages

These options are basically only useful when debugging/developing
this script, but they are very useful, because the script is rather
slow to run completely with all 2000+ packages, especially once
upstream versions will be fetched from release-monitoring.org.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Reviewed-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
---
Changes since v2:
- Indicate in help text that the list of packages is
  comma-separated. Comment from Ricardo.
- Fix typo in commit log. Comment from Ricardo.
- Added Reviewed-by from Ricardo.

Changes since v1:
- Fix flake8 warnings
---
 support/scripts/pkg-stats-new | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/support/scripts/pkg-stats-new b/support/scripts/pkg-stats-new
index 955d3ce990..5dc70f1671 100755
--- a/support/scripts/pkg-stats-new
+++ b/support/scripts/pkg-stats-new
@@ -23,6 +23,7 @@ import os
 from collections import defaultdict
 import re
 import subprocess
+import sys
 
 INFRA_RE = re.compile("\$\(eval \$\(([a-z-]*)-package\)\)")
 
@@ -116,11 +117,14 @@ class Package:
             (self.name, self.path, self.has_license, self.has_license_files, self.has_hash, self.patch_count)
 
 
-def get_pkglist():
+def get_pkglist(npackages, package_list):
     """
     Builds the list of Buildroot packages, returning a list of Package
     objects. Only the .name and .path fields of the Package object are
     initialized.
+
+    npackages: limit to N packages
+    package_list: limit to those packages in this list
     """
     WALK_USEFUL_SUBDIRS = ["boot", "linux", "package", "toolchain"]
     WALK_EXCLUDES = ["boot/common.mk",
@@ -143,6 +147,7 @@ def get_pkglist():
                      "toolchain/helpers.mk",
                      "toolchain/toolchain-wrapper.mk"]
     packages = list()
+    count = 0
     for root, dirs, files in os.walk("."):
         rootdir = root.split("/")
         if len(rootdir) < 2:
@@ -154,6 +159,8 @@ def get_pkglist():
                 continue
             # Strip ending ".mk"
             pkgname = f[:-3]
+            if package_list and pkgname not in package_list:
+                continue
             pkgpath = os.path.join(root, f)
             skip = False
             for exclude in WALK_EXCLUDES:
@@ -165,6 +172,9 @@ def get_pkglist():
                 continue
             p = Package(pkgname, pkgpath)
             packages.append(p)
+            count += 1
+            if npackages and count == npackages:
+                return packages
     return packages
 
 
@@ -434,13 +444,24 @@ def parse_args():
     parser = argparse.ArgumentParser()
     parser.add_argument('-o', dest='output', action='store', required=True,
                         help='HTML output file')
+    parser.add_argument('-n', dest='npackages', type=int, action='store',
+                        help='Number of packages')
+    parser.add_argument('-p', dest='packages', action='store',
+                        help='List of packages (comma separated)')
     return parser.parse_args()
 
 
 def __main__():
     args = parse_args()
+    if args.npackages and args.packages:
+        print "ERROR: -n and -p are mutually exclusive"
+        sys.exit(1)
+    if args.packages:
+        package_list = args.packages.split(",")
+    else:
+        package_list = None
     print "Build package list ..."
-    packages = get_pkglist()
+    packages = get_pkglist(args.npackages, package_list)
     print "Getting package make info ..."
     package_init_make_info()
     print "Getting package details ..."
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Buildroot] [PATCH v3 3/5] support/scripts/pkg-stats-new: add current version information
  2018-03-23 20:54 [Buildroot] [PATCH v3 0/5] New pkg-stats script, with version information Thomas Petazzoni
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 1/5] support/scripts/pkg-stats-new: rewrite in Python Thomas Petazzoni
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 2/5] support/scripts/pkg-stats-new: add -n and -p options Thomas Petazzoni
@ 2018-03-23 20:54 ` Thomas Petazzoni
  2018-03-30  3:25   ` Ricardo Martincoski
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 4/5] support/scripts/pkg-stats-new: add latest upstream " Thomas Petazzoni
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Thomas Petazzoni @ 2018-03-23 20:54 UTC (permalink / raw)
  To: buildroot

This commit adds a new column in the HTML output containing the
current version of a package in Buildroot. As such, it isn't terribly
useful, but combined with the latest upstream version added in a
follow-up commit, it will become very useful.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
---
Changes since v2:
- Address Ricardo's concern about printvars variable sorting affecting
  the version reported: we now parse all HOST_*_VERSION variables
  first, and then all *_VERSION variables, ensuring that the target
  version of a package wins.
- Limit length of version string to 20 characters, as suggested by
  Ricardo.
- Move a lot of the logic as methods of the Package() class.

Changes since v1:
- Fix flake8 warnings
- Pass BR2_HAVE_DOT_CONFIG=y when calling make, in order to fake
  having a .config. This allows "printvars" to dump all variables even
  without a .config.
- Add missing newline in HTML code
---
 support/scripts/pkg-stats-new | 46 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/support/scripts/pkg-stats-new b/support/scripts/pkg-stats-new
index 5dc70f1671..43f7e8d543 100755
--- a/support/scripts/pkg-stats-new
+++ b/support/scripts/pkg-stats-new
@@ -31,6 +31,7 @@ INFRA_RE = re.compile("\$\(eval \$\(([a-z-]*)-package\)\)")
 class Package:
     all_licenses = list()
     all_license_files = list()
+    all_versions = dict()
 
     def __init__(self, name, path):
         self.name = name
@@ -41,6 +42,7 @@ class Package:
         self.has_hash = False
         self.patch_count = 0
         self.warnings = 0
+        self.current_version = None
 
     def pkgvar(self):
         return self.name.upper().replace("-", "_")
@@ -88,6 +90,14 @@ class Package:
         for subdir, _, _ in os.walk(pkgdir):
             self.patch_count += len(fnmatch.filter(os.listdir(subdir), '*.patch'))
 
+    def set_current_version(self):
+        """
+        Fills in the .current_version field
+        """
+        var = self.pkgvar()
+        if var in self.all_versions:
+            self.current_version = self.all_versions[var]
+
     def set_check_package_warnings(self):
         """
         Fills in the .warnings field
@@ -217,6 +227,33 @@ def package_init_make_info():
 
         Package.all_license_files.append(pkgvar)
 
+    # Version
+    o = subprocess.check_output(["make", "BR2_HAVE_DOT_CONFIG=y",
+                                 "-s", "printvars", "VARS=%_VERSION"])
+
+    # We process first the host package VERSION, and then the target
+    # package VERSION. This means that if a package exists in both
+    # target and host variants, with different version numbers
+    # (unlikely), we'll report the target version number.
+    version_list = o.splitlines()
+    version_list = [x for x in version_list if x.startswith("HOST_")] + \
+                   [x for x in version_list if not x.startswith("HOST_")]
+    for l in version_list:
+        # Get variable name and value
+        pkgvar, value = l.split("=")
+
+        # If present, strip HOST_ from variable name
+        if pkgvar.startswith("HOST_"):
+            pkgvar = pkgvar[5:]
+
+        if pkgvar.endswith("_DL_VERSION"):
+            continue
+
+        # Strip _VERSION
+        pkgvar = pkgvar[:-8]
+
+        Package.all_versions[pkgvar] = value
+
 
 def calculate_stats(packages):
     stats = defaultdict(int)
@@ -369,6 +406,13 @@ def dump_html_pkg(f, pkg):
     f.write("  <td class=\"%s\">%s</td>\n" %
             (" ".join(td_class), boolean_str(pkg.has_hash)))
 
+    # Current version
+    if len(pkg.current_version) > 20:
+        current_version = pkg.current_version[:20] + "..."
+    else:
+        current_version = pkg.current_version
+    f.write("  <td class=\"centered\">%s</td>\n" % current_version)
+
     # Warnings
     td_class = ["centered"]
     if pkg.warnings == 0:
@@ -391,6 +435,7 @@ def dump_html_all_pkgs(f, packages):
 <td class=\"centered\">License</td>
 <td class=\"centered\">License files</td>
 <td class=\"centered\">Hash file</td>
+<td class=\"centered\">Current version</td>
 <td class=\"centered\">Warnings</td>
 </tr>
 """)
@@ -471,6 +516,7 @@ def __main__():
         pkg.set_hash_info()
         pkg.set_patch_count()
         pkg.set_check_package_warnings()
+        pkg.set_current_version()
     print "Calculate stats"
     stats = calculate_stats(packages)
     print "Write HTML"
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Buildroot] [PATCH v3 4/5] support/scripts/pkg-stats-new: add latest upstream version information
  2018-03-23 20:54 [Buildroot] [PATCH v3 0/5] New pkg-stats script, with version information Thomas Petazzoni
                   ` (2 preceding siblings ...)
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 3/5] support/scripts/pkg-stats-new: add current version information Thomas Petazzoni
@ 2018-03-23 20:54 ` Thomas Petazzoni
  2018-03-30  3:32   ` Ricardo Martincoski
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 5/5] support/scripts/pkg-stats: replace with new Python version Thomas Petazzoni
  2018-04-04 20:10 ` [Buildroot] [PATCH v3 0/5] New pkg-stats script, with version information Thomas Petazzoni
  5 siblings, 1 reply; 11+ messages in thread
From: Thomas Petazzoni @ 2018-03-23 20:54 UTC (permalink / raw)
  To: buildroot

This commit adds fetching the latest upstream version of each package
from release-monitoring.org.

The fetching process first tries to use the package mappings of the
"Buildroot" distribution [1]. If there is no result, then it does a
regular search, and within the search results, looks for a package
whose name matches the Buildroot name.

Since release-monitoring.org is a bit slow, we have 8 threads that
fetch information in parallel.

From an output point of view, the latest version column:

 - Is green when the version in Buildroot matches the latest upstream
   version

 - Is orange when the latest upstream version is unknown because the
   package was not found on release-monitoring.org

 - Is red when the version in Buildroot doesn't match the latest
   upstream version. Note that we are not doing anything smart here:
   we are just testing if the strings are equal or not.

 - The cell contains the link to the project on release-monitoring.org
   if found.

 - The cell indicates if the match was done using a distro mapping, or
   through a regular search.

[1] https://release-monitoring.org/distro/Buildroot/

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
---
Changes since v2:
- Use the "timeout" argument of urllib2.urlopen() in order to make
  sure that the requests terminate at some point, even if
  release-monitoring.org is stuck.
- Move a lot of the logic as methods of the Package() class.

Changes since v1:
- Fix flake8 warnings
- Add missing newline in HTML
---
 support/scripts/pkg-stats-new | 138 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 138 insertions(+)

diff --git a/support/scripts/pkg-stats-new b/support/scripts/pkg-stats-new
index 43f7e8d543..830040a485 100755
--- a/support/scripts/pkg-stats-new
+++ b/support/scripts/pkg-stats-new
@@ -24,8 +24,13 @@ from collections import defaultdict
 import re
 import subprocess
 import sys
+import json
+import urllib2
+from Queue import Queue
+from threading import Thread
 
 INFRA_RE = re.compile("\$\(eval \$\(([a-z-]*)-package\)\)")
+RELEASE_MONITORING_API = "http://release-monitoring.org/api"
 
 
 class Package:
@@ -43,6 +48,7 @@ class Package:
         self.patch_count = 0
         self.warnings = 0
         self.current_version = None
+        self.latest_version = None
 
     def pkgvar(self):
         return self.name.upper().replace("-", "_")
@@ -116,6 +122,43 @@ class Package:
                 self.warnings = int(m.group(1))
                 return
 
+    def get_latest_version_by_distro(self):
+        try:
+            req = urllib2.Request(os.path.join(RELEASE_MONITORING_API, "project", "Buildroot", self.name))
+            f = urllib2.urlopen(req, timeout=15)
+        except:
+            # Exceptions can typically be a timeout, or a 404 error if not project
+            return (False, None, None)
+        data = json.loads(f.read())
+        if len(data['versions']) > 0:
+            return (True, data['versions'][0], data['id'])
+        else:
+            return (True, None, data['id'])
+
+    def get_latest_version_by_guess(self):
+        try:
+            req = urllib2.Request(os.path.join(RELEASE_MONITORING_API, "projects", "?pattern=%s" % self.name))
+            f = urllib2.urlopen(req, timeout=15)
+        except:
+            # Exceptions can typically be a timeout, or a 404 error if not project
+            return (False, None, None)
+        data = json.loads(f.read())
+        for p in data['projects']:
+            if p['name'] == self.name and len(p['versions']) > 0:
+                return (False, p['versions'][0], p['id'])
+        return (False, None, None)
+
+    def set_latest_version(self):
+        # We first try by using the "Buildroot" distribution on
+        # release-monitoring.org, if it has a mapping for the current
+        # package name.
+        self.latest_version = self.get_latest_version_by_distro()
+        if self.latest_version == (False, None, None):
+            # If that fails because there is no mapping or because we had a
+            # request timeout, we try to search in all packages for a package
+            # of this name.
+            self.latest_version = self.get_latest_version_by_guess()
+
     def __eq__(self, other):
         return self.path == other.path
 
@@ -255,6 +298,41 @@ def package_init_make_info():
         Package.all_versions[pkgvar] = value
 
 
+def set_version_worker(q):
+    while True:
+        pkg = q.get()
+        pkg.set_latest_version()
+        print " [%04d] %s => %s" % (q.qsize(), pkg.name, str(pkg.latest_version))
+        q.task_done()
+
+
+def add_latest_version_info(packages):
+    """
+    Fills in the .latest_version field of all Package objects
+
+    This field has a special format:
+      (mapping, version, id)
+    with:
+    - mapping: boolean that indicates whether release-monitoring.org
+      has a mapping for this package name in the Buildroot distribution
+      or not
+    - version: string containing the latest version known by
+      release-monitoring.org for this package
+    - id: string containing the id of the project corresponding to this
+      package, as known by release-monitoring.org
+    """
+    q = Queue()
+    for pkg in packages:
+        q.put(pkg)
+    # Since release-monitoring.org is rather slow, we create 8 threads
+    # that do HTTP requests to the site.
+    for i in range(8):
+        t = Thread(target=set_version_worker, args=[q])
+        t.daemon = True
+        t.start()
+    q.join()
+
+
 def calculate_stats(packages):
     stats = defaultdict(int)
     for pkg in packages:
@@ -279,6 +357,16 @@ def calculate_stats(packages):
             stats["hash"] += 1
         else:
             stats["no-hash"] += 1
+        if pkg.latest_version[0]:
+            stats["rmo-mapping"] += 1
+        else:
+            stats["rmo-no-mapping"] += 1
+        if not pkg.latest_version[1]:
+            stats["version-unknown"] += 1
+        elif pkg.latest_version[1] == pkg.current_version:
+            stats["version-uptodate"] += 1
+        else:
+            stats["version-not-uptodate"] += 1
         stats["patches"] += pkg.patch_count
     return stats
 
@@ -311,6 +399,15 @@ td.somepatches {
 td.lotsofpatches {
   background: #ff9a69;
 }
+td.version-good {
+  background: #d2ffc4;
+}
+td.version-needs-update {
+  background: #ff9a69;
+}
+td.version-unknown {
+ background: #ffd870;
+}
 </style>
 <title>Statistics of Buildroot packages</title>
 </head>
@@ -413,6 +510,34 @@ def dump_html_pkg(f, pkg):
         current_version = pkg.current_version
     f.write("  <td class=\"centered\">%s</td>\n" % current_version)
 
+    # Latest version
+    if pkg.latest_version[1] is None:
+        td_class.append("version-unknown")
+    elif pkg.latest_version[1] != pkg.current_version:
+        td_class.append("version-needs-update")
+    else:
+        td_class.append("version-good")
+
+    if pkg.latest_version[1] is None:
+        latest_version_text = "<b>Unknown</b>"
+    else:
+        latest_version_text = "<b>%s</b>" % str(pkg.latest_version[1])
+
+    latest_version_text += "<br/>"
+
+    if pkg.latest_version[2]:
+        latest_version_text += "<a href=\"https://release-monitoring.org/project/%s\">link</a>, " % pkg.latest_version[2]
+    else:
+        latest_version_text += "no link, "
+
+    if pkg.latest_version[0]:
+        latest_version_text += "has <a href=\"https://release-monitoring.org/distro/Buildroot/\">mapping</a>"
+    else:
+        latest_version_text += "has <a href=\"https://release-monitoring.org/distro/Buildroot/\">no mapping</a>"
+
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), latest_version_text))
+
     # Warnings
     td_class = ["centered"]
     if pkg.warnings == 0:
@@ -436,6 +561,7 @@ def dump_html_all_pkgs(f, packages):
 <td class=\"centered\">License files</td>
 <td class=\"centered\">Hash file</td>
 <td class=\"centered\">Current version</td>
+<td class=\"centered\">Latest version</td>
 <td class=\"centered\">Warnings</td>
 </tr>
 """)
@@ -465,6 +591,16 @@ def dump_html_stats(f, stats):
             stats["no-hash"])
     f.write(" <tr><td>Total number of patches</td><td>%s</td></tr>\n" %
             stats["patches"])
+    f.write("<tr><td>Packages having a mapping on <i>release-monitoring.org</i></td><td>%s</td></tr>\n" %
+            stats["rmo-mapping"])
+    f.write("<tr><td>Packages lacking a mapping on <i>release-monitoring.org</i></td><td>%s</td></tr>\n" %
+            stats["rmo-no-mapping"])
+    f.write("<tr><td>Packages that are up-to-date</td><td>%s</td></tr>\n" %
+            stats["version-uptodate"])
+    f.write("<tr><td>Packages that are not up-to-date</td><td>%s</td></tr>\n" %
+            stats["version-not-uptodate"])
+    f.write("<tr><td>Packages with no known upstream version</td><td>%s</td></tr>\n" %
+            stats["version-unknown"])
     f.write("</table>\n")
 
 
@@ -517,6 +653,8 @@ def __main__():
         pkg.set_patch_count()
         pkg.set_check_package_warnings()
         pkg.set_current_version()
+    print "Getting latest versions ..."
+    add_latest_version_info(packages)
     print "Calculate stats"
     stats = calculate_stats(packages)
     print "Write HTML"
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Buildroot] [PATCH v3 5/5] support/scripts/pkg-stats: replace with new Python version
  2018-03-23 20:54 [Buildroot] [PATCH v3 0/5] New pkg-stats script, with version information Thomas Petazzoni
                   ` (3 preceding siblings ...)
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 4/5] support/scripts/pkg-stats-new: add latest upstream " Thomas Petazzoni
@ 2018-03-23 20:54 ` Thomas Petazzoni
  2018-04-04 20:10 ` [Buildroot] [PATCH v3 0/5] New pkg-stats script, with version information Thomas Petazzoni
  5 siblings, 0 replies; 11+ messages in thread
From: Thomas Petazzoni @ 2018-03-23 20:54 UTC (permalink / raw)
  To: buildroot

Rename pkg-stats-new to pkg-stats.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
---
 support/scripts/pkg-stats     | 997 +++++++++++++++++++++++++-----------------
 support/scripts/pkg-stats-new | 664 ----------------------------
 2 files changed, 603 insertions(+), 1058 deletions(-)
 delete mode 100755 support/scripts/pkg-stats-new

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index 48a2cc29a1..830040a485 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -1,4 +1,4 @@
-#!/usr/bin/env bash
+#!/usr/bin/env python
 
 # Copyright (C) 2009 by Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
 #
@@ -16,16 +16,363 @@
 # along with this program; if not, write to the Free Software
 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
 
-# This script generates an HTML file that contains a report about all
-# Buildroot packages, their usage of the different package
-# infrastructure and possible cleanup actions
-#
-# Run the script from the Buildroot toplevel directory:
-#
-#  ./support/scripts/pkg-stats > /tmp/pkg.html
-#
-
-echo "<head>
+import argparse
+import datetime
+import fnmatch
+import os
+from collections import defaultdict
+import re
+import subprocess
+import sys
+import json
+import urllib2
+from Queue import Queue
+from threading import Thread
+
+INFRA_RE = re.compile("\$\(eval \$\(([a-z-]*)-package\)\)")
+RELEASE_MONITORING_API = "http://release-monitoring.org/api"
+
+
+class Package:
+    all_licenses = list()
+    all_license_files = list()
+    all_versions = dict()
+
+    def __init__(self, name, path):
+        self.name = name
+        self.path = path
+        self.infras = None
+        self.has_license = False
+        self.has_license_files = False
+        self.has_hash = False
+        self.patch_count = 0
+        self.warnings = 0
+        self.current_version = None
+        self.latest_version = None
+
+    def pkgvar(self):
+        return self.name.upper().replace("-", "_")
+
+    def set_infra(self):
+        """
+        Fills in the .infras field
+        """
+        self.infras = list()
+        with open(self.path, 'r') as f:
+            lines = f.readlines()
+            for l in lines:
+                match = INFRA_RE.match(l)
+                if not match:
+                    continue
+                infra = match.group(1)
+                if infra.startswith("host-"):
+                    self.infras.append(("host", infra[5:]))
+                else:
+                    self.infras.append(("target", infra))
+
+    def set_license(self):
+        """
+        Fills in the .has_license and .has_license_files fields
+        """
+        var = self.pkgvar()
+        if var in self.all_licenses:
+            self.has_license = True
+        if var in self.all_license_files:
+            self.has_license_files = True
+
+    def set_hash_info(self):
+        """
+        Fills in the .has_hash field
+        """
+        hashpath = self.path.replace(".mk", ".hash")
+        self.has_hash = os.path.exists(hashpath)
+
+    def set_patch_count(self):
+        """
+        Fills in the .patch_count field
+        """
+        self.patch_count = 0
+        pkgdir = os.path.dirname(self.path)
+        for subdir, _, _ in os.walk(pkgdir):
+            self.patch_count += len(fnmatch.filter(os.listdir(subdir), '*.patch'))
+
+    def set_current_version(self):
+        """
+        Fills in the .current_version field
+        """
+        var = self.pkgvar()
+        if var in self.all_versions:
+            self.current_version = self.all_versions[var]
+
+    def set_check_package_warnings(self):
+        """
+        Fills in the .warnings field
+        """
+        cmd = ["./utils/check-package"]
+        pkgdir = os.path.dirname(self.path)
+        for root, dirs, files in os.walk(pkgdir):
+            for f in files:
+                if f.endswith(".mk") or f.endswith(".hash") or f == "Config.in" or f == "Config.in.host":
+                    cmd.append(os.path.join(root, f))
+        o = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()[1]
+        lines = o.splitlines()
+        for line in lines:
+            m = re.match("^([0-9]*) warnings generated", line)
+            if m:
+                self.warnings = int(m.group(1))
+                return
+
+    def get_latest_version_by_distro(self):
+        try:
+            req = urllib2.Request(os.path.join(RELEASE_MONITORING_API, "project", "Buildroot", self.name))
+            f = urllib2.urlopen(req, timeout=15)
+        except:
+            # Exceptions can typically be a timeout, or a 404 error if not project
+            return (False, None, None)
+        data = json.loads(f.read())
+        if len(data['versions']) > 0:
+            return (True, data['versions'][0], data['id'])
+        else:
+            return (True, None, data['id'])
+
+    def get_latest_version_by_guess(self):
+        try:
+            req = urllib2.Request(os.path.join(RELEASE_MONITORING_API, "projects", "?pattern=%s" % self.name))
+            f = urllib2.urlopen(req, timeout=15)
+        except:
+            # Exceptions can typically be a timeout, or a 404 error if not project
+            return (False, None, None)
+        data = json.loads(f.read())
+        for p in data['projects']:
+            if p['name'] == self.name and len(p['versions']) > 0:
+                return (False, p['versions'][0], p['id'])
+        return (False, None, None)
+
+    def set_latest_version(self):
+        # We first try by using the "Buildroot" distribution on
+        # release-monitoring.org, if it has a mapping for the current
+        # package name.
+        self.latest_version = self.get_latest_version_by_distro()
+        if self.latest_version == (False, None, None):
+            # If that fails because there is no mapping or because we had a
+            # request timeout, we try to search in all packages for a package
+            # of this name.
+            self.latest_version = self.get_latest_version_by_guess()
+
+    def __eq__(self, other):
+        return self.path == other.path
+
+    def __lt__(self, other):
+        return self.path < other.path
+
+    def __str__(self):
+        return "%s (path='%s', license='%s', license_files='%s', hash='%s', patches=%d)" % \
+            (self.name, self.path, self.has_license, self.has_license_files, self.has_hash, self.patch_count)
+
+
+def get_pkglist(npackages, package_list):
+    """
+    Builds the list of Buildroot packages, returning a list of Package
+    objects. Only the .name and .path fields of the Package object are
+    initialized.
+
+    npackages: limit to N packages
+    package_list: limit to those packages in this list
+    """
+    WALK_USEFUL_SUBDIRS = ["boot", "linux", "package", "toolchain"]
+    WALK_EXCLUDES = ["boot/common.mk",
+                     "linux/linux-ext-.*.mk",
+                     "package/freescale-imx/freescale-imx.mk",
+                     "package/gcc/gcc.mk",
+                     "package/gstreamer/gstreamer.mk",
+                     "package/gstreamer1/gstreamer1.mk",
+                     "package/gtk2-themes/gtk2-themes.mk",
+                     "package/matchbox/matchbox.mk",
+                     "package/opengl/opengl.mk",
+                     "package/qt5/qt5.mk",
+                     "package/x11r7/x11r7.mk",
+                     "package/doc-asciidoc.mk",
+                     "package/pkg-.*.mk",
+                     "package/nvidia-tegra23/nvidia-tegra23.mk",
+                     "toolchain/toolchain-external/pkg-toolchain-external.mk",
+                     "toolchain/toolchain-external/toolchain-external.mk",
+                     "toolchain/toolchain.mk",
+                     "toolchain/helpers.mk",
+                     "toolchain/toolchain-wrapper.mk"]
+    packages = list()
+    count = 0
+    for root, dirs, files in os.walk("."):
+        rootdir = root.split("/")
+        if len(rootdir) < 2:
+            continue
+        if rootdir[1] not in WALK_USEFUL_SUBDIRS:
+            continue
+        for f in files:
+            if not f.endswith(".mk"):
+                continue
+            # Strip ending ".mk"
+            pkgname = f[:-3]
+            if package_list and pkgname not in package_list:
+                continue
+            pkgpath = os.path.join(root, f)
+            skip = False
+            for exclude in WALK_EXCLUDES:
+                # pkgpath[2:] strips the initial './'
+                if re.match(exclude, pkgpath[2:]):
+                    skip = True
+                    continue
+            if skip:
+                continue
+            p = Package(pkgname, pkgpath)
+            packages.append(p)
+            count += 1
+            if npackages and count == npackages:
+                return packages
+    return packages
+
+
+def package_init_make_info():
+    # Licenses
+    o = subprocess.check_output(["make", "BR2_HAVE_DOT_CONFIG=y",
+                                 "-s", "printvars", "VARS=%_LICENSE"])
+    for l in o.splitlines():
+        # Get variable name and value
+        pkgvar, value = l.split("=")
+
+        # If present, strip HOST_ from variable name
+        if pkgvar.startswith("HOST_"):
+            pkgvar = pkgvar[5:]
+
+        # Strip _LICENSE
+        pkgvar = pkgvar[:-8]
+
+        # If value is "unknown", no license details available
+        if value == "unknown":
+            continue
+        Package.all_licenses.append(pkgvar)
+
+    # License files
+    o = subprocess.check_output(["make", "BR2_HAVE_DOT_CONFIG=y",
+                                 "-s", "printvars", "VARS=%_LICENSE_FILES"])
+    for l in o.splitlines():
+        # Get variable name and value
+        pkgvar, value = l.split("=")
+
+        # If present, strip HOST_ from variable name
+        if pkgvar.startswith("HOST_"):
+            pkgvar = pkgvar[5:]
+
+        if pkgvar.endswith("_MANIFEST_LICENSE_FILES"):
+            continue
+
+        # Strip _LICENSE_FILES
+        pkgvar = pkgvar[:-14]
+
+        Package.all_license_files.append(pkgvar)
+
+    # Version
+    o = subprocess.check_output(["make", "BR2_HAVE_DOT_CONFIG=y",
+                                 "-s", "printvars", "VARS=%_VERSION"])
+
+    # We process first the host package VERSION, and then the target
+    # package VERSION. This means that if a package exists in both
+    # target and host variants, with different version numbers
+    # (unlikely), we'll report the target version number.
+    version_list = o.splitlines()
+    version_list = [x for x in version_list if x.startswith("HOST_")] + \
+                   [x for x in version_list if not x.startswith("HOST_")]
+    for l in version_list:
+        # Get variable name and value
+        pkgvar, value = l.split("=")
+
+        # If present, strip HOST_ from variable name
+        if pkgvar.startswith("HOST_"):
+            pkgvar = pkgvar[5:]
+
+        if pkgvar.endswith("_DL_VERSION"):
+            continue
+
+        # Strip _VERSION
+        pkgvar = pkgvar[:-8]
+
+        Package.all_versions[pkgvar] = value
+
+
+def set_version_worker(q):
+    while True:
+        pkg = q.get()
+        pkg.set_latest_version()
+        print " [%04d] %s => %s" % (q.qsize(), pkg.name, str(pkg.latest_version))
+        q.task_done()
+
+
+def add_latest_version_info(packages):
+    """
+    Fills in the .latest_version field of all Package objects
+
+    This field has a special format:
+      (mapping, version, id)
+    with:
+    - mapping: boolean that indicates whether release-monitoring.org
+      has a mapping for this package name in the Buildroot distribution
+      or not
+    - version: string containing the latest version known by
+      release-monitoring.org for this package
+    - id: string containing the id of the project corresponding to this
+      package, as known by release-monitoring.org
+    """
+    q = Queue()
+    for pkg in packages:
+        q.put(pkg)
+    # Since release-monitoring.org is rather slow, we create 8 threads
+    # that do HTTP requests to the site.
+    for i in range(8):
+        t = Thread(target=set_version_worker, args=[q])
+        t.daemon = True
+        t.start()
+    q.join()
+
+
+def calculate_stats(packages):
+    stats = defaultdict(int)
+    for pkg in packages:
+        # If packages have multiple infra, take the first one. For the
+        # vast majority of packages, the target and host infra are the
+        # same. There are very few packages that use a different infra
+        # for the host and target variants.
+        if len(pkg.infras) > 0:
+            infra = pkg.infras[0][1]
+            stats["infra-%s" % infra] += 1
+        else:
+            stats["infra-unknown"] += 1
+        if pkg.has_license:
+            stats["license"] += 1
+        else:
+            stats["no-license"] += 1
+        if pkg.has_license_files:
+            stats["license-files"] += 1
+        else:
+            stats["no-license-files"] += 1
+        if pkg.has_hash:
+            stats["hash"] += 1
+        else:
+            stats["no-hash"] += 1
+        if pkg.latest_version[0]:
+            stats["rmo-mapping"] += 1
+        else:
+            stats["rmo-no-mapping"] += 1
+        if not pkg.latest_version[1]:
+            stats["version-unknown"] += 1
+        elif pkg.latest_version[1] == pkg.current_version:
+            stats["version-uptodate"] += 1
+        else:
+            stats["version-not-uptodate"] += 1
+        stats["patches"] += pkg.patch_count
+    return stats
+
+
+html_header = """
+<head>
 <script src=\"https://www.kryogenix.org/code/browser/sorttable/sorttable.js\"></script>
 <style type=\"text/css\">
 table {
@@ -46,14 +393,21 @@ td.correct {
 td.nopatches {
   background: #d2ffc4;
 }
-
 td.somepatches {
   background: #ffd870;
 }
-
 td.lotsofpatches {
   background: #ff9a69;
 }
+td.version-good {
+  background: #d2ffc4;
+}
+td.version-needs-update {
+  background: #ff9a69;
+}
+td.version-unknown {
+ background: #ffd870;
+}
 </style>
 <title>Statistics of Buildroot packages</title>
 </head>
@@ -61,395 +415,250 @@ td.lotsofpatches {
 <a href=\"#results\">Results</a><br/>
 
 <p id=\"sortable_hint\"></p>
+"""
 
+
+html_footer = """
+</body>
+<script>
+if (typeof sorttable === \"object\") {
+  document.getElementById(\"sortable_hint\").innerHTML =
+  \"hint: the table can be sorted by clicking the column headers\"
+}
+</script>
+</html>
+"""
+
+
+def infra_str(infra_list):
+    if not infra_list:
+        return "Unknown"
+    elif len(infra_list) == 1:
+        return "<b>%s</b><br/>%s" % (infra_list[0][1], infra_list[0][0])
+    elif infra_list[0][1] == infra_list[1][1]:
+        return "<b>%s</b><br/>%s + %s" % \
+            (infra_list[0][1], infra_list[0][0], infra_list[1][0])
+    else:
+        return "<b>%s</b> (%s)<br/><b>%s</b> (%s)" % \
+            (infra_list[0][1], infra_list[0][0],
+             infra_list[1][1], infra_list[1][0])
+
+
+def boolean_str(b):
+    if b:
+        return "Yes"
+    else:
+        return "No"
+
+
+def dump_html_pkg(f, pkg):
+    f.write(" <tr>\n")
+    f.write("  <td>%s</td>\n" % pkg.path[2:])
+
+    # Patch count
+    td_class = ["centered"]
+    if pkg.patch_count == 0:
+        td_class.append("nopatches")
+    elif pkg.patch_count < 5:
+        td_class.append("somepatches")
+    else:
+        td_class.append("lotsofpatches")
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), str(pkg.patch_count)))
+
+    # Infrastructure
+    infra = infra_str(pkg.infras)
+    td_class = ["centered"]
+    if infra == "Unknown":
+        td_class.append("wrong")
+    else:
+        td_class.append("correct")
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), infra_str(pkg.infras)))
+
+    # License
+    td_class = ["centered"]
+    if pkg.has_license:
+        td_class.append("correct")
+    else:
+        td_class.append("wrong")
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), boolean_str(pkg.has_license)))
+
+    # License files
+    td_class = ["centered"]
+    if pkg.has_license_files:
+        td_class.append("correct")
+    else:
+        td_class.append("wrong")
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), boolean_str(pkg.has_license_files)))
+
+    # Hash
+    td_class = ["centered"]
+    if pkg.has_hash:
+        td_class.append("correct")
+    else:
+        td_class.append("wrong")
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), boolean_str(pkg.has_hash)))
+
+    # Current version
+    if len(pkg.current_version) > 20:
+        current_version = pkg.current_version[:20] + "..."
+    else:
+        current_version = pkg.current_version
+    f.write("  <td class=\"centered\">%s</td>\n" % current_version)
+
+    # Latest version
+    if pkg.latest_version[1] is None:
+        td_class.append("version-unknown")
+    elif pkg.latest_version[1] != pkg.current_version:
+        td_class.append("version-needs-update")
+    else:
+        td_class.append("version-good")
+
+    if pkg.latest_version[1] is None:
+        latest_version_text = "<b>Unknown</b>"
+    else:
+        latest_version_text = "<b>%s</b>" % str(pkg.latest_version[1])
+
+    latest_version_text += "<br/>"
+
+    if pkg.latest_version[2]:
+        latest_version_text += "<a href=\"https://release-monitoring.org/project/%s\">link</a>, " % pkg.latest_version[2]
+    else:
+        latest_version_text += "no link, "
+
+    if pkg.latest_version[0]:
+        latest_version_text += "has <a href=\"https://release-monitoring.org/distro/Buildroot/\">mapping</a>"
+    else:
+        latest_version_text += "has <a href=\"https://release-monitoring.org/distro/Buildroot/\">no mapping</a>"
+
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), latest_version_text))
+
+    # Warnings
+    td_class = ["centered"]
+    if pkg.warnings == 0:
+        td_class.append("correct")
+    else:
+        td_class.append("wrong")
+    f.write("  <td class=\"%s\">%d</td>\n" %
+            (" ".join(td_class), pkg.warnings))
+
+    f.write(" </tr>\n")
+
+
+def dump_html_all_pkgs(f, packages):
+    f.write("""
 <table class=\"sortable\">
 <tr>
-<td>Id</td>
 <td>Package</td>
 <td class=\"centered\">Patch count</td>
 <td class=\"centered\">Infrastructure</td>
 <td class=\"centered\">License</td>
 <td class=\"centered\">License files</td>
 <td class=\"centered\">Hash file</td>
+<td class=\"centered\">Current version</td>
+<td class=\"centered\">Latest version</td>
 <td class=\"centered\">Warnings</td>
 </tr>
-"
-
-autotools_packages=0
-cmake_packages=0
-kconfig_packages=0
-luarocks_package=0
-perl_packages=0
-python_packages=0
-rebar_packages=0
-virtual_packages=0
-generic_packages=0
-waf_packages=0
-manual_packages=0
-packages_with_licence=0
-packages_without_licence=0
-packages_with_license_files=0
-packages_without_license_files=0
-packages_with_hash_file=0
-packages_without_hash_file=0
-total_patch_count=0
-cnt=0
-
-for i in $(find boot/ linux/ package/ toolchain/ -name '*.mk' | sort) ; do
-
-    if test \
-	$i = "boot/common.mk" -o \
-	$i = "linux/linux-ext-ev3dev-linux-drivers.mk" -o \
-	$i = "linux/linux-ext-fbtft.mk" -o \
-	$i = "linux/linux-ext-xenomai.mk" -o \
-	$i = "linux/linux-ext-rtai.mk" -o \
-	$i = "package/freescale-imx/freescale-imx.mk" -o \
-	$i = "package/gcc/gcc.mk" -o \
-	$i = "package/gstreamer/gstreamer.mk" -o \
-	$i = "package/gstreamer1/gstreamer1.mk" -o \
-	$i = "package/gtk2-themes/gtk2-themes.mk" -o \
-	$i = "package/matchbox/matchbox.mk" -o \
-	$i = "package/opengl/opengl.mk" -o \
-	$i = "package/qt5/qt5.mk" -o \
-	$i = "package/x11r7/x11r7.mk" -o \
-	$i = "package/doc-asciidoc.mk" -o \
-	$i = "package/pkg-autotools.mk" -o \
-	$i = "package/pkg-cmake.mk" -o \
-	$i = "package/pkg-kconfig.mk" -o \
-	$i = "package/pkg-luarocks.mk" -o \
-	$i = "package/pkg-perl.mk" -o \
-	$i = "package/pkg-python.mk" -o \
-	$i = "package/pkg-rebar.mk" -o \
-	$i = "package/pkg-virtual.mk" -o \
-	$i = "package/pkg-download.mk" -o \
-	$i = "package/pkg-generic.mk" -o \
-	$i = "package/pkg-waf.mk" -o \
-	$i = "package/pkg-kernel-module.mk" -o \
-	$i = "package/pkg-utils.mk" -o \
-	$i = "package/nvidia-tegra23/nvidia-tegra23.mk" -o \
-	$i = "toolchain/toolchain-external/pkg-toolchain-external.mk" -o \
-	$i = "toolchain/toolchain-external/toolchain-external.mk" -o \
-	$i = "toolchain/toolchain.mk" -o \
-	$i = "toolchain/helpers.mk" -o \
-	$i = "toolchain/toolchain-wrapper.mk" ; then
-	echo "skipping $i" 1>&2
-	continue
-    fi
-
-    cnt=$((cnt+1))
-
-    hashost=0
-    hastarget=0
-    infratype=""
-
-    # Determine package infrastructure
-    if grep -E "\(host-autotools-package\)" $i > /dev/null ; then
-	infratype="autotools"
-	hashost=1
-    fi
-
-    if grep -E "\(autotools-package\)" $i > /dev/null ; then
-	infratype="autotools"
-	hastarget=1
-    fi
-
-    if grep -E "\(kconfig-package\)" $i > /dev/null ; then
-	infratype="kconfig"
-	hastarget=1
-    fi
-
-    if grep -E "\(host-luarocks-package\)" $i > /dev/null ; then
-	infratype="luarocks"
-	hashost=1
-    fi
-
-    if grep -E "\(luarocks-package\)" $i > /dev/null ; then
-	infratype="luarocks"
-	hastarget=1
-    fi
-
-    if grep -E "\(host-perl-package\)" $i > /dev/null ; then
-	infratype="perl"
-	hashost=1
-    fi
-
-    if grep -E "\(perl-package\)" $i > /dev/null ; then
-	infratype="perl"
-	hastarget=1
-    fi
-
-    if grep -E "\(host-python-package\)" $i > /dev/null ; then
-	infratype="python"
-	hashost=1
-    fi
-
-    if grep -E "\(python-package\)" $i > /dev/null ; then
-	infratype="python"
-	hastarget=1
-    fi
-
-    if grep -E "\(host-rebar-package\)" $i > /dev/null ; then
-	infratype="rebar"
-	hashost=1
-    fi
-
-    if grep -E "\(rebar-package\)" $i > /dev/null ; then
-	infratype="rebar"
-	hastarget=1
-    fi
-
-    if grep -E "\(host-virtual-package\)" $i > /dev/null ; then
-	infratype="virtual"
-	hashost=1
-    fi
-
-    if grep -E "\(virtual-package\)" $i > /dev/null ; then
-	infratype="virtual"
-	hastarget=1
-    fi
-
-    if grep -E "\(host-generic-package\)" $i > /dev/null ; then
-	infratype="generic"
-	hashost=1
-    fi
-
-    if grep -E "\(generic-package\)" $i > /dev/null ; then
-	infratype="generic"
-	hastarget=1
-    fi
-
-    if grep -E "\(host-cmake-package\)" $i > /dev/null ; then
-	infratype="cmake"
-	hashost=1
-    fi
-
-    if grep -E "\(cmake-package\)" $i > /dev/null ; then
-	infratype="cmake"
-	hastarget=1
-    fi
-
-    if grep -E "\(toolchain-external-package\)" $i > /dev/null ; then
-	infratype="toolchain-external"
-	hastarget=1
-    fi
-
-    if grep -E "\(waf-package\)" $i > /dev/null ; then
-	infratype="waf"
-	hastarget=1
-    fi
-
-    pkg=$(basename $i)
-    dir=$(dirname $i)
-    pkg=${pkg%.mk}
-    pkgvariable=$(echo ${pkg} | tr "a-z-" "A-Z_")
-
-
-    # Count packages per infrastructure
-    if [ -z ${infratype} ] ; then
-	infratype="manual"
-	manual_packages=$(($manual_packages+1))
-    elif [ ${infratype} = "autotools" ]; then
-	autotools_packages=$(($autotools_packages+1))
-    elif [ ${infratype} = "cmake" ]; then
-	cmake_packages=$(($cmake_packages+1))
-    elif [ ${infratype} = "kconfig" ]; then
-	kconfig_packages=$(($kconfig_packages+1))
-    elif [ ${infratype} = "luarocks" ]; then
-	luarocks_packages=$(($luarocks_packages+1))
-    elif [ ${infratype} = "perl" ]; then
-	perl_packages=$(($perl_packages+1))
-    elif [ ${infratype} = "python" ]; then
-	python_packages=$(($python_packages+1))
-    elif [ ${infratype} = "rebar" ]; then
-	rebar_packages=$(($rebar_packages+1))
-    elif [ ${infratype} = "virtual" ]; then
-	virtual_packages=$(($virtual_packages+1))
-    elif [ ${infratype} = "generic" ]; then
-	generic_packages=$(($generic_packages+1))
-    elif [ ${infratype} = "waf" ]; then
-	waf_packages=$(($waf_packages+1))
-    fi
-
-    if grep -qE "^${pkgvariable}_LICENSE[ ]*=" $i ; then
-	packages_with_license=$(($packages_with_license+1))
-	license=1
-    else
-	packages_without_license=$(($packages_without_license+1))
-	license=0
-    fi
-
-    if grep -qE "^${pkgvariable}_LICENSE_FILES[ ]*=" $i ; then
-	packages_with_license_files=$(($packages_with_license_files+1))
-	license_files=1
-    else
-	packages_without_license_files=$(($packages_without_license_files+1))
-	license_files=0
-    fi
-
-    if test -f ${dir}/${pkg}.hash; then
-	packages_with_hash_file=$(($packages_with_hash_file+1))
-	hash_file=1
-    else
-	packages_without_hash_file=$(($packages_without_hash_file+1))
-	hash_file=0
-    fi
-
-    echo "<tr>"
-
-    echo "<td>$cnt</td>"
-    echo "<td>$i</td>"
-
-    package_dir=$(dirname $i)
-    patch_count=$(find ${package_dir} -name '*.patch' | wc -l)
-    total_patch_count=$(($total_patch_count+$patch_count))
-
-    if test $patch_count -lt 1 ; then
-	patch_count_class="nopatches"
-    elif test $patch_count -lt 5 ; then
-	patch_count_class="somepatches"
-    else
-	patch_count_class="lotsofpatches"
-    fi
-
-    echo "<td class=\"centered ${patch_count_class}\">"
-    echo "<b>$patch_count</b>"
-    echo "</td>"
-
-    if [ ${infratype} = "manual" ] ; then
-	echo "<td class=\"centered wrong\"><b>manual</b></td>"
-    else
-	echo "<td class=\"centered correct\">"
-	echo "<b>${infratype}</b><br/>"
-	if [ ${hashost} -eq 1 -a ${hastarget} -eq 1 ]; then
-	    echo "target + host"
-	elif [ ${hashost} -eq 1 ]; then
-	    echo "host"
-	else
-	    echo "target"
-	fi
-	echo "</td>"
-    fi
-
-    if [ ${license} -eq 0 ] ; then
-	echo "<td class=\"centered wrong\">No</td>"
-    else
-	echo "<td class=\"centered correct\">Yes</td>"
-    fi
-
-    if [ ${license_files} -eq 0 ] ; then
-	echo "<td class=\"centered wrong\">No</td>"
-    else
-	echo "<td class=\"centered correct\">Yes</td>"
-    fi
-
-    if [ ${hash_file} -eq 0 ] ; then
-	echo "<td class=\"centered wrong\">No</td>"
-    else
-	echo "<td class=\"centered correct\">Yes</td>"
-    fi
-
-    file_list=$(find ${package_dir} -name '*.mk' -o -name '*.in*' -o -name '*.hash')
-    nwarnings=$(./utils/check-package ${file_list} 2>&1 | sed '/\([0-9]*\) warnings generated/!d; s//\1/')
-    if [ ${nwarnings} -eq 0 ] ; then
-	echo "<td class=\"centered correct\">${nwarnings}</td>"
-    else
-	echo "<td class=\"centered wrong\">${nwarnings}</td>"
-    fi
-
-    echo "</tr>"
-
-done
-echo "</table>"
-
-echo "<a id="results"></a>"
-echo "<table>"
-echo "<tr>"
-echo "<td>Packages using the <i>generic</i> infrastructure</td>"
-echo "<td>$generic_packages</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages using the <i>cmake</i> infrastructure</td>"
-echo "<td>$cmake_packages</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages using the <i>autotools</i> infrastructure</td>"
-echo "<td>$autotools_packages</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages using the <i>luarocks</i> infrastructure</td>"
-echo "<td>$luarocks_packages</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages using the <i>kconfig</i> infrastructure</td>"
-echo "<td>$kconfig_packages</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages using the <i>perl</i> infrastructure</td>"
-echo "<td>$perl_packages</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages using the <i>python</i> infrastructure</td>"
-echo "<td>$python_packages</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages using the <i>rebar</i> infrastructure</td>"
-echo "<td>$rebar_packages</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages using the <i>virtual</i> infrastructure</td>"
-echo "<td>$virtual_packages</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages using the <i>waf</i> infrastructure</td>"
-echo "<td>$waf_packages</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages not using any infrastructure</td>"
-echo "<td>$manual_packages</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages having license information</td>"
-echo "<td>$packages_with_license</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages not having licence information</td>"
-echo "<td>$packages_without_license</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages having license files information</td>"
-echo "<td>$packages_with_license_files</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages not having licence files information</td>"
-echo "<td>$packages_without_license_files</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages having hash file</td>"
-echo "<td>$packages_with_hash_file</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Packages not having hash file</td>"
-echo "<td>$packages_without_hash_file</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>Number of patches in all packages</td>"
-echo "<td>$total_patch_count</td>"
-echo "</tr>"
-echo "<tr>"
-echo "<td>TOTAL</td>"
-echo "<td>$cnt</td>"
-echo "</tr>"
-echo "</table>"
-
-echo "<hr/>"
-echo "<i>Updated on $(LANG=C date), Git commit $(git log master -n 1 --pretty=format:%H)</i>"
-echo "</body>"
-
-echo "<script>
-if (typeof sorttable === \"object\") {
-  document.getElementById(\"sortable_hint\").innerHTML =
-  \"hint: the table can be sorted by clicking the column headers\"
-}
-</script>
-"
-echo "</html>"
+""")
+    for pkg in sorted(packages):
+        dump_html_pkg(f, pkg)
+    f.write("</table>")
+
+
+def dump_html_stats(f, stats):
+    f.write("<a id=\"results\"></a>\n")
+    f.write("<table>\n")
+    infras = [infra[6:] for infra in stats.keys() if infra.startswith("infra-")]
+    for infra in infras:
+        f.write(" <tr><td>Packages using the <i>%s</i> infrastructure</td><td>%s</td></tr>\n" %
+                (infra, stats["infra-%s" % infra]))
+    f.write(" <tr><td>Packages having license information</td><td>%s</td></tr>\n" %
+            stats["license"])
+    f.write(" <tr><td>Packages not having license information</td><td>%s</td></tr>\n" %
+            stats["no-license"])
+    f.write(" <tr><td>Packages having license files information</td><td>%s</td></tr>\n" %
+            stats["license-files"])
+    f.write(" <tr><td>Packages not having license files information</td><td>%s</td></tr>\n" %
+            stats["no-license-files"])
+    f.write(" <tr><td>Packages having a hash file</td><td>%s</td></tr>\n" %
+            stats["hash"])
+    f.write(" <tr><td>Packages not having a hash file</td><td>%s</td></tr>\n" %
+            stats["no-hash"])
+    f.write(" <tr><td>Total number of patches</td><td>%s</td></tr>\n" %
+            stats["patches"])
+    f.write("<tr><td>Packages having a mapping on <i>release-monitoring.org</i></td><td>%s</td></tr>\n" %
+            stats["rmo-mapping"])
+    f.write("<tr><td>Packages lacking a mapping on <i>release-monitoring.org</i></td><td>%s</td></tr>\n" %
+            stats["rmo-no-mapping"])
+    f.write("<tr><td>Packages that are up-to-date</td><td>%s</td></tr>\n" %
+            stats["version-uptodate"])
+    f.write("<tr><td>Packages that are not up-to-date</td><td>%s</td></tr>\n" %
+            stats["version-not-uptodate"])
+    f.write("<tr><td>Packages with no known upstream version</td><td>%s</td></tr>\n" %
+            stats["version-unknown"])
+    f.write("</table>\n")
+
+
+def dump_gen_info(f):
+    # Updated on Mon Feb 19 08:12:08 CET 2018, Git commit aa77030b8f5e41f1c53eb1c1ad664b8c814ba032
+    o = subprocess.check_output(["git", "log", "master", "-n", "1", "--pretty=format:%H"])
+    git_commit = o.splitlines()[0]
+    f.write("<p><i>Updated on %s, git commit %s</i></p>\n" %
+            (str(datetime.datetime.utcnow()), git_commit))
+
+
+def dump_html(packages, stats, output):
+    with open(output, 'w') as f:
+        f.write(html_header)
+        dump_html_all_pkgs(f, packages)
+        dump_html_stats(f, stats)
+        dump_gen_info(f)
+        f.write(html_footer)
+
+
+def parse_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('-o', dest='output', action='store', required=True,
+                        help='HTML output file')
+    parser.add_argument('-n', dest='npackages', type=int, action='store',
+                        help='Number of packages')
+    parser.add_argument('-p', dest='packages', action='store',
+                        help='List of packages (comma separated)')
+    return parser.parse_args()
+
+
+def __main__():
+    args = parse_args()
+    if args.npackages and args.packages:
+        print "ERROR: -n and -p are mutually exclusive"
+        sys.exit(1)
+    if args.packages:
+        package_list = args.packages.split(",")
+    else:
+        package_list = None
+    print "Build package list ..."
+    packages = get_pkglist(args.npackages, package_list)
+    print "Getting package make info ..."
+    package_init_make_info()
+    print "Getting package details ..."
+    for pkg in packages:
+        pkg.set_infra()
+        pkg.set_license()
+        pkg.set_hash_info()
+        pkg.set_patch_count()
+        pkg.set_check_package_warnings()
+        pkg.set_current_version()
+    print "Getting latest versions ..."
+    add_latest_version_info(packages)
+    print "Calculate stats"
+    stats = calculate_stats(packages)
+    print "Write HTML"
+    dump_html(packages, stats, args.output)
+
+
+__main__()
diff --git a/support/scripts/pkg-stats-new b/support/scripts/pkg-stats-new
deleted file mode 100755
index 830040a485..0000000000
--- a/support/scripts/pkg-stats-new
+++ /dev/null
@@ -1,664 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright (C) 2009 by Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
-#
-# This program is free software; you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation; either version 2 of the License, or
-# (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
-# General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program; if not, write to the Free Software
-# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
-
-import argparse
-import datetime
-import fnmatch
-import os
-from collections import defaultdict
-import re
-import subprocess
-import sys
-import json
-import urllib2
-from Queue import Queue
-from threading import Thread
-
-INFRA_RE = re.compile("\$\(eval \$\(([a-z-]*)-package\)\)")
-RELEASE_MONITORING_API = "http://release-monitoring.org/api"
-
-
-class Package:
-    all_licenses = list()
-    all_license_files = list()
-    all_versions = dict()
-
-    def __init__(self, name, path):
-        self.name = name
-        self.path = path
-        self.infras = None
-        self.has_license = False
-        self.has_license_files = False
-        self.has_hash = False
-        self.patch_count = 0
-        self.warnings = 0
-        self.current_version = None
-        self.latest_version = None
-
-    def pkgvar(self):
-        return self.name.upper().replace("-", "_")
-
-    def set_infra(self):
-        """
-        Fills in the .infras field
-        """
-        self.infras = list()
-        with open(self.path, 'r') as f:
-            lines = f.readlines()
-            for l in lines:
-                match = INFRA_RE.match(l)
-                if not match:
-                    continue
-                infra = match.group(1)
-                if infra.startswith("host-"):
-                    self.infras.append(("host", infra[5:]))
-                else:
-                    self.infras.append(("target", infra))
-
-    def set_license(self):
-        """
-        Fills in the .has_license and .has_license_files fields
-        """
-        var = self.pkgvar()
-        if var in self.all_licenses:
-            self.has_license = True
-        if var in self.all_license_files:
-            self.has_license_files = True
-
-    def set_hash_info(self):
-        """
-        Fills in the .has_hash field
-        """
-        hashpath = self.path.replace(".mk", ".hash")
-        self.has_hash = os.path.exists(hashpath)
-
-    def set_patch_count(self):
-        """
-        Fills in the .patch_count field
-        """
-        self.patch_count = 0
-        pkgdir = os.path.dirname(self.path)
-        for subdir, _, _ in os.walk(pkgdir):
-            self.patch_count += len(fnmatch.filter(os.listdir(subdir), '*.patch'))
-
-    def set_current_version(self):
-        """
-        Fills in the .current_version field
-        """
-        var = self.pkgvar()
-        if var in self.all_versions:
-            self.current_version = self.all_versions[var]
-
-    def set_check_package_warnings(self):
-        """
-        Fills in the .warnings field
-        """
-        cmd = ["./utils/check-package"]
-        pkgdir = os.path.dirname(self.path)
-        for root, dirs, files in os.walk(pkgdir):
-            for f in files:
-                if f.endswith(".mk") or f.endswith(".hash") or f == "Config.in" or f == "Config.in.host":
-                    cmd.append(os.path.join(root, f))
-        o = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()[1]
-        lines = o.splitlines()
-        for line in lines:
-            m = re.match("^([0-9]*) warnings generated", line)
-            if m:
-                self.warnings = int(m.group(1))
-                return
-
-    def get_latest_version_by_distro(self):
-        try:
-            req = urllib2.Request(os.path.join(RELEASE_MONITORING_API, "project", "Buildroot", self.name))
-            f = urllib2.urlopen(req, timeout=15)
-        except:
-            # Exceptions can typically be a timeout, or a 404 error if not project
-            return (False, None, None)
-        data = json.loads(f.read())
-        if len(data['versions']) > 0:
-            return (True, data['versions'][0], data['id'])
-        else:
-            return (True, None, data['id'])
-
-    def get_latest_version_by_guess(self):
-        try:
-            req = urllib2.Request(os.path.join(RELEASE_MONITORING_API, "projects", "?pattern=%s" % self.name))
-            f = urllib2.urlopen(req, timeout=15)
-        except:
-            # Exceptions can typically be a timeout, or a 404 error if not project
-            return (False, None, None)
-        data = json.loads(f.read())
-        for p in data['projects']:
-            if p['name'] == self.name and len(p['versions']) > 0:
-                return (False, p['versions'][0], p['id'])
-        return (False, None, None)
-
-    def set_latest_version(self):
-        # We first try by using the "Buildroot" distribution on
-        # release-monitoring.org, if it has a mapping for the current
-        # package name.
-        self.latest_version = self.get_latest_version_by_distro()
-        if self.latest_version == (False, None, None):
-            # If that fails because there is no mapping or because we had a
-            # request timeout, we try to search in all packages for a package
-            # of this name.
-            self.latest_version = self.get_latest_version_by_guess()
-
-    def __eq__(self, other):
-        return self.path == other.path
-
-    def __lt__(self, other):
-        return self.path < other.path
-
-    def __str__(self):
-        return "%s (path='%s', license='%s', license_files='%s', hash='%s', patches=%d)" % \
-            (self.name, self.path, self.has_license, self.has_license_files, self.has_hash, self.patch_count)
-
-
-def get_pkglist(npackages, package_list):
-    """
-    Builds the list of Buildroot packages, returning a list of Package
-    objects. Only the .name and .path fields of the Package object are
-    initialized.
-
-    npackages: limit to N packages
-    package_list: limit to those packages in this list
-    """
-    WALK_USEFUL_SUBDIRS = ["boot", "linux", "package", "toolchain"]
-    WALK_EXCLUDES = ["boot/common.mk",
-                     "linux/linux-ext-.*.mk",
-                     "package/freescale-imx/freescale-imx.mk",
-                     "package/gcc/gcc.mk",
-                     "package/gstreamer/gstreamer.mk",
-                     "package/gstreamer1/gstreamer1.mk",
-                     "package/gtk2-themes/gtk2-themes.mk",
-                     "package/matchbox/matchbox.mk",
-                     "package/opengl/opengl.mk",
-                     "package/qt5/qt5.mk",
-                     "package/x11r7/x11r7.mk",
-                     "package/doc-asciidoc.mk",
-                     "package/pkg-.*.mk",
-                     "package/nvidia-tegra23/nvidia-tegra23.mk",
-                     "toolchain/toolchain-external/pkg-toolchain-external.mk",
-                     "toolchain/toolchain-external/toolchain-external.mk",
-                     "toolchain/toolchain.mk",
-                     "toolchain/helpers.mk",
-                     "toolchain/toolchain-wrapper.mk"]
-    packages = list()
-    count = 0
-    for root, dirs, files in os.walk("."):
-        rootdir = root.split("/")
-        if len(rootdir) < 2:
-            continue
-        if rootdir[1] not in WALK_USEFUL_SUBDIRS:
-            continue
-        for f in files:
-            if not f.endswith(".mk"):
-                continue
-            # Strip ending ".mk"
-            pkgname = f[:-3]
-            if package_list and pkgname not in package_list:
-                continue
-            pkgpath = os.path.join(root, f)
-            skip = False
-            for exclude in WALK_EXCLUDES:
-                # pkgpath[2:] strips the initial './'
-                if re.match(exclude, pkgpath[2:]):
-                    skip = True
-                    continue
-            if skip:
-                continue
-            p = Package(pkgname, pkgpath)
-            packages.append(p)
-            count += 1
-            if npackages and count == npackages:
-                return packages
-    return packages
-
-
-def package_init_make_info():
-    # Licenses
-    o = subprocess.check_output(["make", "BR2_HAVE_DOT_CONFIG=y",
-                                 "-s", "printvars", "VARS=%_LICENSE"])
-    for l in o.splitlines():
-        # Get variable name and value
-        pkgvar, value = l.split("=")
-
-        # If present, strip HOST_ from variable name
-        if pkgvar.startswith("HOST_"):
-            pkgvar = pkgvar[5:]
-
-        # Strip _LICENSE
-        pkgvar = pkgvar[:-8]
-
-        # If value is "unknown", no license details available
-        if value == "unknown":
-            continue
-        Package.all_licenses.append(pkgvar)
-
-    # License files
-    o = subprocess.check_output(["make", "BR2_HAVE_DOT_CONFIG=y",
-                                 "-s", "printvars", "VARS=%_LICENSE_FILES"])
-    for l in o.splitlines():
-        # Get variable name and value
-        pkgvar, value = l.split("=")
-
-        # If present, strip HOST_ from variable name
-        if pkgvar.startswith("HOST_"):
-            pkgvar = pkgvar[5:]
-
-        if pkgvar.endswith("_MANIFEST_LICENSE_FILES"):
-            continue
-
-        # Strip _LICENSE_FILES
-        pkgvar = pkgvar[:-14]
-
-        Package.all_license_files.append(pkgvar)
-
-    # Version
-    o = subprocess.check_output(["make", "BR2_HAVE_DOT_CONFIG=y",
-                                 "-s", "printvars", "VARS=%_VERSION"])
-
-    # We process first the host package VERSION, and then the target
-    # package VERSION. This means that if a package exists in both
-    # target and host variants, with different version numbers
-    # (unlikely), we'll report the target version number.
-    version_list = o.splitlines()
-    version_list = [x for x in version_list if x.startswith("HOST_")] + \
-                   [x for x in version_list if not x.startswith("HOST_")]
-    for l in version_list:
-        # Get variable name and value
-        pkgvar, value = l.split("=")
-
-        # If present, strip HOST_ from variable name
-        if pkgvar.startswith("HOST_"):
-            pkgvar = pkgvar[5:]
-
-        if pkgvar.endswith("_DL_VERSION"):
-            continue
-
-        # Strip _VERSION
-        pkgvar = pkgvar[:-8]
-
-        Package.all_versions[pkgvar] = value
-
-
-def set_version_worker(q):
-    while True:
-        pkg = q.get()
-        pkg.set_latest_version()
-        print " [%04d] %s => %s" % (q.qsize(), pkg.name, str(pkg.latest_version))
-        q.task_done()
-
-
-def add_latest_version_info(packages):
-    """
-    Fills in the .latest_version field of all Package objects
-
-    This field has a special format:
-      (mapping, version, id)
-    with:
-    - mapping: boolean that indicates whether release-monitoring.org
-      has a mapping for this package name in the Buildroot distribution
-      or not
-    - version: string containing the latest version known by
-      release-monitoring.org for this package
-    - id: string containing the id of the project corresponding to this
-      package, as known by release-monitoring.org
-    """
-    q = Queue()
-    for pkg in packages:
-        q.put(pkg)
-    # Since release-monitoring.org is rather slow, we create 8 threads
-    # that do HTTP requests to the site.
-    for i in range(8):
-        t = Thread(target=set_version_worker, args=[q])
-        t.daemon = True
-        t.start()
-    q.join()
-
-
-def calculate_stats(packages):
-    stats = defaultdict(int)
-    for pkg in packages:
-        # If packages have multiple infra, take the first one. For the
-        # vast majority of packages, the target and host infra are the
-        # same. There are very few packages that use a different infra
-        # for the host and target variants.
-        if len(pkg.infras) > 0:
-            infra = pkg.infras[0][1]
-            stats["infra-%s" % infra] += 1
-        else:
-            stats["infra-unknown"] += 1
-        if pkg.has_license:
-            stats["license"] += 1
-        else:
-            stats["no-license"] += 1
-        if pkg.has_license_files:
-            stats["license-files"] += 1
-        else:
-            stats["no-license-files"] += 1
-        if pkg.has_hash:
-            stats["hash"] += 1
-        else:
-            stats["no-hash"] += 1
-        if pkg.latest_version[0]:
-            stats["rmo-mapping"] += 1
-        else:
-            stats["rmo-no-mapping"] += 1
-        if not pkg.latest_version[1]:
-            stats["version-unknown"] += 1
-        elif pkg.latest_version[1] == pkg.current_version:
-            stats["version-uptodate"] += 1
-        else:
-            stats["version-not-uptodate"] += 1
-        stats["patches"] += pkg.patch_count
-    return stats
-
-
-html_header = """
-<head>
-<script src=\"https://www.kryogenix.org/code/browser/sorttable/sorttable.js\"></script>
-<style type=\"text/css\">
-table {
-  width: 100%;
-}
-td {
-  border: 1px solid black;
-}
-td.centered {
-  text-align: center;
-}
-td.wrong {
-  background: #ff9a69;
-}
-td.correct {
-  background: #d2ffc4;
-}
-td.nopatches {
-  background: #d2ffc4;
-}
-td.somepatches {
-  background: #ffd870;
-}
-td.lotsofpatches {
-  background: #ff9a69;
-}
-td.version-good {
-  background: #d2ffc4;
-}
-td.version-needs-update {
-  background: #ff9a69;
-}
-td.version-unknown {
- background: #ffd870;
-}
-</style>
-<title>Statistics of Buildroot packages</title>
-</head>
-
-<a href=\"#results\">Results</a><br/>
-
-<p id=\"sortable_hint\"></p>
-"""
-
-
-html_footer = """
-</body>
-<script>
-if (typeof sorttable === \"object\") {
-  document.getElementById(\"sortable_hint\").innerHTML =
-  \"hint: the table can be sorted by clicking the column headers\"
-}
-</script>
-</html>
-"""
-
-
-def infra_str(infra_list):
-    if not infra_list:
-        return "Unknown"
-    elif len(infra_list) == 1:
-        return "<b>%s</b><br/>%s" % (infra_list[0][1], infra_list[0][0])
-    elif infra_list[0][1] == infra_list[1][1]:
-        return "<b>%s</b><br/>%s + %s" % \
-            (infra_list[0][1], infra_list[0][0], infra_list[1][0])
-    else:
-        return "<b>%s</b> (%s)<br/><b>%s</b> (%s)" % \
-            (infra_list[0][1], infra_list[0][0],
-             infra_list[1][1], infra_list[1][0])
-
-
-def boolean_str(b):
-    if b:
-        return "Yes"
-    else:
-        return "No"
-
-
-def dump_html_pkg(f, pkg):
-    f.write(" <tr>\n")
-    f.write("  <td>%s</td>\n" % pkg.path[2:])
-
-    # Patch count
-    td_class = ["centered"]
-    if pkg.patch_count == 0:
-        td_class.append("nopatches")
-    elif pkg.patch_count < 5:
-        td_class.append("somepatches")
-    else:
-        td_class.append("lotsofpatches")
-    f.write("  <td class=\"%s\">%s</td>\n" %
-            (" ".join(td_class), str(pkg.patch_count)))
-
-    # Infrastructure
-    infra = infra_str(pkg.infras)
-    td_class = ["centered"]
-    if infra == "Unknown":
-        td_class.append("wrong")
-    else:
-        td_class.append("correct")
-    f.write("  <td class=\"%s\">%s</td>\n" %
-            (" ".join(td_class), infra_str(pkg.infras)))
-
-    # License
-    td_class = ["centered"]
-    if pkg.has_license:
-        td_class.append("correct")
-    else:
-        td_class.append("wrong")
-    f.write("  <td class=\"%s\">%s</td>\n" %
-            (" ".join(td_class), boolean_str(pkg.has_license)))
-
-    # License files
-    td_class = ["centered"]
-    if pkg.has_license_files:
-        td_class.append("correct")
-    else:
-        td_class.append("wrong")
-    f.write("  <td class=\"%s\">%s</td>\n" %
-            (" ".join(td_class), boolean_str(pkg.has_license_files)))
-
-    # Hash
-    td_class = ["centered"]
-    if pkg.has_hash:
-        td_class.append("correct")
-    else:
-        td_class.append("wrong")
-    f.write("  <td class=\"%s\">%s</td>\n" %
-            (" ".join(td_class), boolean_str(pkg.has_hash)))
-
-    # Current version
-    if len(pkg.current_version) > 20:
-        current_version = pkg.current_version[:20] + "..."
-    else:
-        current_version = pkg.current_version
-    f.write("  <td class=\"centered\">%s</td>\n" % current_version)
-
-    # Latest version
-    if pkg.latest_version[1] is None:
-        td_class.append("version-unknown")
-    elif pkg.latest_version[1] != pkg.current_version:
-        td_class.append("version-needs-update")
-    else:
-        td_class.append("version-good")
-
-    if pkg.latest_version[1] is None:
-        latest_version_text = "<b>Unknown</b>"
-    else:
-        latest_version_text = "<b>%s</b>" % str(pkg.latest_version[1])
-
-    latest_version_text += "<br/>"
-
-    if pkg.latest_version[2]:
-        latest_version_text += "<a href=\"https://release-monitoring.org/project/%s\">link</a>, " % pkg.latest_version[2]
-    else:
-        latest_version_text += "no link, "
-
-    if pkg.latest_version[0]:
-        latest_version_text += "has <a href=\"https://release-monitoring.org/distro/Buildroot/\">mapping</a>"
-    else:
-        latest_version_text += "has <a href=\"https://release-monitoring.org/distro/Buildroot/\">no mapping</a>"
-
-    f.write("  <td class=\"%s\">%s</td>\n" %
-            (" ".join(td_class), latest_version_text))
-
-    # Warnings
-    td_class = ["centered"]
-    if pkg.warnings == 0:
-        td_class.append("correct")
-    else:
-        td_class.append("wrong")
-    f.write("  <td class=\"%s\">%d</td>\n" %
-            (" ".join(td_class), pkg.warnings))
-
-    f.write(" </tr>\n")
-
-
-def dump_html_all_pkgs(f, packages):
-    f.write("""
-<table class=\"sortable\">
-<tr>
-<td>Package</td>
-<td class=\"centered\">Patch count</td>
-<td class=\"centered\">Infrastructure</td>
-<td class=\"centered\">License</td>
-<td class=\"centered\">License files</td>
-<td class=\"centered\">Hash file</td>
-<td class=\"centered\">Current version</td>
-<td class=\"centered\">Latest version</td>
-<td class=\"centered\">Warnings</td>
-</tr>
-""")
-    for pkg in sorted(packages):
-        dump_html_pkg(f, pkg)
-    f.write("</table>")
-
-
-def dump_html_stats(f, stats):
-    f.write("<a id=\"results\"></a>\n")
-    f.write("<table>\n")
-    infras = [infra[6:] for infra in stats.keys() if infra.startswith("infra-")]
-    for infra in infras:
-        f.write(" <tr><td>Packages using the <i>%s</i> infrastructure</td><td>%s</td></tr>\n" %
-                (infra, stats["infra-%s" % infra]))
-    f.write(" <tr><td>Packages having license information</td><td>%s</td></tr>\n" %
-            stats["license"])
-    f.write(" <tr><td>Packages not having license information</td><td>%s</td></tr>\n" %
-            stats["no-license"])
-    f.write(" <tr><td>Packages having license files information</td><td>%s</td></tr>\n" %
-            stats["license-files"])
-    f.write(" <tr><td>Packages not having license files information</td><td>%s</td></tr>\n" %
-            stats["no-license-files"])
-    f.write(" <tr><td>Packages having a hash file</td><td>%s</td></tr>\n" %
-            stats["hash"])
-    f.write(" <tr><td>Packages not having a hash file</td><td>%s</td></tr>\n" %
-            stats["no-hash"])
-    f.write(" <tr><td>Total number of patches</td><td>%s</td></tr>\n" %
-            stats["patches"])
-    f.write("<tr><td>Packages having a mapping on <i>release-monitoring.org</i></td><td>%s</td></tr>\n" %
-            stats["rmo-mapping"])
-    f.write("<tr><td>Packages lacking a mapping on <i>release-monitoring.org</i></td><td>%s</td></tr>\n" %
-            stats["rmo-no-mapping"])
-    f.write("<tr><td>Packages that are up-to-date</td><td>%s</td></tr>\n" %
-            stats["version-uptodate"])
-    f.write("<tr><td>Packages that are not up-to-date</td><td>%s</td></tr>\n" %
-            stats["version-not-uptodate"])
-    f.write("<tr><td>Packages with no known upstream version</td><td>%s</td></tr>\n" %
-            stats["version-unknown"])
-    f.write("</table>\n")
-
-
-def dump_gen_info(f):
-    # Updated on Mon Feb 19 08:12:08 CET 2018, Git commit aa77030b8f5e41f1c53eb1c1ad664b8c814ba032
-    o = subprocess.check_output(["git", "log", "master", "-n", "1", "--pretty=format:%H"])
-    git_commit = o.splitlines()[0]
-    f.write("<p><i>Updated on %s, git commit %s</i></p>\n" %
-            (str(datetime.datetime.utcnow()), git_commit))
-
-
-def dump_html(packages, stats, output):
-    with open(output, 'w') as f:
-        f.write(html_header)
-        dump_html_all_pkgs(f, packages)
-        dump_html_stats(f, stats)
-        dump_gen_info(f)
-        f.write(html_footer)
-
-
-def parse_args():
-    parser = argparse.ArgumentParser()
-    parser.add_argument('-o', dest='output', action='store', required=True,
-                        help='HTML output file')
-    parser.add_argument('-n', dest='npackages', type=int, action='store',
-                        help='Number of packages')
-    parser.add_argument('-p', dest='packages', action='store',
-                        help='List of packages (comma separated)')
-    return parser.parse_args()
-
-
-def __main__():
-    args = parse_args()
-    if args.npackages and args.packages:
-        print "ERROR: -n and -p are mutually exclusive"
-        sys.exit(1)
-    if args.packages:
-        package_list = args.packages.split(",")
-    else:
-        package_list = None
-    print "Build package list ..."
-    packages = get_pkglist(args.npackages, package_list)
-    print "Getting package make info ..."
-    package_init_make_info()
-    print "Getting package details ..."
-    for pkg in packages:
-        pkg.set_infra()
-        pkg.set_license()
-        pkg.set_hash_info()
-        pkg.set_patch_count()
-        pkg.set_check_package_warnings()
-        pkg.set_current_version()
-    print "Getting latest versions ..."
-    add_latest_version_info(packages)
-    print "Calculate stats"
-    stats = calculate_stats(packages)
-    print "Write HTML"
-    dump_html(packages, stats, args.output)
-
-
-__main__()
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Buildroot] [PATCH v3 1/5] support/scripts/pkg-stats-new: rewrite in Python
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 1/5] support/scripts/pkg-stats-new: rewrite in Python Thomas Petazzoni
@ 2018-03-30  3:23   ` Ricardo Martincoski
  0 siblings, 0 replies; 11+ messages in thread
From: Ricardo Martincoski @ 2018-03-30  3:23 UTC (permalink / raw)
  To: buildroot

Hello,

On Fri, Mar 23, 2018 at 05:54 PM, Thomas Petazzoni wrote:

> This commit adds a new version of the pkg-stats script, rewritten in
> Python. It is for now implemented in a separate file called,
> pkg-stats-new, in order to make the diff easily readable. A future
> commit will rename it to pkg-stats.
> 
> Compared to the existing shell-based pkg-stats script, the
> functionality and output is basically the same. The main difference is
> that the output no longer goes to stdout, but to the file passed as
> argument using the -o option. This allows stdout to be used for more
> debugging related information.
> 

> The way the script works is that a first function get_pkglist()
> creates a dict associating package names with an instance of a
> Package() object, containing basic information about the package. Then
> a number of other functions (add_infra_info, add_pkg_make_info,
> add_hash_info, add_patch_count, add_check_package_warnings) will
> calculate additional information about packages, and fill in fields in
> the Package objects.

This is not accurate anymore for v3.
The info from make printvars is now gathered as the second step.
The functions were moved to methods of the object and got renamed.

> 
> calculate_stats() then calculates global statistics (how packages have
> license information, how packages have a hash file, etc.). Finally,
> dump_html() produces the HTML output, using a number of sub-functions.
> 
> One improvement over the shell-based version is that we can use
> regexps to exclude some .mk files. Thanks to this, we can exclude all
> linux-ext-*.mk files, avoiding incorrect matches.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>

[With the commit log updated to describe v3, please add]
Reviewed-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>


Regards,
Ricardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Buildroot] [PATCH v3 3/5] support/scripts/pkg-stats-new: add current version information
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 3/5] support/scripts/pkg-stats-new: add current version information Thomas Petazzoni
@ 2018-03-30  3:25   ` Ricardo Martincoski
  0 siblings, 0 replies; 11+ messages in thread
From: Ricardo Martincoski @ 2018-03-30  3:25 UTC (permalink / raw)
  To: buildroot

Hello,

On Fri, Mar 23, 2018 at 05:54 PM, Thomas Petazzoni wrote:

> This commit adds a new column in the HTML output containing the
> current version of a package in Buildroot. As such, it isn't terribly
> useful, but combined with the latest upstream version added in a
> follow-up commit, it will become very useful.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>

Reviewed-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>


Regards,
Ricardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Buildroot] [PATCH v3 4/5] support/scripts/pkg-stats-new: add latest upstream version information
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 4/5] support/scripts/pkg-stats-new: add latest upstream " Thomas Petazzoni
@ 2018-03-30  3:32   ` Ricardo Martincoski
  2018-04-05  8:56     ` Peter Korsgaard
  0 siblings, 1 reply; 11+ messages in thread
From: Ricardo Martincoski @ 2018-03-30  3:32 UTC (permalink / raw)
  To: buildroot

Hello,

On Fri, Mar 23, 2018 at 05:54 PM, Thomas Petazzoni wrote:

[snip]
> Since release-monitoring.org is a bit slow, we have 8 threads that
> fetch information in parallel.

I disagree with this explanation.
As I see, the problem with release-monitoring.org is that its API v1 forces us
to create a request per package. The consequence is that we have to make 2000+
requests. Doing it in a serialized way is what brings the slow down.
The response time for a single request to the site seems reasonable to me.

[snip]
> ---
> Changes since v2:
> - Use the "timeout" argument of urllib2.urlopen() in order to make
>   sure that the requests terminate at some point, even if
>   release-monitoring.org is stuck.

When I run the script and one request timeouts, the script still hangs at the
end.

Also at any moment after the first HTTP request any CTRL+C is ignored and the
script is not interruptible by the user. I had to kill the interpreter to exit.

It seems it is possible to properly handle this using threading.Event() +
signal.SIGINT... but wait! It is getting too complicated.
So I thought there must be a better solution.
I did some research and I believe there is.
Let me propose another alternative solution. This time not in the dynamic of the
script but in the underlying modules used...

[snip]
> +from Queue import Queue
> +from threading import Thread

There is a lot of tutorials and articles in the wild saying this is the way to
go. After some digging online I think most of these articles are incomplete.
This seems to be a more complete article about these modules:
https://christopherdavis.me/blog/threading-basics.html


But then I tested the module multiprocessing.
IMO it is the way to go for this case.
See below the comparison.

1) serialized requests:
 - really simple code
 - would take 2 hours to run in my machine

2) threading + Queue:
 - lots of boilerplate code to work properly
 - 20 minutes in my machine

3) multiprocessing:
 - simpler code than threading + Queue
 - 16 minutes in my machine
 - 9 minutes in the Gitlab CI elastic runner:
https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/60290644

The demo code is here (a commit on the top of this series 1 to 4):
https://gitlab.com/RicardoMartincoski/buildroot/commit/dc5f447c30157499cd925c9e79c7bc9c29252219

Of course, as any solution, there are some downsides.
 - Pool.apply_async can't call object methods. There are solutions to this using
   other modules, but I think the simpler code wins. We just need to offload the
   code that runs asynchronously to helper functions. Yes, like you did in a
   previous iteration of the series.
 - more RAM is consumed per worker. I did a very simple measurement and htop
   shows 60MB per worker. I don't think it is too much in this case. I did not
   measured the other solutions.

Can we switch to use multiprocessing?

[snip]
> +    def get_latest_version_by_distro(self):
> +        try:
> +            req = urllib2.Request(os.path.join(RELEASE_MONITORING_API, "project", "Buildroot", self.name))
> +            f = urllib2.urlopen(req, timeout=15)
> +        except:

Did you forgot to re-run flake8?

Using bare exceptions is bad.
https://docs.python.org/2/howto/doanddont.html#except

You can catch all exceptions from the request by using:

        except urllib2.URLError:

[snip]
> +    def get_latest_version_by_guess(self):
> +        try:
> +            req = urllib2.Request(os.path.join(RELEASE_MONITORING_API, "projects", "?pattern=%s" % self.name))
> +            f = urllib2.urlopen(req, timeout=15)
> +        except:

Same here.


Regards,
Ricardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Buildroot] [PATCH v3 0/5] New pkg-stats script, with version information
  2018-03-23 20:54 [Buildroot] [PATCH v3 0/5] New pkg-stats script, with version information Thomas Petazzoni
                   ` (4 preceding siblings ...)
  2018-03-23 20:54 ` [Buildroot] [PATCH v3 5/5] support/scripts/pkg-stats: replace with new Python version Thomas Petazzoni
@ 2018-04-04 20:10 ` Thomas Petazzoni
  5 siblings, 0 replies; 11+ messages in thread
From: Thomas Petazzoni @ 2018-04-04 20:10 UTC (permalink / raw)
  To: buildroot

Hello,

On Fri, 23 Mar 2018 21:54:50 +0100, Thomas Petazzoni wrote:

> Thomas Petazzoni (5):
>   support/scripts/pkg-stats-new: rewrite in Python
>   support/scripts/pkg-stats-new: add -n and -p options
>   support/scripts/pkg-stats-new: add current version information

I have applied those three patches, after fixing the commit log of the
first patch, as noticed by Ricardo.

>   support/scripts/pkg-stats-new: add latest upstream version information

I've left this patch unapplied, since Ricardo is still reporting issues
with it. I'll mark it as Changes Requested.

>   support/scripts/pkg-stats: replace with new Python version

I've applied this one as well, rebased on top of
"support/scripts/pkg-stats-new: add current version information", so
that pkg-stats-new effectively replaces pkg-stats from now on.

Thanks Ricardo again for all the review on this patch series. Let's
continue to work together to find a proper solution to retrieve the
latest upstream version information.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Buildroot] [PATCH v3 4/5] support/scripts/pkg-stats-new: add latest upstream version information
  2018-03-30  3:32   ` Ricardo Martincoski
@ 2018-04-05  8:56     ` Peter Korsgaard
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Korsgaard @ 2018-04-05  8:56 UTC (permalink / raw)
  To: buildroot

>>>>> "Ricardo" == Ricardo Martincoski <ricardo.martincoski@gmail.com> writes:

Hi,

 > There is a lot of tutorials and articles in the wild saying this is the way to
 > go. After some digging online I think most of these articles are incomplete.
 > This seems to be a more complete article about these modules:
 > https://christopherdavis.me/blog/threading-basics.html

 > But then I tested the module multiprocessing.
 > IMO it is the way to go for this case.
 > See below the comparison.

 > 1) serialized requests:
 >  - really simple code
 >  - would take 2 hours to run in my machine

 > 2) threading + Queue:
 >  - lots of boilerplate code to work properly
 >  - 20 minutes in my machine

 > 3) multiprocessing:
 >  - simpler code than threading + Queue
 >  - 16 minutes in my machine
 >  - 9 minutes in the Gitlab CI elastic runner:
 > https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/60290644

 > The demo code is here (a commit on the top of this series 1 to 4):
 > https://gitlab.com/RicardoMartincoski/buildroot/commit/dc5f447c30157499cd925c9e79c7bc9c29252219

 > Of course, as any solution, there are some downsides.
 >  - Pool.apply_async can't call object methods. There are solutions to this using
 >    other modules, but I think the simpler code wins. We just need to offload the
 >    code that runs asynchronously to helper functions. Yes, like you did in a
 >    previous iteration of the series.
 >  - more RAM is consumed per worker. I did a very simple measurement and htop
 >    shows 60MB per worker. I don't think it is too much in this case. I did not
 >    measured the other solutions.

 > Can we switch to use multiprocessing?

I'm far from a Python expert, but it certainly sounds sensible to me! Thomas?

-- 
Bye, Peter Korsgaard

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-04-05  8:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-23 20:54 [Buildroot] [PATCH v3 0/5] New pkg-stats script, with version information Thomas Petazzoni
2018-03-23 20:54 ` [Buildroot] [PATCH v3 1/5] support/scripts/pkg-stats-new: rewrite in Python Thomas Petazzoni
2018-03-30  3:23   ` Ricardo Martincoski
2018-03-23 20:54 ` [Buildroot] [PATCH v3 2/5] support/scripts/pkg-stats-new: add -n and -p options Thomas Petazzoni
2018-03-23 20:54 ` [Buildroot] [PATCH v3 3/5] support/scripts/pkg-stats-new: add current version information Thomas Petazzoni
2018-03-30  3:25   ` Ricardo Martincoski
2018-03-23 20:54 ` [Buildroot] [PATCH v3 4/5] support/scripts/pkg-stats-new: add latest upstream " Thomas Petazzoni
2018-03-30  3:32   ` Ricardo Martincoski
2018-04-05  8:56     ` Peter Korsgaard
2018-03-23 20:54 ` [Buildroot] [PATCH v3 5/5] support/scripts/pkg-stats: replace with new Python version Thomas Petazzoni
2018-04-04 20:10 ` [Buildroot] [PATCH v3 0/5] New pkg-stats script, with version information Thomas Petazzoni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.