All of lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] [PATCH v3 1/2] support/scripts/pkg-stats: URL checking support
@ 2018-10-02  2:37 Matt Weber
  2018-10-02  2:37 ` [Buildroot] [PATCH v3 2/2] support/scripts/pkg-stats: URL check using threads Matt Weber
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Matt Weber @ 2018-10-02  2:37 UTC (permalink / raw)
  To: buildroot

 - Adds support to check if a package has a URL and if that URL
   is valid by doing a header request.
 - Reports this information as part of the generated html output

The URL data is currently gathered from the URL string provided
in the Kconfig help sections for each package.

This check helps ensure the URLs are valid and can be used
for other scripting purposes as the product's home site/URL.
CPE XML generation is an example of a case that could use this
product URL as part of an automated update generation script.

CC: Ricardo Martincoski <ricardo.martincoski@gmail.com>
Signed-off-by: Matt Weber <matthew.weber@rockwellcollins.com>
---
Changes
v2 -> v3
[Ricardo
 - Fixed flake8 warnings of extra lines and unused variable
 - Adjusted regex to be compiled once and used example to
   simplify finding the url and cleaning up the string
 - Removed checking for help section and instead used the suggested
   improved regex to find the URL.

v1 -> v2
 - Dropped disabling of SSL cert verifing

[Thomas
 - Moved URL report to new column
 - Added color coding
 - Better info on if a package doesn't have a Config.in to search
   or the Config.in is missing the URL
---
 support/scripts/pkg-stats | 57 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 57 insertions(+)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index b7b00e8..c841e09 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -24,8 +24,10 @@ from collections import defaultdict
 import re
 import subprocess
 import sys
+import requests  # URL checking
 
 INFRA_RE = re.compile("\$\(eval \$\(([a-z-]*)-package\)\)")
+URL_RE = re.compile("\s*https?://\S*\s*$")
 
 
 class Package:
@@ -43,10 +45,29 @@ class Package:
         self.patch_count = 0
         self.warnings = 0
         self.current_version = None
+        self.url = None
+        self.url_status = None
 
     def pkgvar(self):
         return self.name.upper().replace("-", "_")
 
+    def set_url(self):
+        """
+        Fills in the .url field
+        """
+        self.url_status = "No Config.in"
+        for filename in os.listdir(os.path.dirname(self.path)):
+            if fnmatch.fnmatch(filename, 'Config.*'):
+                fp = open(os.path.join(os.path.dirname(self.path), filename), "r")
+                for config_line in fp:
+                    if URL_RE.match(config_line):
+                        self.url = config_line.strip()
+                        self.url_status = "Found"
+                        fp.close()
+                        return
+                self.url_status = "Missing"
+                fp.close()
+
     def set_infra(self):
         """
         Fills in the .infras field
@@ -255,6 +276,16 @@ def package_init_make_info():
         Package.all_versions[pkgvar] = value
 
 
+def check_url_status(pkg):
+    if pkg.url_status != "Missing" and pkg.url_status != "No Config.in":
+        try:
+            url_status_code = requests.head(pkg.url, timeout=5).status_code
+            if url_status_code >= 400:
+                pkg.url_status = "Invalid(%s)" % str(url_status_code)
+        except requests.exceptions.RequestException:
+            return
+
+
 def calculate_stats(packages):
     stats = defaultdict(int)
     for pkg in packages:
@@ -311,6 +342,15 @@ td.somepatches {
 td.lotsofpatches {
   background: #ff9a69;
 }
+td.good_url {
+  background: #d2ffc4;
+}
+td.missing_url {
+  background: #ffd870;
+}
+td.invalid_url {
+  background: #ff9a69;
+}
 </style>
 <title>Statistics of Buildroot packages</title>
 </head>
@@ -422,6 +462,20 @@ def dump_html_pkg(f, pkg):
     f.write("  <td class=\"%s\">%d</td>\n" %
             (" ".join(td_class), pkg.warnings))
 
+    # URL status
+    td_class = ["centered"]
+    url_str = pkg.url_status
+    if pkg.url_status == "Missing" or pkg.url_status == "No Config.in":
+        td_class.append("missing_url")
+    elif pkg.url_status.startswith("Invalid"):
+        td_class.append("invalid_url")
+        url_str = "<a href=%s>%s</a>" % (pkg.url, pkg.url_status)
+    else:
+        td_class.append("good_url")
+        url_str = "<a href=%s>Link</a>" % pkg.url
+    f.write("  <td class=\"%s\">%s</td>\n" %
+            (" ".join(td_class), url_str))
+
     f.write(" </tr>\n")
 
 
@@ -437,6 +491,7 @@ def dump_html_all_pkgs(f, packages):
 <td class=\"centered\">Hash file</td>
 <td class=\"centered\">Current version</td>
 <td class=\"centered\">Warnings</td>
+<td class=\"centered\">URL</td>
 </tr>
 """)
     for pkg in sorted(packages):
@@ -517,6 +572,8 @@ def __main__():
         pkg.set_patch_count()
         pkg.set_check_package_warnings()
         pkg.set_current_version()
+        pkg.set_url()
+        check_url_status(pkg)
     print("Calculate stats")
     stats = calculate_stats(packages)
     print("Write HTML")
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Buildroot] [PATCH v3 2/2] support/scripts/pkg-stats: URL check using threads
  2018-10-02  2:37 [Buildroot] [PATCH v3 1/2] support/scripts/pkg-stats: URL checking support Matt Weber
@ 2018-10-02  2:37 ` Matt Weber
  2018-10-03  0:50   ` Ricardo Martincoski
  2018-10-09  8:15   ` Thomas Petazzoni
  2018-10-03  0:49 ` [Buildroot] [PATCH v3 1/2] support/scripts/pkg-stats: URL checking support Ricardo Martincoski
  2018-10-09  8:15 ` Thomas Petazzoni
  2 siblings, 2 replies; 8+ messages in thread
From: Matt Weber @ 2018-10-02  2:37 UTC (permalink / raw)
  To: buildroot

Adds a pool of worker threads to accelerate connection testing.

~7.5MB and 2% CPU per thread on a Intel i5-3230M CPU @ 2.60GHz.

Runtime is ~3min in parallel vs ~15min.

CC: Ricardo Martincoski <ricardo.martincoski@gmail.com>
Signed-off-by: Matthew Weber <matthew.weber@rockwellcollins.com>

---
Changes
v2 -> v3
[Ricardo
 - Updated the timeout for header request testing to 30sec as there
   maybe corner cases with a smaller value when running requests in
   parallel on a busy machine

v1 -> v2
 - New patch
---
 support/scripts/pkg-stats | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index c841e09..b615001 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -25,6 +25,7 @@ import re
 import subprocess
 import sys
 import requests  # URL checking
+from multiprocessing import Pool
 
 INFRA_RE = re.compile("\$\(eval \$\(([a-z-]*)-package\)\)")
 URL_RE = re.compile("\s*https?://\S*\s*$")
@@ -47,6 +48,7 @@ class Package:
         self.current_version = None
         self.url = None
         self.url_status = None
+        self.url_worker = None
 
     def pkgvar(self):
         return self.name.upper().replace("-", "_")
@@ -276,14 +278,24 @@ def package_init_make_info():
         Package.all_versions[pkgvar] = value
 
 
-def check_url_status(pkg):
-    if pkg.url_status != "Missing" and pkg.url_status != "No Config.in":
+def check_url_status_worker(url, url_status):
+    if url_status != "Missing" and url_status != "No Config.in":
         try:
-            url_status_code = requests.head(pkg.url, timeout=5).status_code
+            url_status_code = requests.head(url, timeout=30).status_code
             if url_status_code >= 400:
-                pkg.url_status = "Invalid(%s)" % str(url_status_code)
+                return "Invalid(%s)" % str(url_status_code)
         except requests.exceptions.RequestException:
-            return
+            return "Invalid(Err)"
+        return "Ok"
+    return url_status
+
+
+def check_package_urls(packages):
+    Package.pool = Pool(processes=64)
+    for pkg in packages:
+        pkg.url_worker = pkg.pool.apply_async(check_url_status_worker, (pkg.url, pkg.url_status))
+    for pkg in packages:
+        pkg.url_status = pkg.url_worker.get(timeout=3600)
 
 
 def calculate_stats(packages):
@@ -573,7 +585,8 @@ def __main__():
         pkg.set_check_package_warnings()
         pkg.set_current_version()
         pkg.set_url()
-        check_url_status(pkg)
+    print("Checking URL status")
+    check_package_urls(packages)
     print("Calculate stats")
     stats = calculate_stats(packages)
     print("Write HTML")
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Buildroot] [PATCH v3 1/2] support/scripts/pkg-stats: URL checking support
  2018-10-02  2:37 [Buildroot] [PATCH v3 1/2] support/scripts/pkg-stats: URL checking support Matt Weber
  2018-10-02  2:37 ` [Buildroot] [PATCH v3 2/2] support/scripts/pkg-stats: URL check using threads Matt Weber
@ 2018-10-03  0:49 ` Ricardo Martincoski
  2018-10-09  8:15 ` Thomas Petazzoni
  2 siblings, 0 replies; 8+ messages in thread
From: Ricardo Martincoski @ 2018-10-03  0:49 UTC (permalink / raw)
  To: buildroot

Hello,

On Mon, Oct 01, 2018 at 11:37 PM, Matt Weber wrote:

>  - Adds support to check if a package has a URL and if that URL
>    is valid by doing a header request.
>  - Reports this information as part of the generated html output
> 
> The URL data is currently gathered from the URL string provided
> in the Kconfig help sections for each package.
> 
> This check helps ensure the URLs are valid and can be used
> for other scripting purposes as the product's home site/URL.
> CPE XML generation is an example of a case that could use this
> product URL as part of an automated update generation script.
> 
> CC: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> Signed-off-by: Matt Weber <matthew.weber@rockwellcollins.com>

Reviewed-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>


Regards,
Ricardo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Buildroot] [PATCH v3 2/2] support/scripts/pkg-stats: URL check using threads
  2018-10-02  2:37 ` [Buildroot] [PATCH v3 2/2] support/scripts/pkg-stats: URL check using threads Matt Weber
@ 2018-10-03  0:50   ` Ricardo Martincoski
  2018-10-03 12:54     ` Thomas Petazzoni
  2018-10-09  8:15   ` Thomas Petazzoni
  1 sibling, 1 reply; 8+ messages in thread
From: Ricardo Martincoski @ 2018-10-03  0:50 UTC (permalink / raw)
  To: buildroot

Hello,

On Mon, Oct 01, 2018 at 11:37 PM, Matt Weber wrote:

> Adds a pool of worker threads to accelerate connection testing.
> 
> ~7.5MB and 2% CPU per thread on a Intel i5-3230M CPU @ 2.60GHz.
> 
> Runtime is ~3min in parallel vs ~15min.
> 
> CC: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> Signed-off-by: Matthew Weber <matthew.weber@rockwellcollins.com>

Reviewed-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>


Regards,
Ricardo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Buildroot] [PATCH v3 2/2] support/scripts/pkg-stats: URL check using threads
  2018-10-03  0:50   ` Ricardo Martincoski
@ 2018-10-03 12:54     ` Thomas Petazzoni
  2018-10-03 13:22       ` Ricardo Martincoski
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Petazzoni @ 2018-10-03 12:54 UTC (permalink / raw)
  To: buildroot

Hello Ricardo,

On Tue, 02 Oct 2018 21:50:04 -0300, Ricardo Martincoski wrote:

> On Mon, Oct 01, 2018 at 11:37 PM, Matt Weber wrote:
> 
> > Adds a pool of worker threads to accelerate connection testing.
> > 
> > ~7.5MB and 2% CPU per thread on a Intel i5-3230M CPU @ 2.60GHz.
> > 
> > Runtime is ~3min in parallel vs ~15min.
> > 
> > CC: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> > Signed-off-by: Matthew Weber <matthew.weber@rockwellcollins.com>  
> 
> Reviewed-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>

So this threading works for you? I think it is modeled after what I
initially did to retrieve the upstream version of the packages, and you
said it wasn't working well for you.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Buildroot] [PATCH v3 2/2] support/scripts/pkg-stats: URL check using threads
  2018-10-03 12:54     ` Thomas Petazzoni
@ 2018-10-03 13:22       ` Ricardo Martincoski
  0 siblings, 0 replies; 8+ messages in thread
From: Ricardo Martincoski @ 2018-10-03 13:22 UTC (permalink / raw)
  To: buildroot

Hello Thomas,

On Wednesday, October 3, 2018 9:54:19 AM, Thomas Petazzoni wrote:

> On Tue, 02 Oct 2018 21:50:04 -0300, Ricardo Martincoski wrote:
[snip]
>> Reviewed-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> 
> So this threading works for you? I think it is modeled after what I
> initially did to retrieve the upstream version of the packages, and you
> said it wasn't working well for you.

Yes. It is working for me.

Actually you used threading + Queue, and I suggested you to switch to
use multiprocessing.
https://patchwork.ozlabs.org/patch/890236/

Matt uses multiprocessing in this patch.

Regards,
Ricardo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Buildroot] [PATCH v3 1/2] support/scripts/pkg-stats: URL checking support
  2018-10-02  2:37 [Buildroot] [PATCH v3 1/2] support/scripts/pkg-stats: URL checking support Matt Weber
  2018-10-02  2:37 ` [Buildroot] [PATCH v3 2/2] support/scripts/pkg-stats: URL check using threads Matt Weber
  2018-10-03  0:49 ` [Buildroot] [PATCH v3 1/2] support/scripts/pkg-stats: URL checking support Ricardo Martincoski
@ 2018-10-09  8:15 ` Thomas Petazzoni
  2 siblings, 0 replies; 8+ messages in thread
From: Thomas Petazzoni @ 2018-10-09  8:15 UTC (permalink / raw)
  To: buildroot

Hello,

On Mon,  1 Oct 2018 21:37:28 -0500, Matt Weber wrote:
>  - Adds support to check if a package has a URL and if that URL
>    is valid by doing a header request.
>  - Reports this information as part of the generated html output
> 
> The URL data is currently gathered from the URL string provided
> in the Kconfig help sections for each package.
> 
> This check helps ensure the URLs are valid and can be used
> for other scripting purposes as the product's home site/URL.
> CPE XML generation is an example of a case that could use this
> product URL as part of an automated update generation script.
> 
> CC: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> Signed-off-by: Matt Weber <matthew.weber@rockwellcollins.com>
> ---
> Changes
> v2 -> v3
> [Ricardo
>  - Fixed flake8 warnings of extra lines and unused variable
>  - Adjusted regex to be compiled once and used example to
>    simplify finding the url and cleaning up the string
>  - Removed checking for help section and instead used the suggested
>    improved regex to find the URL.

Applied to master, thanks. I have just changed the header of the column
from "URL" to "Upstream URL".

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Buildroot] [PATCH v3 2/2] support/scripts/pkg-stats: URL check using threads
  2018-10-02  2:37 ` [Buildroot] [PATCH v3 2/2] support/scripts/pkg-stats: URL check using threads Matt Weber
  2018-10-03  0:50   ` Ricardo Martincoski
@ 2018-10-09  8:15   ` Thomas Petazzoni
  1 sibling, 0 replies; 8+ messages in thread
From: Thomas Petazzoni @ 2018-10-09  8:15 UTC (permalink / raw)
  To: buildroot

Hello,

On Mon,  1 Oct 2018 21:37:29 -0500, Matt Weber wrote:
> Adds a pool of worker threads to accelerate connection testing.
> 
> ~7.5MB and 2% CPU per thread on a Intel i5-3230M CPU @ 2.60GHz.
> 
> Runtime is ~3min in parallel vs ~15min.
> 
> CC: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> Signed-off-by: Matthew Weber <matthew.weber@rockwellcollins.com>
> 
> ---
> Changes
> v2 -> v3
> [Ricardo
>  - Updated the timeout for header request testing to 30sec as there
>    maybe corner cases with a smaller value when running requests in
>    parallel on a busy machine

Applied to master, thanks.

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-10-09  8:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-02  2:37 [Buildroot] [PATCH v3 1/2] support/scripts/pkg-stats: URL checking support Matt Weber
2018-10-02  2:37 ` [Buildroot] [PATCH v3 2/2] support/scripts/pkg-stats: URL check using threads Matt Weber
2018-10-03  0:50   ` Ricardo Martincoski
2018-10-03 12:54     ` Thomas Petazzoni
2018-10-03 13:22       ` Ricardo Martincoski
2018-10-09  8:15   ` Thomas Petazzoni
2018-10-03  0:49 ` [Buildroot] [PATCH v3 1/2] support/scripts/pkg-stats: URL checking support Ricardo Martincoski
2018-10-09  8:15 ` Thomas Petazzoni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.