All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas De Schampheleire <patrickdepinguin@gmail.com>
To: buildroot@busybox.net
Subject: [Buildroot] [PATCH 2/9] support/scripts/check-package: new script
Date: Tue, 24 Jan 2017 22:14:52 +0100	[thread overview]
Message-ID: <CAAXf6LX2Od0aXd8s65PQkV5omNJiwgvB+dpFt6LC7bCX_6FhNQ@mail.gmail.com> (raw)
In-Reply-To: <20161231032110.11573-3-ricardo.martincoski@gmail.com>

Hi Ricardo,

On Sat, Dec 31, 2016 at 4:21 AM, Ricardo Martincoski
<ricardo.martincoski@gmail.com> wrote:
> Create the infra to check the style of new packages before submitting.
> The overall function of the script is described inside a txt file.
> It is designed to process the actual files and NOT the patch files
> generated by git format-patch.
>
[..]

Interesting!
Please check below where I have some comments on the implementation of
the infrastructure.

> diff --git a/support/scripts/check-package b/support/scripts/check-package
> new file mode 100755
> index 000000000..42af82e2c
> --- /dev/null
> +++ b/support/scripts/check-package
> @@ -0,0 +1,136 @@
> +#!/usr/bin/env python
> +# See support/scripts/check-package.txt before editing this file.
> +
> +from __future__ import print_function
> +import argparse
> +import inspect
> +import re
> +import sys
> +
> +import checkpackagelib_config
> +import checkpackagelib_hash
> +import checkpackagelib_mk
> +import checkpackagelib_patch
> +
> +
> +def parse_args():
> +    parser = argparse.ArgumentParser()
> +
> +    # Do not use argparse.FileType("r") here because only files with known
> +    # format will be open based on the filename.
> +    parser.add_argument("files", metavar="F", type=str, nargs="*",
> +                        help="list of files")
> +
> +    parser.add_argument("--manual-url", action="store",
> +                        default="http://nightly.buildroot.org/",
> +                        help="default: %(default)s")
> +    parser.add_argument("--verbose", "-v", action="count", default=0)
> +
> +    # Now the debug options in the order they are processed.
> +    parser.add_argument("--include-only", dest="include_list", action="append",
> +                        help="run only the specified functions (debug)")
> +    parser.add_argument("--exclude", dest="exclude_list", action="append",
> +                        help="do not run the specified functions (debug)")
> +    parser.add_argument("--dry-run", action="store_true", help="print the "
> +                        "functions that would be called for each file (debug)")
> +
> +    return parser.parse_args()
> +
> +
> +CONFIG_IN_FILENAME = re.compile("/Config\.in(\.host)?$")

Beware, there are packages with different config filenames:

linux/Config.ext.in
linux/Config.tools.in
package/php/Config.ext
package/qt/Config.gfx.in
package/qt/Config.keyboard.in
package/qt/Config.mouse.in
package/qt/Config.sql.in


> +FILE_IS_FROM_A_PACKAGE = re.compile("package/[^/]*/")
> +
> +
> +def get_lib_from_filename(fname):
> +    if FILE_IS_FROM_A_PACKAGE.search(fname) is None:
> +        return None
> +    if CONFIG_IN_FILENAME.search(fname):
> +        return checkpackagelib_config
> +    if fname.endswith(".hash"):
> +        return checkpackagelib_hash
> +    if fname.endswith(".mk"):
> +        return checkpackagelib_mk
> +    if fname.endswith(".patch"):
> +        return checkpackagelib_patch
> +    return None
> +
> +
> +def call_check_function(function, fname, args, **kwargs):

What are these 'args' ? Shouldn't it be '*args' with asterisk?
https://pythontips.com/2013/08/04/args-and-kwargs-in-python-explained/
Same in other functions.

> +    warnings = function(fname, args, **kwargs)
> +
> +    # Avoid the need to use 'return []' at the end of every check function.
> +    if warnings is None:
> +        return 0  # No warning generated.
> +
> +    for level, message in enumerate(warnings):
> +        if args.verbose >= level:
> +            print(message.replace("\t", "< tab  >").rstrip())
> +    return 1  # One more warning to count.
> +
> +
> +def check_file_using_lib(fname, args):
> +    # Count number of warnings generated and lines processed.
> +    nwarnings = 0
> +    nlines = 0
> +
> +    lib = get_lib_from_filename(fname)
> +    if not lib:
> +        if args.verbose >= 3:

perhaps define named constants instead of magic '3' ?

> +            print("{}: ignored".format(fname))
> +        return nwarnings, nlines
> +    callbacks = inspect.getmembers(lib, inspect.isfunction)
> +
> +    # Do not call helper functions.
> +    callbacks = [cb for cb in callbacks if not cb[0].startswith("_")]
> +
> +    if args.include_list:
> +        callbacks = [cb for cb in callbacks if cb[0] in args.include_list]
> +    if args.exclude_list:
> +        callbacks = [cb for cb in callbacks if cb[0] not in args.exclude_list]

It looks cleaner to me to create a custom function with all the checks
(no helper functions, include list, exclude list) and pass that to
inspect.getmembers, rather than doing it in four steps. The custom
function should then return True only if the function should be kept,
and False otherwise. See
https://docs.python.org/2/library/inspect.html#inspect.getmembers

> +
> +    if args.dry_run:
> +        functions_to_run = [cb[0] for cb in callbacks]
> +        print("{}: would run: {}".format(fname, functions_to_run))
> +        return nwarnings, nlines
> +
> +    for cb in callbacks:
> +        nwarnings += call_check_function(cb[1], fname, args, start=True)
> +    for lineno, text in enumerate(open(fname, "r").readlines()):
> +        nlines += 1
> +        for cb in callbacks:
> +            nwarnings += call_check_function(cb[1], fname, args,
> +                                             lineno=lineno + 1, text=text)
> +    for cb in callbacks:
> +        nwarnings += call_check_function(cb[1], fname, args, end=True)

I find the concept with start and end parameter odd and unpythonic.
Why do you not let each checker be a class, rather than a function?
Each class can then define a start, check and end method (or other
names), which you call above.

> +
> +    return nwarnings, nlines
> +
> +
> +def __main__():
> +    args = parse_args()
> +
> +    if len(args.files) == 0:
> +        print("No files to check style")
> +        sys.exit(1)
> +
> +    # Accumulate number of warnings generated and lines processed.
> +    total_warnings = 0
> +    total_lines = 0
> +
> +    for fname in args.files:
> +        nwarnings, nlines = check_file_using_lib(fname, args)
> +        total_warnings += nwarnings
> +        total_lines += nlines
> +
> +    # The warning messages are printed to stdout and can be post-processed
> +    # (e.g. counted by 'wc'), so for stats use stderr. Wait all warnings are
> +    # printed, for the case there are many of them, before printing stats.
> +    sys.stdout.flush()
> +    print("{} lines processed".format(total_lines), file=sys.stderr)
> +    print("{} warnings generated".format(total_warnings), file=sys.stderr)
> +
> +    if total_warnings > 0:
> +        sys.exit(1)
> +
> +
> +__main__()
> diff --git a/support/scripts/check-package.txt b/support/scripts/check-package.txt
> new file mode 100644
> index 000000000..2e71323a2
> --- /dev/null
> +++ b/support/scripts/check-package.txt
> @@ -0,0 +1,76 @@
> +How the scripts are structured:
> +- check-package is the main engine, called by the user.
> +  For each input file, this script decides which parser should be used and it
> +  collects all functions declared in the library file. It opens the input files
> +  and it serves each raw line (including newline!) to every check function in
> +  the library. Two special parameters are used to call the initialization of
> +  each check function (for the case it needs to keep data across calls) and the
> +  equivalent finalization (e.g. for the case a message must be issued if some
> +  pattern is not in the input file).
> +- checkpackagelib.py contains common check functions.
> +  Each check function is explicitly included in a given type-parsing library.
> +  Do not include every single check function in this file, a function that will
> +  only parse hash files should be implemented in the hash-parsing library.
> +  When a warning must be issued, the check function returns an array of strings.
> +  Each string is a warning message and is displayed if the corresponding verbose
> +  level is active. When the script is called without --verbose only the first
> +  warning in the returned array is printed; when called with --verbose both
> +  first and second warnings are printed; when called with -vv until the third
> +  warning is printed; an so on.
> +  Helper functions can be defined by starting the name with "_"; they will not
> +  be called by the main script.
> +- checkpackagelib_type.py contains check functions specific to files of this
> +  type.
> +
> +Some hints when changing this code:
> +- prefer O(n) algorithms, where n is the total number of lines in the files
> +  processed.
> +- when there is no other reason for ordering, use alphabetical order (e.g. keep
> +  the check functions in alphabetical order, keep the imports in alphabetical
> +  order, and so on).
> +- use pyflakes to detect and fix potential problems.
> +- use pep8 formatting.
> +- keep in mind each function will be called with start=True before any line is
> +  served to be checked. A function that checks the filename should only
> +  implement this behavior. A function that needs to keep data across calls (e.g.
> +  keep the last line before the one being processed) should initialize all
> +  "static" variables at this stage.
> +- keep in mind each function will be called with end=True after all lines were
> +  served to be checked. A function that checks the absence of a pattern in the
> +  file will need to use this stage.
> +- try to avoid false warnings. It's better to not issue a warning message to a
> +  corner case than have too many false warnings. The second can make users stop
> +  using the script.
> +- do not check spacing in the input line in every single function. Trailing
> +  whitespace and wrong indentation should be checked by separate functions.
> +- avoid duplicate tests. Try to test only one thing in each function.
> +- in the warning message, include the url to a section from the manual, when
> +  applicable. It potentially will make more people know the manual.
> +- use short sentences in the warning messages. A complete explanation can be
> +  added to show when --verbose is used.
> +- when testing, verify the error message is displayed when the error pattern is
> +  found, but also verify the error message is not displayed for few
> +  well-formatted packages... there are many of these, just pick your favorite
> +  as golden package that should not trigger any warning message.
> +- check the url displayed by the warning message works.
> +
> +Usage examples:
> +- to get a list of check functions that would be called without actually
> +  calling them you can use the --dry-run option:
> +$ support/scripts/check-package --dry-run package/yourfavorite/*
> +
> +- when you just added a new check function, e.g. check_something, check how it
> +  behaves for all current packages:
> +$ support/scripts/check-package --include-only check_something \
> +  $(find package -type f)
> +
> +- the effective processing time (when the .pyc were already generated and all
> +  files to be processed are cached in the RAM) should stay in the order of few
> +  seconds:
> +$ support/scripts/check-package $(find package -type f) >/dev/null ; \
> +  time support/scripts/check-package $(find package -type f) >/dev/null
> +
> +- vim users can navigate the warnings (most editors probably have similar
> +  function) since warnings are generated in the form 'path/file:line: warning':
> +$ find package/ -name 'Config.*' > filelist && vim -c \
> +  'set makeprg=support/scripts/check-package\ $(cat\ filelist)' -c make -c copen
> diff --git a/support/scripts/checkpackagelib.py b/support/scripts/checkpackagelib.py
> new file mode 100644
> index 000000000..987399954
> --- /dev/null
> +++ b/support/scripts/checkpackagelib.py
> @@ -0,0 +1,18 @@
> +# See support/scripts/check-package.txt before editing this file.
> +
> +
> +def check_newline_at_eof(
> +        fname, args, lineno=0, text=None, start=False, end=False):
> +    if start:
> +        check_newline_at_eof.lastlineno = 0
> +        check_newline_at_eof.lastline = "\n"

With a class, you could just assign to self here, rather than using
function statics.

> +        return
> +    if end:
> +        line = check_newline_at_eof.lastline
> +        if line == line.rstrip("\r\n"):
> +            return ["{}:{}: missing newline at end of file"
> +                    .format(fname, check_newline_at_eof.lastlineno),
> +                    line]
> +        return
> +    check_newline_at_eof.lastlineno = lineno
> +    check_newline_at_eof.lastline = text
> diff --git a/support/scripts/checkpackagelib_config.py b/support/scripts/checkpackagelib_config.py
> new file mode 100644
> index 000000000..2660cccd7
> --- /dev/null
> +++ b/support/scripts/checkpackagelib_config.py
> @@ -0,0 +1,7 @@
> +# See support/scripts/check-package.txt before editing this file.
> +# Kconfig generates errors if someone introduces a typo like "boool" instead of
> +# "bool", so below check functions don't need to check for things already
> +# checked by running "make menuconfig".
> +
> +# Notice: ignore 'imported but unused' from pyflakes for check functions.
> +from checkpackagelib import check_newline_at_eof
> diff --git a/support/scripts/checkpackagelib_hash.py b/support/scripts/checkpackagelib_hash.py
> new file mode 100644
> index 000000000..590d5a7c4
> --- /dev/null
> +++ b/support/scripts/checkpackagelib_hash.py
> @@ -0,0 +1,7 @@
> +# See support/scripts/check-package.txt before editing this file.
> +# The validity of the hashes itself is checked when building, so below check
> +# functions don't need to check for things already checked by running
> +# "make package-dirclean package-source".
> +
> +# Notice: ignore 'imported but unused' from pyflakes for check functions.
> +from checkpackagelib import check_newline_at_eof
> diff --git a/support/scripts/checkpackagelib_mk.py b/support/scripts/checkpackagelib_mk.py
> new file mode 100644
> index 000000000..ad7362335
> --- /dev/null
> +++ b/support/scripts/checkpackagelib_mk.py
> @@ -0,0 +1,8 @@
> +# See support/scripts/check-package.txt before editing this file.
> +# There are already dependency checks during the build, so below check
> +# functions don't need to check for things already checked by exploring the
> +# menu options using "make menuconfig" and by running "make" with appropriate
> +# packages enabled.
> +
> +# Notice: ignore 'imported but unused' from pyflakes for check functions.
> +from checkpackagelib import check_newline_at_eof
> diff --git a/support/scripts/checkpackagelib_patch.py b/support/scripts/checkpackagelib_patch.py
> new file mode 100644
> index 000000000..0fb3685a7
> --- /dev/null
> +++ b/support/scripts/checkpackagelib_patch.py
> @@ -0,0 +1,7 @@
> +# See support/scripts/check-package.txt before editing this file.
> +# The format of the patch files is tested during the build, so below check
> +# functions don't need to check for things already checked by running
> +# "make package-dirclean package-patch".
> +
> +# Notice: ignore 'imported but unused' from pyflakes for check functions.
> +from checkpackagelib import check_newline_at_eof
> --
> 2.11.0

Best regards,
Thomas

  reply	other threads:[~2017-01-24 21:14 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-31  3:21 [Buildroot] [PATCH 0/9] A checkpackage script that verifies a package coding style Ricardo Martincoski
2016-12-31  3:21 ` [Buildroot] [PATCH 1/9] support/scripts/check-package: example Ricardo Martincoski
2016-12-31  3:21 ` [Buildroot] [PATCH 2/9] support/scripts/check-package: new script Ricardo Martincoski
2017-01-24 21:14   ` Thomas De Schampheleire [this message]
2017-02-06 18:53     ` Thomas De Schampheleire
2017-02-07  0:17       ` Ricardo Martincoski
2017-02-19 23:13     ` Ricardo Martincoski
2016-12-31  3:21 ` [Buildroot] [PATCH 3/9] check-package: check whitespace and empty lines Ricardo Martincoski
2016-12-31  3:21 ` [Buildroot] [PATCH 4/9] check-package: check *.hash files Ricardo Martincoski
2017-01-24 21:18   ` Thomas De Schampheleire
2017-02-19 23:16     ` Ricardo Martincoski
2016-12-31  3:21 ` [Buildroot] [PATCH 5/9] check-package: check *.patch files Ricardo Martincoski
2017-01-24 21:21   ` Thomas De Schampheleire
2017-02-07  9:58     ` Thomas Petazzoni
2017-02-19 23:41       ` Ricardo Martincoski
2016-12-31  3:21 ` [Buildroot] [PATCH 6/9] check-package: check *.mk files Ricardo Martincoski
2016-12-31  3:21 ` [Buildroot] [PATCH 7/9] docs/manual: size of tab in package description Ricardo Martincoski
2017-01-21 16:58   ` Romain Naour
2017-02-07  1:10     ` Ricardo Martincoski
2016-12-31  3:21 ` [Buildroot] [PATCH 8/9] check-package: check Config.in.* files Ricardo Martincoski
2016-12-31  3:21 ` [Buildroot] [PATCH 9/9] check-package: check *.mk for typo in variable Ricardo Martincoski
2017-01-21 17:19   ` Romain Naour
2017-02-07  0:33     ` Ricardo Martincoski
2017-02-07  9:03   ` Peter Korsgaard
2017-01-21 17:56 ` [Buildroot] [PATCH 0/9] A checkpackage script that verifies a package coding style Romain Naour
2017-02-07  0:52   ` Ricardo Martincoski
2017-02-19 22:17 ` [Buildroot] [PATCH v2 " Ricardo Martincoski
2017-02-19 22:17   ` [Buildroot] [PATCH v2 1/9] support/scripts/check-package: example Ricardo Martincoski
2017-02-19 22:17   ` [Buildroot] [PATCH v2 2/9] support/scripts/check-package: new script Ricardo Martincoski
2017-02-19 22:17   ` [Buildroot] [PATCH v2 3/9] check-package: check whitespace and empty lines Ricardo Martincoski
2017-02-19 22:17   ` [Buildroot] [PATCH v2 4/9] check-package: check *.hash files Ricardo Martincoski
2017-02-19 22:17   ` [Buildroot] [PATCH v2 5/9] check-package: check *.patch files Ricardo Martincoski
2017-02-19 22:17   ` [Buildroot] [PATCH v2 6/9] check-package: check *.mk files Ricardo Martincoski
2017-02-19 22:17   ` [Buildroot] [PATCH v2 7/9] docs/manual: size of tab in package description Ricardo Martincoski
2017-02-19 22:17   ` [Buildroot] [PATCH v2 8/9] check-package: check Config.* files Ricardo Martincoski
2017-02-19 22:17   ` [Buildroot] [PATCH v2 9/9] check-package: check *.mk for typo in variable Ricardo Martincoski
2017-04-08 14:21   ` [Buildroot] [PATCH v2 0/9] A checkpackage script that verifies a package coding style Thomas Petazzoni
2017-04-11 23:03     ` Ricardo Martincoski
2017-04-12  7:49       ` Thomas Petazzoni
2017-04-13  3:03         ` Ricardo Martincoski
2017-04-13  7:20           ` Thomas Petazzoni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAXf6LX2Od0aXd8s65PQkV5omNJiwgvB+dpFt6LC7bCX_6FhNQ@mail.gmail.com \
    --to=patrickdepinguin@gmail.com \
    --cc=buildroot@busybox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.