linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] checkkconfigsymbols.py:  make it Git aware
@ 2015-03-11 11:16 Valentin Rothberg
  2015-03-11 12:04 ` Paul Bolle
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Valentin Rothberg @ 2015-03-11 11:16 UTC (permalink / raw)
  To: gregkh; +Cc: stefan.hengelein, linux-kernel, pebolle, rupran, Valentin Rothberg

The script now supports to check a specified commit or a specified range
of commits (i.e., commit1..commit2).  Developers and maintainers are
encouraged to use this functionality before sending or merging patches
to avoid potential bugs.

This patch adds the following options to the script:

 -c COMMIT, --commit=COMMIT
                  Check if the specified commit (hash) introduces
                  undefined Kconfig symbols.

 -d DIFF, --diff=DIFF
                  Diff undefined symbols between two commits.  The input
                  format bases on Git log's 'commmit1..commit2'.

Note that both options require to 'git reset --hard' the user's Git
tree, which can lead to the loss of uncommitted data.

Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
---
There are still some performance improvements possible (e.g., only parse Kconfig
files for a 2nd time when needed).

Currently, the two options work on a file base, such that files that already
reference the same added undefined feature will not be reported.  A solution can
be to work on a line base.

I like the simplicity of the code at the current state, but I will think about
both issues and send another patch in the near future.
---
 scripts/checkkconfigsymbols.py | 101 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 98 insertions(+), 3 deletions(-)

diff --git a/scripts/checkkconfigsymbols.py b/scripts/checkkconfigsymbols.py
index e9cc689033fe..7f1b5ccf75cb 100644
--- a/scripts/checkkconfigsymbols.py
+++ b/scripts/checkkconfigsymbols.py
@@ -1,6 +1,6 @@
 #!/usr/bin/env python
 
-"""Find Kconfig identifiers that are referenced but not defined."""
+"""Find Kconfig symbols that are referenced but not defined."""
 
 # (c) 2014 Valentin Rothberg <valentinrothberg@gmail.com>
 # (c) 2014 Stefan Hengelein <stefan.hengelein@fau.de>
@@ -10,7 +10,9 @@
 
 import os
 import re
+import sys
 from subprocess import Popen, PIPE, STDOUT
+from optparse import OptionParser
 
 
 # regex expressions
@@ -32,8 +34,100 @@ REGEX_KCONFIG_HELP = re.compile(r"^\s+(help|---help---)\s*$")
 REGEX_FILTER_FEATURES = re.compile(r"[A-Za-z0-9]$")
 
 
+def parse_options():
+    """The user interface of this module."""
+    usage = "%prog [options]\n\n"                                              \
+            "Run this tool to detect Kconfig symbols that are referenced but " \
+            "not defined in\nKconfig.  The output of this tool has the "       \
+            "format \'Undefined symbol\\tFile list\'\n\n"                      \
+            "If no option is specified, %prog will default to check your\n"    \
+            "current tree.  Please note that specifying commits will "         \
+            "\'git reset --hard\'\nyour current tree!  You may save "          \
+            "uncommited changes to avoid losing data."
+
+    parser = OptionParser(usage=usage)
+
+    parser.add_option('-c', '--commit', dest='commit', action='store',
+                      default="",
+                      help="Check if the specified commit (hash) introduces "
+                           "undefined Kconfig symbols.")
+
+    parser.add_option('-d', '--diff', dest='diff', action='store',
+                      default="",
+                      help="Diff undefined symbols between two commits.  The "
+                           "input format bases on Git log's "
+                           "\'commmit1..commit2\'.")
+
+    (opts, _) = parser.parse_args()
+
+    if opts.commit and opts.diff:
+        sys.exit("Please specify only one option at once.")
+
+    if opts.diff and not re.match(r"^[\w\-\.]+\.\.[\w\-\.]+$", opts.diff):
+        sys.exit("Please specify valid input in the following format: "
+                 "\'commmit1..commit2\'")
+
+    return opts
+
+
 def main():
     """Main function of this module."""
+    opts = parse_options()
+
+    if opts.commit or opts.diff:
+        # get commit range
+        commit_a = None
+        commit_b = None
+        if opts.commit:
+            commit_a = opts.commit + "~"
+            commit_b = opts.commit
+        elif opts.diff:
+            split = opts.diff.split("..")
+            commit_a = split[0]
+            commit_b = split[1]
+            undefined_a = {}
+            undefined_b = {}
+
+        # get undefined items before the commit
+        reset_commit(commit_a)
+        undefined_a = check_symbols()
+
+        # get undefined items for the commit
+        reset_commit(commit_b)
+        undefined_b = check_symbols()
+
+        # report cases that are present for the commit but not before
+        for feature in undefined_b:
+            # feature has not been undefined before
+            if not feature in undefined_a:
+                files = undefined_b.get(feature)
+                print "%s\t%s" % (feature, ", ".join(files))
+            # check there are new files that reference the undefined feature
+            else:
+                files = undefined_b.get(feature) - undefined_a.get(feature)
+                if files:
+                    print "%s\t%s" % (feature, ", ".join(files))
+
+    # default to check the entire tree
+    else:
+        undefined = check_symbols()
+        for feature in undefined:
+            files = undefined.get(feature)
+            print "%s\t%s" % (feature, ", ".join(files))
+
+
+def reset_commit(commit):
+    """Hard reset the current git tree to %commit."""
+    cmd = "git reset --hard %s" % commit
+    pop = Popen(cmd, stdout=PIPE, stderr=STDOUT, shell=True)
+    (stdout, _) = pop.communicate()  # wait until finished
+    if pop.returncode != 0:
+        sys.exit(stdout)
+
+
+def check_symbols():
+    """Find undefined Kconfig symbols and return a dict with the symbol as key
+    and a list of referencing files as value."""
     source_files = []
     kconfig_files = []
     defined_features = set()
@@ -61,7 +155,7 @@ def main():
     for kfile in kconfig_files:
         parse_kconfig_file(kfile, defined_features, referenced_features)
 
-    print "Undefined symbol used\tFile list"
+    undefined = {}  # {feature: [files]}
     for feature in sorted(referenced_features):
         # filter some false positives
         if feature == "FOO" or feature == "BAR" or \
@@ -73,7 +167,8 @@ def main():
                 if feature[:-len("_MODULE")] in defined_features:
                     continue
             files = referenced_features.get(feature)
-            print "%s\t%s" % (feature, ", ".join(files))
+            undefined[feature] = files
+    return undefined
 
 
 def parse_source_file(sfile, referenced_features):
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] checkkconfigsymbols.py:  make it Git aware
  2015-03-11 11:16 [PATCH] checkkconfigsymbols.py: make it Git aware Valentin Rothberg
@ 2015-03-11 12:04 ` Paul Bolle
  2015-03-11 14:19   ` Valentin Rothberg
  2015-03-16 10:38 ` [PATCH v2] " Valentin Rothberg
  2015-03-16 11:16 ` [PATCH v3] " Valentin Rothberg
  2 siblings, 1 reply; 9+ messages in thread
From: Paul Bolle @ 2015-03-11 12:04 UTC (permalink / raw)
  To: Valentin Rothberg; +Cc: gregkh, stefan.hengelein, linux-kernel, rupran

On Wed, 2015-03-11 at 12:16 +0100, Valentin Rothberg wrote:
> Note that both options require to 'git reset --hard' the user's Git
> tree, which can lead to the loss of uncommitted data.

My local "800 line perl monster" basically does
    git ls-tree -r $commit_or_tag

which allows you to generate a list of files and their corresponding
hashes (it also helps with filtering out symlinks by the way).

The you can do
    git cat-file blob $commit:$path

or just
    git cat-file blob $hash

for every file you're interested in and parse the output of that file
(in memory, as it were). None of that messes with the current state of
the tree you're working on.


Paul Bolle


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] checkkconfigsymbols.py: make it Git aware
  2015-03-11 12:04 ` Paul Bolle
@ 2015-03-11 14:19   ` Valentin Rothberg
  2015-03-11 14:33     ` Valentin Rothberg
  0 siblings, 1 reply; 9+ messages in thread
From: Valentin Rothberg @ 2015-03-11 14:19 UTC (permalink / raw)
  To: Paul Bolle; +Cc: Greg KH, hengelein Stefan, linux-kernel, Andreas Ruprecht

On Wed, Mar 11, 2015 at 1:04 PM, Paul Bolle <pebolle@tiscali.nl> wrote:
> On Wed, 2015-03-11 at 12:16 +0100, Valentin Rothberg wrote:
>> Note that both options require to 'git reset --hard' the user's Git
>> tree, which can lead to the loss of uncommitted data.
>
> My local "800 line perl monster" basically does
>     git ls-tree -r $commit_or_tag
>
> which allows you to generate a list of files and their corresponding
> hashes (it also helps with filtering out symlinks by the way).
>
> The you can do
>     git cat-file blob $commit:$path
>
> or just
>     git cat-file blob $hash
>
> for every file you're interested in and parse the output of that file
> (in memory, as it were). None of that messes with the current state of
> the tree you're working on.

That's a good point.  I used cat-file once in another script but
totally forgot its existence, thanks : )  The patch is already queued,
so I will change the behavior to your suggestion soon.

Kind regards,
 Valentin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] checkkconfigsymbols.py: make it Git aware
  2015-03-11 14:19   ` Valentin Rothberg
@ 2015-03-11 14:33     ` Valentin Rothberg
  2015-03-11 15:04       ` Valentin Rothberg
  0 siblings, 1 reply; 9+ messages in thread
From: Valentin Rothberg @ 2015-03-11 14:33 UTC (permalink / raw)
  To: Paul Bolle; +Cc: Greg KH, hengelein Stefan, linux-kernel, Andreas Ruprecht

On Wed, Mar 11, 2015 at 3:19 PM, Valentin Rothberg
<valentinrothberg@gmail.com> wrote:
> On Wed, Mar 11, 2015 at 1:04 PM, Paul Bolle <pebolle@tiscali.nl> wrote:
>> On Wed, 2015-03-11 at 12:16 +0100, Valentin Rothberg wrote:
>>> Note that both options require to 'git reset --hard' the user's Git
>>> tree, which can lead to the loss of uncommitted data.
>>
>> My local "800 line perl monster" basically does
>>     git ls-tree -r $commit_or_tag
>>
>> which allows you to generate a list of files and their corresponding
>> hashes (it also helps with filtering out symlinks by the way).
>>
>> The you can do
>>     git cat-file blob $commit:$path
>>
>> or just
>>     git cat-file blob $hash
>>
>> for every file you're interested in and parse the output of that file
>> (in memory, as it were). None of that messes with the current state of
>> the tree you're working on.
>
> That's a good point.  I used cat-file once in another script but
> totally forgot its existence, thanks : )  The patch is already queued,
> so I will change the behavior to your suggestion soon.

Sorry for confusion.  This patch is not queued yet.  The one that
filters toos/ has been queued : )

Valentin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] checkkconfigsymbols.py: make it Git aware
  2015-03-11 14:33     ` Valentin Rothberg
@ 2015-03-11 15:04       ` Valentin Rothberg
  2015-03-13 20:20         ` Paul Bolle
  0 siblings, 1 reply; 9+ messages in thread
From: Valentin Rothberg @ 2015-03-11 15:04 UTC (permalink / raw)
  To: Paul Bolle; +Cc: Greg KH, hengelein Stefan, linux-kernel, Andreas Ruprecht

It seems that using 'git cat-file blob commit:path' instead of 'git reset
--hard commit' + open is much more expensive.

The execution time jumps from 3 secs to 3 mins.

Paul, how long does your monster run?  Maybe I just call it wrong or
mess up with caches.

Kind regards,
 Valentin

On Wed, Mar 11, 2015 at 3:33 PM, Valentin Rothberg
<valentinrothberg@gmail.com> wrote:
> On Wed, Mar 11, 2015 at 3:19 PM, Valentin Rothberg
> <valentinrothberg@gmail.com> wrote:
>> On Wed, Mar 11, 2015 at 1:04 PM, Paul Bolle <pebolle@tiscali.nl> wrote:
>>> On Wed, 2015-03-11 at 12:16 +0100, Valentin Rothberg wrote:
>>>> Note that both options require to 'git reset --hard' the user's Git
>>>> tree, which can lead to the loss of uncommitted data.
>>>
>>> My local "800 line perl monster" basically does
>>>     git ls-tree -r $commit_or_tag
>>>
>>> which allows you to generate a list of files and their corresponding
>>> hashes (it also helps with filtering out symlinks by the way).
>>>
>>> The you can do
>>>     git cat-file blob $commit:$path
>>>
>>> or just
>>>     git cat-file blob $hash
>>>
>>> for every file you're interested in and parse the output of that file
>>> (in memory, as it were). None of that messes with the current state of
>>> the tree you're working on.
>>
>> That's a good point.  I used cat-file once in another script but
>> totally forgot its existence, thanks : )  The patch is already queued,
>> so I will change the behavior to your suggestion soon.
>
> Sorry for confusion.  This patch is not queued yet.  The one that
> filters toos/ has been queued : )
>
> Valentin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] checkkconfigsymbols.py: make it Git aware
  2015-03-11 15:04       ` Valentin Rothberg
@ 2015-03-13 20:20         ` Paul Bolle
  2015-03-14 16:28           ` Valentin Rothberg
  0 siblings, 1 reply; 9+ messages in thread
From: Paul Bolle @ 2015-03-13 20:20 UTC (permalink / raw)
  To: Valentin Rothberg
  Cc: Greg KH, hengelein Stefan, linux-kernel, Andreas Ruprecht

On Wed, 2015-03-11 at 16:04 +0100, Valentin Rothberg wrote:
> Paul, how long does your monster run?  Maybe I just call it wrong or
> mess up with caches.

Even longer, I presume. Because an update for just a new linux next
release can take over a minute on my fastest machine (a ThinkPad X220).

(Recall that my monster runs daily over just the blobs added to the tree
for the latest linux-next tag and stores an intermediate parse as a git
note to each of those new blobs. It then does a second parse of all the
git notes relevant for that tag and stores the final result as a git
note to that tag.

And I do all this to make the daily update run at a decent speed. A
downside of this approach is that the very first run, which has to
parse, say, 50.000 new blobs, takes ages.)

My suggestion won't do here. So let me just say that messing with the
state of peoples repository might make you the target of a flame or two.
Are you sure you want to go down that route?


Paul Bolle


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] checkkconfigsymbols.py: make it Git aware
  2015-03-13 20:20         ` Paul Bolle
@ 2015-03-14 16:28           ` Valentin Rothberg
  0 siblings, 0 replies; 9+ messages in thread
From: Valentin Rothberg @ 2015-03-14 16:28 UTC (permalink / raw)
  To: Paul Bolle; +Cc: Greg KH, hengelein Stefan, linux-kernel, Andreas Ruprecht

On Fri, Mar 13, 2015 at 9:20 PM, Paul Bolle <pebolle@tiscali.nl> wrote:
> On Wed, 2015-03-11 at 16:04 +0100, Valentin Rothberg wrote:
>> Paul, how long does your monster run?  Maybe I just call it wrong or
>> mess up with caches.
>
> Even longer, I presume. Because an update for just a new linux next
> release can take over a minute on my fastest machine (a ThinkPad X220).
>
> (Recall that my monster runs daily over just the blobs added to the tree
> for the latest linux-next tag and stores an intermediate parse as a git
> note to each of those new blobs. It then does a second parse of all the
> git notes relevant for that tag and stores the final result as a git
> note to that tag.
>
> And I do all this to make the daily update run at a decent speed. A
> downside of this approach is that the very first run, which has to
> parse, say, 50.000 new blobs, takes ages.)

Thank you for the explanation.  I was really surprised how long
(compared to reset) it takes.

> My suggestion won't do here. So let me just say that messing with the
> state of peoples repository might make you the target of a flame or two.
> Are you sure you want to go down that route?

In case running the script would delete one's day of work or worse ...
no ... I don't want to take responsibility for that.  The patch at the
current state is unacceptable, but I see two options to solve the
issue while being fast:

(1)  Test if the current tree is dirty, warn the user and ask if she
wants to continue or not.
(2)  Abort if the tree is dirty.

Personally, I prefer option 2.  The script would still be fast, and
there is no way that it deletes data by accident.

Kind regards,
 Valentin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] checkkconfigsymbols.py: make it Git aware
  2015-03-11 11:16 [PATCH] checkkconfigsymbols.py: make it Git aware Valentin Rothberg
  2015-03-11 12:04 ` Paul Bolle
@ 2015-03-16 10:38 ` Valentin Rothberg
  2015-03-16 11:16 ` [PATCH v3] " Valentin Rothberg
  2 siblings, 0 replies; 9+ messages in thread
From: Valentin Rothberg @ 2015-03-16 10:38 UTC (permalink / raw)
  To: pebolle, gregkh, stefan.hengelein, linux-kernel, rupran; +Cc: Valentin Rothberg

The script now supports to check a specified commit or a specified range
of commits (i.e., commit1..commit2).  Developers and maintainers are
encouraged to use this functionality before sending or merging patches
to avoid potential bugs and to keep the code, documentation, etc. clean.

This patch adds the following options to the script:

 -c COMMIT, --commit=COMMIT
                  Check if the specified commit (hash) introduces
                  undefined Kconfig symbols.

 -d DIFF, --diff=DIFF
                  Diff undefined symbols between two commits.  The input
                  format bases on Git log's 'commmit1..commit2'.

  --force         Reset current Git tree even when it's dirty.

Note that the first two options require to 'git reset --hard' the user's
Git tree.  This hard reset is necessary to keep the script fast, but it
can lead to the loss of uncommitted data.  Hence, the script aborts in
case it is executed in a dirty tree.  It won't abort if '--force' is
passed.

If neither -c nor -d is specified, the script defaults to check the
entire local tree (i.e., the previous behavior).

Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
---
Changelog for v2:

* To avoid losing data with 'git reset --hard' the script aborts in dirty
trees.

* Added option --force to force hard resets and ignore aborts in dirty trees.

* Reset to initial HEAD in the end.

* Some minor code refactoring.
---
 scripts/checkkconfigsymbols.py | 138 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 132 insertions(+), 6 deletions(-)

diff --git a/scripts/checkkconfigsymbols.py b/scripts/checkkconfigsymbols.py
index 6445693df669..e6bb2bb4badc 100755
--- a/scripts/checkkconfigsymbols.py
+++ b/scripts/checkkconfigsymbols.py
@@ -1,6 +1,6 @@
 #!/usr/bin/env python
 
-"""Find Kconfig identifiers that are referenced but not defined."""
+"""Find Kconfig symbols that are referenced but not defined."""
 
 # (c) 2014-2015 Valentin Rothberg <Valentin.Rothberg@lip6.fr>
 # (c) 2014 Stefan Hengelein <stefan.hengelein@fau.de>
@@ -10,7 +10,9 @@
 
 import os
 import re
+import sys
 from subprocess import Popen, PIPE, STDOUT
+from optparse import OptionParser
 
 
 # regex expressions
@@ -32,16 +34,140 @@ REGEX_KCONFIG_HELP = re.compile(r"^\s+(help|---help---)\s*$")
 REGEX_FILTER_FEATURES = re.compile(r"[A-Za-z0-9]$")
 
 
+def parse_options():
+    """The user interface of this module."""
+    usage = "%prog [options]\n\n"                                              \
+            "Run this tool to detect Kconfig symbols that are referenced but " \
+            "not defined in\nKconfig.  The output of this tool has the "       \
+            "format \'Undefined symbol\\tFile list\'\n\n"                      \
+            "If no option is specified, %prog will default to check your\n"    \
+            "current tree.  Please note that specifying commits will "         \
+            "\'git reset --hard\'\nyour current tree!  You may save "          \
+            "uncommited changes to avoid losing data."
+
+    parser = OptionParser(usage=usage)
+
+    parser.add_option('-c', '--commit', dest='commit', action='store',
+                      default="",
+                      help="Check if the specified commit (hash) introduces "
+                           "undefined Kconfig symbols.")
+
+    parser.add_option('-d', '--diff', dest='diff', action='store',
+                      default="",
+                      help="Diff undefined symbols between two commits.  The "
+                           "input format bases on Git log's "
+                           "\'commmit1..commit2\'.")
+
+    parser.add_option('', '--force', dest='force', action='store_true',
+                      default=False,
+                      help="Reset current Git tree even when it's dirty.")
+
+    (opts, _) = parser.parse_args()
+
+    if opts.commit and opts.diff:
+        sys.exit("Please specify only one option at once.")
+
+    if opts.diff and not re.match(r"^[\w\-\.]+\.\.[\w\-\.]+$", opts.diff):
+        sys.exit("Please specify valid input in the following format: "
+                 "\'commmit1..commit2\'")
+
+#    if opts.commit or opts.diff:
+#        if not opts.force and tree_is_dirty():
+#            sys.exit("The current Git tree is dirty (see 'git status').  "
+#                     "Running this script may\ndelete important data since it "
+#                     "calls 'git reset --hard' for some performance\nreasons. "
+#                     " Please run this script in a clean Git tree or pass "
+#                     "'--force' if you\nwant to ignore this warning and "
+#                     "continue.")
+#
+    return opts
+
+
 def main():
     """Main function of this module."""
+    opts = parse_options()
+
+    if opts.commit or opts.diff:
+        head = get_head()
+
+        # get commit range
+        commit_a = None
+        commit_b = None
+        if opts.commit:
+            commit_a = opts.commit + "~"
+            commit_b = opts.commit
+        elif opts.diff:
+            split = opts.diff.split("..")
+            commit_a = split[0]
+            commit_b = split[1]
+            undefined_a = {}
+            undefined_b = {}
+
+        # get undefined items before the commit
+        execute("git checkout --detach %s" % commit_a)
+        undefined_a = check_symbols()
+
+        # get undefined items for the commit
+        execute("git checkout --detach %s" % commit_b)
+        undefined_b = check_symbols()
+
+        # report cases that are present for the commit but not before
+        for feature in undefined_b:
+            # feature has not been undefined before
+            if not feature in undefined_a:
+                files = undefined_b.get(feature)
+                print "%s\t%s" % (feature, ", ".join(files))
+            # check if there are new files that reference the undefined feature
+            else:
+                files = undefined_b.get(feature) - undefined_a.get(feature)
+                if files:
+                    print "%s\t%s" % (feature, ", ".join(files))
+
+        # reset to head
+        execute("git checkout --detach %s" % head)
+
+    # default to check the entire tree
+    else:
+        undefined = check_symbols()
+        for feature in undefined:
+            files = undefined.get(feature)
+
+
+def execute(cmd):
+    """Execute %cmd and return stdout.  Exit in case of error."""
+    pop = Popen(cmd, stdout=PIPE, stderr=STDOUT, shell=True)
+    (stdout, _) = pop.communicate()  # wait until finished
+    if pop.returncode != 0:
+        sys.exit(stdout)
+    return stdout
+
+
+def tree_is_dirty():
+    """Return true if the current working tree is dirty (i.e., if any file has
+    been added, deleted, modified, renamed or copied but not committed)."""
+    stdout = execute("git status --porcelain")
+    for line in stdout:
+        if re.findall(r"[URMADC]{1}", line[:2]):
+            return True
+    return False
+
+
+def get_head():
+    """Return commit hash of current HEAD."""
+    stdout = execute("git rev-parse HEAD")
+    return stdout.strip('\n')
+
+
+def check_symbols():
+    """Find undefined Kconfig symbols and return a dict with the symbol as key
+    and a list of referencing files as value."""
     source_files = []
     kconfig_files = []
     defined_features = set()
     referenced_features = dict()  # {feature: [files]}
 
     # use 'git ls-files' to get the worklist
-    pop = Popen("git ls-files", stdout=PIPE, stderr=STDOUT, shell=True)
-    (stdout, _) = pop.communicate()  # wait until finished
+    stdout = execute("git ls-files")
     if len(stdout) > 0 and stdout[-1] == "\n":
         stdout = stdout[:-1]
 
@@ -62,7 +188,7 @@ def main():
     for kfile in kconfig_files:
         parse_kconfig_file(kfile, defined_features, referenced_features)
 
-    print "Undefined symbol used\tFile list"
+    undefined = {}  # {feature: [files]}
     for feature in sorted(referenced_features):
         # filter some false positives
         if feature == "FOO" or feature == "BAR" or \
@@ -73,8 +199,8 @@ def main():
                 # avoid false positives for kernel modules
                 if feature[:-len("_MODULE")] in defined_features:
                     continue
-            files = referenced_features.get(feature)
-            print "%s\t%s" % (feature, ", ".join(files))
+            undefined[feature] = referenced_features.get(feature)
+    return undefined
 
 
 def parse_source_file(sfile, referenced_features):
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3] checkkconfigsymbols.py: make it Git aware
  2015-03-11 11:16 [PATCH] checkkconfigsymbols.py: make it Git aware Valentin Rothberg
  2015-03-11 12:04 ` Paul Bolle
  2015-03-16 10:38 ` [PATCH v2] " Valentin Rothberg
@ 2015-03-16 11:16 ` Valentin Rothberg
  2 siblings, 0 replies; 9+ messages in thread
From: Valentin Rothberg @ 2015-03-16 11:16 UTC (permalink / raw)
  To: pebolle, gregkh, stefan.hengelein, linux-kernel, rupran; +Cc: Valentin Rothberg

The script now supports to check a specified commit or a specified range
of commits (i.e., commit1..commit2).  Developers and maintainers are
encouraged to use this functionality before sending or merging patches
to avoid potential bugs and to keep the code, documentation, etc. clean.

This patch adds the following options to the script:

 -c COMMIT, --commit=COMMIT
                  Check if the specified commit (hash) introduces
                  undefined Kconfig symbols.

 -d DIFF, --diff=DIFF
                  Diff undefined symbols between two commits.  The input
                  format bases on Git log's 'commmit1..commit2'.

  --force         Reset current Git tree even when it's dirty.

Note that the first two options require to 'git reset --hard' the user's
Git tree.  This hard reset is necessary to keep the script fast, but it
can lead to the loss of uncommitted data.  Hence, the script aborts in
case it is executed in a dirty tree.  It won't abort if '--force' is
passed.

If neither -c nor -d is specified, the script defaults to check the
entire local tree (i.e., the previous behavior).

Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
---
Changelog for v2:
* To avoid losing data with 'git reset --hard' the script aborts in dirty
  trees.
* Added option --force to force hard resets and ignore aborts in dirty trees.
* Reset to initial HEAD in the end.
* Some minor code refactoring.

Changelog for v3:
* I accidentally sent the wrong state for v2.  Now the patch corresponds to the
  changelog for v2.  Thanks for catching, Stefan!
---
 scripts/checkkconfigsymbols.py | 138 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 132 insertions(+), 6 deletions(-)

diff --git a/scripts/checkkconfigsymbols.py b/scripts/checkkconfigsymbols.py
index 6445693df669..ce9ca60808b8 100755
--- a/scripts/checkkconfigsymbols.py
+++ b/scripts/checkkconfigsymbols.py
@@ -1,6 +1,6 @@
 #!/usr/bin/env python
 
-"""Find Kconfig identifiers that are referenced but not defined."""
+"""Find Kconfig symbols that are referenced but not defined."""
 
 # (c) 2014-2015 Valentin Rothberg <Valentin.Rothberg@lip6.fr>
 # (c) 2014 Stefan Hengelein <stefan.hengelein@fau.de>
@@ -10,7 +10,9 @@
 
 import os
 import re
+import sys
 from subprocess import Popen, PIPE, STDOUT
+from optparse import OptionParser
 
 
 # regex expressions
@@ -32,16 +34,140 @@ REGEX_KCONFIG_HELP = re.compile(r"^\s+(help|---help---)\s*$")
 REGEX_FILTER_FEATURES = re.compile(r"[A-Za-z0-9]$")
 
 
+def parse_options():
+    """The user interface of this module."""
+    usage = "%prog [options]\n\n"                                              \
+            "Run this tool to detect Kconfig symbols that are referenced but " \
+            "not defined in\nKconfig.  The output of this tool has the "       \
+            "format \'Undefined symbol\\tFile list\'\n\n"                      \
+            "If no option is specified, %prog will default to check your\n"    \
+            "current tree.  Please note that specifying commits will "         \
+            "\'git reset --hard\'\nyour current tree!  You may save "          \
+            "uncommitted changes to avoid losing data."
+
+    parser = OptionParser(usage=usage)
+
+    parser.add_option('-c', '--commit', dest='commit', action='store',
+                      default="",
+                      help="Check if the specified commit (hash) introduces "
+                           "undefined Kconfig symbols.")
+
+    parser.add_option('-d', '--diff', dest='diff', action='store',
+                      default="",
+                      help="Diff undefined symbols between two commits.  The "
+                           "input format bases on Git log's "
+                           "\'commmit1..commit2\'.")
+
+    parser.add_option('', '--force', dest='force', action='store_true',
+                      default=False,
+                      help="Reset current Git tree even when it's dirty.")
+
+    (opts, _) = parser.parse_args()
+
+    if opts.commit and opts.diff:
+        sys.exit("Please specify only one option at once.")
+
+    if opts.diff and not re.match(r"^[\w\-\.]+\.\.[\w\-\.]+$", opts.diff):
+        sys.exit("Please specify valid input in the following format: "
+                 "\'commmit1..commit2\'")
+
+    if opts.commit or opts.diff:
+        if not opts.force and tree_is_dirty():
+            sys.exit("The current Git tree is dirty (see 'git status').  "
+                     "Running this script may\ndelete important data since it "
+                     "calls 'git reset --hard' for some performance\nreasons. "
+                     " Please run this script in a clean Git tree or pass "
+                     "'--force' if you\nwant to ignore this warning and "
+                     "continue.")
+
+    return opts
+
+
 def main():
     """Main function of this module."""
+    opts = parse_options()
+
+    if opts.commit or opts.diff:
+        head = get_head()
+
+        # get commit range
+        commit_a = None
+        commit_b = None
+        if opts.commit:
+            commit_a = opts.commit + "~"
+            commit_b = opts.commit
+        elif opts.diff:
+            split = opts.diff.split("..")
+            commit_a = split[0]
+            commit_b = split[1]
+            undefined_a = {}
+            undefined_b = {}
+
+        # get undefined items before the commit
+        execute("git reset --hard %s" % commit_a)
+        undefined_a = check_symbols()
+
+        # get undefined items for the commit
+        execute("git reset --hard %s" % commit_b)
+        undefined_b = check_symbols()
+
+        # report cases that are present for the commit but not before
+        for feature in undefined_b:
+            # feature has not been undefined before
+            if not feature in undefined_a:
+                files = undefined_b.get(feature)
+                print "%s\t%s" % (feature, ", ".join(files))
+            # check if there are new files that reference the undefined feature
+            else:
+                files = undefined_b.get(feature) - undefined_a.get(feature)
+                if files:
+                    print "%s\t%s" % (feature, ", ".join(files))
+
+        # reset to head
+        execute("git reset --hard %s" % head)
+
+    # default to check the entire tree
+    else:
+        undefined = check_symbols()
+        for feature in undefined:
+            files = undefined.get(feature)
+
+
+def execute(cmd):
+    """Execute %cmd and return stdout.  Exit in case of error."""
+    pop = Popen(cmd, stdout=PIPE, stderr=STDOUT, shell=True)
+    (stdout, _) = pop.communicate()  # wait until finished
+    if pop.returncode != 0:
+        sys.exit(stdout)
+    return stdout
+
+
+def tree_is_dirty():
+    """Return true if the current working tree is dirty (i.e., if any file has
+    been added, deleted, modified, renamed or copied but not committed)."""
+    stdout = execute("git status --porcelain")
+    for line in stdout:
+        if re.findall(r"[URMADC]{1}", line[:2]):
+            return True
+    return False
+
+
+def get_head():
+    """Return commit hash of current HEAD."""
+    stdout = execute("git rev-parse HEAD")
+    return stdout.strip('\n')
+
+
+def check_symbols():
+    """Find undefined Kconfig symbols and return a dict with the symbol as key
+    and a list of referencing files as value."""
     source_files = []
     kconfig_files = []
     defined_features = set()
     referenced_features = dict()  # {feature: [files]}
 
     # use 'git ls-files' to get the worklist
-    pop = Popen("git ls-files", stdout=PIPE, stderr=STDOUT, shell=True)
-    (stdout, _) = pop.communicate()  # wait until finished
+    stdout = execute("git ls-files")
     if len(stdout) > 0 and stdout[-1] == "\n":
         stdout = stdout[:-1]
 
@@ -62,7 +188,7 @@ def main():
     for kfile in kconfig_files:
         parse_kconfig_file(kfile, defined_features, referenced_features)
 
-    print "Undefined symbol used\tFile list"
+    undefined = {}  # {feature: [files]}
     for feature in sorted(referenced_features):
         # filter some false positives
         if feature == "FOO" or feature == "BAR" or \
@@ -73,8 +199,8 @@ def main():
                 # avoid false positives for kernel modules
                 if feature[:-len("_MODULE")] in defined_features:
                     continue
-            files = referenced_features.get(feature)
-            print "%s\t%s" % (feature, ", ".join(files))
+            undefined[feature] = referenced_features.get(feature)
+    return undefined
 
 
 def parse_source_file(sfile, referenced_features):
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-03-16 11:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-11 11:16 [PATCH] checkkconfigsymbols.py: make it Git aware Valentin Rothberg
2015-03-11 12:04 ` Paul Bolle
2015-03-11 14:19   ` Valentin Rothberg
2015-03-11 14:33     ` Valentin Rothberg
2015-03-11 15:04       ` Valentin Rothberg
2015-03-13 20:20         ` Paul Bolle
2015-03-14 16:28           ` Valentin Rothberg
2015-03-16 10:38 ` [PATCH v2] " Valentin Rothberg
2015-03-16 11:16 ` [PATCH v3] " Valentin Rothberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).