tools.linux.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] patchwork-bot: Ignore empty context lines
@ 2021-12-02 23:38 Kees Cook
  2021-12-02 23:38 ` [PATCH 1/2] " Kees Cook
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Kees Cook @ 2021-12-02 23:38 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: Kees Cook, tools

Hi,

I had several patches that seemed to be eluding the automatic state
changing even though they appeared in mainline. This appears to have
been caused by a mismatch in whitespace interpretation between git
and patchwork.

-Kees

Kees Cook (2):
  patchwork-bot: Ignore empty context lines
  patchwork-bot: Add --pwhash for debugging

 git-patchwork-bot.py | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] patchwork-bot: Ignore empty context lines
  2021-12-02 23:38 [PATCH 0/2] patchwork-bot: Ignore empty context lines Kees Cook
@ 2021-12-02 23:38 ` Kees Cook
  2021-12-16 22:11   ` Kees Cook
  2021-12-02 23:38 ` [PATCH 2/2] patchwork-bot: Add --pwhash for debugging Kees Cook
  2021-12-13 22:30 ` [PATCH 0/2] patchwork-bot: Ignore empty context lines Konstantin Ryabitsev
  2 siblings, 1 reply; 9+ messages in thread
From: Kees Cook @ 2021-12-02 23:38 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: Kees Cook, tools

It seems that there is some mismatch in diff output (or storage?) between
git and patchwork under conditions I haven't figured out (MUA or MTA
whitespace stripping?): patchwork appears to ignore empty context lines
(i.e. a line that is only a single space). As a result, the patchwork
hash will not match, and commits will not be identified in patchwork.

As an example of a commit in upstream that patchwork-bot cannot locate
in patchwork due to a different hash, due to the differing empty context
lines:

$ MSGID=20210911102818.3804-1-len.baker@gmx.com
$ wget -qO- https://patchwork.kernel.org/project/linux-hardening/patch/$MSGID/raw/ | \
	grep '^ $' | wc -l
0

$ SHA=f11ee2ad25b22c2ee587045dd6999434375532f7
$ git diff ${SHA}~..${SHA} | \
	grep '^ $' | wc -l
2

Teaching patchwork-bot to ignore empty context lines allows pwhashes to
match in these cases, which lets those patches get located again.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 git-patchwork-bot.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/git-patchwork-bot.py b/git-patchwork-bot.py
index 2ebe5b4e88e4..4e10385de629 100755
--- a/git-patchwork-bot.py
+++ b/git-patchwork-bot.py
@@ -528,7 +528,9 @@ def get_patchwork_hash(diff):
             line_nos = list(map(fn, hunk_match.groups()))
             line = '@@ -%d +%d @@' % tuple(line_nos)
         elif line[0] in prefixes:
-            # if we have a +, - or context line, leave as-is
+            # if we have a +, - or context line, leave as-is, except blank lines
+            if line == ' ':
+                continue
             pass
         else:
             # other lines are ignored
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] patchwork-bot: Add --pwhash for debugging
  2021-12-02 23:38 [PATCH 0/2] patchwork-bot: Ignore empty context lines Kees Cook
  2021-12-02 23:38 ` [PATCH 1/2] " Kees Cook
@ 2021-12-02 23:38 ` Kees Cook
  2021-12-13 22:30 ` [PATCH 0/2] patchwork-bot: Ignore empty context lines Konstantin Ryabitsev
  2 siblings, 0 replies; 9+ messages in thread
From: Kees Cook @ 2021-12-02 23:38 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: Kees Cook, tools

Finding a 2-character whitespace difference between patches with different
patchwork hashes strained my sanity, so I want to keep the tooling I
built to track it down. Using "--pwhash PATCH-ID", patchwork-bot will
compare the given patch in patchwork (and its hash) against the diff
from stdin (and its hash).

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 git-patchwork-bot.py | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/git-patchwork-bot.py b/git-patchwork-bot.py
index 4e10385de629..980928cf7b60 100755
--- a/git-patchwork-bot.py
+++ b/git-patchwork-bot.py
@@ -1363,6 +1363,8 @@ if __name__ == '__main__':
                         help='Domain to use when creating message-ids')
     parser.add_argument('--ancestors', default=None,
                         help='During initial database creation, consider this many ancestor commits as fresh')
+    parser.add_argument('--pwhash', default=None, type=int, metavar='PATCH-ID',
+                        help='Debug pwhash mismatches. Compare patchwork hash of diff from stdin to patch id')
 
     cmdargs = parser.parse_args()
 
@@ -1411,6 +1413,24 @@ if __name__ == '__main__':
     if not os.path.isdir(CACHEDIR):
         os.makedirs(CACHEDIR, exist_ok=True)
 
+    if cmdargs.pwhash:
+        diff = sys.stdin.read()
+        pwhash = get_patchwork_hash(diff)
+        print(pwhash)
+        for pw in CONFIG['patchworks']:
+            print(f"Patchwork: {pw}")
+            for pname, psettings in CONFIG['patchworks'][pw]['projects'].items():
+                print(f"Project: {pname}")
+                project, rm, rpc, pconfig = project_by_name(pname)
+                project_id = project['id']
+                print(get_patchwork_patches_by_project_id_hash(rpc, project_id, pwhash))
+                print('-------')
+                p = rm.get_patch(cmdargs.pwhash)
+                pwdiff = p.get('diff')
+                print(pwdiff)
+                print(get_patchwork_hash(pwdiff))
+        sys.exit(0)
+
     if cmdargs.housekeeping:
         for _pserver, _sconfig in CONFIG['patchworks'].items():
             for _pname in _sconfig['projects']:
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] patchwork-bot: Ignore empty context lines
  2021-12-02 23:38 [PATCH 0/2] patchwork-bot: Ignore empty context lines Kees Cook
  2021-12-02 23:38 ` [PATCH 1/2] " Kees Cook
  2021-12-02 23:38 ` [PATCH 2/2] patchwork-bot: Add --pwhash for debugging Kees Cook
@ 2021-12-13 22:30 ` Konstantin Ryabitsev
  2021-12-16 20:23   ` Konstantin Ryabitsev
  2 siblings, 1 reply; 9+ messages in thread
From: Konstantin Ryabitsev @ 2021-12-13 22:30 UTC (permalink / raw)
  To: Kees Cook; +Cc: tools

On Thu, 2 Dec 2021 15:38:34 -0800, Kees Cook wrote:
> I had several patches that seemed to be eluding the automatic state
> changing even though they appeared in mainline. This appears to have
> been caused by a mismatch in whitespace interpretation between git
> and patchwork.
> 
> -Kees
> 
> [...]

Applied, thanks!

[1/2] patchwork-bot: Ignore empty context lines
      commit: 8998144862b16cfa7a2ec07757d43ca269d59373
[2/2] patchwork-bot: Add --pwhash for debugging
      commit: ffe2576ac7731b7596e4bdb1c10ab9768f93ee09

Best regards,
-- 
Konstantin Ryabitsev <konstantin@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] patchwork-bot: Ignore empty context lines
  2021-12-13 22:30 ` [PATCH 0/2] patchwork-bot: Ignore empty context lines Konstantin Ryabitsev
@ 2021-12-16 20:23   ` Konstantin Ryabitsev
  0 siblings, 0 replies; 9+ messages in thread
From: Konstantin Ryabitsev @ 2021-12-16 20:23 UTC (permalink / raw)
  To: Kees Cook, Jakub Kicinski; +Cc: tools

On Mon, 13 Dec 2021 at 17:30, Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
> [1/2] patchwork-bot: Ignore empty context lines
>       commit: 8998144862b16cfa7a2ec07757d43ca269d59373

Sorry, I had to revert this, as it was causing patchwork hash
mismatches. We want to be fully identical with upstream patchwork in
this function, since otherwise it results in different hashes
generated for identical patches.

Regards,
-K

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] patchwork-bot: Ignore empty context lines
  2021-12-02 23:38 ` [PATCH 1/2] " Kees Cook
@ 2021-12-16 22:11   ` Kees Cook
  2021-12-17 16:17     ` Konstantin Ryabitsev
  0 siblings, 1 reply; 9+ messages in thread
From: Kees Cook @ 2021-12-16 22:11 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: tools

On Thu, Dec 02, 2021 at 03:38:35PM -0800, Kees Cook wrote:
> It seems that there is some mismatch in diff output (or storage?) between
> git and patchwork under conditions I haven't figured out (MUA or MTA
> whitespace stripping?): patchwork appears to ignore empty context lines
> (i.e. a line that is only a single space). As a result, the patchwork
> hash will not match, and commits will not be identified in patchwork.
> 
> As an example of a commit in upstream that patchwork-bot cannot locate
> in patchwork due to a different hash, due to the differing empty context
> lines:
> 
> $ MSGID=20210911102818.3804-1-len.baker@gmx.com
> $ wget -qO- https://patchwork.kernel.org/project/linux-hardening/patch/$MSGID/raw/ | \
> 	grep '^ $' | wc -l
> 0
> 
> $ SHA=f11ee2ad25b22c2ee587045dd6999434375532f7
> $ git diff ${SHA}~..${SHA} | \
> 	grep '^ $' | wc -l
> 2
> 
> Teaching patchwork-bot to ignore empty context lines allows pwhashes to
> match in these cases, which lets those patches get located again.

With regard to the revert, do you not see the above problem with your
repos? I'm working against the same patchwork. I guess there is some
other root cause?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] patchwork-bot: Ignore empty context lines
  2021-12-16 22:11   ` Kees Cook
@ 2021-12-17 16:17     ` Konstantin Ryabitsev
  2021-12-17 16:41       ` Kees Cook
  0 siblings, 1 reply; 9+ messages in thread
From: Konstantin Ryabitsev @ 2021-12-17 16:17 UTC (permalink / raw)
  To: Kees Cook; +Cc: tools

On Thu, Dec 16, 2021 at 02:11:08PM -0800, Kees Cook wrote:
> > $ MSGID=20210911102818.3804-1-len.baker@gmx.com
> > $ wget -qO- https://patchwork.kernel.org/project/linux-hardening/patch/$MSGID/raw/ | \
> > 	grep '^ $' | wc -l
> > 0
> > 
> > $ SHA=f11ee2ad25b22c2ee587045dd6999434375532f7
> > $ git diff ${SHA}~..${SHA} | \
> > 	grep '^ $' | wc -l
> > 2
> > 
> > Teaching patchwork-bot to ignore empty context lines allows pwhashes to
> > match in these cases, which lets those patches get located again.
> 
> With regard to the revert, do you not see the above problem with your
> repos? I'm working against the same patchwork. I guess there is some
> other root cause?

I'm curious if you have local git configs that affect your "git diff" output.
If you "export GIT_CONFIG_GLOBAL=/dev/null", does this make any difference to
your experience?

-K

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] patchwork-bot: Ignore empty context lines
  2021-12-17 16:17     ` Konstantin Ryabitsev
@ 2021-12-17 16:41       ` Kees Cook
  2021-12-17 17:21         ` Konstantin Ryabitsev
  0 siblings, 1 reply; 9+ messages in thread
From: Kees Cook @ 2021-12-17 16:41 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: tools

On Fri, Dec 17, 2021 at 11:17:44AM -0500, Konstantin Ryabitsev wrote:
> On Thu, Dec 16, 2021 at 02:11:08PM -0800, Kees Cook wrote:
> > > $ MSGID=20210911102818.3804-1-len.baker@gmx.com
> > > $ wget -qO- https://patchwork.kernel.org/project/linux-hardening/patch/$MSGID/raw/ | \
> > > 	grep '^ $' | wc -l
> > > 0
> > > 
> > > $ SHA=f11ee2ad25b22c2ee587045dd6999434375532f7
> > > $ git diff ${SHA}~..${SHA} | \
> > > 	grep '^ $' | wc -l
> > > 2
> > > 
> > > Teaching patchwork-bot to ignore empty context lines allows pwhashes to
> > > match in these cases, which lets those patches get located again.
> > 
> > With regard to the revert, do you not see the above problem with your
> > repos? I'm working against the same patchwork. I guess there is some
> > other root cause?
> 
> I'm curious if you have local git configs that affect your "git diff" output.
> If you "export GIT_CONFIG_GLOBAL=/dev/null", does this make any difference to
> your experience?

$ export SHA=f11ee2ad25b22c2ee587045dd6999434375532f7
$ GIT_CONFIG_GLOBAL=/dev/null git diff $SHA~..$SHA | grep '^ $' | wc -l
2
$ git --version
git version 2.30.2

Does this report "0" for you?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] patchwork-bot: Ignore empty context lines
  2021-12-17 16:41       ` Kees Cook
@ 2021-12-17 17:21         ` Konstantin Ryabitsev
  0 siblings, 0 replies; 9+ messages in thread
From: Konstantin Ryabitsev @ 2021-12-17 17:21 UTC (permalink / raw)
  To: Kees Cook; +Cc: tools

On Fri, Dec 17, 2021 at 08:41:41AM -0800, Kees Cook wrote:
> > I'm curious if you have local git configs that affect your "git diff" output.
> > If you "export GIT_CONFIG_GLOBAL=/dev/null", does this make any difference to
> > your experience?
> 
> $ export SHA=f11ee2ad25b22c2ee587045dd6999434375532f7
> $ GIT_CONFIG_GLOBAL=/dev/null git diff $SHA~..$SHA | grep '^ $' | wc -l
> 2
> $ git --version
> git version 2.30.2
> 
> Does this report "0" for you?

No, I get 2 as well. However, what I'm driving at is that the hashing function
has to be identical to what patchwork uses, which is why it's a verbatim lift:

https://github.com/getpatchwork/patchwork/blob/master/patchwork/hasher.py#L18
vs.
https://git.kernel.org/pub/scm/linux/kernel/git/mricon/korg-helpers.git/tree/git-patchwork-bot.py#n479

If it generates a different hash from an email message than it does from the
"git diff" output, then something else is happening.

-K

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-12-17 17:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-02 23:38 [PATCH 0/2] patchwork-bot: Ignore empty context lines Kees Cook
2021-12-02 23:38 ` [PATCH 1/2] " Kees Cook
2021-12-16 22:11   ` Kees Cook
2021-12-17 16:17     ` Konstantin Ryabitsev
2021-12-17 16:41       ` Kees Cook
2021-12-17 17:21         ` Konstantin Ryabitsev
2021-12-02 23:38 ` [PATCH 2/2] patchwork-bot: Add --pwhash for debugging Kees Cook
2021-12-13 22:30 ` [PATCH 0/2] patchwork-bot: Ignore empty context lines Konstantin Ryabitsev
2021-12-16 20:23   ` Konstantin Ryabitsev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).