tools.linux.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paolo Bonzini" <pbonzini@redhat.com>
To: Michael Ellerman <mpe@ellerman.id.au>,
	Konstantin Ryabitsev <konstantin@linuxfoundation.org>,
	users@linux.kernel.org, tools@linux.kernel.org
Subject: Re: [kernel.org users] What is a useful way to show changes between series?
Date: Wed, 29 Apr 2020 12:32:19 +0200	[thread overview]
Message-ID: <da03b84c-e7f3-b71d-ddd4-a06b0b787b6e@redhat.com> (raw)
In-Reply-To: <87sggnexjq.fsf@mpe.ellerman.id.au>

On 29/04/20 06:16, Michael Ellerman wrote:
> 
> What would be a useful way to quickly do a "show me what changed" 
> between two versions of the same patch series? Especially considering 
> that:
> 
> - there may be a different number of patches in the series
> - new revisions will likely be based on newer trees
> - subjects/commit messages/trailers may be vastly different between two 
>   series, even if the code is very similar (e.g. someone broke down a 
>   single large commit into several smaller ones or vice-versa)
> 
> In my mind, the best way to approach this would be:
> 
> 1. Create a single unified diff file from all patches in each series 
>    (using combinediff)
> 2. Remove all context, leaving only -/+ lines (using interdiff -U0)
> 3. Remove all line numbers from @@ @@ hunks (replacing them with 0)
> 4. Add commit information on top of the diff

You have to do what "git range-diff" does:

- Remove all line numbers from @@ @@ lines

- Remove commit message too if you want, though that's not strictly necessary
(sometimes small patches can change a lot but commit messages stay the same)

- Add dummy empty patches to the shorter series to detect new or deleted patches

- Compute diff of all pairs of diffs (this can be optimized by looking in advance
for completely equal patches - maybe removing the commit message would help for
finding equal patches)

- Compute a "weight" that measures how different the patches are (for example, number
of lines removed + number of lines inserted)

- Solve an assignment problem using the Hungarian algorithm

- Figure out the order in which to present patches (e.g. sort them 
according to the new series, including deleted patches from the previous 
version after all the patches that came before it).


You can find an implementation in Javascript of the last five steps at

https://github.com/patchew-project/patchew/blob/master/www/templates/series-diff.html

with the Hungarian algorithm at

https://github.com/patchew-project/patchew/blob/master/static/jsdifflib/munkres.js

(but there are Python implementations available in PyPI too).

Thanks,

Paolo


  reply	other threads:[~2020-04-29 10:32 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-28 19:29 What is a useful way to show changes between series? Konstantin Ryabitsev
2020-04-28 19:35 ` [kernel.org users] " Jason Gunthorpe
2020-04-28 19:44   ` Konstantin Ryabitsev
2020-04-28 19:55     ` Jason Gunthorpe
2020-04-28 20:24       ` Konstantin Ryabitsev
2020-04-29 16:41       ` [tools] jg expand-am (Was: What is a useful way to show changes between series?) Konstantin Ryabitsev
2020-04-29 17:15         ` Jason Gunthorpe
2020-04-29 17:58           ` Konstantin Ryabitsev
2020-04-29 18:01             ` Jason Gunthorpe
2020-04-29 10:13     ` [kernel.org users] What is a useful way to show changes between series? Mark Brown
2020-04-29 15:00       ` James Bottomley
2020-04-29 15:19         ` Mark Brown
2020-04-29 15:44           ` James Bottomley
2020-04-29 15:51             ` Mark Brown
2020-04-29 17:42       ` Junio C Hamano
2020-04-29 17:50         ` Mark Brown
2020-04-29 17:58           ` Jason Gunthorpe
2020-04-29 18:09             ` Mark Brown
2020-04-29 19:12               ` Jason Gunthorpe
2020-04-29 18:24             ` Konstantin Ryabitsev
2020-04-29  4:16 ` mpe
2020-04-29 10:32   ` Paolo Bonzini [this message]
2020-04-29 19:47   ` [tools] " Konstantin Ryabitsev
2020-04-29 12:49 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=da03b84c-e7f3-b71d-ddd4-a06b0b787b6e@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=konstantin@linuxfoundation.org \
    --cc=mpe@ellerman.id.au \
    --cc=tools@linux.kernel.org \
    --cc=users@linux.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).