* Getting clean diff data from git-mailinfo @ 2020-02-21 17:14 Konstantin Ryabitsev 2020-02-22 16:47 ` Junio C Hamano 0 siblings, 1 reply; 4+ messages in thread From: Konstantin Ryabitsev @ 2020-02-21 17:14 UTC (permalink / raw) To: git Hello: Git-mailinfo is a handy utility to quickly parse the contents of a message containing a patch. However, I'm curious why there isn't a way to get just the diff data, without all the surrounding junk. E.g.: curl https://lore.kernel.org/driverdev-devel/20200221123817.16643-1-ajay.kathat@microchip.com/raw \ | git mailinfo msg patch > info The contents of "msg" are already munged to reduce it to exactly what would be in the commit message (properly processing the extra From: header), but the contents of "patch" contain all the junk from around the diff, like the diffstat, git version info, and the list trailer. Is there a git-native command to further clean up the "patch" file to get just diff contents (i.e. as returned by "git diff" after this patch is applied)? -K ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Getting clean diff data from git-mailinfo 2020-02-21 17:14 Getting clean diff data from git-mailinfo Konstantin Ryabitsev @ 2020-02-22 16:47 ` Junio C Hamano 2020-02-22 16:56 ` Junio C Hamano 0 siblings, 1 reply; 4+ messages in thread From: Junio C Hamano @ 2020-02-22 16:47 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: git Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes: > Is there a git-native command to further clean up the "patch" file to > get just diff contents (i.e. as returned by "git diff" after this patch > is applied)? There isn't one, as Git did not need one ;-) The "git am" toolchain is tasked to take a reasonably formatted e-mailed patch generated by tools other people use. When fed a piece of e-mail, after it was split out of a mailbox by the "git mailsplit" program, the "git mailinfo" program is asked to (1) gather metainfo for author identity (2) gather commit log message material (3) collect the input for "git apply" The e-mail header is parsed for (1) and the first line of (2), and then the e-mail body is scanned to find the boundary between (2) and (3), and this is done in order to avoid cruft at the end of (2) as much as possible, because (2) is something a human user has to clean up while applying, as opposed to (3) that is mechanically processed. For that, the line between (2) and (3) is drawn: (a) at "---\n" line, for output by "git format-patch"; (b) at "Index: " line, that often comes from CVS repository; (c) at "diff -" line, that can catch handmade patch e-mail using GNU and BSD diff. And that is why we throw the diffstat and commentary to maintainer that are written after the "---\n" line but before the diff in (3). Now, if "git apply" were less smart and required a pure diff without anything else wround it as its input, then we may have had split (3) into three pieces: (3a) material before the pure diff (e.g. diffstat, etc.) (3b) pure diff (3c) trailing junk (e.g. base-commit info, e-mail signature, etc.) But "git apply" was designed to be usable on the whole of plain text e-mail, roughly as a "GNU diff" replacement, it does not require (3a) and (3c) cleansed out from its input. So, because there is no such need so far, there is no tool in the Git toolbox to split (3) into three pieces. You're welcome to write one, but the current toolset does not need it. Thanks. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Getting clean diff data from git-mailinfo 2020-02-22 16:47 ` Junio C Hamano @ 2020-02-22 16:56 ` Junio C Hamano 2020-02-22 18:00 ` Andreas Schwab 0 siblings, 1 reply; 4+ messages in thread From: Junio C Hamano @ 2020-02-22 16:56 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: git Junio C Hamano <gitster@pobox.com> writes: > ... then we may have had split (3) into three pieces: > > (3a) material before the pure diff (e.g. diffstat, etc.) > (3b) pure diff > (3c) trailing junk (e.g. base-commit info, e-mail signature, etc.) > ... > So, because there is no such need so far, there is no tool in the > Git toolbox to split (3) into three pieces. > > You're welcome to write one, but the current toolset does not need > it. Writing something that reads (3), discarding lines before the first "diff --git", counting lines that appear on "@@ ... @@" line while copying it to the output, repeating the process when you see something other than "diff --git" (i.e. beginning of the patch for the next path) or "@@ ... @@" (i.e. another hunk in the patch for the current path), and discarding the rest may be trivial. But in practice, people edit their diff [*1*], forgetting the line counts on the "@@ ... @@" lines, and it helps the maintainer to have the whole (3), not only (3b), in a single file to recover from such a broken patch submission. So adding another tool to produce (3b) only is fine, but an attempt to get rid of (3) and to claim that (3b) replaces the need for (3) is highly discouraged. Thanks. [Footnote] *1* Even when people edit without changing the line numbers (imagine a typofix on a '+' line), I saw that "patch" mode of Emacs broke the line count on "@@ ...@@" line of the last hunk when the patch ends with certain patterns. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Getting clean diff data from git-mailinfo 2020-02-22 16:56 ` Junio C Hamano @ 2020-02-22 18:00 ` Andreas Schwab 0 siblings, 0 replies; 4+ messages in thread From: Andreas Schwab @ 2020-02-22 18:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: Konstantin Ryabitsev, git On Feb 22 2020, Junio C Hamano wrote: > *1* Even when people edit without changing the line numbers (imagine > a typofix on a '+' line), I saw that "patch" mode of Emacs broke > the line count on "@@ ...@@" line of the last hunk when the > patch ends with certain patterns. For example, when followed by the "-- " signature of git format-patch, as that makes the output ambiguous. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-02-22 18:00 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-02-21 17:14 Getting clean diff data from git-mailinfo Konstantin Ryabitsev 2020-02-22 16:47 ` Junio C Hamano 2020-02-22 16:56 ` Junio C Hamano 2020-02-22 18:00 ` Andreas Schwab
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).