* [BUG] incorrect line numbers reported in git am @ 2019-10-02 18:45 Denton Liu 2019-10-02 19:44 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Denton Liu @ 2019-10-02 18:45 UTC (permalink / raw) To: git; +Cc: Paul Tan, Nguyễn Thái Ngọc Duy, Jeff King Hello all, I found a bug where the line numbers in git am are being reported incorrectly in the case where a patch fails to apply cleanly. The test case for this is pretty simple: $ wget https://public-inbox.org/git/20191001185524.18772-1-newren@gmail.com/raw $ git am raw And the output for this is: Applying: dir: special case check for the possibility that pathspec is NULL error: corrupt patch at line 87 Patch failed at 0001 dir: special case check for the possibility that pathspec is NULL hint: Use 'git am --show-current-patch' to see the failed patch When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". In this case, the path is indeed corrupt. The final hunk header gives 25 lines after instead of 24 lines. As a result, it is erroring out correctly. However, the line offsets are off. Line 87, as it reports, is the following: to avoid a segfault. which is in the middle of the log message. I expect the line to be reported as something in the range of 198-203, where the end of the hunk actually is. Indeed, if you take an 87 line offset from the cutoff "---", we can see that it gives us line 201, which appears at the end of the corrupt hunk. So it appears that the bug is a result of the the apply process not taking into account the number of lines from the mail parsing step. Thanks, Denton ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] incorrect line numbers reported in git am 2019-10-02 18:45 [BUG] incorrect line numbers reported in git am Denton Liu @ 2019-10-02 19:44 ` Junio C Hamano 2019-10-02 20:08 ` Denton Liu 2019-10-02 20:03 ` Junio C Hamano 2019-10-04 21:59 ` [PATCH] apply: tell user location of corrupted patch file Denton Liu 2 siblings, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2019-10-02 19:44 UTC (permalink / raw) To: Denton Liu Cc: git, Paul Tan, Nguyễn Thái Ngọc Duy, Jeff King Denton Liu <liu.denton@gmail.com> writes: > Applying: dir: special case check for the possibility that pathspec is NULL > error: corrupt patch at line 87 This refers to line 87 of the input file, not a line that begins with "@@ -87,count...", doesn't it? If the sender hand edits a patch without correcting the number of lines recorded in the hunk header, the parser may not see the next hunk that begins with "@@" or run out of the input before it reads the required number of lines given the last hunk header. We might be able to notice when the input file is shorter than the last hunk wants it to be, in which case we should be able to say 'premature end of input at line 87' or something like that. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] incorrect line numbers reported in git am 2019-10-02 19:44 ` Junio C Hamano @ 2019-10-02 20:08 ` Denton Liu 0 siblings, 0 replies; 12+ messages in thread From: Denton Liu @ 2019-10-02 20:08 UTC (permalink / raw) To: Junio C Hamano Cc: git, Paul Tan, Nguyễn Thái Ngọc Duy, Jeff King On Thu, Oct 03, 2019 at 04:44:55AM +0900, Junio C Hamano wrote: > Denton Liu <liu.denton@gmail.com> writes: > > > Applying: dir: special case check for the possibility that pathspec is NULL > > error: corrupt patch at line 87 > > This refers to line 87 of the input file, not a line that begins > with "@@ -87,count...", doesn't it? Correct, it refers to line 87 of the input file. Since the whole mail is 202 lines long and the faulty hunk comes at the end of the whole mail, I'd expect the faulty line number to say something like line 198 or something that's near the end of the mail. Line 87 is somewhere in the middle of the log message in the mail. I think the problem comes from line number being expressed as an offset from the "---" (begin diff) line as opposed to an offset from the actual beginning of the mail. > If the sender hand edits a > patch without correcting the number of lines recorded in the hunk > header, the parser may not see the next hunk that begins with "@@" > or run out of the input before it reads the required number of lines > given the last hunk header. Correct, but I think that's orthogonal to the main issue. It makes sense why the error is being reported but what doesn't make sense is the fact that the line numbers reported are so far off from what a user would expect. > > We might be able to notice when the input file is shorter than the > last hunk wants it to be, in which case we should be able to say > 'premature end of input at line 87' or something like that. Yep, I noticed this bug while I was writing a patch to do exactly that. > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] incorrect line numbers reported in git am 2019-10-02 18:45 [BUG] incorrect line numbers reported in git am Denton Liu 2019-10-02 19:44 ` Junio C Hamano @ 2019-10-02 20:03 ` Junio C Hamano 2019-10-02 20:16 ` Denton Liu 2019-10-04 21:59 ` [PATCH] apply: tell user location of corrupted patch file Denton Liu 2 siblings, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2019-10-02 20:03 UTC (permalink / raw) To: Denton Liu Cc: git, Paul Tan, Nguyễn Thái Ngọc Duy, Jeff King Denton Liu <liu.denton@gmail.com> writes: > which is in the middle of the log message. I expect the line to be > reported as something in the range of 198-203,... That comes from not knowing who is complaining and what it is reading. In this case, "git apply" issues a warning because it is fed .git/rebase-apply/patch file, which is the output of mailinfo that parses header & log message out, leaves the message in a separate 'msg' file in the same directory and stores the rest in that 'patch' file. And it is line 87 that has problems. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] incorrect line numbers reported in git am 2019-10-02 20:03 ` Junio C Hamano @ 2019-10-02 20:16 ` Denton Liu 2019-10-03 0:52 ` Junio C Hamano 0 siblings, 1 reply; 12+ messages in thread From: Denton Liu @ 2019-10-02 20:16 UTC (permalink / raw) To: Junio C Hamano Cc: git, Paul Tan, Nguyễn Thái Ngọc Duy, Jeff King On Thu, Oct 03, 2019 at 05:03:14AM +0900, Junio C Hamano wrote: > Denton Liu <liu.denton@gmail.com> writes: > > > which is in the middle of the log message. I expect the line to be > > reported as something in the range of 198-203,... > > That comes from not knowing who is complaining and what it is > reading. In this case, "git apply" issues a warning because it is > fed .git/rebase-apply/patch file, which is the output of mailinfo > that parses header & log message out, leaves the message in a > separate 'msg' file in the same directory and stores the rest in > that 'patch' file. And it is line 87 that has problems. In this case, I would still regard this as a bug since users would expect the line 87 to refer to their input file. I think most users don't even realise that a .git/rebase-apply/patch file exists. (I certainly didn't.) In fact, running `git am --show-current-patch` shows the whole mail, not only the 'patch' file so users would have no reason to expect the line numbers to refer to the 'patch' file. I think it would make sense to pass the number of lines skipped by mailinfo to the apply step so that more accurate line numbers can be reported to users. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] incorrect line numbers reported in git am 2019-10-02 20:16 ` Denton Liu @ 2019-10-03 0:52 ` Junio C Hamano 2019-10-03 6:17 ` Duy Nguyen 0 siblings, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2019-10-03 0:52 UTC (permalink / raw) To: Denton Liu Cc: git, Paul Tan, Nguyễn Thái Ngọc Duy, Jeff King Denton Liu <liu.denton@gmail.com> writes: > On Thu, Oct 03, 2019 at 05:03:14AM +0900, Junio C Hamano wrote: >> Denton Liu <liu.denton@gmail.com> writes: >> >> > which is in the middle of the log message. I expect the line to be >> > reported as something in the range of 198-203,... >> >> That comes from not knowing who is complaining and what it is >> reading. In this case, "git apply" issues a warning because it is >> fed .git/rebase-apply/patch file, which is the output of mailinfo >> that parses header & log message out, leaves the message in a >> separate 'msg' file in the same directory and stores the rest in >> that 'patch' file. And it is line 87 that has problems. > > In this case, I would still regard this as a bug since users would > expect the line 87 to refer to their input file. I think most users > don't even realise that a .git/rebase-apply/patch file exists. (I > certainly didn't.) In any case, if the error message required me to look anywhere outside the patch file, it would make it impossible for me to work. 100% of the time, I just pipe the entire message from MUA to "git am", and I wouldn't know which line it is complaining if it counted the long run of mail headers like Received:, etc., because I do not have such an entire message anywhere in a single file (only my MUA has it, so I'd need to pipe it to "cat >tempfile" again after seeing a failure). > In fact, running `git am --show-current-patch` shows the whole mail, not > only the 'patch' file so users would have no reason to expect the line > numbers to refer to the 'patch' file. Yeah, show-current-patch was a misguided attempt to hide useful information from the users. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] incorrect line numbers reported in git am 2019-10-03 0:52 ` Junio C Hamano @ 2019-10-03 6:17 ` Duy Nguyen 2019-10-03 22:56 ` Junio C Hamano 0 siblings, 1 reply; 12+ messages in thread From: Duy Nguyen @ 2019-10-03 6:17 UTC (permalink / raw) To: Junio C Hamano; +Cc: Denton Liu, Git Mailing List, Paul Tan, Jeff King On Thu, Oct 3, 2019 at 7:52 AM Junio C Hamano <gitster@pobox.com> wrote: > > In fact, running `git am --show-current-patch` shows the whole mail, not > > only the 'patch' file so users would have no reason to expect the line > > numbers to refer to the 'patch' file. > > Yeah, show-current-patch was a misguided attempt to hide useful > information from the users. Not so much hiding as not having the information to present, at least not the easy way, since the mail is split at the beginning of git-am and never stored in $GIT_DIR. By the time this command is run, the mail is already gone. Someone could of course update git-am to keep a copy of the mail and improve this option. -- Duy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] incorrect line numbers reported in git am 2019-10-03 6:17 ` Duy Nguyen @ 2019-10-03 22:56 ` Junio C Hamano 0 siblings, 0 replies; 12+ messages in thread From: Junio C Hamano @ 2019-10-03 22:56 UTC (permalink / raw) To: Duy Nguyen; +Cc: Denton Liu, Git Mailing List, Paul Tan, Jeff King Duy Nguyen <pclouds@gmail.com> writes: > On Thu, Oct 3, 2019 at 7:52 AM Junio C Hamano <gitster@pobox.com> wrote: >> > In fact, running `git am --show-current-patch` shows the whole mail, not >> > only the 'patch' file so users would have no reason to expect the line >> > numbers to refer to the 'patch' file. >> >> Yeah, show-current-patch was a misguided attempt to hide useful >> information from the users. > > Not so much hiding as not having the information to present, at least > not the easy way, since the mail is split at the beginning of git-am > and never stored in $GIT_DIR. By the time this command is run, the > mail is already gone. Someone could of course update git-am to keep a > copy of the mail and improve this option. By "hiding", I meant "rob from the users an opportunity to learn where the useful patch file is stored". You seem to be doubly confused in this case, in that (1) you seem to have mistaken that I was complaining about show-current-patch not giving the full information contained in the original e-mail, and (2) you seem to think show-current-patch gives the contents of the patch witout other e-mail cruft. Both are incorrect. The first thing the command does is to feed the input to mailsplit and store the results in numbered files "%04d", and they are not removed until truly done. When you need to inspect the patch that does not apply, they are still there. Even emails for those steps that have been successfully applied before the current one are also there (the split files are all gone, though, but they no longer matter as they have been applied fine). I wouldn't have been so critical if "git am --show-current-patch" were implemented as "cat $GIT_DIR/rebase-apply/patch", but it does an equivalent of "cd $GIT_DIR/rebase-apply; cat $(cat next)" which is much less useful when trying to fix up the patch text that does not apply. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] apply: tell user location of corrupted patch file 2019-10-02 18:45 [BUG] incorrect line numbers reported in git am Denton Liu 2019-10-02 19:44 ` Junio C Hamano 2019-10-02 20:03 ` Junio C Hamano @ 2019-10-04 21:59 ` Denton Liu 2019-10-05 8:33 ` Junio C Hamano 2 siblings, 1 reply; 12+ messages in thread From: Denton Liu @ 2019-10-04 21:59 UTC (permalink / raw) To: Git Mailing List; +Cc: Duy Nguyen, Jeff King, Junio C Hamano, Paul Tan When `git am` runs into a corrupt patch, it'll error out and give a message such as, error: corrupt patch at line 87 Casual users of am may assume that this line number refers to the <mbox> file that they provided on the command-line. This assumption, however, is incorrect. The line count really refers to the .git/rebase-apply/patch file genrated by am. Teach am to print the location of corrupted patch files so that users of the tool will know where to look when fixing their corrupted patch. Thus the error message will look like this: error: corrupt patch at .git/rebase-apply/patch:87 An alternate design was considered which involved printing the line numbers relative to the output of `git am --show-current-patch` (in other words, the actual mail file that's provided to am). This design was not chosen because am does not store the whole mail and instead, splits the mail into several files. As a result of this, this would break existing users' workflow if they piped their mail directly to am from their mail client, the whole mail would not exist in any file and they would have to manually recreate the mail to see the line number. Let's avoid breaking Junio's workflow since he's probably one of the most frequent user of `git am` in the world. ;) Signed-off-by: Denton Liu <liu.denton@gmail.com> --- apply.c | 2 +- t/t4012-diff-binary.sh | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/apply.c b/apply.c index 57a61f2881..b0ba2e7b1a 100644 --- a/apply.c +++ b/apply.c @@ -1785,7 +1785,7 @@ static int parse_single_patch(struct apply_state *state, len = parse_fragment(state, line, size, patch, fragment); if (len <= 0) { free(fragment); - return error(_("corrupt patch at line %d"), state->linenr); + return error(_("corrupt patch at %s:%d"), state->patch_input_file, state->linenr); } fragment->patch = line; fragment->size = len; diff --git a/t/t4012-diff-binary.sh b/t/t4012-diff-binary.sh index 6579c81216..42cb2dd404 100755 --- a/t/t4012-diff-binary.sh +++ b/t/t4012-diff-binary.sh @@ -68,7 +68,7 @@ test_expect_success C_LOCALE_OUTPUT 'apply detecting corrupt patch correctly' ' sed -e "s/-CIT/xCIT/" <output >broken && test_must_fail git apply --stat --summary broken 2>detected && detected=$(cat detected) && - detected=$(expr "$detected" : "error.*at line \\([0-9]*\\)\$") && + detected=$(expr "$detected" : "error.*at broken:\\([0-9]*\\)\$") && detected=$(sed -ne "${detected}p" broken) && test "$detected" = xCIT ' @@ -77,7 +77,7 @@ test_expect_success C_LOCALE_OUTPUT 'apply detecting corrupt patch correctly' ' git diff --binary | sed -e "s/-CIT/xCIT/" >broken && test_must_fail git apply --stat --summary broken 2>detected && detected=$(cat detected) && - detected=$(expr "$detected" : "error.*at line \\([0-9]*\\)\$") && + detected=$(expr "$detected" : "error.*at broken:\\([0-9]*\\)\$") && detected=$(sed -ne "${detected}p" broken) && test "$detected" = xCIT ' -- 2.23.0.687.g391267506c ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] apply: tell user location of corrupted patch file 2019-10-04 21:59 ` [PATCH] apply: tell user location of corrupted patch file Denton Liu @ 2019-10-05 8:33 ` Junio C Hamano 2019-10-05 22:44 ` Junio C Hamano 2019-10-05 22:51 ` Junio C Hamano 0 siblings, 2 replies; 12+ messages in thread From: Junio C Hamano @ 2019-10-05 8:33 UTC (permalink / raw) To: Denton Liu; +Cc: Git Mailing List, Duy Nguyen, Jeff King, Paul Tan Denton Liu <liu.denton@gmail.com> writes: > When `git am` runs into a corrupt patch, it'll error out and give a > message such as, > > error: corrupt patch at line 87 > > Casual users of am may assume that this line number refers to the <mbox> > file that they provided on the command-line. This assumption, however, > is incorrect. The line count really refers to the > .git/rebase-apply/patch file genrated by am. > > Teach am to print the location of corrupted patch files so that users of s/corrupted/corrupt/; > the tool will know where to look when fixing their corrupted patch. Thus Likewise. > the error message will look like this: > > error: corrupt patch at .git/rebase-apply/patch:87 > > An alternate design was considered which involved printing the line > numbers relative to the output of `git am --show-current-patch` (in > other words, the actual mail file that's provided to am). This design > was not chosen because am does not store the whole mail and instead, > splits the mail into several files. As a result of this, this would > break existing users' workflow if they piped their mail directly to am > from their mail client, the whole mail would not exist in any file and > they would have to manually recreate the mail to see the line number. More importantly, a change to apply.c (hence "git apply", not "git am") will mean the tool can only talk about its input. If you run, instead of "git am mbox", "git apply mbox" (assuming you are lucky and the piece of mail does not use any fancy features like MIME or RFC 1342) may work just as well and it would report the corrupt patch relative to the entire mail text. > > Let's avoid breaking Junio's workflow since he's probably one of the > most frequent user of `git am` in the world. ;) Don't name me. > if (len <= 0) { > free(fragment); > - return error(_("corrupt patch at line %d"), state->linenr); > + return error(_("corrupt patch at %s:%d"), state->patch_input_file, state->linenr); > } Do not forget that you can run "git apply" and feed the patch from its standard input, e.g. $ git apply <patchfile $ git show -R | git apply Make sure state->patch_input_file is a reasonable string before considering this. Also, if you have a mbox file $ cd sub/direc/tory $ git am -s /var/tmp/mbox The "git apply" process thatis run inside "git am" would be running at the top level of the working tree, so state->patch_input_file may say ".git/rebase-apply/patch" (i.e. relative pathname) that is not relative to where the end user is in. I personally do not thinkg it matters too much, but some people may complain. Other than that, looks good. I am kind-of surprised that there is only one place that we report an unusable input with a line number. Nicely found. > diff --git a/t/t4012-diff-binary.sh b/t/t4012-diff-binary.sh > index 6579c81216..42cb2dd404 100755 > --- a/t/t4012-diff-binary.sh > +++ b/t/t4012-diff-binary.sh > @@ -68,7 +68,7 @@ test_expect_success C_LOCALE_OUTPUT 'apply detecting corrupt patch correctly' ' > sed -e "s/-CIT/xCIT/" <output >broken && > test_must_fail git apply --stat --summary broken 2>detected && > detected=$(cat detected) && > - detected=$(expr "$detected" : "error.*at line \\([0-9]*\\)\$") && > + detected=$(expr "$detected" : "error.*at broken:\\([0-9]*\\)\$") && > detected=$(sed -ne "${detected}p" broken) && > test "$detected" = xCIT > ' > @@ -77,7 +77,7 @@ test_expect_success C_LOCALE_OUTPUT 'apply detecting corrupt patch correctly' ' > git diff --binary | sed -e "s/-CIT/xCIT/" >broken && > test_must_fail git apply --stat --summary broken 2>detected && > detected=$(cat detected) && > - detected=$(expr "$detected" : "error.*at line \\([0-9]*\\)\$") && > + detected=$(expr "$detected" : "error.*at broken:\\([0-9]*\\)\$") && > detected=$(sed -ne "${detected}p" broken) && > test "$detected" = xCIT > ' These existing tests can serve a good test for this new feature, but I think you'd also need a case where "apply" is fed the patch from the standard input, and possibly another case where it is run from a subdirectory of a working tree. Thanks. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] apply: tell user location of corrupted patch file 2019-10-05 8:33 ` Junio C Hamano @ 2019-10-05 22:44 ` Junio C Hamano 2019-10-05 22:51 ` Junio C Hamano 1 sibling, 0 replies; 12+ messages in thread From: Junio C Hamano @ 2019-10-05 22:44 UTC (permalink / raw) To: Denton Liu; +Cc: Git Mailing List, Duy Nguyen, Jeff King, Paul Tan Junio C Hamano <gitster@pobox.com> writes: >> An alternate design was considered which involved printing the line >> numbers relative to the output of `git am --show-current-patch` (in >> other words, the actual mail file that's provided to am). This design >> was not chosen because am does not store the whole mail and instead, >> splits the mail into several files. As a result of this, this would >> break existing users' workflow if they piped their mail directly to am >> from their mail client, the whole mail would not exist in any file and >> they would have to manually recreate the mail to see the line number. > > More importantly,... Addendum. I think the primary reason why the "alternate design" will not fly is *NOT* that it breaks existing users (which it would), but giving a line number in the original mbox file is not always possible. Imagine the message you received was munged by the sending mailer, or a relaying mailer, and what you received is encoded in base64 ;-) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] apply: tell user location of corrupted patch file 2019-10-05 8:33 ` Junio C Hamano 2019-10-05 22:44 ` Junio C Hamano @ 2019-10-05 22:51 ` Junio C Hamano 1 sibling, 0 replies; 12+ messages in thread From: Junio C Hamano @ 2019-10-05 22:51 UTC (permalink / raw) To: Denton Liu; +Cc: Git Mailing List, Duy Nguyen, Jeff King, Paul Tan Junio C Hamano <gitster@pobox.com> writes: >> if (len <= 0) { >> free(fragment); >> - return error(_("corrupt patch at line %d"), state->linenr); >> + return error(_("corrupt patch at %s:%d"), state->patch_input_file, state->linenr); >> } > > Do not forget that you can run "git apply" and feed the patch from > its standard input, e.g. > > $ git apply <patchfile > $ git show -R | git apply > > Make sure state->patch_input_file is a reasonable string before > considering this. I think what the patch does is safe in this case; callsites of apply_patch(), which sets the .patch_input_file field, pass the string "<stdin>", so you'd say error: corrupt patch at <stdin>:43 We lost the word "line" in the message, but it would be picked up rather quickly by users that colon + integer is a line number, so I think it is OK. > Also, if you have a mbox file > > $ cd sub/direc/tory > $ git am -s /var/tmp/mbox > > The "git apply" process thatis run inside "git am" would be running > at the top level of the working tree, so state->patch_input_file may > say ".git/rebase-apply/patch" (i.e. relative pathname) that is not > relative to where the end user is in. I personally do not thinkg it > matters too much, but some people may complain. > > Other than that, looks good. I am kind-of surprised that there is > only one place that we report an unusable input with a line number. > Nicely found. I still do not know if we have a relative-path problem, how severe it would be if there is, or if it is fixable if we wanted to and how, though. Thanks. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-10-05 22:51 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-10-02 18:45 [BUG] incorrect line numbers reported in git am Denton Liu 2019-10-02 19:44 ` Junio C Hamano 2019-10-02 20:08 ` Denton Liu 2019-10-02 20:03 ` Junio C Hamano 2019-10-02 20:16 ` Denton Liu 2019-10-03 0:52 ` Junio C Hamano 2019-10-03 6:17 ` Duy Nguyen 2019-10-03 22:56 ` Junio C Hamano 2019-10-04 21:59 ` [PATCH] apply: tell user location of corrupted patch file Denton Liu 2019-10-05 8:33 ` Junio C Hamano 2019-10-05 22:44 ` Junio C Hamano 2019-10-05 22:51 ` Junio C Hamano
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.