* Bug in "git am" when the body starts with spaces @ 2017-04-01 0:24 Linus Torvalds 2017-04-01 0:52 ` Linus Torvalds 0 siblings, 1 reply; 8+ messages in thread From: Linus Torvalds @ 2017-04-01 0:24 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git Mailing List [-- Attachment #1: Type: text/plain, Size: 743 bytes --] Try applying the attached patch with git am 0001-Test-patch.patch in the git repository. At least for me, it results in a very odd commit that has one single line in the commit message: Test patch This should go in the body not in the subject line which is obviously bogus. I think the reason is that the "header continuation line" logic kicks in because the lines in the body start with spaces, but that's entirely incorrect, since (a) we're not in an email header (b) there's an empty line in between anyway, so no way are those body lines continuation lines. I didn't check how far back this goes, I guess I'll do that next. But I thought I'd report it here first in case somebody else goes "ahhh". Linus [-- Attachment #2: 0001-Test-patch.patch --] [-- Type: text/x-patch, Size: 532 bytes --] From ad65cf7ba97ac071da1f845ec854165e7bf1efdf Mon Sep 17 00:00:00 2001 From: Linus Torvalds <torvalds@linux-foundation.org> Date: Fri, 31 Mar 2017 17:18:16 -0700 Subject: [PATCH] Test patch example Subject: [PATCH] Test patch This should go in the body not in the subject line --- Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/Makefile b/Makefile index 9b36068ac..9f36c149b 100644 --- a/Makefile +++ b/Makefile @@ -1,3 +1,4 @@ + # The default target of this Makefile is... all:: -- 2.12.2.401.g5d4234a49 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: Bug in "git am" when the body starts with spaces 2017-04-01 0:24 Bug in "git am" when the body starts with spaces Linus Torvalds @ 2017-04-01 0:52 ` Linus Torvalds 2017-04-01 5:27 ` Jeff King 2017-04-01 19:03 ` Linus Torvalds 0 siblings, 2 replies; 8+ messages in thread From: Linus Torvalds @ 2017-04-01 0:52 UTC (permalink / raw) To: Junio C Hamano, Jonathan Tan; +Cc: Git Mailing List Ok, did a bisect, and this bisects to commit 6b4b013f1884 ("mailinfo: handle in-body header continuations"). The continuation logic is oddly complex, and I can't follow the logic. But it is completely broken in how it thinks empty lines are somehow "continuations". Jonathan? Linus On Fri, Mar 31, 2017 at 5:24 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > I think the reason is that the "header continuation line" logic kicks > in because the lines in the body start with spaces, but that's > entirely incorrect, since > > (a) we're not in an email header > > (b) there's an empty line in between anyway, so no way are those body > lines continuation lines. > > I didn't check how far back this goes, I guess I'll do that next. But > I thought I'd report it here first in case somebody else goes "ahhh". ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Bug in "git am" when the body starts with spaces 2017-04-01 0:52 ` Linus Torvalds @ 2017-04-01 5:27 ` Jeff King 2017-04-01 19:03 ` Linus Torvalds 1 sibling, 0 replies; 8+ messages in thread From: Jeff King @ 2017-04-01 5:27 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, Jonathan Tan, Git Mailing List On Fri, Mar 31, 2017 at 05:52:00PM -0700, Linus Torvalds wrote: > Ok, did a bisect, and this bisects to commit 6b4b013f1884 ("mailinfo: > handle in-body header continuations"). > > The continuation logic is oddly complex, and I can't follow the logic. > But it is completely broken in how it thinks empty lines are somehow > "continuations". I think the continuation logic is OK. The problem is that the newline is never fed to check_inbody_header() at all. So the next line its state machine sees is the indented line, which looks like a continuation. It seems to me that ignoring that newline is a bug, and causes other problems. For instance, if you have a blank line and then another header-looking thing (not a continuation) after, it is respected. For instance, running mailinfo on: From ...mbox header... From: Email Author <author@example.com> Subject: email subject Date: whatever From: in-body author <author@example.com> Subject: in-body subject And the rest of the message. will use "in-body subject" as the subject, though I'd have thought we should stop parsing in-body headers as soon as we see the first in-body blank line (the one after the in-body "From"). The blank line gets eaten by the check at the beginning of handle_commit_msg(), right _before_ we pass it off to the check_inbody_header() function. It seems like that should be more like: diff --git a/mailinfo.c b/mailinfo.c index a489d9d0f..6d89781eb 100644 --- a/mailinfo.c +++ b/mailinfo.c @@ -757,8 +757,10 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line) assert(!mi->filter_stage); if (mi->header_stage) { - if (!line->len || (line->len == 1 && line->buf[0] == '\n')) + if (!line->len || (line->len == 1 && line->buf[0] == '\n')) { + mi->header_stage = 0; return 0; + } } if (mi->use_inbody_headers && mi->header_stage) { But that breaks a bunch of tests. It looks like the loop in handle_body always starts by feeding the initial header-separator (the real headers, not the in-body ones) to the various parsers. So that first newline makes us trigger "no more in-body headers" before we even start parsing any lines. Oops. So doing this: @@ -960,7 +962,7 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line) goto handle_body_out; } - do { + while (!strbuf_getwholeline(line, mi->input, '\n')) { /* process any boundary lines */ if (*(mi->content_top) && is_multipart_boundary(mi, line)) { /* flush any leftover */ @@ -1013,7 +1015,7 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line) if (mi->input_error) break; - } while (!strbuf_getwholeline(line, mi->input, '\n')); + } flush_inbody_header_accum(mi); on top fixes that. But that still breaks a test that has blank lines at the beginning of the message, before the in-body header. So probably the state-machine needs an extra state: sucking up pre-inbody-header newlines. -Peff ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: Bug in "git am" when the body starts with spaces 2017-04-01 0:52 ` Linus Torvalds 2017-04-01 5:27 ` Jeff King @ 2017-04-01 19:03 ` Linus Torvalds 2017-04-02 4:18 ` Jeff King 2017-04-02 17:35 ` Junio C Hamano 1 sibling, 2 replies; 8+ messages in thread From: Linus Torvalds @ 2017-04-01 19:03 UTC (permalink / raw) To: Junio C Hamano, Jonathan Tan, Jeff King; +Cc: Git Mailing List [-- Attachment #1: Type: text/plain, Size: 743 bytes --] On Fri, Mar 31, 2017 at 5:52 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > The continuation logic is oddly complex, and I can't follow the logic. > But it is completely broken in how it thinks empty lines are somehow > "continuations". The attached patch seems to work for me. Comments? The logic is fairly simple: if we encounter an empty line, and we have pending in-body headers, we flush the pending headers, and mark us as no longer in header mode. The code used to just ignore empty lines when in header mode, which is garbage, exactly because it would happily continue accumulating more headers. There may be other cases this misses, but this at least seems to fix the case I encountered. Linus [-- Attachment #2: patch.diff --] [-- Type: text/plain, Size: 640 bytes --] mailinfo.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/mailinfo.c b/mailinfo.c index a489d9d0f..68037758f 100644 --- a/mailinfo.c +++ b/mailinfo.c @@ -757,8 +757,13 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line) assert(!mi->filter_stage); if (mi->header_stage) { - if (!line->len || (line->len == 1 && line->buf[0] == '\n')) + if (!line->len || (line->len == 1 && line->buf[0] == '\n')) { + if (mi->inbody_header_accum.len) { + flush_inbody_header_accum(mi); + mi->header_stage = 0; + } return 0; + } } if (mi->use_inbody_headers && mi->header_stage) { ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: Bug in "git am" when the body starts with spaces 2017-04-01 19:03 ` Linus Torvalds @ 2017-04-02 4:18 ` Jeff King 2017-04-03 17:42 ` Jonathan Tan 2017-04-02 17:35 ` Junio C Hamano 1 sibling, 1 reply; 8+ messages in thread From: Jeff King @ 2017-04-02 4:18 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, Jonathan Tan, Git Mailing List On Sat, Apr 01, 2017 at 12:03:44PM -0700, Linus Torvalds wrote: > On Fri, Mar 31, 2017 at 5:52 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > The continuation logic is oddly complex, and I can't follow the logic. > > But it is completely broken in how it thinks empty lines are somehow > > "continuations". > > The attached patch seems to work for me. Comments? > > The logic is fairly simple: if we encounter an empty line, and we have > pending in-body headers, we flush the pending headers, and mark us as > no longer in header mode. Hmm. I think this may work. At first I thought it was too strict in always checking inbody_header_accum.len, because we want this to kick in always, whether there's whitespace continuation or not. But that accumulator has to collect preemptively, before it knows if there's continuation. So it will always be non-empty if we've seen _any_ header, and it will remain non-empty as long as we keep parsing (because any time we flush, we do so in order to handle another line). IOW, I think this implements the state-machine thing I wrote in my earlier email, because the state "are we inside in-body header parsing" is always reflected by having a non-empty accumulator. It is a bit non-obvious though. -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Bug in "git am" when the body starts with spaces 2017-04-02 4:18 ` Jeff King @ 2017-04-03 17:42 ` Jonathan Tan 2017-04-04 6:50 ` Jeff King 0 siblings, 1 reply; 8+ messages in thread From: Jonathan Tan @ 2017-04-03 17:42 UTC (permalink / raw) To: Jeff King, Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List Thanks, everyone, for looking into this. On 04/01/2017 09:18 PM, Jeff King wrote: > On Sat, Apr 01, 2017 at 12:03:44PM -0700, Linus Torvalds wrote: >> The logic is fairly simple: if we encounter an empty line, and we have >> pending in-body headers, we flush the pending headers, and mark us as >> no longer in header mode. > > Hmm. I think this may work. At first I thought it was too strict in > always checking inbody_header_accum.len, because we want this to kick in > always, whether there's whitespace continuation or not. But that > accumulator has to collect preemptively, before it knows if there's > continuation. So it will always be non-empty if we've seen _any_ header, > and it will remain non-empty as long as we keep parsing (because any > time we flush, we do so in order to handle another line). > > IOW, I think this implements the state-machine thing I wrote in my > earlier email, because the state "are we inside in-body header parsing" > is always reflected by having a non-empty accumulator. It is a bit > non-obvious though. About obviousness, I think of a non-empty accumulator merely representing that the next line could potentially be a continuation line. And it is coincidence that this implies "are we inside in-body header parsing"; if not all in-body header lines could be "continued", there would be no such implication. mi->inbody_header_accum.len is already used in check_inbody_header to mean "could the next line potentially be a continuation line" and to trigger a check for a negative criterion (in this case, a scissors line). I think it's fine to do the same thing, the negative criterion here being a blank line. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Bug in "git am" when the body starts with spaces 2017-04-03 17:42 ` Jonathan Tan @ 2017-04-04 6:50 ` Jeff King 0 siblings, 0 replies; 8+ messages in thread From: Jeff King @ 2017-04-04 6:50 UTC (permalink / raw) To: Jonathan Tan; +Cc: Linus Torvalds, Junio C Hamano, Git Mailing List On Mon, Apr 03, 2017 at 10:42:46AM -0700, Jonathan Tan wrote: > On 04/01/2017 09:18 PM, Jeff King wrote: > > On Sat, Apr 01, 2017 at 12:03:44PM -0700, Linus Torvalds wrote: > > > The logic is fairly simple: if we encounter an empty line, and we have > > > pending in-body headers, we flush the pending headers, and mark us as > > > no longer in header mode. > > > > Hmm. I think this may work. At first I thought it was too strict in > > always checking inbody_header_accum.len, because we want this to kick in > > always, whether there's whitespace continuation or not. But that > > accumulator has to collect preemptively, before it knows if there's > > continuation. So it will always be non-empty if we've seen _any_ header, > > and it will remain non-empty as long as we keep parsing (because any > > time we flush, we do so in order to handle another line). > > > > IOW, I think this implements the state-machine thing I wrote in my > > earlier email, because the state "are we inside in-body header parsing" > > is always reflected by having a non-empty accumulator. It is a bit > > non-obvious though. > > About obviousness, I think of a non-empty accumulator merely representing > that the next line could potentially be a continuation line. And it is > coincidence that this implies "are we inside in-body header parsing"; if not > all in-body header lines could be "continued", there would be no such > implication. > > mi->inbody_header_accum.len is already used in check_inbody_header to mean > "could the next line potentially be a continuation line" and to trigger a > check for a negative criterion (in this case, a scissors line). I think it's > fine to do the same thing, the negative criterion here being a blank line. FWIW, I looked into making a single "state" variable, but it actually got ugly pretty quickly. I think not because it's not a cleaner method in the long run, but just because the existing code is so dependent on individual state variables that needed changing. So I'm OK with Linus's fix; it certainly follows the existing code patterns. -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Bug in "git am" when the body starts with spaces 2017-04-01 19:03 ` Linus Torvalds 2017-04-02 4:18 ` Jeff King @ 2017-04-02 17:35 ` Junio C Hamano 1 sibling, 0 replies; 8+ messages in thread From: Junio C Hamano @ 2017-04-02 17:35 UTC (permalink / raw) To: Linus Torvalds; +Cc: Jonathan Tan, Jeff King, Git Mailing List Linus Torvalds <torvalds@linux-foundation.org> writes: > On Fri, Mar 31, 2017 at 5:52 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> The continuation logic is oddly complex, and I can't follow the logic. >> But it is completely broken in how it thinks empty lines are somehow >> "continuations". > > The attached patch seems to work for me. Comments? We start at header_stage set to 1, keep skipping empty lines while in that state, and we stay in that state as long as we see in-body header (or a continuation of the in-body header we saw earlier). We get out of this state when we see a blank line after we are done with the in-body headers. Once header_stage is set to 0 with a blank line, we don't do in-body headers (scissors will roll back the whole thing and irrelevant to the analysis of correctness). But you found that "keep skipping empty" done unconditionally is wrong, because we may have already seen an in-body header and are trying to see if the line is a continuation (in which case check_inbody_header() would append to the previous) or another in-body header (in which case again check_inbody_header() would use it), or something else (in which case check_inbody_header() will return false, we get out of header_stage=1 state and process this line as the first line of the log message. An empty line we see in this state must trigger "we are no longer taking in-body header" logic, too. And that is exactly your patch does. The change "feels" correct to me. > mailinfo.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/mailinfo.c b/mailinfo.c > index a489d9d0f..68037758f 100644 > --- a/mailinfo.c > +++ b/mailinfo.c > @@ -757,8 +757,13 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line) > assert(!mi->filter_stage); > > if (mi->header_stage) { > - if (!line->len || (line->len == 1 && line->buf[0] == '\n')) > + if (!line->len || (line->len == 1 && line->buf[0] == '\n')) { > + if (mi->inbody_header_accum.len) { > + flush_inbody_header_accum(mi); > + mi->header_stage = 0; > + } > return 0; > + } > } > > if (mi->use_inbody_headers && mi->header_stage) { ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-04-04 6:51 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-04-01 0:24 Bug in "git am" when the body starts with spaces Linus Torvalds 2017-04-01 0:52 ` Linus Torvalds 2017-04-01 5:27 ` Jeff King 2017-04-01 19:03 ` Linus Torvalds 2017-04-02 4:18 ` Jeff King 2017-04-03 17:42 ` Jonathan Tan 2017-04-04 6:50 ` Jeff King 2017-04-02 17:35 ` Junio C Hamano
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.