git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 'git mailinfo' whitespace bug
@ 2010-02-18 18:05 Linus Torvalds
  2010-02-19 11:54 ` [PATCH] mailinfo: don't trim whitespace in the commit message Lukas Sandström
  2010-02-20  5:51 ` 'git mailinfo' whitespace bug Junio C Hamano
  0 siblings, 2 replies; 5+ messages in thread
From: Linus Torvalds @ 2010-02-18 18:05 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano, Lukas Sandström


'git mailinfo' removes the whitespace from the beginning of the email 
body, but it does it incorrectly.

In particular, some people use indented paragraphs, like this:

	  Four-score and Four score and seven years ago our fathers 
   brought forth, upon this continent, a new nation, conceived in Liberty, 
   and dedicated to the proposition that all men are created equal.

	Now we are engaged in a great civil war, testing whether that 
   nation, or any nation so conceived, and so dedicated, can long endure. 
   We are met here on a great battlefield of that war. We have come to 
   dedicate a portion of it as a final resting place for those who here 
   gave their lives that that nation might live. It is altogether fitting 
   and proper that we should do this.

   ...

and mailinfo will not just remove empty lines from the beginning of the 
email body, it will also remove the _first_ indentation (but not any 
others). Which makes the whole thing come out wrong.

I bisected it, and this bug was introduced almost two years ago. In commit 
3b6121f69b2 ("git-mailinfo: use strbuf's instead of fixed buffers"), to be 
exact. I'm pretty sure the bug is that handle_commit_msg() was changed to 
use 'strbuf_ltrim()' for the 'still_looking' case.

Before commit 3b6121f69b2, it would create a new variable that had the 
trimmed results ("char *cp = line;"), after that commit it would just trim 
the line itself. Which is correct for the case of it being a header, but 
if it's the first non-header line, it's wrong.

			Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] mailinfo: don't trim whitespace in the commit message
  2010-02-18 18:05 'git mailinfo' whitespace bug Linus Torvalds
@ 2010-02-19 11:54 ` Lukas Sandström
  2010-02-20  5:51 ` 'git mailinfo' whitespace bug Junio C Hamano
  1 sibling, 0 replies; 5+ messages in thread
From: Lukas Sandström @ 2010-02-19 11:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Lukas Sandström, Git Mailing List, Junio C Hamano

Previously any whitespace in the beginning of the first line
of the commit message was trimmed, destroying any paragraph
indentation. Move the whitespace trimming to check_header()
instead, and preserve all commit message lines as-is in
handle_commit_msg().

Signed-off-by: Lukas Sandström <luksan@gmail.com>
---

On 2010-02-18 19:05, Linus Torvalds wrote:
> 'git mailinfo' removes the whitespace from the beginning of the email
> body, but it does it incorrectly.
>
> In particular, some people use indented paragraphs, like this:
>
> 	  Four-score and Four score and seven years ago our fathers
>    brought forth, upon this continent, a new nation, conceived in Liberty,
>    and dedicated to the proposition that all men are created equal.
>
> 	Now we are engaged in a great civil war, testing whether that
>    nation, or any nation so conceived, and so dedicated, can long endure.
>    We are met here on a great battlefield of that war. We have come to
>    dedicate a portion of it as a final resting place for those who here
>    gave their lives that that nation might live. It is altogether fitting
>    and proper that we should do this.
>
>    ...
>
> and mailinfo will not just remove empty lines from the beginning of the
> email body, it will also remove the _first_ indentation (but not any
> others). Which makes the whole thing come out wrong.
>
> I bisected it, and this bug was introduced almost two years ago. In commit
> 3b6121f69b2 ("git-mailinfo: use strbuf's instead of fixed buffers"), to be
> exact. I'm pretty sure the bug is that handle_commit_msg() was changed to
> use 'strbuf_ltrim()' for the 'still_looking' case.
>
> Before commit 3b6121f69b2, it would create a new variable that had the
> trimmed results ("char *cp = line;"), after that commit it would just trim
> the line itself. Which is correct for the case of it being a header, but
> if it's the first non-header line, it's wrong.
>

This patch should fix it. Note that there was a test-case explicitly
checking for this "trim first line" behaviour.

/Lukas

 builtin-mailinfo.c |   41 ++++++++++++++++++++++-------------------
 t/t5100/msg0015    |    2 +-
 2 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index a50ac22..954dc11 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -283,11 +283,15 @@ static inline int cmp_header(const struct strbuf *line, const char *hdr)
 			line->buf[len] == ':' && isspace(line->buf[len + 1]);
 }

-static int check_header(const struct strbuf *line,
+static int check_header(const struct strbuf *line_in,
 				struct strbuf *hdr_data[], int overwrite)
 {
 	int i, ret = 0, len;
-	struct strbuf sb = STRBUF_INIT;
+	struct strbuf sb = STRBUF_INIT, sb2 = STRBUF_INIT, *line = &sb2;
+
+	strbuf_addbuf(line, line_in);
+	strbuf_ltrim(line);
+
 	/* search for the interesting parts */
 	for (i = 0; header[i]; i++) {
 		int len = strlen(header[i]);
@@ -339,6 +343,7 @@ static int check_header(const struct strbuf *line,

 check_header_out:
 	strbuf_release(&sb);
+	strbuf_release(&sb2);
 	return ret;
 }

@@ -773,27 +778,25 @@ static int is_scissors_line(const struct strbuf *line)

 static int handle_commit_msg(struct strbuf *line)
 {
-	static int still_looking = 1;
+	static int first_msg_line_found = 0;
+	size_t cnt;

 	if (!cmitmsg)
-		return 0;
-
-	if (still_looking) {
-		strbuf_ltrim(line);
-		if (!line->len)
+		return 0; /* FIXME: shouldn't this be: return 1? */
+
+	if (!first_msg_line_found) {
+		if (use_inbody_headers)
+			if (check_header(line, s_hdr_data, 0))
+				return 0;
+		/* Check if the first line is all whitespace */
+		for (cnt = 0; isspace(line->buf[cnt]); cnt++)
+			; /* nothing */
+		if (line->len == cnt)
+			/* Ignore the first line if it's only whitespace */
 			return 0;
+		first_msg_line_found = 1;
 	}

-	if (use_inbody_headers && still_looking) {
-		still_looking = check_header(line, s_hdr_data, 0);
-		if (still_looking)
-			return 0;
-	} else
-		/* Only trim the first (blank) line of the commit message
-		 * when ignoring in-body headers.
-		 */
-		still_looking = 0;
-
 	/* normalize the log message to UTF-8. */
 	if (metainfo_charset)
 		convert_to_utf8(line, charset.buf);
@@ -804,7 +807,7 @@ static int handle_commit_msg(struct strbuf *line)
 			die_errno("Could not rewind output message file");
 		if (ftruncate(fileno(cmitmsg), 0))
 			die_errno("Could not truncate output message file at scissors");
-		still_looking = 1;
+		first_msg_line_found = 0;

 		/*
 		 * We may have already read "secondary headers"; purge
diff --git a/t/t5100/msg0015 b/t/t5100/msg0015
index 9577238..4abb3d5 100644
--- a/t/t5100/msg0015
+++ b/t/t5100/msg0015
@@ -1,2 +1,2 @@
-- a list
+  - a list
   - of stuff
-- 
1.6.6.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: 'git mailinfo' whitespace bug
  2010-02-18 18:05 'git mailinfo' whitespace bug Linus Torvalds
  2010-02-19 11:54 ` [PATCH] mailinfo: don't trim whitespace in the commit message Lukas Sandström
@ 2010-02-20  5:51 ` Junio C Hamano
  2010-02-22 15:13   ` Don Zickus
  1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2010-02-20  5:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Lukas Sandström, Don Zickus

Linus Torvalds <torvalds@linux-foundation.org> writes:

> I bisected it, and this bug was introduced almost two years ago. In commit 
> 3b6121f69b2 ("git-mailinfo: use strbuf's instead of fixed buffers"), to be 
> exact. I'm pretty sure the bug is that handle_commit_msg() was changed to 
> use 'strbuf_ltrim()' for the 'still_looking' case.
>
> Before commit 3b6121f69b2, it would create a new variable that had the 
> trimmed results ("char *cp = line;"), after that commit it would just trim 
> the line itself. Which is correct for the case of it being a header, but 
> if it's the first non-header line, it's wrong.

True; trimming the body is obviously wrong.

But when is it correct to ltrim a header line?  It means we are going to
accept a header (or header-looking line in body) that is indented.  I
don't know why 87ab799 (builtin-mailinfo.c 2007-03-12) was coded that way.


 builtin-mailinfo.c |    3 +--
 t/t5100/msg0015    |    2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index a50ac22..ce2ef6b 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -779,8 +779,7 @@ static int handle_commit_msg(struct strbuf *line)
 		return 0;
 
 	if (still_looking) {
-		strbuf_ltrim(line);
-		if (!line->len)
+		if (!line->len || (line->len == 1 && line->buf[0] == '\n'))
 			return 0;
 	}
 
diff --git a/t/t5100/msg0015 b/t/t5100/msg0015
index 9577238..4abb3d5 100644
--- a/t/t5100/msg0015
+++ b/t/t5100/msg0015
@@ -1,2 +1,2 @@
-- a list
+  - a list
   - of stuff

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: 'git mailinfo' whitespace bug
  2010-02-20  5:51 ` 'git mailinfo' whitespace bug Junio C Hamano
@ 2010-02-22 15:13   ` Don Zickus
  2010-02-22 19:57     ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Don Zickus @ 2010-02-22 15:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List, Lukas Sandström

On Fri, Feb 19, 2010 at 09:51:19PM -0800, Junio C Hamano wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > I bisected it, and this bug was introduced almost two years ago. In commit 
> > 3b6121f69b2 ("git-mailinfo: use strbuf's instead of fixed buffers"), to be 
> > exact. I'm pretty sure the bug is that handle_commit_msg() was changed to 
> > use 'strbuf_ltrim()' for the 'still_looking' case.
> >
> > Before commit 3b6121f69b2, it would create a new variable that had the 
> > trimmed results ("char *cp = line;"), after that commit it would just trim 
> > the line itself. Which is correct for the case of it being a header, but 
> > if it's the first non-header line, it's wrong.
> 
> True; trimming the body is obviously wrong.
> 
> But when is it correct to ltrim a header line?  It means we are going to
> accept a header (or header-looking line in body) that is indented.  I
> don't know why 87ab799 (builtin-mailinfo.c 2007-03-12) was coded that way.

In regards to 87ab799, I just deleted and pasted it from the function
handle_inbody_header (which you can see from that commit).  The original
code for those lines came from ae448e3854d8b6e7e37aa88fa3917f5dd97f3210.
Perhaps I misused it when I moved it.

Your patch belows seems to make sense for what its worth.

Cheers,
Don

> 
> 
>  builtin-mailinfo.c |    3 +--
>  t/t5100/msg0015    |    2 +-
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
> index a50ac22..ce2ef6b 100644
> --- a/builtin-mailinfo.c
> +++ b/builtin-mailinfo.c
> @@ -779,8 +779,7 @@ static int handle_commit_msg(struct strbuf *line)
>  		return 0;
>  
>  	if (still_looking) {
> -		strbuf_ltrim(line);
> -		if (!line->len)
> +		if (!line->len || (line->len == 1 && line->buf[0] == '\n'))
>  			return 0;
>  	}
>  
> diff --git a/t/t5100/msg0015 b/t/t5100/msg0015
> index 9577238..4abb3d5 100644
> --- a/t/t5100/msg0015
> +++ b/t/t5100/msg0015
> @@ -1,2 +1,2 @@
> -- a list
> +  - a list
>    - of stuff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 'git mailinfo' whitespace bug
  2010-02-22 15:13   ` Don Zickus
@ 2010-02-22 19:57     ` Junio C Hamano
  0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2010-02-22 19:57 UTC (permalink / raw)
  To: Don Zickus; +Cc: Linus Torvalds, Git Mailing List, Lukas Sandström

Don Zickus <dzickus@redhat.com> writes:

>> But when is it correct to ltrim a header line?  It means we are going to
>> accept a header (or header-looking line in body) that is indented.  I
>> don't know why 87ab799 (builtin-mailinfo.c 2007-03-12) was coded that way.
>
> In regards to 87ab799, I just deleted and pasted it from the function
> handle_inbody_header (which you can see from that commit).  The original
> code for those lines came from ae448e3854d8b6e7e37aa88fa3917f5dd97f3210.

Thanks for clarifying.  The one in ae448e3 (mailinfo: ignore blanks after
in-body headers., 2006-06-17) was about removing blank lines before the
in-body headers begin, and never about removing indentation of the first
in-body header.  Admittedly, it was also being lenient and skipped over
lines that are not empty but all whitespaces, and if we apply the quoted
patch we will be retroactively tightening the rule, but I somehow do not
think people would care.

>>  builtin-mailinfo.c |    3 +--
>>  t/t5100/msg0015    |    2 +-
>>  2 files changed, 2 insertions(+), 3 deletions(-)
>> 
>> diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
>> index a50ac22..ce2ef6b 100644
>> --- a/builtin-mailinfo.c
>> +++ b/builtin-mailinfo.c
>> @@ -779,8 +779,7 @@ static int handle_commit_msg(struct strbuf *line)
>>  		return 0;
>>  
>>  	if (still_looking) {
>> -		strbuf_ltrim(line);
>> -		if (!line->len)
>> +		if (!line->len || (line->len == 1 && line->buf[0] == '\n'))
>>  			return 0;
>>  	}

We probably could do something like

	if (still_looking) {
        	if (strspn(line->buf, " \t\n") == line->len)
			return 0;
	}

to keep people with blank but not empty lines before the first in-body
header happy.  I somehow don't think it is worth it, but what would I
know...

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-02-22 19:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-18 18:05 'git mailinfo' whitespace bug Linus Torvalds
2010-02-19 11:54 ` [PATCH] mailinfo: don't trim whitespace in the commit message Lukas Sandström
2010-02-20  5:51 ` 'git mailinfo' whitespace bug Junio C Hamano
2010-02-22 15:13   ` Don Zickus
2010-02-22 19:57     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).