All of lore.kernel.org
 help / color / mirror / Atom feed
* Please make git-am handle \r\n-damaged patches
@ 2009-08-03 19:08 H. Peter Anvin
  2009-08-03 21:30 ` Junio C Hamano
  2009-08-04  6:35 ` Alex Riesen
  0 siblings, 2 replies; 33+ messages in thread
From: H. Peter Anvin @ 2009-08-03 19:08 UTC (permalink / raw)
  To: Git Mailing List

In a serious case of craniorectal immersion, the Thunderbird developers
have started using \r\n line endings on saved emails:

https://bugzilla.mozilla.org/show_bug.cgi?id=503271
https://bugzilla.mozilla.org/show_bug.cgi?id=507530

It would be nice if git-am could handle this case automatically.

	-hpa

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-03 19:08 Please make git-am handle \r\n-damaged patches H. Peter Anvin
@ 2009-08-03 21:30 ` Junio C Hamano
  2009-08-03 21:56   ` Wesley J. Landaker
  2009-08-03 22:13   ` H. Peter Anvin
  2009-08-04  6:35 ` Alex Riesen
  1 sibling, 2 replies; 33+ messages in thread
From: Junio C Hamano @ 2009-08-03 21:30 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Git Mailing List

"H. Peter Anvin" <hpa@zytor.com> writes:

> In a serious case of craniorectal immersion, the Thunderbird developers
> have started using \r\n line endings on saved emails:
>
> https://bugzilla.mozilla.org/show_bug.cgi?id=503271
> https://bugzilla.mozilla.org/show_bug.cgi?id=507530
>
> It would be nice if git-am could handle this case automatically.

Perhaps

    $ dos2unix *.eml | git am

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-03 21:30 ` Junio C Hamano
@ 2009-08-03 21:56   ` Wesley J. Landaker
  2009-08-03 22:56     ` H. Peter Anvin
  2009-08-03 22:13   ` H. Peter Anvin
  1 sibling, 1 reply; 33+ messages in thread
From: Wesley J. Landaker @ 2009-08-03 21:56 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 681 bytes --]

On Monday 03 August 2009 15:30:38 Junio C Hamano wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
> > In a serious case of craniorectal immersion, the Thunderbird developers
> > have started using \r\n line endings on saved emails:
> >
> > https://bugzilla.mozilla.org/show_bug.cgi?id=503271
> > https://bugzilla.mozilla.org/show_bug.cgi?id=507530
> >
> > It would be nice if git-am could handle this case automatically.
>
> Perhaps
>
>     $ dos2unix *.eml | git am

I didn't try it, but would "git am" with "apply.whitespace" and 
"core.whitespace" set in some reasonable manner help? Not "automatic", but 
may help if dos2unix isn't available for some reason.

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-03 21:30 ` Junio C Hamano
  2009-08-03 21:56   ` Wesley J. Landaker
@ 2009-08-03 22:13   ` H. Peter Anvin
  2009-08-03 22:21     ` Sverre Rabbelier
  1 sibling, 1 reply; 33+ messages in thread
From: H. Peter Anvin @ 2009-08-03 22:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On 08/03/2009 02:30 PM, Junio C Hamano wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
> 
>> In a serious case of craniorectal immersion, the Thunderbird developers
>> have started using \r\n line endings on saved emails:
>>
>> https://bugzilla.mozilla.org/show_bug.cgi?id=503271
>> https://bugzilla.mozilla.org/show_bug.cgi?id=507530
>>
>> It would be nice if git-am could handle this case automatically.
> 
> Perhaps
> 
>     $ dos2unix *.eml | git am
> 

Yes, that's what they suggested, too.  Like I need an extra step in my
patch process.

	-hpa

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-03 22:13   ` H. Peter Anvin
@ 2009-08-03 22:21     ` Sverre Rabbelier
  2009-08-03 22:31       ` H. Peter Anvin
  2009-08-04  6:21       ` Paolo Bonzini
  0 siblings, 2 replies; 33+ messages in thread
From: Sverre Rabbelier @ 2009-08-03 22:21 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Junio C Hamano, Git Mailing List

Heya,

On Mon, Aug 3, 2009 at 15:13, H. Peter Anvin<hpa@zytor.com> wrote:
> Yes, that's what they suggested, too.  Like I need an extra step in my
> patch process.

Write your own git-lazy-am.sh and put it in your path?

cat > git-lazy-am.sh << EOF
#!/bin/bash

dos2unix "$@" | git am
EOF

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-03 22:21     ` Sverre Rabbelier
@ 2009-08-03 22:31       ` H. Peter Anvin
  2009-08-04  6:21       ` Paolo Bonzini
  1 sibling, 0 replies; 33+ messages in thread
From: H. Peter Anvin @ 2009-08-03 22:31 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Junio C Hamano, Git Mailing List

On 08/03/2009 03:21 PM, Sverre Rabbelier wrote:
> Heya,
> 
> On Mon, Aug 3, 2009 at 15:13, H. Peter Anvin<hpa@zytor.com> wrote:
>> Yes, that's what they suggested, too.  Like I need an extra step in my
>> patch process.
> 
> Write your own git-lazy-am.sh and put it in your path?
> 
> cat > git-lazy-am.sh << EOF
> #!/bin/bash
> 
> dos2unix "$@" | git am
> EOF

Yes, I can do that.  However, having this integrated into git am would
be nicer not just for me but for everyone else.

(Currently I'm just using apply.whitespace = fix, which also solves the
issue.)

	-hpa

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-03 21:56   ` Wesley J. Landaker
@ 2009-08-03 22:56     ` H. Peter Anvin
  0 siblings, 0 replies; 33+ messages in thread
From: H. Peter Anvin @ 2009-08-03 22:56 UTC (permalink / raw)
  To: Wesley J. Landaker; +Cc: Git Mailing List

On 08/03/2009 02:56 PM, Wesley J. Landaker wrote:
> On Monday 03 August 2009 15:30:38 Junio C Hamano wrote:
>> "H. Peter Anvin" <hpa@zytor.com> writes:
>>> In a serious case of craniorectal immersion, the Thunderbird developers
>>> have started using \r\n line endings on saved emails:
>>>
>>> https://bugzilla.mozilla.org/show_bug.cgi?id=503271
>>> https://bugzilla.mozilla.org/show_bug.cgi?id=507530
>>>
>>> It would be nice if git-am could handle this case automatically.
>> Perhaps
>>
>>     $ dos2unix *.eml | git am
> 
> I didn't try it, but would "git am" with "apply.whitespace" and 
> "core.whitespace" set in some reasonable manner help? Not "automatic", but 
> may help if dos2unix isn't available for some reason.

It works, but it's extremely noisy.

	-hpa

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-03 22:21     ` Sverre Rabbelier
  2009-08-03 22:31       ` H. Peter Anvin
@ 2009-08-04  6:21       ` Paolo Bonzini
  1 sibling, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2009-08-04  6:21 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: H. Peter Anvin, Git Mailing List

On 08/04/2009 12:21 AM, Sverre Rabbelier wrote:
> Heya,
>
> On Mon, Aug 3, 2009 at 15:13, H. Peter Anvin<hpa@zytor.com>  wrote:
>> Yes, that's what they suggested, too.  Like I need an extra step in my
>> patch process.
>
> Write your own git-lazy-am.sh and put it in your path?
>
> cat>  git-lazy-am.sh<<  EOF
> #!/bin/bash
>
> dos2unix "$@" | git am
> EOF

Even better than putting it on the path,

[alias]
	tbam = ~/libexec/git/git-lazy-am.sh

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-03 19:08 Please make git-am handle \r\n-damaged patches H. Peter Anvin
  2009-08-03 21:30 ` Junio C Hamano
@ 2009-08-04  6:35 ` Alex Riesen
  2009-08-04  6:46   ` Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Alex Riesen @ 2009-08-04  6:35 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Git Mailing List

On Mon, Aug 3, 2009 at 21:08, H. Peter Anvin<hpa@zytor.com> wrote:
> In a serious case of craniorectal immersion, the Thunderbird developers
> have started using \r\n line endings on saved emails:
>
> https://bugzilla.mozilla.org/show_bug.cgi?id=503271
> https://bugzilla.mozilla.org/show_bug.cgi?id=507530
>
> It would be nice if git-am could handle this case automatically.
>

Maybe it is as simple as that (not tested yet,
and sent through gmail, so please be careful):

diff --git a/builtin-mailsplit.c b/builtin-mailsplit.c
index ad5f6b5..02c1c92 100644
--- a/builtin-mailsplit.c
+++ b/builtin-mailsplit.c
@@ -58,6 +58,8 @@ int read_line_with_nul(char *buf, int size, FILE *in)
 		if (c == '\n' || len + 1 >= size)
 			break;
 	}
+	if (len && buf[len - 1] == '\r')
+		--len;
 	buf[len] = '\0';

 	return len;

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-04  6:35 ` Alex Riesen
@ 2009-08-04  6:46   ` Junio C Hamano
  2009-08-04 17:26     ` [PATCH] Let mailsplit and mailinfo handle mails with CRLF line-endings Alex Riesen
                       ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Junio C Hamano @ 2009-08-04  6:46 UTC (permalink / raw)
  To: Alex Riesen; +Cc: H. Peter Anvin, Git Mailing List

Alex Riesen <raa.lkml@gmail.com> writes:

> Maybe it is as simple as that (not tested yet,
> and sent through gmail, so please be careful):

I thought about this approach, but it made me worried about a case where
an otherwise sane piece of e-mail has \r at the end of one line as the
real payload.  But as long as we are talking about a text e-mail (and we
are talking about mailsplit here and a binary payload with a CTE applied
counts as text), I think we can safely use an approach like this.

>
> diff --git a/builtin-mailsplit.c b/builtin-mailsplit.c
> index ad5f6b5..02c1c92 100644
> --- a/builtin-mailsplit.c
> +++ b/builtin-mailsplit.c
> @@ -58,6 +58,8 @@ int read_line_with_nul(char *buf, int size, FILE *in)
>  		if (c == '\n' || len + 1 >= size)
>  			break;
>  	}
> +	if (len && buf[len - 1] == '\r')
> +		--len;
>  	buf[len] = '\0';
>
>  	return len;

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] Let mailsplit and mailinfo handle mails with CRLF line-endings
  2009-08-04  6:46   ` Junio C Hamano
@ 2009-08-04 17:26     ` Alex Riesen
  2009-08-04 17:34       ` Sverre Rabbelier
  2009-08-04 21:03       ` [PATCH v2] " Alex Riesen
  2009-08-04 20:59     ` Please make git-am handle \r\n-damaged patches Erik Faye-Lund
  2009-08-04 21:15     ` Nanako Shiraishi
  2 siblings, 2 replies; 33+ messages in thread
From: Alex Riesen @ 2009-08-04 17:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: H. Peter Anvin, Git Mailing List

It is not that uncommon to have mails with DOS line-ending, notably
Thunderbird and Gmail (when saving what they call "original" message).

Noticed by H. Peter Anvin.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
---

Alex Riesen <raa.lkml@gmail.com> writes:

> Maybe it is as simple as that (not tested yet,
> and sent through gmail, so please be careful):

> --- a/builtin-mailsplit.c
> +++ b/builtin-mailsplit.c
> @@ -58,6 +58,8 @@ int read_line_with_nul(char *buf, int size, FILE *in)
> +	if (len && buf[len - 1] == '\r')
> +		--len;

That's wrong, of course. I missed the fact that \n stays in the
buffer. Corrected.

 builtin-mailsplit.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/builtin-mailsplit.c b/builtin-mailsplit.c
index ad5f6b5..188ffec 100644
--- a/builtin-mailsplit.c
+++ b/builtin-mailsplit.c
@@ -58,6 +58,8 @@ int read_line_with_nul(char *buf, int size, FILE *in)
 		if (c == '\n' || len + 1 >= size)
 			break;
 	}
+	if (len > 1 && buf[len - 2] == '\r')
+		buf[--len - 1] = '\n';
 	buf[len] = '\0';
 
 	return len;
-- 
1.6.4.30.gd4c0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] Let mailsplit and mailinfo handle mails with CRLF  line-endings
  2009-08-04 17:26     ` [PATCH] Let mailsplit and mailinfo handle mails with CRLF line-endings Alex Riesen
@ 2009-08-04 17:34       ` Sverre Rabbelier
  2009-08-04 18:50         ` Brandon Casey
  2009-08-04 20:57         ` Alex Riesen
  2009-08-04 21:03       ` [PATCH v2] " Alex Riesen
  1 sibling, 2 replies; 33+ messages in thread
From: Sverre Rabbelier @ 2009-08-04 17:34 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Junio C Hamano, H. Peter Anvin, Git Mailing List

Heya,

On Tue, Aug 4, 2009 at 10:26, Alex Riesen<raa.lkml@gmail.com> wrote:
>        }
> +       if (len > 1 && buf[len - 2] == '\r')
> +               buf[--len - 1] = '\n';
>        buf[len] = '\0';

How about something like:

+       if (len > 1 && buf[len - 2] == '\r' && (buf[len - 1] == '\n'
|| buf[len - 1] == '\0'))
+               buf[--len - 1] = '\n';

To make sure that we're not erasing a \r somewhere in the middle of the content?

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Let mailsplit and mailinfo handle mails with CRLF  line-endings
  2009-08-04 17:34       ` Sverre Rabbelier
@ 2009-08-04 18:50         ` Brandon Casey
  2009-08-04 20:57         ` Alex Riesen
  1 sibling, 0 replies; 33+ messages in thread
From: Brandon Casey @ 2009-08-04 18:50 UTC (permalink / raw)
  To: Sverre Rabbelier
  Cc: Alex Riesen, Junio C Hamano, H. Peter Anvin, Git Mailing List

Sverre Rabbelier wrote:
> Heya,
> 
> On Tue, Aug 4, 2009 at 10:26, Alex Riesen<raa.lkml@gmail.com> wrote:
>>        }
>> +       if (len > 1 && buf[len - 2] == '\r')
>> +               buf[--len - 1] = '\n';
>>        buf[len] = '\0';
> 
> How about something like:
> 
> +       if (len > 1 && buf[len - 2] == '\r' && (buf[len - 1] == '\n'
> || buf[len - 1] == '\0'))
> +               buf[--len - 1] = '\n';
> 
> To make sure that we're not erasing a \r somewhere in the middle of the content?

You may want to push the \r back into the buffer if it is the last character read
too.  We may reach the limit of size characters without finding a \n, and so we
can't tell whether the last \r we read was a solitary \r or whether it is the
beginning of \r\n sequence.

So maybe we need something like this after the 'for' loop instead:


   if (c == '\n') {
   	if (len > 1 && buf[len - 2] == '\r')
		buf[--len - 1] = '\n';
   } else if (c == '\r') {
	ungetc(c, in);
	len--;
   }

-brandon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Let mailsplit and mailinfo handle mails with CRLF  line-endings
  2009-08-04 17:34       ` Sverre Rabbelier
  2009-08-04 18:50         ` Brandon Casey
@ 2009-08-04 20:57         ` Alex Riesen
  1 sibling, 0 replies; 33+ messages in thread
From: Alex Riesen @ 2009-08-04 20:57 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Junio C Hamano, H. Peter Anvin, Git Mailing List

On Tue, Aug 4, 2009 at 19:34, Sverre Rabbelier<srabbelier@gmail.com> wrote:
> On Tue, Aug 4, 2009 at 10:26, Alex Riesen<raa.lkml@gmail.com> wrote:
>>        }
>> +       if (len > 1 && buf[len - 2] == '\r')
>> +               buf[--len - 1] = '\n';
>>        buf[len] = '\0';
>
> How about something like:
>
> +       if (len > 1 && buf[len - 2] == '\r' && (buf[len - 1] == '\n'
> || buf[len - 1] == '\0'))
> +               buf[--len - 1] = '\n';
>
> To make sure that we're not erasing a \r somewhere in the middle of the content?

I think this should be enough:

	if (c == '\n' && len > 1 && buf[len - 2] == '\r')
		buf[--len - 1] = '\n';

I'll resend.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-04  6:46   ` Junio C Hamano
  2009-08-04 17:26     ` [PATCH] Let mailsplit and mailinfo handle mails with CRLF line-endings Alex Riesen
@ 2009-08-04 20:59     ` Erik Faye-Lund
  2009-08-04 21:05       ` Alex Riesen
  2009-08-05 23:10       ` Tony Finch
  2009-08-04 21:15     ` Nanako Shiraishi
  2 siblings, 2 replies; 33+ messages in thread
From: Erik Faye-Lund @ 2009-08-04 20:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Alex Riesen, H. Peter Anvin, Git Mailing List

On Tue, Aug 4, 2009 at 8:46 AM, Junio C Hamano<gitster@pobox.com> wrote:
> I thought about this approach, but it made me worried about a case where
> an otherwise sane piece of e-mail has \r at the end of one line as the
> real payload. But as long as we are talking about a text e-mail (and we
> are talking about mailsplit here and a binary payload with a CTE applied
> counts as text), I think we can safely use an approach like this.

RFC 2822, section 2.3 explicitly states that a CR should not occur
without a LF (and vice versa, but the e-mail client might convert CRLF
to LF when saving to file), so I think this should be safe.

-- 
Erik "kusma" Faye-Lund
kusmabite@gmail.com
(+47) 986 59 656

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2] Let mailsplit and mailinfo handle mails with CRLF line-endings
  2009-08-04 17:26     ` [PATCH] Let mailsplit and mailinfo handle mails with CRLF line-endings Alex Riesen
  2009-08-04 17:34       ` Sverre Rabbelier
@ 2009-08-04 21:03       ` Alex Riesen
  2009-08-04 21:29         ` Brandon Casey
  1 sibling, 1 reply; 33+ messages in thread
From: Alex Riesen @ 2009-08-04 21:03 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: H. Peter Anvin, Git Mailing List, Sverre Rabbelier, Brandon Casey

Noticed by H. Peter Anvin.

It is not that uncommon to have mails with DOS line-ending,
notably Thunderbird and web mailers like Gmail (when saving
what they call "original" message).

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
---

Corrected bug with unconditonal last (or very long) line shortening if
it contains a CR in next-to-last character. Noticed by Sverre Rabbelier.

It should also handle the case mentioned by Brandon Casey.

 builtin-mailsplit.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/builtin-mailsplit.c b/builtin-mailsplit.c
index ad5f6b5..48285a0 100644
--- a/builtin-mailsplit.c
+++ b/builtin-mailsplit.c
@@ -58,6 +58,8 @@ int read_line_with_nul(char *buf, int size, FILE *in)
 		if (c == '\n' || len + 1 >= size)
 			break;
 	}
+	if (c == '\n' && len > 1 && buf[len - 2] == '\r')
+		buf[--len - 1] = '\n';
 	buf[len] = '\0';
 
 	return len;
-- 
1.6.4.34.gc3135e

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-04 20:59     ` Please make git-am handle \r\n-damaged patches Erik Faye-Lund
@ 2009-08-04 21:05       ` Alex Riesen
  2009-08-04 21:16         ` Erik Faye-Lund
  2009-08-05 23:10       ` Tony Finch
  1 sibling, 1 reply; 33+ messages in thread
From: Alex Riesen @ 2009-08-04 21:05 UTC (permalink / raw)
  To: Erik Faye-Lund; +Cc: Junio C Hamano, H. Peter Anvin, Git Mailing List

On Tue, Aug 4, 2009 at 22:59, Erik Faye-Lund<kusmabite@googlemail.com> wrote:
> On Tue, Aug 4, 2009 at 8:46 AM, Junio C Hamano<gitster@pobox.com> wrote:
>> I thought about this approach, but it made me worried about a case where
>> an otherwise sane piece of e-mail has \r at the end of one line as the
>> real payload. But as long as we are talking about a text e-mail (and we
>> are talking about mailsplit here and a binary payload with a CTE applied
>> counts as text), I think we can safely use an approach like this.
>
> RFC 2822, section 2.3 explicitly states that a CR should not occur
> without a LF (and vice versa, but the e-mail client might convert CRLF
> to LF when saving to file), so I think this should be safe.

You missed this line in original posters e-mail:
"In a serious case of craniorectal immersion..."

We are not safe from that and alike.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-04  6:46   ` Junio C Hamano
  2009-08-04 17:26     ` [PATCH] Let mailsplit and mailinfo handle mails with CRLF line-endings Alex Riesen
  2009-08-04 20:59     ` Please make git-am handle \r\n-damaged patches Erik Faye-Lund
@ 2009-08-04 21:15     ` Nanako Shiraishi
  2009-08-04 22:22       ` Junio C Hamano
  2 siblings, 1 reply; 33+ messages in thread
From: Nanako Shiraishi @ 2009-08-04 21:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Alex Riesen, H. Peter Anvin, Git Mailing List

Quoting Junio C Hamano <gitster@pobox.com>

> Alex Riesen <raa.lkml@gmail.com> writes:
>
>> Maybe it is as simple as that (not tested yet,
>> and sent through gmail, so please be careful):
>
> I thought about this approach, but it made me worried about a case where
> an otherwise sane piece of e-mail has \r at the end of one line as the
> real payload.  But as long as we are talking about a text e-mail (and we
> are talking about mailsplit here and a binary payload with a CTE applied
> counts as text), I think we can safely use an approach like this.

Is it safe to rebase a commit that introduces a carriage-return at the end of the line using the updated program?

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-04 21:05       ` Alex Riesen
@ 2009-08-04 21:16         ` Erik Faye-Lund
  0 siblings, 0 replies; 33+ messages in thread
From: Erik Faye-Lund @ 2009-08-04 21:16 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Junio C Hamano, H. Peter Anvin, Git Mailing List

On Tue, Aug 4, 2009 at 11:05 PM, Alex Riesen<raa.lkml@gmail.com> wrote:
> You missed this line in original posters e-mail:
> "In a serious case of craniorectal immersion..."
>
> We are not safe from that and alike.

Storing e-mails with CRLFs intact is perfectly sane. Depending on them
not being stored that way is not.

-- 
Erik "kusma" Faye-Lund
kusmabite@gmail.com
(+47) 986 59 656

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2] Let mailsplit and mailinfo handle mails with CRLF line-endings
  2009-08-04 21:03       ` [PATCH v2] " Alex Riesen
@ 2009-08-04 21:29         ` Brandon Casey
  2009-08-04 21:40           ` Brandon Casey
  2009-08-04 21:58           ` Alex Riesen
  0 siblings, 2 replies; 33+ messages in thread
From: Brandon Casey @ 2009-08-04 21:29 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Junio C Hamano, H. Peter Anvin, Git Mailing List, Sverre Rabbelier

Alex Riesen wrote:
> Noticed by H. Peter Anvin.
> 
> It is not that uncommon to have mails with DOS line-ending,
> notably Thunderbird and web mailers like Gmail (when saving
> what they call "original" message).
> 
> Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
> ---
> 
> Corrected bug with unconditonal last (or very long) line shortening if
> it contains a CR in next-to-last character. Noticed by Sverre Rabbelier.
> 
> It should also handle the case mentioned by Brandon Casey.
> 
>  builtin-mailsplit.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/builtin-mailsplit.c b/builtin-mailsplit.c
> index ad5f6b5..48285a0 100644
> --- a/builtin-mailsplit.c
> +++ b/builtin-mailsplit.c
> @@ -58,6 +58,8 @@ int read_line_with_nul(char *buf, int size, FILE *in)
>  		if (c == '\n' || len + 1 >= size)
>  			break;
>  	}
> +	if (c == '\n' && len > 1 && buf[len - 2] == '\r')
> +		buf[--len - 1] = '\n';
>  	buf[len] = '\0';
>  
>  	return len;

What if \r lands at character 99 and \n is at character 100?  If buf has
exactly 100 characters available for writing.  Wouldn't the '\r' be stored
into buf, but not the '\n'?  Then the 'for' loop would terminate since len + 1
would be >= size and the code above would test whether c == '\n', and it
would not, so the '\r' would not be removed from buf as it should be.

At the point where buf has been filled, and the last character read is a '\r',
we can not tell whether the next character is a '\n' or not, so we do not know
if it is a solitary '\r' or whether it is the start of a '\r\n' sequence.  It
seems to me that we must push the '\r' back into the stream and allow the next
call to read_line_with_nul() handle it, or peek to see if the next character
is a '\n'.

Be aware that if we use the version I suggested which pushes the '\r' back
into the input stream, I think we risk an infinite loop if size == 1.  I don't
think that is possible from the current callers though.

-brandon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2] Let mailsplit and mailinfo handle mails with CRLF line-endings
  2009-08-04 21:29         ` Brandon Casey
@ 2009-08-04 21:40           ` Brandon Casey
  2009-08-04 21:58           ` Alex Riesen
  1 sibling, 0 replies; 33+ messages in thread
From: Brandon Casey @ 2009-08-04 21:40 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Junio C Hamano, H. Peter Anvin, Git Mailing List, Sverre Rabbelier

Brandon Casey wrote:

> Be aware that if we use the version I suggested which pushes the '\r' back
> into the input stream, I think we risk an infinite loop if size == 1.  I don't
> think that is possible from the current callers though.

I think something like this would avoid any potential infinite loop, however improbable:

        if (c == '\n') {
                if (len > 1 && buf[len - 2] == '\r')
                        buf[--len - 1] = '\n';
        } else if (c == '\r') {
                c = getc(in);
                if (c == '\n')
                        buf[len - 1] = '\n';
                else if (c != EOF)
                        ungetc(c, in);
        }

-brandon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2] Let mailsplit and mailinfo handle mails with CRLF  line-endings
  2009-08-04 21:29         ` Brandon Casey
  2009-08-04 21:40           ` Brandon Casey
@ 2009-08-04 21:58           ` Alex Riesen
  2009-08-04 22:03             ` Alex Riesen
  2009-08-04 22:23             ` Brandon Casey
  1 sibling, 2 replies; 33+ messages in thread
From: Alex Riesen @ 2009-08-04 21:58 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Junio C Hamano, H. Peter Anvin, Git Mailing List, Sverre Rabbelier

On Tue, Aug 4, 2009 at 23:29, Brandon
Casey<brandon.casey.ctr@nrlssc.navy.mil> wrote:
> Alex Riesen wrote:
>> +     if (c == '\n' && len > 1 && buf[len - 2] == '\r')
>> +             buf[--len - 1] = '\n';
>>       buf[len] = '\0';
>>
>>       return len;
>
> What if \r lands at character 99 and \n is at character 100?  If buf has
> exactly 100 characters available for writing. ...

Ah, yes. You're right.

I have strong dislike towards unget, though. How about this, instead:

int read_line_with_nul(char *buf, int size, FILE *in)
{
	int len = 0, c;

	while (len < size) {
		c = getc(in);
		if (c == EOF)
			break;
		buf[len++] = c;
		if (c == '\n')
			break;
		else if (len == size)
			c = 0;
	}
	if (c == '\n' && len > 1 && buf[len - 2] == '\r')
		buf[--len - 1] = '\n';
	buf[len] = '\0';

	return len;
}

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2] Let mailsplit and mailinfo handle mails with CRLF  line-endings
  2009-08-04 21:58           ` Alex Riesen
@ 2009-08-04 22:03             ` Alex Riesen
  2009-08-04 22:04               ` Alex Riesen
  2009-08-04 22:23             ` Brandon Casey
  1 sibling, 1 reply; 33+ messages in thread
From: Alex Riesen @ 2009-08-04 22:03 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Junio C Hamano, H. Peter Anvin, Git Mailing List, Sverre Rabbelier

On Tue, Aug 4, 2009 at 23:58, Alex Riesen<raa.lkml@gmail.com> wrote:
> int read_line_with_nul(char *buf, int size, FILE *in)
> {
>        int len = 0, c;
>
>        while (len < size) {

"for (;;)", of course. Leftover from development process

>                c = getc(in);
>                if (c == EOF)
>                        break;
>                buf[len++] = c;
>                if (c == '\n')
>                        break;
>                else if (len == size)
>                        c = 0;
>        }
>        if (c == '\n' && len > 1 && buf[len - 2] == '\r')
>                buf[--len - 1] = '\n';
>        buf[len] = '\0';
>
>        return len;
> }
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2] Let mailsplit and mailinfo handle mails with CRLF  line-endings
  2009-08-04 22:03             ` Alex Riesen
@ 2009-08-04 22:04               ` Alex Riesen
  0 siblings, 0 replies; 33+ messages in thread
From: Alex Riesen @ 2009-08-04 22:04 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Junio C Hamano, H. Peter Anvin, Git Mailing List, Sverre Rabbelier

On Wed, Aug 5, 2009 at 00:03, Alex Riesen<raa.lkml@gmail.com> wrote:
> On Tue, Aug 4, 2009 at 23:58, Alex Riesen<raa.lkml@gmail.com> wrote:
>> int read_line_with_nul(char *buf, int size, FILE *in)
>> {
>>        int len = 0, c;
>>
>>        while (len < size) {
>
> "for (;;)", of course. Leftover from development process

Ahm... No, its not! Its correct :)
Good night, everybody.

>>                c = getc(in);
>>                if (c == EOF)
>>                        break;
>>                buf[len++] = c;
>>                if (c == '\n')
>>                        break;
>>                else if (len == size)
>>                        c = 0;
>>        }
>>        if (c == '\n' && len > 1 && buf[len - 2] == '\r')
>>                buf[--len - 1] = '\n';
>>        buf[len] = '\0';
>>
>>        return len;
>> }
>>
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-04 21:15     ` Nanako Shiraishi
@ 2009-08-04 22:22       ` Junio C Hamano
  2009-08-05  3:31         ` [PATCH 0/4] " Brandon Casey
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2009-08-04 22:22 UTC (permalink / raw)
  To: Nanako Shiraishi; +Cc: Alex Riesen, H. Peter Anvin, Git Mailing List

Nanako Shiraishi <nanako3@lavabit.com> writes:

> Quoting Junio C Hamano <gitster@pobox.com>
>
>> Alex Riesen <raa.lkml@gmail.com> writes:
>>
>>> Maybe it is as simple as that (not tested yet,
>>> and sent through gmail, so please be careful):
>>
>> I thought about this approach, but it made me worried about a case where
>> an otherwise sane piece of e-mail has \r at the end of one line as the
>> real payload.  But as long as we are talking about a text e-mail (and we
>> are talking about mailsplit here and a binary payload with a CTE applied
>> counts as text), I think we can safely use an approach like this.
>
> Is it safe to rebase a commit that introduces a carriage-return at the end of the line using the updated program?

Hmmm, good point.  The approach does break rebase.

At least we could patch it up like this.

 builtin-mailinfo.c  |    2 +-
 builtin-mailsplit.c |   11 +++++++++--
 builtin.h           |    2 +-
 git-am.sh           |    8 +++++++-
 t/t3400-rebase.sh   |   26 ++++++++++++++++++++++++--
 5 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index 92637ac..6e44d2c 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -835,7 +835,7 @@ static void handle_body(void)
 		strbuf_reset(&line);
 		if (strbuf_avail(&line) < 100)
 			strbuf_grow(&line, 100);
-	} while ((len = read_line_with_nul(line.buf, strbuf_avail(&line), fin)));
+	} while ((len = read_line_with_nul(line.buf, strbuf_avail(&line), fin, 0)));
 
 handle_body_out:
 	strbuf_release(&prev);
diff --git a/builtin-mailsplit.c b/builtin-mailsplit.c
index ad5f6b5..f774b04 100644
--- a/builtin-mailsplit.c
+++ b/builtin-mailsplit.c
@@ -45,8 +45,10 @@ static int is_from_line(const char *line, int len)
 /* Could be as small as 64, enough to hold a Unix "From " line. */
 static char buf[4096];
 
+static int keep_cr;
+
 /* We cannot use fgets() because our lines can contain NULs */
-int read_line_with_nul(char *buf, int size, FILE *in)
+int read_line_with_nul(char *buf, int size, FILE *in, int nuke_cr_at_eol)
 {
 	int len = 0, c;
 
@@ -58,6 +60,9 @@ int read_line_with_nul(char *buf, int size, FILE *in)
 		if (c == '\n' || len + 1 >= size)
 			break;
 	}
+	if (nuke_cr_at_eol &&
+	    (1 < len && buf[len - 2] == '\r' && buf[len - 1] == '\n'))
+		buf[len-- - 2] = '\n';
 	buf[len] = '\0';
 
 	return len;
@@ -93,7 +98,7 @@ static int split_one(FILE *mbox, const char *name, int allow_bare)
 		if (fwrite(buf, 1, len, output) != len)
 			die_errno("cannot write output");
 
-		len = read_line_with_nul(buf, sizeof(buf), mbox);
+		len = read_line_with_nul(buf, sizeof(buf), mbox, !keep_cr);
 		if (len == 0) {
 			if (feof(mbox)) {
 				status = 1;
@@ -248,6 +253,8 @@ int cmd_mailsplit(int argc, const char **argv, const char *prefix)
 			nr = strtol(arg+2, NULL, 10);
 		} else if ( arg[1] == 'b' && !arg[2] ) {
 			allow_bare = 1;
+		} else if (!strcmp(arg, "--keep-cr")) {
+			keep_cr = 1;
 		} else if ( arg[1] == 'o' && arg[2] ) {
 			dir = arg+2;
 		} else if ( arg[1] == '-' && !arg[2] ) {
diff --git a/builtin.h b/builtin.h
index 825a96f..59dddee 100644
--- a/builtin.h
+++ b/builtin.h
@@ -13,7 +13,7 @@ extern const char git_more_info_string[];
 extern void list_common_cmds_help(void);
 extern const char *help_unknown_cmd(const char *cmd);
 extern void prune_packed_objects(int);
-extern int read_line_with_nul(char *buf, int size, FILE *file);
+extern int read_line_with_nul(char *buf, int size, FILE *file, int nuke_cr_at_eol);
 extern int fmt_merge_msg(int merge_summary, struct strbuf *in,
 	struct strbuf *out);
 extern int commit_tree(const char *msg, unsigned char *tree,
diff --git a/git-am.sh b/git-am.sh
index d64d997..985226b 100755
--- a/git-am.sh
+++ b/git-am.sh
@@ -197,7 +197,13 @@ check_patch_format () {
 split_patches () {
 	case "$patch_format" in
 	mbox)
-		git mailsplit -d"$prec" -o"$dotest" -b -- "$@" > "$dotest/last" ||
+		case "$rebasing" in
+		'')
+			keep_cr= ;;
+		?*)
+			keep_cr=--keep-cr ;;
+		esac
+		git mailsplit -d"$prec" -o"$dotest" -b $keep_cr -- "$@" > "$dotest/last" ||
 		clean_abort
 		;;
 	stgit-series)
diff --git a/t/t3400-rebase.sh b/t/t3400-rebase.sh
index c5c29cc..4e6a44b 100755
--- a/t/t3400-rebase.sh
+++ b/t/t3400-rebase.sh
@@ -3,9 +3,10 @@
 # Copyright (c) 2005 Amos Waterland
 #
 
-test_description='git rebase should not destroy author information
+test_description='git rebase assorted tests
 
-This test runs git rebase and checks that the author information is not lost.
+This test runs git rebase and checks that the author information is not lost
+among other things.
 '
 . ./test-lib.sh
 
@@ -133,4 +134,25 @@ test_expect_success 'rebase -q is quiet' '
      test ! -s output.out
 '
 
+q_to_cr () {
+	tr Q '\015'
+}
+
+test_expect_success 'Rebase a commit that sprinkles CRs in' '
+	(
+		echo "One"
+		echo "TwoQ"
+		echo "Three"
+		echo "FQur"
+		echo "Five"
+	) | q_to_cr >CR &&
+	git add CR &&
+	test_tick &&
+	git commit -a -m "A file with a line with CR" &&
+	git tag file-with-cr &&
+	git checkout HEAD^0 &&
+	git rebase --onto HEAD^^ HEAD^ &&
+	git diff --exit-code file-with-cr:CR HEAD:CR
+'
+
 test_done

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2] Let mailsplit and mailinfo handle mails with CRLF line-endings
  2009-08-04 21:58           ` Alex Riesen
  2009-08-04 22:03             ` Alex Riesen
@ 2009-08-04 22:23             ` Brandon Casey
  1 sibling, 0 replies; 33+ messages in thread
From: Brandon Casey @ 2009-08-04 22:23 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Junio C Hamano, H. Peter Anvin, Git Mailing List, Sverre Rabbelier

Alex Riesen wrote:
> On Tue, Aug 4, 2009 at 23:29, Brandon
> Casey<brandon.casey.ctr@nrlssc.navy.mil> wrote:
>> Alex Riesen wrote:
>>> +     if (c == '\n' && len > 1 && buf[len - 2] == '\r')
>>> +             buf[--len - 1] = '\n';
>>>       buf[len] = '\0';
>>>
>>>       return len;
>> What if \r lands at character 99 and \n is at character 100?  If buf has
>> exactly 100 characters available for writing. ...
> 
> Ah, yes. You're right.
> 
> I have strong dislike towards unget, though. How about this, instead:
> 
> int read_line_with_nul(char *buf, int size, FILE *in)
> {
> 	int len = 0, c;
> 
> 	while (len < size) {
> 		c = getc(in);
> 		if (c == EOF)
> 			break;
> 		buf[len++] = c;
> 		if (c == '\n')
> 			break;
> 		else if (len == size)
> 			c = 0;
> 	}
> 	if (c == '\n' && len > 1 && buf[len - 2] == '\r')
> 		buf[--len - 1] = '\n';
> 	buf[len] = '\0';
> 
> 	return len;
> }


I don't see how this solves the problem.  Still if the buffer is filled,
and the last character read is '\r', and the next character that has not
yet been read is '\n', then the '\r' will erroneously be returned in buf.

Plus, I think you'll have a problem at buf[len] = '\0' if the loop runs to
completion and len == size.

-brandon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 0/4] Re: Please make git-am handle \r\n-damaged patches
  2009-08-04 22:22       ` Junio C Hamano
@ 2009-08-05  3:31         ` Brandon Casey
  2009-08-05  3:31           ` [PATCH 1/4] strbuf: add new function strbuf_getwholeline() Brandon Casey
  0 siblings, 1 reply; 33+ messages in thread
From: Brandon Casey @ 2009-08-05  3:31 UTC (permalink / raw)
  To: gitster; +Cc: nanako3, raa.lkml, hpa, git, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

Convert mailinfo and mailsplit to use strbufs before attempting to address
the CRLF issue for saved emails.

Brandon Casey (3):
  strbuf: add new function strbuf_getwholeline()
  builtin-mailinfo,builtin-mailsplit: use strbufs
  builtin-mailsplit.c: remove read_line_with_nul() since it is no
    longer used

Junio C Hamano (1):
  Allow mailsplit to handle mails with CRLF line-endings

 builtin-mailinfo.c  |    8 +-------
 builtin-mailsplit.c |   44 ++++++++++++++------------------------------
 builtin.h           |    1 -
 git-am.sh           |    8 +++++++-
 strbuf.c            |   15 ++++++++++++---
 strbuf.h            |    1 +
 t/t3400-rebase.sh   |   26 ++++++++++++++++++++++++--
 7 files changed, 59 insertions(+), 44 deletions(-)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/4] strbuf: add new function strbuf_getwholeline()
  2009-08-05  3:31         ` [PATCH 0/4] " Brandon Casey
@ 2009-08-05  3:31           ` Brandon Casey
  2009-08-05  3:31             ` [PATCH 2/4] builtin-mailinfo,builtin-mailsplit: use strbufs Brandon Casey
  0 siblings, 1 reply; 33+ messages in thread
From: Brandon Casey @ 2009-08-05  3:31 UTC (permalink / raw)
  To: gitster; +Cc: nanako3, raa.lkml, hpa, git, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

This function is just like strbuf_getline() except it retains the
line-termination character.  This function will be used by the mailinfo
and mailsplit builtins which require the entire line for parsing.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
---
 strbuf.c |   15 ++++++++++++---
 strbuf.h |    1 +
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index f03d117..a6153dc 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -322,7 +322,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
 	return -1;
 }
 
-int strbuf_getline(struct strbuf *sb, FILE *fp, int term)
+int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
 {
 	int ch;
 
@@ -332,10 +332,10 @@ int strbuf_getline(struct strbuf *sb, FILE *fp, int term)
 
 	strbuf_reset(sb);
 	while ((ch = fgetc(fp)) != EOF) {
-		if (ch == term)
-			break;
 		strbuf_grow(sb, 1);
 		sb->buf[sb->len++] = ch;
+		if (ch == term)
+			break;
 	}
 	if (ch == EOF && sb->len == 0)
 		return EOF;
@@ -344,6 +344,15 @@ int strbuf_getline(struct strbuf *sb, FILE *fp, int term)
 	return 0;
 }
 
+int strbuf_getline(struct strbuf *sb, FILE *fp, int term)
+{
+	if (strbuf_getwholeline(sb, fp, term))
+		return EOF;
+	if (sb->buf[sb->len-1] == term)
+		strbuf_setlen(sb, sb->len-1);
+	return 0;
+}
+
 int strbuf_read_file(struct strbuf *sb, const char *path, size_t hint)
 {
 	int fd, len;
diff --git a/strbuf.h b/strbuf.h
index eaa8704..d05e056 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -126,6 +126,7 @@ extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint);
 extern int strbuf_read_file(struct strbuf *sb, const char *path, size_t hint);
 extern int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint);
 
+extern int strbuf_getwholeline(struct strbuf *, FILE *, int);
 extern int strbuf_getline(struct strbuf *, FILE *, int);
 
 extern void stripspace(struct strbuf *buf, int skip_comments);
-- 
1.6.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 2/4] builtin-mailinfo,builtin-mailsplit: use strbufs
  2009-08-05  3:31           ` [PATCH 1/4] strbuf: add new function strbuf_getwholeline() Brandon Casey
@ 2009-08-05  3:31             ` Brandon Casey
  2009-08-05  3:31               ` [PATCH 3/4] builtin-mailsplit.c: remove read_line_with_nul() since it is no longer used Brandon Casey
  0 siblings, 1 reply; 33+ messages in thread
From: Brandon Casey @ 2009-08-05  3:31 UTC (permalink / raw)
  To: gitster; +Cc: nanako3, raa.lkml, hpa, git, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

There should be no functional change.  Just the necessary changes and
simplifications associated with calling strbuf_getwholeline() rather
than an internal function or fgets.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
---
 builtin-mailinfo.c  |    8 +-------
 builtin-mailsplit.c |   20 ++++++++------------
 2 files changed, 9 insertions(+), 19 deletions(-)

diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index 92637ac..b0b5d8f 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -765,7 +765,6 @@ static void handle_filter(struct strbuf *line)
 
 static void handle_body(void)
 {
-	int len = 0;
 	struct strbuf prev = STRBUF_INIT;
 
 	/* Skip up to the first boundary */
@@ -775,8 +774,6 @@ static void handle_body(void)
 	}
 
 	do {
-		strbuf_setlen(&line, line.len + len);
-
 		/* process any boundary lines */
 		if (*content_top && is_multipart_boundary(&line)) {
 			/* flush any leftover */
@@ -832,10 +829,7 @@ static void handle_body(void)
 			handle_filter(&line);
 		}
 
-		strbuf_reset(&line);
-		if (strbuf_avail(&line) < 100)
-			strbuf_grow(&line, 100);
-	} while ((len = read_line_with_nul(line.buf, strbuf_avail(&line), fin)));
+	} while (!strbuf_getwholeline(&line, fin, '\n'));
 
 handle_body_out:
 	strbuf_release(&prev);
diff --git a/builtin-mailsplit.c b/builtin-mailsplit.c
index ad5f6b5..c2ec6ea 100644
--- a/builtin-mailsplit.c
+++ b/builtin-mailsplit.c
@@ -7,6 +7,7 @@
 #include "cache.h"
 #include "builtin.h"
 #include "string-list.h"
+#include "strbuf.h"
 
 static const char git_mailsplit_usage[] =
 "git mailsplit [-d<prec>] [-f<n>] [-b] -o<directory> [<mbox>|<Maildir>...]";
@@ -42,8 +43,7 @@ static int is_from_line(const char *line, int len)
 	return 1;
 }
 
-/* Could be as small as 64, enough to hold a Unix "From " line. */
-static char buf[4096];
+static struct strbuf buf = STRBUF_INIT;
 
 /* We cannot use fgets() because our lines can contain NULs */
 int read_line_with_nul(char *buf, int size, FILE *in)
@@ -71,10 +71,9 @@ int read_line_with_nul(char *buf, int size, FILE *in)
 static int split_one(FILE *mbox, const char *name, int allow_bare)
 {
 	FILE *output = NULL;
-	int len = strlen(buf);
 	int fd;
 	int status = 0;
-	int is_bare = !is_from_line(buf, len);
+	int is_bare = !is_from_line(buf.buf, buf.len);
 
 	if (is_bare && !allow_bare)
 		goto corrupt;
@@ -88,20 +87,17 @@ static int split_one(FILE *mbox, const char *name, int allow_bare)
 	 * "From " and having something that looks like a date format.
 	 */
 	for (;;) {
-		int is_partial = len && buf[len-1] != '\n';
-
-		if (fwrite(buf, 1, len, output) != len)
+		if (fwrite(buf.buf, 1, buf.len, output) != buf.len)
 			die_errno("cannot write output");
 
-		len = read_line_with_nul(buf, sizeof(buf), mbox);
-		if (len == 0) {
+		if (strbuf_getwholeline(&buf, mbox, '\n')) {
 			if (feof(mbox)) {
 				status = 1;
 				break;
 			}
 			die_errno("cannot read mbox");
 		}
-		if (!is_partial && !is_bare && is_from_line(buf, len))
+		if (!is_bare && is_from_line(buf.buf, buf.len))
 			break; /* done with one message */
 	}
 	fclose(output);
@@ -166,7 +162,7 @@ static int split_maildir(const char *maildir, const char *dir,
 			goto out;
 		}
 
-		if (fgets(buf, sizeof(buf), f) == NULL) {
+		if (strbuf_getwholeline(&buf, f, '\n')) {
 			error("cannot read mail %s (%s)", file, strerror(errno));
 			goto out;
 		}
@@ -203,7 +199,7 @@ static int split_mbox(const char *file, const char *dir, int allow_bare,
 	} while (isspace(peek));
 	ungetc(peek, f);
 
-	if (fgets(buf, sizeof(buf), f) == NULL) {
+	if (strbuf_getwholeline(&buf, f, '\n')) {
 		/* empty stdin is OK */
 		if (f != stdin) {
 			error("cannot read mbox %s", file);
-- 
1.6.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 3/4] builtin-mailsplit.c: remove read_line_with_nul() since it is no longer used
  2009-08-05  3:31             ` [PATCH 2/4] builtin-mailinfo,builtin-mailsplit: use strbufs Brandon Casey
@ 2009-08-05  3:31               ` Brandon Casey
  2009-08-05  3:31                 ` [PATCH 4/4] Allow mailsplit (and hence git-am) to handle mails with CRLF line-endings Brandon Casey
  0 siblings, 1 reply; 33+ messages in thread
From: Brandon Casey @ 2009-08-05  3:31 UTC (permalink / raw)
  To: gitster; +Cc: nanako3, raa.lkml, hpa, git, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>


Signed-off-by: Brandon Casey <drafnel@gmail.com>
---


Squash this with the previous one if you like.

-brandon


 builtin-mailsplit.c |   18 ------------------
 builtin.h           |    1 -
 2 files changed, 0 insertions(+), 19 deletions(-)

diff --git a/builtin-mailsplit.c b/builtin-mailsplit.c
index c2ec6ea..d288fde 100644
--- a/builtin-mailsplit.c
+++ b/builtin-mailsplit.c
@@ -45,24 +45,6 @@ static int is_from_line(const char *line, int len)
 
 static struct strbuf buf = STRBUF_INIT;
 
-/* We cannot use fgets() because our lines can contain NULs */
-int read_line_with_nul(char *buf, int size, FILE *in)
-{
-	int len = 0, c;
-
-	for (;;) {
-		c = getc(in);
-		if (c == EOF)
-			break;
-		buf[len++] = c;
-		if (c == '\n' || len + 1 >= size)
-			break;
-	}
-	buf[len] = '\0';
-
-	return len;
-}
-
 /* Called with the first line (potentially partial)
  * already in buf[] -- normally that should begin with
  * the Unix "From " line.  Write it into the specified
diff --git a/builtin.h b/builtin.h
index 20427d2..9993dfc 100644
--- a/builtin.h
+++ b/builtin.h
@@ -13,7 +13,6 @@ extern const char git_more_info_string[];
 extern void list_common_cmds_help(void);
 extern const char *help_unknown_cmd(const char *cmd);
 extern void prune_packed_objects(int);
-extern int read_line_with_nul(char *buf, int size, FILE *file);
 extern int fmt_merge_msg(int merge_summary, struct strbuf *in,
 	struct strbuf *out);
 extern int commit_tree(const char *msg, unsigned char *tree,
-- 
1.6.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 4/4] Allow mailsplit (and hence git-am) to handle mails with CRLF line-endings
  2009-08-05  3:31               ` [PATCH 3/4] builtin-mailsplit.c: remove read_line_with_nul() since it is no longer used Brandon Casey
@ 2009-08-05  3:31                 ` Brandon Casey
  2009-08-05  5:49                   ` Alex Riesen
  0 siblings, 1 reply; 33+ messages in thread
From: Brandon Casey @ 2009-08-05  3:31 UTC (permalink / raw)
  To: gitster; +Cc: nanako3, raa.lkml, hpa, git, Brandon Casey

From: Junio C Hamano <gitster@pobox.com>

It is not that uncommon to have mails with DOS line-ending, notably
Thunderbird and web mailers like Gmail (when saving what they call
"original" message).  So modify mailsplit to convert CRLF line-endings to
just LF.

Since git-rebase is built on top of git-am, add an option to mailsplit to
be used by git-am when it is acting on behalf of git-rebase, to refrain
from doing this conversion.

And add a test to make sure that rebase still works.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
---
 builtin-mailsplit.c |    6 ++++++
 git-am.sh           |    8 +++++++-
 t/t3400-rebase.sh   |   26 ++++++++++++++++++++++++--
 3 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/builtin-mailsplit.c b/builtin-mailsplit.c
index d288fde..c9c85c3 100644
--- a/builtin-mailsplit.c
+++ b/builtin-mailsplit.c
@@ -44,6 +44,7 @@ static int is_from_line(const char *line, int len)
 }
 
 static struct strbuf buf = STRBUF_INIT;
+static int keep_cr;
 
 /* Called with the first line (potentially partial)
  * already in buf[] -- normally that should begin with
@@ -69,6 +70,12 @@ static int split_one(FILE *mbox, const char *name, int allow_bare)
 	 * "From " and having something that looks like a date format.
 	 */
 	for (;;) {
+		if (!keep_cr && buf.len > 1 && buf.buf[buf.len-1] == '\n' &&
+			buf.buf[buf.len-2] == '\r') {
+			strbuf_setlen(&buf, buf.len-2);
+			strbuf_addch(&buf, '\n');
+		}
+
 		if (fwrite(buf.buf, 1, buf.len, output) != buf.len)
 			die_errno("cannot write output");
 
@@ -226,6 +233,8 @@ int cmd_mailsplit(int argc, const char **argv, const char *prefix)
 			nr = strtol(arg+2, NULL, 10);
 		} else if ( arg[1] == 'b' && !arg[2] ) {
 			allow_bare = 1;
+		} else if (!strcmp(arg, "--keep-cr")) {
+			keep_cr = 1;
 		} else if ( arg[1] == 'o' && arg[2] ) {
 			dir = arg+2;
 		} else if ( arg[1] == '-' && !arg[2] ) {
diff --git a/git-am.sh b/git-am.sh
index d64d997..985226b 100755
--- a/git-am.sh
+++ b/git-am.sh
@@ -197,7 +197,13 @@ check_patch_format () {
 split_patches () {
 	case "$patch_format" in
 	mbox)
-		git mailsplit -d"$prec" -o"$dotest" -b -- "$@" > "$dotest/last" ||
+		case "$rebasing" in
+		'')
+			keep_cr= ;;
+		?*)
+			keep_cr=--keep-cr ;;
+		esac
+		git mailsplit -d"$prec" -o"$dotest" -b $keep_cr -- "$@" > "$dotest/last" ||
 		clean_abort
 		;;
 	stgit-series)
diff --git a/t/t3400-rebase.sh b/t/t3400-rebase.sh
index c5c29cc..4e6a44b 100755
--- a/t/t3400-rebase.sh
+++ b/t/t3400-rebase.sh
@@ -3,9 +3,10 @@
 # Copyright (c) 2005 Amos Waterland
 #
 
-test_description='git rebase should not destroy author information
+test_description='git rebase assorted tests
 
-This test runs git rebase and checks that the author information is not lost.
+This test runs git rebase and checks that the author information is not lost
+among other things.
 '
 . ./test-lib.sh
 
@@ -133,4 +134,25 @@ test_expect_success 'rebase -q is quiet' '
      test ! -s output.out
 '
 
+q_to_cr () {
+	tr Q '\015'
+}
+
+test_expect_success 'Rebase a commit that sprinkles CRs in' '
+	(
+		echo "One"
+		echo "TwoQ"
+		echo "Three"
+		echo "FQur"
+		echo "Five"
+	) | q_to_cr >CR &&
+	git add CR &&
+	test_tick &&
+	git commit -a -m "A file with a line with CR" &&
+	git tag file-with-cr &&
+	git checkout HEAD^0 &&
+	git rebase --onto HEAD^^ HEAD^ &&
+	git diff --exit-code file-with-cr:CR HEAD:CR
+'
+
 test_done
-- 
1.6.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/4] Allow mailsplit (and hence git-am) to handle mails  with CRLF line-endings
  2009-08-05  3:31                 ` [PATCH 4/4] Allow mailsplit (and hence git-am) to handle mails with CRLF line-endings Brandon Casey
@ 2009-08-05  5:49                   ` Alex Riesen
  0 siblings, 0 replies; 33+ messages in thread
From: Alex Riesen @ 2009-08-05  5:49 UTC (permalink / raw)
  To: Brandon Casey; +Cc: gitster, nanako3, hpa, git, Brandon Casey

On Wed, Aug 5, 2009 at 05:31, Brandon Casey<casey@nrlssc.navy.mil> wrote:
> @@ -69,6 +70,12 @@ static int split_one(FILE *mbox, const char *name, int allow_bare)
>         * "From " and having something that looks like a date format.
>         */
>        for (;;) {
> +               if (!keep_cr && buf.len > 1 && buf.buf[buf.len-1] == '\n' &&
> +                       buf.buf[buf.len-2] == '\r') {
> +                       strbuf_setlen(&buf, buf.len-2);
> +                       strbuf_addch(&buf, '\n');
> +               }
> +

That's much better then any on-the-fly corrections :)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Please make git-am handle \r\n-damaged patches
  2009-08-04 20:59     ` Please make git-am handle \r\n-damaged patches Erik Faye-Lund
  2009-08-04 21:05       ` Alex Riesen
@ 2009-08-05 23:10       ` Tony Finch
  1 sibling, 0 replies; 33+ messages in thread
From: Tony Finch @ 2009-08-05 23:10 UTC (permalink / raw)
  To: Erik Faye-Lund
  Cc: Junio C Hamano, Alex Riesen, H. Peter Anvin, Git Mailing List

On Tue, 4 Aug 2009, Erik Faye-Lund wrote:
>
> RFC 2822, section 2.3 explicitly states that a CR should not occur
> without a LF (and vice versa, but the e-mail client might convert CRLF
> to LF when saving to file), so I think this should be safe.

The rare BINARYMIME extension relaxes this requirement.

Tony.
-- 
f.anthony.n.finch  <dot@dotat.at>  http://dotat.at/
GERMAN BIGHT HUMBER: SOUTHWEST 5 TO 7. MODERATE OR ROUGH. SQUALLY SHOWERS.
MODERATE OR GOOD.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2009-08-05 23:11 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-03 19:08 Please make git-am handle \r\n-damaged patches H. Peter Anvin
2009-08-03 21:30 ` Junio C Hamano
2009-08-03 21:56   ` Wesley J. Landaker
2009-08-03 22:56     ` H. Peter Anvin
2009-08-03 22:13   ` H. Peter Anvin
2009-08-03 22:21     ` Sverre Rabbelier
2009-08-03 22:31       ` H. Peter Anvin
2009-08-04  6:21       ` Paolo Bonzini
2009-08-04  6:35 ` Alex Riesen
2009-08-04  6:46   ` Junio C Hamano
2009-08-04 17:26     ` [PATCH] Let mailsplit and mailinfo handle mails with CRLF line-endings Alex Riesen
2009-08-04 17:34       ` Sverre Rabbelier
2009-08-04 18:50         ` Brandon Casey
2009-08-04 20:57         ` Alex Riesen
2009-08-04 21:03       ` [PATCH v2] " Alex Riesen
2009-08-04 21:29         ` Brandon Casey
2009-08-04 21:40           ` Brandon Casey
2009-08-04 21:58           ` Alex Riesen
2009-08-04 22:03             ` Alex Riesen
2009-08-04 22:04               ` Alex Riesen
2009-08-04 22:23             ` Brandon Casey
2009-08-04 20:59     ` Please make git-am handle \r\n-damaged patches Erik Faye-Lund
2009-08-04 21:05       ` Alex Riesen
2009-08-04 21:16         ` Erik Faye-Lund
2009-08-05 23:10       ` Tony Finch
2009-08-04 21:15     ` Nanako Shiraishi
2009-08-04 22:22       ` Junio C Hamano
2009-08-05  3:31         ` [PATCH 0/4] " Brandon Casey
2009-08-05  3:31           ` [PATCH 1/4] strbuf: add new function strbuf_getwholeline() Brandon Casey
2009-08-05  3:31             ` [PATCH 2/4] builtin-mailinfo,builtin-mailsplit: use strbufs Brandon Casey
2009-08-05  3:31               ` [PATCH 3/4] builtin-mailsplit.c: remove read_line_with_nul() since it is no longer used Brandon Casey
2009-08-05  3:31                 ` [PATCH 4/4] Allow mailsplit (and hence git-am) to handle mails with CRLF line-endings Brandon Casey
2009-08-05  5:49                   ` Alex Riesen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.