All of lore.kernel.org
 help / color / mirror / Atom feed
* Problem in Patches with commit-messages containing non-ascii
@ 2010-12-03 11:19 Martin Krüger
  2010-12-03 12:59 ` Jan Krüger
  2010-12-03 19:03 ` Andreas Schwab
  0 siblings, 2 replies; 4+ messages in thread
From: Martin Krüger @ 2010-12-03 11:19 UTC (permalink / raw)
  To: git

Hello

I stumbled over a problem with git handling patches.
Perhaps i am then only developer who hast this problem because im am the
only developer writing commit-mesages in german.

Consider this log-Message:
commit ea2cd63dfe9b3ac3581b6cff8b13a52e69066242
Author: martin <martin@chad.upnx.de>
Date:   Fri Nov 19 18:58:58 2010 +0100

    Methoden überall angepasst.
    Ausser Aussnahmen

Using format-patch the result is:

From ea2cd63dfe9b3ac3581b6cff8b13a52e69066242 Mon Sep 17 00:00:00 2001
From: martin <martin@chad.upnx.de>
Date: Fri, 19 Nov 2010 18:58:58 +0100
Subject: [PATCH] =?UTF-8?q?Methoden=20=C3=BCberall=20angepasst.
=20Ausser=20Aussnahmen?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The content of the subject field ist split up in two lines .
The blank in the second line indicates an header-folding according to
RFC2822 .
After this the string is encoded according to RFC2047 because it
contains non-ascii chars. The blank indicatinc the folding ist encoded
too, with =20.
That is a Problem because the unfolding according to RFC2822 cant't
detect the folding anymore. RFC2822 suggests that the unfolding must be
done before any further processing of the header which applies to the
RFC2047 decoding.

Applying this patch leads to this commit-Message:

commit 3949e57e4773e85e6c55482b68ade7c409426b3c
Author: martin <martin@chad.upnx.de>
Date:   Fri Nov 19 18:58:58 2010 +0100

    =?UTF-8?q?Methoden=20=C3=BCberall=20angepasst.

    =20Ausser=20Aussnahmen?=
    MIME-Version: 1.0
    Content-Type: text/plain; charset=UTF-8
    Content-Transfer-Encoding: 8bit

The solution is to make an exception not to encode blanks indicating a
folding.

I wrote this patch:

diff --git a/pretty.c b/pretty.c
index f85444b..8a78a4e 100644
--- a/pretty.c
+++ b/pretty.c
@@ -216,7 +216,7 @@ static int is_rfc2047_special(char ch)
 static void add_rfc2047(struct strbuf *sb, const char *line, int len,
 		       const char *encoding)
 {
-	int i, last;
+	int i, last, num_foldings;

 	for (i = 0; i < len; i++) {
 		int ch = line[i];
@@ -229,8 +229,14 @@ static void add_rfc2047(struct strbuf *sb, const
char *line, int len,
 	return;

 needquote:
-	strbuf_grow(sb, len * 3 + strlen(encoding) + 100);
+        num_foldings=0;
+        for (i = 1; i < len; i++)
+          if(line[i]==' '&&line[i]=='\n')
+           num_foldings++;
+
+	strbuf_grow(sb, len * 3 + num_foldings*(7+strlen(encoding)) + 100);
 	strbuf_addf(sb, "=?%s?q?", encoding);
+        unsigned last_ch=0;
 	for (i = last = 0; i < len; i++) {
 		unsigned ch = line[i] & 0xFF;
 		/*
@@ -240,10 +246,19 @@ needquote:
 		 * leave the underscore in place.
 		 */
 		if (is_rfc2047_special(ch) || ch == ' ') {
-			strbuf_add(sb, line + last, i - last);
-			strbuf_addf(sb, "=%02X", ch);
-			last = i + 1;
+                    if(!(ch == ' '&& last_ch=='\n')){
+                        strbuf_add(sb, line + last, i - last);
+			strbuf_addf(sb, "=%02X", ch);
+                    }
+                    else{
+                     if(i>last+1)
+                      strbuf_add(sb, line + last, i - last-1);
+                     strbuf_addstr(sb, "?=\n ");
+                     strbuf_addf(sb, "=?%s?q?", encoding);
+                    }
+                  last = i + 1;
 		}
+           last_ch=ch;
 	}
 	strbuf_add(sb, line + last, len - last);
 	strbuf_addstr(sb, "?=");



Then git generates this patch:

From ea2cd63dfe9b3ac3581b6cff8b13a52e69066242 Mon Sep 17 00:00:00 2001
From: martin <martin@chad.upnx.de>
Date: Fri, 19 Nov 2010 18:58:58 +0100
Subject: [PATCH] =?UTF-8?q?Methoden=20=C3=BCberall=20angepasst.?=
 =?UTF-8?q?Ausser=20Aussnahmen?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Applyin leads to a correct commit-Messsage:

commit 62d06e3415ec0726dbd58c11ed93771502b77805
Author: martin <martin@chad.upnx.de>
Date:   Fri Nov 19 18:58:58 2010 +0100

    Methoden überall angepasst.Ausser Aussnahmen


Best regards
   martin

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Problem in Patches with commit-messages containing non-ascii
  2010-12-03 11:19 Problem in Patches with commit-messages containing non-ascii Martin Krüger
@ 2010-12-03 12:59 ` Jan Krüger
  2010-12-03 13:08   ` Michael J Gruber
  2010-12-03 19:03 ` Andreas Schwab
  1 sibling, 1 reply; 4+ messages in thread
From: Jan Krüger @ 2010-12-03 12:59 UTC (permalink / raw)
  To: Martin Krüger; +Cc: git

--- Martin Krüger <martin.krueger@gmx.com> wrote:

> Consider this log-Message:
> commit ea2cd63dfe9b3ac3581b6cff8b13a52e69066242
> Author: martin <martin@chad.upnx.de>
> Date:   Fri Nov 19 18:58:58 2010 +0100
> 
>     Methoden überall angepasst.
>     Ausser Aussnahmen
> 

FWIW, support for multi-line summaries is very limited. Several
tools assume that the log message has this format:

<Summary in one line>
<Blank line>
<Details>

So one could argue that your patch fixes something that isn't really
supported anyway.

> [...]
> 
> Applyin leads to a correct commit-Messsage:
> 
> commit 62d06e3415ec0726dbd58c11ed93771502b77805
> Author: martin <martin@chad.upnx.de>
> Date:   Fri Nov 19 18:58:58 2010 +0100
> 
>     Methoden überall angepasst.Ausser Aussnahmen

How is that correct? It's different from the original commit message.

-Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Problem in Patches with commit-messages containing non-ascii
  2010-12-03 12:59 ` Jan Krüger
@ 2010-12-03 13:08   ` Michael J Gruber
  0 siblings, 0 replies; 4+ messages in thread
From: Michael J Gruber @ 2010-12-03 13:08 UTC (permalink / raw)
  To: Jan Krüger; +Cc: Martin Krüger, git

Jan Krüger venit, vidit, dixit 03.12.2010 13:59:
> --- Martin Krüger <martin.krueger@gmx.com> wrote:
> 
>> Consider this log-Message:
>> commit ea2cd63dfe9b3ac3581b6cff8b13a52e69066242
>> Author: martin <martin@chad.upnx.de>
>> Date:   Fri Nov 19 18:58:58 2010 +0100
>>
>>     Methoden überall angepasst.
>>     Ausser Aussnahmen
>>
> 
> FWIW, support for multi-line summaries is very limited. Several
> tools assume that the log message has this format:
> 
> <Summary in one line>
> <Blank line>
> <Details>
> 
> So one could argue that your patch fixes something that isn't really
> supported anyway.
> 
>> [...]
>>
>> Applyin leads to a correct commit-Messsage:
>>
>> commit 62d06e3415ec0726dbd58c11ed93771502b77805
>> Author: martin <martin@chad.upnx.de>
>> Date:   Fri Nov 19 18:58:58 2010 +0100
>>
>>     Methoden überall angepasst.Ausser Aussnahmen
> 
> How is that correct? It's different from the original commit message.
> 
> -Jan

Also, it is "Außer Ausnahmen" even after the latest spelling reform ;)

Michael

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Problem in Patches with commit-messages containing non-ascii
  2010-12-03 11:19 Problem in Patches with commit-messages containing non-ascii Martin Krüger
  2010-12-03 12:59 ` Jan Krüger
@ 2010-12-03 19:03 ` Andreas Schwab
  1 sibling, 0 replies; 4+ messages in thread
From: Andreas Schwab @ 2010-12-03 19:03 UTC (permalink / raw)
  To: Martin Krüger; +Cc: git

Martin Krüger <martin.krueger@gmx.com> writes:

> Then git generates this patch:
>
> From ea2cd63dfe9b3ac3581b6cff8b13a52e69066242 Mon Sep 17 00:00:00 2001
> From: martin <martin@chad.upnx.de>
> Date: Fri, 19 Nov 2010 18:58:58 +0100
> Subject: [PATCH] =?UTF-8?q?Methoden=20=C3=BCberall=20angepasst.?=
>  =?UTF-8?q?Ausser=20Aussnahmen?=
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Applyin leads to a correct commit-Messsage:
>
> commit 62d06e3415ec0726dbd58c11ed93771502b77805
> Author: martin <martin@chad.upnx.de>
> Date:   Fri Nov 19 18:58:58 2010 +0100
>
>     Methoden überall angepasst.Ausser Aussnahmen

That's at least missing a space after the period.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-12-03 19:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-03 11:19 Problem in Patches with commit-messages containing non-ascii Martin Krüger
2010-12-03 12:59 ` Jan Krüger
2010-12-03 13:08   ` Michael J Gruber
2010-12-03 19:03 ` Andreas Schwab

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.