All of lore.kernel.org
 help / color / mirror / Atom feed
From: "\"Jan H. Schönherr\"" <schnhrr@cs.tu-berlin.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>
Subject: Re: [PATCH 2/5] format-patch: do not wrap rfc2047 encoded headers too late
Date: Wed, 10 Oct 2012 11:31:59 +0200	[thread overview]
Message-ID: <5075408F.8050502@cs.tu-berlin.de> (raw)
In-Reply-To: <7v7gqzfnpj.fsf@alter.siamese.dyndns.org>

Am 09.10.2012 21:30, schrieb Junio C Hamano:
> Jan H. Schönherr <schnhrr@cs.tu-berlin.de> writes:
...
>>  static int is_rfc2047_special(char ch)
>>  {
>> +	/*
>> +	 * We encode ' ' using '=20' even though rfc2047
>> +	 * allows using '_' for readability.  Unfortunately,
>> +	 * many programs do not understand this and just
>> +	 * leave the underscore in place.
>> +	 */
> 
> The sentence break made me read the above three times to understand
> what it is trying to say.  "Unfortunately" refers to what happens if
> we were to use '_', but it initially appeared to be describing some
> bug due to our encoding ' ' as '=20'.  Perhaps like this?
> 
> 	/*
> 	 * rfc2047 allows '_' to encode ' ' for readability, but
> 	 * many programs do not understand ...; encode ' ' using
> 	 * '=20' instead to avoid the problem.
> 	 */

I was just moving that comment (and the following check) around,
but I'll update the comment in the next version.

>> +	if (ch == ' ' || ch == '\n')
>> +		return 1;
> 
> The comment justifies why this "if (ch == ' ')", which could be part
> of the "return" below, separately is done, but nothing explains why
> you add '\n' (and not other controls, e.g. '\t') to the mix.

The check for '\n' was introduced in commit c22e7de3
("format-patch: rfc2047-encode newlines in headers").

The commit log was:

    These should generally never happen, as we already
    concatenate multiples in subjects into a single line. But
    let's be defensive, since not encoding them means we will
    output malformed headers.

Having again a look at RFC 2047, I see that we should be
even more strict and not allow any non-printable character to
be passed through unencoded. I guess that adds another patch to
the series. Hmm... Maybe I can split patch 4 into two patches,
one that mostly fixes is_rfc2047_special() and one that
avoids 822 quoting when doing 2047 encoding.

> 
>>  	return (non_ascii(ch) || (ch == '=') || (ch == '?') || (ch == '_'));
>>  }
>>  
>>  static void add_rfc2047(struct strbuf *sb, const char *line, int len,
>>  		       const char *encoding)
>>  {
>> -	static const int max_length = 78; /* per rfc2822 */
>> +	static const int max_length = 76; /* per rfc2047 */
>>  	int i;
>>  	int line_len;
>>  
>> @@ -286,7 +295,7 @@ static void add_rfc2047(struct strbuf *sb, const char *line, int len,
>>  		if ((i + 1 < len) && (ch == '=' && line[i+1] == '?'))
>>  			goto needquote;
>>  	}
>> -	strbuf_add_wrapped_bytes(sb, line, len, -line_len, 1, max_length+1);
>> +	strbuf_add_wrapped_bytes(sb, line, len, -line_len, 1, 78+1);
>>  	return;
> 
> Yuck.  If you do want to retain 78 for non-quoted output for
> backward compatibility, that is OK, but if that is the case, please
> introduce a new constant "max_quoted_length" or something to stand
> for 76 and use it in the "needquote:" part below.

Will do.

Regards
Jan

  parent reply	other threads:[~2012-10-10  9:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-08 17:33 [PATCH 0/5] Cure some format-patch wrapping and encoding issues Jan H. Schönherr
2012-10-08 17:33 ` [PATCH 1/5] format-patch: do not wrap non-rfc2047 headers too early Jan H. Schönherr
2012-10-08 17:33 ` [PATCH 2/5] format-patch: do not wrap rfc2047 encoded headers too late Jan H. Schönherr
     [not found]   ` <7v7gqzfnpj.fsf@alter.siamese.dyndns.org>
2012-10-10  9:31     ` "Jan H. Schönherr" [this message]
2012-10-08 17:33 ` [PATCH 3/5] format-patch: introduce helper function last_line_length() Jan H. Schönherr
2012-10-08 17:33 ` [PATCH 4/5] format-patch: fix rfc2047 address encoding with respect to rfc822 specials Jan H. Schönherr
2012-10-08 17:33 ` [PATCH 5/5] format-patch: tests: check rfc822+rfc2047 in to+cc headers Jan H. Schönherr
     [not found]   ` <7v391nfmzn.fsf@alter.siamese.dyndns.org>
2012-10-10 10:44     ` "Jan H. Schönherr"
2012-10-10 17:02       ` Junio C Hamano
     [not found] ` <7vfw5nfoq9.fsf@alter.siamese.dyndns.org>
2012-10-10 10:49   ` [PATCH 0/5] Cure some format-patch wrapping and encoding issues "Jan H. Schönherr"

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5075408F.8050502@cs.tu-berlin.de \
    --to=schnhrr@cs.tu-berlin.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.