All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: tboegi@web.de
Cc: j6t@kdbg.org, lars.schneider@autodesk.com, git@vger.kernel.org,
	gitster@pobox.com, patrick@luehne.de, larsxschneider@gmail.com
Subject: Re: [PATCH/RFC 1/1] Auto diff of UTF-16 files in UTF-8
Date: Tue, 27 Feb 2018 17:39:29 -0500	[thread overview]
Message-ID: <20180227223929.GA3484@sigill.intra.peff.net> (raw)
In-Reply-To: <20180226172706.7822-1-tboegi@web.de>

On Mon, Feb 26, 2018 at 06:27:06PM +0100, tboegi@web.de wrote:

> @@ -3611,8 +3615,25 @@ int diff_populate_filespec(struct diff_filespec *s, unsigned int flags)
>  			s->size = size;
>  			s->should_free = 1;
>  		}
> -	}
> -	else {
> +		if (!s->binary && buffer_is_binary(s->data, s->size) &&
> +		    buffer_has_utf16_bom(s->data, s->size)) {
> +			int outsz = 0;
> +			char *outbuf;
> +			outbuf = reencode_string_len(s->data, (int)s->size,
> +						     "UTF-8", "UTF-16", &outsz);
> +			if (outbuf) {
> +				if (s->should_free)
> +					free(s->data);
> +				if (s->should_munmap)
> +					munmap(s->data, s->size);
> +				s->should_munmap = 0;
> +				s->data = outbuf;
> +				s->size = outsz;
> +				s->reencoded_from_utf16 = 1;
> +				s->should_free = 1;
> +			}
> +		}
> +	} else {

I don't think it makes sense to do the conversion deep inside
diff_populate_filespec(), because it will kick in much more than you'd
want (e.g., "format-patch | am" would stop working with this patch, I
think).

I think you'd want to hook this in at the same level as fill_textconv().
In fact, one way to do it would be to have the get_textconv() stack just
fill in a special driver that does the auto-detection. This is similar
to my earlier patch, but it avoids overriding the user-facing config for
non-textconv things (and naturally any actual configured textconv filter
would override the auto-detection).

-Peff

  parent reply	other threads:[~2018-02-27 22:39 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-11 15:50 [PATCH v1] convert: add support for 'encoding' attribute lars.schneider
2017-12-11 18:39 ` Eric Sunshine
2017-12-11 23:47   ` Lars Schneider
2017-12-11 23:58     ` Eric Sunshine
2017-12-12 10:58       ` Lars Schneider
2017-12-11 20:47 ` Johannes Sixt
2017-12-11 23:42   ` Lars Schneider
2017-12-12  0:59     ` Junio C Hamano
2017-12-12  7:15       ` Johannes Sixt
2017-12-12 10:55         ` Lars Schneider
2017-12-12 19:31           ` Junio C Hamano
2017-12-13 17:57             ` Lars Schneider
2017-12-13 18:11               ` Junio C Hamano
2017-12-13 23:02                 ` Lars Schneider
2017-12-14 23:01                   ` Junio C Hamano
2017-12-12  7:09     ` Johannes Sixt
2017-12-18 10:13   ` Torsten Bögershausen
2017-12-18 13:12     ` Jeff King
2017-12-23  8:08       ` Torsten Bögershausen
2017-12-29 13:28       ` [PATCH/RFC 0/2] git diff --UTF-8 tboegi
2017-12-29 13:28       ` [PATCH/RFC 1/2] convert_to_git(): checksafe becomes an integer tboegi
2017-12-29 13:28       ` [PATCH/RFC 2/2] git diff: Allow to reencode into UTF-8 tboegi
2018-02-26 17:27       ` [PATCH/RFC 1/1] Auto diff of UTF-16 files in UTF-8 tboegi
2018-02-26 18:43         ` Peter Krefting
2018-02-27 22:39         ` Jeff King [this message]
2017-12-18 18:02     ` [PATCH v1] convert: add support for 'encoding' attribute Junio C Hamano
2017-12-18 21:55     ` Johannes Sixt
2017-12-15  9:58 ` Jeff King
2017-12-18 10:54   ` Lars Schneider
2017-12-18 12:59     ` Jeff King
2017-12-17 17:14 ` Torsten Bögershausen
2017-12-28 16:14   ` Lars Schneider
2017-12-29 12:59     ` Torsten Bögershausen
2017-12-29 13:56       ` Lars Schneider
2018-01-03 19:15       ` Junio C Hamano
2018-01-03 20:45         ` Lars Schneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180227223929.GA3484@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=lars.schneider@autodesk.com \
    --cc=larsxschneider@gmail.com \
    --cc=patrick@luehne.de \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.