git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Torsten Bögershausen" <tboegi@web.de>
To: Jeff King <peff@peff.net>
Cc: Lars Schneider <larsxschneider@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Lars Schneider <lars.schneider@autodesk.com>,
	git <git@vger.kernel.org>, Johannes Sixt <j6t@kdbg.org>,
	Eric Sunshine <sunshine@sunshineco.com>,
	ramsay@ramsayjones.plus.com, Johannes.Schindelin@gmx.de
Subject: Re: [PATCH v7 0/7] convert: add support for different encodings
Date: Wed, 28 Feb 2018 09:20:05 +0100	[thread overview]
Message-ID: <20180228082005.GA16857@tor.lan> (raw)
In-Reply-To: <20180227212537.GA6899@sigill.intra.peff.net>

On Tue, Feb 27, 2018 at 04:25:38PM -0500, Jeff King wrote:
> On Tue, Feb 27, 2018 at 10:05:17PM +0100, Torsten Bögershausen wrote:
> 
> > The other question is:
> > Would this help showing diffs of UTF-16 encoded files on a "git hoster",
> > github/bitbucket/.... ?
> 
> Almost. There's probably one more thing needed. We don't currently read
> in-tree .gitattributes when doing a diff in a bare repository. And most
> hosting sites will store bare repositories.
> 
> And of course it would require the users to actually set the attributes
> themselves.
> 
> > Or would the auto-magic UTF-16 avoid binary patch that I send out be more helpful ?
> > Or both ?
> > Or the w-t-e encoding ?
> 
> Of the three solutions, I think the relative merits are something like
> this:
> 
>   1. baked-in textconv (my patch)
> 
>      - reuses an existing diff feature, so minimal code and not likely to
>        break things
> 
>      - requires people to add a .gitattributes entry
> 
>      - needs work to make bare-repo .gitattributes work (though I think
>        this would be useful for other features, too)
> 
>      - has a run-time cost at each diff to do the conversion
> 
>      - may sometimes annoy people when it doesn't kick in (e.g.,
>        emailed patches from format-patch won't have a readable diff)
> 
>      - doesn't combine with other custom-diff config (e.g., utf-16
>        storing C code should still use diff=c funcname rules, but
>        wouldn't with my patch)
> 
>   2. auto-detect utf-16 (your patch)
>      - Just Works for existing repositories storing utf-16
> 
>      - carries some risk of kicking in when people would like it not to
>        (e.g., when they really do want a binary patch that can be
>        applied).

The binary patch is still supported, but that detail may need some more explanation
in the commit message. Please see  t4066-diff-encoding.sh
  test_expect_success 'diff --binary against local change' '
         cp file2 file &&
         test_tick &&
         cat >expect <<-\EOF &&
         diff --git a/file b/file
         index 26acf09b0aad19fb22566956d1a39cb4e2a3b420..e98d27acfb90cfcfc84fcc5173baa4aa7828290f 100644
         GIT binary patch
         literal 28
         ecmezW?;ArgLn;Fo!ykquAe{qbJq3!C0BHb{ln3Pi

         literal 32
         icmezW?+HT@Lpnn$kmO?c#!w7oaWVX1NCMJ1Ko$VA_z0~4

  EOF
         git diff --binary file >actual &&
         test_cmp expect actual

> 
>        I think it would probably be OK if this kicked in only when
>        ALLOW_TEXTCONV is set (the default for porcelain), and --binary
>        is not (i.e., when we would otherwise just say "binary
>        files differ").

The user can still use "git diff" (Where auto-detection of UTF-16 kicks in
and replaces "binary files differ" with an UTF-8 diff.
When the user wants a patch, "git diff --binary" will generate a binary patch,
as before.
The only thing which is missing is the line "binary files differ", which may be a
regression. I can re-add it in V2.

> 
>      - similar to (1), carries a run-time cost for each diff, and users
>        may sometimes still see binary diffs
> 
>   3. w-t-e (Lars's patch)
> 
>      - requires no server-side modifications; the diff is plain vanilla
> 
>      - works everywhere you diff, plumbing and porcelain
> 
>      - does require people to add a .gitattributes entry
> 
>      - run-time cost is per-checkout, not per-diff
> 
> So I can see room for (3) to co-exist alongside the others. Between (1)
> and (2), I think (2) is probably the better direction.
> 
> -Peff

  parent reply	other threads:[~2018-02-28  8:20 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-15 15:27 [PATCH v7 0/7] convert: add support for different encodings lars.schneider
2018-02-15 15:27 ` [PATCH v7 1/7] strbuf: remove unnecessary NUL assignment in xstrdup_tolower() lars.schneider
2018-02-16 12:55   ` Ævar Arnfjörð Bjarmason
2018-02-16 18:45     ` Jeff King
2018-02-16 19:30       ` Junio C Hamano
2018-02-15 15:27 ` [PATCH v7 2/7] strbuf: add xstrdup_toupper() lars.schneider
2018-02-15 15:27 ` [PATCH v7 3/7] utf8: add function to detect prohibited UTF-16/32 BOM lars.schneider
2018-02-15 15:27 ` [PATCH v7 4/7] utf8: add function to detect a missing " lars.schneider
2018-02-15 15:27 ` [PATCH v7 5/7] convert: add 'working-tree-encoding' attribute lars.schneider
2018-02-15 15:27 ` [PATCH v7 6/7] convert: add tracing for " lars.schneider
2018-02-15 15:27 ` [PATCH v7 7/7] convert: add round trip check based on 'core.checkRoundtripEncoding' lars.schneider
2018-02-15 20:03 ` [PATCH v7 0/7] convert: add support for different encodings Junio C Hamano
2018-02-15 22:09   ` Jeff King
2018-02-16 18:55     ` Junio C Hamano
2018-02-16 19:25       ` Jeff King
2018-02-16 19:27         ` Jeff King
2018-02-16 19:41           ` Junio C Hamano
2018-02-21 18:06       ` Lars Schneider
2018-02-16 14:42   ` Lars Schneider
2018-02-16 16:58     ` Torsten Bögershausen
2018-02-22 20:00       ` Lars Schneider
2018-02-22 20:12         ` Jeff King
2018-02-23 16:35         ` Junio C Hamano
2018-02-23 20:11           ` Junio C Hamano
2018-02-24 15:18             ` Lars Schneider
2018-02-26  1:44               ` Jeff King
2018-02-26 17:35                 ` Torsten Bögershausen
2018-02-26 20:46                   ` Jeff King
2018-02-27 21:05                     ` Torsten Bögershausen
2018-02-27 21:25                       ` Jeff King
2018-02-27 21:55                         ` Junio C Hamano
2018-02-27 21:58                           ` Jeff King
2018-02-27 22:10                             ` Junio C Hamano
2018-02-27 22:20                               ` Jeff King
2018-02-28  8:20                         ` Torsten Bögershausen [this message]
2018-02-28 13:21                           ` Jeff King
2018-02-28 17:42                             ` Junio C Hamano
2018-03-01  7:49                               ` Jeff King
2018-03-04 10:16                             ` Torsten Bögershausen
2018-02-28 20:46                         ` Lars Schneider
2018-02-16 19:04     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180228082005.GA16857@tor.lan \
    --to=tboegi@web.de \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=lars.schneider@autodesk.com \
    --cc=larsxschneider@gmail.com \
    --cc=peff@peff.net \
    --cc=ramsay@ramsayjones.plus.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).