All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Torsten Bögershausen" <tboegi@web.de>
Cc: git@vger.kernel.org, alexander.s.m@gmail.com, Johannes.Schindelin@gmx.de
Subject: Re: [PATCH v4 1/2] diff.c: When appropriate, use utf8_strwidth(), part1
Date: Wed, 07 Sep 2022 11:31:45 -0700	[thread overview]
Message-ID: <xmqqedwnujce.fsf@gitster.g> (raw)
In-Reply-To: <20220907043040.idqqivi3jt35jyst@tb-raspi4> ("Torsten =?utf-8?Q?B=C3=B6gershausen=22's?= message of "Wed, 7 Sep 2022 06:30:40 +0200")

Torsten Bögershausen <tboegi@web.de> writes:

>
> OK - the comment can be removed.
>
> I didn't know how to read this comment:
>>...but the former may chomp a single multi-byte letter in the middle,
>> which would need to be corrected as a part of this change.
>
> After diffing into the code some more times, I think that we don't
> chomp a single byte out of an UTF-8 sequence.

When turning a/b/c vs a/B/c into a/{b->B}/c, two steps are involved.
Take common prefix and suffix (in this case 'a' and 'c') and turn
'b' vs 'B' into {b->B} is one step.  The other is what to do when
prefix and suffix are long.  After turning aaaaa/b/c vs aaaaa/B/c
into aaaaa/{b->B}/c, if the result is overly long, how we shorten
the prefix (i.e. aaaaa) and the suffix?

I knew the code that produces {b->B} honored '/' boundary, but I
just did not remember offhand what diff.c::pprint_rename() did in
its latter half, specifically, if it just chomped pfx and sfx as a
sequence of bytes (which would have been wrong) or insisted that the
common sequence search honors '/' boundary (which would be OK, as
byte '/' will not appear in the middle of a single multi-byte UTF-8
"letter").  I think iti s doing the latter, so it should be fine.



Thanks.

  reply	other threads:[~2022-09-07 18:31 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-09 13:11 [BUG] Unicode filenames handling in `git log --stat` Alexander Meshcheryakov
2022-08-09 18:20 ` Calvin Wan
2022-08-09 19:03   ` Alexander Meshcheryakov
2022-08-09 21:36     ` Calvin Wan
2022-08-10  5:55   ` Junio C Hamano
2022-08-10  8:40     ` Torsten Bögershausen
2022-08-10  8:56       ` Alexander Meshcheryakov
2022-08-10  9:51         ` Torsten Bögershausen
2022-08-10 11:41           ` Torsten Bögershausen
2022-08-10 15:53       ` Junio C Hamano
2022-08-10 17:35         ` Torsten Bögershausen
2022-08-14 13:35 ` [PATCH/RFC 1/1] diff.c: When appropriate, use utf8_strwidth() tboegi
2022-08-14 23:12   ` Junio C Hamano
2022-08-15  6:34     ` Torsten Bögershausen
2022-08-18 21:00       ` Junio C Hamano
2022-08-27  8:50 ` [PATCH v2 " tboegi
2022-08-27  8:54   ` Torsten Bögershausen
2022-08-27  9:50     ` Eric Sunshine
2022-08-29 12:04   ` Johannes Schindelin
2022-08-29 17:54     ` Torsten Bögershausen
2022-08-29 18:37       ` Junio C Hamano
2022-09-02  9:47       ` Johannes Schindelin
2022-09-02  4:21 ` [PATCH v3 1/2] diff.c: When appropriate, use utf8_strwidth(), part1 tboegi
2022-09-02  9:39   ` Johannes Schindelin
2022-09-02  4:21 ` [PATCH v3 2/2] diff.c: More changes and tests around utf8_strwidth() tboegi
2022-09-02 10:12   ` Johannes Schindelin
2022-09-03  5:39 ` [PATCH v4 1/2] diff.c: When appropriate, use utf8_strwidth(), part1 tboegi
2022-09-05 20:46   ` Junio C Hamano
2022-09-07  4:30     ` Torsten Bögershausen
2022-09-07 18:31       ` Junio C Hamano [this message]
2022-09-03  5:39 ` [PATCH v4 2/2] diff.c: More changes and tests around utf8_strwidth() tboegi
2022-09-05 10:13   ` Johannes Schindelin
2022-09-14 15:13 ` [PATCH v5 1/1] diff.c: When appropriate, use utf8_strwidth() tboegi
2022-09-14 16:40   ` Junio C Hamano
2022-09-26 18:43     ` Torsten Bögershausen
2022-10-10 21:58       ` Junio C Hamano
2022-10-20 15:46         ` Torsten Bögershausen
2022-10-20 17:43           ` Junio C Hamano
2022-10-21 15:19             ` Torsten Bögershausen
2022-10-21 21:59               ` Junio C Hamano
2022-10-23 20:02                 ` Torsten Bögershausen
2022-09-15  2:57   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqedwnujce.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=alexander.s.m@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.