git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tzadik Vanderhoof <tzadik.vanderhoof@gmail.com>
To: Luke Diamand <luke@diamand.org>
Cc: Andrew Oakley <andrew@adoakley.name>,
	Git List <git@vger.kernel.org>, Feiyang Xue <me@feiyangxue.com>
Subject: Re: [PATCH 2/2] git-p4: do not decode data from perforce by default
Date: Fri, 30 Apr 2021 11:08:57 -0700	[thread overview]
Message-ID: <CAKu1iLWbmPrVjAcgLKP1yisjmVxJr+kKQWJLiqkRzh=aAzETwA@mail.gmail.com> (raw)
In-Reply-To: <021c0caf-8e6f-4fbb-6ff7-40bacbe5de38@diamand.org>

On Fri, Apr 30, 2021 at 8:33 AM Luke Diamand <luke@diamand.org> wrote:
>
> Tzadik - is your server unicode enabled or not? That would be
> interesting to know:
>
>      p4 counters | grep -i unicode
>
> I suspect it is not. It's only if unicode is enabled that the server
> will convert to/from utf8 (at least that's my understanding). Without
> this setting, p4d and p4 are (probably) not doing any conversions.

My server is not unicode.

These conversions are happening even with a non-Unicode perforce db.
I don't think it's the p4d code per se that is doing the conversion, but
rather an interaction between the OS and the code, which is different
under Linux vs Windows.  If you create a trivial C program that dumps
the hex values of the bytes it receives in argv, you can see this
different behavior:

#include <stdio.h>

void main(int argc, char *argv[]) {
    int i, j;
    char *s;
    for (i = 1; i < argc; ++i) {
        s = argv[i];
        for (j = 0; s[j] != '\0'; ++j)
            printf(" %X", (unsigned char)s[j]);
        printf("\n");
        printf("[%s]\n\n", s);
    }
}

When built with Visual Studio and called from Cygwin, if you pass in
args with UTF-8 encoded characters, the program will spit them out in
cp1252. If you compile it on a Linux system using gcc, it will spit them out
in UTF-8 (unchanged).  I suspect that's what's happening with p4d on
Windows vs Linux.

In any event, if you look at my patch (v6 is the latest...
https://lore.kernel.org/git/20210429073905.837-1-tzadik.vanderhoof@gmail.com/ ),
you will see I have written tests that pass under both Linux and Windows.
(If you want to run them yourself, you need to base my patch off of "master",
not "seen").  The tests make clear what the different behavior is and
also show that p4d is not set to Unicode (since the tests do not change the
default setting).

  reply	other threads:[~2021-04-30 18:09 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-12  8:52 [PATCH 0/2] git-p4: encoding of data from perforce Andrew Oakley
2021-04-12  8:52 ` [PATCH 1/2] git-p4: avoid decoding more " Andrew Oakley
2021-04-12  8:52 ` [PATCH 2/2] git-p4: do not decode data from perforce by default Andrew Oakley
2021-04-29 10:00   ` Tzadik Vanderhoof
2021-04-30  8:53     ` Andrew Oakley
2021-04-30 15:33       ` Luke Diamand
2021-04-30 18:08         ` Tzadik Vanderhoof [this message]
2021-05-04 21:01           ` Andrew Oakley
2021-05-04 21:46             ` Tzadik Vanderhoof
2021-05-05  1:11               ` Junio C Hamano
2021-05-05  4:02                 ` Tzadik Vanderhoof
2021-05-05  4:06                   ` Tzadik Vanderhoof
2021-05-05  4:34                   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKu1iLWbmPrVjAcgLKP1yisjmVxJr+kKQWJLiqkRzh=aAzETwA@mail.gmail.com' \
    --to=tzadik.vanderhoof@gmail.com \
    --cc=andrew@adoakley.name \
    --cc=git@vger.kernel.org \
    --cc=luke@diamand.org \
    --cc=me@feiyangxue.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).