git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Yang Zhao <yang.zhao@skyboxlabs.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [RFC PATCH 4/4] git-p4: use utf-8 encoding for file paths throughout
Date: Wed, 27 Nov 2019 18:57:35 -0800	[thread overview]
Message-ID: <CABPp-BEa883U+m1emhZHU4s5EqsJHyRiPvjnsyw6_0guKLUsLg@mail.gmail.com> (raw)
In-Reply-To: <20191128012807.3103-5-yang.zhao@skyboxlabs.com>

On Wed, Nov 27, 2019 at 5:32 PM Yang Zhao <yang.zhao@skyboxlabs.com> wrote:
>
> Try to decode file paths in responses from p4 as soon as possible so
> that we are working with unicode string throughout the rest of the flow.
> This makes python 3 a lot happier.
>
> Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
> ---
>
> This is probably the most risky patch out of the set. It's very likely
> that I've neglected to consider certain corner cases with decoding of
> path data.

Yes, this does seem somewhat risky to me.  It may go well on platforms
that require all filenames to be unicode.  And it may work for users
who happen to restrict their filenames to valid utf-8.  But this
abstraction doesn't fit the general problem, so some users may be left
out in the cold.

I tried multiple times while switching git-filter-repo from python2 to
python3, at different levels of pervasiveness, to use unicode more
generally.  But I mostly gave up; everyone knows files won't
necessarily be unicode, but you just can't assume filenames or commit
messages or branch or tag names (and perhaps a few other things I'm
forgetting) are either.  I ended up using bytestrings everywhere
except messages displayed to the user, and I only decode at that
point.


Of course, if perforce happens to only work with unicode filenames
then you'll be fine.  And perhaps you don't want or need to be as
paranoid as I was about what people could do.  So I don't know if my
experience applies in your case (I've never used perforce myself), but
I just thought I'd mention it in case it's useful.

  reply	other threads:[~2019-11-28  2:57 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-28  1:28 [RFC PATCH 0/4] git-p4: python 3 compatability Yang Zhao
2019-11-28  1:28 ` [RFC PATCH 1/4] git-p4: decode response from p4 to str for python3 Yang Zhao
2019-11-28  1:28 ` [RFC PATCH 2/4] git-p4: properly encode/decode communication with git for python 3 Yang Zhao
2019-11-28  1:28 ` [RFC PATCH 3/4] git-p4: open .gitp4-usercache.txt in text mode Yang Zhao
2019-11-28  1:28 ` [RFC PATCH 4/4] git-p4: use utf-8 encoding for file paths throughout Yang Zhao
2019-11-28  2:57   ` Elijah Newren [this message]
2019-11-28 12:54 ` [RFC PATCH 0/4] git-p4: python 3 compatability Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BEa883U+m1emhZHU4s5EqsJHyRiPvjnsyw6_0guKLUsLg@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=yang.zhao@skyboxlabs.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).