git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "Đoàn Trần Công Danh" <congdanhqx@gmail.com>
Cc: Matheus Tavares <matheus.bernardino@usp.br>,
	gitster@pobox.com, git@vger.kernel.org,
	"brian m . carlson" <sandals@crustytoothpaste.net>
Subject: Re: [PATCH] t2080: fix cp invocation to copy symlinks instead of following them
Date: Wed, 02 Jun 2021 12:50:53 +0200	[thread overview]
Message-ID: <87pmx47cs9.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <YLbgi0jQn8BJ1ue2@danh.dev>


On Wed, Jun 02 2021, Đoàn Trần Công Danh wrote:

> On 2021-05-31 16:01:01+0200, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>> 
>> On Thu, May 27 2021, Ævar Arnfjörð Bjarmason wrote:
>> 
>> > On Wed, May 26 2021, Matheus Tavares wrote:
>> >
>> >> t2080 makes a few copies of a test repository and later performs a
>> >> branch switch on each one of the copies to verify that parallel checkout
>> >> and sequential checkout produce the same results. However, the
>> >> repository is copied with `cp -R` which, on some systems, defaults to
>> >> following symlinks on the directory hierarchy and copying their target
>> >> files instead of copying the symlinks themselves. AIX is one example of
>> >> system where this happens. Because the symlinks are not preserved, the
>> >> copied repositories have paths that do not match what is in the index,
>> >> causing git to abort the checkout operation that we want to test. This
>> >> makes the test fail on these systems.
>> >>
>> >> Fix this by copying the repository with the POSIX flag '-P', which
>> >> forces cp to copy the symlinks instead of following them. Note that we
>> >> already use this flag for other cp invocations in our test suite (see
>> >> t7001). With this change, t2080 now passes on AIX.
>> >>
>> >> Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> >> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
>> >> ---
>> >>  t/t2080-parallel-checkout-basics.sh | 2 +-
>> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/t/t2080-parallel-checkout-basics.sh b/t/t2080-parallel-checkout-basics.sh
>> >> index 7087818550..3e0f8c675f 100755
>> >> --- a/t/t2080-parallel-checkout-basics.sh
>> >> +++ b/t/t2080-parallel-checkout-basics.sh
>> >> @@ -114,7 +114,7 @@ do
>> >>  
>> >>  	test_expect_success "$mode checkout" '
>> >>  		repo=various_$mode &&
>> >> -		cp -R various $repo &&
>> >> +		cp -R -P various $repo &&
>> >>  
>> >>  		# The just copied files have more recent timestamps than their
>> >>  		# associated index entries. So refresh the cached timestamps
>> >
>> > Thanks for the quick fix, I can confirm that this makes the test pass on
>> > AIX 7.2.
>> 
>> There's still a failure[1] in t2082-parallel-checkout-attributes.sh
>> though, which is new in 2.32.0-rc*. The difference is in an unexpected
>> BOM:
>>     
>>     avar@gcc119:[/scratch/avar/git/t]perl -nle 'print unpack "H*"' trash\ directory.t2082-parallel-checkout-attributes/encoding/A.internal 
>>     efbbbf74657874
>>     avar@gcc119:[/scratch/avar/git/t]perl -nle 'print unpack "H*"' trash\ directory.t2082-parallel-checkout-attributes/encoding/utf8-text  
>>     74657874
>> 
>> I.e. the A.internal starts with 0xefbbbf. The 2nd test of t0028*.sh also
>> fails similarly[2], so perhaps it's some old/iconv/whatever issue not
>> per-se related to any change of yours.
>
> The 0xefbbbf looks interesting, it's BOM for utf-8.
>
>> I tried compiling with both NO_ICONV=Y and ICONV_OMITS_BOM=Y, both have
>> the same failure.
>
> I didn't check the code-path for NO_ICONV=Y but ICONV_OMITS_BOM=Y only
> affects output of converting *to* utf-16 and utf-32.
>
> So, I think AIX iconv implementation automatically add BOM to utf-8?
>
> Perhap we need to call skip_utf8_bom somewhere?

I debugged this a bit more, it's probably *also* an issue in our use of
libiconv, but it goes wrong just with our test setup with
iconv(1). I.e. on my boring linux box:
    
    echo x | iconv -f UTF-8 -t UTF-16 | perl -0777 -MData::Dumper -ne 'my @a = map { sprintf "0x%x", $_ } unpack "C*"; print Dumper \@a'
    $VAR1 = [
              '0xff',
              '0xfe',
              '0x78',
              '0x0',
              '0xa',
              '0x0'
            ];


On the AIX box to get the same I need to do that as:

    (printf '\376\377'; echo x | iconv -f UTF-8 -t UTF-16LE) | [...]

I.e. we omit the BOM *and* AIX's idea of our UTF-16 is little-endian
UTF-16, a plain UTF-16 gives you the big-endian version. To make things
worse the same is true of UTF-32, except "iconv -l" lists no UTF-32LE
version. So it seems we can't get the same result at all for that one.

So from the outset the code added around 79444c92943 (utf8: handle
systems that don't write BOM for UTF-16, 2019-02-12) needs to be more
careful (although this looked broken before), i.e. we should test exact
known-good bytes and see if UTF-16 is really what we think it is,
etc. This is likely broken on any big-endian non-GNUish iconv
implementation.

  reply	other threads:[~2021-06-02 10:59 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-26 23:58 Matheus Tavares
2021-05-27  7:25 ` Christian Couder
2021-05-27 12:51 ` Ævar Arnfjörð Bjarmason
2021-05-31 14:01   ` Ævar Arnfjörð Bjarmason
2021-05-31 16:09     ` Matheus Tavares
2021-05-31 20:41       ` Ævar Arnfjörð Bjarmason
2021-06-02  1:36     ` Đoàn Trần Công Danh
2021-06-02 10:50       ` Ævar Arnfjörð Bjarmason [this message]
2021-06-02 11:14         ` Bagas Sanjaya
2021-06-02 11:22         ` Đoàn Trần Công Danh
2021-06-02 13:36           ` Ævar Arnfjörð Bjarmason
2021-06-02 13:50             ` Đoàn Trần Công Danh
2021-06-03 12:34               ` Đoàn Trần Công Danh
2021-06-02 19:13             ` UTF-BOM was: [PATCH] t2080: fix cp invocation Torsten Bögershausen
2021-06-03  0:07         ` [PATCH] t2080: fix cp invocation to copy symlinks instead of following them brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pmx47cs9.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=congdanhqx@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    --cc=sandals@crustytoothpaste.net \
    --subject='Re: [PATCH] t2080: fix cp invocation to copy symlinks instead of following them' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).