All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Jeff King <peff@peff.net>, Stefan Beller <sbeller@google.com>,
	git <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>,
	Thomas Rast <tr@thomasrast.ch>, Phil Haack <haacked@gmail.com>,
	Jason Frey <jfrey@redhat.com>,
	Philip Oakley <philipoakley@iee.org>,
	Duy Nguyen <pclouds@gmail.com>
Subject: Re: A potential approach to making tests faster on Windows
Date: Tue, 03 Apr 2018 13:28:18 +0200	[thread overview]
Message-ID: <87sh8cv8a5.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.1804031133270.5026@qfpub.tvgsbejvaqbjf.bet>


On Tue, Apr 03 2018, Johannes Schindelin wrote:

> Hi Peff,
>
> On Fri, 30 Mar 2018, Jeff King wrote:
>
>> On Fri, Mar 30, 2018 at 08:45:45PM +0200, Ævar Arnfjörð Bjarmason wrote:
>>
>> > I've wondered for a while whether it wouldn't be a viable approach to
>> > make something like an interpreter for our test suite to get around
>> > this problem, i.e. much of it's very repetitive and just using a few
>> > shell functions we've defined, what if we had C equivalents of those?
>>
>> I've had a similar thought, though I wonder how far we could get with
>> just shell. I even tried it out with test_cmp:
>>
>>   https://public-inbox.org/git/20161020215647.5no7effvutwep2xt@sigill.intra.peff.net/
>>
>> But Johannes Sixt pointed out that they already do this (see
>> mingw_test_cmp in test-lib-functions).
>
> Right.
>
> Additionally, I noticed that that simple loop in shell is *also* very slow on
> Windows (at least in the MSYS2 Bash we use in Git for Windows).
>
> Under the assumption that it is the Bash with the loop that uses too much
> POSIX emulation to make it fast, I re-implemented mingw_test_cmp in pure
> C:
> https://github.com/git-for-windows/git/commit/8a96ef63a0083ba02305dfeef6ff92c31b4fd7c3
>
> Unfortunately, it did not produce any noticeable speed improvement, so I
> did not even finish the conversion (when the cmp fails, it does not show
> you any helpful diff yet).

I don't know the details of Windows, but it sounds like you're trying to
performance test two things that are going to suck for different
reasons.

On one hand the pure-*.sh comparison would be slower than just diff on
*nix, because it's not C, so you'll get that slowness, but gain in not
having to fork another process.

On the other hand the C implementation is going to be really fast, but
it's going to take you a long time to get it started on Windows.

Which is why I think it would be really interesting to see the third
approach I suggested, i.e. hack the shell to make the test_cmp a builtin
and test that. Then you won't fork, but will get the advantage of your
fast C codepath.

Also, even if test_cmp is much faster, Peff's results over at
https://public-inbox.org/git/20161020123111.qnbsainul2g54z4z@sigill.intra.peff.net/
suggest that you may not notice anyway. Aside from the points raised
there about the bin wrappers it seems the easiest wins are having a
builtin version of "rm" and "cat".

Are you able to compile dash on Windows with some modification of the
patch I sent upthread? If not it doesn't seem too hard to do the same
trick for bash, see:

    git grep '\balias\b' -- builtins

Once you have bash.git checked out. I.e. you add a bit of Makefile
boilerplate and you should be able to get a new builtin.

>> I also tried to explore a few numbers about process invocations to see
>> if running shell commands is the problem:
>>
>>   https://public-inbox.org/git/20161020123111.qnbsainul2g54z4z@sigill.intra.peff.net/
>
> This mail was still in my inbox, in want of me saying something about
> this.
>
> My main evidence that shell scripts on macOS are slower than on Linux was
> the difference of the improvement incurred by moving more things from
> git-rebase--interactive.sh into sequencer.c: Linux saw an improvement only
> of about 3x, while macOS saw an improvement of 4x, IIRC. If I don't
> remember the absolute numbers correctly, at least I vividly remember the
> qualitative difference: It was noticeable.
>
>> There was some discussion there about whether the problem is programs
>> being exec'd, or if it's forks due to subshells. And if it is programs
>> being exec'd, whether it's shell programs or if it is simply that we
>> exec Git a huge number of times.
>
> One large problem there is that it is really hard to analyze performance
> over such a heterogenous code base: part C, part Perl, part Unix shell
> (and of course, when you say Unix shell, you imply dozens of separate
> tools that *also* need to be performance-profiled). I have very good
> profiling tools for C, I saw some built-in performance profiling for Perl,
> but there is no good performance profiling for Unix shell scripting: I
> doubt that the inventors of shell scripting had speed-critical production
> code in mind when they came up with the idea.
>
> I did invest dozens of hours earlier this year trying to obtain debug
> symbols in .pdb format (ready for Visual Studio's really envy-inducing
> performance profiler) also for the MSYS2 runtime and Bash, so that I could
> analyze what makes things so awfully slow in Git's test suite.
>
> The only problem is that I also have to do other things in my day-job, so
> that project waits patiently until I have some time to come back to that
> project.
>
> Ciao,
> Dscho

  reply	other threads:[~2018-04-03 11:28 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin
2018-03-29 15:18 ` [PATCH 1/9] git_config_set: fix off-by-two Johannes Schindelin
2018-03-29 18:15   ` Stefan Beller
2018-03-29 19:41     ` Jeff King
2018-03-30 12:32       ` Johannes Schindelin
2018-03-30 14:15         ` Ævar Arnfjörð Bjarmason
2018-03-30 16:24           ` Junio C Hamano
2018-03-30 18:44             ` Johannes Schindelin
2018-03-30 19:00               ` Junio C Hamano
2018-04-03  9:31                 ` Johannes Schindelin
2018-04-03 15:29                   ` Duy Nguyen
2018-04-03 15:47                     ` Johannes Schindelin
2018-04-08 23:12                   ` Junio C Hamano
2018-03-30 16:36         ` Duy Nguyen
2018-03-30 18:53           ` Johannes Schindelin
2018-03-30 19:16             ` Duy Nguyen
2018-03-30 18:45         ` A potential approach to making tests faster on Windows Ævar Arnfjörð Bjarmason
2018-03-30 18:58           ` Junio C Hamano
2018-03-30 19:16           ` Jeff King
2018-04-03  9:49             ` Johannes Schindelin
2018-04-03 11:28               ` Ævar Arnfjörð Bjarmason [this message]
2018-04-03 15:55                 ` Johannes Schindelin
2018-04-03 21:36               ` Eric Sunshine
2018-04-03 11:43           ` Johannes Schindelin
2018-04-03 13:27             ` Jeff King
2018-04-03 16:00               ` Johannes Schindelin
2018-04-06 21:40                 ` Jeff King
2018-04-06 21:57                   ` Stefan Beller
2018-03-29 15:18 ` [PATCH 2/9] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin
2018-03-29 19:42   ` Jeff King
2018-03-30 12:37     ` Johannes Schindelin
2018-03-29 15:18 ` [PATCH 3/9] t1300: avoid relying on a bug Johannes Schindelin
2018-03-29 19:43   ` Jeff King
2018-03-30 12:38     ` Johannes Schindelin
2018-03-29 15:18 ` [PATCH 4/9] t1300: remove unreasonable expectation from TODO Johannes Schindelin
2018-03-29 19:52   ` Jeff King
2018-03-29 20:45     ` Junio C Hamano
2018-03-30 12:42     ` Johannes Schindelin
2018-03-29 15:18 ` [PATCH 5/9] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin
2018-03-29 19:54   ` Jeff King
2018-03-29 15:18 ` [PATCH 6/9] git_config_set: simplify the way the section name is remembered Johannes Schindelin
2018-03-29 15:19 ` [PATCH 7/9] git config --unset: remove empty sections (in normal situations) Johannes Schindelin
2018-03-29 21:32   ` Jeff King
2018-03-30 13:00     ` Johannes Schindelin
2018-03-30 13:09       ` Jeff King
2018-03-29 15:19 ` [PATCH 8/9] git_config_set: use do_config_from_file() directly Johannes Schindelin
2018-03-29 21:38   ` Jeff King
2018-03-30 13:02     ` Johannes Schindelin
2018-03-30 13:14       ` Jeff King
2018-03-30 14:01         ` Johannes Schindelin
2018-03-30 14:08           ` Jeff King
2018-03-30 19:04             ` Johannes Schindelin
2018-03-29 15:19 ` [PATCH 9/9] git_config_set: reuse empty sections Johannes Schindelin
2018-03-29 21:50   ` Jeff King
2018-03-30 13:15     ` Johannes Schindelin
2018-03-29 17:58 ` [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Stefan Beller
2018-03-30 12:14   ` Johannes Schindelin
2018-03-29 19:39 ` Jeff King
2018-03-30 12:35   ` Johannes Schindelin
2018-03-30 14:17 ` Ævar Arnfjörð Bjarmason
2018-03-30 18:46   ` Johannes Schindelin
2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 01/15] git_config_set: fix off-by-two Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 02/15] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 03/15] t1300: demonstrate that --replace-all can "invent" newlines Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 04/15] config --replace-all: avoid extra line breaks Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 05/15] t1300: avoid relying on a bug Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 06/15] t1300: remove unreasonable expectation from TODO Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 07/15] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 08/15] config: introduce an optional event stream while parsing Johannes Schindelin
2018-04-06 21:22     ` Jeff King
2018-04-09  7:35       ` Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 09/15] config: avoid using the global variable `store` Johannes Schindelin
2018-04-06 21:23     ` Jeff King
2018-04-09  7:36       ` Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 10/15] config_set_store: rename some fields for consistency Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 11/15] git_config_set: do not use a state machine Johannes Schindelin
2018-04-06 21:28     ` Jeff King
2018-04-09  7:50       ` Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 12/15] git_config_set: make use of the config parser's event stream Johannes Schindelin
2018-04-03 16:28   ` [PATCH v2 13/15] git config --unset: remove empty sections (in the common case) Johannes Schindelin
2018-04-03 16:29   ` [PATCH v2 14/15] git_config_set: reuse empty sections Johannes Schindelin
2018-04-03 16:30   ` [PATCH v2 00/15] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin
2018-04-06 21:33   ` Jeff King
2018-04-09  8:19     ` Johannes Schindelin
2018-04-09  8:31   ` [PATCH v3 " Johannes Schindelin
2018-04-09  8:31     ` [PATCH v3 01/15] git_config_set: fix off-by-two Johannes Schindelin
2018-04-09  8:31     ` [PATCH v3 02/15] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin
2018-04-09  8:31     ` [PATCH v3 03/15] t1300: demonstrate that --replace-all can "invent" newlines Johannes Schindelin
2018-04-09  8:31     ` [PATCH v3 04/15] config --replace-all: avoid extra line breaks Johannes Schindelin
2018-04-09  8:31     ` [PATCH v3 05/15] t1300: avoid relying on a bug Johannes Schindelin
2018-04-09  8:31     ` [PATCH v3 06/15] t1300: remove unreasonable expectation from TODO Johannes Schindelin
2018-04-09  8:31     ` [PATCH v3 07/15] t1300: add a few more hairy examples of sections becoming empty Johannes Schindelin
2018-04-09  8:32     ` [PATCH v3 08/15] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin
2018-04-09  8:32     ` [PATCH v3 09/15] config: introduce an optional event stream while parsing Johannes Schindelin
2018-04-09  8:32     ` [PATCH v3 10/15] config: avoid using the global variable `store` Johannes Schindelin
2018-04-09  8:32     ` [PATCH v3 11/15] config_set_store: rename some fields for consistency Johannes Schindelin
2018-04-09  8:32     ` [PATCH v3 12/15] git_config_set: do not use a state machine Johannes Schindelin
2018-04-09  8:32     ` [PATCH v3 13/15] git_config_set: make use of the config parser's event stream Johannes Schindelin
2018-05-08 13:42       ` Jeff King
2018-05-08 14:00         ` Jeff King
2018-04-09  8:32     ` [PATCH v3 14/15] git config --unset: remove empty sections (in the common case) Johannes Schindelin
2018-04-09  8:32     ` [PATCH v3 15/15] git_config_set: reuse empty sections Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sh8cv8a5.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=haacked@gmail.com \
    --cc=jfrey@redhat.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=philipoakley@iee.org \
    --cc=sbeller@google.com \
    --cc=tr@thomasrast.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.