All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Christian Couder <christian.couder@gmail.com>
Cc: git <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Subject: Re: Rebase performance
Date: Thu, 25 Feb 2016 17:31:04 +0100	[thread overview]
Message-ID: <CACBZZX7rVAdzfCm=0FdrCXSx8a2=a8n7pjq1ZSW-V3fzmaSGWw@mail.gmail.com> (raw)
In-Reply-To: <CAP8UFD0p1kvk2B0kkc-M9dm+H-Bmam=OrE99VwQx=KCETFEjcw@mail.gmail.com>

On Wed, Feb 24, 2016 at 11:09 PM, Christian Couder
<christian.couder@gmail.com> wrote:

[Resent because I was accidentally in GMail's HTML mode and the ML rejected it]

> If there was a config option called maybe "rebase.taskset" or
> "rebase.setcpuaffinity" that could be set to ask the OS for all the
> rebase child processes to be run on the same core, people who run many
> rebases on big repos on big servers as we do at Booking.com could
> easily benefit from a nice speed up.
>
> Technically the option may make git-rebase--am.sh call "git am" using
> "taskset" (if taskset is available on the current OS).

I think aside from issues with git-apply this would be an interesting
feature to have in git. I.e. some general facility to intercept
commands and inject a prefix command in front of them, whether that's
taskset, nice/ionice, strace etc.

> Another possibility would be to libify the "git apply" functionality
> and then to use the libified "git apply" in run_apply() instead of
> launching a separate "git apply" process. One benefit from this is
> that we could probably get rid of the read_cache_from() call at the
> end of run_apply() and this would likely further speed up things. Also
> avoiding to launch separate processes might be a win especially on
> Windows.

Yeah that should help in this particular case and make the taskset
redundant since the whole sequence of operations would all be on one
core, right?

At the risk of derailing this thread, a thing that would make rebase
even faster I think would be to change it so that instead of applying
a patch at a time to the working tree the whole operation takes place
on temporary trees & commits and then we'll eventually move the branch
pointer to that once it's finished.

I.e. there's no reason for why a sequence of 1000 patches where a
FOO.txt is changed from "hi1", "hi2", "hi3", ... would be noticeably
slower than applying the same changes with git-fast-import.

Of course this would require a lot of nuances, e.g. if there's a
conflict we'd need to change the working tree & index as we do now
before continuing.

Has anyone looked into some advanced refactoring of the rebase process
that would work like this, or has some feedback on why this would be
dumb or that there's a better way to do it?

  parent reply	other threads:[~2016-02-25 16:31 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-24 22:09 Rebase performance Christian Couder
2016-02-25  0:15 ` Jacob Keller
2016-02-25  0:22   ` Stefan Beller
2016-02-25  0:50 ` Duy Nguyen
2016-02-25  3:02   ` Junio C Hamano
2016-02-25  3:14     ` Duy Nguyen
2016-02-25  9:42   ` Duy Nguyen
2016-02-26 18:15     ` Christian Couder
2016-02-25 16:31 ` Ævar Arnfjörð Bjarmason [this message]
2016-02-25 17:30   ` Matthieu Moy
2016-02-26 15:45     ` Johannes Schindelin
2016-02-26 17:15       ` Stefan Beller
2016-03-02 10:13   ` Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACBZZX7rVAdzfCm=0FdrCXSx8a2=a8n7pjq1ZSW-V3fzmaSGWw@mail.gmail.com' \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.