All of lore.kernel.org
 help / color / mirror / Atom feed
* Rebase performance
@ 2016-02-24 22:09 Christian Couder
  2016-02-25  0:15 ` Jacob Keller
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Christian Couder @ 2016-02-24 22:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Johannes Schindelin, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason

Hi,

Using GIT_TRACE_PERFORMANCE it looks like a lot of time in a regular
rebase is spent in run_apply() in builtin/am.c. This function first
sets up a 'struct child_process cp' to launch "git apply" on a patch
and then uses run_command(&cp) to actually launch the "git apply".
Then this function calls discard_cache() and read_cache_from() to get
the index created by the "git apply".

On a Linux server with many CPUs and many cores on each CPU, it is
strange because the same rebase of 13 commits on a big repo is
significantly slower than on a laptop (typically around 9 seconds
versus 6 seconds). Both the server and the laptop have that has SSD
storage.

It appears that the server is trying to run the "git apply" processes
on different cores or cpus perhaps to try to spread the load on many
of its cores. Anyway adding something like "taskset -c 7" in front of
the "git rebase..." command, when launching it on the server, speeds
it up, so that it takes around the same amount of time as it does on
the laptop (6 seconds). "taskset -c 7" is just asking Linux to run a
process and its children on core number 7, and it appears that doing
that results in a much better cpu (or core) cache usage which explains
the speed up.

If there was a config option called maybe "rebase.taskset" or
"rebase.setcpuaffinity" that could be set to ask the OS for all the
rebase child processes to be run on the same core, people who run many
rebases on big repos on big servers as we do at Booking.com could
easily benefit from a nice speed up.

Technically the option may make git-rebase--am.sh call "git am" using
"taskset" (if taskset is available on the current OS).

Another possibility would be to libify the "git apply" functionality
and then to use the libified "git apply" in run_apply() instead of
launching a separate "git apply" process. One benefit from this is
that we could probably get rid of the read_cache_from() call at the
end of run_apply() and this would likely further speed up things. Also
avoiding to launch separate processes might be a win especially on
Windows.

Suggestions?

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-03-02 10:13 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-24 22:09 Rebase performance Christian Couder
2016-02-25  0:15 ` Jacob Keller
2016-02-25  0:22   ` Stefan Beller
2016-02-25  0:50 ` Duy Nguyen
2016-02-25  3:02   ` Junio C Hamano
2016-02-25  3:14     ` Duy Nguyen
2016-02-25  9:42   ` Duy Nguyen
2016-02-26 18:15     ` Christian Couder
2016-02-25 16:31 ` Ævar Arnfjörð Bjarmason
2016-02-25 17:30   ` Matthieu Moy
2016-02-26 15:45     ` Johannes Schindelin
2016-02-26 17:15       ` Stefan Beller
2016-03-02 10:13   ` Christian Couder

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.