* VERY slow git format-patch (tens on minutes) during rebase and rev-list during rebase -i
@ 2010-07-13 6:56 Marat Radchenko
2010-07-13 8:12 ` Michael J Gruber
2010-10-13 7:56 ` [FEATURE REQUEST] allow enabling patience diff algorithm by default Marat Radchenko
0 siblings, 2 replies; 7+ messages in thread
From: Marat Radchenko @ 2010-07-13 6:56 UTC (permalink / raw)
To: git
Hi.
My setup:
0. Quad-code machine with 8GB of ram, 10K RPM hdd.
1. SVN repo that i periodically fetch into origin/trunk branch. Has ~200
commits/day.
2. My local branch with 1-5 commits which i often rebase against trunk.
3. I haven't rebased for 2 days, so i'm rebasing 3 (three) commits in my branch
over 453 commits in trunk using "git rebase trunk".
4. trunk does contain "bad" from diff POV files (big & binary).
5. Sadly, data in repo is confidential.
Expected: rebase takes some reasonable amount of time (< 1 min?).
Actual: rebase takes 20 mins.
Almost all of that time was spent doing `git format-patch -k --stdout --full-
index --ignore-if-in-upstream
80bb0dfe3d86f3cc9095ea616d9d1b1530fbe7b8..d3fde4ae7497981a6fe61b0366b105477896cf
52` (that's three commits from my branch) at 100% of one CPU core.
Additional info:
Another similar rebase but over 4.5k of commits took 2 hours.
Running without --ignore-if-in-upstream:
$ time git format-patch -k --stdout --full-index
80bb0dfe3d86f3cc9095ea616d9d1b1530fbe7b8..d3fde4ae7497981a6fe61b0366b105477896cf
5 | wc -l
25823
Is it
real 0m0.163s
user 0m0.140s
sys 0m0.020s
Proof there are only three commits:
$ git rev-list
80bb0dfe3d86f3cc9095ea616d9d1b1530fbe7b8..d3fde4ae7497981a6fe61b0366b105477896cf
52d3fde4ae7497981a6fe61b0366b105477896cf52
e18069258806bda6a6165822003f5e9fd958f906
c8c2f2e157e615b73d0baab1d793a22991c9ba71
Questions:
1. Is it expected behavior (branch you rebase onto has binary files -> no
performance for you)?
2. If [1] is yes, is it possible to prevent rebase from running --ignore-if-in-
upstream?
3. If [1] is no, should i run some kind of profiler (how?) to determine what
exactly causes such performance drop?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: VERY slow git format-patch (tens on minutes) during rebase and rev-list during rebase -i
2010-07-13 6:56 VERY slow git format-patch (tens on minutes) during rebase and rev-list during rebase -i Marat Radchenko
@ 2010-07-13 8:12 ` Michael J Gruber
2010-07-13 8:13 ` [RFC/PATCH] rebase: Allow to turn of ignore-if-in-upstream Michael J Gruber
2010-10-13 7:56 ` [FEATURE REQUEST] allow enabling patience diff algorithm by default Marat Radchenko
1 sibling, 1 reply; 7+ messages in thread
From: Michael J Gruber @ 2010-07-13 8:12 UTC (permalink / raw)
To: Marat Radchenko; +Cc: git
Marat Radchenko venit, vidit, dixit 13.07.2010 08:56:
> Hi.
>
> My setup:
> 0. Quad-code machine with 8GB of ram, 10K RPM hdd.
> 1. SVN repo that i periodically fetch into origin/trunk branch. Has ~200
> commits/day.
> 2. My local branch with 1-5 commits which i often rebase against trunk.
> 3. I haven't rebased for 2 days, so i'm rebasing 3 (three) commits in my branch
> over 453 commits in trunk using "git rebase trunk".
> 4. trunk does contain "bad" from diff POV files (big & binary).
> 5. Sadly, data in repo is confidential.
>
> Expected: rebase takes some reasonable amount of time (< 1 min?).
>
> Actual: rebase takes 20 mins.
>
> Almost all of that time was spent doing `git format-patch -k --stdout --full-
> index --ignore-if-in-upstream
> 80bb0dfe3d86f3cc9095ea616d9d1b1530fbe7b8..d3fde4ae7497981a6fe61b0366b105477896cf
> 52` (that's three commits from my branch) at 100% of one CPU core.
>
> Additional info:
>
> Another similar rebase but over 4.5k of commits took 2 hours.
>
> Running without --ignore-if-in-upstream:
> $ time git format-patch -k --stdout --full-index
> 80bb0dfe3d86f3cc9095ea616d9d1b1530fbe7b8..d3fde4ae7497981a6fe61b0366b105477896cf
> 5 | wc -l
> 25823
> Is it
> real 0m0.163s
> user 0m0.140s
> sys 0m0.020s
>
> Proof there are only three commits:
>
> $ git rev-list
> 80bb0dfe3d86f3cc9095ea616d9d1b1530fbe7b8..d3fde4ae7497981a6fe61b0366b105477896cf
> 52d3fde4ae7497981a6fe61b0366b105477896cf52
> e18069258806bda6a6165822003f5e9fd958f906
> c8c2f2e157e615b73d0baab1d793a22991c9ba71
>
> Questions:
> 1. Is it expected behavior (branch you rebase onto has binary files -> no
> performance for you)?
Well, with "ignore-if-in-upstream" git has to compute a patch-id for
every upstream patch (merge-base..upstream) and compare to the ids of
the commits in mb..HEAD.
> 2. If [1] is yes, is it possible to prevent rebase from running --ignore-if-in-
> upstream?
Not currently, but with my upcoming patch ;)
This has the (side-) effect of not ignoring patches which have been
applied (with different sha1) upstream, of course.
> 3. If [1] is no, should i run some kind of profiler (how?) to determine what
> exactly causes such performance drop?
It is the calculation of the patch-ids. Git first creates a "binary
diff" and then computes the patch-id (sha1) of that diff. I am sure we
could optimize the calculation of patch-ids for binary diffs, which may
be useful in addition to shutting off "cherry" with rebase.
Michael
^ permalink raw reply [flat|nested] 7+ messages in thread
* [RFC/PATCH] rebase: Allow to turn of ignore-if-in-upstream
2010-07-13 8:12 ` Michael J Gruber
@ 2010-07-13 8:13 ` Michael J Gruber
2010-07-13 19:33 ` Erik Faye-Lund
0 siblings, 1 reply; 7+ messages in thread
From: Michael J Gruber @ 2010-07-13 8:13 UTC (permalink / raw)
To: git; +Cc: Marat Radchenko
git-rebase uses "format-patch --ignore-if-in-upstream" do determine
which commits to apply. This may or may not be desired: a user may want
to transplant all commits, or may opt to avoid the possibly time
consuming calculation of patch-ids.
Therefore, introduce rebase.cherry (defaulting to true) and --cherry and
--no-cherry options (to override the config), where --cherry means the
current behavior and --no-cherry avoids "--ignore-if-in-upstream".
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
RFC for obvious reasons (doc, tests).
git-rebase.sh | 16 +++++++++++++++-
1 files changed, 15 insertions(+), 1 deletions(-)
diff --git a/git-rebase.sh b/git-rebase.sh
index ab4afa7..1eb6ad1 100755
--- a/git-rebase.sh
+++ b/git-rebase.sh
@@ -53,6 +53,7 @@ git_am_opt=
rebase_root=
force_rebase=
allow_rerere_autoupdate=
+cherry=$(git config --bool rebase.cherry)
continue_merge () {
test -n "$prev_head" || die "prev_head must be defined"
@@ -307,6 +308,12 @@ do
esac
do_merge=t
;;
+ --cherry)
+ cherry=true
+ ;;
+ --no-cherry)
+ cherry=false
+ ;;
-n|--no-stat)
diffstat=
;;
@@ -540,9 +547,16 @@ else
revisions="$upstream..$orig_head"
fi
+if test "x$cherry" = "xfalse"
+then
+ cherry_opt=""
+else
+ cherry_opt="--ignore-if-in-upstream"
+fi
+
if test -z "$do_merge"
then
- git format-patch -k --stdout --full-index --ignore-if-in-upstream \
+ git format-patch -k --stdout --full-index $cherry_opt \
$root_flag "$revisions" |
git am $git_am_opt --rebasing --resolvemsg="$RESOLVEMSG" &&
move_to_original_branch
--
1.7.2.rc1.212.g850a
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC/PATCH] rebase: Allow to turn of ignore-if-in-upstream
2010-07-13 8:13 ` [RFC/PATCH] rebase: Allow to turn of ignore-if-in-upstream Michael J Gruber
@ 2010-07-13 19:33 ` Erik Faye-Lund
2010-09-04 15:03 ` Michael J Gruber
0 siblings, 1 reply; 7+ messages in thread
From: Erik Faye-Lund @ 2010-07-13 19:33 UTC (permalink / raw)
To: Michael J Gruber; +Cc: git, Marat Radchenko
s/of/off/ in the subject ;)
On Tue, Jul 13, 2010 at 10:13 AM, Michael J Gruber
<git@drmicha.warpmail.net> wrote:
> git-rebase uses "format-patch --ignore-if-in-upstream" do determine
> which commits to apply. This may or may not be desired: a user may want
> to transplant all commits, or may opt to avoid the possibly time
> consuming calculation of patch-ids.
>
> Therefore, introduce rebase.cherry (defaulting to true) and --cherry and
> --no-cherry options (to override the config), where --cherry means the
> current behavior and --no-cherry avoids "--ignore-if-in-upstream".
>
> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
> ---
> RFC for obvious reasons (doc, tests).
--
Erik "kusma" Faye-Lund
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC/PATCH] rebase: Allow to turn of ignore-if-in-upstream
2010-07-13 19:33 ` Erik Faye-Lund
@ 2010-09-04 15:03 ` Michael J Gruber
2010-09-09 8:05 ` Marat Radchenko
0 siblings, 1 reply; 7+ messages in thread
From: Michael J Gruber @ 2010-09-04 15:03 UTC (permalink / raw)
To: kusmabite; +Cc: Erik Faye-Lund, git, Marat Radchenko, Junio C Hamano
Erik Faye-Lund venit, vidit, dixit 13.07.2010 21:33:
> s/of/off/ in the subject ;)
>
> On Tue, Jul 13, 2010 at 10:13 AM, Michael J Gruber
> <git@drmicha.warpmail.net> wrote:
>> git-rebase uses "format-patch --ignore-if-in-upstream" do determine
>> which commits to apply. This may or may not be desired: a user may want
>> to transplant all commits, or may opt to avoid the possibly time
>> consuming calculation of patch-ids.
>>
>> Therefore, introduce rebase.cherry (defaulting to true) and --cherry and
>> --no-cherry options (to override the config), where --cherry means the
>> current behavior and --no-cherry avoids "--ignore-if-in-upstream".
>>
>> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
>> ---
>> RFC for obvious reasons (doc, tests).
>
Pinging this one. Is there any interest? Erik is right, off course ;)
Michael
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC/PATCH] rebase: Allow to turn of ignore-if-in-upstream
2010-09-04 15:03 ` Michael J Gruber
@ 2010-09-09 8:05 ` Marat Radchenko
0 siblings, 0 replies; 7+ messages in thread
From: Marat Radchenko @ 2010-09-09 8:05 UTC (permalink / raw)
To: Michael J Gruber, kusmabite; +Cc: Erik Faye-Lund, git, Junio C Hamano
> Pinging this one. Is there any interest? Erik is right, off course ;)
There definitely is. Since [1] rebasing became much faster (minutes instead of tens of minutes), though still it takes more than I'd like it to.
[1]: http://repo.or.cz/w/git.git/commit/34597c1f5a77c710dae33092cb8a7cb01c6b21c1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [FEATURE REQUEST] allow enabling patience diff algorithm by default
2010-07-13 6:56 VERY slow git format-patch (tens on minutes) during rebase and rev-list during rebase -i Marat Radchenko
2010-07-13 8:12 ` Michael J Gruber
@ 2010-10-13 7:56 ` Marat Radchenko
1 sibling, 0 replies; 7+ messages in thread
From: Marat Radchenko @ 2010-10-13 7:56 UTC (permalink / raw)
To: git
I observe patience algorithm being several times faster than standard diff on
some big (1MB<size<10MB) text files (and, actually, it produces smaller
diffs). So using patience diff is likely to improve git-rev-list
performance.
Suggested way: add option to ~/.gitconfig to enable patience diff by
default. Additionally, smth like--no-patience may be added to commands that
accept --patience now so it is possible to override setting if needed.
--
View this message in context: http://git.661346.n2.nabble.com/VERY-slow-git-format-patch-tens-on-minutes-during-rebase-and-rev-list-during-rebase-i-tp5286226p5629926.html
Sent from the git mailing list archive at Nabble.com.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-10-13 7:57 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-13 6:56 VERY slow git format-patch (tens on minutes) during rebase and rev-list during rebase -i Marat Radchenko
2010-07-13 8:12 ` Michael J Gruber
2010-07-13 8:13 ` [RFC/PATCH] rebase: Allow to turn of ignore-if-in-upstream Michael J Gruber
2010-07-13 19:33 ` Erik Faye-Lund
2010-09-04 15:03 ` Michael J Gruber
2010-09-09 8:05 ` Marat Radchenko
2010-10-13 7:56 ` [FEATURE REQUEST] allow enabling patience diff algorithm by default Marat Radchenko
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.