* [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects @ 2015-04-17 7:30 Stefan Saasen 2015-04-17 14:03 ` Jeff King 0 siblings, 1 reply; 19+ messages in thread From: Stefan Saasen @ 2015-04-17 7:30 UTC (permalink / raw) To: Git Mailing List, Jeff King We became aware of slow merge times with the following setup: The merge is created in a temporary location that uses alternates. The temporary repository is on a local disk, the alternate object database on an NFS mount. After some investigation we believe that #33d4221 (present in git 2.2.0, absent in 2.1.4) is causing this regression in merge time. The following are merge times (in seconds) with git@33d4221~ (2.1.2-393-gabcb865) (before the change) Elapsed System User Min. :0.3700 Min. :0.04000 Min. :0.3000 1st Qu.:0.3800 1st Qu.:0.05000 1st Qu.:0.3100 Median :0.4000 Median :0.06000 Median :0.3300 Mean :0.4295 Mean :0.05905 Mean :0.3519 3rd Qu.:0.4600 3rd Qu.:0.07000 3rd Qu.:0.3600 Max. :0.5900 Max. :0.09000 Max. :0.4900 The following are merge times with git@33d4221 (2.1.2-394-g33d4221): Elapsed System User Min. : 8.58 Min. :1.46 Min. :0.4400 1st Qu.: 9.63 1st Qu.:1.60 1st Qu.:0.4400 Median :10.64 Median :1.66 Median :0.4800 Mean :10.50 Mean :1.69 Mean :0.4986 3rd Qu.:11.13 3rd Qu.:1.81 3rd Qu.:0.5000 Max. :13.96 Max. :2.05 Max. :0.6700 As you can see the merge times are an order of magnitude slower after the change. The effect of #33d4221 can be seen when strace'ing the merge: Running strace on git@33d4221 yields % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 70.79 0.714852 178 4018 utime 14.73 0.148789 3 50141 50123 lstat 13.88 0.140198 17 8074 8067 access 0.24 0.002455 614 4 rename 0.15 0.001493 3 577 write 0.06 0.000618 10 65 close 0.04 0.000453 3 152 brk 0.04 0.000433 27 16 mkdir 0.03 0.000310 8 41 fstat Running strace on git@33d4221~ yields % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 98.37 0.138516 3 50141 50123 lstat 0.92 0.001292 2 577 write 0.37 0.000520 14 38 31 access 0.18 0.000252 36 7 getcwd 0.17 0.000237 7 36 20 stat 0.00 0.000000 0 40 read 0.00 0.000000 0 87 30 open My current hypothesis is that the additional `access`, but more importantly the additional `utime` calls are responsible in the increased merge times that we see. NFS stats on the server for the tests seem to confirm this (see nfsstat-{after,before}-change.txt on https://bitbucket.org/snippets/ssaasen/oend). This is certainly due to the fact that this will all happen over NFS but in 2.1.4 this worked fine and starting with 2.2 this has become very slow. Looking at the detailed strace shows that utime will be called repeatedly in same cases (e.g. https://bitbucket.org/snippets/ssaasen/oend shows an example where the same packfile will be updated more than 4000 times in a single merge). http://www.spinics.net/lists/git/msg240106.html discusses a potential improvement for this case. Would that be an acceptable avenue to improve this situation? Best regards, Stefan Saasen ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-17 7:30 [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects Stefan Saasen @ 2015-04-17 14:03 ` Jeff King 2015-04-17 15:50 ` Junio C Hamano 2015-04-18 3:35 ` Stefan Saasen 0 siblings, 2 replies; 19+ messages in thread From: Jeff King @ 2015-04-17 14:03 UTC (permalink / raw) To: Stefan Saasen; +Cc: Git Mailing List On Fri, Apr 17, 2015 at 05:30:22PM +1000, Stefan Saasen wrote: > The merge is created in a temporary location that uses alternates. The > temporary repository is on a local disk, the alternate object database > on an NFS mount. Is the alternate writeable? If we can't freshen the object, we fall back to storing the object locally, which could have a performance impact. But it looks from your tables below like the utime() call is succeeding, so that is probably not what is happening here. > My current hypothesis is that the additional `access`, but more > importantly the additional `utime` calls are responsible in the > increased merge times that we see. Yeah, that makes sense from your tables. The commit in question flips the order of the loose/packed check, and the packed check should be much faster on your NFS mount. Can you try: diff --git a/sha1_file.c b/sha1_file.c index 88f06ba..822aaef 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -3014,7 +3014,7 @@ int write_sha1_file(const void *buf, unsigned long len, const char *type, unsign write_sha1_file_prepare(buf, len, type, sha1, hdr, &hdrlen); if (returnsha1) hashcpy(returnsha1, sha1); - if (freshen_loose_object(sha1) || freshen_packed_object(sha1)) + if (freshen_packed_object(sha1) || freshen_loose_object(sha1)) return 0; return write_loose_object(sha1, hdr, hdrlen, buf, len, 0); } I think that should clear up the access() calls, but leave the utime() ones. > Looking at the detailed strace shows that utime will be called > repeatedly in same cases (e.g. > https://bitbucket.org/snippets/ssaasen/oend shows an example where the > same packfile will be updated more than 4000 times in a single merge). > > http://www.spinics.net/lists/git/msg240106.html discusses a potential > improvement for this case. Would that be an acceptable avenue to > improve this situation? I think so. Here's a tentative patch: diff --git a/cache.h b/cache.h index 3d3244b..72c6888 100644 --- a/cache.h +++ b/cache.h @@ -1174,6 +1174,7 @@ extern struct packed_git { int pack_fd; unsigned pack_local:1, pack_keep:1, + freshened:1, do_not_close:1; unsigned char sha1[20]; /* something like ".git/objects/pack/xxxxx.pack" */ diff --git a/sha1_file.c b/sha1_file.c index 822aaef..f27cbf1 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2999,7 +2999,11 @@ static int freshen_loose_object(const unsigned char *sha1) static int freshen_packed_object(const unsigned char *sha1) { struct pack_entry e; - return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name); + if (!find_pack_entry(sha1, &e)) + return 0; + if (e.p->freshened) + return 1; + return freshen_file(e.p->pack_name); } int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *returnsha1) If it's not a problem, I'd love to see timings for your case with just the first patch, and then with both. You may also be interested in: http://thread.gmane.org/gmane.comp.version-control.git/266370 which addresses another performance problem related to the freshen/recent code in v2.2. -Peff ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-17 14:03 ` Jeff King @ 2015-04-17 15:50 ` Junio C Hamano 2015-04-18 3:35 ` Stefan Saasen 1 sibling, 0 replies; 19+ messages in thread From: Junio C Hamano @ 2015-04-17 15:50 UTC (permalink / raw) To: Jeff King; +Cc: Stefan Saasen, Git Mailing List Jeff King <peff@peff.net> writes: > If it's not a problem, I'd love to see timings for your case with just > the first patch, and then with both. Thanks for two quick progress patches. > You may also be interested in: > > http://thread.gmane.org/gmane.comp.version-control.git/266370 > > which addresses another performance problem related to the > freshen/recent code in v2.2. That, too. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-17 14:03 ` Jeff King 2015-04-17 15:50 ` Junio C Hamano @ 2015-04-18 3:35 ` Stefan Saasen 2015-04-20 19:53 ` Jeff King 1 sibling, 1 reply; 19+ messages in thread From: Stefan Saasen @ 2015-04-18 3:35 UTC (permalink / raw) To: Jeff King; +Cc: Git Mailing List > If it's not a problem, I'd love to see timings for your case with just > the first patch, and then with both. Thanks for the swift response, much appreciated Jeff! Here are the timings for the two patches: Patch 1 on top of 33d4221c79 Elapsed System User Min. :6.110 Min. :0.6700 Min. :0.3600 1st Qu.:6.580 1st Qu.:0.6900 1st Qu.:0.3900 Median :7.260 Median :0.7100 Median :0.4100 Mean :7.347 Mean :0.7248 Mean :0.4214 3rd Qu.:8.000 3rd Qu.:0.7400 3rd Qu.:0.4600 Max. :8.860 Max. :0.8700 Max. :0.5100 I've had to slightly tweak your second patch (`freshened` was never set) but applying the modified patch yielded even better results compared to patch 1: Elapsed System User Min. :0.38 Min. :0.03000 Min. :0.2900 1st Qu.:0.38 1st Qu.:0.04000 1st Qu.:0.3100 Median :0.39 Median :0.06000 Median :0.3200 Mean :0.43 Mean :0.05667 Mean :0.3519 3rd Qu.:0.42 3rd Qu.:0.07000 3rd Qu.:0.3600 Max. :0.68 Max. :0.08000 Max. :0.5700 This is pretty much back to the "before" state. The graph really tells the whole story: https://bytebucket.org/snippets/ssaasen/GeRE/raw/7367353a58c50ccd7c493af40ffb6ba1533e1490/git-merge-timing-patched.png (After is the change in #33d4221, Before the parent of #33d4221 and so on) The graph and the NFS stats can be found here: https://bitbucket.org/snippets/ssaasen/GeRE My tweaked version of your second patch is: diff --git a/cache.h b/cache.h index 51ee856..8982055 100644 --- a/cache.h +++ b/cache.h @@ -1168,6 +1168,7 @@ extern struct packed_git { int pack_fd; unsigned pack_local:1, pack_keep:1, + freshened:1, do_not_close:1; unsigned char sha1[20]; /* something like ".git/objects/pack/xxxxx.pack" */ diff --git a/sha1_file.c b/sha1_file.c index bc6322e..c0ccd4b 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2999,7 +2999,11 @@ static int freshen_loose_object(const unsigned char *sha1) static int freshen_packed_object(const unsigned char *sha1) { struct pack_entry e; - return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name); + if (!find_pack_entry(sha1, &e)) + return 0; + if (e.p->freshened) + return 1; + return e.p->freshened = freshen_file(e.p->pack_name); } int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *returnsha1) The only change is that I assign the result of `freshen_file` to the `freshened` flag. I've only ran this with the test case I was using before but it looks like this is pretty much fixing the merge time changes we observed. Thanks again for the swift response. I've got my test setup sitting here, happy to rerun the tests if that'd be useful. Is there a chance to backport those changes to the 2.2+ branches? > You may also be interested in: > > http://thread.gmane.org/gmane.comp.version-control.git/266370 > > which addresses another performance problem related to the > freshen/recent code in v2.2. Thanks for the pointer, I'll have a look at that as well. Cheers, Stefan ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-18 3:35 ` Stefan Saasen @ 2015-04-20 19:53 ` Jeff King 2015-04-20 19:54 ` [PATCH 1/2] sha1_file: freshen pack objects before loose Jeff King ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: Jeff King @ 2015-04-20 19:53 UTC (permalink / raw) To: Stefan Saasen; +Cc: Junio C Hamano, Git Mailing List On Sat, Apr 18, 2015 at 01:35:51PM +1000, Stefan Saasen wrote: > Here are the timings for the two patches: > [...] Thanks, that matches what I was hoping for. > My tweaked version of your second patch is: > [...] > - return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name); > + if (!find_pack_entry(sha1, &e)) > + return 0; > + if (e.p->freshened) > + return 1; > + return e.p->freshened = freshen_file(e.p->pack_name); > } Whooops, yeah, setting the flag is probably helpful. :) We usually try to avoid assignments in a return like this, so I've written it out a little more verbosely in my final version. I'll send those patches in a moment. [1/2]: sha1_file: freshen pack objects before loose [2/2]: sha1_file: only freshen packs once per run > Is there a chance to backport those changes to the 2.2+ branches? That's up to Junio. These patches can be applied straight to the jk/prune-mtime topic. Usually he would then merge the topic up to "maint", which at this would potentially become the next v2.3.x. If an issue is critical (e.g., a security vulnerability), he'll sometimes merge and roll maintenance releases for older versions. But I don't know if this counts as critical (it is for you, certainly, but I don't think that many people are affected, as the crucial factor here is really the slow NFS filesystem operations). -Peff ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/2] sha1_file: freshen pack objects before loose 2015-04-20 19:53 ` Jeff King @ 2015-04-20 19:54 ` Jeff King 2015-04-21 0:46 ` Stefan Saasen 2015-04-20 19:55 ` [PATCH 2/2] sha1_file: only freshen packs once per run Jeff King 2015-04-20 20:04 ` [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects Junio C Hamano 2 siblings, 1 reply; 19+ messages in thread From: Jeff King @ 2015-04-20 19:54 UTC (permalink / raw) To: Stefan Saasen; +Cc: Junio C Hamano, Git Mailing List When writing out an object file, we first check whether it already exists and if so optimize out the write. Prior to 33d4221, we did this by calling has_sha1_file(), which will check for packed objects followed by loose. Since that commit, we check loose objects first. For the common case of a repository whose objects are mostly packed, this means we will make a lot of extra access() system calls checking for loose objects. We should follow the same packed-then-loose order that all of our other lookups use. Reported-by: Stefan Saasen <ssaasen@atlassian.com> Signed-off-by: Jeff King <peff@peff.net> --- sha1_file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sha1_file.c b/sha1_file.c index 88f06ba..822aaef 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -3014,7 +3014,7 @@ int write_sha1_file(const void *buf, unsigned long len, const char *type, unsign write_sha1_file_prepare(buf, len, type, sha1, hdr, &hdrlen); if (returnsha1) hashcpy(returnsha1, sha1); - if (freshen_loose_object(sha1) || freshen_packed_object(sha1)) + if (freshen_packed_object(sha1) || freshen_loose_object(sha1)) return 0; return write_loose_object(sha1, hdr, hdrlen, buf, len, 0); } -- 2.4.0.rc2.384.g7297a4a ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] sha1_file: freshen pack objects before loose 2015-04-20 19:54 ` [PATCH 1/2] sha1_file: freshen pack objects before loose Jeff King @ 2015-04-21 0:46 ` Stefan Saasen 0 siblings, 0 replies; 19+ messages in thread From: Stefan Saasen @ 2015-04-21 0:46 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Git Mailing List I didn't expect anything else (as the patch is the same as the previous one) but I verified that applying this patch has the desired effect (https://bitbucket.org/snippets/ssaasen/9AXg). Thanks for the fix Jeff. On 21 April 2015 at 05:54, Jeff King <peff@peff.net> wrote: > When writing out an object file, we first check whether it > already exists and if so optimize out the write. Prior to > 33d4221, we did this by calling has_sha1_file(), which will > check for packed objects followed by loose. Since that > commit, we check loose objects first. > > For the common case of a repository whose objects are mostly > packed, this means we will make a lot of extra access() > system calls checking for loose objects. We should follow > the same packed-then-loose order that all of our other > lookups use. > > Reported-by: Stefan Saasen <ssaasen@atlassian.com> > Signed-off-by: Jeff King <peff@peff.net> > --- > sha1_file.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/sha1_file.c b/sha1_file.c > index 88f06ba..822aaef 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -3014,7 +3014,7 @@ int write_sha1_file(const void *buf, unsigned long len, const char *type, unsign > write_sha1_file_prepare(buf, len, type, sha1, hdr, &hdrlen); > if (returnsha1) > hashcpy(returnsha1, sha1); > - if (freshen_loose_object(sha1) || freshen_packed_object(sha1)) > + if (freshen_packed_object(sha1) || freshen_loose_object(sha1)) > return 0; > return write_loose_object(sha1, hdr, hdrlen, buf, len, 0); > } > -- > 2.4.0.rc2.384.g7297a4a > ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 2/2] sha1_file: only freshen packs once per run 2015-04-20 19:53 ` Jeff King 2015-04-20 19:54 ` [PATCH 1/2] sha1_file: freshen pack objects before loose Jeff King @ 2015-04-20 19:55 ` Jeff King 2015-04-21 0:45 ` Stefan Saasen 2015-04-20 20:04 ` [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects Junio C Hamano 2 siblings, 1 reply; 19+ messages in thread From: Jeff King @ 2015-04-20 19:55 UTC (permalink / raw) To: Stefan Saasen; +Cc: Junio C Hamano, Git Mailing List Since 33d4221 (write_sha1_file: freshen existing objects, 2014-10-15), we update the mtime of existing objects that we would have written out (had they not existed). For the common case in which many objects are packed, we may update the mtime on a single packfile repeatedly. This can result in a noticeable performance problem if calling utime() is expensive (e.g., because your storage is on NFS). We can fix this by keeping a per-pack flag that lets us freshen only once per program invocation. An alternative would be to keep the packed_git.mtime flag up to date as we freshen, and freshen only once every N seconds. In practice, it's not worth the complexity. We are racing against prune expiration times here, which inherently must be set to accomodate reasonable program running times (because they really care about the time between an object being written and it becoming referenced, and the latter is typically the last step a program takes). Signed-off-by: Jeff King <peff@peff.net> --- Hopefully I didn't botch the flag logic again. :) I tested with "strace -c" myself this time, so I think it is good. cache.h | 1 + sha1_file.c | 9 ++++++++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/cache.h b/cache.h index 3d3244b..72c6888 100644 --- a/cache.h +++ b/cache.h @@ -1174,6 +1174,7 @@ extern struct packed_git { int pack_fd; unsigned pack_local:1, pack_keep:1, + freshened:1, do_not_close:1; unsigned char sha1[20]; /* something like ".git/objects/pack/xxxxx.pack" */ diff --git a/sha1_file.c b/sha1_file.c index 822aaef..26b9b2b 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2999,7 +2999,14 @@ static int freshen_loose_object(const unsigned char *sha1) static int freshen_packed_object(const unsigned char *sha1) { struct pack_entry e; - return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name); + if (!find_pack_entry(sha1, &e)) + return 0; + if (e.p->freshened) + return 1; + if (!freshen_file(e.p->pack_name)) + return 0; + e.p->freshened = 1; + return 1; } int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *returnsha1) -- 2.4.0.rc2.384.g7297a4a ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] sha1_file: only freshen packs once per run 2015-04-20 19:55 ` [PATCH 2/2] sha1_file: only freshen packs once per run Jeff King @ 2015-04-21 0:45 ` Stefan Saasen 0 siblings, 0 replies; 19+ messages in thread From: Stefan Saasen @ 2015-04-21 0:45 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Git Mailing List I can confirm that this patch is equivalent to the previous one. https://bitbucket.org/snippets/ssaasen/9AXg shows both the timing and the NFS stats showing the effect of applying this patch. Thanks for the fix Jeff! Cheers, Stefan On 21 April 2015 at 05:55, Jeff King <peff@peff.net> wrote: > Since 33d4221 (write_sha1_file: freshen existing objects, > 2014-10-15), we update the mtime of existing objects that we > would have written out (had they not existed). For the > common case in which many objects are packed, we may update > the mtime on a single packfile repeatedly. This can result > in a noticeable performance problem if calling utime() is > expensive (e.g., because your storage is on NFS). > > We can fix this by keeping a per-pack flag that lets us > freshen only once per program invocation. > > An alternative would be to keep the packed_git.mtime flag up > to date as we freshen, and freshen only once every N > seconds. In practice, it's not worth the complexity. We are > racing against prune expiration times here, which inherently > must be set to accomodate reasonable program running times > (because they really care about the time between an object > being written and it becoming referenced, and the latter is > typically the last step a program takes). > > Signed-off-by: Jeff King <peff@peff.net> > --- > Hopefully I didn't botch the flag logic again. :) I tested with "strace > -c" myself this time, so I think it is good. > > cache.h | 1 + > sha1_file.c | 9 ++++++++- > 2 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/cache.h b/cache.h > index 3d3244b..72c6888 100644 > --- a/cache.h > +++ b/cache.h > @@ -1174,6 +1174,7 @@ extern struct packed_git { > int pack_fd; > unsigned pack_local:1, > pack_keep:1, > + freshened:1, > do_not_close:1; > unsigned char sha1[20]; > /* something like ".git/objects/pack/xxxxx.pack" */ > diff --git a/sha1_file.c b/sha1_file.c > index 822aaef..26b9b2b 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -2999,7 +2999,14 @@ static int freshen_loose_object(const unsigned char *sha1) > static int freshen_packed_object(const unsigned char *sha1) > { > struct pack_entry e; > - return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name); > + if (!find_pack_entry(sha1, &e)) > + return 0; > + if (e.p->freshened) > + return 1; > + if (!freshen_file(e.p->pack_name)) > + return 0; > + e.p->freshened = 1; > + return 1; > } > > int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *returnsha1) > -- > 2.4.0.rc2.384.g7297a4a ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-20 19:53 ` Jeff King 2015-04-20 19:54 ` [PATCH 1/2] sha1_file: freshen pack objects before loose Jeff King 2015-04-20 19:55 ` [PATCH 2/2] sha1_file: only freshen packs once per run Jeff King @ 2015-04-20 20:04 ` Junio C Hamano 2015-04-20 20:09 ` Jeff King 2 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2015-04-20 20:04 UTC (permalink / raw) To: Jeff King; +Cc: Stefan Saasen, Git Mailing List Jeff King <peff@peff.net> writes: > ... But I don't know > if this counts as critical (it is for you, certainly, but I don't think > that many people are affected, as the crucial factor here is really the > slow NFS filesystem operations). If it is critical to some people, they can downmerge to their custom old installations of Git they maintain with ease, of course, and that "with ease" part is the reason why I try to apply fixes to tip of the original topic branch even though they were merged to the mainline eons ago ;-). Thanks. The patches look good from cursory reading. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-20 20:04 ` [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects Junio C Hamano @ 2015-04-20 20:09 ` Jeff King 2015-04-20 20:12 ` Junio C Hamano 2015-04-21 1:49 ` Stefan Saasen 0 siblings, 2 replies; 19+ messages in thread From: Jeff King @ 2015-04-20 20:09 UTC (permalink / raw) To: Junio C Hamano; +Cc: Stefan Saasen, Git Mailing List On Mon, Apr 20, 2015 at 01:04:11PM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > ... But I don't know > > if this counts as critical (it is for you, certainly, but I don't think > > that many people are affected, as the crucial factor here is really the > > slow NFS filesystem operations). > > If it is critical to some people, they can downmerge to their custom > old installations of Git they maintain with ease, of course, and > that "with ease" part is the reason why I try to apply fixes to tip > of the original topic branch even though they were merged to the > mainline eons ago ;-). I think it is a bigger deal for folks who do not ship a custom installation, but expect to ship a third-party system that interacts with whatever version of git their customers happen to have (in which case they can only recommend their customers to upgrade). I don't know how Stash or GitLab installations work. GitHub ships our own custom git (which I maintain), though we are already on 2.3.x. Either way, though, I do not think it is the upstream Git project's problem. -Peff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-20 20:09 ` Jeff King @ 2015-04-20 20:12 ` Junio C Hamano 2015-04-20 20:28 ` Jeff King 2015-04-21 1:49 ` Stefan Saasen 1 sibling, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2015-04-20 20:12 UTC (permalink / raw) To: Jeff King; +Cc: Stefan Saasen, Git Mailing List Jeff King <peff@peff.net> writes: > Either way, though, I do not think it is the upstream Git project's > problem. The commit to pick where to queue the fixes actually is my problem, as I have this illusion that I'd be helping these derived works by making it easier for them to merge, not cherry-pick. But I would imagine that they may go the cherry-pick route anyway, in which case I may be wasting my time worrying about them X-<. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-20 20:12 ` Junio C Hamano @ 2015-04-20 20:28 ` Jeff King 0 siblings, 0 replies; 19+ messages in thread From: Jeff King @ 2015-04-20 20:28 UTC (permalink / raw) To: Junio C Hamano; +Cc: Stefan Saasen, Git Mailing List On Mon, Apr 20, 2015 at 01:12:54PM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > Either way, though, I do not think it is the upstream Git project's > > problem. > > The commit to pick where to queue the fixes actually is my problem, > as I have this illusion that I'd be helping these derived works by > making it easier for them to merge, not cherry-pick. True, I had just meant the actual rolling of the releases. > But I would imagine that they may go the cherry-pick route anyway, > in which case I may be wasting my time worrying about them X-<. FWIW, I typically cherry-pick rather than merge. The resulting history is not as nice, but it means I don't have to think as hard about the history when doing so. It also means that topics may not be as well tested (e.g., they may have been implicitly relying on some other thing that happened upstream that I did _not_ cherry-pick). But we treat even cherry-picked upstream topics as their own feature branches, and do our normal internal testing and review. -Peff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-20 20:09 ` Jeff King 2015-04-20 20:12 ` Junio C Hamano @ 2015-04-21 1:49 ` Stefan Saasen 2015-04-21 17:05 ` Junio C Hamano 1 sibling, 1 reply; 19+ messages in thread From: Stefan Saasen @ 2015-04-21 1:49 UTC (permalink / raw) To: Jeff King, Junio C Hamano; +Cc: Git Mailing List >> If it is critical to some people, they can downmerge to their custom >> old installations of Git they maintain with ease, of course, and >> that "with ease" part is the reason why I try to apply fixes to tip >> of the original topic branch even though they were merged to the >> mainline eons ago ;-). > > I think it is a bigger deal for folks who do not ship a custom > installation, but expect to ship a third-party system that interacts > with whatever version of git their customers happen to have (in which > case they can only recommend their customers to upgrade). Yes, this is the situation we are facing. We allow our customers to use the git version that is supported/available on their OS (within a certain range of supported versions) so our customers usually don't compile from source. > Either way, though, I do not think it is the upstream Git project's > problem. That's fair enough, I was mostly enquiring about the official git versions this will land in so that we can advise customers what git version to use (or not to use). I've noticed Peff's patches on pu which suggest they will be available in git 2.5? Do you Junio, have plans to merge them to maint (2.3.x) and/or next (2.4)? While I certainly agree that this is specific to Git on NFS and not a more widespread git performance problem, I'd love to be able to message something other than "skip all the git version between and including git 2.2 - 2.4". I appreciate your consideration and thanks again for the swift response on this. Cheers, Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-21 1:49 ` Stefan Saasen @ 2015-04-21 17:05 ` Junio C Hamano 2015-04-21 21:45 ` Junio C Hamano 2015-04-22 0:04 ` Stefan Saasen 0 siblings, 2 replies; 19+ messages in thread From: Junio C Hamano @ 2015-04-21 17:05 UTC (permalink / raw) To: Stefan Saasen; +Cc: Jeff King, Git Mailing List Stefan Saasen <ssaasen@atlassian.com> writes: > I've noticed Peff's patches on pu which suggest they will be available > in git 2.5? Being on 'pu' (or 'next' for that matter) is not a suggestion for a change to appear in any future version at all, even though it often means that it would soon be merged to 'master' and will be in the upcoming release to be on 'next' in early part of a development cycle. Some larger topics would stay on 'next' for a few cycles. > Do you Junio, have plans to merge them to maint (2.3.x) and/or next (2.4)? The topic will hopefully be merged to 'master' after 2.4 final is released end of this month, down to 'maint' early May and will ship with 2.4.1, unless there is unforeseen issues discovered in the change while people try it out while it is in 'next' (which will happen today, hopefully). ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-21 17:05 ` Junio C Hamano @ 2015-04-21 21:45 ` Junio C Hamano 2015-04-22 1:46 ` Stefan Saasen 2015-04-22 0:04 ` Stefan Saasen 1 sibling, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2015-04-21 21:45 UTC (permalink / raw) To: Stefan Saasen; +Cc: Jeff King, Git Mailing List, Kyle J. McKay Junio C Hamano <gitster@pobox.com> writes: > Stefan Saasen <ssaasen@atlassian.com> writes: > >> I've noticed Peff's patches on pu which suggest they will be available >> in git 2.5? > > Being on 'pu' (or 'next' for that matter) is not a suggestion for a > change to appear in any future version at all, even though it often > means that it would soon be merged to 'master' and will be in the > upcoming release to be on 'next' in early part of a development > cycle. Some larger topics would stay on 'next' for a few cycles. > >> Do you Junio, have plans to merge them to maint (2.3.x) and/or next (2.4)? > > The topic will hopefully be merged to 'master' after 2.4 final is > released end of this month, down to 'maint' early May and will ship > with 2.4.1, unless there is unforeseen issues discovered in the > change while people try it out while it is in 'next' (which will > happen today, hopefully). ... and then if I do not forget and if the topic is really important for real-world users, I am OK to merge it down to 2.3 and even 2.2 maintenance tracks later. But that will happen only after the topic hits 'maint', which will happen only after the topic hits 'master'. What you _can_ help is the "if I do not forget" part ;-) Also see a similar discussion we had recently http://thread.gmane.org/gmane.comp.version-control.git/264365 The key sentence from my part in the thread is > When I say "the tip of 'master' is meant to be more stable than > any tagged versions", I do mean it. and the reasoning behind it that is given in the paragraph before that, though. Perhaps companies like Atlassian that rely on the stability of the open source Git can spare some resources and join forces with like minded folks on LTS of older maintenance tracks, if they are truly interested in. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-21 21:45 ` Junio C Hamano @ 2015-04-22 1:46 ` Stefan Saasen 2015-04-22 22:00 ` Junio C Hamano 0 siblings, 1 reply; 19+ messages in thread From: Stefan Saasen @ 2015-04-22 1:46 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Git Mailing List, Kyle J. McKay > Perhaps companies like Atlassian that rely on the stability of the > open source Git can spare some resources and join forces with like > minded folks on LTS of older maintenance tracks, if they are truly > interested in. We certainly can and would like to. I'm not entirely sure what that would entail though? >From reading through $gmane/264365 I've identified the following responsibilities/opportunities to help: > - Monitor "git log --first-parent maint-lts..master" and find > the tip of topic branches that need to be down-merged; > > - Down-merge such topics to maint-lts; this might involve > cherry-picking instead of merge, as the bugfix topics may > originally be done on the codebase newer than maint-lts; and more importantly testing the maint-lts version to ensure backported changes don't introduce regressions and the maint-lts branch is stable. This suggests specific, spaced LTS versions but in the same thread you mention maint-2.1or maint-2.2. So a different model could be maintaining old versions in a sliding window fashion (e.g. critical issues would be backported to the last 6 months worth of git releases). Maybe I'm getting ahead of myself here :) Anyway, long story short. We're interested to help but I'm not entirely sure what that would look like at the moment. Are there formed ideas floating around or would you be looking for some form of proposal instead? Cheers, Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-22 1:46 ` Stefan Saasen @ 2015-04-22 22:00 ` Junio C Hamano 0 siblings, 0 replies; 19+ messages in thread From: Junio C Hamano @ 2015-04-22 22:00 UTC (permalink / raw) To: Stefan Saasen; +Cc: Jeff King, Git Mailing List, Kyle J. McKay Stefan Saasen <ssaasen@atlassian.com> writes: > Anyway, long story short. We're interested to help but I'm not > entirely sure what that would look like at the moment. Are there > formed ideas floating around or would you be looking for some form of > proposal instead? I am not proposing anything or looking for proposals myself, actually. It is just somebody expressed interest in having tested older maintenance track that is kept alive in the past, so I was merely trying to help connect you with that old thread. If those who are interested in having such LTS track(s) need something specific from me, and if it will not be unrealistic maintenance burden, I am willing to help. That's all. For example, LTS group for whatever reason may nominate 2.2.x track as a base that they want to keep alive longer than other maintenance tracks and promise to test changes to them to keep it stable. Then I can help the effort by making sure people's bugfix patches would apply down to 2.2.x track (often people make mistake of using newer facility to fix or test the fix for an ancient bug, and bugfix topic branch ends up forked at a point much newer than where it should be). ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects 2015-04-21 17:05 ` Junio C Hamano 2015-04-21 21:45 ` Junio C Hamano @ 2015-04-22 0:04 ` Stefan Saasen 1 sibling, 0 replies; 19+ messages in thread From: Stefan Saasen @ 2015-04-22 0:04 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Git Mailing List >> I've noticed Peff's patches on pu which suggest they will be available >> in git 2.5? > > Being on 'pu' (or 'next' for that matter) is not a suggestion for a > change to appear in any future version at all, even though it often > means that it would soon be merged to 'master' and will be in the > upcoming release to be on 'next' in early part of a development > cycle. Some larger topics would stay on 'next' for a few cycles. > >> Do you Junio, have plans to merge them to maint (2.3.x) and/or next (2.4)? > > The topic will hopefully be merged to 'master' after 2.4 final is > released end of this month, down to 'maint' early May and will ship > with 2.4.1, unless there is unforeseen issues discovered in the > change while people try it out while it is in 'next' (which will > happen today, hopefully). Thanks for the clarification Junio. ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2015-04-22 22:00 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-04-17 7:30 [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects Stefan Saasen 2015-04-17 14:03 ` Jeff King 2015-04-17 15:50 ` Junio C Hamano 2015-04-18 3:35 ` Stefan Saasen 2015-04-20 19:53 ` Jeff King 2015-04-20 19:54 ` [PATCH 1/2] sha1_file: freshen pack objects before loose Jeff King 2015-04-21 0:46 ` Stefan Saasen 2015-04-20 19:55 ` [PATCH 2/2] sha1_file: only freshen packs once per run Jeff King 2015-04-21 0:45 ` Stefan Saasen 2015-04-20 20:04 ` [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects Junio C Hamano 2015-04-20 20:09 ` Jeff King 2015-04-20 20:12 ` Junio C Hamano 2015-04-20 20:28 ` Jeff King 2015-04-21 1:49 ` Stefan Saasen 2015-04-21 17:05 ` Junio C Hamano 2015-04-21 21:45 ` Junio C Hamano 2015-04-22 1:46 ` Stefan Saasen 2015-04-22 22:00 ` Junio C Hamano 2015-04-22 0:04 ` Stefan Saasen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.