git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Geert Jansen <gerardu@amazon.com>
To: Jeff King <peff@peff.net>
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>,
	"Christian Couder" <christian.couder@gmail.com>
Subject: Re: [RFC PATCH] index-pack: improve performance on NFS
Date: Mon, 29 Oct 2018 22:21:19 +0000	[thread overview]
Message-ID: <20181029222119.GB8325@amazon.com> (raw)
In-Reply-To: <20181029215049.GA24557@sigill.intra.peff.net>

On Mon, Oct 29, 2018 at 05:50:50PM -0400, Jeff King wrote:

> > I believe the loose-object-cache approach would have a performance regression
> > when you're receiving a small pack file and there's many loose objects in the
> > repo. Basically you're trading off
> > 
> >     MIN(256, num_objects_in_pack / dentries_per_readdir) * readdir_latency
> >     
> > against
> > 
> >     num_loose_objects * stat_latency
> 
> Should num_loose_objects and num_objects_in_pack be swapped here? Just
> making sure I understand what you're saying.

Whoops, yes, thanks for spotting that!

> So the 256 in your MIN() is potentially much smaller. We still have to
> deal with the fact that if you have a large number of loose objects,
> they may be split cross multiple readdir (or getdents) calls. The "cache
> maximum" we discussed does bound that, but in some ways that's worse:
> you waste time doing the bounded amount of readdir and then don't even
> get the benefit of the cache. ;)

Yup. To get the performance benefit you'd like the cache to hold all loose
objects except in clearly degenerate cases with far too many loose objects.

> > On Amazon EFS (and I expect on other NFS server implementations too) it is more
> > efficient to do readdir() on a large directory than to stat() each of the
> > individual files in the same directory. I don't have exact numbers but based on
> > a very rough calculation the difference is roughly 10x for large directories
> > under normal circumstances.
> 
> I'd expect readdir() to be much faster than stat() in general (e.g., "ls
> -f" versus "ls -l" is faster even on a warm cache; there's more
> formatting going on in the latter, but I think a lot of it is the effort
> to stat).

In the case of NFS, the client usually requests that the READDIR response also
contains some of the stat flags (like st_mode). But even in this case it's
still more efficient to return multiple entries in one batch through READDIR
rather than as individual responses to GETATTR (which is what stat() maps to).

  reply	other threads:[~2018-10-29 22:21 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-25 18:38 [RFC PATCH] index-pack: improve performance on NFS Jansen, Geert
2018-10-26  0:21 ` Junio C Hamano
2018-10-26 20:38   ` Ævar Arnfjörð Bjarmason
2018-10-27  7:26     ` Junio C Hamano
2018-10-27  9:33       ` Jeff King
2018-10-27 11:22         ` Ævar Arnfjörð Bjarmason
2018-10-28 22:50           ` [PATCH 0/4] index-pack: optionally turn off SHA-1 collision checking Ævar Arnfjörð Bjarmason
2018-10-30  2:49             ` Geert Jansen
2018-10-30  9:04               ` Junio C Hamano
2018-10-30 18:43             ` [PATCH v2 0/3] index-pack: test updates Ævar Arnfjörð Bjarmason
2018-11-13 20:19               ` [PATCH v3] index-pack: add ability to disable SHA-1 collision check Ævar Arnfjörð Bjarmason
2018-11-14  7:09                 ` Junio C Hamano
2018-11-14 12:40                   ` Ævar Arnfjörð Bjarmason
2018-10-30 18:43             ` [PATCH v2 1/3] pack-objects test: modernize style Ævar Arnfjörð Bjarmason
2018-10-30 18:43             ` [PATCH v2 2/3] pack-objects tests: don't leave test .git corrupt at end Ævar Arnfjörð Bjarmason
2018-10-30 18:43             ` [PATCH v2 3/3] index-pack tests: don't leave test repo dirty " Ævar Arnfjörð Bjarmason
2018-10-28 22:50           ` [PATCH 1/4] pack-objects test: modernize style Ævar Arnfjörð Bjarmason
2018-10-28 22:50           ` [PATCH 2/4] pack-objects tests: don't leave test .git corrupt at end Ævar Arnfjörð Bjarmason
2018-10-28 22:50           ` [PATCH 3/4] index-pack tests: don't leave test repo dirty " Ævar Arnfjörð Bjarmason
2018-10-28 22:50           ` [PATCH 4/4] index-pack: add ability to disable SHA-1 collision check Ævar Arnfjörð Bjarmason
2018-10-29 15:04           ` [RFC PATCH] index-pack: improve performance on NFS Jeff King
2018-10-29 15:09             ` Jeff King
2018-10-29 19:36             ` Ævar Arnfjörð Bjarmason
2018-10-29 23:27               ` Jeff King
2018-11-07 22:55                 ` Geert Jansen
2018-11-08 12:02                   ` Jeff King
2018-11-08 20:58                     ` Geert Jansen
2018-11-08 21:18                       ` Jeff King
2018-11-08 21:55                         ` Geert Jansen
2018-11-08 22:20                     ` Ævar Arnfjörð Bjarmason
2018-11-09 10:11                       ` Ævar Arnfjörð Bjarmason
2018-11-12 14:31                       ` Jeff King
2018-11-12 14:46                     ` [PATCH 0/9] caching loose objects Jeff King
2018-11-12 14:46                       ` [PATCH 1/9] fsck: do not reuse child_process structs Jeff King
2018-11-12 15:26                         ` Derrick Stolee
2018-11-12 14:47                       ` [PATCH 2/9] submodule--helper: prefer strip_suffix() to ends_with() Jeff King
2018-11-12 18:23                         ` Stefan Beller
2018-11-12 14:48                       ` [PATCH 3/9] rename "alternate_object_database" to "object_directory" Jeff King
2018-11-12 15:30                         ` Derrick Stolee
2018-11-12 15:36                           ` Jeff King
2018-11-12 19:41                             ` Ramsay Jones
2018-11-12 14:48                       ` [PATCH 4/9] sha1_file_name(): overwrite buffer instead of appending Jeff King
2018-11-12 15:32                         ` Derrick Stolee
2018-11-12 14:49                       ` [PATCH 5/9] handle alternates paths the same as the main object dir Jeff King
2018-11-12 15:38                         ` Derrick Stolee
2018-11-12 15:46                           ` Jeff King
2018-11-12 15:50                             ` Derrick Stolee
2018-11-12 14:50                       ` [PATCH 6/9] sha1-file: use an object_directory for " Jeff King
2018-11-12 15:48                         ` Derrick Stolee
2018-11-12 16:09                           ` Jeff King
2018-11-12 19:04                             ` Stefan Beller
2018-11-22 17:42                               ` Jeff King
2018-11-12 18:48                           ` Stefan Beller
2018-11-12 14:50                       ` [PATCH 7/9] object-store: provide helpers for loose_objects_cache Jeff King
2018-11-12 19:24                         ` René Scharfe
2018-11-12 20:16                           ` Jeff King
2018-11-12 14:54                       ` [PATCH 8/9] sha1-file: use loose object cache for quick existence check Jeff King
2018-11-12 16:00                         ` Derrick Stolee
2018-11-12 16:01                         ` Ævar Arnfjörð Bjarmason
2018-11-12 16:21                           ` Jeff King
2018-11-12 22:18                             ` Ævar Arnfjörð Bjarmason
2018-11-12 22:30                               ` Ævar Arnfjörð Bjarmason
2018-11-13 10:02                                 ` Ævar Arnfjörð Bjarmason
2018-11-14 18:21                                   ` René Scharfe
2018-12-02 10:52                                   ` René Scharfe
2018-12-03 22:04                                     ` Jeff King
2018-12-04 21:45                                       ` René Scharfe
2018-12-05  4:46                                         ` Jeff King
2018-12-05  6:02                                           ` René Scharfe
2018-12-05  6:51                                             ` Jeff King
2018-12-05  8:15                                               ` Jeff King
2018-12-05 18:41                                                 ` René Scharfe
2018-12-05 20:17                                                   ` Jeff King
2018-11-12 22:44                             ` Geert Jansen
2018-11-27 20:48                         ` René Scharfe
2018-12-01 19:49                           ` Jeff King
2018-11-12 14:55                       ` [PATCH 9/9] fetch-pack: drop custom loose object cache Jeff King
2018-11-12 19:25                         ` René Scharfe
2018-11-12 19:32                           ` Ævar Arnfjörð Bjarmason
2018-11-12 20:07                             ` Jeff King
2018-11-12 20:13                             ` René Scharfe
2018-11-12 16:02                       ` [PATCH 0/9] caching loose objects Derrick Stolee
2018-11-12 19:10                         ` Stefan Beller
2018-11-09 13:43                   ` [RFC PATCH] index-pack: improve performance on NFS Ævar Arnfjörð Bjarmason
2018-11-09 16:08                     ` Duy Nguyen
2018-11-10 14:04                       ` Ævar Arnfjörð Bjarmason
2018-11-12 14:34                         ` Jeff King
2018-11-12 22:58                     ` Geert Jansen
2018-10-27 14:04         ` Duy Nguyen
2018-10-29 15:18           ` Jeff King
2018-10-29  0:48         ` Junio C Hamano
2018-10-29 15:20           ` Jeff King
2018-10-29 18:43             ` Ævar Arnfjörð Bjarmason
2018-10-29 21:34           ` Geert Jansen
2018-10-29 21:50             ` Jeff King
2018-10-29 22:21               ` Geert Jansen [this message]
2018-10-29 22:27             ` Jeff King
2018-10-29 22:35               ` Stefan Beller
2018-10-29 23:29                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181029222119.GB8325@amazon.com \
    --to=gerardu@amazon.com \
    --cc=avarab@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).