All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, Jonathan Nieder <jrnieder@gmail.com>,
	Sergey Organov <sorganov@gmail.com>,
	Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH 5/7] tmp-objdir: new API for creating and removing primary object dirs
Date: Thu, 30 Sep 2021 17:00:22 -0400	[thread overview]
Message-ID: <YVYlZhl6uACu+UGG@coredump.intra.peff.net> (raw)
In-Reply-To: <87tui2tckn.fsf@evledraar.gmail.com>

On Thu, Sep 30, 2021 at 03:16:19PM +0200, Ævar Arnfjörð Bjarmason wrote:

> I also wonder how much if anything writing out the one file v.s. lots of
> loose objects is worthwhile on systems where we could write out those
> loose objects on a ramdisk, which is commonly available on e.g. Linux
> distros these days out of the box. If you care about performance but not
> about your transitory data using a ramdisk is generally much better than
> any other potential I/O optimization.

I'd think in general we won't be using a ramdisk, because tmp_objdir is
putting its directory inside $GIT_DIR/objects. It doesn't _have_ to, but
using a ramdisk works against its original purpose (which was to store
potentially quite a lot of data from an incoming push, and to be able to
rename it cheaply into its final resting place).

It would probably not be too hard to provide a flag that indicates the
intended use, though (and then we decide where to create the temporary
directory based on that).

> Finally, and I don't mean to throw a monkey wrench into this whole
> discussion, so take this as a random musing: I wonder how much faster
> this thing could be on its second run if instead of avoiding writing to
> the store & cleaning up, it just wrote to the store, and then wrote
> another object keyed on the git version and any revision paramaters
> etc., and then pretty much just had to do a "git cat-file -p <that-obj>"
> to present the result to the user :)
> 
> I suppose that would be throwing a lot more work at an eventual "git gc"
> than we ever do now, so maybe it's a bit crazy, but I think it might be
> an interesting direction in general to (ab)use either the primary or
> some secondary store in the .git dir as a semi-permanent cache of
> resolved queries from the likes of "git log".

I don't think it's crazy to just write the objects to the main object
store. We already generate cruft objects for some other operations
(Junio asked elsewhere in the thread about virtual trees for recursive
merges; I don't know the answer offhand, but I'd guess we do there).
They do get cleaned up eventually.

I'm not sure it helps performance much by itself. In a merge (or even
just writing a tree out from the index), by the time you realize you
already have the object, you've done most of the work to generate it.

I think what you're describing is to make some kind of cache structure
on top. That might be sensible (and indeed, the index already does this
with the cachetree extension). But it can also easily come later if the
objects are just in the regular odb.

-Peff

  reply	other threads:[~2021-09-30 21:00 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-31  2:26 [PATCH 0/7] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
2021-08-31  2:26 ` [PATCH 1/7] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
2021-08-31 21:06   ` Junio C Hamano
2021-09-01  0:03     ` Elijah Newren
2021-09-01 17:19       ` Junio C Hamano
2021-08-31  2:26 ` [PATCH 2/7] merge-ort: add ability to record conflict messages in a file Elijah Newren via GitGitGadget
2021-09-28 22:29   ` Jeff King
2021-09-29  6:25     ` Elijah Newren
2021-09-29 16:14       ` Junio C Hamano
2021-09-29 16:31         ` Elijah Newren
2021-09-30  7:58       ` Jeff King
2021-09-30  8:09         ` Ævar Arnfjörð Bjarmason
2021-10-01  2:07         ` Elijah Newren
2021-10-01  5:28           ` Jeff King
2021-08-31  2:26 ` [PATCH 3/7] ll-merge: add API for capturing warnings in a strbuf instead of stderr Elijah Newren via GitGitGadget
2021-09-28 22:37   ` Jeff King
2021-09-28 23:49     ` Junio C Hamano
2021-09-29  4:03     ` Elijah Newren
2021-08-31  2:26 ` [PATCH 4/7] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
2021-09-28 22:39   ` Jeff King
2021-09-30 16:53   ` Ævar Arnfjörð Bjarmason
2021-10-01  1:54     ` Elijah Newren
2021-10-01  7:23       ` Ævar Arnfjörð Bjarmason
2021-08-31  2:26 ` [PATCH 5/7] tmp-objdir: new API for creating and removing primary object dirs Elijah Newren via GitGitGadget
2021-09-28  7:55   ` Ævar Arnfjörð Bjarmason
2021-09-29  4:22     ` Elijah Newren
2021-09-30  7:41       ` Jeff King
2021-09-30 14:17       ` Ævar Arnfjörð Bjarmason
2021-10-01  3:55         ` Elijah Newren
2021-09-28 23:17   ` Jeff King
2021-09-29  4:08     ` Junio C Hamano
2021-09-30  7:33       ` Jeff King
2021-09-30 13:16         ` Ævar Arnfjörð Bjarmason
2021-09-30 21:00           ` Jeff King [this message]
2021-10-01  3:11           ` Elijah Newren
2021-10-01  7:30             ` Ævar Arnfjörð Bjarmason
2021-10-01  8:03               ` Elijah Newren
2021-10-01  4:26         ` Elijah Newren
2021-10-01  5:27           ` Jeff King
2021-10-01  7:43             ` Ævar Arnfjörð Bjarmason
2021-09-29  5:05     ` Elijah Newren
2021-09-30  7:26       ` Jeff King
2021-09-30  7:46         ` Jeff King
2021-09-30 20:06           ` Junio C Hamano
2021-10-01  3:59             ` Elijah Newren
2021-10-01 16:36               ` Junio C Hamano
2021-10-01  2:31           ` Elijah Newren
2021-10-01  5:21             ` Jeff King
2021-09-30 19:25         ` Neeraj Singh
2021-09-30 20:19           ` Junio C Hamano
2021-09-30 20:00         ` Junio C Hamano
2021-10-01  2:23         ` Elijah Newren
2021-08-31  2:26 ` [PATCH 6/7] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
2021-08-31  9:19   ` Sergey Organov
2021-09-28  8:01   ` Ævar Arnfjörð Bjarmason
2021-09-29  4:00     ` Elijah Newren
2021-08-31  2:26 ` [PATCH 7/7] doc/diff-options: explain the new --remerge-diff option Elijah Newren via GitGitGadget
2021-08-31 11:05 ` [PATCH 0/7] Add a new --remerge-diff capability to show & log Bagas Sanjaya
2021-08-31 16:16   ` Elijah Newren
2021-08-31 20:03 ` Junio C Hamano
2021-08-31 20:23   ` Elijah Newren
2021-09-01 21:07 ` Junio C Hamano
2021-09-01 21:42   ` Elijah Newren
2021-09-01 21:55     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YVYlZhl6uACu+UGG@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=newren@gmail.com \
    --cc=sorganov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.