All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Berg <johannes@sipsolutions.net>
To: Derrick Stolee <stolee@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH] pack-format: correct multi-pack-index description
Date: Mon, 10 Feb 2020 16:06:41 +0100	[thread overview]
Message-ID: <a52c8163abfba107a27b359a1588a68efdc581a8.camel@sipsolutions.net> (raw)
In-Reply-To: <08dbc3be-34a7-fb8d-e0bd-56a79ab5b65a@gmail.com> (sfid-20200210_160205_899758_698E8FB8)

On Mon, 2020-02-10 at 10:02 -0500, Derrick Stolee wrote:
> Git loads the multi-pack-index file, which includes a sorted list of
> the packs it covers. It then scans the "pack" directory for pack-indexes
> and checks if they are covered by the multi-pack-index. If not, then
> Git will add them to the packed_git struct and use them as normal.
> The hope is that this list of "uncovered" packs is small compared to
> the data covered by the multi-pack-index.
> 
> This allows Git to continue functioning after an action like "git fetch"
> that adds a new pack but may not want to rewrite the multi-pack-index.

Ah, ok.

So then perhaps I'll just make bup write the multi-pack-index file as
is. This is fine, there's no real need to have multiple, I just didn't
want to have to make sure the file was always consistent.

Or maybe just call git to do it, and only be able to read the resulting
file :-)

> Our background maintenance essentially runs these commands:
> 
>  1. git multi-pack-index write
>  2. git multi-pack-index expire
>  3. git multi-pack-index repack
> 
> Step 1 ensures all packs are pulled into the multi-pack-index. Step 2
> deletes any pack-files whose objects are contained in newer pack-files.
> Step 3 creates a new pack-file containing all objects from a set of
> small pack-files (using the --batch-size=X option). This process helps
> incrementally reduce the size and number of packs. That may be helpful
> for your backup took, too.

I'll have to look at this in more detail later, and understand exactly
what the steps do here. Evidently that modifies pack files, which I
hadn't expected for a type of "index" command :-)

> Perhaps after an incremental multi-pack-index is added, then Git could
> (optionally) have a mode that only checks the multi-pack-index to
> avoid scanning the packs directory. It would require inserting a
> multi-pack-index write into the index-pack logic so Git.

I guess you'd still want to read non-covered pack files just in case old
git was used or something though.

> I'm not sure if that mode would be helpful, since the pack directory
> scan is typically done once per command and is relatively fast.

Right.

> > > That said: if someone wanted to contribute an incremental format,
> > > then I would be happy to review it!
> > 
> > I might still get motivated to do so :-)
> 
> YOU CAN DO IT! (Did that help?)

:-)

Thanks,
johannes


  reply	other threads:[~2020-02-10 15:06 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-07 22:16 [PATCH] pack-format: correct multi-pack-index description Johannes Berg
2020-02-10 14:18 ` Derrick Stolee
2020-02-10 14:22   ` Johannes Berg
2020-02-10 14:46     ` Derrick Stolee
2020-02-10 14:50       ` Johannes Berg
2020-02-10 15:02         ` Derrick Stolee
2020-02-10 15:06           ` Johannes Berg [this message]
2020-02-10 17:02   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a52c8163abfba107a27b359a1588a68efdc581a8.camel@sipsolutions.net \
    --to=johannes@sipsolutions.net \
    --cc=git@vger.kernel.org \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.