All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: nobody <darwinskernel@gmail.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Linux 2.6.38-rc1
Date: Tue, 18 Jan 2011 21:56:38 -0800	[thread overview]
Message-ID: <AANLkTikV6=FCQ+-Z=fAi8SH90M-izCUEBisinqGwo=DU@mail.gmail.com> (raw)
In-Reply-To: <AANLkTikdCek7DGN885Cjuo1nkswe_Y8KG6Knn8XknzsF@mail.gmail.com>

On Tue, Jan 18, 2011 at 9:42 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> When pulling from 2.6.37 to 2.6.38-rc1, it should look something like this:
>
>  remote: Counting objects: 84898, done.
>  remote: Compressing objects: 100% (14274/14274), done.
>  Receiving objects: 100% (71245/71245), 21.07 MiB | 26.53 MiB/s, done.
>  remote: Total 71245 (delta 59086), reused 67779 (delta 56042)
>  Resolving deltas: 100% (59086/59086), completed with 7395 local objects.
>
> ie you got 21.07MiB for the whole change between 2.6.37 and 2.6.38-rc1.

Btw, what may confuse you a bit is that the on-disk representation of
the newly received pack ends up being about 69MB, ie the 21MiB of
network traffic almost tripled in size as a result of that "resolving
deltas" thing. That's because git pack-files are designed to always be
stand-alone, so on disk, the pack-file will always contain the base
objects needed to expand all the deltas.

But on the wire, we don't do that, which is why you have that
"Resolving deltas" phase - it's a purely local phase where it takes
the "pure delta" pack that came over the wire, and creates the
well-formed pack that doesn't have any deltas that depend on external
objects.

And that expansion will end up happening every time you pull: so if
you do daily pulls, all those pulls that will have been fairly small
on the wire will all have been expanded so that the resulting packs
are stand-alone. Which means that you often end up having the same (or
very similar) base objects duplicated in the packs.

So I can well imagine that if you do a pull every day, over two weeks
your .git/objects/pack directory will have new packs that together are
500MB in size due to all of that. That's why git likes doing some GC
on its data every once in a while - it will repack all those
individual packs into one big pack, which avoids all that duplication
of base objects.

And why do we expand the packs and make them stand on their own? Why
don't we just keep all the object data as deltas agains objects in
other packs, the way we pass data around on the network? The reason is
simply robustness. You can get into various nasty situations (like
circular delta dependencies) if you allow deltas between different
packs. So the only time we allow a so-called "thin pack" (ie the pack
is full of deltas against objects external to the pack) is for the
ephemeral pack that is transferred during a "pull" or "fetch". In that
situation we end up doing lots of extra sanity checking, and because
it's ephemeral you never get into the whole situation where deltas in
different packs could refer to each other (because by the time it's a
real pack, it will have been expanded out to be self-sufficient).

So do use "git gc" every once in a while to avoid unnecessary pack
duplication issues (it also makes object indexing much faster etc).

                  Linus

  reply	other threads:[~2011-01-19  5:57 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-18 23:54 Linux 2.6.38-rc1 Linus Torvalds
2011-01-19  2:07 ` Linus Torvalds
2011-01-19  3:42   ` Justin Mattock
2011-01-19  3:53     ` Linus Torvalds
2011-01-19  4:05       ` Justin Mattock
2011-01-19  5:10       ` nobody
2011-01-19  5:42         ` Linus Torvalds
2011-01-19  5:56           ` Linus Torvalds [this message]
2011-01-19 18:07             ` Steven Rostedt
2011-01-19 18:26               ` Linus Torvalds
2011-01-21  3:23           ` tag&track [Re: Linux 2.6.38-rc1] nobody
2011-01-21  8:24             ` Alexey Dobriyan
2011-01-23  8:13   ` Linux 2.6.38-rc1 Török Edwin
2011-01-19  7:39 ` Linux 2.6.38-rc1 doesn't boot Markus Trippelsdorf
2011-01-19  7:46   ` Shaohua Li
2011-01-19  7:55     ` Markus Trippelsdorf
2011-01-19  7:49   ` Markus Trippelsdorf
2011-01-19  8:12     ` Shaohua Li
2011-01-19  8:56       ` H. Peter Anvin
2011-01-19  9:09         ` Ingo Molnar
2011-01-20  2:08           ` Shaohua Li
2011-01-20  3:32             ` Lu, Hongjiu
2011-01-20 11:25             ` Ingo Molnar
2011-01-20 15:08               ` Anvin, H Peter
2011-01-21  7:18                 ` Shaohua Li
2011-01-21  7:55                   ` Shaohua Li
2011-01-21 15:28                   ` H. Peter Anvin
2011-01-21 15:37                     ` Lu, Hongjiu
2011-01-21 21:09                       ` Ingo Molnar
2011-01-19  9:33         ` [tip:x86/urgent] Revert "x86: Make relocatable kernel work with new binutils" tip-bot for Ingo Molnar
2011-01-20  4:59     ` Linux 2.6.38-rc1 doesn't boot Alexandre Courbot
2011-01-19  8:39 ` PPS parport boot lockup: INFO: HARDIRQ-READ-safe -> HARDIRQ-READ-unsafe lock order detected Ingo Molnar
2011-01-20 13:04   ` Alexander Gordeev
2011-01-21 14:44   ` Alexander Gordeev
2011-01-21 16:37     ` Linus Torvalds
2011-01-21 19:43       ` Ingo Molnar
2011-01-24 23:28         ` Alexander Gordeev
2011-01-24 23:46         ` [PATCH] pps: claim parallel port exclusively Alexander Gordeev
2011-01-25  0:19           ` Ingo Molnar
2011-01-24 15:00       ` PPS parport boot lockup: INFO: HARDIRQ-READ-safe -> HARDIRQ-READ-unsafe lock order detected Alexander Gordeev
2011-01-24 15:12   ` [PATCH] parport: make lockdep happy with waitlist_lock Alexander Gordeev
2011-01-24 15:28     ` Ingo Molnar
2011-01-24 15:33       ` Alexander Gordeev
2011-01-19 12:02 ` percpu related boot crash on x86 (was: Linux 2.6.38-rc1) Ingo Molnar
2011-01-19 12:44   ` Tejun Heo
2011-01-19 12:48   ` Peter Zijlstra
2011-01-19 12:56     ` Pekka Enberg
2011-01-19 13:12       ` Peter Zijlstra
2011-01-19 13:13       ` Tejun Heo
2011-01-19 20:53         ` Ingo Molnar
2011-01-19 23:11           ` Ingo Molnar
2011-01-20  8:31             ` percpu related boot crash on x86 Pekka Enberg
2011-01-20 10:47               ` Peter Zijlstra
2011-01-20 11:12                 ` Eric Dumazet
2011-01-20 11:19                   ` Tejun Heo
2011-01-20 11:06   ` [PATCH 1/2] lockdep: move early boot local IRQ enable/disable status to init/main.c Tejun Heo
2011-01-20 11:07     ` [PATCH 2/2] smp: allow on_each_cpu() to be called while early_boot_irqs_disabled " Tejun Heo
2011-01-20 20:22       ` [tip:core/urgent] smp: Allow " tip-bot for Tejun Heo
2011-01-20 11:11     ` [PATCH 1/2] lockdep: move early boot local IRQ enable/disable " Tejun Heo
2011-01-20 11:23       ` Peter Zijlstra
2011-01-20 11:26         ` Tejun Heo
2011-01-20 11:30           ` Pekka Enberg
2011-01-20 11:38           ` Peter Zijlstra
2011-01-20 12:00       ` Ingo Molnar
2011-01-20 12:20         ` [PATCH UPDATED " Tejun Heo
2011-01-20 11:51     ` [PATCH " Ingo Molnar
2011-01-20 20:21     ` [tip:core/urgent] lockdep: Move " tip-bot for Tejun Heo
2011-01-19 21:40 ` Linux 2.6.38-rc1 Alan Cox
2011-01-21 15:30   ` Aaro Koskinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTikV6=FCQ+-Z=fAi8SH90M-izCUEBisinqGwo=DU@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=darwinskernel@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.