All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] verify_packfile: check pack validity before accessing data
@ 2016-09-22  3:49 Jeff King
  2016-09-22  4:05 ` Jeff King
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2016-09-22  3:49 UTC (permalink / raw)
  To: git

The verify_packfile() does not explicitly open the packfile;
instead, it starts with a sha1 checksum over the whole pack,
and relies on use_pack() to open the packfile as a side
effect.

If the pack cannot be opened for whatever reason (either
because its header information is corrupted, or perhaps
because a simultaneous repack deleted it), then use_pack()
will die(), as it has no way to return an error. This is not
ideal, as verify_packfile() otherwise tries to gently return
an error (this lets programs like git-fsck go on to check
other packs).

Instead, let's check is_pack_valid() up front, and return an
error if it fails. This will open the pack as a side effect,
and then use_pack() will later rely on our cached
descriptor, and avoid calling die().

Signed-off-by: Jeff King <peff@peff.net>
---
 pack-check.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/pack-check.c b/pack-check.c
index d123846..c5c7763 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -57,11 +57,8 @@ static int verify_packfile(struct packed_git *p,
 	int err = 0;
 	struct idx_entry *entries;
 
-	/* Note that the pack header checks are actually performed by
-	 * use_pack when it first opens the pack file.  If anything
-	 * goes wrong during those checks then the call will die out
-	 * immediately.
-	 */
+	if (!is_pack_valid(p))
+		return error("packfile %s cannot be accessed", p->pack_name);
 
 	git_SHA1_Init(&ctx);
 	do {
-- 
2.10.0.482.gae5a597

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] verify_packfile: check pack validity before accessing data
  2016-09-22  3:49 [PATCH] verify_packfile: check pack validity before accessing data Jeff King
@ 2016-09-22  4:05 ` Jeff King
  2016-09-22 18:21   ` Junio C Hamano
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2016-09-22  4:05 UTC (permalink / raw)
  To: git

On Wed, Sep 21, 2016 at 11:49:05PM -0400, Jeff King wrote:

> The verify_packfile() does not explicitly open the packfile;
> instead, it starts with a sha1 checksum over the whole pack,
> and relies on use_pack() to open the packfile as a side
> effect.
> 
> If the pack cannot be opened for whatever reason (either
> because its header information is corrupted, or perhaps
> because a simultaneous repack deleted it), then use_pack()
> will die(), as it has no way to return an error. This is not
> ideal, as verify_packfile() otherwise tries to gently return
> an error (this lets programs like git-fsck go on to check
> other packs).
> 
> Instead, let's check is_pack_valid() up front, and return an
> error if it fails. This will open the pack as a side effect,
> and then use_pack() will later rely on our cached
> descriptor, and avoid calling die().

I actually had an ulterior motive, but it didn't pan out. I
think this change is an improvement on its own, which is why I posted
it. But here's my ulterior motive, for reference.

The die() in question happens when use_pack() is asked to lazily open a
pack, but it fails. If this happens then the code in question is racy
with respect to somebody else running a repack, because the pack we are
looking for might go away, but we could find the object in another pack.
Since we cannot handle the retry at this level, callers of use_pack()
should generally use is_pack_valid() early to cache the descriptor (and
at that early stage, they can still bail to another pack if necessary).
See the comment in fill_pack_entry(), for example, or the discussion in 
4c08018 (pack-objects: protect against disappearing packs, 2011-10-14).

So I wanted to know whether there were any code paths that failed to do
so, and just blindly rely on the lazy-open. Finding the races is
inherently hard, because you only catch them when somebody else is doing
a repack. But if we just _remove_ the lazy-load, then it becomes easy to
catch anybody relying on it. Like:

diff --git a/sha1_file.c b/sha1_file.c
index b9c1fa3..f3d7615 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1122,8 +1122,8 @@ unsigned char *use_pack(struct packed_git *p,
 	 * hash, and the in_window function above wouldn't match
 	 * don't allow an offset too close to the end of the file.
 	 */
-	if (!p->pack_size && p->pack_fd == -1 && open_packed_git(p))
-		die("packfile %s cannot be accessed", p->pack_name);
+	if (!p->pack_size && p->pack_fd == -1)
+		die("BUG: use_pack() called on unopened '%s'", p->pack_name);
 	if (offset > (p->pack_size - 20))
 		die("offset beyond end of packfile (truncated pack?)");
 	if (offset < 0)
@@ -1140,8 +1140,8 @@ unsigned char *use_pack(struct packed_git *p,
 			size_t window_align = packed_git_window_size / 2;
 			off_t len;
 
-			if (p->pack_fd == -1 && open_packed_git(p))
-				die("packfile %s cannot be accessed", p->pack_name);
+			if (p->pack_fd == -1)
+				die("BUG: use_pack() called on unopened '%s'", p->pack_name);
 
 			win = xcalloc(1, sizeof(*win));
 			win->offset = (offset / window_align) * window_align;


Running the test suite with the patch above revealed the issue in
verify_packfile (and with my patch, the test suite now passes, even with
this).

So I was hoping that we could convert these into assertions as above.
But the test suite passing does not quite tell the whole story. We might
still close a pack in the middle of an operation if we need to open
another one and are running against the system file descriptor limits.
That would only trigger in a repository with a large number of packs (or
a very low descriptor limit).

In such a case, we are relying on the lazy-load (and we _are_ racy!).
But the patch above would punish people on low-descriptor systems. It's
better to have an unlikely race and complete the request than to fail
consistently. :-/

For people who are running high-traffic servers, they just need to make
sure their file descriptor limit is reasonably high to avoid the race.

-Peff

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] verify_packfile: check pack validity before accessing data
  2016-09-22  4:05 ` Jeff King
@ 2016-09-22 18:21   ` Junio C Hamano
  0 siblings, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2016-09-22 18:21 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> So I wanted to know whether there were any code paths that failed to do
> so, and just blindly rely on the lazy-open. Finding the races is
> inherently hard, because you only catch them when somebody else is doing
> a repack. But if we just _remove_ the lazy-load, then it becomes easy to
> catch anybody relying on it. Like:
> ...

Clever; I like it.

> In such a case, we are relying on the lazy-load (and we _are_ racy!).
> But the patch above would punish people on low-descriptor systems. It's
> better to have an unlikely race and complete the request than to fail
> consistently. :-/
>
> For people who are running high-traffic servers, they just need to make
> sure their file descriptor limit is reasonably high to avoid the race.

Thanks for an illuminating backstory for the patch.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-09-22 18:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-22  3:49 [PATCH] verify_packfile: check pack validity before accessing data Jeff King
2016-09-22  4:05 ` Jeff King
2016-09-22 18:21   ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.