All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC/PATCH] fsck: do not canonicalize modes in trees we are checking
@ 2014-09-23 15:47 Jeff King
  2014-09-23 16:23 ` Edward Thomson
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2014-09-23 15:47 UTC (permalink / raw)
  To: git; +Cc: Edward Thomson, Kirill Smelkov

Doing so means that we do not actually get to see bogus
modes; they are converted into one of our known-good modes
by decode_tree_entry. We want to see the raw data so that we
can complain about it.

Signed-off-by: Jeff King <peff@peff.net>
---
As far as I can tell, fsck's mode-checking has been totally broken
basically forever. Which makes me a little nervous to fix it. :)
linux.git does have some bogus modes, but they are 100664, which is
specifically ignored here unless "fsck --strict" is in effect.

The implementation feels a bit hacky. The global is ugly, but it avoids
having to pass options all through the call chain of init_tree_entry,
update_tree_entry, etc.

Another option would be to have decode_tree_entry leave the raw mode in
the tree_desc, and only canonicalize it during tree_entry_extract (and
teach fsck to look at the raw data, not _extract). That would mean
reverting 7146e66 (tree-walk: finally switch over tree descriptors to
contain a pre-parsed entry, 2014-02-06). That commit says there's no
real benefit at the time, but that future code might benefit. I don't
know if that future code ever materialized (Kirill cc'd).

I thought at first that commit might have been responsible for this
breakage, but I don't think so. The canonicalization has happened in
tree_entry_extract since at least 2006, and we have always called that
function in fsck.

 fsck.c          |  2 ++
 t/t1450-fsck.sh | 21 +++++++++++++++++++++
 tree-walk.c     |  4 +++-
 tree-walk.h     |  3 +++
 4 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index 56156ff..ef79396 100644
--- a/fsck.c
+++ b/fsck.c
@@ -153,6 +153,7 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 	unsigned o_mode;
 	const char *o_name;
 
+	decode_tree_raw = 1;
 	init_tree_desc(&desc, item->buffer, item->size);
 
 	o_mode = 0;
@@ -213,6 +214,7 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 		o_name = name;
 	}
 
+	decode_tree_raw = 0;
 	retval = 0;
 	if (has_null_sha1)
 		retval += error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index b52397a..ba40c6f 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -176,6 +176,27 @@ test_expect_success 'malformatted tree object' '
 	grep "error in tree .*contains duplicate file entries" out
 '
 
+# Basically an 8-bit clean version of sed 's/$1/$2/g'.
+munge_tree_bytes () {
+	perl -e '
+		binmode(STDIN); binmode(STDOUT);
+		$_ = do { local $/; <STDIN> };
+		s/$ARGV[0]/$ARGV[1]/g;
+		print
+	' "$@"
+}
+
+test_expect_success 'bogus mode in tree' '
+	test_when_finished "remove_object \$T" &&
+	T=$(
+		git cat-file tree HEAD >good &&
+		munge_tree_bytes 100644 123456 <good >bad &&
+		git hash-object -w -t tree bad
+	) &&
+	git fsck 2>out &&
+	grep "warning.*contains bad file modes" out
+'
+
 test_expect_success 'tag pointing to nonexistent' '
 	cat >invalid-tag <<-\EOF &&
 	object ffffffffffffffffffffffffffffffffffffffff
diff --git a/tree-walk.c b/tree-walk.c
index 5dd9a71..baca766 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -5,6 +5,8 @@
 #include "tree.h"
 #include "pathspec.h"
 
+int decode_tree_raw;
+
 static const char *get_mode(const char *str, unsigned int *modep)
 {
 	unsigned char c;
@@ -37,7 +39,7 @@ static void decode_tree_entry(struct tree_desc *desc, const char *buf, unsigned
 
 	/* Initialize the descriptor entry */
 	desc->entry.path = path;
-	desc->entry.mode = canon_mode(mode);
+	desc->entry.mode = decode_tree_raw ? mode : canon_mode(mode);
 	desc->entry.sha1 = (const unsigned char *)(path + len);
 }
 
diff --git a/tree-walk.h b/tree-walk.h
index ae7fb3a..9f78e85 100644
--- a/tree-walk.h
+++ b/tree-walk.h
@@ -25,6 +25,9 @@ static inline int tree_entry_len(const struct name_entry *ne)
 	return (const char *)ne->sha1 - ne->path - 1;
 }
 
+/* If set, do not canonicalize modes in tree entries. */
+extern int decode_tree_raw;
+
 void update_tree_entry(struct tree_desc *);
 void init_tree_desc(struct tree_desc *desc, const void *buf, unsigned long size);
 
-- 
2.1.1.496.g1ba8081

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH] fsck: do not canonicalize modes in trees we are checking
  2014-09-23 15:47 [RFC/PATCH] fsck: do not canonicalize modes in trees we are checking Jeff King
@ 2014-09-23 16:23 ` Edward Thomson
  2014-09-23 16:30   ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Edward Thomson @ 2014-09-23 16:23 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Kirill Smelkov

On Tue, Sep 23, 2014 at 11:47:51AM -0400, Jeff King wrote:
> As far as I can tell, fsck's mode-checking has been totally broken
> basically forever. Which makes me a little nervous to fix it. :)
> linux.git does have some bogus modes, but they are 100664, which is
> specifically ignored here unless "fsck --strict" is in effect.

I'm in favor of checking the mode in fsck, at least when --strict.  
But I would suggest we be lax (by default) about other likely-to-exist
but strictly invalid modes to prevent peoples previously workable
repositories from being now broken.

I have, for example, encountered 100775 in the wild, and would argue that
like 100644, it should probably not fail unless we are in --strict mode.

Thanks-
-ed

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH] fsck: do not canonicalize modes in trees we are checking
  2014-09-23 16:23 ` Edward Thomson
@ 2014-09-23 16:30   ` Jeff King
  2014-10-12 22:37     ` Ben Aveling
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2014-09-23 16:30 UTC (permalink / raw)
  To: Edward Thomson; +Cc: git

[-cc Kirill, as his address seem out-of-date]

On Tue, Sep 23, 2014 at 04:23:43PM +0000, Edward Thomson wrote:

> On Tue, Sep 23, 2014 at 11:47:51AM -0400, Jeff King wrote:
> > As far as I can tell, fsck's mode-checking has been totally broken
> > basically forever. Which makes me a little nervous to fix it. :)
> > linux.git does have some bogus modes, but they are 100664, which is
> > specifically ignored here unless "fsck --strict" is in effect.
> 
> I'm in favor of checking the mode in fsck, at least when --strict.  
> But I would suggest we be lax (by default) about other likely-to-exist
> but strictly invalid modes to prevent peoples previously workable
> repositories from being now broken.
> 
> I have, for example, encountered 100775 in the wild, and would argue that
> like 100644, it should probably not fail unless we are in --strict mode.

Yeah, I'd agree with that. The big question is: what breakage have we
seen in the wild? :)

I think treating 100775 the same as 100664 makes sense (want to do a
patch?). Do we know of any others? I guess we can collect them as time
goes on and reports come in. That's not the nicest thing for people with
such repos, but then again, their repos _are_ broken (and it's only
really a showstopper if they are trying to push to somebody with
receive.fsckObjects turned on).

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH] fsck: do not canonicalize modes in trees we are checking
  2014-09-23 16:30   ` Jeff King
@ 2014-10-12 22:37     ` Ben Aveling
  2014-10-14  8:21       ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Ben Aveling @ 2014-10-12 22:37 UTC (permalink / raw)
  To: Jeff King, Edward Thomson; +Cc: git


Hi,

A question about fsck - is there a reason it doesn't have an option to 
delete bad objects?

Regards, Ben

On 24/09/2014 02:30, Jeff King wrote:
> [-cc Kirill, as his address seem out-of-date]
>
> On Tue, Sep 23, 2014 at 04:23:43PM +0000, Edward Thomson wrote:
>
>> On Tue, Sep 23, 2014 at 11:47:51AM -0400, Jeff King wrote:
>>> As far as I can tell, fsck's mode-checking has been totally broken
>>> basically forever. Which makes me a little nervous to fix it. :)
>>> linux.git does have some bogus modes, but they are 100664, which is
>>> specifically ignored here unless "fsck --strict" is in effect.
>> I'm in favor of checking the mode in fsck, at least when --strict.
>> But I would suggest we be lax (by default) about other likely-to-exist
>> but strictly invalid modes to prevent peoples previously workable
>> repositories from being now broken.
>>
>> I have, for example, encountered 100775 in the wild, and would argue that
>> like 100644, it should probably not fail unless we are in --strict mode.
> Yeah, I'd agree with that. The big question is: what breakage have we
> seen in the wild? :)
>
> I think treating 100775 the same as 100664 makes sense (want to do a
> patch?). Do we know of any others? I guess we can collect them as time
> goes on and reports come in. That's not the nicest thing for people with
> such repos, but then again, their repos _are_ broken (and it's only
> really a showstopper if they are trying to push to somebody with
> receive.fsckObjects turned on).
>
> -Peff
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH] fsck: do not canonicalize modes in trees we are checking
  2014-10-12 22:37     ` Ben Aveling
@ 2014-10-14  8:21       ` Jeff King
       [not found]         ` <543F074B.2050907@optusnet.com.au>
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2014-10-14  8:21 UTC (permalink / raw)
  To: Ben Aveling; +Cc: Edward Thomson, git

On Mon, Oct 13, 2014 at 09:37:27AM +1100, Ben Aveling wrote:

> A question about fsck - is there a reason it doesn't have an option to
> delete bad objects?

If the objects are reachable, then deleting them would create other big
problems (i.e., we would be breaking the object graph!).

If they are not, then it is probably safest for them to go away via the
normal means (repack/prune via "git gc").  Deleting via fsck would mean
replicating the reachability and deletion code found elsewhere.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH] fsck: do not canonicalize modes in trees we are checking
       [not found]         ` <543F074B.2050907@optusnet.com.au>
@ 2014-10-16  0:20           ` Jeff King
  2014-10-19 12:40             ` Ben Aveling
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2014-10-16  0:20 UTC (permalink / raw)
  To: Ben Aveling; +Cc: Edward Thomson, git

On Thu, Oct 16, 2014 at 10:46:19AM +1100, Ben Aveling wrote:

> On 14/10/2014 19:21, Jeff King wrote:
> >On Mon, Oct 13, 2014 at 09:37:27AM +1100, Ben Aveling wrote:
> >>A question about fsck - is there a reason it doesn't have an option to
> >>delete bad objects?
> >If the objects are reachable, then deleting them would create other big
> >problems (i.e., we would be breaking the object graph!).
> 
> The man page for fsck advises:
> 
>    "Any corrupt objects you will have to find in backups or other
>    archives (i.e., you can just remove them and do an /rsync/ with some
>    other site in the hopes that somebody else has the object you have
>    corrupted)."

I think there are really two different types of corruption worth
considering here.

One is where the data bytes of the object were correct at one point, but
now aren't anymore. The sha1 does not check out, and the object is
broken. You need to go find another copy of the object somewhere else.

The other is where the object was malformed in the first place,
generally due to a software bug (e.g., syntactically bogus email
addresses on committer lines, trees that are not sorted properly, etc).
There is no point in finding another object, as any you find will be
byte-for-byte identical.

For this second type (which is what I thought we were talking about in
this thread), deleting the object would be Very Bad.

For the first type, deleting the object _could_ be a path forward. But
it's not the usual method for fixing corruption. If you copy a good
version of the object into the repository, then a subsequent repack
should throw away the broken copy in favor of the good one. Note that
you need to copy the objects in manually; you can't use the git protocol
to do so. But that is true whether you have the broken object or not
(because git does not communicate with the other side about _every_
object you have; it talks about commits, and assumes that you have the
rest of the object graph that a commit refers to).

So I don't think deleting really helps in that case. And it may hurt if
it later turns out you don't have another copy of the object and need to
recover what you can from the corrupted pieces (which has actually
happened before).

> I ask because I have a corrupt repository, and every time I run fsck, it
> reports one corrupt object, then stops.

Corrupt how? Bit-corruption, or a malformed object?

> I could write a script to repeatedly call fsck and then remove the
> next corrupt object, but it raises the question for me; could it make
> sense to extend fsck with the option to do to the removes? Or even
> better, do the removes and then do the necessary [r]sync, assuming the
> user has another repository that has a good copy of the bad objects,
> which in this case I do.

If you have a non-corrupted version of the repository, the simplest
thing is to just pack it, copy the resulting packfile to the broken
repository (again, using "cp" or "rsync" and not git), and then repack
there. That won't transfer the minimum amount of data, but it's simple
and only requires one round-trip (bearing in mind that git doesn't know
what out-of-band method you're using, nor whether we can even initiate a
request from the broken repo to the good one, so each round trip
generally does need to be done by hand).

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH] fsck: do not canonicalize modes in trees we are checking
  2014-10-16  0:20           ` Jeff King
@ 2014-10-19 12:40             ` Ben Aveling
  2014-10-20  9:21               ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Ben Aveling @ 2014-10-19 12:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Edward Thomson, git

On 16/10/14 11:20, Jeff King wrote:
> On Thu, Oct 16, 2014 at 10:46:19AM +1100, Ben Aveling wrote:
>
> I have a corrupt repository, and every time I run fsck, it
> reports one corrupt object, then stops.
> Corrupt how? Bit-corruption, or a malformed object?

Bit-corruption, in multiple places.

> If you have a non-corrupted version of the repository, the simplest 
> thing is to just pack it, copy the resulting packfile to the broken 
> repository (again, using "cp" or "rsync" and not git), and then repack 
> there. 

This seems to have worked. I also had to move away the existing .idx and 
copy in a new one before it was happy.

I'm not sure that what I've done is so different from simply copying the 
other version of the repository - there shouldn't have been anything in 
the corrupt version that wasn't also in the good one. But any rate, it 
worked.

Thanks, Ben

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC/PATCH] fsck: do not canonicalize modes in trees we are checking
  2014-10-19 12:40             ` Ben Aveling
@ 2014-10-20  9:21               ` Jeff King
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff King @ 2014-10-20  9:21 UTC (permalink / raw)
  To: Ben Aveling; +Cc: Edward Thomson, git

On Sun, Oct 19, 2014 at 11:40:22PM +1100, Ben Aveling wrote:

> This seems to have worked. I also had to move away the existing .idx and
> copy in a new one before it was happy.

Sorry if I wasn't clear; you do need to copy the .idx files along with
the packfiles (you can regenerate the .idx files from the packfiles on
the destination, but they're not that big; it's probably easier just to
copy them).

> I'm not sure that what I've done is so different from simply copying the
> other version of the repository - there shouldn't have been anything in the
> corrupt version that wasn't also in the good one. But any rate, it worked.

Right, a valid technique for repairing corruption is to just blow away
the original repo entirely and replace it with a good copy. But this is a
way of ensuring that no commits are missed, and keeping the original set
of refs, config options, and reflogs.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-10-20  9:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-23 15:47 [RFC/PATCH] fsck: do not canonicalize modes in trees we are checking Jeff King
2014-09-23 16:23 ` Edward Thomson
2014-09-23 16:30   ` Jeff King
2014-10-12 22:37     ` Ben Aveling
2014-10-14  8:21       ` Jeff King
     [not found]         ` <543F074B.2050907@optusnet.com.au>
2014-10-16  0:20           ` Jeff King
2014-10-19 12:40             ` Ben Aveling
2014-10-20  9:21               ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.