All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brandon Williams <bwilliamseng@gmail.com>
To: git <git@vger.kernel.org>
Cc: Jeff King <peff@peff.net>
Subject: invalid tree and commit object
Date: Fri, 8 May 2020 23:19:38 -0700	[thread overview]
Message-ID: <CALN-EhTpiLERuB16-WPZaLub6GdaRHJW8xDeaOEqSFtKe0kCYw@mail.gmail.com> (raw)

Hey!

Its been a minute since I've written to the list but I was recently looking
into the rules fsck uses to identify valid or invalid objects and I believe I
found a case that I believe fsck is currently missing. One of the things fsck
looks for when validating a tree object is that it doesn't contain any
duplicate entries. It even has a nice comment about how `git-write-tree` used
to write out trees with duplicate entries:

    /*
     * git-write-tree used to write out a nonsense tree that has
     * entries with the same name, one blob and one tree.  Make
     * sure we do not have duplicate entries.
     */

Here's the setup:
    tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8
    tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6
    blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689

    $ git ls-tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8
    100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689    hello
    100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689    hello.c
    040000 tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6    hello

    $ git ls-tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6
    100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689    hello

    # '%' here indicates that there is no newline at the end of the object
    $ git cat-file blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689
    Hello World%

fsck currently passes when being passed these objects despite c63d067eae having
a duplicate entry. This seems to be due to the duplicate entry check in
`fsck_tree` only checking if adjacent entries are duplicates but due to the
sorting rules its unable to realize that there is both a blob and a tree with
the name "hello".

I was even able to produce a commit and push it to Github[1] (which
didn't complain)

    $ git show --pretty=raw 62f1ff6e109f8b77edd7eeb65f6634faa76a93b2
    commit 62f1ff6e109f8b77edd7eeb65f6634faa76a93b2
    tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8
    author Brandon Williams <bwilliams.eng@gmail.com> 1589004242 -0700
    committer Brandon Williams <bwilliams.eng@gmail.com> 1589004242 -0700

        hello

Checking out that commit leaves your working directory in a somewhat
broken and 'unclean' state (although Github's UI seems to be able to handle
displaying it properly).

Am I correct in assuming that this object is indeed invalid and should be
rejected by fsck?

-Brandon

[1]: https://github.com/bmwill/invalid-commit

             reply	other threads:[~2020-05-09  6:19 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-09  6:19 Brandon Williams [this message]
2020-05-09 10:16 ` invalid tree and commit object René Scharfe
2020-05-09  7:16   ` Johannes Schindelin
2020-05-09 11:51     ` René Scharfe
2020-05-09 17:28   ` Junio C Hamano
2020-05-09 19:24     ` René Scharfe
2020-05-09 20:27       ` Junio C Hamano
2020-05-10  9:07         ` René Scharfe
2020-05-10 16:12           ` René Scharfe
2020-05-11 16:25             ` Junio C Hamano
2020-05-13 16:27               ` Brandon Williams
2020-05-21  9:51               ` René Scharfe
2020-05-21  9:52               ` [PATCH 1/4] fsck: fix a typo in a comment René Scharfe
2020-05-21 10:10                 ` Denton Liu
2020-05-21 11:15                 ` René Scharfe
2020-05-21  9:52               ` [PATCH 2/4] t1450: increase test coverage of in-tree d/f detection René Scharfe
2020-05-21 10:20                 ` Denton Liu
2020-05-21 13:31                   ` René Scharfe
2020-05-21 18:01                     ` Junio C Hamano
2020-05-21  9:52               ` [PATCH 3/4] t1450: demonstrate undetected in-tree d/f conflict René Scharfe
2020-05-21  9:52               ` [PATCH 4/4] fsck: detect more in-tree d/f conflicts René Scharfe
2020-05-10 16:37           ` invalid tree and commit object Junio C Hamano
2020-05-21  9:51             ` René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALN-EhTpiLERuB16-WPZaLub6GdaRHJW8xDeaOEqSFtKe0kCYw@mail.gmail.com \
    --to=bwilliamseng@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.