All of lore.kernel.org
 help / color / mirror / Atom feed
* Get a git diff without taking index into account
@ 2015-02-18 14:57 Eric Frederich
  2015-02-18 15:06 ` Eric Frederich
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Frederich @ 2015-02-18 14:57 UTC (permalink / raw)
  To: git

Some background.
I'm trying to use Git as an object store for trees.
I put trees into the repo and can retrieve them.
I'm having issues with diffing these trees that I retrieve from the repo.
If I use a "git checkout" the diffs seem to work but if I create the
tree myself user lower level ls-tree and cat-file commands then the
diff doesn't work.
It seems to take the index into account.

Below is a complete working example.
Should be able to copy / paste line by line.
I am trying to run the diff of form:
  git diff [--options] <commit> [--] [<path>...]
where it says it does a diff from working tree to a commit

Maybe git is interpreting my command as one of the other forms?
Can someone help me understand what is going on?

#
# EXAMPLE
#

# cleanup and create dummy data
rm -rf /tmp/mydatastore && mkdir -p /tmp/mydatastore
rm -rf /tmp/test /tmp/test2 && mkdir -p /tmp/test/d1 /tmp/test2
echo "this is f1" > /tmp/test/f1
echo "this is f2" > /tmp/test/d1/f2

# create a new branch called test with data from /tmp/test
git --git-dir=/tmp/mydatastore/.db init --bare
git --git-dir=/tmp/mydatastore/.db hash-object -w /tmp/test/d1/f2 /tmp/test/f1
echo -e "100644 blob c837441e09d13d3a0a2d906d7c3813adda504833\tf2" |
git --git-dir=/tmp/mydatastore/.db mktree --batch
echo -e "100644 blob
11ac5613caf504eec18b2e60f1e6b3c598b085eb\tf1\n40755 tree
055f1133fbc9872f3093cca5f182b16611e6789a\td1" | git
--git-dir=/tmp/mydatastore/.db mktree
commit_sha=`git --git-dir=/tmp/mydatastore/.db commit-tree -m "initial
commit" c427094b22e74d1eaeebdc9e49e6790b5b6a706a`
git --git-dir=/tmp/mydatastore/.db update-ref refs/heads/test $commit_sha

# why does this show diffs?
git --git-dir=/tmp/mydatastore/.db --work-tree=/tmp/test diff $commit_sha

# after doing a checkout somewhere else it doesn't show any diffs
git --git-dir=/tmp/mydatastore/.db --work-tree=/tmp/test2 checkout test .
git --git-dir=/tmp/mydatastore/.db --work-tree=/tmp/test diff $commit_sha

# remove the index and it shows diffs again
rm /tmp/mydatastore/.db/index
git --git-dir=/tmp/mydatastore/.db --work-tree=/tmp/test diff $commit_sha

# it was my understanding from "git help diff" that this form of diff
is supposed to
# compare a work tree against a commit or branch and not take into
account the index.
# Clearly it takes into account the index because we get different
results with and without it

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Get a git diff without taking index into account
  2015-02-18 14:57 Get a git diff without taking index into account Eric Frederich
@ 2015-02-18 15:06 ` Eric Frederich
  2015-02-18 15:37   ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Frederich @ 2015-02-18 15:06 UTC (permalink / raw)
  To: git

More concise example:

cd /tmp
git clone --bare https://github.com/defnull/bottle.git
mkdir /tmp/bottlecopy
git --git-dir=/tmp/bottle.git --work-tree=/tmp/bottlecopy checkout master .

# this shows no diffs
git --git-dir=/tmp/bottle.git --work-tree=/tmp/bottlecopy diff master
rm /tmp/bottle.git/index

# why does this one now show a diff?
# how can I compare a working directory to a commit without taking the
index into account?
git --git-dir=/tmp/bottle.git --work-tree=/tmp/bottlecopy diff master

On Wed, Feb 18, 2015 at 9:57 AM, Eric Frederich
<eric.frederich@gmail.com> wrote:
> Some background.
> I'm trying to use Git as an object store for trees.
> I put trees into the repo and can retrieve them.
> I'm having issues with diffing these trees that I retrieve from the repo.
> If I use a "git checkout" the diffs seem to work but if I create the
> tree myself user lower level ls-tree and cat-file commands then the
> diff doesn't work.
> It seems to take the index into account.
>
> Below is a complete working example.
> Should be able to copy / paste line by line.
> I am trying to run the diff of form:
>   git diff [--options] <commit> [--] [<path>...]
> where it says it does a diff from working tree to a commit
>
> Maybe git is interpreting my command as one of the other forms?
> Can someone help me understand what is going on?
>
> #
> # EXAMPLE
> #
>
> # cleanup and create dummy data
> rm -rf /tmp/mydatastore && mkdir -p /tmp/mydatastore
> rm -rf /tmp/test /tmp/test2 && mkdir -p /tmp/test/d1 /tmp/test2
> echo "this is f1" > /tmp/test/f1
> echo "this is f2" > /tmp/test/d1/f2
>
> # create a new branch called test with data from /tmp/test
> git --git-dir=/tmp/mydatastore/.db init --bare
> git --git-dir=/tmp/mydatastore/.db hash-object -w /tmp/test/d1/f2 /tmp/test/f1
> echo -e "100644 blob c837441e09d13d3a0a2d906d7c3813adda504833\tf2" |
> git --git-dir=/tmp/mydatastore/.db mktree --batch
> echo -e "100644 blob
> 11ac5613caf504eec18b2e60f1e6b3c598b085eb\tf1\n40755 tree
> 055f1133fbc9872f3093cca5f182b16611e6789a\td1" | git
> --git-dir=/tmp/mydatastore/.db mktree
> commit_sha=`git --git-dir=/tmp/mydatastore/.db commit-tree -m "initial
> commit" c427094b22e74d1eaeebdc9e49e6790b5b6a706a`
> git --git-dir=/tmp/mydatastore/.db update-ref refs/heads/test $commit_sha
>
> # why does this show diffs?
> git --git-dir=/tmp/mydatastore/.db --work-tree=/tmp/test diff $commit_sha
>
> # after doing a checkout somewhere else it doesn't show any diffs
> git --git-dir=/tmp/mydatastore/.db --work-tree=/tmp/test2 checkout test .
> git --git-dir=/tmp/mydatastore/.db --work-tree=/tmp/test diff $commit_sha
>
> # remove the index and it shows diffs again
> rm /tmp/mydatastore/.db/index
> git --git-dir=/tmp/mydatastore/.db --work-tree=/tmp/test diff $commit_sha
>
> # it was my understanding from "git help diff" that this form of diff
> is supposed to
> # compare a work tree against a commit or branch and not take into
> account the index.
> # Clearly it takes into account the index because we get different
> results with and without it

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Get a git diff without taking index into account
  2015-02-18 15:06 ` Eric Frederich
@ 2015-02-18 15:37   ` Junio C Hamano
  2015-02-18 15:42     ` Eric Frederich
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2015-02-18 15:37 UTC (permalink / raw)
  To: Eric Frederich; +Cc: Git Mailing List

On Wed, Feb 18, 2015 at 7:06 AM, Eric Frederich
<eric.frederich@gmail.com> wrote:
> # how can I compare a working directory to a commit without taking the
> index into account?

You don't.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Get a git diff without taking index into account
  2015-02-18 15:37   ` Junio C Hamano
@ 2015-02-18 15:42     ` Eric Frederich
  2015-02-18 16:33       ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Frederich @ 2015-02-18 15:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

This is from "git help diff".  It seems to imply that I should be able to do it.
It mentions nothing of the index.

       git diff [--options] <commit> [--] [<path>...]
           This form is to view the changes you have in your working
tree relative to the named <commit>. You can use HEAD to compare it
with the latest commit, or a branch name
           to compare with the tip of a different branch.

On Wed, Feb 18, 2015 at 10:37 AM, Junio C Hamano <gitster@pobox.com> wrote:
> On Wed, Feb 18, 2015 at 7:06 AM, Eric Frederich
> <eric.frederich@gmail.com> wrote:
>> # how can I compare a working directory to a commit without taking the
>> index into account?
>
> You don't.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Get a git diff without taking index into account
  2015-02-18 15:42     ` Eric Frederich
@ 2015-02-18 16:33       ` Junio C Hamano
  2015-02-18 18:27         ` Eric Frederich
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2015-02-18 16:33 UTC (permalink / raw)
  To: Eric Frederich; +Cc: Git Mailing List

Eric Frederich <eric.frederich@gmail.com> writes:

> This is from "git help diff".  It seems to imply that I should be able to do it.
> It mentions nothing of the index.

Most of the documentation on early subcommands (and "git diff"
certainly is one of the early subcommands) were written back when
everybody knew that Git almost always talks about _tracked_ files
that are known to the index, and the only time it even cares about
untracked ones that are not in the index was when it tries to help
users by reminding what the user may have forgot to "git add".

Documentation pages do not bother repeating "this only looks at
tracked paths" for this reason; Git is about tracked files by
default.

Perhaps you can suggest how to improve the description of commands
without being too repetitive?

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Get a git diff without taking index into account
  2015-02-18 16:33       ` Junio C Hamano
@ 2015-02-18 18:27         ` Eric Frederich
  2015-02-18 18:32           ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Frederich @ 2015-02-18 18:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

Thanks for the reply.
My immediate concern is not to fix the documentation but to get some
sort of status or diff.
I want to avoid using an index because I want to allow multiple
processes to do different diffs at the same time.

Right now I can put trees into the repo and get trees out without
using the index but I had to write routines that use the lower level
commands.
If I could use the index it would be a couple of out of the box
commands and I wouldn't have to write my own routine.
While I've already written ones for "git add" and "git checkout" I'm
trying to avoid writing one for "git diff"

I looks like this is unavoidable.
Oh well... thanks for the help.



On Wed, Feb 18, 2015 at 11:33 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Eric Frederich <eric.frederich@gmail.com> writes:
>
>> This is from "git help diff".  It seems to imply that I should be able to do it.
>> It mentions nothing of the index.
>
> Most of the documentation on early subcommands (and "git diff"
> certainly is one of the early subcommands) were written back when
> everybody knew that Git almost always talks about _tracked_ files
> that are known to the index, and the only time it even cares about
> untracked ones that are not in the index was when it tries to help
> users by reminding what the user may have forgot to "git add".
>
> Documentation pages do not bother repeating "this only looks at
> tracked paths" for this reason; Git is about tracked files by
> default.
>
> Perhaps you can suggest how to improve the description of commands
> without being too repetitive?
>
> Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Get a git diff without taking index into account
  2015-02-18 18:27         ` Eric Frederich
@ 2015-02-18 18:32           ` Jeff King
  2015-02-18 19:36             ` Eric Frederich
  2015-02-18 21:38             ` Eric Frederich
  0 siblings, 2 replies; 11+ messages in thread
From: Jeff King @ 2015-02-18 18:32 UTC (permalink / raw)
  To: Eric Frederich; +Cc: Junio C Hamano, Git Mailing List

On Wed, Feb 18, 2015 at 01:27:50PM -0500, Eric Frederich wrote:

> My immediate concern is not to fix the documentation but to get some
> sort of status or diff.
> I want to avoid using an index because I want to allow multiple
> processes to do different diffs at the same time.

If you only have one working tree, can't all of the processes use the
same index (that matches the working tree) and do different diffs
against it?

If you have multiple working trees, can you use one index per working
tree, and specify it using GIT_INDEX_FILE?

If you can persist the index file for each working tree, this will be
much faster in the long run, too (you can just refresh the index before
each diff, which means that git does not have to actually open the files
in most cases; we can compare their stat information to what is in the
index, and then the index sha1 with what is in the tree).

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Get a git diff without taking index into account
  2015-02-18 18:32           ` Jeff King
@ 2015-02-18 19:36             ` Eric Frederich
  2015-02-18 21:38             ` Eric Frederich
  1 sibling, 0 replies; 11+ messages in thread
From: Eric Frederich @ 2015-02-18 19:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git Mailing List

Thanks Jeff.

I recognize your picture from here...
   http://git.661346.n2.nabble.com/push-race-td7569254.html
... which helped me figure out how two processes trying to update a
ref at the same time works out.

I will try using a separate GIT_INDEX_FILE for each working tree.

I'm not certain that what I'm trying to do is even a good idea.

We have a system which is the official storage of things we'll just
call "items".
These items have one or more "revisions".  These revisions can have
all sorts of relationships to other revisions.
Once a revision is released it is locked down and cannot be changed
including the revisions.

Rather than working directly with this official storage system, we
want our application to work against a concept of a "local workspace".
This is where I want to use Git.
We can map a released revision to a tree structure by getting all the
files, serializing all the attributes, relation data, etc.
That tree structure is what I would store in Git.

My options seem to be
  1) use a single Git repo to store all items in a disconnected manner
(each item has a branch disconnected (orphaned?) from the other
branches)
  2) each item gets its own Git repo
  3) use a single Git repo to store all items but have them all
together in a workspace

I'm pursuing option (1) right now and trying to see how much work it would take.
With option (2) I think that would limit my ability to send a bunch of
items from one repo to another.
Option (3) doesn't really map to the system we're trying to mimic
because releases are done at the "item revision" level, not at a
higher workspace level.

On Wed, Feb 18, 2015 at 1:32 PM, Jeff King <peff@peff.net> wrote:
> On Wed, Feb 18, 2015 at 01:27:50PM -0500, Eric Frederich wrote:
>
>> My immediate concern is not to fix the documentation but to get some
>> sort of status or diff.
>> I want to avoid using an index because I want to allow multiple
>> processes to do different diffs at the same time.
>
> If you only have one working tree, can't all of the processes use the
> same index (that matches the working tree) and do different diffs
> against it?
>
> If you have multiple working trees, can you use one index per working
> tree, and specify it using GIT_INDEX_FILE?
>
> If you can persist the index file for each working tree, this will be
> much faster in the long run, too (you can just refresh the index before
> each diff, which means that git does not have to actually open the files
> in most cases; we can compare their stat information to what is in the
> index, and then the index sha1 with what is in the tree).
>
> -Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Get a git diff without taking index into account
  2015-02-18 18:32           ` Jeff King
  2015-02-18 19:36             ` Eric Frederich
@ 2015-02-18 21:38             ` Eric Frederich
  2015-02-18 22:16               ` Junio C Hamano
  2015-02-18 22:30               ` Jeff King
  1 sibling, 2 replies; 11+ messages in thread
From: Eric Frederich @ 2015-02-18 21:38 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git Mailing List

On Wed, Feb 18, 2015 at 1:32 PM, Jeff King <peff@peff.net> wrote:
> On Wed, Feb 18, 2015 at 01:27:50PM -0500, Eric Frederich wrote:
>
> If you can persist the index file for each working tree, this will be
> much faster in the long run, too (you can just refresh the index before
> each diff, which means that git does not have to actually open the files
> in most cases; we can compare their stat information to what is in the
> index, and then the index sha1 with what is in the tree).

Could you elaborate on "you can just refresh the index before each diff"
What command would I use to do this?
I don't want to store some object just to get a diff of it.

Also, how would I go about detecting untracked files the way status does?
There is no way to specify a HEAD per git command using switches or
environment variables.
I can't change the HEAD of the Git repo because other processes may be
using it at the same time.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Get a git diff without taking index into account
  2015-02-18 21:38             ` Eric Frederich
@ 2015-02-18 22:16               ` Junio C Hamano
  2015-02-18 22:30               ` Jeff King
  1 sibling, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2015-02-18 22:16 UTC (permalink / raw)
  To: Eric Frederich; +Cc: Jeff King, Git Mailing List

On Wed, Feb 18, 2015 at 1:38 PM, Eric Frederich
<eric.frederich@gmail.com> wrote:
> Could you elaborate on "you can just refresh the index before each diff"
> What command would I use to do this?

"update-index --refresh", perhaps?

> Also, how would I go about detecting untracked files the way status does?

"ls-files"?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Get a git diff without taking index into account
  2015-02-18 21:38             ` Eric Frederich
  2015-02-18 22:16               ` Junio C Hamano
@ 2015-02-18 22:30               ` Jeff King
  1 sibling, 0 replies; 11+ messages in thread
From: Jeff King @ 2015-02-18 22:30 UTC (permalink / raw)
  To: Eric Frederich; +Cc: Junio C Hamano, Git Mailing List

On Wed, Feb 18, 2015 at 04:38:55PM -0500, Eric Frederich wrote:

> Also, how would I go about detecting untracked files the way status does?
> There is no way to specify a HEAD per git command using switches or
> environment variables.
> I can't change the HEAD of the Git repo because other processes may be
> using it at the same time.

Untracked files are a function of the index, not of the HEAD. So you
would load whatever tree you like into your index (either "the" index,
or a temporary one you specify with GIT_INDEX_FILE), and then "git
ls-files -o".

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-02-18 22:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-18 14:57 Get a git diff without taking index into account Eric Frederich
2015-02-18 15:06 ` Eric Frederich
2015-02-18 15:37   ` Junio C Hamano
2015-02-18 15:42     ` Eric Frederich
2015-02-18 16:33       ` Junio C Hamano
2015-02-18 18:27         ` Eric Frederich
2015-02-18 18:32           ` Jeff King
2015-02-18 19:36             ` Eric Frederich
2015-02-18 21:38             ` Eric Frederich
2015-02-18 22:16               ` Junio C Hamano
2015-02-18 22:30               ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.