All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <junkio@cox.net>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicolas Pitre <nico@cam.org>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Marco Costalba <mcostalba@gmail.com>,
	GIT list <git@vger.kernel.org>
Subject: Re: 'git status' is not read-only fs friendly
Date: Sat, 10 Feb 2007 22:33:46 -0800	[thread overview]
Message-ID: <7vtzxtdwz9.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: <Pine.LNX.4.64.0702100913020.8424@woody.linux-foundation.org> (Linus Torvalds's message of "Sat, 10 Feb 2007 09:37:58 -0800 (PST)")

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Sat, 10 Feb 2007, Nicolas Pitre wrote:
>> > >
>> > > Because git-status itself is conceptually a read-only operation, and 
>> > > having it barf on a read-only file system is justifiably a bug.
>> > 
>> > I do not 100% agree that it is conceptually a read-only operation.
>> 
>> It is.
>
> It really isn't. 
>
> It's not even a "technical issue". It's a fundamental optimization. Sure, 
> you can call optimizations just "technical issues", but the fact is, it's
> one of the things that makes git so _usable_ on large archives. At some 
> point, an "optimization" is no longer just about making things slightly 
> faster, it's about something much bigger, and has real semantic meaning.
> ...
> THIS IS NOT "JUST A TECHNICAL ISSUE". 
> ...
> And the index is what makes it so. 
>
> And that's why it's important to keep the index up-to-date.

I think a one paragraph summary of your argument is:

 - index is a good thing -- it is what makes the difference
   between usable and unusable.

 - git-status needs to refresh the index in order to do its
   thing efficiently and usably _anyway_, so once it spends
   cycles to do so, it is senseless not to write the refreshed
   index out when it can.

I do not think anybody disputes that in a repository with 20k+
paths, it is sensible to leave the index stat-dirty for all
paths.  But I think your example

	read-tree HEAD

misses the point by stressing the importance of index too much.
Index is important for the usability and I do not think anybody
is disputing it.

The thing is, nobody switches the index that way without running
"update-index --refresh" afterwards.  Normal people would use
git-reset to switch to a different tree object, and the command
does that for you.  If you are a hardcore, you would know to use
"read-tree -m HEAD" at least to avoid making paths unnecessarily
stat-dirty.  Your example, while it is valid and demonstrates
why the index is a good thing very well, is simply not part of
a normal workflow and not very relevant when discussing the
performance ramifications of what state "git-status" should
leave the index in.

When I said "calling 'update-index --refresh' in git-status
loses stat-dirtiness information", I was certainly _NOT_ talking
about losing the information that 20k+ paths used to be
stat-dirty because the user did "read-tree HEAD" earlier.

At least for me, it is very normal to do something like this.

 * start from a clean index.

 * edit cache.h, diff.h, and diff-lib.c.

 * stop, think, and realize that my earlier edit to change one
   function prototype in diff.h was not needed, and revert the
   change to that line still in the editor.

 * fix things up further by editing other files.

And then, I would run "git diff" to see where I am.  I still
remember that I touched diff.h and I also remember that I once
changed a function prototype but then decided the change was not
necessary after all, but I do not remember if I changed anything
else in the file.  It is _very_ assuring to see the emptiness
that follows "git diff --git" header for diff.h in such a case.
Seeing the path to be stat-dirty is a very good thing for me,
because otherwise I might lose a few seconds thinking that what
I thought I touched might have been cache.h and not diff.h.

To me, running "git status" is "wrapping things up" step.  I do
not need that stat-dirty assurance "git diff" gave me at that
point.  Not seeing diff.h in "modified but updated" list is a
good thing.  And in my workflow, after that 'wrapping things up"
step, I do not need that stat-dirty assurance _anymore_.

I think Nico is correct to point out that "not _anymore_" part
of the above reasoning of mine assumes _my_ workflow and
preference, and I think that is a valid point.  Not saving the
refreshed index would make the stat-dirtiness for diff.h to come
back, which would be inconvenient and annoying to me.

But the user might want to keep it stat-dirty after running
"git-status".  People in "not _anymore_" camp like me can throw
the stat-dirtiness away by "update-index --refresh".  I do not
think he (or anybody) is advocating to keep 20k+ paths in
stat-dirty state (arguably, "artificially" due to use of
"read-tree HEAD"), so your example using "read-tree HEAD" only
confuses the discussion.

Having said all that, I do agree with you that git-status should
throw that stat-dirtiness information away by saving the
refreshed index.  Doing otherwise is annoying to me as I already
said, and I do not think of a valid reason for the user to want
to keep stat-dirtiness information after running "git-status",
because to me the whole point of running "git-status" is to
start wrapping things up.

  parent reply	other threads:[~2007-02-11  6:33 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-09 19:25 'git status' is not read-only fs friendly Marco Costalba
2007-02-09 19:56 ` Linus Torvalds
2007-02-09 20:19   ` Marco Costalba
2007-02-09 20:27     ` Junio C Hamano
2007-02-09 20:22   ` Junio C Hamano
2007-02-09 20:29     ` Morten Welinder
2007-02-09 23:27       ` Theodore Tso
2007-02-09 20:35     ` Marco Costalba
2007-02-09 20:59       ` Linus Torvalds
2007-02-10  0:12         ` Junio C Hamano
2007-02-10  0:16           ` Junio C Hamano
2007-02-10  2:51             ` [PATCH 1/2] run_diff_{files,index}(): update calling convention Junio C Hamano
2007-02-10  8:02               ` Marco Costalba
2007-02-10  8:20                 ` Junio C Hamano
2007-02-10  8:29                   ` Marco Costalba
2007-02-10  8:46                     ` Marco Costalba
2007-02-10 10:40                       ` Junio C Hamano
2007-02-10 11:25                         ` Marco Costalba
2007-02-10 15:13                           ` Junio C Hamano
2007-02-10 15:51                             ` Marco Costalba
2007-02-10  2:51             ` [PATCH 2/2] git-runstatus --refresh Junio C Hamano
2007-02-10 14:19 ` 'git status' is not read-only fs friendly Johannes Schindelin
2007-02-10 14:31   ` Marco Costalba
2007-02-10 14:41     ` Johannes Schindelin
2007-02-10 14:48       ` Marco Costalba
2007-02-10 14:51         ` Marco Costalba
2007-02-10 16:25           ` Junio C Hamano
2007-02-10 20:36             ` Johannes Schindelin
2007-02-11 21:57               ` Junio C Hamano
2007-02-11 22:09                 ` Johannes Schindelin
2007-02-11 22:28                   ` Johannes Schindelin
2007-02-11 22:30                   ` Junio C Hamano
2007-02-11 23:24                     ` Johannes Schindelin
2007-02-10 14:59         ` Johannes Schindelin
2007-02-10 15:45           ` Marco Costalba
2007-02-10 15:54           ` Nicolas Pitre
2007-02-10 16:27             ` Junio C Hamano
2007-02-10 16:40               ` Nicolas Pitre
2007-02-10 16:46                 ` Junio C Hamano
2007-02-10 17:03                   ` Nicolas Pitre
2007-02-10 18:00                     ` Junio C Hamano
2007-02-10 18:43                       ` Theodore Tso
2007-02-10 18:53                       ` Nicolas Pitre
2007-02-10 18:56                         ` Theodore Tso
2007-02-10 19:08                         ` Marco Costalba
2007-02-10 17:37                 ` Linus Torvalds
2007-02-10 18:51                   ` Nicolas Pitre
2007-02-11  6:33                   ` Junio C Hamano [this message]
2007-02-11  7:23                   ` Shawn O. Pearce
2007-02-10 20:40             ` Johannes Schindelin
2007-02-10 16:25           ` Junio C Hamano
2007-02-10 16:35             ` Marco Costalba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vtzxtdwz9.fsf@assigned-by-dhcp.cox.net \
    --to=junkio@cox.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=mcostalba@gmail.com \
    --cc=nico@cam.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.