All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicolas Pitre <nico@cam.org>, Junio C Hamano <junkio@cox.net>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Marco Costalba <mcostalba@gmail.com>,
	GIT list <git@vger.kernel.org>
Subject: Re: 'git status' is not read-only fs friendly
Date: Sun, 11 Feb 2007 02:23:58 -0500	[thread overview]
Message-ID: <20070211072358.GB2082@spearce.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0702100913020.8424@woody.linux-foundation.org>

Linus Torvalds <torvalds@linux-foundation.org> wrote:
> It's not even a "technical issue". It's a fundamental optimization. Sure, 
> you can call optimizations just "technical issues", but the fact is, it's 
> one of the things that makes git so _usable_ on large archives. At some 
> point, an "optimization" is no longer just about making things slightly 
> faster, it's about something much bigger, and has real semantic meaning.
> 
> So the fact is, "git status" _needs_ to refresh the index. Because if it 
> doesn't, you'll see every file that doesn't match the index as "dirty", 
> and that is not just a "technical issue".

Indeed.  Except that `git-update-index --refresh` is itself not
very fast on Cygwin+NTFS and large projects (about the size of
the kernel).  So git-status is a real slouch there.  Not running
`git update-index --refresh` saves at least a couple of seconds.

This is why git-gui lets you disable the refresh, and is part of
the reason why it computes the status on its own by diff-index,
diff-files and ls-files --others.
 
> THIS IS NOT "JUST A TECHNICAL ISSUE". 
> 
> When the difference is 40 seconds vs 4 (uncached), or 2 seconds vs 0.06, 
> it's not about "just an optimization" any more. At that point, it's about 
> "unusable vs usable".
> 
> And yeah, waiting 40 seconds for a global "diff" for a big project may be 
> something that a person coming from CVS considers to be just par for the 
> course. Maybe I'm just unreasonable. But I think it's a _bug_ if I can't 
> get a small diff in about a tenth of a second. It needs to be so fast that 
> I never even _think_ about it.

Yes.  Which is why if git-gui finds a file that has an empty diff,
but that was reported as modified by diff-files, it tells the user
its about to go waste a few seconds running `update-index --refresh`,
then does so.

In practice I've found it rare that a file is dirty in the index,
but is not actually modified.  The typical culprit appears to
actually be the virus scanner on a Windows system.  For some reason
it feels a need to modify some random XML 'source' files that are
tracked by Git.  Out of 30,000 files it likes to modify about 100.
*sigh* At least I have Git to tell me it didn't change any content.
 
> I think it would be much better if "git status" always wrote the refreshed 
> index file. It could then choose to ignore any errors if they happen, 
> because if you have a broken setup like the NTFS read-only thing, then 
> tough, it's broken, but git can't do anythign about it. But people should 
> be aware that yes, "git status" absolutely _needs_ to write the index 
> file. 

Not only that, but I think we can do much better with git-runstatus
than we do now.  If we scan the working directory (to search for
untracked files), and we walk the index in parallel, we can update
the index with new stat data if necessary.

Of course that doesn't matter much on Linux; its VFS operations
don't take hours.

-- 
Shawn.

  parent reply	other threads:[~2007-02-11  7:24 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-09 19:25 'git status' is not read-only fs friendly Marco Costalba
2007-02-09 19:56 ` Linus Torvalds
2007-02-09 20:19   ` Marco Costalba
2007-02-09 20:27     ` Junio C Hamano
2007-02-09 20:22   ` Junio C Hamano
2007-02-09 20:29     ` Morten Welinder
2007-02-09 23:27       ` Theodore Tso
2007-02-09 20:35     ` Marco Costalba
2007-02-09 20:59       ` Linus Torvalds
2007-02-10  0:12         ` Junio C Hamano
2007-02-10  0:16           ` Junio C Hamano
2007-02-10  2:51             ` [PATCH 1/2] run_diff_{files,index}(): update calling convention Junio C Hamano
2007-02-10  8:02               ` Marco Costalba
2007-02-10  8:20                 ` Junio C Hamano
2007-02-10  8:29                   ` Marco Costalba
2007-02-10  8:46                     ` Marco Costalba
2007-02-10 10:40                       ` Junio C Hamano
2007-02-10 11:25                         ` Marco Costalba
2007-02-10 15:13                           ` Junio C Hamano
2007-02-10 15:51                             ` Marco Costalba
2007-02-10  2:51             ` [PATCH 2/2] git-runstatus --refresh Junio C Hamano
2007-02-10 14:19 ` 'git status' is not read-only fs friendly Johannes Schindelin
2007-02-10 14:31   ` Marco Costalba
2007-02-10 14:41     ` Johannes Schindelin
2007-02-10 14:48       ` Marco Costalba
2007-02-10 14:51         ` Marco Costalba
2007-02-10 16:25           ` Junio C Hamano
2007-02-10 20:36             ` Johannes Schindelin
2007-02-11 21:57               ` Junio C Hamano
2007-02-11 22:09                 ` Johannes Schindelin
2007-02-11 22:28                   ` Johannes Schindelin
2007-02-11 22:30                   ` Junio C Hamano
2007-02-11 23:24                     ` Johannes Schindelin
2007-02-10 14:59         ` Johannes Schindelin
2007-02-10 15:45           ` Marco Costalba
2007-02-10 15:54           ` Nicolas Pitre
2007-02-10 16:27             ` Junio C Hamano
2007-02-10 16:40               ` Nicolas Pitre
2007-02-10 16:46                 ` Junio C Hamano
2007-02-10 17:03                   ` Nicolas Pitre
2007-02-10 18:00                     ` Junio C Hamano
2007-02-10 18:43                       ` Theodore Tso
2007-02-10 18:53                       ` Nicolas Pitre
2007-02-10 18:56                         ` Theodore Tso
2007-02-10 19:08                         ` Marco Costalba
2007-02-10 17:37                 ` Linus Torvalds
2007-02-10 18:51                   ` Nicolas Pitre
2007-02-11  6:33                   ` Junio C Hamano
2007-02-11  7:23                   ` Shawn O. Pearce [this message]
2007-02-10 20:40             ` Johannes Schindelin
2007-02-10 16:25           ` Junio C Hamano
2007-02-10 16:35             ` Marco Costalba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070211072358.GB2082@spearce.org \
    --to=spearce@spearce.org \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=mcostalba@gmail.com \
    --cc=nico@cam.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.