All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Jeff King <peff@peff.net>
Cc: "Shawn O. Pearce" <spearce@spearce.org>,
	Esko Luontola <esko.luontola@gmail.com>,
	git@vger.kernel.org
Subject: Re: Cross-Platform Version Control
Date: Wed, 13 May 2009 10:46:53 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LFD.2.01.0905131036040.3343@localhost.localdomain> (raw)
In-Reply-To: <alpine.LFD.2.01.0905130951100.3343@localhost.localdomain>



On Wed, 13 May 2009, Linus Torvalds wrote:
>
> Of course, it probably makes sense to have a whole "git_readdir()" that 
> does this thing in general.

Actually, the more I think about that, the less true I think it is.

It _sounds_ like a nice simplification ("just do it once in readdir, and 
forget about it everywhere else"), but it's in fact a stupid thing to do.

Why?

If we _ever_ want to fix this in the general case, then the code that does 
the readdir() will actually have to remember both the "raw filesystem" 
form _and_ the "cleaned-up utf-8 form".

Why? Because when we do readdir(), we'll also do 'lstat()' on the end 
result to check the types, and opendir() in case it's a directory and we 
then want to do things recursively etc. And that happens to work on OS X 
(because we can use our "fixed" filename for lstat too), but it does not 
work in the general case.

And you can say "well, just do the stat inside the wrapped readdir()", but 
that doesn't work _either_, since

 - we don't want to do the lstat() if it's unnecessary. Even if we don't 
   have "de->d_type" information, we can often avoid the need for it, if 
   we can tell that the name isn't interestign (due to being ignored).

   Avoiding the lstat is a huge performance issue for cold-cache cases. 
   It's basically a seek.

   So we really want to do the lstat() later, which implies that the 
   caller needs to know _both_ the original "real" filesystem name _and_ 
   the converted one.

 - it doesn't handle the opendir() case anyway - so the end result is that 
   a real implementation will _always_ need to carry around both the 
   "filesystem view" filename _and_ the "what we've converted it into".

Now, the point of the patch I sent out was that for the specific case of 
OS X, which does UTF-8 conversions (wrong) but also is happy to get our 
properly normalized name, we don't care. So my patch is "correct" for that 
special case - and so would a plain readdir() wrapper be.

But my patch is _also_ correct for the case where a readdir() wrapper 
would do the wrong thing. My patch doesn't _handle_ it (since it doesn't 
change the code to pass both "filesystem view" and "cleaned-up view" 
pathnames), but the patch I sent out also doesn't make it any harder to do 
right.

In contrast, doing a readdir() wrapper makes it much harder to do right 
later, because it's just doing the conversion at the wrong level (you 
could make that "wrapper" return both the original and the fixed 
filename, but at that point the wrapper doesn't really help - you might 
as well just have the "convert" function, and it would be a hell of a lot 
more obvious what is really going on).

So I take it back. A readdir() wrapper is not a good idea. It gets us a 
tiny bit of the way, but it would actually take us a step back from the 
"real" solution.

			Linus

  parent reply	other threads:[~2009-05-13 17:49 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-12 15:06 Cross-Platform Version Control Esko Luontola
2009-05-12 15:14 ` Shawn O. Pearce
2009-05-12 16:13   ` Johannes Schindelin
2009-05-12 17:56     ` Esko Luontola
2009-05-12 20:38       ` Johannes Schindelin
2009-05-12 21:16         ` Esko Luontola
2009-05-13  0:23           ` Johannes Schindelin
2009-05-13  5:34             ` Esko Luontola
2009-05-13  6:49               ` Alex Riesen
2009-05-13 10:15               ` Johannes Schindelin
     [not found]                 ` <43d8ce650905130340q596043d5g45b342b62fe20e8d@mail.gmail.com>
2009-05-13 10:41                   ` John Tapsell
2009-05-13 13:42                     ` Jay Soffian
2009-05-13 13:44                       ` Alex Riesen
2009-05-13 13:50                         ` Jay Soffian
2009-05-13 13:57                           ` John Tapsell
2009-05-13 15:27                             ` Nicolas Pitre
2009-05-13 16:22                               ` Johannes Schindelin
2009-05-13 17:24                             ` Andreas Ericsson
2009-05-14  1:49                             ` Miles Bader
2009-05-12 16:16   ` Jeff King
2009-05-12 16:57     ` Johannes Schindelin
2009-05-13 16:26     ` Linus Torvalds
2009-05-13 17:12       ` Linus Torvalds
2009-05-13 17:31         ` Andreas Ericsson
2009-05-13 17:46         ` Linus Torvalds [this message]
2009-05-13 18:26           ` Martin Langhoff
2009-05-13 18:37             ` Linus Torvalds
2009-05-13 21:04               ` Theodore Tso
2009-05-13 21:20                 ` Linus Torvalds
2009-05-13 21:08               ` Daniel Barkalow
2009-05-13 21:29                 ` Linus Torvalds
2009-05-13 20:57         ` Matthias Andree
2009-05-13 21:10           ` Linus Torvalds
2009-05-13 21:30             ` Jay Soffian
2009-05-13 21:47             ` Matthias Andree
2009-05-12 18:28 ` Dmitry Potapov
2009-05-12 18:40   ` Martin Langhoff
2009-05-12 18:55     ` Jakub Narebski
2009-05-12 21:43       ` [PATCH] Extend sample pre-commit hook to check for non ascii file/usernames Heiko Voigt
2009-05-12 21:55         ` Jakub Narebski
2009-05-14 17:59           ` [PATCH v2] Extend sample pre-commit hook to check for non ascii filenames Heiko Voigt
2009-05-15 10:52             ` Martin Langhoff
2009-05-18  9:37               ` Heiko Voigt
2009-05-18 22:26                 ` Jakub Narebski
2009-06-20 12:14               ` [RFC PATCH] check for filenames that only differ in case to sample pre-commit hook Heiko Voigt
2009-05-15 14:57             ` [PATCH v2] Extend sample pre-commit hook to check for non ascii filenames Jakub Narebski
2009-05-18  9:50               ` [PATCH] " Heiko Voigt
2009-05-18 10:40                 ` Johannes Sixt
2009-05-18 11:50                   ` Heiko Voigt
2009-05-18 12:04                     ` Johannes Sixt
2009-05-19 20:01                   ` [PATCH v4] " Heiko Voigt
2009-05-18 14:42                 ` [PATCH] " Junio C Hamano
2009-05-18 20:35                 ` Julian Phillips
2009-05-15 18:11             ` [PATCH v2] " Junio C Hamano
2009-05-14 13:48 ` Cross-Platform Version Control Peter Krefting
2009-05-14 19:58   ` Esko Luontola
2009-05-14 20:21     ` Andreas Ericsson
2009-05-14 22:25     ` Johannes Schindelin
2009-05-15 11:18     ` Dmitry Potapov
  -- strict thread matches above, loose matches on Subject: below --
2009-04-27  8:55 Eric Sink's blog - notes on git, dscms and a "whole product" approach Martin Langhoff
2009-04-28 11:24 ` Cross-Platform Version Control (was: Eric Sink's blog - notes on git, dscms and a "whole product" approach) Jakub Narebski
2009-04-29  6:55   ` Martin Langhoff
2009-04-29  7:52     ` Cross-Platform Version Control Jakub Narebski
2009-04-29  8:25       ` Martin Langhoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.01.0905131036040.3343@localhost.localdomain \
    --to=torvalds@linux-foundation.org \
    --cc=esko.luontola@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.