From: Linus Torvalds <torvalds@linux-foundation.org>
To: Jeff King <peff@peff.net>
Cc: "Shawn O. Pearce" <spearce@spearce.org>,
Esko Luontola <esko.luontola@gmail.com>,
git@vger.kernel.org
Subject: Re: Cross-Platform Version Control
Date: Wed, 13 May 2009 10:46:53 -0700 (PDT) [thread overview]
Message-ID: <alpine.LFD.2.01.0905131036040.3343@localhost.localdomain> (raw)
In-Reply-To: <alpine.LFD.2.01.0905130951100.3343@localhost.localdomain>
On Wed, 13 May 2009, Linus Torvalds wrote:
>
> Of course, it probably makes sense to have a whole "git_readdir()" that
> does this thing in general.
Actually, the more I think about that, the less true I think it is.
It _sounds_ like a nice simplification ("just do it once in readdir, and
forget about it everywhere else"), but it's in fact a stupid thing to do.
Why?
If we _ever_ want to fix this in the general case, then the code that does
the readdir() will actually have to remember both the "raw filesystem"
form _and_ the "cleaned-up utf-8 form".
Why? Because when we do readdir(), we'll also do 'lstat()' on the end
result to check the types, and opendir() in case it's a directory and we
then want to do things recursively etc. And that happens to work on OS X
(because we can use our "fixed" filename for lstat too), but it does not
work in the general case.
And you can say "well, just do the stat inside the wrapped readdir()", but
that doesn't work _either_, since
- we don't want to do the lstat() if it's unnecessary. Even if we don't
have "de->d_type" information, we can often avoid the need for it, if
we can tell that the name isn't interestign (due to being ignored).
Avoiding the lstat is a huge performance issue for cold-cache cases.
It's basically a seek.
So we really want to do the lstat() later, which implies that the
caller needs to know _both_ the original "real" filesystem name _and_
the converted one.
- it doesn't handle the opendir() case anyway - so the end result is that
a real implementation will _always_ need to carry around both the
"filesystem view" filename _and_ the "what we've converted it into".
Now, the point of the patch I sent out was that for the specific case of
OS X, which does UTF-8 conversions (wrong) but also is happy to get our
properly normalized name, we don't care. So my patch is "correct" for that
special case - and so would a plain readdir() wrapper be.
But my patch is _also_ correct for the case where a readdir() wrapper
would do the wrong thing. My patch doesn't _handle_ it (since it doesn't
change the code to pass both "filesystem view" and "cleaned-up view"
pathnames), but the patch I sent out also doesn't make it any harder to do
right.
In contrast, doing a readdir() wrapper makes it much harder to do right
later, because it's just doing the conversion at the wrong level (you
could make that "wrapper" return both the original and the fixed
filename, but at that point the wrapper doesn't really help - you might
as well just have the "convert" function, and it would be a hell of a lot
more obvious what is really going on).
So I take it back. A readdir() wrapper is not a good idea. It gets us a
tiny bit of the way, but it would actually take us a step back from the
"real" solution.
Linus
next prev parent reply other threads:[~2009-05-13 17:49 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-12 15:06 Cross-Platform Version Control Esko Luontola
2009-05-12 15:14 ` Shawn O. Pearce
2009-05-12 16:13 ` Johannes Schindelin
2009-05-12 17:56 ` Esko Luontola
2009-05-12 20:38 ` Johannes Schindelin
2009-05-12 21:16 ` Esko Luontola
2009-05-13 0:23 ` Johannes Schindelin
2009-05-13 5:34 ` Esko Luontola
2009-05-13 6:49 ` Alex Riesen
2009-05-13 10:15 ` Johannes Schindelin
[not found] ` <43d8ce650905130340q596043d5g45b342b62fe20e8d@mail.gmail.com>
2009-05-13 10:41 ` John Tapsell
2009-05-13 13:42 ` Jay Soffian
2009-05-13 13:44 ` Alex Riesen
2009-05-13 13:50 ` Jay Soffian
2009-05-13 13:57 ` John Tapsell
2009-05-13 15:27 ` Nicolas Pitre
2009-05-13 16:22 ` Johannes Schindelin
2009-05-13 17:24 ` Andreas Ericsson
2009-05-14 1:49 ` Miles Bader
2009-05-12 16:16 ` Jeff King
2009-05-12 16:57 ` Johannes Schindelin
2009-05-13 16:26 ` Linus Torvalds
2009-05-13 17:12 ` Linus Torvalds
2009-05-13 17:31 ` Andreas Ericsson
2009-05-13 17:46 ` Linus Torvalds [this message]
2009-05-13 18:26 ` Martin Langhoff
2009-05-13 18:37 ` Linus Torvalds
2009-05-13 21:04 ` Theodore Tso
2009-05-13 21:20 ` Linus Torvalds
2009-05-13 21:08 ` Daniel Barkalow
2009-05-13 21:29 ` Linus Torvalds
2009-05-13 20:57 ` Matthias Andree
2009-05-13 21:10 ` Linus Torvalds
2009-05-13 21:30 ` Jay Soffian
2009-05-13 21:47 ` Matthias Andree
2009-05-12 18:28 ` Dmitry Potapov
2009-05-12 18:40 ` Martin Langhoff
2009-05-12 18:55 ` Jakub Narebski
2009-05-12 21:43 ` [PATCH] Extend sample pre-commit hook to check for non ascii file/usernames Heiko Voigt
2009-05-12 21:55 ` Jakub Narebski
2009-05-14 17:59 ` [PATCH v2] Extend sample pre-commit hook to check for non ascii filenames Heiko Voigt
2009-05-15 10:52 ` Martin Langhoff
2009-05-18 9:37 ` Heiko Voigt
2009-05-18 22:26 ` Jakub Narebski
2009-06-20 12:14 ` [RFC PATCH] check for filenames that only differ in case to sample pre-commit hook Heiko Voigt
2009-05-15 14:57 ` [PATCH v2] Extend sample pre-commit hook to check for non ascii filenames Jakub Narebski
2009-05-18 9:50 ` [PATCH] " Heiko Voigt
2009-05-18 10:40 ` Johannes Sixt
2009-05-18 11:50 ` Heiko Voigt
2009-05-18 12:04 ` Johannes Sixt
2009-05-19 20:01 ` [PATCH v4] " Heiko Voigt
2009-05-18 14:42 ` [PATCH] " Junio C Hamano
2009-05-18 20:35 ` Julian Phillips
2009-05-15 18:11 ` [PATCH v2] " Junio C Hamano
2009-05-14 13:48 ` Cross-Platform Version Control Peter Krefting
2009-05-14 19:58 ` Esko Luontola
2009-05-14 20:21 ` Andreas Ericsson
2009-05-14 22:25 ` Johannes Schindelin
2009-05-15 11:18 ` Dmitry Potapov
-- strict thread matches above, loose matches on Subject: below --
2009-04-27 8:55 Eric Sink's blog - notes on git, dscms and a "whole product" approach Martin Langhoff
2009-04-28 11:24 ` Cross-Platform Version Control (was: Eric Sink's blog - notes on git, dscms and a "whole product" approach) Jakub Narebski
2009-04-29 6:55 ` Martin Langhoff
2009-04-29 7:52 ` Cross-Platform Version Control Jakub Narebski
2009-04-29 8:25 ` Martin Langhoff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.01.0905131036040.3343@localhost.localdomain \
--to=torvalds@linux-foundation.org \
--cc=esko.luontola@gmail.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.