All of lore.kernel.org
 help / color / mirror / Atom feed
From: Heiko Voigt <hvoigt@hvoigt.net>
To: Jakub Narebski <jnareb@gmail.com>, Junio C Hamano <gitster@pobox.com>
Cc: Martin Langhoff <martin.langhoff@gmail.com>,
	Dmitry Potapov <dpotapov@gmail.com>,
	Esko Luontola <esko.luontola@gmail.com>,
	git@vger.kernel.org
Subject: [PATCH] Extend sample pre-commit hook to check for non ascii filenames
Date: Mon, 18 May 2009 11:50:44 +0200	[thread overview]
Message-ID: <20090518095044.GB910@heiko-voigts-macbook.local> (raw)
In-Reply-To: <200905151657.47225.jnareb@gmail.com>

At the moment non-ascii encodings of filenames are not portably converted
between different filesystems by git. This will most likely change in the
future but to allow repositories to be portable among different file/operating
systems this check is enabled by default.

Signed-off-by: Heiko <hvoigt@hvoigt.net>
---
so here is a third version ...

On Fri, May 15, 2009 at 04:57:45PM +0200, Jakub Narebski wrote:
> On Thu, 14 May 2009, Heiko Voigt wrote:
> 
> > At the moment non-ascii encodings of filenames are not portably converted
> > between different filesystems by git. This will most likely change in the
> > future but to allow repositories to be portable among different file/operating
> > systems this check is enabled by default.
> 
> By the way, you might consider choosing shorter line length for commits,
> something around 70-76 characters per line; otherwise it is harder to
> reply to without linewrapping. 80 characters that you used is, IMHO,
> absolute maximum, and it is good that you kept to it.

Yeah, I admit they were a little bit long.

> > +function is_ascii () {
> > +    test -z "$(cat | sed -e "s/[\ -~]*//g")"
> > +    return $?
> > +}
> 
> From CodingGuidelines for shell scripts:
>  - We do not write the noiseword "function" in front of shell
>    functions.
> 
> (in short: do not use bash-specific features... unless, of course,
> you are modifying bash-completion script).

Addressed.

> Second, it would be nice to have comment about how to use this
> function (as it does not check file given by its argument, but
> rather its standard input). And perhaps also a comment that it
> works because ASCII printable characters begin with ' ' space
> (does it have to be escaped?) and end with '~' tilde[2].

Done

> 
> Third, isn't it useless use of 'cat'[3]? And wouldn't it be better
> to use 'tr' to either delete printable characters and check for
> anything left (as you do; BTW. wouldn't "return test ..." be simpler?),
> or use 'tr' to count non portable characters?

Yes indeed it was useless. I also switched from sed to tr.

> 
> [1] http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html
> [2] http://en.wikipedia.org/wiki/ASCII#ASCII_printable_characters
> [3] http://partmaps.org/era/unix/award.html#cat

On Fri, May 15, 2009 at 11:11:12AM -0700, Junio C Hamano wrote:
> Heiko Voigt <hvoigt@hvoigt.net> writes:
> > +function is_ascii () {
> 
> We do not say "#!/bin/bash" at the beginning (hopefully), so let's not say
> "function " here.

See above.

> > +    test -z "$(cat | sed -e "s/[\ -~]*//g")"
> 
> Do you need "cat | "?

Also above.

> Does this script run under LC_ALL=C?  Can an i18n'ized sed interfere with
> what you are trying to do?

I now explicitely set LC_ALL=C for the tr call which should now be robust
against such cases.

> 
> > +    return $?
> 
> Do you need this, or does the function return the result of the last
> statment anyway?

I wasn't aware of that. Removed the return.

> > +		echo "Non-ascii filenames are not allowed !"
> > +		echo "Please rename the file ..."
> 
> Can we make this sound more like a _sample_ project policy?  It's not like
> we enforce that policy to other people's projects.

I've polished this so we are now more user friendly as well.

 templates/hooks--pre-commit.sample |   32 ++++++++++++++++++++++++++++++++
 1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/templates/hooks--pre-commit.sample b/templates/hooks--pre-commit.sample
index 0e49279..91ab563 100755
--- a/templates/hooks--pre-commit.sample
+++ b/templates/hooks--pre-commit.sample
@@ -7,6 +7,38 @@
 #
 # To enable this hook, rename this file to "pre-commit".
 
+# If you want to allow non-ascii filenames set this variable to true.
+allownonascii=$(git config hooks.allownonascii)
+
+# is_ascii() Tests the string given given on standard input for
+# printable ascii conformance. We exploit the fact that the printable
+# range starts at the space character and ends with tilde.
+is_ascii() {
+    test -z "$(LC_ALL=C tr -d \ -~)"
+}
+
+if [ "$allownonascii" != "true" ]
+then
+	# until git can handle non-ascii filenames gracefully
+	# prevent them to be added into the repository
+	if ! git diff --cached --name-only --diff-filter=A -z \
+	   | tr "\0" "\n" | is_ascii; then
+		echo "Error: Preventing to add a non-ascii filename."
+		echo
+		echo "This can cause problems if you want to work together"
+		echo "with people on other platforms than you."
+		echo
+		echo "To be portable it is adviseable to rename the file ..."
+		echo
+		echo "If you know what you are doing you can disable this"
+		echo "check using:"
+		echo
+		echo "  git config hooks.allownonascii true"
+		echo
+		exit 1
+	fi
+fi
+
 if git-rev-parse --verify HEAD 2>/dev/null
 then
 	against=HEAD
-- 
1.6.3

  reply	other threads:[~2009-05-18  9:50 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-12 15:06 Cross-Platform Version Control Esko Luontola
2009-05-12 15:14 ` Shawn O. Pearce
2009-05-12 16:13   ` Johannes Schindelin
2009-05-12 17:56     ` Esko Luontola
2009-05-12 20:38       ` Johannes Schindelin
2009-05-12 21:16         ` Esko Luontola
2009-05-13  0:23           ` Johannes Schindelin
2009-05-13  5:34             ` Esko Luontola
2009-05-13  6:49               ` Alex Riesen
2009-05-13 10:15               ` Johannes Schindelin
     [not found]                 ` <43d8ce650905130340q596043d5g45b342b62fe20e8d@mail.gmail.com>
2009-05-13 10:41                   ` John Tapsell
2009-05-13 13:42                     ` Jay Soffian
2009-05-13 13:44                       ` Alex Riesen
2009-05-13 13:50                         ` Jay Soffian
2009-05-13 13:57                           ` John Tapsell
2009-05-13 15:27                             ` Nicolas Pitre
2009-05-13 16:22                               ` Johannes Schindelin
2009-05-13 17:24                             ` Andreas Ericsson
2009-05-14  1:49                             ` Miles Bader
2009-05-12 16:16   ` Jeff King
2009-05-12 16:57     ` Johannes Schindelin
2009-05-13 16:26     ` Linus Torvalds
2009-05-13 17:12       ` Linus Torvalds
2009-05-13 17:31         ` Andreas Ericsson
2009-05-13 17:46         ` Linus Torvalds
2009-05-13 18:26           ` Martin Langhoff
2009-05-13 18:37             ` Linus Torvalds
2009-05-13 21:04               ` Theodore Tso
2009-05-13 21:20                 ` Linus Torvalds
2009-05-13 21:08               ` Daniel Barkalow
2009-05-13 21:29                 ` Linus Torvalds
2009-05-13 20:57         ` Matthias Andree
2009-05-13 21:10           ` Linus Torvalds
2009-05-13 21:30             ` Jay Soffian
2009-05-13 21:47             ` Matthias Andree
2009-05-12 18:28 ` Dmitry Potapov
2009-05-12 18:40   ` Martin Langhoff
2009-05-12 18:55     ` Jakub Narebski
2009-05-12 21:43       ` [PATCH] Extend sample pre-commit hook to check for non ascii file/usernames Heiko Voigt
2009-05-12 21:55         ` Jakub Narebski
2009-05-14 17:59           ` [PATCH v2] Extend sample pre-commit hook to check for non ascii filenames Heiko Voigt
2009-05-15 10:52             ` Martin Langhoff
2009-05-18  9:37               ` Heiko Voigt
2009-05-18 22:26                 ` Jakub Narebski
2009-06-20 12:14               ` [RFC PATCH] check for filenames that only differ in case to sample pre-commit hook Heiko Voigt
2009-05-15 14:57             ` [PATCH v2] Extend sample pre-commit hook to check for non ascii filenames Jakub Narebski
2009-05-18  9:50               ` Heiko Voigt [this message]
2009-05-18 10:40                 ` [PATCH] " Johannes Sixt
2009-05-18 11:50                   ` Heiko Voigt
2009-05-18 12:04                     ` Johannes Sixt
2009-05-19 20:01                   ` [PATCH v4] " Heiko Voigt
2009-05-18 14:42                 ` [PATCH] " Junio C Hamano
2009-05-18 20:35                 ` Julian Phillips
2009-05-15 18:11             ` [PATCH v2] " Junio C Hamano
2009-05-14 13:48 ` Cross-Platform Version Control Peter Krefting
2009-05-14 19:58   ` Esko Luontola
2009-05-14 20:21     ` Andreas Ericsson
2009-05-14 22:25     ` Johannes Schindelin
2009-05-15 11:18     ` Dmitry Potapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090518095044.GB910@heiko-voigts-macbook.local \
    --to=hvoigt@hvoigt.net \
    --cc=dpotapov@gmail.com \
    --cc=esko.luontola@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    --cc=martin.langhoff@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.