All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] help: correct behavior for is_executable on Windows
@ 2017-01-27 13:50 Johannes Schindelin
  2017-01-27 18:43 ` Junio C Hamano
  2017-01-30 12:40 ` [PATCH v2] help: improve is_executable() " Johannes Schindelin
  0 siblings, 2 replies; 16+ messages in thread
From: Johannes Schindelin @ 2017-01-27 13:50 UTC (permalink / raw)
  To: git; +Cc: Heiko Voigt, Junio C Hamano

From: Heiko Voigt <hvoigt@hvoigt.net>

The previous implementation said that the filesystem information on
Windows is not reliable to determine whether a file is executable. To
gather this information it was peeking into the first two bytes of a
file to see whether it looks executable.

Apart from the fact that on Windows executables are defined as such by
their extension (and Git has special code to support "executing" scripts
when they have a she-bang line) it leads to serious performance
problems: not only do we have to open many files now, it gets even
slower when a virus scanner is running.

See the following measurements (in seconds) as an example, where we
execute a simple program that simply lists the directory contents and
calls open() on every listed file:

With virus scanner running (coldcache):

$ ./a.exe /libexec/git-core/
before open (git-add.exe): 0.000000
after open (git-add.exe): 0.412873
before open (git-annotate.exe): 0.000175
after open (git-annotate.exe): 0.397925
before open (git-apply.exe): 0.000243
after open (git-apply.exe): 0.399996
before open (git-archive.exe): 0.000147
after open (git-archive.exe): 0.397783
before open (git-bisect--helper.exe): 0.000160
after open (git-bisect--helper.exe): 0.397700
before open (git-blame.exe): 0.000160
after open (git-blame.exe): 0.399136
...

With virus scanner running (hotcache):

$ ./a.exe /libexec/git-core/
before open (git-add.exe): 0.000000
after open (git-add.exe): 0.000325
before open (git-annotate.exe): 0.000229
after open (git-annotate.exe): 0.000177
before open (git-apply.exe): 0.000167
after open (git-apply.exe): 0.000150
before open (git-archive.exe): 0.000154
after open (git-archive.exe): 0.000156
before open (git-bisect--helper.exe): 0.000132
after open (git-bisect--helper.exe): 0.000180
before open (git-blame.exe): 0.000718
after open (git-blame.exe): 0.000724
...

With this patch I get:

$ time git help git
Launching default browser to display HTML ...

real    0m8.723s
user    0m0.000s
sys     0m0.000s

and without

$ time git help git
Launching default browser to display HTML ...

real    1m37.734s
user    0m0.000s
sys     0m0.031s

both tests with cold cache and giving the machine some time to settle
down after restart.

Signed-off-by: Heiko Voigt <heiko.voigt@mahr.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Published-As: https://github.com/dscho/git/releases/tag/help-is-exe-v1
Fetch-It-Via: git fetch https://github.com/dscho/git help-is-exe-v1

 help.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/help.c b/help.c
index 53e2a67e00..bc6cd19cf3 100644
--- a/help.c
+++ b/help.c
@@ -105,7 +105,22 @@ static int is_executable(const char *name)
 		return 0;
 
 #if defined(GIT_WINDOWS_NATIVE)
-{	/* cannot trust the executable bit, peek into the file instead */
+	/*
+	 * On Windows there is no executable bit. The file extension
+	 * indicates whether it can be run as an executable, and Git
+	 * has special-handling to detect scripts and launch them
+	 * through the indicated script interpreter. We test for the
+	 * file extension first because virus scanners may make
+	 * it quite expensive to open many files.
+	 */
+	if (ends_with(name, ".exe"))
+		return S_IXUSR;
+
+{
+	/*
+	 * Now that we know it does not have an executable extension,
+	 * peek into the file instead.
+	 */
 	char buf[3] = { 0 };
 	int n;
 	int fd = open(name, O_RDONLY);
@@ -113,8 +128,8 @@ static int is_executable(const char *name)
 	if (fd >= 0) {
 		n = read(fd, buf, 2);
 		if (n == 2)
-			/* DOS executables start with "MZ" */
-			if (!strcmp(buf, "#!") || !strcmp(buf, "MZ"))
+			/* look for a she-bang */
+			if (!strcmp(buf, "#!"))
 				st.st_mode |= S_IXUSR;
 		close(fd);
 	}

base-commit: 4e59582ff70d299f5a88449891e78d15b4b3fabe
-- 
2.11.1.windows.prerelease.2.9.g3014b57

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] help: correct behavior for is_executable on Windows
  2017-01-27 13:50 [PATCH] help: correct behavior for is_executable on Windows Johannes Schindelin
@ 2017-01-27 18:43 ` Junio C Hamano
  2017-01-30 12:44   ` Johannes Schindelin
  2017-01-30 12:40 ` [PATCH v2] help: improve is_executable() " Johannes Schindelin
  1 sibling, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2017-01-27 18:43 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Heiko Voigt

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> From: Heiko Voigt <hvoigt@hvoigt.net>
>
> The previous implementation said that the filesystem information on
> Windows is not reliable to determine whether a file is executable. To
> gather this information it was peeking into the first two bytes of a
> file to see whether it looks executable.
>
> Apart from the fact that on Windows executables are defined as such by
> their extension (and Git has special code to support "executing" scripts
> when they have a she-bang line) it leads to serious performance
> problems: not only do we have to open many files now, it gets even
> slower when a virus scanner is running.

Heiko, around here (before going into the details of how severe the
problem is and how wonderful the result applying of this change is)
is the best place to summarize the solution.  E.g.

	Because the definition of "executable" on Windows is based
	on the file extension, update the function to declare that a
	file with ".exe" extension without opening and reading the
	early bytes from it.  This avoids serious performance issues.

I paraphrased the rest only so that the description of the solution
(i.e. "instead of opening and peeking, we trust .exe suffix") fits
well in the surrounding text; the important part is to say what the
change does clearly.

I agree with the reasoning and the execution of the patch, except
that 

 - "correct behaviour" in the title makes it appear that this is a
   correctness thing, but this is primarily a performance fix.

 - It is a bit strange that "MZ" is dropped in the same patch
   without any mention.

The latter may be "correctness" fix, in that earlier we treated any
file that happens to begin with MZ as an executable, regardless of
the filename, which may have misidentified a non-executable file as
an executable.  If that is what is going on, it deserves to be
described in the log message.

> diff --git a/help.c b/help.c
> index 53e2a67e00..bc6cd19cf3 100644
> --- a/help.c
> +++ b/help.c
> @@ -105,7 +105,22 @@ static int is_executable(const char *name)
>  		return 0;
>  
>  #if defined(GIT_WINDOWS_NATIVE)
> -{	/* cannot trust the executable bit, peek into the file instead */
> +	/*
> +	 * On Windows there is no executable bit. The file extension
> +	 * indicates whether it can be run as an executable, and Git
> +	 * has special-handling to detect scripts and launch them
> +	 * through the indicated script interpreter. We test for the
> +	 * file extension first because virus scanners may make
> +	 * it quite expensive to open many files.
> +	 */
> +	if (ends_with(name, ".exe"))
> +		return S_IXUSR;
> +
> +{
> +	/*
> +	 * Now that we know it does not have an executable extension,
> +	 * peek into the file instead.
> +	 */
>  	char buf[3] = { 0 };
>  	int n;
>  	int fd = open(name, O_RDONLY);
> @@ -113,8 +128,8 @@ static int is_executable(const char *name)
>  	if (fd >= 0) {
>  		n = read(fd, buf, 2);
>  		if (n == 2)
> -			/* DOS executables start with "MZ" */
> -			if (!strcmp(buf, "#!") || !strcmp(buf, "MZ"))
> +			/* look for a she-bang */
> +			if (!strcmp(buf, "#!"))
>  				st.st_mode |= S_IXUSR;
>  		close(fd);
>  	}
>
> base-commit: 4e59582ff70d299f5a88449891e78d15b4b3fabe

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] help: improve is_executable() on Windows
  2017-01-27 13:50 [PATCH] help: correct behavior for is_executable on Windows Johannes Schindelin
  2017-01-27 18:43 ` Junio C Hamano
@ 2017-01-30 12:40 ` Johannes Schindelin
  2017-01-30 17:03   ` Junio C Hamano
  1 sibling, 1 reply; 16+ messages in thread
From: Johannes Schindelin @ 2017-01-30 12:40 UTC (permalink / raw)
  To: git; +Cc: Heiko Voigt, Junio C Hamano

From: Heiko Voigt <hvoigt@hvoigt.net>

On Windows, executables need to have the file extension `.exe`, or they
are not executables. Hence, to support scripts, Git for Windows also
looks for a she-bang line by opening the file in question, and executing
it via the specified script interpreter.

To figure out whether files in the `PATH` are executable, `git help` has
code that imitates this behavior. With one exception: it *always* opens
the files and looks for a she-bang line *or* an `MZ` tell-tale
(nevermind that files with the magic `MZ` but without file extension
`.exe` would still not be executable).

Opening this many files leads to performance problems that are even more
serious when a virus scanner is running. Therefore, let's change the
code to look for the file extension `.exe` early, and avoid opening the
file altogether if we already know that it is executable.

See the following measurements (in seconds) as an example, where we
execute a simple program that simply lists the directory contents and
calls open() on every listed file:

With virus scanner running (coldcache):

$ ./a.exe /libexec/git-core/
before open (git-add.exe): 0.000000
after open (git-add.exe): 0.412873
before open (git-annotate.exe): 0.000175
after open (git-annotate.exe): 0.397925
before open (git-apply.exe): 0.000243
after open (git-apply.exe): 0.399996
before open (git-archive.exe): 0.000147
after open (git-archive.exe): 0.397783
before open (git-bisect--helper.exe): 0.000160
after open (git-bisect--helper.exe): 0.397700
before open (git-blame.exe): 0.000160
after open (git-blame.exe): 0.399136
...

With virus scanner running (hotcache):

$ ./a.exe /libexec/git-core/
before open (git-add.exe): 0.000000
after open (git-add.exe): 0.000325
before open (git-annotate.exe): 0.000229
after open (git-annotate.exe): 0.000177
before open (git-apply.exe): 0.000167
after open (git-apply.exe): 0.000150
before open (git-archive.exe): 0.000154
after open (git-archive.exe): 0.000156
before open (git-bisect--helper.exe): 0.000132
after open (git-bisect--helper.exe): 0.000180
before open (git-blame.exe): 0.000718
after open (git-blame.exe): 0.000724
...

With this patch I get:

$ time git help git
Launching default browser to display HTML ...

real    0m8.723s
user    0m0.000s
sys     0m0.000s

and without

$ time git help git
Launching default browser to display HTML ...

real    1m37.734s
user    0m0.000s
sys     0m0.031s

both tests with cold cache and giving the machine some time to settle
down after restart.

[jes: adjusted the commit message]

Signed-off-by: Heiko Voigt <heiko.voigt@mahr.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Published-As: https://github.com/dscho/git/releases/tag/help-is-exe-v2
Fetch-It-Via: git fetch https://github.com/dscho/git help-is-exe-v2

 help.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/help.c b/help.c
index 53e2a67e00..bc6cd19cf3 100644
--- a/help.c
+++ b/help.c
@@ -105,7 +105,22 @@ static int is_executable(const char *name)
 		return 0;
 
 #if defined(GIT_WINDOWS_NATIVE)
-{	/* cannot trust the executable bit, peek into the file instead */
+	/*
+	 * On Windows there is no executable bit. The file extension
+	 * indicates whether it can be run as an executable, and Git
+	 * has special-handling to detect scripts and launch them
+	 * through the indicated script interpreter. We test for the
+	 * file extension first because virus scanners may make
+	 * it quite expensive to open many files.
+	 */
+	if (ends_with(name, ".exe"))
+		return S_IXUSR;
+
+{
+	/*
+	 * Now that we know it does not have an executable extension,
+	 * peek into the file instead.
+	 */
 	char buf[3] = { 0 };
 	int n;
 	int fd = open(name, O_RDONLY);
@@ -113,8 +128,8 @@ static int is_executable(const char *name)
 	if (fd >= 0) {
 		n = read(fd, buf, 2);
 		if (n == 2)
-			/* DOS executables start with "MZ" */
-			if (!strcmp(buf, "#!") || !strcmp(buf, "MZ"))
+			/* look for a she-bang */
+			if (!strcmp(buf, "#!"))
 				st.st_mode |= S_IXUSR;
 		close(fd);
 	}

base-commit: 4e59582ff70d299f5a88449891e78d15b4b3fabe
-- 
2.11.1.windows.prerelease.2.9.g3014b57

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] help: correct behavior for is_executable on Windows
  2017-01-27 18:43 ` Junio C Hamano
@ 2017-01-30 12:44   ` Johannes Schindelin
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Schindelin @ 2017-01-30 12:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Heiko Voigt

Hi Junio,

On Fri, 27 Jan 2017, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > From: Heiko Voigt <hvoigt@hvoigt.net>
> >
> > The previous implementation said that the filesystem information on
> > Windows is not reliable to determine whether a file is executable. To
> > gather this information it was peeking into the first two bytes of a
> > file to see whether it looks executable.
> >
> > Apart from the fact that on Windows executables are defined as such by
> > their extension (and Git has special code to support "executing"
> > scripts when they have a she-bang line) it leads to serious
> > performance problems: not only do we have to open many files now, it
> > gets even slower when a virus scanner is running.
> 
> Heiko, around here (before going into the details of how severe the
> problem is and how wonderful the result applying of this change is) is
> the best place to summarize the solution.  E.g.
> 
> 	Because the definition of "executable" on Windows is based
> 	on the file extension, update the function to declare that a
> 	file with ".exe" extension without opening and reading the
> 	early bytes from it.  This avoids serious performance issues.
> 
> I paraphrased the rest only so that the description of the solution
> (i.e. "instead of opening and peeking, we trust .exe suffix") fits well
> in the surrounding text; the important part is to say what the change
> does clearly.

I adjusted the commit message. It was tweaked a little differently from
what you suggested, as I preferred to condense the information a bit more.

> I agree with the reasoning and the execution of the patch, except
> that 
> 
>  - "correct behaviour" in the title makes it appear that this is a
>    correctness thing, but this is primarily a performance fix.

Primarily. But not only. The magic `MZ` without the file extension `.exe`
is pretty useless, as the file could not be executed, still.

To avoid further turnaround, though, I also edited the contentious
"correct" to read "improve" now.

>  - It is a bit strange that "MZ" is dropped in the same patch
>    without any mention.

I fixed that in the commit message.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] help: improve is_executable() on Windows
  2017-01-30 12:40 ` [PATCH v2] help: improve is_executable() " Johannes Schindelin
@ 2017-01-30 17:03   ` Junio C Hamano
  0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2017-01-30 17:03 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Heiko Voigt

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> From: Heiko Voigt <hvoigt@hvoigt.net>
>
> On Windows, executables need to have the file extension `.exe`, or they
> are not executables. Hence, to support scripts, Git for Windows also
> looks for a she-bang line by opening the file in question, and executing
> it via the specified script interpreter.
>
> To figure out whether files in the `PATH` are executable, `git help` has
> code that imitates this behavior. With one exception: it *always* opens
> the files and looks for a she-bang line *or* an `MZ` tell-tale
> (nevermind that files with the magic `MZ` but without file extension
> `.exe` would still not be executable).
>
> Opening this many files leads to performance problems that are even more
> serious when a virus scanner is running. Therefore, let's change the
> code to look for the file extension `.exe` early, and avoid opening the
> file altogether if we already know that it is executable.

Much more readable than the initial round.  Will queue; thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] help: correct behavior for is_executable on Windows
  2012-08-16  2:02               ` Junio C Hamano
@ 2012-08-16 16:35                 ` Heiko Voigt
  0 siblings, 0 replies; 16+ messages in thread
From: Heiko Voigt @ 2012-08-16 16:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Wed, Aug 15, 2012 at 07:02:31PM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > My preference is to remove "static int is_executable()" function
> > from help.c, have an...
> > ... I wouldn't mind seeing the implementation of posix_is_executable()
> > in help.c, which will be dead-code on Windows and Cygwin, if that
> > makes linking and Makefile easier.
> 
> An outline of such a change might look like this.

Thanks for the outline and the clarification, I will use it to cook up
something.

Cheers Heiko

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] help: correct behavior for is_executable on Windows
  2012-08-15 23:15             ` Junio C Hamano
@ 2012-08-16  2:02               ` Junio C Hamano
  2012-08-16 16:35                 ` Heiko Voigt
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2012-08-16  2:02 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> My preference is to remove "static int is_executable()" function
> from help.c, have an...
> ... I wouldn't mind seeing the implementation of posix_is_executable()
> in help.c, which will be dead-code on Windows and Cygwin, if that
> makes linking and Makefile easier.

An outline of such a change might look like this.

As usual, I am not great at names, so you may want to rename the
shared one.  Also I do not know what headers Cygwin and Windows
specific bits would need to include, so you would have to include
more than just "git-compat-util.h" in the compat/*.c files this
patch adds.

It would make sense to move the "does it end with .exe" check from
the caller to the shared function, but because that does not match
the concept of the name I came up with (peek-contents), I left it
out of the function. If there is a good name to express what such a
combined "ends with .exe, or inspected contents look like an
executable", it would make sense to rename it as such and move
".exe" check to it.

Thanks.

 Makefile                     |  6 ++++--
 compat/cygwin.c              | 16 ++++++++++++++++
 compat/win32/is_executable.c | 14 ++++++++++++++
 compat/win32/peek_contents.c | 16 ++++++++++++++++
 git-compat-util.h            |  9 +++++++++
 help.c                       | 32 +-------------------------------
 is_executable.c              | 12 ++++++++++++
 7 files changed, 72 insertions(+), 33 deletions(-)

diff --git a/Makefile b/Makefile
index 6b0c961..2bcb9f4 100644
--- a/Makefile
+++ b/Makefile
@@ -734,6 +734,7 @@ LIB_OBJS += hash.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
 LIB_OBJS += ident.o
+LIB_OBJS += is_executable.o
 LIB_OBJS += kwset.o
 LIB_OBJS += levenshtein.o
 LIB_OBJS += list-objects.o
@@ -1073,7 +1074,7 @@ ifeq ($(uname_O),Cygwin)
 	# Try commenting this out if you suspect MMAP is more efficient
 	NO_MMAP = YesPlease
 	X = .exe
-	COMPAT_OBJS += compat/cygwin.o
+	COMPAT_OBJS += compat/cygwin.o compat/win32/peek_contents.o
 	UNRELIABLE_FSTAT = UnfortunatelyYes
 	SPARSE_FLAGS = -isystem /usr/include/w32api -Wno-one-bit-signed-bitfield
 endif
@@ -1257,7 +1258,8 @@ ifeq ($(uname_S),Windows)
 	BASIC_CFLAGS = -nologo -I. -I../zlib -Icompat/vcbuild -Icompat/vcbuild/include -DWIN32 -D_CONSOLE -DHAVE_STRING_H -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_DEPRECATE
 	COMPAT_OBJS = compat/msvc.o compat/winansi.o \
 		compat/win32/pthread.o compat/win32/syslog.o \
-		compat/win32/poll.o compat/win32/dirent.o
+		compat/win32/poll.o compat/win32/dirent.o \
+		compat/win32/is_executable.o compat/win32/peek_contents.o
 	COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DNOGDI -DHAVE_STRING_H -DHAVE_ALLOCA_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\"
 	BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO -SUBSYSTEM:CONSOLE -NODEFAULTLIB:MSVCRT.lib
 	EXTLIBS = user32.lib advapi32.lib shell32.lib wininet.lib ws2_32.lib
diff --git a/compat/cygwin.c b/compat/cygwin.c
index dfe9b30..5eb948a 100644
--- a/compat/cygwin.c
+++ b/compat/cygwin.c
@@ -140,3 +140,19 @@ static int cygwin_lstat_stub(const char *file_name, struct stat *buf)
 stat_fn_t cygwin_stat_fn = cygwin_stat_stub;
 stat_fn_t cygwin_lstat_fn = cygwin_lstat_stub;
 
+extern int win32_peek_contents_executable(const char *);
+
+int cygwin_is_path_executable(const char *name)
+{
+	struct stat st;
+
+	/* stat, not lstat */
+	if (stat(name, &st) || !S_ISREG(st.st_mode))
+		return 0;
+	if (st.st_mode & S_IXUSR)
+		return 1;
+
+	if (has_extension(name, ".exe"))
+		return 1;
+	return win32_peek_contents_executable(name);
+}
diff --git a/compat/win32/is_executable.c b/compat/win32/is_executable.c
new file mode 100644
index 0000000..f7907b3
--- /dev/null
+++ b/compat/win32/is_executable.c
@@ -0,0 +1,14 @@
+#include "git-compat-util.h"
+
+int win32_path_is_executable(const char *name)
+{
+	/*
+	 * Do whatever you would do on win32 to make sure
+	 * "name" exists here
+	 */
+	if (!file_exists(name))
+		return 0;
+	if (has_extension(name, ".exe"))
+		return 1;
+	return win32_peek_contents_executable(name);
+}
diff --git a/compat/win32/peek_contents.c b/compat/win32/peek_contents.c
new file mode 100644
index 0000000..5e425aa
--- /dev/null
+++ b/compat/win32/peek_contents.c
@@ -0,0 +1,16 @@
+#include "git-compat-util.h"
+
+int win32_peek_contents_executable(const char *name)
+{
+	char buf[2];
+	int n, fd;
+
+	fd = open(name, O_RDONLY);
+	if (fd < 0)
+		return 0;
+	n = read(fd, buf, 2);
+	close(fd);
+	return (n == 2 &&
+		/* DOS executables start with "MZ" */
+		(!memcmp(buf, "#!", 2) || !memcmp(buf, "MZ", 2)));
+}
diff --git a/git-compat-util.h b/git-compat-util.h
index 35b095e..e8f8cb1 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -607,4 +607,13 @@ int remove_or_warn(unsigned int mode, const char *path);
 /* Get the passwd entry for the UID of the current process. */
 struct passwd *xgetpwuid_self(void);
 
+/* Does the given path name an executable command? */
+#if defined(__CYGWIN__)
+#define is_path_executable(p) cygwin_is_path_executable(p)
+#elif defined(WIN32)
+#define is_path_executable(p) win32_is_path_executable(p)
+#endif
+
+extern int is_path_executable(const char *);
+
 #endif
diff --git a/help.c b/help.c
index 2a42ec6..672d9d3 100644
--- a/help.c
+++ b/help.c
@@ -98,36 +98,6 @@ static void pretty_print_string_list(struct cmdnames *cmds,
 	string_list_clear(&list, 0);
 }
 
-static int is_executable(const char *name)
-{
-	struct stat st;
-
-	if (stat(name, &st) || /* stat, not lstat */
-	    !S_ISREG(st.st_mode))
-		return 0;
-
-#if defined(WIN32) || defined(__CYGWIN__)
-#if defined(__CYGWIN__)
-if ((st.st_mode & S_IXUSR) == 0)
-#endif
-{	/* cannot trust the executable bit, peek into the file instead */
-	char buf[3] = { 0 };
-	int n;
-	int fd = open(name, O_RDONLY);
-	st.st_mode &= ~S_IXUSR;
-	if (fd >= 0) {
-		n = read(fd, buf, 2);
-		if (n == 2)
-			/* DOS executables start with "MZ" */
-			if (!strcmp(buf, "#!") || !strcmp(buf, "MZ"))
-				st.st_mode |= S_IXUSR;
-		close(fd);
-	}
-}
-#endif
-	return st.st_mode & S_IXUSR;
-}
-
 static void list_commands_in_dir(struct cmdnames *cmds,
 					 const char *path,
 					 const char *prefix)
@@ -155,7 +125,7 @@ static void list_commands_in_dir(struct cmdnames *cmds,
 
 		strbuf_setlen(&buf, len);
 		strbuf_addstr(&buf, de->d_name);
-		if (!is_executable(buf.buf))
+		if (!is_path_executable(buf.buf))
 			continue;
 
 		entlen = strlen(de->d_name) - prefix_len;
diff --git a/is_executable.c b/is_executable.c
new file mode 100644
index 0000000..9dacd2a
--- /dev/null
+++ b/is_executable.c
@@ -0,0 +1,12 @@
+#include "git-compat-util.h"
+
+int is_path_executable(const char *name)
+{
+	struct stat st;
+
+	/* stat, not lstat */
+	if (stat(name, &st) || !S_ISREG(st.st_mode))
+		return 0;
+
+	return st.st_mode & S_IXUSR;
+}

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] help: correct behavior for is_executable on Windows
  2012-08-15 22:29           ` Heiko Voigt
@ 2012-08-15 23:15             ` Junio C Hamano
  2012-08-16  2:02               ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2012-08-15 23:15 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: git

Heiko Voigt <hvoigt@hvoigt.net> writes:

> I do not know why you are against filling that information into "struct
> stat".

Because it is *WRONG*.  Isn't it a good enough reason?

If the issue you are trying to solve were """stat emulation on
Windows and Cygwin does not give the correct x-bit (and the user
sometimes has to work it around with "update-index --chmod"), and by
using other clues the emulation can be improved to give a better
result""", I agree that it would be a good solution to have the
"""Does it exist as a regular file and ends with ".exe"?  Otherwise
does it start with "MZ" or "#!"?""" heuristics in a helper function
correct_executable_stat(), and to have the implementation of stat
emulation on these two platforms call that shared helper function.

But look at the caller again.  The problem the caller wants this
function to solve is not "I want you to stat this file."  It has a
pathname, and only wants to know if it is an executable file.  It
does not care about who owns it, what time it was last touched,
etc., but calling the incomplete stat emulation on Windows will try
to come up with an answer, and the is_executable() function discards
most of it.  

In other words, you are solving a wrong problem with that approach.

"Run stat() and look at S_IXUSR" happens to be an implementation
detail that is valid only on POSIX systems for a function to answer
the question: "Is this an executable file?", and in that specific
implementation, the answer to that question appears in the S_IXUSR
bit of st.st_mode.  That does not mean the "struct stat" is the best
container for the answer to that question on other platforms.  So
why structure your abstraction around the inappropriate data
structure?  Between the function (is_executable) and its callers,
there is only one bit that needs to be passed.

My preference is to remove "static int is_executable()" function
from help.c, have an "extern int is_executable(const char *)" that
has platform-specific implementation in compat/ layer, and call it
from help.c::list_commands_in_dir() without any #ifdef.  In
git-compat-util.h, have something like:

	#ifdef __CYGWIN__
	#define is_executable(path) cygwin_is_executable(path)
	#else
        # ifdef WIN32
        # define is_executable(path) win32_is_executable(path)
	# endif
	#endif

        #ifndef is_exectutable
	#define is_executable(path) posix_is_executable(path)
	#endif

        extern int is_executable(const char *);

I wouldn't mind seeing the implementation of posix_is_executable()
in help.c, which will be dead-code on Windows and Cygwin, if that
makes linking and Makefile easier.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] help: correct behavior for is_executable on Windows
  2012-08-15 17:53         ` Junio C Hamano
  2012-08-15 19:39           ` Junio C Hamano
@ 2012-08-15 22:29           ` Heiko Voigt
  2012-08-15 23:15             ` Junio C Hamano
  1 sibling, 1 reply; 16+ messages in thread
From: Heiko Voigt @ 2012-08-15 22:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Wed, Aug 15, 2012 at 10:53:55AM -0700, Junio C Hamano wrote:
> Heiko Voigt <hvoigt@hvoigt.net> writes:
> 
> > On Mon, Aug 13, 2012 at 10:48:14AM -0700, Junio C Hamano wrote:
> >> Heiko Voigt <hvoigt@hvoigt.net> writes:
> >> > What do you think?
> >> 
> >> Does having the "stat()" help on Windows in any way?  Does it ever
> >> return an executable bit by itself?
> >
> > No, AFAIK it does not return anything about executability. But I think
> > the stat is still necessary to verify that the file exists and is a
> > regular file.
> 
> But if you are going to read it anyway, you can tell it from open()
> and read() of the first 2 bytes failing, no?  That will still be an
> implementation detail of platform specific "is_path_executable()".

No I am not opening the file anyway. Only when it does not have a
".exe" postfix. That at least was the intention of my previous patch in
this thread.

> So you are forcing Windows an extra and unnecessary stat() that only
> is needed on Cygwin, no?

My first patch in this thread (which lead to this extraction) is about
avoiding the open because it is possibly very costly on our git binaries
in case a virus scanner is running.

The optimized (and correct) check does only look at the given filename
if it contains a ".exe" postfix. A directory (although unlikely) could
also have such a suffix. Should we then consider that directory to be a
git command? AFAICS there is no such check in the earlier codepath.

A stat is much cheaper since it does not open the file. For scripts the
open is much cheaper than for our binaries because they are much
smaller. For clarification: All the git builtin binaries basically
contain the same code. Even though they are hardlinked, the system (or
the scanner) does still consider them like individual files and executes
a scan for each of them. That at least seems to be the case on a number
of systems.

> > @@ -1347,7 +1348,8 @@ ifneq (,$(findstring MINGW,$(uname_S)))
> >  	COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\"
> >  	COMPAT_OBJS += compat/mingw.o compat/winansi.o \
> >  		compat/win32/pthread.o compat/win32/syslog.o \
> > -		compat/win32/poll.o compat/win32/dirent.o
> > +		compat/win32/poll.o compat/win32/dirent.o \
> > +		compat/win32/executable.o
> 
> Looks sensible, even though the filename does not tell what it does
> about "executable".  is_executable.o might be a better name for them.

Agreed, I was not sure about the name anyway. is_executable.o sounds
better.

> > diff --git a/help.c b/help.c
> > index ebc2c42..d9fae3c 100644
> > --- a/help.c
> > +++ b/help.c
> > @@ -106,34 +106,8 @@ static int is_executable(const char *name)
> >  	    !S_ISREG(st.st_mode))
> >  		return 0;
> >  
> > -#if defined(WIN32) || defined(__CYGWIN__)
> > -	/* On Windows we cannot use the executable bit. The executable
> > -	 * state is determined by extension only. We do this first
> > -	 * because with virus scanners opening an executeable for
> > -	 * reading is potentially expensive.
> > -	 */
> > -	if (has_extension(name, ".exe"))
> > -		return S_IXUSR;
> > -
> > -#if defined(__CYGWIN__)
> > -if ((st.st_mode & S_IXUSR) == 0)
> > -#endif
> > -{	/* now that we know it does not have an executable extension,
> > -	   peek into the file instead */
> > -	char buf[3] = { 0 };
> > -	int n;
> > -	int fd = open(name, O_RDONLY);
> > -	st.st_mode &= ~S_IXUSR;
> > -	if (fd >= 0) {
> > -		n = read(fd, buf, 2);
> > -		if (n == 2)
> > -			/* look for a she-bang */
> > -			if (!strcmp(buf, "#!"))
> > -				st.st_mode |= S_IXUSR;
> > -		close(fd);
> > -	}
> > -}
> > -#endif
> > +	correct_executable_stat(name, &st);
> > +
> 
> Yuck.
> 
> Why should we need even a single line of the implementation of a
> function that tells if a given pathname contains an executable
> command, which we know is platform specific?  
> 
> On Posix systems, the implementation will be "stat() and check
> S_IXUSR".  On pure Windows, it will be "check .exe, or open it and
> read the first two bytes". On Cygwin, it will also be "check .exe,
> stat() and check S_IXUSR, or open it and read the first two bytes.
> 
> It is not like the caller of is_executable() needed to run stat for
> other purposes on its own and we can optimize by either borrowing
> the stat data the caller already collected for us, or returning the
> stat data we collected for later use by the caller.  The use of stat
> is entirely contained in the POSIX implementation of this function.
> 
> Why are you so dead-set to shoehorn the different semantics into
> "struct stat" that we know is not an appropriate abstraction across
> platforms?

I simply think filling in the correct information into the data
structure we use on all platforms is the most transparent approach. We
could also call this function

	correct_executable_stat_if_needed_by_platform()

but I considered that name to be too long. As explained above: To be
fully correct the stat call is still needed to make sure we are not
looking at a directory. list_commands_in_dir() from help.c is iterating
over all entries in a given directory so if we skip the stat we are not
sure whether we look at something that is not a regular file. Even
though its unlikely that there will be a directory named something.exe
in the git executable directory I would still like to keep the check in
there.

When first looking at this I also thought about extending the windows
stat replacement to do this whole logic. Then we could completely remove
this platform specific thing. But I decided against that because when
the file does not have a ".exe" postfix we would have an open call in all
stats and to my knowledge that is very expensive on Windows.

I do not know why you are against filling that information into "struct
stat". I could rephrase the above into something like this

	if (platform_is_executeable(name))
		return 1;

but its not that easy to replace to a no op by a macro anymore.

Cheers Heiko

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] help: correct behavior for is_executable on Windows
  2012-08-15 17:53         ` Junio C Hamano
@ 2012-08-15 19:39           ` Junio C Hamano
  2012-08-15 22:29           ` Heiko Voigt
  1 sibling, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2012-08-15 19:39 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

>> diff --git a/help.c b/help.c
>> ...
>> +
>
> Yuck.
>
> Why should we need even a single line of the implementation of a
> function that tells if a given pathname contains an executable
> command, which we know is platform specific?  

Sorry; sent without sufficient proofreading.  

Obviously we need such an implementation (per platform) somewhere in
the system.

What I meant to ask was "Why should we need a function whose
implementation has to be platform-specific in this help.c file".
The last part of the sentence was missing.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] help: correct behavior for is_executable on Windows
  2012-08-15 16:50       ` Heiko Voigt
@ 2012-08-15 17:53         ` Junio C Hamano
  2012-08-15 19:39           ` Junio C Hamano
  2012-08-15 22:29           ` Heiko Voigt
  0 siblings, 2 replies; 16+ messages in thread
From: Junio C Hamano @ 2012-08-15 17:53 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: git

Heiko Voigt <hvoigt@hvoigt.net> writes:

> On Mon, Aug 13, 2012 at 10:48:14AM -0700, Junio C Hamano wrote:
>> Heiko Voigt <hvoigt@hvoigt.net> writes:
>> > What do you think?
>> 
>> Does having the "stat()" help on Windows in any way?  Does it ever
>> return an executable bit by itself?
>
> No, AFAIK it does not return anything about executability. But I think
> the stat is still necessary to verify that the file exists and is a
> regular file.

But if you are going to read it anyway, you can tell it from open()
and read() of the first 2 bytes failing, no?  That will still be an
implementation detail of platform specific "is_path_executable()".

So you are forcing Windows an extra and unnecessary stat() that only
is needed on Cygwin, no?

> @@ -1347,7 +1348,8 @@ ifneq (,$(findstring MINGW,$(uname_S)))
>  	COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\"
>  	COMPAT_OBJS += compat/mingw.o compat/winansi.o \
>  		compat/win32/pthread.o compat/win32/syslog.o \
> -		compat/win32/poll.o compat/win32/dirent.o
> +		compat/win32/poll.o compat/win32/dirent.o \
> +		compat/win32/executable.o

Looks sensible, even though the filename does not tell what it does
about "executable".  is_executable.o might be a better name for them.

> diff --git a/help.c b/help.c
> index ebc2c42..d9fae3c 100644
> --- a/help.c
> +++ b/help.c
> @@ -106,34 +106,8 @@ static int is_executable(const char *name)
>  	    !S_ISREG(st.st_mode))
>  		return 0;
>  
> -#if defined(WIN32) || defined(__CYGWIN__)
> -	/* On Windows we cannot use the executable bit. The executable
> -	 * state is determined by extension only. We do this first
> -	 * because with virus scanners opening an executeable for
> -	 * reading is potentially expensive.
> -	 */
> -	if (has_extension(name, ".exe"))
> -		return S_IXUSR;
> -
> -#if defined(__CYGWIN__)
> -if ((st.st_mode & S_IXUSR) == 0)
> -#endif
> -{	/* now that we know it does not have an executable extension,
> -	   peek into the file instead */
> -	char buf[3] = { 0 };
> -	int n;
> -	int fd = open(name, O_RDONLY);
> -	st.st_mode &= ~S_IXUSR;
> -	if (fd >= 0) {
> -		n = read(fd, buf, 2);
> -		if (n == 2)
> -			/* look for a she-bang */
> -			if (!strcmp(buf, "#!"))
> -				st.st_mode |= S_IXUSR;
> -		close(fd);
> -	}
> -}
> -#endif
> +	correct_executable_stat(name, &st);
> +

Yuck.

Why should we need even a single line of the implementation of a
function that tells if a given pathname contains an executable
command, which we know is platform specific?  

On Posix systems, the implementation will be "stat() and check
S_IXUSR".  On pure Windows, it will be "check .exe, or open it and
read the first two bytes". On Cygwin, it will also be "check .exe,
stat() and check S_IXUSR, or open it and read the first two bytes.

It is not like the caller of is_executable() needed to run stat for
other purposes on its own and we can optimize by either borrowing
the stat data the caller already collected for us, or returning the
stat data we collected for later use by the caller.  The use of stat
is entirely contained in the POSIX implementation of this function.

Why are you so dead-set to shoehorn the different semantics into
"struct stat" that we know is not an appropriate abstraction across
platforms?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] help: correct behavior for is_executable on Windows
  2012-08-13 17:48     ` Junio C Hamano
@ 2012-08-15 16:50       ` Heiko Voigt
  2012-08-15 17:53         ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Heiko Voigt @ 2012-08-15 16:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Mon, Aug 13, 2012 at 10:48:14AM -0700, Junio C Hamano wrote:
> Heiko Voigt <hvoigt@hvoigt.net> writes:
> > What do you think?
> 
> Does having the "stat()" help on Windows in any way?  Does it ever
> return an executable bit by itself?

No, AFAIK it does not return anything about executability. But I think
the stat is still necessary to verify that the file exists and is a
regular file.

Here is a draft of a patch I would follow up with:

Subject: [PATCH RFC] help: extract platform dependent part of is_executable in
 extra module

To remove various ifdefs required for the special handling of
executables on windows we create a new module 'executable' in compat
which allows a user to correctly fill the S_IXUSR flag of struct stat on
Windows.

Since this is valid for all variants of windows (mingw, msvc, cygwin) we
create a new module to avoid code duplication. compat/msvc.h includes
mingw.h so we do not add an extra include there. By default we define
correct_executeable_stat() to a no op on all other platforms. This is
guarded by a NO_EXECUTABLE_STAT which when defined by a compat header
requires and implementation of this function.

NOTE: I did not test this. I first would like to discuss whether the
overall structure this approach is taking is ok.

Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
---
 Makefile                  |  8 +++++---
 compat/cygwin.h           |  2 ++
 compat/mingw.h            |  2 ++
 compat/win32/executable.c | 33 +++++++++++++++++++++++++++++++++
 compat/win32/executable.h |  9 +++++++++
 git-compat-util.h         |  4 ++++
 help.c                    | 30 ++----------------------------
 7 files changed, 57 insertions(+), 31 deletions(-)
 create mode 100644 compat/win32/executable.c
 create mode 100644 compat/win32/executable.h

diff --git a/Makefile b/Makefile
index 6b0c961..cea3559 100644
--- a/Makefile
+++ b/Makefile
@@ -1073,7 +1073,7 @@ ifeq ($(uname_O),Cygwin)
 	# Try commenting this out if you suspect MMAP is more efficient
 	NO_MMAP = YesPlease
 	X = .exe
-	COMPAT_OBJS += compat/cygwin.o
+	COMPAT_OBJS += compat/cygwin.o compat/win32/executable.o
 	UNRELIABLE_FSTAT = UnfortunatelyYes
 	SPARSE_FLAGS = -isystem /usr/include/w32api -Wno-one-bit-signed-bitfield
 endif
@@ -1257,7 +1257,8 @@ ifeq ($(uname_S),Windows)
 	BASIC_CFLAGS = -nologo -I. -I../zlib -Icompat/vcbuild -Icompat/vcbuild/include -DWIN32 -D_CONSOLE -DHAVE_STRING_H -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_DEPRECATE
 	COMPAT_OBJS = compat/msvc.o compat/winansi.o \
 		compat/win32/pthread.o compat/win32/syslog.o \
-		compat/win32/poll.o compat/win32/dirent.o
+		compat/win32/poll.o compat/win32/dirent.o \
+		compat/win32/executable.o
 	COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DNOGDI -DHAVE_STRING_H -DHAVE_ALLOCA_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\"
 	BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO -SUBSYSTEM:CONSOLE -NODEFAULTLIB:MSVCRT.lib
 	EXTLIBS = user32.lib advapi32.lib shell32.lib wininet.lib ws2_32.lib
@@ -1347,7 +1348,8 @@ ifneq (,$(findstring MINGW,$(uname_S)))
 	COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\"
 	COMPAT_OBJS += compat/mingw.o compat/winansi.o \
 		compat/win32/pthread.o compat/win32/syslog.o \
-		compat/win32/poll.o compat/win32/dirent.o
+		compat/win32/poll.o compat/win32/dirent.o \
+		compat/win32/executable.o
 	EXTLIBS += -lws2_32
 	PTHREAD_LIBS =
 	X = .exe
diff --git a/compat/cygwin.h b/compat/cygwin.h
index a3229f5..e3bbd4d 100644
--- a/compat/cygwin.h
+++ b/compat/cygwin.h
@@ -1,6 +1,8 @@
 #include <sys/types.h>
 #include <sys/stat.h>
 
+#include "win32/executable.h"
+
 typedef int (*stat_fn_t)(const char*, struct stat*);
 extern stat_fn_t cygwin_stat_fn;
 extern stat_fn_t cygwin_lstat_fn;
diff --git a/compat/mingw.h b/compat/mingw.h
index 61a6521..73c4f3d 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -1,6 +1,8 @@
 #include <winsock2.h>
 #include <ws2tcpip.h>
 
+#include "win32/executable.h"
+
 /*
  * things that are not available in header files
  */
diff --git a/compat/win32/executable.c b/compat/win32/executable.c
new file mode 100644
index 0000000..326ddb1
--- /dev/null
+++ b/compat/win32/executable.c
@@ -0,0 +1,33 @@
+#include "executeable.h"
+
+void correct_executable_stat(const char *filename, struct stat *st)
+{
+	char buf[3] = { 0 };
+	int n, fd;
+
+	/* On Windows we cannot use the executable bit. The executable
+	 * state is determined by extension only. We do this first
+	 * because with virus scanners opening an executeable for
+	 * reading is potentially expensive.
+	 */
+	if (has_extension(name, ".exe"))
+		st->st_mode |= S_IXUSR;
+
+	if (st.st_mode & S_IXUSR)
+		return;
+
+	/* now that we know it does not have an executable extension,
+	   peek into the file instead */
+	fd = open(name, O_RDONLY);
+	if (fd < 0)
+		return;
+
+	n = read(fd, buf, 2);
+	if (n == 2) {
+		/* look for a she-bang */
+		if (!strcmp(buf, "#!"))
+			st.st_mode |= S_IXUSR;
+	}
+
+	close(fd);
+}
diff --git a/compat/win32/executable.h b/compat/win32/executable.h
new file mode 100644
index 0000000..e4f0c5c
--- /dev/null
+++ b/compat/win32/executable.h
@@ -0,0 +1,9 @@
+#ifndef COMPAT_WIN32_EXECUTEABLE
+#define COMPAT_WIN32_EXECUTEABLE
+
+#include "../../git-compat-util.h"
+
+#define NO_EXECUTABLE_STAT
+extern void correct_executable_stat(const char *filename, struct stat *st);
+
+#endif /* COMPAT_WIN32_EXECUTEABLE */
diff --git a/git-compat-util.h b/git-compat-util.h
index 35b095e..1b723af 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -607,4 +607,8 @@ int remove_or_warn(unsigned int mode, const char *path);
 /* Get the passwd entry for the UID of the current process. */
 struct passwd *xgetpwuid_self(void);
 
+#ifndef NO_EXECUTABLE_STAT
+#define correct_executable_stat(name, stat) ;
+#endif
+
 #endif
diff --git a/help.c b/help.c
index ebc2c42..d9fae3c 100644
--- a/help.c
+++ b/help.c
@@ -106,34 +106,8 @@ static int is_executable(const char *name)
 	    !S_ISREG(st.st_mode))
 		return 0;
 
-#if defined(WIN32) || defined(__CYGWIN__)
-	/* On Windows we cannot use the executable bit. The executable
-	 * state is determined by extension only. We do this first
-	 * because with virus scanners opening an executeable for
-	 * reading is potentially expensive.
-	 */
-	if (has_extension(name, ".exe"))
-		return S_IXUSR;
-
-#if defined(__CYGWIN__)
-if ((st.st_mode & S_IXUSR) == 0)
-#endif
-{	/* now that we know it does not have an executable extension,
-	   peek into the file instead */
-	char buf[3] = { 0 };
-	int n;
-	int fd = open(name, O_RDONLY);
-	st.st_mode &= ~S_IXUSR;
-	if (fd >= 0) {
-		n = read(fd, buf, 2);
-		if (n == 2)
-			/* look for a she-bang */
-			if (!strcmp(buf, "#!"))
-				st.st_mode |= S_IXUSR;
-		close(fd);
-	}
-}
-#endif
+	correct_executable_stat(name, &st);
+
 	return st.st_mode & S_IXUSR;
 }
 
-- 
1.7.12.rc2.11.g5d52328

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] help: correct behavior for is_executable on Windows
  2012-08-13 17:02   ` Heiko Voigt
@ 2012-08-13 17:48     ` Junio C Hamano
  2012-08-15 16:50       ` Heiko Voigt
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2012-08-13 17:48 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: git

Heiko Voigt <hvoigt@hvoigt.net> writes:

> Since the code for cygwin and windows in general is almost the same I would
> extract one function for them where I leave in one ifdef for cygwin.
>
> E.g. like this:
>
>
> 	static int is_executable(const char *name)
> 	{
> 	        struct stat st;
>
> 	        if (stat(name, &st) || /* stat, not lstat */
> 	            !S_ISREG(st.st_mode))
> 	                return 0;
>
> 		fill_platform_stat(name, &st);
>
> 	        return st.st_mode & S_IXUSR;
> 	}
>
> which I could then define to a no op on posix. That way we avoid code
> duplication in the platform specific functions.
>
> What do you think?

Does having the "stat()" help on Windows in any way?  Does it ever
return an executable bit by itself?

If not, Windows compat/ implementation may want to skip issuing a
useless stat() and write it as

	if (has_extension(".exe"))
        	return 1;
	if (contents_begins_with("MZ") || contents_begins_with("#!"))
        	return 1;
	return 0;

without ever talking about stat() which is POSIXism compat/
implementation for Windows does not have to worry about.

And that was the reason I suggested making the whole implementation
of path_is_executable() overridable by compat/ layer.

But if having "stat()" helps on Windows, then your counterproposal
is good enough for me.

Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] help: correct behavior for is_executable on Windows
  2012-08-12  4:30 ` Junio C Hamano
@ 2012-08-13 17:02   ` Heiko Voigt
  2012-08-13 17:48     ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Heiko Voigt @ 2012-08-13 17:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sat, Aug 11, 2012 at 09:30:06PM -0700, Junio C Hamano wrote:
> Heiko Voigt <hvoigt@hvoigt.net> writes:
> 
> >  help.c | 15 ++++++++++++---
> >  1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/help.c b/help.c
> > index 662349d..b41fa21 100644
> > --- a/help.c
> > +++ b/help.c
> > @@ -103,10 +103,19 @@ static int is_executable(const char *name)
> >  		return 0;
> >  
> >  #if defined(WIN32) || defined(__CYGWIN__)
> > +	/* On Windows we cannot use the executable bit. The executable
> > +	 * state is determined by extension only. We do this first
> > +	 * because with virus scanners opening an executeable for
> > +	 * reading is potentially expensive.
> > +	 */
> > +	if (has_extension(name, ".exe"))
> > +		return S_IXUSR;
> > +
> >  #if defined(__CYGWIN__)
> >  if ((st.st_mode & S_IXUSR) == 0)
> >  #endif
> > -{	/* cannot trust the executable bit, peek into the file instead */
> > +{	/* now that we know it does not have an executable extension,
> > +	   peek into the file instead */
> >  	char buf[3] = { 0 };
> >  	int n;
> >  	int fd = open(name, O_RDONLY);
> > @@ -114,8 +123,8 @@ if ((st.st_mode & S_IXUSR) == 0)
> >  	if (fd >= 0) {
> >  		n = read(fd, buf, 2);
> >  		if (n == 2)
> > -			/* DOS executables start with "MZ" */
> > -			if (!strcmp(buf, "#!") || !strcmp(buf, "MZ"))
> > +			/* look for a she-bang */
> > +			if (!strcmp(buf, "#!"))
> >  				st.st_mode |= S_IXUSR;
> >  		close(fd);
> >  	}
> 
> Would it make sense to move this to compat/win32/, compat/cygwin.c,
> and compat/posix.c, each exporting is_executable(const char *path),
> so that we do not have to suffer the #ifdef mess?

Yes that makes sense. But that means I need to test the code on multiple
platforms. To ease the merge in msysgit (the patch is already applied there)
I would suggest to post a follow up patch which would split up the function
into the platform specific parts.

Since the code for cygwin and windows in general is almost the same I would
extract one function for them where I leave in one ifdef for cygwin.

E.g. like this:


	static int is_executable(const char *name)
	{
	        struct stat st;

	        if (stat(name, &st) || /* stat, not lstat */
	            !S_ISREG(st.st_mode))
	                return 0;

		fill_platform_stat(name, &st);

	        return st.st_mode & S_IXUSR;
	}

which I could then define to a no op on posix. That way we avoid code
duplication in the platform specific functions.

What do you think?

Cheers Heiko

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] help: correct behavior for is_executable on Windows
  2012-08-11  7:00 [PATCH] help: correct behavior for is_executable " Heiko Voigt
@ 2012-08-12  4:30 ` Junio C Hamano
  2012-08-13 17:02   ` Heiko Voigt
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2012-08-12  4:30 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: git

Heiko Voigt <hvoigt@hvoigt.net> writes:

>  help.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/help.c b/help.c
> index 662349d..b41fa21 100644
> --- a/help.c
> +++ b/help.c
> @@ -103,10 +103,19 @@ static int is_executable(const char *name)
>  		return 0;
>  
>  #if defined(WIN32) || defined(__CYGWIN__)
> +	/* On Windows we cannot use the executable bit. The executable
> +	 * state is determined by extension only. We do this first
> +	 * because with virus scanners opening an executeable for
> +	 * reading is potentially expensive.
> +	 */
> +	if (has_extension(name, ".exe"))
> +		return S_IXUSR;
> +
>  #if defined(__CYGWIN__)
>  if ((st.st_mode & S_IXUSR) == 0)
>  #endif
> -{	/* cannot trust the executable bit, peek into the file instead */
> +{	/* now that we know it does not have an executable extension,
> +	   peek into the file instead */
>  	char buf[3] = { 0 };
>  	int n;
>  	int fd = open(name, O_RDONLY);
> @@ -114,8 +123,8 @@ if ((st.st_mode & S_IXUSR) == 0)
>  	if (fd >= 0) {
>  		n = read(fd, buf, 2);
>  		if (n == 2)
> -			/* DOS executables start with "MZ" */
> -			if (!strcmp(buf, "#!") || !strcmp(buf, "MZ"))
> +			/* look for a she-bang */
> +			if (!strcmp(buf, "#!"))
>  				st.st_mode |= S_IXUSR;
>  		close(fd);
>  	}

Would it make sense to move this to compat/win32/, compat/cygwin.c,
and compat/posix.c, each exporting is_executable(const char *path),
so that we do not have to suffer the #ifdef mess?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] help: correct behavior for is_executable on Windows
@ 2012-08-11  7:00 Heiko Voigt
  2012-08-12  4:30 ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Heiko Voigt @ 2012-08-11  7:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

The previous implementation said that the filesystem information on
Windows is not reliable to determine whether a file is executable.
To find gather this information it was peeking into the first two bytes
of a file to see whether it looks executable.
Apart from the fact that on Windows executables are usually defined as
such by their extension it lead to slow opening of help file in some
situations.

When you have virus scanner running calling open on an executable file
is a potentially expensive operation. See the following measurements (in
seconds) for example.

With virus scanner running (coldcache):

$ ./a.exe /libexec/git-core/
before open (git-add.exe): 0.000000
after open (git-add.exe): 0.412873
before open (git-annotate.exe): 0.000175
after open (git-annotate.exe): 0.397925
before open (git-apply.exe): 0.000243
after open (git-apply.exe): 0.399996
before open (git-archive.exe): 0.000147
after open (git-archive.exe): 0.397783
before open (git-bisect--helper.exe): 0.000160
after open (git-bisect--helper.exe): 0.397700
before open (git-blame.exe): 0.000160
after open (git-blame.exe): 0.399136
...

With virus scanner running (hotcache):

$ ./a.exe /libexec/git-core/
before open (git-add.exe): 0.000000
after open (git-add.exe): 0.000325
before open (git-annotate.exe): 0.000229
after open (git-annotate.exe): 0.000177
before open (git-apply.exe): 0.000167
after open (git-apply.exe): 0.000150
before open (git-archive.exe): 0.000154
after open (git-archive.exe): 0.000156
before open (git-bisect--helper.exe): 0.000132
after open (git-bisect--helper.exe): 0.000180
before open (git-blame.exe): 0.000718
after open (git-blame.exe): 0.000724
...

This test did just list the given directory and open() each file in it.

With this patch I get:

$ time git help git
Launching default browser to display HTML ...

real    0m8.723s
user    0m0.000s
sys     0m0.000s

and without

$ time git help git
Launching default browser to display HTML ...

real    1m37.734s
user    0m0.000s
sys     0m0.031s

both tests with cold cache and giving the machine some time to settle
down after restart.

Signed-off-by: Heiko Voigt <heiko.voigt@mahr.de>
---
 help.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/help.c b/help.c
index 662349d..b41fa21 100644
--- a/help.c
+++ b/help.c
@@ -103,10 +103,19 @@ static int is_executable(const char *name)
 		return 0;
 
 #if defined(WIN32) || defined(__CYGWIN__)
+	/* On Windows we cannot use the executable bit. The executable
+	 * state is determined by extension only. We do this first
+	 * because with virus scanners opening an executeable for
+	 * reading is potentially expensive.
+	 */
+	if (has_extension(name, ".exe"))
+		return S_IXUSR;
+
 #if defined(__CYGWIN__)
 if ((st.st_mode & S_IXUSR) == 0)
 #endif
-{	/* cannot trust the executable bit, peek into the file instead */
+{	/* now that we know it does not have an executable extension,
+	   peek into the file instead */
 	char buf[3] = { 0 };
 	int n;
 	int fd = open(name, O_RDONLY);
@@ -114,8 +123,8 @@ if ((st.st_mode & S_IXUSR) == 0)
 	if (fd >= 0) {
 		n = read(fd, buf, 2);
 		if (n == 2)
-			/* DOS executables start with "MZ" */
-			if (!strcmp(buf, "#!") || !strcmp(buf, "MZ"))
+			/* look for a she-bang */
+			if (!strcmp(buf, "#!"))
 				st.st_mode |= S_IXUSR;
 		close(fd);
 	}
-- 
1.7.12.rc2.10.gaf2525e

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-01-30 17:03 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-27 13:50 [PATCH] help: correct behavior for is_executable on Windows Johannes Schindelin
2017-01-27 18:43 ` Junio C Hamano
2017-01-30 12:44   ` Johannes Schindelin
2017-01-30 12:40 ` [PATCH v2] help: improve is_executable() " Johannes Schindelin
2017-01-30 17:03   ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2012-08-11  7:00 [PATCH] help: correct behavior for is_executable " Heiko Voigt
2012-08-12  4:30 ` Junio C Hamano
2012-08-13 17:02   ` Heiko Voigt
2012-08-13 17:48     ` Junio C Hamano
2012-08-15 16:50       ` Heiko Voigt
2012-08-15 17:53         ` Junio C Hamano
2012-08-15 19:39           ` Junio C Hamano
2012-08-15 22:29           ` Heiko Voigt
2012-08-15 23:15             ` Junio C Hamano
2012-08-16  2:02               ` Junio C Hamano
2012-08-16 16:35                 ` Heiko Voigt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.