All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] Pass t5530 on Windows
@ 2010-03-06 15:40 Johannes Sixt
  2010-03-06 20:12 ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Johannes Sixt @ 2010-03-06 15:40 UTC (permalink / raw)
  To: git; +Cc: Johannes Sixt

The goal of this series is to fix async procedures to the extent that
t5530-upload-pack-error.sh passes on Windows.

The problem is that on Windows async procedures are run in threads (due
to the lack of fork()), and a die() call from an async procedure tears
down the whole process while on POSIX only the async procedure is
terminated.

The approach taken is to use a custom die routine that exits only the
thread when it detects that it is not called from the main procedure.

As a bonus, the threaded async infrastructure now uses pthreads instead
of Windows specific API, and can be enabled on POSIX as well through a
new compile-time option.

Quite frankly, I don't quite know what to do with this series. On the
one hand, it is a clean-up, but in practice it is not relevant whether
die() kills only the async thread or the whole process because all
callers of async die() themselves anyway when the async procedure died.
On the other hand, it does enable threaded async procedures on POSIX...

Johannes Sixt (6):
  Modernize t5530-upload-pack-error.
  Make report() from usage.c public as vreportf() and use it.
  Fix signature of fcntl() compatibility dummy
  Windows: more pthreads functions
  Reimplement async procedures using pthreads
  Dying in an async procedure should only exit the thread, not the
    process.

 Makefile                     |    5 +++
 compat/mingw.h               |    2 +-
 compat/win32/pthread.c       |    8 ++++
 compat/win32/pthread.h       |   25 ++++++++++++++
 fast-import.c                |    8 ++---
 git-compat-util.h            |    1 +
 http-backend.c               |    5 +--
 run-command.c                |   75 ++++++++++++++++++++++++++++++++----------
 run-command.h                |    8 +++-
 t/t5530-upload-pack-error.sh |   18 +++++-----
 usage.c                      |   10 +++---
 11 files changed, 121 insertions(+), 44 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/6] Pass t5530 on Windows
  2010-03-06 15:40 [PATCH 0/6] Pass t5530 on Windows Johannes Sixt
@ 2010-03-06 20:12 ` Junio C Hamano
  2010-03-06 21:50   ` Shawn O. Pearce
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2010-03-06 20:12 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git

Johannes Sixt <j6t@kdbg.org> writes:

> Quite frankly, I don't quite know what to do with this series. On the
> one hand, it is a clean-up, but in practice it is not relevant whether
> die() kills only the async thread or the whole process because all
> callers of async die() themselves anyway when the async procedure died.
> On the other hand, it does enable threaded async procedures on POSIX...

You wrote in [PATCH 5/6]:

    A new configuration option is introduced so that the threaded
    implementation can also be used on POSIX systems. Since this option is
    intended only as playground on POSIX, but is mandatory on Windows, the
    option is not documented.

but I am wondering how much of the real world is threads-challenged these
days.  Here is what you have at the end of technical/api-run-command.txt:

   There are serious restrictions on what the asynchronous function can do
   because this facility is implemented by a pipe to a forked process on
   UNIX, but by a thread in the same address space on Windows:

   . It cannot change the program's state (global variables, environment,
     etc.) in a way that the caller notices; in other words, .in and .out
     are the only communication channels to the caller.

   . It must not change the program's state that the caller of the
     facility also uses.

And calling die() from async is obviously "change the program's state that
the caller of the facility also uses".  We didn't uncover this as a bug
because the above "serious restrictions" go both ways.

If we make threaded-async the default on any platform that is thread
capable, we would increase the likelihood of catching bugs that violate
the latter condition.  I am sure there may be downsides for going that
route, but it might be better than the current situation where two major
platforms use quite different underlying semantics for the same call, and
rely on the program (both caller and callee) to honor the above
conditions, without much tool support [*1*].

[Footnote]

*1* I sometimes wonder if people who are interseted in static analysis can
help with issues like this: "In a function started by start_async() and
its callees, you are not supposed to touch these globals, nor call those
functions".  That would be more useful than reports we occasionally get
"in this codepath this variable can be used before assigned" with many
false positives, that presumably come from from the canned set of rules
these static checkers may have.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/6] Pass t5530 on Windows
  2010-03-06 20:12 ` Junio C Hamano
@ 2010-03-06 21:50   ` Shawn O. Pearce
  2010-03-09 20:00     ` [PATCH 7/6] Enable threaded async procedures whenever pthreads is available Johannes Sixt
  0 siblings, 1 reply; 16+ messages in thread
From: Shawn O. Pearce @ 2010-03-06 21:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git

Junio C Hamano <gitster@pobox.com> wrote:
> Johannes Sixt <j6t@kdbg.org> writes:
> 
> > Quite frankly, I don't quite know what to do with this series. On the
> > one hand, it is a clean-up, but in practice it is not relevant whether
> > die() kills only the async thread or the whole process because all
> > callers of async die() themselves anyway when the async procedure died.
> > On the other hand, it does enable threaded async procedures on POSIX...
...
>    . It must not change the program's state that the caller of the
>      facility also uses.
> 
> And calling die() from async is obviously "change the program's state that
> the caller of the facility also uses".  We didn't uncover this as a bug
> because the above "serious restrictions" go both ways.

I agree with you Junio.  I think that any async helper that is
invoking die() in its code path is wrong.  Just like its also
wrong for an async helper to try and use the sha1_file.c family
of functions.  The helper should return failure and let its caller
handle the process termination.

Hell, its even wrong for an async helper to use xmalloc(), because in
a low-memory situation that xmalloc may try to remove pack windows
*without locking*.  _DOUBLE_PLUS_UNGOOD_
 
> If we make threaded-async the default on any platform that is thread
> capable, we would increase the likelihood of catching bugs that violate
> the latter condition.

I'm in favor of that.  If we have threaded delta search enabled,
we probably can also run these async procedures in a POSIX thread
rather than forking off a child.  Though on our primary target
platform of Linux, the performance difference either way probably
cannot be measured.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 7/6] Enable threaded async procedures whenever pthreads is available
  2010-03-06 21:50   ` Shawn O. Pearce
@ 2010-03-09 20:00     ` Johannes Sixt
  2010-03-09 23:43       ` Shawn O. Pearce
  2010-03-10 22:28       ` Junio C Hamano
  0 siblings, 2 replies; 16+ messages in thread
From: Johannes Sixt @ 2010-03-09 20:00 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, git

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
On Samstag, 6. März 2010, Shawn O. Pearce wrote:
> I'm in favor of that.  If we have threaded delta search enabled,
> we probably can also run these async procedures in a POSIX thread
> rather than forking off a child.

OK. The patch could look like this.

-- Hannes

 Documentation/technical/api-run-command.txt |    5 +++--
 Makefile                                    |    5 -----
 run-command.c                               |    6 +++---
 run-command.h                               |    4 ++--
 4 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/Documentation/technical/api-run-command.txt b/Documentation/technical/api-run-command.txt
index 44876fa..f18b4f4 100644
--- a/Documentation/technical/api-run-command.txt
+++ b/Documentation/technical/api-run-command.txt
@@ -231,8 +231,9 @@ The function pointer in .proc has the following signature:
 
 
 There are serious restrictions on what the asynchronous function can do
-because this facility is implemented by a pipe to a forked process on
-UNIX, but by a thread in the same address space on Windows:
+because this facility is implemented by a thread in the same address
+space on most platforms (when pthreads is available), but by a pipe to
+a forked process otherwise:
 
 . It cannot change the program's state (global variables, environment,
   etc.) in a way that the caller notices; in other words, .in and .out
diff --git a/Makefile b/Makefile
index 2fe52f8..52f2cc0 100644
--- a/Makefile
+++ b/Makefile
@@ -979,7 +979,6 @@ ifeq ($(uname_S),Windows)
 	NO_CURL = YesPlease
 	NO_PYTHON = YesPlease
 	BLK_SHA1 = YesPlease
-	ASYNC_AS_THREAD = YesPlease
 
 	CC = compat/vcbuild/scripts/clink.pl
 	AR = compat/vcbuild/scripts/lib.pl
@@ -1031,7 +1030,6 @@ ifneq (,$(findstring MINGW,$(uname_S)))
 	NO_REGEX = YesPlease
 	NO_PYTHON = YesPlease
 	BLK_SHA1 = YesPlease
-	ASYNC_AS_THREAD = YesPlease
 	COMPAT_CFLAGS += -D__USE_MINGW_ACCESS -DNOGDI -Icompat -Icompat/fnmatch -Icompat/win32
 	COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\"
 	COMPAT_OBJS += compat/mingw.o compat/fnmatch/fnmatch.o compat/winansi.o \
@@ -1344,9 +1342,6 @@ ifdef NO_PTHREADS
 else
 	EXTLIBS += $(PTHREAD_LIBS)
 	LIB_OBJS += thread-utils.o
-ifdef ASYNC_AS_THREAD
-	BASIC_CFLAGS += -DASYNC_AS_THREAD
-endif
 endif
 
 ifdef DIR_HAS_BSD_GROUP_SEMANTICS
diff --git a/run-command.c b/run-command.c
index 66cc4bf..053b28f 100644
--- a/run-command.c
+++ b/run-command.c
@@ -447,7 +447,7 @@ int run_command_v_opt_cd_env(const char **argv, int opt, const char *dir, const
 	return run_command(&cmd);
 }
 
-#ifdef ASYNC_AS_THREAD
+#ifndef NO_PTHREADS
 static pthread_t main_thread;
 static int main_thread_set;
 static pthread_key_t async_key;
@@ -521,7 +521,7 @@ int start_async(struct async *async)
 	else
 		proc_out = -1;
 
-#ifndef ASYNC_AS_THREAD
+#ifdef NO_PTHREADS
 	/* Flush stdio before fork() to avoid cloning buffers */
 	fflush(NULL);
 
@@ -590,7 +590,7 @@ error:
 
 int finish_async(struct async *async)
 {
-#ifndef ASYNC_AS_THREAD
+#ifdef NO_PTHREADS
 	return wait_or_whine(async->pid, "child process", 0);
 #else
 	void *ret = (void *)(intptr_t)(-1);
diff --git a/run-command.h b/run-command.h
index 40db39c..56491b9 100644
--- a/run-command.h
+++ b/run-command.h
@@ -1,7 +1,7 @@
 #ifndef RUN_COMMAND_H
 #define RUN_COMMAND_H
 
-#ifdef ASYNC_AS_THREAD
+#ifndef NO_PTHREADS
 #include <pthread.h>
 #endif
 
@@ -78,7 +78,7 @@ struct async {
 	void *data;
 	int in;		/* caller writes here and closes it */
 	int out;	/* caller reads from here and closes it */
-#ifndef ASYNC_AS_THREAD
+#ifdef NO_PTHREADS
 	pid_t pid;
 #else
 	pthread_t tid;
-- 
1.7.0.rc2.65.g7b13a

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is available
  2010-03-09 20:00     ` [PATCH 7/6] Enable threaded async procedures whenever pthreads is available Johannes Sixt
@ 2010-03-09 23:43       ` Shawn O. Pearce
  2010-03-10 22:28       ` Junio C Hamano
  1 sibling, 0 replies; 16+ messages in thread
From: Shawn O. Pearce @ 2010-03-09 23:43 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git

Johannes Sixt <j6t@kdbg.org> wrote:
> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
> ---
> On Samstag, 6. M?rz 2010, Shawn O. Pearce wrote:
> > I'm in favor of that.  If we have threaded delta search enabled,
> > we probably can also run these async procedures in a POSIX thread
> > rather than forking off a child.
> 
> OK. The patch could look like this.

Looks fine to me.  But given my earlier statement about xmalloc()
damage... I wonder if we shouldn't try to ensure that isn't a
problem first.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is available
  2010-03-09 20:00     ` [PATCH 7/6] Enable threaded async procedures whenever pthreads is available Johannes Sixt
  2010-03-09 23:43       ` Shawn O. Pearce
@ 2010-03-10 22:28       ` Junio C Hamano
  2010-03-11 19:53         ` Johannes Sixt
  2010-03-17 21:28         ` Johannes Sixt
  1 sibling, 2 replies; 16+ messages in thread
From: Junio C Hamano @ 2010-03-10 22:28 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Shawn O. Pearce, git

Johannes Sixt <j6t@kdbg.org> writes:

> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
> ---
> On Samstag, 6. März 2010, Shawn O. Pearce wrote:
>> I'm in favor of that.  If we have threaded delta search enabled,
>> we probably can also run these async procedures in a POSIX thread
>> rather than forking off a child.
>
> OK. The patch could look like this.

Will queue in 'pu', but as Shawn said, we should probably give another
closer look at the callees that are started with this interface before
moving forward.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is available
  2010-03-10 22:28       ` Junio C Hamano
@ 2010-03-11 19:53         ` Johannes Sixt
  2010-03-12  5:56           ` Junio C Hamano
  2010-03-17 21:28         ` Johannes Sixt
  1 sibling, 1 reply; 16+ messages in thread
From: Johannes Sixt @ 2010-03-11 19:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

On Mittwoch, 10. März 2010, Junio C Hamano wrote:
> Will queue in 'pu', but as Shawn said, we should probably give another
> closer look at the callees that are started with this interface before
> moving forward.

I'll audit the call paths. But I'll need some time for this.

-- Hannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is available
  2010-03-11 19:53         ` Johannes Sixt
@ 2010-03-12  5:56           ` Junio C Hamano
  0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2010-03-12  5:56 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, Shawn O. Pearce, git

Johannes Sixt <j6t@kdbg.org> writes:

> On Mittwoch, 10. März 2010, Junio C Hamano wrote:
>> Will queue in 'pu', but as Shawn said, we should probably give another
>> closer look at the callees that are started with this interface before
>> moving forward.
>
> I'll audit the call paths. But I'll need some time for this.

Thanks for volunteering.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is available
  2010-03-10 22:28       ` Junio C Hamano
  2010-03-11 19:53         ` Johannes Sixt
@ 2010-03-17 21:28         ` Johannes Sixt
  2010-03-17 22:19           ` Junio C Hamano
  2010-03-23  8:15           ` Fredrik Kuivinen
  1 sibling, 2 replies; 16+ messages in thread
From: Johannes Sixt @ 2010-03-17 21:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

On Mittwoch, 10. März 2010, Junio C Hamano wrote:
> Johannes Sixt <j6t@kdbg.org> writes:
> > On Samstag, 6. März 2010, Shawn O. Pearce wrote:
> >> I'm in favor of that.  If we have threaded delta search enabled,
> >> we probably can also run these async procedures in a POSIX thread
> >> rather than forking off a child.
> >
> > OK. The patch could look like this.
>
> Will queue in 'pu', but as Shawn said, we should probably give another
> closer look at the callees that are started with this interface before
> moving forward.

So here is my analysis.

There are currently these users of the async infrastructure:

builtin-fetch-pack.c
builtin-receive-pack.c
builtin-send-pack.c
convert.c
remote-curl.c
upload_pack.c

The list below shows all functions that read or write potentially global state 
that are called from the async procedure (except for upload_pack.c, see 
below). I ignored all functions that do not depend on global state, like 
strcmp, memcpy, sprintf, strerror, etc.

----------
remote-curl.c:write_discovery():
  write
  close

These two functions only communicate with the parent. This is absolutely 
thread-safe.

----------
builtin-fetch-pack.c:sideband_demux() and
builtin-send-pack.c:sideband_demux() are virtually identical, and
builtin-receive-pack.c:copy_to_sideband() has the same footprint:
  getenv
  read
  die_errno
  die
  write
  close

Again, the functions only communicate with the parent and nothing else. getenv 
could be a problem if the parent called setenv, but it doesn't. die (and 
die_errno) are overridden in the threaded version, and are uncritical. In 
total, this user is thread-safe.

(This is getenv("TERM") from recv_sideband, btw.)

----------
convert.c:filter_buffer()
  pipe
  close
  getenv
  fork
  read
  write
  fprintf
  waitpid

if GIT_TRACE is set:
  xrealloc
  die
  free

This one is less trivial. It calls start_command+finish_command from the async 
procedure. The parent calls
  read()
  xrealloc() (via strbuf_read())
  error()
(and nothing else) between start_async() and finish_async().

As long as GIT_TRACE is not set, we are safe, because the async procedure 
carefully releases all global resources that it allocated, and the parent is 
looking in the other direction while it does os. Even if GIT_TRACE is set, 
xrealloc() in the parent and the async procedure cannot be called 
simultanously because the procedure calls it only before it begins writing 
output, and the parent only after it received this output.

On Windows, the situation is slightly different, because we always call into 
xmalloc() during start_command, but again we do so before the output is 
produced.

I think we are safe in all cases.

----------
upload_pack:create_pack_file(): This user is different because it calls into 
the revision walker in the async procedure, which definitely affects global 
state. Therefore, here is the list of functions called by the parent until it 
exits:
  pipe
  close
  getenv
  fork
  read
  write
  fprintf
  waitpid
  alarm
  die
  die_errno
  pthread_join / waitpid

if GIT_TRACE is set:
  xrealloc
  die
  free

It also calls start_command+finish_command. If GIT_TRACE is set, we are less 
safe because this time, xrealloc() can be called simultanously with pack 
functions while the revision walker is operating.

But as long as GIT_TRACE is not set, it looks like we are safe, though I 
haven't looked through the revision walker whether it messes with alarms or 
the environment.

One aspect is, though, that the async procedure is used *only* for a shallow 
clone, and the reason that it is used is that pack-objects (or the way it is 
called from upload-pack?) has not been taught to produce a pack for a shallow 
clone.

This concludes my analysis.

-- Hannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is available
  2010-03-17 21:28         ` Johannes Sixt
@ 2010-03-17 22:19           ` Junio C Hamano
  2010-03-23  8:15           ` Fredrik Kuivinen
  1 sibling, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2010-03-17 22:19 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Shawn O. Pearce, git

Johannes Sixt <j6t@kdbg.org> writes:

>> Will queue in 'pu', but as Shawn said, we should probably give another
>> closer look at the callees that are started with this interface before
>> moving forward.
>
> So here is my analysis.
> ...
> This concludes my analysis.

Thanks; very much appreciated.

I'll read it over and start moving things to 'next', then.  I'll be slower
than my usual self for the coming few days, though.

A bit of patience please.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is  available
  2010-03-17 21:28         ` Johannes Sixt
  2010-03-17 22:19           ` Junio C Hamano
@ 2010-03-23  8:15           ` Fredrik Kuivinen
  2010-03-23 20:19             ` Johannes Sixt
  1 sibling, 1 reply; 16+ messages in thread
From: Fredrik Kuivinen @ 2010-03-23  8:15 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, Shawn O. Pearce, git

On Wed, Mar 17, 2010 at 22:28, Johannes Sixt <j6t@kdbg.org> wrote:
> On Mittwoch, 10. März 2010, Junio C Hamano wrote:
>> Johannes Sixt <j6t@kdbg.org> writes:
>> > On Samstag, 6. März 2010, Shawn O. Pearce wrote:
>> >> I'm in favor of that.  If we have threaded delta search enabled,
>> >> we probably can also run these async procedures in a POSIX thread
>> >> rather than forking off a child.
>> >
>> > OK. The patch could look like this.
>>
>> Will queue in 'pu', but as Shawn said, we should probably give another
>> closer look at the callees that are started with this interface before
>> moving forward.
>
> So here is my analysis.
>
> There are currently these users of the async infrastructure:
>
> builtin-fetch-pack.c
> builtin-receive-pack.c
> builtin-send-pack.c
> convert.c
> remote-curl.c
> upload_pack.c
>
> The list below shows all functions that read or write potentially global state
> that are called from the async procedure (except for upload_pack.c, see
> below). I ignored all functions that do not depend on global state, like
> strcmp, memcpy, sprintf, strerror, etc.
>

> ----------
> convert.c:filter_buffer()
>  pipe
>  close
>  getenv
>  fork
>  read
>  write
>  fprintf
>  waitpid
>
> if GIT_TRACE is set:
>  xrealloc
>  die
>  free
>
> This one is less trivial. It calls start_command+finish_command from the async
> procedure. The parent calls
>  read()
>  xrealloc() (via strbuf_read())
>  error()
> (and nothing else) between start_async() and finish_async().
>
> As long as GIT_TRACE is not set, we are safe, because the async procedure
> carefully releases all global resources that it allocated, and the parent is
> looking in the other direction while it does os. Even if GIT_TRACE is set,
> xrealloc() in the parent and the async procedure cannot be called
> simultanously because the procedure calls it only before it begins writing
> output, and the parent only after it received this output.

Maybe I'm missing something but, isn't it possible that xrealloc is
called simultaneously from the two threads if GIT_TRACE is set?

Immediately after start_async the parent calls strbuf_read. We then
get the call chain
strbuf_read -> strbuf_grow -> ALLOG_GROW -> xrealloc, so xrealloc is
called before we read any data in the parent.

In the child we have start_command -> trace_argv_printf -> strbuf_grow -> ...

That xmalloc and xrealloc aren't thread-safe feels a bit fragile.
Maybe we should try to fix that.

> ----------
> upload_pack:create_pack_file(): This user is different because it calls into
> the revision walker in the async procedure, which definitely affects global
> state. Therefore, here is the list of functions called by the parent until it
> exits:
>  pipe
>  close
>  getenv
>  fork
>  read
>  write
>  fprintf
>  waitpid
>  alarm
>  die
>  die_errno
>  pthread_join / waitpid

sha1_to_hex is also called by the parent and the current
implementation of that function is not thread-safe. sha1_to_hex is
also called by some paths in the revision machinery, but I don't know
if it will ever be called in this particular case.

Maybe it would be a good idea to create wrappers for getenv, setenv,
unsetenv, and putenv to make them thread-safe as well. Then won't have
to worry about them in the future.

- Fredrik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is  available
  2010-03-23  8:15           ` Fredrik Kuivinen
@ 2010-03-23 20:19             ` Johannes Sixt
  2010-03-23 20:25               ` Johannes Sixt
  2010-03-23 21:42               ` Fredrik Kuivinen
  0 siblings, 2 replies; 16+ messages in thread
From: Johannes Sixt @ 2010-03-23 20:19 UTC (permalink / raw)
  To: Fredrik Kuivinen; +Cc: Junio C Hamano, Shawn O. Pearce, git

On Dienstag, 23. März 2010, Fredrik Kuivinen wrote:
> On Wed, Mar 17, 2010 at 22:28, Johannes Sixt <j6t@kdbg.org> wrote:
> > ----------
> > convert.c:filter_buffer()
> ...
> Maybe I'm missing something but, isn't it possible that xrealloc is
> called simultaneously from the two threads if GIT_TRACE is set?
>
> Immediately after start_async the parent calls strbuf_read. We then
> get the call chain
> strbuf_read -> strbuf_grow -> ALLOG_GROW -> xrealloc, so xrealloc is
> called before we read any data in the parent.
>
> In the child we have start_command -> trace_argv_printf -> strbuf_grow ->
> ...

Outch! You are right. It seems I missed the call of strbuf_grow before the 
loop in strbuf_read.

OK, this means that convert.c is not safe if (and only if) GIT_TRACE is 
set. :-(

> That xmalloc and xrealloc aren't thread-safe feels a bit fragile.
> Maybe we should try to fix that.

The point of this assessment was to find out whether this is necessary (and 
whether something else that is not thread-safe is used).

> > ----------
> > upload_pack:create_pack_file(): 
> ...
> sha1_to_hex is also called by the parent and the current
> implementation of that function is not thread-safe. sha1_to_hex is
> also called by some paths in the revision machinery, but I don't know
> if it will ever be called in this particular case.

sha1_to_hex is only called by the parent when the async procedure is not used.

-- Hannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is  available
  2010-03-23 20:19             ` Johannes Sixt
@ 2010-03-23 20:25               ` Johannes Sixt
  2010-03-23 20:44                 ` Junio C Hamano
  2010-03-23 21:42               ` Fredrik Kuivinen
  1 sibling, 1 reply; 16+ messages in thread
From: Johannes Sixt @ 2010-03-23 20:25 UTC (permalink / raw)
  To: Fredrik Kuivinen; +Cc: Junio C Hamano, Shawn O. Pearce, git

On Dienstag, 23. März 2010, Johannes Sixt wrote:
> On Dienstag, 23. März 2010, Fredrik Kuivinen wrote:
> > On Wed, Mar 17, 2010 at 22:28, Johannes Sixt <j6t@kdbg.org> wrote:
> > > ----------
> > > upload_pack:create_pack_file():
> >
> > ...
> > sha1_to_hex is also called by the parent and the current
> > implementation of that function is not thread-safe. sha1_to_hex is
> > also called by some paths in the revision machinery, but I don't know
> > if it will ever be called in this particular case.
>
> sha1_to_hex is only called by the parent when the async procedure is not
> used.

BTW, the real fix for this potentially problematic case is to teach 
pack-objects to create a pack file for a shallow clone. Then we don't need 
this instance of an async procedure.

-- Hannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is  available
  2010-03-23 20:25               ` Johannes Sixt
@ 2010-03-23 20:44                 ` Junio C Hamano
  2010-03-23 21:09                   ` Johannes Sixt
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2010-03-23 20:44 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Fredrik Kuivinen, Shawn O. Pearce, git

Johannes Sixt <j6t@kdbg.org> writes:

> BTW, the real fix for this potentially problematic case is to teach 
> pack-objects to create a pack file for a shallow clone. Then we don't need 
> this instance of an async procedure.

I keep hearing "shallow clone" here, but doesn't "bundle -n 1" essentially
do something quite similar using pack-objects?  Do they make a call into
pack-objects in a different way, and if so can we update "clone" to mimic
how "bundle" drives pack-objects?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is  available
  2010-03-23 20:44                 ` Junio C Hamano
@ 2010-03-23 21:09                   ` Johannes Sixt
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Sixt @ 2010-03-23 21:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Fredrik Kuivinen, Shawn O. Pearce, git

On Dienstag, 23. März 2010, Junio C Hamano wrote:
> Johannes Sixt <j6t@kdbg.org> writes:
> > BTW, the real fix for this potentially problematic case is to teach
> > pack-objects to create a pack file for a shallow clone. Then we don't
> > need this instance of an async procedure.
>
> I keep hearing "shallow clone" here, but doesn't "bundle -n 1" essentially
> do something quite similar using pack-objects?  Do they make a call into
> pack-objects in a different way, and if so can we update "clone" to mimic
> how "bundle" drives pack-objects?

Both *current* implementations of shallow clone and bundle feed pack-objects 
with rev-list output.

In the case of non-shallow clone, upload-pack does not need rev-list because 
pack-objects can walk the revisions itself. I cannot tell what pack-objects 
does not have that it cannot walk the revisions itself for shallow clones.

-- Hannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is  available
  2010-03-23 20:19             ` Johannes Sixt
  2010-03-23 20:25               ` Johannes Sixt
@ 2010-03-23 21:42               ` Fredrik Kuivinen
  1 sibling, 0 replies; 16+ messages in thread
From: Fredrik Kuivinen @ 2010-03-23 21:42 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, Shawn O. Pearce, git

On Tue, Mar 23, 2010 at 21:19, Johannes Sixt <j6t@kdbg.org> wrote:
> On Dienstag, 23. März 2010, Fredrik Kuivinen wrote:
>> On Wed, Mar 17, 2010 at 22:28, Johannes Sixt <j6t@kdbg.org> wrote:

>> That xmalloc and xrealloc aren't thread-safe feels a bit fragile.
>> Maybe we should try to fix that.
>
> The point of this assessment was to find out whether this is necessary (and
> whether something else that is not thread-safe is used).

It may not be necessary now, but my point was that by having
thread-unsafe xmalloc and xrealloc it is a bit too easy to introduce
new bugs.

>> > ----------
>> > upload_pack:create_pack_file():
>> ...
>> sha1_to_hex is also called by the parent and the current
>> implementation of that function is not thread-safe. sha1_to_hex is
>> also called by some paths in the revision machinery, but I don't know
>> if it will ever be called in this particular case.
>
> sha1_to_hex is only called by the parent when the async procedure is not used.

Yes, you are right.

- Fredrik

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-03-23 21:42 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-06 15:40 [PATCH 0/6] Pass t5530 on Windows Johannes Sixt
2010-03-06 20:12 ` Junio C Hamano
2010-03-06 21:50   ` Shawn O. Pearce
2010-03-09 20:00     ` [PATCH 7/6] Enable threaded async procedures whenever pthreads is available Johannes Sixt
2010-03-09 23:43       ` Shawn O. Pearce
2010-03-10 22:28       ` Junio C Hamano
2010-03-11 19:53         ` Johannes Sixt
2010-03-12  5:56           ` Junio C Hamano
2010-03-17 21:28         ` Johannes Sixt
2010-03-17 22:19           ` Junio C Hamano
2010-03-23  8:15           ` Fredrik Kuivinen
2010-03-23 20:19             ` Johannes Sixt
2010-03-23 20:25               ` Johannes Sixt
2010-03-23 20:44                 ` Junio C Hamano
2010-03-23 21:09                   ` Johannes Sixt
2010-03-23 21:42               ` Fredrik Kuivinen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.