All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Fix two buffer overflows and remove a redundant var
@ 2013-12-17 13:43 Michael Haggerty
  2013-12-17 13:43 ` [PATCH 1/3] prune-packed: fix a possible buffer overflow Michael Haggerty
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Michael Haggerty @ 2013-12-17 13:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

These patches fix three things I found when I was puttering around.
The first two patches fix buffer overflows.  (They don't seem to be
exploitable.)

The third removes a redundant local variable.

The patches apply to master.

Michael Haggerty (3):
  prune-packed: fix a possible buffer overflow
  prune_object_dir(): verify that path fits in the temporary buffer
  cmd_repack(): remove redundant local variable "nr_packs"

 builtin/prune-packed.c | 4 ++--
 builtin/prune.c        | 4 +++-
 builtin/repack.c       | 6 ++----
 3 files changed, 7 insertions(+), 7 deletions(-)

-- 
1.8.5.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/3] prune-packed: fix a possible buffer overflow
  2013-12-17 13:43 [PATCH 0/3] Fix two buffer overflows and remove a redundant var Michael Haggerty
@ 2013-12-17 13:43 ` Michael Haggerty
  2013-12-17 13:57   ` Duy Nguyen
  2013-12-17 13:43 ` [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer Michael Haggerty
  2013-12-17 13:43 ` [PATCH 3/3] cmd_repack(): remove redundant local variable "nr_packs" Michael Haggerty
  2 siblings, 1 reply; 21+ messages in thread
From: Michael Haggerty @ 2013-12-17 13:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

The pathname character array might hold:

    strlen(pathname) -- the pathname argument
    '/'              -- a slash, if not already present in pathname
    %02x/            -- the first two characters of the SHA-1 plus slash
    38 characters    -- the last 38 characters of the SHA-1
    NUL              -- terminator
    ---------------------
    strlen(pathname) + 43

(Actually, the NUL character is not written explicitly to pathname;
rather, the code relies on pathname being initialized to zeros and the
zero following the pathname still being there after the other
characters are written to the array.)

But the old pathname variable was PATH_MAX characters long, whereas
the check was (len > PATH_MAX - 42).  So there might have been one
byte too many stored in pathname.  This would have resulted in it's
not being NUL-terminated.

So, increase the size of the pathname variable by one byte to avoid
this possibility.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/prune-packed.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
index fa6ce42..81bc786 100644
--- a/builtin/prune-packed.c
+++ b/builtin/prune-packed.c
@@ -37,7 +37,7 @@ static void prune_dir(int i, DIR *dir, char *pathname, int len, int opts)
 void prune_packed_objects(int opts)
 {
 	int i;
-	static char pathname[PATH_MAX];
+	static char pathname[PATH_MAX + 1];
 	const char *dir = get_object_directory();
 	int len = strlen(dir);
 
@@ -45,7 +45,7 @@ void prune_packed_objects(int opts)
 		progress = start_progress_delay("Removing duplicate objects",
 			256, 95, 2);
 
-	if (len > PATH_MAX - 42)
+	if (len + 42 > PATH_MAX)
 		die("impossible object directory");
 	memcpy(pathname, dir, len);
 	if (len && pathname[len-1] != '/')
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer
  2013-12-17 13:43 [PATCH 0/3] Fix two buffer overflows and remove a redundant var Michael Haggerty
  2013-12-17 13:43 ` [PATCH 1/3] prune-packed: fix a possible buffer overflow Michael Haggerty
@ 2013-12-17 13:43 ` Michael Haggerty
  2013-12-17 18:51   ` Junio C Hamano
  2013-12-17 18:56   ` Antoine Pelisse
  2013-12-17 13:43 ` [PATCH 3/3] cmd_repack(): remove redundant local variable "nr_packs" Michael Haggerty
  2 siblings, 2 replies; 21+ messages in thread
From: Michael Haggerty @ 2013-12-17 13:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

Dimension the buffer based on PATH_MAX rather than a magic number, and
verify that the path fits in it before continuing.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---

I don't think that this problem is remotely exploitable, because the
size of the string doesn't depend on inputs that can be influenced by
a client (at least not within Git).

 builtin/prune.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/builtin/prune.c b/builtin/prune.c
index 6366917..ae34d04 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -96,7 +96,9 @@ static void prune_object_dir(const char *path)
 {
 	int i;
 	for (i = 0; i < 256; i++) {
-		static char dir[4096];
+		static char dir[PATH_MAX + 1];
+		if (strlen(path) + 3 > PATH_MAX)
+			die("impossible object directory");
 		sprintf(dir, "%s/%02x", path, i);
 		prune_dir(i, dir);
 	}
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/3] cmd_repack(): remove redundant local variable "nr_packs"
  2013-12-17 13:43 [PATCH 0/3] Fix two buffer overflows and remove a redundant var Michael Haggerty
  2013-12-17 13:43 ` [PATCH 1/3] prune-packed: fix a possible buffer overflow Michael Haggerty
  2013-12-17 13:43 ` [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer Michael Haggerty
@ 2013-12-17 13:43 ` Michael Haggerty
  2013-12-17 13:46   ` Stefan Beller
  2 siblings, 1 reply; 21+ messages in thread
From: Michael Haggerty @ 2013-12-17 13:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

Its value is the same as the number of entries in the "names"
string_list, so just use "names.nr" in its place.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
This is just a trivial simplification.  Take it or leave it.

 builtin/repack.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index a0ff5c7..91e2363 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -123,7 +123,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	struct string_list rollback = STRING_LIST_INIT_NODUP;
 	struct string_list existing_packs = STRING_LIST_INIT_DUP;
 	struct strbuf line = STRBUF_INIT;
-	int nr_packs, ext, ret, failed;
+	int ext, ret, failed;
 	FILE *out;
 
 	/* variables to be filled by option parsing */
@@ -233,13 +233,11 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	if (ret)
 		return ret;
 
-	nr_packs = 0;
 	out = xfdopen(cmd.out, "r");
 	while (strbuf_getline(&line, out, '\n') != EOF) {
 		if (line.len != 40)
 			die("repack: Expecting 40 character sha1 lines only from pack-objects.");
 		string_list_append(&names, line.buf);
-		nr_packs++;
 	}
 	fclose(out);
 	ret = finish_command(&cmd);
@@ -247,7 +245,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 		return ret;
 	argv_array_clear(&cmd_args);
 
-	if (!nr_packs && !quiet)
+	if (!names.nr && !quiet)
 		printf("Nothing new to pack.\n");
 
 	/*
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/3] cmd_repack(): remove redundant local variable "nr_packs"
  2013-12-17 13:43 ` [PATCH 3/3] cmd_repack(): remove redundant local variable "nr_packs" Michael Haggerty
@ 2013-12-17 13:46   ` Stefan Beller
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan Beller @ 2013-12-17 13:46 UTC (permalink / raw)
  To: Michael Haggerty, Junio C Hamano; +Cc: git

On 17.12.2013 14:43, Michael Haggerty wrote:
> Its value is the same as the number of entries in the "names"
> string_list, so just use "names.nr" in its place.
> 
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>

Acked-by: Stefan Beller <stefanbeller@googlemail.com>

> ---
> This is just a trivial simplification.  Take it or leave it.
> 
>  builtin/repack.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/builtin/repack.c b/builtin/repack.c
> index a0ff5c7..91e2363 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -123,7 +123,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  	struct string_list rollback = STRING_LIST_INIT_NODUP;
>  	struct string_list existing_packs = STRING_LIST_INIT_DUP;
>  	struct strbuf line = STRBUF_INIT;
> -	int nr_packs, ext, ret, failed;
> +	int ext, ret, failed;
>  	FILE *out;
>  
>  	/* variables to be filled by option parsing */
> @@ -233,13 +233,11 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  	if (ret)
>  		return ret;
>  
> -	nr_packs = 0;
>  	out = xfdopen(cmd.out, "r");
>  	while (strbuf_getline(&line, out, '\n') != EOF) {
>  		if (line.len != 40)
>  			die("repack: Expecting 40 character sha1 lines only from pack-objects.");
>  		string_list_append(&names, line.buf);
> -		nr_packs++;
>  	}
>  	fclose(out);
>  	ret = finish_command(&cmd);
> @@ -247,7 +245,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  		return ret;
>  	argv_array_clear(&cmd_args);
>  
> -	if (!nr_packs && !quiet)
> +	if (!names.nr && !quiet)
>  		printf("Nothing new to pack.\n");
>  
>  	/*
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/3] prune-packed: fix a possible buffer overflow
  2013-12-17 13:43 ` [PATCH 1/3] prune-packed: fix a possible buffer overflow Michael Haggerty
@ 2013-12-17 13:57   ` Duy Nguyen
  2013-12-17 18:43     ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Duy Nguyen @ 2013-12-17 13:57 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Git Mailing List

On Tue, Dec 17, 2013 at 8:43 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> The pathname character array might hold:
>
>     strlen(pathname) -- the pathname argument
>     '/'              -- a slash, if not already present in pathname
>     %02x/            -- the first two characters of the SHA-1 plus slash
>     38 characters    -- the last 38 characters of the SHA-1
>     NUL              -- terminator
>     ---------------------
>     strlen(pathname) + 43
>
> (Actually, the NUL character is not written explicitly to pathname;
> rather, the code relies on pathname being initialized to zeros and the
> zero following the pathname still being there after the other
> characters are written to the array.)
>
> But the old pathname variable was PATH_MAX characters long, whereas
> the check was (len > PATH_MAX - 42).  So there might have been one
> byte too many stored in pathname.  This would have resulted in it's
> not being NUL-terminated.
>
> So, increase the size of the pathname variable by one byte to avoid
> this possibility.

Why don't we take this opportunity to replace that array with a
strbuf? The conversion looks simple with this function.

>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  builtin/prune-packed.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
> index fa6ce42..81bc786 100644
> --- a/builtin/prune-packed.c
> +++ b/builtin/prune-packed.c
> @@ -37,7 +37,7 @@ static void prune_dir(int i, DIR *dir, char *pathname, int len, int opts)
>  void prune_packed_objects(int opts)
>  {
>         int i;
> -       static char pathname[PATH_MAX];
> +       static char pathname[PATH_MAX + 1];
>         const char *dir = get_object_directory();
>         int len = strlen(dir);
>
> @@ -45,7 +45,7 @@ void prune_packed_objects(int opts)
>                 progress = start_progress_delay("Removing duplicate objects",
>                         256, 95, 2);
>
> -       if (len > PATH_MAX - 42)
> +       if (len + 42 > PATH_MAX)
>                 die("impossible object directory");
>         memcpy(pathname, dir, len);
>         if (len && pathname[len-1] != '/')
> --
-- 
Duy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/3] prune-packed: fix a possible buffer overflow
  2013-12-17 13:57   ` Duy Nguyen
@ 2013-12-17 18:43     ` Junio C Hamano
  2013-12-18 22:44       ` Michael Haggerty
  2013-12-19  0:37       ` Duy Nguyen
  0 siblings, 2 replies; 21+ messages in thread
From: Junio C Hamano @ 2013-12-17 18:43 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Michael Haggerty, Git Mailing List

Duy Nguyen <pclouds@gmail.com> writes:

> Why don't we take this opportunity to replace that array with a
> strbuf? The conversion looks simple with this function.

Indeed.  Something like this, perhaps?

 builtin/prune-packed.c | 38 +++++++++++++++++++++-----------------
 1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
index fa6ce42..fcf5fb6 100644
--- a/builtin/prune-packed.c
+++ b/builtin/prune-packed.c
@@ -10,58 +10,62 @@ static const char * const prune_packed_usage[] = {
 
 static struct progress *progress;
 
-static void prune_dir(int i, DIR *dir, char *pathname, int len, int opts)
+static void prune_dir(int i, DIR *dir, struct strbuf *pathname, int opts)
 {
 	struct dirent *de;
 	char hex[40];
+	int top_len = pathname->len;
 
 	sprintf(hex, "%02x", i);
 	while ((de = readdir(dir)) != NULL) {
 		unsigned char sha1[20];
 		if (strlen(de->d_name) != 38)
 			continue;
-		memcpy(hex+2, de->d_name, 38);
+		memcpy(hex + 2, de->d_name, 38);
 		if (get_sha1_hex(hex, sha1))
 			continue;
 		if (!has_sha1_pack(sha1))
 			continue;
-		memcpy(pathname + len, de->d_name, 38);
+
+		strbuf_add(pathname, de->d_name, 38);
 		if (opts & PRUNE_PACKED_DRY_RUN)
-			printf("rm -f %s\n", pathname);
+			printf("rm -f %s\n", pathname->buf);
 		else
-			unlink_or_warn(pathname);
+			unlink_or_warn(pathname->buf);
 		display_progress(progress, i + 1);
+		strbuf_setlen(pathname, top_len);
 	}
 }
 
 void prune_packed_objects(int opts)
 {
 	int i;
-	static char pathname[PATH_MAX];
 	const char *dir = get_object_directory();
-	int len = strlen(dir);
+	struct strbuf pathname = STRBUF_INIT;
+	int top_len;
 
+	strbuf_addstr(&pathname, dir);
 	if (opts & PRUNE_PACKED_VERBOSE)
 		progress = start_progress_delay("Removing duplicate objects",
 			256, 95, 2);
 
-	if (len > PATH_MAX - 42)
-		die("impossible object directory");
-	memcpy(pathname, dir, len);
-	if (len && pathname[len-1] != '/')
-		pathname[len++] = '/';
+	if (pathname.len && pathname.buf[pathname.len - 1] != '/')
+		strbuf_addch(&pathname, '/');
+
+	top_len = pathname.len;
 	for (i = 0; i < 256; i++) {
 		DIR *d;
 
 		display_progress(progress, i + 1);
-		sprintf(pathname + len, "%02x/", i);
-		d = opendir(pathname);
+		strbuf_setlen(&pathname, top_len);
+		strbuf_addf(&pathname, "%02x/", i);
+		d = opendir(pathname.buf);
 		if (!d)
 			continue;
-		prune_dir(i, d, pathname, len + 3, opts);
+		prune_dir(i, d, &pathname, opts);
 		closedir(d);
-		pathname[len + 2] = '\0';
-		rmdir(pathname);
+		strbuf_setlen(&pathname, top_len + 2);
+		rmdir(pathname.buf);
 	}
 	stop_progress(&progress);
 }

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer
  2013-12-17 13:43 ` [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer Michael Haggerty
@ 2013-12-17 18:51   ` Junio C Hamano
  2013-12-17 23:22     ` Jeff King
  2013-12-17 18:56   ` Antoine Pelisse
  1 sibling, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2013-12-17 18:51 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Dimension the buffer based on PATH_MAX rather than a magic number, and
> verify that the path fits in it before continuing.
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>
> I don't think that this problem is remotely exploitable, because the
> size of the string doesn't depend on inputs that can be influenced by
> a client (at least not within Git).

This is shrinking the buffer on some platforms where PATH_MAX is
lower than 4k---granted, we will die() with the new check instead of
crashing uncontrolled, but it still feels somewhat wrong.

>  builtin/prune.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/prune.c b/builtin/prune.c
> index 6366917..ae34d04 100644
> --- a/builtin/prune.c
> +++ b/builtin/prune.c
> @@ -96,7 +96,9 @@ static void prune_object_dir(const char *path)
>  {
>  	int i;
>  	for (i = 0; i < 256; i++) {
> -		static char dir[4096];
> +		static char dir[PATH_MAX + 1];
> +		if (strlen(path) + 3 > PATH_MAX)
> +			die("impossible object directory");
>  		sprintf(dir, "%s/%02x", path, i);
>  		prune_dir(i, dir);
>  	}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer
  2013-12-17 13:43 ` [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer Michael Haggerty
  2013-12-17 18:51   ` Junio C Hamano
@ 2013-12-17 18:56   ` Antoine Pelisse
  1 sibling, 0 replies; 21+ messages in thread
From: Antoine Pelisse @ 2013-12-17 18:56 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, git

On Tue, Dec 17, 2013 at 2:43 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> Dimension the buffer based on PATH_MAX rather than a magic number, and
> verify that the path fits in it before continuing.
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>
> I don't think that this problem is remotely exploitable, because the
> size of the string doesn't depend on inputs that can be influenced by
> a client (at least not within Git).
>
>  builtin/prune.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/prune.c b/builtin/prune.c
> index 6366917..ae34d04 100644
> --- a/builtin/prune.c
> +++ b/builtin/prune.c
> @@ -96,7 +96,9 @@ static void prune_object_dir(const char *path)
>  {
>         int i;
>         for (i = 0; i < 256; i++) {
> -               static char dir[4096];
> +               static char dir[PATH_MAX + 1];
> +               if (strlen(path) + 3 > PATH_MAX)
> +                       die("impossible object directory");
>                 sprintf(dir, "%s/%02x", path, i);
>                 prune_dir(i, dir);
>         }
> --
> 1.8.5.1

Obviously correct,

Thanks,
Antoine,

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer
  2013-12-17 18:51   ` Junio C Hamano
@ 2013-12-17 23:22     ` Jeff King
  2013-12-18 19:35       ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2013-12-17 23:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git

On Tue, Dec 17, 2013 at 10:51:30AM -0800, Junio C Hamano wrote:

> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
> > Dimension the buffer based on PATH_MAX rather than a magic number, and
> > verify that the path fits in it before continuing.
> >
> > Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> > ---
> >
> > I don't think that this problem is remotely exploitable, because the
> > size of the string doesn't depend on inputs that can be influenced by
> > a client (at least not within Git).
> 
> This is shrinking the buffer on some platforms where PATH_MAX is
> lower than 4k---granted, we will die() with the new check instead of
> crashing uncontrolled, but it still feels somewhat wrong.

On such a system, though, does the resulting prune_dir call actually do
anything? We will feed the result to opendir(), which I would think
would choke on the long name.

Still, it is probably better to do everything we can and let the OS
choke (especially because we would want to continue operating on other
paths in this case).

Converting it to use strbuf looks like it will actually let us drop a
bunch of copying, too, as we just end up in mkpath at the very lowest
level. I.e., something like below.

As an aside, I have noticed us using this "push/pop" approach to treating a
strbuf as a stack of paths elsewhere, too. I.e:

  size_t baselen = base->len;
  strbuf_addf(base, "/%s", some_thing);
  some_call(base);
  base->len = baselen;

I wondered if there was any kind of helper we could add to make it look
nicer. But I don't think so; the hairy part is that you must remember to
reset base->len after the call, and there is no easy way around that in
C. If you had object destructors that ran as the stack unwound, or
dynamic scoping, it would be easy to manipulate the object. Wrapping
won't work because strbuf isn't just a length wrapping an immutable
buffer; it actually has to move the NUL in the buffer.

Anyway, not important, but perhaps somebody is more clever than I am.

diff --git a/builtin/prune.c b/builtin/prune.c
index 6366917..4ca8ec1 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -17,9 +17,8 @@ static int verbose;
 static unsigned long expire;
 static int show_progress = -1;
 
-static int prune_tmp_object(const char *path, const char *filename)
+static int prune_tmp_object(const char *fullpath)
 {
-	const char *fullpath = mkpath("%s/%s", path, filename);
 	struct stat st;
 	if (lstat(fullpath, &st))
 		return error("Could not stat '%s'", fullpath);
@@ -32,9 +31,8 @@ static int prune_tmp_object(const char *path, const char *filename)
 	return 0;
 }
 
-static int prune_object(char *path, const char *filename, const unsigned char *sha1)
+static int prune_object(const char *fullpath, const unsigned char *sha1)
 {
-	const char *fullpath = mkpath("%s/%s", path, filename);
 	struct stat st;
 	if (lstat(fullpath, &st))
 		return error("Could not stat '%s'", fullpath);
@@ -50,9 +48,10 @@ static int prune_object(char *path, const char *filename, const unsigned char *s
 	return 0;
 }
 
-static int prune_dir(int i, char *path)
+static int prune_dir(int i, struct strbuf *path)
 {
-	DIR *dir = opendir(path);
+	size_t baselen = path->len;
+	DIR *dir = opendir(path->buf);
 	struct dirent *de;
 
 	if (!dir)
@@ -77,28 +76,39 @@ static int prune_dir(int i, char *path)
 			if (lookup_object(sha1))
 				continue;
 
-			prune_object(path, de->d_name, sha1);
+			strbuf_addf(path, "/%s", de->d_name);
+			prune_object(path->buf, sha1);
+			path->len = baselen;
 			continue;
 		}
 		if (!prefixcmp(de->d_name, "tmp_obj_")) {
-			prune_tmp_object(path, de->d_name);
+			strbuf_addf(path, "/%s", de->d_name);
+			prune_tmp_object(path->buf);
+			path->len = baselen;
 			continue;
 		}
-		fprintf(stderr, "bad sha1 file: %s/%s\n", path, de->d_name);
+		fprintf(stderr, "bad sha1 file: %s/%s\n", path->buf, de->d_name);
 	}
 	closedir(dir);
 	if (!show_only)
-		rmdir(path);
+		rmdir(path->buf);
 	return 0;
 }
 
 static void prune_object_dir(const char *path)
 {
+	struct strbuf buf = STRBUF_INIT;
+	size_t baselen;
 	int i;
+
+	strbuf_addstr(&buf, path);
+	strbuf_addch(&buf, '/');
+	baselen = buf.len;
+
 	for (i = 0; i < 256; i++) {
-		static char dir[4096];
-		sprintf(dir, "%s/%02x", path, i);
-		prune_dir(i, dir);
+		strbuf_addf(&buf, "%02x", i);
+		prune_dir(i, &buf);
+		buf.len = baselen;
 	}
 }
 
@@ -120,7 +130,7 @@ static void remove_temporary_files(const char *path)
 	}
 	while ((de = readdir(dir)) != NULL)
 		if (!prefixcmp(de->d_name, "tmp_"))
-			prune_tmp_object(path, de->d_name);
+			prune_tmp_object(mkpath("%s/%s", path, de->d_name));
 	closedir(dir);
 }
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer
  2013-12-17 23:22     ` Jeff King
@ 2013-12-18 19:35       ` Junio C Hamano
  2013-12-18 20:00         ` Jeff King
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2013-12-18 19:35 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git

Jeff King <peff@peff.net> writes:

> Converting it to use strbuf looks like it will actually let us drop a
> bunch of copying, too, as we just end up in mkpath at the very lowest
> level. I.e., something like below.

Thanks; I may have a few minor comments, but overall, I like the
placement of mkpath() in the resulting callchain a lot better than
the original.

> As an aside, I have noticed us using this "push/pop" approach to treating a
> strbuf as a stack of paths elsewhere, too. I.e:
>
>   size_t baselen = base->len;
>   strbuf_addf(base, "/%s", some_thing);
>   some_call(base);
>   base->len = baselen;
>
> I wondered if there was any kind of helper we could add to make it look
> nicer. But I don't think so; the hairy part is that you must remember to
> reset base->len after the call, and there is no easy way around that in
> C. If you had object destructors that ran as the stack unwound, or
> dynamic scoping, it would be easy to manipulate the object. Wrapping
> won't work because strbuf isn't just a length wrapping an immutable
> buffer; it actually has to move the NUL in the buffer.
>
> Anyway, not important, but perhaps somebody is more clever than I am.

Hmph... interesting but we would need a lot more thought than the
time necessary to respond to one piece of e-mail for this ;-)
Perhaps later...

> diff --git a/builtin/prune.c b/builtin/prune.c
> index 6366917..4ca8ec1 100644
> --- a/builtin/prune.c
> +++ b/builtin/prune.c
> @@ -17,9 +17,8 @@ static int verbose;
>  static unsigned long expire;
>  static int show_progress = -1;
>  
> -static int prune_tmp_object(const char *path, const char *filename)
> +static int prune_tmp_object(const char *fullpath)
>  {
> -	const char *fullpath = mkpath("%s/%s", path, filename);
>  	struct stat st;
>  	if (lstat(fullpath, &st))
>  		return error("Could not stat '%s'", fullpath);

This function is called to remove

 * Any tmp_* found directly in .git/objects/
 * Any tmp_* found directly in .git/objects/pack/
 * Any tmp_obj_* found directly in .git/objects/??/

and shares the same expiration logic with prune_object().  The only
difference from the other function is what the file is called in
dry-run or verbose report ("stale temporary file" vs "<sha-1> <typename>").

We may want to rename it to prune_tmp_file(); its usage may have
been limited to an unborn loose object file at some point in the
history, but it does not look that way in today's code.

> -static int prune_dir(int i, char *path)
> +static int prune_dir(int i, struct strbuf *path)
>  {
> -	DIR *dir = opendir(path);
> +	size_t baselen = path->len;
> +	DIR *dir = opendir(path->buf);
>  	struct dirent *de;
>  
>  	if (!dir)
> @@ -77,28 +76,39 @@ static int prune_dir(int i, char *path)
>  			if (lookup_object(sha1))
>  				continue;
>  
> -			prune_object(path, de->d_name, sha1);
> +			strbuf_addf(path, "/%s", de->d_name);
> +			prune_object(path->buf, sha1);
> +			path->len = baselen;

This is minor, but I prefer using strbuf_setlen() for this.

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer
  2013-12-18 19:35       ` Junio C Hamano
@ 2013-12-18 20:00         ` Jeff King
  2013-12-18 20:07           ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2013-12-18 20:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git

On Wed, Dec 18, 2013 at 11:35:34AM -0800, Junio C Hamano wrote:

> This function is called to remove
> 
>  * Any tmp_* found directly in .git/objects/
>  * Any tmp_* found directly in .git/objects/pack/
>  * Any tmp_obj_* found directly in .git/objects/??/
> 
> and shares the same expiration logic with prune_object().  The only
> difference from the other function is what the file is called in
> dry-run or verbose report ("stale temporary file" vs "<sha-1> <typename>").
> 
> We may want to rename it to prune_tmp_file(); its usage may have
> been limited to an unborn loose object file at some point in the
> history, but it does not look that way in today's code.

Yes, certainly the rename makes sense (and I think the history is as you
mentioned). I noticed the similarity between the two, as well. It might
be simple to refactor into a single function.

> > -			prune_object(path, de->d_name, sha1);
> > +			strbuf_addf(path, "/%s", de->d_name);
> > +			prune_object(path->buf, sha1);
> > +			path->len = baselen;
> 
> This is minor, but I prefer using strbuf_setlen() for this.

Good catch. I do not think it is minor at all; my version is buggy.
After the loop ends, path->len does not match the NUL in path->buf. That
is OK if the next caller is strbuf-aware, but if it were to pass
path->buf straight to a system call, that would be rather...confusing.

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer
  2013-12-18 20:00         ` Jeff King
@ 2013-12-18 20:07           ` Junio C Hamano
  2013-12-18 20:11             ` Jeff King
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2013-12-18 20:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git

Jeff King <peff@peff.net> writes:

>> > +			prune_object(path->buf, sha1);
>> > +			path->len = baselen;
>> 
>> This is minor, but I prefer using strbuf_setlen() for this.
>
> Good catch. I do not think it is minor at all; my version is buggy.
> After the loop ends, path->len does not match the NUL in path->buf. That
> is OK if the next caller is strbuf-aware, but if it were to pass
> path->buf straight to a system call, that would be rather...confusing.

Hmph, rmdir(path->buf) at the end of prune_dir() may have that exact
issue.

Will squash in the following.

 builtin/prune.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/prune.c b/builtin/prune.c
index 4ca8ec1..99f3f35 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -17,7 +17,7 @@ static int verbose;
 static unsigned long expire;
 static int show_progress = -1;
 
-static int prune_tmp_object(const char *fullpath)
+static int prune_tmp_file(const char *fullpath)
 {
 	struct stat st;
 	if (lstat(fullpath, &st))
@@ -78,13 +78,13 @@ static int prune_dir(int i, struct strbuf *path)
 
 			strbuf_addf(path, "/%s", de->d_name);
 			prune_object(path->buf, sha1);
-			path->len = baselen;
+			strbuf_setlen(&path, baselen);
 			continue;
 		}
 		if (!prefixcmp(de->d_name, "tmp_obj_")) {
 			strbuf_addf(path, "/%s", de->d_name);
-			prune_tmp_object(path->buf);
-			path->len = baselen;
+			prune_tmp_file(path->buf);
+			strbuf_setlen(&path, baselen);
 			continue;
 		}
 		fprintf(stderr, "bad sha1 file: %s/%s\n", path->buf, de->d_name);
@@ -108,7 +108,7 @@ static void prune_object_dir(const char *path)
 	for (i = 0; i < 256; i++) {
 		strbuf_addf(&buf, "%02x", i);
 		prune_dir(i, &buf);
-		buf.len = baselen;
+		strbuf_setlen(&buf, baselen);
 	}
 }
 
@@ -130,7 +130,7 @@ static void remove_temporary_files(const char *path)
 	}
 	while ((de = readdir(dir)) != NULL)
 		if (!prefixcmp(de->d_name, "tmp_"))
-			prune_tmp_object(mkpath("%s/%s", path, de->d_name));
+			prune_tmp_file(mkpath("%s/%s", path, de->d_name));
 	closedir(dir);
 }
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer
  2013-12-18 20:07           ` Junio C Hamano
@ 2013-12-18 20:11             ` Jeff King
  2013-12-18 20:15               ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2013-12-18 20:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git

On Wed, Dec 18, 2013 at 12:07:02PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> > +			prune_object(path->buf, sha1);
> >> > +			path->len = baselen;
> >> 
> >> This is minor, but I prefer using strbuf_setlen() for this.
> >
> > Good catch. I do not think it is minor at all; my version is buggy.
> > After the loop ends, path->len does not match the NUL in path->buf. That
> > is OK if the next caller is strbuf-aware, but if it were to pass
> > path->buf straight to a system call, that would be rather...confusing.
> 
> Hmph, rmdir(path->buf) at the end of prune_dir() may have that exact
> issue.
> 
> Will squash in the following.

Thanks. Are you picking this up with a commit message, or did you want
me to re-send with the usual message/signoff?

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer
  2013-12-18 20:11             ` Jeff King
@ 2013-12-18 20:15               ` Junio C Hamano
  2013-12-18 20:27                 ` Jeff King
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2013-12-18 20:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git

Jeff King <peff@peff.net> writes:

> On Wed, Dec 18, 2013 at 12:07:02PM -0800, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> >> > +			prune_object(path->buf, sha1);
>> >> > +			path->len = baselen;
>> >> 
>> >> This is minor, but I prefer using strbuf_setlen() for this.
>> >
>> > Good catch. I do not think it is minor at all; my version is buggy.
>> > After the loop ends, path->len does not match the NUL in path->buf. That
>> > is OK if the next caller is strbuf-aware, but if it were to pass
>> > path->buf straight to a system call, that would be rather...confusing.
>> 
>> Hmph, rmdir(path->buf) at the end of prune_dir() may have that exact
>> issue.
>> 
>> Will squash in the following.
>
> Thanks. Are you picking this up with a commit message, or did you want
> me to re-send with the usual message/signoff?

I think this should be sufficient ;-)

-- >8 --
From: Jeff King <peff@peff.net>
Date: Tue, 17 Dec 2013 18:22:31 -0500
Subject: [PATCH] builtin/prune.c: use strbuf to avoid having to worry about PATH_MAX

While at it, rename prune_tmp_object(), which used to be a helper to
remove temporary files that were created to become loose object
files, to prune_tmp_file(), as the function is also used to remove
any random cruft whose name begins with tmp_ directly in .git/object
or .git/object/pack directories these days.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/prune.c | 38 ++++++++++++++++++++++++--------------
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/builtin/prune.c b/builtin/prune.c
index 6366917..99f3f35 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -17,9 +17,8 @@ static int verbose;
 static unsigned long expire;
 static int show_progress = -1;
 
-static int prune_tmp_object(const char *path, const char *filename)
+static int prune_tmp_file(const char *fullpath)
 {
-	const char *fullpath = mkpath("%s/%s", path, filename);
 	struct stat st;
 	if (lstat(fullpath, &st))
 		return error("Could not stat '%s'", fullpath);
@@ -32,9 +31,8 @@ static int prune_tmp_object(const char *path, const char *filename)
 	return 0;
 }
 
-static int prune_object(char *path, const char *filename, const unsigned char *sha1)
+static int prune_object(const char *fullpath, const unsigned char *sha1)
 {
-	const char *fullpath = mkpath("%s/%s", path, filename);
 	struct stat st;
 	if (lstat(fullpath, &st))
 		return error("Could not stat '%s'", fullpath);
@@ -50,9 +48,10 @@ static int prune_object(char *path, const char *filename, const unsigned char *s
 	return 0;
 }
 
-static int prune_dir(int i, char *path)
+static int prune_dir(int i, struct strbuf *path)
 {
-	DIR *dir = opendir(path);
+	size_t baselen = path->len;
+	DIR *dir = opendir(path->buf);
 	struct dirent *de;
 
 	if (!dir)
@@ -77,28 +76,39 @@ static int prune_dir(int i, char *path)
 			if (lookup_object(sha1))
 				continue;
 
-			prune_object(path, de->d_name, sha1);
+			strbuf_addf(path, "/%s", de->d_name);
+			prune_object(path->buf, sha1);
+			strbuf_setlen(&path, baselen);
 			continue;
 		}
 		if (!prefixcmp(de->d_name, "tmp_obj_")) {
-			prune_tmp_object(path, de->d_name);
+			strbuf_addf(path, "/%s", de->d_name);
+			prune_tmp_file(path->buf);
+			strbuf_setlen(&path, baselen);
 			continue;
 		}
-		fprintf(stderr, "bad sha1 file: %s/%s\n", path, de->d_name);
+		fprintf(stderr, "bad sha1 file: %s/%s\n", path->buf, de->d_name);
 	}
 	closedir(dir);
 	if (!show_only)
-		rmdir(path);
+		rmdir(path->buf);
 	return 0;
 }
 
 static void prune_object_dir(const char *path)
 {
+	struct strbuf buf = STRBUF_INIT;
+	size_t baselen;
 	int i;
+
+	strbuf_addstr(&buf, path);
+	strbuf_addch(&buf, '/');
+	baselen = buf.len;
+
 	for (i = 0; i < 256; i++) {
-		static char dir[4096];
-		sprintf(dir, "%s/%02x", path, i);
-		prune_dir(i, dir);
+		strbuf_addf(&buf, "%02x", i);
+		prune_dir(i, &buf);
+		strbuf_setlen(&buf, baselen);
 	}
 }
 
@@ -120,7 +130,7 @@ static void remove_temporary_files(const char *path)
 	}
 	while ((de = readdir(dir)) != NULL)
 		if (!prefixcmp(de->d_name, "tmp_"))
-			prune_tmp_object(path, de->d_name);
+			prune_tmp_file(mkpath("%s/%s", path, de->d_name));
 	closedir(dir);
 }
 
-- 
1.8.5.2-297-g3e57c29

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer
  2013-12-18 20:15               ` Junio C Hamano
@ 2013-12-18 20:27                 ` Jeff King
  0 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2013-12-18 20:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git

On Wed, Dec 18, 2013 at 12:15:19PM -0800, Junio C Hamano wrote:

> > Thanks. Are you picking this up with a commit message, or did you want
> > me to re-send with the usual message/signoff?
> 
> I think this should be sufficient ;-)
> 
> -- >8 --
> From: Jeff King <peff@peff.net>
> Date: Tue, 17 Dec 2013 18:22:31 -0500
> Subject: [PATCH] builtin/prune.c: use strbuf to avoid having to worry about PATH_MAX
> 
> While at it, rename prune_tmp_object(), which used to be a helper to
> remove temporary files that were created to become loose object
> files, to prune_tmp_file(), as the function is also used to remove
> any random cruft whose name begins with tmp_ directly in .git/object
> or .git/object/pack directories these days.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Great, thanks. You might also want to stick a:

 Noticed-by: Michael Haggerty <mhagger@alum.mit.edu>

in there.

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/3] prune-packed: fix a possible buffer overflow
  2013-12-17 18:43     ` Junio C Hamano
@ 2013-12-18 22:44       ` Michael Haggerty
  2013-12-19  0:04         ` Jeff King
  2013-12-19  0:37       ` Duy Nguyen
  1 sibling, 1 reply; 21+ messages in thread
From: Michael Haggerty @ 2013-12-18 22:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Git Mailing List, Jeff King, Antoine Pelisse

On 12/17/2013 07:43 PM, Junio C Hamano wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
> 
>> Why don't we take this opportunity to replace that array with a
>> strbuf? The conversion looks simple with this function.
> 
> Indeed.  Something like this, perhaps?
> [...]

Frankly, with my initial patches I was just trying to paper over the bug
with the smallest possible change.  It's nice that people are attempting
bigger improvements.

I went in a slightly different direction: I am spiking out an API for
iterating over loose object files.  It would be useful in a couple of
places.

[While doing so, I got sidetracked by the question: what happens if a
prune process deletes the "objects/XX" directory just the same moment
that another process is trying to write an object into that directory?
I think the relevant function is sha1_file.c:create_tmpfile().  It looks
like there is a nonzero but very small race window that could result in
a spurious "unable to create temporary file" error, but even then I
don't think there would be any corruption or anything.]

But don't let me stop you; the cleanups you are working on are
definitely nice and are complementary to my ideas.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/3] prune-packed: fix a possible buffer overflow
  2013-12-18 22:44       ` Michael Haggerty
@ 2013-12-19  0:04         ` Jeff King
  2013-12-19 16:33           ` Michael Haggerty
  0 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2013-12-19  0:04 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Duy Nguyen, Git Mailing List, Antoine Pelisse

On Wed, Dec 18, 2013 at 11:44:58PM +0100, Michael Haggerty wrote:

> [While doing so, I got sidetracked by the question: what happens if a
> prune process deletes the "objects/XX" directory just the same moment
> that another process is trying to write an object into that directory?
> I think the relevant function is sha1_file.c:create_tmpfile().  It looks
> like there is a nonzero but very small race window that could result in
> a spurious "unable to create temporary file" error, but even then I
> don't think there would be any corruption or anything.]

There's a race there, but I think it's hard to trigger.

Our strategy with object creation is to call open, recognize ENOENT,
mkdir, and then try again. If the rmdir happens before our call to open,
then we're fine. If open happens first, then the rmdir will fail.

But we don't loop on ENOENT. So if the rmdir happens in the middle,
after the mkdir but before we call open again, we'd fail, because we
don't treat ENOENT specially in the second call to open. That is
unlikely to happen, though, as prune would not be removing a directory
it did not just enter and clean up an object from (in which case we
would not have gotten the first ENOENT in the creator). I think you'd So
you'd have to have something creating and then pruning the directory in
the time between our open and mkdir. It would probably be more likely to
see it if you had two prunes running (the first one kills the directory,
creator notices and calls mkdir, then the second prune kills the
directory, too).

So it seems unlikely and the worst case is a temporary failure, not a
corruption. It's probably not worth caring too much about, but we could
solve it pretty easily by looping on ENOENT on creation.

On a similar note, I imagine that a simultaneous "branch foo/bar" and
"branch -d foo/baz" could race over the creation/deletion of
"refs/heads/foo", but I didn't look into it.

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/3] prune-packed: fix a possible buffer overflow
  2013-12-17 18:43     ` Junio C Hamano
  2013-12-18 22:44       ` Michael Haggerty
@ 2013-12-19  0:37       ` Duy Nguyen
  1 sibling, 0 replies; 21+ messages in thread
From: Duy Nguyen @ 2013-12-19  0:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, Git Mailing List

On Wed, Dec 18, 2013 at 1:43 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
>> Why don't we take this opportunity to replace that array with a
>> strbuf? The conversion looks simple with this function.
>
> Indeed.  Something like this, perhaps?

Yes, looking good.

>  void prune_packed_objects(int opts)
>  {
>         int i;
> -       static char pathname[PATH_MAX];
>         const char *dir = get_object_directory();
> -       int len = strlen(dir);
> +       struct strbuf pathname = STRBUF_INIT;
> +       int top_len;
>
> +       strbuf_addstr(&pathname, dir);
>         if (opts & PRUNE_PACKED_VERBOSE)
>                 progress = start_progress_delay("Removing duplicate objects",
>                         256, 95, 2);
>
> -       if (len > PATH_MAX - 42)
> -               die("impossible object directory");
> -       memcpy(pathname, dir, len);
> -       if (len && pathname[len-1] != '/')
> -               pathname[len++] = '/';
> +       if (pathname.len && pathname.buf[pathname.len - 1] != '/')
> +               strbuf_addch(&pathname, '/');

I see this pattern (add a trailing slash) in a few places too. Maybe
we could make a wrapper for it.
-- 
Duy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/3] prune-packed: fix a possible buffer overflow
  2013-12-19  0:04         ` Jeff King
@ 2013-12-19 16:33           ` Michael Haggerty
  2013-12-20  9:30             ` Jeff King
  0 siblings, 1 reply; 21+ messages in thread
From: Michael Haggerty @ 2013-12-19 16:33 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Duy Nguyen, Git Mailing List, Antoine Pelisse

On 12/19/2013 01:04 AM, Jeff King wrote:
> On Wed, Dec 18, 2013 at 11:44:58PM +0100, Michael Haggerty wrote:
> 
>> [While doing so, I got sidetracked by the question: what happens if a
>> prune process deletes the "objects/XX" directory just the same moment
>> that another process is trying to write an object into that directory?
>> I think the relevant function is sha1_file.c:create_tmpfile().  It looks
>> like there is a nonzero but very small race window that could result in
>> a spurious "unable to create temporary file" error, but even then I
>> don't think there would be any corruption or anything.]
> 
> There's a race there, but I think it's hard to trigger.
> 
> Our strategy with object creation is to call open, recognize ENOENT,
> mkdir, and then try again. If the rmdir happens before our call to open,
> then we're fine. If open happens first, then the rmdir will fail.
> 
> But we don't loop on ENOENT. So if the rmdir happens in the middle,
> after the mkdir but before we call open again, we'd fail, because we
> don't treat ENOENT specially in the second call to open. That is
> unlikely to happen, though, as prune would not be removing a directory
> it did not just enter and clean up an object from (in which case we
> would not have gotten the first ENOENT in the creator). [...]

The way I read it, prune tries to delete the directory whether or not
there were any files in it.  So the race could be triggered by a single
writer that wants to write an object to a not-yet-existent shard
directory and a single prune process that encounters the directory
between when it is created and when the object file is added.

But that doesn't mean I disagree with your conclusion:

> So it seems unlikely and the worst case is a temporary failure, not a
> corruption. It's probably not worth caring too much about, but we could
> solve it pretty easily by looping on ENOENT on creation.

Regarding references:

> On a similar note, I imagine that a simultaneous "branch foo/bar" and
> "branch -d foo/baz" could race over the creation/deletion of
> "refs/heads/foo", but I didn't look into it.

Deleting a loose reference doesn't cause the directory containing it to
be deleted.  The directory is only deleted by pack-refs (and then only
when a reference in the directory was just packed) or when there is an
attempt to create a new reference that conflicts with the directory.  So
the question is whether the creation of a loose ref file is robust
against the disappearance of a directory that it just created.

And the answer is "no".  It looks like there are a bunch of places where
similar races occur involving references.  And probably many others
elsewhere in the code.  (Any caller of safe_create_leading_directories()
is a candidate problem point, and in fact that function itself has an
internal race.)  I've started fixing some of these but it might take a
while.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/3] prune-packed: fix a possible buffer overflow
  2013-12-19 16:33           ` Michael Haggerty
@ 2013-12-20  9:30             ` Jeff King
  0 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2013-12-20  9:30 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Duy Nguyen, Git Mailing List, Antoine Pelisse

On Thu, Dec 19, 2013 at 05:33:55PM +0100, Michael Haggerty wrote:

> > But we don't loop on ENOENT. So if the rmdir happens in the middle,
> > after the mkdir but before we call open again, we'd fail, because we
> > don't treat ENOENT specially in the second call to open. That is
> > unlikely to happen, though, as prune would not be removing a directory
> > it did not just enter and clean up an object from (in which case we
> > would not have gotten the first ENOENT in the creator). [...]
> 
> The way I read it, prune tries to delete the directory whether or not
> there were any files in it.  So the race could be triggered by a single
> writer that wants to write an object to a not-yet-existent shard
> directory and a single prune process that encounters the directory
> between when it is created and when the object file is added.

Yes, that's true. It does make the race slightly more difficult than a
straight deletion because the prune has to catch it in the moment where
it exists but does not yet have an object. But it's still possible.

> But that doesn't mean I disagree with your conclusion:

I think we're in violent agreement at this point. :)

> Regarding references:
> 
> > On a similar note, I imagine that a simultaneous "branch foo/bar" and
> > "branch -d foo/baz" could race over the creation/deletion of
> > "refs/heads/foo", but I didn't look into it.
> 
> Deleting a loose reference doesn't cause the directory containing it to
> be deleted.  The directory is only deleted by pack-refs (and then only
> when a reference in the directory was just packed) or when there is an
> attempt to create a new reference that conflicts with the directory.  So
> the question is whether the creation of a loose ref file is robust
> against the disappearance of a directory that it just created.

Ah, right, I forgot we leave the directories sitting around after
deletion. So we may run into a collision with another creator, but by
definition we would have a D/F conflict with such a creator anyway, so
we cannot both succeed.

But we can hit the problem with pack-refs, as you note:

> And the answer is "no".  It looks like there are a bunch of places where
> similar races occur involving references.  And probably many others
> elsewhere in the code.  (Any caller of safe_create_leading_directories()
> is a candidate problem point, and in fact that function itself has an
> internal race.)  I've started fixing some of these but it might take a
> while.

Yeah, I think you'd have to teach safe_create_leading_directories to
atomically try-to-create-and-check-errno rather than stat+mkdir. And
then teach it to backtrack when an expected leading path goes missing
after we created it (so mkdir("foo"), then mkdir("foo/bar"), then step
back to mkdir("foo") if we got ENOENT).

I don't think the races are a big deal, though. As with the prune case,
we will ultimately fail to create the lockfile and get a temporary
failure rather than a corruption. So unless we actually have reports of
it happening (and I have seen none), it's probably not worth spending
much time on.

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2013-12-20  9:31 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-17 13:43 [PATCH 0/3] Fix two buffer overflows and remove a redundant var Michael Haggerty
2013-12-17 13:43 ` [PATCH 1/3] prune-packed: fix a possible buffer overflow Michael Haggerty
2013-12-17 13:57   ` Duy Nguyen
2013-12-17 18:43     ` Junio C Hamano
2013-12-18 22:44       ` Michael Haggerty
2013-12-19  0:04         ` Jeff King
2013-12-19 16:33           ` Michael Haggerty
2013-12-20  9:30             ` Jeff King
2013-12-19  0:37       ` Duy Nguyen
2013-12-17 13:43 ` [PATCH 2/3] prune_object_dir(): verify that path fits in the temporary buffer Michael Haggerty
2013-12-17 18:51   ` Junio C Hamano
2013-12-17 23:22     ` Jeff King
2013-12-18 19:35       ` Junio C Hamano
2013-12-18 20:00         ` Jeff King
2013-12-18 20:07           ` Junio C Hamano
2013-12-18 20:11             ` Jeff King
2013-12-18 20:15               ` Junio C Hamano
2013-12-18 20:27                 ` Jeff King
2013-12-17 18:56   ` Antoine Pelisse
2013-12-17 13:43 ` [PATCH 3/3] cmd_repack(): remove redundant local variable "nr_packs" Michael Haggerty
2013-12-17 13:46   ` Stefan Beller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.