All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] getcwd without PATH_MAX
@ 2014-07-20 11:18 René Scharfe
  2014-07-20 11:21 ` [PATCH 1/3] strbuf: add strbuf_add_cwd() René Scharfe
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: René Scharfe @ 2014-07-20 11:18 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Karsten Blees, Nguyễn Thái Ngọc Duy, Junio C Hamano

Paths longer than PATH_MAX can be created and used on at least on some
file systems.  Currently we use getcwd() generally with a PATH_MAX-
sized buffer.  This short series adds two functions, strbuf_add_cwd()
and xgetcwd(), then uses them to reduce the number of fixed-sized
buffers and to allow us to handle longer working directory paths.

René Scharfe (3):
  strbuf: add strbuf_add_cwd()
  wrapper: add xgetcwd()
  use xgetcwd() get the current directory or die

 Documentation/technical/api-strbuf.txt |  4 ++++
 builtin/init-db.c                      | 17 ++++++++---------
 builtin/rev-parse.c                    |  6 +++---
 dir.c                                  | 12 ++++++++----
 git-compat-util.h                      |  1 +
 strbuf.c                               | 22 ++++++++++++++++++++++
 strbuf.h                               |  1 +
 trace.c                                |  7 ++++---
 wrapper.c                              |  8 ++++++++
 9 files changed, 59 insertions(+), 19 deletions(-)

-- 
2.0.2

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] strbuf: add strbuf_add_cwd()
  2014-07-20 11:18 [PATCH 0/3] getcwd without PATH_MAX René Scharfe
@ 2014-07-20 11:21 ` René Scharfe
  2014-07-20 12:33   ` Duy Nguyen
  2014-07-20 11:21 ` [PATCH 2/3] wrapper: add xgetcwd() René Scharfe
  2014-07-20 11:22 ` [PATCH 3/3] use xgetcwd() get the current directory or die René Scharfe
  2 siblings, 1 reply; 10+ messages in thread
From: René Scharfe @ 2014-07-20 11:21 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Karsten Blees, Nguyễn Thái Ngọc Duy, Junio C Hamano

Add strbuf_add_cwd(), which adds the current working directory to a
strbuf.  Because it doesn't use a fixed-size buffer it supports
arbitrarily long paths, as long as the platform's getcwd() does as
well.  At least on Linux and FreeBSD it handles paths longer than
PATH_MAX just fine.

Suggested-by: Karsten Blees <karsten.blees@gmail.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 Documentation/technical/api-strbuf.txt |  4 ++++
 strbuf.c                               | 22 ++++++++++++++++++++++
 strbuf.h                               |  1 +
 3 files changed, 27 insertions(+)

diff --git a/Documentation/technical/api-strbuf.txt b/Documentation/technical/api-strbuf.txt
index f9c06a7..b96b78c 100644
--- a/Documentation/technical/api-strbuf.txt
+++ b/Documentation/technical/api-strbuf.txt
@@ -257,6 +257,10 @@ which can be used by the programmer of the callback as she sees fit.
 
 	Add a formatted string to the buffer.
 
+`strbuf_add_cwd`::
+
+	Add the current working directory to the buffer.
+
 `strbuf_commented_addf`::
 
 	Add a formatted string prepended by a comment character and a
diff --git a/strbuf.c b/strbuf.c
index 33018d8..4e44773 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -406,6 +406,28 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
 	return -1;
 }
 
+int strbuf_add_cwd(struct strbuf *sb)
+{
+	size_t oldalloc = sb->alloc;
+	size_t guessed_len = 32;
+
+	for (;; guessed_len *= 2) {
+		char *cwd;
+
+		strbuf_grow(sb, guessed_len);
+		cwd = getcwd(sb->buf + sb->len, sb->alloc - sb->len);
+		if (cwd) {
+			strbuf_setlen(sb, sb->len + strlen(cwd));
+			return 0;
+		}
+		if (errno != ERANGE)
+			break;
+	}
+	if (oldalloc == 0)
+		strbuf_release(sb);
+	return -1;
+}
+
 int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
 {
 	int ch;
diff --git a/strbuf.h b/strbuf.h
index a7c0192..ba95cd6 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -174,6 +174,7 @@ extern size_t strbuf_fread(struct strbuf *, size_t, FILE *);
 extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint);
 extern int strbuf_read_file(struct strbuf *sb, const char *path, size_t hint);
 extern int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint);
+extern int strbuf_add_cwd(struct strbuf *sb);
 
 extern int strbuf_getwholeline(struct strbuf *, FILE *, int);
 extern int strbuf_getline(struct strbuf *, FILE *, int);
-- 
2.0.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] wrapper: add xgetcwd()
  2014-07-20 11:18 [PATCH 0/3] getcwd without PATH_MAX René Scharfe
  2014-07-20 11:21 ` [PATCH 1/3] strbuf: add strbuf_add_cwd() René Scharfe
@ 2014-07-20 11:21 ` René Scharfe
  2014-07-20 12:35   ` Duy Nguyen
  2014-07-20 11:22 ` [PATCH 3/3] use xgetcwd() get the current directory or die René Scharfe
  2 siblings, 1 reply; 10+ messages in thread
From: René Scharfe @ 2014-07-20 11:21 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Karsten Blees, Nguyễn Thái Ngọc Duy, Junio C Hamano

Add the helper function xgetcwd(), which returns the current directory
or dies.  The returned string has to be free()d after use.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 git-compat-util.h | 1 +
 wrapper.c         | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 0b53c9a..d64d012 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -605,6 +605,7 @@ extern int xmkstemp(char *template);
 extern int xmkstemp_mode(char *template, int mode);
 extern int odb_mkstemp(char *template, size_t limit, const char *pattern);
 extern int odb_pack_keep(char *name, size_t namesz, const unsigned char *sha1);
+extern char *xgetcwd(void);
 
 static inline size_t xsize_t(off_t len)
 {
diff --git a/wrapper.c b/wrapper.c
index bc1bfb8..e297fa9 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -493,3 +493,11 @@ struct passwd *xgetpwuid_self(void)
 		    errno ? strerror(errno) : _("no such user"));
 	return pw;
 }
+
+char *xgetcwd(void)
+{
+	struct strbuf sb = STRBUF_INIT;
+	if (strbuf_add_cwd(&sb))
+		die_errno("unable to get current working directory");
+	return strbuf_detach(&sb, NULL);
+}
-- 
2.0.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] use xgetcwd() get the current directory or die
  2014-07-20 11:18 [PATCH 0/3] getcwd without PATH_MAX René Scharfe
  2014-07-20 11:21 ` [PATCH 1/3] strbuf: add strbuf_add_cwd() René Scharfe
  2014-07-20 11:21 ` [PATCH 2/3] wrapper: add xgetcwd() René Scharfe
@ 2014-07-20 11:22 ` René Scharfe
  2014-07-20 12:45   ` Duy Nguyen
  2 siblings, 1 reply; 10+ messages in thread
From: René Scharfe @ 2014-07-20 11:22 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Karsten Blees, Nguyễn Thái Ngọc Duy, Junio C Hamano

Convert several calls of getcwd() and die() to use xgetcwd() instead.
This way we get rid of fixed-size buffers (which can be too small
depending on the used file system) and gain consistent error messages.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 builtin/init-db.c   | 17 ++++++++---------
 builtin/rev-parse.c |  6 +++---
 dir.c               | 12 ++++++++----
 trace.c             |  7 ++++---
 4 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/builtin/init-db.c b/builtin/init-db.c
index 56f85e2..f6dd172 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -426,8 +426,9 @@ int init_db(const char *template_dir, unsigned int flags)
 
 static int guess_repository_type(const char *git_dir)
 {
-	char cwd[PATH_MAX];
 	const char *slash;
+	char *cwd;
+	int cwd_is_git_dir;
 
 	/*
 	 * "GIT_DIR=. git init" is always bare.
@@ -435,9 +436,10 @@ static int guess_repository_type(const char *git_dir)
 	 */
 	if (!strcmp(".", git_dir))
 		return 1;
-	if (!getcwd(cwd, sizeof(cwd)))
-		die_errno(_("cannot tell cwd"));
-	if (!strcmp(git_dir, cwd))
+	cwd = xgetcwd();
+	cwd_is_git_dir = !strcmp(git_dir, cwd);
+	free(cwd);
+	if (cwd_is_git_dir)
 		return 1;
 	/*
 	 * "GIT_DIR=.git or GIT_DIR=something/.git is usually not.
@@ -572,11 +574,8 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 			git_work_tree_cfg = xstrdup(real_path(rel));
 			free(rel);
 		}
-		if (!git_work_tree_cfg) {
-			git_work_tree_cfg = xcalloc(PATH_MAX, 1);
-			if (!getcwd(git_work_tree_cfg, PATH_MAX))
-				die_errno (_("Cannot access current working directory"));
-		}
+		if (!git_work_tree_cfg)
+			git_work_tree_cfg = xgetcwd();
 		if (work_tree)
 			set_git_work_tree(real_path(work_tree));
 		else
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 8102aaa..f6bbac7 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -735,7 +735,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "--git-dir")) {
 				const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
-				static char cwd[PATH_MAX];
+				char *cwd;
 				int len;
 				if (gitdir) {
 					puts(gitdir);
@@ -745,10 +745,10 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 					puts(".git");
 					continue;
 				}
-				if (!getcwd(cwd, PATH_MAX))
-					die_errno("unable to get current working directory");
+				cwd = xgetcwd();
 				len = strlen(cwd);
 				printf("%s%s.git\n", cwd, len && cwd[len-1] != '/' ? "/" : "");
+				free(cwd);
 				continue;
 			}
 			if (!strcmp(arg, "--resolve-git-dir")) {
diff --git a/dir.c b/dir.c
index e65888d..7b994d4 100644
--- a/dir.c
+++ b/dir.c
@@ -1499,12 +1499,16 @@ int dir_inside_of(const char *subdir, const char *dir)
 
 int is_inside_dir(const char *dir)
 {
-	char cwd[PATH_MAX];
+	char *cwd;
+	int rc;
+
 	if (!dir)
 		return 0;
-	if (!getcwd(cwd, sizeof(cwd)))
-		die_errno("can't find the current directory");
-	return dir_inside_of(cwd, dir) >= 0;
+
+	cwd = xgetcwd();
+	rc = (dir_inside_of(cwd, dir) >= 0);
+	free(cwd);
+	return rc;
 }
 
 int is_empty_dir(const char *path)
diff --git a/trace.c b/trace.c
index 08180a9..3523667 100644
--- a/trace.c
+++ b/trace.c
@@ -158,13 +158,12 @@ void trace_repo_setup(const char *prefix)
 {
 	static const char *key = "GIT_TRACE_SETUP";
 	const char *git_work_tree;
-	char cwd[PATH_MAX];
+	char *cwd;
 
 	if (!trace_want(key))
 		return;
 
-	if (!getcwd(cwd, PATH_MAX))
-		die("Unable to get current working directory");
+	cwd = xgetcwd();
 
 	if (!(git_work_tree = get_git_work_tree()))
 		git_work_tree = "(null)";
@@ -176,6 +175,8 @@ void trace_repo_setup(const char *prefix)
 	trace_printf_key(key, "setup: worktree: %s\n", quote_crnl(git_work_tree));
 	trace_printf_key(key, "setup: cwd: %s\n", quote_crnl(cwd));
 	trace_printf_key(key, "setup: prefix: %s\n", quote_crnl(prefix));
+
+	free(cwd);
 }
 
 int trace_want(const char *key)
-- 
2.0.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] strbuf: add strbuf_add_cwd()
  2014-07-20 11:21 ` [PATCH 1/3] strbuf: add strbuf_add_cwd() René Scharfe
@ 2014-07-20 12:33   ` Duy Nguyen
  2014-07-20 15:21     ` René Scharfe
  0 siblings, 1 reply; 10+ messages in thread
From: Duy Nguyen @ 2014-07-20 12:33 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git Mailing List, Karsten Blees, Junio C Hamano

On Sun, Jul 20, 2014 at 6:21 PM, René Scharfe <l.s.r@web.de> wrote:
> +int strbuf_add_cwd(struct strbuf *sb)
> +{
> +       size_t oldalloc = sb->alloc;
> +       size_t guessed_len = 32;

For Linux, I think this is enough to succesfully get cwd in the first
pass. Windows' $HOME is usually deep in C:\Users\Blahblah. Maybe
increase this to 128? And we probably want to keep the guessed value,
so if the first strbuf_add_cwd needs a few rounds to get cwd, the next
strbuf_add_cwd() call does not.

> +
> +       for (;; guessed_len *= 2) {
> +               char *cwd;
> +
> +               strbuf_grow(sb, guessed_len);
> +               cwd = getcwd(sb->buf + sb->len, sb->alloc - sb->len);
> +               if (cwd) {
> +                       strbuf_setlen(sb, sb->len + strlen(cwd));
> +                       return 0;
> +               }
> +               if (errno != ERANGE)
> +                       break;
> +       }
> +       if (oldalloc == 0)
> +               strbuf_release(sb);
> +       return -1;
> +}

The loop and the following strbuf_release() are correct. But I wonder
if it's easier to read if you save getcwd in a separate/local strbuf
variable and only concat it to "sb" when you successfully get cwd..
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] wrapper: add xgetcwd()
  2014-07-20 11:21 ` [PATCH 2/3] wrapper: add xgetcwd() René Scharfe
@ 2014-07-20 12:35   ` Duy Nguyen
  2014-07-20 15:22     ` René Scharfe
  0 siblings, 1 reply; 10+ messages in thread
From: Duy Nguyen @ 2014-07-20 12:35 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git Mailing List, Karsten Blees, Junio C Hamano

On Sun, Jul 20, 2014 at 6:21 PM, René Scharfe <l.s.r@web.de> wrote:
> +char *xgetcwd(void)
> +{
> +       struct strbuf sb = STRBUF_INIT;
> +       if (strbuf_add_cwd(&sb))
> +               die_errno("unable to get current working directory");

Wrap the string with _() to make it translatable? I can't see why any
script would want to grep this string..

> +       return strbuf_detach(&sb, NULL);
> +}
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] use xgetcwd() get the current directory or die
  2014-07-20 11:22 ` [PATCH 3/3] use xgetcwd() get the current directory or die René Scharfe
@ 2014-07-20 12:45   ` Duy Nguyen
  2014-07-20 15:21     ` René Scharfe
  0 siblings, 1 reply; 10+ messages in thread
From: Duy Nguyen @ 2014-07-20 12:45 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git Mailing List, Karsten Blees, Junio C Hamano

On Sun, Jul 20, 2014 at 6:22 PM, René Scharfe <l.s.r@web.de> wrote:
> Convert several calls of getcwd() and die() to use xgetcwd() instead.
> This way we get rid of fixed-size buffers (which can be too small
> depending on the used file system) and gain consistent error messages.
>
> Signed-off-by: Rene Scharfe <l.s.r@web.de>
> ---
>  builtin/init-db.c   | 17 ++++++++---------
>  builtin/rev-parse.c |  6 +++---
>  dir.c               | 12 ++++++++----
>  trace.c             |  7 ++++---
>  4 files changed, 23 insertions(+), 19 deletions(-)

There should be a 4/3 to replace the remaining getcwd with
strbuf_getcwd. But I guess we could add that to the low hanging fruit
list.
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] strbuf: add strbuf_add_cwd()
  2014-07-20 12:33   ` Duy Nguyen
@ 2014-07-20 15:21     ` René Scharfe
  0 siblings, 0 replies; 10+ messages in thread
From: René Scharfe @ 2014-07-20 15:21 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List, Karsten Blees, Junio C Hamano

Am 20.07.2014 14:33, schrieb Duy Nguyen:
> On Sun, Jul 20, 2014 at 6:21 PM, René Scharfe <l.s.r@web.de> wrote:
>> +int strbuf_add_cwd(struct strbuf *sb)
>> +{
>> +       size_t oldalloc = sb->alloc;
>> +       size_t guessed_len = 32;
>
> For Linux, I think this is enough to succesfully get cwd in the first
> pass. Windows' $HOME is usually deep in C:\Users\Blahblah. Maybe
> increase this to 128? And we probably want to keep the guessed value,
> so if the first strbuf_add_cwd needs a few rounds to get cwd, the next
> strbuf_add_cwd() call does not.

The initial number is debatable, sure.  I just copied the 32 from 
strbuf_readline().

"C:\Users\John Doe\Documents\Projects\foo" (or whatever) isn't thaaat 
long either, though.  FWIW, the longest (non-hidden) path in my home on 
Windows 8.1 is 139 characters long (as reported by dir -r | %{ 
$_.FullName.Length } | sort -desc | select -f 1), but there are no git 
projects in that directory.  Not sure if that means 128 would be a 
better start value, but I don't mind changing it in any case.

Before adding performance optimizations like remembering the last cwd 
length I'd rather see measurements that demonstrate their use.  I doubt 
we'll see getcwd() show up high on a profile (but didn't measure, except 
for a run of t/perf).

>> +
>> +       for (;; guessed_len *= 2) {
>> +               char *cwd;
>> +
>> +               strbuf_grow(sb, guessed_len);
>> +               cwd = getcwd(sb->buf + sb->len, sb->alloc - sb->len);
>> +               if (cwd) {
>> +                       strbuf_setlen(sb, sb->len + strlen(cwd));
>> +                       return 0;
>> +               }
>> +               if (errno != ERANGE)
>> +                       break;
>> +       }
>> +       if (oldalloc == 0)
>> +               strbuf_release(sb);
>> +       return -1;
>> +}
>
> The loop and the following strbuf_release() are correct. But I wonder
> if it's easier to read if you save getcwd in a separate/local strbuf
> variable and only concat it to "sb" when you successfully get cwd..

Adding an extra allocation and string copy sound more complicated.  But 
you are right that the function is more complicated than necessary. 
Reroll coming..

René

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] use xgetcwd() get the current directory or die
  2014-07-20 12:45   ` Duy Nguyen
@ 2014-07-20 15:21     ` René Scharfe
  0 siblings, 0 replies; 10+ messages in thread
From: René Scharfe @ 2014-07-20 15:21 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List, Karsten Blees, Junio C Hamano

Am 20.07.2014 14:45, schrieb Duy Nguyen:
> On Sun, Jul 20, 2014 at 6:22 PM, René Scharfe <l.s.r@web.de> wrote:
>> Convert several calls of getcwd() and die() to use xgetcwd() instead.
>> This way we get rid of fixed-size buffers (which can be too small
>> depending on the used file system) and gain consistent error messages.
>>
>> Signed-off-by: Rene Scharfe <l.s.r@web.de>
>> ---
>>   builtin/init-db.c   | 17 ++++++++---------
>>   builtin/rev-parse.c |  6 +++---
>>   dir.c               | 12 ++++++++----
>>   trace.c             |  7 ++++---
>>   4 files changed, 23 insertions(+), 19 deletions(-)
>
> There should be a 4/3 to replace the remaining getcwd with
> strbuf_getcwd. But I guess we could add that to the low hanging fruit
> list.

I left out the cases with the go-there-then-come-back pattern on 
purpose, as they hopefully can be changed to cease using getcwd() in the 
first place.  But I'll include another example (in addition to 
xgetcwd()) in the reroll.

René

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] wrapper: add xgetcwd()
  2014-07-20 12:35   ` Duy Nguyen
@ 2014-07-20 15:22     ` René Scharfe
  0 siblings, 0 replies; 10+ messages in thread
From: René Scharfe @ 2014-07-20 15:22 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List, Karsten Blees, Junio C Hamano

Am 20.07.2014 14:35, schrieb Duy Nguyen:
> On Sun, Jul 20, 2014 at 6:21 PM, René Scharfe <l.s.r@web.de> wrote:
>> +char *xgetcwd(void)
>> +{
>> +       struct strbuf sb = STRBUF_INIT;
>> +       if (strbuf_add_cwd(&sb))
>> +               die_errno("unable to get current working directory");
>
> Wrap the string with _() to make it translatable? I can't see why any
> script would want to grep this string..

Sure, good idea.

>
>> +       return strbuf_detach(&sb, NULL);
>> +}

Thank you for the review,
René

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-07-20 15:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-20 11:18 [PATCH 0/3] getcwd without PATH_MAX René Scharfe
2014-07-20 11:21 ` [PATCH 1/3] strbuf: add strbuf_add_cwd() René Scharfe
2014-07-20 12:33   ` Duy Nguyen
2014-07-20 15:21     ` René Scharfe
2014-07-20 11:21 ` [PATCH 2/3] wrapper: add xgetcwd() René Scharfe
2014-07-20 12:35   ` Duy Nguyen
2014-07-20 15:22     ` René Scharfe
2014-07-20 11:22 ` [PATCH 3/3] use xgetcwd() get the current directory or die René Scharfe
2014-07-20 12:45   ` Duy Nguyen
2014-07-20 15:21     ` René Scharfe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.