All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gummerer <t.gummerer@gmail.com>
To: Matheus Tavares <matheus.bernardino@usp.br>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal
Date: Sat, 23 Feb 2019 21:48:56 +0000	[thread overview]
Message-ID: <20190223214856.GQ6085@hank.intra.tgummerer.com> (raw)
In-Reply-To: <20190223190309.6728-4-matheus.bernardino@usp.br>

On 02/23, Matheus Tavares wrote:
> Replace usage of opendir/readdir/closedir API to traverse directories
> recursively, at copy_or_link_directory function, by the dir-iterator
> API. This simplifies the code and avoid recursive calls to
> copy_or_link_directory.
> 
> This process also brings some safe behaviour changes to
> copy_or_link_directory:
>  - It will no longer follows symbolic links. This is not a problem,
>    since the function is only used to copy .git/objects directory, and
>    symbolic links are not expected there.
>  - Hidden directories won't be skipped anymore. In fact, it is odd that
>    the function currently skip hidden directories but not hidden files.
>    The reason for that could be unintentional: probably the intention
>    was to skip '.' and '..' only, but it ended up accidentally skipping
>    all directories starting with '.'. Again, it must not be a problem
>    not to skip hidden dirs since hidden dirs/files are not expected at
>    .git/objects.
>  - Now, copy_or_link_directory will call die() in case of an error on
>    openddir, readdir or lstat, inside dir_iterator_advance. That means
>    it will abort in case of an error trying to fetch any iteration
>    entry.
> 
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
> Changes in v2:
>  - Improved patch message
>  - Removed a now unused variable

s/variable/parameter/ I believe?

>  - Put warning on stat error back
>  - Added pedantic option to dir-iterator initialization
>  - Modified copy_or_link_directory not to skip hidden paths

Thanks, these descriptions are very useful for reviewers that had a
look at previous rounds.

>  builtin/clone.c | 47 ++++++++++++++++++++++++++++-------------------
>  1 file changed, 28 insertions(+), 19 deletions(-)
> 
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 862d2ea69c..515dc91d63 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -23,6 +23,8 @@
>  #include "transport.h"
>  #include "strbuf.h"
>  #include "dir.h"
> +#include "dir-iterator.h"
> +#include "iterator.h"
>  #include "sigchain.h"
>  #include "branch.h"
>  #include "remote.h"
> @@ -411,42 +413,45 @@ static void mkdir_if_missing(const char *pathname, mode_t mode)
>  }
>  
>  static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
> -				   const char *src_repo, int src_baselen)
> +				   const char *src_repo)
>  {
> -	struct dirent *de;
> -	struct stat buf;
>  	int src_len, dest_len;
> -	DIR *dir;
> -
> -	dir = opendir(src->buf);
> -	if (!dir)
> -		die_errno(_("failed to open '%s'"), src->buf);
> +	struct dir_iterator *iter;
> +	int iter_status;
> +	struct stat st;
>  
>  	mkdir_if_missing(dest->buf, 0777);
>  
> +	iter = dir_iterator_begin(src->buf, 1);
> +
>  	strbuf_addch(src, '/');
>  	src_len = src->len;
>  	strbuf_addch(dest, '/');
>  	dest_len = dest->len;
>  
> -	while ((de = readdir(dir)) != NULL) {
> +	while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) {
>  		strbuf_setlen(src, src_len);
> -		strbuf_addstr(src, de->d_name);
> +		strbuf_addstr(src, iter->relative_path);
>  		strbuf_setlen(dest, dest_len);
> -		strbuf_addstr(dest, de->d_name);
> -		if (stat(src->buf, &buf)) {
> +		strbuf_addstr(dest, iter->relative_path);
> +
> +		/*
> +		 * dir_iterator_advance already calls lstat to populate iter->st
> +		 * but, unlike stat, lstat does not checks for permissions on
> +		 * the given path.
> +		 */

Hmm, lstat does check the permissions on the path, it just doesn't
follow symlinks.  I think I actually mislead you in my previous review
here, and was reading the code in dir-iterator.c all wrong.

I thought it said "if (errno == ENOENT) warning(...)", however the
condition is "errno != ENOENT", which is why I thought we were loosing
warnings when errno == EACCES for example.

As we decided that we would no longer follow symlinks now, I think we
can actually get rid of the stat call here.  Sorry about the confusion.

> +		if (stat(src->buf, &st)) {
>  			warning (_("failed to stat %s\n"), src->buf);
>  			continue;
>  		}
> -		if (S_ISDIR(buf.st_mode)) {
> -			if (de->d_name[0] != '.')
> -				copy_or_link_directory(src, dest,
> -						       src_repo, src_baselen);
> +
> +		if (S_ISDIR(iter->st.st_mode)) {
> +			mkdir_if_missing(dest->buf, 0777);
>  			continue;
>  		}
>  
>  		/* Files that cannot be copied bit-for-bit... */
> -		if (!strcmp(src->buf + src_baselen, "/info/alternates")) {
> +		if (!strcmp(iter->relative_path, "info/alternates")) {
>  			copy_alternates(src, dest, src_repo);
>  			continue;
>  		}
> @@ -463,7 +468,11 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
>  		if (copy_file_with_time(dest->buf, src->buf, 0666))
>  			die_errno(_("failed to copy file to '%s'"), dest->buf);
>  	}
> -	closedir(dir);
> +
> +	if (iter_status != ITER_DONE) {
> +		strbuf_setlen(src, src_len);
> +		die(_("failed to iterate over '%s'"), src->buf);
> +	}
>  }
>  
>  static void clone_local(const char *src_repo, const char *dest_repo)
> @@ -481,7 +490,7 @@ static void clone_local(const char *src_repo, const char *dest_repo)
>  		get_common_dir(&dest, dest_repo);
>  		strbuf_addstr(&src, "/objects");
>  		strbuf_addstr(&dest, "/objects");
> -		copy_or_link_directory(&src, &dest, src_repo, src.len);
> +		copy_or_link_directory(&src, &dest, src_repo);
>  		strbuf_release(&src);
>  		strbuf_release(&dest);
>  	}
> -- 
> 2.20.1
> 

  reply	other threads:[~2019-02-23 21:49 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-23 19:03 [GSoC][PATCH 0/3] clone: convert explicit dir traversal to dir-iterator Matheus Tavares
2019-02-23 19:03 ` [GSoC][PATCH 1/3] dir-iterator: add pedantic option to dir_iterator_begin Matheus Tavares
2019-02-23 21:35   ` Thomas Gummerer
2019-02-24  8:35     ` Christian Couder
2019-02-24 17:43       ` Matheus Tavares Bernardino
2019-02-24 21:06         ` Thomas Gummerer
2019-02-23 19:03 ` [GSoC][PATCH 2/3] clone: extract function from copy_or_link_directory Matheus Tavares
2019-02-24  8:38   ` Christian Couder
2019-02-23 19:03 ` [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-02-23 21:48   ` Thomas Gummerer [this message]
2019-02-24 18:19     ` Matheus Tavares Bernardino
2019-02-23 22:40   ` Ævar Arnfjörð Bjarmason
2019-02-24  9:41     ` Christian Couder
2019-02-24 14:45       ` Ævar Arnfjörð Bjarmason
2019-02-25  9:45         ` Duy Nguyen
2019-02-26  0:26           ` [WIP RFC PATCH 0/7] clone: dir iterator refactoring with tests Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 1/7] dir-iterator: add pedantic option to dir_iterator_begin Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 2/7] dir-iterator: use stat() instead of lstat() Ævar Arnfjörð Bjarmason
2019-02-26  1:53             ` Matheus Tavares Bernardino
2019-02-26  0:26           ` [WIP RFC PATCH 3/7] clone: extract function from copy_or_link_directory Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 4/7] clone: test for our behavior on odd objects/* content Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 5/7] clone: use dir-iterator to avoid explicit dir traversal Ævar Arnfjörð Bjarmason
2019-02-26  3:48             ` Matheus Tavares Bernardino
2019-02-26 11:33               ` Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 6/7] clone: stop ignoring dotdirs in --local etc. clone Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 7/7] clone: break cloning repos that have symlinks in them Ævar Arnfjörð Bjarmason
2019-02-25  2:31       ` [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares Bernardino
2019-02-25 10:25         ` Ævar Arnfjörð Bjarmason
2019-02-25 20:40           ` Christian Couder
2019-02-26 10:33         ` Christian Couder
2019-02-23 19:07 ` [GSoC][PATCH 0/3] clone: convert explicit dir traversal to dir-iterator Matheus Tavares Bernardino
2019-02-23 20:10   ` Ævar Arnfjörð Bjarmason
2019-02-23 21:59 ` Thomas Gummerer
2019-02-24 16:34   ` Matheus Tavares Bernardino
2019-02-24 21:07     ` Thomas Gummerer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190223214856.GQ6085@hank.intra.tgummerer.com \
    --to=t.gummerer@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.