All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gummerer <t.gummerer@gmail.com>
To: Matheus Tavares <matheus.bernardino@usp.br>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Ramsay Jones" <ramsay@ramsayjones.plus.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Michael Haggerty" <mhagger@alum.mit.edu>
Subject: Re: [GSoC][PATCH 1/3] dir-iterator: add pedantic option to dir_iterator_begin
Date: Sat, 23 Feb 2019 21:35:27 +0000	[thread overview]
Message-ID: <20190223213527.GP6085@hank.intra.tgummerer.com> (raw)
In-Reply-To: <20190223190309.6728-2-matheus.bernardino@usp.br>

On 02/23, Matheus Tavares wrote:
> Add the pedantic option to dir-iterator's initialization function,
> dir_iterator_begin. When this option is set to true,
> dir_iterator_advance will immediately return ITER_ERROR when failing to
> fetch the next entry. When set to false, dir_iterator_advance will emit
> a warning and keep looking for the next entry.
> 
> Also adjust refs/files-backend.c to the new dir_iterator_begin
> signature.

Thanks, this makes sense to me.  This commit message describes what we
are doing in this patch, but not why we are doing it.  From previously
reviewing this series, I know this is going to be used in a subsequent
patch, but it is nice to give reviewers and future readers of this
patch that context as well.  Just something like "This behaviour is
going to be used in a subsequent patch." should be enough here.

A few comments inline.

> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
> Changes in v2:
>  - Added in v2
> 
>  dir-iterator.c       | 23 +++++++++++++++++++++--
>  dir-iterator.h       | 16 +++++++++++++---
>  refs/files-backend.c |  2 +-
>  3 files changed, 35 insertions(+), 6 deletions(-)
> 
> diff --git a/dir-iterator.c b/dir-iterator.c
> index f2dcd82fde..070a656555 100644
> --- a/dir-iterator.c
> +++ b/dir-iterator.c
> @@ -48,6 +48,13 @@ struct dir_iterator_int {
>  	 * that will be included in this iteration.
>  	 */
>  	struct dir_iterator_level *levels;
> +
> +	/*
> +	 * Boolean value to define dir-iterator's behaviour when failing to
> +	 * fetch next entry. See comments on dir_iterator_begin at
> +	 * dir-iterator.h
> +	 */
> +	int pedantic;
>  };
>  
>  int dir_iterator_advance(struct dir_iterator *dir_iterator)
> @@ -71,6 +78,8 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
>  
>  			level->dir = opendir(iter->base.path.buf);
>  			if (!level->dir && errno != ENOENT) {
> +				if (iter->pedantic)
> +					goto error_out;

I think we should also print an error here.  The caller doesn't have
any context on what went wrong, and will probably just 'die()' if an
error is encountered.  I think it would make sense to call
'error(...)' here before 'goto error_out;' to give a useful error
message here.

>  				warning("error opening directory %s: %s",
>  					iter->base.path.buf, strerror(errno));
>  				/* Popping the level is handled below */
> @@ -122,6 +131,8 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
>  			if (!de) {
>  				/* This level is exhausted; pop up a level. */
>  				if (errno) {
> +					if (iter->pedantic)
> +						goto error_out;
>  					warning("error reading directory %s: %s",
>  						iter->base.path.buf, strerror(errno));
>  				} else if (closedir(level->dir))
> @@ -139,10 +150,13 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
>  
>  			strbuf_addstr(&iter->base.path, de->d_name);
>  			if (lstat(iter->base.path.buf, &iter->base.st) < 0) {
> -				if (errno != ENOENT)
> +				if (errno != ENOENT) {
> +					if (iter->pedantic)
> +						goto error_out;
>  					warning("error reading path '%s': %s",
>  						iter->base.path.buf,
>  						strerror(errno));
> +				}
>  				continue;
>  			}
>  
> @@ -159,6 +173,10 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
>  			return ITER_OK;
>  		}
>  	}
> +
> +error_out:
> +	dir_iterator_abort(dir_iterator);
> +	return ITER_ERROR;
>  }
>  
>  int dir_iterator_abort(struct dir_iterator *dir_iterator)
> @@ -182,7 +200,7 @@ int dir_iterator_abort(struct dir_iterator *dir_iterator)
>  	return ITER_DONE;
>  }
>  
> -struct dir_iterator *dir_iterator_begin(const char *path)
> +struct dir_iterator *dir_iterator_begin(const char *path, int pedantic)

Thinking about the future evolution of this interface, it might make
more sense to have that second parameter be a "struct
dir_iterator_opts".  For now it would just have one member "pedantic",
but in the future we could add additional options there instead of
adding additional parameters.

For now there are not many callers of the dir iterator API, so adding
a parameter is not a huge problem.  However we will most likely add
more callers in the future, so changing all of them becomes more and
more costly.

It also becomes a problem for integrating these patches, in the case
where a topic in flight adds a new use of this API.  In that case the
breakage wouldn't be noticed on merges, as there would be no merge
conflicts, however it would break the build, which makes the
maintainers job harder.

We could of course add new functions in the future, rather than
changing dir_iterator_begin, however just having one function, that
allows combining different options sounds like the better choice to me
here.

>  {
>  	struct dir_iterator_int *iter = xcalloc(1, sizeof(*iter));
>  	struct dir_iterator *dir_iterator = &iter->base;
> @@ -195,6 +213,7 @@ struct dir_iterator *dir_iterator_begin(const char *path)
>  
>  	ALLOC_GROW(iter->levels, 10, iter->levels_alloc);
>  
> +	iter->pedantic = pedantic;
>  	iter->levels_nr = 1;
>  	iter->levels[0].initialized = 0;
>  
> diff --git a/dir-iterator.h b/dir-iterator.h
> index 970793d07a..50ca8e1a27 100644
> --- a/dir-iterator.h
> +++ b/dir-iterator.h
> @@ -19,7 +19,7 @@
>   * A typical iteration looks like this:
>   *
>   *     int ok;
> - *     struct iterator *iter = dir_iterator_begin(path);
> + *     struct iterator *iter = dir_iterator_begin(path, 0);
>   *
>   *     while ((ok = dir_iterator_advance(iter)) == ITER_OK) {
>   *             if (want_to_stop_iteration()) {
> @@ -65,9 +65,15 @@ struct dir_iterator {
>   * The iteration includes all paths under path, not including path
>   * itself and not including "." or ".." entries.
>   *
> - * path is the starting directory. An internal copy will be made.
> + * Parameters are:
> + * - path is the starting directory. An internal copy will be made.
> + * - pedantic is a boolean value. If true, dir-iterator will free
> + *   resources and return ITER_ERROR immediately, in case of an error
> + *   while trying to fetch the next entry in dir_iterator_advance. If
> + *   false, it will just emit a warning and keep looking for the next
> + *   entry.
>   */
> -struct dir_iterator *dir_iterator_begin(const char *path);
> +struct dir_iterator *dir_iterator_begin(const char *path, int pedantic);
>  
>  /*
>   * Advance the iterator to the first or next item and return ITER_OK.
> @@ -76,6 +82,10 @@ struct dir_iterator *dir_iterator_begin(const char *path);
>   * dir_iterator and associated resources and return ITER_ERROR. It is
>   * a bug to use iterator or call this function again after it has
>   * returned ITER_DONE or ITER_ERROR.
> + *
> + * Note that whether dir_iterator_advance will return ITER_ERROR when
> + * failing to fetch the next entry or keep going is defined by the
> + * 'pedantic' option at dir-iterator's initialization.
>   */
>  int dir_iterator_advance(struct dir_iterator *iterator);
>  
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index dd8abe9185..c3d3b6c454 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -2143,7 +2143,7 @@ static struct ref_iterator *reflog_iterator_begin(struct ref_store *ref_store,
>  
>  	base_ref_iterator_init(ref_iterator, &files_reflog_iterator_vtable, 0);
>  	strbuf_addf(&sb, "%s/logs", gitdir);
> -	iter->dir_iterator = dir_iterator_begin(sb.buf);
> +	iter->dir_iterator = dir_iterator_begin(sb.buf, 0);
>  	iter->ref_store = ref_store;
>  	strbuf_release(&sb);
>  
> -- 
> 2.20.1
> 

  reply	other threads:[~2019-02-23 21:35 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-23 19:03 [GSoC][PATCH 0/3] clone: convert explicit dir traversal to dir-iterator Matheus Tavares
2019-02-23 19:03 ` [GSoC][PATCH 1/3] dir-iterator: add pedantic option to dir_iterator_begin Matheus Tavares
2019-02-23 21:35   ` Thomas Gummerer [this message]
2019-02-24  8:35     ` Christian Couder
2019-02-24 17:43       ` Matheus Tavares Bernardino
2019-02-24 21:06         ` Thomas Gummerer
2019-02-23 19:03 ` [GSoC][PATCH 2/3] clone: extract function from copy_or_link_directory Matheus Tavares
2019-02-24  8:38   ` Christian Couder
2019-02-23 19:03 ` [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-02-23 21:48   ` Thomas Gummerer
2019-02-24 18:19     ` Matheus Tavares Bernardino
2019-02-23 22:40   ` Ævar Arnfjörð Bjarmason
2019-02-24  9:41     ` Christian Couder
2019-02-24 14:45       ` Ævar Arnfjörð Bjarmason
2019-02-25  9:45         ` Duy Nguyen
2019-02-26  0:26           ` [WIP RFC PATCH 0/7] clone: dir iterator refactoring with tests Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 1/7] dir-iterator: add pedantic option to dir_iterator_begin Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 2/7] dir-iterator: use stat() instead of lstat() Ævar Arnfjörð Bjarmason
2019-02-26  1:53             ` Matheus Tavares Bernardino
2019-02-26  0:26           ` [WIP RFC PATCH 3/7] clone: extract function from copy_or_link_directory Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 4/7] clone: test for our behavior on odd objects/* content Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 5/7] clone: use dir-iterator to avoid explicit dir traversal Ævar Arnfjörð Bjarmason
2019-02-26  3:48             ` Matheus Tavares Bernardino
2019-02-26 11:33               ` Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 6/7] clone: stop ignoring dotdirs in --local etc. clone Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 7/7] clone: break cloning repos that have symlinks in them Ævar Arnfjörð Bjarmason
2019-02-25  2:31       ` [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares Bernardino
2019-02-25 10:25         ` Ævar Arnfjörð Bjarmason
2019-02-25 20:40           ` Christian Couder
2019-02-26 10:33         ` Christian Couder
2019-02-23 19:07 ` [GSoC][PATCH 0/3] clone: convert explicit dir traversal to dir-iterator Matheus Tavares Bernardino
2019-02-23 20:10   ` Ævar Arnfjörð Bjarmason
2019-02-23 21:59 ` Thomas Gummerer
2019-02-24 16:34   ` Matheus Tavares Bernardino
2019-02-24 21:07     ` Thomas Gummerer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190223213527.GP6085@hank.intra.tgummerer.com \
    --to=t.gummerer@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    --cc=mhagger@alum.mit.edu \
    --cc=pclouds@gmail.com \
    --cc=ramsay@ramsayjones.plus.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.