All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
To: Duy Nguyen <pclouds@gmail.com>
Cc: "Git Mailing List" <git@vger.kernel.org>,
	"Thomas Gummerer" <t.gummerer@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Christian Couder" <christian.couder@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>
Subject: Re: [WIP RFC PATCH v2 5/5] clone: use dir-iterator to avoid explicit dir traversal
Date: Wed, 27 Feb 2019 14:40:16 -0300	[thread overview]
Message-ID: <CAHd-oW6WJ5JSRAbcy+5kcEA4V8qKEUc9B=6WQZvdqaHz4XHBTA@mail.gmail.com> (raw)
In-Reply-To: <CACsJy8B1asF3i+G-C1aZRw7QTW7jS+a4VkCbg-17YOTyYHuw5w@mail.gmail.com>

On Tue, Feb 26, 2019 at 9:32 AM Duy Nguyen <pclouds@gmail.com> wrote:
>
> On Tue, Feb 26, 2019 at 12:18 PM Matheus Tavares
> <matheus.bernardino@usp.br> wrote:
> >
> > Replace usage of opendir/readdir/closedir API to traverse directories
> > recursively, at copy_or_link_directory function, by the dir-iterator
> > API. This simplifies the code and avoid recursive calls to
> > copy_or_link_directory.
> >
> > This process also makes copy_or_link_directory call die() in case of an
> > error on readdir or stat, inside dir_iterator_advance. Previously it
> > would just print a warning for errors on stat and ignore errors on
> > readdir, which isn't nice because a local git clone would end up
> > successfully even though the .git/objects copy didn't fully succeeded.
> >
> > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> > ---
> > I can also make the change described in the last paragraph in a separate
> > patch before this one, but I would have to undo it in this patch because
> > dir-iterator already implements it. So, IMHO, it would be just noise
> > and not worthy.
> >
> >  builtin/clone.c | 45 +++++++++++++++++++++++----------------------
> >  1 file changed, 23 insertions(+), 22 deletions(-)
> >
> > diff --git a/builtin/clone.c b/builtin/clone.c
> > index fd580fa98d..b23ba64c94 100644
> > --- a/builtin/clone.c
> > +++ b/builtin/clone.c
> > @@ -23,6 +23,8 @@
> >  #include "transport.h"
> >  #include "strbuf.h"
> >  #include "dir.h"
> > +#include "dir-iterator.h"
> > +#include "iterator.h"
> >  #include "sigchain.h"
> >  #include "branch.h"
> >  #include "remote.h"
> > @@ -411,42 +413,37 @@ static void mkdir_if_missing(const char *pathname, mode_t mode)
> >  }
> >
> >  static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
> > -                                  const char *src_repo, int src_baselen)
> > +                                  const char *src_repo)
> >  {
> > -       struct dirent *de;
> > -       struct stat buf;
> >         int src_len, dest_len;
> > -       DIR *dir;
> > -
> > -       dir = opendir(src->buf);
> > -       if (!dir)
> > -               die_errno(_("failed to open '%s'"), src->buf);
> > +       struct dir_iterator *iter;
> > +       int iter_status;
> > +       struct stat st;
> > +       unsigned flags;
> >
> >         mkdir_if_missing(dest->buf, 0777);
> >
> > +       flags = DIR_ITERATOR_PEDANTIC | DIR_ITERATOR_FOLLOW_SYMLINKS;
> > +       iter = dir_iterator_begin(src->buf, flags);
> > +
> >         strbuf_addch(src, '/');
> >         src_len = src->len;
> >         strbuf_addch(dest, '/');
> >         dest_len = dest->len;
> >
> > -       while ((de = readdir(dir)) != NULL) {
> > +       while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) {
> >                 strbuf_setlen(src, src_len);
> > -               strbuf_addstr(src, de->d_name);
> > +               strbuf_addstr(src, iter->relative_path);
> >                 strbuf_setlen(dest, dest_len);
> > -               strbuf_addstr(dest, de->d_name);
> > -               if (stat(src->buf, &buf)) {
> > -                       warning (_("failed to stat %s\n"), src->buf);
> > -                       continue;
> > -               }
> > -               if (S_ISDIR(buf.st_mode)) {
> > -                       if (!is_dot_or_dotdot(de->d_name))
> > -                               copy_or_link_directory(src, dest,
> > -                                                      src_repo, src_baselen);
> > +               strbuf_addstr(dest, iter->relative_path);
> > +
> > +               if (S_ISDIR(iter->st.st_mode)) {
> > +                       mkdir_if_missing(dest->buf, 0777);
>
> I wonder if this mkdir_if_missing is sufficient. What if you have to
> create multiple directories?
>
> Let's say the first advance, we hit "a". The the second advance we hit
> directory "b/b/b/b", we would need to mkdir recursively and something
> like safe_create_leading_directories() would be a better fit.
>
> I'm not sure if it can happen though. I haven't re-read dir-iterator
> code carefully.
>
> >                         continue;
> >                 }
> >
> >                 /* Files that cannot be copied bit-for-bit... */
> > -               if (!strcmp(src->buf + src_baselen, "/info/alternates")) {
> > +               if (!strcmp(iter->relative_path, "info/alternates")) {
>
> While we're here, this should be fspathcmp to be friendlier to
> case-insensitive filesystems. You probably should fix it in a separate
> patch though.
>

Nice! I will make this change in a separate patch in the series. Thanks!

> >                         copy_alternates(src, dest, src_repo);
> >                         continue;
> >                 }
> > @@ -463,7 +460,11 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
> >                 if (copy_file_with_time(dest->buf, src->buf, 0666))
> >                         die_errno(_("failed to copy file to '%s'"), dest->buf);
> >         }
> > -       closedir(dir);
> > +
> > +       if (iter_status != ITER_DONE) {
> > +               strbuf_setlen(src, src_len);
> > +               die(_("failed to iterate over '%s'"), src->buf);
> > +       }
>
> I think you need to abort the iterator even when it returns ITER_DONE.
> At least that's how the first caller in files-backend.c does it.
>

Hm, I don't think so, since dir_iterator_advance() already frees the
resources before returning ITER_DONE. Also, I may be wrong, but it
doesn't seem to me, that files-backend.c does it. The function
files_reflog_iterator_advance() that calls dir_iterator_advance() even
sets the dir-iterator pointer to NULL as soon as ITER_DONE is
returned.


> >  }
> >
> >  static void clone_local(const char *src_repo, const char *dest_repo)
> > @@ -481,7 +482,7 @@ static void clone_local(const char *src_repo, const char *dest_repo)
> >                 get_common_dir(&dest, dest_repo);
> >                 strbuf_addstr(&src, "/objects");
> >                 strbuf_addstr(&dest, "/objects");
> > -               copy_or_link_directory(&src, &dest, src_repo, src.len);
> > +               copy_or_link_directory(&src, &dest, src_repo);
> >                 strbuf_release(&src);
> >                 strbuf_release(&dest);
> >         }
> > --
> > 2.20.1
> >
>
>
> --
> Duy

  parent reply	other threads:[~2019-02-27 17:40 UTC|newest]

Thread overview: 127+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-26  5:17 [WIP RFC PATCH v2 0/5] clone: dir iterator refactoring with tests Matheus Tavares
2019-02-26  5:18 ` [WIP RFC PATCH v2 1/5] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-02-26 12:01   ` Duy Nguyen
2019-02-27 13:59     ` Matheus Tavares Bernardino
2019-02-26  5:18 ` [WIP RFC PATCH v2 2/5] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-02-26  5:18 ` [WIP RFC PATCH v2 3/5] clone: copy hidden paths at local clone Matheus Tavares
2019-02-26 12:13   ` Duy Nguyen
2019-02-26  5:18 ` [WIP RFC PATCH v2 4/5] clone: extract function from copy_or_link_directory Matheus Tavares
2019-02-26 12:18   ` Duy Nguyen
2019-02-27 17:30     ` Matheus Tavares Bernardino
2019-02-27 22:45       ` Thomas Gummerer
2019-02-27 22:50         ` Matheus Tavares Bernardino
2019-02-26  5:18 ` [WIP RFC PATCH v2 5/5] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-02-26 11:35   ` Ævar Arnfjörð Bjarmason
2019-02-26 12:32   ` Duy Nguyen
2019-02-26 12:50     ` Ævar Arnfjörð Bjarmason
2019-02-27 17:40     ` Matheus Tavares Bernardino [this message]
2019-02-28  7:13       ` Duy Nguyen
2019-02-28  7:53       ` Ævar Arnfjörð Bjarmason
2019-02-26 11:36 ` [WIP RFC PATCH v2 0/5] clone: dir iterator refactoring with tests Ævar Arnfjörð Bjarmason
2019-02-26 12:20   ` Duy Nguyen
2019-02-26 12:28 ` [RFC PATCH v3 " Ævar Arnfjörð Bjarmason
2019-02-26 20:56   ` Matheus Tavares Bernardino
2019-03-22 23:22   ` [GSoC][PATCH v4 0/7] clone: dir-iterator " Matheus Tavares
2019-03-22 23:22     ` [GSoC][PATCH v4 1/7] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-03-24 18:09       ` Matheus Tavares Bernardino
2019-03-24 20:56       ` SZEDER Gábor
2019-03-26 19:43         ` Matheus Tavares Bernardino
2019-03-28 21:49       ` Thomas Gummerer
2019-03-29 14:06         ` Matheus Tavares Bernardino
2019-03-29 19:31           ` Thomas Gummerer
2019-03-29 19:42             ` SZEDER Gábor
2019-03-30  2:49             ` Matheus Tavares Bernardino
2019-03-22 23:22     ` [GSoC][PATCH v4 2/7] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-03-28 22:10       ` Thomas Gummerer
2019-03-29  8:38         ` Ævar Arnfjörð Bjarmason
2019-03-29 20:15           ` Thomas Gummerer
2019-03-29 14:27         ` Matheus Tavares Bernardino
2019-03-29 20:05           ` Thomas Gummerer
2019-03-30  5:32             ` Matheus Tavares Bernardino
2019-03-30 19:27               ` Thomas Gummerer
2019-04-01  3:56                 ` Matheus Tavares Bernardino
2019-03-29 15:40         ` Johannes Schindelin
2019-03-22 23:22     ` [GSoC][PATCH v4 3/7] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-03-28 22:19       ` Thomas Gummerer
2019-03-29 13:16         ` Matheus Tavares Bernardino
2019-03-22 23:22     ` [GSoC][PATCH v4 4/7] clone: copy hidden paths at local clone Matheus Tavares
2019-03-22 23:22     ` [GSoC][PATCH v4 5/7] clone: extract function from copy_or_link_directory Matheus Tavares
2019-03-22 23:22     ` [GSoC][PATCH v4 6/7] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-03-22 23:22     ` [GSoC][PATCH v4 7/7] clone: Replace strcmp by fspathcmp Matheus Tavares
2019-03-30 22:49     ` [GSoC][PATCH v5 0/7] clone: dir-iterator refactoring with tests Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 1/7] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 2/7] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-03-31 17:40         ` Thomas Gummerer
2019-04-01  3:59           ` Matheus Tavares Bernardino
2019-03-30 22:49       ` [GSoC][PATCH v5 3/7] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-03-31 18:12         ` Thomas Gummerer
2019-04-10 20:24           ` Matheus Tavares Bernardino
2019-04-11 21:09             ` Thomas Gummerer
2019-04-23 17:07               ` Matheus Tavares Bernardino
2019-04-24 18:36                 ` Thomas Gummerer
2019-04-26  4:13                   ` Matheus Tavares Bernardino
2019-03-30 22:49       ` [GSoC][PATCH v5 4/7] clone: copy hidden paths at local clone Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 5/7] clone: extract function from copy_or_link_directory Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 6/7] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-03-30 22:49       ` [GSoC][PATCH v5 7/7] clone: replace strcmp by fspathcmp Matheus Tavares
2019-03-31 18:16       ` [GSoC][PATCH v5 0/7] clone: dir-iterator refactoring with tests Thomas Gummerer
2019-04-01 13:56         ` Matheus Tavares Bernardino
2019-05-02 14:48       ` [GSoC][PATCH v6 00/10] " Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 01/10] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 02/10] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 03/10] dir-iterator: add tests for dir-iterator API Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 04/10] dir-iterator: use warning_errno when possible Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 05/10] dir-iterator: refactor state machine model Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 06/10] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 07/10] clone: copy hidden paths at local clone Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 08/10] clone: extract function from copy_or_link_directory Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 09/10] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-05-02 14:48         ` [GSoC][PATCH v6 10/10] clone: replace strcmp by fspathcmp Matheus Tavares
2019-06-18 23:27         ` [GSoC][PATCH v7 00/10] clone: dir-iterator refactoring with tests Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 01/10] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 02/10] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 03/10] dir-iterator: add tests for dir-iterator API Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 04/10] dir-iterator: use warning_errno when possible Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 05/10] dir-iterator: refactor state machine model Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 06/10] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-06-25 18:00             ` Junio C Hamano
2019-06-25 18:11               ` Matheus Tavares Bernardino
2019-06-26 13:34             ` Johannes Schindelin
2019-06-26 18:04               ` Junio C Hamano
2019-06-27  9:20                 ` Duy Nguyen
2019-06-27 17:23                 ` Matheus Tavares Bernardino
2019-06-27 18:48                   ` Johannes Schindelin
2019-06-27 19:33                     ` Matheus Tavares Bernardino
2019-06-28 12:51                       ` Johannes Schindelin
2019-06-28 14:16                         ` Matheus Tavares Bernardino
2019-07-01 12:15                           ` Johannes Schindelin
2019-07-03  8:57             ` SZEDER Gábor
2019-07-08 22:21               ` Matheus Tavares Bernardino
2019-06-18 23:27           ` [GSoC][PATCH v7 07/10] clone: copy hidden paths at local clone Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 08/10] clone: extract function from copy_or_link_directory Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 09/10] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-06-18 23:27           ` [GSoC][PATCH v7 10/10] clone: replace strcmp by fspathcmp Matheus Tavares
2019-06-19  4:36           ` [GSoC][PATCH v7 00/10] clone: dir-iterator refactoring with tests Matheus Tavares Bernardino
2019-06-20 20:18           ` Junio C Hamano
2019-06-21 13:41             ` Matheus Tavares Bernardino
2019-07-10 23:58           ` [GSoC][PATCH v8 " Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 01/10] clone: test for our behavior on odd objects/* content Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 02/10] clone: better handle symlinked files at .git/objects/ Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 03/10] dir-iterator: add tests for dir-iterator API Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 04/10] dir-iterator: use warning_errno when possible Matheus Tavares
2019-07-10 23:58             ` [GSoC][PATCH v8 05/10] dir-iterator: refactor state machine model Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 06/10] dir-iterator: add flags parameter to dir_iterator_begin Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 07/10] clone: copy hidden paths at local clone Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 08/10] clone: extract function from copy_or_link_directory Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 09/10] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-07-10 23:59             ` [GSoC][PATCH v8 10/10] clone: replace strcmp by fspathcmp Matheus Tavares
2019-07-11 11:56             ` [GSoC][PATCH v8 00/10] clone: dir-iterator refactoring with tests Johannes Schindelin
2019-07-11 15:24               ` Matheus Tavares Bernardino
2019-02-26 12:28 ` [RFC PATCH v3 1/5] clone: test for our behavior on odd objects/* content Ævar Arnfjörð Bjarmason
2019-02-28 21:19   ` Matheus Tavares Bernardino
2019-03-01 13:49     ` Ævar Arnfjörð Bjarmason
2019-03-13  3:17       ` Matheus Tavares
2019-02-26 12:28 ` [RFC PATCH v3 2/5] dir-iterator: add flags parameter to dir_iterator_begin Ævar Arnfjörð Bjarmason
2019-02-26 12:28 ` [RFC PATCH v3 3/5] clone: copy hidden paths at local clone Ævar Arnfjörð Bjarmason
2019-02-26 12:28 ` [RFC PATCH v3 4/5] clone: extract function from copy_or_link_directory Ævar Arnfjörð Bjarmason
2019-02-26 12:28 ` [RFC PATCH v3 5/5] clone: use dir-iterator to avoid explicit dir traversal Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHd-oW6WJ5JSRAbcy+5kcEA4V8qKEUc9B=6WQZvdqaHz4XHBTA@mail.gmail.com' \
    --to=matheus.bernardino@usp.br \
    --cc=avarab@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=t.gummerer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.