* [PATCH] entry.c: use dir-iterator to avoid explicit dir traversal
@ 2019-12-08 18:04 otalpster
2019-12-09 15:21 ` Derrick Stolee
2019-12-09 17:41 ` Matheus Tavares Bernardino
0 siblings, 2 replies; 6+ messages in thread
From: otalpster @ 2019-12-08 18:04 UTC (permalink / raw)
To: git; +Cc: mhagger, matheus.bernardino, Plato
From: Plato <otalpster@gmail.com>
Replace usage of opendir/readdir/closedir API to traverse directories
recursively, at remove_subtree() function, by the dir-iterator API. This
simplifies the code and avoids recursive calls to remove_subtree().
Signed-off-by: Plato <otalpster@gmail.com>
---
Hello,
This is my first patch.
I hope I cc'd the correct people and didn't mess up.
The changes pass the test suite t/ and Travis CI.
Please point out any mistakes.
Thanks for your time! :)
entry.c | 32 +++++++++++++++-----------------
1 file changed, 15 insertions(+), 17 deletions(-)
diff --git a/entry.c b/entry.c
index 53380bb614..e7f4881d3b 100644
--- a/entry.c
+++ b/entry.c
@@ -2,6 +2,8 @@
#include "blob.h"
#include "object-store.h"
#include "dir.h"
+#include "iterator.h"
+#include "dir-iterator.h"
#include "streaming.h"
#include "submodule.h"
#include "progress.h"
@@ -50,29 +52,25 @@ static void create_directories(const char *path, int path_len,
static void remove_subtree(struct strbuf *path)
{
- DIR *dir = opendir(path->buf);
- struct dirent *de;
- int origlen = path->len;
+ int ok;
+ unsigned int flags = DIR_ITERATOR_PEDANTIC;
+ struct dir_iterator *iter = dir_iterator_begin(path->buf, flags);
- if (!dir)
+ if (!iter)
die_errno("cannot opendir '%s'", path->buf);
- while ((de = readdir(dir)) != NULL) {
- struct stat st;
- if (is_dot_or_dotdot(de->d_name))
+ while ((ok = dir_iterator_advance(iter)) == ITER_OK) {
+ if (is_dot_or_dotdot(iter->path.buf))
continue;
- strbuf_addch(path, '/');
- strbuf_addstr(path, de->d_name);
- if (lstat(path->buf, &st))
- die_errno("cannot lstat '%s'", path->buf);
- if (S_ISDIR(st.st_mode))
- remove_subtree(path);
- else if (unlink(path->buf))
- die_errno("cannot unlink '%s'", path->buf);
- strbuf_setlen(path, origlen);
+ if (unlink(iter->path.buf)) {
+ die_errno("cannot unlink '%s'", iter->path.buf);
+ }
}
- closedir(dir);
+
+ if (ok != ITER_DONE)
+ die(_("failed to iterate over '%s'"), path->buf);
+
if (rmdir(path->buf))
die_errno("cannot rmdir '%s'", path->buf);
}
--
2.24.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] entry.c: use dir-iterator to avoid explicit dir traversal
2019-12-08 18:04 [PATCH] entry.c: use dir-iterator to avoid explicit dir traversal otalpster
@ 2019-12-09 15:21 ` Derrick Stolee
2019-12-09 17:41 ` Matheus Tavares Bernardino
1 sibling, 0 replies; 6+ messages in thread
From: Derrick Stolee @ 2019-12-09 15:21 UTC (permalink / raw)
To: otalpster, git; +Cc: mhagger, matheus.bernardino
On 12/8/2019 1:04 PM, otalpster@gmail.com wrote:
> From: Plato <otalpster@gmail.com>
>
> Replace usage of opendir/readdir/closedir API to traverse directories
> recursively, at remove_subtree() function, by the dir-iterator API. This
> simplifies the code and avoids recursive calls to remove_subtree().
>
> Signed-off-by: Plato <otalpster@gmail.com>
> ---
> Hello,
>
> This is my first patch.
> I hope I cc'd the correct people and didn't mess up.
Welcome, Plato!
> The changes pass the test suite t/ and Travis CI.
> Please point out any mistakes.
>
> Thanks for your time! :)
>
> entry.c | 32 +++++++++++++++-----------------
> 1 file changed, 15 insertions(+), 17 deletions(-)
>
> diff --git a/entry.c b/entry.c
> index 53380bb614..e7f4881d3b 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -2,6 +2,8 @@
> #include "blob.h"
> #include "object-store.h"
> #include "dir.h"
> +#include "iterator.h"
> +#include "dir-iterator.h"
> #include "streaming.h"
> #include "submodule.h"
> #include "progress.h"
> @@ -50,29 +52,25 @@ static void create_directories(const char *path, int path_len,
>
> static void remove_subtree(struct strbuf *path)
> {
> - DIR *dir = opendir(path->buf);
> - struct dirent *de;
> - int origlen = path->len;
> + int ok;
> + unsigned int flags = DIR_ITERATOR_PEDANTIC;
> + struct dir_iterator *iter = dir_iterator_begin(path->buf, flags);
>
> - if (!dir)
> + if (!iter)
> die_errno("cannot opendir '%s'", path->buf);
> - while ((de = readdir(dir)) != NULL) {
> - struct stat st;
>
> - if (is_dot_or_dotdot(de->d_name))
> + while ((ok = dir_iterator_advance(iter)) == ITER_OK) {
> + if (is_dot_or_dotdot(iter->path.buf))
> continue;
The documentation in dir-iterator.h seems to skip . and .. for you.
This 'if' shouldn't be needed.
> - strbuf_addch(path, '/');
> - strbuf_addstr(path, de->d_name);
> - if (lstat(path->buf, &st))
> - die_errno("cannot lstat '%s'", path->buf);
> - if (S_ISDIR(st.st_mode))
> - remove_subtree(path);
It took me reading dir-iterator.h to realize that the iterator
is recursive, because I'm new to this API. This is a nice cleanup!
> - else if (unlink(path->buf))
> - die_errno("cannot unlink '%s'", path->buf);
> - strbuf_setlen(path, origlen);
> + if (unlink(iter->path.buf)) {
> + die_errno("cannot unlink '%s'", iter->path.buf);
> + }
nit: don't use braces for a single-line block.
> }
> - closedir(dir);
> +
> + if (ok != ITER_DONE)
> + die(_("failed to iterate over '%s'"), path->buf);
> +
> if (rmdir(path->buf))
> die_errno("cannot rmdir '%s'", path->buf);
> }
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] entry.c: use dir-iterator to avoid explicit dir traversal
2019-12-08 18:04 [PATCH] entry.c: use dir-iterator to avoid explicit dir traversal otalpster
2019-12-09 15:21 ` Derrick Stolee
@ 2019-12-09 17:41 ` Matheus Tavares Bernardino
2019-12-09 21:18 ` Junio C Hamano
1 sibling, 1 reply; 6+ messages in thread
From: Matheus Tavares Bernardino @ 2019-12-09 17:41 UTC (permalink / raw)
To: otalpster; +Cc: git, Michael Haggerty
On Sun, Dec 8, 2019 at 3:06 PM <otalpster@gmail.com> wrote:
>
> From: Plato <otalpster@gmail.com>
>
> Replace usage of opendir/readdir/closedir API to traverse directories
> recursively, at remove_subtree() function, by the dir-iterator API. This
> simplifies the code and avoids recursive calls to remove_subtree().
>
> Signed-off-by: Plato <otalpster@gmail.com>
> ---
> Hello,
>
> This is my first patch.
Hello Plato, and welcome!
Thanks for working on this.
> I hope I cc'd the correct people and didn't mess up.
>
> The changes pass the test suite t/ and Travis CI.
> Please point out any mistakes.
>
> Thanks for your time! :)
>
> entry.c | 32 +++++++++++++++-----------------
> 1 file changed, 15 insertions(+), 17 deletions(-)
>
> diff --git a/entry.c b/entry.c
> index 53380bb614..e7f4881d3b 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -2,6 +2,8 @@
> #include "blob.h"
> #include "object-store.h"
> #include "dir.h"
> +#include "iterator.h"
> +#include "dir-iterator.h"
> #include "streaming.h"
> #include "submodule.h"
> #include "progress.h"
> @@ -50,29 +52,25 @@ static void create_directories(const char *path, int path_len,
>
> static void remove_subtree(struct strbuf *path)
> {
> - DIR *dir = opendir(path->buf);
> - struct dirent *de;
> - int origlen = path->len;
> + int ok;
> + unsigned int flags = DIR_ITERATOR_PEDANTIC;
> + struct dir_iterator *iter = dir_iterator_begin(path->buf, flags);
>
> - if (!dir)
> + if (!iter)
> die_errno("cannot opendir '%s'", path->buf);
Nitpick: since dir_iterator_begin() might fail for reasons other than
an opendir() error, I think the error message here could be more
generic. Maybe "failed to start iterator over %s"?
> - while ((de = readdir(dir)) != NULL) {
> - struct stat st;
>
> - if (is_dot_or_dotdot(de->d_name))
> + while ((ok = dir_iterator_advance(iter)) == ITER_OK) {
> + if (is_dot_or_dotdot(iter->path.buf))
This check is already done by dir-iterator internally, so you may
remove it here.
> continue;
>
> - strbuf_addch(path, '/');
> - strbuf_addstr(path, de->d_name);
> - if (lstat(path->buf, &st))
> - die_errno("cannot lstat '%s'", path->buf);
> - if (S_ISDIR(st.st_mode))
> - remove_subtree(path);
> - else if (unlink(path->buf))
> - die_errno("cannot unlink '%s'", path->buf);
> - strbuf_setlen(path, origlen);
> + if (unlink(iter->path.buf)) {
unlink()-ing a directory in Linux will return a EISDIR error. So I
think you still need to use S_ISDIR() to check if iter->path.buf is a
directory and call rmdir(), in this case.
However, note that the dir-iterator API gives entries in pre-order.
I.e. a directory appears before its subentries. In the use case of
remove_subtree(), though, we need to traverse in post-order, since we
have to remove the subentries before removing the directory where they
reside. My suggestion is that you add a preliminary patch,
implementing a new DIR_ITERATOR_POST_ORDER flag to dir-iterator.h, and
then use it in this patch.
You may also want to check this[1] series, which worked towards the
same goal of converting remove_subtree(). It ended up not getting
merged, back them, but some of the patches were re-used in this[2]
series which got merged. I think you could also re-use some of the
code from [1] that implements the post-order traversing and a test[3]
for remove_subtree().
Thanks,
Matheus
[1]: https://public-inbox.org/git/1493226219-33423-1-git-send-email-bnmvco@gmail.com/
[2]: https://public-inbox.org/git/cover.1562801254.git.matheus.bernardino@usp.br/
[3]: https://public-inbox.org/git/1493226219-33423-3-git-send-email-bnmvco@gmail.com/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] entry.c: use dir-iterator to avoid explicit dir traversal
2019-12-09 17:41 ` Matheus Tavares Bernardino
@ 2019-12-09 21:18 ` Junio C Hamano
2019-12-10 4:38 ` Matheus Tavares Bernardino
0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2019-12-09 21:18 UTC (permalink / raw)
To: Matheus Tavares Bernardino; +Cc: otalpster, git, Michael Haggerty
Matheus Tavares Bernardino <matheus.bernardino@usp.br> writes:
>> The changes pass the test suite t/ and Travis CI.
>> Please point out any mistakes.
>> ...
>> - strbuf_addch(path, '/');
>> - strbuf_addstr(path, de->d_name);
>> - if (lstat(path->buf, &st))
>> - die_errno("cannot lstat '%s'", path->buf);
>> - if (S_ISDIR(st.st_mode))
>> - remove_subtree(path);
>> - else if (unlink(path->buf))
>> - die_errno("cannot unlink '%s'", path->buf);
>> - strbuf_setlen(path, origlen);
>> + if (unlink(iter->path.buf)) {
>
> unlink()-ing a directory in Linux will return a EISDIR error. So I
> think you still need to use S_ISDIR() to check if iter->path.buf is a
> directory and call rmdir(), in this case.
>
> However, note that the dir-iterator API gives entries in pre-order.
> I.e. a directory appears before its subentries. In the use case of
> remove_subtree(), though, we need to traverse in post-order, since we
> have to remove the subentries before removing the directory where they
> reside. My suggestion is that you add a preliminary patch,
> implementing a new DIR_ITERATOR_POST_ORDER flag to dir-iterator.h, and
> then use it in this patch.
Thanks for a review and a few hints to nudge a new contributor in
the right direction. Very much appreciated.
I wonder why the bugs in this patch weren't caught by self test we
already have, by the way. We need a bit better test coverage,
perhaps?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] entry.c: use dir-iterator to avoid explicit dir traversal
2019-12-09 21:18 ` Junio C Hamano
@ 2019-12-10 4:38 ` Matheus Tavares Bernardino
2019-12-16 16:12 ` otalpster
0 siblings, 1 reply; 6+ messages in thread
From: Matheus Tavares Bernardino @ 2019-12-10 4:38 UTC (permalink / raw)
To: Junio C Hamano; +Cc: otalpster, git, Michael Haggerty
On Mon, Dec 9, 2019 at 6:18 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Matheus Tavares Bernardino <matheus.bernardino@usp.br> writes:
>
> >> The changes pass the test suite t/ and Travis CI.
> >> Please point out any mistakes.
> >> ...
> >> - strbuf_addch(path, '/');
> >> - strbuf_addstr(path, de->d_name);
> >> - if (lstat(path->buf, &st))
> >> - die_errno("cannot lstat '%s'", path->buf);
> >> - if (S_ISDIR(st.st_mode))
> >> - remove_subtree(path);
> >> - else if (unlink(path->buf))
> >> - die_errno("cannot unlink '%s'", path->buf);
> >> - strbuf_setlen(path, origlen);
> >> + if (unlink(iter->path.buf)) {
> >
> > unlink()-ing a directory in Linux will return a EISDIR error. So I
> > think you still need to use S_ISDIR() to check if iter->path.buf is a
> > directory and call rmdir(), in this case.
> >
> > However, note that the dir-iterator API gives entries in pre-order.
> > I.e. a directory appears before its subentries. In the use case of
> > remove_subtree(), though, we need to traverse in post-order, since we
> > have to remove the subentries before removing the directory where they
> > reside. My suggestion is that you add a preliminary patch,
> > implementing a new DIR_ITERATOR_POST_ORDER flag to dir-iterator.h, and
> > then use it in this patch.
>
> Thanks for a review and a few hints to nudge a new contributor in
> the right direction. Very much appreciated.
>
> I wonder why the bugs in this patch weren't caught by self test we
> already have, by the way. We need a bit better test coverage,
> perhaps?
I think there is no current test that covers remove_subtree() being
called with nested directories. But we could use the test proposed by
Daniel[1], which does fail when this current patch is applied. So,
maybe, Plato could also include this test in v2, before performing the
dir-iterator convertion.
[1]: https://public-inbox.org/git/1493226219-33423-3-git-send-email-bnmvco@gmail.com/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] entry.c: use dir-iterator to avoid explicit dir traversal
2019-12-10 4:38 ` Matheus Tavares Bernardino
@ 2019-12-16 16:12 ` otalpster
0 siblings, 0 replies; 6+ messages in thread
From: otalpster @ 2019-12-16 16:12 UTC (permalink / raw)
To: Matheus Tavares Bernardino; +Cc: Junio C Hamano, git, Michael Haggerty
On Tue, Dec 10, 2019 at 6:38 AM Matheus Tavares Bernardino
<matheus.bernardino@usp.br> wrote:
>
> On Mon, Dec 9, 2019 at 6:18 PM Junio C Hamano <gitster@pobox.com> wrote:
> >
> > Matheus Tavares Bernardino <matheus.bernardino@usp.br> writes:
> >
> > >> The changes pass the test suite t/ and Travis CI.
> > >> Please point out any mistakes.
> > >> ...
> > >> - strbuf_addch(path, '/');
> > >> - strbuf_addstr(path, de->d_name);
> > >> - if (lstat(path->buf, &st))
> > >> - die_errno("cannot lstat '%s'", path->buf);
> > >> - if (S_ISDIR(st.st_mode))
> > >> - remove_subtree(path);
> > >> - else if (unlink(path->buf))
> > >> - die_errno("cannot unlink '%s'", path->buf);
> > >> - strbuf_setlen(path, origlen);
> > >> + if (unlink(iter->path.buf)) {
> > >
> > > unlink()-ing a directory in Linux will return a EISDIR error. So I
> > > think you still need to use S_ISDIR() to check if iter->path.buf is a
> > > directory and call rmdir(), in this case.
> > >
> > > However, note that the dir-iterator API gives entries in pre-order.
> > > I.e. a directory appears before its subentries. In the use case of
> > > remove_subtree(), though, we need to traverse in post-order, since we
> > > have to remove the subentries before removing the directory where they
> > > reside. My suggestion is that you add a preliminary patch,
> > > implementing a new DIR_ITERATOR_POST_ORDER flag to dir-iterator.h, and
> > > then use it in this patch.
> >
> > Thanks for a review and a few hints to nudge a new contributor in
> > the right direction. Very much appreciated.
> >
> > I wonder why the bugs in this patch weren't caught by self test we
> > already have, by the way. We need a bit better test coverage,
> > perhaps?
>
> I think there is no current test that covers remove_subtree() being
> called with nested directories. But we could use the test proposed by
> Daniel[1], which does fail when this current patch is applied. So,
> maybe, Plato could also include this test in v2, before performing the
> dir-iterator convertion.
>
> [1]: https://public-inbox.org/git/1493226219-33423-3-git-send-email-bnmvco@gmail.com/
Hello!
Thanks for the warm welcome and sorry for the late reply.
Your review and hints you provided were very helpful and appreciated.
Yes, I can work and include the test in v2, as well as performing the
dir-iterator conversion. The required process and the necessary
changes I need to make were very clear.
This week I'm very busy. I'll work on this in great detail from next week.
Thanks,
Plato
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-12-16 16:13 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-08 18:04 [PATCH] entry.c: use dir-iterator to avoid explicit dir traversal otalpster
2019-12-09 15:21 ` Derrick Stolee
2019-12-09 17:41 ` Matheus Tavares Bernardino
2019-12-09 21:18 ` Junio C Hamano
2019-12-10 4:38 ` Matheus Tavares Bernardino
2019-12-16 16:12 ` otalpster
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).