git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>
To: Elijah Newren <newren@gmail.com>
Cc: "Git Mailing List" <git@vger.kernel.org>,
	"Björn Lässig" <b.laessig@pengutronix.de>
Subject: Re: Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one")
Date: Tue, 7 Dec 2021 08:14:42 +0100	[thread overview]
Message-ID: <20211207071442.ajkcl5hpwwbze2lx@pengutronix.de> (raw)
In-Reply-To: <CABPp-BEEpboXxs6dghCagj2oXkXKX9aNQGLmUmQRa5m3jj2--w@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6222 bytes --]

Hello,

first of all thanks for addressing my report, also to brian.

On Mon, Dec 06, 2021 at 09:29:59PM -0800, Elijah Newren wrote:
> On Mon, Dec 6, 2021 at 4:16 AM Uwe Kleine-König
> <u.kleine-koenig@pengutronix.de> wrote:
> > I admit this is somewhat of a corner case, still it happens in the
> > reality of our admin team ...
> > Initially this was noticed after upgrading the OS from Debian buster
> > (with git 2.20.1) to Debian bullseye (with git 2.30.2).
> >
> > (wgit is just a wrapper for git to call it from my ~/src/git.)
> >
> > This is the good ("old") case:
> >
> >         uwe@taurus:~/tmp/8d92fb29270$ wgit version
> >         git version 2.25.2.7.g0bbd0e8b5233
> >
> >         uwe@taurus:~/tmp/8d92fb292706$ wgit init
> >         Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/
> >
> >         uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir
> >         uwe@taurus:~/tmp/8d92fb292706$ cd subdir/
> >         uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init
> >         Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/
> >         uwe@taurus:~/tmp/8d92fb292706/subdir$ cd ..
> >
> >         uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile
> >         uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile
> >         uwe@taurus:~/tmp/8d92fb292706$ wgit status
> >         On branch master
> >
> >         No commits yet
> >
> >         Changes to be committed:
> >           (use "git rm --cached <file>..." to unstage)
> >                 new file:   subdir/somefile
> 
> Eek, that's bad.  I think there's a number of dragons going down that route.

Yes, as soon as you start to checkout files in one of the involved
repositories you implicitly change the other one. In this case git is
only used to track files in /etc and elsewhere, and there are two
mechanisms to track them. (Don't ask for the reasons, I don't know them.
:-) In this setup (I think) the dragons should be well fixed to their
chains.

> > with 8d92fb292706, the following happens:
> >
> >         uwe@taurus:~/tmp/8d92fb292706$ wgit version
> >         git version 2.25.2.8.g8d92fb292706
> >         uwe@taurus:~/tmp/8d92fb292706$ wgit init
> >         Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/
> >         uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir
> >         uwe@taurus:~/tmp/8d92fb292706$ cd subdir/
> >         uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init
> >         Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/
> >         uwe@taurus:~/tmp/8d92fb292706/subdir$ cd ..
> >         uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile
> >         uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile
> 
> Not optimal; more on this below.
> 
> >         uwe@taurus:~/tmp/8d92fb292706$ wgit status
> >         On branch master
> >
> >         No commits yet
> >
> >         Untracked files:
> >           (use "git add <file>..." to include in what will be committed)
> >                 subdir/
> >
> >         nothing added to commit but untracked files present (use "git add" to track)
> 
> This part looks good to me.
> 
> > So git after 8d92fb292706 doesn't add files from a subdirectory if
> > said subdirectory is tracked in git, too.
> >
> > While I'm not sure which of the two behaviours is the bogus one, this is
> > a change in behaviour that I guess wasn't intended in 8d92fb292706.
> 
> I put some effort separate from that commit into avoiding accidentally
> recursing into nested git dirs; see e.g. commit 09487f2cba ("clean:
> avoid removing untracked files in a nested git repository",
> 2019-09-17).  So, I was slightly surprised that some other commit
> hadn't fixed this.
> 
> However, it's not all that surprising to me that 8d92fb292706 affected
> this.  Prior to that commit, we visited untracked paths which were n
> directories deep a ridiculous 2^n times.  But what made it even more
> fun was that the status returned for any given path (tracked, ignored,
> not interesting to the traversal, etc.) was not always the same; later
> traversals might return something different than earlier traversals.
> That confusion made it real "fun" trying to ensure no regressions when
> reducing the number of visits to any given path from 2^n down to 1.
> The fact that side effects of the traversals (the population of the
> dir.entries and dir.ignored) could have essentially functioned to
> override a later traversal's return status certainly didn't help; it
> was such a mess.
> 
> But, interestingly, the fixed behavior here also depends pretty
> strongly on commit b9670c1f5e ("dir: fix checks on common prefix
> directory", 2019-12-19) which came months earlier.  This is
> particularly important in combination with the following comment from
> dir.h:
>     /**
>      * If set, recurse into a directory that looks like a Git directory.
>      * Otherwise it is shown as a directory.
>      */
>     DIR_NO_GITLINKS = 1<<3,
> which suggests that cmd_add() (which didn't set this flag) should have
> never been recursing into a Git-tracked directory.  In other words,
> this was a bug all along.
> 
> > Is this something that needs fixing?
> 
> I agree with brian elsewhere in this thread that not adding the file
> is correct.  However, two points:
> 
> * I would prefer to see a warning/error from git add when it doesn't
> add a path  (Any takers?  #leftoverbits maybe?)

Agreed, also maybe return an exit code != 0?

> * It is possible that one might want to be able to force the addition
> of files to an outer repository despite existing within a directory
> tracked by an inner git repository, perhaps with a double `--force`
> being passed to git-add (much like git-clean allows).  If so, that
> could be implemented via the addition of
>     dir.flags |= DIR_NO_GITLINKS;
> to cmd_add() when the double force is detected.

I think this would be good.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      reply	other threads:[~2021-12-07  7:14 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-06 12:16 Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one") Uwe Kleine-König
2021-12-07  2:19 ` brian m. carlson
2021-12-07  5:29 ` Elijah Newren
2021-12-07  7:14   ` Uwe Kleine-König [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211207071442.ajkcl5hpwwbze2lx@pengutronix.de \
    --to=u.kleine-koenig@pengutronix.de \
    --cc=b.laessig@pengutronix.de \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).