All of lore.kernel.org
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 4/9] repository: add repo reference to index_state
Date: Wed, 20 Jan 2021 11:50:10 -0800	[thread overview]
Message-ID: <CABPp-BE7wg9YBvFidrQaB3EP+zNb3u9W9_VRhWFdsS85vuYP5A@mail.gmail.com> (raw)
In-Reply-To: <f790db9c-0c6d-3661-93d9-0339e22c12ff@gmail.com>

On Wed, Jan 20, 2021 at 11:16 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 1/20/2021 12:46 PM, Elijah Newren wrote:
> > On Wed, Jan 20, 2021 at 8:54 AM Derrick Stolee via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> From: Derrick Stolee <dstolee@microsoft.com>
> >>
> >> It will be helpful to add behavior to index opertations that might
> >
> > s/opertations/operations/
>
> Thanks.
>
> >> trigger an object lookup. Since each index belongs to a specific
> >> repository, add a 'repo' pointer to struct index_state that allows
> >> access to this repository.
> >>
> >> This will prevent future changes from needing to pass an additional
> >> 'struct repository *repo' parameter and instead rely only on the 'struct
> >> index_state *istate' parameter.
> >>
> >> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> >> ---
> >>  cache.h      | 1 +
> >>  repository.c | 4 ++++
> >>  2 files changed, 5 insertions(+)
> >>
> >> diff --git a/cache.h b/cache.h
> >> index 71097657489..f9c7a603841 100644
> >> --- a/cache.h
> >> +++ b/cache.h
> >> @@ -328,6 +328,7 @@ struct index_state {
> >>         struct ewah_bitmap *fsmonitor_dirty;
> >>         struct mem_pool *ce_mem_pool;
> >>         struct progress *progress;
> >> +       struct repository *repo;
> >>  };
> >>
> >>  /* Name hashing */
> >> diff --git a/repository.c b/repository.c
> >> index a4174ddb062..67a4c1da2d9 100644
> >> --- a/repository.c
> >> +++ b/repository.c
> >> @@ -264,6 +264,10 @@ int repo_read_index(struct repository *repo)
> >>         if (!repo->index)
> >>                 repo->index = xcalloc(1, sizeof(*repo->index));
> >>
> >> +       /* Complete the double-reference */
> >> +       if (!repo->index->repo)
> >> +               repo->index->repo = repo;
> >> +
> >>         return read_index_from(repo->index, repo->index_file, repo->gitdir);
> >>  }
> >>
> >> --
> >> gitgitgadget
> >
> > Since we have repo->index and we have index->repo, which are intended
> > to be circular...what if they aren't?  Do we want or need to add
> > assertions anywhere that repo == repo->index->repo or that index ==
> > index->repo->index ?
>
> Here, we are pairing them together and the loop is complete. I don't
> view that as a permanent thing. This only initializes istate->repo
> when we are parsing an index from a file, but not when we create one
> in memory.
>
> I imagine it will be likely in some cases to have multiple index_state
> instances for a single repository. However, having the pointer "this
> index belongs to this repository" seems helpful (to me).
>
> > My initial implementations of --remerge-diff[1] played around with
> > creating a second repo, with a different primary object store but
> > everything else the same.  The index for the two repository objects
> > was thus the same, and thus clearly would have violated this assumed
> > invariant for one of the two repos.  I discarded that initial
> > implementation (which I didn't quite have working) because I
> > discovered tmp-objdir.h and was able to add some
> > tmp_objdir_make_primary() and tmp_objdir_remove_as_primary() functions
> > that merely altered the existing repo's primary object store, but I'm
> > curious if there might be other cases of folks doing stuff that might
> > have weird failures with this new invariant.
>
> This is an interesting concept, and definitely violates my expectations
> that an index belongs to only one repository. I'd need to know more
> about why this was a good design decision before being convinced that
> the relationship should not be many-to-one (index-to-repo).

I'm not sure what I did was a good design decision; I was kind of
exploring and trying to figure things out.  In retrospect, I think it
was probably a bad idea.  But we have various guard rails in the form
of BUG() calls and such when basic assumptions are violated, and here
it seems that you are now making a new basic assumption that an index
belongs to only one repository.  (Even if all current callers happen
to satisfy that assumption, it's not clear to me that git previously
cared if this condition were satisfied or not).  Hence my question
about safety checks.

> > It's entirely possible that --remerge-diff was just so different, and
> > I was so unfamiliar with repo objects (and still kind of am) that I
> > was just doing weird stuff no one has done before, so perhaps no
> > additional checks are needed -- I'm just throwing my gut question out
> > there as food for thought.
> >
> > [1] I have not yet submitted `--remerge-diff` to the list; you haven't
> > missed anything.  I'm waiting for merge-ort to be submitted, reviewed,
> > and merged first.  It's the remerge-diff branch in my fork on GitHub
> > if anyone is curious, though.
>
> I'm interested in what others might say about this idea. I'd be able
> to do most of what I want to do without this patch, but it just gets
> a lot messier. (istate->repo is used in the very next patch in a way
> that would be less clean without it.)

I'm less concerned with your patch as-is (I think your assumption
seems reasonable and I'm fine with labelling my former unsubmitted
patches as erroneous), and more wondering whether others in the future
will accidentally violate assumptions your patch starts encoding...and
whether we can or should do anything about it.  If there's a simple
place we can add a check for such an error, then it probably makes
sense to add one.  If there isn't...then at least we considered it?

  reply	other threads:[~2021-01-20 19:51 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-20 16:53 [PATCH 0/9] More index cleanups Derrick Stolee via GitGitGadget
2021-01-20 16:53 ` [PATCH 1/9] cache-tree: clean up cache_tree_update() Derrick Stolee via GitGitGadget
2021-01-20 17:21   ` Elijah Newren
2021-01-20 19:10     ` Derrick Stolee
2021-01-20 16:53 ` [PATCH 2/9] cache-tree: extract subtree_pos() Derrick Stolee via GitGitGadget
2021-01-20 17:23   ` Elijah Newren
2021-01-20 16:53 ` [PATCH 3/9] fsmonitor: de-duplicate BUG()s around dirty bits Derrick Stolee via GitGitGadget
2021-01-20 17:26   ` Elijah Newren
2021-01-21 12:53   ` Chris Torek
2021-01-21 15:56     ` Derrick Stolee
2021-01-20 16:53 ` [PATCH 4/9] repository: add repo reference to index_state Derrick Stolee via GitGitGadget
2021-01-20 17:46   ` Elijah Newren
2021-01-20 19:16     ` Derrick Stolee
2021-01-20 19:50       ` Elijah Newren [this message]
2021-01-20 16:53 ` [PATCH 5/9] name-hash: use trace2 regions for init Derrick Stolee via GitGitGadget
2021-01-20 17:47   ` Elijah Newren
2021-01-20 16:53 ` [PATCH 6/9] sparse-checkout: load sparse-checkout patterns Derrick Stolee via GitGitGadget
2021-01-20 17:54   ` Elijah Newren
2021-01-20 16:53 ` [PATCH 7/9] sparse-checkout: hold pattern list in index Derrick Stolee via GitGitGadget
2021-01-20 18:03   ` Elijah Newren
2021-01-20 19:22     ` Derrick Stolee
2021-01-20 16:53 ` [PATCH 8/9] test-lib: test_region looks for trace2 regions Derrick Stolee via GitGitGadget
2021-01-20 18:20   ` Elijah Newren
2021-01-20 19:24     ` Derrick Stolee
2021-01-20 16:53 ` [PATCH 9/9] t1092: test interesting sparse-checkout scenarios Derrick Stolee via GitGitGadget
2021-01-20 19:40   ` Elijah Newren
2021-01-21 11:59     ` Derrick Stolee
2021-01-22 13:03 ` [PATCH v2 0/8] More index cleanups Derrick Stolee via GitGitGadget
2021-01-22 13:03   ` [PATCH v2 1/8] cache-tree: clean up cache_tree_update() Derrick Stolee via GitGitGadget
2021-01-22 19:11     ` Junio C Hamano
2021-01-22 13:03   ` [PATCH v2 2/8] cache-tree: extract subtree_pos() Derrick Stolee via GitGitGadget
2021-01-22 13:03   ` [PATCH v2 3/8] fsmonitor: de-duplicate BUG()s around dirty bits Derrick Stolee via GitGitGadget
2021-01-22 19:18     ` Junio C Hamano
2021-01-22 13:03   ` [PATCH v2 4/8] repository: add repo reference to index_state Derrick Stolee via GitGitGadget
2021-01-22 19:23     ` Junio C Hamano
2021-01-22 13:03   ` [PATCH v2 5/8] name-hash: use trace2 regions for init Derrick Stolee via GitGitGadget
2021-01-22 13:03   ` [PATCH v2 6/8] sparse-checkout: load sparse-checkout patterns Derrick Stolee via GitGitGadget
2021-01-22 13:03   ` [PATCH v2 7/8] test-lib: test_region looks for trace2 regions Derrick Stolee via GitGitGadget
2021-01-22 19:42     ` Junio C Hamano
2021-01-23 18:36       ` Derrick Stolee
2021-01-23 18:50         ` Junio C Hamano
2021-01-22 13:03   ` [PATCH v2 8/8] t1092: test interesting sparse-checkout scenarios Derrick Stolee via GitGitGadget
2021-01-22 19:49   ` [PATCH v2 0/8] More index cleanups Elijah Newren
2021-01-23 18:47     ` Derrick Stolee
2021-01-23 19:58   ` [PATCH v3 0/9] " Derrick Stolee via GitGitGadget
2021-01-23 19:58     ` [PATCH v3 1/9] cache-tree: clean up cache_tree_update() Derrick Stolee via GitGitGadget
2021-01-23 19:58     ` [PATCH v3 2/9] cache-tree: simplify verify_cache() prototype Derrick Stolee via GitGitGadget
2021-01-23 20:24       ` Elijah Newren
2021-01-23 21:02         ` Derrick Stolee
2021-01-23 21:10           ` Elijah Newren
2021-01-23 21:41           ` Junio C Hamano
2021-01-23 21:10         ` Junio C Hamano
2021-01-23 21:14           ` Derrick Stolee
2021-01-23 19:58     ` [PATCH v3 3/9] cache-tree: extract subtree_pos() Derrick Stolee via GitGitGadget
2021-01-23 19:58     ` [PATCH v3 4/9] fsmonitor: de-duplicate BUG()s around dirty bits Derrick Stolee via GitGitGadget
2021-01-23 19:58     ` [PATCH v3 5/9] repository: add repo reference to index_state Derrick Stolee via GitGitGadget
2021-01-23 19:58     ` [PATCH v3 6/9] name-hash: use trace2 regions for init Derrick Stolee via GitGitGadget
2021-01-23 19:58     ` [PATCH v3 7/9] sparse-checkout: load sparse-checkout patterns Derrick Stolee via GitGitGadget
2021-01-23 19:58     ` [PATCH v3 8/9] test-lib: test_region looks for trace2 regions Derrick Stolee via GitGitGadget
2021-01-23 21:07       ` Derrick Stolee
2021-01-23 19:58     ` [PATCH v3 9/9] t1092: test interesting sparse-checkout scenarios Derrick Stolee via GitGitGadget
2021-01-25 18:45       ` Elijah Newren
2021-01-23 20:29     ` [PATCH v3 0/9] More index cleanups Elijah Newren
2021-01-23 21:05       ` Derrick Stolee
2021-01-23 21:42         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BE7wg9YBvFidrQaB3EP+zNb3u9W9_VRhWFdsS85vuYP5A@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.