git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Taylor Blau <me@ttaylorr.com>,
	Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH] commit-graph: ignore duplicates when merging layers
Date: Thu, 8 Oct 2020 10:29:13 -0400	[thread overview]
Message-ID: <3571945a-5cac-15fa-773f-2eeb06dc0f6a@gmail.com> (raw)
In-Reply-To: <20201008141527.GA351725@nand.local>

On 10/8/2020 10:15 AM, Taylor Blau wrote:
> On Thu, Oct 08, 2020 at 01:56:52PM +0000, Derrick Stolee via GitGitGadget wrote:
>> Thus, this die() is too aggignoring the duplicates.
> 
> s/aggignoring/aggressively ignoring ?
> 
>>
>> This leads to some additional complication that we did no have before:
> 
> s/no/not, but I am more wondering about what "This" is. I think what
> you're saying is: "Suppose we didn't die on duplicates, what would
> happen? Well, there'd be some additional problems, but here's a way that
> we can fix them (storing the de-duplicated OIDs separately)".

Thanks. The message will be edited to fix these brain farts.

>>     I still don't have a grasp on how this happened in the first place, but
>>     will keep looking.
> 
> I'm looking as well, but I haven't found any smoking guns yet. I could
> imagine that this is a problem that existed before 0bd52e27e3
> (commit-graph.h: store an odb in 'struct write_commit_graph_context',
> 2020-02-03), and simply couldn't be tickled because of how brittle
> comparing ODB paths is. I could equally imagine that 0bd52e27e3 did
> introduce this problem.

Thanks.

>> +	ALLOC_ARRAY(deduped_commits.list, deduped_commits.alloc);
> 
> I'm not sure that this deduped_commits list is actually necessary.
> 
> It would be nice for this caller if ctx->commits were a linked list
> since it would make deleting duplicates easy, but I think that it would
> be a burden for other callers. So, that's a dead end.
> 
> But what about marking the duplicate positions by NULL-ing them out, and
> then taking another pass over the list to (1) compact it (i.e., push
> everything down so that all of the NULLs occur at the end), and then (2)
> truncate the length to the number of unique commits.
> 
> I could imagine that something like that is a little trickier, but it
> seems worth it to avoid doubling the memory cost of this function.

You are correct that we can just re-use the commits.list by "collapsing"
the list on top of duplicate entries. I'll send a new version that does
exactly that.

Thanks,
-Stolee

  reply	other threads:[~2020-10-08 14:29 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-08 13:56 Derrick Stolee via GitGitGadget
2020-10-08 14:15 ` Taylor Blau
2020-10-08 14:29   ` Derrick Stolee [this message]
2020-10-08 14:59 ` [PATCH v2] " Derrick Stolee via GitGitGadget
2020-10-08 15:04   ` [PATCH v3] " Derrick Stolee via GitGitGadget
2020-10-08 15:53     ` Jeff King
2020-10-08 16:26       ` Derrick Stolee
2020-10-08 16:42         ` Taylor Blau
2020-10-08 16:43         ` Jeff King
2020-10-09 20:53     ` [PATCH v4 0/2] " Derrick Stolee via GitGitGadget
2020-10-09 20:53       ` [PATCH v4 1/2] " Derrick Stolee via GitGitGadget
2020-10-09 20:53       ` [PATCH v4 2/2] commit-graph: don't write commit-graph when disabled Derrick Stolee via GitGitGadget
2020-10-09 21:12         ` Junio C Hamano
2020-10-09 21:17         ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3571945a-5cac-15fa-773f-2eeb06dc0f6a@gmail.com \
    --to=stolee@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=me@ttaylorr.com \
    --subject='Re: [PATCH] commit-graph: ignore duplicates when merging layers' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox