All of lore.kernel.org
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Elijah Newren <newren@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH v2 09/20] merge-ort: record stage and auxiliary info for every path
Date: Wed, 11 Nov 2020 10:26:43 -0500	[thread overview]
Message-ID: <94bf9b69-5d13-c914-fb1a-bce912018a63@gmail.com> (raw)
In-Reply-To: <20201102204344.342633-10-newren@gmail.com>

On 11/2/2020 3:43 PM, Elijah Newren wrote:
> +static void setup_path_info(struct merge_options *opt,
> +			    struct string_list_item *result,
> +			    const char *current_dir_name,
> +			    int current_dir_name_len,
> +			    char *fullpath, /* we'll take over ownership */
> +			    struct name_entry *names,
> +			    struct name_entry *merged_version,
> +			    unsigned is_null,     /* boolean */
> +			    unsigned df_conflict, /* boolean */
> +			    unsigned filemask,
> +			    unsigned dirmask,
> +			    int resolved          /* boolean */)
> +{
> +	struct conflict_info *path_info;

In addition to my concerns below about 'conflict_info' versus
'merged_info', I was doubly confused that 'result' in the parameter
list is given a variable named 'pi' for "path info" and result->util
eventually is equal to this path_info. What if we renamed 'result'
to 'pi' for "path info" here, then operated on 'pi->util' in this
method?

> +	path_info = xcalloc(1, resolved ? sizeof(struct merged_info) :
> +					  sizeof(struct conflict_info));

Hm. I'm happy to have a `struct merged_info *` pointing to a
`struct conflict_info`, but the opposite seems very dangerous.
Perhaps we should always use sizeof(struct conflict_info)?

We can use path_info->merged.clean to detect whether the rest of
the data is worth looking at. (Or, in your case, whether or not
it is allocated.)

I imagine that in a large repo we will need many of these structs,
but very few of them will actually need to be conflicts, so using
'struct conflict_info' always will lead to memory bloat. But in
that case, would we not be better off with an array instead of a
scattering of data across the heap?

Perhaps 'struct conflict_info' shouldn't contain a 'struct merged_info'
and instead be just the "extra" data. Then we could have a contiguous
array of 'struct merged_info' values for most of the paths, but heap
pointers for 'struct conflict_info' as necessary.

It's also true that I haven't fully formed a mental model for how these
are used in your algorithm, so I'll keep reading.

> +	path_info->merged.directory_name = current_dir_name;
> +	path_info->merged.basename_offset = current_dir_name_len;
> +	path_info->merged.clean = !!resolved;
> +	if (resolved) {
> +		path_info->merged.result.mode = merged_version->mode;
> +		oidcpy(&path_info->merged.result.oid, &merged_version->oid);
> +		path_info->merged.is_null = !!is_null;
> +	} else {
> +		int i;
> +
> +		for (i = 0; i < 3; i++) {
> +			path_info->pathnames[i] = fullpath;
> +			path_info->stages[i].mode = names[i].mode;
> +			oidcpy(&path_info->stages[i].oid, &names[i].oid);
> +		}
> +		path_info->filemask = filemask;
> +		path_info->dirmask = dirmask;
> +		path_info->df_conflict = !!df_conflict;
> +	}
> +	strmap_put(&opt->priv->paths, fullpath, path_info);
> +	result->string = fullpath;
> +	result->util = path_info;

This is set in all cases, so should we use it everywhere? Naturally,
there might be a cost to the extra pointer indirection, so maybe we
create a 'struct conflict_info *util' to operate on during this
method, but set 'result->util = util' right after allocating so we
know how it should behave?

> @@ -91,10 +136,12 @@ static int collect_merge_info_callback(int n,
>  	 */
>  	struct merge_options *opt = info->data;
>  	struct merge_options_internal *opti = opt->priv;
> -	struct conflict_info *ci;
> +	struct string_list_item pi;  /* Path Info */
> +	struct conflict_info *ci; /* pi.util when there's a conflict */

...

> +	setup_path_info(opt, &pi, dirname, info->pathlen, fullpath,
> +			names, NULL, 0, df_conflict, filemask, dirmask, 0);
> +	ci = pi.util;

Here is the use of 'pi' that I was talking about earlier.

Thanks,
-Stolee

  parent reply	other threads:[~2020-11-11 15:26 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-02 20:43 [PATCH v2 00/20] fundamentals of merge-ort implementation Elijah Newren
2020-11-02 20:43 ` [PATCH v2 01/20] merge-ort: setup basic internal data structures Elijah Newren
2020-11-06 22:05   ` Jonathan Tan
2020-11-06 22:45     ` Elijah Newren
2020-11-09 20:55       ` Jonathan Tan
2020-11-02 20:43 ` [PATCH v2 02/20] merge-ort: add some high-level algorithm structure Elijah Newren
2020-11-02 20:43 ` [PATCH v2 03/20] merge-ort: port merge_start() from merge-recursive Elijah Newren
2020-11-11 13:52   ` Derrick Stolee
2020-11-11 16:22     ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 04/20] merge-ort: use histogram diff Elijah Newren
2020-11-11 13:54   ` Derrick Stolee
2020-11-11 16:47     ` Elijah Newren
2020-11-11 16:51       ` Derrick Stolee
2020-11-11 17:03         ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 05/20] merge-ort: add an err() function similar to one from merge-recursive Elijah Newren
2020-11-11 13:58   ` Derrick Stolee
2020-11-11 17:07     ` Elijah Newren
2020-11-11 17:10       ` Derrick Stolee
2020-11-02 20:43 ` [PATCH v2 06/20] merge-ort: implement a very basic collect_merge_info() Elijah Newren
2020-11-06 22:19   ` Jonathan Tan
2020-11-06 23:10     ` Elijah Newren
2020-11-09 20:59       ` Jonathan Tan
2020-11-11 14:38   ` Derrick Stolee
2020-11-11 17:02     ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 07/20] merge-ort: avoid repeating fill_tree_descriptor() on the same tree Elijah Newren
2020-11-11 14:51   ` Derrick Stolee
2020-11-11 17:13     ` Elijah Newren
2020-11-11 17:21       ` Eric Sunshine
2020-11-02 20:43 ` [PATCH v2 08/20] merge-ort: compute a few more useful fields for collect_merge_info Elijah Newren
2020-11-06 22:52   ` Jonathan Tan
2020-11-06 23:41     ` Elijah Newren
2020-11-09 22:04       ` Jonathan Tan
2020-11-09 23:05         ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 09/20] merge-ort: record stage and auxiliary info for every path Elijah Newren
2020-11-06 22:58   ` Jonathan Tan
2020-11-07  0:26     ` Elijah Newren
2020-11-09 22:09       ` Jonathan Tan
2020-11-09 23:08         ` Elijah Newren
2020-11-11 15:26   ` Derrick Stolee [this message]
2020-11-11 18:16     ` Elijah Newren
2020-11-11 22:06       ` Elijah Newren
2020-11-12 18:23         ` Derrick Stolee
2020-11-12 18:39       ` Derrick Stolee
2020-11-02 20:43 ` [PATCH v2 10/20] merge-ort: avoid recursing into identical trees Elijah Newren
2020-11-11 15:31   ` Derrick Stolee
2020-11-02 20:43 ` [PATCH v2 11/20] merge-ort: add a preliminary simple process_entries() implementation Elijah Newren
2020-11-11 19:51   ` Jonathan Tan
2020-11-12  1:48     ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 12/20] merge-ort: have process_entries operate in a defined order Elijah Newren
2020-11-11 16:09   ` Derrick Stolee
2020-11-11 18:58     ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 13/20] merge-ort: step 1 of tree writing -- record basenames, modes, and oids Elijah Newren
2020-11-11 20:01   ` Jonathan Tan
2020-11-11 20:24     ` Elijah Newren
2020-11-12 20:39       ` Jonathan Tan
2020-11-02 20:43 ` [PATCH v2 14/20] merge-ort: step 2 of tree writing -- function to create tree object Elijah Newren
2020-11-11 20:47   ` Jonathan Tan
2020-11-11 21:21     ` Elijah Newren
2020-11-02 20:43 ` [PATCH v2 15/20] merge-ort: step 3 of tree writing -- handling subdirectories as we go Elijah Newren
2020-11-12 20:15   ` Jonathan Tan
2020-11-12 22:30     ` Elijah Newren
2020-11-24 20:19       ` Elijah Newren
2020-11-25  2:07         ` Jonathan Tan
2020-11-26 18:13           ` Elijah Newren
2020-11-30 18:41             ` Jonathan Tan
2020-11-02 20:43 ` [PATCH v2 16/20] merge-ort: basic outline for merge_switch_to_result() Elijah Newren
2020-11-02 20:43 ` [PATCH v2 17/20] merge-ort: add implementation of checkout() Elijah Newren
2020-11-02 20:43 ` [PATCH v2 18/20] tree: enable cmp_cache_name_compare() to be used elsewhere Elijah Newren
2020-11-02 20:43 ` [PATCH v2 19/20] merge-ort: add implementation of record_unmerged_index_entries() Elijah Newren
2020-11-02 20:43 ` [PATCH v2 20/20] merge-ort: free data structures in merge_finalize() Elijah Newren
2020-11-03 14:49 ` [PATCH v2 00/20] fundamentals of merge-ort implementation Derrick Stolee
2020-11-03 16:36   ` Elijah Newren
2020-11-07  6:06     ` Elijah Newren
2020-11-07 15:02       ` Derrick Stolee
2020-11-07 19:39         ` Elijah Newren
2020-11-09 12:30           ` Derrick Stolee
2020-11-09 17:13             ` Elijah Newren
2020-11-09 19:51               ` Derrick Stolee
2020-11-09 22:44                 ` Elijah Newren
2020-11-11 17:08 ` Derrick Stolee
2020-11-11 18:35   ` Elijah Newren
2020-11-11 20:48     ` Derrick Stolee
2020-11-11 21:18       ` Elijah Newren
2020-11-29  7:43 [PATCH " Elijah Newren via GitGitGadget
2020-12-04 20:47 ` [PATCH v2 " Elijah Newren via GitGitGadget
2020-12-04 20:47   ` [PATCH v2 09/20] merge-ort: record stage and auxiliary info for every path Elijah Newren via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94bf9b69-5d13-c914-fb1a-bce912018a63@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.