git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "René Scharfe" <l.s.r@web.de>
Cc: Jessica Clarke <jrtc27@jrtc27.com>, git@vger.kernel.org
Subject: Re: [PATCH] apply: Avoid ambiguous pointer provenance for CHERI/Arm's Morello
Date: Fri, 07 Jan 2022 11:40:06 -0800	[thread overview]
Message-ID: <xmqq35lz9vxl.fsf@gitster.g> (raw)
In-Reply-To: <8739caad-aa3d-1f0f-b5dd-6174a8e059f6@web.de> (=?utf-8?Q?=22R?= =?utf-8?Q?en=C3=A9?= Scharfe"'s message of "Fri, 7 Jan 2022 13:16:53 +0100")

René Scharfe <l.s.r@web.de> writes:

>> I actually wonder if it results in code that is much easier to
>> follow if we did:
>>
>>  * Introduce an "enum apply_symlink_treatment" that has
>>    APPLY_SYMLINK_GOES_AWAY and APPLY_SYMLINK_IN_RESULT as its
>>    possible values;
>>
>>  * Make register_symlink_changes() and check_symlink_changes()
>>    work with "enum apply_symlink_treatment";
>>
>>  * (optional) stop using string_list() to store the symlink_changes;
>>    use strintmap and use strintmap_set() and strintmap_get() to
>>    access its entries, so that the ugly implementation detail
>>    (i.e. "the container type we use only has a (void *) field to
>>    store caller-supplied data, so we cast an integer and a pointer
>>    back and forth") can be safely hidden.
>>
> Or strsets -- we only need two.

True.

When we check a path, we make a single look-up of two bit in a
single hashtable but now we need two look-ups, but addition, removal
and renaming of a symlink would be rare enough to matter either way.

Will queue; thanks.

> --- >8 ---
> Subject: [PATCH] apply: use strsets to track symlinks
>
> Symlink changes are tracked in a string_list, with the util pointer
> value indicating whether a symlink is kept or removed.  Using fake
> pointer values requires awkward casts.  Use one strset for each type of
> change instead to simplify and shorten the code.
>
> Original-patch-by: Jessica Clarke <jrtc27@jrtc27.com>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  apply.c | 42 ++++++++----------------------------------
>  apply.h | 26 +++++++++++---------------
>  2 files changed, 19 insertions(+), 49 deletions(-)
>
> diff --git a/apply.c b/apply.c
> index fed195250b..7deb4f79fd 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -103,7 +103,8 @@ int init_apply_state(struct apply_state *state,
>  	state->linenr = 1;
>  	string_list_init_nodup(&state->fn_table);
>  	string_list_init_nodup(&state->limit_by_name);
> -	string_list_init_nodup(&state->symlink_changes);
> +	strset_init(&state->removed_symlinks);
> +	strset_init(&state->kept_symlinks);
>  	strbuf_init(&state->root, 0);
>
>  	git_apply_config();
> @@ -117,7 +118,8 @@ int init_apply_state(struct apply_state *state,
>  void clear_apply_state(struct apply_state *state)
>  {
>  	string_list_clear(&state->limit_by_name, 0);
> -	string_list_clear(&state->symlink_changes, 0);
> +	strset_clear(&state->removed_symlinks);
> +	strset_clear(&state->kept_symlinks);
>  	strbuf_release(&state->root);
>
>  	/* &state->fn_table is cleared at the end of apply_patch() */
> @@ -3812,59 +3814,31 @@ static int check_to_create(struct apply_state *state,
>  	return 0;
>  }
>
> -static uintptr_t register_symlink_changes(struct apply_state *state,
> -					  const char *path,
> -					  uintptr_t what)
> -{
> -	struct string_list_item *ent;
> -
> -	ent = string_list_lookup(&state->symlink_changes, path);
> -	if (!ent) {
> -		ent = string_list_insert(&state->symlink_changes, path);
> -		ent->util = (void *)0;
> -	}
> -	ent->util = (void *)(what | ((uintptr_t)ent->util));
> -	return (uintptr_t)ent->util;
> -}
> -
> -static uintptr_t check_symlink_changes(struct apply_state *state, const char *path)
> -{
> -	struct string_list_item *ent;
> -
> -	ent = string_list_lookup(&state->symlink_changes, path);
> -	if (!ent)
> -		return 0;
> -	return (uintptr_t)ent->util;
> -}
> -
>  static void prepare_symlink_changes(struct apply_state *state, struct patch *patch)
>  {
>  	for ( ; patch; patch = patch->next) {
>  		if ((patch->old_name && S_ISLNK(patch->old_mode)) &&
>  		    (patch->is_rename || patch->is_delete))
>  			/* the symlink at patch->old_name is removed */
> -			register_symlink_changes(state, patch->old_name, APPLY_SYMLINK_GOES_AWAY);
> +			strset_add(&state->removed_symlinks, patch->old_name);
>
>  		if (patch->new_name && S_ISLNK(patch->new_mode))
>  			/* the symlink at patch->new_name is created or remains */
> -			register_symlink_changes(state, patch->new_name, APPLY_SYMLINK_IN_RESULT);
> +			strset_add(&state->kept_symlinks, patch->new_name);
>  	}
>  }
>
>  static int path_is_beyond_symlink_1(struct apply_state *state, struct strbuf *name)
>  {
>  	do {
> -		unsigned int change;
> -
>  		while (--name->len && name->buf[name->len] != '/')
>  			; /* scan backwards */
>  		if (!name->len)
>  			break;
>  		name->buf[name->len] = '\0';
> -		change = check_symlink_changes(state, name->buf);
> -		if (change & APPLY_SYMLINK_IN_RESULT)
> +		if (strset_contains(&state->kept_symlinks, name->buf))
>  			return 1;
> -		if (change & APPLY_SYMLINK_GOES_AWAY)
> +		if (strset_contains(&state->removed_symlinks, name->buf))
>  			/*
>  			 * This cannot be "return 0", because we may
>  			 * see a new one created at a higher level.
> diff --git a/apply.h b/apply.h
> index 16202da160..4052da50c0 100644
> --- a/apply.h
> +++ b/apply.h
> @@ -4,6 +4,7 @@
>  #include "hash.h"
>  #include "lockfile.h"
>  #include "string-list.h"
> +#include "strmap.h"
>
>  struct repository;
>
> @@ -25,20 +26,6 @@ enum apply_verbosity {
>  	verbosity_verbose = 1
>  };
>
> -/*
> - * We need to keep track of how symlinks in the preimage are
> - * manipulated by the patches.  A patch to add a/b/c where a/b
> - * is a symlink should not be allowed to affect the directory
> - * the symlink points at, but if the same patch removes a/b,
> - * it is perfectly fine, as the patch removes a/b to make room
> - * to create a directory a/b so that a/b/c can be created.
> - *
> - * See also "struct string_list symlink_changes" in "struct
> - * apply_state".
> - */
> -#define APPLY_SYMLINK_GOES_AWAY 01
> -#define APPLY_SYMLINK_IN_RESULT 02
> -
>  struct apply_state {
>  	const char *prefix;
>
> @@ -86,7 +73,16 @@ struct apply_state {
>
>  	/* Various "current state" */
>  	int linenr; /* current line number */
> -	struct string_list symlink_changes; /* we have to track symlinks */
> +	/*
> +	 * We need to keep track of how symlinks in the preimage are
> +	 * manipulated by the patches.  A patch to add a/b/c where a/b
> +	 * is a symlink should not be allowed to affect the directory
> +	 * the symlink points at, but if the same patch removes a/b,
> +	 * it is perfectly fine, as the patch removes a/b to make room
> +	 * to create a directory a/b so that a/b/c can be created.
> +	 */
> +	struct strset removed_symlinks;
> +	struct strset kept_symlinks;
>
>  	/*
>  	 * For "diff-stat" like behaviour, we keep track of the biggest change
> --
> 2.34.1

  parent reply	other threads:[~2022-01-07 19:40 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-05 13:23 [PATCH] apply: Avoid ambiguous pointer provenance for CHERI/Arm's Morello Jessica Clarke
2022-01-05 16:39 ` Konstantin Khomoutov
2022-01-05 16:40   ` Jessica Clarke
2022-01-06 22:50 ` Taylor Blau
2022-01-06 22:57   ` Jessica Clarke
2022-01-06 22:53 ` Junio C Hamano
2022-01-06 23:02   ` Jessica Clarke
2022-01-06 23:41     ` Junio C Hamano
2022-01-07 12:16   ` René Scharfe
2022-01-07 13:00     ` Jessica Clarke
2022-01-07 19:40     ` Junio C Hamano [this message]
2022-01-08  0:04       ` René Scharfe
2022-01-08  0:51         ` Junio C Hamano
2022-01-07 23:25     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq35lz9vxl.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jrtc27@jrtc27.com \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).