All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "René Scharfe" <l.s.r@web.de>
Cc: Jessica Clarke <jrtc27@jrtc27.com>, git@vger.kernel.org
Subject: Re: [PATCH] apply: Avoid ambiguous pointer provenance for CHERI/Arm's Morello
Date: Fri, 07 Jan 2022 11:40:06 -0800	[thread overview]
Message-ID: <xmqq35lz9vxl.fsf@gitster.g> (raw)
In-Reply-To: <8739caad-aa3d-1f0f-b5dd-6174a8e059f6@web.de> (=?utf-8?Q?=22R?= =?utf-8?Q?en=C3=A9?= Scharfe"'s message of "Fri, 7 Jan 2022 13:16:53 +0100")

René Scharfe <l.s.r@web.de> writes:

>> I actually wonder if it results in code that is much easier to
>> follow if we did:
>>
>>  * Introduce an "enum apply_symlink_treatment" that has
>>    APPLY_SYMLINK_GOES_AWAY and APPLY_SYMLINK_IN_RESULT as its
>>    possible values;
>>
>>  * Make register_symlink_changes() and check_symlink_changes()
>>    work with "enum apply_symlink_treatment";
>>
>>  * (optional) stop using string_list() to store the symlink_changes;
>>    use strintmap and use strintmap_set() and strintmap_get() to
>>    access its entries, so that the ugly implementation detail
>>    (i.e. "the container type we use only has a (void *) field to
>>    store caller-supplied data, so we cast an integer and a pointer
>>    back and forth") can be safely hidden.
>>
> Or strsets -- we only need two.

True.

When we check a path, we make a single look-up of two bit in a
single hashtable but now we need two look-ups, but addition, removal
and renaming of a symlink would be rare enough to matter either way.

Will queue; thanks.

> --- >8 ---
> Subject: [PATCH] apply: use strsets to track symlinks
>
> Symlink changes are tracked in a string_list, with the util pointer
> value indicating whether a symlink is kept or removed.  Using fake
> pointer values requires awkward casts.  Use one strset for each type of
> change instead to simplify and shorten the code.
>
> Original-patch-by: Jessica Clarke <jrtc27@jrtc27.com>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  apply.c | 42 ++++++++----------------------------------
>  apply.h | 26 +++++++++++---------------
>  2 files changed, 19 insertions(+), 49 deletions(-)
>
> diff --git a/apply.c b/apply.c
> index fed195250b..7deb4f79fd 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -103,7 +103,8 @@ int init_apply_state(struct apply_state *state,
>  	state->linenr = 1;
>  	string_list_init_nodup(&state->fn_table);
>  	string_list_init_nodup(&state->limit_by_name);
> -	string_list_init_nodup(&state->symlink_changes);
> +	strset_init(&state->removed_symlinks);
> +	strset_init(&state->kept_symlinks);
>  	strbuf_init(&state->root, 0);
>
>  	git_apply_config();
> @@ -117,7 +118,8 @@ int init_apply_state(struct apply_state *state,
>  void clear_apply_state(struct apply_state *state)
>  {
>  	string_list_clear(&state->limit_by_name, 0);
> -	string_list_clear(&state->symlink_changes, 0);
> +	strset_clear(&state->removed_symlinks);
> +	strset_clear(&state->kept_symlinks);
>  	strbuf_release(&state->root);
>
>  	/* &state->fn_table is cleared at the end of apply_patch() */
> @@ -3812,59 +3814,31 @@ static int check_to_create(struct apply_state *state,
>  	return 0;
>  }
>
> -static uintptr_t register_symlink_changes(struct apply_state *state,
> -					  const char *path,
> -					  uintptr_t what)
> -{
> -	struct string_list_item *ent;
> -
> -	ent = string_list_lookup(&state->symlink_changes, path);
> -	if (!ent) {
> -		ent = string_list_insert(&state->symlink_changes, path);
> -		ent->util = (void *)0;
> -	}
> -	ent->util = (void *)(what | ((uintptr_t)ent->util));
> -	return (uintptr_t)ent->util;
> -}
> -
> -static uintptr_t check_symlink_changes(struct apply_state *state, const char *path)
> -{
> -	struct string_list_item *ent;
> -
> -	ent = string_list_lookup(&state->symlink_changes, path);
> -	if (!ent)
> -		return 0;
> -	return (uintptr_t)ent->util;
> -}
> -
>  static void prepare_symlink_changes(struct apply_state *state, struct patch *patch)
>  {
>  	for ( ; patch; patch = patch->next) {
>  		if ((patch->old_name && S_ISLNK(patch->old_mode)) &&
>  		    (patch->is_rename || patch->is_delete))
>  			/* the symlink at patch->old_name is removed */
> -			register_symlink_changes(state, patch->old_name, APPLY_SYMLINK_GOES_AWAY);
> +			strset_add(&state->removed_symlinks, patch->old_name);
>
>  		if (patch->new_name && S_ISLNK(patch->new_mode))
>  			/* the symlink at patch->new_name is created or remains */
> -			register_symlink_changes(state, patch->new_name, APPLY_SYMLINK_IN_RESULT);
> +			strset_add(&state->kept_symlinks, patch->new_name);
>  	}
>  }
>
>  static int path_is_beyond_symlink_1(struct apply_state *state, struct strbuf *name)
>  {
>  	do {
> -		unsigned int change;
> -
>  		while (--name->len && name->buf[name->len] != '/')
>  			; /* scan backwards */
>  		if (!name->len)
>  			break;
>  		name->buf[name->len] = '\0';
> -		change = check_symlink_changes(state, name->buf);
> -		if (change & APPLY_SYMLINK_IN_RESULT)
> +		if (strset_contains(&state->kept_symlinks, name->buf))
>  			return 1;
> -		if (change & APPLY_SYMLINK_GOES_AWAY)
> +		if (strset_contains(&state->removed_symlinks, name->buf))
>  			/*
>  			 * This cannot be "return 0", because we may
>  			 * see a new one created at a higher level.
> diff --git a/apply.h b/apply.h
> index 16202da160..4052da50c0 100644
> --- a/apply.h
> +++ b/apply.h
> @@ -4,6 +4,7 @@
>  #include "hash.h"
>  #include "lockfile.h"
>  #include "string-list.h"
> +#include "strmap.h"
>
>  struct repository;
>
> @@ -25,20 +26,6 @@ enum apply_verbosity {
>  	verbosity_verbose = 1
>  };
>
> -/*
> - * We need to keep track of how symlinks in the preimage are
> - * manipulated by the patches.  A patch to add a/b/c where a/b
> - * is a symlink should not be allowed to affect the directory
> - * the symlink points at, but if the same patch removes a/b,
> - * it is perfectly fine, as the patch removes a/b to make room
> - * to create a directory a/b so that a/b/c can be created.
> - *
> - * See also "struct string_list symlink_changes" in "struct
> - * apply_state".
> - */
> -#define APPLY_SYMLINK_GOES_AWAY 01
> -#define APPLY_SYMLINK_IN_RESULT 02
> -
>  struct apply_state {
>  	const char *prefix;
>
> @@ -86,7 +73,16 @@ struct apply_state {
>
>  	/* Various "current state" */
>  	int linenr; /* current line number */
> -	struct string_list symlink_changes; /* we have to track symlinks */
> +	/*
> +	 * We need to keep track of how symlinks in the preimage are
> +	 * manipulated by the patches.  A patch to add a/b/c where a/b
> +	 * is a symlink should not be allowed to affect the directory
> +	 * the symlink points at, but if the same patch removes a/b,
> +	 * it is perfectly fine, as the patch removes a/b to make room
> +	 * to create a directory a/b so that a/b/c can be created.
> +	 */
> +	struct strset removed_symlinks;
> +	struct strset kept_symlinks;
>
>  	/*
>  	 * For "diff-stat" like behaviour, we keep track of the biggest change
> --
> 2.34.1

  parent reply	other threads:[~2022-01-07 19:40 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-05 13:23 [PATCH] apply: Avoid ambiguous pointer provenance for CHERI/Arm's Morello Jessica Clarke
2022-01-05 16:39 ` Konstantin Khomoutov
2022-01-05 16:40   ` Jessica Clarke
2022-01-06 22:50 ` Taylor Blau
2022-01-06 22:57   ` Jessica Clarke
2022-01-06 22:53 ` Junio C Hamano
2022-01-06 23:02   ` Jessica Clarke
2022-01-06 23:41     ` Junio C Hamano
2022-01-07 12:16   ` René Scharfe
2022-01-07 13:00     ` Jessica Clarke
2022-01-07 19:40     ` Junio C Hamano [this message]
2022-01-08  0:04       ` René Scharfe
2022-01-08  0:51         ` Junio C Hamano
2022-01-07 23:25     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq35lz9vxl.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jrtc27@jrtc27.com \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.