git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Michał Kępień" <michal@isc.org>,
	"Phillip Wood" <phillip.wood123@gmail.com>
Subject: Re: [PATCH 1/2] diff: add an API for deferred freeing
Date: Wed, 10 Feb 2021 17:00:45 +0100 (CET)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.2102101557160.29765@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <20210205141320.18076-1-avarab@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 12871 bytes --]

Hi Ævar,

On Fri, 5 Feb 2021, Ævar Arnfjörð Bjarmason wrote:

> Add a diff_free() function to free anything we may have allocated in
> the "diff_options" struct, and the ability to make calling it a noop
> by setting "no_free" in "diff_options".

Why do we need a `no_free` flag? Why not simply set the `free()`d (or
`fclose()`d) attributes to `NULL`?

> This is required because when e.g. "git diff" is run we'll allocate
> things in that struct, use the diff machinery once, and then exit, but
> if we run e.g. "git log -p" we're going to re-use what we allocated
> across multiple diff_flush() calls, and only want to free things at
> the end.
>
> We've thus ended up with features like the recently added "diff -I"[1]
> where we'll leak memory. As it turns out it could have simply used the
> pattern established in 6ea57703f6 (log: prepare log/log-tree to reuse
> the diffopt.close_file attribute, 2016-06-22).
>
> Manually adding more such flags to things log_tree_commit() every time
> we need to allocate something would be tedious.
>
> Let's instead move that fclose() code it to a new diff_free(), in
> anticipation of freeing more things in that function in follow-up
> commits. I'm renaming the "close_file" struct member to "fclose_file"
> for the ease of validating this, we can be certain that these are all
> the relevant callsites.

I like that idea a lot.

> Some functions such as log_tree_commit() need an idiom of optionally
> retaining a previous "no_free", as they may either free the memory
> themselves, or their caller may do so. I'm keeping that idiom in
> log_show_early() even though I don't think it's currently called in
> this manner, since it also gets passed an existing "struct rev_info"..
>
> 1. 296d4a94e7 (diff: add -I<regex> that ignores matching changes,
>    2020-10-20)
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/add.c |  2 +-
>  builtin/am.c  |  6 ++++--
>  builtin/log.c | 27 ++++++++++++++-------------
>  diff.c        | 18 +++++++++++++-----
>  diff.h        | 14 +++++++++++++-
>  log-tree.c    | 10 ++++++----
>  wt-status.c   |  2 +-
>  7 files changed, 52 insertions(+), 27 deletions(-)
>
> diff --git a/builtin/add.c b/builtin/add.c
> index a825887c50..6319710186 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -282,7 +282,7 @@ static int edit_patch(int argc, const char **argv, const char *prefix)
>  	if (out < 0)
>  		die(_("Could not open '%s' for writing."), file);
>  	rev.diffopt.file = xfdopen(out, "w");
> -	rev.diffopt.close_file = 1;
> +	rev.diffopt.fclose_file = 1;

This rename makes the patch unnecessarily tedious to review, and I do not
even agree with leaking the implementation detail that we need to
`fclose()` the file.

Let's just not?

>  	if (run_diff_files(&rev, 0))
>  		die(_("Could not write patch"));
>
> diff --git a/builtin/am.c b/builtin/am.c
> index 8355e3566f..157d264583 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1315,7 +1315,7 @@ static void write_commit_patch(const struct am_state *state, struct commit *comm
>  	rev_info.diffopt.flags.full_index = 1;
>  	rev_info.diffopt.use_color = 0;
>  	rev_info.diffopt.file = fp;
> -	rev_info.diffopt.close_file = 1;
> +	rev_info.diffopt.fclose_file = 1; /* log_tree_commit() sets .no_free=1 */
>  	add_pending_object(&rev_info, &commit->object, "");
>  	diff_setup_done(&rev_info.diffopt);
>  	log_tree_commit(&rev_info, commit);
> @@ -1347,10 +1347,12 @@ static void write_index_patch(const struct am_state *state)
>  	rev_info.diffopt.output_format = DIFF_FORMAT_PATCH;
>  	rev_info.diffopt.use_color = 0;
>  	rev_info.diffopt.file = fp;
> -	rev_info.diffopt.close_file = 1;
> +	rev_info.diffopt.fclose_file = 1;
> +	rev_info.diffopt.no_free = 1;
>  	add_pending_object(&rev_info, &tree->object, "");
>  	diff_setup_done(&rev_info.diffopt);
>  	run_diff_index(&rev_info, 1);
> +	diff_free(&rev_info.diffopt);
>  }
>
>  /**
> diff --git a/builtin/log.c b/builtin/log.c
> index fd282def43..604ee29ec0 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -306,10 +306,11 @@ static struct itimerval early_output_timer;
>
>  static void log_show_early(struct rev_info *revs, struct commit_list *list)
>  {
> -	int i = revs->early_output, close_file = revs->diffopt.close_file;
> +	int i = revs->early_output;
>  	int show_header = 1;
> +	int no_free = revs->diffopt.no_free;
>
> -	revs->diffopt.close_file = 0;
> +	revs->diffopt.no_free = 0;
>  	sort_in_topological_order(&list, revs->sort_order);
>  	while (list && i) {
>  		struct commit *commit = list->item;
> @@ -326,8 +327,8 @@ static void log_show_early(struct rev_info *revs, struct commit_list *list)
>  		case commit_ignore:
>  			break;
>  		case commit_error:
> -			if (close_file)
> -				fclose(revs->diffopt.file);
> +			revs->diffopt.no_free = no_free;
> +			diff_free(&revs->diffopt);
>  			return;
>  		}
>  		list = list->next;
> @@ -335,8 +336,8 @@ static void log_show_early(struct rev_info *revs, struct commit_list *list)
>
>  	/* Did we already get enough commits for the early output? */
>  	if (!i) {
> -		if (close_file)
> -			fclose(revs->diffopt.file);
> +		revs->diffopt.no_free = 0;
> +		diff_free(&revs->diffopt);
>  		return;
>  	}
>
> @@ -400,7 +401,7 @@ static int cmd_log_walk(struct rev_info *rev)
>  {
>  	struct commit *commit;
>  	int saved_nrl = 0;
> -	int saved_dcctc = 0, close_file = rev->diffopt.close_file;
> +	int saved_dcctc = 0;
>
>  	if (rev->early_output)
>  		setup_early_output();
> @@ -416,7 +417,7 @@ static int cmd_log_walk(struct rev_info *rev)
>  	 * and HAS_CHANGES being accumulated in rev->diffopt, so be careful to
>  	 * retain that state information if replacing rev->diffopt in this loop
>  	 */
> -	rev->diffopt.close_file = 0;
> +	rev->diffopt.no_free = 1;
>  	while ((commit = get_revision(rev)) != NULL) {
>  		if (!log_tree_commit(rev, commit) && rev->max_count >= 0)
>  			/*
> @@ -441,8 +442,8 @@ static int cmd_log_walk(struct rev_info *rev)
>  	}
>  	rev->diffopt.degraded_cc_to_c = saved_dcctc;
>  	rev->diffopt.needed_rename_limit = saved_nrl;
> -	if (close_file)
> -		fclose(rev->diffopt.file);
> +	rev->diffopt.no_free = 0;
> +	diff_free(&rev->diffopt);
>
>  	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
>  	    rev->diffopt.flags.check_failed) {
> @@ -1952,18 +1953,18 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>  	if (rev.show_notes)
>  		load_display_notes(&rev.notes_opt);
>
> -	if (use_stdout + rev.diffopt.close_file + !!output_directory > 1)
> +	if (use_stdout + rev.diffopt.fclose_file + !!output_directory > 1)
>  		die(_("--stdout, --output, and --output-directory are mutually exclusive"));
>
>  	if (use_stdout) {
>  		setup_pager();
> -	} else if (rev.diffopt.close_file) {
> +	} else if (rev.diffopt.fclose_file) {
>  		/*
>  		 * The diff code parsed --output; it has already opened the
>  		 * file, but but we must instruct it not to close after each
>  		 * diff.
>  		 */
> -		rev.diffopt.close_file = 0;
> +		rev.diffopt.no_free = 1;
>  	} else {
>  		int saved;
>
> diff --git a/diff.c b/diff.c
> index 69e3bc00ed..3e6f8f0a71 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -5187,7 +5187,7 @@ static enum parse_opt_result diff_opt_output(struct parse_opt_ctx_t *ctx,
>  	BUG_ON_OPT_NEG(unset);
>  	path = prefix_filename(ctx->prefix, arg);
>  	options->file = xfopen(path, "w");
> -	options->close_file = 1;
> +	options->fclose_file = 1;
>  	if (options->use_color != GIT_COLOR_ALWAYS)
>  		options->use_color = GIT_COLOR_NEVER;
>  	free(path);
> @@ -6399,10 +6399,10 @@ void diff_flush(struct diff_options *options)
>  		 * options->file to /dev/null should be safe, because we
>  		 * aren't supposed to produce any output anyway.
>  		 */
> -		if (options->close_file)
> +		if (options->fclose_file)
>  			fclose(options->file);

And at this stage, we should set `options->file = NULL`.

>  		options->file = xfopen("/dev/null", "w");
> -		options->close_file = 1;
> +		options->fclose_file = 1;
>  		options->color_moved = 0;
>  		for (i = 0; i < q->nr; i++) {
>  			struct diff_filepair *p = q->queue[i];
> @@ -6433,8 +6433,7 @@ void diff_flush(struct diff_options *options)
>  free_queue:
>  	free(q->queue);
>  	DIFF_QUEUE_CLEAR(q);
> -	if (options->close_file)
> -		fclose(options->file);
> +	diff_free(options);
>
>  	/*
>  	 * Report the content-level differences with HAS_CHANGES;
> @@ -6449,6 +6448,15 @@ void diff_flush(struct diff_options *options)
>  	}
>  }
>
> +void diff_free(struct diff_options *options)
> +{
> +	if (options->no_free)
> +		return;
> +	if (options->fclose_file)
> +		fclose(options->file);

And at this stage, we should set `options->file = NULL`.

> +}
> +
> +
>  static int match_filter(const struct diff_options *options, const struct diff_filepair *p)
>  {
>  	return (((p->status == DIFF_STATUS_MODIFIED) &&
> diff --git a/diff.h b/diff.h
> index 2ff2b1c7f2..d1d74c3a9e 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -49,7 +49,16 @@
>   * - Once you finish feeding the pairs of files, call `diffcore_std()`.
>   * This will tell the diffcore library to go ahead and do its work.
>   *
> + * - The `diff_opt_parse()` etc. functions might allocate memory in
> + *  `struct diff_options`. When running the API `N > 1` set `.no_free
> + *  = 1` to make the `diff_free()` invoked by `diff_flush()` below a
> + *  noop.

I have serious trouble parsing that last sentence. Would you mind
rephrasing it?

> + *
>   * - Calling `diff_flush()` will produce the output.
> + *
> + * - If you set `.no_free = 1` before set it to `0` and call
> + *   `diff_free()` again. If `.no_free = 1` was not set there's no
> + *   need to call `diff_free()`, `diff_flush()` will call it.

Again, as I mentioned before, the indicator whether things need to be
released should be whether the attribute is `NULL` or not. And once
released, ot should be set to `NULL`.

Other than that, it looks fine to me. And I definitely like the idea of
introducing a function to release all of the stuff in `struct diffopt`.

Thanks,
Dscho

>   */
>
>  struct combine_diff_path;
> @@ -328,7 +337,7 @@ struct diff_options {
>  	void (*set_default)(struct diff_options *);
>
>  	FILE *file;
> -	int close_file;
> +	int fclose_file;
>
>  #define OUTPUT_INDICATOR_NEW 0
>  #define OUTPUT_INDICATOR_OLD 1
> @@ -365,6 +374,8 @@ struct diff_options {
>
>  	struct repository *repo;
>  	struct option *parseopts;
> +
> +	int no_free;
>  };
>
>  unsigned diff_filter_bit(char status);
> @@ -559,6 +570,7 @@ void diffcore_fix_diff_index(void);
>
>  int diff_queue_is_empty(void);
>  void diff_flush(struct diff_options*);
> +void diff_free(struct diff_options*);
>  void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc);
>
>  /* diff-raw status letters */
> diff --git a/log-tree.c b/log-tree.c
> index fd0dde97ec..bb946f15f1 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -952,12 +952,14 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log
>  int log_tree_commit(struct rev_info *opt, struct commit *commit)
>  {
>  	struct log_info log;
> -	int shown, close_file = opt->diffopt.close_file;
> +	int shown;
> +	/* maybe called by e.g. cmd_log_walk(), maybe stand-alone */
> +	int no_free = opt->diffopt.no_free;
>
>  	log.commit = commit;
>  	log.parent = NULL;
>  	opt->loginfo = &log;
> -	opt->diffopt.close_file = 0;
> +	opt->diffopt.no_free = 1;
>
>  	if (opt->line_level_traverse)
>  		return line_log_print(opt, commit);
> @@ -974,7 +976,7 @@ int log_tree_commit(struct rev_info *opt, struct commit *commit)
>  		fprintf(opt->diffopt.file, "\n%s\n", opt->break_bar);
>  	opt->loginfo = NULL;
>  	maybe_flush_or_die(opt->diffopt.file, "stdout");
> -	if (close_file)
> -		fclose(opt->diffopt.file);
> +	opt->diffopt.no_free = no_free;
> +	diff_free(&opt->diffopt);
>  	return shown;
>  }
> diff --git a/wt-status.c b/wt-status.c
> index 0c8287a023..41b08474e5 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -1052,7 +1052,7 @@ static void wt_longstatus_print_verbose(struct wt_status *s)
>  	rev.diffopt.rename_limit = s->rename_limit >= 0 ? s->rename_limit : rev.diffopt.rename_limit;
>  	rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : rev.diffopt.rename_score;
>  	rev.diffopt.file = s->fp;
> -	rev.diffopt.close_file = 0;
> +	rev.diffopt.fclose_file = 0; /* wt_status owns the s->fp */
>  	/*
>  	 * If we're not going to stdout, then we definitely don't
>  	 * want color, since we are going to the commit message
> --
> 2.30.0.284.gd98b1dd5eaa7
>
>

  reply	other threads:[~2021-02-10 16:02 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-01 12:06 [PATCH 0/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-01 12:06 ` [PATCH 1/2] " Michał Kępień
2020-10-01 18:21   ` Junio C Hamano
2020-10-07 19:48     ` Michał Kępień
2020-10-07 20:08       ` Junio C Hamano
2020-10-01 12:06 ` [PATCH 2/2] t: add -I<regex> tests Michał Kępień
2020-10-01 17:02 ` [PATCH 0/2] diff: add -I<regex> that ignores matching changes Junio C Hamano
2020-10-12  9:17 ` [PATCH v2 0/3] " Michał Kępień
2020-10-12  9:17   ` [PATCH v2 1/3] merge-base, xdiff: zero out xpparam_t structures Michał Kępień
2020-10-12 11:14     ` Johannes Schindelin
2020-10-12 17:09       ` Junio C Hamano
2020-10-12 19:52     ` Junio C Hamano
2020-10-13  6:35       ` Michał Kępień
2020-10-12  9:17   ` [PATCH v2 2/3] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-12 11:20     ` Johannes Schindelin
2020-10-12 20:00       ` Junio C Hamano
2020-10-12 20:39         ` Johannes Schindelin
2020-10-12 21:43           ` Junio C Hamano
2020-10-13  6:37             ` Michał Kępień
2020-10-13 15:49               ` Junio C Hamano
2020-10-13  6:36       ` Michał Kępień
2020-10-13 12:02         ` Johannes Schindelin
2020-10-13 15:53           ` Junio C Hamano
2020-10-13 18:45           ` Michał Kępień
2020-10-12 18:01     ` Junio C Hamano
2020-10-13  6:38       ` Michał Kępień
2020-10-12 20:04     ` Junio C Hamano
2020-10-13  6:38       ` Michał Kępień
2020-10-12  9:17   ` [PATCH v2 3/3] t: add -I<regex> tests Michał Kępień
2020-10-12 11:49     ` Johannes Schindelin
2020-10-13  6:38       ` Michał Kępień
2020-10-13 12:00         ` Johannes Schindelin
2020-10-13 16:00           ` Junio C Hamano
2020-10-13 19:01           ` Michał Kępień
2020-10-15 11:45             ` Johannes Schindelin
2020-10-15  7:24   ` [PATCH v3 0/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-15  7:24     ` [PATCH v3 1/2] merge-base, xdiff: zero out xpparam_t structures Michał Kępień
2020-10-15  7:24     ` [PATCH v3 2/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-16 15:32       ` Phillip Wood
2020-10-16 18:04         ` Junio C Hamano
2020-10-19  9:48           ` Michał Kępień
2020-10-16 18:16       ` Junio C Hamano
2020-10-19  9:55         ` Michał Kępień
2020-10-19 17:29           ` Junio C Hamano
2020-10-16 10:00     ` [PATCH v3 0/2] " Johannes Schindelin
2020-10-20  6:48     ` [PATCH v4 " Michał Kępień
2020-10-20  6:48       ` [PATCH v4 1/2] merge-base, xdiff: zero out xpparam_t structures Michał Kępień
2020-10-20  6:48       ` [PATCH v4 2/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2021-02-05 14:13       ` [PATCH 1/2] diff: add an API for deferred freeing Ævar Arnfjörð Bjarmason
2021-02-10 16:00         ` Johannes Schindelin [this message]
2021-02-11  3:00           ` Ævar Arnfjörð Bjarmason
2021-02-11  9:40             ` Johannes Schindelin
2021-02-11 10:21               ` Jeff King
2021-02-11 10:45                 ` [PATCH v2 0/2] " Ævar Arnfjörð Bjarmason
2021-02-11 10:45                 ` [PATCH v2 1/2] " Ævar Arnfjörð Bjarmason
2021-02-11 10:45                 ` [PATCH v2 2/2] diff: plug memory leak from regcomp() on {log,diff} -I Ævar Arnfjörð Bjarmason
2021-02-05 14:13       ` [PATCH " Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.2102101557160.29765@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=michal@isc.org \
    --cc=phillip.wood123@gmail.com \
    --subject='Re: [PATCH 1/2] diff: add an API for deferred freeing' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).