Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: sandals@crustytoothpaste.net, steadmon@google.com,
	jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com,
	phillip.wood123@gmail.com, emilyshaffer@google.com,
	sluongng@gmail.com, jonathantanmy@google.com,
	Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH v3 00/11] Maintenance I: Command, gc and commit-graph tasks
Date: Tue, 25 Aug 2020 18:33:35 +0000
Message-ID: <pull.695.v3.git.1598380426.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.695.v2.git.1597760589.gitgitgadget@gmail.com>

This series is based on jc/no-update-fetch-head.

This patch series contains 11patches that were going to be part of v4 of
ds/maintenance [1], but the discussion has gotten really long. To help focus
the conversation, I'm splitting out the portions that create and test the
'maintenance' builtin from the additional tasks (prefetch, loose-objects,
incremental-repack) that can be brought in later.

[1] 
https://lore.kernel.org/git/pull.671.git.1594131695.gitgitgadget@gmail.com/

As mentioned before, git gc already plays the role of maintaining Git
repositories. It has accumulated several smaller pieces in its long history,
including:

 1. Repacking all reachable objects into one pack-file (and deleting
    unreachable objects).
 2. Packing refs.
 3. Expiring reflogs.
 4. Clearing rerere logs.
 5. Updating the commit-graph file.
 6. Pruning worktrees.

While expiring reflogs, clearing rererelogs, and deleting unreachable
objects are suitable under the guise of "garbage collection", packing refs
and updating the commit-graph file are not as obviously fitting. Further,
these operations are "all or nothing" in that they rewrite almost all
repository data, which does not perform well at extremely large scales.
These operations can also be disruptive to foreground Git commands when git
gc --auto triggers during routine use.

This series does not intend to change what git gc does, but instead create
new choices for automatic maintenance activities, of which git gc remains
the only one enabled by default.

The new maintenance tasks are:

 * 'commit-graph' : write and verify a single layer of an incremental
   commit-graph.
 * 'loose-objects' : prune packed loose objects, then create a new pack from
   a batch of loose objects.
 * 'pack-files' : expire redundant packs from the multi-pack-index, then
   repack using the multi-pack-index's incremental repack strategy.
 * 'prefetch' : fetch from each remote, storing the refs in 'refs/prefetch/
   /'.

The only included tasks are the 'gc' and 'commit-graph' tasks. The rest will
follow in a follow-up series. Including the 'commit-graph' task here allows
us to build and test features like config settings and the --task= 
command-line argument.

These tasks are all disabled by default, but can be enabled with config
options or run explicitly using "git maintenance run --task=". There are
additional config options to allow customizing the conditions for which the
tasks run during the '--auto' option.

 Because 'gc' is implemented as a maintenance task, the most dramatic change
of this series is to convert the 'git gc --auto' calls into 'git maintenance
run --auto' calls at the end of some Git commands. By default, the only
change is that 'git gc --auto' will be run below an additional 'git
maintenance' process.

The 'git maintenance' builtin has a 'run' subcommand so it can be extended
later with subcommands that manage background maintenance, such as 'start'
or 'stop'. These are not the subject of this series, as it is important to
focus on the maintenance activities themselves.

Updates since v2
================

 * Based on jc/no-update-fetch-head instead of jk/strvec.
   
   
 * I realized that the other maintenance subcommands should not accept the
   options for the 'run' subcommand, so I reorganized the option parsing
   into that subcommand. This makes the range-diff noisier than it would
   have been otherwise.
   
   
 * While updating the parsing, I also updated the usage strings.
   
   
 * The "verify, then delete and rewrite on failure" logic is removed. I'll
   consider bringing this back with a way to test the behavior in a later
   patch series.
   
   
 * Other feedback from Jonathan Tan is applied.
   
   

Updates since v1 (of this series)
=================================

 * Documentation fixes.
   
   
 * The builtin code had some slight tweaks in PATCH 1.
   
   

UPDATES since v3 of [1]
=======================

 * The biggest change here is the use of "test_subcommand", based on
   Jonathan Nieder's approach. This requires having the exact command-line
   figured out, which now requires spelling out all --no- [quiet%7Cprogress] 
   options. I also added a bunch of "2>/dev/null" checks because of the
   isatty(2) calls. Without that, the behavior will change depending on
   whether the test is run with -x/-v or without.
   
   
 * The option parsing has changed to use a local struct and pass that struct
   to the helper methods. This is instead of having a global singleton.
   
   

Thanks, -Stolee

Derrick Stolee (11):
  maintenance: create basic maintenance runner
  maintenance: add --quiet option
  maintenance: replace run_auto_gc()
  maintenance: initialize task array
  maintenance: add commit-graph task
  maintenance: add --task option
  maintenance: take a lock on the objects directory
  maintenance: create maintenance.<task>.enabled config
  maintenance: use pointers to check --auto
  maintenance: add auto condition for commit-graph task
  maintenance: add trace2 regions for task execution

 .gitignore                           |   1 +
 Documentation/config.txt             |   2 +
 Documentation/config/maintenance.txt |  16 ++
 Documentation/fetch-options.txt      |   6 +-
 Documentation/git-clone.txt          |   6 +-
 Documentation/git-maintenance.txt    |  79 +++++++
 builtin.h                            |   1 +
 builtin/am.c                         |   2 +-
 builtin/commit.c                     |   2 +-
 builtin/fetch.c                      |   6 +-
 builtin/gc.c                         | 336 +++++++++++++++++++++++++++
 builtin/merge.c                      |   2 +-
 builtin/rebase.c                     |   4 +-
 command-list.txt                     |   1 +
 commit-graph.c                       |   8 +-
 commit-graph.h                       |   1 +
 git.c                                |   1 +
 object.h                             |   1 +
 run-command.c                        |  16 +-
 run-command.h                        |   2 +-
 t/t5510-fetch.sh                     |   2 +-
 t/t5514-fetch-multiple.sh            |   2 +-
 t/t7900-maintenance.sh               |  63 +++++
 t/test-lib-functions.sh              |  33 +++
 24 files changed, 565 insertions(+), 28 deletions(-)
 create mode 100644 Documentation/config/maintenance.txt
 create mode 100644 Documentation/git-maintenance.txt
 create mode 100755 t/t7900-maintenance.sh


base-commit: 887952b8c680626f4721cb5fa57704478801aca4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-695%2Fderrickstolee%2Fmaintenance%2Fbuiltin-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-695/derrickstolee/maintenance/builtin-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/695

Range-diff vs v2:

  1:  e09e4a4a87 !  1:  aa961af387 maintenance: create basic maintenance runner
     @@ builtin/gc.c: int cmd_gc(int argc, const char **argv, const char *prefix)
       	return 0;
       }
      +
     -+static const char * const builtin_maintenance_usage[] = {
     -+	N_("git maintenance run [<options>]"),
     ++static const char * const builtin_maintenance_run_usage[] = {
     ++	N_("git maintenance run [--auto]"),
      +	NULL
      +};
      +
     -+struct maintenance_opts {
     ++struct maintenance_run_opts {
      +	int auto_flag;
      +};
      +
     -+static int maintenance_task_gc(struct maintenance_opts *opts)
     ++static int maintenance_task_gc(struct maintenance_run_opts *opts)
      +{
      +	struct child_process child = CHILD_PROCESS_INIT;
      +
     @@ builtin/gc.c: int cmd_gc(int argc, const char **argv, const char *prefix)
      +	return run_command(&child);
      +}
      +
     -+static int maintenance_run(struct maintenance_opts *opts)
     ++static int maintenance_run(int argc, const char **argv, const char *prefix)
      +{
     -+	return maintenance_task_gc(opts);
     -+}
     -+
     -+int cmd_maintenance(int argc, const char **argv, const char *prefix)
     -+{
     -+	struct maintenance_opts opts;
     -+	struct option builtin_maintenance_options[] = {
     ++	struct maintenance_run_opts opts;
     ++	struct option builtin_maintenance_run_options[] = {
      +		OPT_BOOL(0, "auto", &opts.auto_flag,
      +			 N_("run tasks based on the state of the repository")),
      +		OPT_END()
      +	};
     -+
      +	memset(&opts, 0, sizeof(opts));
      +
     -+	if (argc == 2 && !strcmp(argv[1], "-h"))
     -+		usage_with_options(builtin_maintenance_usage,
     -+				   builtin_maintenance_options);
     -+
      +	argc = parse_options(argc, argv, prefix,
     -+			     builtin_maintenance_options,
     -+			     builtin_maintenance_usage,
     -+			     PARSE_OPT_KEEP_UNKNOWN);
     ++			     builtin_maintenance_run_options,
     ++			     builtin_maintenance_run_usage,
     ++			     PARSE_OPT_STOP_AT_NON_OPTION);
     ++
     ++	if (argc != 0)
     ++		usage_with_options(builtin_maintenance_run_usage,
     ++				   builtin_maintenance_run_options);
     ++	return maintenance_task_gc(&opts);
     ++}
     ++
     ++static const char builtin_maintenance_usage[] = N_("git maintenance run [<options>]");
      +
     -+	if (argc != 1)
     -+		usage_with_options(builtin_maintenance_usage,
     -+				   builtin_maintenance_options);
     ++int cmd_maintenance(int argc, const char **argv, const char *prefix)
     ++{
     ++	if (argc == 2 && !strcmp(argv[1], "-h"))
     ++		usage(builtin_maintenance_usage);
      +
     -+	if (!strcmp(argv[0], "run"))
     -+		return maintenance_run(&opts);
     ++	if (!strcmp(argv[1], "run"))
     ++		return maintenance_run(argc - 1, argv + 1, prefix);
      +
     -+	die(_("invalid subcommand: %s"), argv[0]);
     ++	die(_("invalid subcommand: %s"), argv[1]);
      +}
      
     + ## command-list.txt ##
     +@@ command-list.txt: git-ls-remote                           plumbinginterrogators
     + git-ls-tree                             plumbinginterrogators
     + git-mailinfo                            purehelpers
     + git-mailsplit                           purehelpers
     ++git-maintenance                         mainporcelain
     + git-merge                               mainporcelain           history
     + git-merge-base                          plumbinginterrogators
     + git-merge-file                          plumbingmanipulators
     +
       ## git.c ##
      @@ git.c: static struct cmd_struct commands[] = {
       	{ "ls-tree", cmd_ls_tree, RUN_SETUP },
  2:  adae48d235 !  2:  5386d8a628 maintenance: add --quiet option
     @@ Documentation/git-maintenance.txt: OPTIONS
       Part of the linkgit:git[1] suite
      
       ## builtin/gc.c ##
     -@@ builtin/gc.c: static const char * const builtin_maintenance_usage[] = {
     +@@ builtin/gc.c: int cmd_gc(int argc, const char **argv, const char *prefix)
     + }
       
     - struct maintenance_opts {
     + static const char * const builtin_maintenance_run_usage[] = {
     +-	N_("git maintenance run [--auto]"),
     ++	N_("git maintenance run [--auto] [--[no-]quiet]"),
     + 	NULL
     + };
     + 
     + struct maintenance_run_opts {
       	int auto_flag;
      +	int quiet;
       };
       
     - static int maintenance_task_gc(struct maintenance_opts *opts)
     -@@ builtin/gc.c: static int maintenance_task_gc(struct maintenance_opts *opts)
     + static int maintenance_task_gc(struct maintenance_run_opts *opts)
     +@@ builtin/gc.c: static int maintenance_task_gc(struct maintenance_run_opts *opts)
       
       	if (opts->auto_flag)
       		strvec_push(&child.args, "--auto");
     @@ builtin/gc.c: static int maintenance_task_gc(struct maintenance_opts *opts)
       
       	close_object_store(the_repository->objects);
       	return run_command(&child);
     -@@ builtin/gc.c: int cmd_maintenance(int argc, const char **argv, const char *prefix)
     - 	struct option builtin_maintenance_options[] = {
     +@@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char *prefix)
     + 	struct option builtin_maintenance_run_options[] = {
       		OPT_BOOL(0, "auto", &opts.auto_flag,
       			 N_("run tasks based on the state of the repository")),
      +		OPT_BOOL(0, "quiet", &opts.quiet,
      +			 N_("do not report progress or other information over stderr")),
       		OPT_END()
       	};
     - 
     -@@ builtin/gc.c: int cmd_maintenance(int argc, const char **argv, const char *prefix)
     - 		usage_with_options(builtin_maintenance_usage,
     - 				   builtin_maintenance_options);
     + 	memset(&opts, 0, sizeof(opts));
       
      +	opts.quiet = !isatty(2);
      +
       	argc = parse_options(argc, argv, prefix,
     - 			     builtin_maintenance_options,
     - 			     builtin_maintenance_usage,
     + 			     builtin_maintenance_run_options,
     + 			     builtin_maintenance_run_usage,
      
       ## t/t7900-maintenance.sh ##
      @@ t/t7900-maintenance.sh: test_expect_success 'help text' '
  3:  91741a0cfc =  3:  e28b332df4 maintenance: replace run_auto_gc()
  4:  1db3b96280 !  4:  82326c1c38 maintenance: initialize task array
     @@ Commit message
      
          The run subcommand will return a nonzero exit code if any task fails.
          However, it will attempt all tasks in its loop before returning with the
     -    failure. Also each failed task will send an error message.
     +    failure. Also each failed task will print an error message.
      
          Helped-by: Taylor Blau <me@ttaylorr.com>
          Helped-by: Junio C Hamano <gitster@pobox.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## builtin/gc.c ##
     -@@ builtin/gc.c: static int maintenance_task_gc(struct maintenance_opts *opts)
     +@@ builtin/gc.c: static int maintenance_task_gc(struct maintenance_run_opts *opts)
       	return run_command(&child);
       }
       
     -+typedef int maintenance_task_fn(struct maintenance_opts *opts);
     ++typedef int maintenance_task_fn(struct maintenance_run_opts *opts);
      +
      +struct maintenance_task {
      +	const char *name;
     @@ builtin/gc.c: static int maintenance_task_gc(struct maintenance_opts *opts)
      +	},
      +};
      +
     - static int maintenance_run(struct maintenance_opts *opts)
     - {
     --	return maintenance_task_gc(opts);
     ++static int maintenance_run_tasks(struct maintenance_run_opts *opts)
     ++{
      +	int i;
      +	int result = 0;
      +
     @@ builtin/gc.c: static int maintenance_task_gc(struct maintenance_opts *opts)
      +	}
      +
      +	return result;
     ++}
     ++
     + static int maintenance_run(int argc, const char **argv, const char *prefix)
     + {
     + 	struct maintenance_run_opts opts;
     +@@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char *prefix)
     + 	if (argc != 0)
     + 		usage_with_options(builtin_maintenance_run_usage,
     + 				   builtin_maintenance_run_options);
     +-	return maintenance_task_gc(&opts);
     ++	return maintenance_run_tasks(&opts);
       }
       
     - int cmd_maintenance(int argc, const char **argv, const char *prefix)
     + static const char builtin_maintenance_usage[] = N_("git maintenance run [<options>]");
  5:  50b457fd57 !  5:  06984a42bf maintenance: add commit-graph task
     @@ Commit message
          maintenance: add commit-graph task
      
          The first new task in the 'git maintenance' builtin is the
     -    'commit-graph' job. It is based on the sequence of events in the
     -    'commit-graph' job in Scalar [1]. This sequence is as follows:
     +    'commit-graph' task. This updates the commit-graph file
     +    incrementally with the command
      
     -    1. git commit-graph write --reachable --split
     -    2. git commit-graph verify --shallow
     -    3. If the verify succeeds, stop.
     -    4. Delete the commit-graph-chain file.
     -    5. git commit-graph write --reachable --split
     +            git commit-graph write --reachable --split
      
          By writing an incremental commit-graph file using the "--split"
          option we minimize the disruption from this operation. The default
     @@ Commit message
          This could be avoided by a future update to use the --expire-time
          argument when writing the commit-graph.
      
     -    By using 'git commit-graph verify --shallow' we can ensure that
     -    the file we just wrote is valid. This is an extra safety precaution
     -    that is faster than our 'write' subcommand. In the rare situation
     -    that the newest layer of the commit-graph is corrupt, we can "fix"
     -    the corruption by deleting the commit-graph-chain file and rewrite
     -    the full commit-graph as a new one-layer commit graph. This does
     -    not completely prevent _that_ file from being corrupt, but it does
     -    recompute the commit-graph by parsing commits from the object
     -    database. In our use of this step in Scalar and VFS for Git, we
     -    have only seen this issue arise because our microsoft/git fork
     -    reverted 43d3561 ("commit-graph write: don't die if the existing
     -    graph is corrupt" 2019-03-25) for a while to keep commit-graph
     -    writes very fast. We dropped the revert when updating to v2.23.0.
     -    The verify still has potential for catching corrupt data across
     -    the layer boundary: if the new file has commit X with parent Y
     -    in an old file but the commit ID for Y in the old file had a
     -    bitswap, then we will notice that in the 'verify' command.
     -
     -    [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/CommitGraphStep.cs
     -
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## Documentation/git-maintenance.txt ##
     @@ Documentation/git-maintenance.txt: run::
       
      +commit-graph::
      +	The `commit-graph` job updates the `commit-graph` files incrementally,
     -+	then verifies that the written data is correct. If the new layer has an
     -+	issue, then the chain file is removed and the `commit-graph` is
     -+	rewritten from scratch.
     -++
     -+The incremental write is safe to run alongside concurrent Git processes
     -+since it will not expire `.graph` files that were in the previous
     -+`commit-graph-chain` file. They will be deleted by a later run based on
     -+the expiration delay.
     ++	then verifies that the written data is correct. The incremental
     ++	write is safe to run alongside concurrent Git processes since it
     ++	will not expire `.graph` files that were in the previous
     ++	`commit-graph-chain` file. They will be deleted by a later run based
     ++	on the expiration delay.
      +
       gc::
       	Clean up unnecessary files and optimize the local repository. "GC"
       	stands for "garbage collection," but this task performs many
      
       ## builtin/gc.c ##
     -@@ builtin/gc.c: struct maintenance_opts {
     +@@ builtin/gc.c: struct maintenance_run_opts {
       	int quiet;
       };
       
     -+static int run_write_commit_graph(struct maintenance_opts *opts)
     ++static int run_write_commit_graph(struct maintenance_run_opts *opts)
      +{
      +	struct child_process child = CHILD_PROCESS_INIT;
      +
     @@ builtin/gc.c: struct maintenance_opts {
      +	return !!run_command(&child);
      +}
      +
     -+static int run_verify_commit_graph(struct maintenance_opts *opts)
     ++static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
      +{
     -+	struct child_process child = CHILD_PROCESS_INIT;
     -+
     -+	child.git_cmd = 1;
     -+	strvec_pushl(&child.args, "commit-graph", "verify",
     -+		     "--shallow", NULL);
     -+
     -+	if (opts->quiet)
     -+		strvec_push(&child.args, "--no-progress");
     -+
     -+	return !!run_command(&child);
     -+}
     -+
     -+static int maintenance_task_commit_graph(struct maintenance_opts *opts)
     -+{
     -+	struct repository *r = the_repository;
     -+	char *chain_path;
     -+
     -+	close_object_store(r->objects);
     ++	close_object_store(the_repository->objects);
      +	if (run_write_commit_graph(opts)) {
      +		error(_("failed to write commit-graph"));
      +		return 1;
      +	}
      +
     -+	if (!run_verify_commit_graph(opts))
     -+		return 0;
     -+
     -+	warning(_("commit-graph verify caught error, rewriting"));
     -+
     -+	chain_path = get_commit_graph_chain_filename(r->objects->odb);
     -+	if (unlink(chain_path)) {
     -+		UNLEAK(chain_path);
     -+		die(_("failed to remove commit-graph at %s"), chain_path);
     -+	}
     -+	free(chain_path);
     -+
     -+	if (!run_write_commit_graph(opts))
     -+		return 0;
     -+
     -+	error(_("failed to rewrite commit-graph"));
     -+	return 1;
     ++	return 0;
      +}
      +
     - static int maintenance_task_gc(struct maintenance_opts *opts)
     + static int maintenance_task_gc(struct maintenance_run_opts *opts)
       {
       	struct child_process child = CHILD_PROCESS_INIT;
      @@ builtin/gc.c: struct maintenance_task {
     @@ builtin/gc.c: static struct maintenance_task tasks[] = {
      +	},
       };
       
     - static int maintenance_run(struct maintenance_opts *opts)
     + static int maintenance_run_tasks(struct maintenance_run_opts *opts)
      
       ## commit-graph.c ##
      @@ commit-graph.c: static char *get_split_graph_filename(struct object_directory *odb,
     @@ commit-graph.c: static void expire_commit_graphs(struct write_commit_graph_conte
       		ctx->num_commit_graphs_after = 0;
      
       ## commit-graph.h ##
     -@@ commit-graph.h: struct commit;
     - struct bloom_filter_settings;
     +@@ commit-graph.h: struct raw_object_store;
     + struct string_list;
       
       char *get_commit_graph_filename(struct object_directory *odb);
      +char *get_commit_graph_chain_filename(struct object_directory *odb);
  6:  85268bd53e !  6:  69298aee24 maintenance: add --task option
     @@ Documentation/git-maintenance.txt: OPTIONS
       
      +--task=<task>::
      +	If this option is specified one or more times, then only run the
     -+	specified tasks in the specified order.
     ++	specified tasks in the specified order. See the 'TASKS' section
     ++	for the list of accepted `<task>` values.
      +
       GIT
       ---
       Part of the linkgit:git[1] suite
      
       ## builtin/gc.c ##
     -@@ builtin/gc.c: typedef int maintenance_task_fn(struct maintenance_opts *opts);
     - struct maintenance_task {
     +@@ builtin/gc.c: int cmd_gc(int argc, const char **argv, const char *prefix)
     + }
     + 
     + static const char * const builtin_maintenance_run_usage[] = {
     +-	N_("git maintenance run [--auto] [--[no-]quiet]"),
     ++	N_("git maintenance run [--auto] [--[no-]quiet] [--task=<task>]"),
     + 	NULL
     + };
     + 
     +@@ builtin/gc.c: struct maintenance_task {
       	const char *name;
       	maintenance_task_fn *fn;
     --	unsigned enabled:1;
     -+	unsigned enabled:1,
     -+		 selected:1;
     + 	unsigned enabled:1;
     ++
     ++	/* -1 if not selected. */
      +	int selected_order;
       };
       
     @@ builtin/gc.c: static struct maintenance_task tasks[] = {
      +	return b->selected_order - a->selected_order;
      +}
      +
     - static int maintenance_run(struct maintenance_opts *opts)
     + static int maintenance_run_tasks(struct maintenance_run_opts *opts)
       {
      -	int i;
      +	int i, found_selected = 0;
       	int result = 0;
       
      +	for (i = 0; !found_selected && i < TASK__COUNT; i++)
     -+		found_selected = tasks[i].selected;
     ++		found_selected = tasks[i].selected_order >= 0;
      +
      +	if (found_selected)
      +		QSORT(tasks, TASK__COUNT, compare_tasks_by_selection);
      +
       	for (i = 0; i < TASK__COUNT; i++) {
      -		if (!tasks[i].enabled)
     -+		if (found_selected && !tasks[i].selected)
     ++		if (found_selected && tasks[i].selected_order < 0)
      +			continue;
      +
      +		if (!found_selected && !tasks[i].enabled)
       			continue;
       
       		if (tasks[i].fn(opts)) {
     -@@ builtin/gc.c: static int maintenance_run(struct maintenance_opts *opts)
     +@@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts)
       	return result;
       }
       
     @@ builtin/gc.c: static int maintenance_run(struct maintenance_opts *opts)
      +	BUG_ON_OPT_NEG(unset);
      +
      +	for (i = 0; i < TASK__COUNT; i++) {
     -+		num_selected += tasks[i].selected;
     ++		if (tasks[i].selected_order >= 0)
     ++			num_selected++;
      +		if (!strcasecmp(tasks[i].name, arg)) {
      +			task = &tasks[i];
      +		}
     @@ builtin/gc.c: static int maintenance_run(struct maintenance_opts *opts)
      +		return 1;
      +	}
      +
     -+	if (task->selected) {
     ++	if (task->selected_order >= 0) {
      +		error(_("task '%s' cannot be selected multiple times"), arg);
      +		return 1;
      +	}
      +
     -+	task->selected = 1;
      +	task->selected_order = num_selected + 1;
      +
      +	return 0;
      +}
      +
     - int cmd_maintenance(int argc, const char **argv, const char *prefix)
     + static int maintenance_run(int argc, const char **argv, const char *prefix)
       {
     - 	struct maintenance_opts opts;
     -@@ builtin/gc.c: int cmd_maintenance(int argc, const char **argv, const char *prefix)
     ++	int i;
     + 	struct maintenance_run_opts opts;
     + 	struct option builtin_maintenance_run_options[] = {
     + 		OPT_BOOL(0, "auto", &opts.auto_flag,
       			 N_("run tasks based on the state of the repository")),
       		OPT_BOOL(0, "quiet", &opts.quiet,
       			 N_("do not report progress or other information over stderr")),
     @@ builtin/gc.c: int cmd_maintenance(int argc, const char **argv, const char *prefi
      +			PARSE_OPT_NONEG, task_option_parse),
       		OPT_END()
       	};
     + 	memset(&opts, 0, sizeof(opts));
     + 
     + 	opts.quiet = !isatty(2);
       
     ++	for (i = 0; i < TASK__COUNT; i++)
     ++		tasks[i].selected_order = -1;
     ++
     + 	argc = parse_options(argc, argv, prefix,
     + 			     builtin_maintenance_run_options,
     + 			     builtin_maintenance_run_usage,
      
       ## t/t7900-maintenance.sh ##
      @@ t/t7900-maintenance.sh: test_expect_success 'run [--auto|--quiet]' '
  7:  6f86cfaa94 !  7:  3d513acdd8 maintenance: take a lock on the objects directory
     @@ Commit message
          lock is never committed, since it does not represent meaningful data.
          Instead, it is only a placeholder.
      
     -    If the lock file already exists, then fail silently. This will become
     -    very important later when we implement the 'fetch' task, as this is our
     -    stop-gap from creating a recursive process loop between 'git fetch' and
     -    'git maintenance run'.
     +    If the lock file already exists, then fail with a warning. If '--auto'
     +    is specified, then instead no warning is shown and no tasks are attempted.
     +    This will become very important later when we implement the 'prefetch'
     +    task, as this is our stop-gap from creating a recursive process loop
     +    between 'git fetch' and 'git maintenance run --auto'.
      
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## builtin/gc.c ##
     -@@ builtin/gc.c: static int maintenance_run(struct maintenance_opts *opts)
     +@@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts)
       {
       	int i, found_selected = 0;
       	int result = 0;
     @@ builtin/gc.c: static int maintenance_run(struct maintenance_opts *opts)
      +		 * that case.
      +		 */
      +		if (!opts->auto_flag && !opts->quiet)
     -+			error(_("lock file '%s' exists, skipping maintenance"),
     -+			      lock_path);
     ++			warning(_("lock file '%s' exists, skipping maintenance"),
     ++				lock_path);
      +		free(lock_path);
      +		return 0;
      +	}
      +	free(lock_path);
       
       	for (i = 0; !found_selected && i < TASK__COUNT; i++)
     - 		found_selected = tasks[i].selected;
     -@@ builtin/gc.c: static int maintenance_run(struct maintenance_opts *opts)
     + 		found_selected = tasks[i].selected_order >= 0;
     +@@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts)
       		}
       	}
       
  8:  5c0f9d69d1 !  8:  712f5f2d8e maintenance: create maintenance.<task>.enabled config
     @@ Documentation/config/maintenance.txt (new)
      @@
      +maintenance.<task>.enabled::
      +	This boolean config option controls whether the maintenance task
     -+	with name `<task>` is run when no `--task` option is specified.
     -+	By default, only `maintenance.gc.enabled` is true.
     ++	with name `<task>` is run when no `--task` option is specified to
     ++	`git maintenance run`. These config values are ignored if a
     ++	`--task` option exists. By default, only `maintenance.gc.enabled`
     ++	is true.
      
       ## Documentation/git-maintenance.txt ##
      @@ Documentation/git-maintenance.txt: SUBCOMMANDS
     @@ Documentation/git-maintenance.txt: SUBCOMMANDS
       
       TASKS
       -----
     +@@ Documentation/git-maintenance.txt: OPTIONS
     + 
     + --task=<task>::
     + 	If this option is specified one or more times, then only run the
     +-	specified tasks in the specified order. See the 'TASKS' section
     +-	for the list of accepted `<task>` values.
     ++	specified tasks in the specified order. If no `--task=<task>`
     ++	arguments are specified, then only the tasks with
     ++	`maintenance.<task>.enabled` configured as `true` are considered.
     ++	See the 'TASKS' section for the list of accepted `<task>` values.
     + 
     + GIT
     + ---
      
       ## builtin/gc.c ##
     -@@ builtin/gc.c: static int maintenance_run(struct maintenance_opts *opts)
     +@@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts)
       	return result;
       }
       
     @@ builtin/gc.c: static int maintenance_run(struct maintenance_opts *opts)
       static int task_option_parse(const struct option *opt,
       			     const char *arg, int unset)
       {
     -@@ builtin/gc.c: int cmd_maintenance(int argc, const char **argv, const char *prefix)
     - 				   builtin_maintenance_options);
     +@@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char *prefix)
     + 	memset(&opts, 0, sizeof(opts));
       
       	opts.quiet = !isatty(2);
      +	initialize_task_config();
       
     - 	argc = parse_options(argc, argv, prefix,
     - 			     builtin_maintenance_options,
     + 	for (i = 0; i < TASK__COUNT; i++)
     + 		tasks[i].selected_order = -1;
      
       ## t/t7900-maintenance.sh ##
      @@ t/t7900-maintenance.sh: test_expect_success 'run [--auto|--quiet]' '
  9:  68bf5bef4b !  9:  69d3b48fd4 maintenance: use pointers to check --auto
     @@ Commit message
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## builtin/gc.c ##
     -@@ builtin/gc.c: static int maintenance_task_gc(struct maintenance_opts *opts)
     +@@ builtin/gc.c: static int maintenance_task_gc(struct maintenance_run_opts *opts)
       
     - typedef int maintenance_task_fn(struct maintenance_opts *opts);
     + typedef int maintenance_task_fn(struct maintenance_run_opts *opts);
       
      +/*
      + * An auto condition function returns 1 if the task should run
     @@ builtin/gc.c: static int maintenance_task_gc(struct maintenance_opts *opts)
       	const char *name;
       	maintenance_task_fn *fn;
      +	maintenance_auto_fn *auto_condition;
     - 	unsigned enabled:1,
     - 		 selected:1;
     - 	int selected_order;
     + 	unsigned enabled:1;
     + 
     + 	/* -1 if not selected. */
      @@ builtin/gc.c: static struct maintenance_task tasks[] = {
       	[TASK_GC] = {
       		"gc",
     @@ builtin/gc.c: static struct maintenance_task tasks[] = {
       		1,
       	},
       	[TASK_COMMIT_GRAPH] = {
     -@@ builtin/gc.c: static int maintenance_run(struct maintenance_opts *opts)
     +@@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts)
       		if (!found_selected && !tasks[i].enabled)
       			continue;
       
 10:  fc097c389a ! 10:  4c3115fe35 maintenance: add auto condition for commit-graph task
     @@ Commit message
      
       ## Documentation/config/maintenance.txt ##
      @@ Documentation/config/maintenance.txt: maintenance.<task>.enabled::
     - 	This boolean config option controls whether the maintenance task
     - 	with name `<task>` is run when no `--task` option is specified.
     - 	By default, only `maintenance.gc.enabled` is true.
     + 	`git maintenance run`. These config values are ignored if a
     + 	`--task` option exists. By default, only `maintenance.gc.enabled`
     + 	is true.
      +
      +maintenance.commit-graph.auto::
      +	This integer config option controls how often the `commit-graph` task
     @@ builtin/gc.c
       
       #define FAILED_RUN "failed to run %s"
       
     -@@ builtin/gc.c: struct maintenance_opts {
     +@@ builtin/gc.c: struct maintenance_run_opts {
       	int quiet;
       };
       
      +/* Remember to update object flag allocation in object.h */
     -+#define PARENT1		(1u<<16)
     ++#define SEEN		(1u<<0)
      +
     -+static int num_commits_not_in_graph = 0;
     -+static int limit_commits_not_in_graph = 100;
     ++struct cg_auto_data {
     ++	int num_not_in_graph;
     ++	int limit;
     ++};
      +
      +static int dfs_on_ref(const char *refname,
      +		      const struct object_id *oid, int flags,
      +		      void *cb_data)
      +{
     ++	struct cg_auto_data *data = (struct cg_auto_data *)cb_data;
      +	int result = 0;
      +	struct object_id peeled;
      +	struct commit_list *stack = NULL;
     @@ builtin/gc.c: struct maintenance_opts {
      +		for (parent = commit->parents; parent; parent = parent->next) {
      +			if (parse_commit(parent->item) ||
      +			    commit_graph_position(parent->item) != COMMIT_NOT_FROM_GRAPH ||
     -+			    parent->item->object.flags & PARENT1)
     ++			    parent->item->object.flags & SEEN)
      +				continue;
      +
     -+			parent->item->object.flags |= PARENT1;
     -+			num_commits_not_in_graph++;
     ++			parent->item->object.flags |= SEEN;
     ++			data->num_not_in_graph++;
      +
     -+			if (num_commits_not_in_graph >= limit_commits_not_in_graph) {
     ++			if (data->num_not_in_graph >= data->limit) {
      +				result = 1;
      +				break;
      +			}
     @@ builtin/gc.c: struct maintenance_opts {
      +static int should_write_commit_graph(void)
      +{
      +	int result;
     ++	struct cg_auto_data data;
      +
     ++	data.num_not_in_graph = 0;
     ++	data.limit = 100;
      +	git_config_get_int("maintenance.commit-graph.auto",
     -+			   &limit_commits_not_in_graph);
     ++			   &data.limit);
      +
     -+	if (!limit_commits_not_in_graph)
     ++	if (!data.limit)
      +		return 0;
     -+	if (limit_commits_not_in_graph < 0)
     ++	if (data.limit < 0)
      +		return 1;
      +
     -+	result = for_each_ref(dfs_on_ref, NULL);
     ++	result = for_each_ref(dfs_on_ref, &data);
      +
     -+	clear_commit_marks_all(PARENT1);
     ++	clear_commit_marks_all(SEEN);
      +
      +	return result;
      +}
      +
     - static int run_write_commit_graph(struct maintenance_opts *opts)
     + static int run_write_commit_graph(struct maintenance_run_opts *opts)
       {
       	struct child_process child = CHILD_PROCESS_INIT;
      @@ builtin/gc.c: static struct maintenance_task tasks[] = {
     @@ builtin/gc.c: static struct maintenance_task tasks[] = {
      
       ## object.h ##
      @@ object.h: struct object_array {
     +  * sha1-name.c:                                              20
        * list-objects-filter.c:                                      21
        * builtin/fsck.c:           0--3
     ++ * builtin/gc.c:             0
        * builtin/index-pack.c:                                     2021
     -+ * builtin/maintenance.c:                           16
        * builtin/pack-objects.c:                                   20
        * builtin/reflog.c:                   10--12
     -  * builtin/show-branch.c:    0-------------------------------------------26
 11:  46fbe161aa ! 11:  652a8eac57 maintenance: add trace2 regions for task execution
     @@ Commit message
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## builtin/gc.c ##
     -@@ builtin/gc.c: static int maintenance_run(struct maintenance_opts *opts)
     +@@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts)
       		     !tasks[i].auto_condition()))
       			continue;
       

-- 
gitgitgadget

  parent reply index

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-06 15:48 [PATCH " Derrick Stolee via GitGitGadget
2020-08-06 15:48 ` [PATCH 01/11] maintenance: create basic maintenance runner Derrick Stolee via GitGitGadget
2020-08-07 22:16   ` Martin Ågren
2020-08-12 21:03   ` Jonathan Nieder
2020-08-12 22:07     ` Junio C Hamano
2020-08-12 22:50       ` Jonathan Nieder
2020-08-14  1:05     ` Derrick Stolee
2020-08-06 15:48 ` [PATCH 02/11] maintenance: add --quiet option Derrick Stolee via GitGitGadget
2020-08-06 15:48 ` [PATCH 03/11] maintenance: replace run_auto_gc() Derrick Stolee via GitGitGadget
2020-08-06 15:48 ` [PATCH 04/11] maintenance: initialize task array Derrick Stolee via GitGitGadget
2020-08-06 15:48 ` [PATCH 05/11] maintenance: add commit-graph task Derrick Stolee via GitGitGadget
2020-08-07 22:29   ` Martin Ågren
2020-08-12 13:30     ` Derrick Stolee
2020-08-14 12:23       ` Martin Ågren
2020-08-06 15:48 ` [PATCH 06/11] maintenance: add --task option Derrick Stolee via GitGitGadget
2020-08-06 15:48 ` [PATCH 07/11] maintenance: take a lock on the objects directory Derrick Stolee via GitGitGadget
2020-08-06 15:48 ` [PATCH 08/11] maintenance: create maintenance.<task>.enabled config Derrick Stolee via GitGitGadget
2020-08-06 15:48 ` [PATCH 09/11] maintenance: use pointers to check --auto Derrick Stolee via GitGitGadget
2020-08-06 15:48 ` [PATCH 10/11] maintenance: add auto condition for commit-graph task Derrick Stolee via GitGitGadget
2020-08-06 15:48 ` [PATCH 11/11] maintenance: add trace2 regions for task execution Derrick Stolee via GitGitGadget
2020-08-18 14:22 ` [PATCH v2 00/11] Maintenance I: Command, gc and commit-graph tasks Derrick Stolee via GitGitGadget
2020-08-18 14:22   ` [PATCH v2 01/11] maintenance: create basic maintenance runner Derrick Stolee via GitGitGadget
2020-08-18 14:22   ` [PATCH v2 02/11] maintenance: add --quiet option Derrick Stolee via GitGitGadget
2020-08-18 14:23   ` [PATCH v2 03/11] maintenance: replace run_auto_gc() Derrick Stolee via GitGitGadget
2020-08-18 14:23   ` [PATCH v2 04/11] maintenance: initialize task array Derrick Stolee via GitGitGadget
2020-08-18 23:46     ` Jonathan Tan
2020-08-18 14:23   ` [PATCH v2 05/11] maintenance: add commit-graph task Derrick Stolee via GitGitGadget
2020-08-18 23:51     ` Jonathan Tan
2020-08-19 15:04       ` Derrick Stolee
2020-08-19 17:43         ` Jonathan Tan
2020-08-18 14:23   ` [PATCH v2 06/11] maintenance: add --task option Derrick Stolee via GitGitGadget
2020-08-19  0:00     ` Jonathan Tan
2020-08-19  0:36       ` Junio C Hamano
2020-08-19 15:09         ` Derrick Stolee
2020-08-19 17:35           ` Jonathan Tan
2020-08-18 14:23   ` [PATCH v2 07/11] maintenance: take a lock on the objects directory Derrick Stolee via GitGitGadget
2020-08-19  0:04     ` Jonathan Tan
2020-08-19 15:10       ` Derrick Stolee
2020-08-18 14:23   ` [PATCH v2 08/11] maintenance: create maintenance.<task>.enabled config Derrick Stolee via GitGitGadget
2020-08-18 14:23   ` [PATCH v2 09/11] maintenance: use pointers to check --auto Derrick Stolee via GitGitGadget
2020-08-18 14:23   ` [PATCH v2 10/11] maintenance: add auto condition for commit-graph task Derrick Stolee via GitGitGadget
2020-08-19  0:09     ` Jonathan Tan
2020-08-19 15:15       ` Derrick Stolee
2020-08-18 14:23   ` [PATCH v2 11/11] maintenance: add trace2 regions for task execution Derrick Stolee via GitGitGadget
2020-08-19  0:11     ` Jonathan Tan
2020-08-18 20:18   ` [PATCH v2 00/11] Maintenance I: Command, gc and commit-graph tasks Junio C Hamano
2020-08-19 14:51     ` Derrick Stolee
2020-08-25 18:33   ` Derrick Stolee via GitGitGadget [this message]
2020-08-25 18:33     ` [PATCH v3 01/11] maintenance: create basic maintenance runner Derrick Stolee via GitGitGadget
2020-08-25 18:33     ` [PATCH v3 02/11] maintenance: add --quiet option Derrick Stolee via GitGitGadget
2020-08-25 18:33     ` [PATCH v3 03/11] maintenance: replace run_auto_gc() Derrick Stolee via GitGitGadget
2020-08-25 18:33     ` [PATCH v3 04/11] maintenance: initialize task array Derrick Stolee via GitGitGadget
2020-08-25 18:33     ` [PATCH v3 05/11] maintenance: add commit-graph task Derrick Stolee via GitGitGadget
2020-08-25 18:33     ` [PATCH v3 06/11] maintenance: add --task option Derrick Stolee via GitGitGadget
2020-08-25 18:33     ` [PATCH v3 07/11] maintenance: take a lock on the objects directory Derrick Stolee via GitGitGadget
2020-08-26 23:02       ` Jonathan Tan
2020-08-25 18:33     ` [PATCH v3 08/11] maintenance: create maintenance.<task>.enabled config Derrick Stolee via GitGitGadget
2020-08-25 18:33     ` [PATCH v3 09/11] maintenance: use pointers to check --auto Derrick Stolee via GitGitGadget
2020-08-25 18:33     ` [PATCH v3 10/11] maintenance: add auto condition for commit-graph task Derrick Stolee via GitGitGadget
2020-08-26 23:02       ` Jonathan Tan
2020-08-26 23:56         ` Junio C Hamano
2020-08-25 18:33     ` [PATCH v3 11/11] maintenance: add trace2 regions for task execution Derrick Stolee via GitGitGadget
2020-09-04 13:09     ` [PATCH v4 00/11] Maintenance I: Command, gc and commit-graph tasks Derrick Stolee via GitGitGadget
2020-09-04 13:09       ` [PATCH v4 01/11] maintenance: create basic maintenance runner Derrick Stolee via GitGitGadget
2020-09-04 13:09       ` [PATCH v4 02/11] maintenance: add --quiet option Derrick Stolee via GitGitGadget
2020-09-04 13:09       ` [PATCH v4 03/11] maintenance: replace run_auto_gc() Derrick Stolee via GitGitGadget
2020-09-04 13:09       ` [PATCH v4 04/11] maintenance: initialize task array Derrick Stolee via GitGitGadget
2020-09-04 13:09       ` [PATCH v4 05/11] maintenance: add commit-graph task Derrick Stolee via GitGitGadget
2020-09-04 13:09       ` [PATCH v4 06/11] maintenance: add --task option Derrick Stolee via GitGitGadget
2020-09-04 13:09       ` [PATCH v4 07/11] maintenance: take a lock on the objects directory Derrick Stolee via GitGitGadget
2020-09-04 13:09       ` [PATCH v4 08/11] maintenance: create maintenance.<task>.enabled config Derrick Stolee via GitGitGadget
2020-09-04 13:09       ` [PATCH v4 09/11] maintenance: use pointers to check --auto Derrick Stolee via GitGitGadget
2020-09-04 13:09       ` [PATCH v4 10/11] maintenance: add auto condition for commit-graph task Derrick Stolee via GitGitGadget
2020-09-04 13:09       ` [PATCH v4 11/11] maintenance: add trace2 regions for task execution Derrick Stolee via GitGitGadget
2020-09-17 18:11       ` [PATCH v5 00/11] Maintenance I: Command, gc and commit-graph tasks Derrick Stolee via GitGitGadget
2020-09-17 18:11         ` [PATCH v5 01/11] maintenance: create basic maintenance runner Derrick Stolee via GitGitGadget
2020-09-17 18:11         ` [PATCH v5 02/11] maintenance: add --quiet option Derrick Stolee via GitGitGadget
2020-09-17 18:11         ` [PATCH v5 03/11] maintenance: replace run_auto_gc() Derrick Stolee via GitGitGadget
2020-09-17 18:11         ` [PATCH v5 04/11] maintenance: initialize task array Derrick Stolee via GitGitGadget
2020-09-17 18:11         ` [PATCH v5 05/11] maintenance: add commit-graph task Derrick Stolee via GitGitGadget
2020-09-17 18:11         ` [PATCH v5 06/11] maintenance: add --task option Derrick Stolee via GitGitGadget
2020-09-17 18:11         ` [PATCH v5 07/11] maintenance: take a lock on the objects directory Derrick Stolee via GitGitGadget
2020-09-21 13:36           ` Ævar Arnfjörð Bjarmason
2020-09-21 13:43             ` Derrick Stolee
2020-09-21 19:29               ` Junio C Hamano
2020-09-17 18:11         ` [PATCH v5 08/11] maintenance: create maintenance.<task>.enabled config Derrick Stolee via GitGitGadget
2020-09-17 18:11         ` [PATCH v5 09/11] maintenance: use pointers to check --auto Derrick Stolee via GitGitGadget
2020-09-17 18:11         ` [PATCH v5 10/11] maintenance: add auto condition for commit-graph task Derrick Stolee via GitGitGadget
2020-09-17 18:11         ` [PATCH v5 11/11] maintenance: add trace2 regions for task execution Derrick Stolee via GitGitGadget
2020-09-17 18:35         ` [PATCH v5 00/11] Maintenance I: Command, gc and commit-graph tasks Junio C Hamano
2020-09-18 13:14           ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.695.v3.git.1598380426.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=congdanhqx@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    --cc=phillip.wood123@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    --cc=sluongng@gmail.com \
    --cc=steadmon@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git