git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: jrnieder@gmail.com, jonathantanmy@google.com, sluongng@gmail.com,
	congdanhqx@gmail.com, "SZEDER Gábor" <szeder.dev@gmail.com>,
	"Derrick Stolee" <stolee@gmail.com>,
	"Derrick Stolee" <derrickstolee@github.com>
Subject: [PATCH v2 0/7] Maintenance III: Background maintenance
Date: Fri, 11 Sep 2020 17:49:13 +0000	[thread overview]
Message-ID: <pull.724.v2.git.1599846560.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.724.git.1599234126.gitgitgadget@gmail.com>

This is based on ds/maintenance-part-2 and replaces the RFC from [1].

[1] 
https://lore.kernel.org/git/pull.680.v3.git.1598629517.gitgitgadget@gmail.com/

This series introduces background maintenance to Git, through an integration
with cron and crontab.

Some preliminary work is done to allow a new --schedule option that tells
the command which tasks to run based on a maintenance.<task>.schedule config
option. The timing is not enforced by Git, but instead is expected to be
provided as a hint from a cron schedule. The options are "hourly", "daily",
and "weekly".

A new for-each-repo builtin runs Git commands on every repo in a given list.
Currently, the list is stored as a config setting, allowing a new 
maintenance.repos config list to store the repositories registered for
background maintenance. Others may want to add a --file=<file> option for
their own workflows, but I focused on making this as simple as possible for
now.

The updates to the git maintenance builtin include new register/unregister 
subcommands and start/stop subcommands. The register subcommand initializes
the config while the start subcommand does everything register does plus 
update the cron table. The unregister and stop commands reverse this
process.

A troubleshooting guide is added to Documentation/git-maintenance.txt to
advise expert users who choose to create custom cron schedules.

The very last patch is entirely optional. It sets a recommended schedule
based on my own experience with very large repositories. I'm open to other
suggestions, but these are ones that I think work well and don't cause a
"rewrite the world" scenario like running nightly 'gc' would do.

I've been testing this scenario on my macOS laptop and Linux desktop. I have
modified my cron task to provide logging via trace2 so I can see what's
happening. A future direction here would be to add some maintenance logs to
the repository so we can track what is happening and diagnose whether the
maintenance strategy is working on real repos.

Note: git maintenance (start|stop) only works on machines with cron by
design. The proper thing to do on Windows will come later. Perhaps this
command should be marked as unavailable on Windows somehow, or at least a
better error than "cron may not be available on your system". I did find
that that message is helpful sometimes: macOS worker agents for CI builds
typically do not have cron available.

Updates in v2:

 * Fixed the char/int issue in test-tool crontab, and a typo.
 * Updated commit message and patch noise in PATCH 2
 * This should fix the test failures, allowing this to be picked up in
   'seen'.

Derrick Stolee (7):
  maintenance: optionally skip --auto process
  maintenance: add --schedule option and config
  for-each-repo: run subcommands on configured repos
  maintenance: add [un]register subcommands
  maintenance: add start/stop subcommands
  maintenance: recommended schedule in register/start
  maintenance: add troubleshooting guide to docs

 .gitignore                           |   1 +
 Documentation/config/maintenance.txt |  10 +
 Documentation/git-for-each-repo.txt  |  59 ++++++
 Documentation/git-maintenance.txt    |  88 +++++++-
 Makefile                             |   2 +
 builtin.h                            |   1 +
 builtin/for-each-repo.c              |  58 ++++++
 builtin/gc.c                         | 289 ++++++++++++++++++++++++++-
 command-list.txt                     |   1 +
 git.c                                |   1 +
 run-command.c                        |   6 +
 t/helper/test-crontab.c              |  35 ++++
 t/helper/test-tool.c                 |   1 +
 t/helper/test-tool.h                 |   1 +
 t/t0068-for-each-repo.sh             |  30 +++
 t/t7900-maintenance.sh               | 114 ++++++++++-
 t/test-lib.sh                        |   6 +
 17 files changed, 697 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/git-for-each-repo.txt
 create mode 100644 builtin/for-each-repo.c
 create mode 100644 t/helper/test-crontab.c
 create mode 100755 t/t0068-for-each-repo.sh


base-commit: 6f11fba53777584b94dd9ed32976c2079d645fa2
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-724%2Fderrickstolee%2Fmaintenance%2Fscheduled-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-724/derrickstolee/maintenance/scheduled-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/724

Range-diff vs v1:

 1:  bd95729009 = 1:  b21cd68c90 maintenance: optionally skip --auto process
 2:  1783e80b8d ! 2:  e2d14d66d4 maintenance: add --schedule option and config
     @@ Metadata
       ## Commit message ##
          maintenance: add --schedule option and config
      
     -    A user may want to run certain maintenance tasks based on frequency, not
     -    conditions given in the repository. For example, the user may want to
     -    perform a 'prefetch' task every hour, or 'gc' task every day. To assist,
     -    update the 'git maintenance run' command to include a
     -    '--schedule=<frequency>' option. The allowed frequencies are 'hourly',
     -    'daily', and 'weekly'. These values are also allowed in a new config
     -    value 'maintenance.<task>.schedule'.
     +    Maintenance currently triggers when certain data-size thresholds are
     +    met, such as number of pack-files or loose objects. Users may want to
     +    run certain maintenance tasks based on frequency instead. For example,
     +    a user may want to perform a 'prefetch' task every hour, or 'gc' task
     +    every day. To help these users, update the 'git maintenance run' command
     +    to include a '--schedule=<frequency>' option. The allowed frequencies
     +    are 'hourly', 'daily', and 'weekly'. These values are also allowed in a
     +    new config value 'maintenance.<task>.schedule'.
      
          The 'git maintenance run --schedule=<frequency>' checks the '*.schedule'
          config value for each enabled task to see if the configured frequency is
     @@ Commit message
          The following cron table would run the scheduled tasks with the correct
          frequencies:
      
     -      0 1-23 * * *    git -C <repo> maintenance run --scheduled=hourly
     -      0 0    * * 1-6  git -C <repo> maintenance run --scheduled=daily
     -      0 0    * * 0    git -C <repo> maintenance run --scheduled=weekly
     +      0 1-23 * * *    git -C <repo> maintenance run --schedule=hourly
     +      0 0    * * 1-6  git -C <repo> maintenance run --schedule=daily
     +      0 0    * * 0    git -C <repo> maintenance run --schedule=weekly
      
     -    This cron schedule will run --scheduled=hourly every hour except at
     -    midnight. This avoids a concurrent run with the --scheduled=daily that
     +    This cron schedule will run --schedule=hourly every hour except at
     +    midnight. This avoids a concurrent run with the --schedule=daily that
          runs at midnight every day except the first day of the week. This avoids
     -    a concurrent run with the --scheduled=weekly that runs at midnight on
     -    the first day of the week. Since --scheduled=daily also runs the
     -    'hourly' tasks and --scheduled=weekly runs the 'hourly' and 'daily'
     +    a concurrent run with the --schedule=weekly that runs at midnight on
     +    the first day of the week. Since --schedule=daily also runs the
     +    'hourly' tasks and --schedule=weekly runs the 'hourly' and 'daily'
          tasks, we will still see all tasks run with the proper frequencies.
      
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
     @@ builtin/gc.c: struct maintenance_task {
       	int selected_order;
       };
      @@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts)
     + 		     !tasks[i].auto_condition()))
       			continue;
       
     - 		if (opts->auto_flag &&
     --		    (!tasks[i].auto_condition ||
     --		     !tasks[i].auto_condition()))
     -+		    (!tasks[i].auto_condition || !tasks[i].auto_condition()))
     ++		if (opts->schedule && tasks[i].schedule < opts->schedule)
      +			continue;
      +
     -+		if (opts->schedule && tasks[i].schedule < opts->schedule)
     - 			continue;
     - 
       		trace2_region_enter("maintenance", tasks[i].name, r);
     + 		if (tasks[i].fn(opts)) {
     + 			error(_("task '%s' failed"), tasks[i].name);
      @@ builtin/gc.c: static void initialize_task_config(void)
       
       	for (i = 0; i < TASK__COUNT; i++) {
 3:  6082d939eb = 3:  41a346dfbb for-each-repo: run subcommands on configured repos
 4:  b7775b3aaf = 4:  1f49cda18e maintenance: add [un]register subcommands
 5:  e02641881d ! 5:  e9b2a39c1d maintenance: add start/stop subcommands
     @@ t/helper/test-crontab.c (new)
      +/*
      + * Usage: test-tool cron <file> [-l]
      + *
     -+ * If -l is specified, then write the contents of <file> to stdou.
     ++ * If -l is specified, then write the contents of <file> to stdout.
      + * Otherwise, write from stdin into <file>.
      + */
      +int cmd__crontab(int argc, const char **argv)
      +{
     -+	char a;
     ++	int a;
      +	FILE *from, *to;
      +
      +	if (argc == 3 && !strcmp(argv[2], "-l")) {
 6:  8a285e00e6 = 6:  f609c1bde2 maintenance: recommended schedule in register/start
 7:  c00de53906 = 7:  2344eff4ba maintenance: add troubleshooting guide to docs

-- 
gitgitgadget

  parent reply	other threads:[~2020-09-11 17:49 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-04 15:41 [PATCH 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
2020-09-04 15:42 ` [PATCH 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
2020-09-04 15:42 ` [PATCH 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
2020-09-08 13:07   ` Đoàn Trần Công Danh
2020-09-09 12:14     ` Derrick Stolee
2020-09-04 15:42 ` [PATCH 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
2020-09-04 15:42 ` [PATCH 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
2020-09-04 15:42 ` [PATCH 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
2020-09-08  6:29   ` SZEDER Gábor
2020-09-08 12:43     ` Derrick Stolee
2020-09-08 19:31     ` Junio C Hamano
2020-09-04 15:42 ` [PATCH 6/7] maintenance: recommended schedule in register/start Derrick Stolee via GitGitGadget
2020-09-04 15:42 ` [PATCH 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
2020-09-11 17:49 ` Derrick Stolee via GitGitGadget [this message]
2020-09-11 17:49   ` [PATCH v2 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
2020-09-11 17:49   ` [PATCH v2 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
2020-09-11 17:49   ` [PATCH v2 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
2020-09-11 17:49   ` [PATCH v2 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
2020-09-17 14:05     ` Đoàn Trần Công Danh
2020-09-11 17:49   ` [PATCH v2 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
2020-09-11 17:49   ` [PATCH v2 6/7] maintenance: recommended schedule in register/start Derrick Stolee via GitGitGadget
2020-09-29 19:48     ` Martin Ågren
2020-09-30 20:11       ` Derrick Stolee
2020-10-01 20:38         ` Derrick Stolee
2020-10-02  0:38           ` Đoàn Trần Công Danh
2020-10-02  1:55             ` Derrick Stolee
2020-10-05 13:16               ` Đoàn Trần Công Danh
2020-10-05 18:17                 ` Derrick Stolee
2020-09-11 17:49   ` [PATCH v2 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
2020-10-05 12:57   ` [PATCH v3 0/7] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
2020-10-05 12:57     ` [PATCH v3 1/7] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
2020-10-05 12:57     ` [PATCH v3 2/7] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
2020-10-05 12:57     ` [PATCH v3 3/7] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
2020-10-05 12:57     ` [PATCH v3 4/7] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
2020-10-05 12:57     ` [PATCH v3 5/7] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
2020-10-05 12:57     ` [PATCH v3 6/7] maintenance: use default schedule if not configured Derrick Stolee via GitGitGadget
2020-10-05 19:57       ` Martin Ågren
2020-10-08 13:32         ` Derrick Stolee
2020-10-05 12:57     ` [PATCH v3 7/7] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget
2020-10-15 17:21     ` [PATCH v4 0/8] Maintenance III: Background maintenance Derrick Stolee via GitGitGadget
2020-10-15 17:21       ` [PATCH v4 1/8] maintenance: optionally skip --auto process Derrick Stolee via GitGitGadget
2020-10-15 17:21       ` [PATCH v4 2/8] maintenance: add --schedule option and config Derrick Stolee via GitGitGadget
2021-02-09 14:06         ` Ævar Arnfjörð Bjarmason
2021-02-09 16:54           ` Derrick Stolee
2021-05-10 12:16             ` Ævar Arnfjörð Bjarmason
2021-05-10 18:42               ` Junio C Hamano
2020-10-15 17:21       ` [PATCH v4 3/8] for-each-repo: run subcommands on configured repos Derrick Stolee via GitGitGadget
2021-05-03 16:10         ` Andrzej Hunt
2021-05-03 17:01           ` Eric Sunshine
2021-05-03 19:26             ` Eric Sunshine
2021-05-03 19:43           ` Derrick Stolee
2020-10-15 17:22       ` [PATCH v4 4/8] maintenance: add [un]register subcommands Derrick Stolee via GitGitGadget
2020-10-15 17:22       ` [PATCH v4 5/8] maintenance: add start/stop subcommands Derrick Stolee via GitGitGadget
2020-12-09 18:51         ` Josh Steadmon
2020-12-09 19:16           ` Josh Steadmon
2020-12-09 21:59             ` Derrick Stolee
2020-12-10  0:13             ` Junio C Hamano
2020-12-10  1:52               ` Derrick Stolee
2020-12-10  6:54                 ` Junio C Hamano
2020-10-15 17:22       ` [PATCH v4 6/8] maintenance: create maintenance.strategy config Derrick Stolee via GitGitGadget
2020-10-15 17:22       ` [PATCH v4 7/8] maintenance: use 'incremental' strategy by default Derrick Stolee via GitGitGadget
2020-10-15 17:22       ` [PATCH v4 8/8] maintenance: add troubleshooting guide to docs Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.724.v2.git.1599846560.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=congdanhqx@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=sluongng@gmail.com \
    --cc=stolee@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).