git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] maintenance: schedule maintenance on a random minute
@ 2023-08-07 18:51 Derrick Stolee via GitGitGadget
  2023-08-07 18:51 ` [PATCH 1/6] maintenance: add get_random_minute() Derrick Stolee via GitGitGadget
                   ` (6 more replies)
  0 siblings, 7 replies; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-07 18:51 UTC (permalink / raw)
  To: git; +Cc: gitster, sandals, lenaic, Derrick Stolee

When we initially created background maintenance -- with its hourly, daily,
and weekly schedules -- we considered the effects of all clients launching
fetches to the server every hour on the hour. The worry of DDoSing server
hosts was noted, but left as something we would consider for a future
update.

As background maintenance has gained more adoption over the past three
years, our worries about DDoSing the big Git hosts has been unfounded. Those
systems, especially those serving public repositories, are already resilient
to thundering herds of much smaller scale.

However, sometimes organizations spin up specific custom server
infrastructure either in addition to or on top of their Git host. Some of
these technologies are built for a different range of scale, and can hit
concurrency limits sooner. Organizations with such custom infrastructures
are more likely to recommend tools like scalar which furthers their adoption
of background maintenance.

This series attempts to help by spreading out background maintenance to a
random minute of the hour. This minute is selected during git maintenance
start, and the same minute applies to each of the three schedules.

This isn't a full solution to this problem, as the custom infrastructure
needs to be resilient to bursts of activity, but at least this will help
somewhat.

Each of the integrated schedulers needs a different way of integrating the
random minute. The most problematic is systemd, since our integration had a
clever use of templates to write one schedule that inserted the hourly,
daily, and weekly schedules as a string into the template. This needs some
refactoring before the custom minute could be inserted.

For the most part, each scheduler's integration is relatively simple. That
is, until we get to the systemd integration. That integration made use of a
clever templating technique that is no longer possible when making this
adjustment. The last two patches involve systemd, though one is just a move
of code (without edits) to make the second's diff look a lot simpler.

Thanks, -Stolee

Derrick Stolee (6):
  maintenance: add get_random_minute()
  maintenance: use random minute in launchctl scheduler
  maintenance: use random minute in Windows scheduler
  maintenance: use random minute in cron scheduler
  maintenance: swap method locations
  maintenance: use random minute in systemd scheduler

 builtin/gc.c           | 259 ++++++++++++++++++++++++++---------------
 t/t7900-maintenance.sh |   4 +-
 2 files changed, 166 insertions(+), 97 deletions(-)


base-commit: fe86abd7511a9a6862d5706c6fa1d9b57a63ba09
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1567%2Fderrickstolee%2Fmaintenance-random-minute-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1567/derrickstolee/maintenance-random-minute-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1567
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-07 18:51 [PATCH 0/6] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
@ 2023-08-07 18:51 ` Derrick Stolee via GitGitGadget
  2023-08-07 21:20   ` Taylor Blau
  2023-08-07 18:51 ` [PATCH 2/6] maintenance: use random minute in launchctl scheduler Derrick Stolee via GitGitGadget
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-07 18:51 UTC (permalink / raw)
  To: git; +Cc: gitster, sandals, lenaic, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

When we initially created background maintenance -- with its hourly,
daily, and weekly schedules -- we considered the effects of all clients
launching fetches to the server every hour on the hour. The worry of
DDoSing server hosts was noted, but left as something we would consider
for a future update.

As background maintenance has gained more adoption over the past three
years, our worries about DDoSing the big Git hosts has been unfounded.
Those systems, especially those serving public repositories, are already
resilient to thundering herds of much smaller scale.

However, sometimes organizations spin up specific custom server
infrastructure either in addition to or on top of their Git host. Some
of these technologies are built for a different range of scale, and can
hit concurrency limits sooner. Organizations with such custom
infrastructures are more likely to recommend tools like `scalar` which
furthers their adoption of background maintenance.

To help solve for this, create get_random_minute() as a method to help
Git select a random minute when creating schedules in the future. The
integrations with this method do not yet exist, but will follow in
future changes.

One thing that is important for testability is that we notice when we
are under a test scenario and return a predictable result. The schedules
themselves are not checked for this value, but at least one launchctl
test checks that we do not unnecessarily reboot the schedule if it has
not changed from a previous version.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/builtin/gc.c b/builtin/gc.c
index f3942188a61..66a972bc292 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1708,6 +1708,23 @@ static int get_schedule_cmd(const char **cmd, int *is_available)
 	return 1;
 }
 
+MAYBE_UNUSED
+static int get_random_minute(void)
+{
+	static int random_initialized = 0;
+
+	/* Use a static value when under tests. */
+	if (!getenv("GIT_TEST_MAINTENANCE_SCHEDULER"))
+		return 13;
+
+	if (!random_initialized) {
+		srand((unsigned int)getpid());
+		random_initialized = 1;
+	}
+
+	return rand() % 60;
+}
+
 static int is_launchctl_available(void)
 {
 	const char *cmd = "launchctl";
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 2/6] maintenance: use random minute in launchctl scheduler
  2023-08-07 18:51 [PATCH 0/6] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
  2023-08-07 18:51 ` [PATCH 1/6] maintenance: add get_random_minute() Derrick Stolee via GitGitGadget
@ 2023-08-07 18:51 ` Derrick Stolee via GitGitGadget
  2023-08-07 21:23   ` Taylor Blau
  2023-08-07 18:51 ` [PATCH 3/6] maintenance: use random minute in Windows scheduler Derrick Stolee via GitGitGadget
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-07 18:51 UTC (permalink / raw)
  To: git; +Cc: gitster, sandals, lenaic, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Use get_random_minute() when constructing the schedules for launchctl.

The format already includes a 'Minute' key which is modified from 0 to
the random minute.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 66a972bc292..51d6c7620ff 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1708,7 +1708,6 @@ static int get_schedule_cmd(const char **cmd, int *is_available)
 	return 1;
 }
 
-MAYBE_UNUSED
 static int get_random_minute(void)
 {
 	static int random_initialized = 0;
@@ -1837,6 +1836,7 @@ static int launchctl_schedule_plist(const char *exec_path, enum schedule_priorit
 	struct strbuf plist = STRBUF_INIT, plist2 = STRBUF_INIT;
 	struct stat st;
 	const char *cmd = "launchctl";
+	int minute = get_random_minute();
 
 	get_schedule_cmd(&cmd, NULL);
 	preamble = "<?xml version=\"1.0\"?>\n"
@@ -1862,29 +1862,30 @@ static int launchctl_schedule_plist(const char *exec_path, enum schedule_priorit
 	case SCHEDULE_HOURLY:
 		repeat = "<dict>\n"
 			 "<key>Hour</key><integer>%d</integer>\n"
-			 "<key>Minute</key><integer>0</integer>\n"
+			 "<key>Minute</key><integer>%d</integer>\n"
 			 "</dict>\n";
 		for (i = 1; i <= 23; i++)
-			strbuf_addf(&plist, repeat, i);
+			strbuf_addf(&plist, repeat, i, minute);
 		break;
 
 	case SCHEDULE_DAILY:
 		repeat = "<dict>\n"
 			 "<key>Day</key><integer>%d</integer>\n"
 			 "<key>Hour</key><integer>0</integer>\n"
-			 "<key>Minute</key><integer>0</integer>\n"
+			 "<key>Minute</key><integer>%d</integer>\n"
 			 "</dict>\n";
 		for (i = 1; i <= 6; i++)
-			strbuf_addf(&plist, repeat, i);
+			strbuf_addf(&plist, repeat, i, minute);
 		break;
 
 	case SCHEDULE_WEEKLY:
-		strbuf_addstr(&plist,
-			      "<dict>\n"
-			      "<key>Day</key><integer>0</integer>\n"
-			      "<key>Hour</key><integer>0</integer>\n"
-			      "<key>Minute</key><integer>0</integer>\n"
-			      "</dict>\n");
+		strbuf_addf(&plist,
+			    "<dict>\n"
+			    "<key>Day</key><integer>0</integer>\n"
+			    "<key>Hour</key><integer>0</integer>\n"
+			    "<key>Minute</key><integer>%d</integer>\n"
+			    "</dict>\n",
+			    minute);
 		break;
 
 	default:
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 3/6] maintenance: use random minute in Windows scheduler
  2023-08-07 18:51 [PATCH 0/6] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
  2023-08-07 18:51 ` [PATCH 1/6] maintenance: add get_random_minute() Derrick Stolee via GitGitGadget
  2023-08-07 18:51 ` [PATCH 2/6] maintenance: use random minute in launchctl scheduler Derrick Stolee via GitGitGadget
@ 2023-08-07 18:51 ` Derrick Stolee via GitGitGadget
  2023-08-07 18:51 ` [PATCH 4/6] maintenance: use random minute in cron scheduler Derrick Stolee via GitGitGadget
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-07 18:51 UTC (permalink / raw)
  To: git; +Cc: gitster, sandals, lenaic, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Add this random minute to the Windows scheduler integration.

We need only to modify the minute value for the 'StartBoundary' tag
across the three schedules.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 51d6c7620ff..f059c9f89d5 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -2002,6 +2002,7 @@ static int schtasks_schedule_task(const char *exec_path, enum schedule_priority
 	const char *frequency = get_frequency(schedule);
 	char *name = schtasks_task_name(frequency);
 	struct strbuf tfilename = STRBUF_INIT;
+	int minute = get_random_minute();
 
 	get_schedule_cmd(&cmd, NULL);
 
@@ -2022,7 +2023,7 @@ static int schtasks_schedule_task(const char *exec_path, enum schedule_priority
 	switch (schedule) {
 	case SCHEDULE_HOURLY:
 		fprintf(tfile->fp,
-			"<StartBoundary>2020-01-01T01:00:00</StartBoundary>\n"
+			"<StartBoundary>2020-01-01T01:%02d:00</StartBoundary>\n"
 			"<Enabled>true</Enabled>\n"
 			"<ScheduleByDay>\n"
 			"<DaysInterval>1</DaysInterval>\n"
@@ -2031,12 +2032,13 @@ static int schtasks_schedule_task(const char *exec_path, enum schedule_priority
 			"<Interval>PT1H</Interval>\n"
 			"<Duration>PT23H</Duration>\n"
 			"<StopAtDurationEnd>false</StopAtDurationEnd>\n"
-			"</Repetition>\n");
+			"</Repetition>\n",
+			minute);
 		break;
 
 	case SCHEDULE_DAILY:
 		fprintf(tfile->fp,
-			"<StartBoundary>2020-01-01T00:00:00</StartBoundary>\n"
+			"<StartBoundary>2020-01-01T00:%02d:00</StartBoundary>\n"
 			"<Enabled>true</Enabled>\n"
 			"<ScheduleByWeek>\n"
 			"<DaysOfWeek>\n"
@@ -2048,19 +2050,21 @@ static int schtasks_schedule_task(const char *exec_path, enum schedule_priority
 			"<Saturday />\n"
 			"</DaysOfWeek>\n"
 			"<WeeksInterval>1</WeeksInterval>\n"
-			"</ScheduleByWeek>\n");
+			"</ScheduleByWeek>\n",
+			minute);
 		break;
 
 	case SCHEDULE_WEEKLY:
 		fprintf(tfile->fp,
-			"<StartBoundary>2020-01-01T00:00:00</StartBoundary>\n"
+			"<StartBoundary>2020-01-01T00:%02d:00</StartBoundary>\n"
 			"<Enabled>true</Enabled>\n"
 			"<ScheduleByWeek>\n"
 			"<DaysOfWeek>\n"
 			"<Sunday />\n"
 			"</DaysOfWeek>\n"
 			"<WeeksInterval>1</WeeksInterval>\n"
-			"</ScheduleByWeek>\n");
+			"</ScheduleByWeek>\n",
+			minute);
 		break;
 
 	default:
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 4/6] maintenance: use random minute in cron scheduler
  2023-08-07 18:51 [PATCH 0/6] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
                   ` (2 preceding siblings ...)
  2023-08-07 18:51 ` [PATCH 3/6] maintenance: use random minute in Windows scheduler Derrick Stolee via GitGitGadget
@ 2023-08-07 18:51 ` Derrick Stolee via GitGitGadget
  2023-08-07 18:51 ` [PATCH 5/6] maintenance: swap method locations Derrick Stolee via GitGitGadget
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-07 18:51 UTC (permalink / raw)
  To: git; +Cc: gitster, sandals, lenaic, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Add this random minute to the cron integration.

The cron schedule specification starts with a minute indicator, which
was previously inserted as the "0" string but now takes the given minute
as an integer parameter.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index f059c9f89d5..be47000ed36 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -2181,6 +2181,7 @@ static int crontab_update_schedule(int run_maintenance, int fd)
 	FILE *cron_list, *cron_in;
 	struct strbuf line = STRBUF_INIT;
 	struct tempfile *tmpedit = NULL;
+	int minute = get_random_minute();
 
 	get_schedule_cmd(&cmd, NULL);
 	strvec_split(&crontab_list.args, cmd);
@@ -2235,11 +2236,11 @@ static int crontab_update_schedule(int run_maintenance, int fd)
 			"# replaced in the future by a Git command.\n\n");
 
 		strbuf_addf(&line_format,
-			    "%%s %%s * * %%s \"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%s\n",
+			    "%%d %%s * * %%s \"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%s\n",
 			    exec_path, exec_path);
-		fprintf(cron_in, line_format.buf, "0", "1-23", "*", "hourly");
-		fprintf(cron_in, line_format.buf, "0", "0", "1-6", "daily");
-		fprintf(cron_in, line_format.buf, "0", "0", "0", "weekly");
+		fprintf(cron_in, line_format.buf, minute, "1-23", "*", "hourly");
+		fprintf(cron_in, line_format.buf, minute, "0", "1-6", "daily");
+		fprintf(cron_in, line_format.buf, minute, "0", "0", "weekly");
 		strbuf_release(&line_format);
 
 		fprintf(cron_in, "\n%s\n", END_LINE);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 5/6] maintenance: swap method locations
  2023-08-07 18:51 [PATCH 0/6] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
                   ` (3 preceding siblings ...)
  2023-08-07 18:51 ` [PATCH 4/6] maintenance: use random minute in cron scheduler Derrick Stolee via GitGitGadget
@ 2023-08-07 18:51 ` Derrick Stolee via GitGitGadget
  2023-08-07 21:24   ` Taylor Blau
  2023-08-07 18:51 ` [PATCH 6/6] maintenance: use random minute in systemd scheduler Derrick Stolee via GitGitGadget
  2023-08-10 20:39 ` [PATCH v2 0/8] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
  6 siblings, 1 reply; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-07 18:51 UTC (permalink / raw)
  To: git; +Cc: gitster, sandals, lenaic, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The systemd_timer_write_unit_templates() method writes a single template
that is then used to start the hourly, daily, and weekly schedules with
systemd.

However, in order to schedule systemd maintenance on a given minute,
these templates need to be replaced with specific schedules for each of
these jobs.

Before modifying the schedules, move the writing method above the
systemd_timer_enable_unit() method, so we can write a specific schedule
for each unit.

The diff is computed smaller by showing systemd_timer_enable_unit() move
instead of systemd_timer_write_unit_templates().

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c | 128 +++++++++++++++++++++++++--------------------------
 1 file changed, 64 insertions(+), 64 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index be47000ed36..b3ef95b10aa 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -2299,70 +2299,6 @@ static char *xdg_config_home_systemd(const char *filename)
 	return xdg_config_home_for("systemd/user", filename);
 }
 
-static int systemd_timer_enable_unit(int enable,
-				     enum schedule_priority schedule)
-{
-	const char *cmd = "systemctl";
-	struct child_process child = CHILD_PROCESS_INIT;
-	const char *frequency = get_frequency(schedule);
-
-	/*
-	 * Disabling the systemd unit while it is already disabled makes
-	 * systemctl print an error.
-	 * Let's ignore it since it means we already are in the expected state:
-	 * the unit is disabled.
-	 *
-	 * On the other hand, enabling a systemd unit which is already enabled
-	 * produces no error.
-	 */
-	if (!enable)
-		child.no_stderr = 1;
-
-	get_schedule_cmd(&cmd, NULL);
-	strvec_split(&child.args, cmd);
-	strvec_pushl(&child.args, "--user", enable ? "enable" : "disable",
-		     "--now", NULL);
-	strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
-
-	if (start_command(&child))
-		return error(_("failed to start systemctl"));
-	if (finish_command(&child))
-		/*
-		 * Disabling an already disabled systemd unit makes
-		 * systemctl fail.
-		 * Let's ignore this failure.
-		 *
-		 * Enabling an enabled systemd unit doesn't fail.
-		 */
-		if (enable)
-			return error(_("failed to run systemctl"));
-	return 0;
-}
-
-static int systemd_timer_delete_unit_templates(void)
-{
-	int ret = 0;
-	char *filename = xdg_config_home_systemd("git-maintenance@.timer");
-	if (unlink(filename) && !is_missing_file_error(errno))
-		ret = error_errno(_("failed to delete '%s'"), filename);
-	FREE_AND_NULL(filename);
-
-	filename = xdg_config_home_systemd("git-maintenance@.service");
-	if (unlink(filename) && !is_missing_file_error(errno))
-		ret = error_errno(_("failed to delete '%s'"), filename);
-
-	free(filename);
-	return ret;
-}
-
-static int systemd_timer_delete_units(void)
-{
-	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY) ||
-	       systemd_timer_enable_unit(0, SCHEDULE_DAILY) ||
-	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY) ||
-	       systemd_timer_delete_unit_templates();
-}
-
 static int systemd_timer_write_unit_templates(const char *exec_path)
 {
 	char *filename;
@@ -2444,6 +2380,70 @@ error:
 	return -1;
 }
 
+static int systemd_timer_enable_unit(int enable,
+				     enum schedule_priority schedule)
+{
+	const char *cmd = "systemctl";
+	struct child_process child = CHILD_PROCESS_INIT;
+	const char *frequency = get_frequency(schedule);
+
+	/*
+	 * Disabling the systemd unit while it is already disabled makes
+	 * systemctl print an error.
+	 * Let's ignore it since it means we already are in the expected state:
+	 * the unit is disabled.
+	 *
+	 * On the other hand, enabling a systemd unit which is already enabled
+	 * produces no error.
+	 */
+	if (!enable)
+		child.no_stderr = 1;
+
+	get_schedule_cmd(&cmd, NULL);
+	strvec_split(&child.args, cmd);
+	strvec_pushl(&child.args, "--user", enable ? "enable" : "disable",
+		     "--now", NULL);
+	strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
+
+	if (start_command(&child))
+		return error(_("failed to start systemctl"));
+	if (finish_command(&child))
+		/*
+		 * Disabling an already disabled systemd unit makes
+		 * systemctl fail.
+		 * Let's ignore this failure.
+		 *
+		 * Enabling an enabled systemd unit doesn't fail.
+		 */
+		if (enable)
+			return error(_("failed to run systemctl"));
+	return 0;
+}
+
+static int systemd_timer_delete_unit_templates(void)
+{
+	int ret = 0;
+	char *filename = xdg_config_home_systemd("git-maintenance@.timer");
+	if (unlink(filename) && !is_missing_file_error(errno))
+		ret = error_errno(_("failed to delete '%s'"), filename);
+	FREE_AND_NULL(filename);
+
+	filename = xdg_config_home_systemd("git-maintenance@.service");
+	if (unlink(filename) && !is_missing_file_error(errno))
+		ret = error_errno(_("failed to delete '%s'"), filename);
+
+	free(filename);
+	return ret;
+}
+
+static int systemd_timer_delete_units(void)
+{
+	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY) ||
+	       systemd_timer_enable_unit(0, SCHEDULE_DAILY) ||
+	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY) ||
+	       systemd_timer_delete_unit_templates();
+}
+
 static int systemd_timer_setup_units(void)
 {
 	const char *exec_path = git_exec_path();
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-07 18:51 [PATCH 0/6] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
                   ` (4 preceding siblings ...)
  2023-08-07 18:51 ` [PATCH 5/6] maintenance: swap method locations Derrick Stolee via GitGitGadget
@ 2023-08-07 18:51 ` Derrick Stolee via GitGitGadget
  2023-08-07 21:31   ` Taylor Blau
                     ` (2 more replies)
  2023-08-10 20:39 ` [PATCH v2 0/8] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
  6 siblings, 3 replies; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-07 18:51 UTC (permalink / raw)
  To: git; +Cc: gitster, sandals, lenaic, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Add this random minute to the systemd integration.

This integration is more complicated than similar changes for other
schedulers because of a neat trick that systemd allows: templating.

The previous implementation generated two template files with names
of the form 'git-maintenance@.(timer|service)'. The '.timer' or
'.service' indicates that this is a template that is picked up when we
later specify '...@<schedule>.timer' or '...@<schedule>.service'. The
'<schedule>' string is then used to insert into the template both the
'OnCalendar' schedule setting and the '--schedule' parameter of the
'git maintenance run' command.

In order to set these schedules to a given minute, we can no longer use
the 'hourly', 'daily', or 'weekly' strings for '<schedule>' and instead
need to abandon the template model.

Modify the template with a custom schedule in the 'OnCalendar' setting.
This schedule has some interesting differences from cron-like patterns,
but is relatively easy to figure out from context. The one that might be
confusing is that '*-*-*' is a date-based pattern, but this must be
omitted when using 'Mon' to signal that we care about the day of the
week. Monday is used since that matches the day used for the 'weekly'
schedule used previously.

The rest of the change involves making sure we are writing these .timer
and .service files before initializing the schedule with 'systemctl' and
deleting the files when we are done. Some changes are also made to share
the random minute along with a single computation of the execution path
of the current Git executable.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c           | 82 ++++++++++++++++++++++++++++++++----------
 t/t7900-maintenance.sh |  4 ++-
 2 files changed, 66 insertions(+), 20 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index b3ef95b10aa..5f5bb95641f 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -2299,13 +2299,20 @@ static char *xdg_config_home_systemd(const char *filename)
 	return xdg_config_home_for("systemd/user", filename);
 }
 
-static int systemd_timer_write_unit_templates(const char *exec_path)
+static int systemd_timer_write_unit_template(enum schedule_priority schedule,
+					     const char *exec_path,
+					     int minute)
 {
 	char *filename;
 	FILE *file;
 	const char *unit;
+	char *schedule_pattern = NULL;
+	const char *frequency = get_frequency(schedule);
+	char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
+	char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);
+
+	filename = xdg_config_home_systemd(local_timer_name);
 
-	filename = xdg_config_home_systemd("git-maintenance@.timer");
 	if (safe_create_leading_directories(filename)) {
 		error(_("failed to create directories for '%s'"), filename);
 		goto error;
@@ -2314,6 +2321,23 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
 	if (!file)
 		goto error;
 
+	switch (schedule) {
+	case SCHEDULE_HOURLY:
+		schedule_pattern = xstrfmt("*-*-* *:%02d:00", minute);
+		break;
+
+	case SCHEDULE_DAILY:
+		schedule_pattern = xstrfmt("*-*-* 0:%02d:00", minute);
+		break;
+
+	case SCHEDULE_WEEKLY:
+		schedule_pattern = xstrfmt("Mon 0:%02d:00", minute);
+		break;
+
+	default:
+		BUG("Unhandled schedule_priority");
+	}
+
 	unit = "# This file was created and is maintained by Git.\n"
 	       "# Any edits made in this file might be replaced in the future\n"
 	       "# by a Git command.\n"
@@ -2322,12 +2346,12 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
 	       "Description=Optimize Git repositories data\n"
 	       "\n"
 	       "[Timer]\n"
-	       "OnCalendar=%i\n"
+	       "OnCalendar=%s\n"
 	       "Persistent=true\n"
 	       "\n"
 	       "[Install]\n"
 	       "WantedBy=timers.target\n";
-	if (fputs(unit, file) == EOF) {
+	if (fprintf(file, unit, schedule_pattern) < 0) {
 		error(_("failed to write to '%s'"), filename);
 		fclose(file);
 		goto error;
@@ -2338,7 +2362,7 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
 	}
 	free(filename);
 
-	filename = xdg_config_home_systemd("git-maintenance@.service");
+	filename = xdg_config_home_systemd(local_service_name);
 	file = fopen_or_warn(filename, "w");
 	if (!file)
 		goto error;
@@ -2352,7 +2376,7 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
 	       "\n"
 	       "[Service]\n"
 	       "Type=oneshot\n"
-	       "ExecStart=\"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%i\n"
+	       "ExecStart=\"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%s\n"
 	       "LockPersonality=yes\n"
 	       "MemoryDenyWriteExecute=yes\n"
 	       "NoNewPrivileges=yes\n"
@@ -2362,7 +2386,7 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
 	       "RestrictSUIDSGID=yes\n"
 	       "SystemCallArchitectures=native\n"
 	       "SystemCallFilter=@system-service\n";
-	if (fprintf(file, unit, exec_path, exec_path) < 0) {
+	if (fprintf(file, unit, exec_path, exec_path, frequency) < 0) {
 		error(_("failed to write to '%s'"), filename);
 		fclose(file);
 		goto error;
@@ -2375,13 +2399,16 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
 	return 0;
 
 error:
+	free(schedule_pattern);
+	free(local_timer_name);
 	free(filename);
-	systemd_timer_delete_unit_templates();
 	return -1;
 }
 
 static int systemd_timer_enable_unit(int enable,
-				     enum schedule_priority schedule)
+				     enum schedule_priority schedule,
+				     const char *exec_path,
+				     int minute)
 {
 	const char *cmd = "systemctl";
 	struct child_process child = CHILD_PROCESS_INIT;
@@ -2398,6 +2425,8 @@ static int systemd_timer_enable_unit(int enable,
 	 */
 	if (!enable)
 		child.no_stderr = 1;
+	else if (systemd_timer_write_unit_template(schedule, exec_path, minute))
+		return -1;
 
 	get_schedule_cmd(&cmd, NULL);
 	strvec_split(&child.args, cmd);
@@ -2420,38 +2449,53 @@ static int systemd_timer_enable_unit(int enable,
 	return 0;
 }
 
-static int systemd_timer_delete_unit_templates(void)
+static int systemd_timer_delete_unit_template(enum schedule_priority priority)
 {
+	const char *frequency = get_frequency(priority);
+	char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
+	char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);
 	int ret = 0;
-	char *filename = xdg_config_home_systemd("git-maintenance@.timer");
+	char *filename = xdg_config_home_systemd(local_timer_name);
 	if (unlink(filename) && !is_missing_file_error(errno))
 		ret = error_errno(_("failed to delete '%s'"), filename);
 	FREE_AND_NULL(filename);
 
-	filename = xdg_config_home_systemd("git-maintenance@.service");
+	filename = xdg_config_home_systemd(local_service_name);
 	if (unlink(filename) && !is_missing_file_error(errno))
 		ret = error_errno(_("failed to delete '%s'"), filename);
 
 	free(filename);
+	free(local_timer_name);
+	free(local_service_name);
 	return ret;
 }
 
+static int systemd_timer_delete_unit_templates(void)
+{
+	/* Purposefully not short-circuited to make sure all are called. */
+	return systemd_timer_delete_unit_template(SCHEDULE_HOURLY) |
+	       systemd_timer_delete_unit_template(SCHEDULE_DAILY) |
+	       systemd_timer_delete_unit_template(SCHEDULE_WEEKLY);
+}
+
 static int systemd_timer_delete_units(void)
 {
-	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY) ||
-	       systemd_timer_enable_unit(0, SCHEDULE_DAILY) ||
-	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY) ||
+	int minute = get_random_minute();
+	const char *exec_path = git_exec_path();
+	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY, exec_path, minute) ||
+	       systemd_timer_enable_unit(0, SCHEDULE_DAILY, exec_path, minute) ||
+	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY, exec_path, minute) ||
 	       systemd_timer_delete_unit_templates();
 }
 
 static int systemd_timer_setup_units(void)
 {
+	int minute = get_random_minute();
 	const char *exec_path = git_exec_path();
 
-	int ret = systemd_timer_write_unit_templates(exec_path) ||
-		  systemd_timer_enable_unit(1, SCHEDULE_HOURLY) ||
-		  systemd_timer_enable_unit(1, SCHEDULE_DAILY) ||
-		  systemd_timer_enable_unit(1, SCHEDULE_WEEKLY);
+	int ret = systemd_timer_enable_unit(1, SCHEDULE_HOURLY, exec_path, minute) ||
+		  systemd_timer_enable_unit(1, SCHEDULE_DAILY, exec_path, minute) ||
+		  systemd_timer_enable_unit(1, SCHEDULE_WEEKLY, exec_path, minute);
 	if (ret)
 		systemd_timer_delete_units();
 	return ret;
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 487e326b3fa..f5a93f5aecf 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -744,7 +744,9 @@ test_expect_success 'start and stop Linux/systemd maintenance' '
 	# start registers the repo
 	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
 
-	test_systemd_analyze_verify "systemd/user/git-maintenance@.service" &&
+	test_systemd_analyze_verify "systemd/user/git-maintenance@hourly.service" &&
+	test_systemd_analyze_verify "systemd/user/git-maintenance@daily.service" &&
+	test_systemd_analyze_verify "systemd/user/git-maintenance@weekly.service" &&
 
 	printf -- "--user enable --now git-maintenance@%s.timer\n" hourly daily weekly >expect &&
 	test_cmp expect args &&
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-07 18:51 ` [PATCH 1/6] maintenance: add get_random_minute() Derrick Stolee via GitGitGadget
@ 2023-08-07 21:20   ` Taylor Blau
  2023-08-07 23:53     ` Junio C Hamano
  2023-08-08 17:28     ` Derrick Stolee
  0 siblings, 2 replies; 47+ messages in thread
From: Taylor Blau @ 2023-08-07 21:20 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sandals, lenaic, Derrick Stolee

On Mon, Aug 07, 2023 at 06:51:35PM +0000, Derrick Stolee via GitGitGadget wrote:
> diff --git a/builtin/gc.c b/builtin/gc.c
> index f3942188a61..66a972bc292 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -1708,6 +1708,23 @@ static int get_schedule_cmd(const char **cmd, int *is_available)
>  	return 1;
>  }
>
> +MAYBE_UNUSED
> +static int get_random_minute(void)
> +{
> +	static int random_initialized = 0;
> +
> +	/* Use a static value when under tests. */
> +	if (!getenv("GIT_TEST_MAINTENANCE_SCHEDULER"))
> +		return 13;
> +
> +	if (!random_initialized) {
> +		srand((unsigned int)getpid());
> +		random_initialized = 1;
> +	}

I was wondering where else we call srand() within Git, and it looks like
the only other spot is in `lock_file_timeout()`.

I doubt it, but is there a chance that that code depends on only calling
srand() once? I think the answer is "no", since we only use rand()
within that function to generate a random-ish backoff period, so I think
the repeatability of it doesn't matter all that much.

So I think this is kind of outside the scope of your series, but I
wonder if we should have a git_rand() that automatically initializes the
PRNG with the value of getpid()? Then multiple callers can grab random
values at will without reinitializing the PRNG.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 2/6] maintenance: use random minute in launchctl scheduler
  2023-08-07 18:51 ` [PATCH 2/6] maintenance: use random minute in launchctl scheduler Derrick Stolee via GitGitGadget
@ 2023-08-07 21:23   ` Taylor Blau
  0 siblings, 0 replies; 47+ messages in thread
From: Taylor Blau @ 2023-08-07 21:23 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sandals, lenaic, Derrick Stolee

On Mon, Aug 07, 2023 at 06:51:36PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
>
> The get_random_minute() method was created to allow maintenance
> schedules to be fixed to a random minute of the hour. This randomness is
> only intended to spread out the load from a number of clients, but each
> client should have an hour between each maintenance cycle.
>
> Use get_random_minute() when constructing the schedules for launchctl.
>
> The format already includes a 'Minute' key which is modified from 0 to
> the random minute.

All looks good. I was wondering if this is something that we'd want to
allow users to opt out of, but I think that the built-in schedules are
sufficiently vague that we can afford to have a fudge factor on the
exact minute.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 5/6] maintenance: swap method locations
  2023-08-07 18:51 ` [PATCH 5/6] maintenance: swap method locations Derrick Stolee via GitGitGadget
@ 2023-08-07 21:24   ` Taylor Blau
  0 siblings, 0 replies; 47+ messages in thread
From: Taylor Blau @ 2023-08-07 21:24 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sandals, lenaic, Derrick Stolee

On Mon, Aug 07, 2023 at 06:51:39PM +0000, Derrick Stolee via GitGitGadget wrote:
> The diff is computed smaller by showing systemd_timer_enable_unit() move
> instead of systemd_timer_write_unit_templates().
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  builtin/gc.c | 128 +++++++++++++++++++++++++--------------------------
>  1 file changed, 64 insertions(+), 64 deletions(-)

This looks like a straightforward code movement, and diffing with
`--color-moved` verifies it as such. LGTM.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-07 18:51 ` [PATCH 6/6] maintenance: use random minute in systemd scheduler Derrick Stolee via GitGitGadget
@ 2023-08-07 21:31   ` Taylor Blau
  2023-08-08 13:49     ` Derrick Stolee
  2023-08-08  9:53   ` Phillip Wood
  2023-08-08 12:08   ` Phillip Wood
  2 siblings, 1 reply; 47+ messages in thread
From: Taylor Blau @ 2023-08-07 21:31 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sandals, lenaic, Derrick Stolee

On Mon, Aug 07, 2023 at 06:51:40PM +0000, Derrick Stolee via GitGitGadget wrote:
> In order to set these schedules to a given minute, we can no longer use
> the 'hourly', 'daily', or 'weekly' strings for '<schedule>' and instead
> need to abandon the template model.

Makes sense.

> Modify the template with a custom schedule in the 'OnCalendar' setting.
> This schedule has some interesting differences from cron-like patterns,
> but is relatively easy to figure out from context. The one that might be
> confusing is that '*-*-*' is a date-based pattern, but this must be
> omitted when using 'Mon' to signal that we care about the day of the
> week. Monday is used since that matches the day used for the 'weekly'
> schedule used previously.

I think the launchd version (which uses "0" for the day of the week)
runs on Sunday, if I remember correctly. I don't think that these two
necessarily need to run on the same day of the week when configured to
run weekly.

But I figured I'd raise the question in case you did mean for them to
both run on either Sunday or Monday.

> The rest of the change involves making sure we are writing these .timer
> and .service files before initializing the schedule with 'systemctl' and
> deleting the files when we are done. Some changes are also made to share
> the random minute along with a single computation of the execution path
> of the current Git executable.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  builtin/gc.c           | 82 ++++++++++++++++++++++++++++++++----------
>  t/t7900-maintenance.sh |  4 ++-
>  2 files changed, 66 insertions(+), 20 deletions(-)
>
> diff --git a/builtin/gc.c b/builtin/gc.c
> index b3ef95b10aa..5f5bb95641f 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -2299,13 +2299,20 @@ static char *xdg_config_home_systemd(const char *filename)
>  	return xdg_config_home_for("systemd/user", filename);
>  }
>
> -static int systemd_timer_write_unit_templates(const char *exec_path)
> +static int systemd_timer_write_unit_template(enum schedule_priority schedule,
> +					     const char *exec_path,
> +					     int minute)
>  {
>  	char *filename;
>  	FILE *file;
>  	const char *unit;
> +	char *schedule_pattern = NULL;

You should be able to drop the NULL initialization, since you assign
this value unconditionally in the switch statement below (or BUG() on an
unknown schedule type).

> +	const char *frequency = get_frequency(schedule);
> +	char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
> +	char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);
> +
> +	filename = xdg_config_home_systemd(local_timer_name);
>
> -	filename = xdg_config_home_systemd("git-maintenance@.timer");
>  	if (safe_create_leading_directories(filename)) {
>  		error(_("failed to create directories for '%s'"), filename);
>  		goto error;
> @@ -2314,6 +2321,23 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
>  	if (!file)
>  		goto error;
>
> +	switch (schedule) {
> +	case SCHEDULE_HOURLY:
> +		schedule_pattern = xstrfmt("*-*-* *:%02d:00", minute);
> +		break;
> +
> +	case SCHEDULE_DAILY:
> +		schedule_pattern = xstrfmt("*-*-* 0:%02d:00", minute);
> +		break;
> +
> +	case SCHEDULE_WEEKLY:
> +		schedule_pattern = xstrfmt("Mon 0:%02d:00", minute);
> +		break;
> +
> +	default:
> +		BUG("Unhandled schedule_priority");
> +	}
> +
>  	unit = "# This file was created and is maintained by Git.\n"
>  	       "# Any edits made in this file might be replaced in the future\n"
>  	       "# by a Git command.\n"
> @@ -2322,12 +2346,12 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
>  	       "Description=Optimize Git repositories data\n"
>  	       "\n"
>  	       "[Timer]\n"
> -	       "OnCalendar=%i\n"
> +	       "OnCalendar=%s\n"
>  	       "Persistent=true\n"
>  	       "\n"
>  	       "[Install]\n"
>  	       "WantedBy=timers.target\n";
> -	if (fputs(unit, file) == EOF) {
> +	if (fprintf(file, unit, schedule_pattern) < 0) {

OK, this is the templating part that you were mentioning earlier. I was
wondering what we were doing fputs()-ing a string with "%i" in it
without a formatting value to fill it in with. But that "%i" pertains to
systemd's instance value, IIUC.

The rest all looks good, thanks.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-07 21:20   ` Taylor Blau
@ 2023-08-07 23:53     ` Junio C Hamano
  2023-08-08  0:22       ` Junio C Hamano
  2023-08-08 17:28     ` Derrick Stolee
  1 sibling, 1 reply; 47+ messages in thread
From: Junio C Hamano @ 2023-08-07 23:53 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Derrick Stolee via GitGitGadget, git, sandals, lenaic, Derrick Stolee

Taylor Blau <me@ttaylorr.com> writes:

> I was wondering where else we call srand() within Git, and it looks like
> the only other spot is in `lock_file_timeout()`.

lock_file_timeout() should be updated to match git_mkstemps_mode(),
which was taught to use the csprng_bytes() function with 47efda96
(wrapper: use a CSPRNG to generate random file names, 2022-01-17),
and this new caller may want to do so as well, perhaps?  I dunno,
but the caller then does not have to worry about "initializing it
just once".

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-07 23:53     ` Junio C Hamano
@ 2023-08-08  0:22       ` Junio C Hamano
  2023-08-08 14:48         ` Taylor Blau
  0 siblings, 1 reply; 47+ messages in thread
From: Junio C Hamano @ 2023-08-08  0:22 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Derrick Stolee via GitGitGadget, git, sandals, lenaic, Derrick Stolee

Junio C Hamano <gitster@pobox.com> writes:

> Taylor Blau <me@ttaylorr.com> writes:
>
>> I was wondering where else we call srand() within Git, and it looks like
>> the only other spot is in `lock_file_timeout()`.
>
> lock_file_timeout() should be updated to match git_mkstemps_mode(),
> which was taught to use the csprng_bytes() function with 47efda96
> (wrapper: use a CSPRNG to generate random file names, 2022-01-17),
> and this new caller may want to do so as well, perhaps?  I dunno,
> but the caller then does not have to worry about "initializing it
> just once".

Of course, the obvious downside is that crypto-secure one may be,
unlike for its use in mkstemps(), way overkill for lockfiles and
cron dispersion purposes, as these codepaths are not on the target
surface.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-07 18:51 ` [PATCH 6/6] maintenance: use random minute in systemd scheduler Derrick Stolee via GitGitGadget
  2023-08-07 21:31   ` Taylor Blau
@ 2023-08-08  9:53   ` Phillip Wood
  2023-08-08 13:03     ` Phillip Wood
  2023-08-08 13:56     ` Derrick Stolee
  2023-08-08 12:08   ` Phillip Wood
  2 siblings, 2 replies; 47+ messages in thread
From: Phillip Wood @ 2023-08-08  9:53 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, sandals, lenaic, Derrick Stolee, Taylor Blau

Hi Stolee

On 07/08/2023 19:51, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> +	switch (schedule) {
> +	case SCHEDULE_HOURLY:
> +		schedule_pattern = xstrfmt("*-*-* *:%02d:00", minute);
> +		break;
> +
> +	case SCHEDULE_DAILY:
> +		schedule_pattern = xstrfmt("*-*-* 0:%02d:00", minute);
> +		break;
> +
> +	case SCHEDULE_WEEKLY:
> +		schedule_pattern = xstrfmt("Mon 0:%02d:00", minute);
> +		break;

This is not a new issue with this patch but we run the hourly job even 
when we want to run the daily job or the weekly job and we run the daily 
job when we want to run the weekly job.  maintenance_run_tasks() contains

	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0) {
		/*
		 * Another maintenance command is running.
		 *
		 * If --auto was provided, then it is likely due to a
		 * recursive process stack. Do not report an error in
		 * that case.
		 */
		if (!opts->auto_flag && !opts->quiet)
			warning(_("lock file '%s' exists, skipping maintenance"),
				lock_path);
		free(lock_path);
		return 0;
	}

So only one of these jobs will succeed. The cron entries are careful to 
only run one job at a time, I think it would be worth doing the same 
thing here. I think the using the following format strings would fix this.

Hourly: "Tue..Sun *-*-* 1..23:00:%02d"
Daily:  "Tue..Sun *-*-* 00:00:%02d"
Weekly: "Mon      *-*-* 00:00:%02d"

It looks like the launchctl schedule has the same issue.

One thing I've been wondering about which is related to maintenance but 
totally off-topic for this patch is that I think when auto maintenance 
is enabled we stop automatically running "gc" so how do the reflogs get 
expired?

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-07 18:51 ` [PATCH 6/6] maintenance: use random minute in systemd scheduler Derrick Stolee via GitGitGadget
  2023-08-07 21:31   ` Taylor Blau
  2023-08-08  9:53   ` Phillip Wood
@ 2023-08-08 12:08   ` Phillip Wood
  2023-08-08 17:06     ` Derrick Stolee
  2 siblings, 1 reply; 47+ messages in thread
From: Phillip Wood @ 2023-08-08 12:08 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, sandals, lenaic, Derrick Stolee

On 07/08/2023 19:51, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> The get_random_minute() method was created to allow maintenance
> schedules to be fixed to a random minute of the hour. This randomness is
> only intended to spread out the load from a number of clients, but each
> client should have an hour between each maintenance cycle.
> 
> Add this random minute to the systemd integration.

I think it makes sense to keep the random minute implementation the same 
across all the schedulers, but we could use RandomizedDelaySec (possibly 
combined with FixedRandomDelay) to randomize when the job is actually run.

> This integration is more complicated than similar changes for other
> schedulers because of a neat trick that systemd allows: templating.
> 
> The previous implementation generated two template files with names
> of the form 'git-maintenance@.(timer|service)'. The '.timer' or
> '.service' indicates that this is a template that is picked up when we
> later specify '...@<schedule>.timer' or '...@<schedule>.service'. The
> '<schedule>' string is then used to insert into the template both the
> 'OnCalendar' schedule setting and the '--schedule' parameter of the
> 'git maintenance run' command.
> 
> In order to set these schedules to a given minute, we can no longer use
> the 'hourly', 'daily', or 'weekly' strings for '<schedule>' and instead
> need to abandon the template model.

I've left some comments about this below.


> @@ -2299,13 +2299,20 @@ static char *xdg_config_home_systemd(const char *filename)
>   	return xdg_config_home_for("systemd/user", filename);
>   }
>   
> -static int systemd_timer_write_unit_templates(const char *exec_path)
> +static int systemd_timer_write_unit_template(enum schedule_priority schedule,

As we're not writing template files any more I think we should rename 
this to systemd_timer_write_unit_file()

> +					     const char *exec_path,
> +					     int minute)
>   {
>   	char *filename;
>   	FILE *file;
>   	const char *unit;
> +	char *schedule_pattern = NULL;
> +	const char *frequency = get_frequency(schedule);
> +	char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);

The "@" in the name signifies that it is a template unit which it isn't 
anymore so I think we want to change this to "git-maintenance-%s.timer"

> +	char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);

Same change to the name here. I wonder if we could still use a template 
service file but that would complicate the implementation as we'd need 
to write three timer files but only one service file.

> [...]
> @@ -2375,13 +2399,16 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
>   	return 0;
>   
>   error:
> +	free(schedule_pattern);
> +	free(local_timer_name);
>   	free(filename);
> -	systemd_timer_delete_unit_templates();

This looks like a change in behavior as previously we'd remove any files 
if there was an error rather than leaving behind a timer file without a 
corresponding unit file.

Looking at maintenance_start() we call maintenance_register() which 
disables "gc --auto" before we get to this point so if we fail to write 
the files we'll end up disabling any form of gc in the repository.

> [...]
> -static int systemd_timer_delete_unit_templates(void)
> +static int systemd_timer_delete_unit_template(enum schedule_priority priority)

Same suggestion as above to rename this to ..._unit_file()

>   {
> +	const char *frequency = get_frequency(priority);
> +	char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
> +	char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);

I'm not sure it is worth it but we could perhaps

	#define SYSTEMD_UNIT_FORMAT "git-maintenance-%s.%s"

above and then these lines and the ones in 
systemd_timer_write_unit_file() would become

	char *local_timer_name = xstrfmt(SYSTEMD_UNIT_FORMAT, frequency, "timer");
	char *local_service = xstrfmt(SYSTEMD_UNIT_FORMAT, frequency, "service");

> [...]
> +static int systemd_timer_delete_unit_templates(void)

Naming again.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-08  9:53   ` Phillip Wood
@ 2023-08-08 13:03     ` Phillip Wood
  2023-08-08 13:56     ` Derrick Stolee
  1 sibling, 0 replies; 47+ messages in thread
From: Phillip Wood @ 2023-08-08 13:03 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, sandals, lenaic, Derrick Stolee, Taylor Blau

On 08/08/2023 10:53, Phillip Wood wrote:
> Hourly: "Tue..Sun *-*-* 1..23:00:%02d"
> Daily:  "Tue..Sun *-*-* 00:00:%02d"
> Weekly: "Mon      *-*-* 00:00:%02d"

Thinking about it some more, this only fixes the problem if the computer 
is actually on at midnight. If the computer is switched off overnight 
then we still try to start three maintenance jobs at the same time when 
the user turns their computer on on Tuesday morning. We could stop 
marking the hourly job as persistent on the assumption that it will be 
run soon anyway but that does not solve the problem of the daily and 
weekly jobs running concurrently on a Tuesday morning.

Best Wishes

Phillip


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-07 21:31   ` Taylor Blau
@ 2023-08-08 13:49     ` Derrick Stolee
  2023-08-08 20:05       ` Taylor Blau
  0 siblings, 1 reply; 47+ messages in thread
From: Derrick Stolee @ 2023-08-08 13:49 UTC (permalink / raw)
  To: Taylor Blau, Derrick Stolee via GitGitGadget
  Cc: git, gitster, sandals, lenaic

On 8/7/2023 5:31 PM, Taylor Blau wrote:
> On Mon, Aug 07, 2023 at 06:51:40PM +0000, Derrick Stolee via GitGitGadget wrote:

>> Modify the template with a custom schedule in the 'OnCalendar' setting.
>> This schedule has some interesting differences from cron-like patterns,
>> but is relatively easy to figure out from context. The one that might be
>> confusing is that '*-*-*' is a date-based pattern, but this must be
>> omitted when using 'Mon' to signal that we care about the day of the
>> week. Monday is used since that matches the day used for the 'weekly'
>> schedule used previously.
> 
> I think the launchd version (which uses "0" for the day of the week)
> runs on Sunday, if I remember correctly. I don't think that these two
> necessarily need to run on the same day of the week when configured to
> run weekly.
> 
> But I figured I'd raise the question in case you did mean for them to
> both run on either Sunday or Monday.

I don't intend to change the day that is run as part of this change.

I think all other schedulers run on Sunday, but systemd running on Monday
is a particular detail of its "weekly" schedule.

>> -static int systemd_timer_write_unit_templates(const char *exec_path)
>> +static int systemd_timer_write_unit_template(enum schedule_priority schedule,
>> +					     const char *exec_path,
>> +					     int minute)
>>  {
>>  	char *filename;
>>  	FILE *file;
>>  	const char *unit;
>> +	char *schedule_pattern = NULL;
> 
> You should be able to drop the NULL initialization, since you assign
> this value unconditionally in the switch statement below (or BUG() on an
> unknown schedule type).

Unfortunately, GCC complained about a possibly-unassigned value when I
left this unset during development, so this actually is necessary for
that compiler.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-08  9:53   ` Phillip Wood
  2023-08-08 13:03     ` Phillip Wood
@ 2023-08-08 13:56     ` Derrick Stolee
  2023-08-08 17:24       ` Derrick Stolee
  1 sibling, 1 reply; 47+ messages in thread
From: Derrick Stolee @ 2023-08-08 13:56 UTC (permalink / raw)
  To: phillip.wood, Derrick Stolee via GitGitGadget, git
  Cc: gitster, sandals, lenaic, Taylor Blau

On 8/8/2023 5:53 AM, Phillip Wood wrote:> Hi Stolee
> 
> On 07/08/2023 19:51, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <derrickstolee@github.com>
>> +    switch (schedule) {
>> +    case SCHEDULE_HOURLY:
>> +        schedule_pattern = xstrfmt("*-*-* *:%02d:00", minute);
>> +        break;
>> +
>> +    case SCHEDULE_DAILY:
>> +        schedule_pattern = xstrfmt("*-*-* 0:%02d:00", minute);
>> +        break;
>> +
>> +    case SCHEDULE_WEEKLY:
>> +        schedule_pattern = xstrfmt("Mon 0:%02d:00", minute);
>> +        break;
> 
> This is not a new issue with this patch but we run the hourly job even
> when we want to run the daily job or the weekly job and we run the daily
> job when we want to run the weekly job.

This is an excellent point! Thanks for bringing it up.

> So only one of these jobs will succeed. The cron entries are careful to
> only run one job at a time, I think it would be worth doing the same
> thing here. I think the using the following format strings would fix this.
> 
> Hourly: "Tue..Sun *-*-* 1..23:00:%02d"
> Daily:  "Tue..Sun *-*-* 00:00:%02d"
> Weekly: "Mon      *-*-* 00:00:%02d"

I would modify this with dropping the "Tue..Sun" from the hourly schedule,
as we still want 23 runs on Mondays.

> It looks like the launchctl schedule has the same issue.

I will take a look at this and consider some additional patches to correct
these issues across both schedulers. Thank you for the attention to detail!

> One thing I've been wondering about which is related to maintenance but
> totally off-topic for this patch is that I think when auto maintenance
> is enabled we stop automatically running "gc" so how do the reflogs get
> expired?

The default maintenance schedule does not include a 'gc' run as it does
not intend to remove any data. Reflog expiration could be considered as a
separate maintenance task that might be useful in the default maintenance
schedule.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-08  0:22       ` Junio C Hamano
@ 2023-08-08 14:48         ` Taylor Blau
  2023-08-08 16:34           ` Junio C Hamano
  0 siblings, 1 reply; 47+ messages in thread
From: Taylor Blau @ 2023-08-08 14:48 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, git, sandals, lenaic, Derrick Stolee

On Mon, Aug 07, 2023 at 05:22:49PM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Taylor Blau <me@ttaylorr.com> writes:
> >
> >> I was wondering where else we call srand() within Git, and it looks like
> >> the only other spot is in `lock_file_timeout()`.
> >
> > lock_file_timeout() should be updated to match git_mkstemps_mode(),
> > which was taught to use the csprng_bytes() function with 47efda96
> > (wrapper: use a CSPRNG to generate random file names, 2022-01-17),
> > and this new caller may want to do so as well, perhaps?  I dunno,
> > but the caller then does not have to worry about "initializing it
> > just once".
>
> Of course, the obvious downside is that crypto-secure one may be,
> unlike for its use in mkstemps(), way overkill for lockfiles and
> cron dispersion purposes, as these codepaths are not on the target
> surface.

I think that's an acceptable price to pay here, since we can drop the
code to remember whether or not srand() has been called or not. Here's a
patch that we could take in that direction:

--- 8< ---
Subject: [PATCH] lockfile.c: use a CSPRNG to generate backoff milliseconds

Upon failing to acquire the lockfile, `lock_file_timeout()` will try
again with an exponential backoff. This backoff includes some noise as a
multiplier over the default backoff behavior ranging from [0.75, 1.25].

It generates this noise via rand(3). Using a non-cryptographic source of
randomness here is OK, since a more trivial attack vector (holding the
file open via an external process for longer than the value of
`timeout_ms`) is easier to exploit.

That all said, `lock_file_timeout()` initializes the PRNG with
`srand()`. This has a couple of downsides:

  - lock_file_timeout() needs to remember whether or not the PRNG has
    or hasn't been seeded.

  - If a different function also calls `srand()`, the PRNG may start
    generating repeated values (if that caller also initialized the PRNG
    with `getpid()`).

Let's avoid both of these by using `csprng_bytes()`, in a similar spirit
as 47efda967c (wrapper: use a CSPRNG to generate random file names,
2022-01-17).

Using a CSPRNG is definitely overkill for noising a backoff window, but
it avoids the concerns about calling `srand()`, so let's use it here,
too.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 lockfile.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/lockfile.c b/lockfile.c
index 1d5ed01682..6587d407f4 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -107,22 +107,17 @@ static int lock_file_timeout(struct lock_file *lk, const char *path,
 	int n = 1;
 	int multiplier = 1;
 	long remaining_ms = 0;
-	static int random_initialized = 0;

 	if (timeout_ms == 0)
 		return lock_file(lk, path, flags, mode);

-	if (!random_initialized) {
-		srand((unsigned int)getpid());
-		random_initialized = 1;
-	}
-
 	if (timeout_ms > 0)
 		remaining_ms = timeout_ms;

 	while (1) {
 		long backoff_ms, wait_ms;
 		int fd;
+		uint64_t rand;

 		fd = lock_file(lk, path, flags, mode);

@@ -135,7 +130,10 @@ static int lock_file_timeout(struct lock_file *lk, const char *path,

 		backoff_ms = multiplier * INITIAL_BACKOFF_MS;
 		/* back off for between 0.75*backoff_ms and 1.25*backoff_ms */
-		wait_ms = (750 + rand() % 500) * backoff_ms / 1000;
+		if (csprng_bytes(&rand, sizeof(uint64_t)) < 0)
+			return error_errno(_("unable to get random bytes for"
+					     "lockfile backoff"));
+		wait_ms = (750 + rand % 500) * backoff_ms / 1000;
 		sleep_millisec(wait_ms);
 		remaining_ms -= wait_ms;

--
2.42.0.rc0.26.g802d811bac.dirty

--- >8 ---

Thanks,
Taylor

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-08 14:48         ` Taylor Blau
@ 2023-08-08 16:34           ` Junio C Hamano
  2023-08-08 16:49             ` Junio C Hamano
  0 siblings, 1 reply; 47+ messages in thread
From: Junio C Hamano @ 2023-08-08 16:34 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Derrick Stolee via GitGitGadget, git, sandals, lenaic, Derrick Stolee

Taylor Blau <me@ttaylorr.com> writes:

> I think that's an acceptable price to pay here, since we can drop the
> code to remember whether or not srand() has been called or not. Here's a
> patch that we could take in that direction:
> ...
>   - If a different function also calls `srand()`, the PRNG may start
>     generating repeated values (if that caller also initialized the PRNG
>     with `getpid()`).

Hmph, I didn't think about that one.  I do not think it can be a
viable attack vector to attack _this_ code, but the other function
might be.  But if the other function is worth attacking and
attackable, it ought to be using crypto-secure one and not srand(),
so this argument may not give us a good justification X-<.

Provided if code simplification is a good enough rationale, the
patch looks sensible, but I find its use of u64 a bit questionable
(though not wrong).  I would have expected that the type of "rand"
would be the same as backoff_ms and wait_ms, two variables involved
in the same expression.

Thanks.

> diff --git a/lockfile.c b/lockfile.c
> index 1d5ed01682..6587d407f4 100644
> --- a/lockfile.c
> +++ b/lockfile.c
> @@ -107,22 +107,17 @@ static int lock_file_timeout(struct lock_file *lk, const char *path,
>  	int n = 1;
>  	int multiplier = 1;
>  	long remaining_ms = 0;
> -	static int random_initialized = 0;
>
>  	if (timeout_ms == 0)
>  		return lock_file(lk, path, flags, mode);
>
> -	if (!random_initialized) {
> -		srand((unsigned int)getpid());
> -		random_initialized = 1;
> -	}
> -
>  	if (timeout_ms > 0)
>  		remaining_ms = timeout_ms;
>
>  	while (1) {
>  		long backoff_ms, wait_ms;
>  		int fd;
> +		uint64_t rand;
>
>  		fd = lock_file(lk, path, flags, mode);
>
> @@ -135,7 +130,10 @@ static int lock_file_timeout(struct lock_file *lk, const char *path,
>
>  		backoff_ms = multiplier * INITIAL_BACKOFF_MS;
>  		/* back off for between 0.75*backoff_ms and 1.25*backoff_ms */
> -		wait_ms = (750 + rand() % 500) * backoff_ms / 1000;
> +		if (csprng_bytes(&rand, sizeof(uint64_t)) < 0)
> +			return error_errno(_("unable to get random bytes for"
> +					     "lockfile backoff"));
> +		wait_ms = (750 + rand % 500) * backoff_ms / 1000;
>  		sleep_millisec(wait_ms);
>  		remaining_ms -= wait_ms;
>
> --
> 2.42.0.rc0.26.g802d811bac.dirty
>
> --- >8 ---
>
> Thanks,
> Taylor

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-08 16:34           ` Junio C Hamano
@ 2023-08-08 16:49             ` Junio C Hamano
  2023-08-08 20:01               ` Taylor Blau
  0 siblings, 1 reply; 47+ messages in thread
From: Junio C Hamano @ 2023-08-08 16:49 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Derrick Stolee via GitGitGadget, git, sandals, lenaic, Derrick Stolee

Junio C Hamano <gitster@pobox.com> writes:

> Provided if code simplification is a good enough rationale, the
> patch looks sensible, but I find its use of u64 a bit questionable
> (though not wrong).  I would have expected that the type of "rand"
> would be the same as backoff_ms and wait_ms, two variables involved
> in the same expression.

Ah, not so fast.  The use of modulo means we need to be careful
about about the fuzzing factor going negative, and use of unsigned
type allows us to forget about it.

(fuzz % 250), when fuzz is of a signed random integral type, ranges
between -250 and +250 and because we want the center of our
distribution at 1000, so I think the following is equivalent.

 lockfile.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git c/lockfile.c w/lockfile.c
index 6587d407f4..7de53526ac 100644
--- c/lockfile.c
+++ w/lockfile.c
@@ -115,9 +115,8 @@ static int lock_file_timeout(struct lock_file *lk, const char *path,
 		remaining_ms = timeout_ms;
 
 	while (1) {
-		long backoff_ms, wait_ms;
+		long backoff_ms, wait_ms, fuzz;
 		int fd;
-		uint64_t rand;
 
 		fd = lock_file(lk, path, flags, mode);
 
@@ -130,10 +129,10 @@ static int lock_file_timeout(struct lock_file *lk, const char *path,
 
 		backoff_ms = multiplier * INITIAL_BACKOFF_MS;
 		/* back off for between 0.75*backoff_ms and 1.25*backoff_ms */
-		if (csprng_bytes(&rand, sizeof(uint64_t)) < 0)
+		if (csprng_bytes(&fuzz, sizeof(fuzz)) < 0)
 			return error_errno(_("unable to get random bytes for"
 					     "lockfile backoff"));
-		wait_ms = (750 + rand % 500) * backoff_ms / 1000;
+		wait_ms = (1000 + fuzz % 250) * backoff_ms / 1000;
 		sleep_millisec(wait_ms);
 		remaining_ms -= wait_ms;
 



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-08 12:08   ` Phillip Wood
@ 2023-08-08 17:06     ` Derrick Stolee
  2023-08-08 17:14       ` Derrick Stolee
  0 siblings, 1 reply; 47+ messages in thread
From: Derrick Stolee @ 2023-08-08 17:06 UTC (permalink / raw)
  To: phillip.wood, Derrick Stolee via GitGitGadget, git
  Cc: gitster, sandals, lenaic

On 8/8/2023 8:08 AM, Phillip Wood wrote:
> On 07/08/2023 19:51, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <derrickstolee@github.com>
>>
>> The get_random_minute() method was created to allow maintenance
>> schedules to be fixed to a random minute of the hour. This randomness is
>> only intended to spread out the load from a number of clients, but each
>> client should have an hour between each maintenance cycle.
>>
>> Add this random minute to the systemd integration.
> 
> I think it makes sense to keep the random minute implementation the same across all the schedulers, but we could use RandomizedDelaySec (possibly combined with FixedRandomDelay) to randomize when the job is actually run.

That's an interesting suggestion, but I also think it is valuable to
have consistent gaps between maintenance activities on the client, but
RandomizedDelaySec would present the possibility of 60-3540 seconds between
"hourly" maintenance runs (if I understand the option correctly).

>> @@ -2299,13 +2299,20 @@ static char *xdg_config_home_systemd(const char *filename)
>>       return xdg_config_home_for("systemd/user", filename);
>>   }
>>   -static int systemd_timer_write_unit_templates(const char *exec_path)
>> +static int systemd_timer_write_unit_template(enum schedule_priority schedule,
> 
> As we're not writing template files any more I think we should rename this to systemd_timer_write_unit_file()

Good point. I have adjusted all the names in my next version.

>> +                         const char *exec_path,
>> +                         int minute)
>>   {
>>       char *filename;
>>       FILE *file;
>>       const char *unit;
>> +    char *schedule_pattern = NULL;
>> +    const char *frequency = get_frequency(schedule);
>> +    char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
> 
> The "@" in the name signifies that it is a template unit which it isn't anymore so I think we want to change this to "git-maintenance-%s.timer"

I'll also take your SYSTEMD_UNIT_FORMAT macro suggestion to simplify things.
 
>> +    char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);
> 
> Same change to the name here. I wonder if we could still use a template service file but that would complicate the implementation as we'd need to write three timer files but only one service file.

Since we execute systemctl only passing the .timer filename, I think we'd
need to keep the '@' symbol in the name if we wanted to use .schedule
templates. Best to keep things simple and abandon templates completely.

>> [...]
>> @@ -2375,13 +2399,16 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
>>       return 0;
>>     error:
>> +    free(schedule_pattern);
>> +    free(local_timer_name);
>>       free(filename);
>> -    systemd_timer_delete_unit_templates();
> 
> This looks like a change in behavior as previously we'd remove any files if there was an error rather than leaving behind a timer file without a corresponding unit file.

The callers take care of deleting the unit files, but there was one place
where a short-circuit could have avoided this deletion. I'll clean that up.

> Looking at maintenance_start() we call maintenance_register() which disables "gc --auto" before we get to this point so if we fail to write the files we'll end up disabling any form of gc in the repository.

Adding this to the list of cleanups at the end. Thanks.

I appreciate the careful review!

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-08 17:06     ` Derrick Stolee
@ 2023-08-08 17:14       ` Derrick Stolee
  2023-08-09 10:00         ` Phillip Wood
  0 siblings, 1 reply; 47+ messages in thread
From: Derrick Stolee @ 2023-08-08 17:14 UTC (permalink / raw)
  To: phillip.wood, Derrick Stolee via GitGitGadget, git
  Cc: gitster, sandals, lenaic

On 8/8/2023 1:06 PM, Derrick Stolee wrote:
> On 8/8/2023 8:08 AM, Phillip Wood wrote:
>> On 07/08/2023 19:51, Derrick Stolee via GitGitGadget wrote:

>>> +    char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
>>
>> The "@" in the name signifies that it is a template unit which it isn't anymore so I think we want to change this to "git-maintenance-%s.timer"
> 
> I'll also take your SYSTEMD_UNIT_FORMAT macro suggestion to simplify things.

As I was checking things, it turns out that we _should_ keep the '@' symbol
if only to make sure that our new schedule overwrites the old schedule.

The alternative is that we manually try to delete the old schedule, but that
feels like an inefficient way to do it, leaving some cruft around long-term.

For completeness, here is what I did to check:

$ systemctl --user list-timers
NEXT                        LEFT        LAST                        PASSED       UNIT                         ACTIVATES                     
Tue 2023-08-08 13:13:00 EDT 6s left     n/a                         n/a          git-maintenance-hourly.timer git-maintenance-hourly.service
Tue 2023-08-08 13:50:00 EDT 37min left  Tue 2023-08-08 12:50:10 EDT 22min ago    git-maintenance@hourly.timer git-maintenance@hourly.service
Wed 2023-08-09 00:13:00 EDT 11h left    n/a                         n/a          git-maintenance-daily.timer  git-maintenance-daily.service
Wed 2023-08-09 00:50:00 EDT 11h left    Tue 2023-08-08 09:35:31 EDT 3h 37min ago git-maintenance@daily.timer  git-maintenance@daily.service
Mon 2023-08-14 00:13:00 EDT 5 days left n/a                         n/a          git-maintenance-weekly.timer git-maintenance-weekly.service
Mon 2023-08-14 00:50:00 EDT 5 days left Mon 2023-08-07 10:28:10 EDT 1 day 2h ago git-maintenance@weekly.timer git-maintenance@weekly.service

Do you have an alternative idea how to handle that?

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-08 13:56     ` Derrick Stolee
@ 2023-08-08 17:24       ` Derrick Stolee
  2023-08-09 10:03         ` Phillip Wood
  0 siblings, 1 reply; 47+ messages in thread
From: Derrick Stolee @ 2023-08-08 17:24 UTC (permalink / raw)
  To: phillip.wood, Derrick Stolee via GitGitGadget, git
  Cc: gitster, sandals, lenaic, Taylor Blau

On 8/8/2023 9:56 AM, Derrick Stolee wrote:
> On 8/8/2023 5:53 AM, Phillip Wood wrote:> Hi Stolee

>> So only one of these jobs will succeed. The cron entries are careful to
>> only run one job at a time, I think it would be worth doing the same
>> thing here. I think the using the following format strings would fix this.
>>
>> Hourly: "Tue..Sun *-*-* 1..23:00:%02d"
>> Daily:  "Tue..Sun *-*-* 00:00:%02d"
>> Weekly: "Mon      *-*-* 00:00:%02d"
> 
> I would modify this with dropping the "Tue..Sun" from the hourly schedule,
> as we still want 23 runs on Mondays.
> 
>> It looks like the launchctl schedule has the same issue.
> 
> I will take a look at this and consider some additional patches to correct
> these issues across both schedulers. Thank you for the attention to detail!

Taking a look, it seems that launchctl does not have this same problem.

The schedule is set via an <array> of <dict>s as follows:

	case SCHEDULE_HOURLY:
		repeat = "<dict>\n"
			 "<key>Hour</key><integer>%d</integer>\n"
			 "<key>Minute</key><integer>%d</integer>\n"
			 "</dict>\n";
		for (i = 1; i <= 23; i++)
			strbuf_addf(&plist, repeat, i, minute);
		break;

	case SCHEDULE_DAILY:
		repeat = "<dict>\n"
			 "<key>Day</key><integer>%d</integer>\n"
			 "<key>Hour</key><integer>0</integer>\n"
			 "<key>Minute</key><integer>%d</integer>\n"
			 "</dict>\n";
		for (i = 1; i <= 6; i++)
			strbuf_addf(&plist, repeat, i, minute);
		break;

This means that we create an hourly schedule for each hour 1..23
(no 0th hour means no collision with daily or weekly schedule) and
a daily schedule for each day 1..6 (no 0th day means no collision
with the weekly schedule).

Does this match your understanding?

The overlap _definitely_ exists in systemd, which I will fix.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-07 21:20   ` Taylor Blau
  2023-08-07 23:53     ` Junio C Hamano
@ 2023-08-08 17:28     ` Derrick Stolee
  2023-08-08 20:04       ` Taylor Blau
  1 sibling, 1 reply; 47+ messages in thread
From: Derrick Stolee @ 2023-08-08 17:28 UTC (permalink / raw)
  To: Taylor Blau, Derrick Stolee via GitGitGadget
  Cc: git, gitster, sandals, lenaic

On 8/7/2023 5:20 PM, Taylor Blau wrote:
> On Mon, Aug 07, 2023 at 06:51:35PM +0000, Derrick Stolee via GitGitGadget wrote:

>> +	if (!random_initialized) {
>> +		srand((unsigned int)getpid());
>> +		random_initialized = 1;
>> +	}
> 
> I was wondering where else we call srand() within Git, and it looks like
> the only other spot is in `lock_file_timeout()`.
> 
> I doubt it, but is there a chance that that code depends on only calling
> srand() once? I think the answer is "no", since we only use rand()
> within that function to generate a random-ish backoff period, so I think
> the repeatability of it doesn't matter all that much.

The main point would be to avoid the cost of spinning up the number
generator, but there is potential for a bug if we run both initializers:
the lockfile would re-use the same filenames after the generator is
reset.

> So I think this is kind of outside the scope of your series, but I
> wonder if we should have a git_rand() that automatically initializes the
> PRNG with the value of getpid()? Then multiple callers can grab random
> values at will without reinitializing the PRNG.

I see you're moving ahead with removing the srand() from the lockfile code,
so I'll focus on creating a `git_rand()` that centralizes the use of
srand(), but won't touch the code in the lockfile so your patch applies
independently.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-08 16:49             ` Junio C Hamano
@ 2023-08-08 20:01               ` Taylor Blau
  0 siblings, 0 replies; 47+ messages in thread
From: Taylor Blau @ 2023-08-08 20:01 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, git, sandals, lenaic, Derrick Stolee

On Tue, Aug 08, 2023 at 09:49:30AM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Provided if code simplification is a good enough rationale, the
> > patch looks sensible, but I find its use of u64 a bit questionable
> > (though not wrong).  I would have expected that the type of "rand"
> > would be the same as backoff_ms and wait_ms, two variables involved
> > in the same expression.
>
> Ah, not so fast.  The use of modulo means we need to be careful
> about about the fuzzing factor going negative, and use of unsigned
> type allows us to forget about it.
>
> (fuzz % 250), when fuzz is of a signed random integral type, ranges
> between -250 and +250 and because we want the center of our
> distribution at 1000, so I think the following is equivalent.

[...]

> -		wait_ms = (750 + rand % 500) * backoff_ms / 1000;
> +		wait_ms = (1000 + fuzz % 250) * backoff_ms / 1000;

I was going to say that having rand be an unsigned type was the right
behavior, but with this change it can (and should) be signed. Thanks for
the simplification :-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-08 17:28     ` Derrick Stolee
@ 2023-08-08 20:04       ` Taylor Blau
  2023-08-09 12:17         ` Derrick Stolee
  0 siblings, 1 reply; 47+ messages in thread
From: Taylor Blau @ 2023-08-08 20:04 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, gitster, sandals, lenaic

On Tue, Aug 08, 2023 at 01:28:50PM -0400, Derrick Stolee wrote:
> > So I think this is kind of outside the scope of your series, but I
> > wonder if we should have a git_rand() that automatically initializes the
> > PRNG with the value of getpid()? Then multiple callers can grab random
> > values at will without reinitializing the PRNG.
>
> I see you're moving ahead with removing the srand() from the lockfile code,
> so I'll focus on creating a `git_rand()` that centralizes the use of
> srand(), but won't touch the code in the lockfile so your patch applies
> independently.

That thread may have progressed a little since you last looked at it.

Instead of using srand() and rand() (which would make sense to wrap with
git_rand() as you propose), we can simplify our lives by using a CSPRNG,
which only gets initialized once, as is already the case with
csprng_bytes().

I think Junio is picking up a lightly modified version of my patch
there, see [1].

Thanks,
Taylor

[1]: https://lore.kernel.org/git/ZNKfKs1mLQhnybvF@nand.local/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-08 13:49     ` Derrick Stolee
@ 2023-08-08 20:05       ` Taylor Blau
  0 siblings, 0 replies; 47+ messages in thread
From: Taylor Blau @ 2023-08-08 20:05 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, gitster, sandals, lenaic

On Tue, Aug 08, 2023 at 09:49:40AM -0400, Derrick Stolee wrote:
> > But I figured I'd raise the question in case you did mean for them to
> > both run on either Sunday or Monday.
>
> I don't intend to change the day that is run as part of this change.
>
> I think all other schedulers run on Sunday, but systemd running on Monday
> is a particular detail of its "weekly" schedule.

No problem -- I figured that you didn't intend on changing the days
around here, just wanted to make sure it was known that the systemd
scheduler picks a different day of the week for its weekly schedule than
the others do.

> >> -static int systemd_timer_write_unit_templates(const char *exec_path)
> >> +static int systemd_timer_write_unit_template(enum schedule_priority schedule,
> >> +					     const char *exec_path,
> >> +					     int minute)
> >>  {
> >>  	char *filename;
> >>  	FILE *file;
> >>  	const char *unit;
> >> +	char *schedule_pattern = NULL;
> >
> > You should be able to drop the NULL initialization, since you assign
> > this value unconditionally in the switch statement below (or BUG() on an
> > unknown schedule type).
>
> Unfortunately, GCC complained about a possibly-unassigned value when I
> left this unset during development, so this actually is necessary for
> that compiler.

Ah, I would have thought that GCC would be smart enough to figure out
that schedule_pattern is unconditionally initialized via the switch
statement, but I guess not. Makes sense.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-08 17:14       ` Derrick Stolee
@ 2023-08-09 10:00         ` Phillip Wood
  0 siblings, 0 replies; 47+ messages in thread
From: Phillip Wood @ 2023-08-09 10:00 UTC (permalink / raw)
  To: Derrick Stolee, phillip.wood, Derrick Stolee via GitGitGadget, git
  Cc: gitster, sandals, lenaic

On 08/08/2023 18:14, Derrick Stolee wrote:
> On 8/8/2023 1:06 PM, Derrick Stolee wrote:
>> On 8/8/2023 8:08 AM, Phillip Wood wrote:
>>> On 07/08/2023 19:51, Derrick Stolee via GitGitGadget wrote:
> 
>>>> +    char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
>>>
>>> The "@" in the name signifies that it is a template unit which it isn't anymore so I think we want to change this to "git-maintenance-%s.timer"
>>
>> I'll also take your SYSTEMD_UNIT_FORMAT macro suggestion to simplify things.
> 
> As I was checking things, it turns out that we _should_ keep the '@' symbol
> if only to make sure that our new schedule overwrites the old schedule.

Oh, so if the user already has scheduled maintenance set up then running 
"git maintenance start" adds a new set of timers. I'd not though about that.

> The alternative is that we manually try to delete the old schedule, but that
> feels like an inefficient way to do it, leaving some cruft around long-term.

This patch still changes the names of the files we write. Currently we write

	$XDG_CONFIG_HOME/systemd/user/git-maintenance@.service
	$XDG_CONFIG_HOME/systemd/user/git-maintenance@.timer

and this patch changes that to

	$XDG_CONFIG_HOME/systemd/user/git-maintenance@hourly.service
	$XDG_CONFIG_HOME/systemd/user/git-maintenance@daily.service
	$XDG_CONFIG_HOME/systemd/user/git-maintenance@weekly.service
	$XDG_CONFIG_HOME/systemd/user/git-maintenance@hourly.timer
	$XDG_CONFIG_HOME/systemd/user/git-maintenance@daily.timer
	$XDG_CONFIG_HOME/systemd/user/git-maintenance@weekly.timer

If the user has already enabled maintenance then

	$XDG_CONFIG_HOME/systemd/user/timers.target.wants/git-maintenance@hourly.timer
	$XDG_CONFIG_HOME/systemd/user/timers.target.wants/git-maintenance@daily.timer
	$XDG_CONFIG_HOME/systemd/user/timers.target.wants/git-maintenance@weekly.timer

will exist and are all symlinks pointing to

	$XDG_CONFIG_HOME/systemd/user/git-maintenance@.timer

After this patch if the user runs "git maintenance start" again then 
systemctl will update the symlinks tot point to the matching unit files 
rather than the old template file. That means the user will pick up the 
new schedule but we leave behind the original files that are unused.

> For completeness, here is what I did to check:
> 
> $ systemctl --user list-timers
> NEXT                        LEFT        LAST                        PASSED       UNIT                         ACTIVATES
> Tue 2023-08-08 13:13:00 EDT 6s left     n/a                         n/a          git-maintenance-hourly.timer git-maintenance-hourly.service
> Tue 2023-08-08 13:50:00 EDT 37min left  Tue 2023-08-08 12:50:10 EDT 22min ago    git-maintenance@hourly.timer git-maintenance@hourly.service
> Wed 2023-08-09 00:13:00 EDT 11h left    n/a                         n/a          git-maintenance-daily.timer  git-maintenance-daily.service
> Wed 2023-08-09 00:50:00 EDT 11h left    Tue 2023-08-08 09:35:31 EDT 3h 37min ago git-maintenance@daily.timer  git-maintenance@daily.service
> Mon 2023-08-14 00:13:00 EDT 5 days left n/a                         n/a          git-maintenance-weekly.timer git-maintenance-weekly.service
> Mon 2023-08-14 00:50:00 EDT 5 days left Mon 2023-08-07 10:28:10 EDT 1 day 2h ago git-maintenance@weekly.timer git-maintenance@weekly.service
> 
> Do you have an alternative idea how to handle that?

I think we should stick with the names as you have them. It might be 
worth keeping the service file as a template so we only write the new 
timer files and have a reason to use the "@" naming scheme. We could 
update systemd_timer_setup_units() to delete git-maintenance@.timer if 
we successfully enable the new timer unit files.

Sorry for the confusion, I should have thought about the user running 
"git maintenance start" for a second time.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/6] maintenance: use random minute in systemd scheduler
  2023-08-08 17:24       ` Derrick Stolee
@ 2023-08-09 10:03         ` Phillip Wood
  0 siblings, 0 replies; 47+ messages in thread
From: Phillip Wood @ 2023-08-09 10:03 UTC (permalink / raw)
  To: Derrick Stolee, phillip.wood, Derrick Stolee via GitGitGadget, git
  Cc: gitster, sandals, lenaic, Taylor Blau

On 08/08/2023 18:24, Derrick Stolee wrote:
> On 8/8/2023 9:56 AM, Derrick Stolee wrote:
>> On 8/8/2023 5:53 AM, Phillip Wood wrote:> Hi Stolee
> 
>>> So only one of these jobs will succeed. The cron entries are careful to
>>> only run one job at a time, I think it would be worth doing the same
>>> thing here. I think the using the following format strings would fix this.
>>>
>>> Hourly: "Tue..Sun *-*-* 1..23:00:%02d"
>>> Daily:  "Tue..Sun *-*-* 00:00:%02d"
>>> Weekly: "Mon      *-*-* 00:00:%02d"
>>
>> I would modify this with dropping the "Tue..Sun" from the hourly schedule,
>> as we still want 23 runs on Mondays.

Oops, well spotted

>>> It looks like the launchctl schedule has the same issue.
>>
>> I will take a look at this and consider some additional patches to correct
>> these issues across both schedulers. Thank you for the attention to detail!
> 
> Taking a look, it seems that launchctl does not have this same problem.
> 
> The schedule is set via an <array> of <dict>s as follows:
> 
> 	case SCHEDULE_HOURLY:
> 		repeat = "<dict>\n"
> 			 "<key>Hour</key><integer>%d</integer>\n"
> 			 "<key>Minute</key><integer>%d</integer>\n"
> 			 "</dict>\n";
> 		for (i = 1; i <= 23; i++)
> 			strbuf_addf(&plist, repeat, i, minute);
> 		break;
> 
> 	case SCHEDULE_DAILY:
> 		repeat = "<dict>\n"
> 			 "<key>Day</key><integer>%d</integer>\n"
> 			 "<key>Hour</key><integer>0</integer>\n"
> 			 "<key>Minute</key><integer>%d</integer>\n"
> 			 "</dict>\n";
> 		for (i = 1; i <= 6; i++)
> 			strbuf_addf(&plist, repeat, i, minute);
> 		break;
> 
> This means that we create an hourly schedule for each hour 1..23
> (no 0th hour means no collision with daily or weekly schedule) and
> a daily schedule for each day 1..6 (no 0th day means no collision
> with the weekly schedule).
> 
> Does this match your understanding?

Yes, having read it again - sorry I'd misunderstood it yesterday.

> The overlap _definitely_ exists in systemd, which I will fix.

That's great

Best Wishes

Phillip


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-08 20:04       ` Taylor Blau
@ 2023-08-09 12:17         ` Derrick Stolee
  2023-08-09 18:50           ` Junio C Hamano
  0 siblings, 1 reply; 47+ messages in thread
From: Derrick Stolee @ 2023-08-09 12:17 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Derrick Stolee via GitGitGadget, git, gitster, sandals, lenaic

On 8/8/2023 4:04 PM, Taylor Blau wrote:
> On Tue, Aug 08, 2023 at 01:28:50PM -0400, Derrick Stolee wrote:

>> I see you're moving ahead with removing the srand() from the lockfile code,
 
> That thread may have progressed a little since you last looked at it.

I think this part of my summary is still correct.

>> so I'll focus on creating a `git_rand()` that centralizes the use of
>> srand(), but won't touch the code in the lockfile so your patch applies
>> independently.
 
> Instead of using srand() and rand() (which would make sense to wrap with
> git_rand() as you propose), we can simplify our lives by using a CSPRNG,
> which only gets initialized once, as is already the case with
> csprng_bytes().

So the idea is to use csprng_bytes() everywhere instead of srand()/rand().

I can adjust my local patch to still create git_rand(), but base it on
csprng_bytes() and not collide with your patch. Mimicking rand()'s behavior
is a simpler interface to consume.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-09 12:17         ` Derrick Stolee
@ 2023-08-09 18:50           ` Junio C Hamano
  2023-08-09 20:34             ` Taylor Blau
  0 siblings, 1 reply; 47+ messages in thread
From: Junio C Hamano @ 2023-08-09 18:50 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Taylor Blau, Derrick Stolee via GitGitGadget, git, sandals, lenaic

Derrick Stolee <derrickstolee@github.com> writes:

>> Instead of using srand() and rand() (which would make sense to wrap with
>> git_rand() as you propose), we can simplify our lives by using a CSPRNG,
>> which only gets initialized once, as is already the case with
>> csprng_bytes().
>
> So the idea is to use csprng_bytes() everywhere instead of srand()/rand().
>
> I can adjust my local patch to still create git_rand(), but base it on
> csprng_bytes() and not collide with your patch. Mimicking rand()'s behavior
> is a simpler interface to consume.

I am still ambivalent about wasting entropy for something that
srand() would suffice, so git_rand() interface may be an welcome
addition anyway, that serves an extra layer of indirection to
insulate the callers from the implementation.

Thanks.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/6] maintenance: add get_random_minute()
  2023-08-09 18:50           ` Junio C Hamano
@ 2023-08-09 20:34             ` Taylor Blau
  0 siblings, 0 replies; 47+ messages in thread
From: Taylor Blau @ 2023-08-09 20:34 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee, Derrick Stolee via GitGitGadget, git, sandals, lenaic

On Wed, Aug 09, 2023 at 11:50:38AM -0700, Junio C Hamano wrote:
> Derrick Stolee <derrickstolee@github.com> writes:
>
> >> Instead of using srand() and rand() (which would make sense to wrap with
> >> git_rand() as you propose), we can simplify our lives by using a CSPRNG,
> >> which only gets initialized once, as is already the case with
> >> csprng_bytes().
> >
> > So the idea is to use csprng_bytes() everywhere instead of srand()/rand().
> >
> > I can adjust my local patch to still create git_rand(), but base it on
> > csprng_bytes() and not collide with your patch. Mimicking rand()'s behavior
> > is a simpler interface to consume.
>
> I am still ambivalent about wasting entropy for something that
> srand() would suffice, so git_rand() interface may be an welcome
> addition anyway, that serves an extra layer of indirection to
> insulate the callers from the implementation.

Sounds good to me, I'm not particularly attached to one implementation
over another. Thanks, both.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v2 0/8] maintenance: schedule maintenance on a random minute
  2023-08-07 18:51 [PATCH 0/6] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
                   ` (5 preceding siblings ...)
  2023-08-07 18:51 ` [PATCH 6/6] maintenance: use random minute in systemd scheduler Derrick Stolee via GitGitGadget
@ 2023-08-10 20:39 ` Derrick Stolee via GitGitGadget
  2023-08-10 20:39   ` [PATCH v2 1/8] maintenance: add get_random_minute() Derrick Stolee via GitGitGadget
                     ` (7 more replies)
  6 siblings, 8 replies; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-10 20:39 UTC (permalink / raw)
  To: git; +Cc: gitster, sandals, lenaic, Taylor Blau, Phillip Wood, Derrick Stolee

When we initially created background maintenance -- with its hourly, daily,
and weekly schedules -- we considered the effects of all clients launching
fetches to the server every hour on the hour. The worry of DDoSing server
hosts was noted, but left as something we would consider for a future
update.

As background maintenance has gained more adoption over the past three
years, our worries about DDoSing the big Git hosts has been unfounded. Those
systems, especially those serving public repositories, are already resilient
to thundering herds of much smaller scale.

However, sometimes organizations spin up specific custom server
infrastructure either in addition to or on top of their Git host. Some of
these technologies are built for a different range of scale, and can hit
concurrency limits sooner. Organizations with such custom infrastructures
are more likely to recommend tools like scalar which furthers their adoption
of background maintenance.

This series attempts to help by spreading out background maintenance to a
random minute of the hour. This minute is selected during git maintenance
start, and the same minute applies to each of the three schedules.

This isn't a full solution to this problem, as the custom infrastructure
needs to be resilient to bursts of activity, but at least this will help
somewhat.

Each of the integrated schedulers needs a different way of integrating the
random minute. The most problematic is systemd, since our integration had a
clever use of templates to write one schedule that inserted the hourly,
daily, and weekly schedules as a string into the template. This needs some
refactoring before the custom minute could be inserted.

For the most part, each scheduler's integration is relatively simple. That
is, until we get to the systemd integration. That integration made use of a
clever templating technique that is no longer possible when making this
adjustment.

Patches 5-7 involve systemd, though patch 5 is just a move of code (without
edits) to make the diff in patch 6 somewhat simpler (it's still complicated
due to templating changes). Patch 7 fixes an issue where the systemd
schedules overlap.

Patch 8 fixes an issue where config changes persist even if the scheduler
fails to initialize. Thanks for noticing, Philip!


Updates in version 2
====================

 * get_random_minute() now uses a new helper, git_rand(), which is itself a
   wrapper around csprng_bytes() for easier use.
 * get_random_minute() also had an error in its use of getenv() which is now
   fixed.
 * Patch 6 has a lot of new changes, including:
   * Keeping the .service template.
   * Deleting the old .timer template when safe to do so.
 * Patch 7 fixes the schedule overlap in systemd.
 * Patch 8 fixes the issue where 'mainteancne.auto=false' would persist even
   if the scheduler failed to initialize.

Thanks, -Stolee

Derrick Stolee (8):
  maintenance: add get_random_minute()
  maintenance: use random minute in launchctl scheduler
  maintenance: use random minute in Windows scheduler
  maintenance: use random minute in cron scheduler
  maintenance: swap method locations
  maintenance: use random minute in systemd scheduler
  maintenance: fix systemd schedule overlaps
  maintenance: update schedule before config

 builtin/gc.c           | 291 +++++++++++++++++++++++++++++------------
 t/t7900-maintenance.sh |  28 +++-
 wrapper.c              |  10 ++
 wrapper.h              |   6 +
 4 files changed, 250 insertions(+), 85 deletions(-)


base-commit: a82fb66fed250e16d3010c75404503bea3f0ab61
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1567%2Fderrickstolee%2Fmaintenance-random-minute-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1567/derrickstolee/maintenance-random-minute-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1567

Range-diff vs v1:

 1:  fefdaa94579 ! 1:  edc08023ed5 maintenance: add get_random_minute()
     @@ Commit message
          integrations with this method do not yet exist, but will follow in
          future changes.
      
     +    To avoid multiple sources of randomness in the Git codebase, create a
     +    new helper function, git_rand(), that returns a random uint32_t. This is
     +    similar to how rand() returns a random nonnegative value, except it is
     +    based on csprng_bytes() which is cryptographic and will return values
     +    larger than RAND_MAX.
     +
          One thing that is important for testability is that we notice when we
          are under a test scenario and return a predictable result. The schedules
          themselves are not checked for this value, but at least one launchctl
     @@ builtin/gc.c: static int get_schedule_cmd(const char **cmd, int *is_available)
      +MAYBE_UNUSED
      +static int get_random_minute(void)
      +{
     -+	static int random_initialized = 0;
     -+
      +	/* Use a static value when under tests. */
     -+	if (!getenv("GIT_TEST_MAINTENANCE_SCHEDULER"))
     ++	if (getenv("GIT_TEST_MAINT_SCHEDULER"))
      +		return 13;
      +
     -+	if (!random_initialized) {
     -+		srand((unsigned int)getpid());
     -+		random_initialized = 1;
     -+	}
     -+
     -+	return rand() % 60;
     ++	return git_rand() % 60;
      +}
      +
       static int is_launchctl_available(void)
       {
       	const char *cmd = "launchctl";
     +
     + ## wrapper.c ##
     +@@ wrapper.c: int csprng_bytes(void *buf, size_t len)
     + 	return 0;
     + #endif
     + }
     ++
     ++uint32_t git_rand(void)
     ++{
     ++	uint32_t result;
     ++
     ++	if (csprng_bytes(&result, sizeof(result)) < 0)
     ++		die(_("unable to get random bytes"));
     ++
     ++	return result;
     ++}
     +
     + ## wrapper.h ##
     +@@ wrapper.h: void sleep_millisec(int millisec);
     +  */
     + int csprng_bytes(void *buf, size_t len);
     + 
     ++/*
     ++ * Returns a random uint32_t, uniformly distributed across all possible
     ++ * values.
     ++ */
     ++uint32_t git_rand(void);
     ++
     + #endif /* WRAPPER_H */
 2:  babf62ec6d5 ! 2:  72ec86f2f88 maintenance: use random minute in launchctl scheduler
     @@ builtin/gc.c: static int get_schedule_cmd(const char **cmd, int *is_available)
      -MAYBE_UNUSED
       static int get_random_minute(void)
       {
     - 	static int random_initialized = 0;
     + 	/* Use a static value when under tests. */
      @@ builtin/gc.c: static int launchctl_schedule_plist(const char *exec_path, enum schedule_priorit
       	struct strbuf plist = STRBUF_INIT, plist2 = STRBUF_INIT;
       	struct stat st;
 3:  1048dff1d3a = 3:  f6d9c4f3b02 maintenance: use random minute in Windows scheduler
 4:  3ef3cf0861f = 4:  b291e6f7aec maintenance: use random minute in cron scheduler
 5:  37f96b6b619 ! 5:  88610437b4b maintenance: swap method locations
     @@ Commit message
          systemd_timer_enable_unit() method, so we can write a specific schedule
          for each unit.
      
     -    The diff is computed smaller by showing systemd_timer_enable_unit() move
     -    instead of systemd_timer_write_unit_templates().
     +    The diff is computed smaller by showing systemd_timer_enable_unit() and
     +    systemd_timer_delete_units()  move instead of
     +    systemd_timer_write_unit_templates() and
     +    systemd_timer_delete_unit_templates().
      
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
      
     @@ builtin/gc.c: static char *xdg_config_home_systemd(const char *filename)
      -	return 0;
      -}
      -
     --static int systemd_timer_delete_unit_templates(void)
     --{
     --	int ret = 0;
     --	char *filename = xdg_config_home_systemd("git-maintenance@.timer");
     --	if (unlink(filename) && !is_missing_file_error(errno))
     --		ret = error_errno(_("failed to delete '%s'"), filename);
     --	FREE_AND_NULL(filename);
     --
     --	filename = xdg_config_home_systemd("git-maintenance@.service");
     --	if (unlink(filename) && !is_missing_file_error(errno))
     --		ret = error_errno(_("failed to delete '%s'"), filename);
     --
     --	free(filename);
     --	return ret;
     --}
     --
     + static int systemd_timer_delete_unit_templates(void)
     + {
     + 	int ret = 0;
     +@@ builtin/gc.c: static int systemd_timer_delete_unit_templates(void)
     + 	return ret;
     + }
     + 
      -static int systemd_timer_delete_units(void)
      -{
      -	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY) ||
     @@ builtin/gc.c: error:
      +	return 0;
      +}
      +
     -+static int systemd_timer_delete_unit_templates(void)
     -+{
     -+	int ret = 0;
     -+	char *filename = xdg_config_home_systemd("git-maintenance@.timer");
     -+	if (unlink(filename) && !is_missing_file_error(errno))
     -+		ret = error_errno(_("failed to delete '%s'"), filename);
     -+	FREE_AND_NULL(filename);
     -+
     -+	filename = xdg_config_home_systemd("git-maintenance@.service");
     -+	if (unlink(filename) && !is_missing_file_error(errno))
     -+		ret = error_errno(_("failed to delete '%s'"), filename);
     -+
     -+	free(filename);
     -+	return ret;
     -+}
     -+
      +static int systemd_timer_delete_units(void)
      +{
      +	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY) ||
 6:  14e340b75fa ! 6:  e43778d3e40 maintenance: use random minute in systemd scheduler
     @@ Commit message
      
          In order to set these schedules to a given minute, we can no longer use
          the 'hourly', 'daily', or 'weekly' strings for '<schedule>' and instead
     -    need to abandon the template model.
     +    need to abandon the template model for the .timer files. We can still
     +    use templates for the .service files. For this reason, we split these
     +    writes into two methods.
      
          Modify the template with a custom schedule in the 'OnCalendar' setting.
          This schedule has some interesting differences from cron-like patterns,
     @@ Commit message
          week. Monday is used since that matches the day used for the 'weekly'
          schedule used previously.
      
     +    Now that the timer files are not templates, we might want to abandon the
     +    '@' symbol in the file names. However, this would cause users with
     +    existing schedules to get two competing schedules due to different
     +    names. The work to remove the old schedule name is one thing that we can
     +    avoid by keeping the '@' symbol in our unit names. Since we are locked
     +    into this name, it makes sense that we keep the template model for the
     +    .service files.
     +
          The rest of the change involves making sure we are writing these .timer
          and .service files before initializing the schedule with 'systemctl' and
          deleting the files when we are done. Some changes are also made to share
          the random minute along with a single computation of the execution path
          of the current Git executable.
      
     +    In addition, older Git versions may have written a
     +    'git-maintenance@.timer' template file. Be sure to remove this when
     +    successfully enabling maintenance (or disabling maintenance).
     +
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
      
       ## builtin/gc.c ##
     @@ builtin/gc.c: static char *xdg_config_home_systemd(const char *filename)
       	return xdg_config_home_for("systemd/user", filename);
       }
       
     +-static int systemd_timer_delete_unit_templates(void)
     ++#define SYSTEMD_UNIT_FORMAT "git-maintenance@%s.%s"
     ++
     ++static int systemd_timer_delete_timer_file(enum schedule_priority priority)
     + {
     + 	int ret = 0;
     +-	char *filename = xdg_config_home_systemd("git-maintenance@.timer");
     ++	const char *frequency = get_frequency(priority);
     ++	char *local_timer_name = xstrfmt(SYSTEMD_UNIT_FORMAT, frequency, "timer");
     ++	char *filename = xdg_config_home_systemd(local_timer_name);
     ++
     + 	if (unlink(filename) && !is_missing_file_error(errno))
     + 		ret = error_errno(_("failed to delete '%s'"), filename);
     +-	FREE_AND_NULL(filename);
     + 
     +-	filename = xdg_config_home_systemd("git-maintenance@.service");
     ++	free(filename);
     ++	free(local_timer_name);
     ++	return ret;
     ++}
     ++
     ++static int systemd_timer_delete_service_template(void)
     ++{
     ++	int ret = 0;
     ++	char *local_service_name = xstrfmt(SYSTEMD_UNIT_FORMAT, "", "service");
     ++	char *filename = xdg_config_home_systemd(local_service_name);
     + 	if (unlink(filename) && !is_missing_file_error(errno))
     + 		ret = error_errno(_("failed to delete '%s'"), filename);
     + 
     + 	free(filename);
     ++	free(local_service_name);
     + 	return ret;
     + }
     + 
      -static int systemd_timer_write_unit_templates(const char *exec_path)
     -+static int systemd_timer_write_unit_template(enum schedule_priority schedule,
     -+					     const char *exec_path,
     -+					     int minute)
     ++/*
     ++ * Write the schedule information into a git-maintenance@<schedule>.timer
     ++ * file using a custom minute. This timer file cannot use the templating
     ++ * system, so we generate a specific file for each.
     ++ */
     ++static int systemd_timer_write_timer_file(enum schedule_priority schedule,
     ++					  int minute)
       {
     ++	int res = -1;
       	char *filename;
       	FILE *file;
       	const char *unit;
      +	char *schedule_pattern = NULL;
      +	const char *frequency = get_frequency(schedule);
     -+	char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
     -+	char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);
     ++	char *local_timer_name = xstrfmt(SYSTEMD_UNIT_FORMAT, frequency, "timer");
      +
      +	filename = xdg_config_home_systemd(local_timer_name);
       
     @@ builtin/gc.c: static int systemd_timer_write_unit_templates(const char *exec_pat
       		fclose(file);
       		goto error;
      @@ builtin/gc.c: static int systemd_timer_write_unit_templates(const char *exec_path)
     + 		error_errno(_("failed to flush '%s'"), filename);
     + 		goto error;
       	}
     ++
     ++	res = 0;
     ++
     ++error:
     ++	free(schedule_pattern);
     ++	free(local_timer_name);
       	free(filename);
     ++	return res;
     ++}
       
      -	filename = xdg_config_home_systemd("git-maintenance@.service");
     ++/*
     ++ * No matter the schedule, we use the same service and can make use of the
     ++ * templating system. When installing git-maintenance@<schedule>.timer,
     ++ * systemd will notice that git-maintenance@.service exists as a template
     ++ * and will use this file and insert the <schedule> into the template at
     ++ * the position of "%i".
     ++ */
     ++static int systemd_timer_write_service_template(const char *exec_path)
     ++{
     ++	int res = -1;
     ++	char *filename;
     ++	FILE *file;
     ++	const char *unit;
     ++	char *local_service_name = xstrfmt(SYSTEMD_UNIT_FORMAT, "", "service");
     ++
      +	filename = xdg_config_home_systemd(local_service_name);
     ++	if (safe_create_leading_directories(filename)) {
     ++		error(_("failed to create directories for '%s'"), filename);
     ++		goto error;
     ++	}
       	file = fopen_or_warn(filename, "w");
       	if (!file)
       		goto error;
      @@ builtin/gc.c: static int systemd_timer_write_unit_templates(const char *exec_path)
     - 	       "\n"
     - 	       "[Service]\n"
     - 	       "Type=oneshot\n"
     --	       "ExecStart=\"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%i\n"
     -+	       "ExecStart=\"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%s\n"
     - 	       "LockPersonality=yes\n"
     - 	       "MemoryDenyWriteExecute=yes\n"
     - 	       "NoNewPrivileges=yes\n"
     -@@ builtin/gc.c: static int systemd_timer_write_unit_templates(const char *exec_path)
     - 	       "RestrictSUIDSGID=yes\n"
     - 	       "SystemCallArchitectures=native\n"
     - 	       "SystemCallFilter=@system-service\n";
     --	if (fprintf(file, unit, exec_path, exec_path) < 0) {
     -+	if (fprintf(file, unit, exec_path, exec_path, frequency) < 0) {
     - 		error(_("failed to write to '%s'"), filename);
     - 		fclose(file);
     + 		error_errno(_("failed to flush '%s'"), filename);
       		goto error;
     -@@ builtin/gc.c: static int systemd_timer_write_unit_templates(const char *exec_path)
     - 	return 0;
     + 	}
     +-	free(filename);
     +-	return 0;
     ++
     ++	res = 0;
       
       error:
     -+	free(schedule_pattern);
     -+	free(local_timer_name);
     ++	free(local_service_name);
       	free(filename);
      -	systemd_timer_delete_unit_templates();
     - 	return -1;
     +-	return -1;
     ++	return res;
       }
       
       static int systemd_timer_enable_unit(int enable,
      -				     enum schedule_priority schedule)
      +				     enum schedule_priority schedule,
     -+				     const char *exec_path,
      +				     int minute)
       {
       	const char *cmd = "systemctl";
     @@ builtin/gc.c: static int systemd_timer_enable_unit(int enable,
       	 */
       	if (!enable)
       		child.no_stderr = 1;
     -+	else if (systemd_timer_write_unit_template(schedule, exec_path, minute))
     ++	else if (systemd_timer_write_timer_file(schedule, minute))
      +		return -1;
       
       	get_schedule_cmd(&cmd, NULL);
       	strvec_split(&child.args, cmd);
     + 	strvec_pushl(&child.args, "--user", enable ? "enable" : "disable",
     + 		     "--now", NULL);
     +-	strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
     ++	strvec_pushf(&child.args, SYSTEMD_UNIT_FORMAT, frequency, "timer");
     + 
     + 	if (start_command(&child))
     + 		return error(_("failed to start systemctl"));
      @@ builtin/gc.c: static int systemd_timer_enable_unit(int enable,
       	return 0;
       }
       
     --static int systemd_timer_delete_unit_templates(void)
     -+static int systemd_timer_delete_unit_template(enum schedule_priority priority)
     - {
     -+	const char *frequency = get_frequency(priority);
     -+	char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
     -+	char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);
     - 	int ret = 0;
     --	char *filename = xdg_config_home_systemd("git-maintenance@.timer");
     -+	char *filename = xdg_config_home_systemd(local_timer_name);
     - 	if (unlink(filename) && !is_missing_file_error(errno))
     - 		ret = error_errno(_("failed to delete '%s'"), filename);
     - 	FREE_AND_NULL(filename);
     - 
     --	filename = xdg_config_home_systemd("git-maintenance@.service");
     -+	filename = xdg_config_home_systemd(local_service_name);
     - 	if (unlink(filename) && !is_missing_file_error(errno))
     - 		ret = error_errno(_("failed to delete '%s'"), filename);
     - 
     - 	free(filename);
     -+	free(local_timer_name);
     -+	free(local_service_name);
     - 	return ret;
     - }
     - 
     -+static int systemd_timer_delete_unit_templates(void)
     ++/*
     ++ * A previous version of Git wrote the timer units as template files.
     ++ * Clean these up, if they exist.
     ++ */
     ++static void systemd_timer_delete_stale_timer_templates(void)
     ++{
     ++	char *timer_template_name = xstrfmt(SYSTEMD_UNIT_FORMAT, "", "timer");
     ++	char *filename = xdg_config_home_systemd(timer_template_name);
     ++
     ++	if (unlink(filename) && !is_missing_file_error(errno))
     ++		warning(_("failed to delete '%s'"), filename);
     ++
     ++	free(filename);
     ++	free(timer_template_name);
     ++}
     ++
     ++static int systemd_timer_delete_unit_files(void)
      +{
     ++	systemd_timer_delete_stale_timer_templates();
     ++
      +	/* Purposefully not short-circuited to make sure all are called. */
     -+	return systemd_timer_delete_unit_template(SCHEDULE_HOURLY) |
     -+	       systemd_timer_delete_unit_template(SCHEDULE_DAILY) |
     -+	       systemd_timer_delete_unit_template(SCHEDULE_WEEKLY);
     ++	return systemd_timer_delete_timer_file(SCHEDULE_HOURLY) |
     ++	       systemd_timer_delete_timer_file(SCHEDULE_DAILY) |
     ++	       systemd_timer_delete_timer_file(SCHEDULE_WEEKLY) |
     ++	       systemd_timer_delete_service_template();
      +}
      +
       static int systemd_timer_delete_units(void)
     @@ builtin/gc.c: static int systemd_timer_enable_unit(int enable,
      -	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY) ||
      -	       systemd_timer_enable_unit(0, SCHEDULE_DAILY) ||
      -	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY) ||
     +-	       systemd_timer_delete_unit_templates();
      +	int minute = get_random_minute();
     -+	const char *exec_path = git_exec_path();
     -+	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY, exec_path, minute) ||
     -+	       systemd_timer_enable_unit(0, SCHEDULE_DAILY, exec_path, minute) ||
     -+	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY, exec_path, minute) ||
     - 	       systemd_timer_delete_unit_templates();
     ++	/* Purposefully not short-circuited to make sure all are called. */
     ++	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY, minute) |
     ++	       systemd_timer_enable_unit(0, SCHEDULE_DAILY, minute) |
     ++	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY, minute) |
     ++	       systemd_timer_delete_unit_files();
       }
       
       static int systemd_timer_setup_units(void)
     @@ builtin/gc.c: static int systemd_timer_enable_unit(int enable,
      -		  systemd_timer_enable_unit(1, SCHEDULE_HOURLY) ||
      -		  systemd_timer_enable_unit(1, SCHEDULE_DAILY) ||
      -		  systemd_timer_enable_unit(1, SCHEDULE_WEEKLY);
     -+	int ret = systemd_timer_enable_unit(1, SCHEDULE_HOURLY, exec_path, minute) ||
     -+		  systemd_timer_enable_unit(1, SCHEDULE_DAILY, exec_path, minute) ||
     -+		  systemd_timer_enable_unit(1, SCHEDULE_WEEKLY, exec_path, minute);
     ++	int ret = systemd_timer_write_service_template(exec_path) ||
     ++		  systemd_timer_enable_unit(1, SCHEDULE_HOURLY, minute) ||
     ++		  systemd_timer_enable_unit(1, SCHEDULE_DAILY, minute) ||
     ++		  systemd_timer_enable_unit(1, SCHEDULE_WEEKLY, minute);
     ++
       	if (ret)
       		systemd_timer_delete_units();
     ++	else
     ++		systemd_timer_delete_stale_timer_templates();
     ++
       	return ret;
     + }
     + 
      
       ## t/t7900-maintenance.sh ##
      @@ t/t7900-maintenance.sh: test_expect_success 'start and stop Linux/systemd maintenance' '
     @@ t/t7900-maintenance.sh: test_expect_success 'start and stop Linux/systemd mainte
       	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
       
      -	test_systemd_analyze_verify "systemd/user/git-maintenance@.service" &&
     ++	for schedule in hourly daily weekly
     ++	do
     ++		test_path_is_file "systemd/user/git-maintenance@$schedule.timer" || return 1
     ++	done &&
     ++	test_path_is_file "systemd/user/git-maintenance@.service" &&
     ++
      +	test_systemd_analyze_verify "systemd/user/git-maintenance@hourly.service" &&
      +	test_systemd_analyze_verify "systemd/user/git-maintenance@daily.service" &&
      +	test_systemd_analyze_verify "systemd/user/git-maintenance@weekly.service" &&
       
       	printf -- "--user enable --now git-maintenance@%s.timer\n" hourly daily weekly >expect &&
       	test_cmp expect args &&
     +@@ t/t7900-maintenance.sh: test_expect_success 'start and stop Linux/systemd maintenance' '
     + 	# stop does not unregister the repo
     + 	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
     + 
     +-	test_path_is_missing "systemd/user/git-maintenance@.timer" &&
     ++	for schedule in hourly daily weekly
     ++	do
     ++		test_path_is_missing "systemd/user/git-maintenance@$schedule.timer" || return 1
     ++	done &&
     + 	test_path_is_missing "systemd/user/git-maintenance@.service" &&
     + 
     + 	printf -- "--user disable --now git-maintenance@%s.timer\n" hourly daily weekly >expect &&
 -:  ----------- > 7:  86c4255d63d maintenance: fix systemd schedule overlaps
 -:  ----------- > 8:  f0c0f6eff88 maintenance: update schedule before config

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v2 1/8] maintenance: add get_random_minute()
  2023-08-10 20:39 ` [PATCH v2 0/8] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
@ 2023-08-10 20:39   ` Derrick Stolee via GitGitGadget
  2023-08-10 21:25     ` Taylor Blau
  2023-08-10 20:39   ` [PATCH v2 2/8] maintenance: use random minute in launchctl scheduler Derrick Stolee via GitGitGadget
                     ` (6 subsequent siblings)
  7 siblings, 1 reply; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-10 20:39 UTC (permalink / raw)
  To: git
  Cc: gitster, sandals, lenaic, Taylor Blau, Phillip Wood,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

When we initially created background maintenance -- with its hourly,
daily, and weekly schedules -- we considered the effects of all clients
launching fetches to the server every hour on the hour. The worry of
DDoSing server hosts was noted, but left as something we would consider
for a future update.

As background maintenance has gained more adoption over the past three
years, our worries about DDoSing the big Git hosts has been unfounded.
Those systems, especially those serving public repositories, are already
resilient to thundering herds of much smaller scale.

However, sometimes organizations spin up specific custom server
infrastructure either in addition to or on top of their Git host. Some
of these technologies are built for a different range of scale, and can
hit concurrency limits sooner. Organizations with such custom
infrastructures are more likely to recommend tools like `scalar` which
furthers their adoption of background maintenance.

To help solve for this, create get_random_minute() as a method to help
Git select a random minute when creating schedules in the future. The
integrations with this method do not yet exist, but will follow in
future changes.

To avoid multiple sources of randomness in the Git codebase, create a
new helper function, git_rand(), that returns a random uint32_t. This is
similar to how rand() returns a random nonnegative value, except it is
based on csprng_bytes() which is cryptographic and will return values
larger than RAND_MAX.

One thing that is important for testability is that we notice when we
are under a test scenario and return a predictable result. The schedules
themselves are not checked for this value, but at least one launchctl
test checks that we do not unnecessarily reboot the schedule if it has
not changed from a previous version.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c | 10 ++++++++++
 wrapper.c    | 10 ++++++++++
 wrapper.h    |  6 ++++++
 3 files changed, 26 insertions(+)

diff --git a/builtin/gc.c b/builtin/gc.c
index 19d73067aad..2ebae7bc17c 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1708,6 +1708,16 @@ static int get_schedule_cmd(const char **cmd, int *is_available)
 	return 1;
 }
 
+MAYBE_UNUSED
+static int get_random_minute(void)
+{
+	/* Use a static value when under tests. */
+	if (getenv("GIT_TEST_MAINT_SCHEDULER"))
+		return 13;
+
+	return git_rand() % 60;
+}
+
 static int is_launchctl_available(void)
 {
 	const char *cmd = "launchctl";
diff --git a/wrapper.c b/wrapper.c
index 5160c9e28de..48065c4f533 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -819,3 +819,13 @@ int csprng_bytes(void *buf, size_t len)
 	return 0;
 #endif
 }
+
+uint32_t git_rand(void)
+{
+	uint32_t result;
+
+	if (csprng_bytes(&result, sizeof(result)) < 0)
+		die(_("unable to get random bytes"));
+
+	return result;
+}
diff --git a/wrapper.h b/wrapper.h
index 79a9c1b5077..79c7321bb37 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -139,4 +139,10 @@ void sleep_millisec(int millisec);
  */
 int csprng_bytes(void *buf, size_t len);
 
+/*
+ * Returns a random uint32_t, uniformly distributed across all possible
+ * values.
+ */
+uint32_t git_rand(void);
+
 #endif /* WRAPPER_H */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 2/8] maintenance: use random minute in launchctl scheduler
  2023-08-10 20:39 ` [PATCH v2 0/8] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
  2023-08-10 20:39   ` [PATCH v2 1/8] maintenance: add get_random_minute() Derrick Stolee via GitGitGadget
@ 2023-08-10 20:39   ` Derrick Stolee via GitGitGadget
  2023-08-10 20:39   ` [PATCH v2 3/8] maintenance: use random minute in Windows scheduler Derrick Stolee via GitGitGadget
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-10 20:39 UTC (permalink / raw)
  To: git
  Cc: gitster, sandals, lenaic, Taylor Blau, Phillip Wood,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Use get_random_minute() when constructing the schedules for launchctl.

The format already includes a 'Minute' key which is modified from 0 to
the random minute.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 2ebae7bc17c..286ce131a5c 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1708,7 +1708,6 @@ static int get_schedule_cmd(const char **cmd, int *is_available)
 	return 1;
 }
 
-MAYBE_UNUSED
 static int get_random_minute(void)
 {
 	/* Use a static value when under tests. */
@@ -1830,6 +1829,7 @@ static int launchctl_schedule_plist(const char *exec_path, enum schedule_priorit
 	struct strbuf plist = STRBUF_INIT, plist2 = STRBUF_INIT;
 	struct stat st;
 	const char *cmd = "launchctl";
+	int minute = get_random_minute();
 
 	get_schedule_cmd(&cmd, NULL);
 	preamble = "<?xml version=\"1.0\"?>\n"
@@ -1855,29 +1855,30 @@ static int launchctl_schedule_plist(const char *exec_path, enum schedule_priorit
 	case SCHEDULE_HOURLY:
 		repeat = "<dict>\n"
 			 "<key>Hour</key><integer>%d</integer>\n"
-			 "<key>Minute</key><integer>0</integer>\n"
+			 "<key>Minute</key><integer>%d</integer>\n"
 			 "</dict>\n";
 		for (i = 1; i <= 23; i++)
-			strbuf_addf(&plist, repeat, i);
+			strbuf_addf(&plist, repeat, i, minute);
 		break;
 
 	case SCHEDULE_DAILY:
 		repeat = "<dict>\n"
 			 "<key>Day</key><integer>%d</integer>\n"
 			 "<key>Hour</key><integer>0</integer>\n"
-			 "<key>Minute</key><integer>0</integer>\n"
+			 "<key>Minute</key><integer>%d</integer>\n"
 			 "</dict>\n";
 		for (i = 1; i <= 6; i++)
-			strbuf_addf(&plist, repeat, i);
+			strbuf_addf(&plist, repeat, i, minute);
 		break;
 
 	case SCHEDULE_WEEKLY:
-		strbuf_addstr(&plist,
-			      "<dict>\n"
-			      "<key>Day</key><integer>0</integer>\n"
-			      "<key>Hour</key><integer>0</integer>\n"
-			      "<key>Minute</key><integer>0</integer>\n"
-			      "</dict>\n");
+		strbuf_addf(&plist,
+			    "<dict>\n"
+			    "<key>Day</key><integer>0</integer>\n"
+			    "<key>Hour</key><integer>0</integer>\n"
+			    "<key>Minute</key><integer>%d</integer>\n"
+			    "</dict>\n",
+			    minute);
 		break;
 
 	default:
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 3/8] maintenance: use random minute in Windows scheduler
  2023-08-10 20:39 ` [PATCH v2 0/8] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
  2023-08-10 20:39   ` [PATCH v2 1/8] maintenance: add get_random_minute() Derrick Stolee via GitGitGadget
  2023-08-10 20:39   ` [PATCH v2 2/8] maintenance: use random minute in launchctl scheduler Derrick Stolee via GitGitGadget
@ 2023-08-10 20:39   ` Derrick Stolee via GitGitGadget
  2023-08-10 20:39   ` [PATCH v2 4/8] maintenance: use random minute in cron scheduler Derrick Stolee via GitGitGadget
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-10 20:39 UTC (permalink / raw)
  To: git
  Cc: gitster, sandals, lenaic, Taylor Blau, Phillip Wood,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Add this random minute to the Windows scheduler integration.

We need only to modify the minute value for the 'StartBoundary' tag
across the three schedules.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 286ce131a5c..78924bb32c6 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1995,6 +1995,7 @@ static int schtasks_schedule_task(const char *exec_path, enum schedule_priority
 	const char *frequency = get_frequency(schedule);
 	char *name = schtasks_task_name(frequency);
 	struct strbuf tfilename = STRBUF_INIT;
+	int minute = get_random_minute();
 
 	get_schedule_cmd(&cmd, NULL);
 
@@ -2015,7 +2016,7 @@ static int schtasks_schedule_task(const char *exec_path, enum schedule_priority
 	switch (schedule) {
 	case SCHEDULE_HOURLY:
 		fprintf(tfile->fp,
-			"<StartBoundary>2020-01-01T01:00:00</StartBoundary>\n"
+			"<StartBoundary>2020-01-01T01:%02d:00</StartBoundary>\n"
 			"<Enabled>true</Enabled>\n"
 			"<ScheduleByDay>\n"
 			"<DaysInterval>1</DaysInterval>\n"
@@ -2024,12 +2025,13 @@ static int schtasks_schedule_task(const char *exec_path, enum schedule_priority
 			"<Interval>PT1H</Interval>\n"
 			"<Duration>PT23H</Duration>\n"
 			"<StopAtDurationEnd>false</StopAtDurationEnd>\n"
-			"</Repetition>\n");
+			"</Repetition>\n",
+			minute);
 		break;
 
 	case SCHEDULE_DAILY:
 		fprintf(tfile->fp,
-			"<StartBoundary>2020-01-01T00:00:00</StartBoundary>\n"
+			"<StartBoundary>2020-01-01T00:%02d:00</StartBoundary>\n"
 			"<Enabled>true</Enabled>\n"
 			"<ScheduleByWeek>\n"
 			"<DaysOfWeek>\n"
@@ -2041,19 +2043,21 @@ static int schtasks_schedule_task(const char *exec_path, enum schedule_priority
 			"<Saturday />\n"
 			"</DaysOfWeek>\n"
 			"<WeeksInterval>1</WeeksInterval>\n"
-			"</ScheduleByWeek>\n");
+			"</ScheduleByWeek>\n",
+			minute);
 		break;
 
 	case SCHEDULE_WEEKLY:
 		fprintf(tfile->fp,
-			"<StartBoundary>2020-01-01T00:00:00</StartBoundary>\n"
+			"<StartBoundary>2020-01-01T00:%02d:00</StartBoundary>\n"
 			"<Enabled>true</Enabled>\n"
 			"<ScheduleByWeek>\n"
 			"<DaysOfWeek>\n"
 			"<Sunday />\n"
 			"</DaysOfWeek>\n"
 			"<WeeksInterval>1</WeeksInterval>\n"
-			"</ScheduleByWeek>\n");
+			"</ScheduleByWeek>\n",
+			minute);
 		break;
 
 	default:
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 4/8] maintenance: use random minute in cron scheduler
  2023-08-10 20:39 ` [PATCH v2 0/8] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
                     ` (2 preceding siblings ...)
  2023-08-10 20:39   ` [PATCH v2 3/8] maintenance: use random minute in Windows scheduler Derrick Stolee via GitGitGadget
@ 2023-08-10 20:39   ` Derrick Stolee via GitGitGadget
  2023-08-10 20:39   ` [PATCH v2 5/8] maintenance: swap method locations Derrick Stolee via GitGitGadget
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-10 20:39 UTC (permalink / raw)
  To: git
  Cc: gitster, sandals, lenaic, Taylor Blau, Phillip Wood,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Add this random minute to the cron integration.

The cron schedule specification starts with a minute indicator, which
was previously inserted as the "0" string but now takes the given minute
as an integer parameter.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 78924bb32c6..ef8bb772c38 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -2174,6 +2174,7 @@ static int crontab_update_schedule(int run_maintenance, int fd)
 	FILE *cron_list, *cron_in;
 	struct strbuf line = STRBUF_INIT;
 	struct tempfile *tmpedit = NULL;
+	int minute = get_random_minute();
 
 	get_schedule_cmd(&cmd, NULL);
 	strvec_split(&crontab_list.args, cmd);
@@ -2228,11 +2229,11 @@ static int crontab_update_schedule(int run_maintenance, int fd)
 			"# replaced in the future by a Git command.\n\n");
 
 		strbuf_addf(&line_format,
-			    "%%s %%s * * %%s \"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%s\n",
+			    "%%d %%s * * %%s \"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%s\n",
 			    exec_path, exec_path);
-		fprintf(cron_in, line_format.buf, "0", "1-23", "*", "hourly");
-		fprintf(cron_in, line_format.buf, "0", "0", "1-6", "daily");
-		fprintf(cron_in, line_format.buf, "0", "0", "0", "weekly");
+		fprintf(cron_in, line_format.buf, minute, "1-23", "*", "hourly");
+		fprintf(cron_in, line_format.buf, minute, "0", "1-6", "daily");
+		fprintf(cron_in, line_format.buf, minute, "0", "0", "weekly");
 		strbuf_release(&line_format);
 
 		fprintf(cron_in, "\n%s\n", END_LINE);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 5/8] maintenance: swap method locations
  2023-08-10 20:39 ` [PATCH v2 0/8] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
                     ` (3 preceding siblings ...)
  2023-08-10 20:39   ` [PATCH v2 4/8] maintenance: use random minute in cron scheduler Derrick Stolee via GitGitGadget
@ 2023-08-10 20:39   ` Derrick Stolee via GitGitGadget
  2023-08-10 20:39   ` [PATCH v2 6/8] maintenance: use random minute in systemd scheduler Derrick Stolee via GitGitGadget
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-10 20:39 UTC (permalink / raw)
  To: git
  Cc: gitster, sandals, lenaic, Taylor Blau, Phillip Wood,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The systemd_timer_write_unit_templates() method writes a single template
that is then used to start the hourly, daily, and weekly schedules with
systemd.

However, in order to schedule systemd maintenance on a given minute,
these templates need to be replaced with specific schedules for each of
these jobs.

Before modifying the schedules, move the writing method above the
systemd_timer_enable_unit() method, so we can write a specific schedule
for each unit.

The diff is computed smaller by showing systemd_timer_enable_unit() and
systemd_timer_delete_units()  move instead of
systemd_timer_write_unit_templates() and
systemd_timer_delete_unit_templates().

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c | 96 ++++++++++++++++++++++++++--------------------------
 1 file changed, 48 insertions(+), 48 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index ef8bb772c38..e3819fc285a 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -2292,46 +2292,6 @@ static char *xdg_config_home_systemd(const char *filename)
 	return xdg_config_home_for("systemd/user", filename);
 }
 
-static int systemd_timer_enable_unit(int enable,
-				     enum schedule_priority schedule)
-{
-	const char *cmd = "systemctl";
-	struct child_process child = CHILD_PROCESS_INIT;
-	const char *frequency = get_frequency(schedule);
-
-	/*
-	 * Disabling the systemd unit while it is already disabled makes
-	 * systemctl print an error.
-	 * Let's ignore it since it means we already are in the expected state:
-	 * the unit is disabled.
-	 *
-	 * On the other hand, enabling a systemd unit which is already enabled
-	 * produces no error.
-	 */
-	if (!enable)
-		child.no_stderr = 1;
-
-	get_schedule_cmd(&cmd, NULL);
-	strvec_split(&child.args, cmd);
-	strvec_pushl(&child.args, "--user", enable ? "enable" : "disable",
-		     "--now", NULL);
-	strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
-
-	if (start_command(&child))
-		return error(_("failed to start systemctl"));
-	if (finish_command(&child))
-		/*
-		 * Disabling an already disabled systemd unit makes
-		 * systemctl fail.
-		 * Let's ignore this failure.
-		 *
-		 * Enabling an enabled systemd unit doesn't fail.
-		 */
-		if (enable)
-			return error(_("failed to run systemctl"));
-	return 0;
-}
-
 static int systemd_timer_delete_unit_templates(void)
 {
 	int ret = 0;
@@ -2348,14 +2308,6 @@ static int systemd_timer_delete_unit_templates(void)
 	return ret;
 }
 
-static int systemd_timer_delete_units(void)
-{
-	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY) ||
-	       systemd_timer_enable_unit(0, SCHEDULE_DAILY) ||
-	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY) ||
-	       systemd_timer_delete_unit_templates();
-}
-
 static int systemd_timer_write_unit_templates(const char *exec_path)
 {
 	char *filename;
@@ -2437,6 +2389,54 @@ error:
 	return -1;
 }
 
+static int systemd_timer_enable_unit(int enable,
+				     enum schedule_priority schedule)
+{
+	const char *cmd = "systemctl";
+	struct child_process child = CHILD_PROCESS_INIT;
+	const char *frequency = get_frequency(schedule);
+
+	/*
+	 * Disabling the systemd unit while it is already disabled makes
+	 * systemctl print an error.
+	 * Let's ignore it since it means we already are in the expected state:
+	 * the unit is disabled.
+	 *
+	 * On the other hand, enabling a systemd unit which is already enabled
+	 * produces no error.
+	 */
+	if (!enable)
+		child.no_stderr = 1;
+
+	get_schedule_cmd(&cmd, NULL);
+	strvec_split(&child.args, cmd);
+	strvec_pushl(&child.args, "--user", enable ? "enable" : "disable",
+		     "--now", NULL);
+	strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
+
+	if (start_command(&child))
+		return error(_("failed to start systemctl"));
+	if (finish_command(&child))
+		/*
+		 * Disabling an already disabled systemd unit makes
+		 * systemctl fail.
+		 * Let's ignore this failure.
+		 *
+		 * Enabling an enabled systemd unit doesn't fail.
+		 */
+		if (enable)
+			return error(_("failed to run systemctl"));
+	return 0;
+}
+
+static int systemd_timer_delete_units(void)
+{
+	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY) ||
+	       systemd_timer_enable_unit(0, SCHEDULE_DAILY) ||
+	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY) ||
+	       systemd_timer_delete_unit_templates();
+}
+
 static int systemd_timer_setup_units(void)
 {
 	const char *exec_path = git_exec_path();
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 6/8] maintenance: use random minute in systemd scheduler
  2023-08-10 20:39 ` [PATCH v2 0/8] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
                     ` (4 preceding siblings ...)
  2023-08-10 20:39   ` [PATCH v2 5/8] maintenance: swap method locations Derrick Stolee via GitGitGadget
@ 2023-08-10 20:39   ` Derrick Stolee via GitGitGadget
  2023-08-14 11:26     ` Phillip Wood
  2023-08-10 20:39   ` [PATCH v2 7/8] maintenance: fix systemd schedule overlaps Derrick Stolee via GitGitGadget
  2023-08-10 20:39   ` [PATCH v2 8/8] maintenance: update schedule before config Derrick Stolee via GitGitGadget
  7 siblings, 1 reply; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-10 20:39 UTC (permalink / raw)
  To: git
  Cc: gitster, sandals, lenaic, Taylor Blau, Phillip Wood,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Add this random minute to the systemd integration.

This integration is more complicated than similar changes for other
schedulers because of a neat trick that systemd allows: templating.

The previous implementation generated two template files with names
of the form 'git-maintenance@.(timer|service)'. The '.timer' or
'.service' indicates that this is a template that is picked up when we
later specify '...@<schedule>.timer' or '...@<schedule>.service'. The
'<schedule>' string is then used to insert into the template both the
'OnCalendar' schedule setting and the '--schedule' parameter of the
'git maintenance run' command.

In order to set these schedules to a given minute, we can no longer use
the 'hourly', 'daily', or 'weekly' strings for '<schedule>' and instead
need to abandon the template model for the .timer files. We can still
use templates for the .service files. For this reason, we split these
writes into two methods.

Modify the template with a custom schedule in the 'OnCalendar' setting.
This schedule has some interesting differences from cron-like patterns,
but is relatively easy to figure out from context. The one that might be
confusing is that '*-*-*' is a date-based pattern, but this must be
omitted when using 'Mon' to signal that we care about the day of the
week. Monday is used since that matches the day used for the 'weekly'
schedule used previously.

Now that the timer files are not templates, we might want to abandon the
'@' symbol in the file names. However, this would cause users with
existing schedules to get two competing schedules due to different
names. The work to remove the old schedule name is one thing that we can
avoid by keeping the '@' symbol in our unit names. Since we are locked
into this name, it makes sense that we keep the template model for the
.service files.

The rest of the change involves making sure we are writing these .timer
and .service files before initializing the schedule with 'systemctl' and
deleting the files when we are done. Some changes are also made to share
the random minute along with a single computation of the execution path
of the current Git executable.

In addition, older Git versions may have written a
'git-maintenance@.timer' template file. Be sure to remove this when
successfully enabling maintenance (or disabling maintenance).

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c           | 152 ++++++++++++++++++++++++++++++++++-------
 t/t7900-maintenance.sh |  15 +++-
 2 files changed, 142 insertions(+), 25 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index e3819fc285a..e52129e485c 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -2292,29 +2292,54 @@ static char *xdg_config_home_systemd(const char *filename)
 	return xdg_config_home_for("systemd/user", filename);
 }
 
-static int systemd_timer_delete_unit_templates(void)
+#define SYSTEMD_UNIT_FORMAT "git-maintenance@%s.%s"
+
+static int systemd_timer_delete_timer_file(enum schedule_priority priority)
 {
 	int ret = 0;
-	char *filename = xdg_config_home_systemd("git-maintenance@.timer");
+	const char *frequency = get_frequency(priority);
+	char *local_timer_name = xstrfmt(SYSTEMD_UNIT_FORMAT, frequency, "timer");
+	char *filename = xdg_config_home_systemd(local_timer_name);
+
 	if (unlink(filename) && !is_missing_file_error(errno))
 		ret = error_errno(_("failed to delete '%s'"), filename);
-	FREE_AND_NULL(filename);
 
-	filename = xdg_config_home_systemd("git-maintenance@.service");
+	free(filename);
+	free(local_timer_name);
+	return ret;
+}
+
+static int systemd_timer_delete_service_template(void)
+{
+	int ret = 0;
+	char *local_service_name = xstrfmt(SYSTEMD_UNIT_FORMAT, "", "service");
+	char *filename = xdg_config_home_systemd(local_service_name);
 	if (unlink(filename) && !is_missing_file_error(errno))
 		ret = error_errno(_("failed to delete '%s'"), filename);
 
 	free(filename);
+	free(local_service_name);
 	return ret;
 }
 
-static int systemd_timer_write_unit_templates(const char *exec_path)
+/*
+ * Write the schedule information into a git-maintenance@<schedule>.timer
+ * file using a custom minute. This timer file cannot use the templating
+ * system, so we generate a specific file for each.
+ */
+static int systemd_timer_write_timer_file(enum schedule_priority schedule,
+					  int minute)
 {
+	int res = -1;
 	char *filename;
 	FILE *file;
 	const char *unit;
+	char *schedule_pattern = NULL;
+	const char *frequency = get_frequency(schedule);
+	char *local_timer_name = xstrfmt(SYSTEMD_UNIT_FORMAT, frequency, "timer");
+
+	filename = xdg_config_home_systemd(local_timer_name);
 
-	filename = xdg_config_home_systemd("git-maintenance@.timer");
 	if (safe_create_leading_directories(filename)) {
 		error(_("failed to create directories for '%s'"), filename);
 		goto error;
@@ -2323,6 +2348,23 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
 	if (!file)
 		goto error;
 
+	switch (schedule) {
+	case SCHEDULE_HOURLY:
+		schedule_pattern = xstrfmt("*-*-* *:%02d:00", minute);
+		break;
+
+	case SCHEDULE_DAILY:
+		schedule_pattern = xstrfmt("*-*-* 0:%02d:00", minute);
+		break;
+
+	case SCHEDULE_WEEKLY:
+		schedule_pattern = xstrfmt("Mon 0:%02d:00", minute);
+		break;
+
+	default:
+		BUG("Unhandled schedule_priority");
+	}
+
 	unit = "# This file was created and is maintained by Git.\n"
 	       "# Any edits made in this file might be replaced in the future\n"
 	       "# by a Git command.\n"
@@ -2331,12 +2373,12 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
 	       "Description=Optimize Git repositories data\n"
 	       "\n"
 	       "[Timer]\n"
-	       "OnCalendar=%i\n"
+	       "OnCalendar=%s\n"
 	       "Persistent=true\n"
 	       "\n"
 	       "[Install]\n"
 	       "WantedBy=timers.target\n";
-	if (fputs(unit, file) == EOF) {
+	if (fprintf(file, unit, schedule_pattern) < 0) {
 		error(_("failed to write to '%s'"), filename);
 		fclose(file);
 		goto error;
@@ -2345,9 +2387,36 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
 		error_errno(_("failed to flush '%s'"), filename);
 		goto error;
 	}
+
+	res = 0;
+
+error:
+	free(schedule_pattern);
+	free(local_timer_name);
 	free(filename);
+	return res;
+}
 
-	filename = xdg_config_home_systemd("git-maintenance@.service");
+/*
+ * No matter the schedule, we use the same service and can make use of the
+ * templating system. When installing git-maintenance@<schedule>.timer,
+ * systemd will notice that git-maintenance@.service exists as a template
+ * and will use this file and insert the <schedule> into the template at
+ * the position of "%i".
+ */
+static int systemd_timer_write_service_template(const char *exec_path)
+{
+	int res = -1;
+	char *filename;
+	FILE *file;
+	const char *unit;
+	char *local_service_name = xstrfmt(SYSTEMD_UNIT_FORMAT, "", "service");
+
+	filename = xdg_config_home_systemd(local_service_name);
+	if (safe_create_leading_directories(filename)) {
+		error(_("failed to create directories for '%s'"), filename);
+		goto error;
+	}
 	file = fopen_or_warn(filename, "w");
 	if (!file)
 		goto error;
@@ -2380,17 +2449,18 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
 		error_errno(_("failed to flush '%s'"), filename);
 		goto error;
 	}
-	free(filename);
-	return 0;
+
+	res = 0;
 
 error:
+	free(local_service_name);
 	free(filename);
-	systemd_timer_delete_unit_templates();
-	return -1;
+	return res;
 }
 
 static int systemd_timer_enable_unit(int enable,
-				     enum schedule_priority schedule)
+				     enum schedule_priority schedule,
+				     int minute)
 {
 	const char *cmd = "systemctl";
 	struct child_process child = CHILD_PROCESS_INIT;
@@ -2407,12 +2477,14 @@ static int systemd_timer_enable_unit(int enable,
 	 */
 	if (!enable)
 		child.no_stderr = 1;
+	else if (systemd_timer_write_timer_file(schedule, minute))
+		return -1;
 
 	get_schedule_cmd(&cmd, NULL);
 	strvec_split(&child.args, cmd);
 	strvec_pushl(&child.args, "--user", enable ? "enable" : "disable",
 		     "--now", NULL);
-	strvec_pushf(&child.args, "git-maintenance@%s.timer", frequency);
+	strvec_pushf(&child.args, SYSTEMD_UNIT_FORMAT, frequency, "timer");
 
 	if (start_command(&child))
 		return error(_("failed to start systemctl"));
@@ -2429,24 +2501,58 @@ static int systemd_timer_enable_unit(int enable,
 	return 0;
 }
 
+/*
+ * A previous version of Git wrote the timer units as template files.
+ * Clean these up, if they exist.
+ */
+static void systemd_timer_delete_stale_timer_templates(void)
+{
+	char *timer_template_name = xstrfmt(SYSTEMD_UNIT_FORMAT, "", "timer");
+	char *filename = xdg_config_home_systemd(timer_template_name);
+
+	if (unlink(filename) && !is_missing_file_error(errno))
+		warning(_("failed to delete '%s'"), filename);
+
+	free(filename);
+	free(timer_template_name);
+}
+
+static int systemd_timer_delete_unit_files(void)
+{
+	systemd_timer_delete_stale_timer_templates();
+
+	/* Purposefully not short-circuited to make sure all are called. */
+	return systemd_timer_delete_timer_file(SCHEDULE_HOURLY) |
+	       systemd_timer_delete_timer_file(SCHEDULE_DAILY) |
+	       systemd_timer_delete_timer_file(SCHEDULE_WEEKLY) |
+	       systemd_timer_delete_service_template();
+}
+
 static int systemd_timer_delete_units(void)
 {
-	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY) ||
-	       systemd_timer_enable_unit(0, SCHEDULE_DAILY) ||
-	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY) ||
-	       systemd_timer_delete_unit_templates();
+	int minute = get_random_minute();
+	/* Purposefully not short-circuited to make sure all are called. */
+	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY, minute) |
+	       systemd_timer_enable_unit(0, SCHEDULE_DAILY, minute) |
+	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY, minute) |
+	       systemd_timer_delete_unit_files();
 }
 
 static int systemd_timer_setup_units(void)
 {
+	int minute = get_random_minute();
 	const char *exec_path = git_exec_path();
 
-	int ret = systemd_timer_write_unit_templates(exec_path) ||
-		  systemd_timer_enable_unit(1, SCHEDULE_HOURLY) ||
-		  systemd_timer_enable_unit(1, SCHEDULE_DAILY) ||
-		  systemd_timer_enable_unit(1, SCHEDULE_WEEKLY);
+	int ret = systemd_timer_write_service_template(exec_path) ||
+		  systemd_timer_enable_unit(1, SCHEDULE_HOURLY, minute) ||
+		  systemd_timer_enable_unit(1, SCHEDULE_DAILY, minute) ||
+		  systemd_timer_enable_unit(1, SCHEDULE_WEEKLY, minute);
+
 	if (ret)
 		systemd_timer_delete_units();
+	else
+		systemd_timer_delete_stale_timer_templates();
+
 	return ret;
 }
 
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 487e326b3fa..9ffe76729e6 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -744,7 +744,15 @@ test_expect_success 'start and stop Linux/systemd maintenance' '
 	# start registers the repo
 	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
 
-	test_systemd_analyze_verify "systemd/user/git-maintenance@.service" &&
+	for schedule in hourly daily weekly
+	do
+		test_path_is_file "systemd/user/git-maintenance@$schedule.timer" || return 1
+	done &&
+	test_path_is_file "systemd/user/git-maintenance@.service" &&
+
+	test_systemd_analyze_verify "systemd/user/git-maintenance@hourly.service" &&
+	test_systemd_analyze_verify "systemd/user/git-maintenance@daily.service" &&
+	test_systemd_analyze_verify "systemd/user/git-maintenance@weekly.service" &&
 
 	printf -- "--user enable --now git-maintenance@%s.timer\n" hourly daily weekly >expect &&
 	test_cmp expect args &&
@@ -755,7 +763,10 @@ test_expect_success 'start and stop Linux/systemd maintenance' '
 	# stop does not unregister the repo
 	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
 
-	test_path_is_missing "systemd/user/git-maintenance@.timer" &&
+	for schedule in hourly daily weekly
+	do
+		test_path_is_missing "systemd/user/git-maintenance@$schedule.timer" || return 1
+	done &&
 	test_path_is_missing "systemd/user/git-maintenance@.service" &&
 
 	printf -- "--user disable --now git-maintenance@%s.timer\n" hourly daily weekly >expect &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 7/8] maintenance: fix systemd schedule overlaps
  2023-08-10 20:39 ` [PATCH v2 0/8] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
                     ` (5 preceding siblings ...)
  2023-08-10 20:39   ` [PATCH v2 6/8] maintenance: use random minute in systemd scheduler Derrick Stolee via GitGitGadget
@ 2023-08-10 20:39   ` Derrick Stolee via GitGitGadget
  2023-08-10 21:22     ` Junio C Hamano
  2023-08-14 11:27     ` Phillip Wood
  2023-08-10 20:39   ` [PATCH v2 8/8] maintenance: update schedule before config Derrick Stolee via GitGitGadget
  7 siblings, 2 replies; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-10 20:39 UTC (permalink / raw)
  To: git
  Cc: gitster, sandals, lenaic, Taylor Blau, Phillip Wood,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The 'git maintenance run' command prevents concurrent runs in the same
repository using a 'maintenance.lock' file. However, when using systemd
the hourly maintenance runs the same time as the daily and weekly runs.
(Similarly, daily maintenance runs at the same time as weekly
maintenance.) These competing commands result in some maintenance not
actually being run.

This overlap was something we could not fix until we made the recent
change to not use the builting 'hourly', 'daily', and 'weekly' schedules
in systemd. We can adjust the schedules such that:

 1. Hourly runs avoid the 0th hour.
 2. Daily runs avoid Monday.

This will keep maintenance runs from colliding when using systemd.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index e52129e485c..6f8df366fbe 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -2350,11 +2350,11 @@ static int systemd_timer_write_timer_file(enum schedule_priority schedule,
 
 	switch (schedule) {
 	case SCHEDULE_HOURLY:
-		schedule_pattern = xstrfmt("*-*-* *:%02d:00", minute);
+		schedule_pattern = xstrfmt("*-*-* 1..23:%02d:00", minute);
 		break;
 
 	case SCHEDULE_DAILY:
-		schedule_pattern = xstrfmt("*-*-* 0:%02d:00", minute);
+		schedule_pattern = xstrfmt("Tue..Sun *-*-* 0:%02d:00", minute);
 		break;
 
 	case SCHEDULE_WEEKLY:
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 8/8] maintenance: update schedule before config
  2023-08-10 20:39 ` [PATCH v2 0/8] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
                     ` (6 preceding siblings ...)
  2023-08-10 20:39   ` [PATCH v2 7/8] maintenance: fix systemd schedule overlaps Derrick Stolee via GitGitGadget
@ 2023-08-10 20:39   ` Derrick Stolee via GitGitGadget
  2023-08-14 11:28     ` Phillip Wood
  7 siblings, 1 reply; 47+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-08-10 20:39 UTC (permalink / raw)
  To: git
  Cc: gitster, sandals, lenaic, Taylor Blau, Phillip Wood,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

When running 'git maintenance start', the current pattern is to
configure global config settings to enable maintenance on the current
repository and set 'maintenance.auto' to false and _then_ to set up the
schedule with the system scheduler.

This has a problematic error condition: if the scheduler fails to
initialize, the repository still will not use automatic maintenance due
to the 'maintenance.auto' setting.

Fix this gap by swapping the order of operations. If Git fails to
initialize maintenance, then the config changes should never happen.

Reported-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c           |  5 ++++-
 t/t7900-maintenance.sh | 13 +++++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 6f8df366fbe..fe5f871c493 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -2728,9 +2728,12 @@ static int maintenance_start(int argc, const char **argv, const char *prefix)
 	opts.scheduler = resolve_scheduler(opts.scheduler);
 	validate_scheduler(opts.scheduler);
 
+	if (update_background_schedule(&opts, 1))
+		die(_("failed to set up maintenance schedule"));
+
 	if (maintenance_register(ARRAY_SIZE(register_args)-1, register_args, NULL))
 		warning(_("failed to add repo to global config"));
-	return update_background_schedule(&opts, 1);
+	return 0;
 }
 
 static const char *const builtin_maintenance_stop_usage[] = {
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 9ffe76729e6..e56f5980dc4 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -849,4 +849,17 @@ test_expect_success 'register and unregister bare repo' '
 	)
 '
 
+test_expect_success 'failed schedule prevents config change' '
+	git init --bare failcase &&
+
+	for scheduler in crontab launchctl schtasks systemctl
+	do
+		GIT_TEST_MAINT_SCHEDULER="$scheduler:false" &&
+		export GIT_TEST_MAINT_SCHEDULER &&
+		test_must_fail \
+			git -C failcase maintenance start &&
+		test_must_fail git -C failcase config maintenance.auto || return 1
+	done
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 7/8] maintenance: fix systemd schedule overlaps
  2023-08-10 20:39   ` [PATCH v2 7/8] maintenance: fix systemd schedule overlaps Derrick Stolee via GitGitGadget
@ 2023-08-10 21:22     ` Junio C Hamano
  2023-08-14 11:27     ` Phillip Wood
  1 sibling, 0 replies; 47+ messages in thread
From: Junio C Hamano @ 2023-08-10 21:22 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, sandals, lenaic, Taylor Blau, Phillip Wood, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <derrickstolee@github.com>
>
> The 'git maintenance run' command prevents concurrent runs in the same
> repository using a 'maintenance.lock' file. However, when using systemd
> the hourly maintenance runs the same time as the daily and weekly runs.
> (Similarly, daily maintenance runs at the same time as weekly
> maintenance.) These competing commands result in some maintenance not
> actually being run.
>
> This overlap was something we could not fix until we made the recent
> change to not use the builting 'hourly', 'daily', and 'weekly' schedules
> in systemd. We can adjust the schedules such that:
>
>  1. Hourly runs avoid the 0th hour.
>  2. Daily runs avoid Monday.

Simple and stupid is good.  When I read the problem description, I
wondered if the hour in which daily jobs are run will also be
dispersed (so that the load of the server that helps daily tasks at
the clients will not be concentrated in a single hour out of 24
hours of a day), but hopefully the load is still spread within that
60 minutes evenly, which probably is good enough.



>
> This will keep maintenance runs from colliding when using systemd.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  builtin/gc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/gc.c b/builtin/gc.c
> index e52129e485c..6f8df366fbe 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -2350,11 +2350,11 @@ static int systemd_timer_write_timer_file(enum schedule_priority schedule,
>  
>  	switch (schedule) {
>  	case SCHEDULE_HOURLY:
> -		schedule_pattern = xstrfmt("*-*-* *:%02d:00", minute);
> +		schedule_pattern = xstrfmt("*-*-* 1..23:%02d:00", minute);
>  		break;
>  
>  	case SCHEDULE_DAILY:
> -		schedule_pattern = xstrfmt("*-*-* 0:%02d:00", minute);
> +		schedule_pattern = xstrfmt("Tue..Sun *-*-* 0:%02d:00", minute);
>  		break;
>  
>  	case SCHEDULE_WEEKLY:

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 1/8] maintenance: add get_random_minute()
  2023-08-10 20:39   ` [PATCH v2 1/8] maintenance: add get_random_minute() Derrick Stolee via GitGitGadget
@ 2023-08-10 21:25     ` Taylor Blau
  0 siblings, 0 replies; 47+ messages in thread
From: Taylor Blau @ 2023-08-10 21:25 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sandals, lenaic, Phillip Wood, Derrick Stolee

On Thu, Aug 10, 2023 at 08:39:40PM +0000, Derrick Stolee via GitGitGadget wrote:
> ---
>  builtin/gc.c | 10 ++++++++++
>  wrapper.c    | 10 ++++++++++
>  wrapper.h    |  6 ++++++
>  3 files changed, 26 insertions(+)

Very nice, this patch LGTM. Thanks for working on it!

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 6/8] maintenance: use random minute in systemd scheduler
  2023-08-10 20:39   ` [PATCH v2 6/8] maintenance: use random minute in systemd scheduler Derrick Stolee via GitGitGadget
@ 2023-08-14 11:26     ` Phillip Wood
  0 siblings, 0 replies; 47+ messages in thread
From: Phillip Wood @ 2023-08-14 11:26 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, sandals, lenaic, Taylor Blau, Derrick Stolee

Hi Stolee

This all looks good to me now, I've left a comment on the test but I 
think it is probably fine to go in as it is.

> -static int systemd_timer_write_unit_templates(const char *exec_path)
> +/*
> + * Write the schedule information into a git-maintenance@<schedule>.timer
> + * file using a custom minute. This timer file cannot use the templating
> + * system, so we generate a specific file for each.
> + */

Thanks for adding comments to the functions that write the timer and 
service files, they really helpful especially for readers who are not 
that familiar with systemd.

> +static int systemd_timer_delete_unit_files(void)
> +{
> +	systemd_timer_delete_stale_timer_templates();
> +
> +	/* Purposefully not short-circuited to make sure all are called. */
> +	return systemd_timer_delete_timer_file(SCHEDULE_HOURLY) |
> +	       systemd_timer_delete_timer_file(SCHEDULE_DAILY) |
> +	       systemd_timer_delete_timer_file(SCHEDULE_WEEKLY) |
> +	       systemd_timer_delete_service_template();

Using "|" rather than "||" is a nice touch.

> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 487e326b3fa..9ffe76729e6 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -744,7 +744,15 @@ test_expect_success 'start and stop Linux/systemd maintenance' '
>   	# start registers the repo
>   	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
>   
> -	test_systemd_analyze_verify "systemd/user/git-maintenance@.service" &&
> +	for schedule in hourly daily weekly
> +	do
> +		test_path_is_file "systemd/user/git-maintenance@$schedule.timer" || return 1
> +	done &&
> +	test_path_is_file "systemd/user/git-maintenance@.service" &&
> +
> +	test_systemd_analyze_verify "systemd/user/git-maintenance@hourly.service" &&
> +	test_systemd_analyze_verify "systemd/user/git-maintenance@daily.service" &&
> +	test_systemd_analyze_verify "systemd/user/git-maintenance@weekly.service" &&

As we only write the template service file I'm not sure what we gain by 
these three checks but they don't seem to be doing any harm.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 7/8] maintenance: fix systemd schedule overlaps
  2023-08-10 20:39   ` [PATCH v2 7/8] maintenance: fix systemd schedule overlaps Derrick Stolee via GitGitGadget
  2023-08-10 21:22     ` Junio C Hamano
@ 2023-08-14 11:27     ` Phillip Wood
  1 sibling, 0 replies; 47+ messages in thread
From: Phillip Wood @ 2023-08-14 11:27 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, sandals, lenaic, Taylor Blau, Derrick Stolee

Hi Stolee

On 10/08/2023 21:39, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> The 'git maintenance run' command prevents concurrent runs in the same
> repository using a 'maintenance.lock' file. However, when using systemd
> the hourly maintenance runs the same time as the daily and weekly runs.
> (Similarly, daily maintenance runs at the same time as weekly
> maintenance.) These competing commands result in some maintenance not
> actually being run.
> 
> This overlap was something we could not fix until we made the recent
> change to not use the builting 'hourly', 'daily', and 'weekly' schedules
> in systemd. We can adjust the schedules such that:
> 
>   1. Hourly runs avoid the 0th hour.
>   2. Daily runs avoid Monday.
> 
> This will keep maintenance runs from colliding when using systemd.

Thanks for fixing this, the patch looks good

Best Wishes

Phillip

> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>   builtin/gc.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin/gc.c b/builtin/gc.c
> index e52129e485c..6f8df366fbe 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -2350,11 +2350,11 @@ static int systemd_timer_write_timer_file(enum schedule_priority schedule,
>   
>   	switch (schedule) {
>   	case SCHEDULE_HOURLY:
> -		schedule_pattern = xstrfmt("*-*-* *:%02d:00", minute);
> +		schedule_pattern = xstrfmt("*-*-* 1..23:%02d:00", minute);
>   		break;
>   
>   	case SCHEDULE_DAILY:
> -		schedule_pattern = xstrfmt("*-*-* 0:%02d:00", minute);
> +		schedule_pattern = xstrfmt("Tue..Sun *-*-* 0:%02d:00", minute);
>   		break;
>   
>   	case SCHEDULE_WEEKLY:

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 8/8] maintenance: update schedule before config
  2023-08-10 20:39   ` [PATCH v2 8/8] maintenance: update schedule before config Derrick Stolee via GitGitGadget
@ 2023-08-14 11:28     ` Phillip Wood
  0 siblings, 0 replies; 47+ messages in thread
From: Phillip Wood @ 2023-08-14 11:28 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, sandals, lenaic, Taylor Blau, Derrick Stolee

Hi Stolee

On 10/08/2023 21:39, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> When running 'git maintenance start', the current pattern is to
> configure global config settings to enable maintenance on the current
> repository and set 'maintenance.auto' to false and _then_ to set up the
> schedule with the system scheduler.
> 
> This has a problematic error condition: if the scheduler fails to
> initialize, the repository still will not use automatic maintenance due
> to the 'maintenance.auto' setting.
> 
> Fix this gap by swapping the order of operations. If Git fails to
> initialize maintenance, then the config changes should never happen.

The commit message and patch look good to me

Thanks

Phillip

> Reported-by: Phillip Wood <phillip.wood123@gmail.com>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>   builtin/gc.c           |  5 ++++-
>   t/t7900-maintenance.sh | 13 +++++++++++++
>   2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 6f8df366fbe..fe5f871c493 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -2728,9 +2728,12 @@ static int maintenance_start(int argc, const char **argv, const char *prefix)
>   	opts.scheduler = resolve_scheduler(opts.scheduler);
>   	validate_scheduler(opts.scheduler);
>   
> +	if (update_background_schedule(&opts, 1))
> +		die(_("failed to set up maintenance schedule"));
> +
>   	if (maintenance_register(ARRAY_SIZE(register_args)-1, register_args, NULL))
>   		warning(_("failed to add repo to global config"));
> -	return update_background_schedule(&opts, 1);
> +	return 0;
>   }
>   
>   static const char *const builtin_maintenance_stop_usage[] = {
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 9ffe76729e6..e56f5980dc4 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -849,4 +849,17 @@ test_expect_success 'register and unregister bare repo' '
>   	)
>   '
>   
> +test_expect_success 'failed schedule prevents config change' '
> +	git init --bare failcase &&
> +
> +	for scheduler in crontab launchctl schtasks systemctl
> +	do
> +		GIT_TEST_MAINT_SCHEDULER="$scheduler:false" &&
> +		export GIT_TEST_MAINT_SCHEDULER &&
> +		test_must_fail \
> +			git -C failcase maintenance start &&
> +		test_must_fail git -C failcase config maintenance.auto || return 1
> +	done
> +'
> +
>   test_done

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2023-08-14 11:29 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-07 18:51 [PATCH 0/6] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
2023-08-07 18:51 ` [PATCH 1/6] maintenance: add get_random_minute() Derrick Stolee via GitGitGadget
2023-08-07 21:20   ` Taylor Blau
2023-08-07 23:53     ` Junio C Hamano
2023-08-08  0:22       ` Junio C Hamano
2023-08-08 14:48         ` Taylor Blau
2023-08-08 16:34           ` Junio C Hamano
2023-08-08 16:49             ` Junio C Hamano
2023-08-08 20:01               ` Taylor Blau
2023-08-08 17:28     ` Derrick Stolee
2023-08-08 20:04       ` Taylor Blau
2023-08-09 12:17         ` Derrick Stolee
2023-08-09 18:50           ` Junio C Hamano
2023-08-09 20:34             ` Taylor Blau
2023-08-07 18:51 ` [PATCH 2/6] maintenance: use random minute in launchctl scheduler Derrick Stolee via GitGitGadget
2023-08-07 21:23   ` Taylor Blau
2023-08-07 18:51 ` [PATCH 3/6] maintenance: use random minute in Windows scheduler Derrick Stolee via GitGitGadget
2023-08-07 18:51 ` [PATCH 4/6] maintenance: use random minute in cron scheduler Derrick Stolee via GitGitGadget
2023-08-07 18:51 ` [PATCH 5/6] maintenance: swap method locations Derrick Stolee via GitGitGadget
2023-08-07 21:24   ` Taylor Blau
2023-08-07 18:51 ` [PATCH 6/6] maintenance: use random minute in systemd scheduler Derrick Stolee via GitGitGadget
2023-08-07 21:31   ` Taylor Blau
2023-08-08 13:49     ` Derrick Stolee
2023-08-08 20:05       ` Taylor Blau
2023-08-08  9:53   ` Phillip Wood
2023-08-08 13:03     ` Phillip Wood
2023-08-08 13:56     ` Derrick Stolee
2023-08-08 17:24       ` Derrick Stolee
2023-08-09 10:03         ` Phillip Wood
2023-08-08 12:08   ` Phillip Wood
2023-08-08 17:06     ` Derrick Stolee
2023-08-08 17:14       ` Derrick Stolee
2023-08-09 10:00         ` Phillip Wood
2023-08-10 20:39 ` [PATCH v2 0/8] maintenance: schedule maintenance on a random minute Derrick Stolee via GitGitGadget
2023-08-10 20:39   ` [PATCH v2 1/8] maintenance: add get_random_minute() Derrick Stolee via GitGitGadget
2023-08-10 21:25     ` Taylor Blau
2023-08-10 20:39   ` [PATCH v2 2/8] maintenance: use random minute in launchctl scheduler Derrick Stolee via GitGitGadget
2023-08-10 20:39   ` [PATCH v2 3/8] maintenance: use random minute in Windows scheduler Derrick Stolee via GitGitGadget
2023-08-10 20:39   ` [PATCH v2 4/8] maintenance: use random minute in cron scheduler Derrick Stolee via GitGitGadget
2023-08-10 20:39   ` [PATCH v2 5/8] maintenance: swap method locations Derrick Stolee via GitGitGadget
2023-08-10 20:39   ` [PATCH v2 6/8] maintenance: use random minute in systemd scheduler Derrick Stolee via GitGitGadget
2023-08-14 11:26     ` Phillip Wood
2023-08-10 20:39   ` [PATCH v2 7/8] maintenance: fix systemd schedule overlaps Derrick Stolee via GitGitGadget
2023-08-10 21:22     ` Junio C Hamano
2023-08-14 11:27     ` Phillip Wood
2023-08-10 20:39   ` [PATCH v2 8/8] maintenance: update schedule before config Derrick Stolee via GitGitGadget
2023-08-14 11:28     ` Phillip Wood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).