git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Derrick Stolee <stolee@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, peff@peff.net, jrnieder@google.com,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 00/15] [RFC] Maintenance jobs and job runner
Date: Wed, 8 Apr 2020 00:01:49 +0000	[thread overview]
Message-ID: <20200408000149.GN6369@camp.crustytoothpaste.net> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.2004072355100.46@tvgsbejvaqbjf.bet>

[-- Attachment #1: Type: text/plain, Size: 5733 bytes --]

Hey,

On 2020-04-07 at 22:23:43, Johannes Schindelin wrote:
> > If there are periodic tasks that should be done, even if only on large
> > repos, then let's have a git gc --periodic that does them.  I'm not sure
> > that fetch should be in that set, but nothing prevents users from doing
> > "git fetch origin && git gc --periodic".
> 
> Hmm. Who says that maintenance tasks are essentially only `gc`? With
> _maaaaaybe_ a `fetch` thrown in?

What I'm saying is that we have a tool to run maintenance tasks on the
repository.  If we need to perform additional maintenance tasks, let's
put them in the same place as the ones we have now.  I realize "gc" may
become a less accurate name, but oh, well.

> > Let's make it as simple and straightforward as possible.
> 
> I get the impression, however, that many reviewers here seem to favor the
> goal of making the _patches_ as simple and straightforward as possible,
> however, at the expense of the original goal. Like, totally sacrificing
> the ease of use in return for "just use a shell script" advice.

I think we can have both.  They are not mutually exclusive, and I've
proposed a suggestion for both.

> > As for handling multiple repositories, the tool to do that could be as
> > simple as a shell script which reads from ~/.config/git/repo-maintenance
> > (or whatever) and runs the same command on all of the repos it finds
> > there, possibly with a subcommand to add and remove repos.
> 
> Sure, that is flexible.
> 
> And it requires a ton of Git expertise to know what to put into those
> scripts. And Git updates cannot deliver more value to those scripts.

Perhaps I was unclear what I thought could be the design of this.  My
proposal is something like the following:

  git schedule-gc add [--period=TIME] [--fetch=REMOTE | --fetch-all] REPO
  git schedule-gc remove REPO

The actual command invoked by the system scheduler would be something
like the following:

  git schedule-gc run

It would work as I proposed under the hood, but it would be relatively
straightforward to use.

> > I'm not opposed to seeing a tool that can schedule periodic maintenance
> > jobs, perhaps in contrib, depending on whether other people think it
> > should go.  However, I think running periodic jobs is best handled on
> > Unix with cron or anacron and not a custom tool or a command in Git.
> 
> Okay, here is a challenge for you: design this such that the Windows
> experience does _not_ feel like a 3rd-class citizen. Go ahead. Yes, there
> is a scheduler. Yep, it does not do cron-like things. Precisely: you have
> to feed it an XML to make use of the "advanced" features. Yeah, I also
> cannot remember what the semantics are regarding missed jobs due to
> shutdown cycles. Nope, you cannot rely on the XML being an option, that
> would require Windows 10. The list goes on.

I will freely admit that I know next to nothing about Windows.  I have
used it only incidentally, if at all, for at least two decades.  It is
not a platform I generally have an interest in developing for, although
I try to make it work as well as possible when I am working on a project
which supports it.

It is, in general, my assumption, based on its wide usage, that it is a
powerful and robust operating system with many features, but I have
little actual knowledge about how it functions or the exact features it
provides.

I want a solution that builds on the existing Unix tools for Unix,
because that is least surprising to users and it is how Unix tools are
supposed to work.  I think we can agree that Git was designed with the
Unix philosophy in mind.

I also want a solution that works on Windows.  Ideally that solution
would build on existing components that are part of Windows, because it
reduces the maintenance burden on all of us.  But unfortunately, I know
next to nothing about how to build such a solution.

> > I've dealt with systems that implemented periodic tasks without using
> > the existing tools for doing that, and I've found that usually that's a
> > mistake.  Despite seeming straightforward, there are a lot of tricky
> > edge cases to deal with and it's easy to get wrong.
> 
> But maybe you found one of those issues in Stolee's patches? If so, please
> do contribute your experience there to point out those issues, so that
> they can be addressed.

One of the benefits of using anacron on Unix is that it can skip running
tasks when the user is on battery.  This is not anything we can portably
do across systems, nor is it something that Git should need to know
about.

> > We also don't have to reimplement all the features in the system
> > scheduler and can let expert users use a different tool of their choice
> > instead if cron (or the Windows equivalent) is not to their liking.
> 
> Do we really want to start relying on `cron`, when the major platform used
> by the target audience (enterprise software engineers who deal with rather
> larger repositories than git.git or linux.git) quite obviously _lacks_
> support for that?

Unix users will be unhappy with us if we use our own scheduling system
when cron is available.  They will expect us to reimplement those
features and they will complain if we do not.  While I cannot name
names, there are a nontrivial number of large, enterprise monorepos that
run only on macOS and Linux.

That doesn't prevent us from building tooling that does the scheduling
on Windows if we can't use the system scheduler, but it would be nice to
try to present a relatively unified interface across the two platforms.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

  reply	other threads:[~2020-04-08  0:01 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-03 20:47 Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 01/15] run-job: create barebones builtin Derrick Stolee via GitGitGadget
2020-04-05 15:10   ` Phillip Wood
2020-04-05 19:21     ` Junio C Hamano
2020-04-06 14:42       ` Derrick Stolee
2020-04-07  0:58         ` Danh Doan
2020-04-07 10:54           ` Derrick Stolee
2020-04-07 14:16             ` Danh Doan
2020-04-07 14:30               ` Johannes Schindelin
2020-04-03 20:48 ` [PATCH 02/15] run-job: implement commit-graph job Derrick Stolee via GitGitGadget
2020-05-20 19:08   ` Josh Steadmon
2020-04-03 20:48 ` [PATCH 03/15] run-job: implement fetch job Derrick Stolee via GitGitGadget
2020-04-05 15:14   ` Phillip Wood
2020-04-06 12:48     ` Derrick Stolee
2020-04-05 20:28   ` Junio C Hamano
2020-04-06 12:46     ` Derrick Stolee
2020-05-20 19:08   ` Josh Steadmon
2020-04-03 20:48 ` [PATCH 04/15] run-job: implement loose-objects job Derrick Stolee via GitGitGadget
2020-04-05 20:33   ` Junio C Hamano
2020-04-03 20:48 ` [PATCH 05/15] run-job: implement pack-files job Derrick Stolee via GitGitGadget
2020-05-27 22:17   ` Josh Steadmon
2020-04-03 20:48 ` [PATCH 06/15] run-job: auto-size or use custom pack-files batch Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 07/15] config: add job.pack-files.batchSize option Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 08/15] job-runner: create builtin for job loop Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 09/15] job-runner: load repos from config by default Derrick Stolee via GitGitGadget
2020-04-05 15:18   ` Phillip Wood
2020-04-06 12:49     ` Derrick Stolee
2020-04-05 15:41   ` Phillip Wood
2020-04-06 12:57     ` Derrick Stolee
2020-04-03 20:48 ` [PATCH 10/15] job-runner: use config to limit job frequency Derrick Stolee via GitGitGadget
2020-04-05 15:24   ` Phillip Wood
2020-04-03 20:48 ` [PATCH 11/15] job-runner: use config for loop interval Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 12/15] job-runner: add --interval=<span> option Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 13/15] job-runner: skip a job if job.<job-name>.enabled is false Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 14/15] job-runner: add --daemonize option Derrick Stolee via GitGitGadget
2020-04-03 20:48 ` [PATCH 15/15] runjob: customize the loose-objects batch size Derrick Stolee via GitGitGadget
2020-04-03 21:40 ` [PATCH 00/15] [RFC] Maintenance jobs and job runner Junio C Hamano
2020-04-04  0:16   ` Derrick Stolee
2020-04-07  0:50     ` Danh Doan
2020-04-07 10:59       ` Derrick Stolee
2020-04-07 14:26         ` Danh Doan
2020-04-07 14:43           ` Johannes Schindelin
2020-04-07  1:48     ` brian m. carlson
2020-04-07 20:08       ` Junio C Hamano
2020-04-07 22:23       ` Johannes Schindelin
2020-04-08  0:01         ` brian m. carlson [this message]
2020-05-27 22:39           ` Josh Steadmon
2020-05-28  0:47             ` Junio C Hamano
2020-05-27 21:52               ` Johannes Schindelin
2020-05-28 14:48                 ` Junio C Hamano
2020-05-28 14:50                 ` Jonathan Nieder
2020-05-28 14:57                   ` Junio C Hamano
2020-05-28 15:03                     ` Jonathan Nieder
2020-05-28 15:30                       ` Derrick Stolee
2020-05-28  4:39                         ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200408000149.GN6369@camp.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jrnieder@google.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    --subject='Re: [PATCH 00/15] [RFC] Maintenance jobs and job runner' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).