(cross-posted to git, LKML, and the kernel workflows mailing lists.) Hi all, I've been following Konstantin Ryabitsev's quest for better development and communication tools for the kernel [1][2][3], and I would like to propose a relatively straightforward idea which I think could bring a lot to the table. Step 1: * git send-email needs to include parent SHA1s and generally all the information needed to perfectly recreate the commit when applied so that all the SHA1s remain the same * git am (or an alternative command) needs to recreate the commit perfectly when applied, including applying it to the correct parent Having these two will allow a perfect mapping between email and git; essentially email just becomes a transport for git. There are a lot of advantages to this, particularly that you have a stable way to refer to a patch or commit (despite it appearing on a mailing list), and there is no need for "changeset IDs" or whatever, since you can just use the git SHA1 which is unique, unambiguous, and stable. As a rough proof of concept I've attached 3 git patches which implement this. There are issues to work out like exact format, encodings, mail mangling, error handling, etc., but hopefully the git community can help out here. (Improvement suggestions are welcome!) Step 2: * A bot that follows LKML (and other lists) and imports patchsets into a git repository hosted on git.kernel.org * The bot can add git notes with URLs to lore (and/or other mailing list archives) and store them in e.g. refs/notes/lore, refs/notes/lkml, etc. (For those who don't use git notes yet: they are essentially small bits of information you can add to a commit without changing its SHA1, and you can configure tools like 'git log' to show these at the bottom of a commit. Notes can also exist in a repo completely separate from the commits they attach data to, so there is _zero_ overhead for those who don't want to use this.) * Maintainers can either pull patchsets directly from this bot- maintained repo OR they can continue to apply patches from their inbox (the result should be the same either way) OR they can continue in the old-style process (at least for a while) and just not have the benefits of the new process. Step 3: * Instead of describing a patchset in a separate introduction email, we can create a merge commit between the parent of the first commit in the series and the last and put the patchset description in the merge commit [5]. This means the patchset description also gets to be part of git history. (This would require support for git send-email/am to be able to send and apply merge commits -- at least those which have the same tree as one of the parents. This is _not_ yet supported in my proposed git patches.) * stable SHA1s means we can refer to previous versions of a patchset by SHA1 rather than archive links. I propose a new changelog tag for this, maybe "Previous:" or maybe even a full list of "v1:", "v2:", etc. with a SHA1 or ref. Note that these SHA1s do *not* need to exist in Linus's repo, but those who want can pull those branches from the bot-maintained repo on git.kernel.org. Advantages: - we can keep using email to post patches/patchsets - the process is opt-in (but should be encouraged) for both authors and maintainers, and the transition can happen over time - there is a central repo for convenience, but it is not necessary for development to happen and is not a single point of failure -- it's more like Linus's repo and can be moved or even replicated from scratch by somebody else simply by having mailing list archives - allows quick lookup of patch/patchset <-> email discussion within git - allows diffing between versions of a single logical patchset - patchset descriptions naturally become part of the changelog that ends up in Linus's tree Disadvantages: - requires patching git - requires a bot to continuously create branches for patchsets sent to mailing lists - increased storage/bandwidth for git.kernel.org (?) - may need a couple of new wrapper scripts to automate patchset construction/versioning Thoughts? Vegard PS: Eric Wong described something that comes quite close to this idea, but AFAICT without actually recreating commits exactly. I've included the link for completeness. [4] [1]: https://lwn.net/Articles/793037/ "Ryabitsev: Patches carved into developer sigchains" [2]: https://lwn.net/Articles/799134/ "Defragmenting the kernel development process" [3]: https://lore.kernel.org/workflows/20190924182536.GC6041@hmswarspite.think-freely.org/ [4]: https://lore.kernel.org/workflows/20191008003931.y4rc2dp64gbhv5ju@dcvr/ [5]: To create this merge commit one could use something like this (bash): # usage: patchset BASE [PREVIOUS_VERSION] patchset () { start=$1 prev=$2 # construct tentative commit message commit_editmsg="$(git rev-parse --git-dir)/COMMIT_EDITMSG" ( if [ -z "$prev" ] then echo 'Patchset title' echo echo Commits: echo git log --oneline $start..HEAD else git show --format=format:%B --no-patch $prev echo Previous-version: $(git rev-parse $prev) fi ) > "${commit_editmsg}" ${EDITOR} "${commit_editmsg}" merge=$(git commit-tree -p $start -p HEAD -F "${commit_editmsg}" $(git rev-parse HEAD^{tree})) echo $merge } This will open the editor to edit the patchset description and create a merge commit that encompasses the patches in the patchset (use sha1^- to view the patches in it).