git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: John Austin <john@astrangergravity.com>,
	git@vger.kernel.org, sandals@crustytoothpaste.net,
	larsxschneider@gmail.com, pastelmobilesuit@github.com,
	Joey Hess <id@joeyh.name>
Subject: Re: Git for games working group
Date: Sun, 16 Sep 2018 16:55:13 +0200	[thread overview]
Message-ID: <878t41lcfi.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <20180915164052.GA88932@syl>


On Sat, Sep 15 2018, Taylor Blau wrote:

> On Fri, Sep 14, 2018 at 02:09:12PM -0700, John Austin wrote:
>> I've been working myself on strategies for handling binary conflicts,
>> and particularly how to do it in a git-friendly way (ie. avoiding as
>> much centralization as possible and playing into the commit/branching
>> model of git).
>
> Git LFS handles conflict resolution and merging over binary files with
> two primary mechanisms: (1) file locking, and (2) use of a merge-tool.
>
>   1. is the most "non-Git-friendly" solution, since it requires the use
>      of a centralized Git LFS server (to be run alongside your remote
>      repository) and that every clone phones home to make sure that they
>      are OK to acquire a lock.
>
>      The workflow that we expect is that users will run 'git lfs lock
>      /path/to/file' any time they want to make a change to an
>      unmeregeable file, and that this call first checks to make sure
>      that they are the only person who would hold the lock.
>
>      We also periodically "sync" the state of locks locally with those
>      on the remote, namely during the post-merge, post-commit, and
>      post-checkout hook(s).
>
>      Users are expected to perform the 'git lfs unlock /path/to/file'
>      anytime they "merge" their changes back into master, but the
>      thought is that servers could be taught to automatically do this
>      upon the remote detecting the merge.
>
>   2. is a more it-friendly approach, i.e., that the 'git mergetool'
>      builtin does work with files tracked under Git LFS, i.e., that both
>      sides of the merge are filtered so that the mergetool can resolve
>      the changes in the large files instead of the textual pointers.
>
>
>> I've got to a loose design that I like, but it'd be good to get some
>> feedback, as well as hearing what other game devs would want in a
>> binary conflict system.
>
> Please do share, and I would be happy to provide feedback (and make
> proposals to integrate favorable parts of your ideas into Git LFS).

All of this is obviously correct as far as git-lfs goes. Just to use
this as a jump-off comment on the topic of file locking and to frame
this discussion more generally.

It's true that a tool like git-lfs "requires the use of a centralized
[...] server" for file locking, but it's not the case that a feature
like file locking requires a centralized authority.

In particular, git-lfs unlike git-annex (which preceded it) does the
opposite of (to quote John upthread) "avoid[...] as much centralization
as possible", it *is* explicitly a centralized large file solution, not
a distributed one, as opposed to git-annex.

That's not a critique of git-lfs or the centralized method, or a
recommendation for decentralization in this context, but we already have
a similar distributed solution in the form of git-annex, it's just a hop
skip and a jump away from changing "who has the file" to "who has the
lock".

So how does that work? In the centralized case like
git-lfs/cvs/p4/whatever you have some "lock/unlock" command, and it
locks a file on a central server, locking is usually a a [locked?, who]
state of "is it locked" and "who locked it?". Usually this is also
followed-up on the client-side by checking those files out without the
"w" flag.

In the hypothetical git-annex-like case (simplifying a bit for the
purposes this explanation), for every FILE in your tree you have a
corresponding FILE.lock file, but it's not a boolean, but a log of who's
asked for locks, i.e. lines of:

    <repository UUID> <ts> <state> <who (email?)> <explanation?>

E.g.:

    $ cat Makefile.lock
    my-random-per-repo-id 2018-09-15 1 avarab@gmail.com "refactoring all Makefiles"
    my-random-per-repo-id 2018-09-16 0 avarab@gmail.com "done!"

This log is append-only, when clients encounter conflicts there's a
merge driver to ensure that all updates are kept.

You can then enact a policy saying you care or don't care about updates
from certain sources, or ignore locks older than so-and-so.

None of this is stuff I'd really recommend. It's just instructive to
point out that if someone wants a distributed locking solution for git,
it pretty much already exists, you can even (ab)use git-annex for it
today with a tiny hack on top.

I.e. each time you want to lock a file called Makefile just:

    echo We created a lock for this >Makefile.lock &&
    git annex add Makefile.lock &&
    git annex sync

And to release the lock:

    git annex rm Makefile.lock &&
    git annex sync

Then you and others using this just mentally pretend (or setup aliases)
that the following mapping exists:

    git annex get <file> && git annex sync ==> git lockit <file>
    git annex rm <file>  && git annex sync ==> git unlockit <file>

And that stuff like "git annex whereis" (designed to list "who has the
files") means "git annex who-has-locks".

Then you'd change the post-{checkout,merge} hooks to list the locks
"tracked annex files", chmod -w appropriately, and voila, a distributed
locking solution for git built on top of an existing tool you can
implement in a couple of hours.

Now, if I were in a game studio like this would I do any of this? Nope,
I think even if you go for locks something like the centralized git-lfs
approach is simpler and probably more appropriate (you presumably want
to be centralized anyway).

But to be honest I don't really get the need for this given something
like the use-case noted upthread:

    > John Austin <john@astrangergravity.com> wrote:
    > An essential example would be a team of 5 audio designers working
    > together on the SFX for a game. If one designer wants to add a layer
    > of ambience to 40% of the .wav files, they have to coordinate with
    > everyone else on the project manually.

If you have 5 people working on a project together, isn't it more
straightforward to post in IRC/E-Mail:

    Hey @all, don't change *.wav files for the next couple of days,
    major refactoring.

That's what we do all the time over in the non-game-non-binary-assets SW
development world, and I daresay that even if you have textual
conflicts, they're sometimes just as hard to solve.

I.e. you can have two people unaware of each other on a team starting to
in parallel refactor the same set of code in two completely different
ways, needing a lot of manual merging / throwing out of most of one
implementation. The way that's usually dealt with is something like the
above example post to a ML.

But maybe I'm just not imagining the use-cases.

  reply	other threads:[~2018-09-16 14:55 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-14 17:55 Git for games working group John Austin
2018-09-14 19:00 ` Taylor Blau
2018-09-14 21:09   ` John Austin
2018-09-15 16:40     ` Taylor Blau
2018-09-16 14:55       ` Ævar Arnfjörð Bjarmason [this message]
2018-09-16 20:49         ` John Austin
2018-09-17 13:55         ` Taylor Blau
2018-09-17 14:01           ` Randall S. Becker
2018-09-17 15:00           ` Ævar Arnfjörð Bjarmason
2018-09-17 15:57             ` Taylor Blau
2018-09-17 16:21               ` Randall S. Becker
2018-09-17 16:47             ` Joey Hess
2018-09-17 17:23               ` Ævar Arnfjörð Bjarmason
2018-09-23 17:28                 ` John Austin
2018-09-23 17:56                   ` Randall S. Becker
2018-09-23 19:53                     ` John Austin
2018-09-23 19:55                       ` John Austin
2018-09-23 20:43                       ` Randall S. Becker
2018-09-24 14:01                       ` Taylor Blau
2018-09-24 15:34                         ` John Austin
2018-09-24 19:58                           ` Taylor Blau
2018-09-25  4:05                             ` John Austin
2018-09-25 20:14                               ` Taylor Blau
2018-09-24 13:59                     ` Taylor Blau
2018-09-14 21:13   ` John Austin
2018-09-16  7:56     ` David Aguilar
2018-09-17 13:48       ` Taylor Blau
2018-09-14 21:21 ` Ævar Arnfjörð Bjarmason
2018-09-14 23:36   ` John Austin
2018-09-15 16:42     ` Taylor Blau
2018-09-16 18:17       ` John Austin
2018-09-16 22:05         ` Jonathan Nieder
2018-09-17 13:58           ` Taylor Blau
2018-09-17 15:58             ` Jonathan Nieder
2018-10-03 12:28               ` Thomas Braun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878t41lcfi.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=id@joeyh.name \
    --cc=john@astrangergravity.com \
    --cc=larsxschneider@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=pastelmobilesuit@github.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).