All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: John Austin <john@astrangergravity.com>,
	git@vger.kernel.org, sandals@crustytoothpaste.net,
	larsxschneider@gmail.com, pastelmobilesuit@github.com,
	Joey Hess <id@joeyh.name>
Subject: Re: Git for games working group
Date: Sun, 16 Sep 2018 16:55:13 +0200	[thread overview]
Message-ID: <878t41lcfi.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <20180915164052.GA88932@syl>


On Sat, Sep 15 2018, Taylor Blau wrote:

> On Fri, Sep 14, 2018 at 02:09:12PM -0700, John Austin wrote:
>> I've been working myself on strategies for handling binary conflicts,
>> and particularly how to do it in a git-friendly way (ie. avoiding as
>> much centralization as possible and playing into the commit/branching
>> model of git).
>
> Git LFS handles conflict resolution and merging over binary files with
> two primary mechanisms: (1) file locking, and (2) use of a merge-tool.
>
>   1. is the most "non-Git-friendly" solution, since it requires the use
>      of a centralized Git LFS server (to be run alongside your remote
>      repository) and that every clone phones home to make sure that they
>      are OK to acquire a lock.
>
>      The workflow that we expect is that users will run 'git lfs lock
>      /path/to/file' any time they want to make a change to an
>      unmeregeable file, and that this call first checks to make sure
>      that they are the only person who would hold the lock.
>
>      We also periodically "sync" the state of locks locally with those
>      on the remote, namely during the post-merge, post-commit, and
>      post-checkout hook(s).
>
>      Users are expected to perform the 'git lfs unlock /path/to/file'
>      anytime they "merge" their changes back into master, but the
>      thought is that servers could be taught to automatically do this
>      upon the remote detecting the merge.
>
>   2. is a more it-friendly approach, i.e., that the 'git mergetool'
>      builtin does work with files tracked under Git LFS, i.e., that both
>      sides of the merge are filtered so that the mergetool can resolve
>      the changes in the large files instead of the textual pointers.
>
>
>> I've got to a loose design that I like, but it'd be good to get some
>> feedback, as well as hearing what other game devs would want in a
>> binary conflict system.
>
> Please do share, and I would be happy to provide feedback (and make
> proposals to integrate favorable parts of your ideas into Git LFS).

All of this is obviously correct as far as git-lfs goes. Just to use
this as a jump-off comment on the topic of file locking and to frame
this discussion more generally.

It's true that a tool like git-lfs "requires the use of a centralized
[...] server" for file locking, but it's not the case that a feature
like file locking requires a centralized authority.

In particular, git-lfs unlike git-annex (which preceded it) does the
opposite of (to quote John upthread) "avoid[...] as much centralization
as possible", it *is* explicitly a centralized large file solution, not
a distributed one, as opposed to git-annex.

That's not a critique of git-lfs or the centralized method, or a
recommendation for decentralization in this context, but we already have
a similar distributed solution in the form of git-annex, it's just a hop
skip and a jump away from changing "who has the file" to "who has the
lock".

So how does that work? In the centralized case like
git-lfs/cvs/p4/whatever you have some "lock/unlock" command, and it
locks a file on a central server, locking is usually a a [locked?, who]
state of "is it locked" and "who locked it?". Usually this is also
followed-up on the client-side by checking those files out without the
"w" flag.

In the hypothetical git-annex-like case (simplifying a bit for the
purposes this explanation), for every FILE in your tree you have a
corresponding FILE.lock file, but it's not a boolean, but a log of who's
asked for locks, i.e. lines of:

    <repository UUID> <ts> <state> <who (email?)> <explanation?>

E.g.:

    $ cat Makefile.lock
    my-random-per-repo-id 2018-09-15 1 avarab@gmail.com "refactoring all Makefiles"
    my-random-per-repo-id 2018-09-16 0 avarab@gmail.com "done!"

This log is append-only, when clients encounter conflicts there's a
merge driver to ensure that all updates are kept.

You can then enact a policy saying you care or don't care about updates
from certain sources, or ignore locks older than so-and-so.

None of this is stuff I'd really recommend. It's just instructive to
point out that if someone wants a distributed locking solution for git,
it pretty much already exists, you can even (ab)use git-annex for it
today with a tiny hack on top.

I.e. each time you want to lock a file called Makefile just:

    echo We created a lock for this >Makefile.lock &&
    git annex add Makefile.lock &&
    git annex sync

And to release the lock:

    git annex rm Makefile.lock &&
    git annex sync

Then you and others using this just mentally pretend (or setup aliases)
that the following mapping exists:

    git annex get <file> && git annex sync ==> git lockit <file>
    git annex rm <file>  && git annex sync ==> git unlockit <file>

And that stuff like "git annex whereis" (designed to list "who has the
files") means "git annex who-has-locks".

Then you'd change the post-{checkout,merge} hooks to list the locks
"tracked annex files", chmod -w appropriately, and voila, a distributed
locking solution for git built on top of an existing tool you can
implement in a couple of hours.

Now, if I were in a game studio like this would I do any of this? Nope,
I think even if you go for locks something like the centralized git-lfs
approach is simpler and probably more appropriate (you presumably want
to be centralized anyway).

But to be honest I don't really get the need for this given something
like the use-case noted upthread:

    > John Austin <john@astrangergravity.com> wrote:
    > An essential example would be a team of 5 audio designers working
    > together on the SFX for a game. If one designer wants to add a layer
    > of ambience to 40% of the .wav files, they have to coordinate with
    > everyone else on the project manually.

If you have 5 people working on a project together, isn't it more
straightforward to post in IRC/E-Mail:

    Hey @all, don't change *.wav files for the next couple of days,
    major refactoring.

That's what we do all the time over in the non-game-non-binary-assets SW
development world, and I daresay that even if you have textual
conflicts, they're sometimes just as hard to solve.

I.e. you can have two people unaware of each other on a team starting to
in parallel refactor the same set of code in two completely different
ways, needing a lot of manual merging / throwing out of most of one
implementation. The way that's usually dealt with is something like the
above example post to a ML.

But maybe I'm just not imagining the use-cases.

  reply	other threads:[~2018-09-16 14:55 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-14 17:55 Git for games working group John Austin
2018-09-14 19:00 ` Taylor Blau
2018-09-14 21:09   ` John Austin
2018-09-15 16:40     ` Taylor Blau
2018-09-16 14:55       ` Ævar Arnfjörð Bjarmason [this message]
2018-09-16 20:49         ` John Austin
2018-09-17 13:55         ` Taylor Blau
2018-09-17 14:01           ` Randall S. Becker
2018-09-17 15:00           ` Ævar Arnfjörð Bjarmason
2018-09-17 15:57             ` Taylor Blau
2018-09-17 16:21               ` Randall S. Becker
2018-09-17 16:47             ` Joey Hess
2018-09-17 17:23               ` Ævar Arnfjörð Bjarmason
2018-09-23 17:28                 ` John Austin
2018-09-23 17:56                   ` Randall S. Becker
2018-09-23 19:53                     ` John Austin
2018-09-23 19:55                       ` John Austin
2018-09-23 20:43                       ` Randall S. Becker
2018-09-24 14:01                       ` Taylor Blau
2018-09-24 15:34                         ` John Austin
2018-09-24 19:58                           ` Taylor Blau
2018-09-25  4:05                             ` John Austin
2018-09-25 20:14                               ` Taylor Blau
2018-09-24 13:59                     ` Taylor Blau
2018-09-14 21:13   ` John Austin
2018-09-16  7:56     ` David Aguilar
2018-09-17 13:48       ` Taylor Blau
2018-09-14 21:21 ` Ævar Arnfjörð Bjarmason
2018-09-14 23:36   ` John Austin
2018-09-15 16:42     ` Taylor Blau
2018-09-16 18:17       ` John Austin
2018-09-16 22:05         ` Jonathan Nieder
2018-09-17 13:58           ` Taylor Blau
2018-09-17 15:58             ` Jonathan Nieder
2018-10-03 12:28               ` Thomas Braun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878t41lcfi.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=id@joeyh.name \
    --cc=john@astrangergravity.com \
    --cc=larsxschneider@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=pastelmobilesuit@github.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.