* Re: [RFC PATCH gentoolkit] bin: Add merge-driver-ekeyword
[not found] <20201221034452.307153-1-mattst88@gentoo.org>
@ 2020-12-22 22:40 ` Matt Turner
2020-12-23 18:11 ` Beat Bolli
2020-12-23 19:45 ` Junio C Hamano
0 siblings, 2 replies; 6+ messages in thread
From: Matt Turner @ 2020-12-22 22:40 UTC (permalink / raw)
To: gentoo-portage-dev; +Cc: git
tl;dr:
I want to handle conflicts automatically on lines like
> KEYWORDS="~alpha ~amd64 ~arm ~arm64 ~hppa ~ia64 ~mips ~ppc ~ppc64 ~riscv ~s390 ~sparc ~x86"
where conflicts frequently happen by adding/removing ~ before the
architecture names or adding/removing whole architectures. I don't
know if I should use a custom git merge driver or a custom git merge
strategy.
So the program in the patch below works, but it's not ideal, because
it rejects any hunks that don't touch the KEYWORDS=... assignment.
As I understand it, a custom git merge driver is intended to be used
to merge whole file formats, like JSON. As a result, you configure it
via gitattributes on a per-extension basis.
I really just want to make the default recursive git merge handle
KEYWORDS=... conflicts automatically, and I don't expect to be able to
make a git merge driver that can handle arbitrary conflicts in
*.ebuild files. If the merge driver returns non-zero if it was unable
to resolve the conflicts, but when it does so git evidently doesn't
fallback and insert the typical <<< HEAD ... === ... >>> markers.
Maybe I could make my merge driver insert those like git normally
does? Seems like git's logic is probably a bit better about handling
some conflicts than my tool would be.
So... is a git merge strategy the thing I want? I don't know. There
doesn't seem to really be any documentation on writing git merge
strategies. I've only found [1] and [2].
Cc'ing git@vger.kernel.org, since I expect that's where the experts
are. Hopefully they have suggestions.
[1] https://stackoverflow.com/questions/23140240/git-how-do-i-add-a-custom-merge-strategy
[2] https://stackoverflow.com/questions/54528824/any-documentation-for-writing-a-custom-git-merge-strategy
On Sun, Dec 20, 2020 at 10:44 PM Matt Turner <mattst88@gentoo.org> wrote:
>
> Since the KEYWORDS=... assignment is a single line, git struggles to
> handle conflicts. When rebasing a series of commits that modify the
> KEYWORDS=... it's usually easier to throw them away and reapply on the
> new tree than it is to manually handle conflicts during the rebase.
>
> git allows a 'merge driver' program to handle conflicts; this program
> handles conflicts in the KEYWORDS=... assignment. E.g., given an ebuild
> with these keywords:
>
> KEYWORDS="~alpha ~amd64 ~arm ~arm64 ~hppa ~ia64 ~mips ~ppc ~ppc64 ~riscv ~s390 ~sparc ~x86"
>
> One developer drops the ~alpha keyword and pushes to gentoo.git, and
> another developer stabilizes hppa. Without this merge driver, git
> requires the second developer to manually resolve the conflict. With
> the custom merge driver, it automatically resolves the conflict.
>
> gentoo.git/.git/config:
>
> [core]
> ...
> attributesfile = ~/.gitattributes
> [merge "keywords"]
> name = KEYWORDS merge driver
> driver = merge-driver-ekeyword %O %A %B
>
> ~/.gitattributes:
>
> *.ebuild merge=keywords
>
> Signed-off-by: Matt Turner <mattst88@gentoo.org>
> ---
> One annoying wart in the program is due to the fact that ekeyword
> won't work on any file not named *.ebuild. I make a symlink (and set up
> an atexit handler to remove it) to work around this. I'm not sure we
> could make ekeyword handle arbitrary filenames given its complex multi-
> argument parameter support. git merge files are named .merge_file_XXXXX
> according to git-unpack-file(1), so we could allow those. Thoughts?
>
> bin/merge-driver-ekeyword | 125 ++++++++++++++++++++++++++++++++++++++
> 1 file changed, 125 insertions(+)
> create mode 100755 bin/merge-driver-ekeyword
>
> diff --git a/bin/merge-driver-ekeyword b/bin/merge-driver-ekeyword
> new file mode 100755
> index 0000000..6e645a9
> --- /dev/null
> +++ b/bin/merge-driver-ekeyword
> @@ -0,0 +1,125 @@
> +#!/usr/bin/python
> +#
> +# Copyright 2020 Gentoo Authors
> +# Distributed under the terms of the GNU General Public License v2 or later
> +
> +"""
> +Custom git merge driver for handling conflicts in KEYWORDS assignments
> +
> +See https://git-scm.com/docs/gitattributes#_defining_a_custom_merge_driver
> +"""
> +
> +import atexit
> +import difflib
> +import os
> +import shutil
> +import sys
> +
> +from typing import List, Optional, Tuple
> +
> +from gentoolkit.ekeyword import ekeyword
> +
> +
> +def keyword_array(keyword_line: str) -> List[str]:
> + # Find indices of string inside the double-quotes
> + i1: int = keyword_line.find('"') + 1
> + i2: int = keyword_line.rfind('"')
> +
> + # Split into array of KEYWORDS
> + return keyword_line[i1:i2].split(' ')
> +
> +
> +def keyword_line_changes(old: str, new: str) -> List[Tuple[Optional[str],
> + Optional[str]]]:
> + a: List[str] = keyword_array(old)
> + b: List[str] = keyword_array(new)
> +
> + s = difflib.SequenceMatcher(a=a, b=b)
> +
> + changes = []
> + for tag, i1, i2, j1, j2 in s.opcodes():
> + if tag == 'replace':
> + changes.append((a[i1:i2], b[j1:j2]),)
> + elif tag == 'delete':
> + changes.append((a[i1:i2], None),)
> + elif tag == 'insert':
> + changes.append((None, b[j1:j2]),)
> + else:
> + assert tag == 'equal'
> + return changes
> +
> +
> +def keyword_changes(ebuild1: str, ebuild2: str) -> List[Tuple[Optional[str],
> + Optional[str]]]:
> + with open(ebuild1) as e1, open(ebuild2) as e2:
> + lines1 = e1.readlines()
> + lines2 = e2.readlines()
> +
> + diff = difflib.unified_diff(lines1, lines2, n=0)
> + assert next(diff) == '--- \n'
> + assert next(diff) == '+++ \n'
> +
> + hunk: int = 0
> + old: str = ''
> + new: str = ''
> +
> + for line in diff:
> + if line.startswith('@@ '):
> + if hunk > 0: break
> + hunk += 1
> + elif line.startswith('-'):
> + if old or new: break
> + old = line
> + elif line.startswith('+'):
> + if not old or new: break
> + new = line
> + else:
> + if 'KEYWORDS=' in old and 'KEYWORDS=' in new:
> + return keyword_line_changes(old, new)
> + return None
> +
> +
> +def apply_keyword_changes(ebuild: str,
> + changes: List[Tuple[Optional[str],
> + Optional[str]]]) -> int:
> + # ekeyword will only modify files named *.ebuild, so make a symlink
> + ebuild_symlink = ebuild + '.ebuild'
> + os.symlink(ebuild, ebuild_symlink)
> + atexit.register(lambda: os.remove(ebuild_symlink))
> +
> + for removals, additions in changes:
> + args = []
> + for rem in removals:
> + # Drop leading '~' and '-' characters and prepend '^'
> + i = 1 if rem[0] in ('~', '-') else 0
> + args.append('^' + rem[i:])
> + if additions:
> + args.extend(additions)
> + args.append(ebuild_symlink)
> +
> + result = ekeyword.main(args)
> + if result != 0:
> + return result
> + return 0
> +
> +
> +def main(argv):
> + if len(argv) != 4:
> + sys.exit(-1)
> +
> + O = argv[1] # %O - filename of original
> + A = argv[2] # %A - filename of our current version
> + B = argv[3] # %B - filename of the other branch's version
> +
> + # Get changes from %O to %B
> + changes = keyword_changes(O, B)
> + if not changes:
> + sys.exit(-1)
> +
> + # Apply O -> B changes to A
> + result: int = apply_keyword_changes(A, changes)
> + sys.exit(result)
> +
> +
> +if __name__ == "__main__":
> + main(sys.argv)
> --
> 2.26.2
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH gentoolkit] bin: Add merge-driver-ekeyword
2020-12-22 22:40 ` [RFC PATCH gentoolkit] bin: Add merge-driver-ekeyword Matt Turner
@ 2020-12-23 18:11 ` Beat Bolli
2020-12-23 19:46 ` Junio C Hamano
2020-12-23 19:45 ` Junio C Hamano
1 sibling, 1 reply; 6+ messages in thread
From: Beat Bolli @ 2020-12-23 18:11 UTC (permalink / raw)
To: Matt Turner, gentoo-portage-dev; +Cc: git
On 22.12.20 23:40, Matt Turner wrote:
> tl;dr:
>
> I want to handle conflicts automatically on lines like
>
>> KEYWORDS="~alpha ~amd64 ~arm ~arm64 ~hppa ~ia64 ~mips ~ppc ~ppc64 ~riscv ~s390 ~sparc ~x86"
>
> where conflicts frequently happen by adding/removing ~ before the
> architecture names or adding/removing whole architectures. I don't
> know if I should use a custom git merge driver or a custom git merge
> strategy.
You can probably put each of the keywords on a separate line:
KEYWORDS="
~alpha
~amd64
~arm
~arm64
~hppa
~ia64
~mips
~ppc
~ppc64
~riscv
~s390
~sparc~x86
"
The shell should handle both forms about the same.
(I'm not a Gentoo user, just talking about my general shell experience)
Regards
Beat
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH gentoolkit] bin: Add merge-driver-ekeyword
2020-12-22 22:40 ` [RFC PATCH gentoolkit] bin: Add merge-driver-ekeyword Matt Turner
2020-12-23 18:11 ` Beat Bolli
@ 2020-12-23 19:45 ` Junio C Hamano
2020-12-24 4:47 ` Matt Turner
1 sibling, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2020-12-23 19:45 UTC (permalink / raw)
To: Matt Turner; +Cc: gentoo-portage-dev, git
Matt Turner <mattst88@gentoo.org> writes:
> I want to handle conflicts automatically on lines like
>
>> KEYWORDS="~alpha ~amd64 ~arm ~arm64 ~hppa ~ia64 ~mips ~ppc ~ppc64 ~riscv ~s390 ~sparc ~x86"
>
> where conflicts frequently happen by adding/removing ~ before the
> architecture names or adding/removing whole architectures. I don't
> know if I should use a custom git merge driver or a custom git merge
> strategy.
A merge strategy is about how the changes at the tree level are
handled. A merge driver is given three blobs (original, your
version, and their version) and comes up with a merged blob.
In your case, you'd want a custom merge driver if you want to handle
word changes on a single line, because the default text merge driver
is pretty much line oriented.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH gentoolkit] bin: Add merge-driver-ekeyword
2020-12-23 18:11 ` Beat Bolli
@ 2020-12-23 19:46 ` Junio C Hamano
0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2020-12-23 19:46 UTC (permalink / raw)
To: Beat Bolli; +Cc: Matt Turner, gentoo-portage-dev, git
Beat Bolli <dev+git@drbeat.li> writes:
> You can probably put each of the keywords on a separate line:
>
> KEYWORDS="
> ~alpha
> ~amd64
> ~arm
> ~arm64
> ~hppa
> ~ia64
> ~mips
> ~ppc
> ~ppc64
> ~riscv
> ~s390
> ~sparc~x86
> "
>
> The shell should handle both forms about the same.
I agree that it is a more practical approach than writing an one-off
merge driver.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH gentoolkit] bin: Add merge-driver-ekeyword
2020-12-23 19:45 ` Junio C Hamano
@ 2020-12-24 4:47 ` Matt Turner
2020-12-24 6:13 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Matt Turner @ 2020-12-24 4:47 UTC (permalink / raw)
To: Junio C Hamano; +Cc: gentoo-portage-dev, git
On Wed, Dec 23, 2020 at 2:46 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Matt Turner <mattst88@gentoo.org> writes:
>
> > I want to handle conflicts automatically on lines like
> >
> >> KEYWORDS="~alpha ~amd64 ~arm ~arm64 ~hppa ~ia64 ~mips ~ppc ~ppc64 ~riscv ~s390 ~sparc ~x86"
> >
> > where conflicts frequently happen by adding/removing ~ before the
> > architecture names or adding/removing whole architectures. I don't
> > know if I should use a custom git merge driver or a custom git merge
> > strategy.
>
> A merge strategy is about how the changes at the tree level are
> handled. A merge driver is given three blobs (original, your
> version, and their version) and comes up with a merged blob.
>
> In your case, you'd want a custom merge driver if you want to handle
> word changes on a single line, because the default text merge driver
> is pretty much line oriented.
Thanks, that makes sense. The merge driver I've written seems to work
great for handling the KEYWORDS=... line.
If users could more simply opt into using it (e.g., on the command
line rather than enabling it via ~/.gitattributes) I think it would be
fine to use. Better yet, is there a way git can be configured to
fallback to another merge driver if the first returns a non-zero
status due to unresolved conflicts? For example, if there are changes
to other lines, how can I fall back to another merge driver?
Thank you for your advice!
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH gentoolkit] bin: Add merge-driver-ekeyword
2020-12-24 4:47 ` Matt Turner
@ 2020-12-24 6:13 ` Junio C Hamano
0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2020-12-24 6:13 UTC (permalink / raw)
To: Matt Turner; +Cc: gentoo-portage-dev, git
Matt Turner <mattst88@gentoo.org> writes:
> ... is there a way git can be configured to
> fallback to another merge driver if the first returns a non-zero
> status due to unresolved conflicts? For example, if there are changes
> to other lines, how can I fall back to another merge driver?
There is no "fallback", but a merge driver should be able to first
run another merge driver (e.g. "git merge-file" or the "merge"
command from the RCS suite of programs would be line-oriented 3-way
drivers suitable for text files) and then fix up the leftover bits.
If your users don't want to contaminate the .gitattributes file that
is recorded in-tree, they can also use .git/info/attributes to locally
configure Git to use such a driver.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-12-24 6:14 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20201221034452.307153-1-mattst88@gentoo.org>
2020-12-22 22:40 ` [RFC PATCH gentoolkit] bin: Add merge-driver-ekeyword Matt Turner
2020-12-23 18:11 ` Beat Bolli
2020-12-23 19:46 ` Junio C Hamano
2020-12-23 19:45 ` Junio C Hamano
2020-12-24 4:47 ` Matt Turner
2020-12-24 6:13 ` Junio C Hamano
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.